use an allocator to manage temporary buffers when copying
unmanaged data from GPU buffer to host. This is necessary,
since the buffers have to be pinned for better performance,
which is an expensive operation.
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
this commit adds the initial support for cuda buffers in ompio, for blocking
and non-blocking individual read and write operations.
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>