It would be great to have support for unified memory. CUDA.jl already supports the CUDA-equivalent of this and there is even support in KernelAbstractions for it. Implementation-wise this would probably mean adding another type of buffer besides HIPBuffer and HostBuffer. I'm not quite sure how this would interact wrt the pool allocator though.
We already support wrapping host memory as a ROCArray, but AFAIU, that's not the same as unified memory, as it is unidirectional and supposedly less optimized for accessing data from the device. With unified memory we could also have a no-copy unsafe_wrap(Array, ::ROCArray) which we're currently lacking.
It would be great to have support for unified memory. CUDA.jl already supports the CUDA-equivalent of this and there is even support in KernelAbstractions for it. Implementation-wise this would probably mean adding another type of buffer besides
HIPBufferandHostBuffer. I'm not quite sure how this would interact wrt the pool allocator though.We already support wrapping host memory as a
ROCArray, but AFAIU, that's not the same as unified memory, as it is unidirectional and supposedly less optimized for accessing data from the device. With unified memory we could also have a no-copyunsafe_wrap(Array, ::ROCArray)which we're currently lacking.