CUDA Support
Creating GPU Tensors
C API:
size_t shape[] = {1000, 1000};
Tensor* t = tensr_zeros(shape, 2, TENSR_FLOAT32, TENSR_CUDA);
C++ API:
auto t = tensr::Tensor::zeros({1000, 1000}, tensr::DType::Float32, tensr::Device::CUDA);
Device Transfer
C API:
tensr_to_device(t, TENSR_CUDA, 0);
C++ API:
t.to(tensr::Device::CUDA, 0);
Synchronization
C API:
tensr_synchronize(TENSR_CUDA, 0);
C++ API:
tensr::synchronize(tensr::Device::CUDA, 0);
Device Count
C API:
int count = tensr_device_count(TENSR_CUDA);
C++ API:
int count = tensr::device_count(tensr::Device::CUDA);