CUDA constant memory, namespace, and weird bugs

Edit: the usage of cudaMemcpyToSymbol describded below is deprecated since CUDA 4.1 (See also my new entry Upgrade to CUDA 5.0: cudaMemcpyToSymbol invalid device symbol error)

Today I want to discuss some issues I had with CUDA constant memory and share some workarounds.

Constant memory is a useful feature that enables CUDA programmers to share data within a kernel. Without constant  memory one can declare and use a global variable like this:

__device__ float c_array[10]

__global__ void kernel(float* d_array){ d_array[0] = c_array[0]; }

void test(){
    float* d_array = 0;
    float h_array[10] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};    
    cudaMalloc((void**)&d_array, 10 * sizeof(float));
    cudaMemcpyToSymbol("c_array", h_array, sizeof(float)*10, 0, cudaMemcpyHostToDevice );
    kernel<<< 1, 1 >>>(d_array);

 The global variable c_array is in global device memory. This type of memory is known to be the slowest memory you can access from a kernel. Instead one should prefer constant memory by putting __constant__ before c_array:

__device__ __constant__ float c_array[10]

[GPU specs]

I've been facing the run-time error "invalid device symbol" when doing something like:

namespace Test_name {
__device__ __constant__ float c_array[10]

void test(){    
    cudaMemcpyToSymbol("c_array", h_array, sizeof(float)*10, 0, cudaMemcpyHostToDevice );

I'm sure people doing language designs would find the error trivial but for me it took some time to figure out what it was. NVIDIA documentation is not really helping here. Explanations about constant memory and its usage are quiet sparse...

So here it goes: nvcc doesn't find the c_array symbol because it is declared within a namespace. The solution is to give the correct symbol:

cudaMemcpyToSymbol("Test_name::c_array", h_array, sizeof(float)*10, 0, cudaMemcpyHostToDevice );

A shame I had to guess this syntax because it is not described anywhere in the CUDA documentation. Or at least it is too well hidden for me to find it!

You may face weird bugs with constant memory depending on your project and GPU architectures. Here is few things you can try:

Hope this will help.

No comments

(optional field, I won't disclose or spam but it's necessary to notify you if I respond to your comment)
All html tags except <b> and <i> will be removed from your comment. You can make links by just typing the url or mail-address.
Anti-spam question: