This guide describes the basic rule for constructing an on-the-fly cuda program.
On-the-fly programs can be of various types, depending on the type of function you want to implement.
Your code will go right into the core of the graphic card (and thus can easily crash it) but the matlab wrapper will deal with all the parameter conversion and data transport issues including calling the cuda function itself. Thus programming such a function is made really easy.
The following function types currently exist and hopefully many more will be added later.
- CUDA_UnaryFkt : Takes one input array (refered to as “a”) as input and has one output array (referred to as “c” in the cuda code).
- CUDA_BinaryFkt : Takes two input arrays (refered to as “a” and “b”) as inputs and has one output array (referred to as “c” in the cuda code).
See also “help cuda_define”
Write your program
In all code snippets you are writing a number of variables are predefined:
- idx : The linear index of your processor
- a,b,c : Depending on the function type (see above) input and output arrays
- sSize : A structure containing the source size array in its components s. E.g. sSize.s would return the size along X
If you need to calculate with the position coordinates in the arrays, you can start your program with the line
wich will create the array “pos” assigning each processor directly the multidimensional coordinate of its pixel.
Furthermore the lines
will calculate the index corresponding to the position given in pos such that a result can for example be written to c[idd]=a[idx];