cudaChannelFormatDesc () in CUDA | How to use
cudaChannelFormatDesc in CUDA
In this article we learn how to use
cudaChannelFormatDesc
()
in-build function in CUDA. Background of
this function is very interesting [ (I can’t open that secret here ;) ].
So, if you are planning to use CUDA Array’s, then you have to be
understand what is
cudaChannelFormatDesc()
function
in CUDA.
So, Today I going to teach you, how to use
cudaChannelFormatDesc()
function
in CUDA, related to CUDA Array’s. For sake of advice, I advise you please
go through this article for sake of understanding.
Let’s start our discussion;
cudaChannelFormatDesc ( )
channelDesc describes the format of the value
that is returned when fetching the texture;
The cudaChannelFormatDesc describes the
format of a texture element.
struct cudaChannelFormatDesc {
int x, y, z, w;
enum cudaChannelFormatKind f;
}
|
Where,
x, y,
z, w : are set to the number of
bits for each component, and
f: is one of the
cudaChannelFormatKindSigned (if
these components are of signed integer type)
cudaChannelFormatKindUnsigned (if they are of
unsigned integer type)
cudaChannelFormatKindFloat (if they are of floating point type )
Example
For float texels we could create a
channel with;
cudaCreateChannelDesc( 32, 0,
0, 0, cudaChannelFormatKindFloat );
While for short4 texels this would be
cudaCreateChannelDesc( 16, 16,
16, 16, cudaChannelFormatKindSigned );
|
According to CUDA programming
guide 4.2 ;
Before a
kernel can use a texture reference to read from texture memory, the texture
reference must be bound to a texture using cudaBindTexture() or cudaBindTexture2D()
for linear memory, or cudaBindTextureToArray() for CUDA arrays.
cudaUnbindTexture() is used to unbind a texture reference. It is
recommended to allocate two-dimensional textures in linear memory using cudaMallocPitch()
and use the pitch returned by cudaMallocPitch() as input parameter
to cudaBindTexture2D().
The following code samples
bind a texture reference to aCUDA Array’s cuArray:
Using the low-level API:
texture
<float, cudaTextureType2D, cudaReadModeElementType> texRef;
textureReference*
texRefPtr;
cudaGetTextureReference(&texRefPtr,
"texRef");
cudaChannelFormatDesc
channelDesc;
cudaGetChannelDesc(&channelDesc,
cuArray);
cudaBindTextureToArray(texRef,
cuArray, &channelDesc);
|
Using the high-level API:
texture<float,
cudaTextureType2D, cudaReadModeElementType> texRef;
cudaBindTextureToArray(texRef,
cuArray);
|
The
format specified when binding a texture to a texture reference must match the
parameters specified when declaring the texture reference; otherwise, the
results of texture fetches are undefined.
Complete
Example of CUDA Array and
cudaChannelFormatDesc ()
#include "HEADER.h"
#include "ERROR.h"
// 2D float texture
texture<float, cudaTextureType2D,
cudaReadModeElementType> texRef;
// Simple transformation kernel
__global__
void transformKernel(float* output, int width, int height, float theta)
{
// Calculate
normalized texture coordinates
unsigned
int x =
blockIdx.x *
blockDim.x +
threadIdx.x;
unsigned
int y =
blockIdx.y *
blockDim.y +
threadIdx.y;
float u = x / (float)width;
float v = y / (float)height;
// Transform
coordinates
u -=
0.5f;
v -=
0.5f;
float tu
= u * cosf(theta) - v * sinf(theta) + 0.5f ;
float tv
= v * cosf(theta) + u * sinf(theta) + 0.5f ;
// Read from
texture and write to global memory
output[y
* width + x] = tex2D(texRef, tu, tv);
}
// Host code
int main()
{
int
width = 3 ,
height = 3 ;
float
h_data[3][3] ;
for ( int i =0 ; i<3 ; i++ )
for
( int j = 0 ; j<3; j++ )
h_data [i][j] = i*j ;
int size
= width*height*sizeof(float) ;
// Allocate
CUDA array in device memory
cudaChannelFormatDesc channelDesc = cudaCreateChannelDesc(32, 0, 0, 0,
cudaChannelFormatKindFloat );
cudaArray*
cuArray;
cudaMallocArray(&cuArray, &channelDesc, width, height);
// Copy to
device memory some data located at address h_data
// in host
memory
cudaMemcpyToArray(cuArray, 0, 0, h_data, size,
cudaMemcpyHostToDevice);
// Set texture
parameters
texRef.addressMode[0] = cudaAddressModeWrap;
texRef.addressMode[1] = cudaAddressModeWrap;
texRef.filterMode
= cudaFilterModeLinear;
texRef.normalized
= true;
// Bind the
array to the texture reference
cudaBindTextureToArray(texRef, cuArray, channelDesc);
// Allocate
result of transformation in device memory
float* output;
cudaMalloc(&output, size );
// Invoke
kernel
dim3 dimBlock(16, 16);
dim3 dimGrid(
(width +
dimBlock.x -
1) /
dimBlock.x,
(height +
dimBlock.y -
1) /
dimBlock.y );
transformKernel<<<dimGrid, dimBlock>>>(output, width, height, 90 );
printf ("\n Original array \n");
for ( int i =0 ; i<3 ; i++ )
{
for
( int j = 0 ; j<3; j++ )
printf ("%f ", h_data [i][j] );
printf ("\n");
}
CUDA_CALL (
cudaMemcpy (h_data ,
output, size,
cudaMemcpyDeviceToHost ) );
printf ("\n After operation \n ");
for ( int i =0 ; i<3 ; i++ )
{
for
( int j = 0 ; j<3; j++ )
printf ("%f ", h_data [i][j] );
printf ("\n");
}
system ("pause");
// Free device
memory
cudaFreeArray(cuArray);
cudaFree(output);
return 0;
}
|
Sample
output
Fig.
I hope you must like this article and have learned cudaChannelFormatDesc ( ) and how to use cudaChannelFormatDesc ( ) in
CUDA.
If you want to know about Texture memory; click here
Got
Questions?
Feel
free to ask me any question because I'd be happy to walk you through step by
step!
grt!!! thanks
ReplyDeleteYour blog provided us with valuable information to work with. Each & every tips of your post are awesome. Thanks a lot for sharing.
ReplyDelete안전놀이터
We are really grateful for your blog post. You will find a lot of approaches after visiting your post. Great work!
ReplyDelete성인웹툰
pusulabet
ReplyDeletesex hattı
https://izmirkizlari.com
rulet siteleri
rexbet
5AQ3MA