TEXTURE OBJECT IN CUDA | Bindless Texture in CUDA

Posted by Unknown at 02:33 | 52 comments

I have talked about texture memory in CUDA in this article previously. In the previous article, we were talking about Texture Reference. Now in this article I’m introducing new feature of Kepler architecture called Texture object.

For this, we recap some of the important concept of texture memory that we have discussed in details in this article previously. Feel free to brows this article for more theory and example.

TEXTURE Memory

Texture memory is read from kernels using the device functions described in Texture Functions. The process of reading a texture calling one of these functions is called a texture fetch. Each texture fetch specifies a parameter called a texture object for the texture object API or a texture reference for the texture reference API.

the TEXTURE

Texture which is the piece of texture memory that is fetched. Texture objects are created at runtime and the texture is specified when creating the texture object as described in Texture Object API. Texture references are created at compile time and the texture is specified at runtime by bounding the texture reference to the texture through runtime functions as described in Texture Reference API; several distinct texture references might be bound to the same texture or to textures that overlap in memory. A texture can be any region of linear memory or a CUDA array (described in CUDA Arrays).

TEXTURE dimensionality

Its dimensionality that specifies whether the texture is addressed as a one dimensional array using one texture coordinate, a two-dimensional array using two texture coordinates, or a three-dimensional array using three texture coordinates. Elements of the array are called texels, short for texture elements. The texture width, height, and depth refer to the size of the array in each dimension. Table Wiki lists the maximum texture width, height, and depth depending on the compute capability of the device.

TEXTURE type (Texel)

The type of a texel, which is restricted to the basic integer and single-precision floating-point types and any of the 1-, 2-, and 4-component vector types defined in char, short, int, long, long long, float, double.

TEXTURE Read Mode

The read mode, which is equal to cudaReadModeNormalizedFloat or cudaReadModeElementType. If it is cudaReadModeNormalizedFloat and the type of the texel is a 16-bit or 8-bit integer type, the value returned by the texture fetch is actually returned as floating-point type and the full range of the integer type is mapped to [0.0, 1.0] for unsigned integer type and [-1.0, 1.0] for signed integer type; for example, an unsigned 8-bit texture element with the value 0xff reads as 1. If it is

cudaReadModeElementType, no conversion is performed.

TEXTURE Normalized coordinates

Whether texture coordinates are normalized or not. By default, textures are referenced (by the functions of Texture Functions) using floating-point coordinates in the range [0, N-1] where N is the size of the texture in the dimension corresponding to the coordinate. For example, a texture that is 64x32 in size will be referenced with coordinates in the range [0, 63] and [0, 31] for the x and y dimensions, respectively. Normalized texture coordinates cause the coordinates to be specified in the range [0.0, 1.0-1/N] instead of [0, N-1], so the same 64x32 texture would be addressed by normalized coordinates in the range [0, 1-1/N] in both the x and y dimensions. Normalized texture coordinates are a natural fit to some applications' requirements, if it is preferable for the texture coordinates to be independent of the texture size.

TEXTURE addressing mode

It is valid to call the device functions with coordinates that are out of range. The addressing mode defines what happens in that case. The default addressing mode is to clamp the coordinates to the valid range: [0, N) for non-normalized coordinates and [0.0, 1.0) for normalized coordinates. If the border mode is specified instead, texture fetches with out-of-range texture coordinates return zero. For normalized coordinates, the warp mode and the mirror mode are also available. When using the wrap mode, each coordinate x is converted to frac(x)= x floor(x) where floor(x) is the largest integer not greater than x. When using the mirror mode, each coordinate x is converted to frac(x) if floor(x) is even and 1-frac(x) if floor(x) is odd. The addressing mode is specified as an array of size three whose first, second, and third elements specify the addressing mode for the first, second, and third texture coordinates, respectively; the addressing mode are cudaAddressModeBorder, cudaAddressModeClamp, cudaAddressModeWrap and cudaAddressModeMirror; cudaAddressModeWrap and

cudaAddressModeMirror are only supported for normalized texture coordinates.

TEXTURE filtering mode

The filtering mode which specifies how the value returned when fetching the texture is computed based on the input texture coordinates. Linear texture filtering may be done only for textures that are configured to return floating-point data. It performs

Low-precision interpolation between neighboring texels. When enabled, the texels surrounding a texture fetch location are read and the return value of the texture fetch is interpolated based on where the texture coordinates fell between the texels. Simple linear interpolation is performed for one-dimensional textures, bilinear interpolation for two-dimensional textures, and trilinear interpolation for three-dimensional textures. Texture Fetching gives more details on texture fetching. The filtering mode is equal to cudaFilterModePoint or cudaFilterModeLinear. If it is cudaFilterModePoint, the returned value is the texel whose texture coordinates are the closest to the input texture coordinates. If it is cudaFilterModeLinear, the returned value is the linear interpolation of the two (for a one-dimensional texture), four (for a two dimensional texture), or eight (for a three dimensional texture) texels whose texture coordinates are the closest to the input texture coordinates. cudaFilterModeLinear is only valid for returned values of floating-point type.

TEXTURE object api

A texture object is created using cudaCreateTextureObject() from a resource description of type struct cudaResourceDesc, which specifies the texture, and from a texture description defined as such:

struct cudaTextureDesc

{

enum cudaTextureAddressMode addressMode[3];

enum cudaTextureFilterMode filterMode;

enum cudaTextureReadMode readMode;

int sRGB;

int normalizedCoords;

unsigned int maxAnisotropy;

enum cudaTextureFilterMode mipmapFilterMode;

float mipmapLevelBias;

float minMipmapLevelClamp;

float maxMipmapLevelClamp;

};

· addressMode specifies the addressing mode;

· filterMode specifies the filter mode;

· readMode specifies the read mode;

· normalizedCoords specifies whether texture coordinates are normalized or not;

TEXTURE REFERENCES

Textures are likely a familiar concept to anyone who’s done much CUDA programming. A feature from the graphics world, textures are images that are stretched, rotated and pasted on polygons to form the 3D graphics we are familiar with. Using textures for GPU computing has always been a pro tip for the CUDA programmer; they enable fast random access to arrays and use a cache to provide bandwidth aggregation. On the flip side, the legacy texture reference API is cumbersome to use because it requires manual binding and unbinding of texture references to memory addresses, as the following code demonstrates. Also, texture references can only be declared as static global variables and cannot be passed as function arguments.

#define N 1024

texture<float, 1, cudaReadModeElementType> tex;

// texture reference name must be known at compile time

__global__ void kernel() {

int i = blockIdx.x * blockDim.x + threadIdx.x;

float x = tex1D(tex, i);

// do some work using x...

}

void call_kernel(float *buffer) {

// bind texture to buffer

cudaBindTexture(0, tex, buffer, N*sizeof(float));

kernel<<<grid, block>>>();

// unbind texture from buffer

cudaUnbindTexture(tex);

}

int main() {

// declare and allocate memory

float *buffer;

cudaMalloc(&buffer, N*sizeof(float));

call_kernel(buffer);

cudaFree(buffer);

}

TEXTURE OBJECTS

Kepler GPUs and CUDA 5.0 introduce a new feature called texture objects (sometimes called bindless textures, since they don’t require manual binding/unbinding) that greatly improves the usability and programmability of textures. Texture objects use the new cudaTextureObject_t class API, whereby textures become first-class C++ objects and can be passed as arguments just as if they were pointers. There is no need to know at compile time which textures will be used at run time, which enables much more dynamic execution and flexible programming, as shown in the following code.

#define N 1024

// texture object is a kernel argument

__global__ void kernel(cudaTextureObject_t tex) {

int i = blockIdx.x * blockDim.x + threadIdx.x;

float x = tex1Dfetch(tex, i);

// do some work using x ...

}

void call_kernel(cudaTextureObject_t tex) {

kernel <<<grid, block>>>(tex);

}

int main() {

// declare and allocate memory

float *buffer;

cudaMalloc(&buffer, N*sizeof(float));

// create texture object

cudaResourceDesc resDesc;

memset(&resDesc, 0, sizeof(resDesc));

resDesc.resType = cudaResourceTypeLinear;

resDesc.res.linear.devPtr = buffer;

resDesc.res.linear.desc.f = cudaChannelFormatKindFloat;

resDesc.desc.x = 32; // bits per channel

resDesc.res.linear.sizeInBytes = N*sizeof(float);

cudaTextureDesc texDesc;

memset(&texDesc, 0, sizeof(texDesc));

texDesc.readMode = cudaReadModeElementType;

// create texture object: we only have to do this once!

cudaTextureObject_t tex;

cudaCreateTextureObject(&tex, &resDesc, &texDesc, NULL);

call_kernel(tex); // pass texture as argument

// destroy texture object

cudaDestroyTextureObject(tex);

cudaFree(buffer);

}

Moreover, texture objects only need to be instantiated once, and are not subject to the hardware limit of 128 texture references, so there is no need to continuously bind and unbind them. (The Kepler hardware limit is over one million texture objects, a limit that will likely never be reached by any practical CUDA application in the lifetime of the architecture!) Using texture objects, the overhead of binding (up to 1 μs) and unbinding (up to 0.5 μs) textures is eliminated. What is not commonly known is that each outstanding texture reference that is bound when a kernel is launched incurs added launch latency—up to 0.5 μs per texture reference. This launch overhead persists even if the outstanding bound textures are not even referenced by the kernel. Again, using texture objects instead of texture references completely removes this overhead.

STRONG SCALING IN HPC

While these overheads may sound minor for any application that uses kernels that run for milliseconds or greater in duration, they are important for latency-sensitive, short-running kernels. High Performance Computing (HPC) applications that need strong scaling typify this use case. Strong scaling is how the solution time varies with the number of processors for a fixed global problem size. The goal is to solve a fixed problem in as short a time as possible. There is usually a limit to strong scaling, as at some point an application will become bound by the inter-processor communication bandwidth or latencies that arise from doing less work on each processor. Beyond this limit the total performance will actually decrease.

Figure 1: Strong-scaling performance of the QUDA solver using both texture references and texture objects.

There are many large-scale HPC applications that will be running on the Titan supercomputer at Oak Ridge National Laboratory in Tennessee, for which achieving excellent strong scaling performance will be vital. Lattice Quantum Chromodynamics (LQCD) is one such application. LQCD uses a combination of sparse linear solvers, molecular dynamics algorithms and Monte Carlo methods to probe the structure of the nucleus. Using the highly optimized QUDA library, legacy LQCD applications MILC and Chroma can run on Titan making full use of the attached Tesla K20X accelerators. The QUDA library makes extensive use of textures, and using texture references the strong scaling tops out at 2 TFLOP/s, as you can see in Figure 1. Profiling using the CUDA visual profiler revealed that the overhead of texture binding and unbinding contributed significantly to time-critical communication routines running on the GPU. Moreover, many of the kernels run for O(10μs) using up to ten textures simultaneously. Thus the launch latency overhead is a significant bottleneck. Once the library was rewritten to use texture objects, strong scaling performance increased significantly and throughput increased to nearly 3 TFLOP/s at 27 GPUs (see Figure 1).

Texture objects are yet another powerful feature of the Kepler architecture that make it easier to achieve high performance and scalability with CUDA.

Another example

// Simple transformation kernel

__global__ void transformKernel(float* output, cudaTextureObject_t texObj, int width, int height, float theta)

{

// Calculate normalized texture coordinates

unsigned int x = blockIdx.x * blockDim.x + threadIdx.x;

unsigned int y = blockIdx.y * blockDim.y + threadIdx.y;

float u = x / (float)width;

float v = y / (float)height;

// Transform coordinates

u -= 0.5f;

v -= 0.5f;

float tu = u * cosf(theta) – v * sinf(theta) + 0.5f;

float tv = v * cosf(theta) + u * sinf(theta) + 0.5f;

// Read from texture and write to global memory

output[y * width + x] = tex2D<float>(texObj, tu, tv);

}

// Host code

int main()

{

// Allocate CUDA array in device memory

cudaChannelFormatDesc channelDesc =cudaCreateChannelDesc(32, 0, 0, 0, cudaChannelFormatKindFloat);

cudaArray* cuArray;

cudaMallocArray(&cuArray, &channelDesc, width, height);

// Copy to device memory some data located at address h_data

// in host memory

cudaMemcpyToArray(cuArray, 0, 0, h_data, size,cudaMemcpyHostToDevice);

// Specify texture

struct cudaResourceDesc resDesc;

memset(&resDesc, 0, sizeof(resDesc));

resDesc.resType = cudaResourceTypeArray;

resDesc.res.array.array = cuArray;

// Specify texture object parameters

struct cudaTextureDesc texDesc;

memset(&texDesc, 0, sizeof(texDesc));

texDesc.addressMode[0] = cudaAddressModeWrap;

texDesc.addressMode[1] = cudaAddressModeWrap;

texDesc.filterMode = cudaFilterModeLinear;

texDesc.readMode = cudaReadModeElementType;

texDesc.normalizedCoords = 1;

// Create texture object

cudaTextureObject_t texObj = 0;

cudaCreateTextureObject(&texObj, &resDesc, &texDesc, NULL);

// Allocate result of transformation in device memory

float* output;

cudaMalloc(&output, width * height * sizeof(float));

// Invoke kernel

dim3 dimGrid((width + dimBlock.x – 1) / dimBlock.x, (height + dimBlock.y – 1) / dimBlock.y);

transformKernel<<<dimGrid, dimBlock>>>(output, texObj, width, height, angle);

// Destroy texture object

cudaDestroyTextureObject(texObject);

// Free device memory

cudaFreeArray(cuArray);

cudaFree(output);

return 0;

}

Reference

CUDA C programming Guide

52 comments:

Unknown29 July 2016 at 04:25
the blog is very interesting and will be much useful for us. thank you for sharing the blog with us. please keep on updating.
SEO Company In Chennai
ReplyDelete
Replies
YouLoseBellyFat17 August 2016 at 06:17
you'll find c codes on c codes for beginners
ReplyDelete
Replies
Anonymous10 October 2016 at 11:56
Great website! I wonder if people at Nvidia know you copy their blog posts and publish on your page...
ReplyDelete
Replies
Unknown31 January 2017 at 22:03
I just see the post i am so happy to the communication science post of information's.So I have really enjoyed and reading your blogs for these posts.Any way I’ll be replay for your great thinks and I hope you post again soon.

seo training in chennai
ReplyDelete
Replies
Unknown5 April 2017 at 23:42
Thanks, very useful descriptions!
ReplyDelete
Replies
Ancy merina16 February 2018 at 01:28
This comment has been removed by the author.
ReplyDelete
Replies
suman26 August 2018 at 04:29
I foubd nice articlein your blog.thank you for useful info
.
ReplyDelete
Replies
Kate8 October 2018 at 05:47
Great reading these your posts.
ReplyDelete
Replies
afiah b31 October 2018 at 01:59
Thank you for an additional great post. Exactly where else could anybody get that kind of facts in this kind of a ideal way of writing? I have a presentation next week, and I’m around the appear for this kind of data.
Java training in Bangalore | Java training in Electronic city

Java training in Chennai | Java training institute in Chennai | Java course in Chennai

Java training in USA

Java training in Bangalore | Java training in Indira nagar
ReplyDelete
Replies
nivatha31 October 2018 at 04:28
Really great post, Thank you for sharing This knowledge.Excellently written article, if only all bloggers offered the same level of content as you, the internet would be a much better place. Please keep it up!
Data Science course in Chennai | Best Data Science course in Chennai
Data science course in bangalore | Best Data Science course in Bangalore
Data science course in pune | Data Science Course institute in Pune
Data science online course | Online Data Science certification course-Gangboard
Data Science Interview questions and answers
ReplyDelete
Replies
gowsalya1 November 2018 at 23:46
It's interesting that many of the bloggers to helped clarify a few things for me as well as giving.Most of ideas can be nice content.The people to give them a good shake to get your point and across the command
python training institute in marathahalli
python training institute in btm
Python training course in Chennai

ReplyDelete
Replies
sathya shri2 November 2018 at 23:17
Needed to compose you a very little word to thank you yet again regarding the nice suggestions you’ve contributed here.
angularjs-Training in chennai

angularjs Training in chennai

angularjs-Training in tambaram

angularjs-Training in sholinganallur

angularjs-Training in velachery
ReplyDelete
Replies
rohini7 February 2019 at 02:08
" I’d love to be a part of group where I can get advice from other experienced people that share the same interest. If you have any recommendations, please let me know. Thank you.
"
apple service center chennai | ipod service center in chennai | Apple laptop service center in chennai | apple iphone service center in chennai | apple iphone service center in chennai
ReplyDelete
Replies
Chiến SEOCAM3 April 2019 at 02:25
Koj tsab xov xwm tseem ceeb heev

chó Corgi

bán chó Corgi

chó Corgi giá bao nhiêu

mua chó Corgi
ReplyDelete
Replies
rohini14 May 2019 at 00:41
There are a lot of occasions to be in contact with the job world (workshops, presentations in front of the company, guest speakers, seminars).Along that blogs is helping through learn on different path to move on different career.
Laptop battery replacement in chennai
Laptop unlocking service in chennai
100% genuine laptop parts
Laptop service centre in chennai
Laptop display replacement in chennai
ReplyDelete
Replies
Softgen Infotech10 December 2019 at 07:43
I am happy for sharing on this blog its awesome blog I really impressed. thanks for sharing.

Became an Expert In Google Cloud Platform Training in Bangalore! Learn from experienced Trainers and get the knowledge to crack a coding interview, @Softgen Infotech Located in BTM Layout.
ReplyDelete
Replies
Anonymous30 June 2020 at 17:23
Дамская сумочка – поистине предмет искусства. Итальянцы постоянно экперементируют с цветом и материалом, ваяя невероятные сумочки из разных веществ. Однако наряду с такими будоражащими воображение остаются неизменно привычнми традицонные сумочки из кожи, красивые и подкупающие непосредственной элегантностью и высоким стилем. Красивая природная кожа, идеальная фурнитура, многолетние традиции и акцент последних стилей итальянской моды мотивируют проивзодителей саломея сумки на создание всё более идеальных коллекций. Новых трендом стала треугольная кожаная сумочка. Украшенная стильными кругами из медных деталей - будет блистательным акцентом твоего наряда. Выбирайте коричневую, для того, чтобы бысть стильной и заслужить восторженные взоры.
ReplyDelete
Replies
Christina Alpanda23 January 2021 at 07:30
Nice reading, I love your content. This is really a fantastic and informative post. Keep it up and if you are looking for devops course then visit igmguru.com.
data science with python training
ReplyDelete
Replies
Best Training Institute15 February 2021 at 21:33
Really Nice Information It's Very Helpful All courses Checkout Here.
Microsoft Dynamics CRM Training in Bangalore
ReplyDelete
Replies
Gesper66621 March 2021 at 21:45
Nice to be visiting your blog again, it has been months for me. Well this article that i’ve been waited for so long. I need this article to complete my assignment in the college, and it has same topic with your article. Thanks, great share.

Gesper666
Resep Ntaps
ReplyDelete
Replies
gusdri25 May 2021 at 13:12
Texture memory is read from kernels using the device functions described in Texture Functions

gabung file jpg
ReplyDelete
Replies
Anonymous29 October 2021 at 22:18
I got such a useful stuff on your website that helps me a lot to gain information.

check university ba 1st year time table
ReplyDelete
Replies
Keerthi5517 November 2021 at 01:41
will omit your great writing due to this problem.
MuleSoft training
MuleSoft online training
ReplyDelete
Replies
periyannan2 December 2021 at 21:41
Nice blog post for reading and Thanks for sharing the wonderful article

Internship providing companies in chennai | Where to do internship | internship opportunities in Chennai | internship offer letter | What internship should i do | How internship works | how many internships should i do ? | internship and inplant training difference | internship guidelines for students | why internship is necessary
ReplyDelete
Replies
Bdfy8 February 2022 at 07:44
ladyhammer casino no depositing bonuses codes
slottyway
tornadobet bonus code
dzisiejsze numery keno
ReplyDelete
Replies
Anonymous8 February 2022 at 19:34
This comment has been removed by the author.
ReplyDelete
Replies
Alex Jarboe27 February 2022 at 21:15
This is a great article with lots of informative resources. I appreciate your work this is really helpful for everyone. Check out our website DevOps Training for more igmguru.com, related info!
ReplyDelete
Replies
Copycat Figma24 August 2022 at 04:49
React.js offers a faster development environment, and its very easy to use. Learn them without taking care of too many setups, when you are developing apps by serving HTML files.
Learn How to Use Reactjs Cdn
ReplyDelete
Replies
Scofio morgan2 November 2022 at 08:08
This comment has been removed by the author.
ReplyDelete
Replies
Scofio morgan2 November 2022 at 08:18
Ask him any Questions about herpes

[â€¦] Itâ€™s a crazy miracle product. Read my review here. [â€¦]

I was cured of herpes simplex virus his contact ___________Robinsonbuckler11@gmail. com.......................

this is the best herbal remedyâ€¦
ReplyDelete
Replies
baby day pannel chart24 November 2022 at 03:09
I am really really impressed with your writing skills as well as with the layout on your blog.

baby night pannel chart, golden day express pannel chart, meenakshi morning pannel chart.
ReplyDelete
Replies
laxmicynixit27 May 2024 at 11:18
nice...............!
mulesoft training
pega training
servicenow training training
ReplyDelete
Replies
파워볼게임20 December 2024 at 02:34
registered as an official certified guarantee company for over 150 community sites, and is a playground that provides a safe betting environment as a premium Toto site that has been in use without changing its company name for over 6 years.
ReplyDelete
Replies
바카라 사이트20 December 2024 at 02:34
Toto without restrictions and unlimited winnings and cash out, just click on the Superstar Bet registration code and address provided.
ReplyDelete
Replies
카지노사이트20 December 2024 at 02:37
preparing exciting promotions that cannot be experienced in reality and provides the best games in the industry to provide true entertainment and opportunities for winning.
ReplyDelete
Replies
파워볼20 December 2024 at 02:39
Playground boasts the largest domestic capital and is the best real-time mini-game specialized site that provides the highest odds in the industry, 1.96, and provides major site codes.
ReplyDelete
Replies
바카라사이트13 February 2025 at 01:10
Muchos Gracias for your post. Really looking forward to read more. Want more.
ReplyDelete
Replies
카지노사이트13 February 2025 at 01:11
Good way of explaining, and fastidious piece of writing to take facts, thankyou!
ReplyDelete
Replies
파워볼사이트13 February 2025 at 01:11
Very neat blog. Thank you! Great. Really looking forward to read more. Awesome.
ReplyDelete
Replies
토토사이트15 February 2025 at 22:50
I learn something totally new and challenging on blogs, Its great.
ReplyDelete
Replies
토토사이트15 February 2025 at 22:51
Great blog love to read thanks for sharing such well written .
ReplyDelete
Replies
토토사이트15 February 2025 at 22:53
Thank you for ssharing this
ReplyDelete
Replies
파워볼사이트15 February 2025 at 22:54
Such a excellent post. Continue for sharing like this, Its helpful info
ReplyDelete
Replies
슬롯사이트15 February 2025 at 22:56
Really nice and informative blog, keep it up buddy…
ReplyDelete
Replies
Sumeet Chopra3 March 2025 at 22:49
Well Explained! I appreciate the way you broke down the details. It made it easy to understand. Keep up the great work!

Mysore Ooty Coorg Tour Package
Mysore Ooty Kodaikanal Tour Package
Ooty Kodaikanal Tour Package
ReplyDelete
Replies
슬롯사이트4 April 2025 at 17:55

Hey keep posting such good and meaningful articles.
ReplyDelete
Replies
파워볼사이트4 April 2025 at 17:57
very nice work, keep it up
ReplyDelete
Replies
토토사이트4 April 2025 at 17:58
This is a very interesting post. Thank you for posting a lot of interesting posts.
ReplyDelete
Replies
토토사이트4 April 2025 at 18:00
This is such a great resource that you are providing and you give it away for free.
ReplyDelete
Replies
토토사이트4 April 2025 at 18:08
Very good written information.
ReplyDelete
Replies
신용카드 현금화4 April 2025 at 18:09
Your writing is perfect and complete.
ReplyDelete
Replies

Add comment

Help us to improve our quality and become contributor to our blog

CUDA Programming

Prefer Your Language

Search This Blog

Tags

TEXTURE OBJECT IN CUDA | Bindless Texture in CUDA

52 comments:

Recent Post

About Me

Total Pageviews

Labels

Blog Archive

Labels

Like Us

Cloud

Admin

CUDA Programming

Prefer Your Language

Search This Blog

Tags

Related Posts

Share This

TEXTURE OBJECT IN CUDA | Bindless Texture in CUDA

52 comments:

Recent Post

About Me

Subscribe To

Total Pageviews

Labels

Blog Archive

Labels

Like Us

Cloud

Admin