Pages

Saturday, 12 January 2013

What is CUDA Driver API and CUDA Runtime API and Difference in between?


CUDA runtime API
The CUDA runtime makes it possible to compile and link your CUDA kernels into executable.This means that you don't have to distribute Cubin files with your application, or deal with loading them through the driver API. As you have noted, it is generally easier to use.

CUDA Driver API

In contrast, the driver API is harder to program but provided more control over how CUDA is used. The programmer has to directly deal with initialization, module loading, etc.
Apparently more detailed device information can be queried through the driver API than through the runtime API. For instance, the free memory available on the device can be queried only through the driver API.
Difference between Cuda Driver API and cuda Runtime API
  • A low-level API called the CUDA driver API
  •  A higher-level API called the CUDA runtime API that is implemented on top of the CUDA driver API.
These APIs are mutually exclusive: An application should use either one or the other.
The CUDA runtime eases device code management by providing implicit initialization, context management, and module management. The C host code generated by nvcc is based on the CUDA runtime , so applications that link to this code must use the CUDA runtime API.
In contrast, the CUDA driver API requires more code, is harder to program and debug, but offers a better level of control and is language-independent since it only deals with cubin objects. In particular, it is more difficult to configure and launch kernels using the CUDA driver API, since the execution configuration and kernel parameters must be specified with explicit function calls instead of the execution configuration syntax described in Section 4.2.3 in CUDA programming Guide. Also, device emulation (see Section 4.5.2.9 CUDA programming Guide) does not work with the CUDA driver API.
There is no noticeable performance difference between the API's. How your kernels use memory and how they are laid out on the GPU (in warps and blocks) will have a much more pronounced effect



Got Questions?
Feel free to ask me any question because I'd be happy to walk you through step by step!  


Reference and External Links

2 comments:

  1. Is there an example of implicit loading *.cubin using the Runtime API? Thank you in advance

    ReplyDelete
  2. This blog has interesting content.

    ReplyDelete

Help us to improve our quality and become contributor to our blog