Misplaced Pages

Compute kernel: Difference between revisions

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
Browse history interactively← Previous editNext edit →Content deleted Content addedVisualWikitext
Revision as of 22:08, 1 May 2016 edit81.153.147.173 (talk) See also← Previous edit Revision as of 22:09, 1 May 2016 edit undo81.153.147.173 (talk)No edit summaryNext edit →
Line 7: Line 7:
They may be specified by a separate language such as "OpenCL C" (managed by the ] API),or as "compute ]s"( managed by a graphics API such as ]), or embedded directly in application code written in a high level language, as in the case of ] . They may be specified by a separate language such as "OpenCL C" (managed by the ] API),or as "compute ]s"( managed by a graphics API such as ]), or embedded directly in application code written in a high level language, as in the case of ] .


This programming model maps well to ]s: there is an assumption that each invocation of a kernel within a batch is independent, allowing for ] execution. However, ] may sometimes be used for synchronisation between elements (for interdependent work), in some scenarios. Individual invocations are given indices (in 1 or more dimensions) from which arbitrary addressing of buffer data may be performed (including ] operations), so long as the non-overlapping assumption is respected. This programming paradigm maps well to ]s: there is an assumption that each invocation of a kernel within a batch is independent, allowing for ] execution. However, ] may sometimes be used for synchronisation between elements (for interdependent work), in some scenarios. Individual invocations are given indices (in 1 or more dimensions) from which arbitrary addressing of buffer data may be performed (including ] operations), so long as the non-overlapping assumption is respected.


The ] provides the intermediate ] representation to describe ''both'' Graphical Shaders, ''and'' Compute Kernels, in a ] and ] manner. The intention is: to facilitate language evolution, for a more natural ability to leverage of GPU compute capabilities, inline with hardware developments such as ] and ], which allow closer co-operation between a CPU and GPU. The ] provides the intermediate ] representation to describe ''both'' Graphical Shaders, ''and'' Compute Kernels, in a ] and ] manner. The intention is: to facilitate language evolution, for a more natural ability to leverage of GPU compute capabilities, inline with hardware developments such as ] and ], which allow closer co-operation between a CPU and GPU.

Revision as of 22:09, 1 May 2016

"Kernel (compute)" redirects here. For other uses, see Kernel (disambiguation).

In computing, a compute kernel is a routine compiled for high throughput accelerators such as GPUs, DSPs or FPGAs, separated from the main program. They are sometimes called compute shaders , sharing execution resources with vertex shaders and pixel shaders on GPUs, but are not limited to execution on one class of device, or graphics APIs.

Compute kernels roughly correspond to inner loops when implementing algorithms in traditional languages (except there is no implied sequential operation), or to code passed to internal iterators.

They may be specified by a separate language such as "OpenCL C" (managed by the OpenCL API),or as "compute shaders"( managed by a graphics API such as OpenGL), or embedded directly in application code written in a high level language, as in the case of C++AMP .

This programming paradigm maps well to vector processors: there is an assumption that each invocation of a kernel within a batch is independent, allowing for data parallel execution. However, atomic operations may sometimes be used for synchronisation between elements (for interdependent work), in some scenarios. Individual invocations are given indices (in 1 or more dimensions) from which arbitrary addressing of buffer data may be performed (including scatter gather operations), so long as the non-overlapping assumption is respected.

The Vulkan API provides the intermediate SPIR-V representation to describe both Graphical Shaders, and Compute Kernels, in a Language independent and machine independent manner. The intention is: to facilitate language evolution, for a more natural ability to leverage of GPU compute capabilities, inline with hardware developments such as Unified Memory Architecture and heterogeneous system architecture, which allow closer co-operation between a CPU and GPU.

See also

  1. Introduction to Compute Programming in Metal
  2. CUDA Tutorial - the Kernel
Categories:
Compute kernel: Difference between revisions Add topic