- Programming …
DPC++
Links:
Always use in code:
#include <CL/sycl.hpp>
Using namespace sycl;
1. Memory Management
Malloc
To manage memory use a pointer-based approach, more specifically Malloc. For example, create memory location to share:
char *example = malloc_shared<char>(variable,queue);
In SYCL you can use 3 types: malloc_host, malloc_shared and malloc_device:
int* host_array = malloc_host(N, Q);
int* shared_array = malloc_shared(N, Q);
int* device_array = malloc_device(N, Q);
Characteristics:
Type | Description | Accessible on host? | Accessible on device? | Located on |
---|---|---|---|---|
device | Allocation in device memory | No | Yes | device |
host | Allocations in host memory | Yes | Yes | host |
shared | Allocations shared between host and device | Yes | Yes | can migrate back and forth |
Buffer
Another option is the use of buffers: The easiest way to declare them is by indicating the data source in their constructor: array, vector, pointer, etc. Example:
buffer my_buffer(my_data);
To access these you must use an ‘accessor’
Example:
accessor my_accessor(my_buffer, h)
Access type | Description |
---|---|
read_only | Read only access |
write_only | Write only access |
read_write | Read and write access |
2. Implementation and management of Kernels
Queues
The queues allow us to connect with the devices, with them we send kernels to execute work and move data.
Queue <queue name>;
Parallel_for
It allows to define a parallel loop that will be executed by many threads depending on the number of iterations. In its simplified version, the compiler performs the division of work (iterations) in the threads.
h.parallel_for (range {N}, [=] (id <1> idx) {
As you can see it is like a lambda function, the function takes two arguments, the first is called a range which specifies the number of elements to throw in each dimension and the second is a kernel function that will be executed for each index of the range .
Note: They can use for several dimensions, for two dimensions it would be:
h.parallel_for (range {N, M}, [=] (id <2> idx)
A complete code example:
#include <CL/sycl.hpp>
#include <array>
#include <iostream>
using namespace sycl;
int main(){
constexpr int size = 16;
std::array<int, size> data;
queue Q;
buffer B { data };
Q.submit([&](handler& h){
accessor A{B,h};
h.parallel_for(size, [=](auto& idx){
A[idx] = idx;
});
});
host_accessor A{B};
for(int i=0; i<size; i++)
std::cout << "data" << i << " = " << A[i] << "\n";
return 0;
}