Open
Description
Describe the bug
Whenever I am trying to run my application on the latest TornadoVM build, it only at some occasions throws an error indicating that PTX JIT compilation failed:
Unable to compile task 300502d1-daec-4e34-b335-8fde2503eb00.mxm - addFloat
The internal error is: [Error During the Task Compilation]
How To Reproduce
My Main.java just simply runs the following:
public class Main {
public static void main(String[] args) {
run();
}
public void run() {
FloatArray nativeBuffer = getFrom...();
FloatArray nativeResultBuffer = getFrom...();
size = 1024;
// Create a task-graph with multiple tasks. Each task points to an exising Java method
// that can be accelerated on a GPU/FPGA
TaskGraph taskGraph = new TaskGraph(UUID.randomUUID().toString())
.transferToDevice(DataTransferMode.FIRST_EXECUTION, nativeBuffer, nativeResultBuffer) // Transfer data from host to device only in the first execution
.task("mxm", AdditionTask::addFloat, nativeBuffer, nativeResultBuffer, 1.0f, size) // Each task points to an existing Java method
.transferToHost(DataTransferMode.EVERY_EXECUTION, nativeResultBuffer); // Transfer data from device to host
// Create an immutable task-graph
ImmutableTaskGraph immutableTaskGraph = taskGraph.snapshot();
// Create an execution plan from an immutable task-graph
try (TornadoExecutionPlan executionPlan = new TornadoExecutionPlan(immutableTaskGraph)) {
// Run the execution plan on the default device
TornadoExecutionResult executionResult = executionPlan.execute();
if (executionResult.isReady()) {
...
}
} catch (TornadoExecutionPlanException ex) {
ex.printStackTrace()
}
}
}
The AdditionTask class looks like this:
public class AdditionTask implements BufferTask {
public static void addFloat(@NotNull FloatArray input, @NotNull FloatArray output,
float value, int size) {
for (@Parallel int i = 0; i < size; i++)
output.set(i, 1.0f);
}
}
In this case, BufferTask
is just an empty interface.
Expected behavior
I'm expecting the code to run, without throwing any compilation issues. This happens in some cases but not all.
Computing system setup (please complete the following information):
- OS: Windows 10
- CUDA: cuda_11.7.r11.7/compiler.31294372_0
- PTX: How do I obtain this?
- TornadoVM commit id: 8db121e
Additional context
The attached log is generated using the --debug
flag and is one from my original program. The minimal reproducible example should still be a valid proxy.