I am running a Batch job with tasks that uses the GPU. When the task parallelism is greater than 1, the gpu memory gets exhausted by the first task. What is the best practice for handling concurrent task that uses the GPU? How do I prevent container tasks from being schedule on the same VM? Or should I change the allocationPolicy to have more than one gpu and in the code set the visible gpu device per task?
"allocationPolicy": {
"instances": [
{
"installGpuDrivers": true,
"policy": {
"machineType": "g2-standard-16",
"accelerators": [
{
"type": "nvidia-l4",
"count": 1
}
]
}
}
]
}