Slurm on VM instance doesn't work with different MPI implementations

Hello,

I have created a Compute Engine VM instance built with Slurm (using one of the standard blueprints “hpc-cluster-small.yaml”). On the VM I have a code base that uses a variety of packages that have been installed using the Spack package manager, including MPICH. The issue I run into occurs when running the code using srun.

============================================================
“The application appears to have been direct launched using “srun”,
but OMPI was not built with SLURM support. This usually happens
when OMPI was not configured --with-slurm and we weren’t able
to discover a SLURM installation in the usual places.”

I’ve tried a number of things including building the MPICH library with Spack to include the existing installation of Slurm that is built with this instance. I’ve also tried installing MPICH with a Spack-installed Slurm, but whenever I load this module, it seems to break Slurm altogether and I get errors like:

============================================================
sinfo: error: resolve_ctls_from_dns_srv: res_nsearch error: Unknown host
sinfo: error: fetch_config: DNS SRV lookup failed
sinfo: error: _establish_config_source: failed to fetch config
sinfo: fatal: Could not establish a configuration source

Is there a way to easily reconfigure the VM instance so that it recognizes different MPI implementations or different Slurm installations?

Thank you

For this type of questions about Slurm and Google Cloud products, you would get better answers in the following Google Group: