Hello,
I have created a Compute Engine VM instance built with Slurm (using one of the standard blueprints “hpc-cluster-small.yaml”). On the VM I have a code base that uses a variety of packages that have been installed using the Spack package manager, including MPICH. The issue I run into occurs when running the code using srun.
============================================================
“The application appears to have been direct launched using “srun”,
but OMPI was not built with SLURM support. This usually happens
when OMPI was not configured --with-slurm and we weren’t able
to discover a SLURM installation in the usual places.”
I’ve tried a number of things including building the MPICH library with Spack to include the existing installation of Slurm that is built with this instance. I’ve also tried installing MPICH with a Spack-installed Slurm, but whenever I load this module, it seems to break Slurm altogether and I get errors like:
============================================================
sinfo: error: resolve_ctls_from_dns_srv: res_nsearch error: Unknown host
sinfo: error: fetch_config: DNS SRV lookup failed
sinfo: error: _establish_config_source: failed to fetch config
sinfo: fatal: Could not establish a configuration source
Is there a way to easily reconfigure the VM instance so that it recognizes different MPI implementations or different Slurm installations?
Thank you