Fix munge key race condition and update docs

- Add sleep to prevent munge.key race condition
- Warn about CPU override conflicts in README
- Update README with CPU config details for minimal setup
- Correct `less` command examples for prime number output
- Force update slurm/cgroups configs in the provision script
This commit is contained in:
Kris Lamoureux 2024-08-18 00:00:08 -04:00
parent b1aee01586
commit 1e95dd7b2f
Signed by: kris
GPG Key ID: 3EDA9C3441EDA925
3 changed files with 17 additions and 7 deletions

View File

@ -63,7 +63,7 @@ By default, each node is allocated:
4. View the resulting prime numbers found, check `ls` for exact filenames
less slurm-1_0.out
less slurm-2_1.out
less slurm-1_1.out
### Configuration Tool
@ -96,6 +96,10 @@ ignored by .gitignore. Be cautious when using this command as it will delete
files that are not tracked by Git. Use the `-n` flag to dry-run first.
## Global Overrides
**WARNING:** Always update `slurm.conf` to match any CPU overrides to prevent
resource allocation conflicts.
If you wish to override the default settings on a global level,
you can do so by creating a `.settings.yml` file based on the provided
`example-.settings.yml` file:
@ -125,6 +129,11 @@ file without modifications. This results in a cluster configuration using only
1 vCPU and 1 GB RAM per node (totaling 4 threads/cores and 4 GB RAM), allowing
basic operation on modest hardware.
When using this minimal setup with 1 vCPU, you'll need to update the `slurm.conf` file.
Apply the following change to the default `slurm.conf`:
sed -i 's/CPUs=2/CPUs=1/g' slurm.conf
### Slurm Settings Overrides
- `SLURM_NODES`
- Default: `4`

View File

@ -46,12 +46,12 @@ fi
dpkg -s slurm-client &>/dev/null || apt-get install -y slurm-client
# Create directories for Slurm
mkdir -p /var/spool/slurm /var/log/slurm /etc/slurm
chown slurm:slurm /var/spool/slurm /var/log/slurm /etc/slurm
mkdir -p /var/spool/slurm /etc/slurm
chown slurm:slurm /var/spool/slurm /etc/slurm
# Copy slurm.conf and cgroup.conf
cp -u /vagrant/slurm.conf /etc/slurm/slurm.conf
cp -u /vagrant/cgroup.conf /etc/slurm/cgroup.conf
cp -f /vagrant/slurm.conf /etc/slurm/slurm.conf
cp -f /vagrant/cgroup.conf /etc/slurm/cgroup.conf
chown slurm:slurm /etc/slurm/slurm.conf /etc/slurm/cgroup.conf
chmod 644 /etc/slurm/slurm.conf /etc/slurm/cgroup.conf
@ -107,7 +107,8 @@ else
sleep 10
done
# Enable/start/test munge service
# Enable/start munge service
sleep 3
cp -f /vagrant/munge.key /etc/munge/munge.key
chown munge:munge /etc/munge/munge.key
chmod 400 /etc/munge/munge.key

View File

@ -1,4 +1,4 @@
#slurm.conf file generated by configurator easy.html.
# slurm.conf file generated by configurator easy.html.
# Put this file on all nodes of your cluster.
# See the slurm.conf man page for more information.
#