Fix munge key race condition and update docs
- Add sleep to prevent munge.key race condition - Warn about CPU override conflicts in README - Update README with CPU config details for minimal setup - Correct `less` command examples for prime number output - Force update slurm/cgroups configs in the provision script
This commit is contained in:
parent
b1aee01586
commit
1e95dd7b2f
11
README.md
11
README.md
@ -63,7 +63,7 @@ By default, each node is allocated:
|
|||||||
4. View the resulting prime numbers found, check `ls` for exact filenames
|
4. View the resulting prime numbers found, check `ls` for exact filenames
|
||||||
|
|
||||||
less slurm-1_0.out
|
less slurm-1_0.out
|
||||||
less slurm-2_1.out
|
less slurm-1_1.out
|
||||||
|
|
||||||
### Configuration Tool
|
### Configuration Tool
|
||||||
|
|
||||||
@ -96,6 +96,10 @@ ignored by .gitignore. Be cautious when using this command as it will delete
|
|||||||
files that are not tracked by Git. Use the `-n` flag to dry-run first.
|
files that are not tracked by Git. Use the `-n` flag to dry-run first.
|
||||||
|
|
||||||
## Global Overrides
|
## Global Overrides
|
||||||
|
|
||||||
|
**WARNING:** Always update `slurm.conf` to match any CPU overrides to prevent
|
||||||
|
resource allocation conflicts.
|
||||||
|
|
||||||
If you wish to override the default settings on a global level,
|
If you wish to override the default settings on a global level,
|
||||||
you can do so by creating a `.settings.yml` file based on the provided
|
you can do so by creating a `.settings.yml` file based on the provided
|
||||||
`example-.settings.yml` file:
|
`example-.settings.yml` file:
|
||||||
@ -125,6 +129,11 @@ file without modifications. This results in a cluster configuration using only
|
|||||||
1 vCPU and 1 GB RAM per node (totaling 4 threads/cores and 4 GB RAM), allowing
|
1 vCPU and 1 GB RAM per node (totaling 4 threads/cores and 4 GB RAM), allowing
|
||||||
basic operation on modest hardware.
|
basic operation on modest hardware.
|
||||||
|
|
||||||
|
When using this minimal setup with 1 vCPU, you'll need to update the `slurm.conf` file.
|
||||||
|
Apply the following change to the default `slurm.conf`:
|
||||||
|
|
||||||
|
sed -i 's/CPUs=2/CPUs=1/g' slurm.conf
|
||||||
|
|
||||||
### Slurm Settings Overrides
|
### Slurm Settings Overrides
|
||||||
- `SLURM_NODES`
|
- `SLURM_NODES`
|
||||||
- Default: `4`
|
- Default: `4`
|
||||||
|
11
provision.sh
11
provision.sh
@ -46,12 +46,12 @@ fi
|
|||||||
dpkg -s slurm-client &>/dev/null || apt-get install -y slurm-client
|
dpkg -s slurm-client &>/dev/null || apt-get install -y slurm-client
|
||||||
|
|
||||||
# Create directories for Slurm
|
# Create directories for Slurm
|
||||||
mkdir -p /var/spool/slurm /var/log/slurm /etc/slurm
|
mkdir -p /var/spool/slurm /etc/slurm
|
||||||
chown slurm:slurm /var/spool/slurm /var/log/slurm /etc/slurm
|
chown slurm:slurm /var/spool/slurm /etc/slurm
|
||||||
|
|
||||||
# Copy slurm.conf and cgroup.conf
|
# Copy slurm.conf and cgroup.conf
|
||||||
cp -u /vagrant/slurm.conf /etc/slurm/slurm.conf
|
cp -f /vagrant/slurm.conf /etc/slurm/slurm.conf
|
||||||
cp -u /vagrant/cgroup.conf /etc/slurm/cgroup.conf
|
cp -f /vagrant/cgroup.conf /etc/slurm/cgroup.conf
|
||||||
chown slurm:slurm /etc/slurm/slurm.conf /etc/slurm/cgroup.conf
|
chown slurm:slurm /etc/slurm/slurm.conf /etc/slurm/cgroup.conf
|
||||||
chmod 644 /etc/slurm/slurm.conf /etc/slurm/cgroup.conf
|
chmod 644 /etc/slurm/slurm.conf /etc/slurm/cgroup.conf
|
||||||
|
|
||||||
@ -107,7 +107,8 @@ else
|
|||||||
sleep 10
|
sleep 10
|
||||||
done
|
done
|
||||||
|
|
||||||
# Enable/start/test munge service
|
# Enable/start munge service
|
||||||
|
sleep 3
|
||||||
cp -f /vagrant/munge.key /etc/munge/munge.key
|
cp -f /vagrant/munge.key /etc/munge/munge.key
|
||||||
chown munge:munge /etc/munge/munge.key
|
chown munge:munge /etc/munge/munge.key
|
||||||
chmod 400 /etc/munge/munge.key
|
chmod 400 /etc/munge/munge.key
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
#slurm.conf file generated by configurator easy.html.
|
# slurm.conf file generated by configurator easy.html.
|
||||||
# Put this file on all nodes of your cluster.
|
# Put this file on all nodes of your cluster.
|
||||||
# See the slurm.conf man page for more information.
|
# See the slurm.conf man page for more information.
|
||||||
#
|
#
|
||||||
|
Loading…
Reference in New Issue
Block a user