Add README and prime number finding script
This commit is contained in:
parent
943b64fdcf
commit
95c7198280
88
README.md
Normal file
88
README.md
Normal file
@ -0,0 +1,88 @@
|
||||
# Vagrant Slurm
|
||||
|
||||
**Warning: For demonstration/testing purposes only, not suitable for use in production**
|
||||
|
||||
This repository contains a `Vagrantfile` and the necessary configuration for
|
||||
automating the setup of a Slurm cluster using Vagrant's shell provisioning on
|
||||
Debian 12 x86_64 VMs.
|
||||
|
||||
### Prerequisites
|
||||
|
||||
This setup was developed using vagrant-libvirt with NFS for file sharing,
|
||||
rather than the more common VirtualBox configuration which typically uses
|
||||
VirtualBox's Shared Folders. However, VirtualBox should work fine.
|
||||
|
||||
The core requirements for this setup are:
|
||||
- Vagrant (with functioning file sharing)
|
||||
- (Optional) Make (for convenience commands)
|
||||
|
||||
### Cluster Structure
|
||||
- `node1`: Head Node (runs `slurmctld`)
|
||||
- `node2`: Login/Submit Node
|
||||
- `node3` / `node4`: Compute Nodes (runs `slurmd`)
|
||||
|
||||
By default, each node is allocated:
|
||||
- 2 threads/cores (depending on architecture)
|
||||
- 2 GB of RAM
|
||||
|
||||
**Warning: 8 vCPUs and 8 GB of RAM is used in total resources**
|
||||
|
||||
## Quick Start
|
||||
|
||||
1. To build the cluster, you can use either of these methods
|
||||
|
||||
Using the Makefile (recommended):
|
||||
|
||||
make
|
||||
|
||||
Using Vagrant directly:
|
||||
|
||||
vagrant up
|
||||
|
||||
2. Login to the Login Node (node2) as the submit user:
|
||||
|
||||
vagrant ssh node2 -c "sudo -iu submit"
|
||||
|
||||
|
||||
3. Run the example prime number search script:
|
||||
|
||||
/vagrant/primes.sh
|
||||
|
||||
By default, this script searches for prime numbers from `1-10,000` and `10,001-20,000`
|
||||
|
||||
You can adjust the range searched per node by providing an integer argument, e.g.:
|
||||
|
||||
/vagrant/primes.sh 20000
|
||||
|
||||
The script will then drop you into a `watch -n0.1 squeue` view so you can see
|
||||
each job computing on `nodes[3-4]`. You may `CTRL+c` out of this view, and
|
||||
the jobs will continue in the background. The home directory for the `submit`
|
||||
user is in the shared `/vagrant` directory, so the results from each node are
|
||||
shared back to the login node.
|
||||
|
||||
4. View the resulting prime numbers found, check `ls` for exact filenames
|
||||
|
||||
less slurm-1.out
|
||||
less slurm-2.out
|
||||
|
||||
### Configuration Tool
|
||||
|
||||
On the Head Node (`node1`), you can access the configuration tools specific to
|
||||
the version distributed with Debian. Since this may not be the latest Slurm
|
||||
release, it's important to use the configuration tool that matches the
|
||||
installed version. To access these tools, you can use Python to run a simple
|
||||
web server:
|
||||
|
||||
python3 -m http.server 8080 --directory /usr/share/doc/slurm-wlm/html/
|
||||
|
||||
You can then access the HTML documentation via the VM's IP address at port 8080
|
||||
in your web browser on the host machine.
|
||||
|
||||
### Cleanup
|
||||
To clean up files placed on the host through Vagrant file sharing:
|
||||
|
||||
make clean
|
||||
|
||||
This command is useful when you want to remove all generated files and return
|
||||
to a clean state. The Makefile is quite simple, so you can refer to it directly
|
||||
to see exactly what's being cleaned up.
|
41
primes.sh
Executable file
41
primes.sh
Executable file
@ -0,0 +1,41 @@
|
||||
#!/bin/bash
|
||||
|
||||
# This script finds prime numbers using the Slurm workload manager.
|
||||
# It operates in two modes:
|
||||
#
|
||||
# 1. When run without a Slurm job ID:
|
||||
# - It accepts an optional argument to set the range (default: 10000).
|
||||
# - It submits two Slurm jobs:
|
||||
# a) First job searches for primes from 1 to RANGE.
|
||||
# b) Second job searches for primes from (RANGE + 1) to (RANGE * 2).
|
||||
# - After submission, it watches the Slurm queue, updating every 0.1 seconds.
|
||||
#
|
||||
# 2. When run as a Slurm job:
|
||||
# - It uses the 'factor' command to identify prime numbers within the given range.
|
||||
# - It prints each prime number found to stdout.
|
||||
# - It logs the job ID and the range being searched.
|
||||
#
|
||||
# Usage:
|
||||
# Without arguments: ./primes.sh
|
||||
# With custom range: ./primes.sh 20000
|
||||
|
||||
RANGE=${1:-10000}
|
||||
|
||||
function find_primes() {
|
||||
local START="$1"
|
||||
local END="$2"
|
||||
echo "INFO: Job $SLURM_JOB_ID looking for prime numbers from $START to $END"
|
||||
for ((i=START;i<=END;i++)); do
|
||||
if [ "$(factor "$i")" == "$i: $i" ]; then
|
||||
echo "$i"
|
||||
fi
|
||||
done
|
||||
}
|
||||
|
||||
if [ -z "$SLURM_JOB_ID" ]; then
|
||||
sbatch -N1 --wrap="$0 1 $RANGE"
|
||||
sbatch -N1 --wrap="$0 $((RANGE + 1)) $((RANGE * 2))"
|
||||
watch -n0.1 squeue
|
||||
else
|
||||
find_primes "$1" "$2"
|
||||
fi
|
Loading…
Reference in New Issue
Block a user