We maintain a system
cuda-hw2 with 40 CPU cores containing some General Purpose GPUs (GPGPU).
|2||Nvidia Tesla K40c||12|
|4||Nvidia Tesla K40m||12|
The primary purpose for the GPUs is for coursework related to high performance computing/parallel programming. We are also making them available for research purposes. Jobs submitted for courses will take precedence.
We use SLURM workload manager/job queueing system. The previous link points to the SLURM quick start documentation. From there you can access all other documentation on SLURM.
On our systems the
bash shell is configured via a global alias for
automatically select the correct partition
Users of shells other than
bash, or those submitting jobs with
must make sure to specify the correct partition in their job submissions.
You will log in to a virtual machine tailored to your course’s needs. The following table enumerates the VMs available at the time of this writing (others may be added):
Use the SLURM partition from the above table corresponding to your course.
Again, this is the default behavior of
srun if you are using the
cpeg655:~$ alias srun alias srun='srun -p cpeg655'
Log in to
All jobs to be submitted for research purposes should be launched from
research partition for all jobs.
This allows the jobs to be preempted by academic jobs, which will often be operating under strict deadlines.
We are looking into various options on how to deal with overcommitted resources, such as job suspension and requeuing. For now: your long-running jobs may be killed to make room for new academic jobs. This will only happen if there aren’t enough resources to run the academic jobs when they are submitted.
System packages of MPI and CUDA are being used as of January 2019. You may use these without altering your environment.
Other high-performance computing softwares (such as compilers, MPI implementations, etc.) may be available, and/or installed upon request.
modulecommand to inspect what is currently available. Load with
GPUs are exposed by SLURM as a General Resource. Example run (using 2 GPUs):
srun --gres=gpu:2 ./myjob
See SLURM documentation on General Resources for more.
cuda-hw2 has 2x Intel Xeon E5-2630 CPUs clocked at 2.20GHz.
Each CPU contains 10 cores, each of which is dual threaded.
2x10x2=40 logical cores of computation possible for CPU-level parallelism.
Here we are requesting 4 processes with the
% srun -n 4 mpi_program Hello world from processor cuda-hw2, rank 2 out of 4 processors Hello world from processor cuda-hw2, rank 0 out of 4 processors Hello world from processor cuda-hw2, rank 1 out of 4 processors Hello world from processor cuda-hw2, rank 3 out of 4 processors
Here we are requesting 4 threads with the
% srun -c 4 omp_program Hello from thread 1 Hello from thread 3 Hello from thread 0 Hello from thread 2
Here we are requesting 2 processes
-n 2, each with two threads
% mpicc -fopenmp ompi.c -o ompi % srun -n 2 -c 2 ./ompi Hello from thread 1 out of 2 from process 2 out of 2 on cuda-hw2 Hello from thread 2 out of 2 from process 2 out of 2 on cuda-hw2 Hello from thread 1 out of 2 from process 1 out of 2 on cuda-hw2 Hello from thread 2 out of 2 from process 1 out of 2 on cuda-hw2