We maintain a system with 40 CPU cores and some General Purpose GPUs (GPGPU). Current hardware:
|2||Nvidia Tesla K40c||12|
|4||Nvidia Tesla K40m||12|
The primary purpose for the GPUs is for coursework related to high performance computing/parallel programming. We are also making them available for research purposes. Jobs submitted for courses will take precedence.
You will log in to a virtual machine tailored to your course’s needs. The following table enumerates the VMs available at the time of this writing (although others may be added):
System packages of OpenMPI and CUDA are being used as of January 2019. You may use these without altering your environment.
Other high-performance computing softwares (such as compilers, MPI implementations, etc.) are installed.
modulecommand to inspect what is available (i.e.
module avail). Load desired modules with
We use SLURM workload manager/job queueing system. The previous link points to the SLURM quick start documentation. From there you can access all other documentation on SLURM.
bashshell is configured via a global alias for
srunto automatically use the correct partition
cpeg655:~$ alias srun alias srun='srun -p cpeg655'
Users of shells other than
bash should make sure to call
srun -p [PARTITION] when submitting jobs to SLURM.
sbatchshould make sure to specify the correct partitions.
GPUs are exposed by SLURM as a General Resource. Example run (using 2 GPUs):
srun -p cisc360 --gres=gpu:2 ./myjob
See SLURM documentation on General Resources for more.
cuda0 has 2x Intel Xeon E5-2630 CPUs clocked at 2.20GHz.
Each CPU contains 10 cores, each of which is dual threaded.
2x10x2=40 logical cores of computation possible for CPU-level parallelism.
«mpi» is the name of the compiled program. Here we are requesting 4 processes with the
cisc372% srun -p cisc372 -n 4 mpi Hello world from processor cuda-hw2, rank 2 out of 4 processors Hello world from processor cuda-hw2, rank 0 out of 4 processors Hello world from processor cuda-hw2, rank 1 out of 4 processors Hello world from processor cuda-hw2, rank 3 out of 4 processors
«omp» is the name of the compiled program. Here we are requesting 4 threads with the
cisc372% srun -p cisc372 -c 4 omp Hello from thread 1 Hello from thread 3 Hello from thread 0 Hello from thread 2