CLIC Servers
Connecting to a Server
To connect to a machine, open a shell and type:
ssh (-Y) username@serveraddress
using your UNITN username and password. As for serveraddress, it should be any of the following:
- masterclic.cimec.unitn.it - cluster connected to several CPU-only nodes
- clicgpu1.cimec.unitn.it - GPU server
- clicgpu2.cimec.unitn.it - GPU server
- clicgpu3.cimec.unitn.it - GPU server
- clicgpu4.cimec.unitn.it - GPU server
- clicgpu5.cimec.unitn.it - GPU server
Note: The -Y flag is optional. Use it if you need to run a graphical application on the server (e.g., matlab).
Tip: if you access the machines using the terminal, it can be useful to install the `tmux` terminal multiplexer. The main advantage is that processes executed within `tmux` keep running even after you disconnect from the server. Other advantages include the ability to have several virtual terminals open at once in the same terminal window.
TODO: explain how to schedule work on the CPU nodes managed by masterclic.
Hardware
Note: This description is valid as of August 2023, but it may become outdated. You can use `lscpu`, `htop`, `nvidia-smi`, and `lspci` to figure out what hardware is installed.
Note: We report the number of cores exposed by `htop`. These include both physical and virtual cores.
clicgpu1
- CPU: Xeon CPU E5-2683 v4 @ 2.10GHz, 64 Cores
- GPUs:
- 1 x Quadro P6000 (24 GiB, released 2016)
- 2 x Tesla K80 (12 GiB, released 2014)
- RAM: 256 GiB
clicgpu2
- CPU: Core i7-5930K CPU @ 3.50GHz, 12 Cores
- GPUs:
- 2 x TITAN V (12 GiB, released 2017)
- 1 x TITAN X (12 GiB, released 2015)
- RAM: 128 GiB
clicgpu3
- CPU: Core i9-7920X CPU @ 2.90GHz, 24 Cores
- GPUs:
- 2 x A5000 (24 GiB, released 2021)
- 1 x Quadro P6000 (24 GiB, released 2016)
- RAM: 256 GiB
clicgpu4
- CPU: Ryzen Threadripper PRO 3955WX, 32 Cores
- GPUs:
- 2 x GeForce RTX 3090 (24 GiB, released 2020)
- RAM: 256 GiB
clicgpu5
- CPU: AMD Ryzen Threadripper PRO 5975WX, 32 Cores
- GPUs:
- 1 x GeForce RTX 4090 (24 GiB, released 202?)
- RAM: 256 GiB
TODO: add masterclic.
Storage
Each server has its own local storage, where the user’s home directories live. Your local home directory is at:
/home/jane.doe
Important: Local storage is limited and shared across all users. If you store too many things in your local home, this will end up eating into other people’s storage. For this reason, local storage should only be used to store scripts and configuration files. Large data sets or results of large experiments should be stored on remote storage, not in your local home.
The remote storage is shared across servers and can be found here:
/mnt/cimec-storage6 (30 TiB)
/mnt/povobackup (8 TiB)
Every user is given access to a (private) remote home at:
/mnt/cimec-storage6/users/jane.doe
Please store all your large files here. There is also a shared directory that all users have access to at:
/mnt/cimec-storage6/shared/
Use this directory for sharing things with other users.
Note: the directory structure of povobackup is not as well defined. Use with caution.
Note: other remote storage options are are deprecated and all data contained therein will be moved to storage6 in the near future. Please refrain from using them, and use storage6 (or pobobackup) instead.
Storage Policy
If you are a thesis student or a Ph.D. student, please note that the contents of your local home (/home/jane.doe) will be ERASED within 2 months after the completion of your studies.
Please make sure to move all important information - that is, data, results and scripts that you do not want to be lost - to your remote home (that is, /mnt/cimec-storage6/users/jane.doe) within two months from your graduation.
Software
All servers run Linux centos 7/8/Rocky Linux.
Available software includes:
- Tools: htop, mc, tmux, screen, vim, nano
- Python:
- python 2.7 (clicgpu 1, 2, 3, 4)
- python 3.6 (clicgpu 1, 2, 4)
- python 3.9 (clicgpu 4)
- you can use miniconda (https://conda.io/) to install your favorite Python version and libraries in your home directory
- R 3.6.0 (clicgpu 1, 2)
- Matlab R2016b (clicgpu 1, 2)
- Cuda 11.4 (clicgpu1) or 12 (clicgpu 2, 3, 4)
- Other: perl, QT4, openjdk
If you need other software, open a ticket here:
https://servicedesk.unitn.it/goauth/en?sf=SF00445
Setting Up Python/Conda
Python is installed by default on all servers. However, we strongly suggest to use `miniconda` for your python needs. This allows you to:
- Use environments to manage the dependencies of your projects.
- Install arbitrary python versions (say, 3.10).
- Install arbitrary packages.
- Easily install the cuda libraries to interact with the GPUs.
- Share environments across servers.
Note: while you can install miniconda in your local home, we strongly encourage you to install miniconda in your remote home. This way, any conda environments you create and the cached packages installed by conda do not take up space in your home. This is easy to do by following these steps:
- Download a miniconda linux installation script from:
https://docs.conda.io/en/latest/miniconda.html#linux-installers
for instance using `wget`:
wget https://repo.anaconda.com/miniconda/Miniconda3-py38_23.5.2-0-Linux-x86_64.sh
- Execute the script:
bash ./Miniconda3-py38_23.5.2-0-Linux-x86_64.sh
and tell it to install miniconda within your remote home, for instance in:
/mnt/cimec-storage6/users/jane.doe/miniconda3
Also, tell it to run conda automatically when you connect to the machine. This will add an autorun script in your .bashrc file, which can be found in your local home.
- Log out and log in again. If conda is working correctly, you should see a “(base)” string at the beginning of the terminal prompt. If so, you are all set!
- The above steps install conda on the remote storage and additionally ensure it is run automatically every time you access the one server you installed conda from (say, clicgpu1).
To get it to run automatically also on the other servers (clicgpu2, 3, and 4), you have to edit the .bashrc files on those servers. Simply copy-paste the conda autorun script from the clicgpu1 .bashrc file to the .bashrc files of the other servers.
Now, every time you log into a server, you will have access to the environments you have created on the other servers, along with all the packages they contain.
Example: if, on clicgpu1, you create a new environment with:
conda create -n myenvironment39 python=3.9
then you can activate the same environment from a different server (say, clicgpu3) with:
conda activate myenvironment39