This guide is intended as a quick reference for users who are already familiar with HPC compute environments.
If you don't yet have an account on Rosalind, please see:Requesting an Account on Rosalind
Rosalind is accessible from the KCL network. It is linked to the KCL Active Directory system, so you should be able to connect with your KCL user credentials:
If you are not on campus, you will need to log into the KCL VPN. Details can be found on the KCL IT Website. If you have issues getting connected to the KCL VPN, then please contact the KCL IT helpdesk on 020 7848 8888
NOTE: login.rosalind.compute.estate is currently set to round-robin address resolution which is occasionally giving errors containing the following message:
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ @ WARNING: POSSIBLE DNS SPOOFING DETECTED! @ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
Until this gets fixed, you can log in to either login node with IP: 10.202.64.28 or 10.202.64.29 e.g.
If you are a collaborator from an external institution who has been given an account on the cluster, you will need to log in to Rosalind's own VPN. This will give you access to Rosalind services, but not to the rest of the KCL network.
You should have received an OpenVPN configuration file and a certificate file when your account was created. If you don't have these files, please contact your Rosalind administrator. You will need an OpenVPN client on your machine in order to connect. Please see the list below for suggestions. You should authenticate on the VPN with the cluster username and password received when your account was created. If you have issues getting connected to the Rosalind VPN, please contact your Rosalind Administrator for assistance.
Once connected to the VPN, you should be able to log in to the login nodes with your cluster credentials like:
When you connect to the login node, you will be placed in your home directory. User's home directories are on the /users file system, which is mounted on all of the compute nodes and login nodes. /users has a 10TB capacity so each user will have a quota about 20GB.
Your HPC workloads should read and write to the high performance lustre file system. This is mounted on the login nodes and compute nodes and all users have their own lustre directory. This can be accessed via a symbolic link in your home directory (e.g.
~/brc_scratch). If you are collaborating with other Rosalind users and you require a shared working directory on the lustre storage, contact your Rosalind Administrator. The lustre file system will also have a quota. To find out what you’re quota is and how much data you have, use the command lfs quota /mnt/lustre e.g.For the moment we give also 1TB quota of lustre by default but this will be reduced.
[k1234567@login1(rosalind)~]$lfs quota /mnt/lustre
Disk quotas for user k1234567 (uid 103800009):
Filesystem kbytes quota limit grace files quota limit grace
/mnt/lustre 607898384 610445160 620000000 - 130523 2800000 3000000
Disk quotas for group k1234567 (gid 103800009):
Filesystem kbytes quota limit grace files quota limit grace
/mnt/lustre 607898384 0 0 - 130523 0 0
The login node has the fastest external connection and can be used to transfer data to and from the cluster. Data can be transferred with rsync or scp e.g.
rsync avz ./sourceDir <username>@login.rosalind.compute.estate:~/brc_scratch
scp -r ./sourceDir <username>@login.rosalind.compute.estate:~/brc_scratch
The scheduling software Open Grid Scheduler can be user to launch jobs Rosalind. Open grid engine is descendant of sun grid engine (so you may see it referred to as SGE) and the commands and usage are almost identical.
Interactive jobs can be launched using the
Batch jobs can be launched using the
If you're unfamiliar with Grid Engine, have a look at the Open Grid Scheduler Docs.
For more specific information about available queues, parallel environments, quotas and other Rosalind-specific Grid Engine settings, see: Rosalind Open Grid Scheduler Configuration
Standard Linux build tools are available and users are welcome to install and/or develop software in their own scratch space (e.g. ~/brc_scratch) however there is also software available in /opt/gridware and reasonable requests for software installation there will be catered for.
Rosalind uses the Environment Modules package to allow users to load appropriate environment variables to use a particular version of an installed piece of software.
you can make a call to module in your grid engine script to set up your environment appropriately. e.g.:
module add bioinformatics/R/3.2.1
More details can be found at http://modules.sourceforge.net/ or in the module man page.
All users log in through the login nodes and login node resources are limited.
Login nodes are for logging in and for transferring data files. Other work should be done through the Grid Scheduler.
If you run anything directly on the login node that uses a significant amount of CPU or memory then you can slow down (or potentially even crash) the login node, which will not make you popular with other users.
Even seemingly trivial things can use significant resources when dealing with big datasets, for example opening large genomic data files in a text editor can exhaust the RAM on a login node.
Anything found running on a login node that is consuming a significant amount of resources or is found to be slowing access to the file system is liable to be terminated without warning.
Its the user's responsibility to ensure that their data are stored with appropriate file and folder permissions. Its essential therefore that users understand how these permissions work and how to control access and are familiar with the commands chmod and chgrp. An introduction into linux file permissions can be found in the following two links:
If your research has made use of Rosalind, you should make sure to acknowledge this in your publications. Details can be found here: Acknowledging Rosalind