User Tools

Site Tools


hpc_quick_start_guide:start

HPC Quick Start

This guide is intended as a quick reference for users who are already familiar with HPC compute environments.

Getting an Account

If you don't yet have an account on Rosalind, please see:Requesting an Account on Rosalind

Logging on

Secure shell (SSH) access to the login nodes is available directly within the KCL network, the hostname is: login.rosalind.kcl.ac.uk
Your connection will be automatically directed to one of two login nodes, if you need access to a specific login node, you can simply SSH between them, or reach them directly via their individual hostnames: login1.rosalind.kcl.ac.uk, login2.rosalind.kcl.ac.uk

VPN access

If you require access from outside of the KCL network you will need to connect to either the KCL VPN (see http://www.kcl.ac.uk/it) or the Rosalind VPN first.
You should have received the OpenVPN configuration file for Rosalind when your account was created. You will also need an OpenVPN client for this; some recommended programs are:

  • macOS: Tunnelblick (https://tunnelblick.net); install, then import or double-click the .ovpn config file
  • Windows: OpenVPN Community Project edition (https://openvpn.net/index.php/open-source/downloads.html); install, then import the .ovpn config file or copy it to the OpenVPN configuration folder (e.g. C:\User\<username>\OpenVPN\config)
  • Linux: install openvpn from your package manager; copy the .ovpn config file into the OpenVPN configuration files directory (e.g. /etc/OpenVPN), then you should be able to connect with, e.g. sudo openvpn –config <configfile>. You may also be able to configure the connection from your distro's desktop GUI, see e.g. this guide for an example: https://askubuntu.com/questions/187511/how-can-i-use-a-ovpn-file-with-network-manager
  • iOS/Android: although not supported by the Rosalind team, the OpenVPN Connect app is available on these platforms. You will need to copy the config file to your device.

Directory structure

When you connect to the login node, you will be placed in your home directory. User's home directories are on the /users file system, which is mounted on all of the compute nodes and login nodes. /users has a 10TB capacity so each user will have a quota about 20GB.

Your HPC workloads should read and write to the high performance lustre file system. This is mounted on the login nodes and compute nodes and all users have their own lustre directory. This can be accessed via a symbolic link in your home directory (e.g. ~/brc_scratch). If you are collaborating with other Rosalind users and you require a shared working directory on the lustre storage, contact your Rosalind Administrator. The lustre file system will also have a quota. To find out what you’re quota is and how much data you have, use the command lfs quota /mnt/lustre e.g.For the moment we give also 1TB quota of lustre by default but this will be reduced.

[k1234567@login1(rosalind)~]$lfs quota /mnt/lustre
Disk quotas for user k1234567 (uid 103800009):
Filesystem kbytes quota limit grace files quota limit grace
/mnt/lustre 607898384 610445160 620000000 - 130523 2800000 3000000
­ Disk quotas for group k1234567 (gid 103800009):
Filesystem kbytes quota limit grace files quota limit grace
/mnt/lustre 607898384 0 0 - 130523 0 0

Transferring data on/off the cluster

The login node has the fastest external connection and can be used to transfer data to and from the cluster. Data can be transferred with rsync or scp e.g.

rsync ­avz ./sourceDir <username>@login.rosalind.kcl.ac.uk:~/brc_scratch

scp ­-r ./sourceDir <username>@login.rosalind.kcl.ac.uk:~/brc_scratch

Launching jobs

The scheduling software Open Grid Scheduler can be user to launch jobs Rosalind. Open grid engine is descendant of sun grid engine (so you may see it referred to as SGE) and the commands and usage are almost identical.

Interactive jobs can be launched using the qrsh command.

Batch jobs can be launched using the qsub command.

If you're unfamiliar with Grid Engine, have a look at the Open Grid Scheduler Docs.

For more specific information about available queues, parallel environments, quotas and other Rosalind-specific Grid Engine settings, see: Rosalind Open Grid Scheduler Configuration

Software

Standard Linux build tools are available and users are welcome to install and/or develop software in their own scratch space (e.g. ~/brc_scratch) however there is also software available in /opt/gridware and reasonable requests for software installation there will be catered for.

Modules

Rosalind uses the Environment Modules package to allow users to load appropriate environment variables to use a particular version of an installed piece of software.

you can make a call to module in your grid engine script to set up your environment appropriately. e.g.:

module add bioinformatics/R/3.2.1

More details can be found at http://modules.sourceforge.net/ or in the module man page.

WARNING: Do Not Use Login Nodes for Work

All users log in through the login nodes and login node resources are limited.

Login nodes are for logging in and for transferring data files. Other work should be done through the Grid Scheduler.

If you run anything directly on the login node that uses a significant amount of CPU or memory then you can slow down (or potentially even crash) the login node, which will not make you popular with other users.

Even seemingly trivial things can use significant resources when dealing with big datasets, for example opening large genomic data files in a text editor can exhaust the RAM on a login node.

Anything found running on a login node that is consuming a significant amount of resources or is found to be slowing access to the file system is liable to be terminated without warning.

Security and file permissions

Its the user's responsibility to ensure that their data are stored with appropriate file and folder permissions. Its essential therefore that users understand how these permissions work and how to control access and are familiar with the commands chmod and chgrp. An introduction into linux file permissions can be found in the following two links:

https://www.linux.com/learn/understanding-linux-file-permissions

http://ryanstutorials.net/linuxtutorial/permissions.php

Acknowledging use of the cluster.

If your research has made use of Rosalind, you should make sure to acknowledge this in your publications. Details can be found here: Acknowledging Rosalind

Getting help

If you need any further help, please log a ticket through our ticketing system at https://helpdesk.rosalind.kcl.ac.uk (accessible on the KCL network only; you don't need to login to the site to submit a ticket, simply click on “Open a new ticket”), or email via rosalind-support@kcl.ac.uk.

A web-based online course, 'Introduction to King's High Performance Computing Service', is also available via KEATS (King's E-learning and Teaching Service): http://keats.kcl.ac.uk/enrol/index.php?id=27710

hpc_quick_start_guide/start.txt · Last modified: 2018/05/31 13:47 by igrant