User Tools

Site Tools


scientific_software:start

Scientific Software

There is a number of software installed in Rosalind. There is a good number of compilers, MPI implementations and libraries installed. Various software packages were compiled and validated using the aforementioned compilers and libraries.

Users can compile and validate software packages according to their needs. This can happen in their own home directory. This is important for each user:

  • Each user, uses differently each software package e.g. different libraries, different parts of the code, different compilation of parameters for optimization;
  • Although software package was tested with tests(regression etc.) provided by the software developers, using different tests, the software can fail e.g. crash or numerical errors or sanity checks etc. Thus compilations must be checked using sanity checks which use the same “subroutines” as the production simulations;

Thus, no any software packages will be compiled on demand and in case-by-case base.

Software Environmental Modules

The Environment Modules system is a tool to help users manage their Linux shell environment by allowing groups of related environment variable settings to be made or removed dynamically. This tool is very useful as it allows users to simply switch between different versions of the same program or library.

You can see the list of software modules available on a system with module av

--------------------- /usr/share/Modules/modulefiles ---------------------
GAMS                    mpich2-x86_64
Python2.7.6             dot                     null
Python3.3.3             module-info


---------------------------- /etc/modulefiles ----------------------------
compat-openmpi-x86_64 openmpi-x86_64

You can load the module that you need, for instance python 2.7.6, with:module load Python2.7.6

* Hint – the modules command will autocomplete using the tab key. e.g if you type modules bio <tab> the word bioinformatics should appear.

Use it for your needs and then unload it with:module unload Python2.7.6

You can check what modules you have loaded at any particular moment with:module list

Currently Loaded Modulefiles:
  1) Python2.7.6

You can unload all the modules with: module purge

It's important to understand that these commands only apply to your current session, so if you have different terminal sessions open, they can each have different modules loaded.

It is also important to note that loading the modules for different versions at the same time is incorrect. Loading more than one version of the same module at the same time can cause problems.

Sometimes, for scientific software that requires several libraries, a combination of modules might need to be loaded.

Creating your modules

You can also add your own modules files to software that you have installed in your home directory.

* Caveat. Modules often inserts a leading '/' in its listings. This should be removed when loading a module. e.g.:

module load /bioinformatics/bwa/0.7.12-r1039

will not load but:

module load bioinformatics/bwa/0.7.12-r1039

should work.

Versions

Its often necessary to keep older versions of software so that experiments can be repeated and pipeline scripts don't have to be edited each time a program is upgraded. Because multiple versions of software will exist in the /opt/apps directory, you should always use the version number when loading a module (modules are always put on the system with softwarename/version at the end of the path e.g. python/2.7.10, bwa/0.7.12-r1039 etc.) Although you could load the R module without supplying the version number, if you do so you'll be selecting the last version in alphanumeric order.
Its a good idea therefore to include the version name of the software that you're using in your script, i.e.

module load bioinformatics/R/3.2.1

This will ensure that when the software gets upgraded you'll still be using the same version of R. A number of the programs available in /opt/apps will have sub-programs and modules installed within them e.g. python, R and perl.

Maintaining the versions within these is often quite difficult because some modules may automatically upgrade others when they are installed without giving any warning.

In R, the sessionInfo() command will display the versions of libraries that you have loaded, so including this in your script will give you a record of the exact versions used during a particular run.

To get the version of python libraries for whatever version of python you're using you can use pip freeze at a shell prompt.
[k1214122@login2(rosalind) ~]$ pip freeze
cutadapt==1.8.3
numpy==1.9.2
scipy==0.16.0
wheel==0.24.0

Perl versions are a little more tricky and may have to be done on an individual basis. The instmodsh command can be used to list all the installed packages which at the current time may need to be queried individually for version number.

Python virtualenv (http://docs.python-guide.org/en/latest/dev/virtualenvs/) can be used to control local python environments and versions. In perl the equivalent is perlbrew (http://perlbrew.pl/).

scientific_software/start.txt · Last modified: 2016/11/06 11:26 by admin