Software Resources

Through partnership with WebStore, ACCC provides campus access to discounted software licenses that are negotiated for all of U of Illinois campuses. In addition to software sales, ACCC provides educational resources to researchers for learning new technologies such as Lynda.com. ACER group provides specialized software consulting services of installing and optimizing complex scientific packages for a distributed computing environment.

 

Over last 2.5 years, ACER has provided researchers with over 150 scientific applications, compilers and tools etc. Please see the following detailed list of all softwares provided by ACER on its HPC platform:

 

System software

 

Nodes on the Extreme cluster use the following system software as default:

 

  • CentOS 6.9
  • GCC 4.7.7
  • OpenMPI 1.6.2
  • Java 1.6.0_35
  • Python 2.6.6
  • R 3.2.0

 

Modules repository

 

We’re using modules software environment management to load and unload modules dynamically in a clean fashion and maintain different version of the software. User can use “module avail” command to view list of at the available software modules. To use particular module you can use “module load apps/{module_name}” as this command will put specified module in your path. Please include this load command in the jobs you submit to cluster.

 

We’re continuously updating the following list of installed software on extreme cluster.

 

Applications ( packages) –

 

  • Abaqus (6.13) An Abaqus environment that provides a simple, consistent interface for creating, submitting, monitoring, and evaluating results from Abaqus simulations.
  • Abinit (7.10.5)ABINIT is a package whose main program allows one to find the total energy, charge density and electronic structure of systems made of electrons and nuclei (molecules and periodic solids) within Density Functional Theory (DFT), using pseudopotentials and a planewave or wavelet basis.
  • Abyss (1.9.0)ABySS is a de novo, parallel, paired-end sequence assembler that is designed for short reads.
  • Amber 12Amber is the collective name for a suite of programs that allow users to carry out molecular dynamics simulations, particularly on biomolecules.
  • AMOS (3.1.0)A Modular Open Source Assembler.
  • ANNOVARAn efficient software tool to utilize update-to-date information to functionally annotate genetic variants detected from diverse genomes.
  • Apt1.15.2
  • ASPECT (1.3) – An extensible code written in C++ to support research in simulating convection in the Earth mantle and elsewhere.
  • Bamm (2.3.0) – Program for modeling complex dynamics of speciation, extinction, and trait evolution on phylogenetic trees.
  • Bedtools (2.25.0) – Bedtools utilities are tools for a wide-range of genomics analysis tasks.
  • Bedtools2 (2.19.1) – Bedtools utilities are tools for a wide-range of genomics analysis tasks.
  • BLAST+ (2.5.0) – BLAST finds regions of similarity between biological sequences. The program compares nucleotide or protein sequence databases and calculates the statistical significance. (Note: The software would require the user to download the required database set and change the environment variables accordingly.)
  • Boost (1.65.0) – Boost speeds initial development, results in fewer bugs, reduces reinvention-of-the-wheel, and cuts long-term maintenance costs.
  • Bowtie2 (2.1.0)/(2.2.2)/(2.2.5)/(2.2.9) – Ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences.
  • Bwa (0.7.5a) (0.7.15) – BWA is a software package for mapping low-divergent sequences against a large reference genome, such as the human genome.
  • CEAS (1.0.2) – A tool designed to characterize genome-wide protein-DNA interaction patterns from ChIP-chip and ChIP-Seq of both sharp and broad binding factors.
  • CellProfiler (3.0.0) – Software for quantitative analysis of biological images
  • CLOVER
  • COMSOL (4.3b, 5.2a)* – A finite element analysis, solver and Simulation software / FEA Software package for various physics and engineering applications, especially coupled phenomena, or multiphysics.
  • CONN (v.15) – A Matlab-based cross-platform software for the computation, display, and analysis of functional connectivity in fMRI (fcMRI).
  • Converge (2.1.0, 2.4.13)a multipurpose computational fluid dynamics (CFD) code with innovative features including a fully coupled automated mesh created at runtime and Adaptive Mesh Refinement (AMR).
  • Cp2k-2.5.1-ssmpCP2K is a freely available (GPL) program, written in Fortran 95, to perform atomistic and molecular simulations of solid state, liquid, molecular and biological systems.
  • Cufflinks (2.2.0) – Transcriptome assembly and differential expression analysis for RNA-Seq.
  • DANPOS (2.2.2) – A toolkit for Dynamic Analysis of Nucleosome and Protein Occupancy by Sequencing, version 2.
  • Deal.II (8.2.1) – A C++ software library supporting the creation of finite element codes and an open community of users and developers.
  • Espresso (Quantum-Espresso-5.3) – An integrated suite of Open-Source computer codes for electronic-structure calculations and materials modeling at the nanoscale. It is based on density-functional theory, plane waves, and pseudopotentials.
  • FALCON – A set of tools for fast aligning long reads for consensus and assembly.
  • FastQC (0.10.1) – FastQC is a quality control application for high throughput sequence data. It reads in sequence data in a variety of formats and can either provide an interactive application to review the results of several different QC checks, or create an HTML based report which can be integrated into a pipeline.
  • Fastx The FASTX-Toolkit is a collection of command line tools for Short-Reads FASTA/FASTQ files preprocessing.
  • FSL (5.0.10) – A comprehensive library of analysis tools for FMRI, MRI and DTI brain imaging data.
  • Gaussian (09) – It provides state-of-the-art capabilities for electronic structure modeling.
  • GaussView (4.1.2) – A graphical interface used with Gaussian
  • GROMACS (2016, 5.1.4) – GROMACS is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles.
  • Gtk (3.1.1)
  • Hmmer (2.3.2)(3.1b2) – HMMER is used for searching sequence databases for sequence homologs, and for making sequence alignments. It implements methods using probabilistic models called profile hidden Markov models (profile HMMs).
  • HPL (2.1) – Benchmark for High performance clusters.
  • HTSeq (0.6.1) – A Python package that provides infrastructure to process data from high-throughput sequencing assays.
  • JAGS (3.4.0) – JAGS is Just Another Gibbs Sampler. It is a program for analysis of Bayesian hierarchical models using Markov Chain Monte Carlo (MCMC) simulation.
  • Java-genomics-toolkit – This is a collection of applications for genomics data processing, primarily high-throughput next-generation sequencing.
  • Lammps (11Nov13-mpich3) – Molecular Dynamics Simulator .
  • MACS (1.4.2) – Model-based Analysis of ChIP-Seq.
  • Mafft (7) – MAFFT is a multiple sequence alignment program for unix-like operating systems. It offers a range of multiple alignment methods, L-INS-i (accurate; for alignment of <∼200 sequences), FFT-NS-2 (fast; for alignment of <∼30,000 sequences), etc.
  • Maker (2.31.9) – A portable and easily configurable genome annotation pipeline
  • mapDamage – Tracking and quantifying damage parrerns in ancient DNA sequences.
  • Marc(MSC) – Simulate products more accurately with the industry’s leading nonlinear FEA solver technology.
  • Mathematica-10.3 – A mathematical tool for analysis.
  • MATLAB (R2010b, R2016b) – A multi-paradigm numerical computing environment and fourth-generation programming language.
  • Megacc (7.0.26) – Sophisticated and user-friendly software suite for analyzing DNA and protein sequence data from species and populations.
  • Meme (4.9.1) – Motif-based sequence analysis tools.
  • MetaVelvet (1.2.02) – An extension of Velvet assembler to de novo metagenomic assembly.
  • Mixcr (2.1.6) – A universal software for fast and accurate analysis of raw T- or B- cell receptor repertoire sequencing data.
  • Molgw (1.F) – A code that implements the many-body perturbtation theory (MBPT) to describe the excited electronic states in finite systems (atoms, molecules, clusters)
  • Moose Framework (PETSc) – The Multiphysics Object-Oriented Simulation Environment (MOOSE) is a finite-element, multiphysics framework.
  • Mothur (1.39.5) – A single piece of open-source, expandable software to fill the bioinformatics needs of the microbial ecology community.
  • Mrbayes (3.2.2, 3.2.3, 3.2.5) – A program for Bayesian inference and model choice across a wide range of phylogenetic and evolutionary models.
  • MSC-Nastran – A multidisciplinary structural analysis application to perform static, dynamic, and thermal analysis across the linear and nonlinear domains, complemented with automated structural optimization and award winning embedded fatigue analysis technologies, all enabled by high performance computing.
  • Msg (0.4.9) – A pipeline of scripts to assign ancestry to genomic segments using next-gen sequence data.
  • MUMmer (3.23) – A system for rapidly aligning entire genomes.
  • MUSCLEIt is one of the best-performing multiple alignment programs according to published benchmark tests, with accuracy and speed that are consistently better than CLUSTALW.
  • NAMD (2.9, 2.10, 2.12) – A parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems.
  • Ncbi_cxx- (12.0.0) NCBI Toolkit – NCBI C++ Toolkit is a public-domain collection of portable libraries, consisting of a cross-platform application framework and a set of utilities and supporting classes to work with biological data.
  • Nclncargs (6.2.0) – An interpreted language designed by the National Center for Atmospheric Research for scientific visualization and data processing.
  • Nwchem (6.6) – Computational chemistry tools that are scalable both in their ability to treat large scientific computational chemistry problems efficiently, and in their use of available parallel computing resources from high-performance parallel supercomputers to conventional workstation clusters.
  • Oases (0.2.08)De novo transcriptome assembler for very short reads.
  • OpenFOAM (2.4.0) – OpenFOAM is free, open source software for computational fluid dynamics (CFD).
  • P4est (1.1) – Parallel adaptive mesh refinement library. (FAST and DEBUG modes).
  • Perl (5.20.2, 5.26.0) – Perl is a family of high-level, general-purpose, interpreted, dynamic programming languages.
  • Picard-tools (1.107) – A set of tools (in Java) for working with next generation sequencing data in the BAM.
  • PICRUSt (1.0.0) – A bioinformatics software package designed to predict metagenome functional content from marker gene surveys and full genomes.
  • Pplacer (1.1) – The pplacer binary actually does phylogenetic placement and produces place files, guppy does all of the downstream analysis of placements, and rppr does useful things having to do with reference packages.
  • Prodigal (2.6.3) – Fast, reliable protein-coding gene prediction for prokaryotic genomes.
  • Pyicoteo (2.0.7) – Pyicos is a command line utility for the conversion and manipulation of genomic coordinates files.
  • Pymummer (0.6.1) – Python3 wrapper for running MUMmer and parsing the output.
  • Qualimap (0.8.1) – Evaluating next generation sequencing alignment data.
  • R (3.0.2, 3.2.0, 3.4.1) – R is a programming language and software environment for statistical computing and graphics.
  • RAxML (8.2.9) – Randomized Axelerated Maximum Likelihood – a fast program for the inference of phylogenies with maximum likelihood method.
  • Revbayes – An interactive environment for statistical computation in phylogenetics, primarily intended for modeling, simulation, and Bayesian inference in evolutionary biology, particularly phylogenetics.
  • Scons (2.5.1) – An improved, cross-platform substitute for the classic Make utility with integrated functionality similar to autoconf/automake and compiler caches such as ccache.
  • Scope
  • SICER (1.1) – A clustering approach for identification of enriched domains from histone modification ChIP-Seq data.
  • Simpson (4.1.1)A general-purpose software package for simulation virtually all kinds of solid-state NMR experiments.
  • Singularity (2.2) – Used to package entire scientific workflows, software and libraries, and even data.
  • SOAPdenovo2 – A novel short-read assembly method that can build a de novo draft assembly for the human-sized genomes.
  • SPAdes (3.7.1, 3.10.1) – St. Petersburg genome assembler – is an assembly toolkit containing various assembly pipelines.
  • SPM12 (MATLAB) – Statistical Parametric Mapping refers to the construction and assessment of spatially extended statistical processes used to test hypotheses about functional imaging data.
  • STAR (2.3.0e) – STAR is an ultrafast universal RNA-seq aligner.
  • STORM-Cread (0.84) – Comprehensive Regulatory Element Analysis and Discovery.
  • Subread (1.4.6) – A tool kit for processing next-gen sequencing data.
  • Swig (3.0.12) – A software development tool that connects programs written in C and C++ with a variety of high-level programming languages.
  • Tensorflow (1.3.0) – An open source machine learning framework for numerical computation using data flow graphs.
  • Tophat (2.0.11)(2.1.0) – A fast splice junction mapper for RNA-Seq reads.
  • Trimmomatic (0.32) – Trimmomatic performs a variety of useful trimming tasks for illumina paired-end and single ended data.
  • Trinityrnaseq (2.0.2) – Package for the efficient and robust de novo reconstruction of transcriptomes from RNA-seq data.
  • Usearch (8.1.1861) USEARCH offers search and clustering algorithms that are often orders of magnitude faster than BLAST.
  • VASP (5.3.5) – The Vienna Ab initio Simulation Package (VASP) is a computer program for atomic scale materials modelling.
  • Vcftools (0.1.11) – A program package designed for working with VCF files, such as those generated by the 1000 Genomes Project.
  • Velvet (1.2.10) – Velvet is a de novo genomic assembler specially designed for short read sequencing technologies.
  • Vicuna (1.3) – A de novo assembly program targeting populations with high mutation rates.
  • Visit (2.9.2) VisIt is an Open Source, interactive, scalable, visualization, animation and analysis tool.
  • Vmd (1.9.2) – VMD is a molecular visualization program for displaying, animating, and analyzing large biomolecular systems using 3-D graphics and built-in scripting.
  • VSCode (1.15.1) – A lightweight but powerful source code editor.
  • Zlib (1.2.8) – A massively spiffy yet delicately unobtrusive compression library.

 

Tools and libraries ( packages)-

 

  • Autodock Vina (1.1.2) – AutoDock Vina is an open-source program for doing molecular docking.
  • Basespacepy – A Python based SDK to be used in the development of Apps and scripts for working with Illumina’s BaseSpace cloud-computing solution for next-gen sequencing data analysis.
  • BEAGLE BEAGLE is a high-performance library that can perform the core calculations at the heart of most Bayesian and Maximum Likelihood phylogenetics package.
  • Biom-format (2.1.5) – Python tool designed to be a general-use format for representing biological sample by observation contingency tables.
  • BioPythonThe Biopython Project is an international association of developers of freely available Python tools for computational molecular biology.
  • Boost (1.58.0)/(1.55.0) – Boost provides free peer-reviewed portable C++ source libraries.
  • CheckM – CheckM provides a set of tools for assessing the quality of genomes recovered from isolates, single cells, or metagenomes.
  • Circlator (1.4.1) – A tool to circularize genome assemblies.
  • Cmake (2.8.11.2) (3.2.3)– A cross-platform, open-source build system.
  • CPLEX Studio (12.6.1) – Analytical decision support toolkit for rapid development and deployment of optimization models using mathematical and constraint programming.
  • Crispresso -Analysis of CRISPR-Cas9 genome editing outcomes from deep sequencing data.
  • Cython (0.23.4) – Cython is an optimising static compiler for both the Python programming language and the extended Cython programming language.
  • DendropyA Python library for phylogenetic computing.
  • EaselAn ANSI C code library for computational analysis of biological sequences using probabilistic models.
  • ELK (3.1.12) – An all-electron full-potential linearised augmented-plane wave (FP-LAPW) code with many advanced features.
  • FFTW (3.3.3)FFTW is a C subroutine library for computing the discrete Fourier transform (DFT).
  • GATK (3.2-0) (3.6) – A software package developed at the Broad Institute to analyze high-throughput sequencing data.
  • glibc (2.16) Core libraries for the GNU system and GNU/Linux systems, as well as many other systems that use Linux as the kernel.
  • GNUPlot (5.0.3) A portable command-line driven graphing utility.
  • gperftools A collection of a high-performance multi-threaded malloc() implementation, plus some performance analysis tools.
  • GSL (1.16) – A numerical library for C and C++ programmers.
  • H5py The h5py package is a Pythonic interface to the HDF5 binary data format.
  • Hdf5 HDF5 is a data model, library, and file format for storing and managing data.
  • Hmmlearn (0.2.0) – Hmmlearn is a set of algorithms for unsupervised learning and inference of Hidden Markov Models.
  • LibXScrnSaver (1.2.2) – Support for changing the image on a display screen after a user-settable period of inactivity to avoid burning the cathode ray tube phosphors.
  • MACS2 (2.1) – A python module to provide a powerful ChIP-Seq analysis method.
  • Matplotlib (1.3.1) (1.4.3) – Matplotlib is a python 2D plotting library.
  • Moose Framework – The Multiphysics Object-Oriented Simulation Environment (MOOSE) is a finite-element, multiphysics framework.
  • Mpi4py (2.0.0) – An MPI implementation for Python.
  • MPICH (2-1.5)/(3.0.4) – A high performance and widely portable implementation of the Message Passing Interface (MPI) standard.
  • Mpmath – Python library for real and complex floating-point arithmetic with arbitrary precision.
  • Ngslib – A Python based package aims in Next-Generation Sequencing Analysis.
  • Numpy (1.8.1)(1.11.1) – NumPy is the fundamental package for scientific computing with Python.
  • Octave (4.0.0) – A high-level interpreted language, primarily intended for numerical computations.
  • Odin – A declarative framework for defining resources (classes) and their relationships, validation of the fields that make up the resources and mapping between objects (either a resource, or other python structures).
  • Omics Pipe – An open-source, modular computational platform that automates best practice multi-omics data analysis pipelines
  • OpenBabel (2.3.2)Open Babel is a chemical toolbox designed to speak the many languages of chemical data.
  • OpenBLAS (0.2.18) – OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version.
  • Parallel MPI Simple Perl – A non-compliant wrapper around the widely implemented MPI libraries, allowing messages to consist of arbitarily nested Perl data structures whose size is limited by available memory.
  • Pathlib – Classes representing filesystem paths with semantics appropriate for different operating systems.
  • Pbh5tools – Tools for manipulating HDF5 files produced by Pacific Biosciences.
  • PETSc (3.6.3, 3.7.3) – A suite of data structures and routines for the scalable (parallel) solution of scientific applications modeled by partial differential equations.
  • Picard tools (2.5.0) – A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.
  • Poretools – A toolkit for working with nanopore sequencing data from Oxford Nanopore.
  • PyGTK (2.16.0) – PyGTK lets you to easily create programs with a graphical user interface using the Python.
  • PySam (0.9.1.4) – Pysam is a python module for reading and manipulating files in the SAM/BAM format.
  • PyYAML(3.11)A data serialization format designed for human readability and interaction with scripting languages.
  • Qiime (1.9.1) – An open-source bioinformatics pipeline for performing microbiome analysis from raw DNA sequencing data.
  • Qt4 (4.8.7) (qmake) – Application development environment.
  • Requests (2.13.0) (qmake) – Allows you to send organic, grass-fed HTTP/1.1 requests, without the need for manual labor.
  • Ruffus – A Computation Pipeline library for python. Open-sourced, powerful and user-friendly, and widely used in science and bioinformatics.
  • Samtools (1.3, 1.4) – Samtools is a suite of programs for interacting with high-throughput sequencing data.
  • Scikit-Learn (0.17.1) – Simple and efficient tools for data mining, machine learning and data analysis.
  • SciPy (0.13.3)(0.18.0) – A Python-based ecosystem of open-source software for mathematics, science, and engineering.
  • ScreamingBackPack – A utility for handing syncing of remote and local data resources. Developed for use in CheckM but hopefully generic enough to be used elsewhere.
  • Snakemake (3.11.2) – A workflow management system that aims to reduce the complexity of creating workflows by providing a fast and comfortable execution environment.
  • SymPy (0.7.5) – a Python library for symbolic mathematics.
  • TCL (8.6.4) – (Tool Command Language) is a very powerful but easy to learn dynamic programming language
  • Thor and Odin – THOR & ODIN is an HMM-based approach to detect and analyze differential peaks in two sets of ChIP-seq data from distinct biological conditions with replicates.
  • UCSC – Genome Browser and Blat application binaries.
  • VarScan (2.3.6) – A massively parallel sequencing technology for the study of genetics.

 

Compilers ( packages)-

 

 

* Disclaimer: Some restrictions apply. Please note all softwares are not available to everyone by default and might require departmental approval.