HPC & Big Data FAQ
How to get an account
Please fill out the form and we will create your Id once your department coordinator approves.
How to share account resources
We can create a shared directory for users to access and exchange data from. Let us know your requirements accordingly.
Please do not share your login credentials. Accounts will be disabled if found to violate this policy.
Logging into Extreme
We do not provide local access to Extreme. If you’ve been granted access, you should log into the cluster using your UIC netID and ACCC common password.
Use an SSH client to connect to login-1.<cluster-name>.acer.uic.edu.
Access using Unix, Linux, and OS X systems – Run ssh login-1.<cluster-name>.acer.uic.edu -l <Netid> (Enter your netID).
Access using Windows based systems(using X11 Forwarding)-
“X forwarding” is a feature of X where a graphical program runs on one computer, but the user interacts with it on another computer.
All you need is an X server that runs on Windows, and an SSH client, both of which are freely available.
Follow these steps to configure PuTTY,
1. Enter the hostname you want to connect to: login-1.<cluster-name>.acer.uic.edu on port 22. Make sure the connection type is SSH.
2. Scroll to Connection > SSH > X11. Check the box next to Enable X11 Forwarding. By default the X Display location is empty. You can enter localhost:0. The remote authentication should be set to MIT-Magic-Cookie-1.
3. Finally go back to Session. You can save your session too, and load it each time you want to connect.
4. Click Open to bring up the terminal and login using your netid/password .
Getting started with your environment
Resetting or lost password
Transferring files to Extreme
For transferring files from extreme to your workstation,
Downloading files and data sets off the internet
Downloading files from the internet from external ftp and web servers, use wget. Copy the link location of the file to be downloaded then,
Or downloading from git file repository use,
Setting up a bashrc file
Follow these quick steps to create a profile. bashrc file is present in all directories when a new account is created. It is a hidden file, so do the following,
There should be a .bashrc on the first page. If not just create it with
And simply write following line into it. PATH=$PATH:~/bin
More environment variables can be added as required.
New software installation request
A thing to note here is, ACER is not responsible for registering or purchasing licenses for software packages. The user will have to purchase/register and download the package in their home directory and we will proceed further with installation on the cluster.
Software available on Extreme
Use ‘module avail‘ in your bash terminal.
Software/modules are segregated into apps, tools and compilers.
How to load or unload software/module
Naming conventions that will be useful are added with the name of the software.
If intel is added then load the compilers/intel module along with the software package. If a version of python is added then load it from the compilers. Likewise, these provide you with dependencies that are required for the proper functioning of the software package.
To unload a package use ‘module unload‘ in your bash terminal.
To list out the current modules you are working on, use ‘module list’.
Compiling software in your/shared directories
Compiling softwares in the directories where you have permissions do not require super user/root privileges. Since, by default packages get installed in system libraries and binary folders, hence the need for sudo privilege. With this information out of the way, lets look at how to compile softwares.
For most packages,
- Load the dependencies from modules provided on Extreme/SABER (compilers, tools etc).
- Configure with ‘Prefix’ to your directory (User needs to have permissions to access that directory).
- To run programs give the full path to the binaries.
- To avoid the long path names, append the package in your environment variables. The best way is to add the environment variable in your ./bashrc file, so it loads every time you login.
*Note – The installation of the package is local to the user and cannot be accessed by others.
- To install in a shared directory, just supply full path(or source your ./bashrc) to the executable to be able to access it.
Compiling Python modules in your/shared directories
- There are two general ways to install Python modules. With ‘easy_install’ or with ‘pip’
- This will exit and respond you to create the directory,
- Also append the PYTHONPATH environment variable to include the above created directory.
Compiling your program on Extreme/SABER
Compiling with Intel compiler
- When you invoke the compiler with icc, the compiler builds C source files using C libraries and C include files. If you use icc with a C++ source file, it is compiled as a C++ file. Use icc to link C object files.
- When you invoke the compiler with icpc the compiler builds C++ source files using C++ libraries and C++ include files. If you use icpc with a C source file, it is compiled as a C++ file. Use icpc to link C++ object files.
The icc or icpc command does the following:
- Compiles and links the input source file(s).
- Produces one executable file, a.out, in the current directory.
- C or C++ source file (.c, .cc, .ccp, .cxx, .i)
- assembly file (.asm),
- object (.obj)
- static library (.lib)
Appropriate file name extensions are required for each compiler. By default, the executable filename is “a.out”, but it may be renamed with the “-o” option. The compiler command performs two operations: it makes a compiled object file (having a .o suffix) for each file listed on the command-line, and then combines them with system library files in a link step to create an executable. To compile without the link step, use the “-c” option.
Compiling with MPICH2 & MPICH3
MPICH is an open source implementation of MPI (Message Passing Interface). Similar to Intel’s MPI, this is an alternative implementation of MPI.
To get started, load the MPICH module before working on it.
Following versions are installed on the system for your use.
Once either of the above commands are executed, they will automatically add environment variables required to use Mpich, i.e PATH, MPICH2_HOME (MPICH3_HOME in the case of MPICH3) and LD_LIBRARY_PATH.
To run or compile programs with MPICH, run mpiexec.
The following scripts are available to compile and link your mpi programs:
Each script will invoke the appropriate compiler.
Make a job script to reserve nodes for your job to run on. Refer to how to create a job script (FAQ)
To compile or run a program with MPICH,
To test that you can run an ’n’ process job on multiple nodes:
The ’machinefile’ is of the form:
host1’, ’host2’, ’host3’ and ’host4’ are the hostnames of the machines you want to run the job on.
For more information about MPICH– See the MPICH Manual
Compiling with OpenMPI
OpenMPI is another open source MPI implementation, similar to MPICH.
Its usage is similar to MPICH and Intel MPI.
To get started with OpenMPI, you do not have to load the module. It is the default MPI implementation for Rocks OS.
The following scripts are available to compile and link your mpi programs:
Each script will invoke the appropriate compiler.
mpicc <flags> <filename.c>
mpiCC <flags> <filename.cpp>
mpif77 <flags> <filename.f>
mpif90 <flags> <filename.f90>
To get more information on specific compiler wrappers in OpenMPI, use -help with each wrapper.
Compiling OpenMP Code with Intel Compilers
Load the intel compiler:
To cross compile a C program:
To cross compile a C++ program:
To cross compile a Fortran program:
Submit and manage jobs on Extreme
How to submit/run a job on Extreme
- Submit a job script
Submit the script:
*Be sure to substitute your own UIC NetID for NetID.
*Please make sure to transfer files from your Lustre directory into your home directory. Files are subject to removal after 90 days.
- For nodes, submit the number of nodes that your queue has permission to access. E.g., nodes=10 will reserve 10 nodes for your job. It may not even use as many resources, but it will reserve this for your job.
- Specify the number of cores needed per processor using ppn. If you’re using batch queue on Extreme, the value can be up to 16 cores. As mentioned earlier,
#PBS -l nodes=10:ppn=1,walltime=5:00:00
may result in higher wait time than#PBS -l nodes=1:ppn=10,walltime=5:00:00
- Keep in mind that we have different types of nodes in our cluster where G1 nodes have 16 cores, G2 nodes have 20 cores and Highmem nodes have 32 cores. So If your queue has only G1 nodes then you can not have ppn>16.
- After you submit your job script, changes to the contents of the script file will have no effect on your job as Torque has already spooled a copy to a separate file system.
- If your job request too many resources, showq will classify it as idle until resources become available.
- We recommend you to always leave your email address in your scripts so you are alerted to any status changes in the job.
|#PBS -a||-a||Declares the time after which the job is eligible for execution. Syntax: (brackets delimit optional items with the default being current date/time):[CC][YY][MM][DD]hhmm[.SS]|
|#PBS -A account||-A account||Defines the account associated with the job.|
|#PBS -d path||-d path||Specifies the directory in which the job should begin executing.|
|#PBS -e filename||-e filename||Defines the file name to be used for stderr.|
|#PBS -h||-h||Put a user hold on the job at submission time.|
|#PBS -j oe||-j oe||Combine stdout and stderr into the same output file. This is the default. If you want to give the combined stdout/stderr file a specific name, include the -o path flag also.|
|#PBS -l string||-l string||Defines the resources that are required by the job. See the discussion below for this important flag.|
|#PBS -m option(s)||-m option(s)||Defines the set of conditions (a=abort,b=begin,e=end) when the server will send a mail message about the job to the user.|
|#PBS -N name||-N name||Gives a user specified name to the job. Note that job names do not appear in all Moab job info displays, and do not determine how your job’s stdout/stderr files are named.|
|#PBS -o filename||-o filename||Defines the file name to be used for stdout.|
|#PBS -p priority||-p priority||Assigns a user priority value to a job. See the discussion under Setting Job Priority.|
|#PBS -q queue#PBS -q queue@host||-q queue||Run the job in the specified queue (pdebug, pbatch, etc.). A host may also be specified if it is not the local host.|
|#PBS -r y||-r y||Automatically rerun the job is there is a system failure. The default behavior at LC is to NOT automatically rerun a job in such cases.|
|#PBS -S path||-S path||Specifies the shell which interprets the job script. The default is your login shell.|
|#PBS -v list||-v list||Specifically adds a list (comma separated) of environment variables that are exported to the job.|
|#PBS -V||-V||Declares that all environment variables in the qsub environment are exported to the batch job.|
|#PBS -W||-W||This option has been deprecated and should be ignored.|
- Submit an interactive job
HPCC staff recommends jobs normally be submitted using a script and the qsub. However, qsub will also allow interactive jobs, which are useful when debugging scripts and applications.
To run an interactive job, you must include the -I (capital i) flag to qsub. Additionally, any job submission parameters in your script file with #PBS prefixes should be included at the command line.
This command assigns a compute node to the user to run their jobs. Please note that if you logout or exit your interactive session, your job will be marked as completed by the scheduler.
To pass multiple options with your interactive job script, use the -l (lowercase L) option.
Flags used at the command line follow the same syntax as those flags listed in the table above.
Monitor a Job
To see the status of all your jobs submitted,
The showq command has several options. A few that may prove useful include:
- -r shows only running jobs plus additional information such as partition, qos, account and start time.
- -i shows only idle jobs plus additional information such as priority, qos, account and class.
- -b shows only blocked jobs
- -p partition shows only those jobs on a specified partition. Can be combined with -r, -i and -b to further narrow the scope of the display.
- -c shows recently completed jobs.
To check the status of a specific job,
- Displays detailed job state information and diagnostic output for a selected job.
- The checkjob command is probably the most useful user command for troubleshooting your job, especially if used with the -v flag. Sometimes, additional diagnostic information can be viewed by using multiple “v”s: -vv or -v -v.
2. Cancel a job:
Cancel a running or queued job.
Run an Interactive job using screen
To run an interactive job, you must spawn a screen session first.
1. To start a session, type ‘screen’ and it will open a new session from where to run your interactive job off.
2. A simple interactive job syntax is
3. To detach a screen, use Cltr + A + D, so now your job will keep running in the background. Now you have the freedom to logout of Extreme and the job will still be running.
4. To reattach a screen and monitor how your job is still running, type
This will show all your active screens. To reattach use
By this method you can start a job in a screen session and detach it.The current job submission is done in batch mode, so while users start a job and wait for resources to get allocated, users can detach the screen while it waits in the background. This way your job will not quit when you try to logout of your session on Extreme.
For more information about screen. Use ‘man screen’
Why is my job taking time to start
When submitting jobs to the batch queue, please have patience. Maximum wall time for a job in batch queue is 10 days. Some groups/departments have their own reserved queues, and rules for their queues are different from batch queue.
For now just submit a job in the eligible queue, once the resources requested by your job become available, you will be pushed into the active job state. There are no reservations in the batch queue. It is shared by all users. Please do not abuse the shared space.
Specific Software (Best practices)
Operating with Gaussian.
Gaussian manual states:
“It is always best to use SMP-parallelism within nodes and Linda only between nodes. For example on a cluster of 4 nodes, each with a dual quad-core EM64T, one should use %NProcShared=8 %LindaWorkers=node1,node2,node3,node4”.
If you are a part of any queue, which is shared by users. To get a list of processor names (which will be a lot) , but when each time jobs get scheduled you wont always get the same processors assigned to you.
Follow the below steps to get the names of the nodes you are operating on,
1. Start an interactive job with the number of nodes you require.
Below is the command for starting an interactive job.
To read more about Interactive jobs, see our Technical Documentation page.
2. Then once the job has started, using ‘checkjob -v ‘ to check the names of the nodes it is running on.
3. Lastly, input them in your input file and run Gaussian.
This might be tedious task, but it guarantees the performance and utilization of nodes.