Extreme Storage Migration

Per the November 15, 2019 announcement, original storage on the Extreme HPC Resource is now out of warranty and users are asked to migrate their data to other filesystems in advance of the UIC Spring Semester 2020. The affected filesystems are:

  • /mnt/store1: home directories
  • /mnt/store2: home directories
  • /mnt/store3: project lab shares and class directories
  • /mnt/lustre: fast scratch

ACER users whose Extreme account was created prior to summer 2017 likely have their home directory (~) on /mnt/store1 or /mnt/store2; similarly, projects established on ACER resources prior to summer 2017 likely have a project directory in /mnt/store3. Some users may have inadvertently also used their /mnt/lustre directories for auxiliary storage.

If you wish to preserve data currently on any of these filesystems, you must take action.

In order to facilitate a smooth transition off of out-of-warranty storage, users should review the following to understand necessary constraints and timelines.

1.1. General cleanup

Extreme filesystems are again at near-capacity. Regardless of migration plans, all users are asked to delete any and all unnecessary or redundant files off of the four Extreme filesystems listed above. Failure to stave off exhaustion of storage on a filesystem may result in general loss of computation service for large numbers of Extreme users. ACER requests all users and groups remove all unneeded files by Monday, December 16, 2019.

1.2. User home directories

In an effort to manage filesystem capacity and performance, home directories shared throughout ACER HPC clusters are being consolidated into a single /home filesystem. All home directories will be subjected to quota with a soft limit of 10 GB and a hard limit of 15 GB; home directories over 10 GB in excess of seven (7) days will become read-only until usage is brought under the 10 GB soft limit.

In order to decommission and migrate data off old filesystems in advance of the beginning of the Spring 2020 term, all home directories must be reduced to under 10 GB in size by Friday, January 10, 2020. Users whose home directories are not within the 10 GB limit may not have their account migrated and access to ACER HPC clusters will effectively become disabled.

In lieu of large data storage within /home , each user will be provided their own individual directory within the project directory associated with their respective research group. (See below.)

1.3. Project directories

Over the course of Extreme’s operation, research groups on Extreme have been provisioned different storage options for cluster data:

  1. Directory in /projects; project directory name is of the format /projects/<dept>_<group> . This is the active standard location for research project data. Anything within the /projects filesystem resides on in-warranty storage arrays and does not need to be migrated.
  2. Legacy lab share; project directory name is  /mnt/store3/clust<group>lab .
  3. No project or lab share directory. Some users and groups have only operated out of their /home or /mnt/lustre user directories.

Availability of data stored via option 2 or 3 will be discontinued at the beginning of the Spring 2020 term. Therefore, active research data expected to be in use on ACER HPC clusters must be migrated to /projects by Wednesday, January 15, 2020.

Users wishing to maintain use of large datasets on the clusters should migrate that data to their individual space within their research group’s directory located at /projects/<dept>_<group>/<NetID> . Users should refer to the ACER Projects Index to determine what legacy and active directories exist for their respective research groups.

1.4. Local backups

While all efforts are made to maintain availability and integrity of data, filesystems mounted on ACER HPC resources are not backed up and ACER provides no assurances for long-term storage. ATintegrity

Therefore, users and groups are encourage to maintain local copies of all critical data on research lab or departmental storage.

This section provides basic instructions to users for data transfer and filesystem housekeeping operations.

2.1. External transfers

Operations for external file transfers are meant for evacuation and/or backup of data that currently exist on Extreme storage.

SCP/SFTP via Extreme login nodes

The long-standing method for transferring data transfer on and off cluster storage is the use of the Secure Copy (SCP) and Secure File Transfer Protocol (SFTP) methods. Both command line interface (CLI) and graphical user interface (GUI) tools are available.

Both login-1.extreme.acer.uic.edu and login-2.extreme.acer.uic.edu provide SCP and SFTP services. ACER asks Extreme users to do their primary interactive work on login-1 and allow login-2 to be the primary SCP/SFTP transfer node.

Globus via DTN

ACER is launching a new Data Transfer Node (DTN) resource providing the Globus Connect Service. Users familiar with Globus can request early adopter access to the DTN by sending a support ticket to extreme@uic.edu with the subject “DTN access request.”

Once a user has been provisioned access to the DTN, they may access their files by using the endpoint uicacer#dtn .

2.2. Internal transfers (via SSH)

Operations for internal file transfers are meant for migration of data to the /projects filesystem that currently exist on Extreme storage in /mnt/store{1,2,3} and /mnt/lustre .

Users working with ACER HPC clusters should have some basic experiencing at the Linux command line provided through SSH. Basic filesystem operations include:

  • cp <source> <destination>: copy a source filename or directory to a destination filename or directory. Use of the -R flag will recursively copy directory contents, while use of the -a flag will recursively copy and retain file properties (such as permissions).
  • mv <source> <destination>: move a source file/directory to a destination file/directory, analogous to cp.
  • rm <filename>: permanently remove (delete) file named filename. Running rm -r <directorywill recursively delete all the contents within a directory.
  • du -sh *: provides the disk usage of all files in the current directory in “human readible” format (K = kilobytes, M = megabytes, G = gigabytes).

2.3. Note on rsync

The rsync tool is typically installed on most macOS and Linux systems (the latter including ACER resources); Windows options are also available. The advantage of rsync is its ability to resume interrupted, partially-completed file transfers without having to start from the beginning. The syntax for rsync is similar to that of scp, i.e.:

  • rsync  <source> <destination>

Where <flagsare command line arguments such as -R or -a . Users with very large data sets with numerous files may wish to use this utility for file transfers. It is effective for data transfers both internal and external to ACER resources.

3.1 Enforcement of deadlines

Deadlines are required in order to synchronize with an administrative downtimes scheduled for Tuesday, December 10, 2019 (for home directories) and Thursday, January 16, 2020 (for project directories). If you believe your transfer will not completed before this time, please contact Extreme technical support as soon as possible.

3.2 Contacting support

As always, please direct all Extreme support questions to extreme@uic.edu . For the purposes of this data migration effort, we request the subject “Extreme storage migration”.