Fixing Quota Issues

The most common quota problem is being close to the quota or exceeding it.

The other common problem is that show-quota (see Checking Usage And Quotas) doesn’t show your usage and quotas or has very out of date data. In this case, please contact support so that we can fix it (see Getting Started for the email address to use).

This page, however, will focus on the most common problem – exceeding quota or being close to it.

Determine Where Your Usage Is

The filesystem quotas and show-quota work on whole directories and don’t indicate anything about where in a directory the usage is. Often times, a lot of the usage is old files and directories or caches that are no longer needed and can be cleaned. Use show-quota to get the path/s to the your (or your project’s) directory/ies in the data store. Then, you can use ls and du to determine where your usage is in those directories.

If you are in a directory, you can run

ls -alh

to list all files and directories including hidden ones (the -a option) along with the sizes of all the files in a human readable format (the -l and -h options put together). The -a option is particularly useful in your HOME directory where there are many hidden files and directories. An example of its output would be

glogin9:/scratch-grete/usr/gzadmfnord $ ls -alh
total 79M
drwx------    4 gzadmfnord gzadmfnord 4.0K May 27 15:09 .
drwxr-xr-x 4606 root       root       268K May 27 10:50 ..
lrwxrwxrwx    1 gzadmfnord gzadmfnord    4 May 27 15:09 .a -> .baz
drwxr-x---    2 gzadmfnord gzadmfnord 4.0K May 27 15:08 bar
-rw-r-----    1 gzadmfnord gzadmfnord  78M May 27 15:08 .baz
drwxr-x---    2 gzadmfnord gzadmfnord 4.0K May 27 15:08 foo

In this case, if the -a option had not been used, the 78 MiB file .baz would have been completely overlooked. Notice that ls -alh also shows the total size of all files directly in the directory (not in subdirectories), the permissions, and the targets of symlinks.

One of the limitations of ls is that it does now show the total amount of space and inodes are used inside directories. For that, you need to use du. To get the block usage for one or more directories, run

du -sch DIR1 ... DIRN

and for inodes

du --inodes -sch DIR1 ... DIRN

where the -s option generates a summary of the total usage for each directory, the -c prints the sum-total of all the given directories, and -h converts the usage to human readable usage (e.g. converting 1073741824 bytes into 1G).

An example with a user’s two directories in the scratch-grete data store would be

glogin9:~ $ du -sch /scratch-grete/usr/$USER /scratch-grete/tmp/$USER
79M     /scratch-grete/usr/gzadmfnord
60K     /scratch-grete/tmp/gzadmfnord
79M     total

and for inodes, it would be

glogin9:~ $ du --inodes -sch /scratch-grete/usr/$USER /scratch-grete/tmp/$USER
5       /scratch-grete/usr/gzadmfnord
1       /scratch-grete/tmp/gzadmfnord
6       total

Reducing Filesystem Usage

After determinging where your high usage is, there are several things that can be done to reduce usage.

First, if the files/directories are no longer needed or are trivially regenerated, they can simply be deleted. In particular, the ~/.cache directory is a common culprit for exceeding the quota of one’s HOME directory since many tools cache files there and never clean up (conda seems to be the most common culprit). It is often safe to outright delete the ~/.cache directory or any of its sub-directories. Temporary files also tend to accrue in /scratch-*/tmp/USERNAME and /mnt/lustre-*/tmp/USERNAME directories, so they are good candidates for cleanup.

Second, the files/directories could be moved to somewhere else. Are you using your user’s data stores for data that should really be in the project’s data stores (projects get bigger quotas)? Another example are files/directories residing on a SCRATCH/WORK that won’t be used again for a long time, and therefore could be moved to an ARCHIVE/PERM data store or downloaded to your machine/s to be re-uploaded later (obviously, special considerations must be taken for very large data sets). Another example would be moving files/directories needing less IO performance from an SSD based SCRATCH/WORK data store to a larger HDD based one.

Third, the files/directories could be compressed or bundled together in some way. Compression can reduce the block usage depending on the kind of data and the compression algorithm chosen, and can even improve read performance afterwards in some cases. Bundling many files/directories together can greately reduce inode usage (e.g. packing a directory with a million files into a single TAR or ZIP file). Common methods include:

tar, zip, etc. for bundling and compressing files and directories.
Changing file formats to different ones that store the same data but smaller. For example, PNG will almost always store an image in less space than BMP will (most BMP images are either uncompressed or use a very weak compression algorithm).
Concatenating files and making an index of the byte offsets and sizes of each to reduce the inode usage. A common example is concatenating many JPEG files making a MJPG (Motion JPEG) file which is supported by many tools directly (e.g. ffmpeg can read them just like any other video format).
Putting the data from several files and/or directories in an HDF5 or NetCDF4 file with compression on the Datasets/Variables.
Converting directories of thousands of images into a multi-page image format (e.g. multi-page TIFF) or a video. Note that videos can use the similarities between successive frames to compress. Be careful to choose an appropriate video codec for your data if you do this. You may want either a lossless codec (even if your original image frames are lossy), a codec put into its lossless mode, or a codec tuned to be low loss in order to avoid compounding losses (lossy JPEG then put through a lossy image codec). Encoding and decoding speed can be important as well.

Tip

A directory with many thousands or millions of small files generally has terrible performance anyways, so bundling them up into a much smaller number of larger files can improve performance by an order of magnitude or more.