login.deepthought.umd.edu.
Here's an example of a simple script, we'll call test.sh:
#PBS -lwalltime=1:00 #PBS -lnodes=4 hostname date |
The first two lines specify parameters to the scheduler. The first, walltime, specifies the maximum amount of time you expect your job to run. The walltime parameter is in the form HH:MM:SS. If you leave off any digits, the ones you provide will be assumed to be the smallest units available, for instance a walltime of 1:00 is equal to one minute. You should specify a reasonable estimate for this number, because if you specify too large of a number your job may not be scheduled appropriately, and if you specify too small of a number your job will be terminated before it completes.
The second line, nodes, tells the scheduler on how many cores you want your job to run. This method of specification doesn't care how those cores are distributed across machines or about how those machines are configured. If you want a more detailed method of specifying CPU/machine requirements, check out the examples section.
The remaining lines in the file are just standard commands, you will
replace them with whatever your job requires. In this case once the
job runs, it will print out the time and hostname to the output file.
By default the script will be run in whatever shell you use to log in
to the cluster, so if your normal shell is tcsh then the
script will be run inside tcsh. If you want to change
this, check out the examples section.
To submit your job, pick a queue that fits your needs, we'll choose the queue serial for this test, and then submit the job. (The serial queue is the default queue, but for this example we'll specify it anyway.)
f20-l1:~: qsub -q serial test.sh 4178.deepthought.umd.edu |
The number that is returned to you is the identifier for the job, and you should use that anytime you want to find out more information about your job. For information on how to verify that your job is running, see the section Monitoring Your Jobs.
Once your job completes, unless you've specified otherwise, your
output and any errors that occur will be written to two files in the
same directory from which you submitted your job. The files will be
named with the same name as your job script, with .eNNNN
and .oNNNN appended, where the Ns are replaced by the job
identifier.
Note that by default when you log in to the cluster, you are sitting in your home directory, and all output and submissions will be transferred to and from your home directory. For best performance, you should consider running your jobs from a space set aside for them. See Files and Storage and the qsub example on Running Your Job in a Different Directory for more information.
Here's what you should see when your job completes:
f20-l1:~: cat test.sh.o4178 Warning: no access to tty (Bad file descriptor). Thus no job control in this shell. compute-2-39.deepthought.umd.edu Mon Jan 22 11:13:09 EST 2007 f20-l1:~: cat test.sh.e4178 term: Undefined variable. |
As you can see in the output files above, the script ran and printed the hostname and date as specified by the job script. The few error messages that you see above are expected and can be ignored.
In addition to queue priorities, users of the cluster with paid
allocations (users that contribute money or resources to the cluster)
get priority over non-paying users. All users are provided with a
certain number of service units (SUs) as determined by the
HPCC Allocations and
Advisory Committee. In addition, "free" usage of the cluster is
provided to users with paid or non-paid allocations, assuming cycles
are available. "Free" jobs run at low priority and will be preempted
(evicted) if a higher-priority job comes along. To specify a queue,
use the -q option to qsub. Note:
paid users will also need to specify their high-priority account in
order to take advantage of their elevated priority. If no account is
specified, the default priorities will be used. See the section Job Accounting for information on how to
specify an alternate account.
The queues are as follows:
| queue | #nodes | wallclock | priority | notes | ||
|---|---|---|---|---|---|---|
| min | max | min | max | |||
| debug | 2 | 15 min | high | always available; use for interactive jobs | ||
| wide-debug | 5 | 100% | 15 min | high | ||
| narrow-med | 20% | 8 hr | med | |||
| wide-short | 5 | 100% | 2 hr | med | ||
| narrow-long | 20% | 3 days | low | |||
| narrow-extended | 25% | 2 weeks | low | paid allocations only | ||
| med-extended | 5 | 50% | 1 week | low | paid allocations only | |
| wide-med | 5 | 100% | 8 hr | low | ||
| ib | 2 | 100% | 1 week | high | InfiniBand connected hosts | |
| serial | 100% | unlimited | very low | free; preemptible | ||
$PBS_NODEFILE which contains the name of a file that
lists all of the nodes that you've been assigned.
$PBS_NODEFILE and as such you don't need to include it on the
command line. OpenMPI is also compiled to support all of the various
interconnect hardware, so for nodes with fast transport (InfiniBand/Myrinet),
the fastest interface will be selected automatically.
The following example will run the MPI executable alltoall on
a total of 40 cores. Note that you will need to add the command
tap -q openmpi-gnu to your .cshrc.mine file
to set up your environment properly to run OpenMPI. For further information
on the tap command check out the section Setting
Up Your Environment.
#PBS -l nodes=40 #PBS -l walltime=00:00:60 mpirun -np 40 alltoall |
alltoall on
a total of 40 cores. Note that you will need to add the command
tap -q lam-gnu (or one of the other MPI flavors) to your
.cshrc.mine file to set up your environment properly to run
LAM. For further information on the tap command check out
the section Setting Up Your Environment.
#PBS -l nodes=40 #PBS -l walltime=00:00:60 lamboot $PBS_NODEFILE mpirun C alltoall lamhalt |
If you see errors in your output of the form "LAM failed to execute a
LAM binary on the remote node X", it is most likely because you failed
to add the appropriate tap command to your
.cshrc.mine file.
alltoall on
a total of 40 cores. Note that you will need to add the command
tap -q mpich-gnu (or one of the other
MPI flavors) to your .cshrc.mine file to set up your
environment properly to run MPICH. For further information on the
tap command check out the section Setting
Up Your Environment.
Note also that if you've never run MPICH before, you'll need to create the file .mpd.conf in your home directory. This file should contain at least a line of the form MPD_SECRETWORD=we23jfn82933. (DO NOT use the example provided, make up your own secret word.)
#PBS -l nodes=40 #PBS -l walltime=00:00:60 mpdboot -n 40 -f $PBS_NODEFILE mpiexec -n 40 alltoall mpdallexit |
#PBS -l nodes=40 #PBS -l walltime=00:00:60 foreach node (`cat $PBS_NODEFILE`) ssh $node hostname end |
And if your shell is sh/ksh/bash, use this:
#PBS -l nodes=40 #PBS -l walltime=00:00:60 for node in `cat $PBS_NODEFILE`; do ssh $node hostname done |
#PBS -l nodes=2:mhz3000 #PBS -l walltime=00:00:60 myjob |
nodes=2, both of these jobs can be scheduled
simultaneously onto the same 4-core machine.
There are two different arguments available to specify how many cores and
separate physical nodes you are allocated. The nodes argument,
when specified by itself, specifies the number of cores you will
be allocated. If you specify nodes by itself, your job may all
fit onto a single physical node, or may be split across multiple nodes
depending on what's available.
If you know you'll need 12 cores, but don't care how they're distributed, try the following:
#PBS -l nodes=12 myjob |
This might give you three 4-core nodes, or an 8-core node and a 4-core node, or even two 8-core nodes.
If you require that your cores are all allocated on the same physical node,
you can add the ppn argument. Specifying
nodes=x:ppn=y says that you want x physical nodes with at least
y cores per node.
If you know you'll need at least four 4-core nodes (16 cores total), try the following:
#PBS -l nodes=4:ppn=4 myjob |
This might give you four 4-core nodes, or even four 8-core nodes, depending
on what's available. Note that if you're using MPI, and you get larger nodes
than you've requested, the mpirun command will pack more of your
jobs onto the larger nodes leaving some nodes empty. For this reason, using
the ppn argument is not recommended when using MPI.
The nodes and ppn arguments can be confusing, so
you should check to make sure you're getting what you want, and not allocating
more nodes than you need. Your best bet is to specify only the
nodes argument, and let the scheduler pick the appropriate
resources for you.
#PBS -l nodes=1 #PBS -l mem=1024mb myjob |
This example requests a single node with at least 4 processors and at least
1GB (1024MB) of memory total. Note that the mem parameter
specifies the total amount of memory required across all of your
allocation. So if you end up with multiple nodes allocated, this memory
figure will be divided across all of them.
If you want to request a specific amount of memory on a per-core basis, use the following:
#PBS -l nodes=4 #PBS -l pmem=1024mb myjob |
This example requests at least 1GB per core, on four cores, with a total of 4GB memory requested for the entire job.
Most of the nodes currently have at least 30GB of scratch space, some have as much as 250GB available, and a few have as little as 1GB available. Scratch space is currently mounted as /tmp. Scratch space will be cleared once your job completes.
The following example specifies a scratch space requirement of 5GB. Note however that if you do this, the scheduler will set a filesize limit of 5GB. If you then try to create a file larger than that, your job will automatically be killed, so be sure to specify a size large enough for your needs.
#PBS -l nodes=1 #PBS -l file=5gb myjob |
If you want to be notified via email when your job completes, you can
add the -mXX option to your description file. If you want
to receive mail when the job starts, replace the Xs with the
letter b. If you want to receive mail when your job
completes, replace the Xs with the letter e. You
may add both letters if you like, and you'll get two email messages. By
default, you will always be sent email if your job is aborted by the
scheduler for any reason. The completion email will tell you the exit
status of your job as well as the amount of resources the job
consumed. Note that the CPU time and memory usage numbers provided in
this email are unreliable at best. The email messages by default will
be sent to your Glue account. If you'd like them to go elsewhere, you
can add the -M option followed by a comma-seperated list
of usernames.
#PBS -l walltime=00:00:60 #PBS -mbe -Mbob@myhost.com,jane@yourhost.com date |
-S option to your description file. Also note that when
using the bash shell, you must explicitly run your
.profile script, as it is not run for you automatically.
If you have tap commands in your submit script, this is
especially important because tap is defined in
.profile. If you're using tcsh you don't
need to worry about this.
The following example changes to using /bin/bash as the
execution shell.
#PBS -lwalltime=00:01:00 #PBS -S /bin/bash . ~/.profile # only needed for bash shell date hostname |
/data/dt-raid5/bob/my_program when you submit your
job, when the job runs, it will look in your home directory for any
files that don't have a full pathname specified. To change this
behavior, you'll need to add the -d argument to your job
description file.
Also note that if you are using MPI, you may also need to add either
the -wd option for LAM (mpirun) or the
-wdir option for MPICH (mpiexec) to specify the
working directory.
The following example (using LAM) switches the working directory to
/data/dt-raid5/bob/my_program.
#PBS -lwalltime=00:01:00 #PBS -d /data/dt-raid5/bob/my_program lamboot $PBS_NODEFILE mpirun -wd /data/dt-raid5/bob/my_program C alltoall lamhalt |
To specify your estimated runtime, use the walltime
parameter. This value should be specified in the form
HHH:MM:SS. Note that if your job is expected to run over
multiple days, simply convert the number of days into hours- for
example a 3 day job would have a walltime value of 72:00:00.
You may leave off the leading digits if you like- so a walltime of
15:00 will represent 15 minutes. Note also that while the
scheduler may show walltimes in the form DD:HH:MM:SS when you
view the queue status, this format will not be accepted when you
submit a job.
If you do not specify a walltime, the default (maximum) permitted walltime for the queue will be used. See the section entitled Choosing a Queue for more information on queues and their assigned limits.
The following example specifies a walltime of 60 seconds, which should be more than enough for the job to complete.
#PBS -l nodes=1 #PBS -l walltime=00:00:60 hostname |
tap openmpi,
tap openmpi-intel, tap openmpi-pgi)
will automatically use InfiniBand if it is available.
When using OpenMPI, if you want to force your code to only use
InfiniBand, add the argument --mca btl openib,self to your
mpirun command. If you want to force your code to only use
the TCP transport, instead add --mca btl tcp,self. (If you
really want just TCP, though, please run your job in a queue other than the
ib queue.)
showq. For example:
f20-l1:~: showq
ACTIVE JOBS--------------------
JOBNAME USERNAME STATE PROC REMAINING STARTTIME
4178 kevin Running 4 00:01:00 Mon Jan 22 11:13:09
1 Active Job 4 of 236 Processors Active (1.69%)
1 of 59 Nodes Active (1.69%)
IDLE JOBS----------------------
JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME
0 Idle Jobs
BLOCKED JOBS----------------
JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME
Total Jobs: 1 Active Jobs: 1 Idle Jobs: 0 Blocked Jobs: 0
|
If your job shows up in the ACTIVE JOBS section as shown above,
your job should be off and running.
If your job shows up in the
IDLE JOBS section, that means that there currently are
insufficient resources available to run your job. Check to make sure
you haven't requested more processors than you need, and that you've
specified a reasonable walltime. If you see lots of jobs in the
ACTIVE JOBS section, it's probable that you'll just need to
wait for someone else's job to finish before yours can start.
If your job shows up in the BLOCKED JOBS section, it most
likely means that you did not have a sufficient amount of time
remaining in your CPU allocation to run the job. Either specify a
smaller walltime, or obtain an additional allocation. See the
section Diagnosing Job Problems for further
information.
To find out more detailed information about your job, use the
checkjob command. This command will show you which
specific nodes were allocated to your job, and it will also show you
the job requirements you specified when you submitted the job.
f20-l1:~: checkjob 4209 checking job 4209 State: Running Creds: user:kevin group:wheel account:kevin class:serial qos:serial WallTime: 00:00:00 of 00:01:00 SubmitTime: Tue Jan 23 10:33:55 (Time Queued Total: 00:00:01 Eligible: 00:00:01) StartTime: Tue Jan 23 10:33:56 Total Tasks: 1 Req[0] TaskCount: 1 Partition: DEFAULT Network: [NONE] Memory >= 0 Disk >= 0 Swap >= 0 Opsys: [NONE] Arch: [NONE] Features: [prod] Allocated Nodes: [compute-2-39.deeptho:1] IWD: [NONE] Executable: [NONE] Bypass: 0 StartCount: 1 PartitionMask: [ALL] Flags: RESTARTABLE PREEMPTEE PREEMPTOR Attr: PREEMPTEE Reservation '4209' (00:00:00 -> 00:01:00 Duration: 00:01:00) PE: 1.00 StartPriority: 200 |
If you want to view the output of your job while it is running, you
can use the command qpeek. This command can be used to view
both the standard output and standard error streams from your job, and
can also be used to follow the output as it occurs.
f20-l1:~: qpeek
qpeek: Peek into a job's output spool files
Usage: qpeek [options] JOBID
Options:
-c Show all of the output file ("cat", default)
-h Show only the beginning of the output file ("head")
-t Show only the end of the output file ("tail")
-f Show only the end of the file and keep listening ("tail -f")
- |
canceljob
command.
f20-l1:~: canceljob 7274 job '7274' cancelled |
mybalance.
f20-l1:~: mybalance Project Machines Balance -------- -------- -------- test ANY 12093571 test-hi ANY 71999976 |
This shows you the balance remaining (in seconds) in all of the accounts to which you are authorized to charge. The account with the -hi suffix is your high-priority account.
To submit jobs to an account other than your default
(standard-priority) account, use the -A option to
qsub.
f20-l1:~: qsub -A test-hi test.sh 4194.deepthought.umd.edu |
checkjob command to get more
information about why your job isn't running.
f20-l1:~: checkjob 4195 [ ... deleted for brevity ... ] job is deferred. Reason: NoResources (cannot create reservation for job '4195' (intital reservation attempt)) Holds: Defer (hold reason: NoResources) PE: 232.00 StartPriority: 200 cannot select job 4195 for partition DEFAULT (job hold active) |
In this example, we see that the job was deferred because there are insufficient resources available to run the job. Once sufficient resources become available, the job will run automatically.
If instead, you see the following as part of the checkjob output, it means that the job you are trying to run will exceed the allocation you have remaining. This may simply be because you did not specify a walltime as part of your job specification. If your specifications are correct, you can either resubmit your job to your standard-priority account, or to the free serial queue, or you can request an additional allocation from the committee.
f20-l1:~: checkjob 4204 [ ... deleted for brevity ... ] job is deferred. Reason: BankFailure (cannot debit job account) Holds: Defer (hold reason: BankFailure) PE: 32.00 StartPriority: 200 cannot select job 4204 for partition DEFAULT (job hold active) |
If none of the above conditions apply, and your job is listed in the IDLE JOBS section, keep the following in mind:
showq command lists jobs according to
priority order, with the highest priority jobs listed first.
If you need shell access to additional nodes, provided some are
available you can ask the scheduler to assign them to you with
qsub -I. Assuming your requirements are met, you will be
given a shell on the first node, and on that node,
$PBS_NODEFILE will be set to the name of a file
containing the list of nodes to which you now have access. You can
then ssh to and between any of the nodes in that list, and you can
also ssh to all of your assigned nodes from the login nodes.
For example, if you want to request two seperate nodes, try this:
f20-l1:~: qsub -lnodes=2 -lwalltime=00:15:00 -I qsub: waiting for job 4216.deepthought.umd.edu to start qsub: job 4216.deepthought.umd.edu ready DISPLAY not set. compute-2-39:~: cat $PBS_NODEFILE compute-2-39.deepthought.umd.edu compute-2-39.deepthought.umd.edu compute-2-39.deepthought.umd.edu compute-2-39.deepthought.umd.edu compute-2-38.deepthought.umd.edu compute-2-38.deepthought.umd.edu compute-2-38.deepthought.umd.edu compute-2-38.deepthought.umd.edu compute-2-39:~: ssh compute-2-38 date Tue Jan 23 11:22:48 EST 2007 |
PATH.
Your account as provided gives you access to the basic tools needed to
submit and monitor jobs, access basic Gnu compilers, etc. It is
HIGHLY suggested that you DO NOT remove or modify the dot files
(.cshrc, .profile, etc) that are provided
for you. Instead, add any customizations you need to the alternate
set of files described here.
If you choose to modify the system default files, you run the risk of
losing any systemwide changes that are necessary to keep your account
running smoothly.
For packages that are not included in your default environment, the
tap command is provided. When run, this command will
modify your current session by adding the appropriate entries to your
PATH, MANPATH, LD_LIBRARY_PATH
and will set any other variables necessary to ensure the proper
functioning of the package in question. Note that these changes are
temporary and only exist until you log out. If you want to have
tap run for you automatically, add the command tap
-q <package> to your .cshrc.mine file. (The
-q argument prevents tap from displaying any
text output when it runs, which can confuse some shells.) If
you run the tap command without any arguments, it will
provide a list of available packages. Note that many of these
packages are not accessible on the cluster by default, if you want
access to them, let us know and if possible, we'll make them available.
For example, if you want to run Matlab, you'll want to do the
following. Notice that Matlab is not available until after the
tap command has been run.
f20-l1:~: matlab
matlab: Command not found.
f20-l1:~: tap matlab
----------------------------------------------------------------------
This is a shortcut to the default version of Matlab available
on your platform.
Run command "matlab" to start up the program,
or "matlab -h" to see various command-line options.
There may be other versions of Matlab available. Please check
the Dash/KDE menu for specific versions of Matlab.
----------------------------------------------------------------------
f20-l1:~: matlab
< M A T L A B >
Copyright 1984-2005 The MathWorks, Inc.
Version 7.0.4.352 (R14) Service Pack 2
January 29, 2005
To get started, type one of these: helpwin, helpdesk, or demo.
For product information, visit www.mathworks.com.
>>
|
| Package Name | Description |
|---|---|
| ansys100 | Ansys 10.0 |
| aria | Aria 2.2 (built against CNS 1.21, GNU compilers) |
| blast | BLAST 2.2.18 |
| blcr | BLCR 0.8.1 |
| cap3 | CAP3 compiled with Intel compilers |
| clustalw | ClustalW 1.83 compiled with Intel compilers |
| cns | CNS 1.2 compiled with Intel compilers |
| cns121 | CNS 1.21 compiled with GNU compilers |
| fftw | FFTW 2.1.5 (with MPI extensions, built with lam-gnu) |
| fftw-intel-openmpi | FFTW 2.1.5 (with MPI extensions, built with openmpi, intel) |
| fftw32 | FFTW 3.2 (with MPI extensions, built with lam-gnu) |
| garli | GARLI 0.951 compiled with Intel compilers (single process version) |
| garli-mpi | GARLI 0.942 compiled with Intel compilers and OpenMPI (MPI version) |
| gromacs | GROMACS version 3.3.3 compiled with Intel compilers |
| gsl | GNU Scientific Library version 1.8 |
| haddock | Haddock 2.0 (built against CNS 1.21, GNU compilers) |
| hdf | HDF 4.2r1 |
| hdf5 | HDF 1.6.5 |
| intel | Intel Compilers 10.1.008, MKL 10.0.011 |
| intel-mpi | Intel MPI 3.1 - note that this does NOT work with InfiniBand |
| java | Java 1.5.0_11 |
| java6 | Java 1.6.0_04 |
| lam-gnu | LAM 7.1.2 compiled with Gnu compilers |
| lam-intel | LAM 7.1.2 compiled with Intel compilers |
| lammps | Large-scale Atomic/Molecular Massively Parallel Simulator (Sandia), Version 9Jan2009/19May2009 compiled with Gnu compilers, OpenMPI |
| lapack | LAPACK 3.1.0 (This is the reference implementation -
non-optimised) It includes the BLAS library as well. |
| lapack32 | LAPACK 3.2.1 (This is the reference implementation -
non-optimised) It includes the BLAS library as well. |
| lucy | lucy 1.19 compiled with Intel compilers |
| mangled-mesa | Mesa v7.4.4, with symbols mangled |
| mathematica60 | Mathematica 6.0 |
| matlab | Matlab 7.0.4 |
| matlab2007b | Matlab 7.5.0 |
| matlab2008b | Matlab 2008b (7.7.0) |
| mesa | Mesa v7.4.4, standard version |
| modeltest | modeltest 3.7 compiled with Intel compilers also includes MrModeltest 2.2 |
| mpich-gnu | MPICH 1.0.4p1 compiled with Gnu compilers |
| mpich-intel | MPICH 1.0.4p1 compiled with Intel compilers |
| mrbayes | MrBayes 3.1.2 compiled with Intel compilers (single process version) |
| mrbayes-mpi | MrBayes 3.1.2 compiled with Intel compilers and OpenMPI (MPI version) |
| muscle | MUSCLE 3.6 compiled with Intel compilers |
| namd | NAMD 2.6 |
| nwchem | nwchem 5.1.1 compiled with Intel compilers, openmpi
For use by abinitio group only |
| openmpi-gnu | OpenMPI 1.2.5 compiled with Gnu compilers |
| openmpi-intel | OpenMPI 1.2.5 compiled with Intel compilers |
| openmpi-pgi | OpenMPI 1.2.5 compiled with PGI compilers |
| netcdf | NetCDF 3.6.1 |
| paml | PAML 4b compiled with Intel compilers |
| povray | POV-Ray 3.6 |
| qespresso | Quantum Espresso 4.0.5 (with intel/mkl, openmpi) |
| R | R 2.4.1 |
| R280 | R version 2.8.0 |
| RAxML | RAxML phylogenetic software.
"serial" version, intel compiler, openMP parallel version, intel compiler, openMPI |
| sctk | NIST Speech Recognition Scoring Toolkit v2.3 |
| siesta | siesta: Ab initio electronic dynamics program, v 2.0.2
abinitio group only |
| silo | Silo Scientific Data File Format Library v 4.7 |
| snack | Snack Sound Toolkit v2.2.10 |
| speech_tools | Edinburgh Speech Tools Library v1.2.96-beta |
| unio08 | UNIO'08 NMR Data Analysis Tool |
| vasp46 | VASP 4.6, licensed to abinitio group only. Intel + MPICH |
| vasp46vtst | VASP 4.6 with VTST 2.03d extenstions, licensed to abinitio group only. Intel + MPICH |
| vasp522 | VASP 5.2.2, licensed to abinitio group only. Intel + openMPI |
| vasp522vtst | VASP 5.2.2 with VTST 2.03d extenstions, licensed to abinitio group only. Intel + OpenMPI |
| visit | VisIt scientific visualization and graphical analysis package |
| vtk500c | Visualization Toolkit v5.0.0c (mainly for ViSit) |
| xplor-nih | xplor-nih 2.19 |
All of the filesystems listed here are network filesystems. You will get
much better performance if you use the local scratch filesystem on each
compute node (/tmp) for your computations, and then copy
your final results to the network filesystems. See
here for instructions on how to specify how much
scratch space your job needs. Keep in mind, however, that any data
written to /tmp is not backed up and will be automatically
removed once your job completes, so be sure to save what you need.
Because much of the data generated on the cluster is of a transient nature and because of its size, data stored in the /data partitions is not backed up. This data resides on RAID protected filesystems, however there is always a small chance of loss or corruption. If you have critical data that must be saved, be sure to copy it elsewhere.
There are several general purpose areas that are intended for storage of computational data. These areas are accessible to all users of the cluster and as such you should be sure to protect any files or directories you create there. See Securing Your Data for more information.
The areas are:
| Path | Filesystem Type | Approximate Size |
|---|---|---|
| The following filesystems are available to all Deepthought users | ||
| /data/dt-raid5 | NFS on RAID5 | 500GB |
| /data/dt-raid10 | NFS on RAID10 | 500GB |
| /data/dt-vol6 | NFS on RAID5 | 3TB |
| The following filesystems are for members of the ME group only | ||
| /data/dt-vol0 | NFS on RAID5 | 2TB |
| /data/dt-vol1 | NFS on RAID5 | 2TB | The following filesystems are for members of the CLFS group only |
| /data/dt-vol2 | NFS on RAID5 | 1.7TB |
| /data/dt-vol3 | NFS on RAID5 | 1.7TB |
| /data/dt-vol4 | NFS on RAID5 | 1.7TB |
| /data/dt-vol5 | NFS on RAID5 | 1.7TB |
Please remember that you are sharing these filesystems with other
researchers and other groups. If you have data residing there that
you don't need, please remove it promptly. If you know you are going
to create large files, make sure there is sufficient space available
in the filesystem you are using. You can check this yourself with
the df command:
f20-l1:~: df -h /data/dt-vol0
Filesystem Size Used Avail Use% Mounted on
g20-fs1.deepthought.umd.edu:/export/data/vol0
1.8T 850G 888G 49% /a/g20-fs1/data/dt-vol0
|
This output shows that there are currently 888 GB of free space
available on /data/dt-vol0.
If you have a Glue account and you want to share your data back and
forth with that account, you can access it at
/glue_homes/<username>. Note that you cannot have
jobs read or write directly from your Glue directory, you'll need to
copy data back and forth by hand as needed.
Policies regarding usage of Disk Space on HPCC
Your home directory as configured is private and only you have access to it. Any directories you create outside your home directory are your responsibility to secure appropriately. If you are unsure of how to do so, please contact hpcc-help@umd.edu for additional assistance.
If you're a member of a group, you'll want to make sure that you give
your group access to these directories, and you may want to consider
setting your umask so that any files you create automatically have
group read and write access. To do so, add the line umask
002 to your .cshrc.mine file.