NewStorage

From Cbcb-private
Jump to: navigation, search

The following page outlines the structure of the CBCB storage platform, and provides some general guidance on its usage.

Accessing CBCB Storage

The CBCB network storage platform is available to all UMIACS Supported hosts within CBCB. Directories on this platform are mounted on demand via Autofs, meaning data will not be visible until accessed.

For example:

$ ls /fs
$ ls /fs/cbcb-scratch
file1  file2

As shown above, the /fs/ directory will not be populated until the specific resource is accessed.

Directory Structure

/fs
├── cbcb-data        -- Common datasets. Limited write access.
├── cbcb-lab         -- Lab specific storage. 
├── cbcb-scratch     -- Project/user specific storage.
└── cbcb-software    -- Software collection.
data
Stores read-only copies of common data sets used within the center. This centralized location ensures datasets remain pristine, and reduces the amount of duplicate data present in the system.
lab
Serves as general storage place for each Lab/Group. Each Lab/Group will receive their own directory and data quota.
scratch
Intended for general purpose intermediate data storage. Data in this directory is not backed up.
software
Stores precompiled software modules, used throughout the center.
Exclamation-point.png Data in the scratch directory is not backed up, and does not include snapshots.

Requesting An Allocation

To request an allocation within the CBCB Storage platform, please submit a request to staff@umiacs.umd.edu containing the following information:

  • Name of allocation.
  • Is it a personal or group allocation.
  • Initial size of the allocation.
  • What the allocation will be used for.
Exclamation-point.png There may be additional constraints for the allocation, as outlined below.

/fs/cbcb-data

Allocations in the CBCB data directory need to meet the following requirements:

  • Solely for the storage of data sets.
  • Data will be immutable once stored.

/fs/cbcb-lab

Allocations in the CBCB lab directory need to meet the following requirements:

  • Base allocation for each faculty member within the Center.
  • Increased allocations for additional investments in the underlying storage system.
  • Used by more than one user within the center.
  • Long-term storage of lab-specific data/code.

If your Lab/Group wants to make a larger investment into available storage please contact staff to provide pricing.

/fs/cbcb-scratch

Single users will be allowed allocations in /fs/cbcb-scratch and are limited to 1Tb. Temporary exemptions to this limit may be granted for a period no greater than 90 days with approval from the director of CBCB or a PI that has made a storage investment (within the space they have invested).

At the end of the 90 day period the allocation (unless an extension is secured) will be returned to 1Tb, and the user/group will be required to clean up any data exceeding this limit within 1 week. Should the allocation exceed its 1Tb limit after this 1-week period, technical staff will remove data to bring it back within the limit.

Where do I put my data?

This section outlines some general guidelines on where to store data.

Is this a data set that will be used by multiple users within the Center?
Store it in CBCB data
Is this lab-specific data critical to an ongoing research project?
Store it in CBCB Lab
Is this lab-specific or personal data that is non-critical or reproducible?
Store it in CBCB scratch
Is this personal code and files?
Store it in CBCB home directory

I accidentally deleted my data. What next?

Data existing in the following directories is backed up and protected by snapshots.

Exclamation-point.png Data recovery may not be possible if the file was created and deleted in-between a backup cycle.

Snapshots

Snapshots are the first line of defense in data recovery. Similar to how a picture captures the state of a scene at a given time, a snapshot captures the state of data at a given time.

Snapshots can be accessed by changing (read: cd) into the .snapshot directory from any directory within the file share. Within the .snapshot directory are a number of subdirectories for each snapshot available. The name of the subdirectory indicates the date/time at which the snapshot was created. Look in the subdirectory that you think has the best chance of containing the file(s) you want to recover, and copy them out of the .snapshot directory.

If you are unable to find your data in the .snapshot directory, please reach out to staff@umiacs.umd.edu to determine if there are any further actions.

Snapshot Schedule

The following table outlines the snapshot schedule and retention policy.

SnapShot Schedule
Name Execution Time # Copies Stored
Hourly 8:00am, 12:00pm, 4:00pm, 8:00pm 4
Daily Every day @ 12:00am 2
Weekly Sunday @ 12:00am 1
  • At any given time there will be four hourly snapshots, two daily snapshots, and one weekly snapshot.