Data Storage: Difference between revisions

From UMIACS
Jump to navigation Jump to search
No edit summary
No edit summary
Line 24: Line 24:
*: [[Snapshots]]: Point-in-time copies of specific file systems, easily accessible for quick restores. Taken more often than daily and retained for up to a week (see page for more details).
*: [[Snapshots]]: Point-in-time copies of specific file systems, easily accessible for quick restores. Taken more often than daily and retained for up to a week (see page for more details).
*: [[NightlyBackups]]: Daily copies of specific file systems, sent to a backup server managed by staff. Taken daily and retained for up to 90 days.
*: [[NightlyBackups]]: Daily copies of specific file systems, sent to a backup server managed by staff. Taken daily and retained for up to 90 days.
*: [[Archives]]: Final copies of specific types of data, stored on an archive server managed by staff. Taken when specific types of data are decommissioned and retained for up to 5 years (see page for more details).
*: [[Archives]]: Final copies of specific types of data, stored on an archive server managed by staff. Taken when specific types of data is decommissioned and retained for up to 5 years (see page for more details).
 
* '''[[OBJ | UMIACS' Object Store]]''' is not backed up in any way, but decommissioned data is sent to [[Archives]].
 
* '''[[CloudDataStorage | Cloud data storage]]''' backup policies vary based on the service provider. You can also often buy additional storage or storage protection for a set price per renewal period. There is also sometimes a Trash folder that stores accidentally deleted things for some period of time, often 30 days.


==What are some data storage best practices?==
==What are some data storage best practices?==
* [[Publishing Data]]
* [[Publishing Data]]

Revision as of 17:20, 25 October 2024

This is a landing page for all topics related to data storage that are available at UMIACS. It is under active development.

Where can I store my data?

Before choosing where to store your data, consider how you may need to interact with that data in the short-term and in the long-term, and who you might need to share the data with. You can copy data between the different types of data storage listed below, but it may be unnecessarily cumbersome if you don't choose the right place for the "master copy" of your data.

  • Local data storage is best suited for data that you are actively working on. Some examples are code that you are developing or using (perhaps to run computational jobs in SLURM with), results from computational jobs that have already run but have not yet been published in a paper, or files that need to be processed by another desktop-based or server-based application.
    Data on UMIACS' local storage can be moved between different UMIACS-supported hosts and shared with other UMIACS account holders using common methods such as cp, File Explorer, Finder, and more.
  • UMIACS' Object Store is best suited for data that is going to remain static, but needs to remain accessible for very long periods of time, such as data referenced by published papers. It is also suitable for transferring specific versions of data in and out of UMIACS via large singular files, such as archive (tar/zip/etc.) files.
    Data in UMIACS' Object Store can be moved in and out of it, to local storage (whether UMIACS' or otherwise), via its built-in web interface or one of many compatible clients. Data can be shared publicly via simple download links or static websites that can visualize the data in a more accessible way. UMIACS account holders can sponsor Collaborator accounts for external collaborators that may need to upload and download new versions of data without needing direct access to UMIACS-supported hosts.
  • Cloud data storage is best suited for collaborative data, such as simple or rich text documents, PDF forms, spreadsheets, presentations, pictures, videos, and more. Many cloud storage service providers also provide web-based apps attached to their storage service that can be used to edit these types of data without ever having to download it to a specific device.
    The methods that can be used to move data in and out of cloud storage to local storage vary based on the specific service provider, but often involve web-based or mobile app-based upload and download. Data can be shared with others across the web via accounts associated with that service provider, and often publicly via simple download links.

How can I transfer my data?

  • Local data storage is typically best transferred using commands or programs available in the operating system on which it is stored or most commonly accessed from. A number of these commands are covered here.
  • Cloud data storage transfer methods vary based on the service provider. Most major providers will provide simple upload/download functionality, which works well for individual or relatively small amounts of files/folders. For bulk transfers, Rclone is one program that UMIACS staff often uses, as it has compatibility with many major providers.

How is my data retained?

  • Local data storage is retained by UMIACS staff in a number of different ways, namely:
    Snapshots: Point-in-time copies of specific file systems, easily accessible for quick restores. Taken more often than daily and retained for up to a week (see page for more details).
    NightlyBackups: Daily copies of specific file systems, sent to a backup server managed by staff. Taken daily and retained for up to 90 days.
    Archives: Final copies of specific types of data, stored on an archive server managed by staff. Taken when specific types of data is decommissioned and retained for up to 5 years (see page for more details).
  • Cloud data storage backup policies vary based on the service provider. You can also often buy additional storage or storage protection for a set price per renewal period. There is also sometimes a Trash folder that stores accidentally deleted things for some period of time, often 30 days.

What are some data storage best practices?