Data Transfer

From UMIACS
Revision as of 20:27, 28 May 2012 by Ridge (talk | contribs)
Jump to navigation Jump to search

Depending on the amount of data you are trying to transfer there are different commands that you should use. The classic cp command has a number of edge cases in which it will not copy everything that is expected, so the only time that one should use the cp command is if the copy can be verified such as if you are moving a single file. The better choice for transfering data is by using tar/gtar. After transfering files or directories you can use rsync to check that everything moved correctly and it will update files that have been changed.

Transfer a single file

If you want to transfer a single file into another directory you can use the cp command. The cp command will take the file you need to transfer and make a copy of it in the directory you specify. If you do not specify a directory it will just make a copy of it in the current location.

The format for the cp command is: cp file /target/

Example: To transfer the file foo.txt from your home directory into Documents

cd

cp foo.txt /nfshomes/YOURUSERNAME/Documents

The end result will leave the original file in the home directory and create a copy of it in Documents.

Important: The cp command should only be used for small file transfers. If you try to transfer a large amount it is possible that cp will not copy all the files over properly.


Transfer a directory or large amounts of data to another location

If you want to transfer a whole directory or a large amount of data to another location you can use the gtar command. This command will archive the all files (including those in subdirectories) within the current directory and re-create the files in the directory that you specify.

The format for the gtar command is: gtar -cpf - . | gtar -C /dir -xpvf -

Example: To transfer all files from your documents to a folder in your home directory called foo:

cd

gtar -cpf - . | gtar -C /nfshomes/YOURUSERNAME/foo -xpvf -

When you use this command it will display a list of the files it has transferred.

This command will leave Documents the same, but create a full copy of all files and folders from Documents in foo.

Note: This command will preserve permissions, attributes, and meta-data of all files transferred.

Transfer between two different hosts

Include the command -ssh USERNAME@FULLYQUALIFIEDHOSTNAME- before the gtar command.

If the other host is the one with the data to transfer you will need to include the command before the first gtar.

If the other host is the one receiving the data, you will need to include the command after the pipe "|" and before the second gtar.

Example: To transfer files from the directory /tmp/ on example1.umiacs.umd.edu to the folder /foo/ on your current host:

ssh USERNAME@example1.umiacs.umd.edu gtar -cpf - /tmp | gtar -C /foo -xpvf -

If your data transfer is interrupted, you can use the rsync command listed below to copy the rest of the files without creating doubles of files that have already been transferred.

Rsync can also be used for the initial transfer of data if you expect the transfer to be interrupted. Elsewise, this method should not be used as it takes more time and memory.

To run rsync to copy files, the format for the command is the same as written below under "Verifying transfer."

Verifying transfer

To verify that your transfer copied everything you can use the rsync command, which will compare the two directories contents and will update the files in which it sees differences.

The format for the rsync command is:

rsync -aH /source/ /target

Example: To ensure that the files are the same in Documents and foo from the previous example:

rsync -aH /nfshomes/YOURUSERNAME/Documents/ /nfshomes/YOURUSERNAME/foo

This command will compare the files and directories within Documents to the files and directories within foo. If there are files within Documents and its subdirectories that do not appear in foo, this command will copy the missing files from Documents to foo.

Important: Make sure to include the slash after the name of the source directory, if you do not include it it will copy the directory folder over as well.