Data Transfer

Jump to navigation Jump to search

Depending on the amount of data you are trying to transfer there are different commands that you should use. The classic cp command has a number of edge cases in which it will not copy everything that is expected, so the only time that one should use the cp command is if the copy can be verified, such as if you are moving a single file. The better choice for transfering data is by using tar/gtar. After transfering files or directories you can use rsync to check that everything moved correctly and it will update files that have been changed.

Transfer a single file

If you want to transfer a single file into another directory you can use the cp command. The format for the cp command is: cp Source Destination. If destination is omitted, it will make a copy of the file(s) in the current directory.

Example 1: To transfer the file foo.txt from your home directory into Documents

cp foo.txt ~/Documents

The end result will leave the original file in the home directory and create a copy of it in Documents.

Example 2: To transfer the file foo.txt from your current directory to another host use scp (secure copy) with $USERNAME@<FQDN>:<PATH>

scp foo.txt

This command will copy the file foo from your current directory to the folder Documents located in your home directory on the host example1. You can also use scp to copy a file from a host to your current directory, or copy a file between two hosts.

Example 3: To transfer back the file foo.txt from the host example1 to another directory in your current host

scp ~/Videos

Important: The cp command should only be used for small file transfers. If you try to transfer a large amount it is possible that cp will not copy all the files over properly.

Transfer a directory or large amounts of data to another location

If you want to transfer a whole directory or a large amount of data to another location you can use the gtar command. Even though we use the gtar command to copy and transfer a directory, the more common use of gtar is to make tar balls (to create archives of specified directories or files).

Example 1: To combine two files in your current directory in an archive:

gtar -cpvf archive.tar file1 file2

To re-archive the data back in a different directory, for example Documents in your home directory:

gtar -C ~/Documents/ -xpvf archive.tar

This command will archive all files (including those in subdirectories) within the current directory and re-create the files in the directory that you specify.

The format for the gtar command for data transfer is: gtar -cpf - . | gtar -C /dir -xpvf -

In this command, the target is a dash '-' which stands for standard output, and the source is a period '.' which is interpreted as all files in your current directory. The standard output is piped to the second command, which has as a source a dash '-' (which the shell interprets as standard input). The output of the first command is piped and becomes the input of the second command. The target directory in your second command is /dir.

Example 2: To transfer all files from your documents to a folder in your home directory called foo:

cd ~/Documents

gtar -cpf - . | gtar -C ~/foo -xpvf -

When you use this command it will display a list of the files it has transferred.

This command will leave Documents the same, but create a full copy of all files and folders from Documents in foo.

Note: This command will preserve permissions, attributes, and meta-data of all files transferred.

Transfer between two different hosts

Include the command ssh USERNAME@FULLYQUALIFIEDHOSTNAME before the gtar command.

If the other host is the one with the data to transfer you will need to include the command before the first gtar.

If the other host is the one receiving the data, you will need to include the command after the pipe "|" and before the second gtar.

Example 1: To transfer files from the directory /tmp/ on to the folder /foo/ on your current host:

ssh gtar -cpf - /tmp | gtar -C /foo -xpvf -

Example 2: To transfer files from the directory /foo/ on your current host to directory /tmp on

gtar -cpf - /foo/ | ssh gtar -C /tmp -xpvf -

If your data transfer is interrupted, you can use the rsync command listed below to copy the rest of the files without creating doubles of files that have already been transferred.

Rsync can also be used for the initial transfer of data if you expect the transfer to be interrupted. Elsewise, this method should not be used as it takes more time and memory.

To run rsync to copy files, the format for the command is the same as written below under "Verifying transfer."

Verifying transfer

To verify that your transfer copied everything you can use the rsync command, which will compare the two directories contents and will update the files in which it sees differences.

The format for the rsync command is:

rsync -aH /source/ /target

Example: To ensure that the files are the same in Documents and foo from the previous example:

rsync -aH ~/Documents/ ~/foo

This command will compare the files and directories within Documents to the files and directories within foo. If there are files within Documents and its subdirectories that do not appear in foo, this command will copy the missing files from Documents to foo.

Important: Make sure to include the slash after the name of the source directory, if you do not include it it will copy the directory folder over as well.