LocalDataTransfer
Depending on the amount of data you are trying to transfer there are different commands that you should use. The classic cp
command has a number of edge cases in which it will not copy everything that is expected, so the only time that one should use the cp
command is if the copy can be verified, such as if you are moving a single file. The better choice for transferring data is by using tar
or gtar
. After transferring files or directories, you can use rsync
to check that everything moved correctly and it will update files that have been changed.
Transfer a single file
If you want to transfer a single file into another directory you can use the cp
command. The format for the cp
command is: cp Source Destination
. If destination is omitted, it will make a copy of the file(s) in the current directory.
- To copy the file foo.txt from your home directory into Documents
cp foo.txt ~/Documents
- To transfer the file foo.txt from your current directory to another host use scp (secure copy) with $USERNAME@<FQDN>:<PATH>
scp foo.txt USERNAME@example1.umiacs.umd.edu:~/Documents
This command will copy the file foo from your current directory to the folder Documents located in your home directory on the host example1.
- To transfer back the file foo.txt from the host example1 to another directory in your current host
scp USERNAME@example1.umiacs.umd.edu:~/Documents/foo.txt ~/Videos
Important: The cp
command should only be used for small file transfers. If you try to transfer a large amount it is possible that cp
will not copy all the files over properly. To learn more about the cp
command type man cp
on the in the terminal.
Transfer a directory or large amounts of data to another location
If you want to transfer a whole directory or a large amount of data to another location you can use the code
command. Even though we use the gtar
command to copy and transfer a directory, the more common use of gtar
is to make tar balls (to create archives of specified directories or files).
- To combine two files in your current directory in an archive:
gtar -cpvf archive.tar file1 file2
To re-archive the data back in a different directory, for example Documents in your home directory:
gtar -C ~/Documents/ -xpvf archive.tar
This command will archive all files (including those in subdirectories) within the current directory and re-create the files in the directory that you specify.
The format for the gtar
command for data transfer is:
gtar -cpf - . | gtar -C /dir -xpvf -
In this command, the target is a dash '-' which stands for standard output, and the source is a period '.' which is interpreted as all files in your current directory. The standard output is piped to the second command, which has as a source a dash '-' (which the shell interprets as standard input). The output of the first command is piped and becomes the input of the second command. The target directory in your second command is /dir.
- To transfer all files from your documents to a folder in your home directory called foo:
cd ~/Documents gtar -cpf - . | gtar -C ~/foo -xpvf -
When you use this command it will display a list of the files it has transferred.
This command will leave Documents the same, but create a full copy of all files and folders from Documents in foo.
Note: This command will preserve permissions, attributes, and meta-data of all files transferred.
Transfer between two different hosts
- To transfer files from the tmp directory on example1.umiacs.umd.edu to the foo directory on your current host:
ssh USERNAME@example1.umiacs.umd.edu gtar -cpf - /tmp | gtar -C /foo -xpvf -
- To transfer files from foo directory on your current host to the tmp directory on example1.umiacs.umd.edu:
gtar -cpf - /foo/ | ssh USERNAME@example1.umiacs.umd.edu gtar -C /tmp -xpvf -
If your data transfer is interrupted, you can use the rsync command listed below to copy the rest of the files without creating doubles of files that have already been transferred.
rsync can also be used for the initial transfer of data if you expect the transfer to be interrupted. Otherwise, this method should not be used as it takes more time and memory.
Verifying transfer
To verify that your transfer copied everything you can use the rsync command, which will compare the two directories contents and will update the files in which it sees differences.
The format for the rsync command is:
rsync -aH /source/ /target
To ensure that the files are the same in Documents and foo from the previous example:
rsync -aH ~/Documents/ ~/foo
This command will compare the files and directories within Documents to the files and directories within foo. If there are files within Documents and its subdirectories that do not appear in foo, this command will copy the missing files from Documents to foo.
Important: Make sure to include the slash after the name of the source directory, if you do not include it, it will copy the directory folder over as well.