From time to time I'm transferring big files (up to 150 GiB) over relative slow network links or to slow servers. These are either tar.gz backups, disk images or huge photo collections. Since the fancy web-UIs are non-scriptable, bloated and suffer from timeout issues, I prefer to use the scp
command over a SSH connection:
$ scp some-big-file.zip your-server.example.com:tmp/
The command assumes that you have configured the host your-server.example.com
in your local ~/.ssh/config
or your local username is the same as remote.
Needless to say that a long running scp
command can fail after some hours or days when your network connection is unstable, e.g. your upstream provider rotates your IP-address every 24 hours.
In that case just reexecuting the scp
is suboptimal, because it truncates the remote file, restarts the transfer from scratch and mostly will fail again.
It would be really nice to have a commandline argument to reuse the already transfered bytes. Sadly scp
does not have this feature, but rsync
comes to rescue:
Using rsync
with --append-verify
To restart the upload without reuploading the already transmitted bytes, you can use the rsync
command as follows:
$ rsync --progress --append-verify -v --rsh=ssh \
some-big-file.zip your-server.example.com:tmp/some-big-file.zip
The trick is the --append-verify
option. It's an advanced version of --append
.
If you use the argument --append
, rsync
reuses existing files on the remote side and only appends bytes. Additionally --append-verify
uses a checksum algorithm to compare the existing bytes on the remote side prior to appending data. In the end you can be sure that the huge file was transferred successfully to remote server without any sort of data corruption.
Nevertheless I'm always doing an extra round with md5sum
to ensure that the local and remote file are identically. Just in case. (Until now I have never witnessed a data corruption caused by rsync
.)
Notes
It's totally save to interrupt a rsync
upload with CTRL+C, e.g. to pause an already running upload. Just rexecute the rsync
command and it will continue.
rsync
has also the argument --partial
. It's a different mechanism, but behaves the same as --append
for a single file. In the past I have also used --partial
to restart interrupted uploads successfully.