Sunday 17 November 2019

linux - Fastest way to copy 1TB safely over the wire



I want to transmit a 1TB file over the internet. I have control over both machines (sender unix, receiver linux). I can open FTP, OpenVPN, NFS, ...


So far resuming for some reason is not stable on FTP, so I cut the file in 1GB pieces and transmit file per file, then md5 on the other side. It's very annoying.


I have between 3 and 6 mbytes/sec to the other site. Which is a pretty decent speed.


I want it to be safe and I want resuming if it fails. I tried NFS - no resuming.


Is there a cleaner, safer way?



Answer



If you need restart, rsync is probably the way to go. One of the fastest ways over a network, I've found is with "Samba" -- but to get excellent performance it usually needs to be tuned for the environment. But what I mean by "fast" is on a 1Gbit ethernet, 125MB/s writes and 119MB/s reads. The 125MB/s writes are as fast as you are going to get unless your payload is able to be compressed -- i.e. a 1TB text file would likely compress to 1/10th the size.


Note... problem w/rsync, "direct" is that it generally uses some other protocol (rsh/ssh) to do do the transfer -- if you use rsync over 'ssh' you will incur an encryption cost, which on a good machine adds a ceiling of 140-160MB/s. That will usually drive up latency on a network connection and slow down overall transfer be 50% or more.


So the fastest way depends on what type of network you have in place between the two machines. Slow-ish Internet speeds (i.e. .5-10MB/s), or a local area network (w/1Gb or perhaps even 10Gb).


If transfer speed is most important and more so than overall speed, i'd "prep" the file for transfer by running it through a good compressor (like xz or 7z) -- BUT that will take a large chunk of time by itself (so overall time is likely to be larger). But if actual transfer speed is more important, then compression becomes a good way to cut down on that.


It really depends on where your priorities are and how much time you want to spend optimizing the transfer speed. Overall, though, I think sirlancelot gave the right answer, just that there can be many mitigating factors depending on your priorities.


No comments:

Post a Comment

How can I VLOOKUP in multiple Excel documents?

I am trying to VLOOKUP reference data with around 400 seperate Excel files. Is it possible to do this in a quick way rather than doing it m...