March 25, 2021 — The US Department of Energy (DOE) SLAC National Accelerator Laboratory has released a new Technical Report and associated open source testing tools. The report describes and illustrates a rigorous, comprehensive, and fully automated investigation about the highly popular rsync data copying tool. It answers a key question “When to use rsync?”We believe this is the first study at this level and scope, carried out using two expertly designed flexible testbeds: Zettar Inc’s and the U.S. DOE ESnet’s 100G SDN testbed.
First released in 1996, rsync remains the go-to data mover for many IT professionals. Yet the world is facing exponential data growth. So,
- Is rsync still the proper tool to use for almost every data moving task?
- If it is still useful, what are the proper range of operations?
- How about the effectiveness of some rsync-based tools that run multiple rsync instances?
- Are there any alternatives?
The full DOE Technical Report “When to use rsync” answers such questions, plus file size histograms from two U.S. DOE user facilities. They should show the typical file size distributions in large modern research datasets.
The report is available at https://slac.stanford.edu/pubs/slactns/tn06/slac-tn-21-001.pdf.
The testing tools are available at https://github.com/fangchin/test_rsync. They enable any interested parties to use the same methodology to obtain more results in their own environment.
Source: Zettar Inc.