Named data networking strategies for improving large scientific data transfers


Current scientific workflows such as Climate Science and High Energy Particle Physics (HEP), routinely generate and use large volumes of observed or simulated data. Users are often geographically dispersed and need to transfer large volumes of data over the network for replication, archiving, or local analysis. Scientific communities have built sophisticated applications and dedicated networks to facilitate such data transfers, and yet, users continue to experience failures, delay, and unpredictable transfer latency. Named Data Networking (NDN) is a new Internet architecture that provides a more flexible and intelligent network layer, suitable for large data transfers. In this work, we use a real scientific data flow to demonstrate NDN's flexibility and versatility that makes it a suitable choice for large-data workflows. We use deadline-based data transfers as our driving example since HEP communities widely use them and discuss several NDN forwarding strategies that can help such flows. In addition to using typical forwarding strategies, we propose, at a high level, a bandwidth reservation protocol for NDN and an on-demand high-speed path creation mechanism. Using these as building blocks, we create a deadline-based data transfer protocol and show how NDN can simplify and improve scientific data distribution. Finally, we use a week-long HEP data log to evaluate our protocol analytically.

Publication Title

2018 IEEE International Conference on Communications Workshops, ICC Workshops 2018 - Proceedings