Request aggregation, caching, and forwarding strategies for improving large climate data distribution with NDN: A case study


Scientific domains such as Climate Science, High Energy Particle Physics (HEP) and others, routinely generate and manage petabytes of data, projected to rise into exabytes [26]. The sheer volume and long life of the data stress IP network- ing and traditional content distribution networks mechanisms. Thus, each scientific domain typically designs, develops, im- plements, deploys and maintains its own data management and distribution system, often duplicating functionality. Sup- porting various incarnations of similar software is wasteful, prone to bugs, and results in an ecosystem of one-off solutions. In this paper, we present the first trace-driven study that investigates NDN in the context of a scientific application domain. Our contribution is threefold. First, we analyze a three-year climate data server log and characterize data access patterns to expose important variables such as cache size. Second, using an approximated topology derived from the log, we replay log requests in real-time over an NDN simulator to evaluate how NDN improves traffic flows through aggregation and caching. Finally, we implement a simple, nearest-replica NDN forwarding strategy and evaluate how NDN can improve scientific content delivery.

Publication Title

ICN 2017 - Proceedings of the 4th ACM Conference on Information Centric Networking