Date and Time:
Sampling large operational datasets such as ISP usage measurements can be effective for reducing storage requirements and execution time, while prolonging the useful life of the data for baselining and retrospective analysis. Sampling needs to mediate between the characteristics data and accuracy need of queries. This talk is about a cost-based formulation to express these opposing priorities, and how it leads to optimal sampling schemes without prior statistical assumptions.
Nick Duffield joined Rutgers University as a Research Professor in 2013. Previously he worked at AT&T Labs Research, where he was a Distinguished Member of Technical Staff and an AT&T Fellow. He works on network and data science, particularly the acquisition, analysis and applications of operational network data. He was formerly chair of the IETF Working Group on Packet Sampling, and an associate editor of the IEEE/ACM Transactions on Networking. He is an IEEE Fellow and was a co-recipient of the ACM Sigmetrics Test of Time Award in both 2012 and 2013.