BIG DATA AND STATISTICS: A STATISTICIAN'S PERSPECTIVE

Metode Sci Stud J. 2015:5:143-149. doi: 10.7203/metode.83.3590.

Abstract

Big Data brings unprecedented power to address scientific, economic and societal issues, but also amplifies the possibility of certain pitfalls. These include using purely data-driven approaches that disregard understanding the phenomenon under study, aiming at a dynamically moving target, ignoring critical data collection issues, summarizing or preprocessing the data inadequately and mistaking noise for signal. We review some success stories and illustrate how statistical principles can help obtain more reliable information from data. We also touch upon current challenges that require active methodological research, such as strategies for efficient computation, integration of heterogeneous data, extending the underlying theory to increasingly complex questions and, perhaps most importantly, training a new generation of scientists to develop and deploy these strategies.

Keywords: Big Data; case studies; challenges; pitfalls; statistics.