While “no insight without analysis” is even truer for big data, organizations today still face the challenge of identifying an adequate technical architecture for achieving that. For this task, in the case of traditional data routinely solved through the deployment of relational database management systems (RDBMS), the authors convincingly derive a compelling, easy to use, and simple contingency matrix.
Based on an analysis of the (actually quite limited) scientific literature, the paper first identifies four different big data strategies: using traditional RDBMS simply with larger datasets; resorting to big data analytics as a service (cloud based); deploying MapReduce algorithms on a distributed file system; and a hybrid approach combining RDBMS with MapReduce.
Next, eight factors are derived, capturing different contingencies relevant for organizations when choosing their individual big data strategy (for example, relevance of big data analytics, urgency, resource availability, absorption capacity, data privacy, and others).
Mapping the impact of the eight contingency factors onto each of the four big data strategies finally produces the 8×4 contingency matrix.
The authors thoroughly describe all elements of the contingency matrix and also present vivid and differentiated examples of its entries. For instance, manufacturing companies should possibly look to a RDBMS-based approach (frequent and fast execution of standardized queries on less frequently changing data), while retailers might find a MapReduce strategy better suited to their needs (high urgency, frequently changing data). This allows the interested reader to easily fine tune the matrix to her individual needs, thus supporting a differentiated stakeholder discussion regarding available big data strategy options.
The paper is well written and easy to follow, and a must-read for data analysts and anyone in a professional or academic setting who searches for methods on how to determine strategies for coping with big data.