Computing Reviews

Responsible data management
Stoyanovich J., Abiteboul S., Howe B., Jagadish H., Schelter S. Communications of the ACM6564-74,2022.Type:Article
Date Reviewed: 06/08/23

There are an abundance of papers and resources that discuss data management, but few have taken the approach that Stoyanovich et al. take in this article. Starting off with a discussion of the ethical and legal requirements and considerations for systems that use algorithms, the article highlights a key argument that I have had for many years: much of the work done, including data quality and data audits, happens in the “last mile” of data analysis. The authors stress that this disregards and minimizes the work done during the system design, development, and use life cycle, as well as the data life cycle, which are critical to the understanding and appropriate use of the data. The authors’ two main points--decisions made during data collection have a direct impact on the systems, and these systems continue to be our responsibility after deployment--are also key components of my philosophy as a database administrator at a hospital-based research institute.

Focusing on automated decision systems (ADSs), the authors do a thorough exploration of areas ranging from bias, data cleaning, and other critical aspects of data management for applications typically seen as “turn on and forget.” I really appreciate their coverage of marginalized folks (specifically people who are non-binary) as this is something I struggle to get people to understand in my own work. Having my own position so articulately described in an article like this reaffirms my belief that inclusion, sensitivity, and a full understanding of the population being examined are key factors for surveys and other formats of data collection.

Overall, I highly recommend this article to anyone interested in data management or artificial intelligence (AI), as well as those working in an industry that collects potentially sensitive data about people. This will definitely be an article I provide to the students and new data analysts I mentor, and will be a component of the conversations I have with them.

Reviewer:  Christopher Battiston Review #: CR147599 (2309-0123)

Reproduction in whole or in part without permission is prohibited.   Copyright 2024 ComputingReviews.com™
Terms of Use
| Privacy Policy