Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
The use of regression methodology for the compromise of confidential information in statistical databases
Palley M., Simonoff J. ACM Transactions on Database Systems12 (4):593-608,1987.Type:Article
Date Reviewed: Oct 1 1988

Confidentiality in statistical databases can be compromised by intruders using regression methodologies, even when the regression cannot be run on the database itself. Instead, a representative sample of queries yields a synthetic database, which is then analyzed to produce a prediction equation applied back to the main database. Technical controls, such as table size restrictions, are ineffective against the strategy, or have side effects reducing the usability of the data. The problem is real and serious, and has received attention from professional associations.

The paper introduces the problem, reviews the use of regression techniques for estimating individual data, and examines the prospects of using several types of controls. The discussion lists database characteristics that increase the difficulty of successful uses of regression: low correlations between database attributes, uniform distributions, and continuous variables. The problem was studied using census data.

The issue posed at the end is how to defend against regression-based intrusion strategies without over-restricting legitimate access.

One cannot fault the authors for doing a good job on what they set out to do--evaluating specific technological fixes for the problem of the legitimate user who compromises system integrity. However, one wishes they had chosen also to review other technical defenses, such as monitoring types of query patterns and their frequency, forced identification of terminal users, and identifying terminals when they come on line. Administrative defenses are not treated.

The real problem is the betrayal of trust by a legitimate user of the data. No combination of technological fixes is likely to solve this problem. Management training procedures, good employee relations, ethical standards supported by professional associations, and other sociological approaches will also be needed if confidentiality in statistical databases is not to be compromised.

Reviewer:  John A. Sonquist Review #: CR112622
Bookmark and Share
 
Database Administration (H.2.7 )
 
 
Privacy (K.4.1 ... )
 
 
Security (K.6.m ... )
 
 
Security, Integrity, And Protection (H.2.0 ... )
 
 
Statistical Computing (G.3 ... )
 
 
Database Applications (H.2.8 )
 
Would you recommend this review?
yes
no
Other reviews under "Database Administration": Date
Data administration: selected topics of data control
Brathwaite K., John Wiley & Sons, Inc., New York, NY, 1985. Type: Book (9789780471809234)
Nov 1 1985
Data administration: selected topics of data control
Brathwaite K., John Wiley & Sons, Inc., New York, NY, 1986. Type: Book (9789780471809234)
Apr 1 1987
Data dictionaries for database administrators
Vinden R., TAB Books, Blue Ridge Summit, PA, 1990. Type: Book (9780830635153)
Apr 1 1991
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy