Measuring the information loss
Most anonymization techniques consist in reducing the level of detail in the information provided, or in suppressing information. Therefore, they typically result in a loss of information. The challenge for the statistician is to strike a proper balance between the conflicting objectives of reducing the disclosure risk and minimizing this loss.
Various methods are available to assess the information loss. For categorical data, these methods include direct comparison, comparison of contingency tables, and entropy-based measures. For continuous data, the methods include comparisons of mean square, mean absolute, and mean variation.
Information on the techniques is available in the following documents:
- Daniel Vorgrimler and Martin Rosemann. Effects of anonymization on analytical validity and reidentification risk
- F. Karr, C. N. Kohnen, A. Oganian, J. P. Reiter, and A. P. Sanil. A Framework for Evaluating the Utility of Data Altered to Protect Confidentiality. National Institute of Statistical Sciences. Technical Report Number 153. June 2006
- Grup Crises. Trading of Information Loss and Disclosure Risk in Database Privacy Protection. Research Report CRIREP-04-002, September 2004
- Grup Crises. Information Loss Measures for Microdata in Database Privacy Protection . Research Report CRIREP-04-004, September 2004
- Josep Domingo-Ferrer, Josep M. Mateo-Sanz and Vicenç Torra. Comparing SDC Methods for Microdata on the Basis of Information Loss and Disclosure Risk
- Josep M. Mateo-Sanz, Josep Domingo-Ferrer and Francesc Sebé, Probabilistic information loss measures in confidentiality protection of continuous microdata, Data Mining and Knowledge Discovery, vol. 11, Number 2, pp. 181-193, September 2005. ISSN: 1384-5810.
- Josep M. Mateo-Sanz, Josep Domingo-Ferrer and Francesc Sebé, Probabilistic Information Loss Measures for Continuous Microdata , 2004
- Shanti Gomatam and Alan F. Karr. Distortion Measures for Categorical Data Swapping. National Institute of Statistical Sciences. Technical Report Number 131. January, 2003
