On Identity Disclosure Risk Measurement for Shared Microdata
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 32799
On Identity Disclosure Risk Measurement for Shared Microdata

Authors: M. N. Huda, S. Yamada, N. Sonehara

Abstract:

Probability-based identity disclosure risk measurement may give the same overall risk for different anonymization strategy of the same dataset. Some entities in the anonymous dataset may have higher identification risks than the others. Individuals are more concerned about higher risks than the average and are more interested to know if they have a possibility of being under higher risk. A notation of overall risk in the above measurement method doesn-t indicate whether some of the involved entities have higher identity disclosure risk than the others. In this paper, we have introduced an identity disclosure risk measurement method that not only implies overall risk, but also indicates whether some of the members have higher risk than the others. The proposed method quantifies the overall risk based on the individual risk values, the percentage of the records that have a risk value higher than the average and how larger the higher risk values are compared to the average. We have analyzed the disclosure risks for different disclosure control techniques applied to original microdata and present the results.

Keywords: Anonymization, microdata, disclosure risk, privacy.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1079598

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1322

References:


[1] L. Willemborg, T. Waal, "Elements of Statistical Disclosure Control". Springer Verlag. 2001.
[2] P. Kosseim and K. El Emam, "Privacy Interests in Prescription Records, Part 1: Prescriber Privacy," IEEE Security and Privacy, vol. 7, pp.72-76, 2009
[3] K. El Emam and P. Kosseim, "Privacy Interests in Prescription Records, Part 2: Patient Privacy," IEEE Security and Privacy, vol. 7, pp.75-78, 2009
[4] J. Lane, P. Heus and T. Mulcahy, "Data access in a cyber world: making use of cyberinfrastructure", Transactions on Data Privacy, 1(1), pp.2-16, 2008
[5] P. Tendick, N. Matloff, , "A Modified Random Perturbation Method for Database Security." ACM Transactions on Database Systems, Volume 19, Number 1. 1994.
[6] R. H. McGuckin, S. V Nguyen. , "Public Use Microdata: Disclosure and Usefulness. Journal of Economic and Social Measurement", Vol. 16, pp.19 - 39, 1990.
[7] R. J. A. Little, "Statistical Analysis of Masked Data", Journal of Official Statistics, Vol. 9, pp.407-426, 1993.
[8] J. Domingo-Ferrer and J. Mateo-Sanz, "Practical Data-Oriented Microaggregation for Statistical Disclosure Control". IEEE Transactions on Knowledge and Data Engineering, Vol. 14, No. 1, pp.189-201. 2002.
[9] C. J. Skinner, C. Marsh, S. Openshaw, and C. Wymer, "Disclosure control for census microdata", Journal of Official Statistics, pp.31-51. 1994.
[10] N. R. Adam and J. C. Wortmann , "Security Control Methods for Statistical Databases: A Comparative Study". ACM Computing Surveys, Vol. 21, No. 4. 1989.
[11] J. J. Kim, "A Method for Limiting Disclosure in Microdata Based on Random Noise and Transformation", American Statistical Association, Proceedings of the Section on Survey Research Methods, pp.303-308, 1986.
[12] K. Muralidhar and R. Sarathy, "Security of Random Data Perturbation Methods", ACM Transactions on Database Systems, Vol. 24, No. 4, pp.487-493, 1999.
[13] P. Kooiman, L. Willemborg, and J. Gouweleeuw, "PRAM: A Method for Disclosure Limitation for Microdata", Report, Department of Statistical Methods, Statistical Netherlands, Voorburg, 1997.
[14] T. Dalenius and S. P. Reiss, "Data-Swapping: A Technique for Disclosure Control", Journal of Statistical Planning and Inference 6, pp.73-85, 1982.
[15] D. Lambert, "Measures of Disclosure Risk and Harm". Journal of Official Statistics, Vol. 9, pp.313-331, 1993.
[16] S. E. Fienberg, U. E. Markov, "Confidentiality, Uniqueness, and Disclosure Limitation for Categorical Data", Journal of Official Statistics, pp385 - 397, 1998.
[17] M. J. Elliot, "DIS: a new approach to the measurement of statistical disclosure risk", International Journal of Risk Management, pp.39 -48, 2000.
[18] P. Samarati, "Protecting Respondents Identities in Microdata Release", IEEE Transactions on Knowledge and Data Engineering, Vol. 13, No. 6, pp.1010-1027, 2001.
[19] J. G. Bethlehem, W. J., Keller, and J. Pannekoek, "Disclosure control of microdata". Journal of the American Statistical Association., vol. 85, pp.38-45, 1990.
[20] B. Greenberg, and L. Zayatz, "Strategies for measuring risk in public use microdata files". Statistica Neerlandica, vol. 46, pp.33-48, 1992.
[21] C.J. Skinner, C. Marsh, S. Openshaw, and C. Wymer, "Disclosure control for census microdata". Journal of Official Statistics., vol. 10, pp.31-51, 1994.
[22] G. Chen, and S. Keller-McNulty, "Estimation of identification disclosure risk in microdata". Journal Official Statistics., vol. 14, pp.79-95, 1998.
[23] S.E. Fienberg, and U.E. Makov, "Confidentiality, uniqueness and disclosure limitation for categorical data", Journal Official Statistics, vol. 14, pp.385-397, 1998.
[24] S.M. Samuels, "A Bayesian, species-sampling-inspired approach to the unique problems in microdata disclosure risk assessment". Journal Official Statistics, vol. 14, pp.373-383, 1998.
[25] M.J. Elliot, and A. Dale," Scenarios of attack: the data intruder-s perspective on statistical disclosure risk". Netherlands Official Statist., Spring, pp.6-10, 1999.
[26] G. Paass, "Disclosure risk and disclosure avoidance for microdata". J.Bus.Econ.Statist., vol. 6, pp.487-500, 1988.
[27] U. Blien, H. Wirth, and M. M├╝ller, "Disclosure risk for microdata stemming from official statistics". Statistica Neerlandica,vol. 46, pp. 69-82, 1992.
[28] X. Xiao, Y. Tao and N. Koudas, "Transparent Anonymization: Thwarting Adversaries Who Know the Algorithm, ACM Transactions on Database Systems (TODS)", Vol. 35, Issue 2, April 2010.
[29] V.S. Laks, Lakshmanan and T. NG Raymond and G. Ramesh, "On disclosure risk analysis of anonymized itemsets in the presence of prior knowledge", ACM Transactions on Knowledge Discovery from Data (TKDD) Vol.2 , Issue 3 October 2008.
[30] A. Machanavajjhala, D. Kifer, J. Gehrke, and M. VENKITASUBRAMANIAM, "l-Diversity: Privacy Beyond k- Anonymity, ACM Transactions on Knowledge Discovery from Data (TKDD)" Vol. 1 , Issue 1 , March 2007.
[31] F. K. Dankar and K. E. Emam, "A Method for Evaluating Marketer Reidentification Risk", Proceedings of the 2010 EDBT/ICDT Workshops, Lausanne, Switzerland 2010
[32] T.M. Truta, F. Fotouhi and D. Barth-Jones, "Assessing Global Disclosure Risk in Masked Microdata", Proceedings of the 2004 ACM workshop on Privacy in the electronic society table of contents, Washington DC, USA, pp.85 - 93, 2004
[33] T.M. Truta, F. Fotouhi and D. Barth-Jones, "Disclosure Risk Measures for the Sampling Disclosure Control Method", Proceedings of the 2004 ACM symposium on Applied computing, Nicosia, Cyprus, pp.301 - 306, 2004.
[34] M. Bezz, "Expressing privacy metrics as one-symbol information", Proceedings of the 2010 EDBT/ICDT Workshops, Lausanne, Switzerland, Article No.: 29, 2010
[35] J. Domingo-Ferrer and David Rebollo-Monedero, "Measuring Risk and Utility of Anonymized Data Using Information Theory", Proceedings of the 2009 EDBT/ICDT Workshops, Saint-Petersburg, Russia Pages: 126- 130, 2009
[36] T.M. Truta, F. Fotouhi and D. Barth-Jones, "Privacy and Confidentiality Management for the Microaggregation Disclosure Control Method: Disclosure Risk and Information Loss Measures", Proceedings of the 2003 ACM workshop on Privacy in the electronic society, Washington, DC, pp.21 - 30, 2003.
[37] C. J. Skinner and M. J. Elliot, "A Measure of Disclosure Risk for Microdata". Journal of the Royal Statistical Society, Series B, Vol. 64, 2002, 855--867
[38] R. Benedetti, L. Franconi, "Statistical and Technological Solutions for Controlled Data Dissemination". Proceedings of New Techniques and Technologies for Statistics, Vol. 1, pp. 225-232, 1998.
[39] D. E. Denning and P. J. Denning, "Data Security". ACM Computing Surveys, Vol. 11, pp. 227-249, 1979.
[40] W. A. Fuller, "Masking Procedure for Microdata Disclosure Limitation", Journal of Official Statistics, Vol. 9, pp.383-406, 1993.
[41] N. L. Spruill, "The Confidentiality and Analytic Usefulness of Masked Business Microdata". Proceedings of the American Statistical Association, Section on Survey Research Methods, pp.602-613, 1983.
[42] T.M. Truta, F. Fotouhi and D. Barth-Jones, "Disclosure risk measures for microdata", Proceedings of the 15th International Conference on Scientific and Statistical Database Management, Cambridge, MA, Page: 15-22, 2003
[43] P. Steel, and J. Sperling, "The Impact of Multiple Geographies and Geographic Detail on Disclosure Risk: Interactions between Census Tract and ZIP Code Tabulation Geography". Bureau of Census, 2001
[44] A. Takemura, "Local Recoding by Maximum Weight Matching for Disclosure Control of Microdata Sets". ITME Discussion Paper, No.11, 1999.
[45] G.T. Duncan and D. Lambert, "The risk of disclosure for microdata", J.Bus.Econ. Statist., vol. 7, pp.207-217, 1989.
[46] D. Lambert, "Measures of disclosure risk and harm". Journal Official Statistics., vol.9, pp.313-331, 1993.