An Efficient Model for Publishing Microdata with Multiple Sensitive Attributes

Main Article Content

Surapon Riyana
Kittikorn Sasujit
Nigran Homdoung
Noppamas Riyana

Abstract

The purpose of this work is to propose an anonymization model. It is used to address privacy violation issues in datasets that have multiple sensitive attributes. To achieve privacy preservation constraints and maintain data utilities, the sensitive attributes of datasets are grouped to be nominal and continuous attributes. With the nominal sensitive attribute, the data utility and privacy are maintained by the confidence of data re-identification. With another data type, the continuous data, the data utility and privacy are maintained by the data bounding. The proposed model is evaluated by using extensive experiments. The experimental results indicate that the proposed model is more effective and efficient than the compared models. Moreover, the datasets satisfy the privacy preservation constraints of the proposed model, which can guarantee the confidence and bounding of data re-identification.

Article Details

How to Cite
[1]
S. Riyana, K. Sasujit, N. Homdoung, and N. Riyana, “An Efficient Model for Publishing Microdata with Multiple Sensitive Attributes”, ECTI-CIT Transactions, vol. 19, no. 3, pp. 429–441, Aug. 2025.
Section
Research Article

References

L. Sweeney, “k-anonymity: A model for protecting privacy,” International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, vol. 10, no. 5, pp. 557–570, 2002.

Y. Liang and R. Samavi, “Optimization-based k-anonymity algorithms,” Computers & Security, vol. 93, p. 101753, 2020.

Y. Canbay, “On the Complexity of Optimal kAnonymity: A New Proof Based on Graph Coloring,” in IEEE Access, vol. 12, pp. 94197-94204, 2024.

A. Machanavajjhala, D. Kifer, J. Gehrke and M. Venkitasubramaniam, “l-diversity: Privacy beyond k-anonymity,” ACM Transactions on Knowledge Discovery from Data (TKDD), vol. 1, no. 1, pp. 3–es, 2007.

F. Ashkouti, K. Khamforoosh, A. Sheikhahmadi and H. Khamfroush, “DI-Mondrian: Distributed improved Mondrian for satisfaction of the Ldiversity privacy model using Apache Spark,” Information Sciences, vol. 546, pp. 1–24, 2021.

M. Jeon, O. Temuujin, J. Ahn and D. H. Im, “Distributed L-diversity using Spark-based algorithm for large resource description frameworks data,” The Journal of Supercomputing, vol. 77, no. 7, pp. 7270–7286, 2021.

K. Oishi, Y. Sei, J. Andrew, Y. Tahara and A. Ohsuga, “Algorithm to satisfy l-diversity by combining dummy records and grouping,” Security and Privacy, vol. 7, no. 3, p. e373, 2024.

N. Li, T. Li and S. Venkatasubramanian, “tCloseness: Privacy Beyond k-Anonymity and lDiversity,” 2007 IEEE 23rd International Conference on Data Engineering, Istanbul, Turkey, pp. 106-115, 2007.

W. Ren, K. Ghazinour and X. Lian, “ktSafety: Graph Release via k-Anonymity and tCloseness,” IEEE Transactions on Knowledge and Data Engineering, vol. 35, no. 9, pp. 9102–9113, 2022.

X. Xiao and Y. Tao, “Anatomy: Simple and effective privacy preservation,” in Proc. 32nd Int. Conf. on Very Large Data Bases (VLDB), pp. 139–150, 2006.

X. He, Y. Xiao, Y. Li, Q. Wang, W. Wang and B. Shi, “Permutation anonymization: Improving anatomy for privacy preservation in data publication,” in New Frontiers in Applied Data Mining: PAKDD 2011 International Workshops, Shenzhen, China, May 24–27, 2011, Revised Selected Papers 15, Springer, pp. 111–123, 2012.

S. Riyana, N. Riyana and W. Sujinda, “An anatomization model for farmer data collections,” SN Computer Science, vol. 2, no. 5, p. 353, 2021.

S. Riyana, “Achieving Anatomization Constraints in Dynamic Datasets,” ECTI-CIT Transactions, vol. 17, no. 1, pp. 27–45, Feb. 2023.

Q. Zhang, N. Koudas, D. Srivastava and T. Yu, “Aggregate Query Answering on Anonymized Tables,” 2007 IEEE 23rd International Conference on Data Engineering, Istanbul, Turkey, pp. 116-125, 2007.

S. Riyana, N. Riyana and S. Nanthachumphu, “Enhanced (k, e)-anonymous for categorical data,” in Proceedings of the 6th International Conference on Software and Computer Applications, pp. 62–67, 2017.

R. Chi-Wing Wong, J. Li, A. W.-C. Fu and K. Wang, “(α, k)-anonymity: an enhanced kanonymity model for privacy preserving data publishing,” in Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 754–759, 2006.

B. C. M. Fung, M. Cao, B. C. Desai and H. Xu, “Privacy protection for RFID data,” in Proceedings of the 2009 ACM symposium on Applied Computing, pp. 1528–1535, 2009.

M. Rafiei, M. Wagner and W. M.P. van der Aalst, “TLKC-privacy model for process mining,” in International Conference on Research Challenges in Information Science, Springer, pp. 398–416, 2020.

S. Riyana, “(lp1,...,lpn)-Privacy: privacy preservation models for numerical quasiidentifiers and multiple sensitive attributes,” Journal of Ambient Intelligence and Humanized Computing, vol. 12, no. 1, pp. 9713–9729, 2021.

C. Dwork, “Differential privacy,” in Proc. Int. Colloquium on Automata, Languages, and Programming, Springer, pp. 1–12, 2006.

K. Wei et al., “Federated Learning With Differential Privacy: Algorithms and Performance Analysis,” in IEEE Transactions on Information Forensics and Security, vol. 15, pp. 3454-3469, 2020.

J. Dong, A. Roth and W. J. Su, “Gaussian differential privacy,” Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol. 84, no. 1, pp. 3–37, 2022.

S. Yaseen et al., “Improved Generalization for Secure Data Publishing,” in IEEE Access, vol. 6, pp. 27156-27165, 2018.

D. Slijepˇcevi´c, M. Henzl, L. D. Klausner, T. Dam, P. Kieseberg and M. Zeppelzauer, “kAnonymity in practice: How generalisation and suppression affect machine learning classifiers,” Computers & Security, vol. 111, p. 102488, 2021.

F. Farokhi and H. Sandberg, “Ensuring privacy with constrained additive noise by minimizing Fisher information,” Automatica, vol. 99, pp. 275–288, 2019

Y. Hu, A. Hu, C. Li, P. Li and C. Zhang, “Towards a privacy protection-capable noise fingerprinting for numerically aggregated data,” Computers & Security, vol. 119, p. 102755, 2022.

R. Wang, Y. Zhu, C. C. Chang and Q. Peng, “Privacy-preserving high-dimensional data publishing for classification,” Computers & Security, vol. 93, p. 101785, 2020.

S. Riyana, S. Nanthachumphu and N. Riyana, “Achieving privacy preservation constraints in missing-value datasets,” SN Computer Science, vol. 1, pp. 1–10, 2020.

R. Wang, Y. Zhu, T. S. Chen and C. C. Chang, “Privacy-preserving algorithms for multiple sensitive attributes satisfying t-closeness,” Journal of Computer Science and Technology, vol. 33, pp. 1231–1242, 2018.

T. Kanwal et al., “Privacy-preserving model and generalization correlation attacks for 1: M data with multiple sensitive attributes,” Information Sciences, vol. 488, pp. 238–256, 2019.

T. Kanwal et al., “A robust privacy preserving approach for electronic health records using multiple dataset with multiple sensitive attributes,” Computers & Security, vol. 105, p. 102224, 2021.

T. Gal, Z. Chen and A. Gangopadhyay, “A Privacy Protection Model for Patient Data with Multiple Sensitive Attributes,” International Journal of Information Security and Privacy (IJISP), vol. 2, pp. 28–44, Jul. 2008.

A. Dey, S. Biswas and L. Abualigah, “Efficient Violence Recognition in Video Streams using ResDLCNN-GRU Attention Network,” ECTICIT Transactions, vol. 18, no. 3, pp. 329–341, Jul. 2024.

W. Sae-Tang and A. Sirikham, “Image Steganography-based Copyright and PrivacyProtected Image Trading Systems,” ECTI-CIT Transactions, vol. 17, no. 3, pp. 358–375, Aug. 2023.

S. Riyana, K. Sasujit, N. Homdoung, T. Chaichana and T. Punsaensri, “Effective Privacy Preservation Models for Rating Datasets,” ECTI-CIT Transactions, vol. 17, no. 1, pp. 1–13, Nov. 2022.

Kruskal, “Searching, Merging, and Sorting in Parallel Computation,” in IEEE Transactions An Efficient Model for Publishing Microdata with Multiple Sensitive Attributes on Computers, vol. C-32, no. 10, pp. 942-946, Oct. 1983.