Improved Fairlet Decomposition for Fair Correlation Clustering

Main Article Content

Vacharapat Mettanant
Adisak Supeesun
Jittat Fakcharoenphol

Abstract

We consider fairness in correlation clustering, a fundamental clustering problem when the similarity between data points is represented as graphs. The fairness conditions we focus on guarantee that the resulting clusters satisfy the required proportion of population groups, modeled as vertex colors. Recently, Ahmadian, Epasto, Kumar, and Mahdian considered fairness constraints parameterized by α, the maximum fraction of points in a cluster having the same color. They addressed fairness constraints in correlation clustering using the fairlet decomposition framework and solved the fairlet decomposition problem for many fairness constraints α using optimization problems on median costs. This paper gives algorithms for the median-cost problem with better approximation ratios for various cases, to obtain improvements on the approximation ratios for the correlation clustering problem from 256 to 174 when α = 1/2 and from 16.48C2 to 40.96C+4.12 when α = 1/C, where C is the number of colors. We also consider the notion of proportional fairness, where there are only two colors with dierent population sizes. We give an approximation algorithm with provable approximation ratios, and provide connections to a star-covering problem.

Article Details

How to Cite
[1]
V. Mettanant, A. Supeesun, and J. Fakcharoenphol, “Improved Fairlet Decomposition for Fair Correlation Clustering”, ECTI-CIT Transactions, vol. 17, no. 1, pp. 137–152, Mar. 2023.
Section
Research Article

References

M. Kearns and A. Roth, “Ethical Algorithm Design,” ACM SIGecom Exchanges, vol. 18, pp. 31–36, Jul. 2020.

C. Dwork, N. Kohli and D. Mulligan, “Differential Privacy in Practice: Expose your Epsilons!,”

Journal of Privacy and Confidentiality, vol. 9, no. 2, pp. 1-22, Oct. 2019.

C. Dwork, “Differential Privacy in Distributed Environments: An Overview and Open Questions,” in Proceedings of the 2021 ACM Symposium on Principles of Distributed Computing, pp. 5, Jul. 2021.

C. Ro ̈sner and M. Schmidt, “Privacy Preserving Clustering with Constraints,” in The 45th International Colloquium on Automata, Languages, and Programming (ICALP 2018), vol. 107, pp. 96:1-96:14, 2018.

M. Hardt, E. Price and N. Srebro, “Equality of Opportunity in Supervised Learning,” in The 30th International Conference on Neural Information Processing Systems, vol. 29, pp. 1-9, 2016.

H. Elzayn, et al., “Fair Algorithms for Learning in Allocation Problems,” in Proceedings of the Conference on Fairness, Accountability, and Transparency, pp.170-179, Jan. 2019.

K. Donahue and J. Kleinberg, “Fairness and Utilization in Allocating Resources with Uncertain Demand,” in Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, pp.658-668, Jan. 2020.

A. Weller, “Transparency: Motivations and Challenges,” in Explainable AI: Interpreting, Explaining and Visualizing Deep Learning, pp.23- 40, 2019.

B. Wagner, et al., “Regulating Transparency? Facebook, Twitter and the German Network Enforcement Act,” in Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, pp. 261-271, Jan. 2020.

A. Masoomi, et al., “Explanations of Black-Box Models based on Directional Feature Interactions,” in International Conference on Learning Representations, pp. 1-31, 2022.

F. Chierichetti, et al. “Fair Clustering through Fairlets,” in The 31th International Conference on Neural Information Processing Systems, pp. 1-9, 2017.

S. Ahmadian, et al., “Clustering without Over- Representation,” in The 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp.267-275, Jul. 2019.

S. Ahmadian, et al., “Fair Correlation Clustering,” in Proceedings of the 23rdInternational Conference on Artificial Intelligence and Statistics (AISTATS) 2020, 2020.

A. Backurs, et al., “Scalable fair clustering,” in Proceedings of the 36 th International Conference on Machine Learning, 2019.

S. K. Bera, et al., “Fair Algorithms for Clustering,” in Proceedings of the 33rd International Conference on Neural Information Processing Systems, no. 446, pp. 4954-4965, 2019.

I.O. Bercea, et al., “On the Cost of Essentially Fair Clusterings, in Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, APPROX/RANDOM 2019, no.18, pp.18:1-18:22, 2019.

S. A. Esmaeili, et al., “Probabilistic Fair Clustering,” in 34th Conference on Neural Information Processing Systems (NeurIPS 2020), pp. 1-13, 2020.

S. A. Esmaeili, et al., “Fair Clustering Under a Bounded Cost,” in 35th Conference on Neural Information Processing Systems (NeurIPS 2021), pp. 1-13, 2021.

M. Kleindessner, et al., “Guarantees for Spectral Clustering with Fairness Constraints,” in

Proceedings of the 36 th International Conference on Machine Learning, pp.1-10, 2019.

M. Schmidt, C. Schwiegelshohn, and C. Sohler, “Fair Coresets and Streaming Algorithms for Fair k-means,” in Approximation and Online Algorithms, pp. 232-251, 2020.

J. Tang, et al., “A Survey of Signed Network Mining in Social Media,” ACM Computing Surveys, vol. 49, no. 3, pp. 1-37, 2016.

P. Li, H. Dau, G. Puleo and O. Milenkovic, “Motif clustering and overlapping clustering for social network analysis,” IEEE INFOCOM 2017 - IEEE Conference on Computer Communications, pp. 1-9, 2017.

Y. Yan, L. Chen and W.-C. Tjhi, “Semisupervised fuzzy co-clustering algorithm for document categorization,” Knowledge and information systems, vol. 34, pp. 55-74, 2013.

W.W. Cohen and J. Richman, “Learning to Match and Cluster Large High-Dimensional Data Sets for Data Integration,” in Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 475-480, Jul. 2002.

N. Bansal, A. Blum and S. Chawla, “Correlation Clustering,” Machine Learning, vol. 56, pp. 89- 113, 2004.

S. Ahmadian and M. Negahbani, “Improved Approximation for Fair Correlation Clustering,” arXiv preprint arXiv:2206.05050, 2022.

S. Ahmadi, et al., “Fair Correlation Clustering,” arXiv preprint arXiv:2002.03508, 2020.

S. Chawla, et al., “Near Optimal LP Rounding Algorithm for CorrelationClustering on Complete and Complete k-Partite Graphs,” in Pro- ceedings of the Forty-Seventh Annual ACM Symposium on Theory of Computing, pp. 219-228, Jun. 2015.

N. Ailon, M. Charikar and A. Newman, “Aggregating Inconsistent Information: Ranking and Clustering,” Journal of the ACM, vol. 55, no. 5, pp. 1-27, 2008.

E.D. Demaine, et al., “Correlation Clustering in General Weighted Graphs,” Theoretical Computer Science, vol. 361, no. 2-3, op. 172-187, Sep. 2006.

S. Mahabadi and A. Vakilian. “Individual Fairness for k-Clustering,” in Proceedings of the 37th International Conference on Machine Learning, no. 611, pp. 6586-6596, 2020.

A. Vakilian and M. Yal ̧cıner, “Improved Approximation Algorithms for Individually Fair Clustering,” arXiv preprint arXiv:2106.14043, 2021.

D. Chakrabarty and M. Negahbani, “Better Algorithms for Individually Fair k-Clustering,” Advances in Neural Information Processing Systems, vol. 34, 2021.

C. Jung, S. Kannan, and N. Lutz, “A Center in Your Neighborhood: Fairness in Facility Location,” ArXiv:abs/1908.09041, 2019.

M. Abbasi, A. Bhaskara, and S. Venkatasubramanian. “Fair Clustering via Equitable Group Representations,” in The 2021 ACM Conference on Fairness, Accountability, and Transparency, pp. 504-514, Mar .2021.

M. Ghadiri,, S. Samadi, and S. Vempala. “Socially Fair K-Means Clustering,” in Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, pp. 438-448, Mar. 2021.

Y. Makarychev and A. Vakilian. “Approximation Algorithms for Socially Fair Clustering,” in Proceedings of Thirty Fourth Conference on Learning Theory, vol. 134, pp. 1-19, 2021.

D. Goyal and R. Jaiswal, “FPT Approximation for Socially Fair Clustering,” ArXiv:abs/2106.06755, 2021.

A. Schrijver, ‘Combinatorial Optimization - Polyhedra and Efficiency, Springer, 2003.

R.K. Ahuja, T.L. Magnanti and J.B. Orlin, Network Flows: Theory, Algorithms, and Applications, Prentice Hall, 1993.

P. Hall, “On Representatives of Subsets,” Journal of the London Mathematical Society, vol. s1- 10, no. 1, pp. 26-30, 1935.

M. Adamczyk,, et al. “Constant-Factor FPT Approximation for Capacitated k-Median,” in 27th Annual European Symposium on Algorithms (ESA 2019). 2019.

D. Goyal, R. Jaiswal and A. Kumar. “FPT Approximation for Constrained Metric k-Median/Means,” in 15th International Symposium on Parameterized and Exact Computation (IPEC 2020), pp. 14:1-14:19, 2020.

J. Fakcharoenphol, S. Rao and K. Talwar, “A Tight Bound on Approximating Arbitrary Metrics by Tree Metrics,” Journal of Computer and System Sciences, vol. 69, no. 3, pp. 485-497, Nov. 2004.