FEDIS: Graph-Based Social Media Evidence Collection with Correlation-Aware Forensic Analysis
Main Article Content
Abstract
Online Social Networks (OSNs) have emerged as a major source of digital evidence in cybercrime investigations, abuse detection, and online incident analysis. Publicly available data, such as posts, comments, reactions, and user interactions, provide critical insights into suspicious activities and behavioral patterns. However, extracting and analyzing such data in a forensic context remains challenging. Existing social media data acquisition approaches primarily rely on web scraping or API-based techniques that produce unstructured outputs (e.g., raw text and images) without preserving the relationships among entities such as users, posts, and interactions. This results in the loss of contextual information essential for forensic analysis. Furthermore, data collection and analysis are often performed separately, resulting in delayed investigations and limited real-time insight. To address these limitations, this paper focuses on two key forensic requirements: (i) efficient and structured acquisition of social media evidence and (ii) correlation-aware analysis of interactions. We propose FEDIS, a unified forensic data acquisition and analysis system that integrates a hybrid DOM-TAO data model, graph-based representation, and parallel keyword-based search. Experimental results demonstrate that FEDIS achieves complete relationship preservation, improves data collection completeness, and significantly reduces search latency compared to traditional approaches, making it practical for real-world social media forensic investigations.
Article Details

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
References
H. Wang et al., “Anchor Link Prediction for Cross-Network Digital Forensics From Local and Global Perspectives,” in IEEE Transactions on Information Forensics and Security, vol. 19, pp. 3620-3635, 2024.
A. Shen, K. P. Chow and Q. Zhou, “Unsupervised Community Detection Framework for Social Network Forensics,” 2024 IEEE/ACIS 24th International Conference on Computer and Information Science (ICIS), Shanghai, China, pp. 162-168, 2024.
A. Umair, P. Nanda and X. He, “Online Social Network Information Forensics: A Survey on Use of Various Tools and Determining How Cautious Facebook Users are?,” 2017 IEEE Trustcom/BigDataSE/ICESS, Sydney, NSW, Australia, pp. 1139-1144, 2017.
S. Shao, C. Tunc, A. Al-Shawi and S. Hariri, “Automated Twitter Author Clustering with Unsupervised Learning for Social Media Forensics,” 2019 IEEE/ACS 16th International Conference on Computer Systems and Applications (AICCSA), Abu Dhabi, United Arab Emirates, pp. 1-8, 2019.
H. Arshad, A. Jantan, G. K. Hoon and I. O. Abiodun, “Formal knowledge model for online social network forensics,” Computers & Security, vol. 89, p. 101675, Feb. 2020.
O. Mayer and M. C. Stamm, “Exposing Fake Images With Forensic Similarity Graphs,” in IEEE Journal of Selected Topics in Signal Processing, vol. 14, no. 5, pp. 1049-1064, Aug. 2020.
O. Elezaj, S. Y. Yayilgan and E. Kalemi, “Criminal network community detection in social media forensics,” in Proc. Int. Conf. Intell. Technol. Appl. Cham, Switzerland: Springer, pp. 371–383, 2021.
X. Zhang, Z. H. Sun, S. Karaman and S. -F. Chang, “Discovering Image Manipulation History by Pairwise Relation and Forensics Tools,” in IEEE Journal of Selected Topics in Signal Processing, vol. 14, no. 5, pp. 1012-1023, Aug. 2020.
S. Grigali¯unas, R. Br¯uzgien˙e and A. Venˇckauskas, “Ontology-Driven Digital Profiling for Identification and Linking Evidence Across Social Media Platform,” in IEEE Access, vol. 11, pp. 111672-111691, 2023.
T. Fernando, C. Fookes, S. Denman and S. Sridharan, “Detection of Fake and Fraudulent Faces via Neural Memory Networks,” in IEEE Transactions on Information Forensics and Security, vol. 16, pp. 1973-1988, 2021.
S. Thorpe and M. Bernard, “Graph mining for 355 forensic databases,” SoutheastCon 2017, Concord, NC, USA, pp. 1-10, 2017.
“Full-Text Search Index,” Neo4j, [Online]. Available: https://neo4j. com/docs/cypher-manual/current/ indexes-for-full-text-search/. [Accessed: Jan. 15, 2024].
N. Bronson et al., “TAO: Facebook’s distributed data store for the social graph,” in Proceedings of the 2013 USENIX Conference on Annual Technical Conference (USENIX ATC’13). USENIX Association, USA, 49–60, 2013.
I. Mysiuk and R. Shuvar, “Automating Web Scraping of User Comments for Sentiment Analysis in Social Networks,” 2023 IEEE 13th International Conference on Electronics and Information Technologies (ELIT), Lviv, Ukraine, pp. 77-81, 2023.
“What Is an Actor?,” Apify, [Online]. Available: https://docs.apify.com/platform/actors. [Accessed: Jan. 15, 2024].
L. Citra Dewi and A. Chandra, “Social Media Web Scraping using Social Media Developers API and Regex,” in Proc. of 4th International Conference on Computer Science and Computational Intelligence 2019 (ICCSCI), pp. 444-449, Sep. 2019.
A. Hernandez-Suarez, G. Sanchez-Perez, K. Toscano-Medina, R. Toscano-Medina, V. Martinez-Hernandez, J. Olivares-Mercado, H. P´erez-Meana and V. Sanchez, “Can Twitter API be bypassed? A new methodology for collecting chronological information without restrictions,” in Proc. SoMeT, 2018, pp. 453–462.
J. You, J. Lee and H. -Y. Kwon, “A Complete and Fast Scraping Method for Collecting Tweets,” 2021 IEEE International Conference on Big Data and Smart Computing (BigComp), Jeju Island, Korea (South), pp. 24-27, 2021.
B. Kusumasari and N. P. A. Prabowo, “Scraping social media data for disaster communication: How the pattern of Twitter users affects disasters in Asia and the pacific,” Natural Hazards, vol. 103, no. 3, pp. 3415–3435, Sep. 2020.
I. Dongo, Y., Cadinale, A. Aguilera, F. Mart´ınez, Y. Quintero and S. Barrios, “Web scraping versus Twitter API: A comparison for a credibility analysis,” in Proceedings of the 22nd International Conference on Information Integration and Web-based Applications & Services, pp. 1–11, Nov. 2020.
Y. Cardinale, I. Dongo, G. Robayo, D. Cabeza, A. Aguilera and S. Medina, “T-CREo: A Twitter Credibility Analysis Framework,” in IEEE Access, vol. 9, pp. 32498-32516, 2021.
Apify, “Apify,” [Online]. Available: https:// apify.com. [Accessed: Jan. 15, 2024].
M. Mitchell, S. Wu, A. Zaldivar, P. Barnes, L. Vasserman, B. Hutchinson, E. Spitzer, I. D. Raji and T. Gebru, “Model cards for model reporting,” in Proc. Conf. Fairness, Accountability, Transparency, pp. 220–229, Jan. 2019.
H. Arshad, A. Jantan and E. Omolara, “Evidence collection and forensics on social networks: Research challenges and directions,” Digital Investigation, vol. 28, pp. 126–138, Mar. 2019.
P. V. Srimukh and S. Shridevi, “Ontology-based crime investigation process,” in Advances in Smart Grid Technology, P. Siano and K. Jamuna, Eds. Singapore: Springer, vol. 687, pp. 497–509, 2020.
M. Woodward, “Twitter User Statistics 2024: What Happened After ‘X’ Rebranding,” Search Logistics, [Online]. Available: https://www. searchlogistics.com/learn/statistics/ twitter-user-statistics/.[Accessed: Jul. 1, 2024].
J. Robie, “What Is the Document Object Model?,” W3C, [Online]. Available: https:// www.w3.org/TR/WD-DOM/introduction.html. [Accessed: Jul. 1, 2024].
Y. Wu, N. Cao, D. Gotz, Y. -P. Tan and D. A. Keim, “A Survey on Visual Analytics of Social Media Data,” in IEEE Transactions on Multimedia, vol. 18, no. 11, pp. 2135-2148, Nov. 2016.
S. Chen, L. Lin, and X. Yuan, “Social media visual analytics,” Computer Graphics Forum, vol. 36, no. 3, pp. 563–587, Jun. 2017.
L. Anthony, “Introducing Fireant: A Freeware, Multiplatform Social Media Data-Analysis Tool,” in IEEE Transactions on Professional Communication, vol. 61, no. 4, pp. 428-442, Dec. 2018.
T. Isenberg, Z. Salazar, R. Blanco and C. Plaisant, “Do You Believe Your (Social Media) Data? A Personal Story on Location Data Biases, Errors, and Plausibility as Well as Their Visualization,” in IEEE Transactions on Visualization and Computer Graphics, vol. 28, no. 9, pp. 3277-3291, 1 Sept. 2022.
X. Bao and S. Jin, “Knowledge Graph Analysis of Social Media in the Context of Big Data,” 2024 9th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), Chengdu, China, pp. 295-300, 2024.
S. Lee, A. A. H. Mujammami and K. Kim, “Leveraging Social Networks for Cyber Threat Intelligence: Analyzing Attack Trends and TTPs in the Arab World,” in IEEE Access, vol. 13, pp. 5679-5693, 2025.
S. Jiang, Y. Qiu, X. Mo, R. Tang and W. Wang, “An Effective Node Injection Approach for Attacking Social Network Alignment,” in IEEE Transactions on Information Forensics and Security, vol. 20, pp. 589-604, 2025.