MEASURING THE ENERGY EFFICIENCY OF RDF QUERY PROCESSING

Authors

  • Marko Niinimaki Lecturer, Chulalongkorn School of Integrated Innovation, Chulalongkorn University, 254 Phayathai Road, Pathumwan, Bangkok 10330
  • Kitichai Chanyalikit Assistant, Chulalongkorn School of Integrated Innovation, Chulalongkorn University, 254 Phayathai Road, Pathumwan, Bangkok 10330
  • Atamfon Udofia Student, Chulalongkorn School of Integrated Innovation, Chulalongkorn University, 254 Phayathai Road, Pathumwan, Bangkok 10330

Keywords:

database, energy, RDF, RDF4J

Abstract

The cost of electric power consumed by a server computer is a significant component of its total cost of ownership. Since database servers are essential in the era of Big Data, we studied the performance and energy consumption of a small server. To achieve this, we stored a large set of RDF (Resource Description Framework) data in a database (RDF4J) running on consumer-grade hardware. Using realistic SPARQL language queries from Wiki data and a low-cost power/energy meter, we measured the energy consumption of RDF query processing. Our database management system responded to queries over a network connection, demonstrating that the network processing overhead in query processing was quite low (about 2 to 4%). We found that the most energy-efficient processing (queries per Watt) could be achieved with a slightly larger degree of parallelism than the best throughput (queries per hour). Moreover, we discovered that using a stripped-down version of the operating system on which the database ran did not affect the energy consumption of the query processing.

References

Gelenbe E. Electricity consumption by ICT: Facts, trends, and measurements. Ubiquity 2023;(August):1-15.

Patel CD, Shah AJ. Cost model for planning, development and operation of a data center. Hewlett-Packard Laboratories Technical Report 2005;107:1-36.

Bianchini R, Rajamony R. Power and energy management for server systems. Computer 2004;37(1):68-76.

Alex Woodie. Big Growth Forecasted for Big Data [Internet]. Datanami [updated 2022 Jan 11; cited 2024 Aug 12] Available from: https://www.datanami.com/2022/01/11/big-growth-forecasted-for-big-data

De Mauro A, Greco A, Grimaldi M. A formal definition of Big Data based on its essential features. Library review 2016;65(3):122-35.

Martínez-Prieto MA, Cuesta CE, Arias M, Fernández JD. The solid architecture for real-time management of big semantic data. Future Generation Computer Systems 2015;47:62-79.

The World Wide Web Consortium. W3C RDF [Internet] The World Wide Web Consortium [updated 2014 Feb 25; cited 2024 Aug 12] Available from: https://www.w3.org/RDF/.

Cruz IF, Xiao H. The role of ontologies in data integration. Engineering intelligent systems for electrical engineering and communications 2005;13;245.

Vrandečić D, Krötzsch M. Wikidata: a free collaborative knowledgebase. Communications of the ACM 2014;57(10):78-85.

Harris S, Seaborne A, Prud’hommeaux E. SPARQL 1.1 query language. [Internet] The World Wide Web Consortium [updated 2013 Mar 21; cited 2024 Aug 12] Available from: https://www.w3.org/TR/sparql11-query.

O’Neil PE. Database Performance Measurement. In: Tucker AB, editor. The Computer Science and Engineering Handbook. Boca Raton, USA: CRC Press, 1997. p.1078-92.

Pickavet M et al. Worldwide energy needs for ICT: the rise of power-aware networking. Proceeding of 2nd International Symposium on Advanced Networks and Telecommunication Systems; 2008 Dec 15-17; Mumbai, India. Piscataway, USA: IEEE; 2008. p. 1-3.

Park WY, Phadke A, Shah N. Efficiency improvement opportunities for personal computer monitors: implications for market transformation programs. Energy Efficiency 2013;6:54-69.

ENERGYSTAR. ENERGY STAR program requirements [Internet] ENERGYSTAR [updated 2010 Aug 1; cited 2024 Aug 12] Available from: https://www.energystar.gov/ sites/default/files/specs/private/Computers_Program_Requirements.pdf

BuildComputers. Power consumption of PC components in watts. buildcomputers.net [Internet]. BuildComputers [updated 2013 Mar 15; cited 2024 Aug 12] Available from: http://www.buildcomputers.net/power-consumption-of-pc-components.html

Barroso LA, Hölzle U. The case for energy-proportional computing. Computer 2007; 40(12):33-7.

Lang W, Kandhan R, Patel JM. Rethinking query processing for energy efficiency: Slowing down to win the race. IEEE Data Engineering Bulletin 2011:34:12-23.

Economou D, Rivoire S, Kozyrakis C, Ranganathan P. Full-system power analysis and modeling for server environments. In: Eeckhout, Lieven, and J. YI, editors. Proceedings of the Second Annual Workshop on Modeling, Benchmarking and Simulation (MoBS) Held in Conjunction with the 33rd Annual International Symposium on Computer Architecture (ISCA-33); 2006 Jun 17-21; Boston, MA. Piscataway, USA: IEEE; 2006. p. 70-7.

Arenas M, Gutierrez C, Pérez J. Foundations of RDF databases. Heidelberg, Germany: Springer; 2009.

Wylot M, Hauswirth M, Cudré-Mauroux P, Sakr S. RDF data storage and query processing schemes: A survey. ACM Computing Surveys (CSUR) 2018;51(4):84.

Bizer C, Schultz A. Benchmarking the performance of storage systems that expose SPARQL endpoints. Proceeding of 4 th International Workshop on Scalable Semantic Web Knowledge Base Systems (SSWS); 2008 Oct 21; Karlsruhe, Germany. CEUR; 2008. p. 39.

Broekstra J, Kampman A, Van Harmelen F. Sesame: A generic architecture for storing and querying RDF and RDF schema. Proceeding of the Semantic Web-ISWC 2002: First International Semantic Web Conference; 2002 June 9-12; Sardinia, Italy. Heidelberg, Germany: Springer; 2002. p. 54-68.

Nacional T, Niinimaki M, Heikkurinen M. RDF Databases - Case Study and Performance Evaluation. MATTER: International Journal of Science and Technology 2019;5(3):1-14.

Hernández D, Hogan A, Krötzsch M. Reifying RDF: What works well with Wikidata? Proceedings of the 11th International Workshop on Scalable Semantic Web Knowledge Base Systems; 2015 Oct 11; 2015; Bethlehem, USA. p. 32-47.

Shakhovska N, Veres O, Bolubash Y, Bychkovska L. Big data information technology and data space architecture. Sensors & Transducers 2015;195:69-76.

Kambatla K, Pathak A, Pucha H. Towards optimizing hadoop provisioning in the cloud. Proceedings of the 2009 conference on Hot topics in cloud computing (HotCloud'09); 2009 Jun 15; San Diego, USA. USA: USENIX Association; 2009. Article 22.

Schätzle A, Przyjaciel-Zablocki M, Neu A, and Lausen G. Sempala: Interactive SPARQL query processing on Hadoop. The Semantic Web–ISWC 2014: 13th International Semantic Web Conference; 2014 Oct 19-23; Riva del Garda, Italy. Heidelberg, Germany: Springer; 2014. p. 164-179.

Kawises J and Vatanawood W. A development of RDF data transfer and query on Hadoop Framework. 2016 IEEE/ACIS 15th International Conference on Computer and Information Science (ICIS); 2016 June 26-29, Okayama, Japan. Piscataway, USA: IEEE; 2016. p. 1-4.

Husain MF, Doshi P, Khan L, Thuraisingham BM. Storage and retrieval of large RDF graph using Hadoop and MapReduce. In: Jaatun MG, Zhao G, Rong C, editors, Proceeding of the first International Conference (CloudCom 2009); Beijing, China, 2009 December 1-4. Lecture Notes in Computer Science 2009 (5931):680-86.

Niinimaki M, Niemi T, Thanisch P. Dataspace management with ETL and RDF support. Naresuan University Journal: Science and Technology (NUJST) 2020;28:36-49.

Poess M, Nambiar RO, Vaid K, Stephens Jr JM, Huppler K, Haines E. Energy benchmarks: a detailed analysis. Proceedings of the 1st International Conference on Energy-Efficient Computing and Networking; 2010 Apr 13-15; Passau, Germany. New York, USA: ACM; 2010. p. 131-40.

Tsirogiannis D, Harizopoulos S, Shah MA. Analyzing the energy efficiency of a database server. Proceeding of the 2010 ACM SIGMOD International Conference on Management of data; 2010 Jun 6-11; Indianapolis, USA. New York, USA: ACM; 2010. p. 231-42.

Niinimäki M, Abaunza F, Niemi T, Thanisch P, Kommeri J. Energy-efficient query processing in a combined database and web service environment. Green Computing Strategies for Competitive Advantage and Business Sustainability. Pennsylvania, USA: IGI Global; 2018. p. 62-88

Hasemann H, Kröller A, and Pagel M. RDF Provisioning for the Internet of Things. Proceeding of the 3rd IEEE International Conference on the Internet of Things; 2012 Oct 24-26; Wuxi, China. Piscataway, USA: IEEE; 2012 p. 143-50.

Prud’hommeaux E, Carothers G. Turtle - Terse RDF triple language. [Internet] The World Wide Web Consortium [updated 2015 Feb 25; cited 2024 Aug 12] Available from: https://www.w3.org/TR/turtle

Flood J. Porteus Linux. [Intenet] Porteus [updated 2023 Oct 4; cited 2024 Aug 12]. Available at: http://porteus.org

Wu K, Arpaci-Dusseau A, Arpaci-Dusseau R, Sen R, Park K. Exploiting Intel Optane SSD for Microsoft SQL server. In: 15th International Workshop on Data Management on New Hardware; 2019 July 1; Amsterdam, Netherlands. New York, USA: ACM; 2019. Article 15.

Frey J, Meyer LP, Arndt N, Brei F, Bulert K. Benchmarking the abilities of large language models for RDF knowledge graph creation and comprehension: How well do LLMs speak Turtle. Proceeding of the ISWC 2023 Workshop on Deep Learning for Knowledge Graphs; 2023 Nov 6-7; Athens, Greece. CEUR; 2023.

Downloads

Published

2024-08-30

Issue

Section

บทความวิจัย (Research Article)