A Comparison of Real-Time Data Analytics Algorithms
Main Article Content
Abstract
Today's world is overwhelmed with the stream of data generated by IoT sensors, Smartphone applications, E-commerce transactions, etc. Data streaming and real-time analytics tools are necessary to apply for various purposes such as financial fraud detection, recommended products, or disaster warning systems. Existing real-time data analytics tools such as StreamDM, Scikit-multiflow, or Massive Online Analysis (MOA) play a significant role in this field. There is, however, still a lack of well-comparisons among streaming algorithms in these tools. In this paper, we aim to study and compare the performance of the streaming algorithms provided by Scikit-multiflow, one of the most popular tools. In the experiment, we compare various algorithms on classification and regression problems in terms of accuracy, model size, memory, etc. The synthesized and real-world datasets are both employed for the experiment. The experimental results illustrate that the Hoeffding-Tree algorithm shows the best performance among other algorithms.
Article Details
Article Accepting Policy
The editorial board of Thai-Nichi Institute of Technology is pleased to receive articles from lecturers and experts in the fields of business administration, languages, engineering and technology written in Thai or English. The academic work submitted for publication must not be published in any other publication before and must not be under consideration of other journal submissions. Therefore, those interested in participating in the dissemination of work and knowledge can submit their article to the editorial board for further submission to the screening committee to consider publishing in the journal. The articles that can be published include solely research articles. Interested persons can prepare their articles by reviewing recommendations for article authors.
Copyright infringement is solely the responsibility of the author(s) of the article. Articles that have been published must be screened and reviewed for quality from qualified experts approved by the editorial board.
The text that appears within each article published in this research journal is a personal opinion of each author, nothing related to Thai-Nichi Institute of Technology, and other faculty members in the institution in any way. Responsibilities and accuracy for the content of each article are owned by each author. If there is any mistake, each author will be responsible for his/her own article(s).
The editorial board reserves the right not to bring any content, views or comments of articles in the Journal of Thai-Nichi Institute of Technology to publish before receiving permission from the authorized author(s) in writing. The published work is the copyright of the Journal of Thai-Nichi Institute of Technology.
References
A. Kejariwal, S. Kulkarni, and K. Ramasamy, “Real time analytics: Algorithms and systems,” VLDB Endowment, vol. 8, no. 12, pp. 2040–2041, Aug. 2015.
A. Bifet, G. Holmes, R. Kirkby, and B. Pfahringer, “MOA: Massive online analysis,” Journal of Machine Learning Research, vol. 11, pp. 1601–1604, 2010.
J. Montiel, J. Read, A. Bifet, and T. Abdessalem, “Scikit-multiflow: A multi-output streaming framework,” Journal of Machine Learning Research, vol. 19, pp. 2915–2914, 2018.
A. Bifet, S. Maniu, J. Qian, G. Tian, C. He, and W. Fan, “Stream DM: Advanced data mining in spark streaming,” in Proc. IEEE Int. Conf. Data Mining Workshop (ICDMW), Atlantic City, NJ, USA, Nov. 14–17, 2015, pp. 1608–1611.
J. Gama, I. Zliobaite, A. Bifet, M. Pechenizkiy, and A. Bouchachia, “A survey on concept drift adaptation,” ACM Computing Surveys, vol. 46, no. 4, pp. 1–37, Apr. 2014.
A. P. Dawid, “Present position and potential developments: Some personal views statistical theory the prequential approach,” Journal of the Royal Statistical Society, Series A, vol. 147, no. 2, pp. 278–290, 1984.
J. Cohen, “A coefficient of agreement for nominal scales,” Educational and Psychological Measurement, vol. 20, no. 1, pp. 37–46, Apr. 1960.
R. Agrawal, T. Imielinski, and A. Swami, “Database Mining: A performance perspective,” IEEE Trans. Knowledge and Data Engineering, vol. 5, no. 6, pp. 914–925, Dec. 1993.