Sentiment Analysis of Patient Satisfaction With the Halodoc Application using Naive Bayes, SVM, and KNN
Main Article Content
Abstract
This research aims to analyze patient satisfaction sentiment towards the Halodoc application, a digital health platform that facilitates online medical consultations, medicine ordering, and other health services. Using machine learning-based sentiment analysis techniques—Naive Bayes, Support Vector Machine (SVM), and K-Nearest Neighbors (KNN) algorithms—this study examines patient reviews collected from various sources such as the Google Play Store, App Store, and online health forums. A total of 5,000 reviews were processed through preprocessing stages, including tokenization, stemming, and stop words removal, to identify positive, negative, or neutral sentiments. The analysis results show that the Naive Bayes algorithm achieved the highest accuracy of 85% in sentiment classification, followed by SVM with 82% and KNN with 78%. Key satisfaction factors include ease of access, doctor response speed, and service quality, while common complaints revolve around application technical issues, service costs, and feature limitations. This research provides insights for Halodoc developers to enhance user experience and contributes to the literature on sentiment analysis in the digital health sector.
Article Details

This work is licensed under a Creative Commons Attribution 4.0 International License.
References
[1] McCallum, A., & Nigam, K. (1998). A comparison of event models for Naive Bayes text classification. In Proceedings of the AAAI-98 Workshop on Learning for Text Categorization.
[2] Joachims, T. (1998). Text categorization with Support Vector Machines: Learning with many relevant features. In European Conference on Machine Learning (ECML).
[3] Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1), 21–27.
[4] Salton, G., & Buckley, C. (1988). Term-weighting approaches in automatic text retrieval. Information Processing & Management, 24(5), 513–523.
[5] Ramos, J. (2003). Using TF-IDF to determine word relevance in document queries. (Technical Report). (Sering dijadikan rujukan praktis untuk TF-IDF).
[6] Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., ... & Duchesnay, É. (2011). Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 12, 2825–2830
[7] Porter, M. F. (1980). An algorithm for suffix stripping. Program, 14(3), 130–137
[8] Jurafsky, D., & Martin, J. H. (2009). Speech and Language Processing (2nd ed.). Prentice Hall.
[9] Pang, B., & Lee, L. (2008). Opinion mining and sentiment analysis. Foundations and Trends® in Information Retrieval, 2(1–2), 1–135.
[10] Liu, B. (2012). Sentiment Analysis and Opinion Mining. Morgan & Claypool Publishers.
[11] Sebastiani, F. (2002). Machine learning in automated text categorization. ACM Computing Surveys, 34(1), 1–47.
[12] Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1), 37–46.
[13] Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297.
[14] (Opsional teknis) Manning, C. D., Raghavan, P., & Schütze, H. (2008). Introduction to Information Retrieval. Cambridge University Press.