Date of Award
12-2024
Document Type
Project
Degree Name
Master of Science in Information Systems and Technology
Department
Information and Decision Sciences
First Reader/Committee Chair
Conrad Shayo
Abstract
In recent years, the proliferation of online patient-generated drug reviews has created a valuable resource for assessing drug effectiveness and patient satisfaction, with sentiment analysis emerging as a powerful tool for extracting insights from this unstructured data.
This culminating research project conducted a comparative analysis of traditional Machine Learning (ML) and Deep Learning (DL) models for assessing drug effectiveness using sentiment analysis of participant reviews. The research aimed to evaluate the performance of Support Vector Machine (SVM), XGBoost, Random Forest, Long Short-Term Memory (LSTM), and Bidirectional Encoder Representations from Transformers (BERT) models in this context. This culminating research project addressed three main research questions: (1) How do traditional ML models compare to each other in assessing drug effectiveness ratings? (2) How do DL models compare to each other in this assessment? (3) How do the performances of DL models compare to traditional ML methods? The data was sourced from UCI Machine Learning Repository and was analyzed using Python within the Kaggle environment.
Based on our analysis, for the first question, Random Forest demonstrated superior performance among traditional ML models, achieving 97% accuracy, followed by SVM (95%) and XGBoost (94%). Random Forest excelled in precision for negative reviews (0.99) and recall for positive reviews (0.99), while SVM and XGBoost showed slightly better precision for positive reviews (0.97). Regarding the second question, BERT outperformed LSTM in assessing drug effectiveness ratings. BERT achieved 86% accuracy compared to LSTM's 82%. BERT demonstrated higher precision, especially for positive reviews (0.90), and better recall for negative reviews (0.68). BERT's F1 scores were consistently higher than LSTM's for both positive and negative reviews. Comparing traditional ML and DL models, the former, particularly Random Forest, outperformed DL models in overall accuracy, precision, recall and F1 score. However, DL models, especially BERT, showed promise in handling complex language patterns and nuances in sentiment analysis.
The study concludes that while traditional ML models, particularly Random Forest, currently offer superior performance in assessing drug effectiveness through sentiment analysis, DL models like BERT show potential for handling complex linguistic patterns. Future research directions include exploring multiclass classification, incorporating multimodal data, and investigating additional ML models for sentiment prediction. Using more recent and comprehensive datasets could provide current insights, while analyzing sentiment evolution over time for specific drugs or conditions may reveal valuable trends.
Recommended Citation
Nwogu, Blessing Ogechukwu, "COMPARATIVE ASSESSMENT OF MACHINE LEARNING AND DEEP LEARNING MODELS FOR DRUG EFFECTIVENESS USING SENTIMENT ANALYSIS" (2024). Electronic Theses, Projects, and Dissertations. 2061.
https://scholarworks.lib.csusb.edu/etd/2061
Included in
Business Intelligence Commons, Health Information Technology Commons, Technology and Innovation Commons