This project focuses on analyzing customer reviews from Flipkart determine whether a review expresses a Positive or Negative sentiment. Natural Language Processing (NLP) techniques are used to convert textual reviews into numerical representations, which are then classified using machine learning models.
Multiple models were trained and evaluated. The final deployed model is a Balanced Logistic Regression model, selected based on its strong performance on imbalanced data and its ability to correctly identify negative customer feedback.
When a review is submitted, the system performs the following steps:
Class imbalance is handled using class weighting, ensuring that negative reviews are not ignored during prediction.
| Model | Accuracy | Weighted F1 | Macro F1 | Negative Recall |
|---|---|---|---|---|
| SVM (Linear) | 0.869 | 0.859 | 0.765 | 0.513 |
| Logistic Regression (Balanced) – Final Model | 0.858 | 0.859 | 0.780 | 0.662 |
| Logistic Regression (Baseline) | 0.871 | 0.854 | 0.748 | 0.436 |
| Decision Tree + SMOTE | 0.854 | 0.843 | 0.738 | 0.478 |
| Naive Bayes | 0.860 | 0.833 | 0.703 | 0.338 |
| Logistic Regression + SMOTE | 0.700 | 0.729 | 0.643 | 0.763 |