Sentiment Analysis of Flipkart Product Reviews

Project Overview

This project focuses on analyzing customer reviews from Flipkart determine whether a review expresses a Positive or Negative sentiment. Natural Language Processing (NLP) techniques are used to convert textual reviews into numerical representations, which are then classified using machine learning models.

Multiple models were trained and evaluated. The final deployed model is a Balanced Logistic Regression model, selected based on its strong performance on imbalanced data and its ability to correctly identify negative customer feedback.

Model Input and Prediction



How the Model Works

When a review is submitted, the system performs the following steps:

  1. Text cleaning and preprocessing.
  2. TF-IDF vectorization to convert text into numerical features.
  3. Prediction using a Balanced Logistic Regression classifier.
  4. Final sentiment is returned as Positive or Negative.

Class imbalance is handled using class weighting, ensuring that negative reviews are not ignored during prediction.

Model Performance Comparison

Model Accuracy Weighted F1 Macro F1 Negative Recall
SVM (Linear) 0.869 0.859 0.765 0.513
Logistic Regression (Balanced) – Final Model 0.858 0.859 0.780 0.662
Logistic Regression (Baseline) 0.871 0.854 0.748 0.436
Decision Tree + SMOTE 0.854 0.843 0.738 0.478
Naive Bayes 0.860 0.833 0.703 0.338
Logistic Regression + SMOTE 0.700 0.729 0.643 0.763