Skip to content

Latest commit

 

History

History
90 lines (63 loc) · 2.6 KB

File metadata and controls

90 lines (63 loc) · 2.6 KB

Global Movie & Webseries Recommendation System

This repository provides a large-scale content recommendation system for movies and web series.

Table of Contents


Features

  • Large Synthetic Dataset: Generates 100,000 entries that blend realistic movie and web series titles with diverse genres, casts, directors, and other attributes.
  • Enhanced Diversity: Minimizes repetition through broad lists of unique titles, actors, and directors.
  • TF-IDF Based Similarity: Merges metadata (genres, keywords, cast, director, and tagline) into a unified text field, then applies TF-IDF and cosine similarity to find similar content.
  • 1 x N Computation: Computes similarity between the selected item and all others efficiently, avoiding the overhead of a full NxN comparison matrix.

Project Structure

├── generate_data.py               # Script to generate the 100K-row dataset
├── recommendation_engine.py       # Script to run the interactive recommendation system
├── final_100k_movies_webseries.csv    # Generated CSV file (after you run generate_data.py)
├── MovieRecommendationSystem.ipynb    # (Optional) Jupyter Notebook version (if any)
├── README.md                      # Project documentation

Getting Started

  1. Clone this repository.
  2. Ensure you have a Python 3.x environment set up..

Usage

1. Dataset Generation

Run:

python generate_data.py

This creates a file called final_100k_movies_webseries.csv in the same directory, containing 100,000 rows of synthetic data.

2. Recommendation Engine

After generating the dataset, run:

python recommendation_engine.py

The program will:

  1. Ask you to choose Movie or Webseries.
  2. Display some titles from that category.
  3. Ask you to enter your favorite title.
  4. Find the closest match.
  5. Compute cosine similarity (1 x N) between your chosen item and all others.
  6. Print the top recommendations (up to 30).

Dependencies

Install them with:

pip install -r requirements.txt

Or individually:

pip install numpy pandas scikit-learn