This repository comprises the project that I completed as part of my internship with Afame Technologies. The project is:
The Indian film industry is one of the largest in the world. It produces many movies across various genres each year. In this project, we aimed to understand how different factors contribute to a movie's success, particularly its rating, and predict the ratings of movies based on a variety of features including release year, duration, genre, number of votes, etc. using machine learning techniques.
The dataset is pulled from IMDb.com of all the Indian movies on the platform. It included information such as movie title, release year, duration, genre, number of votes, IMDb rating, etc. Data preprocessing involved handling missing values, encoding categorical variables, transforming features, and scaling numerical features to ensure compatibility with machine learning algorithms.
To better understand the data first, we performed exploratory data analysis to gain insights into the features and their relationships with the target variable. We used several plots to visualize the insights and findings.
For our prediction task, we experimented with several machine learning algorithms including linear regression, k-nearest neighbors, support vector machine, random forests, adaptive boosting, and gradient boosting techniques. We used cross-validation and random search to evaluate and tune the performance of each model to improve predictive accuracy and prevent overfitting.
We evaluated the performance of our models using metrics such as root mean squared error (RMSE), R-squared (R2) score, and mean absolute error (MAE). Our best-performing model demonstrated strong predictive performance, accurately estimating the ratings of Indian movies with mean absolute error (MAE) of 0.83.
In conclusion, our project successfully demonstrated the feasibility of predicting the ratings of Indian movies using machine learning techniques. By leveraging features such as release year, duration, genre, number of votes, etc. we developed a robust predictive model that can assist us in understanding and optimizing the factors that contribute to the success of the movies. Our findings contribute to the growing body of research aimed at leveraging data-driven approaches to enhance decision-making in the entertainment industry.