About Data: The dataset contains transactional records of a UK-based online retail company, covering the period from 01/12/2010 to 09/12/2011. Key attributes include InvoiceNo, StockCode, Description, Quantity, InvoiceDate, UnitPrice, CustomerID, and Country. The company specializes in unique all-occasion gifts, serving mainly wholesalers.
Problem Statement: The objective is to analyze customer purchasing behavior, identify patterns, and improve business strategies for the online retail company.
Libraries Used:
- numpy
- pandas
- matplotlib.pyplot
- seaborn
- datetime
- sklearn
- scipy.cluster.hierarchy
Solution: The project follows a structured data analysis pipeline:
- Data Cleaning: Preprocessing and cleaning of the dataset to ensure data quality.
- Exploratory Data Analysis (EDA): Descriptive analysis and visualization to gain insights into customer behavior and transaction patterns.
- Clustering Analysis: Utilizes K-Means clustering and Hierarchical clustering to segment customers based on transactional attributes.
- Modeling: Application of clustering algorithms to identify customer segments and provide recommendations for business improvement.
Algorithms:
- Hopkin Test: Evaluates the clustering tendency of the data.
- K-Means Clustering: Used for customer segmentation based on transactional attributes.
- Hierarchical Clustering: Employed for visualizing cluster relationships.
Outcome: The project aims to provide actionable insights and recommendations to the online retail company for enhancing customer satisfaction, optimizing marketing strategies, and improving overall business performance.