Face clustering

This Python script is designed to cluster similar faces together using the DBSCAN algorithm. The faces are first identified using a face recognition script and then encoded into 128-dimensional arrays using HOG. Dimensionality reduction is performed on the encodings using PCA, and then the reduced encodings are clustered together using DBSCAN. Finally, the script displays the faces in the same clusters. The cluaters are also plotted.

Prerequisites

Before running the script, make sure you have the following installed:

dlib==19.24.99
face-recognition==1.3.0
imutils==0.5.4
matplotlib==3.7.2
numpy==1.25.1
opencv-python==4.8.0.74
scikit-learn==1.3.0

You can install dlib and the other required packages using pip:

pip install -r requirements.txt

How to Use

Clone the repository to your local machine:

git clone https://github.com/iamadityavishnu/face-clustering.git

Navigate to theproject directory:
```
cd face-clustering
```
Place your dataset of images in the dataset folder. The images should contain faces that you want to cluster. You can have images with multiple faces.
Run the script using the following command:
```
python encode_faces.py --dataset dataset --encodings <name_your_pickle_file> --detection-method hog
```
The --dataset argument takes path to input directory of images. Give the name of the pickle file you want to create in the --encodings argument (for eg. encodings.pickle). The --detection-method argument, by default uses CNN for face detection. If you have a GPU enabled device, CNN should give better accuracy than HOG. If you do not have a GPU enabled device, using HOG algorithm would be better as it performs quickly when compared to CNN on CPU only device.

To install GPU enabled dlib using conda, please follow this tutorial.

The script will perform the following steps:

In the encode_faces.py file

Face Detection: The script will use the face recognition library to detect faces in the images present in the "data" folder.
Face Encoding: The detected faces will be encoded into 128-dimensional arrays using Histogram of Oriented Gradients (HOG) algorithm. The encodings will be saved as a pickle file.

In the notebook file

Dimensionality Reduction: Principal Component Analysis (PCA) will be applied to reduce the dimensionality of the face encodings.
Clustering: The reduced encodings will be clustered together using the DBSCAN algorithm.
Display: The script will display the faces that belong to the same clusters. The outliers are the random people who appeared in group photos from the dataset.
Plot Clusters: The clusters will be plotted to visualize the groupings of similar faces.

Example Output

Here's an example of what the outputs might look like:

Contributing

If you want to contribute to this project, feel free to fork the repository and submit pull requests.

References

🫰

iamadityavishnu / face-clustering Goto Github PK

face-clustering's Introduction

Face clustering

Prerequisites

How to Use

Example Output

Contributing

References

face-clustering's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent