Giter Site home page Giter Site logo

face-clustering's Introduction

Face clustering

This Python script is designed to cluster similar faces together using the DBSCAN algorithm. The faces are first identified using a face recognition script and then encoded into 128-dimensional arrays using HOG. Dimensionality reduction is performed on the encodings using PCA, and then the reduced encodings are clustered together using DBSCAN. Finally, the script displays the faces in the same clusters. The cluaters are also plotted.

Prerequisites

Before running the script, make sure you have the following installed:

dlib==19.24.99
face-recognition==1.3.0
imutils==0.5.4
matplotlib==3.7.2
numpy==1.25.1
opencv-python==4.8.0.74
scikit-learn==1.3.0

You can install dlib and the other required packages using pip:

pip install -r requirements.txt

How to Use

  1. Clone the repository to your local machine:

    git clone https://github.com/iamadityavishnu/face-clustering.git
    
  2. Navigate to theproject directory:

    cd face-clustering
    
  3. Place your dataset of images in the dataset folder. The images should contain faces that you want to cluster. You can have images with multiple faces.

  4. Run the script using the following command:

    python encode_faces.py --dataset dataset --encodings <name_your_pickle_file> --detection-method hog
    

    The --dataset argument takes path to input directory of images. Give the name of the pickle file you want to create in the --encodings argument (for eg. encodings.pickle). The --detection-method argument, by default uses CNN for face detection. If you have a GPU enabled device, CNN should give better accuracy than HOG. If you do not have a GPU enabled device, using HOG algorithm would be better as it performs quickly when compared to CNN on CPU only device.

    To install GPU enabled dlib using conda, please follow this tutorial.

The script will perform the following steps:

In the encode_faces.py file

  1. Face Detection: The script will use the face recognition library to detect faces in the images present in the "data" folder.

  2. Face Encoding: The detected faces will be encoded into 128-dimensional arrays using Histogram of Oriented Gradients (HOG) algorithm. The encodings will be saved as a pickle file.


In the notebook file

  1. Dimensionality Reduction: Principal Component Analysis (PCA) will be applied to reduce the dimensionality of the face encodings.

  2. Clustering: The reduced encodings will be clustered together using the DBSCAN algorithm.

  3. Display: The script will display the faces that belong to the same clusters. The outliers are the random people who appeared in group photos from the dataset.

  4. Plot Clusters: The clusters will be plotted to visualize the groupings of similar faces.

Example Output

Here's an example of what the outputs might look like:

alt text alt text

Contributing

If you want to contribute to this project, feel free to fork the repository and submit pull requests.

References

๐Ÿซฐ

face-clustering's People

Contributors

iamadityavishnu avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.