Giter Site home page Giter Site logo

samoed / autoannotator Goto Github PK

View Code? Open in Web Editor NEW

This project forked from catswhotrain/autoannotator

0.0 0.0 0.0 767 KB

A Python library that enables automatic image annotation.

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%

autoannotator's Introduction

Auto Annotator

An extendable tool for automatic annotation of image data by a combination of deep neural networks.

alt text

The primary objective of this annotator is to prioritize the accuracy and quality of predictions over speed. The autoannotator has been specifically designed to surpass the precision offered by most publicly available tools. It leverages ensembles of deep neural models to ensure the utmost quality in its predictions. It is important to note that neural networks trained on clean datasets tend to yield superior results compared to those trained on larger but noisier datasets.

Supported tasks

  • Face and landmarks detection
  • Face alignment via keypoints
  • Face descriptor extraction
  • Clusterization

๐Ÿ“Š Benchmarks

Speed

Task Hardware Time, s
Face detection + landmarks + extraction Xeon e5 2678 v3 ~1

๐Ÿ— Installation

PIP package

pip install autoannotator

๐Ÿšฒ Getting started

Face recognition example

Check out our demo face recognition pipeline at: examples/face_recognition_example.py

[Optional] Run frontend and backend

git clone https://github.com/CatsWhoTrain/autoannotator_client
cd autoannotator_client
docker compose up

The webinterface could be found locally at: http://localhost:8080/

FAQ

Do companies and engineers actually need this tool?

We have asked engineers in the field of video analytics whether they are interested in such a library. Their responses were:

  • IREX: would use this library and contribute to it.
  • NapoleonIT: would use this library and contribute to it.
  • ITMO.Lens: would use this library.

What are the reasons for choosing this data labeling tool over the alternative of employing human annotators?

Human accuracy is not so good

Long time ago Andrej Karpathy observed that his accuracy was only 94% when he tried to label just 400 images of the CIFAR-10 dataset while SOTA Efficient adaptive ensembling for image classification (August 29, 2023) achieves >99.6% accuracy. When expert labelers had to choose from ~100 labels while annotating ImageNet, their error rate increased to 13-15%.

Andrej's error rate was determined to be 5.1%, and he initially invested approximately one minute in labeling a single image. Conversely, utilizing Florence or never models for the same task can deliver a top-5 error rate of less than 1%.

Industry case: human face classification.

A certain undisclosed company, bound by a non-disclosure agreement (NDA), has utilized a technique wherein face images captured under challenging environmental conditions are pre-processed. This procedure involves the application of both a facial recognition network and DBSCAN algorithm to divide the images into distinct individuals. Subsequently, human annotators undertook a validation process to verify the accuracy of the pre-processed data. The work conducted by the annotators was inspected by their team leader. Ultimately, it was determined by an ML engineer that 1.4% of the clustered face images were mislabeled.

๐Ÿฐ Legacy

Current repository takes ideas and some code form the following projects:

โœ’ Legal

All individuals depicted in the "assets" images have provided explicit consent to utilize their photos for the purpose of showcasing the current library's work. Kindly refrain from utilizing these images within your individual projects.

autoannotator's People

Contributors

filonenkoa avatar catswhotrain avatar ponjoru avatar samoed avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.