Giter Site home page Giter Site logo

varun-suresh / clustering Goto Github PK

View Code? Open in Web Editor NEW
136.0 17.0 41.0 920 KB

Implements "Clustering a Million Faces by Identity"

License: MIT License

Python 93.95% HTML 6.05%
clustering clustering-algorithm high-dimensional face-verification-experiment python3

clustering's People

Contributors

dependabot[bot] avatar gmy001 avatar varun-suresh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

clustering's Issues

running error

Hi Varun,
I test your code,and it appeared:
~/Clustering$ python demo.py --lfw_path ./lfw -v LightenedCNN_A_lfw.mat

Distance calculation time : 52.4087600708
N Clusters: 6854, thresh: 1.1
No of clusters: 6854
Threshold : 1.1
Traceback (most recent call last):
File "demo.py", line 113, in
f1_score = evaluate_clusters(clusters['clusters'], labels_lookup)
File "demo.py", line 64, in evaluate_clusters
precision, recall = calculate_pairwise_pr(clusters, labels_lookup)
File "/home/king/Clustering/evaluation.py", line 30, in calculate_pairwise_pr
cp, tp = count_correct_pairs(cluster, labels_lookup)
File "/home/king/Clustering/evaluation.py", line 16, in count_correct_pairs
if labels_lookup[f1] == labels_lookup[f2]:
KeyError: 8944

But when i tried B and C,it is ok,here is the result:

~/Clustering$ python demo.py --lfw_path ./lfw -v LightenedCNN_C_lfw.mat
Distance calculation time : 51.666009903
N Clusters: 6987, thresh: 1.1
No of clusters: 6987
Threshold : 1.1
Correct Pairs that are in the same cluster:196778
Total pairs as per the clusters created: 204500
Total possible true pairs:242225.0
Precision : 0.962239608802
Recall : 0.812376922283
f1_score : 0.880980468969

And another question is about the visualization,
when i tried json.load(),it got:

File "visualize.py", line 72, in
clusters = json.load(args.clusters)
File "/usr/lib/python2.7/json/init.py", line 286, in load
return loads(fp.read(),
AttributeError: 'str' object has no attribute 'read'

when i tried json.loads(),it got:

File "visualize.py", line 72, in
clusters = json.loads(args.clusters)
File "/usr/lib/python2.7/json/init.py", line 338, in loads
return _default_decoder.decode(s)
File "/usr/lib/python2.7/json/decoder.py", line 366, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python2.7/json/decoder.py", line 384, in raw_decode
raise ValueError("No JSON object could be decoded")

please give me some tips,thank you~~~

[Security] Path Traversal Vulnerability found

A path traversal attack (also known as directory traversal) aims to access files and directories that are stored outside the web root folder. By manipulating variables that reference files with “dot-dot-slash (../)” sequences and its variations or by using absolute file paths, it may be possible to access arbitrary files and directories stored on file system including application source code or configuration and critical system files. It should be noted that access to files is limited by system operational access control (such as in the case of locked or in-use files on the Microsoft Windows operating system).

This attack is also known as “dot-dot-slash”, “directory traversal”, “directory climbing” and “backtracking”.

Root Cause Analysis

In this case, the path traversal vulnerability can be blamed on incorrect usage of the send_from_directory Flask call. The vulnerability occurs due to the code snippet shown below

Clustering/visualize.py

Lines 33 to 36 in 53e663e

@app.route('/img/<path:fpath>', methods=['GET'])
def get_img_path(fpath):
print(os.path.dirname(fpath), fpath.split('/')[-1])
return send_from_directory(os.path.dirname(fpath),fpath.split('/')[-1])

Here, since the fpath parameter is attacker controlled, the effective directory and filename passed to the send_from_directory call can be controlled by the attacker leading to a path traversal attack.

Proof of Concept

The bug can be verified using the proof of concept similar to the one shown below.

curl -i --path-as-is -s -k -X $'GET' \
    -H $'Host: 0.0.0.0:5000' -H $'User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:91.0) Gecko/20100101 Firefox/91.0' -H $'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8' -H $'Accept-Language: en-US,en;q=0.5' -H $'Accept-Encoding: gzip, deflate' -H $'Connection: close' -H $'Upgrade-Insecure-Requests: 1' \
    $'http://0.0.0.0:5000/img/../../../../../../etc/passwd'

Remediation

This can be easily fixed my restricting the value of file and path parameters by a fixed whitelist of possible values.

CVSS 3 Score

CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:C/C:H/I:H/A:H

This bug was found using CodeQL by Github

New data clustering

Does it support new data clustering based on the model which has already converged?

Precision drop?

Hi Varun,

were you able to resolve, why your implementation has such a huge drop in precision? Thanks and looking forward to your reply.

best, Vivek

Usage

Could this work on a dir of unlabeled faces?

run error

When I cluster.py
I got this error ::
Traceback (most recent call last):
File "clustering.py", line 4, in
import pyflann
File "/Users/youyin/anaconda3/lib/python3.6/site-packages/pyflann/init.py", line 27, in
from index import *
ModuleNotFoundError: No module named 'index'

My python version is 3.6.4
Do you have any suggestions?Thank you

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.