Giter Site home page Giter Site logo

zhaoj9014 / face.evolve Goto Github PK

View Code? Open in Web Editor NEW
3.4K 110.0 753.0 8.24 MB

🔥🔥High-Performance Face Recognition Library on PaddlePaddle & PyTorch🔥🔥

License: MIT License

Python 100.00%
pytorch face-recognition face-detection face-alignment face-landmark-detection model-training feature-extraction fine-tuning data-augmentation deep-learning

face.evolve's Issues

question about retrain the model

Thank you very much for your high performance repo!
I want to improve the recognition rate on my dataset, so I use the “backbone_ir50_asia.pth” you have released as the pretrain backbone model to retrain it, I update all the parameter of the backbone, but the val acc of calfw is only 56% even after 50 epochs. So I decide to only retrain the parameter of fc1 and fc2 of the backboned, do you think it will be used?
Can you give me some precious advice about how to improve the model's recognition rate on real scene and how should i do to finetune the pretrian backbone model?
thanks, good wishes for you!

About the model of MTCNN

Hello,
Thanks for your great work!!! My question is that what is the difference of the MTCNN model provided in this repo between the original model released by the auther. Do you retrain this model on your own dataset? Thanks!

How to disable this user warning?

When run detector to show result on detect result, there is an warning:

/align/get_nets.py:70: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
  a = F.softmax(a)
detector.py:82: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  img_boxes = Variable(torch.FloatTensor(img_boxes), volatile = True)

Failed building wheel for bcolz

build fail for
'''
gcc -pthread -B /data/software/anaconda3/envs/evoLVe/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DHAVE_LZ4=1 -DHAVE_SNAPPY=1 -DHAVE_ZLIB=1 -DHAVE_ZSTD=1 -Ibcolz -Ic-blosc/blosc -Ic-blosc/internal-complibs/zstd-1.3.4 -Ic-blosc/internal-complibs/lz4-1.8.1.2 -Ic-blosc/internal-complibs/snappy-1.1.1 -Ic-blosc/internal-complibs/zlib-1.2.8 -Ic-blosc/internal-complibs/zstd-1.3.4/compress -Ic-blosc/internal-complibs/zstd-1.3.4/dictBuilder -Ic-blosc/internal-complibs/zstd-1.3.4/decompress -Ic-blosc/internal-complibs/zstd-1.3.4/legacy -Ic-blosc/internal-complibs/zstd-1.3.4/common -Ic-blosc/internal-complibs/zstd-1.3.4/dll -Ic-blosc/internal-complibs/zstd-1.3.4/deprecated -I/data/software/anaconda3/envs/evoLVe/lib/python3.7/site-packages/numpy/core/include -I/data/software/anaconda3/envs/evoLVe/include/python3.7m -c c-blosc/internal-complibs/snappy-1.1.1/snappy-stubs-internal.cc -o build/temp.linux-x86_64-3.7/c-blosc/internal-complibs/snappy-1.1.1/snappy-stubs-internal.o -DSHUFFLE_SSE2_ENABLED -msse2 -DSHUFFLE_AVX2_ENABLED -mavx2
cc1plus: 警告:command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
c-blosc/internal-complibs/snappy-1.1.1/snappy-stubs-internal.cc:29:21: 致命错误:algorithm:没有那个文件或目录
编译中断。
error: command 'gcc' failed with exit status 1


Failed building wheel for bcolz
Running setup.py clean for bcolz
'''

'''
gcc -pthread -B /data/software/anaconda3/envs/evoLVe/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DHAVE_LZ4=1 -DHAVE_SNAPPY=1 -DHAVE_ZLIB=1 -DHAVE_ZSTD=1 -Ibcolz -Ic-blosc/blosc -Ic-blosc/internal-complibs/zstd-1.3.4 -Ic-blosc/internal-complibs/lz4-1.8.1.2 -Ic-blosc/internal-complibs/snappy-1.1.1 -Ic-blosc/internal-complibs/zlib-1.2.8 -Ic-blosc/internal-complibs/zstd-1.3.4/compress -Ic-blosc/internal-complibs/zstd-1.3.4/dictBuilder -Ic-blosc/internal-complibs/zstd-1.3.4/decompress -Ic-blosc/internal-complibs/zstd-1.3.4/legacy -Ic-blosc/internal-complibs/zstd-1.3.4/common -Ic-blosc/internal-complibs/zstd-1.3.4/dll -Ic-blosc/internal-complibs/zstd-1.3.4/deprecated -I/data/software/anaconda3/envs/evoLVe/lib/python3.7/site-packages/numpy/core/include -I/data/software/anaconda3/envs/evoLVe/include/python3.7m -c c-blosc/internal-complibs/snappy-1.1.1/snappy-stubs-internal.cc -o build/temp.linux-x86_64-3.7/c-blosc/internal-complibs/snappy-1.1.1/snappy-stubs-internal.o -DSHUFFLE_SSE2_ENABLED -msse2 -DSHUFFLE_AVX2_ENABLED -mavx2
cc1plus: 警告:command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
c-blosc/internal-complibs/snappy-1.1.1/snappy-stubs-internal.cc:29:21: 致命错误:algorithm:没有那个文件或目录
编译中断。
error: command 'gcc' failed with exit status 1
'''
How I fix it?

Detail about the folder structure

Hi

What should be the training data folder structure?

should it be $rootfolder/data/train, $rootfolder/data/val, $rootfolder/data/test

or $rootfolder/data/imgs

Thanks in advance.

Where the MTCNN parameters are from?

Thanks for your job.
Did you train MTCNN parameters p/o/rnet.npy by yourself or use the parameters from other places? I'd like to know how it was trained.

Question about MegaFace results

Hi~

Thank you for your great work!
We wonder the validation results (or code) for MegaFace would be reported or not.

Thank you.

I am trying to extract faces and train

Hello,
I have live feeds streaming of stores I want to extract each face of the person automatic and save it.
Currently, I am doing that manually cropping face and saving it in a folder with a unique id
even I tried opencv's modle but it was not accurate than i used ddn I got decent accuracy but some time i am getting blur faces may be because of movment of people fast or camera issue. So any knows how can i save detected face with quniue id inside a unique folder. The script is given blow I have searched a lot bet did not find any thing like auto face extracting from video.

I am using this git project uses knn but its not scalable
What i am trying to do is this extract face and keep the faces with unique id folders each face should have unique id.

My Questions Are

  1. Is there any way to solve above problem
  2. How many images do i need to give of a person face to train the modle using your git lib
  3. Do i need to lable my data like bonding box or keeping the photos in a different folder. If my first problem is solved
    DDN Code
# USAGE
# python object_tracker.py --prototxt deploy.prototxt --model res10_300x300_ssd_iter_140000.caffemodel

# import the necessary packages
from data.centroidtracker import CentroidTracker
from imutils.video import VideoStream
import numpy as np
import argparse
import imutils
import time
import cv2

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-p", "--prototxt", required=True,
	help="path to Caffe 'deploy' prototxt file")
ap.add_argument("-m", "--model", required=True,
	help="path to Caffe pre-trained model")
ap.add_argument("-c", "--confidence", type=float, default=0.5,
	help="minimum probability to filter weak detections")
args = vars(ap.parse_args())

# initialize our centroid tracker and frame dimensions
ct = CentroidTracker()
(H, W) = (None, None)

# load our serialized model from disk
print("[INFO] loading model...")
net = cv2.dnn.readNetFromCaffe(args["prototxt"], args["model"])

# initialize the video stream and allow the camera sensor to warmup
print("[INFO] starting video stream...")
vs = VideoStream(src=0).start()
time.sleep(2.0)

# loop over the frames from the video stream
while True:
	# read the next frame from the video stream and resize it
	frame = vs.read()
	frame = imutils.resize(frame, width=400)

	# if the frame dimensions are None, grab them
	if W is None or H is None:
		(H, W) = frame.shape[:2]

	# construct a blob from the frame, pass it through the network,
	# obtain our output predictions, and initialize the list of
	# bounding box rectangles
	blob = cv2.dnn.blobFromImage(frame, 1.0, (W, H),
		(104.0, 177.0, 123.0))
	net.setInput(blob)
	detections = net.forward()
	rects = []

	# loop over the detections
	for i in range(0, detections.shape[2]):
		# filter out weak detections by ensuring the predicted
		# probability is greater than a minimum threshold
		if detections[0, 0, i, 2] > args["confidence"]:
			# compute the (x, y)-coordinates of the bounding box for
			# the object, then update the bounding box rectangles list
			box = detections[0, 0, i, 3:7] * np.array([W, H, W, H])
			rects.append(box.astype("int"))

			# draw a bounding box surrounding the object so we can
			# visualize it
			(startX, startY, endX, endY) = box.astype("int")
			cv2.rectangle(frame, (startX, startY), (endX, endY),
				(0, 255, 0), 2)

	# update our centroid tracker using the computed set of bounding
	# box rectangles
	objects = ct.update(rects)

	# loop over the tracked objects
	for (objectID, centroid) in objects.items():
		# draw both the ID of the object and the centroid of the
		# object on the output frame
		text = "ID {}".format(objectID)
		cv2.putText(frame, text, (centroid[0] - 10, centroid[1] - 10),
			cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)
		cv2.circle(frame, (centroid[0], centroid[1]), 4, (0, 255, 0), -1)

	# show the output frame
	cv2.imshow("Frame", frame)
	key = cv2.waitKey(1) & 0xFF

	# if the `q` key was pressed, break from the loop
	if key == ord("q"):
		break

# do a bit of cleanup
cv2.destroyAllWindows()
vs.stop()

Inference code

Hi

Thanks for sharing your work,

I don't have resources to train any models, but is there only inference code available to test on my images?

Thanks

Performance Issue

https://github.com/ZhaoJ9014/face.evoLVe.PyTorch/blob/master/align/detector.py#L24

# LOAD MODELS
pnet = PNet()
rnet = RNet()
onet = ONet()
onet.eval()

Model are loaded on detect_faces.

So in face_align.py
for subfolder in tqdm(os.listdir(source_root)):
if not os.path.isdir(os.path.join(dest_root, subfolder)):
os.mkdir(os.path.join(dest_root, subfolder))
for image_name in os.listdir(os.path.join(source_root, subfolder)):
print("Processing\t{}".format(os.path.join(source_root, subfolder, image_name)))
img = Image.open(os.path.join(source_root, subfolder, image_name))
try: # Handle exception
_, landmarks = detect_faces(img)

Models are going to be loaded every time.
They should be put outside of the function

REFERENCE_FACIAL_POINTS

I notice that you defined REFERENCE_FACIAL_POINTS in the file align_trans.py. Could do you please tell me how do you calculate its value? If I want to use more landmarks (detected by other models) to do face alignment, how can I calculate those reference points myself.

extract_feature_v1.py. I think there is a missing l2_norm in line 75.

Hi.

while idx + batch_size <= len(loader.dataset):
            batch, _ = iter(loader).next()
            if tta:
                fliped = hflip_batch(batch)
                emb_batch = backbone(batch.to(device)).cpu() + backbone(fliped.to(device)).cpu()
                features[idx:idx + batch_size] = l2_norm(emb_batch)
            else:
      75:          features[idx:idx + batch_size] = backbone(batch.to(device)).cpu()
            idx += batch_size

        if idx < len(loader.dataset):
            batch, _ = iter(loader).next()
            if tta:
                fliped = hflip_batch(batch)
                emb_batch = backbone(batch.to(device)).cpu() + backbone(fliped.to(device)).cpu()
                features[idx:] = l2_norm(emb_batch)
            else:
                features[idx:] = l2_norm(backbone(batch.to(device)).cpu())

Line 75 seems to have missing l2_norm as the other 3 cases have.

I'd thank an explanation why l2_norm is necessary. If the aim is to search in the database who that person is, normalizing doesn't seem a good idea.

Thanks.

Training difficulties

Hi,

I am training on casia-clean + align, using IR_SE_50 backbone and ArcFace head. Somehow the network just isn't learning well. The loss is around 24 for the first few epochs, and even after stage 1 adjustment, it drops only to around 19.
I see during your training that the loss starts around 10. Anything I am doing wrong / missing?

Thanks

Why not MxNet

@ZhaoJ9014
First, thank you for NUS-Panasonic's great job!
As to dynamic computation graphical approach of Tensorflow, just try Eager Execution and Tensorflow Fold :)
Another similar question, #14 :
Could you explain in detail why not use MxNet framework(ArcFace source code)?
Thanks.

Face alignment speed up and GPU usage

To whom it may concern,

This repo provided really amazing tools. Thanks for the great work.
I tried face alignment, extract features by using this lib. I found the face alignment may cost 1.3s to process an image. After reading the code, I realized the mtcnn is not running on GPU. A a little bit changes were made, e.g., torch.FloatTensor => torch.cuda.FloatTensor, Pnet() =>Pnet().cuda(), etc.

This increased the face alignment speed per image from 1.3 to 0.8s. It works, however, the result does not make me satisfied. Is there a way to make the face detection/alignment run faster?

There is another thing make me confused. The GPU usage is very low, 1%~2%. Please see the attachments.

screen shot 2019-02-26 at 12 22 32

I'm not sure if this is due to I didn't configured the GPU properly or it is just one of the advantages of this library.
The installed CUDA version is 9.2, Cudnn version is 7.4. Graphic card is RTX 2070. It reports an error after I run the python code. Can anyone tell me how to fix it?
screen shot 2019-02-26 at 12 23 37

Again, many thanks for the great work!

training loss nan

Thanks for sharing this fantastic repository!
But, when I train my own dataset, I have found training loss has become nan after few epoches.
Could you tell me how to solve this problem?

Cannot unzip ms1m_align_112.zip

Hello!

I downloaded the file ms1m_align_112.zip from this Google Drive link. However, I am getting a weird following exception while extracting the data.

   creating: imgs/67619/
  inflating: imgs/67619/4636064.jpg  
  inflating: imgs/67619/4636050.jpg  
  inflating: imgs/67619/4636028.jpg  
  inflating: imgs/67619/4636004.jpg  
  inflating: imgs/67619/4635977.jpg  
  inflating: imgs/67619/4635994.jpg  
  inflating: imgs/67619/4636076.jpg  
  inflating: imgs/67619/4635998.jpg  
  inflating: imgs/67619/4635981.jpg  
  inflating: imgs/67619/4636018.jpg  
  inflating: imgs/67619/4636027.jpg  
  inflating: imgs/67619/4636043.jpg  
imgs:  mismatching "local" filename (imgs/67619/4636066.jpg),
         continuing with "central" filename version
replace imgs? [y]es, [n]o, [A]ll, [N]one, [r]ename:

Can you kindly look into it?

Regards!

General Question

I wanted to perform face verification, correct me if i'm wrong, but most of the recognition models are trained on a million images or so,

I was thinking if i could combine most of the open-source datasets like 1millionceleb, facesemore, lfw etc, and train the network, would that give me a better face embedding?

What do you think?

Thanks in advance.

Asking for meta and sizes

i am trying to train the network on lfw,

my folder structure looks like

$RootFolder/data/train, $RootFolder/data/val, $RootFolder/data/test

but when i run train.py , i get

FileNotFoundError: [Errno 2] No such file or directory: '/media/ryan/shakira/face.evoLVe.PyTorch/data/lfw/meta/sizes'

Am i missing something?

Thanks in advance.

Cannot unzip ms1m_align_112.zip

Hi~

We downloaded the dataset from Google Drive.
But we cannot unzip the file ms1m_align_112.zip (>25GB).

Could you check this issue?
If possible, would you provide the md5 checksum of the dataset (or all the dataset)?
Thank you.

Problem with dimension when trying to extract features.

Hi.

I get this error:
RuntimeError: Expected 4-dimensional input for 4-dimensional weight [64, 3, 3, 3], but got 3-dimensional input of size [3, 112, 112] instead

in this line:
features = l2_norm(backbone(images.to(device)).cpu())
When running this code:

def get_transform(input_size = [112, 112], rgb_mean = [0.5, 0.5, 0.5], rgb_std = [0.5, 0.5, 0.5]): 
  transform = transforms.Compose([
  transforms.Resize([int(128 * input_size[0] / 112), int(128 * input_size[0] / 112)]), # smaller side resized
  transforms.CenterCrop([input_size[0], input_size[1]]),
  transforms.ToTensor(),
  transforms.Normalize(mean = rgb_mean, std = rgb_std)])
  return transform

def extract_feature(images, backbone, embedding_size = 512, 
                    device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")):
  batch_size = len(images)
  features = None
  print("backbone:", backbone)
  with torch.no_grad():
    features = l2_norm(backbone(images.to(device)).cpu())
  return features

if __name__ == "__main__":
  cap = cv2.VideoCapture(0)
  backbone = load_backbone(IR_50(input_size = [112, 112]), './backbone_ir50_ms1m_epoch120.pth')
  transform = get_transform(input_size = [112, 112])
  while(True):
    ret, frame = cap.read()
    #cv_image = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB) #cv2.COLOR_BGR2GRAY)
    pil_image = Image.fromarray(frame)
    bounding_boxes, landmarks = detect_faces(pil_image)
    faces = []
    for box in bounding_boxes:
      face = pil_image.crop( ( box[0], box[1], box[2] , box[3]))
      face = transform(face)
      faces.append(face.float())
    if len(faces) > 0:
      I_ = torch.cat(faces, 0)
      I_ = Variable(I_, requires_grad=False)
      features = extract_feature(I_, backbone)
      print("features:", features)

I am beginner with pytorch.

Thanks.

About model parallel for weight matrix in the head

Thank you very much for your high performance repo!

By splitting large matrix(refer to here),
the memory consumption is more balanced, however the training process seems not speed up. I guess the bottleneck lies in the communication of transferring head matrix from device to device.

May I ask how to implement parallel_module_local_v1.py in insightface efficiently?

Looking forward for your suggestion! Thank you!

Question about the required training epochs

Hi~

From the original paper, the required training iterations is 180K for MS1MV2.
By converting this number into training epochs, it would be 512*180K/5.8M = 16 epochs.
(if there are any error in this estimation, please correct me.)

From the provided pre-trained model (IR-50), the training epoch is up to 120.
We would like to know how many training epoch is sufficient for training MS1MV2.

On the other hand, we validate the pre-trained model on megaface (cleaned by deepinsight).
The accuracy is less than 95%.
Is this result reasonable?

Thank you.

The network detects 18 different (farther than 0.99) me when moving my head.

Hi.

I have developed a little app that when detecting a face tries to find it in the database and if it's not there, it created a new record.
As I move my head the app thinks it's a new user (distance >= 0.99) and creates a new entry in the database.
Up to 18 records have been created.

I was thinking of developing a professional app for controlling access to places, etc.
How can I do to filter (or any other way) these extra records created with different head poses?

Thanks.

If I use a pretrained model do I need to train to recognize people?

Hi.

What I have understood is that a trained model will assign a vector to each face. With each vector I find in the database if that person is already registered (distance between vector <= 0.99).
Is this wrong? Do I have to train the network with the faces of people I want it to recognize?

I have read several times the documentation but I don't know how to assign a vector to a face. Where ca nI find that code?

Thanks.

About some problems in evaluation

Thank you for your great repo! May ask some problems about evaluation?

  • The model is trained by RGB image, but evaluated with BGR image, which cause slight performance degradation, to be specific, acc on cfp_fp can reach 98% if fix this problem.
  • May I ask why you use ccrop, first resize 112x112 to 128x128 and then crop to 112x112?
  • What is the difference with the validation dataset released by InsightFace_Pytorch? I notice that using your released cfp_fp can yield a better accuracy.

there are lot of errors in this git lib for example utlits.py there are extra parenthesis.after solved nearly 50 problems i got stuck here

 python train.py
============================================================
Overall Configurations:
{'SEED': 1337, 'DATA_ROOT': '/media/mustafa/ubuntu_backup/face/face.evoLVe.PyTorch/alogn_faces', 'MODEL_ROOT': '/media/mustafa/ubuntu_backup/face/face.evoLVe.PyTorch/model', 'LOG_ROOT': '/media/mustafa/ubuntu_backup/face/face.evoLVe.PyTorch/log', 'BACKBONE_RESUME_ROOT': './', 'HEAD_RESUME_ROOT': './', 'BACKBONE_NAME': 'IR_SE_50', 'HEAD_NAME': 'ArcFace', 'LOSS_NAME': 'Focal', 'INPUT_SIZE': [112, 112], 'RGB_MEAN': [0.5, 0.5, 0.5], 'RGB_STD': [0.5, 0.5, 0.5], 'EMBEDDING_SIZE': 512, 'BATCH_SIZE': 512, 'DROP_LAST': True, 'LR': 0.1, 'NUM_EPOCH': 125, 'WEIGHT_DECAY': 0.0005, 'MOMENTUM': 0.9, 'STAGES': [35, 65, 95], 'DEVICE': device(type='cuda', index=0), 'MULTI_GPU': True, 'GPU_ID': [0, 1, 2, 3], 'PIN_MEMORY': True, 'NUM_WORKERS': 0}
============================================================
Traceback (most recent call last):
  File "train.py", line 73, in <module>
    weights = make_weights_for_balanced_classes(dataset_train.imgs, len(dataset_train.classes))
  File "/media/mustafa/ubuntu_backup/face/face.evoLVe.PyTorch/util/utils.py", line 47, in make_weights_for_balanced_classes
    weight_per_class[i] = N / float(count[i])
ZeroDivisionError: float division by zero

amsoftmax the size not match?

cos_theta = torch.mm(embbedings, kernel_norm)

RuntimeError: size mismatch, m1: [1 x 5994], m2: [512 x 5994]

i use a resnext as backbone and pass the output feature to amsoftmax, but error as upon, the kernel size type not match the feature type, something like the ordre is not correctly, am i used not correct?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.