Comments (20)
The results of the Face Detection CPU demo is fine
from mediapipe.
@xuguozhi Why did the first screenshot show the bounding box being positioned wrongly? Was this temporarily only?
from mediapipe.
Thanks for reporting. We are aware of such rendering issue. Some other user already got the same issue on a xiaomi mix2s. We believe it's a GPU rendering issue inside the AnnotationOverlayCalculator for certain types of the Android phones. Unfortunately, we fail to reproduce it on our testing devices. To help us reproduce this issue, please let us know your device type if possible. Thanks.
from mediapipe.
@xuguozhi Why did the first screenshot show the bounding box being positioned wrongly? Was this temporarily only?
@mgyong It's not temporarily, it seems always showing like that. My testing phone is XIaomi black shark https://www.mi.com/blackshark-game2/
from mediapipe.
Thanks for reporting. We are aware of such rendering issue. Some other user already got the same issue on a xiaomi mix2s. We believe it's a GPU rendering issue inside the AnnotationOverlayCalculator for certain types of the Android phones. Unfortunately, we fail to reproduce it on our testing devices. To help us reproduce this issue, please let us know your device type if possible. Thanks.
@jiuqiant Xiaomi black shark https://www.mi.com/blackshark-game2/
from mediapipe.
@xuguozhi, we are able to reproduce the problem on a Redmi Note 4. We will be working on a fix. Thanks.
from mediapipe.
Hi @xuguozhi, I think the root cause has been found:
Some Xiaomi phones have an odd-size camera image resolution (1269x1692) by default, and mediapipe GpuBuffer assumes all texture sizes are evenly divisible by 4.
I can provide a temporary workaround until a proper solution is found.
Please make the following edits:
-
in tflite_tensors_to_detections_calculator.cc line 631, change
num_coords_
tokNumCoordsPerBox
, such that the final line will now readsize_t raw_anchors_length = num_boxes_ * kNumCoordsPerBox;
-
in CameraXPreviewHelper.java line 49, add
.setTargetResolution(new Size(600,800))
to the builder, such that the final line will now readnew PreviewConfig.Builder().setLensFacing(cameraLensFacing).setTargetResolution(new Size(600,800)).build();
The first fix (num coords) is a true bug, larger than this one, and fixes a memory allocation issue.
The second fix (java) is a way to request a different size camera texture. You can play around with the request size, but the goal is to have the camera give a multiple-of-4-sized image. In my experiments, the size given here results in 1080x1920 image.
Hope that helps, and thanks for working with us to help make MediaPipe better
from mediapipe.
- in tflite_tensors_to_detections_calculator.cc line 624, change
num_coords_
tokNumCoordsPerBox
, such that the final line will now readsize_t raw_anchors_length = num_boxes_ * kNumCoordsPerBox;
Sorry, in line 624, should it be
size_t raw_boxes_length = num_boxes_ * kNumCoordsPerBox;
or size_t raw_anchors_length = num_boxes_ * kNumCoordsPerBox;
?
from mediapipe.
@xuguozhi, sorry for the confusion.
Please modify the line 631 of tflite_tensors_to_detections_calculator.cc to be size_t raw_anchors_length = num_boxes_ * kNumCoordsPerBox;
from mediapipe.
@xuguozhi, sorry for the confusion.
Please modify the line 631 of tflite_tensors_to_detections_calculator.cc to besize_t raw_anchors_length = num_boxes_ * kNumCoordsPerBox;
It doesn't work, the boxes in face detection GPU or object detection GPU are in red, but not standard rectangles. It appears like rectangles with wrongly affine transformation. However, both the CPU version of face detection or object detection works fine.
from mediapipe.
Ah, sorry about line number mixup, fixed.
Seeing red squares is progress!
Did you also modify the camera size? Can you verify the new resolution? Another screenshot may also help.
from mediapipe.
Ah, sorry about line number mixup, fixed.
Seeing red squares is progress!
Did you also modify the camera size? Can you verify the new resolution? Another screenshot may also help.
Hi @mcclanahoochie,
I have modified the camera size as: new PreviewConfig.Builder().setLensFacing(cameraLensFacing).setTargetResolution(new Size(600,800)).build();
and the screenshot is like this:
from mediapipe.
What is the resolution of the camera frames? (before and after the java edit)
Another option, instead of the java edit, is to insert a ImageTransformationCalculator calculator in the beginning of the graph to resize the image to a known size.
It would look like this (comments removed, 1200x1600 based on 1269x1692 on the phone here):
input_stream: "input_video"
output_stream: "output_video"
node {
calculator: "RealTimeFlowLimiterCalculator"
input_stream: "input_video"
input_stream: "FINISHED:detections"
input_stream_info: {
tag_index: "FINISHED"
back_edge: true
}
output_stream: "throttled_input_video_0"
}
node: {
calculator: "ImageTransformationCalculator"
input_stream: "IMAGE_GPU:throttled_input_video_0"
output_stream: "IMAGE_GPU:throttled_input_video"
node_options: {
[type.googleapis.com/mediapipe.ImageTransformationCalculatorOptions] {
output_width: 1200
output_height: 1600
}
}
}
node: {
calculator: "ImageTransformationCalculator"
input_stream: "IMAGE_GPU:throttled_input_video"
output_stream: "IMAGE_GPU:transformed_input_video"
output_stream: "LETTERBOX_PADDING:letterbox_padding"
node_options: {
[type.googleapis.com/mediapipe.ImageTransformationCalculatorOptions] {
output_width: 128
output_height: 128
scale_mode: FIT
}
}
}
node {
calculator: "TfLiteConverterCalculator"
input_stream: "IMAGE_GPU:transformed_input_video"
output_stream: "TENSORS_GPU:image_tensor"
node_options: {
[type.googleapis.com/mediapipe.TfLiteConverterCalculatorOptions] {
zero_center: true
flip_vertically: true
}
}
}
node {
calculator: "TfLiteInferenceCalculator"
input_stream: "TENSORS_GPU:image_tensor"
output_stream: "TENSORS_GPU:detection_tensors"
node_options: {
[type.googleapis.com/mediapipe.TfLiteInferenceCalculatorOptions] {
model_path: "facedetector_front.tflite"
}
}
}
node {
calculator: "SsdAnchorsCalculator"
output_side_packet: "anchors"
node_options: {
[type.googleapis.com/mediapipe.SsdAnchorsCalculatorOptions] {
num_layers: 4
min_scale: 0.1484375
max_scale: 0.75
input_size_height: 128
input_size_width: 128
anchor_offset_x: 0.5
anchor_offset_y: 0.5
strides: 8
strides: 16
strides: 16
strides: 16
aspect_ratios: 1.0
fixed_anchor_size: true
}
}
}
node {
calculator: "TfLiteTensorsToDetectionsCalculator"
input_stream: "TENSORS_GPU:detection_tensors"
input_side_packet: "ANCHORS:anchors"
output_stream: "DETECTIONS:detections"
node_options: {
[type.googleapis.com/mediapipe.TfLiteTensorsToDetectionsCalculatorOptions] {
num_classes: 1
num_boxes: 896
num_coords: 16
box_coord_offset: 0
keypoint_coord_offset: 4
num_keypoints: 6
num_values_per_keypoint: 2
sigmoid_score: true
score_clipping_thresh: 100.0
reverse_output_order: true
x_scale: 128.0
y_scale: 128.0
h_scale: 128.0
w_scale: 128.0
flip_vertically: true
}
}
}
node {
calculator: "NonMaxSuppressionCalculator"
input_stream: "detections"
output_stream: "filtered_detections"
node_options: {
[type.googleapis.com/mediapipe.NonMaxSuppressionCalculatorOptions] {
min_suppression_threshold: 0.3
min_score_threshold: 0.75
overlap_type: INTERSECTION_OVER_UNION
algorithm: WEIGHTED
}
}
}
node {
calculator: "DetectionLabelIdToTextCalculator"
input_stream: "filtered_detections"
output_stream: "labeled_detections"
node_options: {
[type.googleapis.com/mediapipe.DetectionLabelIdToTextCalculatorOptions] {
label_map_path: "facedetector_front_labelmap.txt"
}
}
}
node {
calculator: "DetectionLetterboxRemovalCalculator"
input_stream: "DETECTIONS:labeled_detections"
input_stream: "LETTERBOX_PADDING:letterbox_padding"
output_stream: "DETECTIONS:output_detections"
}
node {
calculator: "DetectionsToRenderDataCalculator"
input_stream: "DETECTION_VECTOR:output_detections"
output_stream: "RENDER_DATA:render_data"
node_options: {
[type.googleapis.com/mediapipe.DetectionsToRenderDataCalculatorOptions] {
thickness: 8.0
color { r: 255 g: 0 b: 0 }
}
}
}
node {
calculator: "AnnotationOverlayCalculator"
input_stream: "INPUT_FRAME_GPU:throttled_input_video"
input_stream: "render_data"
output_stream: "OUTPUT_FRAME_GPU:output_video"
node_options: {
[type.googleapis.com/mediapipe.AnnotationOverlayCalculatorOptions] {
flip_text_vertically: true
}
}
}
This is to test my theory about the %4 size issue. This graph fixes the skew on the Xiaomi phone here.
from mediapipe.
@xuguozhi Did you try out @mcclanahoochie ImageTransformationCalculator suggestion? and did it work? If it did, pls let us know
from mediapipe.
@xuguozhi Did you try out @mcclanahoochie ImageTransformationCalculator suggestion? and did it work? If it did, pls let us know
@mgyong Sorry for the late reply, I will try it soon :)
from mediapipe.
Hi, the same issues appear on OPPO Find X and Xiaomi MAX2 phones.
@mcclanahoochie insert a ImageTransformationCalculator calculator in the beginning of the graph to resize the image to a known size.
Which line to intert?
from mediapipe.
@xuguozhi What @mcclanahoochie gives you is a new MediaPipe graph. You can visualize it in http://viz.mediapipe.dev. Please manually replace the content of the face detection gpu graph with the code snippet in @mcclanahoochie's comment .
FYI, the new graph looks like:
from mediapipe.
@mgyong @jiuqiant @mcclanahoochie Cool~, it works! Thanks, you guys are great!
from mediapipe.
Awesome!
from mediapipe.
Not sure but when i input high resolution image in Mediapipe i get error 👍
NoneType Object
But when i crop and send same image it gives me error. any clue?
from mediapipe.
Related Issues (20)
- FaceLandmarker's iris landmarks are worse compared to FaceMesh(refine_landmarks=True) HOT 3
- Build "libllm_inference_engine_jni.so" error! HOT 9
- [HOLISTIC SOLUTION] Info about the visibility/confidence of keypoints from the hands is not available. HOT 3
- 0.10.11 added Pytorch as a dependency HOT 6
- Can't pip install mediapipe-model-maker HOT 1
- How can I Integrate NPU into HOT 4
- Unable to install Mediapipe on venv HOT 3
- Replacing the z coordinate from pos_landmark with depth value HOT 5
- Cannot find clear documentation on mp.Image, is the expected input BGR or RGB? HOT 4
- How to use PoseLandmarker and AudioClassifier simultaneously? HOT 2
- Avatar puppeteering along with Updated Holistic Solution HOT 1
- Trouble rebuilding the CocoaPod equivalent of iOS MediaPipeTaskVision.xcframework & MediaPipeTasksCommon.xcframework HOT 1
- Executing ImageGenerator.createFromOptions crashes the program
- Need MediaPipeTasksAudio on iOS HOT 2
- Removal of Mediapipe 0.8.9.1 from pypi HOT 5
- HandTrackingModule_with_mediapipe HOT 4
- mediapipe hands not working over grayscale images even though I am stacking it to make it 3 channel. HOT 9
- A crash on Android Samsung S24+ HOT 1
- ModuleNotFoundError: No module named 'keras.src.engine' HOT 5
- target '//mediapipe/calculators/tensorflow:pack_media_sequence_calculator_cc_proto' does not exist HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mediapipe.