uzh-rpg / rpg_monocular_pose_estimator Goto Github PK

A monocular pose estimation system based on infrared LEDs

License: GNU General Public License v3.0

TeX 1.50% C++ 96.53% CMake 0.77% Python 1.20%

rpg_monocular_pose_estimator's Introduction

RPG Monocular Pose Estimator

Disclaimer and License

The RPG Monocular Pose Estimator is recommended to be used with ROS-Kinetic and Ubuntu 16.04. This is research code, expect that it changes often (or not at all) and any fitness for a particular purpose is disclaimed.

The source code is released under a GPL licence. Please contact the authors for a commercial license.

Package Summary

The RPG Monocular Pose Estimator uses infrared LEDs mounted on a target object and a camera with an infrared-pass filter to estimate the pose of an object.

The positions of the LEDs on the target object are provided by the user in a YAML configuration file. The LEDs are detected in the image, and a pose estimate for the target object is subsequently calculated.

Publication

If you use this work in an academic context, please cite the following ICRA 2014 publication:

M. Faessler, E. Mueggler, K. Schwabe, D. Scaramuzza: A Monocular Pose Estimation System based on Infrared LEDs. IEEE International Conference on Robotics and Automation (ICRA), Hong Kong, 2014.

@InProceedings{Faessler14icra,
  author        = {Matthias Faessler and Elias Mueggler and Karl Schwabe and
                  Davide Scaramuzza},
  title         = {A Monocular Pose Estimation System based on Infrared {LED}s},
  booktitle     = {{IEEE} Int. Conf. Robot. Autom. (ICRA)},
  year          = 2014,
  pages         = {907--913},
  doi           = {10.1109/ICRA.2014.6906962}
}

Watch the video demonstrating the RPG Monocular Pose Estimator:

Installation

Installation of the package

System Dependencies

The RPG Monocular Pose Estimator is built on the Robotic Operating System (ROS). In order to install the package, ROS has to be installed.

In order to install the Robot Operating System (ROS), please follow the instructions provided in the link.
Make sure you have properly set up a ROS catkin workspace as described here.

Additionally, the RPG Monocular Pose Estimator makes use of OpenCV for image processing and the Eigen linear algebra library. These should come preinstalled with ROS, however, if the dependency is missing they can be installed from their respective websites:

To install OpenCV, follow the installation instructions provided on the OpenCV website.
To install the Eigen linear algebra library, follow the installation instructions provided on the Eigen website.

Main Installation

In order to install the RPG Monocular Pose Estimator and its dependencies, first install vcstool:

sudo apt-get install python-vcstool

Then clone the latest version from our GitHub repository into your catkin workspace, get the dependencies with the vcstool and compile the packages using ROS.:

cd catkin_workspace/src
git clone https://github.com/uzh-rpg/rpg_monocular_pose_estimator.git
vcs-import < rpg_monocular_pose_estimator/dependencies.yaml
catkin_make

Source your catkin workspace after this.

Building the Documentation

The RPG Monocular Pose Estimator makes use of Doxygen to produce its documentation. Please ensure that you have the latest version of Doxygen which can be installed with:

sudo apt-get install doxygen

The Doxygen configuration file is located in the doxygen_documentation folder within the RPG Monocular Pose Estimator directory.

Change to the doxygen_documentation directory:

roscd monocular_pose_estimator/../doxygen_documentation

To produce the documentation, run Doxygen on the configuration file:

doxygen doxygen_config_file.cfg

Open the index.html file in html directory that has been produced.

Test Installation on Basic Dataset

In order to test the installation on a data set, download the data set from here, and follow these instructions.

Download and Untar a sample ROS bag file

roscd monocular_pose_estimator
mkdir bags
cd bags
wget http://rpg.ifi.uzh.ch/data/monocular-pose-estimator-data.tar.gz
tar -zxvf monocular-pose-estimator-data.tar.gz
rm monocular-pose-estimator-data.tar.gz

Launch the demo launch file using

roslaunch monocular_pose_estimator demo.launch

You should see a visualisation of the system: detected LEDs should be circled in red, the region of interest in the image that is being processed should be bounded by a blue rectangle, and the orientation of the tracked object will be represented by the red-green-blue trivector located at the origin of the traget object's coordinate frame.

Basic Usage

The program makes use of a number of parameters to estimate the pose of the tracked object. These include the location of the LEDs on the object, the intrinsic parameters of the camera, and various runtime prameters, such as thresholding values.

Setting up the Marker Positions Parameter File

In order to predict the pose of the target object, the RPG Monocular Pose Estimator needs to know the positions of the LEDs on the object in the object's frame of reference. These positions are given to the program using a YAML file located within the roscd monocular_pose_estimator/marker_positions folder.

By default, the file is called marker_positions.YAML and has the following format.

# The marker positions in the trackable's frame of reference:
#
marker_positions:
  - x: 0.0714197
    y: 0.0800214
    z: 0.0622611
  - x: 0.0400755
    y: -0.0912328
    z: 0.0317064
  - x: -0.0647293
    y: -0.0879977
    z: 0.0830852
  - x: -0.0558663
    y: -0.0165446
    z: 0.053473

Note that each new LED position is indicated by a dash (-). The position is given in the x, y, and z coordinates of the object frame in metres.

If you would like to use your own marker positions file, place it in the monocular_pose_tracker/marker_positions folder and alter the launch file as explained below in the section 'Launch Files'.

Running the RPG Monocular Pose Estimator with a USB Camera

Ensure that you have a working USB camera.

The camera needs to be calibrated. Follow the instructions at http://www.ros.org/wiki/camera_calibration/Tutorials/MonocularCalibration.

The pose estimator listens to the camera/image_raw topic and the camera/camera_info topic of the camera at launch. An example of such a camera_info topic is:

header: 
  seq: 4686
  stamp: 
    secs: 1378817190
    nsecs: 598124104
  frame_id: /camera
height: 480
width: 752
distortion_model: plumb_bob
D: [-0.358561237166698, 0.149312912580924, 0.000484551782515636, -0.000200189442379448, 0.0]
K: [615.652408400557, 0.0, 362.655454167686, 0.0, 616.760184718123, 256.67210750994, 0.0, 0.0, 1.0]
R: [1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0]
P: [525.978149414062, 0.0, 357.619310580667, 0.0, 0.0, 579.796447753906, 258.377118195804, 0.0, 0.0, 0.0, 1.0, 0.0]
binning_x: 0
binning_y: 0
roi: 
  x_offset: 0
  y_offset: 0
  height: 0
  width: 0
  do_rectify: False
---

The camera should be adjusted so that the gain and shutter/exposure time of the camera are fixed. (Use 'rqt_reconfigure' to adjust the camera parameters while the camera is running). Ideally, the LEDs should appear very bright in the image so that they can easily be segmented from the surrounding scene by simple thresholding. See 'Parameter Settings' below on how to change the thresholding value.

Inputs, Outputs

Configuration Files

The RPG Monocular Pose Estimator requires the LED positions on the target object. These are entered in a YAML file and are loaded at runtime. See 'Setting up the Marker Positions Parameter File' above for more details.

Subscribed Topics

The RPG Monocular Pose Estimator subscribes to the following topics:

camera/image_raw (sensor_msgs/Image)

The image form the camera. The LEDs will be detected in this image.
camera/camera_info (sensor_msgs/CameraInfo)

The camera calibration parameters.

Published Topics

The RPG Monocular Pose Estimator publishes the following topics:

estimated_pose (geometry_msgs/PoseWithCovarianceStamped)

The estimated pose of the target object with respect to the camera.
image_with_detections (sensor_msgs/Image)

The image with the detected LEDs cirlced in red, the region of interest of the image that was processed bounded by a blue rectangle, and the orientation trivector projected onto the object.

Parameter Settings

The following parameters can be set dynamically during runtime. (Use 'rqt_reconfigure' to adjust the parameters).

exit(gen.generate(PACKAGE, "monocular_pose_estimator", "monocular_pose_estimator_params"))

threshold_value (int, default: 220, min: 0, max: 255)

This is the pixel intensity that will be used to threshold the image. All pixels with intensity values below this value will be set to zero. Pixels with intensity values equal to or higher than this value will retain their intensities.
gaussian_sigma (double, default: 0.6, min: 0, max: 6)

This is the standard deviation of the of the Gaussian that will be used to blur the image after being thresholded.
min_blob_area (double, default: 10, min: 0, max: 100)

This is the minimum blob area (in pixels squared) that will be detected as a blob/LED.
max_blob_area (double, default: 200, min: 0, max: 1000)

This is the maximum blob area (in pixels squared) that will be detected as a blob/LED. Blobs having an area larger than this will not be detected as LEDs.
max_width_height_distortion (double, default: 0.5, min: 0, max: 1)

This is a parameter related to the circular distortion of the detected blobs. It is the maximum allowable distortion of a bounding box around the detected blob calculated as the ratio of the width to the height of the bounding rectangle. Ideally the ratio of the width to the height of the bounding rectangle should be 1.
max_circular_distortion (double, default: 0.5, min: 0, max: 1)

This is a parameter related to the circular distortion of the detected blobs. It is the maximum allowable distortion of a bounding box around the detected blob, calculated as the area of the blob divided by pi times half the height or half the width of the bounding rectangle.
back_projection_pixel_tolerance (double, default: 3, min: 0, max: 10)

This is the tolerance (in pixels) between a back projected LED and an image detection. If a back projected LED is within this threshold, then the pose for that back projection is deemed to be a possible candidate for the pose.
nearest_neighbour_pixel_tolerance (double, default: 5, min: 0, max: 10)

This is the tolerance for the prediction of the correspondences between object LEDs and image detections. If the predicted position of an LED in the image is within this tolerance of the image detections, then the LED and image detection are considered to correspond to each other.
certainty_threshold (double, default: 0.75, min: 0, max: 1)

This is the proportion of back projected LEDs into the image that have to be within the back_projection_pixel_tolerance, for a pose to be considered valid.
valid_correspondence_threshold (double, default: 0.7, min: 0, max: 1)

This is the ratio of all combinations of 3 of the correspondences that yielded valid poses (i.e., were within the certainty_threshold), for a set of correspondences to be considered valid.
roi_boarder_thickness (int, default: 10, min: 0, max: 200)

This is the thickness of the boarder (in pixels) around the predicted area of the LEDs in the image that defines the region of interest for image processing and detection of the LEDs.

Launch Files

The RPG Monocular Pose Estimator needs to be launched with a launch file, since the location of the YAML configuration file containing the LED positions on the object needs to be specified. (See 'Setting up the Marker Positions Parameter File' above for further details). An example launch file is presented below.

<launch> 
    
	<!-- Name of the YAML file containing the marker positions -->
    <arg name="YAML_file_name" default="marker_positions"/>

	<!-- File containing the the marker positions in the trackable's frame of reference -->
	<arg name="marker_positions_file" default="$(find monocular_pose_estimator)/marker_positions/$(arg YAML_file_name).yaml"/> 

    <node name="monocular_pose_estimator" pkg="monocular_pose_estimator" type="monocular_pose_estimator" respawn="false" output="screen"> 
	    <rosparam command="load" file="$(arg marker_positions_file)"/>
		<param name= "threshold_value" value = "140" />
		<param name= "gaussian_sigma" value = "0.6" />
		<param name= "min_blob_area" value = "10" />
		<param name= "max_blob_area" value = "200" />
		<param name= "max_width_height_distortion" value = "0.5" />
		<param name= "max_circular_distortion" value = "0.5" />
	</node>
</launch>

The name of the YAML configuration file may be specified directly as the default value of the argument YAML_file_name, or may be passed to the system via the command line on launch, e.g. roslaunch monocular_pose_estimator launch_file_name.launch YAML_file_name:=<file_name>, where <file_name> is replaced with the name of the maker positions YAML file.

This example launch file also sets some of the parameters described in 'Parameter Settings' above. They needn't be set in the launch file as, they can be dynamically changed during runtime. (Use 'rqt_reconfigure' to adjust the parameters).

For more information on how to use ROS launch files, see the ROS website.

Hardware

The RPG Monocular Pose Estimator makes use of infrared LEDs mounted on the target object and a camera fitted with an infrared-pass filter. The details of the LEDs and camera that were used in our evaluation of the package are outlined below. Our system will be able to work with any kind of LEDs and camera, provided that the LEDs appear bright in the image and can consequently be segmented from the image using simple thresholding operations. Also the emission pattern of the LEDs should be as wide as possible, so that they still appear bright in the image even when viewed from shallow viewing angles.

Infrared LEDs

The infrared LEDs that were used were SMD LEDs from Harvatek HT-260IRPJ of type Harvatek HT-260IRPJ. They have a wide emission pattern and consequently can be viewed from varying view points.

LED Configuration

The placement of the LEDs on the target object can be arbitrary, but must be non-symmetric. The LEDs should also not lie in a plane in order to reduce the ambiguities of the pose estimation. Also the larger the volume that the LEDs fill, the more accurate the estimated pose will be. The robustness and accuracy are improved if the LEDs are visible from many view points. The RPG Monocular Pose Estimator requires a minimum of four (4) LEDs mounted and visible on the target object in order for its pose to be resolved.

Camera

The camera used was a MatrixVision mvBlueFOX-MLC200w monochrome camera fitted with an infrared-pass filter. Its resolution is 752x480 pixels.

rpg_monocular_pose_estimator's People

Contributors

Stargazers

Watchers

Forkers

xinkang kubark42 zxc2694 amiltonwong bagobor wunl wpfhtl vabz7867 doctorwk007 tuuzdu ericyuli cvexp weiweikong caomw benjamesbabala daikimaekawa chefotter nsaip0ng xiqiaosang huleg abuchan rapyuta-robotics xepost abaldur whuwan tjuchen mmattamala merlinwu german-m-garcia ajaycharan ndimubanzisenga weblucas pangzilu matebearyuan shubhampachori12110095 raclab eaa3 syrass vis4rob-lab conkty koenvaneijk namdinhrobotics helloworldagainandagain david-willo canflyzhou jinzhu6 diarkarim marcelomata yeataro danielhonies commanderlee moonuniverse jespestana richeyhuang romanstadlhuber abhifoo honuaye wqhot tuskaw vfjr sif95 zxawangxiaomu xinlong-chen falthaus

rpg_monocular_pose_estimator's Issues

The marker positions in the trackable's frame of reference

Hello I would like to build my own marker to test the pose estimator with live cameras. Say i mount the led's on a board would the center of the board be (0,0,0)? so that the position in the marker_position.yaml is calculatet from the center?

Opencv 3.0 Assertion failed

After roslaunch monocular_pose_estimator demo.launch

OpenCV Error: Assertion failed (CV_IS_MAT(_cameraMatrix) && _cameraMatrix->rows == 3 && _cameraMatrix->cols == 3) in cvUndistortPoints, file /home/tuuzdu/tmp/opencv-3.0.0/modules/imgproc/src/undistort.cpp, line 288
terminate called after throwing an instance of 'cv::Exception'
what(): /home/tuuzdu/tmp/opencv-3.0.0/modules/imgproc/src/undistort.cpp:288: error: (-215) CV_IS_MAT(_cameraMatrix) && _cameraMatrix->rows == 3 && _cameraMatrix->cols == 3 in function cvUndistortPoints

Why the P is not identity?

D: [-0.358561237166698, 0.149312912580924, 0.000484551782515636, -0.000200189442379448, 0.0]
K: [615.652408400557, 0.0, 362.655454167686, 0.0, 616.760184718123, 256.67210750994, 0.0, 0.0, 1.0]
R: [1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0]
P: [525.978149414062, 0.0, 357.619310580667, 0.0, 0.0, 579.796447753906, 258.377118195804, 0.0, 0.0, 0.0, 1.0, 0.0]

Hi,
In your example bag file, why the P is not identity? (or it should have same number as K I think)
Usually, only K is effective, however, the P is not similar with K.

For example, http://wiki.ros.org/camera_calibration/Tutorials/MonocularCalibration,
the P is identity.

Recommended Camera

Good afternoon,

I would like to know which cameras and, specially, IR Filters (and where to get them) do you advise to use with IR led's and filters. I've tried the following approach:

Logitech C270 getting RGB LED's, without much success (even with low noise on the image I couldn't get an estimated pose);
Thought of Pixy with IR-Lock setup, but I am not sure if it can detect LED's up to 5m distant from the sensor and 30cm distant from each-other;

What camera and filter do you use in your setup, and what distance can you get? I am going to use this on a Quadrotor.

With my best regards,

Pedro Roque

Unable to resolve pose when one led is occluded

Hi,

I am testing this node in simulation, on VREP. I have created a scene with 6 markers and one camera. The markers are not coplanar and well spaced
The markers are now simple spheres and I added some filters on the camera in order to mimc the IR emitters and IR camera.

I set up the position of the markers in the yaml file and lanched the node. What happens is that when all 6 markers are visible, the pose estimation works very well. However, as soon as I occlude any of the 6 markers the pose estimation fails and I get the warning "Unabel to resolve a pose".

I verified that the blobs are detected correctly. I believe my problem is at a later step, perhaps in the correspondence check. I have also tried to force and always reinitialize the estimation, and the problem persists. So it looks like is is not limited to neither the brute force correspondence check nor the correspondence check from the predicted pose.

Here are the parameters I used in the launch file
param name= "threshold_value" value = "140"
param name= "gaussian_sigma" value = "0.6"
param name= "min_blob_area" value = "60"
param name= "max_blob_area" value = "2000"
param name= "max_width_height_distortion" value = "0.7"
param name= "max_circular_distortion" value = "0.7"
param name= "back_projection_pixel_tolerance" value = "5"
param name= "nearest_neighbour_pixel_tolerance" value = "70"
param name= "certainty_threshold" value = "0.75"
param name= "valid_correspondence_threshold" value = "0.7"
param name= "roi_border_thickness" value="100"

Any suggestion at what could be the cause or to what part of the algorithm I should focus looking at?

Porting this library to android and use the image from camera as input

Hi all,
I want to port the code to android and test it by taking the input from my mobile handset camera , is there any one who has done this? I could use some help please let me know.

ROS Indigo OpenCV3: undefined reference

Hi! When I try compilling project, I get:

Building CXX object rpg_monocular_pose_estimator/monocular_pose_estimator/CMakeFiles/monocular_pose_estimator.dir/src/monocular_pose_estimator.o
Linking CXX executable /home/tuuzdu/catkin_ws/devel/lib/monocular_pose_estimator/monocular_pose_estimator
/home/tuuzdu/catkin_ws/devel/lib/libmonocular_pose_estimator_lib.so: undefined reference to `cv::line(cv::_InputOutputArray const&, cv::Point_<int>, cv::Point_<int>, cv::Scalar_<double> const&, int, int, int)'
/home/tuuzdu/catkin_ws/devel/lib/libmonocular_pose_estimator_lib.so: undefined reference to `cv::findContours(cv::_InputOutputArray const&, cv::_OutputArray const&, int, int, cv::Point_<int>)'
/home/tuuzdu/catkin_ws/devel/lib/libmonocular_pose_estimator_lib.so: undefined reference to `cv::circle(cv::_InputOutputArray const&, cv::Point_<int>, int, cv::Scalar_<double> const&, int, int, int)'
collect2: error: ld returned 1 exit status

libmonocular_pose_estimator_lib.so compillig normal.
I have ROS Indigo and ros-indigo-opencv3.

Tracking in close proximity for takeoff/landing

Very nice and useful work first of all.

It works just fine for me, but when the distance between the camera and the LEDs is no less than 0.5 meters.
I would like to use your project for takeoff/landing operations. I was not able to set up the parameters so that the tracking works in close proximity (100-300 mm between the camera and the LEDs).
Increasing max blob area doesn't help (I set it to 1000).

Could you please help.

My bag file is attached (with good tracking on 0.5 m distance and no tracking on 200 mm distance).

marker_positions:

x: 0.036
y: 0.055
z: 0.009
x: 0.004
y: 0.023
z: 0.009
x: -0.022
y: -0.015
z: 0.009
x: 0.016
y: -0.059
z: 0.009

2019-03-21-08-48-19.bag.tar.gz

Success at compiling, but not running (Tag: Instructions)

Hi--

Congrats on a really neat project. I'm trying to make it run, but I find that the directions assume a lot of knowledge and familiarity with ROS. While I have succeeded in compiling the pose estimator, the launch commands do not work.

roslaunch monocular_pose_estimator demo.launch
[demo.launch] is neither a launch file in package [monocular_pose_estimator] nor is [monocular_pose_estimator] a launch file name

Completely unrelated, and this isn't a feature feature request, but FWIW the code would probably have a wider appeal if it linked against a simpler library. Reading the paper and browsing through the code, it looks like OpenCV and Eigen might be enough for 99% of the required third-party lib functionality.

Provide sample data in a standard format

The library is useful outside of ROS. Therefore, it would be nice if the test data was provided in a format that did not depend on ROS (i.e. in the .bag format). A simple video file of the test data along with parameters for the detector would be perfect.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.