Giter Site home page Giter Site logo

royzon / 3d-object-tracking-project Goto Github PK

View Code? Open in Web Editor NEW

This project forked from uzmafiza24/3d-object-tracking-project

0.0 1.0 0.0 129.77 MB

Detect and track objects from the benchmark KITTI dataset. Classify those objects and project them into three dimensions. Fuse those projections together with lidar data to create 3D objects to track over time.

CMake 1.15% C++ 98.85%

3d-object-tracking-project's Introduction

SFND 3D Object Tracking

Welcome to the final project of the camera course. By completing all the lessons, you now have a solid understanding of keypoint detectors, descriptors, and methods to match them between successive images. Also, you know how to detect objects in an image using the YOLO deep-learning framework. And finally, you know how to associate regions in a camera image with Lidar points in 3D space. Let's take a look at our program schematic to see what we already have accomplished and what's still missing.

In this final project, you will implement the missing parts in the schematic. To do this, you will complete four major tasks:

  1. First, you will develop a way to match 3D objects over time by using keypoint correspondences.
  2. Second, you will compute the TTC based on Lidar measurements.
  3. You will then proceed to do the same using the camera, which requires to first associate keypoint matches to regions of interest and then to compute the TTC based on those matches.
  4. And lastly, you will conduct various tests with the framework. Your goal is to identify the most suitable detector/descriptor combination for TTC estimation and also to search for problems that can lead to faulty measurements by the camera or Lidar sensor. In the last course of this Nanodegree, you will learn about the Kalman filter, which is a great way to combine the two independent TTC measurements into an improved version which is much more reliable than a single sensor alone can be. But before we think about such things, let us focus on your final project in the camera course.

Dependencies for Running Locally

Basic Build Instructions

  1. Clone this repo.
  2. Make a build directory in the top level project directory: mkdir build && cd build
  3. Compile: cmake .. && make
  4. Run it: ./3D_object_tracking.

SFND 3D Object Tracking Rubric

FP.1 Match 3D Objects

Implement the method matchBoundingBoxes, which takes as input both the previous and the current data frames and provides as output the ids of the matched regions of interest (i.e. the boxID property). Matches must be the ones with the highest number of keypoint correspondences.

Implemented in camFusion_Student.cpp

FP.2 Compute Lidar Based TTC

Compute the time-to-collision in second for all matched 3D objects using only Lidar measurements from the matched bounding boxes between current and previous frame.

Implemented in camFusion_Student.cpp.

FP.3 Associate Keypoint Correspondences with Bounding Boxes

Prepare the TTC computation based on camera measurements by associating keypoint correspondences to the bounding boxes which enclose them. All matches which satisfy this condition must be added to a vector in the respective bounding box.

Implemented in camFusion_Student.cpp.

FP.4 Compute Camera-based TTC

Compute the time-to-collision in second for all matched 3D objects using only keypoint correspondences from the matched bounding boxes between current and previous frame.

Implemented in camFusion_Student.cpp.

FP.5 Performance Evaluation 1

Find examples where the TTC estimate of the Lidar sensor does not seem plausible. Describe your observations and provide a sound argumentation why you think this happened. Several examples (2-3) have been identified and described in detail. The assertion that the TTC is off has been based on manually estimating the distance to the rear of the preceding vehicle from a top view perspective of the Lidar points.

  • As can be seen from the graph in Figure 1 calculating the TTC using lidar gives good results.
  • There is a clear trend in TTC (reduces over time) as the ego car gets closer to the preceding vehicle.
  • The lidar TTC estimation uses the median of the x lidar points between frames which reduces the influence of outliers. This makes the TTC estimation more robust and avoids severe errors.
  • The estimations is not perfect however, as there are some ‘jumps’ in TTC between frames.
  • This maybe due to noise in the lidar data which affects the spread and consistency of the point cloud lidar data.
  • In particular the edges in the horizontal axis have large spreads in some frames which can affect the TTC and throw the estimation way off. This maybe due to reflections from the cars rear lights.
  • See Figures 2-4 for inconsistencies in the captured lidar data between frames.
  • Figures show a top down view of the back of the preceding vehicle as measured by the lidar sensor.

Figure 1 Graph of TTC lidar

Figure 2 Large spread at edges in y axis

Figure 3 Tight spread overall

Figure 4 Wide spread of points overall and a few outliers

FP.6 Performance Evaluation 2

Run several detector / descriptor combinations and look at the differences in TTC estimation. Find out which methods perform best and also include several examples where camera-based TTC estimation is way off. As with Lidar, describe your observations again and also look into potential reasons. All detector / descriptor combinations implemented in previous chapters have been compared with regard to the TTC estimate on a frame-by-frame basis. To facilitate comparison, a spreadsheet and graph should be used to represent the different TTCs.

  • Figure 5 shows the comparison between the TTC estimate using lidar and camera.
  • As with lidar, the camera based method of TTC estimation does a good job as there is a clear trend as the ego vehicle gets closer to the preceding vehicle.
  • The detector type is the biggest factor in the performance of the TTC calculation.
  • Figure 6 shows the TTC camera for all detector / descriptor combinations.
  • The following combinations are not displayed to make the graph clearer: HARRIS/BRISK, HARRIS/BRIEF, HARRIS/FREAK, HARRIS/SIFT, ORB/BRISK, ORB/BRIEF, ORB/ORB, ORB/FREAK, ORB/SIFT.
  • They have been removed as in some instances they fail completely (nan) to calculate the TTC and also produce severe outliers with TTC estimates of ~500s.
  • It's clear that the detector types HARRIS and ORB fail or produce poor performance due to the small number of keypoints found. This leads to few reliable matches between frames.
  • SHITOMASI, BRISK, AKAZE and SIFT base detectors produce good results but the TTC can be unstable in some frames. Also they have the disadvantage of being computationally expensive.
  • A FAST based detector seems to perform the best. This is due to number of keypoints found. This was evident in the mid term project.
  • The TTC estimates using a FAST based detector are shown in Figure 7.
  • In general there is a relationship between the number of keypoints found and performance. So there could be scenarios where the scene in the image/ frame does not present many key points (e.g low light, night time). Therefore estimating TTC using camera even with a FAST based descriptor may fall down.

Figure 5 TTC comparison of Lidar and camera

Figure 6 TTC using camera for all detector / descriptor combinations

Figure 7 FAST based detector to estimate TTC

3d-object-tracking-project's People

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.