In this project, there are two parts to implement the Object Detection algorithm.
- Generalized Hough Transforma (GHT)
- Visual Vocabulary
- Euclidean distance (SSD)
- Given a set of training cropped images prepare a vocabulary and GHT translation vectors.
- Use a Harris Corner detector to collect intereseting points.
- At each interesting point, extract a fixed size image patch and use the vector of raw pixel intensities as the descriptor.
- Cluster the patches by K-means to constitute a "visual vocabulary".
- Go back to training images and assign their patches to visual words in vocabulary by using the closest Euclidean distance.
- Record the possible displacement vectors between visual word and object center.
- Detect interesting points by corner detection.
- Use a fixed image patch to create a raw pixel descriptor.
- Assign to each patch a visual word.
- Let visual word occurrence vote for the position of the object using the stored displacement vectors.
- After all votes are cast, analyze the votes and predict where the object occurs.
- Compute the accuracy of the predictions.
550 cropped training images of cars, eaach 40 * 100 pixels
100 test images
A list of locations: topLeftLocs = [x1, y1; x2, y2; ...; xn, yn]
import os
import cv2
import numpy as np
import matplotlib.pyplot as plt
from scipy.ndimage import filters
import math
from sklearn.cluster import KMeans
from scipy.io import loadmat