Giter Site home page Giter Site logo

kkduncan / saliencydetection Goto Github PK

View Code? Open in Web Editor NEW
4.0 4.0 1.0 67.21 MB

Saliency detection in images and videos.

C++ 11.15% C 57.95% Shell 22.05% SAS 0.44% Smalltalk 0.08% Assembly 0.26% Makefile 4.87% QMake 0.01% M4 0.34% HTML 0.24% Module Management System 0.43% CMake 0.05% Roff 2.13%

saliencydetection's People

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

caomw

saliencydetection's Issues

minor issues

  1. typo in CannyEdgeDetector.h

`#ifndef CANNYEDGEDETECTOR_H_

define CANNEDGEDETECTOR_H_`

  1. In CannyEdgeDetector.cpp the data flow of smoothedImg seems to have no effect:

`/*
* Perform Gaussian smoothing
*/
smoothImage(src, smoothedImg);

/*
 * Compute the first derivative in the x and y directions
 */
calculateDerivatives(src, deltaX, deltaY);`

In other words, you create smoothedImg but calc derivatives on the src.

  1. Is there a spurious edge effect on the right and bottom of SaliencyTestOutput.jpg ?

Also I didn't fork but derived another repository: https://github.com/bootchk/saliencyLibrary.git , where I have made some trivial changes to clean up the code (while I am trying to understand it, and hopefully wrap it in a Gimp plugin.)

I started here when I saw you forked resynthesizer. It is curious that repeated applications of resynthesizer (from an image to itself) tends to obliterate unique, man-made features (salient?) whereas your algorithm seeks to find those features.

What I am hoping to do as a plugin is image summarization or autocrop, by thresholding the saliency map and then cropping the original to the bounding box of that.

I don't see that this code operates at all scales (Gaussian pyramid) as you say in the paper, but possibly I just misunderstand.

purpose of updateEntropy

The updatePixelEntropy() function is called every 32 iterations. It stores a result in the densityEstimate, but the result does not seem to be a function of prior results. Also, after iteration stops, it seems like there likely have been additional samples (after the last call to updateEntropy) which have not contributed to the estimate.

I removed the periodic call to this function and made method KernelDensityInfo.entropy() (instead of a data field), called only once when creating the saliency map. That doesn't seem to affect the results, good or bad. So this is not much of an issue.

But just for my understanding, is the purpose to be found in equation 5 of your paper? My best guess is that the top equation of 5 should be computed every iteration by updatePixelEntropy(), but that computing it every 32 iterations is heuristic simplification from the ideal. But the data flow doesn't seem to support that purpose.

channel-wise saliency

An enhancement.

My initial thought was that retaining color channels throughout would give better results. In my current code I calculate an orientation difference for each channel RGB, in range [0,pi] but then sum that into one angle difference kernel (theta) in the range [0, 3pi.] Still using a kernelSum of K(d, theta). That seems ineffective e.g. for a yellow highway stripe.

So I will try again, calculating a kernel sum of K(d, thetaR, thetaG, thetaB). Do you have an intuition about that?

Consider depth-enabled photography, where the image has additional channels: infrared IR and laser-ranging depth e.g. the Intel RealSense cameras. In many cases, one channel might have the most readily available object (saliency) information. Without any apriori knowledge of which channel that is, a general-purpose algorithm (for nieve GIMP users) should work channel-wise?

difference of gradient orientations

Using the minus operator for float types holding gradient orientations (angular coordinate of polar coordinates) in units of radians yields either the interior (small) or exterior (large) subtended angle, depending on the order of the operands. Instead, using the function atan2(sin(x-y), cos(x-y)) gives the smallest angle, and seems to give crisper saliency results.

This issue might have crept in during your translation from legacy code to openCV and polar coordinates. I just stumbled upon this as I was working on color gradients.

I am surprised openCV does not have an Angle class with this method.

[http://stackoverflow.com/questions/1878907/the-smallest-difference-between-2-angles(url)

thresholding preprocess

I found that thresholding before computing saliency gives results more like I expected.

For example, a yellow highway stripe (that I mentioned earlier) now becomes salient. Another example is image 'i2.jpg' from the MIT saliency database (an image of a postage stamp on a letter), where the postage stamp now becomes salient.

I also removed the smoothing preprocess, without much change in the result. As an architectural issue, I think any preprocess should be in the API like postProcessSaliencyMap() is, making it optional.

Its non-intuitive to me that thresholding should give better(?) results. Why should discarding information up front give better results? (Not discarding information is why I attempted to retain color channels in the input and throughout the calculations, but that was ineffective. More on that to follow in another issue.)

If you threshold, then many values become discrete. Then you could speed it up by eliminating floating point operations and using table lookups for transcendental functions? Premature to try that I suppose.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.