Giter Site home page Giter Site logo

Comments (9)

mxndrwgrdnr avatar mxndrwgrdnr commented on July 20, 2024

I updated the scoring metric to count each unmatched segment as a miss, even if its been missed already (e.g. if the route loops around the block). The results look similar to previous findings but more closely mimic the phenomenon identified in the Newson and Krumm paper where higher sampling rates produce poorer results at high levels of noise. This trend inverts around 40-60 m of noise.
score_vs_noise_by_sample_rate

from reporter-quality-testing-rig.

kpwebb avatar kpwebb commented on July 20, 2024

@mxndrwgrdnr this looks great.

How are things looking re measuring in terms of matched distance, rather than just segments?

I'm increasingly convinced that the failure case we're looking for is when a mostly good GPS trace falls apart temporarily (signal loss, etc.) and the match jumps way off. Does the distance spike in those cases trying to find a realistic match?

Raises two things about metrics: 1) distance of matched trace matters, 2) are there ways to think about perturbation of GPS that don't just mess up the whole trace, but rather degrade them periodically (would need to think about what GPS failure modes look like but maybe possible to get an idea from real-world traces).

from reporter-quality-testing-rig.

mxndrwgrdnr avatar mxndrwgrdnr commented on July 20, 2024

@kpwebb Distance traveled, and relatedly, speed, comparisons are still on the "to-do" list. I'm holding off until we've actually tuned the map-matching HMM using the segment-match-based metric, which should happen at some point next week. Also worth keeping in mind that the speed and distance-based metrics will be significantly impacted by the inclusion of time in the HMM, which is still on the docket. Any distance/speed-based scoring generated now won't necessarily reflect the performance of the finished product, although it will give us a good idea of where we're starting from. In any event, I will have something for you to look at next week.

In the meantime I will keep thinking about the different failure modes of GPS as I agree that's a good way of producing more realistic traces.

from reporter-quality-testing-rig.

mxndrwgrdnr avatar mxndrwgrdnr commented on July 20, 2024

Might need to pass reporter-generated segments back to valhalla trace_attributes in order to do length/distance-traveled comparison. The code is already doing this for the sake of route visualizations but I'm currently not saving the rest of the output, which we'd need in order to compare the relevant attributes.

from reporter-quality-testing-rig.

mxndrwgrdnr avatar mxndrwgrdnr commented on July 20, 2024

Implemented distance traveled-based scoring metric based on the method used in the Newton and Krumm paper:
screen shot 2017-06-27 at 11 14 39 am.

The results are a near mirror-image of the segment-based matching:
scores_vs_noise_by_sample_rate

from reporter-quality-testing-rig.

mxndrwgrdnr avatar mxndrwgrdnr commented on July 20, 2024

All distance-based metrics:
match_errors_by_sample_rate

The top row of plots is comprised of composite metrics of both under- and overmatches (i.e. false negatives and false positives). The left column are count-based scores, and the right column are distance-based. They all track each other nicely, at least in the test region (San Francisco Bay Area).

One noticeable pattern that sticks out to me is that the "undermatches" appear to be more sensitive to sample rate at lower noise levels, while overmatches exhibit greater differentiation at higher levels of noise. Also of note is that the inversion in match quality mentioned above, whereby higher sample rates produce worse matches, is more pronounced for undermatches (false negatives).

from reporter-quality-testing-rig.

mxndrwgrdnr avatar mxndrwgrdnr commented on July 20, 2024

I've been exploring different metrics for speed-based matching and I think I've arrived at a useful result. The graph below shows two CDF curves, one for successfully matched segments (red) and one for incorrectly matched segments (blue) for the % error of GPS-derived speed relative to OSM speeds ((GPS speed - OSM speed) / OSM speed). The results suggest a definitive breakpoint for a threshold above which we would throw out the most erroneous matches while retaining the most correct ones. In your post above, @kpwebb, you suggested 2x as a threshold, and the graph certainly supports the notion that any derived/measured/observed speed above 2x the OSM speed is going to be a true negative. However, it also suggests that we'd still be getting a ton of false positives at this threshold (about 70% of them). At least for this region, the SF Bay Area, we could drop that threshold down to 37% above the OSM speed which would allow us to retain > 90% of our good matches while discarding 60% of the false positives. Even more conservative would be threshold around 15% which would retain almost 80% of true positives while rejecting over 70% of the false positives. Its worth noting, too, that this plot represents results from simulated GPS data at all sample rates and all noise levels. The threshold could be easily customized depending on sample rate and expected positional accuracy. My next move will be to see how this threshold might vary along those lines, and to compare the trend across different regions.
cum_freq_speed_error

from reporter-quality-testing-rig.

nvkelso avatar nvkelso commented on July 20, 2024

💯 great work!

from reporter-quality-testing-rig.

mxndrwgrdnr avatar mxndrwgrdnr commented on July 20, 2024

Distance-based QA metrics are available as a Python function here. Speed-based are here. And the wrapper function that iterates over a number of routes and performs the calculations is here. Functions for generating the metric plots as seen in the validation notebook can be found here. The plots themselves are featured in sections 3. and 6. of the validation notebook.

from reporter-quality-testing-rig.

Related Issues (9)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.