Giter Site home page Giter Site logo

Comments (7)

worldveil avatar worldveil commented on July 23, 2024

Please shared the modifications to the script so I or others can run? Just so the issue is completely reproducible.

from dejavu.

balnagy avatar balnagy commented on July 23, 2024

Sure, I will try and add more instructions.

  1. I modified this line: https://github.com/worldveil/dejavu/blob/master/test_dejavu.sh#L11 to
python dejavu.py fingerprint ./mp3/ wav
  1. I removed all the other mp3 files from the mp3 directory, so only my wav file stayed there.
  2. I created a clean database, named dejavu_test2 in the local MySQL.
  3. I created a dejavu.cnf config file like this:
{
    "database": {
        "host": "127.0.0.1",
        "user": "root",
        "passwd": "", 
        "db": "dejavu_test2"
    }
}
  1. Then I run ./test_dejavu.sh

Results

  • Confidence_*sec.png has only 0 values
  • matching_perc_*sec.png has only invalid values

Thanks!

from dejavu.

worldveil avatar worldveil commented on July 23, 2024

@balnagy ah I see. There is no problem with Dejavu, but the testing framework has a bug where if the name of your track on disk includes an underscore (_), then the match will always be invalid because the strings compared will be different.

If you look in the results/dejavu-tests.log generated by the testing suite (as you should!), you'll see it is predicting the correct song, but the track name excludes the part following the last underscore:

file: Taylor Swift - Shake It Off-nfWlot6h_JM_69_3sec.wav
song: Taylor Swift - Shake It Off-nfWlot6h
song_result: Taylor Swift - Shake It Off-nfWlot6h_JM
invalid match

But if I extract a random 3 second segment from the Talyor Swift track to mytest.wav and use the command line tool, everything works fine:

$ python dejavu.py recognize file mytest.wav 
{'match_time': 0.3390800952911377, 'song_id': 1, 'confidence': 1326, 'song_name': 'Taylor Swift - Shake It Off-nfWlot6h_JM', 'offset': 646L}

The problem is here. I didn't originally write the testing suite, but had a contributor kind enough to make it. It needs some love, though. Long term, this obviously needs to be fixed.

In the short term, I might use instead 3-4 underscores (____) as a separator, which is even more hackish (ugh), but is a temporary fix. In even shorter term, you could change the filename part nfWlot6h_JM to nfWlot6hJM by removing the underscore.

from dejavu.

balnagy avatar balnagy commented on July 23, 2024

@worldveil, wow, thanks. Now I repeated the test after renaming the file to 1.wav, so I have higher confidence (40-700), but the offset still doesn't match. I could imagine the song is repeatative, but none of the 5 samples matches, which I think very unlikely.

DEBUG:root:--------------------------------------------------
DEBUG:root:file: 1_170_5sec.wav
DEBUG:root:song: 1
DEBUG:root:song_result: 1
DEBUG:root:correct match
DEBUG:root:query duration: 0.599
DEBUG:root:confidence: 146
DEBUG:root:song start_time: 170
DEBUG:root:result start time: 94.0
DEBUG:root:inaccurate match
DEBUG:root:--------------------------------------------------

Song: https://www.youtube.com/watch?v=nfWlot6h_JM&t=170s
Result: https://www.youtube.com/watch?v=nfWlot6h_JM&t=94s

from dejavu.

worldveil avatar worldveil commented on July 23, 2024

@balnagy, apologies, I haven't had much time to look into these issues lately. Any progress or thoughts?

from dejavu.

balnagy avatar balnagy commented on July 23, 2024

It's very hard to debug such a problem, so I gave up and implemented my own algorithm just to find the offset, since I know the song. And it's kind of weird, since if the offset is not precise, then it makes the whole result questionable.

Where would you start debugging?

from dejavu.

worldveil avatar worldveil commented on July 23, 2024

Not necessarily. The way music is produced now, many of the sounds are direct copies or looped clips, meaning that it might actually be legitimately ambiguous as to which loop the fingerprints matched to.

I would start with the test case script and ensure the algorithm is actually messing up and not just the test suite. The test suite was contributed by someone, that, while I applaud the effort, leaves some room for improvement.

If that isn't it, you might just need to tweak the parameters of the hashing to ensure better offset matching.

from dejavu.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.