Giter Site home page Giter Site logo

Song matching with wrong artists about redlist HOT 3 OPEN

laharah avatar laharah commented on June 16, 2024
Song matching with wrong artists

from redlist.

Comments (3)

Laharah avatar Laharah commented on June 16, 2024

I've come across this issue before. Redlist uses beets' track distance logic to determine matches. In order to allow matches that come from different albums/sources (Eg: your Spotify playlist has "Tiny Dancer" from "Madman Across the Water" but you have the radio edit of "Tiny Dancer" from a "Best of Elton John" album) I had to tune the maximum accepted distance to something fairly liberal using fairly few fields to minimize unnecessary torrent downloading. I found 0.3 to be a happy medium, but beets uses its own weights to score how important certain differences are. In Beets, track title and length are more strongly weighted than artist or album.

All together that means that if you have a track with the same title and a very similar length it can outweigh the fact the artist and album are fairly different, pushing the difference score to just below 0.3. Unfortunately there's a lot of nuance that goes into artist matching that simple fuzzy/levenshtein distance won't cover (Eg: stripping "feat" tags, scoring differences further down the string lower, "the" being scored lower so that "The Doors" and "Doors" are closer than "Hat Doors", ect.).

So there are really 2 ways to limit this:

1. Adjusting the match threshold

I've added a beets_match_threshold option to the config so that you can override the 0.3 value I've tuned it to. If you set it lower to 0.2 or 0.25 it'll be more strict and you'll see less false matches, but you're also likely to miss some more possible matches as well.

2. Upping the Artist weight in beets

You can also adjust how important beets weights the Artist value. Adding this to your beets config.yaml file will up it from the default of 2.0 to 3.0.

match:
    distance_weights:
        track_artist: 3.0

You can see all the default weights here for reference but you should know that Redlist only uses track_length, track_title, and track_artist to calculate matching distance, and only adds album if the --restrict-album option is set.

I'll leave this issue open incase you or anyone else has some other ideas on how to limit the problem since I agree it is annoying. I suppose I could add an additional penalty for artist mis-match; I'll have to think some more about it.

from redlist.

Laharah avatar Laharah commented on June 16, 2024

After thinking about it some more and running some tests I think the best thing to do is simply weight the track_artist more heavily for the distance calculation. I've configured redlist to temporarily patch the weight to a higher number when doing it's matching.

I think that since there are on average 1-2 missed matches on a 100 song playlist, the trade of an extra 2-3 is worth it to reduce some of the more blatant false positive matches.

If you try out the new build, let me know how the new weights do on your playlists, I'm still trying to tune the new weights.

from redlist.

lbesnard avatar lbesnard commented on June 16, 2024

Sorry for the slow reply, was away for the last week! Cheers for looking into this, your explanations makes heaps more sense, and yeah a fzf approach is probably not enough. I'm trying the code with your latest changes and it seems to be way better tuned. I have yet to come across a death metal tune in my chilllout playlist :D!
I'll keep on testing this, especially with some spotify playlists which are always adding very recent tunes since these will most likely not be in my beet library.

from redlist.

Related Issues (9)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.