Giter Site home page Giter Site logo

python-musicbrainzngs's People

Contributors

alastair avatar doskir avatar dosoe avatar dufferzafar avatar freso avatar frewsxcv avatar galenhz avatar gward avatar horrendus avatar ianmcorvidae avatar ibmibmibm avatar itaybb avatar jonnyjd avatar laarmen avatar marineam avatar mineo avatar navap avatar paulbailey avatar rlhelinski avatar ruippeixotog avatar samdoshi avatar sampsyo avatar stefanor avatar timgates42 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

python-musicbrainzngs's Issues

automatic rate limiting

There is a rate limit to the server. queries should be automatically limited if they happen too quickly. There should be an option to turn this off for people who run a local server

Catch socket errors that occur during read (not just open)

Currently, we use the _safe_open function to catch lots of errors when opening a URL and retry when necessary. For example, if we see a "connection reset by peer" error during URL open, it gets retried rather than propagated to the application.

However, errors like this can also occur during the data transfer, not just at opening. The call to message.read() inside mbxml.parse_message can raise socket errors that go unhandled.

The easy way to address this would be to move the read() call into _safe_open and make that function return a string instead of a file-like object.

Handle request errors

It would be useful to have the library make communication errors with the server "nicer" by avoiding the exposure of lower-level exceptions. For example, XML parse errors should maybe be translated into something like MalformedMBResponseError exceptions; HTML timeouts and such could be turned into ServerBusyErrors or the like. This would greatly reduce the headaches involved for clients trying to implement robust queries.

At the same time, perhaps the library should be responsible for retrying under certain conditions (e.g., after 502 errors)?

user ratings, tags

show tags and ratings given by the user who is authenticated. Should be an error to ask for them if no login info has been given

ext:score support?

I looked around for a way to get the score back from various searches but I don't see one.

It looks like adding "{http://musicbrainz.org/ns/ext#-2.0}score" to the list of attributes in, for example, parse_recording will give me the attribute score.

I messed around with trying to fix the attributes with namespace but I'm too dim to figure it out (as is being done with ws:recording, etc).

Could someone more versed in this maybe add support for the ext:score attributes? They are very handy. Or maybe there's already a way and I'm just not seeing it?

Thanks!

Change user agent

clients should be able to set the user agent that gets sent with requests

Cover art archive

We should support the new cover art archive API. Either as part of pymb, or another library

404 while fetching by ID should not be a simple ResponseError

Currently when issuing get_releases_by_discid I get a ResponseError for any of the status codes 400, 404 and 411.

However, I do think a 404 is quite distinct from other ResponseErrors.
When checking for a discid I want to know if the disc ID is not found on the server or if there really was a response error.

That should either be a None as a return value or an Exception distinguishable from ResponseError (without having to check the cause or message). It can be a derived class, though.

Encode Unicode search query terms

If Unicode arguments are passed to the search function, the library eventually dies in urllib.urlencode(), which only support byte stings. This library should encode arguments (using UTF-8, like the old library) when building the request.

Add support for attributes for aliases

Background: An alias-list element, which includes multiple alias elements, are included on artist, label or work entities when they are requested with the aliases include. For example: http://musicbrainz.org/ws/2/artist/0e43fe9d-c472-4b62-be9e-55f971a023e1?inc=aliases

Currently alias-list elements are treated as list of strings via parse_element_list, however each alias element can have one of several attributes, from the schema:

    <define name="def_alias">
        <element name="alias">
            <optional>
                <attribute name="locale">
                    <ref name="def_iso-3166-2" />
                </attribute>
            </optional>
            <optional>
              <attribute name="sort-name">
                <text />
              </attribute>
            </optional>
            <optional>
              <attribute name="type">
                <text />
              </attribute>
            </optional>
            <optional>
              <attribute name="primary">
                <text />
              </attribute>
            </optional>
            <optional>
              <attribute name="begin-date">
                <ref name="def_incomplete-date"/>
              </attribute>
            </optional>
            <optional>
              <attribute name="end-date">
                <ref name="def_incomplete-date"/>
              </attribute>
            </optional>
            <text/>
        </element>
    </define>

In particular the locale and primary attributes are critical to being able to select the most appropriate alias for a given language.

I would like to change this by introducing a parse_alias_list function that would return a list of dictionaries and as such would more closely follow the XML schema, however such a change will break any software that uses the existing alias-list implementation.

Field access on returned data

It'd be neat to be able to access information in the returned data as fields as well as dictionary keys:

release.title

instead of (or in addition to)

release["title"]

live-span is parsed wrong

<?xml version="1.0" encoding="UTF-8"?>
<metadata xmlns="http://musicbrainz.org/ns/mmd-2.0#">
<artist type="Group" id="952a4205-023d-4235-897c-6fdb6f58dfaa">
<name>Dynamo Go</name><sort-name>Dynamo Go</sort-name>
<life-span><begin>2005-06</begin></life-span>
</artist></metadata>

becomes

{'artist': {'sort-name': 'Dynamo Go', 'type': 'Group', 'id': '952a4205-023d-4235-897c-6fdb6f58dfaa', 
'2005-06': '',
'name': 'Dynamo Go'}}

please note the second line

artist credits

inc=artist-credits, valid for releases and recordings. Should provide xml->dict as well as combining the credit back together

Avoid Bad Request when searching for "AND" or "OR"

The search_* functions' keyword arguments are escaped for inclusion in Lucene queries, but strings that look like boolean operators (e.g., "AND" and "OR" in upper case) still work as boolean operators. This can cause undesired behavior when someone intends to actually search for one of these words, or, more seriously, it leads to an HTTP 400 error ("Bad Request") when the query seems malformed. For example, if OR appears at the end of a query part (as in album:(PORTLAND, OR)), an error occurs.

I can think of three ways of addressing this:

  • Lower case all keyword arguments so they can't contain operators. (The query parameter can still be used if the user actually wants to use boolean operators.)
  • Detect incorrect usage of boolean operators and throw an error on the client side.
  • Leave it as is, but document the behavior.

I think the first seems the most reasonable. I'll implement it unless anyone has an objection.

Internal consistency

Ensure all methods have a similar way of calling them, and return similar looking objects

Automatically add required includes

Some includes require other includes to be present. E.g. puid requires recordings. We should consider adding these checks in ourselves (because otherwise we let people make an invalid request). An alternative could be to automatically add the includes required to make a request valid - e.g. if someone adds puid we automatically add recordings.

validate rate limiting values

if you call set_rate_limit(0,0) you get a Div by 0 error. if you call set_rate_limit(1,0) it hangs forever, trying to complete 0 requests in 1 second.

browse requests

"Browse requests are a direct lookup of all the entities directly linked to another entity"
for example, all releases given a label.

Also consider an object for valid links for a particular browse request (like #3)

Rework examples/demos

Move examples from a single file to a series of demos that use all the features that are available

Features that should be demonstrated:

  • Get
  • Includes
  • Release status / type
  • Browse
  • Paging
  • Submission
  • Search
  • Advanced search
  • Setting passwords
  • Collections

Make common length field that uses track length or recording length

When requesting the track-list in some cases the length parameter will only be parsed correctly if it is inside a recording element
Example: calling
musicbrainzngs.get_release_by_id("7118801c-cb38-43a3-a76a-b25ee81769bd",["artists","release-groups","media","recordings"])

will not have a length on most of the tracks. The xml response has length parameters for all tracks but they are outside of the recording elements.

This should be fixable by parsing the length right inside the track element.

include libdiscid binding

I was using libdiscid through python-musicbrainz2. It would be nice to have something similar here.

The implementation in Pymb2 was in musicbrainz2.disc and one could use readDisc(devicename) to get the discID from a cd in a drive.

Always return Unicode strings

I noticed recently that the strings returned from our library are sometimes bytes and sometimes Unicode. Due to ElementTree's default behavior, only those strings that are non-ASCII are returned as Unicode objects. For example:

>>> rec = musicbrainzngs.search_recordings(artist='alt-j', recording='piano', limit=1)['recording-list'][0]
>>> rec['title']
u'\u2766 (Piano)'
>>> rec['release-list'][0]['title']
'An Awesome Wave'

The recording title, which has a "special" character in it, is a unicode object. The release title, which is all ASCII, is a str object. For consistency's sake (and for an eventual Python 3 port), the library should always return unicode objects.

Anyone have any bright ideas about the best way to go about addressing this? (I have a nagging sensation that we might have discussed this in the past, but I can't remember if we came to a conclusion about what to do.)

ext:score not properly exported into result.

It seems the parsing of the ext:score attribute is not handled correctly and are never exported into the resulting dictionary object.

I haven't found the actual problem myself yet due to a lack of time. I will try to look into it later this week.

various artists

from the docs:

  • various-artists include only those releases where the artist appears on one of the tracks,
    but not in the artist credit for the release itself (this is only valid on a
    /ws/2/artist?inc=releases request).

encode/check non-ascii input

Ascii input is no problem.

Unicode input works throughout the code currently (at least I haven't found problems), also because of using unicode literals in _do_mb_search and conversion from unicode to utf8 for the output in _mb_request (see #28)

However, we don't have any checking or conversion on the input. We just expect everything to be in unicode or ascii.
Just using sys.argv does not generate unicode strings and other input might also have problems.

So when we decide on handling unicode strings in the library itself we have to encode non-ascii strings to unicode on input.

Otherwise every function must be prepared to use non-ascii strings AND unicode.

Right now I only see _do_mb_search handling user input that is possibly non-ascii. So we probably should convert there.

Additionally we should check how things change when we try to support Python3 (additionally).

fill in "missing" track information

For space reasons the musicbrainz webservice skips filling in track details that can be inherited from the recording (if the recording and track don't differ)

We should fill in these values again, so that we don't need to check one element to see if it exists before falling back to another one.

For some examples, see:
http://test.musicbrainz.org/ws/2/release/5e3524ca-b4a1-4e51-9ba5-63ea2de8f49b?inc=recordings (track name)
https://beta.musicbrainz.org/ws/2/release/704b7bbd-ffdb-4e01-b211-713d0506ba85?inc=recordings+artists+artist-credits (artist credits)
https://beta.musicbrainz.org/ws/2/release/5dc6c088-0b65-4501-90e5-2b07d60618a2?inc=artists+recordings+artist-credits (compare to previous)

automatic paging

Browse requests and searches support paging. We should return from these requests an object that gives the results, with an easy method to call to get the next set of results.

An option might be to make these responses iterable so that you can just call next() on them and paging will happen magically in the background

filter by release type, status

"Any query which includes release-groups in the results can be filtered to only include release groups of a certain type"

AttributeError: 'etree' object has no attribute 'ParseError'

File "query.py", line 14, in main
print m.get_recordings_by_puid("070359fc-8219-e62b-7bfd-5a01e742b490")
[...]
File "python-musicbrainz-ngs/musicbrainz.py", line 576, in _mb_request
except etree.ParseError, exc:
AttributeError: 'module' object has no attribute 'ParseError'

using Python 2.6

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.