Giter Site home page Giter Site logo

Comments (3)

laurentS avatar laurentS commented on July 20, 2024 1

I opened https://gitlab.com/meltano/sdk/-/issues/282 to suggest a solution at the SDK level.

from tap-github.

ericboucher avatar ericboucher commented on July 20, 2024
  • MAX_PER_PAGE was updated in #53

  • Implementation idea:
    Similar to our list of tolerated_error, we could have a list of continue_on_errors. When a partition exits on such an error, it would add a flag in its context, error_occured: true. If such a flag is raised, we could then override [Stream]._increment_stream_state to check this flag and NOT update the state for this partition if it has been raised.

from tap-github.

laurentS avatar laurentS commented on July 20, 2024

A couple of ideas to let the tap continue upon errors (assuming backoff has tried N times and failed, like we've seen a number of times in our logs):

Tolerated errors

We could add all sorts of error codes to the tolerated_http_errors list for a stream, but this has the downside of losing data, as the tap will update the bookmark to past the erroneous data.

Override stream request decorator

The RestStream class has a request_decorator which we could override in the tap when a specific config option is passed to continue_on_error. The decorator could then build a mock Response so that the rest of the stream moves on to the next partition. The state bookmark should be set to "before" the error, so it can be retried on the next run.

The difficulty would be in figuring out which errors are temporary, and which are not:

  • the example URL above is still returning the same 502 after 10s more than 3 months later. It's unlikely it will ever work. At what point do we consider an error "permanent"?
  • on the other hand, we had a 500 on /repos/opensourcedesign/opensourcedesign.github.io/commits yesterday which worked fine about 24h later.

A possible option would be to add a retried_count in the state bookmark which keeps track of how many times (or over how long) a specific partition+stream has been retried, and gives up after N attempts over multiple runs.

from tap-github.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.