Comments (3)
I opened https://gitlab.com/meltano/sdk/-/issues/282 to suggest a solution at the SDK level.
from tap-github.
-
MAX_PER_PAGE
was updated in #53 -
Implementation idea:
Similar to our list oftolerated_error
, we could have a list ofcontinue_on_errors
. When a partition exits on such an error, it would add a flag in its context,error_occured: true
. If such a flag is raised, we could then override[Stream]._increment_stream_state
to check this flag and NOT update the state for this partition if it has been raised.
from tap-github.
A couple of ideas to let the tap continue upon errors (assuming backoff has tried N times and failed, like we've seen a number of times in our logs):
Tolerated errors
We could add all sorts of error codes to the tolerated_http_errors
list for a stream, but this has the downside of losing data, as the tap will update the bookmark to past the erroneous data.
Override stream request decorator
The RestStream
class has a request_decorator which we could override in the tap when a specific config option is passed to continue_on_error
. The decorator could then build a mock Response
so that the rest of the stream moves on to the next partition. The state bookmark should be set to "before" the error, so it can be retried on the next run.
The difficulty would be in figuring out which errors are temporary, and which are not:
- the example URL above is still returning the same 502 after 10s more than 3 months later. It's unlikely it will ever work. At what point do we consider an error "permanent"?
- on the other hand, we had a 500 on
/repos/opensourcedesign/opensourcedesign.github.io/commits
yesterday which worked fine about 24h later.
A possible option would be to add a retried_count
in the state bookmark which keeps track of how many times (or over how long) a specific partition+stream has been retried, and gives up after N attempts over multiple runs.
from tap-github.
Related Issues (20)
- Passing a username as "organizations" config value crashes the tap HOT 5
- KeyError: `commit_timestamp` HOT 5
- Field `fetched_at` in stream `extra-metrics` can be formatted as a date-time string
- Releases stream has 10,000 record limit HOT 3
- The 'pull_number' field not being populated for the 'pull_request_commits' stream HOT 5
- If a member is part of multiple teams, they will only be listed once HOT 2
- ValueError: not enough values to unpack (expected at least 1, got 0) in repository_streams HOT 1
- Incremental replication doesn't respect the current state HOT 1
- Use pre-commit.ci to lint project
- Stream `extra_metrics` fails on repos with large number of issues/PRs HOT 1
- Drop support for python 3.7 HOT 1
- Invalid SCHEMA messages are produced for deselected streams HOT 3
- Replace use of `get_next_page_token` in the tap HOT 2
- Workflow streams incorrectly claim to support incremental loading
- Hard to tell if API token is valid or not HOT 1
- Add `files` property to `CommitsStream` HOT 1
- Experiencing 401 Bad Credentials when credentials are valid
- Document `api_url_base` setting for Enterprise Server installations
- SDK Version pointing to a specific commit HOT 1
- Loader 'target-jsonl' is not known to Meltano.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tap-github.