Giter Site home page Giter Site logo

Comments (6)

taganaka avatar taganaka commented on May 29, 2024

I'm more prone to let the client to handle page download failures rather than apply a re-enqueue strategy blindly
A built in naive failure retry strategy for temporary network failures is already in place (https://github.com/taganaka/polipus/blob/master/lib/polipus/http.rb#L170-L179)

A client can easily implement its own error handling workflow for more fine grain control:

polipus.on_page_downloaded do |page|
  add_to_queue page if page.error
end

I can add a new block handler like on_page_error {|page|...} so that the error handling logic can be encapsulated there

on_page_downloaded blocks is always called, no matter what.
Introducing on_page_error handler should then skip on_page_downloaded call since the page has not been downloaded

from polipus.

tmaier avatar tmaier commented on May 29, 2024

I think Page#storable? should return false if there was an error.
Drawback is that this would hinder us to investigate an error
Advantage would be that if we queue the url again, it would skip the url later

I like on_page_error as this would encapsulate the error handling on one place.

And one of the examples should show

polipus.on_page_error do |page|
  polipus.add_to_queue(page) if page.error
end

from polipus.

tmaier avatar tmaier commented on May 29, 2024

I was wrong. The drawback mentioned does not exist, as we do not store the error message in the database.

from polipus.

taganaka avatar taganaka commented on May 29, 2024

storing the page with the error associated into the DB could be useful for better offline debug.
Having on_page_error blocks executed before[1] page.storable? is evaluated, users could easily modify the flow:

polipus.on_page_error do |page|
  page.storable = false
  polipus.add_to_queue(page)
end

However to store the error message into the DB, error attribute needs to be serialized and exposed into page.to_hash method

[1]https://github.com/taganaka/polipus/blob/on_page_error/lib/polipus.rb#L207

from polipus.

taganaka avatar taganaka commented on May 29, 2024

implemented here: #29

from polipus.

taganaka avatar taganaka commented on May 29, 2024

Addressed into v 0.3.0

https://github.com/taganaka/polipus/releases/tag/0.3.0

from polipus.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.