Giter Site home page Giter Site logo

Comments (14)

Mr0grog avatar Mr0grog commented on June 15, 2024 4

Actually, it might be helpful to clarify here (or via quick video/chat meeting this week?) what we mean by V1. I think we should shoot first for something that merely replaces the current Google Sheets workflow but nothing beyond that:

  • Organizers should need to do nothing to set things up for analysts (e.g. run the scraper, upload CSVs to Google Docs)
  • Analysts can go some site to see the current diffs they need to look at for the current N-day period.
  • Analysts can click a link to view diffs in Versionista
  • Analysts can fill in a form to annotate a change
  • ? (I’m sure I’m missing exactly how we’d want to handle the dictionary stuff, seeing what analysts had marked as significant/insignificant, etc.)

That’s simple and concrete. I think we should be able to achieve it by May 1st… or it’s a sign we probably aren’t coordinating well.

Once we can do that, then we can move on to all the more complex things we want (e.g. PageFreezer, Internet Archiver, fancier/smarter diffs, automated filtering services, etc etc etc). Is the above is v0 or v1 or v-uh-sorry-we're-not-interested-in-that?

from web-monitoring.

danielballan avatar danielballan commented on June 15, 2024 3

Good luck with the boxes, @lightandluck.

from web-monitoring.

Mr0grog avatar Mr0grog commented on June 15, 2024 1

OK! Per conversation from Saturday, we’re going to try and break this up into two items:

  • v.0: A simple implementation that can functionally accomplish what’s done in Google Docs now. This is primarily for evaluation—if it works well enough, we may want to shift analysts directly to it, but that is not its main goal. This is pretty close to the set of requirements listed above:

    • Data (including raw HTML content) is automatically and continually scraped out of Versionista
    • Data can be queried by site (to maintain current process for how work is split up by analyst)
    • Analysts can view all versions/diffs over an N-day period (I think, at this point, it’s still OK to link them back to Versionista for diff viewing. That should be done away with by 1.0)
    • Analysts can fill in a form to annotate a change
    • Annotations can be queries by significance
    • This does NOT include:
      • Custom diffing separate from Versionista
      • Tagging
      • Dictionary (we’ll leave that part of the workflow in Google Docs, though it could now be a link to this system instead of a full line from a spreadsheet)
  • v.1: A fully deployed implementation across all projects that can absolutely replace the current Google Docs workflow.

    • Showing our own diffs
    • Flagging changes for the dictionary
    • Probably a much nicer UI
    • Maybe also includes…
      • Tagging (of pages and maybe of changes [we might be able to do change tagging through annotations…?]; only some people have permissions to make new tags so we can keep a controlled vocabulary). See also #30.
      • Sources other than Versionista

Shooting for v0 by the end of April and v1 by the end of May.

I’ve made a v0 and a v1 milestone for web-monitoring-db (but not properly sorted and tagged all issues yet). We should probably do the same for all the web-monitoring* projects.

More thoughts/feedback/amendments welcome if this doesn’t seem complete or doesn’t quite jive with everyone.

/cc @lightandluck

from web-monitoring.

lightandluck avatar lightandluck commented on June 15, 2024 1

Thanks for the write-up and keeping me in the loop! I'm moved in but now comes the unpacking and organizing phase. I'll be able to continue contributing by next week. I'll try to catch myself up in the meanwhile.

from web-monitoring.

danielballan avatar danielballan commented on June 15, 2024 1

Version 0 updated task list, with effort estimates:

These times are not hours of on-the-job effort but actual calendar time, given our likely availability for volunteer development work

from web-monitoring.

Mr0grog avatar Mr0grog commented on June 15, 2024

👍 I have added a corresponding milestone in the DB project and started to tag PRs and issues.

from web-monitoring.

danielballan avatar danielballan commented on June 15, 2024

The purpose of the next web-monitoring dev call should be to nail down the scope and ensure that we all understand it consistently. We can either co-opt the main #dev call for this purpose or hold a continuing call after.

from web-monitoring.

Mr0grog avatar Mr0grog commented on June 15, 2024

@trinberg @ambergman @dcwalk Would love any quick (or not) input you have here that we can think about before Saturday’s call.

from web-monitoring.

dcwalk avatar dcwalk commented on June 15, 2024

Likely don't have time to add thoughts before the townhall today (sorry!), but I'll try to be available for ~30mins at 6pm for your call and chime in.

from web-monitoring.

danielballan avatar danielballan commented on June 15, 2024

Thanks for writing this up, @Mr0grog. This matches my current understanding.

from web-monitoring.

dcwalk avatar dcwalk commented on June 15, 2024

Yeah, good luck with the boxes :))

from web-monitoring.

Mr0grog avatar Mr0grog commented on June 15, 2024

All the plans we variously enumerated here have gone out the window multiple times over. Should we close this issue or do we need to do a better job keeping it updated?

from web-monitoring.

danielballan avatar danielballan commented on June 15, 2024

We have finer granularity than "v0" now. I think we should close this issue and set up more fine-grained milestones.

from web-monitoring.

Mr0grog avatar Mr0grog commented on June 15, 2024

Closing in favor of #75.

from web-monitoring.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.