Comments (10)
I am working on this. Just wanted to let people know so that we don't end up doing duplicate work.
from autowebcompat.
Either CSV or a line-limited JSON (https://en.wikipedia.org/wiki/JSON_streaming#Line-delimited_JSON, that is a JSON object per line).
A normal JSON is a bit problematic because you can't easily see diffs between two versions of the file (e.g. if you just add one entry, the diff for a normal JSON file will show you the entire file).
from autowebcompat.
To do this, the first step would be to create a script to list all the inconsistencies.
from autowebcompat.
Which format would be best for exporting the inconsistencies?
- A
CSV
with every row being an inconsistency - A
JSON
as a list of inconsistencies, the list consisting of dicts detailing it.
from autowebcompat.
@marco-c what do you think we could do next regarding this ?
from autowebcompat.
Manually look at the inconsistencies and see what prevented us from taking a screenshot. E.g. force the crawler to only load the website with an inconsistency and see if the crawler throws an exception in one of the browsers.
from autowebcompat.
can you point me in a direction as to how i could work with the crawler in this case and force it to load a site?
from autowebcompat.
The crawler is in collect.py
, you need to change it to load a URL you want instead of loading an URL from one of the webcompat bugs.
from autowebcompat.
@marco-c where do I get the URL's of the websites for which we have inconsistent screenshots, we haven't stored these website URL's anywhere
from autowebcompat.
We have stored the webcompat ID, so you can retrieve the URLs either with Python by using utils.get_bugs()
and finding the bug you want, or by loading the bug on the webcompat.com website (e.g. https://webcompat.com/issues/1491).
from autowebcompat.
Related Issues (20)
- Sort labels when saving them
- test_labels should validate all labels files
- test_labels.py is not actually testing the screenshots actually exist HOT 3
- Limit size of full page screenshot HOT 4
- Script to rename images and labels according to new convention
- Implementing Object Segmentation networks for bounding box annotations HOT 1
- Throw a meaningful error in utils.read_labels when labels.csv is empty HOT 2
- Running pretrain.py gives FileNotFoundError. HOT 4
- Try training a neural network using the responses from a DOM-based tool as features
- Try using the responses from a DOM-based tool as additional features
- Create a web-based tool to show predicted differences HOT 2
- Move the labeling tool to be web-based
- Use multiple releases of each browser
- Try using mdn/browser-compat-data to automatically label screenshot pairs
- Investigate training a model to detect regressions in a browser
- When prefilling an issue on webcompat.com, prefill as much as possible
- Add possibility to navigate to websites on demand
- Use Docker to run browsers and collect screenshots
- Train a baseline classifier HOT 12
- Out of memory error while training vgg16 and vgg19 with imagenet weights on Colab
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from autowebcompat.