Giter Site home page Giter Site logo

digitaldutch / roaddanger.org Goto Github PK

View Code? Open in Web Editor NEW
12.0 12.0 3.0 9.49 MB

A website and database for news reports about traffic crashes around the world.

Home Page: https://www.roaddanger.org/

License: MIT License

JavaScript 57.94% PHP 39.25% CSS 2.81%
danger mariadb php road traffic-crashes

roaddanger.org's People

Contributors

digitaldutch avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

roaddanger.org's Issues

Data export seems incomplete, limited at 10000 crashes

Sorry, another issue... (I think).

It seems as if not all data is exported in the gzip JSON file.

When exporting the data via https://nl.roaddanger.org/export/ I get:

However, when I visit https://nl.roaddanger.org/statistics/general I see there are approximately 14k crashes (this corresponds with the current crash ID's):

image

Due to the neat amount of 10000 items I suspect there is some limiting going on. I don't know PHP, but the export function does mention $maxRows = 10000; and this variable is used later in the SQL query. Could this be the culprit?

Editing accident returns database error

Fietsster overleden na verkeersongeval

I wanted to change unknown vehicle to something that represents a mobile home as best as possible. But the it returned the following error:

SQLSTATE[22007]: Invalid datetime format: 1366 Incorrect integer value: '' for column `hetongeluk`.`crashpersons`.`child` at row 1

I changed something. But the old unknown vehicle is deleted, and the new vehicle is not added.

On further inspection, I cannot add an accident. I Added the HAR file as txt.
nl.roaddanger.org.har.txt.

It is possible to reproduce this in my Vivaldi (chromium) and Firefox browser.

Regional statistics

Is there any chance to extend thecrashes.org to provide a regional analysis of the data? I'm adding news articles from Aachen/Germany at the moment and at some point there could be enough data that a regional analysis could make sense.

Extract crash info from crash image

Almost all crash article include a photograph. Maybe crash meta information can be extracted from these images.

Currently all meta info, like involved modes of transportation and injuries, is extracted from the article text. AI image analyzation may help. This may be quite hard as the available photo's often contain little info.

Can't add people on iPad with keyboard

Ik zie dat je deze in het Engels doet...

I was trying to add an article on my iPad, but adding people does not seem to work there. Without adding people, I can't submit the article, so I had to switch to my laptop.

After submitting I had some more debugging:

  • it works on my iPhone, so that's weird.
  • when I unplug the keyboard to my iPad it works again.

I think this is because of the use of <span> and onclick. When there is no keyboard connected, iPad will be compatible and fire onclicks for tabs. But when the keyboard is connected, some keyboards will have a mouse, so it will probably distinguish between onclick and ontab. (Mine has no mouse, but it seems like it still distinguishes.)

One could think the bug is with Apple, but the better way is not to use a <span> when you want to use it to click on. The accessible option would be to use a <input type="radio">, which does what you want out of the box, plus has advantages for screenreaders and different devices :)

Web crawler to find media web pages

Currently volunteers find crash articles themselves and add the link to our spider, which loads the page and extracts meta information from this webpage.

A web crawler that crawls media websites or crash aggregation websites (like ongeluk vandaag) . for relevant articles would be quite handy.

Extract involved crash humans and their injuries automatically from article text

Currently, the humans and their injuries involved in the crash are determined by volunteers copying the article full text selecting this data manually. Helpers click the Add button and fill in a url to a media page. The roaddanger spider then tries to read the meta tags (JSON-LD, Twitter/X, Open graph, etc).

Any missing data is copied manually. From the title and full text all important data is extracted like:

  • All involved humans and their mode of transportation. All transport options can be found on the data export page.
  • Their injuries: Dead, injured, unharmed of unknown
  • If the human is a child (below 18)
  • If the human was intoxicated
  • If the human drove away of fled
  • If it was a one sided crash (no other humans involved)

Entry screen:
Data entry screen

Entering data will be faster if these steps can be automated using an AI language model that reads the text and then automatically selects all involved humans and their characteristics. As roaddanger.org is multilingual, it would be nice if this feature supports multiple languages.

All current crash data (full texts and all meta data like involved humans) can be downloaded in JSON format from this page. This data can be used to train or test the language models.

Add new transportation mode "speed pedelec"

Hello,

It would be helpful if there would be a new transportation mode "speed pedelec"/"high speed e-bike". I think this could be a valuable addition because:

  1. They sometimes get a bad reputation of being dangerous because they, in Belgium at least, are oftentimes allowed to make use of the cycle path, which is often not build with the higher average speed of the pedelec in mind. It would be interesting to see data about how often they are involved in an accident vs. a "normal" bicycle.

  2. Currently it is not clear how they should be categorised. Some instances view them as mopeds (for example the Belgian traffic code), while other see them as bicycles (for example the Flemish citizen science project Straatvinken). It is not good for the data if in some collisions the speed pedelecs involved are categorised as bicycles and other as mopeds.

In my humble opinion, a new category would be very helpful.
An other possibility would be to classify the pedelec as a bicycle or a moped and then changing the name of the category to bicycle/pedelec for example so it is clear where they need to be placed.

Thank you very much!

Automatically extract the full article text from a media webpage

Currently our spider uses the JSON-LD article tag to find the full text of a media article web page. Problem is that few media websites support this tag. Consequently our volunteers have to manually copy and page the text from the web page to our input field.

Any method (scraping, not yet used tags) that helps to automatically read the full text is welcome.

As roaddanger.org is multilingual, it would be nice if the full text extractor supports multiple languages.

Additional graphs to better visualize the data

Currently there are a few graphs to visualize the crash data.

New graphs or better graphs which show what transportation modes causes the most victims or how the media report on crashes are welcome. Preferably interactive with filters and clickable, but any graph of visualization that can be used on social media is welcome.

The current graphs use the d3 JavaScript library. Data is pulled from the server using a fetch call to a server side PHP script and the graphs are generated in the browser using d3.

Suggestions:

If summary cannot be scraped it remains empty, but filling it is required to close out form

Example URL: https://regio15.nl/nieuws/ongevallen/36619/wegvervoer-ongeval-letsel-neherkade-2980-den-haag/

Steps to reproduce:

  1. Go to https://nl.roaddanger.org/
  2. Add new article via + sign in the upper right
  3. Add this URL: https://regio15.nl/nieuws/ongevallen/36621/spoorvervoer-ongeval-letsel-scheveningseweg-den-haag-lijn-1/ and press on "Artikel ophalen"
  4. Note that the "Samenvatting" field remains empty, it cannot be scraped for this URL
  5. Fill in the "Tekst" field under "Ongeluk" to create a manual summary
  6. Fill in the rest of the fields ("Datum", "Betrokken mensen", "Locatie", etc...)
  7. Press "Opslaan"

Expected result: the article is saved and text from the "Tekst" field is used as the summary.

Actual result: error message is shown with the text: "Artikel samenvatting niet ingevuld".

Workaround: in the browser inspector I can remove the data-readonlyhelper attribute of the editArticleText element and paste in my own summary, then I can save it. But this quite cumbersome.

Extract location of crash from the article text automatically

Currently, the location of a crash is determined by the human reading the media article text, finding a street or corner and municipality. Then enter that it into the map search area and putting a marker on the map. That marker is saved as coordinates into the database.

It would be nice if these steps could be automated by using a language model to extract it automatically.

  1. Find street and municipality from the article text
  2. Convert street and municipality to coordinates

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.