thegreenwebfoundation / datasets Goto Github PK

This project forked from openamee/datasets

9.0 9.0 1.0 5.4 MB

Open datasets & methodologies for carbon emissions from different activities. Forked from OpenAMEE, and npm installable

License: MIT License

JavaScript 100.00%

datasets's Introduction

Green Web Foundation API

In this repo you can find the source code for the API and checking code that the Green Web Foundation servers use to check the power a domain uses.

Overview

Following Simon Brown's C4 model this repo includes the API server code, along with the green check worker code in packages/greencheck.

Apps - API Server at api.thegreenwebfoundation.org

This repository contains the code served to you when you visit http://api.thegreenwebfoundation.org.

When requests come in, symfony accepts and validates the request, and creates a job for enqeueue to service with a worker.

The greenweb api application running on https://api.thegreenwebfoundation.org

This provides a backend for the browser extensions and the website on https://www.thegreenwebfoundation.org

This needs:

an enqueue adapter, like fs for development, amqp for production
php 7.3
nginx
redis for greencheck library
ansible and ssh access to server for deploys

Currently runs on symfony 5.x

To start development:

Clone the monorepo git clone [email protected]:thegreenwebfoundation/thegreenwebfoundation.git
Configure .env.local (copy from .env) for a local mysql database
composer install
bin/console server:run
check the fixtures in packages/greencheck/src/TGWF/Fixtures to setup a fixture database

To deploy:

bin/deploy

To test locally:

Go to http://127.0.0.1:8000 for homepage
Go to http://127.0.0.1:8000/greencheck/www.nu.nl to test www.nu.nl
If this keeps loading, everything is correctly setup, Now run bin/console enqueue:consume in a seperate terminal to process the checks

Packages - Greencheck

In packages/greencheck is the library used for carrying out checks against the Green Web Foundation Database. Workers take jobs in a RabbitMQ queue, and call the greencheck code to return the result quickly, before passing the result, RPC-style to the original calling code in symfony API server.

Packages - public suffix

In packages/publicsuffix is a library provides helpers for retrieving the public suffix of a domain name based on the Mozilla Public Suffix list. Used by the API Server.

datasets's People

Contributors

Stargazers

Watchers

Forkers

mamhoff

datasets's Issues

Work out how to create multiple npm modules from a single monorepo

I'm not sure how we should best handle having a single repo, but multiple installable packages.

All the cool kids seem to be using LernaJS for managing stuff like this.

https://lerna.js.org/

I think this would be the most popular tool to save us a bunch of work on this, but I'd appreciate some pointers.

Questions:

We don't need all these answered.

what npm run taskname scripts would we need?
what would a generated leaf dataset package (i.e. the ones with a calculation js file) look like using this?
what would the npm publishing process look like under lerna?
how do we handle contributions, or use other datasets from other repos?

Deal with `units` column in data.csv file

All of our data.csv files have a units column. However, it's unclear why it is there as the itemdef.csv files ALSO have unit and per_unit columns, duplicating the information in the data.csv file.

In most datasets, the units column in the data.csv file is completely empty. In others, it duplicates the information in the itemdef.csv file.

Here's a ruby script that deletes the column if it's entirely unnecessary or shows some information:

#!/usr/bin/env ruby

require 'csv'
require 'pry'

data_csvs = Dir.glob('**/data.csv')

def unnecessary_units_column?(csv)
  csv.all? do |row|
    row['units'].nil? ||
      row['units'].downcase == 'none' ||
      row['units'].downcase == 'dummy' ||
      row['units'].downcase == 'n/a' ||
      row['units'].downcase =~ / or /
  end
end

data_csvs.each do |file|
  begin
    csv = CSV.read(file, headers: true)
    if unnecessary_units_column?(csv)
      CSV.open(file, 'wb') do |new_csv|
        new_csv << csv.headers.reject { |h| h == 'units' }
        csv.each do |row|
          new_csv << row.reject { |k, v| k == 'units'}.map { |k, v| v }
        end
      end
    else
      puts "#{file}: #{csv.map {|r|r['units']}.uniq}"
    end
  rescue => e
    puts "ERROR in #{file}: #{e}"
  end
end

Update the branch node creole files too

Martin's magic bash-fu worked on leafnode datasets, but we need a different invocation for the tree nodes, as datasets can be nested within other datasets

Look over deprecated datasets and work out how to remove them

We have loads of deprecated datasets in here, that are confusing and make choosing the right model for calculating stuff a real pain.

This issue is to remove them, or at least get a better idea of why they are there, and if alternatives exist

Name this

We've been chatting about how this is like a distributed, decentralised version of AMEE, as it's supposed to be less monolithic, and something that could work in a federated way.

D.A.M.E.E. ?

Also, it's worth making the readme clearer and making the name more obvious

Make note of how the default.js works and the unexpected idiosynacracies

this is stream of conscious stuff, see #13 for (slightly more) cogent thoughts on this.

Soz.

models we're looking at:

https://www.carbonkit.net/categories/Heating_US
https://www.carbonkit.net/categories/DEFRA_journey_based_flight_methodology

itemdef.csv

this lists the inputs for default.js, and something like the type you pass in.

if it's a drill down it's an argument
if it's not a data item, or has a default, or is a data item, then it's a required argument we need to pass in

the entire default.js look like it to implicitly returns the values

Set up instructions for setup with running calcs in a single dataset

Martin and I had some success making a little npm module for the greenhouse gases dataset, with the aim of being able to require it for use in other modules you might use for calculating emissions for delivery, or other uses.

https://github.com/mamhoff/ameeless-planet-greenhousegases-gwp

We agreed it would make sense to work on it as a for in the Green Web Foundation, and find a way to break this repo into a series of smaller npm modules that can be installed separately.

So when, in Bret Viktor writes in What can a technologist do about climate change? A personal view

What if there were an “npm” for scientific models?

And mentions some sample code like:

var lighting = require(’residential-lighting-energy-model’);

var energyPerYear = lighting.getEnergyPerYear({
    zipCode: 94110,
    houseSize: "small",
    occupants: 3,
    fluorescentFraction: 0.75,
});

This would in theory, be possible. You could do something like:

npm install @tgwf/datasets-delivery-emissions-model

Then you'd be able to call it like the example above.

We're starting with the gases dataset, because different models emit data as different gases, and we want to be able to convert emissions to carbon dioxide equivalent.

Todo

get tests set up for single dataset
add script to convert item def csv to actual csv
move in script to precompile the csv file to json, to remove the need for any csv deps installing
document assumed plan for converting other datasets

Create index.js from default.js

Because the default.js files depend on context, we need to set that context up. Using data from the respective itemdef.csv file, we can generate that context using some template system, like HandleBars.

Here's how we can imagine that happening:

// load item definitions
itemDefinitions = convertData.getItemDefinitions(currentDirectory)

arguments = itemDefinitions.filter((definition) => {
  if (!definition.dataItem) return true
  if (definition.isDrilldown) return true
  return false
})

variablesThatNeedToBeThere = itemDefinitions - arguments

// Pseudocode for Mustache

// export function run(gas: undefined, emissionRate: undefined)    
export function run(
  {{# arguments.each do |arg| }}
    arg.path: arg.default,
  {{ end }}
  ) {
  {{ variablesThatNeedToBeThere.each do |var| }}
  const {{ var.path }} = drillDataFor(var.path, data, arguments)
  
  // Dump the default.js in here so we have it as string
  
  returnValue = {{ load default.js }}
  return returnValue;
}

transport/van/generic/itemdef.csv does not conform to standard

It's missing the type and path columns...