Giter Site home page Giter Site logo

carbon.txt's Introduction

Green Web Foundation API

In this repo you can find the source code for the API and checking code that the Green Web Foundation servers use to check the power a domain uses.

Build Status

Overview

Following Simon Brown's C4 model this repo includes the API server code, along with the green check worker code in packages/greencheck.

API

Apps - API Server at api.thegreenwebfoundation.org

This repository contains the code served to you when you visit http://api.thegreenwebfoundation.org.

When requests come in, symfony accepts and validates the request, and creates a job for enqeueue to service with a worker.

API

The greenweb api application running on https://api.thegreenwebfoundation.org

This provides a backend for the browser extensions and the website on https://www.thegreenwebfoundation.org

This needs:

  • an enqueue adapter, like fs for development, amqp for production
  • php 7.3
  • nginx
  • redis for greencheck library
  • ansible and ssh access to server for deploys

Currently runs on symfony 5.x

To start development:

  • Clone the monorepo git clone [email protected]:thegreenwebfoundation/thegreenwebfoundation.git
  • Configure .env.local (copy from .env) for a local mysql database
  • composer install
  • bin/console server:run
  • check the fixtures in packages/greencheck/src/TGWF/Fixtures to setup a fixture database

To deploy:

  • bin/deploy

To test locally:

Packages - Greencheck

In packages/greencheck is the library used for carrying out checks against the Green Web Foundation Database. Workers take jobs in a RabbitMQ queue, and call the greencheck code to return the result quickly, before passing the result, RPC-style to the original calling code in symfony API server.

API

Packages - public suffix

In packages/publicsuffix is a library provides helpers for retrieving the public suffix of a domain name based on the Mozilla Public Suffix list. Used by the API Server.

carbon.txt's People

Contributors

fershad avatar hanopcan avatar mrchrisadams avatar ross-spencer avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

carbon.txt's Issues

Extending to present more green credentials

Good work, Chris! While I totally understand that this is meant for presenting meta data for websites (like robots.txt) that can be crawled, I wonder if there's an opportunity to extend the idea to also present more green credentials for an organisation, such as the providers they choose to use.

For example, to show that the organisation choose a green energy supplier, you might have something like:

[electricity]
www.ecotricity.co.uk

[gas]
www.ecotricity.co.uk

Allowing for supporting documents to existing bodies requesting relevant data

There seems to be no discernible pattern for finding structured information on company websites about their actions relating to environmental or climate responsibility, even when they have invested significant amounts of time making shiny reports like Google has here:

https://storage.googleapis.com/gweb-sustainability.appspot.com/pdf/Google_2018-Environmental-Report.pdf

Or when Scout 24 so something worth sharing

https://csrbericht.scout24.com/wp-content/uploads/2018/04/180418_Scout24_GRI_Chapter_Environment.pdf

Or when they companies share detailed info about their DC use:
https://storage.googleapis.com/gweb-sustainability.appspot.com/pdf/24x7-carbon-free-energy-data-centers.pdf

Or made submissions to groups like Google's submission to the CDP:

https://storage.googleapis.com/gweb-environment.appspot.com/pdf/alphabet-2017-cdp-climate-change-response.pdf

Or when they've made filings inline with the SASB, where Etsy list steps they're taking:

See page 25 in this SEC filing. This stuff is really hard to find!

https://investors.etsy.com/financials/sec-filings/sec-filings-details/default.aspx?FilingId=13261228

Someone who's checking the information in a carbon.txt can reasonably be assumed to care about this too, and making it easier to find.

Define a set of accepted values for the "service" key

The current carbon.txt syntax allows upstream providers to be listed as an object with the keys domain and service. For example:

{ domain="cloud.google.com", service = "infrastructure" },

Currently, the service key is not used/referenced anywhere. However, it is easy to imagine a future where it might come in handy for reporting, checks, or other instances where data might be dissected.

The purpose of this issue is to define a set of accepted values for the service key.

Support a way to confirm ip addresses without exposing it, to avoid DDOS attacks

As Matthew mentions, the shift of the web from relying on transit to relying on CDNs, is
making it harder to rely on the public facing IP of sites now for telling if a site is using green power or not:

https://mobile.twitter.com/dracos/status/1268915142621835269

An example is cloudflare - when they were offsetting for their North American network operations, it made things unclear when they were in the middle and handling bandwidth. This was resolved somewhat when they switched over to account for the emissions in all their regions with RECs and so on, but as more people use them, it means its much harder to sell if the origin server was running on green infra.

Listing the real IP might work in a carbon.txt file, but then one of the key ideas behind DDOS protectio, or using a CDN is not exposing this server to attack, and if you know the IP address it's possible to target this server again.

Define a way for provider locations to be specified

There are cases when one might use a provider's service in just one or a couple of regions. For example, you might use Object Storage services from Provider X, but only provision those services in that provider's us-east location.

In the current carbon.txt specification, there is no way to capture this detail. To provide more granularity and transparency through carbon.txt, there should be a method through which implementors can specify the provider regions they use as part of their service.

Work out how to make it discoverable - `well-known`, TXT records or root domains

We have loads of prior work to look to for establishing convention for this.

  • The .wellknown convention has been around for ages, and stops us polluting the root namespace

  • Google uses DNS Text records as a way to tie information to a domain as well.

  • Amazon and others use the conventions for email address to check SSL certificates, by sending an email to confirm information ought to be associated with a domain. This page outlines how it works

What would this look like?

What about individual websites?

This issue has been created to track future discussion on how the carbon.txt specification can be adopted by individual website owners for their own sites.

Document flow for checking a carbon.txt file

The flow is as follows:

  1. Check the domain name is a valid one.
  2. Check there if there is carbon-txt DNS TXT record for the given domain.
  3. Perform an HTTP request at https://domain.com/carbon.txt, OR the overide URL given as the value in the DNS TXT lookup.
  4. If there is valid 200 response and a parseable file, parse the file.
  5. If there is a no valid 200/OK response at https://domain.com/carbon.txt (i.e. a 404, or 403), check the HTTP for a Via header with a new domain, as a new domain to check.
  6. Repeat steps 1 through 5 until we end up with a 200 response with a parsable carbon.txt payload, or bad request (i.e. 40x, 50x) with no HTTP Via header.

Why do it this way?

This flow is designed to allows CDNs and managed service providers to serve information in a default carbon.txt file, whilst allowing "downstream" providers to share their own, more detailed information if need be.

Why support the carbon.txt DNX TXT record?

Supporting the DNS lookup allows an organisation that owns or operates multiple domains to refer to a single URL for them to maintain.

if you served traffic from a domain like cdn-domain.com, you would add a TXT record to cdn-domain.com, with the following content:

carbon-txt=https://actual-domain.com/carbon.txt

This would set an override url, to allowing multiple domains to point to the one carbon.txt file for a organisation.

The "override URL" also allows for organisations that prefer to serve their file from a .well-known directory to do so:

carbon-txt=https://actual-domain.com/.well-known/carbon.txt

This allows folks to support the .well-known convention of storing files in a clearly identified place where it makes sense to do so, without requiring people who do not know what a .well-known directory is, or for people who do not have control over what is allowed to write to the .well-knowndirectory in a server.

Why use the Via header?

Consider the case where managed-service-provider.com is hosting customer-a.com's website.

The managed service provider may be offering a CDN or managed hosting service, but they may not have control over the customer-a.com domain. They may not have, or want direct control over what a downstream user is sharing at a given url. However because they are offering some service "in front" of customer-a's website, and serving it over a secure connection, they are able to add headers to HTTP requests.

the HTTP Via header exists specifically to serve this purpose, and provides a well specified way to pass along information about a domain of the organisation providing a managed service, when the domain is different.

The link above outlines the spec, but for convenience you would add a header looking like so:

Via: 1.1 alternative-domain.com

Why use domain/carbon.txt as the path?

Defaulting to a root carbon.txt makes it possible to implement a carbon.txt file without needing to know about .well-known directories, that by convention are normally invisible files. Having a single default place to look avoids needing to support a hierarchy of potential places to look, and precedence rules for where to look - there is either one place to default to when making an HTTP request, OR the single override.

Create examples in a directory for common use cases

Ideally we'd have some examples to show what this might look like for common stacks, to people who run them can understand how this would look, and make it easier to adopt.

Own sites, and small-medium hosting company

  • Own static site on own server -> hosting co -> DC provider -> Energy Co > Carbon Credits
  • Own wordpress.org on own server -> hosting co -> DC provider -> Energy Co > Carbon Credits
  • Managed hosting of wordpress.org -> DC Provider -> Energy co > Carbon Credits

Own sites, and cloud giant (Digital Ocean, M$, GCP, AWS, etc.)

The main difference here is that the bigger cloud companies tend to act as both a hosting provider and a DC operator, as they often own their own DC's. They often have more complicated products in their portfolio, like hosted database as a service, object storage, and so on (smaller ones have this too, but it's less common).

  • Own static site on own server -> hosting co -> DC provider -> Energy Co > Carbon Credits
  • Own dynamic site site on own server -> hosting co -> DC provider -> Energy Co > Carbon Credits
  • Own static site on object storage -> hosting co -> DC provider -> Energy Co > Carbon Credits

Larger hosting company, providing carbon.txt data for all their customer's sites

  • hosted wordpress.com style site -> xxx -> Energy Co > Carbon Credits

To think about

  • PaaS
  • Hosted static sites like netlify etc. (would it be different to the static site on own server case? Not sure)
  • Distributed web cases ( this sounds really hard - Dat acts like a kind of collaborative CDN, for example, so in theory every peer would need to be able to declare where it's own power came from…)

Define how to include comments

I might have missed it, but there doesn't seem to be any mention of how to include human-readable comments in the carbon.txt format. Ignoring anything from a "#" char to the end of line would be fairly standard, and is what robots.txt uses.

Supporting multiple domains

Websites rely on content from multiple domains. How do we handle this?

We could do something like:

6 out 8 domains run on green power on this site.

use robots.txt-style 'Name: value' formatting

I think this is a fantastic approach and well worth doing. One thing that would improve it IMO would be to use a "Header: value" style more similar to robots.txt, though;

So for instance, instead of

[upstream]
krystal.co.uk

it'd use something like

[upstream]
Domain: krystal.co.uk

Or other kinds of identifier:

[upstream]
Name: Ecotricity Group Ltd

This gives more room for extensibility in future, where new forms of addressing or relationships can be supported.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.