Giter Site home page Giter Site logo

uvacoder / benfords-law Goto Github PK

View Code? Open in Web Editor NEW

This project forked from jasonlong/benfords-law

0.0 0.0 0.0 3.03 MB

Experimenting with Benford's Law

Home Page: http://testingbenfordslaw.com

Ruby 5.50% CSS 24.65% JavaScript 19.86% CoffeeScript 17.22% HTML 17.31% SCSS 15.46%

benfords-law's Introduction

Testing Benford's Law

This is a simple experiment by Jason Long and Bryce Thornton to test how many real-life, publicly available datasets satisfy Benford's Law.

Contributing Datasets

If you find this to be an interesting idea, we'd encourage you to help add more datasets to the site. We've intentionally kept the site as simple and lightweight as possible. There is no real backend - the data has been crunched in advance and the results are simply entered into JSON files.

To contribute a new dataset, you'll need to do two things:

Add the dataset name to the JSON index file

The format of js/datasets/index.json is simply a key/value pair:

{
  "twitter-users-by-followers-count": "Twitter users by followers count",
  "distance-of-stars-from-earth-in-light-years": "Distance of stars from Earth in light years",
  "loan-amounts-on-kiva-org": "Loan amounts on kiva.org",
  "total-number-of-print-materials-in-us-libraries": "Total number of print materials in US libraries",
  "population-of-spanish-cities": "Population of Spanish cities"
}

Create a dataset JSON file

Add your new file in the /js/datasets/ directory with a name that matches the key used in step 1. The format looks like this:

{
	"values": {
		"1": 32.62,
		"2": 16.66,
		"3": 11.80,
		"4": 9.26,
  		"5": 7.63,
		  "6": 6.55,
		"7": 5.76,
		"8": 5.14,
		"9": 4.56
	},
	"num_records": "38,670,514",
	"min_value": "1",
	"max_value": "4,706,631",
	"source": "http://www.infochimps.com/datasets/twitter-census-twitter-users-by-friends-count"
}

It's important to include the source of the data used so that others can verify and reproduce the results.

Crunching the data

Generating Benford stats is a fairly straightforward process. We've made a simple ruby class for you to use if you'd like.

First, grab a copy of the class from here: https://gist.github.com/1044174

Second, include the class in your script like so:

require 'benford_counter'
require 'rubygems'
require 'csv'

counter = BenfordCounter.new

CSV.foreach("spain.txt") do |row|
  counter.count(row[9])
end

counter.results

Additional Tools

fweez contributed the Linux filesize dataset and created a Python script for tallying filesizes in a directory.

Updating Javascript and CSS

We're using CoffeeScript for the Javascript and Sass/Compass for the CSS.

Once CoffeeScript is installed (see the CoffeeScript docs), run this command from the project root to observe and compile changes:

coffee --watch -o js/ --compile js/coffee/*.coffee 

Note that the only file that should be edited is /js/coffee/app.coffee. The /js/coffee/app.js file is generated by CoffeeScript.

To make changes to the CSS, you need to install Sass and Compass (see the Compass docs. Then edit /css/sass/screen.scss. You observe and compile changes by running this command from the project root:

compass watch

To compile a production-ready compressed version:

compass compile --output-style compressed --force  

benfords-law's People

Contributors

blackant avatar brycethornton avatar eliardocosta avatar ferhatelmas avatar fweez avatar jasonlong avatar jmcastagnetto avatar jontas avatar jywarren avatar nyanag avatar prakashk avatar richq avatar sparksmaths avatar yipeng avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.