Giter Site home page Giter Site logo

gem_survey's Introduction

Gem survey

Hi, I would like to get some statistics about gem installations so that I can better understand how we should make performance improvements to RubyGems.

QUICK START

Do this once:

$ curl https://raw.githubusercontent.com/tenderlove/gem_survey/master/survey.rb | ruby

Then in each project do this:

$ curl https://raw.githubusercontent.com/tenderlove/gem_survey/master/survey.rb | bundle exec ruby

Or run it all at once:

$ curl https://raw.githubusercontent.com/tenderlove/gem_survey/master/survey_all.rb | bundle exec ruby

Not so quick start

This script gathers some information about the gems that you have installed as well as your Ruby version, RubyGems version and operating system information and uploads them to a Google form anonymously, but with a mostly unique id. I've outlined exactly what information is collected and why below.

I would like to collect system wide information, as well as per-project information.

Running the script

You can run via curl or wget as below, or just download the file and run it directly. It only depends on code in stdlib, so you shouldn't need to install anything.

System wide statistics

For system wide statistics, run the script like this:

wget:

$ wget -qO- https://raw.githubusercontent.com/tenderlove/gem_survey/master/survey.rb | ruby 

curl:

$ curl https://raw.githubusercontent.com/tenderlove/gem_survey/master/survey.rb | ruby

Per project stats

For per-project statistics, run the script like this:

wget:

$ wget -qO- https://raw.githubusercontent.com/tenderlove/gem_survey/master/survey.rb | bundle exec ruby 

curl:

$ curl https://raw.githubusercontent.com/tenderlove/gem_survey/master/survey.rb | bundle exec ruby

What information is collected?

Here is a table of what information is collected and why:

Name Description / Reason
ID A mostly unique id that consists of a SHA256 of your hostname, ip address, time zone, and home directory. This field is to help understand how many projects each person has, and to help weed out duplicate data.
BUNDLER A SHA256 of the project directory if the project uses bundler. This field is to help differentiate system wide statistics from per project bundler statistics. It also helps to remove duplicate records of pre project stats
GEM_COUNT The number of gems available for activation.
ENGINE The Ruby implementation that you're using
RUBY_VERSION The version of Ruby that you're using
ENGINE_VERSION The version of the engine that you're using
HOST_OS Your operating system
FILE_COUNT_MIN The fewest files in a gem specification
FILE_COUNT_MAX The most files in a gem specification
FILE_COUNT_MEDIAN The median files per gem specification
FILE_COUNT_MEAN The mean files per gem specification
FILE_COUNT_STDDEV The standard deviation for the files per gem specification

This data will be posted to a Google form. I don't have access to access logs, so I shouldn't be able to tell who posted what data. I'll only have the data listed above.

What will this data be used for?

I would like to use this data to determine how best to speed up RubyGems. My goal is to add different types of caches to Ruby Gems, but the type of cache depends on the usage. If a particular optimization only helps people who have thousands of gems, but most people only have hundreds, then maybe the optimization isn't worth while.

I may be able to backport these optimizations to older versions of RubyGems (by using a gem). But that would depend on the usage.

Finally, I think the things we can do to speed up RubyGems could also be used to speed up Bundler by providing the right APIs.

gem_survey's People

Contributors

baweaver avatar tenderlove avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

macowie baweaver

gem_survey's Issues

failure on IPSocket.getaddress

Hi! I'm getting a failure with the following output:

-:53:in `getaddress': getaddrinfo: nodename nor servname provided, or not known (SocketError)
    from -:53:in `<main>'

in irb:

2.2.0 :001 > require 'socket'
 => true
2.2.0 :002 > Socket.gethostname
 => "tmm-mbp-014.domain"
2.2.0 :003 > IPSocket.getaddress(Socket.gethostname)
SocketError: getaddrinfo: nodename nor servname provided, or not known
    from (irb):3:in `getaddress'
    from (irb):3
    from /Users/timwade/.rvm/rubies/ruby-2.2.0/bin/irb:11:in `<main>'

i'd love to find a way to submit my results.
thanks!

DONE

Just sent system wide also statics for my three projects. Thanks for trying to make the ruby's world a better place :)

OpenSSL verification error

While trying to report the data using the script i get:

$ curl https://raw.githubusercontent.com/tenderlove/gem_survey/master/survey.rb | ruby
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  2932  100  2932    0     0  20644      0 --:--:-- --:--:-- --:--:-- 20794
/Users/janhab/.rvm/rubies/ruby-2.1.6/lib/ruby/2.1.0/net/http.rb:923:in `connect': SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed (OpenSSL::SSL::SSLError)
    from /Users/janhab/.rvm/rubies/ruby-2.1.6/lib/ruby/2.1.0/net/http.rb:923:in `block in connect'
    from /Users/janhab/.rvm/rubies/ruby-2.1.6/lib/ruby/2.1.0/timeout.rb:76:in `timeout'
    from /Users/janhab/.rvm/rubies/ruby-2.1.6/lib/ruby/2.1.0/net/http.rb:923:in `connect'
    from /Users/janhab/.rvm/rubies/ruby-2.1.6/lib/ruby/2.1.0/net/http.rb:863:in `do_start'
    from /Users/janhab/.rvm/rubies/ruby-2.1.6/lib/ruby/2.1.0/net/http.rb:852:in `start'
    from /Users/janhab/.rvm/rubies/ruby-2.1.6/lib/ruby/2.1.0/net/http.rb:1375:in `request'
    from -:120:in `<main>'

Operating system: OSX Yosemite

I use RVM at work, rbenv at home.
On both systems i get the same error.


It is an issue with outdated certificates somehow i'm afraid.
Possible solutions are found here: http://railsapps.github.io/openssl-certificate-verify-failed.html

I just wanted to report this issue because maybe I'm not the only one encountering this issue.
And maybe many people try it out and give up => Useful survey data lost :-(

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.