HashiCorp projects are misclassified as Open Source

Open Source Contributor Index (OSCI)

OSCI, an open source project, aiming to track and measure open source activity on GitHub by commercial organizations. It allows organizations, communities, analysts and individuals involved in Open Source to get insights about contribution trends among commercial organizations by providing access to up-to-date data through an intuitive interface.

How does OSCI work?

To create this index, the system processes GitHub push events data from GH Archive:

OSCI tracks two measures at each organization:

Active contributors, the number of people who authored 10 or more commits over a period of time
Total community, the number of people who made at least one commit over a period of time

How are commit authors linked to commercial organizations?

The system uses email domain of the commit author to identify the organization. Your organization is missing in the ranking? Feel free to add your organization to the list.

Note: OSCI does not rank open source activity contributed by universities, research institutions and individual entrepreneurs.

How can I submit my company for ranking?

Check whether the organization you propose to add matches OSCI definition:
- not an educational, governmental, non-profit or research institution;
- registered, commercial organization;
- sells goods or services for the purpose of making a profit.
Create a new pull request.
Go to company domain match list (company_domain_match_list.yaml)
Double check that the organization you want to add is not listed.
Add the email domain of the company and the company name to the table. For example:
```
- company: Facebook
  domains:
    - fb.com
  regex:
```
If the company has more than 1 email domain for its employees, add all of them to block domains (or regex for using regular expression). For example:
```
- company: Facebook
  domains:
    - fb.com
    - facebook.com
  regex:
    - ^.*\.fb\.com$
    - ^.*\.facebook\.com$
```
Select the industry to which your company belongs from the following list:
- Automotive;
- Banking, Insurance & Financial Services;
- Education;
- Energy & Utilities;
- Entertainment;
- Healthcare and Pharma;
- Professional Services;
- Public Sector;
- Retail & Hospitality;
- Technology;
- Media & Telecoms;
- Travel & Transport;
- Other (please specify);
For example:
```
- company: Facebook
  domains:
    - fb.com
    - facebook.com
  regex:
    - ^.*\.fb\.com$
    - ^.*\.facebook\.com$
  industry: Media & Telecoms
```

Our team will review your pull request and merge it if everything is correct.

Note: since OSCI processes the data for the previous month, you'll see your organization's rank in the beginning of the next month.

How can I contribute to OSCI?

See CONTRIBUTING.md for details on contribution process.

QuickStart

OSCI is deployed into Azure Cloud environment using Azure DataFactory, Azure Function and Azure DataBricks. However, the code available on GitHub does not require using of Azure Cloud. Run the application from the command line using the instruction below.

Installation

Clone repository

     git clone https://github.com/epam/OSCI.git

Go to project directory
```
     cd OSCI
```
Install requirements
```
     pip install -r requirements.txt
```

Configuration

Create a file local.yml (by default this file added to .gitignore) in the directory osci/config/files. A sample file default.yml is included, please don't change values in this file

Sample run

Run script to download data from archive (for example for 01 January 2020)
```
     python3 osci-cli.py get-github-daily-push-events -d 2020-01-01
```
Run script to add company field (matched by domain) (for example for 01 January 2020)
```
     python3 osci-cli.py process-github-daily-push-events -d 2020-01-01
```
Run script to add company field (matched by domain) (for example for 01 January 2020)
```
     python3 osci-cli.py daily-osci-rankings -td 2020-01-02
```

OSCI Versioning

For a comprehensive OSCI versioning we adopted the following approach <year>.<month>.<number of patch >) e.g. 2021.05.0. We expect regularly monthly updates including releases associated with submission of a new company for ranking.

License

OSCI is licensed under the GNU General Public License v3.0.

Contact Us

For support or help using OSCI, please contact us at [email protected].

Criteria	Status (Yes/No)	Notes (e.g. about how it is possible, or limitations, etc)
Is this site free to use for open source projects?	yes	Seems to be free only for teams under 5 people unless you request a community license.
Does it look like this site hosts many open source projects?	unclear	It's not clear that there are large numbers of open source projects hosted.Most public projects seem to be non-commercial (not outsourced by companies). It appears most users do not use company domains - need to investigate further.The pages/repos of many companies appear to be not active.
Size of user base	-	In the order of 5,000,000 users
Is there a public API we can query?	yes
API type	REST
API URL	https://api.bitbucket.org/2.0
Query Limits (if any)	1,000 per hour 60,000 per hour
Is there a paid access with more information?	- (to be investigated)
Is it possible to query the project license?	Yes
Is it possible to query commit events/commit counts by a user in a time period?	Yes	/repositories?before=timestamp&after=timestamp e.g. https://api.bitbucket.org/2.0/repositories?after=2020-03-01T09%3A37%3A06.254721%2B00%3A00
Is it possible to query email address or else some organization information for the person making a commit?	Yes	email address
Is there a public archive we can use instead of the public API?	no

Criteria	Status (Yes/No)	Notes (e.g. about how it is possible, or limitations, etc)
Is this site free to use for open source projects?	yes	Seems to be free with basic features (they sell more advanced CI/CD features)
Does it look like this site hosts many open source projects?	yes
Is there a public API we can query?	yes
API type	REST
API URL	https://gitlab.com/api/v4/
Query Limits (if any)	600 queries per 60 second period
Is there a paid access with more information?	no
Is it possible to query the project license?	no
Is it possible to query commit events/commit counts by a user in a time period?	no
Is it possible to query email address or else some organization information for the person making a commit?	yes	email address
Is there a public archive we can use instead of the public API?	no
Any additional Information worth knowing?	no

Criteria	Status (Yes/No)	Notes (e.g. about how it is possible, or limitations, etc)
Is this site free to use for open source projects?	yes
Does it look like this site hosts many open source projects?	yes	“over 430,000 projects”Popular in open source community.BUT, it hosts a lot of binaries and mirrors of repos which are primarily hosted on github or elsewhere.
Size of user base	-	"we host over 3.7 million registered users”
Is there a public API we can query?	yes
API type	not studied yet
API URL	not studied yet
Query Limits (if any)	not studied yet
Is there a paid access with more information?	not studied yet
Is it possible to query the project license?	not studied yet
Is it possible to query commit events/commit counts by a user in a time period?	not studied yet
Is it possible to query email address or else some organization information for the person making a commit?	not studied yet
Is there a public archive we can use instead of the public API?	not studied yet
Any additional Information worth knowing?	not studied yet

OrgName	Commits
Microsoft	640009
GitHub	519108
Renovateapp	472705
Google	379847
Red Hat	331087
Travis CI	195377
Intel	150613
IBM	131510
Exoplatform	125844
Odoo	113452
Pyup	82118

Company	AuthorName	Commits
Pyup	pyup-bot	349717
Pyup	pyup.io bot	10146
Pyup	pyup.io vuln bot	22
Pyup	pyup.io bot (via Travis CI)	1

Company	AuthorName	Commits
Renovateapp	Renovate Bot	2348935
Renovateapp	WhiteSource Renovate	65148
Renovateapp	Renovate Bot (via Travis CI)	358
Renovateapp	renovate-bot	63
Renovateapp	Rhys Arkins	3

Company	AuthorName	Commits
Travis CI	Deployment Bot (from Travis CI)	426727
Travis CI	Travis CI	92799
Travis CI	travis-ci	11824
Travis CI	TravisCI	9511
Travis CI	Travis	8128
Travis CI	Deployment Bot (Travis)	7723
Travis CI	Deployment Bot	1917
Travis CI	raveit65	1322
Travis CI	Piotr Milcarz	1317
Travis CI	Travis Build Bot (from Travis CI)	1015

Criteria	Status (Yes/No)	Notes (e.g. about how it is possible, or limitations, etc)
Is this site free to use for open source projects?	yes
Does it look like this site hosts many open source projects?	yes	In total (all projects): “43,314 projects, 1,808,413 bugs, 1,004,760 branches, 17,009 Git repositories, 2,977,004 translations, 685,925 answers, 77,280 blueprints, and counting...” https://launchpad.net/projects/+all?batch=75Seems to be popular for Ubuntu community mainly.many repos appear to be mirrors of projects which are hosted elsewhere - need more data to provea lot of linux-focused projects
Size of user base	-	c. 4,000,000
Is there a public API we can query?	yes
API type	HTTP
API URL	http://api.launchpad.net/1.0/
Query Limits (if any)	- (to be investigated)
Is there a paid access with more information?	- (to be investigated)
Is it possible to query the project license?	Yes
Is it possible to query commit events/commit counts by a user in a time period?	Yes
Is it possible to query email address or else some organization information for the person making a commit?	Yes	email address (required an authentication)
Is there a public archive we can use instead of the public API?	no

Criteria	Status (Yes/No)	Notes (e.g. about how it is possible, or limitations, etc)
Is this site free to use for open source projects?	yes
Does it look like this site hosts many open source projects?	yes	In total (all projects): "23990 registered users, 3829 hosted projects"
Is there a public API we can query?	no	however, we can parse HTML pages
API type	-
API URL	-
Query Limits (if any)	-
Is there a paid access with more information?	-
Is it possible to query the project license?	Yes	by parsing HTML page
Is it possible to query commit events/commit counts by a user in a time period?	no
Is it possible to query email address or else some organization information for the person making a commit?	Yes	by parsing HTML pageemail address
Is there a public archive we can use instead of the public API?	no
Any additional Information worth knowing?	yes	it is possible to get information about commits with using parsing web pages if project is based on GIT

company	language	commits
Google	python	50
Google	go	30
Microsoft	typescript	40
Microsoft	powershell	20

company	license	commits
Google	apache-2.0	50
Google	mit	30
Microsoft	gpl-3.0	40
Microsoft	lgpl-2.1	20

author_email	author_name	repo_name	language	license	company	commits
[email protected]	Lorem	datalayer-contrib/hadoop	Java	apache-2.0	Cloudera	3
[email protected]	Ipsum	docker-library/docs	Shell	mit	Infosiftr	200
[email protected]	Dolor	golang/go	Go	other	Google	12
[email protected]	Sit	konnectors/darty	JavaScript	agpl-3.0	Renovateapp	4

epam / osci Goto Github PK

osci's Introduction

Open Source Contributor Index (OSCI)

Table of contents

How does OSCI work?

How are commit authors linked to commercial organizations?

How can I submit my company for ranking?

How can I contribute to OSCI?

QuickStart

Installation

Configuration

Sample run

OSCI Versioning

License

Contact Us

osci's People

Contributors

Stargazers

Watchers

Forkers

osci's Issues

Plans

TODO

OSCI Languages YTD

OSCI Licenses YTD

Plans

TODO

Company contributors repository commits

OSCI Contributors YTD

Recommend Projects

Recommend Topics

Recommend Org