Comments (8)
I will respond to this in two parts.
- It is a good idea to add instructions for how to modify this file. A little background - to created this list we reviewed email addresses associated with commits to GitHub and then filter out which ones were from companies (commercial organizations), leading to the creation of this list in the SQL file. Since the concept of OSCI was to rank companies (leading on from some earlier such studies which were published), we excluded email addresses from universities, freemail providers, etc. It was a fair bit of effort to research all the email domains and find out which categories to put them in. We also had to combine email domains where a single organization uses multiple. It's possible we missed some companies or subdomains in this exercise, and of course new companies will need to be added sometimes too. So this list needs to be maintained, but yes you are right, it will be necessary to publish the rationale and instructions.
from osci.
- The other issue is people using non-corporate emails. We researched many ways to identify what organization contributions were coming from, such as the contributors profile and the org of the repo. All of these have pros and cons, and in the end we concluded that - for now - we would use the email domain - even knowing that this will under count the totals. There is no perfect science to these analyses but we felt this still gives valueable results. We would like to return to the idea of improving this algorithm - this task will need a bit of scoping first which I do plan to work on.
from osci.
There is a pain point in open source projects around contributions made during a signed contract with a company. The burden of proof that project doesn't get corporate code as result of third party contribution is placed on open source projects resulting in various CLA and conditional acceptance of contributions. This really hurts.
What could be improved from the corporate side is to make it clear and public which contributions are sponsored or covered by a existing contract. It can also set some standards to get clarity into contracts that say that any code that a person writes belongs to a company. If it will be the responsibility of the company lawyers to track official person involvement into relationships with the company and maintain it online, then an open source developers will feel less pressure over these legal issues, and OSCI could get fine-grained information which emails were involved with certain projects from which company in a certain period.
from osci.
@patrickstephens1 - I'm a member of EPAM and have a GitHub account, but not part of the EPAM organization in GitHub. In my profile, I have plenty of MIT licensed stuff that I use in conferences and workshops. So how do I show up in EPAM as a contributor in OSCI based on the current algorithm?
from osci.
@gitaroktato Hey Oresztesz. The OSCI algorithm uses the email domain of the author of the commit. We have a filter which picks out all the company email domains we have found in our analysis. Anyone (EPAM or other) who wants their contributors to be picked up should set their company email address on their public profile, or alternatively it can be set at the repo level (see here https://help.github.com/en/enterprise/2.19/user/github/setting-up-and-managing-your-github-user-account/setting-your-commit-email-address). Does that answer your qn?
from osci.
@abitrolly to an earlier question, the project README is now updated with instructions how to add companies and email domains. Actually we are almost ready to publish an update to this mapping having recently completed another analysis of email domains we see in larger numbers of commits. This will be done in next week or so.
from osci.
@patrickstephens1 thanks for the heads up! The commit with instructions is 54e8526
Ideally the mappings should be in the repository root in self-describing format. Files with which people interact most often - custom mappings and configuration are better not to be hidden in the depths as to require specialized docs to access them. Anyway, the docs are awesome.
from osci.
@patrickstephens1 OK, I've changed my e-mail address in the git history. I hope it makes some impact 😃. Cheers!
from osci.
Related Issues (20)
- Calculating ranking of an organization HOT 1
- Unable to filter by Industry for Feb 2022 HOT 5
- Query on ranking HOT 5
- Unable to get basic example to run HOT 30
- Timezones affecting monthly filter drop-down around end of month? HOT 3
- SmartBear not appearing in the list HOT 6
- Looking to obtain data in csv format HOT 1
- Unusual spike in data? (starting Nov 21) HOT 2
- Clarification counting method HOT 2
- How to obtain country name for the records in the dataset HOT 2
- Report issues - data does not add up HOT 4
- Data inconsistency or update issues? HOT 13
- Certificate expired HOT 2
- How to run OSCI in 2023? HOT 1
- Kibana is misclassified as Open Source
- MongoDB is misclassified as Open Source
- HashiCorp projects are misclassified as Open Source HOT 1
- Something was missed in Industry drop-down list HOT 2
- Is https://opensourceindex.io/ not updated anymore? HOT 1
- Measuring company support for known OSS public projects
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from osci.