Comments (6)
In that case, I suggest that bodega supports both --repo
(the default positional argument right now) and --accounts
. If --repo
is specified, then all accounts from given repository are considered. If --accounts
is provided, then all accounts are considered (i.e. their comments are downloaded, regardless of the repositories where they were made). If both are provided, then only the specified accounts active in given repository should be considered, and only the comments within that repository are considered.
Additionally, --repo
could be a list of repositories instead of a single repository. In that case, all accounts from given repositories are considered (but only comments within these repositories are downloaded and processed).
from bodegha.
Yes, I think we should have a --repo that can actually take either a single or a list of repositories, just like --accounts can be a single or multiple accounts. I also very much like the other ideas suggested by @AlexandreDecan above.
from bodegha.
But our model trained based on repository-user pair. Is it correct to predict an account based on comments from several repositories?
from bodegha.
You can check for this. I don't see any major reason why it would fail. Human comments will be more various, and bot comments are likely to be more similar.
from bodegha.
But increasing the number of repositories could increase the number of patterns for bots.
from bodegha.
But also their number of considered comments. The best you can do is to check for this based on the ground truth dataset: download extra comments for some accounts and see if the model is still somewhat reliable ;-)
from bodegha.
Related Issues (11)
- deprecation warning HOT 1
- version tagging HOT 4
- misclassified cases HOT 8
- University-of-Mons HOT 3
- including or excluding accounts? HOT 3
- Bodega name HOT 3
- Rename BoDeGa into BoDeGHa HOT 2
- Several bugs: Parameters not used, misclassifications, and other errors... HOT 6
- update readme file HOT 1
- Warnings regarding scikit-learn version: Version mismatch might lead to invalid results HOT 10
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bodegha.