Giter Site home page Giter Site logo

csurfer / gitsuggest Goto Github PK

View Code? Open in Web Editor NEW
655.0 12.0 19.0 100 KB

A tool to suggest github repositories based on the repositories you have shown interest in.

Home Page: http://www.gitsuggest.com

License: MIT License

Python 100.00%
github repository suggestion-engine lda-model nltk

gitsuggest's People

Contributors

albalitz avatar csurfer avatar dependabot[bot] avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gitsuggest's Issues

Traceback when running gitsuggest

Hi,

I installed gitsuggest using pip. When i run gitsuggest I am getting following traceback

I am trying from mac.

Traceback (most recent call last):
File "/usr/local/bin/gitsuggest", line 11, in
sys.exit(main())
File "/usr/local/lib/python2.7/site-packages/gitsuggest/commandline.py", line 57, in main
repos = list(gs.get_suggested_repositories())
File "/usr/local/lib/python2.7/site-packages/gitsuggest/suggest.py", line 64, in get_suggested_repositories
query = self.__get_query_for_repos(term_count=term_count)
File "/usr/local/lib/python2.7/site-packages/gitsuggest/suggest.py", line 230, in __get_query_for_repos
self.__construct_lda_model()
File "/usr/local/lib/python2.7/site-packages/gitsuggest/suggest.py", line 213, in __construct_lda_model
cleaned_tokens = self.__clean_and_tokenize(repos_of_interest)
File "/usr/local/lib/python2.7/site-packages/gitsuggest/suggest.py", line 170, in __clean_and_tokenize
stopwords = self.__get_words_to_ignore()
File "/usr/local/lib/python2.7/site-packages/gitsuggest/suggest.py", line 139, in __get_words_to_ignore
with open('../gitlang/languages.txt', 'r') as langauges:
IOError: [Errno 2] No such file or directory: '../gitlang/languages.txt'

error happens when wrong password entered

[adam@macbook ~]$ gitsuggest lordadamson

INFO: Authentication (with password) have higher rate limits.
INFO: Skipping password might cause failure due to rate limit.

Password (to skip press enter):
Generating suggestions...
Traceback (most recent call last):
  File "/usr/bin/gitsuggest", line 9, in <module>
    load_entry_point('gitsuggest==0.0.8', 'console_scripts', 'gitsuggest')()
  File "/usr/lib/python3.5/site-packages/gitsuggest/commandline.py", line 61, in main
    gs = GitSuggest(arguments.username, password)
  File "/usr/lib/python3.5/site-packages/gitsuggest/suggest.py", line 50, in __init__
    self.__populate_repositories_of_interest(username)
  File "/usr/lib/python3.5/site-packages/gitsuggest/suggest.py", line 114, in __populate_repositories_of_interest
    user = self.github.get_user(username)
  File "/usr/lib/python3.5/site-packages/github/MainClass.py", line 167, in get_user
    "/users/" + login
  File "/usr/lib/python3.5/site-packages/github/Requester.py", line 172, in requestJsonAndCheck
    return self.__check(*self.requestJson(verb, url, parameters, headers, input, cnx))
  File "/usr/lib/python3.5/site-packages/github/Requester.py", line 180, in __check
    raise self.__createException(status, responseHeaders, output)
github.GithubException.BadCredentialsException: 401 {'documentation_url': 'https://developer.github.com/v3', 'message': 'Bad credentials'}

maybe we could handle the bad credentials exception by just printing out wrong password try again

[Feature-Request] Additional Information Regarding the Repositories Suggested

Since the information about the suggested repositories does not contains the information regarding the Language of the Project , Star and Fork , It will be great to add these information while suggesting the repositories to the user. This will not only help to choose the best repositories for them based on their Language and fork but also provide ease for the user to start working on the project provided.

2FA issue

As you might imagine, getting the password from a user won't really work if the user has 2FA set up and authenticates to GitHub via SSH.

$ gitsuggest aleksandar-todorovic
Password: 
Generating suggestions...
Traceback (most recent call last):
  File "/usr/local/bin/gitsuggest", line 11, in <module>
    load_entry_point('gitsuggest==0.0.3', 'console_scripts', 'gitsuggest')()
  File "/usr/local/lib/python2.7/dist-packages/gitsuggest/commandline.py", line 56, in main
    gs = GitSuggest(arguments.username, password)
  File "/usr/local/lib/python2.7/dist-packages/gitsuggest/suggest.py", line 53, in __init__
    raise ValueError('Unable to authenticate the user.')

use README instead of repo description

Just an idea:
I think README would be the best thing to run LDA on, since it contains a pretty good description of the project. Projects without README should be penalised either way. Often times the repository description is too short to describe in detail what the repository is all about.

NLTK resource not found

I get the following when running gitsuggest:

Generating suggestions...
Traceback (most recent call last):
  File "/usr/local/bin/gitsuggest", line 11, in <module>
    sys.exit(main())
  File "/usr/local/lib/python2.7/site-packages/gitsuggest/commandline.py", line 57, in main
    repos = list(gs.get_suggested_repositories())
  File "/usr/local/lib/python2.7/site-packages/gitsuggest/suggest.py", line 64, in get_suggested_repositories
    query = self.__get_query_for_repos(term_count=term_count)
  File "/usr/local/lib/python2.7/site-packages/gitsuggest/suggest.py", line 230, in __get_query_for_repos
    self.__construct_lda_model()
  File "/usr/local/lib/python2.7/site-packages/gitsuggest/suggest.py", line 213, in __construct_lda_model
    cleaned_tokens = self.__clean_and_tokenize(repos_of_interest)
  File "/usr/local/lib/python2.7/site-packages/gitsuggest/suggest.py", line 170, in __clean_and_tokenize
    stopwords = self.__get_words_to_ignore()
  File "/usr/local/lib/python2.7/site-packages/gitsuggest/suggest.py", line 135, in __get_words_to_ignore
    english_stopwords = nltk.corpus.stopwords.words('english')
  File "/usr/local/lib/python2.7/site-packages/nltk/corpus/util.py", line 116, in __getattr__
    self.__load()
  File "/usr/local/lib/python2.7/site-packages/nltk/corpus/util.py", line 81, in __load
    except LookupError: raise e
LookupError:
**********************************************************************
  Resource u'corpora/stopwords' not found.  Please use the NLTK
  Downloader to obtain the resource:  >>> nltk.download()
  Searched in:
    - '/Users/callumhoward/nltk_data'
    - '/usr/share/nltk_data'
    - '/usr/local/share/nltk_data'
    - '/usr/lib/nltk_data'
    - '/usr/local/lib/nltk_data'
**********************************************************************

On macOS, installed with pip:

$ pip show gitsuggest
Name: gitsuggest
Version: 0.0.3
Summary: A tool to suggest github repositories based on the repositories you have shown interest in.
Home-page: https://github.com/csurfer/gitsuggest
Author: Vishwas B Sharma
Author-email: [email protected]
License: MIT
Location: /usr/local/lib/python2.7/site-packages
Requires: PyGithub, gensim, nltk, pyenchant
$ python --version
Python 2.7.13

IOError: [Errno 2] No such file or directory: '/usr/lib/python2.7/site-packages/gitsuggest-0.0.3-py2.7.egg/gitsuggest/../gitlang/languages.txt

[adam@macbook antiyoy]$ gitsuggest lordadamson
Password: 
Generating suggestions...
Traceback (most recent call last):
  File "/usr/bin/gitsuggest", line 9, in <module>
    load_entry_point('gitsuggest==0.0.3', 'console_scripts', 'gitsuggest')()
  File "/usr/lib/python2.7/site-packages/gitsuggest-0.0.3-py2.7.egg/gitsuggest/commandline.py", line 57, in main
    repos = list(gs.get_suggested_repositories())
  File "/usr/lib/python2.7/site-packages/gitsuggest-0.0.3-py2.7.egg/gitsuggest/suggest.py", line 66, in get_suggested_repositories
    query = self.__get_query_for_repos(term_count=term_count)
  File "/usr/lib/python2.7/site-packages/gitsuggest-0.0.3-py2.7.egg/gitsuggest/suggest.py", line 250, in __get_query_for_repos
    self.__construct_lda_model()
  File "/usr/lib/python2.7/site-packages/gitsuggest-0.0.3-py2.7.egg/gitsuggest/suggest.py", line 232, in __construct_lda_model
    cleaned_tokens = self.__clean_and_tokenize(repos_of_interest)
  File "/usr/lib/python2.7/site-packages/gitsuggest-0.0.3-py2.7.egg/gitsuggest/suggest.py", line 186, in __clean_and_tokenize
    stopwords = self.__get_words_to_ignore()
  File "/usr/lib/python2.7/site-packages/gitsuggest-0.0.3-py2.7.egg/gitsuggest/suggest.py", line 153, in __get_words_to_ignore
    with open(language_file, 'r') as langauges:
IOError: [Errno 2] No such file or directory: '/usr/lib/python2.7/site-packages/gitsuggest-0.0.3-py2.7.egg/gitsuggest/../gitlang/languages.txt'

Use public API and rate-limit appropriately

Thanks for the tool and your response on Reddit. I would still prefer not to type my password into the script and instead just have it rate-limit its requests to fit the unauthenticated quota. #4 could be avoided as well.

py3 incompatibility

After pip install in default py3 environment it runs with appropriate error. Would you add some restrictions to setup.py?

Raises TypeError From Commandline

Just ran it with:

gitsuggest j-chad --password ******
Traceback (most recent call last):
  File gitsuggest-script.py", line 11, in <module>
    load_entry_point('gitsuggest==0.0.2', 'console_scripts', 'gitsuggest')()
  File "gitsuggest\commandline.py", line 63, in main
    repos = list(gs.get_suggested_repositories())
  File "gitsuggest\suggest.py", line 64, in get_suggested_repositories
    query = self.__get_query_for_repos(term_count=term_count)
  File "gitsuggest\suggest.py", line 230, in __get_query_for_repos
    self.__construct_lda_model()
  File "gitsuggest\suggest.py", line 204, in __construct_lda_model
    repos_of_interest = self.__get_interests()
  File "gitsuggest\suggest.py", line 118, in __get_interests
    self.__repositories_interested_in.update(cur_user.get_starred())
TypeError: unhashable type: 'Repository'

please implement SOME kind of caching

we do not care about complete results as much... just give us some suggestions.

I had to try 4 times and still no results, now through all rate limits.

  • old version (0.0.3), crashed

  • without password, didn't finish

  • first logged in, missing nltk stopwords

  • second

     enchant.errors.DictNotFoundError: Dictionary for language 'en_US' could not be found
    
  • third, rate limited

If you would have stored the information and deleted it after a successful interpretation, this wouldn't have had to happen. Maybe implement storing results to /tmp?

gitsuggest what?

Hello,
This is more of a question and not an issue. I have reviewed a few repos and I want a list of similar repos, in terms of the type of projects e.g simulations, computational modeling, GIS projects, would gitsuggest be able to extract such repos?

Resource u'corpora/stopwords' not found

Generating suggestions...
Traceback (most recent call last):
  File "/home/username/.envs/gitsuggest/bin/gitsuggest", line 11, in <module>
    sys.exit(main())
  File "/home/username/.envs/gitsuggest/local/lib/python2.7/site-packages/gitsuggest/commandline.py", line 57, in main
    repos = list(gs.get_suggested_repositories())
  File "/home/username/.envs/gitsuggest/local/lib/python2.7/site-packages/gitsuggest/suggest.py", line 64, in get_suggested_repositories
    query = self.__get_query_for_repos(term_count=term_count)
  File "/home/username/.envs/gitsuggest/local/lib/python2.7/site-packages/gitsuggest/suggest.py", line 230, in __get_query_for_repos
    self.__construct_lda_model()
  File "/home/username/.envs/gitsuggest/local/lib/python2.7/site-packages/gitsuggest/suggest.py", line 213, in __construct_lda_model
    cleaned_tokens = self.__clean_and_tokenize(repos_of_interest)
  File "/home/username/.envs/gitsuggest/local/lib/python2.7/site-packages/gitsuggest/suggest.py", line 170, in __clean_and_tokenize
    stopwords = self.__get_words_to_ignore()
  File "/home/username/.envs/gitsuggest/local/lib/python2.7/site-packages/gitsuggest/suggest.py", line 135, in __get_words_to_ignore
    english_stopwords = nltk.corpus.stopwords.words('english')
  File "/home/username/.envs/gitsuggest/local/lib/python2.7/site-packages/nltk/corpus/util.py", line 116, in __getattr__
    self.__load()
  File "/home/username/.envs/gitsuggest/local/lib/python2.7/site-packages/nltk/corpus/util.py", line 81, in __load
    except LookupError: raise e
LookupError: 
**********************************************************************
  Resource u'corpora/stopwords' not found.  Please use the NLTK
  Downloader to obtain the resource:  >>> nltk.download()
  Searched in:
    - '/home/username/nltk_data'
    - '/usr/share/nltk_data'
    - '/usr/local/share/nltk_data'
    - '/usr/lib/nltk_data'
    - '/usr/local/lib/nltk_data'
**********************************************************************

Application error

Right after github login it reports application error. I have 1.2k stars, maybe that’s the reason. There maybe should be a limit on records processed to avoid an app crash.

Resource u'corpora/stopwords' not found. Please use the NLTK

Hi. I cloned the repo and installed it using: sudo python setup.py install

[adam@macbook ~]$ gitsuggest lordadamson
Password: 
Generating suggestions...
Traceback (most recent call last):
  File "/usr/bin/gitsuggest", line 9, in <module>
    load_entry_point('gitsuggest==0.0.3', 'console_scripts', 'gitsuggest')()
  File "/usr/lib/python2.7/site-packages/gitsuggest-0.0.3-py2.7.egg/gitsuggest/commandline.py", line 57, in main
    repos = list(gs.get_suggested_repositories())
  File "/usr/lib/python2.7/site-packages/gitsuggest-0.0.3-py2.7.egg/gitsuggest/suggest.py", line 66, in get_suggested_repositories
    query = self.__get_query_for_repos(term_count=term_count)
  File "/usr/lib/python2.7/site-packages/gitsuggest-0.0.3-py2.7.egg/gitsuggest/suggest.py", line 250, in __get_query_for_repos
    self.__construct_lda_model()
  File "/usr/lib/python2.7/site-packages/gitsuggest-0.0.3-py2.7.egg/gitsuggest/suggest.py", line 232, in __construct_lda_model
    cleaned_tokens = self.__clean_and_tokenize(repos_of_interest)
  File "/usr/lib/python2.7/site-packages/gitsuggest-0.0.3-py2.7.egg/gitsuggest/suggest.py", line 186, in __clean_and_tokenize
    stopwords = self.__get_words_to_ignore()
  File "/usr/lib/python2.7/site-packages/gitsuggest-0.0.3-py2.7.egg/gitsuggest/suggest.py", line 147, in __get_words_to_ignore
    english_stopwords = nltk.corpus.stopwords.words('english')
  File "/usr/lib/python2.7/site-packages/nltk-3.2.4-py2.7.egg/nltk/corpus/util.py", line 116, in __getattr__
    self.__load()
  File "/usr/lib/python2.7/site-packages/nltk-3.2.4-py2.7.egg/nltk/corpus/util.py", line 81, in __load
    except LookupError: raise e
LookupError: 
**********************************************************************
  Resource u'corpora/stopwords' not found.  Please use the NLTK
  Downloader to obtain the resource:  >>> nltk.download()
  Searched in:
    - '/home/adam/nltk_data'
    - '/usr/share/nltk_data'
    - '/usr/local/share/nltk_data'
    - '/usr/lib/nltk_data'
    - '/usr/local/lib/nltk_data'
**********************************************************************

I can see it's using python2.7 (which is my default python). Should I have installed it using python3 instead?
I don't know what is nltk_data but maybe check if it exists as you install?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.