Giter Site home page Giter Site logo

musehd / multidefine Goto Github PK

View Code? Open in Web Editor NEW
5.0 1.0 9.0 82.06 MB

πŸ“ Compiles the definitions of multiple words into a single defintions list.

License: MIT License

Python 100.00%
vocabulary scraper vocabulary-lists hacktoberfest selenium python definition-generation definition-list

multidefine's Introduction

Logo

MultiDefine

Compiles the definitions of multiple words into a single defintions list.
Report Bug Β· Request Feature

About the Project:

πŸ“This program queries Google, WikiPedia and Oxford Dictionary for the definitions of multiple words and compiles these definitions into a single list of definitions.

It was created to help with my schoolwork, which often involved finding definitions of a long list of technical words. I realised I was wasting a lot of time manually searching the definitons of each of the words and writing them down, so I decided to automate this process.

Getting Started

Prerequisites

  • The program has been built and tested on Windows 10+ machines, so its recommended to use a Windows Machine
  • If using MacOS or other environments, try the manual installation process. If you have any issues, please create a new issue here
  • For manual installation, Python needs to be installed

Use one of the following methods to use the program:

  • Manual Installation

    Open a terminal,
    cd <PATH FOR INSTALLATION>
    git clone https://github.com/museHD/MultiDefine
    pip install -r requirements.txt
    
  • Downloading Release (Only on Windows)

    It is recommended to create a separate folder for the application. Download the latest release to the folder and run it. Most of the dependencies will be contained within the release and those that aren't will be downloaded to the folder.

Usage:

  • Manually Installed

    Open a terminal. Run using python multidefine.py
  • EXE File

    Open the folder where the exe is downloaded. Double click to run the program.

Simply enter words separated by commas into the program and it will return the list of words and the top definition for each word. Note: Part of the program is still quite unstable and may not always give the most relevant definition.

Contributing

Please feel free to raise issues and suggest changes to the project. We are always looking for improvements! Please make sure that any pull requests are made to the development branch as these changes will later be merged to master once the development branch is stable. Thank you!

Dependencies:

The program uses the following libraries:


The libraries have been taken care of in the exe release

All users need to have the matching/compatible versions of Google Chrome and [Chromedriver](https://chromedriver.chromium.org/). If you are facing issues, make sure that the chromedriver.exe file is in the same directory as the application. Eg. If you have Chrome 86, download the ChromeDriver v86 from the above link and place the exe in the same directory as this program.


forthebadge forthebadge

multidefine's People

Contributors

deepsourcebot avatar delaguardianick avatar lpuv avatar metavinayak avatar musehd avatar suvanbalu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

multidefine's Issues

Async Code execution

One of the major problems right now is performance. The program has to go through each step for each of the words, before it moves on to the next one.

I've wanted to look into async calls but haven't gotten around to doing it. The current code will most likely need to be structured, as the order of operations needs to be taken into account. i.e. The program should only retrieve the definition for a word from one source, rather than getting definitions from several sources. Ideally, it should also detect how much additional load is being put on the system and add threads accordingly, allowing for more performance as well as accessibility.

If anyone is interested in implementing this, please let me know.

New search doesn't clear issues from previous search

When the definition for a word is not able to be obtained, it adds it to the list of failed words. When the program is re-run by pressing any key, the list does not get cleared and the failed words are stored in the list for the rest of the session. Should be easy to fix by clearing the list every time the user wants to rerun the program.

Messy Code

Currently, there's a single function called get_ans() which is not ideal from a development perspective.
It would be worth breaking up the statements in the try and except blocks into separate functions to make it easier to maintain and debug.

New driver not updated

update.py gets the required zip file and extracts it but doesn't replace the old driver. So I updated with the new driver in #8 . So do check it !

Fix Duplication for Wikipedia

Definitions often break when searching for specific phrases due to google's new design.
Try searching using id and/or use next siblings in selenium to make sure that the right element is being selected and displayed

Inconsistent Formatting/Alignment of Results

The current formatting of words is quite inconsistent, adding spaces for some results, but not for others,
I believe this is because different sources have different formatting. This results in unaligned output, as shown:

image

This should be easy to fix by stripping the data collected, removing spaces and/or newlines and making sure that all of them are aligned.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.