Giter Site home page Giter Site logo

mihasm / ingredient-parser Goto Github PK

View Code? Open in Web Editor NEW

This project forked from strangetom/ingredient-parser

0.0 0.0 0.0 30.07 MB

A tool to parse recipe ingredients into structured data

Home Page: https://ingredient-parser.readthedocs.io/en/latest/

License: MIT License

Python 100.00%

ingredient-parser's Introduction

Ingredient Parser

The Ingredient Parser package is a Python package for parsing structured information out of recipe ingredient sentences.

Documentation

Documentation on using the package and training the model can be found at https://ingredient-parser.readthedocs.io/en/latest/.

Quick Start

Install the package using pip

python -m pip install ingredient-parser-nlp

Import the parse_ingredient function and pass it an ingredient sentence.

>>> from ingredient_parser import parse_ingredient

>>> parse_ingredient("3 pounds pork shoulder, cut into 2-inch chunks")
{'sentence': '3 pounds pork shoulder, cut into 2-inch chunks',
 'quantity': '3',
 'unit': 'pound',
 'name': 'pork shoulder',
 'comment': ', cut into 2-inch chunks',
 'other': ''}

# Output confidence for each label
>>> parse_ingredient("3 pounds pork shoulder, cut into 2-inch chunks", confidence=True)
{'sentence': '3 pounds pork shoulder, cut into 2-inch chunks',
 'quantity': '3',
 'unit': 'pound',
 'name': 'pork shoulder',
 'comment': ', cut into 2-inch chunks',
 'other': '',
 'confidence': {'quantity': 0.9986,
  'unit': 0.9967,
  'name': 0.9535,
  'comment': 0.9967,
  'other': 0}}

The returned dictionary has the format

{
    "sentence": str,
    "quantity": str,
    "unit": str,
    "name": str,
    "comment": str,
    "other": str
}

Model accuracy

The model provided in ingredient-parser/ directory has the following accuracy on a test data set of 25% of the total data used:

Sentence-level results:
	Total: 9448
	Correct: 8189
	-> 86.67%

Word-level results:
	Total: 54854
	Correct: 52509
	-> 95.73%

Development

The development dependencies are in the requirements-dev.txt file.

Note that development includes training the model.

  • Black is used for code formatting.

  • isort is used for import sorting.

  • flake8 is used for linting. Note the line length standard (E501) is ignored.

  • pyright is used for type static analysis.

  • pytest is used for tests, with coverage being used for test coverage.

The documentation dependencies are in the requirement-doc.txt file.

ingredient-parser's People

Contributors

strangetom avatar mihasm avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.