Giter Site home page Giter Site logo

auto-evaluator's Introduction

Auto-evaluator 🧠 📝

This is a lightweight evaluation tool for question-answering using Langchain to:

  • Ask the user to input a set of documents of interest

  • Use an LLM (GPT-3.5-turbo) to auto-generate question``answer pairs from these docs

  • Generate a question-answering chain with a specified set of UI-chosen configurations

  • Use the chain to generate a response to each question

  • Use an LLM (GPT-3.5-turbo) to score the response relative to the answer

  • Explore scoring across various chain configurations

Run as Streamlit app

pip install -r requirements.txt

streamlit run auto-evaluator.py

Inputs

num_eval_questions - Number of question to auto-generate (if the user does not supply an eval set)

split_method - Method for text splitting

chunk_chars - Chunk size for text splitting

overlap - Chunk overlap for text splitting

embeddings - Embedding method for chunks

retriever_type - Chunk retrival method

num_neighbors - Neighbors in retrivial

model - LLM for summarization from retrived chunks

grade_prompt - Promp choice for grading

Blog

https://blog.langchain.dev/auto-eval-of-question-answering-tasks/

UI

ui

Disclaimer

You will need an OpenAI API key with with access to `GPT-4` and an Anthropic API key to take advantage of all of the default dashboard model settings. However, additional models (e.g., from HuggingFace) can be easily added to the app.

auto-evaluator's People

Contributors

rlancemartin avatar prem2012 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.