Giter Site home page Giter Site logo

balm's Introduction

status Twitter Follow

BALM: Batch Analysis with Language Models

BALM makes processing batches of documents via large language models easy. Features include:

  • Accessible user interface without coding.
  • Support for zero-shot and few-shot learning.
  • Easy profiling of models on the same task.
  • Automatic result aggregation and visualization.

Local Setup

The following installation instructions have been tested with Python 3.10 on Ubuntu Server 22.04.

  1. Download this repository, e.g., by executing
git clone https://github.com/itrummer/balm
  1. Make sure that pip is installed:
pip --version

If you get an error message, install pip:

sudo apt-get update
sudo apt install python3-pip
  1. Change into the balm directory and install requirements:
cd balm
sudo pip install -r requirements.txt

Running BALM

From the balm directory, execute:

./start.sh

You should see the message You can now view your Streamlit app in your browser., followed by a Network URL and an External URL. Enter the Network URL to access a local BALM installation and the External URL to access a remote BALM server. If using BALM remotely, make sure that port 8501 is reachable. E.g., when running BALM on an Amazon EC2 instance, change the Inbound Rules by adding a custom TCP rule for port 8501.

Example: Analyzing Movie Reviews

We will introduce the BALM interface by an example scenario. You find the example data here. It is a .csv file containing 100 movie reviews in the first column. We will use language models to map reviews to a sentiment (positive or negative) in the following. This example uses language models by OpenAI and requires a corresponding account (see here).

  1. Open the BALM interface in your Web browser (this example was tested on Google Chrome but most browsers will work).
  2. Click on the Credentials box. Copy your OpenAI API key into the corresponding field (see here).
  3. Click on the Models box. Leave the default (1) for the number of models. E.g., select the gpt-3.5-turbo model.
  4. Enter a task description in the prompt field. For instance:
Is the sentiment positive (Yes/No)?
  1. Optionally, specify examples to increase output quality (few-shot learning). Click on the Examples box, choose the number of examples, then enter example input and output. E.g.:
Example input: This movie was really bad.
Example output: No
  1. Select CSV for the input type (the default), then click on Browse files and select the input file you previously downloaded.
  2. Movie reviews are stored in the first column, therefore use 0 (default) for the column index (we count starting from zero).
  3. Optionally, restrict the number of reviews to process by clicking on the Limit rows checkbox and setting a maximal number.
  4. Click on the Process Data button to start processing.

You will see results in the result table as they become available from the language model. After processing all input, BALM automatically generates several aggregate statistics. Click on the Output Distribution box to obtain a visualization, showing how often specific outputs were produced by the language model. The Model Agreement box is only interesting if multiple models are applied to the same data (see the following section). Finally, you can download results as .csv file by clicking on the Download Results button. Note that this will erase the current results and reset the interface.

Comparing Models

OpenAI and other providers offer language models in different sizes. Using smaller models is often hundreds of times cheaper (per token) than using large models like GPT-4. To avoid overpaying, it is good practice to compare the output of different models on a data sample before processing a large batch.

BALM makes comparing models easy:

  1. Continuing with the previous example, click on the Models box and increase the number of models to two. Select a cheap model like Ada in addition to gpt-3.5-turbo.
  2. Select the same .csv file as before and check the Limit rows checkbox to limit the number of rows, e.g., to ten. Then click on Process Data again to start processing.

The result table now contains one column for each of the selected columns (the column header is the model name). In that column, you find the output generated by the corresponding columns.

After processing, click on Model Agreement to see aggregate statistics on output consistency between models. The section contains a table with rows and columns labeled by model names. In each cell, you find the ratio of input documents for which the two models produced exactly the same output. If this ratio is close to one, both models can be used interchangeably. In those cases, select the cheapest of all equivalent models.

Configuring BALM

Adding New Models

You can add new models by changing the models.json file in the configuration folder. Each entry maps a model label (shown in the model selection drop-down menu) to a model description. This description is a dictionary with the following keys:

  • name: the model name assigned by the provider (which may differ from the label).
  • provider: currently, this has to be set to "OpenAI" (support for other providers is coming soon).
  • type: either "chat" for chat models or "default".

Increasing Limit on File Size

By default, BALM restricts the size of input files to 10 MB. You can change that number in the file .streamlit/config.toml.

How to Cite

Please cite the following paper to refer to BALM:

@article{Trummer2023balm,
author = {Trummer, Immanuel},
journal = {CoRR},
title = {{BALM: Batch Analysis with Language Models}},
year = {2023}
}

Resources

balm's People

Contributors

itrummer avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.