Giter Site home page Giter Site logo

hongtangshui / nemo-skills-openmathinstruct-1 Goto Github PK

View Code? Open in Web Editor NEW

This project forked from kipok/nemo-skills

0.0 0.0 0.0 1.77 MB

A pipeline to improve skills of large language models

License: Apache License 2.0

Shell 0.43% JavaScript 0.36% Python 98.80% CSS 0.42%

nemo-skills-openmathinstruct-1's Introduction

NeMo Skills

In this repository we provide a pipeline to improve "skills" of large language models (LLMs). Currently we focus on the ability to solve simple mathematical problems, but more skills are coming (such as coding and table understanding).

Our pipeline consists of 3 steps and can be directly applied to any LLM that is supported in NVIDIA's NeMo Toolkit.

  1. Setup
    • Pick a "student" model that you want to improve. E.g. Mistral-7B.
    • [optionally] Pick a "teacher" model (can also use the student model itself). E.g. Mixtral-8x7B.
    • Choose evaluation benchmarks and training datasets. E.g. GSM8K and MATH.
  2. Generate synthetic data
    • Write a couple of examples of solutions that you want the student LLM to learn. E.g. teach it to use code to solve math problems.
    • Run a large-scale generation of diverse solutions on the training datasets showing your examples in the prompt to the teacher model.
    • Filter the generated solutions based on correctness and quality.
  3. Finetune the student model on the generated dataset

We release a series of OpenMath models improved with this pipeline that are one of the best open models for solving mathematical problems and are currently the only state-of-the-art open models that do not rely on OpenAI for data generation!

greedy majority@50
model GSM8K MATH GMS8K MATH
GPT-4 [1] 94.4 56.2 - -
GPT-4 + code [2] 92.9 69.7 - -
OpenMath-CodeLlama-7B (nemo | HF) 75.9 43.6 84.8 55.6
OpenMath-Mistral-7B (nemo | HF) 80.2 44.5 86.9 57.2
OpenMath-CodeLlama-13B (nemo | HF) 78.8 45.5 86.8 57.6
OpenMath-CodeLlama-34B (nemo | HF) 80.7 48.3 88.0 60.2
OpenMath-Llama2-70B (nemo | HF) 84.7 46.3 90.1 58.3
OpenMath-CodeLlama-70B (nemo | HF) 84.6 50.7 90.8 60.4

We also release OpenMathInstruct-1, a math instruction tuning dataset with 1.8M problem-solution pairs generated using permissively licensed Mixtral-8x7B model.

Please see our paper "OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset" for more details!

Getting started

Try to run inference with our models with just a few commands!

We provide all instructions to fully reproduce our results.

If you want to improve your own models or to learn more about our pipeline, read through the relevant docs below.

We also provide a convinient tool for visualizing inference and data analysis

Overview Inference Page Analyze Page
Demo of the tool Demo of the inference page Demo of the analyze page

Supported models and datasets

Any model that is supported by NeMo can be used as a "student". Many popular models are supported, e.g. LLaMA2, CodeLLaMA, Mistral-7B and Mixtral-8x7B. For the "teacher" you can use virtually any openly available LLM, since only inference support is needed.

We currently support the following datasets.

Evaluation:

Training:

Please check out evaluation and finetuning sections to learn more!

Paper and Citation

If you find our work useful, please consider citing us!

@article{toshniwal2024openmath,
  title   = {OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset},
  author  = {Shubham Toshniwal and Ivan Moshkov and Sean Narenthiran and Daria Gitman and Fei Jia and Igor Gitman},
  year    = {2024},
  journal = {arXiv preprint arXiv: Arxiv-2402.10176}
}

Disclaimer: This project is strictly for research purposes, and not an official product from NVIDIA.

nemo-skills-openmathinstruct-1's People

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.