Giter Site home page Giter Site logo

jcarbonnell / nearcoder Goto Github PK

View Code? Open in Web Editor NEW

This project forked from annavalentinahirsch/web3codellm

0.0 0.0 0.0 1007.45 MB

Fine-tuning StarCoder2, the last state-of-the-art of open-source code LLMs, on the Near Protocol blockchain.

License: Apache License 2.0

Jupyter Notebook 100.00%

nearcoder's Introduction

NEARCoder - Web3 Code LLM

NearCoder draft training protocol

NEARCoder - Web3 Code LLM aims to help blockchain developers in their coding challenges. While most web2 technologies are quite well furnished in term of tutorials and examples of code on forums and online courses, the web3 is still a recent technology, with a rather scattered technological landscape due to competing ecosystems releasing their own solutions and trying to grow a user base.

The fast-changing set of coding languages and the nascent stage of the web3 industry makes it hard for it to hire developers willing to start coding from scratch and we believe that a powerful coding assistant would be a significant help in that context.

NEARCoder - Web3 Code LLM is a StarCoder2-3b fine-tuned on the Near Protocol documentation, dApp structure and full dApps repositories collected from open GitHub repositories.

NEARCoder Web3 Code LLM started as a course project at the opencampus.sh.

Our fine-tuning protocol includes three steps (see figure 1):

    1. Continued Pre-Training
    1. Structure-Aware Fine-Tuning
    1. Specialized Fine-Tuning

The datasets and the models are open-sourced at Hugging Face.

A presentation of NEARCoder was made at the OpenCampus.sh on June 17th. The slides deck is available here.

nearcoder's People

Contributors

jcarbonnell avatar annavalentinahirsch avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.