Giter Site home page Giter Site logo

transformer_exploration's Introduction

Transformer-based Calculator

Simple exploration of basic transformer architectures.

Tutorial

Step-by-step jupyter notebook tutorial taken from Karpathy's Youtube series: https://www.youtube.com/watch?v=kCc8FmEb1nY

Calculator

gen_calculator_dataset.py: generates a simple .txt file dataset containing arithmetic problems involving (+, -, *, //) and a specified min and max for integers used.

transformer.py: gpt-2 like architecture, refactored out of jupyter notbook tutorial into standalone file.

train.py: trainer for a given dataset, output looks like:

Training config:  Config(batch_size=64, num_iterations=5000, lr=0.0003, eval_interval=100, block_size=12, vocab_size=17, n_layer=6, n_head=6, n_embed=384, dropout=0.1)
Dataset location:  math.txt
Training set size:  1209022
Validation set size:  134336

Fresh model sample:
 +92010518/57-8+35++
65
4*=4/+=** 3/6+ 7++87-41-6+=5

Training model...
Estimated losses: 1.27914 train, 1.28612 val: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5000/5000 [31:35<00:00,  2.64it/s]

Trained sample:
  + 3 = 52
57 // 74 = 0
68 * 26 = 1126
22 * 24 = 54
48 // 28 = 1
39 // 82 = 0
12 - 26 = -2
48 // 59 = 1

transformer_exploration's People

Contributors

lukeroberto avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.