Giter Site home page Giter Site logo

sungsujaing / letter_digit_generator_vae Goto Github PK

View Code? Open in Web Editor NEW
3.0 1.0 0.0 76.21 MB

generate arbitrary handwritten letter/digits based on the inputs

License: MIT License

Jupyter Notebook 99.74% Python 0.26%
vae beta-vae tensorflow-keras conditional-vae emnist-dataset convolutional-cvae

letter_digit_generator_vae's Introduction

letter_digit_generator_VAE

This project aims to build a conditional variational autoencoder (CVAE) to generate arbitrary handwritten letters/digits based on the keyboard input. Based on the EMNIST dataset, the CVAE model is trained to encode the handwritten letters/digits into a latent vector space. With a random sampling or interpolation technique, imaginary letters and digits are generated.

EMNIST data examples

LDG Version 3

  • Loss: binary crossentropy
  • Optimizer: Adam
  • Latent dimension: 6
  • Image normalization: [0, 1]
  • Last activation function of the decoder: sigmoid
  • Convolutional CVAE layers: [784,62]-[784]-[(28,28,1)]-[(14,14,16)]-[(7,7,32)]-[1568]-[64]-[6] // [6,62]-[64]-[1568]-[(7,7,32)]-[(14,14,32)]-[(28,28,16)]-[(28,28,1)]-[784]
  • Multi-layer CVAE layers: [784,62]-[256]-[128]-[6] // [6,62]-[128]-[256]-[784]

A command-line letters/digits generator based on the ldg_v3 Conv-CVAE model (details below). It simply loads the Conv-CVAE model and the corresponding best weights to produce results.

  • label inputs to both encoder and decoder

Training

Dataset reconstruction

Generating new letters/digits (with/without arbitrary binary threshold filter)

LDG Version 2

  • Loss: MSE
  • Optimizer: Adam
  • Latent dimension: 10
  • Image normalization: [-1, 1]
  • Last activation function of the decoder: tanh
  • Convolutional CVAE layers: [784,62]-[784]-[(28,28,1)]-[(28,28,16)]-[(28,28,32)]-[(28,28,64)]-[12544]-[128]-[10] // [10,62]-[128]-[12544]-[(14,14,64)]-[(28,28,32)]-[(28,28,16)]-[(28,28,1)]-[784]
  • Multi-layer CVAE layers: [784,62]-[512]-[256]-[10] // [10,62]-[256]-[512]-[784]

A command-line letters/digits generator based on ldg_v2 Conv-CVAE model (details below). It simply loads the Conv-CVAE model and the corresponding best weights to produce results.

  • label inputs to both encoder and decoder

Training (direct comparison is difficult due to the difference in epochs)

Dataset reconstruction

Generating new letters/digits (with/without arbitrary binary threshold filter)

LDG Version 1

Initial convolutional conditional variational autoencoder model.

  • label inputs only to decoder
  • training/test data reconstructions were satisfactory, but generation of specific string input was somewhat difficult.

VAE interpolation from image 1 to image 2

While the model architecture seems to be okay, the standford dogs datasets may not be suitable to train VAE.

letter_digit_generator_vae's People

Contributors

sungsujaing avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.