Giter Site home page Giter Site logo

fyubang / keras-attention Goto Github PK

View Code? Open in Web Editor NEW

This project forked from datalogue/keras-attention

0.0 1.0 0.0 15.62 MB

Visualizing RNNs using the attention mechanism

Home Page: https://medium.com/datalogue/attention-in-keras-1892773a4f22

License: GNU Affero General Public License v3.0

Python 100.00%

keras-attention's Introduction

Attention RNNs in Keras

Implementation and visualization of a custom RNN layer with attention in Keras for translating dates.

This repository comes with a tutorial found here: https://medium.com/datalogue/attention-in-keras-1892773a4f22

Setting up the repository

  1. Make sure you have Python 3.4+ installed.

  2. Clone this repository to your local system

git clone https://github.com/datalogue/keras-attention.git
  1. Install the requirements (You can skip this step if you have all the requirements already installed)

We recommend using GPU's otherwise training might be prohbitively slow:

pip install -r requirements-gpu.txt

If you do not have a GPU or want to prototype on your local machine:

pip install -r requirements.txt

Creating the dataset

cd into data and run

python generate.py

This will create 4 files:

  1. training.csv - data to train the model
  2. validation.csv - data to evaluate the model and compare performance
  3. human_vocab.json - vocabulary for the human dates
  4. machine_vocab.json - vocabulary for the machine dates

Running the model

We highly recommending having a machine with a GPU to run this software, otherwise training might be prohibitively slow. To see what arguments are accepted you can run python run.py -h from the main directory:

usage: run.py [-h] [-e |] [-g |] [-p |] [-t |] [-v |] [-b |]

optional arguments:
  -h, --help            show this help message and exit

named arguments:
  -e |, --epochs |      Number of Epochs to Run
  -g |, --gpu |         GPU to use
  -p |, --padding |     Amount of padding to use
  -t |, --training-data |
                        Location of training data
  -v |, --validation-data |
                        Location of validation data
  -b |, --batch-size |  Location of validation data

All parameters have default values, so if you want to just run it, you can type python run.py. You can always stop running the model early using Ctrl+C.

Visualizing Attention

You can use the script visualize.py to visualize the attention map. We have provided sample weights and vocabularies in data/ and weights/ so that this script can run automatically using just an example. Run with the -h argument to see what is accepted:

usage: visualize.py [-h] -e | [-w |] [-p |] [-hv |] [-mv |]

optional arguments:
  -h, --help            show this help message and exit

named arguments:
  -e |, --examples |    Example string/file to visualize attention map for If
                        file, it must end with '.txt'
  -w |, --weights |     Location of weights
  -p |, --padding |     Length of padding
  -hv |, --human-vocab |
                        Path to the human vocabulary
  -mv |, --machine-vocab |
                        Path to the machine vocabulary

The default padding parameters correspond between run.py and visualize.py and therefore, if you change this make sure to note it. You must supply the path to the weights you want to use and an example/file of examples. An example file is provided in examples.txt.

Example visualizations

Here are some example visuals you can obtain:

image

The model has learned that “Saturday” has no predictive value!

image

We can see the weirdly formatted date “January 2016 5” is incorrectly translated as 2016–01–02 where the “02” comes from the “20” in 2016

Help

Start an issue if you find a bug or would like to contribute!

For other matters, you can contact @zafarali at [email protected] or us directly [email protected]

Acknowledgements

As with all open source code, we could not have built this without other code out there. Special thanks to:

  1. rasmusbergpalm/normalization - for some of the data generation code.
  2. joke2k/faker for their fake data generator.

References

Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. "Neural machine translation by jointly learning to align and translate." arXiv preprint arXiv:1409.0473 (2014).

keras-attention's People

Contributors

callicles avatar szsen avatar zafarali avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.