Giter Site home page Giter Site logo

dfencoder's Introduction

dfencoder - AutoEncoders for DataFrames

Want to learn useful nonlinear representations of your tabular data? Don't have time to mess with autoencoders? This library aims to simplify your life.

Currently under development.

Installation

We highly recommend using a virtual environment to install! This software has only been tested using python 3.6.

The bare-bones requirements are installed automatically by pip. You may also want to install jupyter and matplotlib to run notebooks and the ipynb logger, but these are not requirements to install.

Install using:

pip install dfencoder



Or, you can get the latest version by cloning this repository and installing from the home directory:

pip install .



Usage

Thorough documntation is still being written, but the demo notebook is available to show some of the features of this library.

Running the tests

The adult.csv dataset is used in the testing script. Make sure the file (found in the root of this repo) is in the same directory as test.py when you run the script.

Contributing

Contributors are welcomed! Please reach out with PRs.

Feature Requests and Bugs

We'd like to release a stable version soon, so in the meantime please submit feature requests and bug reports on this repository's issues page.

Thanks for your interest in this project!

Dataframe Encoding

dfencoder does some manipulation to encode features to feed into the feed-forward MLP. This HLD hopefully clears up how this looks. HLD for how inputs are encoded by dfencoder

Features

This library is a personal project so progress is slow. The latest release as of this writing is v0.0.37 which introduces "inference mode" that optimizes inference for single records, on json inputs.

Previous Releases:

v0.0.36 which introduces handling for timestamp data; will use cyclical encoding to encode time of day, day of week, day of month, day of year, as well as the raw timestamp scaled as a numeric feature to encode raw linear time.

Pre-process your timestamp columns by using pandas: pd.to_datetime() so dfencoder can infer the datatype and handle it accordingly.

dfencoder's People

Contributors

alliedtoasters avatar dagardner-nv avatar efajardo-nv avatar gbatmaz avatar gputester avatar hsin-c avatar mdemoret-nv avatar shawn-davis avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.