Giter Site home page Giter Site logo

koaning / drawdata Goto Github PK

View Code? Open in Web Editor NEW
724.0 8.0 79.0 5.04 MB

Draw datasets from within Jupyter.

Home Page: https://calmcode.io/labs/drawdata.html

License: MIT License

Makefile 0.10% Python 0.58% CSS 0.08% JavaScript 99.24%
jupyter drawdata data

drawdata's Introduction

drawdata

"Just draw some data and get on with your day."

This small Python library contains Jupyter widgets that allow you to draw a dataset in a Jupyter notebook. This should be very useful when teaching machine learning algorithms.

The project uses anywidget under the hood so our tools should work in Jupyter, VSCode and Colab. That also means that you get a proper widget that can interact with ipywidgets natively. Here is an example where updating a drawing triggers a new scikit-learn model to train (code).

You can really get creative with this in a notebook, so feel free to give it a spin!

Installation

Installation occurs via pip.

python -m pip install drawdata

To read the data, polars is useful, but this library also suppots pandas:

python -m pip install pandas polars

Usage

You can load the scatter widget to start drawing immediately.

from drawdata import ScatterWidget

widget = ScatterWidget()
widget

If you want to use the dataset that you've just drawn you can do so via:

# Get the drawn data as a list of dictionaries
widget.data

# Get the drawn data as a dataframe
widget.data_as_pandas
widget.data_as_polars

If you're eager to do scikit-learn stuff with your drawn data you may appreciate this property instead:

X, y = widget.data_as_X_y

The assumption for this property is that if you've used multiple colors that you're interested in doing classification and if you've only drawn one color you're interested in regression. In the case of regression y will refer to the y-axis.

Shoutout

This project was originally part of my work over at calmcode labs but my employer probabl has been very supportive and has allowed me to work on this project during my working hours. This was super cool and I wanted to make sure I recognise them for it.





Old Features

The original implementation of our widget would use an iframe to load a site in order to be able to draw from a Jupyter notebook. This works, but requires more manual effort, only works with pandas via the clipboard feature and needs an internet connection. Here's what that widget looks like:

It will be kept around, but the way forward for this library is to build on top of anywidget.

Old Feature Usage

When you run this from jupyter, you should load in an iframe.

from drawdata import draw_scatter

draw_scatter()

Once you're done drawing you can copy the data to the clipboard. After this you can use pandas to read the clipboard to get your drawn data into a dataframe.

import pandas as pd 
pd.read_clipboard(sep=",")

drawdata's People

Contributors

archydeberker avatar koaning avatar thewchan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

drawdata's Issues

Configure the number of bins in the histogram

I understand that this library only returns iframes from the website, but it would be awesome if the plots can be configured for instance to control the number bins in the histogram.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.