Giter Site home page Giter Site logo

brainscopypaste's People

Contributors

wehlutyk avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

brainscopypaste's Issues

Look at the distribution of features in successive timebags 1, 2, 3...

For the clusters that last at least N timebags. (Or for all clusters, if #25 shows that the duration of a cluster does not depend on its linguistic initial point.)

If there is an evolution, the distributions should gradually go towards more spiked ones (e.g. more words learned earlier, less words learned later).

Sphinx toctree warning

The error is:

WARNING: toctree contains reference to nonexisting document u'reference/analyze.args.GroupAnalysisArgs.title'

Also decide on the options for automodule (etc.) to get all members of all classes, but not inherited members.

Great refactoring

I discovered so many new tools since this started. MongoDB, SQLAlchemy joblib, logging, pandas, seaborn to name a few. So there's a great refactoring in sight. Basically, there will be three levels:

Database

Models in SQLAlchemy. This will also solve #33. Basically it eases everything.

Flow

Mining operations are to be generalized. You can mine for substitutions, for evolution of timebags, for other things. One unique command and subcommand brains mine {substitutions | evolution | ...}. Each operation has prerequisites that go through joblib, are done before if needed (with confirmation), a kind of parallelized make. This includes #32.

Creating a new mining operation must be straightforward, because that's the way to go when a new question appears. Maybe also allow for quick immediate tweaking of an analysis by adding variation-ids to analyses?

Viz

In notebooks, and nowhere else. Storify and order them too. Graphs read mined stuff, writing their prerequisites at the beginning.

IPythonize

Graphing/visualizing (not mining) can be put in ipython with nice views and interfaces.

All the figures from the paper should be included in this.

Update the sphinx doc with the notebooks.

Harmonise ids

For Clusters they're ints, in QtStrings in substitutions they seem to be floats, and in Quotes in Clusters they're strings!

Look at examples

Print out reasonably sized clusters with all their quotes, about 5 clusters per page, on 5-10 pages.

And wee what there is to see.

Plan documentation

Check what's documented in the code and what isn't. Make a list of things that still need documenting.

List fact-checks

Many facts stated in the paper need checking. List them. Check them.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.