wehlutyk / brainscopypaste Goto Github PK
View Code? Open in Web Editor NEWAnalysis of mutation of quotes on the web
License: GNU General Public License v3.0
Analysis of mutation of quotes on the web
License: GNU General Public License v3.0
It's smarter to have susceptibilities as a curve, and feature density as the colors.
For the clusters that last at least N timebags. (Or for all clusters, if #25 shows that the duration of a cluster does not depend on its linguistic initial point.)
If there is an evolution, the distributions should gradually go towards more spiked ones (e.g. more words learned earlier, less words learned later).
See also wehlutyk/brainscopypaste-paper#7
In http://nbviewer.ipython.org/github/wehlutyk/brainscopypaste/blob/master/features_timebags_evolution_recursive_shifting.ipynb , in the MNSyllables plot, there's a growing spike at 4 syllables. What's that? It could also be related to a similar spike in the AoA plots.
The error is:
WARNING: toctree contains reference to nonexisting document u'reference/analyze.args.GroupAnalysisArgs.title'
Also decide on the options for automodule (etc.) to get all members of all classes, but not inherited members.
I discovered so many new tools since this started. MongoDB, SQLAlchemy joblib, logging, pandas, seaborn to name a few. So there's a great refactoring in sight. Basically, there will be three levels:
Models in SQLAlchemy. This will also solve #33. Basically it eases everything.
Mining operations are to be generalized. You can mine for substitutions, for evolution of timebags, for other things. One unique command and subcommand brains mine {substitutions | evolution | ...}
. Each operation has prerequisites that go through joblib, are done before if needed (with confirmation), a kind of parallelized make. This includes #32.
Creating a new mining operation must be straightforward, because that's the way to go when a new question appears. Maybe also allow for quick immediate tweaking of an analysis by adding variation-ids to analyses?
In notebooks, and nowhere else. Storify and order them too. Graphs read mined stuff, writing their prerequisites at the beginning.
The language detector relies on this. See here.
Graphing/visualizing (not mining) can be put in ipython with nice views and interfaces.
All the figures from the paper should be included in this.
Update the sphinx doc with the notebooks.
Will go with the documentation
And rebuild the whole dependency list. See also #19.
For Cluster
s they're int
s, in QtString
s in substitutions they seem to be float
s, and in Quote
s in Cluster
s they're string
s!
Print out reasonably sized clusters with all their quotes, about 5 clusters per page, on 5-10 pages.
And wee what there is to see.
Check what's documented in the code and what isn't. Make a list of things that still need documenting.
Instead of a duration depending on the cluster duration
Many facts stated in the paper need checking. List them. Check them.
So pip install -r requirements.txt
fails.
Useful?
As an answer to what?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.