Comments (6)
Also there is no text data in combined text from the old_taxons in new_content.csv. This needs fixing.
from govuk-taxonomy-supervised-learning.
Has this been completed now @ellieking17?
from govuk-taxonomy-supervised-learning.
not yet. We have a trello card for it
from govuk-taxonomy-supervised-learning.
this turns out to be problem in the pipeline, where in some scripts, files are being written out as zipped and the folloiwng script being written in as csv. The makefile also is feeding unzipped files between scripts. @ff-l is working on standardising the pipeline to zipped file types as these are being read by notebooks too.
from govuk-taxonomy-supervised-learning.
@ellieking17 I've set up the scripts so that all outputs are compressed (gzip), final thing to change would be the output from the python crawl pipeline (to get gzipped json files).
from govuk-taxonomy-supervised-learning.
this is now fixed.
from govuk-taxonomy-supervised-learning.
Related Issues (14)
- Algorithm V2.0.0 needs to be run as script
- create_unlabelled_predictions_meta as script and task to makefile
- Add asserts to cleaning scripts HOT 3
- logger.warn when overwriting existing output files
- Create lookup table for document_type group
- Add level1 only and level2 tags for labelled df to labelled.py
- Remove data cleaning activities from EDA notebooks HOT 2
- labelled doesn't have document_type_gp column. Why? HOT 1
- untagged index is rangeindex not timestamp. Why? HOT 1
- Steps for deep learning on AWS HOT 2
- ./databox.sh fails on ubuntu 16.04 HOT 1
- Cache tokens during model runs HOT 1
- dataprep.py and new_dataprep.py need to be added to Makefile
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from govuk-taxonomy-supervised-learning.