mirador / nhanes Goto Github PK
View Code? Open in Web Editor NEWScripts to download and aggregate NHANES data
Home Page: http://www.cdc.gov/nchs/nhanes.htm
License: GNU General Public License v2.0
Scripts to download and aggregate NHANES data
Home Page: http://www.cdc.gov/nchs/nhanes.htm
License: GNU General Public License v2.0
This happened so far only with 2013-2014 cycle, where opening csv files with the default encoding on Mac results in the error UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf6 in position N: invalid start/continuing byte
. The files should be opened with latin-1 encoding.
The config.mira file is not copied into the mirador datasets.
Although running the steps individually works fine.
Hi! First of all, I'm a big fan of you guys, making such an effort to make it easy to use the NHANES data--thanks!
That being said, I'm struggling to generate the data. I've cloned your repo and am running
makeall.sh 1999 2018
from the terminal, which creates a folder (mirador/1999-2018/) containing a single file called config.mira
. Am I doing anything wrong? Have can I access the final dataset?
The "Data release cycle" variable is generated when aggregating consecutive cycles. It is a categorical variable with a coded value for each NHANES cycle (i.e.: 7 for 2011-2012, 8 for 2013-2014, etc). The. order of this values is not correct in the dictionary.
Right now, the xpt2csv calls R to convert the .xpt files into .csv. The xport package for Python could be used instead. Tested with one file, seems to work well, need to check more to make sure that there are no discrepancies with R-based conversion.
It appears that the FTP structure has changed rendering getdata.py
and download.py
useless. If anyone has a fix, I'd love to see the changes merged. This looks like an incredibly useful set of scripts otherwise.
Command: python makemeta.py 2015-2016 Questionnaire data/sources/csv/2015-2016 data/mirador/2015-2016/question.xml_strings
XML validation error:
Traceback (most recent call last):
File "makemeta.py", line 461, in
doc = parseString(''.join(xml_strings))
File "/Users/andres/anaconda3/lib/python3.7/xml/dom/minidom.py", line 1968, in parseString
return expatbuilder.parseString(string)
File "/Users/andres/anaconda3/lib/python3.7/xml/dom/expatbuilder.py", line 925, in parseString
return builder.parseString(string)
File "/Users/andres/anaconda3/lib/python3.7/xml/dom/expatbuilder.py", line 223, in parseString
parser.Parse(string, True)
xml.parsers.expat.ExpatError: not well-formed (invalid token): line 847, column 256
Hi there, I really appreciate you guys for this great work to download the NHANES dataset.
But when I running this
python getdata.py 2007-2008
It stuck at the DOWNLOADING XPT FILES message for 30 minutes
Please take a look at it.
Thanks a lot
Add argument to composites.py script to indicate where to insert the composites group in the xml file
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.