innolitics / dicom-attribute-scraper Goto Github PK
View Code? Open in Web Editor NEWScripts for scraping and aggregating DICOM attributes
Scripts for scraping and aggregating DICOM attributes
Given multiple JSON mappings from #3, aggregate the results into a single mapping. For example, { tag: “stringA” }
and { tag: “stringB” }
can be aggregated to a single { tag: [“stringA”, “stringB”] }
.
This aggregator should accept the same write method arguments as #3. I.e., if the scraping script outputs JSON and sqlite, we should make sure the aggregator can aggregate both (I suspect the sqlite aggregator would be quite simple).
We need to generate the "big" example file. Since we may need to do this again in the future, we should make a little script to handle it. Here are the tasks I expect we will need to complete, but chime in below if you see anything I have missed.
Information gathering:
Script:
find
may be useful here).Follow-up:
I think a makefile or a shell script would be the best choices for format. I would probably lean toward Make but do not have a strong preference one way or the other.
Tags used in the browser do not include the space after the comma: "(0008,0008)" instead of "(0008, 0008)". This discrepancy causes the tags to not be recognized in the Dicom browser code, so the spaces must be removed during the attribute scraping.
When processing a large number of files, it may be more efficient to write to a simple sqlite table instead of creating a json file for every dicom file. For example:
attribute | value |
---|---|
"0010,0020" | "patient-id-12345" |
"0010,0040" | "M" |
... | ... |
"0010,0040" | "O" |
This functionality can be used by passing an attribute to the script. E.g., python scraper.py --use-sqlite "sqlite-db-file-name"
.
It would be nice to have a few shared example files to use during development. There aren't any hard requirements, but ideally, the files would include a variety of attributes (i.e., overlapping attributes are less useful). The analyzer tab at https://dicom.innolitics.com may be useful for roughly gauging overlap.
Given a DICOM file, create a mapping from tag (“(0010,0010)” or its ID or hex equivalent) to value for each attribute in the file, subject to the criteria below.
Deliverable: Python script that accepts a DICOM file path and saves the result. Ex:
python scriptname.py input-file.dcm --json output-file.json
I suspect https://pydicom.github.io/pydicom/stable/ will be useful.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.