Comments (3)
More thoughts on what would be nice to have here and possible approaches:
SQLite database that stores date-time, software, and
script versions for each run of each dataset, plus a hash of an export of that
dataset to check that it hasn't been changed. Additional table (joined many to
one) that records the manipulations performed for each dataset. Raw data files
are stored with an associated identifier (probably not forever, but maybe a
few versions back). Add a view_changes
function that reports the manipulations conducted and a reproduce
function that uses git to checkout the
versions of the code used and then runs them on the associated raw data files.
from retriever.
Possible provenance modules to use:
Discussion of this as part of GSoC applications:
from retriever.
Added #1369
from retriever.
Related Issues (20)
- API research for API integration in Data Retriever (GSoC '21)
- Add a default bounding box for usgs-elevation
- Retriever doesn't detect new python scripts HOT 1
- Add RDatasets
- Tidycensus dataset doesn't work with the download and install csv commands. HOT 3
- Make sure that the the R api dataset are run on the retrieverdash
- Add new functions to rdataretriever and Retriever.jl
- Excel xlsx file; not supported HOT 9
- Update codecov to action stage in workflows HOT 2
- not able to use gdal==3.3.2 while working with ".shp" files HOT 2
- Improve test coverage HOT 6
- display_all_rdatasets_names in rdatasets takes a list of package_name HOT 4
- Create breeding bird survey for all releases. HOT 4
- Downloading fails for files with no Content-Disposition HOT 1
- Retriever should gracefully fail if there is no internet. HOT 2
- hacktoberfest guide
- Installation from source fails due to missing configuration HOT 6
- Installation failing on Python 3.12 due to removal of imp package HOT 1
- Test and update Bioclim data
- GSoC 2024 - Getting started. HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from retriever.