Comments (7)
I don't really know much about the formatting recommendation/guidelines in/for Jupyter notebooks, and if there's a difference between Jupyter Notebook and Jupyter Lab in terms on what gets written to .ipynb files. However, I noticed that in the Jupyter Lab UI, there's a metadata field, which would probably be equivalent to what @bollwyvl mentioned with
{
"metadata": {
"watermark": {
"date": "2015-17-06T15:04:35",
"CPython": "3.4.3",
"IPython": "3.1.0",
"compiler": "GCC 4.2.1 (Apple Inc. build 5577)",
"system" : "Darwin",
"release" : "14.3.0",
"machine": "x86_64",
"processor" : "i386",
"CPU cores": "4",
"interpreter": "64bit"
}
}
In any case, if you or @bollwyvl or someone else would like to implement this (a way to optionally write metadata), I'd be very open to this and be happy to merge it (there was good work in progress over at #7 ).
This could be either via a
- magic command
- decorator, or
--metadata
flag.
from watermark.
Thanks for the suggestion, this sounds interesting. API-wise, I would think of an additional (optional) flag that would maybe write the produced output into the meta-tag.
Just wondering, what application and use-case would you have in mind? Right now, for example, I'd use this plugin to conveniently show the time-stamp of the last update to users. Or to show Python versions and packages that were used to create those results. I am just wondering how the "meta" tag could be additionally used to improve reproducibility.
from watermark.
Thanks for the response. Yeah, -m is already taken, but something to that
effect.
I think the big win is that metadata in standard formats (iso, etc) is more
unambiguously parseable by downstream consumers and UI than inline text.
Instead of writing some regular expressions, one can
json.load()[metadata][watermark] For example, on nbviewer, we show the
kernel that was used to create the notebook.
So if one has a big stack of documentation notebooks in a repo, one can
check for when they were actually executed, not when they were checked out,
etc.
When we get better search, either in Jupyter hub or in custom deployments,
metadata fields will just be ready to go as facets. An organization that
has watermark as part of their "standard distribution" could gain a lot of
insight, about a snapshot or over time.
On 23:34, Tue, Sep 1, 2015 Sebastian Raschka [email protected]
wrote:
Thanks for the suggestion, this sounds interesting. API-wise, I would
think of an additional (optional) flag that would maybe write the produced
output into the meta-tag.Just wondering, what application and use-case would you have in mind?
Right now, for example, I'd use this plugin to conveniently show the
time-stamp of the last update to users. Or to show Python versions and
packages that were used to create those results. I am just wondering how
the "meta" tag could be additionally used to improve reproducibility.—
Reply to this email directly or view it on GitHub
#4 (comment).
from watermark.
metadata in standard formats (iso, etc) is more
unambiguously parseable by downstream consumers and UI than inline text.
Good point, I agree. In this context, I could also imagine an optional little add-on to write all current package specifications of the Python env into the metadata as in pip freeze > requirements.txt
Btw. something like
-s --save_meta
-g --generate_meta
seems to be okay! However, I would suggest to not use the 1-letter short form here and go with --generate_meta
to make it clear to a "user" of this notebook that the current watermark
would change the notebook's meta-data in some way upon re-execution.
Would you be interested in implementing such a feature?
from watermark.
Sorry I didn't get back to you sooner: traveling!
I'd love to take a whack at this. Hopefully I can get a PoC up quickly.
Addons are great, but likely outside the scope of this particular request!
But, since we're off topic... I highly recommend building thementry_point
s vs namespace tomfoolery or magic module/function names.
In addition to pip
, i'd consider being able to serialize the state of:
- python
conda
- "native" managers:
apt
dnf
/yum
brew
- other vcs
hg
from watermark.
No need to apologize, and I am sorry, too. It was a pretty hectic week. I am currently in final stage of finishing up my new book that is coming out in 1-2 weeks and there is a lot of stuff to be done :).
So, I think writing to the meta-tags as an option would be great. And I will open separate issues for the other suggestions. I like the idea of considering other "managers"/"environments"
Cheers,
Sebastian
from watermark.
Worth reheating this discussion? I think it would be cool to have the information inside the metadata of the notebook. Then follow up with a PR for conda-tools/conda-execute#3 which might make the notebook a "shareable unit". Right now for sharing notebooks you need to make repository with a requirements.txt
or some such.
from watermark.
Related Issues (20)
- Supporting nested library imports via `-iv`
- version unknown HOT 4
- --iversions returns an error if no package has been imported HOT 1
- Timezone name has no trailing space HOT 2
- Python 3.8's new metadata package
- support -iv edge cases? HOT 1
- cli version HOT 5
- Python 2.7 and 3.5 EOL HOT 1
- Adopt NEP 29?
- setup.py fails unless all install_requires are already installed HOT 1
- Missing space before timezone HOT 5
- Watermark fails to recognize all project used libraries HOT 1
- `-d` doesn't print date, only when combined with `-u` HOT 1
- Watermark fails to identify (some) packages when imported as 'from X import Y' HOT 13
- Error UsageError on author argument HOT 1
- Is it possible to blur background? HOT 1
- Remove Travis CI config and replace build status badge with AppVeyor? HOT 2
- New feature to check for latest packages
- Include information about how Python was installed HOT 2
- Determine what version of jupyter notebook the code is currently being run in HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from watermark.