The notebooks from googledatalab

Unsupported loss function in seq2seq model.

I am exploring the following tensorflow example: https://github.com/googledatalab/notebooks/blob/master/samples/TensorFlow/LSTM%20Punctuation%20Model%20With%20TensorFlow.ipynb which apparently is written in tf v1, so I upgraded with the v2 upgrade script and there were three main inconsistencies:

ERROR: Using member tf.contrib.rnn.DropoutWrapper in deprecated module tf.contrib. tf.contrib.rnn.DropoutWrapper cannot be converted automatically. tf.contrib will not be distributed with TensorFlow 2.0, please consider an alternative in non-contrib TensorFlow, a community-maintained repository such as tensorflow/addons, or fork the required code.
ERROR: Using member tf.contrib.legacy_seq2seq.sequence_loss_by_example in deprecated module tf.contrib. tf.contrib.legacy_seq2seq.sequence_loss_by_example cannot be converted automatically. tf.contrib will not be distributed with TensorFlow 2.0, please consider an alternative in non-contrib TensorFlow, a community-maintained repository such as tensorflow/addons, or fork the required code.
ERROR: Using member tf.contrib.framework.get_or_create_global_step in deprecated module tf.contrib. tf.contrib.framework.get_or_create_global_step cannot be converted automatically. tf.contrib will not be distributed with TensorFlow 2.0, please consider an alternative in non-contrib TensorFlow, a community-maintained repository such as tensorflow/addons, or fork the required code.

So for compatibility I manually replaced framework.get_or_create_global_step with tf.compat.v1.train.get_or_create_global_step, and also rnn.DropoutWrapper with tf.compat.v1.nn.rnn_cell.DropoutWrapper.

But I was unable to find a solution on how to handle the tf.contrib.legacy_seq2seq.sequence_loss_by_example method, since I cannot find a backwards compatible alternative. I tried installing Tensroflow Addons and use its seq2seq loss function, but wasn't able to figure out how to adapt it to work with the rest of the code.

Stumbled across some errors like Consider casting elements to a supported type. or Logits must be a [batch_size x sequence_length x logits] tensor, because probably i am not implementing something correctly.

My question: How to implement supported tensorflow v2 alternative of this loss function, so it acts similarly to the code below?

    output = tf.reshape(tf.concat(axis=1, values=outputs), [-1, size])
    softmax_w = tf.compat.v1.get_variable("softmax_w", [size, len(TARGETS)], dtype=tf.float32)
    softmax_b = tf.compat.v1.get_variable("softmax_b", [len(TARGETS)], dtype=tf.float32)
    logits = tf.matmul(output, softmax_w) + softmax_b
    self._predictions = tf.argmax(input=logits, axis=1)    
    self._targets = tf.reshape(input_.targets, [-1])
    loss = tfa.seq2seq.sequence_loss(
        [logits],
        [tf.reshape(input_.targets, [-1])],
        [tf.ones([batch_size * num_steps], dtype=tf.float32)])
    self._cost = cost = tf.reduce_sum(input_tensor=loss) / batch_size
    self._final_state = state

Full code here.

My proposal: When this is resolved please update the notebook with newer version example.

delete

delete wrong git

Add sample to show GCS (csv) --> dataframe --> GCS processing

Based on a customer request:
Show a sample to do basic I/O from/to GCS with dataframe in between

Consider adding an standalone sample notebook of Facets

Usage of Facets can be found at the samples such as:
https://github.com/googledatalab/notebooks/blob/master/samples/contrib/mlworkbench/structured_data_classification_police/Predict%20Case%20Resolution%20(small%20data%20experience).ipynb

Given the importance of data visualization, please consider creating a standalone notebook to address best practices and how to use the built-in Facets feature in Datalab.

Besides, please add concrete examples using a sample dataset to explain how to interpret the Facets result, and how to transfer the insight into action items, such as data cleaning or feature engineering.
One possible structure can be:

Image of the Facets result
Human interpretation
The code

How-to on handling large data

Creating a bug to track internal bug...

It's very difficult to understand anything about best practices for magics or modules without linking out to readthedocs.

This needs to be covered in tutorials and sample notebooks through markdown discussion and code and in help text for queries.
Specific worthwhile additions:

How to handle large data (common complaint) for in-memory work - when retrieved from BQ - this is needed for GA

How to handle large data in memory - Dataframe won't scale - this is post-GA and I have a separate tracking bug to use Graphlab's OSS alternative.

We should figure out how to better surface reference docs, as well as improve docs with a how-to set of notebooks to cover this sort of information.

Add sort_index() to closing data

The dataframes do not seem to be in date-sorted order, this is not a problem for the plotting of the time series, but the autocorrelation, etc. assumes they are sorted.

A work-around is to add a sort_index() to the first "Munge the data" code cell, e.g., to read:

closing_data = pd.DataFrame()
   . . .
closing_data['aord_close'] = aord['Close']

# Put the closing_data in sorted order *** Needed for autocorrelation to work ***
closing_data = closing_data.sort_index()

# Pandas includes a very convenient function for filling gaps in the data.
closing_data = closing_data.fillna(method='ffill')

Image Classification Service End-End Notebook blog post link

On the Image Classification Service End-End notebook, the first paragraph discusses a blog post. The blogpost links to a localhost page. Can you please update the link to the blog post page? Thank you!

Chart output missing for TimeSeries Chart and Geo Chart

Visiting https://github.com/googledatalab/notebooks/blob/master/tutorials/Data/Interactive%20Charts%20with%20Google%20Charting%20APIs.ipynb the last two charts do not have their output embedded (Out[10] + Out[12]).

Beta2: Update tutorial notebook "Importing and Exporting Data"

table.to_file('/tmp/cars.csv')

TypeErrorTraceback (most recent call last)
in ()
----> 1 table.to_file('/tmp/cars.csv')

/usr/local/lib/python2.7/dist-packages/datalab/bigquery/_table.pyc in to_file(self, destination, format, csv_delimiter, csv_header)
648 for column in self.schema:
649 fieldnames.append(column.name)
--> 650 writer = csv.DictWriter(f, fieldnames=fieldnames, delimiter=csv_delimiter)
651 if csv_header:
652 writer.writeheader()

/usr/lib/python2.7/csv.pyc in init(self, f, fieldnames, restval, extrasaction, dialect, args, *kwds)
135 extrasaction)
136 self.extrasaction = extrasaction
--> 137 self.writer = writer(f, dialect, args, *kwds)
138
139 def writeheader(self):

TypeError: "delimiter" must be string, not unicode

Sample for LIME

Illustrate how to use LIME

Add sample notebook for TF estimator use

We already have Toolbox and lower level TF examples (docssamplesTensorFlow). This item is to track estimator level sample for completeness.

Very slow and very low accuracy

Very hard to find this

I found this via the article: https://cloudplatform.googleblog.com/2016/03/TensorFlow-machine-learning-with-financial-data-on-Google-Cloud-Platform.html

But it was very hard to find. I initially thought it would be in the https://github.com/googledatalab/datalab

Please update your Readme files or the article to make it more clear. Thanks!

Notebooks magic cells are outdated after GA

Some of the documentations are outdated and left behind the GA version.
For example in BigQuery magic cells page, %%bq is not used in GA version

Quick questions

The "Using Datalab" intro notebook still references the old App Engine based deployment.

The intro note book here needs to be updated to work based on the new locally-run setup.

BigQuery API Notebook throws error

I'm running through /docs/tutorials/BigQuery/BigQuery%20APIs.ipynb

In the second cell I get:

# Create and run a SQL query
bq.Query('SELECT * FROM [cloud-datalab-samples:httplogs.logs_20140615] LIMIT 3').results()


RequestExceptionTraceback (most recent call last)
<ipython-input-2-03b0534f5548> in <module>()
      1 # Create and run a SQL query
----> 2 bq.Query('SELECT * FROM [cloud-datalab-samples:httplogs.logs_20140615] LIMIT 3').results()

/usr/local/lib/python2.7/dist-packages/datalab/bigquery/_query.pyc in results(self, use_cache, dialect, billing_tier)
    226     """
    227     if not use_cache or (self._results is None):
--> 228       self.execute(use_cache=use_cache, dialect=dialect, billing_tier=billing_tier)
    229     return self._results.results
    230 

/usr/local/lib/python2.7/dist-packages/datalab/bigquery/_query.pyc in execute(self, table_name, table_mode, use_cache, priority, allow_large_results, dialect, billing_tier)
    524     job = self.execute_async(table_name=table_name, table_mode=table_mode, use_cache=use_cache,
    525                              priority=priority, allow_large_results=allow_large_results,
--> 526                              dialect=dialect, billing_tier=billing_tier)
    527     self._results = job.wait()
    528     return self._results

/usr/local/lib/python2.7/dist-packages/datalab/bigquery/_query.pyc in execute_async(self, table_name, table_mode, use_cache, priority, allow_large_results, dialect, billing_tier)
    479                                                  billing_tier=billing_tier)
    480     except Exception as e:
--> 481       raise e
    482     if 'jobReference' not in query_result:
    483       raise Exception('Unexpected response from server')

RequestException: HTTP request failed: Invalid project ID '<PROJECT_ID>'. Project IDs must contain 6-63 lowercase letters, digits, or dashes. IDs must start with a letter and may not end with a dash.

The query itself runs fine in BigQuery.

ga sd notebooks

TODO

Done: remove job.wait as they are now blocking
Done: check I don't say 'transforms.json' or 'numerical_analysis.json'.
Done: use job name from the ctc?...it doesn't have one
Done: make a readme that says you have to run %projects set cloud-ml-dev....not needed

Is Storage API in sample notebook referencing pydatalab?

Hi,

Is the Storage API referenced in https://github.com/googledatalab/notebooks/blob/master/tutorials/Storage/Storage%20APIs.ipynb follows the API in http://googledatalab.github.io/pydatalab/datalab.storage.html? The source code for this documentation is in https://github.com/googledatalab/pydatalab/tree/v1.1/datalab/storage

If yes, I couldn't find some classes such as Item. The error thrown is AttributeError: 'module' object has no attribute 'Item'

Code snippet for your reference:

import google.datalab.storage as storage

shared_bucket = storage.Item('BUCKET_NAME', "KEY_NAME")

Missing package causes error in sample: "Introduction to Python"

Copy issue from googledatalab/datalab#936

Need to add package or change sample (the latter may be easier for now)
%%bash

apt-get install -y -q libxslt-dev libxml2-dev
pip install -q scrapy

debconf: delaying package configuration, since apt-utils is not installed
Command "/usr/bin/python -u -c "import setuptools, tokenize;file='/tmp/pip-build- OqPUd_/cryptography/setup.py';exec(compile(getattr(tokenize, 'open', open)(file).read().replace('\r\n', '\n'), file, 'exec'))" install --record /tmp/pip-NBepB4-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-build-OqPUd_/cryptography/

Beta2: Update sample notebook Readme

Copying sample notebooks from GCS is no longer necessary or appropriate. So Readme.ipynb needs to be updated to show git command instead.

Make the notebooks more reliable in test runs

We've recently added the .test.sh script which runs each of these notebooks to validate that none of the cells raise errors when executed.

Conceptually, this is fine, but most of these notebooks were written with the idea that they would be run once rather than repeatedly.

As such, we should go through each notebook and see if there are things we should change to make them more reliable when run multiple times.

As a concrete example, we have at least one issue in the 'tutorials/storage/Storage APIs.ipynb' notebook where it creates a sample bucket, adds an item to it, deletes just that one item, and then deletes the notebook. If that process fails after creating the sample item but before deleting it, then every subsequent run will find that the sample bucket is not empty, and attempts to delete the sample bucket will fail.

That particular notebook needs to be updated to make sure that it better handles the scenario of the sample bucket already existing and there are probably also similar issues in the other notebooks.

Internal Temporary file not writable

In the workbook at notebooks/samples/ML Toolbox/Regression/Census/2 Service Preprocess.ipynb the following section:

analysis_path = os.path.join(workspace_path, 'analysis') regression.analyze(dataset=train_data, output_dir=analysis_path, cloud=True)

gives the following error:

Running numerical analysis...Analyze: failed with error: The internal temporary file is not writable.

Map charts don't load, missing key

Getting this error:

Google Maps API error: MissingKeyMapError https://developers.google.com/maps/documentation/javascript/error-messages#missing-key-map-error
_.kb @ js?v=3&callback=google.loader.callbacks.maps&sensor=false:37
(anonymous) @ common.js:53
(anonymous) @ common.js:194
c @ common.js:49
(anonymous) @ AuthenticationService.Authenticate?1shttp%3A%2F%2Flocalhost%3A8081%2Fnotebooks%2Fdev%2Fnotebooks%2F…:1

Beta2: Update sample notebook "Programming Language Correlation" - missing github.timeline table

githubarchive:github.timeline table seems to have disappeared from BQ. There are some other github tables in bigquery-public-data: github_repos that need to be checked or the sample needs to be dropped.

Create notebook to show Beam & Dataflow usage

Show a simple example (e.g. apache_beam.examples.wordcount) with local and cloud run

Fix the batch prediction output schema when using it for evaluation

In Census/4 Service Evaluate.ipynb -- the correct schema to use when defining a BigQuery data source should use the preferred column names.

predicted_target -> predicted
target_from_input -> target

Machine Learning with Financial Data - unstable results

Hi,
I was trying out the "Machine Learning with Financial Data" notebook.
In the tutorial video, the prediction accuracy of the first trivial model is 65% (as is described in the notebook's text), however, the current version of the notebook on github showed an accuracy of 90% and when I ran the notebook myself both on my PC and on Google Cloud, I got 84%. I'm wondering what is going on here. Can anyone explain that to me?
Thanks,
Yang

Local Prediction with Flower Image Classification

prediction_dir is defined but not used in the notebook.

Failed to send HTTP request on BigQuery APIs Notebook

I'm trying to run the BigQuery API tutorial that is included in datalabs, but I'm hitting an error on the very first step:

# Create and run a SQL query
bq.Query('SELECT * FROM [cloud-datalab-samples:httplogs.logs_20140615] LIMIT 3').results()

I feel like I might be missing something basic. Here is the Traceback:

ExceptionTraceback (most recent call last)
<ipython-input-2-03b0534f5548> in <module>()
      1 # Create and run a SQL query
----> 2 bq.Query('SELECT * FROM [cloud-datalab-samples:httplogs.logs_20140615] LIMIT 3').results()

/usr/local/lib/python2.7/dist-packages/datalab/bigquery/_query.pyc in results(self, use_cache, dialect, billing_tier)
    226     """
    227     if not use_cache or (self._results is None):
--> 228       self.execute(use_cache=use_cache, dialect=dialect, billing_tier=billing_tier)
    229     return self._results.results
    230 

/usr/local/lib/python2.7/dist-packages/datalab/bigquery/_query.pyc in execute(self, table_name, table_mode, use_cache, priority, allow_large_results, dialect, billing_tier)
    524     job = self.execute_async(table_name=table_name, table_mode=table_mode, use_cache=use_cache,
    525                              priority=priority, allow_large_results=allow_large_results,
--> 526                              dialect=dialect, billing_tier=billing_tier)
    527     self._results = job.wait()
    528     return self._results

/usr/local/lib/python2.7/dist-packages/datalab/bigquery/_query.pyc in execute_async(self, table_name, table_mode, use_cache, priority, allow_large_results, dialect, billing_tier)
    479                                                  billing_tier=billing_tier)
    480     except Exception as e:
--> 481       raise e
    482     if 'jobReference' not in query_result:
    483       raise Exception('Unexpected response from server')

Exception: Failed to send HTTP request.

Needed: DataLab integration with Google BigTable, Google DataProc (Spark)

We use Jupyter notebooks to access BigTable data like so:

from google.cloud import bigtable
from google.cloud import happybase
client = bigtable.Client(project=project_id, admin=True)
instance = client.instance(instance_id)
connection = happybase.Connection(instance=instance)
table = connection.table(table_name)

for key, row in table.scan:

(we then convert this in Pandas DataFrames)

In regards to DataLab and DataProc integration - Jupyter Spark integration http://blog.insightdatalabs.com/jupyter-on-apache-spark-step-by-step/ is a thing in Data Science - so how can we leverage DataLab notebooks over Spark jobs running on DataProc (eg stepwise pyspark job definitions, visualising job results)?

Also , how do we leverage IPython Parallel https://ipyparallel.readthedocs.io/en/latest/ and Jupyter Cluster notebook extensions in DataLab ?

Wrong link to blog post in flower classification large dataset example

In the example "Flower Classification (large dataset experience)," the link to the blog post is incorrect. Presumably it should be the same as in the small-dataset-experience example.

AttributeError: 'dict_values' has no attribute 'index'

Upon running for s in sources: source, predicted = predictor.predict(s) print('\n---SOURCE----\n' + source) print('---PREDICTED----\n' + predicted)

I get the following error:
INFO:tensorflow:Restoring parameters from /content/lab/datalab/punctuation/model/punctuation-5796

AttributeError Traceback (most recent call last)
in ()
9
10 for s in sources:
---> 11 source, predicted = predictor.predict(s)
12 print('\n---SOURCE----\n' + source)
13 print('---PREDICTED----\n' + predicted)

in predict(self, content)
88 for i in indices:
89 words1[i], words1[i-1] = words1[i-1], words1[i]
---> 90 words2 = [self._word_to_id.keys()[self._word_to_id.values().index(data_x[index])] for index in range(len(puncts) - 1, len(data_x))]
91 all_words = words1 + [puncts[-1]] + words2
92 content = ' '.join(all_words)

in (.0)
88 for i in indices:
89 words1[i], words1[i-1] = words1[i-1], words1[i]
---> 90 words2 = [self._word_to_id.keys()[self._word_to_id.values().index(data_x[index])] for index in range(len(puncts) - 1, len(data_x))]
91 all_words = words1 + [puncts[-1]] + words2
92 content = ' '.join(all_words)

AttributeError: 'dict_values' object has no attribute 'index'

This is how index is defined:
words1 = [self._word_to_id.keys()[self._word_to_id.values().index(data_x[index])] for index in range(len(puncts) - 1)]

indices = [i for i, w in enumerate(words1) if w in PUNCTUATIONS]

for i in indices:

words1[i], words1[i-1] = words1[i-1], words1[i] #only line in for loop

words2 = [self._word_to_id.keys()[self._word_to_id.values().index(data_x[index])] for index in range(len(puncts) - 1, len(data_x))]
all_words = words1 + [puncts[-1]] + words2
content = ' '.join(all_words)
min_step = len(puncts)

Can anyone explain why this occurs and/or how to fix this?

Error when using old training data

Hi all,

When I use model_file_prefix = model_dir to skip the training part (as I've already done this) I get the an error upon running the following code:

`from google.datalab.ml import ConfusionMatrix
from pprint import pprint

cm_data = run_eval(model_file_prefix, '/content/datalab/punctuation/datapreped/test.txt')
pprint(cm_data.tolist())
cm = ConfusionMatrix(cm_data, TARGETS)
cm.plot()`

Error:
INFO:tensorflow:Restoring parameters from /content/lab/datalab/punctuation/model/eval/model.ckpt
INFO:tensorflow:Starting standard services.
INFO:tensorflow:Saving checkpoint to path /content/lab/datalab/punctuation/model/eval/model.ckpt
INFO:tensorflow:Starting queue runners.
INFO:tensorflow:Recording summary at step None.
INFO:tensorflow:Restoring parameters from /content/lab/datalab/punctuation/model/
INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.NotFoundError'>, Unsuccessful TensorSliceReader constructor: Failed to find any matching files for /content/lab/datalab/punctuation/model/
[[Node: save/RestoreV2_6 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2_6/tensor_names, save/RestoreV2_6/shape_and_slices)]]

Caused by op 'save/RestoreV2_6', defined at:
File "/usr/local/envs/py3env/lib/python3.5/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/usr/local/envs/py3env/lib/python3.5/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/ipykernel/main.py", line 3, in
app.launch_new_instance()
File "/usr/local/envs/py3env/lib/python3.5/site-packages/traitlets/config/application.py", line 658, in launch_instance
app.start()
File "/usr/local/envs/py3env/lib/python3.5/site-packages/ipykernel/kernelapp.py", line 474, in start
ioloop.IOLoop.instance().start()
File "/usr/local/envs/py3env/lib/python3.5/site-packages/zmq/eventloop/ioloop.py", line 177, in start
super(ZMQIOLoop, self).start()
File "/usr/local/envs/py3env/lib/python3.5/site-packages/tornado/ioloop.py", line 887, in start
handler_func(fd_obj, events)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/tornado/stack_context.py", line 275, in null_wrapper
return fn(*args, **kwargs)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/zmq/eventloop/zmqstream.py", line 440, in _handle_events
self._handle_recv()
File "/usr/local/envs/py3env/lib/python3.5/site-packages/zmq/eventloop/zmqstream.py", line 472, in _handle_recv
self._run_callback(callback, msg)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/zmq/eventloop/zmqstream.py", line 414, in _run_callback
callback(*args, **kwargs)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/tornado/stack_context.py", line 275, in null_wrapper
return fn(*args, **kwargs)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/ipykernel/kernelbase.py", line 276, in dispatcher
return self.dispatch_shell(stream, msg)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/ipykernel/kernelbase.py", line 228, in dispatch_shell
handler(stream, idents, msg)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/ipykernel/kernelbase.py", line 390, in execute_request
user_expressions, allow_stdin)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/ipykernel/ipkernel.py", line 196, in do_execute
res = shell.run_cell(code, store_history=store_history, silent=silent)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/ipykernel/zmqshell.py", line 501, in run_cell
return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2728, in run_cell
interactivity=interactivity, compiler=compiler, result=result)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2850, in run_ast_nodes
if self.run_code(code, result):
File "/usr/local/envs/py3env/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2910, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 4, in
cm_data = run_eval(model_file_prefix, '/content/lab/datalab/punctuation/datapreped/test.txt')
File "", line 25, in run_eval
sv = tf.train.Supervisor(logdir=logdir)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/util/deprecation.py", line 136, in new_func
return func(*args, **kwargs)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/training/supervisor.py", line 316, in init
self._init_saver(saver=saver)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/training/supervisor.py", line 464, in _init_saver
saver = saver_mod.Saver()
File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 1239, in init
self.build()
File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 1248, in build
self._build(self._filename, build_save=True, build_restore=True)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 1284, in _build
build_save=build_save, build_restore=build_restore)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 765, in _build_internal
restore_sequentially, reshape)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 428, in _AddRestoreOps
tensors = self.restore_op(filename_tensor, saveable, preferred_shard)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 268, in restore_op
[spec.tensor.dtype])[0])
File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/ops/gen_io_ops.py", line 1031, in restore_v2
shape_and_slices=shape_and_slices, dtypes=dtypes, name=name)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3160, in create_op
op_def=op_def)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1625, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

NotFoundError (see above for traceback): Unsuccessful TensorSliceReader constructor: Failed to find any matching files for /content/lab/datalab/punctuation/model/
[[Node: save/RestoreV2_6 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2_6/tensor_names, save/RestoreV2_6/shape_and_slices)]]

NotFoundError Traceback (most recent call last)
/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args)
1349 try:
-> 1350 return fn(*args)
1351 except errors.OpError as e:

/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/client/session.py in _run_fn(session, feed_dict, fetch_list, target_list, options, run_metadata)
1328 feed_dict, fetch_list, target_list,
-> 1329 status, run_metadata)
1330

/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py in exit(self, type_arg, value_arg, traceback_arg)
472 compat.as_text(c_api.TF_Message(self.status.status)),
--> 473 c_api.TF_GetCode(self.status.status))
474 # Delete the underlying status object from memory otherwise it stays alive

NotFoundError: Unsuccessful TensorSliceReader constructor: Failed to find any matching files for /content/lab/datalab/punctuation/model/
[[Node: save/RestoreV2_6 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2_6/tensor_names, save/RestoreV2_6/shape_and_slices)]]

During handling of the above exception, another exception occurred:

NotFoundError Traceback (most recent call last)
in ()
2 from pprint import pprint
3
----> 4 cm_data = run_eval(model_file_prefix, '/content/lab/datalab/punctuation/datapreped/test.txt')
5 #'/content/datalab/docs/samples/TensorFlow'
6 pprint(cm_data.tolist())

in run_eval(model_file_prefix, test_data_path)
25 sv = tf.train.Supervisor(logdir=logdir)
26 with sv.managed_session() as session:
---> 27 sv.saver.restore(session, model_file_prefix)
28 test_perplexity, cm_data = run_epoch(session, mtest, 1, word_to_id, is_eval=True)
29 return cm_data

/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/training/saver.py in restore(self, sess, save_path)
1684 if context.in_graph_mode():
1685 sess.run(self.saver_def.restore_op_name,
-> 1686 {self.saver_def.filename_tensor_name: save_path})
1687 else:
1688 self._build_eager(save_path, build_save=False, build_restore=True)

/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/client/session.py in run(self, fetches, feed_dict, options, run_metadata)
893 try:
894 result = self._run(None, fetches, feed_dict, options_ptr,
--> 895 run_metadata_ptr)
896 if run_metadata:
897 proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/client/session.py in _run(self, handle, fetches, feed_dict, options, run_metadata)
1126 if final_fetches or final_targets or (handle and feed_dict_tensor):
1127 results = self._do_run(handle, final_targets, final_fetches,
-> 1128 feed_dict_tensor, options, run_metadata)
1129 else:
1130 results = []

/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/client/session.py in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata)
1342 if handle is None:
1343 return self._do_call(_run_fn, self._session, feeds, fetches, targets,
-> 1344 options, run_metadata)
1345 else:
1346 return self._do_call(_prun_fn, self._session, handle, feeds, fetches)

/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args)
1361 except KeyError:
1362 pass
-> 1363 raise type(e)(node_def, op, message)
1364
1365 def _extend_graph(self):

NotFoundError: Unsuccessful TensorSliceReader constructor: Failed to find any matching files for /content/lab/datalab/punctuation/model/
[[Node: save/RestoreV2_6 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2_6/tensor_names, save/RestoreV2_6/shape_and_slices)]]

Caused by op 'save/RestoreV2_6', defined at:
File "/usr/local/envs/py3env/lib/python3.5/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/usr/local/envs/py3env/lib/python3.5/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/ipykernel/main.py", line 3, in
app.launch_new_instance()
File "/usr/local/envs/py3env/lib/python3.5/site-packages/traitlets/config/application.py", line 658, in launch_instance
app.start()
File "/usr/local/envs/py3env/lib/python3.5/site-packages/ipykernel/kernelapp.py", line 474, in start
ioloop.IOLoop.instance().start()
File "/usr/local/envs/py3env/lib/python3.5/site-packages/zmq/eventloop/ioloop.py", line 177, in start
super(ZMQIOLoop, self).start()
File "/usr/local/envs/py3env/lib/python3.5/site-packages/tornado/ioloop.py", line 887, in start
handler_func(fd_obj, events)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/tornado/stack_context.py", line 275, in null_wrapper
return fn(*args, **kwargs)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/zmq/eventloop/zmqstream.py", line 440, in _handle_events
self._handle_recv()
File "/usr/local/envs/py3env/lib/python3.5/site-packages/zmq/eventloop/zmqstream.py", line 472, in _handle_recv
self._run_callback(callback, msg)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/zmq/eventloop/zmqstream.py", line 414, in _run_callback
callback(*args, **kwargs)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/tornado/stack_context.py", line 275, in null_wrapper
return fn(*args, **kwargs)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/ipykernel/kernelbase.py", line 276, in dispatcher
return self.dispatch_shell(stream, msg)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/ipykernel/kernelbase.py", line 228, in dispatch_shell
handler(stream, idents, msg)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/ipykernel/kernelbase.py", line 390, in execute_request
user_expressions, allow_stdin)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/ipykernel/ipkernel.py", line 196, in do_execute
res = shell.run_cell(code, store_history=store_history, silent=silent)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/ipykernel/zmqshell.py", line 501, in run_cell
return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2728, in run_cell
interactivity=interactivity, compiler=compiler, result=result)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2850, in run_ast_nodes
if self.run_code(code, result):
File "/usr/local/envs/py3env/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2910, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 4, in
cm_data = run_eval(model_file_prefix, '/content/lab/datalab/punctuation/datapreped/test.txt')
File "", line 25, in run_eval
sv = tf.train.Supervisor(logdir=logdir)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/util/deprecation.py", line 136, in new_func
return func(*args, **kwargs)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/training/supervisor.py", line 316, in init
self._init_saver(saver=saver)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/training/supervisor.py", line 464, in _init_saver
saver = saver_mod.Saver()
File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 1239, in init
self.build()
File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 1248, in build
self._build(self._filename, build_save=True, build_restore=True)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 1284, in _build
build_save=build_save, build_restore=build_restore)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 765, in _build_internal
restore_sequentially, reshape)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 428, in _AddRestoreOps
tensors = self.restore_op(filename_tensor, saveable, preferred_shard)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 268, in restore_op
[spec.tensor.dtype])[0])
File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/ops/gen_io_ops.py", line 1031, in restore_v2
shape_and_slices=shape_and_slices, dtypes=dtypes, name=name)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3160, in create_op
op_def=op_def)
File "/usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1625, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

NotFoundError (see above for traceback): Unsuccessful TensorSliceReader constructor: Failed to find any matching files for /content/lab/datalab/punctuation/model/
[[Node: save/RestoreV2_6 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2_6/tensor_names, save/RestoreV2_6/shape_and_slices)]]

The path, and file, both exist.

When I use model_file_prefix = train(ttxt, vtxt, saved_model_path) the code does work without error. Any idea why this occurs and how to fix it?

googledatalab / notebooks Goto Github PK

notebooks's Introduction

Google Cloud DataLab

notebooks's People

Contributors

Stargazers

Watchers

Forkers

notebooks's Issues

I get the following error: INFO:tensorflow:Restoring parameters from /content/lab/datalab/punctuation/model/punctuation-5796

Recommend Projects

Recommend Topics

Recommend Org

I get the following error:
INFO:tensorflow:Restoring parameters from /content/lab/datalab/punctuation/model/punctuation-5796