Giter Site home page Giter Site logo

model-analysis's Introduction

TensorFlow Model Analysis

Python PyPI Documentation

TensorFlow Model Analysis (TFMA) is a library for evaluating TensorFlow models. It allows users to evaluate their models on large amounts of data in a distributed manner, using the same metrics defined in their trainer. These metrics can be computed over different slices of data and visualized in Jupyter notebooks.

TFMA Slicing Metrics Browser

Caution: TFMA may introduce backwards incompatible changes before version 1.0.

Installation

The recommended way to install TFMA is using the PyPI package:

pip install tensorflow-model-analysis

pip install from https://pypi-nightly.tensorflow.org

pip install -i https://pypi-nightly.tensorflow.org/simple tensorflow-model-analysis

pip install from the HEAD of the git:

pip install git+https://github.com/tensorflow/model-analysis.git#egg=tensorflow_model_analysis

pip install from a released version directly from git:

pip install git+https://github.com/tensorflow/[email protected]#egg=tensorflow_model_analysis

If you have cloned the repository locally, and want to test your local change, pip install from a local folder.

pip install -e $FOLDER_OF_THE_LOCAL_LOCATION

Note that protobuf must be installed correctly for the above option since it is building TFMA from source and it requires protoc and all of its includes reference-able. Please see protobuf install instruction for see the latest install instructions.

Currently, TFMA requires that TensorFlow is installed but does not have an explicit dependency on the TensorFlow PyPI package. See the TensorFlow install guides for instructions.

Build TFMA from source

To build from source follow the following steps:

Install the protoc as per the link mentioned: protoc

Create a virtual environment by running the commands

python3 -m venv <virtualenv_name>
source <virtualenv_name>/bin/activate
pip3 install setuptools wheel
git clone https://github.com/tensorflow/model-analysis.git
cd model-analysis
python3 setup.py bdist_wheel

This will build the TFMA wheel in the dist directory. To install the wheel from dist directory run the commands

cd dist
pip3 install tensorflow_model_analysis-<version>-py3-none-any.whl

Jupyter Lab

As of writing, because of pypa/pip#9187, pip install might never finish. In that case, you should revert pip to version 19 instead of 20: pip install "pip<20".

Using a JupyterLab extension requires installing dependencies on the command line. You can do this within the console in the JupyterLab UI or on the command line. This includes separately installing any pip package dependencies and JupyterLab labextension plugin dependencies, and the version numbers must be compatible. JupyterLab labextension packages refer to npm packages (eg, tensorflow_model_analysis.

The examples below use 0.32.0. Check available versions below to use the latest.

Jupyter Lab 3.0.x

pip install tensorflow_model_analysis==0.32.0
jupyter labextension install [email protected]
pip install jupyterlab_widgets==1.0.0

Jupyter Lab 2.2.x

pip install tensorflow_model_analysis==0.32.0
jupyter labextension install [email protected]
jupyter labextension install @jupyter-widgets/jupyterlab-manager@2

Jupyter Lab 1.2.x

pip install tensorflow_model_analysis==0.32.0
jupyter labextension install [email protected]
jupyter labextension install @jupyter-widgets/[email protected]

Classic Jupyter Notebook

To enable TFMA visualization in the classic Jupyter Notebook (either through jupyter notebook or through the JupyterLab UI), you'll also need to run:

jupyter nbextension enable --py widgetsnbextension
jupyter nbextension enable --py tensorflow_model_analysis

Note: If Jupyter notebook is already installed in your home directory, add --user to these commands. If Jupyter is installed as root, or using a virtual environment, the parameter --sys-prefix might be required.

Building TFMA from source

If you want to build TFMA from source and use the UI in JupyterLab, you'll need to make sure that the source contains valid version numbers. Check that the Python package version number and npm package version number are exactly the same, and that both valid version numbers (eg, remove the -dev suffix).

Troubleshooting

Check pip packages:

pip list

Check JupyterLab extensions:

jupyter labextension list  # for JupyterLab
jupyter nbextension list  # for classic Jupyter Notebook

Standalone HTML page with embed_minimal_html

TFMA notebook extension can be built into a standalone HTML file that also bundles data into the HTML file. See the Jupyter Widgets docs on embed_minimal_html.

Kubeflow Pipelines

Kubeflow Pipelines includes integrations that embed the TFMA notebook extension (code). This integration relies on network access at runtime to load a variant of the JavaScript build published on unpkg.com (see config and loader code).

Notable Dependencies

TensorFlow is required.

Apache Beam is required; it's the way that efficient distributed computation is supported. By default, Apache Beam runs in local mode but can also run in distributed mode using Google Cloud Dataflow and other Apache Beam runners.

Apache Arrow is also required. TFMA uses Arrow to represent data internally in order to make use of vectorized numpy functions.

Getting Started

For instructions on using TFMA, see the get started guide.

Compatible Versions

The following table is the TFMA package versions that are compatible with each other. This is determined by our testing framework, but other untested combinations may also work.

tensorflow-model-analysis apache-beam[gcp] pyarrow tensorflow tensorflow-metadata tfx-bsl
GitHub master 2.47.0 10.0.0 nightly (2.x) 1.15.0 1.15.1
0.46.0 2.47.0 10.0.0 2.15 1.15.0 1.15.1
0.45.0 2.47.0 10.0.0 2.13 1.14.0 1.14.0
0.44.0 2.40.0 6.0.0 2.12 1.13.1 1.13.0
0.43.0 2.40.0 6.0.0 2.11 1.12.0 1.12.0
0.42.0 2.40.0 6.0.0 1.15.5 / 2.10 1.11.0 1.11.1
0.41.0 2.40.0 6.0.0 1.15.5 / 2.9 1.10.0 1.10.1
0.40.0 2.38.0 5.0.0 1.15.5 / 2.9 1.9.0 1.9.0
0.39.0 2.38.0 5.0.0 1.15.5 / 2.8 1.8.0 1.8.0
0.38.0 2.36.0 5.0.0 1.15.5 / 2.8 1.7.0 1.7.0
0.37.0 2.35.0 5.0.0 1.15.5 / 2.7 1.6.0 1.6.0
0.36.0 2.34.0 5.0.0 1.15.5 / 2.7 1.5.0 1.5.0
0.35.0 2.33.0 5.0.0 1.15 / 2.6 1.4.0 1.4.0
0.34.1 2.32.0 2.0.0 1.15 / 2.6 1.2.0 1.3.0
0.34.0 2.31.0 2.0.0 1.15 / 2.6 1.2.0 1.3.1
0.33.0 2.31.0 2.0.0 1.15 / 2.5 1.2.0 1.2.0
0.32.1 2.29.0 2.0.0 1.15 / 2.5 1.1.0 1.1.1
0.32.0 2.29.0 2.0.0 1.15 / 2.5 1.1.0 1.1.0
0.31.0 2.29.0 2.0.0 1.15 / 2.5 1.0.0 1.0.0
0.30.0 2.28.0 2.0.0 1.15 / 2.4 0.30.0 0.30.0
0.29.0 2.28.0 2.0.0 1.15 / 2.4 0.29.0 0.29.0
0.28.0 2.28.0 2.0.0 1.15 / 2.4 0.28.0 0.28.0
0.27.0 2.27.0 2.0.0 1.15 / 2.4 0.27.0 0.27.0
0.26.1 2.28.0 0.17.0 1.15 / 2.3 0.26.0 0.26.0
0.26.0 2.25.0 0.17.0 1.15 / 2.3 0.26.0 0.26.0
0.25.0 2.25.0 0.17.0 1.15 / 2.3 0.25.0 0.25.0
0.24.3 2.24.0 0.17.0 1.15 / 2.3 0.24.0 0.24.1
0.24.2 2.23.0 0.17.0 1.15 / 2.3 0.24.0 0.24.0
0.24.1 2.23.0 0.17.0 1.15 / 2.3 0.24.0 0.24.0
0.24.0 2.23.0 0.17.0 1.15 / 2.3 0.24.0 0.24.0
0.23.0 2.23.0 0.17.0 1.15 / 2.3 0.23.0 0.23.0
0.22.2 2.20.0 0.16.0 1.15 / 2.2 0.22.2 0.22.0
0.22.1 2.20.0 0.16.0 1.15 / 2.2 0.22.2 0.22.0
0.22.0 2.20.0 0.16.0 1.15 / 2.2 0.22.0 0.22.0
0.21.6 2.19.0 0.15.0 1.15 / 2.1 0.21.0 0.21.3
0.21.5 2.19.0 0.15.0 1.15 / 2.1 0.21.0 0.21.3
0.21.4 2.19.0 0.15.0 1.15 / 2.1 0.21.0 0.21.3
0.21.3 2.17.0 0.15.0 1.15 / 2.1 0.21.0 0.21.0
0.21.2 2.17.0 0.15.0 1.15 / 2.1 0.21.0 0.21.0
0.21.1 2.17.0 0.15.0 1.15 / 2.1 0.21.0 0.21.0
0.21.0 2.17.0 0.15.0 1.15 / 2.1 0.21.0 0.21.0
0.15.4 2.16.0 0.15.0 1.15 / 2.0 n/a 0.15.1
0.15.3 2.16.0 0.15.0 1.15 / 2.0 n/a 0.15.1
0.15.2 2.16.0 0.15.0 1.15 / 2.0 n/a 0.15.1
0.15.1 2.16.0 0.15.0 1.15 / 2.0 n/a 0.15.0
0.15.0 2.16.0 0.15.0 1.15 n/a n/a
0.14.0 2.14.0 n/a 1.14 n/a n/a
0.13.1 2.11.0 n/a 1.13 n/a n/a
0.13.0 2.11.0 n/a 1.13 n/a n/a
0.12.1 2.10.0 n/a 1.12 n/a n/a
0.12.0 2.10.0 n/a 1.12 n/a n/a
0.11.0 2.8.0 n/a 1.11 n/a n/a
0.9.2 2.6.0 n/a 1.9 n/a n/a
0.9.1 2.6.0 n/a 1.10 n/a n/a
0.9.0 2.5.0 n/a 1.9 n/a n/a
0.6.0 2.4.0 n/a 1.6 n/a n/a

Questions

Please direct any questions about working with TFMA to Stack Overflow using the tensorflow-model-analysis tag.

model-analysis's People

Contributors

abattery avatar atn832 avatar brills avatar chongkong avatar chuanyu avatar dhruvesh09 avatar embr avatar genehwung avatar hawkinsp avatar hertschuh avatar hurutoriya avatar iindyk avatar jindalshivam09 avatar k-w-w avatar lamberta avatar markdaoust avatar mdreves avatar nkovela1 avatar paulgc avatar pfyang avatar rchen152 avatar rcrowe-google avatar rtg0795 avatar solma avatar terrytangyuan avatar venkat2469 avatar vkarampudi avatar xinzha623 avatar yilei avatar zhouhao138 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

model-analysis's Issues

KeyError: TupleType for load_eval_result from GCS with TFX 0.13.0

TFMA's load_eval_result('gs://bucket/path/to/evaluator/output') on TFX 0.13 (Kubeflow 0.4.1 on GCP with KFP 0.1.19 and Python 3.5) gives the following error:

KeyError                                  Traceback (most recent call last)
<ipython-input-7-4f9962d5bddb> in <module>
----> 1 result = tfma.load_eval_result('gs://bucket/path/to/evaluator/output')

/usr/lib/python3.5/site-packages/tensorflow_model_analysis/api/model_eval_lib.py in load_eval_result(output_path)
    194            for key, plot_data in plots_proto_list]
    195 
--> 196   eval_config = load_eval_config(output_path)
    197   return EvalResult(
    198       slicing_metrics=slicing_metrics, plots=plots, config=eval_config)

/usr/lib/python3.5/site-packages/tensorflow_model_analysis/api/model_eval_lib.py in load_eval_config(output_path)
    115       tf.python_io.tf_record_iterator(
    116           os.path.join(output_path, _EVAL_CONFIG_FILE)))
--> 117   final_dict = pickle.loads(serialized_record)
    118   _check_version(final_dict, output_path)
    119   return final_dict[_EVAL_CONFIG_KEY]

/usr/lib/python3.5/site-packages/dill/_dill.py in _load_type(name)
    575 
    576 def _load_type(name):
--> 577     return _reverse_typemap[name]
    578 
    579 def _create_type(typeobj, *args):

KeyError: 'TupleType'

CMLE's Python version is also 3.5 (as it was provided as a training input).

For TFX 0.12 on the same Kubeflow cluster it works as expected. I updated TFMA's version according to the instructions.

TFDV works without any issues (in the same notebook). And it also works fine when using Airflow for local development.

Could it be related to this?

WriteTFRecord includes full path to output file

This is not a very significant bug, but it is a bit unexpected behavior. In the code output files are written via a beam.io.WriteToTFRecord PTransform, but the name is assigned as 'WriteTFRecord(%s)' % output_file. The problem is that the output_file is generally an absolute path to some file, using / as a delimiter. In the Google Cloud Dataflow UI it seems like / in the step name is used to group together similar operations, so you end up with a deeply-nested graph structure where WriteTFRecord(gs://path/to/file) gets mapped to 'WriteTFRecord(gs:' > '' > 'path' > 'to' > 'file'.

Bug: module 'tfx_bsl.coders.example_coder' has no attribute 'ExampleToNumpyDict'

System information

  • Have I written custom code (Yes):
  • OS Platform and Distribution (Docker of Jupyter):
  • TensorFlow Model Analysis installed from (pip install):
  • TensorFlow Model Analysis version (0.21.6):
  • Python version: 3.7.6
  • Jupyter Notebook version: 6.0.3
  • Exact command to reproduce: as shown below.

Describe the problem

I've tried the emaples here, and all works well in jupyter except tfma.view.render_time_series doesn't show anything.

So, I tried tfma on my own keras saved model. Model info is inspected using saved_model_cli command.

saved_model_cli show --dir xdeepfm --tag_set serve --signature_def serving_default

The given SavedModel SignatureDef contains the following input(s):
  inputs['input_1'] tensor_info:
      dtype: DT_INT64
      shape: (-1, 31)
      name: serving_default_input_1:0
The given SavedModel SignatureDef contains the following output(s):
  outputs['output_1'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 1)
      name: StatefulPartitionedCall:0
Method name is: tensorflow/serving/predict

With the codes below, some errors occur when run_model_analysis is called. It's a AttributeError: module 'tfx_bsl.coders.example_coder' has no attribute 'ExampleToNumpyDict'.

The version of tfx_bsl in my system is 0.21.4, and I've checked the source code here. It does have the ExampleToNumpyDict : from tfx_bsl.cc.tfx_bsl_extension.coders import ExampleToNumpyDict.

So are there some errors in my code? Or something else? Thank for your help~

Source code / logs

Code

import tensorflow as tf
import tensorflow_model_analysis as tfma
import os
import numpy as np
from tensorflow.core.example import example_pb2
from google.protobuf import text_format

BASE_DIR = '/home/jovyan/work'
TFMA_DIR = os.path.join(BASE_DIR, 'tfma')
OUTPUT_DIR = os.path.join(TFMA_DIR, 'output')

SAMPLE_SIZE=100
inputs = np.random.randint(1001, size=(SAMPLE_SIZE, 31))
labels = np.random.randint(2, size=(SAMPLE_SIZE, 1))

examples = []
for i in range(SAMPLE_SIZE):
    example = example_pb2.Example()
    example.features.feature['input_1'].int64_list.value[:] = inputs[i].tolist()
    example.features.feature['output_1'].int64_list.value[:] = labels[i].tolist()
    examples.append(example)

TFRecord_file = os.path.join(TFMA_DIR, 'train_data.rio')
with tf.io.TFRecordWriter(TFRecord_file) as writer:
    for example in examples:
        writer.write(example.SerializeToString())
    writer.flush()
    writer.close()

eval_config = text_format.Parse("""
  model_specs {
    # This assumes a serving model with a "serving_default" signature.
    label_key: "output_1"
    # example_weight_key: "weight"
  }
  metrics_specs {
    # This assumes a binary classification model.
    # metrics { class_name: "AUC" }
    # ... other metrics ...
  }
  slicing_specs {}
""", tfma.EvalConfig())

eval_model_dir = os.path.join(TFMA_DIR, 'xdeepfm')
print(eval_model_dir)
eval_model = tfma.default_eval_shared_model(eval_saved_model_path=eval_model_dir,eval_config=eval_config)
eval_result = tfma.run_model_analysis(eval_shared_model=eval_model,
                                      data_location=TFRecord_file,
                                      file_format='tfrecords',
                                      output_path=OUTPUT_DIR,
                                      extractors=None)
tfma.view.render_slicing_metrics(eval_result)

Logs

/opt/conda/lib/python3.7/site-packages/apache_beam/transforms/core.py in <lambda>(x, *args, **kwargs)
   1434   if _fn_takes_side_inputs(fn):
-> 1435     wrapper = lambda x, *args, **kwargs: [fn(x, *args, **kwargs)]
   1436   else:

/opt/conda/lib/python3.7/site-packages/tensorflow_model_analysis/extractors/input_extractor.py in _ParseExample(extracts, eval_config)
    104 
--> 105   features = example_coder.ExampleToNumpyDict(extracts[constants.INPUT_KEY])
    106   extracts = copy.copy(extracts)

AttributeError: module 'tfx_bsl.coders.example_coder' has no attribute 'ExampleToNumpyDict'

JupyterLab support?

System information

  • Have I written custom code (as opposed to using a stock example script
    provided in TensorFlow Model Analysis)
    : N/A
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux CentOS 7
  • TensorFlow Model Analysis installed from (source or binary): pypi binary
  • TensorFlow Model Analysis version (use command below): tensorflow-model-analysis 0.14.0
  • Python version: 3.6
  • Jupyter Notebook version: 7.0.0
  • Exact command to reproduce: N/A

Describe the problem

At Twitter, we primarily use the JupyterLab front-end for our notebook-based workflows. TFMA currently only supports running as an nbextension for the "Classic" Notebook UI - vs providing a labextension for e.g. JupyterLab.

Thus, consuming TFMA currently requires that our ML practitioners revert to the "Classic" Notebook UI which has largely been deprecated internally. It'd be great if TFMA could provide a JupyterLab plugin so that our users didn't have to switch UIs and interrupt their typical workflow.

Source code / logs

N/A

Binarize options in default_multi_class_classification_specs is broken

System information

  • Have I written custom code (as opposed to using a stock example script
    provided in TensorFlow Model Analysis)
    : Yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu
  • TensorFlow Model Analysis installed from (source or binary): binary (PyPI)
  • TensorFlow Model Analysis version (use command below): 0.22.2
  • Python version: 3.6.8
  • Jupyter Notebook version: 6.0.3
  • Exact command to reproduce:
tfma.metrics.default_multi_class_classification_specs(
    model_names=['model'],
    binarize=tfma.BinarizationOptions(class_ids={'values' : list(range(len(classes)))}),
    sparse=True
)

This gives the following error:

<ipython-input-17-1d79e900e9de> in <module>
      2     model_names=['model'],
      3     binarize=tfma.BinarizationOptions(class_ids={'values' : list(range(len(classes)))}),
----> 4     sparse=True
      5 )

/usr/local/lib/python3.6/site-packages/tensorflow_model_analysis/metrics/metric_specs.py in default_multi_class_classification_specs(model_names, output_names, binarize, aggregate, sparse)
    325       ])
    326     binarize = config.BinarizationOptions().CopyFrom(binarize)
--> 327     binarize.ClearField('top_k_list')  # pytype: disable=attribute-error
    328   multi_class_metrics = specs_from_metrics(
    329       metrics, model_names=model_names, output_names=output_names)

AttributeError: 'NoneType' object has no attribute 'ClearField'

Confusion matrics and calibration plots don't visualize in kubeflow pipeline

System information

  • Have I written custom code (as opposed to using a stock example script
    provided in TensorFlow Model Analysis)
    : Yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu
  • TensorFlow Model Analysis installed from (source or binary): binary (PyPI)
  • TensorFlow Model Analysis version (use command below): 0.22.2
  • Python version: 3.6.9
  • Jupyter Notebook version: 6.0.3
  • Exact command to reproduce:
from tensorflow_model_analysis import EvalConfig
from tensorflow_model_analysis.metrics import default_multi_class_classification_specs
from google.protobuf.json_format import ParseDict

class = ['class_1', 'class_2', ...]

eval_config = {
    'model_specs': [
        {
            'name': 'rig_state',
            'model_type': 'tf_keras',
            'signature_name': 'serve_raw',
            'label_key': ...,
            'example_weight_key': 'sample_weight'
        }
    ],
    'metrics_specs': [
        {
            'metrics': [
                {
                    'class_name': 'MultiClassConfusionMatrixPlot',
                    'config': '"thresholds": [0.5]'
                },
                {'class_name': 'ExampleCount'},
                {'class_name': 'WeightedExampleCount'},
                {'class_name': 'SparseCategoricalAccuracy'},
            ],
        },
        {
            'binarize': {'class_ids': {'values': list(range(len(classes)))}},
            'metrics': [
                {'class_name': 'AUC'},
                {'class_name': 'CalibrationPlot'},
                {'class_name': 'BinaryAccuracy'},
                {'class_name': 'MeanPrediction'}
            ]
        }
    ],
    'slicing_specs': [...]
}
eval_config: EvalConfig = ParseDict(eval_config, EvalConfig())

Describe the problem

When using this EvalConfig in tfma.run_model_analysis everything runs successfully. Inspecting the result, there is plot data for the confusion matrix, yet when trying to plot the result using tfma.view.render_slicing_metrics, the confusion matrix won't show.

This happens in a kubeflow pipeline.

Documentation for Slicer.SingleSliceSpec

System information

  • Have I written custom code (as opposed to using a stock example script
    provided in TensorFlow Model Analysis)
    : No
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): N/A
  • TensorFlow Model Analysis installed from (source or binary): N/A
  • TensorFlow Model Analysis version (use command below): N/A
  • Python version: N/A
  • Jupyter Notebook version: N/A
  • Exact command to reproduce: N/A

You can obtain the TensorFlow Model Analysis version with

python -c "import tensorflow_model_analysis as tfma; print(tfma.version.VERSION)"

Describe the problem

Documentation for SingleSliceSpec is missing. The link from the tutorials goes to the following page
and 404s

If documentation is available in any other format I'd love to look at it.

TFMA fails when trying to use a custom metric from a module that ends with `keras.metrics`

System information

  • Have I written custom code (as opposed to using a stock example script
    provided in TensorFlow Model Analysis)
    : Yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): tensorflow/tensorflow:2.2.0-gpu-jupyter docker image
  • TensorFlow Model Analysis installed from (source or binary): binary
  • TensorFlow Model Analysis version (use command below): 0.22.2
  • Python version: 3.6.9
  • Jupyter Notebook version: 6.0.2
  • Exact command to reproduce: NA

Describe the problem

TFMA fails when trying to use a custom metric coming from a module that ends with keras.metrics. I have a contrib module whose name is {PROJECT_NAME}.ml_utils.tf.keras.metrics in which I have lots of custom metrics. When I try to use those in TFMA, it fails saying that the metrics can't be found. I don't have this issue if I rename my module to something like {PROJECT_NAME}.ml_utils.tf.keras.custom_metrics. After tracking the issue down, I noticed this snippet to be the issue:

def _custom_objects(
    metrics: Dict[Text, List[tf.keras.metrics.Metric]]) -> Dict[Text, Any]:
  custom_objects = {}
  for metric_list in metrics.values():
    for metric in metric_list:
      if (not metric.__class__.__module__.endswith('keras.metrics') and
          not metric.__class__.__module__.endswith('keras.losses')):
        custom_objects[metric.__class__.__module__] = metric.__class__.__name__
  return custom_objects

The if statement is problematic because my module indeed ends with keras.metrics and, yet, is a custom module. It seems the more correct and robust approach here would be to replace it with:

def _custom_objects(
    metrics: Dict[Text, List[tf.keras.metrics.Metric]]) -> Dict[Text, Any]:
  custom_objects = {}
  for metric_list in metrics.values():
    for metric in metric_list:
      if (not metric.__class__.__module__ == tf.keras.metrics.__name__ and
          not metric.__class__.__module__ == tf.keras.losses.__name__):
        custom_objects[metric.__class__.__module__] = metric.__class__.__name__
  return custom_objects

Can't load metrics inside tensorboard

System information

  • Have I written custom code (as opposed to using a stock example script
    provided in TensorFlow Model Analysis)
    : NO
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 10
  • TensorFlow Model Analysis installed from (source or binary): binary (Pypi)
  • TensorFlow Model Analysis version (use command below): 0.21.6
  • Python version: 3.7.7
  • Jupyter Notebook version: -
  • Exact command to reproduce: -

Describe the problem

I successfully generated the logs for tfma, but I'm having trouble having them displayed in tensorboard. Basically I can see only the plot of the first metric (which appears on loading the page), but the others appear empty, even the first one that was already plotted when trying to load it below
image
Trying to use tfma inside a notebook displays metrics properly, but using tf serving and a local tensorboard fails to display them

Source code / logs

I suspect there is a problem mapping a script
This is what I see in Firefox console

paper-header-panel is deprecated. Please use app-layout instead! tf-tensorboard.html.js:18569:69
paper-toolbar is deprecated. Please use app-layout instead! tf-tensorboard.html.js:18575:204
Content Security Policy: Ignoring โ€˜x-frame-optionsโ€™ because of โ€˜frame-ancestorsโ€™ directive.
Loading failed for the <script> with source โ€œhttps://www.gstatic.com/charts/loader.jsโ€. plugin_entry.html:1:1
Content Security Policy: The pageโ€™s settings blocked the loading of a resource at https://www.gstatic.com/charts/loader.js (โ€œscript-srcโ€). vulcanized_tfma.js:1046:317
Content Security Policy: The pageโ€™s settings blocked the loading of a resource at https://fonts.googleapis.com/css?family=Roboto:400,300,300italic,400italic,500,500italic,700,700italic (โ€œstyle-srcโ€). vulcanized_tfma.js:1280:354
Content Security Policy: The pageโ€™s settings blocked the loading of a resource at https://fonts.googleapis.com/css?family=Roboto+Mono:400,700 (โ€œstyle-srcโ€). vulcanized_tfma.js:1280:354
uncaught exception: Object
Source map error: Error: request failed with status 404
Resource URL: http://localhost:6006/data/plugin/fairness_indicators/vulcanized_tfma.js
Source Map URL: web-animations-next-lite.min.js.map

Source map error: Error: request failed with status 404
Resource URL: http://localhost:6006/data/plugin/fairness_indicators/vulcanized_tfma.js
Source Map URL: web-animations-next-lite.min.js.map

I tried also to load the page on chrome with content security policies disabled but the source map error persists

MultiOutput Keras Model Evaluation Issue

System information

  • Have I written custom code (as opposed to using a stock example script
    provided in TensorFlow Model Analysis)
    : NO
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Google Colab
  • TensorFlow Model Analysis installed from (source or binary): source
  • TensorFlow Model Analysis version (use command below): 0.27.0
  • Python version: 3.6.9
  • Jupyter Notebook version: Google Colab
  • Exact command to reproduce:

Describe the problem

Hi, i've been following the TFX Chicago Taxi Example (https://www.tensorflow.org/tfx/tutorials/tfx/components_keras#evaluator) to factor my TensorFlow code into the TFX framework.

However, for my use case, it's a multi-output keras model, where the model consumes a given input, and produces 2 outputs (both being multi-class).

If i ran the evaluator component with just 1 output (e.g.: disable the other output in my model) , it works fine and i can run tfma.run_model_analysis without an issue.

However, reverting to my multi-output model, running the evaluator component throws up an error.

Model - output_0 has 5 classes, and output_1 has 8 classes to predict >>

signature_def['serving_raw']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['CREDIT'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, -1)
        name: serving_raw_CREDIT:0
    inputs['DEBIT'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, -1)
        name: serving_raw_DEBIT:0
    inputs['DESCRIPTION'] tensor_info:
        dtype: DT_STRING
        shape: (-1, -1)
        name: serving_raw_DESCRIPTION:0
    inputs['TRADEDATE'] tensor_info:
        dtype: DT_STRING
        shape: (-1, -1)
        name: serving_raw_TRADEDATE:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['output_0'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 5)
        name: StatefulPartitionedCall_2:0
    outputs['output_1'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 8)

Eval_Config >>

eval_config = tfma.EvalConfig(
    model_specs=[     
        tfma.ModelSpec(label_key='my_label_key')
    ],
    metrics_specs=[
        tfma.MetricsSpec(
            metrics=[
                tfma.MetricConfig(class_name = 'SparseCategoricalAccuracy', 
                                  threshold=tfma.MetricThreshold(
                                      value_threshold=tfma.GenericValueThreshold(lower_bound={'value': 0.5}),
                                      change_threshold=tfma.GenericChangeThreshold(
                                          direction=tfma.MetricDirection.HIGHER_IS_BETTER,
                                          absolute={'value': -1e-10}))),
                tfma.MetricConfig(class_name = 'MultiClassConfusionMatrixPlot'),
                tfma.MetricConfig(class_name = "Precision"),
                tfma.MetricConfig(class_name = "Recall")
            ], 
            output_names =['output_0']
        ),
         tfma.MetricsSpec(
            metrics=[
                tfma.MetricConfig(class_name = 'SparseCategoricalAccuracy', 
                                  threshold=tfma.MetricThreshold(
                                      value_threshold=tfma.GenericValueThreshold(lower_bound={'value': 0.5}),
                                      change_threshold=tfma.GenericChangeThreshold(
                                          direction=tfma.MetricDirection.HIGHER_IS_BETTER,
                                          absolute={'value': -1e-10}))),
                tfma.MetricConfig(class_name = 'MultiClassConfusionMatrixPlot'),
                tfma.MetricConfig(class_name = "Precision"),
                tfma.MetricConfig(class_name = "Recall")
            ], 
            output_names =['output_1']
         )
    ],
    slicing_specs=[
        tfma.SlicingSpec(),
    ])

Running tfma.run_model_analysis using the above eval_config,

keras_model_path = os.path.join(trainer.outputs['model'].get()[0].uri,'serving_model_dir') # gets the model from the trainer stage
keras_eval_shared_model = tfma.default_eval_shared_model(
    eval_saved_model_path=keras_model_path,
    eval_config=eval_config)

keras_output_path = os.path.join(os.getcwd(), 'keras2')
tfrecord_file = '/tmp/tfx-interactive-2021-02-09T06_02_48.210135-95bh38cw/Transform/transformed_examples/5/train/transformed_examples-00000-of-00001.gz'
# Run TFMA
keras_eval_result = tfma.run_model_analysis(
    eval_shared_model=keras_eval_shared_model,
    eval_config=eval_config,
    data_location=tfrecord_file,
    output_path=keras_output_path)

I get an error message of the below >>


ValueError                                Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/tensorflow_model_analysis/model_util.py in process(self, element)
    667     try:
--> 668       result = self._batch_reducible_process(element)
    669       self._batch_size.update(batch_size)

118 frames
ValueError: could not broadcast input array from shape (5) into shape (1)

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
ValueError: could not broadcast input array from shape (5) into shape (1)

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/numpy/core/_asarray.py in asarray(a, dtype, order)
     81 
     82     """
---> 83     return array(a, dtype, copy=False, order=order)
     84 
     85 

ValueError: could not broadcast input array from shape (5) into shape (1) [while running 'ExtractEvaluateAndWriteResults/ExtractAndEvaluate/ExtractPredictions/Predict']

I've tried to find code examples of multi-output eval_config but haven't come across one yet.

Following the documentation, i've arrived at what i think the eval_config should be for a multi-output model - however is it set up correctly given the error message?

render_plot won't render, getting 404 on GET /static/tensorflow_model-analysis.js?v=2020103020382

I'm not getting tfma.view.render_plot to render plots with:

tfma.view.render_plot(result)
or
tfma.view.render_time_series
No output to the cell appears other than
Error rendering Jupyter widget: missing widget manager

note that tfma.view.render_slicing_metrics does work..

I suspect it's related to this 404... I'm getting the following error in jupyter notebook stdout log:

[W 20:42:09.329 NotebookApp] 404 GET /static/tensorflow_model-analysis.js?v=20201030203829 (192.168.112.1) 14.49ms referer=http://localhost:18888/notebooks/analysis.ipynb

It seems relevant that the /static/tensorflow_model-analysis.js file above in the 404 log is not part of the package, as can be seen in the github repo itself: https://github.com/tensorflow/model-analysis/tree/master/tensorflow_model_analysis/static

this is running in a docker image from python:3.7-slim-buster (debian)

jupyter commands:

        /usr/local/airflow/.local/bin/jupyter nbextension enable --py widgetsnbextension --sys-prefix
        /usr/local/airflow/.local/bin/jupyter nbextension install --py --symlink tensorflow_model_analysis --sys-prefix
        /usr/local/airflow/.local/bin/jupyter nbextension enable --py tensorflow_model_analysis --sys-prefix
        /usr/local/airflow/.local/bin/jupyter nbextension install --py --symlink witwidget --sys-prefix
        /usr/local/airflow/.local/bin/jupyter nbextension enable witwidget --py  --sys-prefix

        /usr/local/airflow/.local/bin/jupyter nbextensions_configurator enable --user

        /usr/local/airflow/.local/bin/jupyter notebook --ip=0.0.0.0 --notebook-dir=/usr/local/airflow/dags --port=18888 

versions are:

cat requirements.txt | grep tensor

tensorboard==2.1.1
tensorflow-data-validation==0.21.5
tensorflow-estimator==1.15.1
tensorflow-metadata==0.21.1
tensorflow-model-analysis==0.21.5
#tensorflow-model-analysis==0.22.0
tensorflow-serving-api==2.1.0
tensorflow-transform==0.21.2
tensorflow-gpu==2.1.0
tensorflow-serving-api==2.1.0
tensorflow-datasets==3.1.0
tensorflow-hub==0.9.0

I tried tfma version 0.22.0 and had the same result, all else being equal.

Here are my nbextensions:

airflow@8158874d751e:~$ jupyter nbextension list --debug
Searching ['/usr/local/airflow', '/usr/local/airflow/.jupyter', '/usr/local/etc/jupyter', '/etc/jupyter'] for config files
Looking for jupyter_config in /etc/jupyter
Looking for jupyter_config in /usr/local/etc/jupyter
Looking for jupyter_config in /usr/local/airflow/.jupyter
Looking for jupyter_config in /usr/local/airflow
Looking for jupyter nbextension list_config in /etc/jupyter
Looking for jupyter nbextension list_config in /usr/local/etc/jupyter
Looking for jupyter nbextension list_config in /usr/local/airflow/.jupyter
Looking for jupyter nbextension list_config in /usr/local/airflow
Known nbextensions:
Paths used for configuration of common: 
    	/usr/local/airflow/.jupyter/nbconfig/common.json
Paths used for configuration of notebook: 
    	/usr/local/airflow/.jupyter/nbconfig/notebook.json
Paths used for configuration of tree: 
    	/usr/local/airflow/.jupyter/nbconfig/tree.json
Paths used for configuration of edit: 
    	/usr/local/airflow/.jupyter/nbconfig/edit.json
Paths used for configuration of terminal: 
    	/usr/local/airflow/.jupyter/nbconfig/terminal.json
Paths used for configuration of common: 
    	/usr/local/etc/jupyter/nbconfig/common.json
Paths used for configuration of notebook: 
    	/usr/local/etc/jupyter/nbconfig/notebook.json
  config dir: /usr/local/etc/jupyter/nbconfig
    notebook section
      jupyter-js-widgets/extension  enabled 
      - Validating: OK
      tensorflow_model_analysis/extension  enabled 
      - Validating: OK
      wit-widget/extension  enabled 
      - Validating: OK
Paths used for configuration of tree: 
    	/usr/local/etc/jupyter/nbconfig/tree.json
Paths used for configuration of edit: 
    	/usr/local/etc/jupyter/nbconfig/edit.json
Paths used for configuration of terminal: 
    	/usr/local/etc/jupyter/nbconfig/terminal.json
Paths used for configuration of common: 
    	/etc/jupyter/nbconfig/common.json
Paths used for configuration of notebook: 
    	/etc/jupyter/nbconfig/notebook.json
Paths used for configuration of tree: 
    	/etc/jupyter/nbconfig/tree.json
Paths used for configuration of edit: 
    	/etc/jupyter/nbconfig/edit.json
Paths used for configuration of terminal: 
    	/etc/jupyter/nbconfig/terminal.json

i've found that if i wait long enough, the jupyter cell will return the following ressponse when running tfma.render_plot:

PlotViewer(config={'sliceName': 'Overall', 'metricKeys': {'calibrationPlot': {'metricName': 'calibrationHistogโ€ฆ

I'm happy to provide more info if needed. Many Thanks.
Jason

Unable to view the visualization / dashboard on a machine without internet

System information

  • Have I written custom code (as opposed to using a stock example script
    provided in TensorFlow Model Analysis)
    : No
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux
  • TensorFlow Model Analysis installed from (source or binary): Using pip install
  • TensorFlow Model Analysis version (use command below): 0.25.0
  • Python version: 3.7
  • Jupyter Notebook version: 1.0.0
  • Exact command to reproduce:
tfma.view.render_slicing_metrics(eval_result, slicing_spec=slices[0])

Describe the problem

I'm able to view the visualization on a machine with internet access. But when I run the command on our internal server it doesn't show the plot. The only difference I see between the machines is the availability of public network access. Can I make tfma work there? Thank you.

Attached the errors from browser's console log below

Source code / logs

manager-base.js:273 Could not instantiate widget
(anonymous) @ manager-base.js:273
(anonymous) @ manager-base.js:44
(anonymous) @ manager-base.js:25
a @ manager-base.js:17
Promise.then (async)
u @ manager-base.js:18
(anonymous) @ manager-base.js:19
A @ manager-base.js:15
t._make_model @ manager-base.js:257
(anonymous) @ manager-base.js:246
(anonymous) @ manager-base.js:44
(anonymous) @ manager-base.js:25
(anonymous) @ manager-base.js:19
A @ manager-base.js:15
t.new_model @ manager-base.js:232
t.handle_comm_open @ manager-base.js:144
L @ underscore.js:762
(anonymous) @ underscore.js:775
(anonymous) @ underscore.js:122
(anonymous) @ comm.js:89
Promise.then (async)
CommManager.comm_open @ comm.js:85
i @ jquery.min.js:2
Kernel._handle_iopub_message @ kernel.js:1223
Kernel._finish_ws_message @ kernel.js:1015
(anonymous) @ kernel.js:1006
Promise.then (async)
Kernel._handle_ws_message @ kernel.js:1006
i @ jquery.min.js:2
utils.js:119 Error: Could not create a model.
    at utils.js:119
(anonymous) @ utils.js:119
Promise.catch (async)
t.handle_comm_open @ manager-base.js:149
L @ underscore.js:762
(anonymous) @ underscore.js:775
(anonymous) @ underscore.js:122
(anonymous) @ comm.js:89
Promise.then (async)
CommManager.comm_open @ comm.js:85
i @ jquery.min.js:2
Kernel._handle_iopub_message @ kernel.js:1223
Kernel._finish_ws_message @ kernel.js:1015
(anonymous) @ kernel.js:1006
Promise.then (async)
Kernel._handle_ws_message @ kernel.js:1006
i @ jquery.min.js:2
kernel.js:1007 Couldn't process kernel message TypeError: Cannot read property 'SlicingMetricsModel' of undefined
    at manager.js:153
(anonymous) @ kernel.js:1007
Promise.catch (async)
Kernel._handle_ws_message @ kernel.js:1007
i @ jquery.min.js:2
kernel.js:1007 Couldn't process kernel message TypeError: Cannot read property 'SlicingMetricsModel' of undefined
    at manager.js:153
(anonymous) @ kernel.js:1007
Promise.catch (async)
Kernel._handle_ws_message @ kernel.js:1007
i @ jquery.min.js:2
kernel.js:1007 Couldn't process kernel message TypeError: Cannot read property 'SlicingMetricsModel' of undefined
    at manager.js:153
(anonymous) @ kernel.js:1007
Promise.catch (async)
Kernel._handle_ws_message @ kernel.js:1007
i @ jquery.min.js:2
manager.js:153 Uncaught (in promise) TypeError: Cannot read property 'SlicingMetricsModel' of undefined
    at manager.js:153
(anonymous) @ manager.js:153
Promise.then (async)
t.register_model @ manager-base.js:208
(anonymous) @ manager-base.js:248
(anonymous) @ manager-base.js:44
(anonymous) @ manager-base.js:25
(anonymous) @ manager-base.js:19
A @ manager-base.js:15
t.new_model @ manager-base.js:232
t.handle_comm_open @ manager-base.js:144
L @ underscore.js:762
(anonymous) @ underscore.js:775
(anonymous) @ underscore.js:122
(anonymous) @ comm.js:89
Promise.then (async)
CommManager.comm_open @ comm.js:85
i @ jquery.min.js:2
Kernel._handle_iopub_message @ kernel.js:1223
Kernel._finish_ws_message @ kernel.js:1015
(anonymous) @ kernel.js:1006
Promise.then (async)
Kernel._handle_ws_message @ kernel.js:1006
i @ jquery.min.js:2
manager.js:153 Uncaught (in promise) TypeError: Cannot read property 'SlicingMetricsModel' of undefined
    at manager.js:153
(anonymous) @ manager.js:153
Promise.then (async)
(anonymous) @ extension.js:121
n.OutputArea.register_mime_type.safe @ extension.js:145
OutputArea.append_mime_type @ outputarea.js:696
OutputArea.append_display_data @ outputarea.js:659
OutputArea.append_output @ outputarea.js:346
OutputArea.handle_output @ outputarea.js:257
output @ codecell.js:395
Kernel._handle_output_message @ kernel.js:1196
i @ jquery.min.js:2
Kernel._handle_iopub_message @ kernel.js:1223
Kernel._finish_ws_message @ kernel.js:1015
(anonymous) @ kernel.js:1006
Promise.then (async)
Kernel._handle_ws_message @ kernel.js:1006
i @ jquery.min.js:2

Support for Python 3

What's the timeline for supporting Python 3?

TensorFlow only runs on Python 3.5 and 3.6 on Windows. So those of us who work on Windows machines have a harder time trying out TFMA.

Feature Request: Support viewing the slicing metrics widget outside of a Jupyter notebook

System information

  • Have I written custom code (as opposed to using a stock example script
    provided in TensorFlow Model Analysis)
    : Yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): N/A
  • TensorFlow Model Analysis installed from (source or binary): PyPI
  • TensorFlow Model Analysis version (use command below): 0.6.0
  • Python version: 2.7
  • Jupyter Notebook version: N/A
  • Exact command to reproduce:
import tensorflow_model_analysis as tfma
from ipywidgets.embed import embed_minimal_html

analysis_path = 'gs://<TFMA_OUTPUT_DIRECTORY>'
result = tfma.load_eval_result(output_path=analysis_path)
slicing_metrics_view = tfma.view.render_slicing_metrics(result)
embed_minimal_html('tfma_export.html', views=[slicing_metrics_view], title='Slicing Metrics')

Describe the problem

Jupyter notebook widgets support embedding the widget in a static HTML file that can be loaded outside of a Jupyter notebook (see here for details).

This almost works for the TFMA widgets, except that the tfma_widget_js.js file assumes that it is running in the context of a notebook page, and tries to load the vulcanized_template.html file from the notebook server (which won't exist in the case of a static HTML file).

Thus, I am filing a feature request to put that file on a CDN somewhere, and to teach the tfma_widget_js.js code to load it from the CDN if necessary.

Source code / logs

Here is the line in the JS file where it tries to load the vulcanized_template.html file from the notebook server.

The simplest way I could suggest to enhance this would be to check of the data-base-url document attribute is null (indicating that the code is running outside of a notebook), and in that case have the __webpack_require__.p location resolve to a CDN URL.

Model Analysis is very slow.


System information

  • Have I written custom code (as opposed to using a stock example script
    provided in TensorFlow Model Analysis)
    : Yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Debian 10
  • TensorFlow Model Analysis version (use command below): 0.26.0
  • Python version: 3.7.9
  • Jupyter Notebook version: 6.1.6
  • Exact command to reproduce: tfma.run_model_analysis or context.run(evaluator) (TFX standard component)

Running in an AI Platform Notebook with 4 CPU cores and 15 GB of RAM using the interactive context.

Describe the problem

Tensorflow Model analysis takes a very long time to run. For context, I am try to run evaluation on a very small dataset (only 8,000 examples) generated by an example_gen component in a tfx pipeline. If I manually create a tfrecord dataset and run model.evaluate (its a Keras model) on it, it will only take a minute or two, including the time it takes to load the model. Using TFMA (either directly or with the evaluator component) takes about 9 minutes, and if I provide a slicing key (with only 6 unique values) it takes 16 minutes.

I am not using the dataflow runner and I understand that running on dataflow will allow me to scale out my evaluation, but that seems like complete overkill for just a few thousand rows of data. Is it expected that running TFMA should create this much overhead? Is there anything I might be doing that's making this slower than it could be/is there anything I could try to make it go faster? I'm currently only calculating a single metric (in addition to the two saved with my Keras model).

I'm really liking the capabilities of TFMA, but its currently slowing things down quite a bit.

Source code / logs

As you can see, my configuration is very simple:

accuracy_threshold = tfma.MetricThreshold(
    value_threshold=tfma.GenericValueThreshold(
        lower_bound={'value': 0.0001}
    )
)

metrics_spec = tfma.MetricsSpec(
    metrics = [
        tfma.MetricConfig(class_name='BinaryCrossentropy', threshold=accuracy_threshold)
    ]
)

eval_config = tfma.EvalConfig(
    model_specs=[
        tfma.ModelSpec(label_key='series_ep_tags_xf', signature_name="serving_default", preprocessing_function_names=['tft_layer'])
    ],
    metrics_specs=[metrics_spec],
    slicing_specs=[tfma.SlicingSpec()]
)

evaluator = Evaluator(
    examples=example_gen.outputs['examples'],
    model=trainer.outputs['model'], 
    eval_config=eval_config
)

Doesn't work on Firefox

System information

  • Have I written custom code (as opposed to using a stock example script
    provided in TensorFlow Model Analysis)
    : No
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): MacOS 10.13.6
  • TensorFlow Model Analysis installed from (source or binary): binary
  • TensorFlow Model Analysis version (use command below): 0.9.0
  • Python version: 2.7.14
  • Jupyter Notebook version: 1.0.0
  • Exact command to reproduce: Follow Chicago Taxi Example (local example) tutorial.

Describe the problem

I ran Chicago Taxi Example notebook on Firefox and the TF Model Analysis interactive widget didn't show up. At first I thought there was a problem when I installed and enabled tensorflow_model_analysis Jupyter nbextension. It turned out when I opened the notebook in Chrome, it worked perfectly.

Source code / logs

Firefox' console showed that tfma_widget_js is successfully loaded:

Use of Mutation Events is deprecated. Use MutationObserver instead. jquery.min.js:2
actions jupyter-notebook:find-and-replace does not exist, still binding it in case it will be defined later... menubar.js:277
accessing "actions" on the global IPython/Jupyter is not recommended. Pass it to your objects contructors at creation time main.js:208
Loaded moment locale en bidi.js:19
load_extensions 
Arguments { 0: "jupyter-js-widgets/extension", 1: "tfma_widget_js/extension", โ€ฆ }
utils.js:60
Loading extension: tfma_widget_js/extension utils.js:37
Session: kernel_created (fedc5d8c-5b4b-4b18-9029-685745010dd4) session.js:54
Starting WebSockets: ws://localhost:8888/api/kernels/fd4631da-a6de-4077-bc96-b52c0fa6a0e9 kernel.js:459
Loading extension: jupyter-js-widgets/extension utils.js:37
Kernel: kernel_connected (fd4631da-a6de-4077-bc96-b52c0fa6a0e9) kernel.js:103
Kernel: kernel_ready (fd4631da-a6de-4077-bc96-b52c0fa6a0e9) kernel.js:103 

Comparison to Chrome's console:

load_extensions Arguments(2)ย ["jupyter-js-widgets/extension", "tfma_widget_js/extension", callee: (...), Symbol(Symbol.iterator): ฦ’]
bidi.js:19 Loaded moment locale en
session.js:54 Session: kernel_created (fedc5d8c-5b4b-4b18-9029-685745010dd4)
kernel.js:459 Starting WebSockets: ws://localhost:8888/api/kernels/fd4631da-a6de-4077-bc96-b52c0fa6a0e9
utils.js:37 Loading extension: tfma_widget_js/extension
kernel.js:103 Kernel: kernel_connected (fd4631da-a6de-4077-bc96-b52c0fa6a0e9)
kernel.js:103 Kernel: kernel_ready (fd4631da-a6de-4077-bc96-b52c0fa6a0e9)
utils.js:37 Loading extension: jupyter-js-widgets/extension
[Deprecation] Styling master document from stylesheets defined in HTML Imports is deprecated. Please refer to https://goo.gl/EGXzpw for possible migration paths.

TimeSeriesView and PlotViewer broken due to incorrect model_module in HTML export

System information

  • Have I written custom code (as opposed to using a stock example script
    provided in TensorFlow Model Analysis)
    : no
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Debian Buster
  • TensorFlow Model Analysis installed from (source or binary): binary
  • TensorFlow Model Analysis version (use command below): 0.22.1
  • Python version: 3.7.3
  • Jupyter Notebook version: 6.0.3

Describe the problem

Module loading fails via unpck.com due to incorrect naming.

_model_module = traitlets.Unicode('tensorflow_model-analysis').tag(sync=True)
and
_model_module = traitlets.Unicode('tensorflow_model-analysis').tag(sync=True)
should be tensorflow_model_analysis instead of tensorflow_model-analysis

Ready for general use?

Hi,

I'd love to try this to test the accuracy of trained models.

Would you say this is alpha or beta quality at this point? How much bandwidth is being dedicated to it by the Tensorflow team? In other words, is this part of the the framework that we can already use or in early stages of development?

Thanks,

Unclear EvalConfig nomenclature for model outputs

System information

  • Have I written custom code (as opposed to using a stock example script
    provided in TensorFlow Model Analysis)
    : Yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): macOS Catalina
  • TensorFlow Model Analysis installed from (source or binary): pypi
  • TensorFlow Model Analysis version (use command below): 0.22.1
  • Python version: 3.7.5
  • Jupyter Notebook version: 1.0.0
  • TensorFlow version: 2.3.0

Describe the problem

I have trained a Keras model with a custom training loop i.e. using gradient tape instead of the .fit method. I have specified a custom serving signature as follows:

@tf.function(input_signature=tf.TensorSpec(shape=[None], dtype=tf.string, name="examples"))
def _serving_function(examples):
  mu = model(x)
  sigma = 100.0
  return {"mu": mu, "sigma": sigma}

This is because we want the model to output distribution parameters rather than a point estimate. The serving signature as seen by saved_model_cli is:

signature_def['serving_default']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['examples'] tensor_info:
        dtype: DT_STRING
        shape: (-1)
        name: serving_default_examples:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['mu'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 1)
        name: StatefulPartitionedCall_1:0
    outputs['sigma'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 1)
        name: StatefulPartitionedCall_1:1
  Method name is: tensorflow/serving/predict

I would like to analyse this model using TFMA, and I have configured it using the following EvalConfig:

eval_config = tfma.EvalConfig(
    model_specs=[tfma.ModelSpec(label_key="my_label_key")],
    metrics_specs=[
        tfma.MetricsSpec(
            metrics=[tfma.MetricConfig(
                class_name="MeanSquaredError",
                config='"dtype": "float32", "name": "mse"'
            )]
        )
    ],
    slicing_specs=[tfma.SlicingSpec()]
)
eval_shared_model = tfma.default_eval_shared_model(saved_model_dir, eval_config=eval_config)

However, when I invoke the run_model_analysis function, I get the following error:

ValueError: unable to prepare labels and predictions because the labels and/or predictions are dicts with unrecognized keys. If a multi-output keras model (or estimator) was used check that an output_name was provided. If an estimator was used check that common prediction keys were provided (e.g. logistic, probabilities, etc): labels=[1416.], predictions={'mu': array([91.325935], dtype=float32), 'sigma': array([100.], dtype=float32)}, prediction_key= [while running 'ExtractEvaluateAndWriteResults/ExtractAndEvaluate/EvaluateMetricsAndPlots/ComputeMetricsAndPlots()/ComputePerSlice/ComputeUnsampledMetrics/CombinePerSliceKey/WindowIntoDiscarding']

This leads me to believe that I might have misconfigured the labels and/or predictions key. However, the protobuf is a little vague (to me) as to what those terms refer to.

What is the difference between labels, predictions, and output? Is my current model a single-output model with different prediction keys, or a multi-output model? How should I be configuring the EvalConfig?

Multilabel Metrics TFX

#14 # System information

  • Have I written custom code (as opposed to using a stock example script
    provided in TensorFlow Model Analysis)
    : No
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): CentOS Linux 7 (Core)
  • TensorFlow Model Analysis installed from (source or binary): Binary
  • TensorFlow Model Analysis version (use command below):0.24.3
  • Python version:3.6.9
  • Jupyter Notebook version:6.0.1
  • Exact command to reproduce:

Describe the problem

I am following the tfx tutorial https://www.tensorflow.org/tfx/tutorials/tfx/components_keras, but with multilabel data.(number of categories is 5). Here is the output from example_gen

{
 'Text': array([b'Football fans looking forward to seeing the renewal of the rivalry between Cristiano Ronaldo and Lionel Messi were made to wait a while longer after the Portuguese forward was forced to miss Juventus' Champions League tie against Barcelona on Wednesday.'],dtype=object),
 'Headline': array([b"Lionel Messi scores as Cristiano Ronaldo misses Barcelona's victory over Juventus."], dtype=object),
 'categories': array([b'Sports'], dtype=object), 
}

{
 'Text': array([b'COVID-19 has changed fan behavior and accelerated three to five years of technology adoption into six months'],dtype=object),
 'Headline': array([b"How Technology Is Improving Fan Transactions at Sports Venues"], dtype=object),
 'categories': array([b'Sports', b'Science and Technology'], dtype=object), 
}

Output from tf transform:

{
 'Text': array([b'Football fans looking forward to seeing the renewal of the rivalry between Cristiano Ronaldo and Lionel Messi were made to wait a while longer after the Portuguese forward was forced to miss Juventus' Champions League tie against Barcelona on Wednesday.'],dtype=object),
 'Headline': array([b"Lionel Messi scores as Cristiano Ronaldo misses Barcelona's victory over Juventus."], dtype=object),
 'categories': array([1., 0., 0., 0., 0.], dtype=object), 
}

{
 'Text_xf': array([b'COVID-19 has changed fan behavior and accelerated three to five years of technology adoption into six months'],dtype=object),
 'Headline_xf': array([b"How Technology Is Improving Fan Transactions at Sports Venues"], dtype=object),
 'categories_xf': array([1., 1., 0., 0., 0.'], dtype=object), 
}

I have trained the model using Trainer and then I want to use TFMA

metrics = [
    tf.keras.metrics.Recall(name='recall', top_k=3),
]
metrics_specs = tfma.metrics.specs_from_metrics(metrics)

eval_config=tfma.EvalConfig(
    model_specs=[tfma.ModelSpec(label_key="categories")],
    slicing_specs=[tfma.SlicingSpec()],
    metrics_specs=metrics_specs,
    
)

evaluator = Evaluator(
    examples=example_gen.outputs['examples'],
    model=trainer.outputs['model'],
    baseline_model=model_resolver.outputs['model'],
    eval_config=eval_config
)
context.run(evaluator)

logs

WARNING:tensorflow:5 out of the last 5 calls to <function recreate_function.<locals>.restored_function_body at 0x7f5a27adc9d8> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings could be due to (1) creating @tf.function repeatedly in a loop, (2) passing tensors with different shapes, (3) passing Python objects instead of tensors. For (1), please define your @tf.function outside of the loop. For (2), @tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. For (3), please refer to https://www.tensorflow.org/tutorials/customization/performance#python_or_tensor_args and https://www.tensorflow.org/api_docs/python/tf/function for  more details.
WARNING:tensorflow:6 out of the last 6 calls to <function recreate_function.<locals>.restored_function_body at 0x7f5a27abc6a8> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings could be due to (1) creating @tf.function repeatedly in a loop, (2) passing tensors with different shapes, (3) passing Python objects instead of tensors. For (1), please define your @tf.function outside of the loop. For (2), @tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. For (3), please refer to https://www.tensorflow.org/tutorials/customization/performance#python_or_tensor_args and https://www.tensorflow.org/api_docs/python/tf/function for  more details.
WARNING:tensorflow:7 out of the last 7 calls to <function recreate_function.<locals>.restored_function_body at 0x7f5a27abc0d0> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings could be due to (1) creating @tf.function repeatedly in a loop, (2) passing tensors with different shapes, (3) passing Python objects instead of tensors. For (1), please define your @tf.function outside of the loop. For (2), @tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. For (3), please refer to https://www.tensorflow.org/tutorials/customization/performance#python_or_tensor_args and https://www.tensorflow.org/api_docs/python/tf/function for  more details.
WARNING:tensorflow:8 out of the last 8 calls to <function recreate_function.<locals>.restored_function_body at 0x7f59386e9c80> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings could be due to (1) creating @tf.function repeatedly in a loop, (2) passing tensors with different shapes, (3) passing Python objects instead of tensors. For (1), please define your @tf.function outside of the loop. For (2), @tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. For (3), please refer to https://www.tensorflow.org/tutorials/customization/performance#python_or_tensor_args and https://www.tensorflow.org/api_docs/python/tf/function for  more details.
WARNING:tensorflow:9 out of the last 9 calls to <function recreate_function.<locals>.restored_function_body at 0x7f59386d39d8> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings could be due to (1) creating @tf.function repeatedly in a loop, (2) passing tensors with different shapes, (3) passing Python objects instead of tensors. For (1), please define your @tf.function outside of the loop. For (2), @tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. For (3), please refer to https://www.tensorflow.org/tutorials/customization/performance#python_or_tensor_args and https://www.tensorflow.org/api_docs/python/tf/function for  more details.
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.PerWindowInvoker.invoke_process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common._OutputProcessor.process_outputs()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.SingletonConsumerSet.receive()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.PGBKCVOperation.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.PGBKCVOperation.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/tensorflow_model_analysis/evaluators/metrics_and_plots_evaluator_v2.py in add_input(self, accumulator, element)
    340     for i, (c, a) in enumerate(zip(self._combiners, accumulator)):
--> 341       result = c.add_input(a, get_combiner_input(elements[0], i))
    342       for e in elements[1:]:

~/anaconda3/envs/tf2/lib/python3.6/site-packages/tensorflow_model_analysis/metrics/tf_metric_wrapper.py in add_input(self, accumulator, element)
    576         if self._is_top_k() and label.shape != prediction.shape:
--> 577           label = metric_util.one_hot(label, prediction)
    578         accumulator.add_input(i, label, prediction, example_weight)

~/anaconda3/envs/tf2/lib/python3.6/site-packages/tensorflow_model_analysis/metrics/metric_util.py in one_hot(tensor, target)
    703   # indexing the -1 and then removing it after.
--> 704   tensor = np.delete(np.eye(target.shape[-1] + 1)[tensor], -1, axis=-1)
    705   return tensor.reshape(target.shape)

IndexError: arrays used as indices must be of integer (or boolean) type

During handling of the above exception, another exception occurred:

IndexError                                Traceback (most recent call last)
<ipython-input-31-952eda92fce9> in <module>
      5     eval_config=eval_config
      6 )
----> 7 context.run(evaluator)

~/anaconda3/envs/tf2/lib/python3.6/site-packages/tfx/orchestration/experimental/interactive/interactive_context.py in run_if_ipython(*args, **kwargs)
     65       # __IPYTHON__ variable is set by IPython, see
     66       # https://ipython.org/ipython-doc/rel-0.10.2/html/interactive/reference.html#embedding-ipython.
---> 67       return fn(*args, **kwargs)
     68     else:
     69       absl.logging.warning(

~/anaconda3/envs/tf2/lib/python3.6/site-packages/tfx/orchestration/experimental/interactive/interactive_context.py in run(self, component, enable_cache, beam_pipeline_args)
    180         telemetry_utils.LABEL_TFX_RUNNER: runner_label,
    181     }):
--> 182       execution_id = launcher.launch().execution_id
    183 
    184     return execution_result.ExecutionResult(

~/anaconda3/envs/tf2/lib/python3.6/site-packages/tfx/orchestration/launcher/base_component_launcher.py in launch(self)
    203                          execution_decision.input_dict,
    204                          execution_decision.output_dict,
--> 205                          execution_decision.exec_properties)
    206 
    207     absl.logging.info('Running publisher for %s',

~/anaconda3/envs/tf2/lib/python3.6/site-packages/tfx/orchestration/launcher/in_process_component_launcher.py in _run_executor(self, execution_id, input_dict, output_dict, exec_properties)
     65         executor_context)  # type: ignore
     66 
---> 67     executor.Do(input_dict, output_dict, exec_properties)

~/anaconda3/envs/tf2/lib/python3.6/site-packages/tfx/components/evaluator/executor.py in Do(self, input_dict, output_dict, exec_properties)
    258            output_path=output_uri,
    259            slice_spec=slice_spec,
--> 260            tensor_adapter_config=tensor_adapter_config))
    261     logging.info('Evaluation complete. Results written to %s.', output_uri)
    262 

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/pipeline.py in __exit__(self, exc_type, exc_val, exc_tb)
    553     try:
    554       if not exc_type:
--> 555         self.result = self.run()
    556         self.result.wait_until_finish()
    557     finally:

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/pipeline.py in run(self, test_runner_api)
    532         finally:
    533           shutil.rmtree(tmpdir)
--> 534       return self.runner.run_pipeline(self, self._options)
    535     finally:
    536       shutil.rmtree(self.local_tempdir, ignore_errors=True)

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py in run_pipeline(self, pipeline, options)
    174 
    175     self._latest_run_result = self.run_via_runner_api(
--> 176         pipeline.to_runner_api(default_environment=self._default_environment))
    177     return self._latest_run_result
    178 

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py in run_via_runner_api(self, pipeline_proto)
    184     # TODO(pabloem, BEAM-7514): Create a watermark manager (that has access to
    185     #   the teststream (if any), and all the stages).
--> 186     return self.run_stages(stage_context, stages)
    187 
    188   @contextlib.contextmanager

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py in run_stages(self, stage_context, stages)
    342           stage_results = self._run_stage(
    343               runner_execution_context,
--> 344               bundle_context_manager,
    345           )
    346           monitoring_infos_by_stage[stage.name] = (

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py in _run_stage(self, runner_execution_context, bundle_context_manager)
    521               input_timers,
    522               expected_timer_output,
--> 523               bundle_manager)
    524 
    525       final_result = merge_results(last_result)

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py in _run_bundle(self, runner_execution_context, bundle_context_manager, data_input, data_output, input_timers, expected_timer_output, bundle_manager)
    559 
    560     result, splits = bundle_manager.process_bundle(
--> 561         data_input, data_output, input_timers, expected_timer_output)
    562     # Now we collect all the deferred inputs remaining from bundle execution.
    563     # Deferred inputs can be:

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py in process_bundle(self, inputs, expected_outputs, fired_timers, expected_output_timers, dry_run)
    943     with thread_pool_executor.shared_unbounded_instance() as executor:
    944       for result, split_result in executor.map(execute, zip(part_inputs,  # pylint: disable=zip-builtin-not-iterating
--> 945                                                             timer_inputs)):
    946         split_result_list += split_result
    947         if merged_result is None:

~/anaconda3/envs/tf2/lib/python3.6/concurrent/futures/_base.py in result_iterator()
    584                     # Careful not to keep a reference to the popped future
    585                     if timeout is None:
--> 586                         yield fs.pop().result()
    587                     else:
    588                         yield fs.pop().result(end_time - time.monotonic())

~/anaconda3/envs/tf2/lib/python3.6/concurrent/futures/_base.py in result(self, timeout)
    430                 raise CancelledError()
    431             elif self._state == FINISHED:
--> 432                 return self.__get_result()
    433             else:
    434                 raise TimeoutError()

~/anaconda3/envs/tf2/lib/python3.6/concurrent/futures/_base.py in __get_result(self)
    382     def __get_result(self):
    383         if self._exception:
--> 384             raise self._exception
    385         else:
    386             return self._result

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/utils/thread_pool_executor.py in run(self)
     42       # If the future wasn't cancelled, then attempt to execute it.
     43       try:
---> 44         self._future.set_result(self._fn(*self._fn_args, **self._fn_kwargs))
     45       except BaseException as exc:
     46         # Even though Python 2 futures library has #set_exection(),

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py in execute(part_map_input_timers)
    939           input_timers,
    940           expected_output_timers,
--> 941           dry_run)
    942 
    943     with thread_pool_executor.shared_unbounded_instance() as executor:

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py in process_bundle(self, inputs, expected_outputs, fired_timers, expected_output_timers, dry_run)
    839             process_bundle_descriptor.id,
    840             cache_tokens=[next(self._cache_token_generator)]))
--> 841     result_future = self._worker_handler.control_conn.push(process_bundle_req)
    842 
    843     split_results = []  # type: List[beam_fn_api_pb2.ProcessBundleSplitResponse]

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/portability/fn_api_runner/worker_handlers.py in push(self, request)
    351       self._uid_counter += 1
    352       request.instruction_id = 'control_%s' % self._uid_counter
--> 353     response = self.worker.do_instruction(request)
    354     return ControlFuture(request.instruction_id, response)
    355 

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/sdk_worker.py in do_instruction(self, request)
    481       # E.g. if register is set, this will call self.register(request.register))
    482       return getattr(self, request_type)(
--> 483           getattr(request, request_type), request.instruction_id)
    484     else:
    485       raise NotImplementedError

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/sdk_worker.py in process_bundle(self, request, instruction_id)
    516         with self.maybe_profile(instruction_id):
    517           delayed_applications, requests_finalization = (
--> 518               bundle_processor.process_bundle(instruction_id))
    519           monitoring_infos = bundle_processor.monitoring_infos()
    520           monitoring_infos.extend(self.state_cache_metrics_fn())

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/bundle_processor.py in process_bundle(self, instruction_id)
    981           elif isinstance(element, beam_fn_api_pb2.Elements.Data):
    982             input_op_by_transform_id[element.transform_id].process_encoded(
--> 983                 element.data)
    984 
    985       # Finish all operations.

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/bundle_processor.py in process_encoded(self, encoded_windowed_values)
    217       decoded_value = self.windowed_coder_impl.decode_from_stream(
    218           input_stream, True)
--> 219       self.output(decoded_value)
    220 
    221   def monitoring_infos(self, transform_id, tag_to_pcollection_id):

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.Operation.output()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.Operation.output()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.SingletonConsumerSet.receive()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.SdfProcessSizedElements.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.SdfProcessSizedElements.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process_with_sized_restriction()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.PerWindowInvoker.invoke_process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common._OutputProcessor.process_outputs()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.SingletonConsumerSet.receive()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.FlattenOperation.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.FlattenOperation.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.Operation.output()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.SingletonConsumerSet.receive()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.DoOperation.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.DoOperation.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner._reraise_augmented()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.SimpleInvoker.invoke_process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common._OutputProcessor.process_outputs()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.SingletonConsumerSet.receive()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.DoOperation.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.DoOperation.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner._reraise_augmented()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.SimpleInvoker.invoke_process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common._OutputProcessor.process_outputs()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.SingletonConsumerSet.receive()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.DoOperation.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.DoOperation.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner._reraise_augmented()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.SimpleInvoker.invoke_process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common._OutputProcessor.process_outputs()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.SingletonConsumerSet.receive()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.DoOperation.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.DoOperation.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner._reraise_augmented()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.SimpleInvoker.invoke_process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common._OutputProcessor.process_outputs()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.SingletonConsumerSet.receive()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.FlattenOperation.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.FlattenOperation.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.Operation.output()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.SingletonConsumerSet.receive()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.DoOperation.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.DoOperation.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner._reraise_augmented()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.SimpleInvoker.invoke_process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common._OutputProcessor.process_outputs()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.SingletonConsumerSet.receive()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.DoOperation.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.DoOperation.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner._reraise_augmented()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.PerWindowInvoker.invoke_process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common._OutputProcessor.process_outputs()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.SingletonConsumerSet.receive()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.DoOperation.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.DoOperation.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner._reraise_augmented()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.SimpleInvoker.invoke_process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common._OutputProcessor.process_outputs()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.SingletonConsumerSet.receive()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.DoOperation.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.DoOperation.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner._reraise_augmented()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.SimpleInvoker.invoke_process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common._OutputProcessor.process_outputs()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.SingletonConsumerSet.receive()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.DoOperation.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.DoOperation.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner._reraise_augmented()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.SimpleInvoker.invoke_process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common._OutputProcessor.process_outputs()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.SingletonConsumerSet.receive()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.DoOperation.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.DoOperation.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner._reraise_augmented()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.SimpleInvoker.invoke_process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common._OutputProcessor.process_outputs()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.SingletonConsumerSet.receive()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.DoOperation.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.DoOperation.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner._reraise_augmented()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.SimpleInvoker.invoke_process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common._OutputProcessor.process_outputs()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.ConsumerSet.receive()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.DoOperation.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.DoOperation.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner._reraise_augmented()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.SimpleInvoker.invoke_process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common._OutputProcessor.process_outputs()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.SingletonConsumerSet.receive()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.SingletonConsumerSet.receive()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.DoOperation.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.DoOperation.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner._reraise_augmented()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/future/utils/__init__.py in raise_with_traceback(exc, traceback)
    444         if traceback == Ellipsis:
    445             _, _, traceback = sys.exc_info()
--> 446         raise exc.with_traceback(traceback)
    447 
    448 else:

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.DoFnRunner.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.PerWindowInvoker.invoke_process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/common.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.common._OutputProcessor.process_outputs()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.SingletonConsumerSet.receive()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.PGBKCVOperation.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/apache_beam/runners/worker/operations.cpython-36m-x86_64-linux-gnu.so in apache_beam.runners.worker.operations.PGBKCVOperation.process()

~/anaconda3/envs/tf2/lib/python3.6/site-packages/tensorflow_model_analysis/evaluators/metrics_and_plots_evaluator_v2.py in add_input(self, accumulator, element)
    339     results = []
    340     for i, (c, a) in enumerate(zip(self._combiners, accumulator)):
--> 341       result = c.add_input(a, get_combiner_input(elements[0], i))
    342       for e in elements[1:]:
    343         result = c.add_input(result, get_combiner_input(e, i))

~/anaconda3/envs/tf2/lib/python3.6/site-packages/tensorflow_model_analysis/metrics/tf_metric_wrapper.py in add_input(self, accumulator, element)
    575         # Keras requires non-sparse keys for top_k calcuations.
    576         if self._is_top_k() and label.shape != prediction.shape:
--> 577           label = metric_util.one_hot(label, prediction)
    578         accumulator.add_input(i, label, prediction, example_weight)
    579     if (accumulator.len_inputs() >= self._batch_size or

~/anaconda3/envs/tf2/lib/python3.6/site-packages/tensorflow_model_analysis/metrics/metric_util.py in one_hot(tensor, target)
    702   # the row. The following handles -1 values by adding an additional column for
    703   # indexing the -1 and then removing it after.
--> 704   tensor = np.delete(np.eye(target.shape[-1] + 1)[tensor], -1, axis=-1)
    705   return tensor.reshape(target.shape)
    706 

IndexError: arrays used as indices must be of integer (or boolean) type [while running 'ExtractEvaluateAndWriteResults/ExtractAndEvaluate/EvaluateMetricsAndPlots/ComputeMetricsAndPlots()/ComputePerSlice/ComputeUnsampledMetrics/CombinePerSliceKey/WindowIntoDiscarding']

Wrong batch size pass to predictor

System information

0.13.*

Describe the problem

When predictor input with batch size, say 256, we need to specify batch size (by setting desired_batch_size) when calling run_model_analysis (as in PR: #48), or ExtractEvaluateAndWriteResults, the problem is that if my input has 300 examples, first batch will be 256 will runs ok, but the second batch will be with 44 example feeding to predictor which will crash predictor (since batch size does not match).

One easy fix is to ignore not mismatched size of batch by adding a filter here:
https://github.com/tensorflow/model-analysis/blob/master/tensorflow_model_analysis/extractors/predict_extractor.py#L81
if (self._desired_batch_size and self._desired_batch_size != batch_size):
print("Skip process for {} element since batch size does not match".format(batch_size))
return

Source code / logs

Documentation needed?

Hi developers,

I don't know if this is the right place or channel to communicate this. But the tfma website could use a bit more documentation.

Even though it is possible to dive in to the source code to find out the parameters and usage, the library usability can be greatly improved had there been a higher level explanation of the API available in tfma.

For example, the tfma.genericvaluethreshold class conditions the pusher to only push models that are within the two bounds configured in this class.

I think this endeavor should slightly prioritized as it's the gateway for people to start using the library.

Cannot read property 'SlicingMetricsModel' of undefined

System information

  • Have I written custom code (as opposed to using a stock example script
    provided in TensorFlow Model Analysis)
    : Yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu
  • TensorFlow Model Analysis installed from (source or binary): source
  • TensorFlow Model Analysis version (use command below): 0.22.2
  • Python version: 3.6.9
  • Jupyter Notebook version: 6.0.3
  • Exact command to reproduce: tfma.view.render_slicing_metrics(result)

Describe the problem

I'm running tfma inside a docker container to get around any issues with the tfma extension not being available in my conda environment. Everything looks good, I just can't display the interactive charts. When I call tfma.view.render_slicing_metrics(result), I get the following error in the browser console. Chrome version 83.0.4103.106.

Source code / logs

Screen Shot 2020-06-28 at 7 01 57 PM

ExampleCount metric always returns 1 for multi-output models

System information

  • Have I written custom code (as opposed to using a stock example script
    provided in TensorFlow Model Analysis)
    : No
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Mac OS X
  • TensorFlow Model Analysis installed from (source or binary): binary
  • TensorFlow Model Analysis version (use command below): 0.21.6
  • Python version: Python 3.6.9
  • Jupyter Notebook version: doesn't apply
  • Exact command to reproduce:

Describe the problem

tfma.metrics.ExampleCount always returns 1 for multi-output models. I believe this happens because when the metric key is created, the output name is not considered (see https://github.com/tensorflow/model-analysis/blob/master/tensorflow_model_analysis/metrics/example_count.py#L49). This is not a problem in the similar tfma.metrics.WeightedExampleCount (https://github.com/tensorflow/model-analysis/blob/master/tensorflow_model_analysis/metrics/weighted_example_count.py#L60).

Source code / logs

eval_config = tfma.EvalConfig(
    model_specs=[
        tfma.ModelSpec(
            prediction_key='softmax',
            signature_name="eval",
            label_key="candidate_app_id_integerized"
        )
    ],
    metrics_specs=tfma.metrics.specs_from_metrics(
        metrics=[
            tfma.metrics.ExampleCount(), 
            tfma.metrics.WeightedExampleCount(),
            tf.keras.metrics.SparseCategoricalCrossentropy()
        ],
        output_names=['softmax'],
        include_example_count=False,
        include_weighted_example_count=False
    ),
    slicing_specs=[
        tfma.SlicingSpec(),
        tfma.SlicingSpec(
            feature_keys=["language"]
        ),
        tfma.SlicingSpec(
            feature_keys=["country_code"]
        ),
        tfma.SlicingSpec(
            feature_keys=["is_ios"]
        )
    ]
)

eval_result = tfma.run_model_analysis(
    eval_shared_model=eval_shared_model,
    eval_config=eval_config,
    data_location=examples_dir,
    output_path=output_path
)

Build configuration is missing definitions

System information

  • Have I written custom code (as opposed to using a stock example script
    provided in TensorFlow Model Analysis)
    : N/A
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): N/A
  • TensorFlow Model Analysis installed from (source or binary): N/A
  • TensorFlow Model Analysis version (use command below): N/A
  • Python version: N/A
  • Jupyter Notebook version: N/A
  • Exact command to reproduce: N/A

Describe the problem

The HEAD version of TFMA seems to be missing some definitions for third-party dependencies. The easy-to-fix one is ProtoBuf:

ERROR: error loading package 'tensorflow_model_analysis/proto': Extension file not found. Unable to load package for '@protobuf_bzl//:protobuf.bzl': The repository could not be resolved

which can be fixed by changing the load call slightly

diff --git a/tensorflow_model_analysis/proto/BUILD b/tensorflow_model_analysis/proto/BUILD
index af3386c..af787da 100644
--- a/tensorflow_model_analysis/proto/BUILD
+++ b/tensorflow_model_analysis/proto/BUILD
@@ -2,7 +2,7 @@ licenses(["notice"])  # Apache 2.0

 package(default_visibility = ["//visibility:public"])

-load("@protobuf_bzl//:protobuf.bzl", "py_proto_library")
+load("@com_google_protobuf//:protobuf.bzl", "py_proto_library")

The other one is a bit more difficult since the BUILD file for third_party/py/typing is truly missing from the repo

ERROR: [...]/model-analysis/tensorflow_model_analysis/slicer/BUILD:3:1: no such package 'third_party/py/typing': BUILD file not found on package path and referenced by '//tensorflow_model_analysis/slicer:slicer'

Finally, some of the TFMA BUILD files reference third_party/py/numpy which is missing as well.

"no value provided for label" error with TFX Keras + Evaluator component

Issue

I'm part of the team supporting TFX/Kubeflow Pipelines at Spotify, we are currently upgrading our internal stack to tfx==0.22.1 and tensorflow-model-analysis==0.22.2.

We can successfully evaluate an Estimator-based model with the open-source evaluator component. Unfortunately, when using a Keras-based model, the Beam evaluation pipeline running on Dataflow fails with the following error:

Traceback (most recent call last):
  File "apache_beam/runners/common.py", line 961, in apache_beam.runners.common.DoFnRunner.process
  File "apache_beam/runners/common.py", line 726, in apache_beam.runners.common.PerWindowInvoker.invoke_process
  File "apache_beam/runners/common.py", line 812, in apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window
  File "apache_beam/runners/common.py", line 1122, in apache_beam.runners.common._OutputProcessor.process_outputs
  File "apache_beam/runners/worker/operations.py", line 195, in apache_beam.runners.worker.operations.SingletonConsumerSet.receive
  File "apache_beam/runners/worker/operations.py", line 949, in apache_beam.runners.worker.operations.PGBKCVOperation.process
  File "apache_beam/runners/worker/operations.py", line 978, in apache_beam.runners.worker.operations.PGBKCVOperation.process
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_model_analysis/evaluators/metrics_and_plots_evaluator_v2.py", line 356, in add_input
    result = c.add_input(a, get_combiner_input(elements[0], i))
  File "/usr/local/lib/python3.6/site-packages/tensorflow_model_analysis/metrics/tf_metric_wrapper.py", line 551, in add_input
    flatten=self._class_weights is not None)):
  File "/usr/local/lib/python3.6/site-packages/tensorflow_model_analysis/metrics/metric_util.py", line 264, in to_label_prediction_example_weight
    sub_key, inputs))
ValueError: no value provided for label: model_name=, output_name=, sub_key=None, StandardMetricInputs=StandardMetricInputs(label=None, prediction=0.025542974, example_weight=None, features=None)
This may be caused by a configuration error (i.e. label, and/or prediction keys were not specified) or an error in the pipeline.

Have you ever faced this issue?

Additional context

  • We run a version of the Chicago taxi example pipeline
  • Keras model was trained using a copy of the GenericExecutor
  • The model's code was copied from TFX's tutorial
  • The following eval config is passed to the Evaluator component:
import tensorflow_model_analysis as tfma
from google.protobuf.wrappers_pb2 import BoolValue
eval_config = tfma.EvalConfig(
    model_specs=[
      tfma.ModelSpec(model_type=tfma.constants.TF_KERAS, label_key="tips")
    ],
    slicing_specs=[
        tfma.SlicingSpec(),
    ],
    options=tfma.Options(include_default_metrics=BoolValue(value=True)),
)

System information

  • Have I written custom code: Yes
  • TensorFlow Model Analysis installed from: binary via pip
  • TensorFlow Model Analysis version (use command below): 0.22.2
  • Python version: 3.6.9

MinLabelPosition does not display in Slicing Metrics graphs

System information

  • Have I written custom code (as opposed to using a stock example script
    provided in TensorFlow Model Analysis)
    : yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu
  • TensorFlow Model Analysis installed from (source or binary): binary
  • TensorFlow Model Analysis version (use command below): 0.26.0
  • Python version: 3.7.6
  • Jupyter Notebook version: 6.2.0
  • Exact command to reproduce:
    (pandas version : 1.2.0)
import os
import json
from typing import List

import pandas as pd
import tensorflow_model_analysis as tfma


df_data = pd.DataFrame({
    'score': [1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4],
    'label': [0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0],
    'query_id': [0] * 4 + [1] * 4 + [2] * 4 + [3] * 4, 
    'slice': ['aaa'] * 4 + ['bbb'] * 12
})

def set_eval_config(prediction: str, label: str, query_key: str, slices: List) -> tfma.EvalConfig:
    model_specs = [
      tfma.ModelSpec(
          prediction_key=prediction,
          label_key=label)
    ]

    metrics = [
        tfma.metrics.NDCG(name='ndcg', gain_key=label, top_k_list=[4]),
        tfma.metrics.MinLabelPosition(name='min_label_position', label_key='label'),
    ]

    metrics_specs = tfma.metrics.specs_from_metrics(metrics, query_key=query_key)

    slicing_specs = [tfma.SlicingSpec()]  # the empty slice represents the overall dataset  
    slicing_specs += [tfma.SlicingSpec(feature_keys=[slice]) for slice in slices]
    

    eval_config = tfma.EvalConfig(
        model_specs=model_specs,
        metrics_specs=metrics_specs,
        slicing_specs=slicing_specs)
    return eval_config

eval_config = set_eval_config(prediction='score', 
                                      label='label', 
                                      query_key='query_id',
                                      slices=['slice'])

eval_result = tfma.analyze_raw_data(df_data, 
                                        eval_config)

tfma.view.render_slicing_metrics(eval_result, slicing_column='slice')

#print(eval_result) 

Describe the problem

I am trying to evaluate a ranking model based on a dataframe of predictions.
When I evaluate MinLabelPosition and try to visualize it using tfma.view.render_slicing_metrics, the graphs does not display for this metric whereas it displays well for NDCG.
Nonetheless the MinLabelPosition value is available in the dataframe view output by tfma.view.render_slicing_metrics.

There seems to be a problem of type for the output generated by the metric MinLabelPosition (see below in part Source code / logs).

I suggested a fix in the submitted Pull Request : #106

Source code / logs

When printing what's inside eval_result :

EvalResult(slicing_metrics=[((), {'': {'': {'example_count': {'doubleValue': 16.0}, 
'weighted_example_count': {'doubleValue': 16.0}, 
'min_label_position': {'arrayValue': {'dataType': 'FLOAT32', 'shape': [1], 'float32Values': [1.5]}}}, 
'topK:4': {'ndcg': {'doubleValue': 0.8154648767857289}}}}), 
((('slice', 'aaa'),), {'': {'': {'example_count': {'doubleValue': 4.0}, 'weighted_example_count': {'doubleValue': 4.0}, 'min_label_position': {'arrayValue': {'dataType': 'FLOAT32', 'shape': [1], 'float32Values': [1.0]}}}, 
'topK:4': {'ndcg': {'doubleValue': 1.0}}}}),
 ((('slice', 'bbb'),), {'': {'': {'example_count': {'doubleValue': 12.0}, 
'weighted_example_count': {'doubleValue': 12.0},
 'min_label_position': {'arrayValue': {'dataType': 'FLOAT32', 'shape': [1], 'float32Values': [1.6666666]}}}, 
'topK:4': {'ndcg': {'doubleValue': 0.7539531690476383}}}})], 
plots=[((), None), ((('slice', 'aaa'),), None), ((('slice', 'bbb'),), None)], attributions=[((), None), ((('slice', 'aaa'),), None), ((('slice', 'bbb'),), None)], config=model_specs {
  label_key: "label"
  prediction_key: "score"
}
slicing_specs {
}
slicing_specs {
  feature_keys: "slice"
}
metrics_specs {
  metrics {
    class_name: "ExampleCount"
    config: "{\"name\": \"example_count\"}"
  }
}
metrics_specs {
  metrics {
    class_name: "WeightedExampleCount"
    config: "{\"name\": \"weighted_example_count\"}"
  }
}
metrics_specs {
  metrics {
    class_name: "NDCG"
    config: "{\"gain_key\": \"label\", \"name\": \"ndcg\", \"top_k_list\": [4]}"
  }
  metrics {
    class_name: "MinLabelPosition"
    config: "{\"label_key\": \"label\", \"name\": \"min_label_position\"}"
  }
  query_key: "query_id"
}
, data_location='<user provided PCollection>', file_format='<unknown>', model_location='<unknown>')

Screenshot of the output with render_slicing_metrics
slicing_metrics

`static` folder missing when installing Jupyter notebook extensions

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 16.04
  • TensorFlow Model Analysis installed from (source or binary): binary
  • TensorFlow Model Analysis version (use command below): 0.13.2
  • Python version: 2.7 / 3.5
  • Jupyter Notebook version: 5.7.8
  • Exact command to reproduce:

Describe the problem

I'm following the instructions to install the TFMA Jupyter extension via Conda and I realized that at jupyter nbextension install --py --symlink tensorflow_model_analysis:

% jupyter nbextension install --py --symlink tensorflow_model_analysis
/home/benjamintan/miniconda3/lib/python3.7/site-packages/apache_beam/__init__.py:84: UserWarning: Running the Apache Beam SDK on Python 3 is not yet fully supported. You may encounter buggy behavior or missing features.
  'Running the Apache Beam SDK on Python 3 is not yet fully supported. '

Installing /home/benjamintan/miniconda3/lib/python3.7/site-packages/tensorflow_model_analysis/static -> tfma_widget_js
Symlinking: /usr/local/share/jupyter/nbextensions/tfma_widget_js -> /home/benjamintan/miniconda3/lib/python3.7/site-packages/tensorflow_model_analysis/static
- Validating: problems found:
   X  require: /usr/local/share/jupyter/nbextensions/tfma_widget_js/extension.js
  OK section: notebook
Full spec: {'section': 'notebook', 'src': 'static', 'dest': 'tfma_widget_js', 'require': 'tfma_widget_js/extension'}

    To initialize this nbextension in the browser every time the notebook (or other app) loads:

          jupyter nbextension enable tensorflow_model_analysis --py

What I've found is that the /home/benjamintan/miniconda3/lib/python3.7/site-packages/tensorflow_model_analysis/static doesn't exist. This causes the widgets not to load.

However, when I manually copy the files from Github to the aforementioned directory, the widgets then load.

I've tried to use --system and --sys-prefix without any success.

Saw the demo at the conference,, but

I was so hopeful after seeing the demo. When going to use the analysis was very disappointing to note the Py27 dependency,, which seems to be due to Beam,, why not just take the normal TF input flow?? eg among other things taking plain old csv files as input (ex Beam). I can think of no reason that Beam should be in the analysis package,, if it is needed/desired it should be structured as part of the TF IO package facilities. As it sits now, it seems like the package special purpose code that is not ready for users doing normal development on local machines for a variety of use cases.. Even Google Doc's generally advise working out flows, and this would include initial analysis, on local machines before deploying to cloud usage.. It seems pretty apparent that this is not getting much usage, if only due to the lack of SO activity (none) and the lack of GH issues,, just these two... Hopefully there is a plan to make this more usable as I feel it would be a huge help to model development. We really need tools like this though!

TFMA and TF KERAS 2.0 model on pretrained model

Hello all,

I am referring here to stackoverflow that I have published couple of days ago: [https://stackoverflow.com/questions/56248024/tensorflow-model-analysis-tfma-for-keras-model]
I didn't receive any response. There might not be too many people that uses TFX with TF KERAS 2.0 at the moment. So, I am trying my luck here.
In general, I want to analyze a pre-trained (VGG16) model. The model was

  1. imported with TF KERAS 2.0 API
  2. saved
  3. loaded and converted to estimator using the keras to estimator API

However, the execution requires all VGG features in TF features format. What is the right way to extract these features?
Can anyone refer me to an example where TFX is being used with pre-trained model.

The code is available at the stackoverflow. if easer, I can copy it to here - let me know
Many thanks,
eilalan

tfma.run_model_analysis only handles uncompressed tfrecords

System information

  • Have I written custom code (as opposed to using a stock example script
    provided in TensorFlow Model Analysis)
    : No
  • TensorFlow Model Analysis version (use command below):0.13.2 + master

Describe the problem

The type of compression when reading TFRecords in run_model_analysis is hard-coded to compressed. However, beam handles correctly other kind of compression. The type CompressionTypes.AUTO should be used: " Default value is CompressionTypes.AUTO, in which case the file_path's extension will be used to detect the compression."

Multiclass confusion matrix / binarized metrics need class names, not just class IDs

System information

  • Have I written custom code (as opposed to using a stock example script
    provided in TensorFlow Model Analysis)
    : Yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu
  • TensorFlow Model Analysis installed from (source or binary): binary (PyPI)
  • TensorFlow Model Analysis version (use command below): 0.22.2
  • Python version: 3.6.9
  • Jupyter Notebook version: 6.0.3
  • Exact command to reproduce:
from tensorflow_model_analysis import EvalConfig
from tensorflow_model_analysis.metrics import default_multi_class_classification_specs
from google.protobuf.json_format import ParseDict

class = ['class_1', 'class_2', ...]

eval_config = {
    'model_specs': [
        {
            'name': 'rig_state',
            'model_type': 'tf_keras',
            'signature_name': 'serve_raw',
            'label_key': ...,
            'example_weight_key': 'sample_weight'
        }
    ],
    'metrics_specs': [
        {
            'metrics': [
                {
                    'class_name': 'MultiClassConfusionMatrixPlot',
                    'config': '"thresholds": [0.5]'
                },
                {'class_name': 'ExampleCount'},
                {'class_name': 'WeightedExampleCount'},
                {'class_name': 'SparseCategoricalAccuracy'},
            ],
        },
        {
            'binarize': {'class_ids': {'values': list(range(len(classes)))}},
            'metrics': [
                {'class_name': 'AUC'},
                {'class_name': 'CalibrationPlot'},
                {'class_name': 'BinaryAccuracy'},
                {'class_name': 'MeanPrediction'}
            ]
        }
    ],
    'slicing_specs': [...]
}
eval_config: EvalConfig = ParseDict(eval_config, EvalConfig())

Describe the problem

Multiclass confusion matrices and binarized metrics should support class names, not just class IDs. Something like 'binarize': {'classes': [{'id': _id, 'name': name} for _id, name in enumerate(classes)]. As it stands, having integer value IDs for the classes is meaningless to data scientists and business stakeholders looking at the TFMA visualizations.

Documentation for run_model_analysis is outdated

System information

N/A

tfma version 0.13.0

Describe the problem

The documentation for run_model_analysis (can be found here) is not according to the latest version of tfma. For instance, model_location parameter does not exist anymore.

Source code / logs

N/A

TFMA unable to find metrics for Keras model when loading eval result

System information

  • Have I written custom code (as opposed to using a stock example script
    provided in TensorFlow Model Analysis)
    : Yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): macOS Catalina
  • TensorFlow Model Analysis installed from (source or binary): pypi
  • TensorFlow Model Analysis version (use command below): 0.22.1
  • Python version: 3.7.5
  • Jupyter Notebook version: 1.0.0

Describe the problem

I have trained a Keras model (not estimator) with the following serving signature:

signature_def['serving_default']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['examples'] tensor_info:
        dtype: DT_STRING
        shape: (-1)
        name: serving_default_examples:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['mu'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 1)
        name: StatefulPartitionedCall_1:0
    outputs['sigma'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 1)
        name: StatefulPartitionedCall_1:1
  Method name is: tensorflow/serving/predict

The weights are updated using a custom training loop with gradient tape, instead of the model.fit method, before the model is exported as a saved_model. As I am unable to get TFMA to work without first compiling the model, I compile the model while specifying a set of custom Keras metrics:

model.compile(metrics=custom_keras_metrics) # each custom metric inherits from keras.Metric
custom_training_loop(model)
model.save("path/to/saved_model", save_format="tf")

I would like to evaluate this model using TFMA, so I first initialise an eval shared model as follows:

eval_config = tfma.EvalConfig(
    model_specs=[tfma.ModelSpec(label_key="my_label_key")],
    slicing_specs=[tfma.SlicingSpec()] # empty slice refers to the entire dataset
)
eval_shared_model = tfma.default_eval_shared_model("path/to/saved_model", eval_config=eval_config)

However, when I try to run model analysis:

eval_results = tfma.run_model_analysis(
    eval_shared_model=eval_shared_model,
    data_location="path/to/test/tfrecords*",
    file_format="tfrecords"
)

I am faced with the following error:

ValueError          Traceback (most recent call last)
<ipython-input-156-f9a9684a6797> in <module>
      2     eval_shared_model=eval_shared_model,
      3     data_location="tfma/test_raw-*",
----> 4     file_format="tfrecords"
      5 )

~/.pyenv/versions/miniconda3-4.3.30/envs/tensorflow/lib/python3.7/site-packages/tensorflow_model_analysis/api/model_eval_lib.py in run_model_analysis(eval_shared_model, eval_config, data_location, file_format, output_path, extractors, evaluators, writers, pipeline_options, slice_spec, write_config, compute_confidence_intervals, min_slice_size, random_seed_for_testing, schema)
   1204 
   1205   if len(eval_config.model_specs) <= 1:
-> 1206     return load_eval_result(output_path)
   1207   else:
   1208     results = []

~/.pyenv/versions/miniconda3-4.3.30/envs/tensorflow/lib/python3.7/site-packages/tensorflow_model_analysis/api/model_eval_lib.py in load_eval_result(output_path, model_name)
    383       metrics_and_plots_serialization.load_and_deserialize_metrics(
    384           path=os.path.join(output_path, constants.METRICS_KEY),
--> 385           model_name=model_name))
    386   plots_proto_list = (
    387       metrics_and_plots_serialization.load_and_deserialize_plots(

~/.pyenv/versions/miniconda3-4.3.30/envs/tensorflow/lib/python3.7/site-packages/tensorflow_model_analysis/writers/metrics_and_plots_serialization.py in load_and_deserialize_metrics(path, model_name)
    180       raise ValueError('Fail to find metrics for model name: %s . '
    181                        'Available model names are [%s]' %
--> 182                        (model_name, ', '.join(keys)))
    183 
    184     result.append((

ValueError: Fail to find metrics for model name: None . Available model names are []

Why is TFMA raising this exception, and where should I begin debugging this error? I tried specifying the model names manually (which should not be required since I'm only using one model), but that did not seem to help either. I tried tracing the source code and it seems this happens when TFMA tries to load the eval result generated by the PTransform.

TFMA not rendering in JupyterLab

System information

  • Have I written custom code (as opposed to using a stock example script
    provided in TensorFlow Model Analysis)
    : Yes, minor
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 18.04 and image python:3.7-slim
  • TensorFlow Model Analysis installed from (source or binary): pip install
  • TensorFlow Model Analysis version (use command below): 0.27.0
  • Python version: 3.7
  • Jupyter Notebook version: jupyterlab 2.2.9
  • Exact command to reproduce: see sample notebook

Describe the problem

tfma.view.render_slicing_metrics shows no output.

Source code / logs

Slim docker image to reproduce the issue:

FROM python:3.7-slim

ENV DEBIAN_FRONTEND=noninteractive

# This is used because our k8s cluster can only access our internal pypi
#COPY pip.conf /etc/pip.conf

# # TFMA is installed in the notebook because pip complained otherwise
RUN python3.7 -m pip install --no-cache-dir jupyterlab==2.2.9

# Install Node (for jupyter lab extensions)
RUN apt update && \
    apt -y install nano curl dirmngr apt-transport-https lsb-release ca-certificates && \
    curl -L https://deb.nodesource.com/setup_15.x | bash - && \
    apt update && apt install -y nodejs && \
    node -v

RUN jupyter labextension install [email protected] && \
    jupyter labextension install @jupyter-widgets/jupyterlab-manager@2

RUN jupyter lab build

ENV DEBIAN_FRONTEND=

ENV NB_PREFIX /
ENV SHELL=/bin/bash

# Standard KubeFlow jupyter startup command
CMD ["/bin/bash","-c", "jupyter lab --notebook-dir=/home/jovyan --ip=0.0.0.0 --no-browser --allow-root --port=8888 --NotebookApp.token='' --NotebookApp.password='' --NotebookApp.allow_origin='*' --NotebookApp.base_url=${NB_PREFIX}"]

Below is an evaluation artifact from a small TFX pipeline and a minimal notebook to reproduce. Notebook consists of unzipping, install tfma, load eval result and try to display.
3053.zip
tfma-render-issue.ipynb.zip

Edit: I've also run the above with node 12 instead of 15 with the exact same result.

Null-Value Validation Results prevent push with no explanation

Please go to Stack Overflow for help and support:

https://stackoverflow.com/questions/tagged/tensorflow-model-analysis

If you open a GitHub issue, here is our policy:

  1. It must be a bug, a feature request, or a significant problem with
    documentation (for small docs fixes please send a PR instead).
  2. The form below must be filled out.

Here's why we have that policy: TensorFlow Model Analysis developers respond
to issues. We want to focus on work that benefits the whole community, e.g.,
fixing bugs and adding features. Support only helps individuals. GitHub also
notifies thousands of people when issues are filed. We want them to see you
communicating an interesting problem, rather than being redirected to Stack
Overflow.


System information

Lambda Quad; 4 2080 RTX Cards, 128 Gb RAM, 24 Virtual CPUs

  • Have I written custom code (as opposed to using a stock example script
    provided in TensorFlow Model Analysis)
    :

No

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04):

UBUNTU 18.04

  • TensorFlow Model Analysis installed from (source or binary):

pip install tensorflow-tfx

  • TensorFlow Model Analysis version (use command below):

TensorFlow version: 2.1.0
TFX version: 0.21.4
pandas version: 1.0.3
TF Data Validation: 0.21.5
TF Model Analysis: 0.21.5

  • Python version:

Python 3.7.7

  • Jupyter Notebook version:

The version of the notebook server is: 5.7.8

  • Exact command to reproduce:
    So I'm looking at what appears to be a TFMA bug. Every so often when training a model, it fails to push even though Tensorboard would suggest a successful passing model. Validation results as you can seee below are for some reason excluding the offending value.

image

You can obtain the TensorFlow Model Analysis version with

python -c "import tensorflow_model_analysis as tfma; print(tfma.version.VERSION)"

0.21.5

Describe the problem

Describe the problem clearly here. Be sure to convey here why it's a bug in
TensorFlow Model Analysis or a feature request.

Source code / logs

Include any logs or source code that would be helpful to diagnose the problem.
If including tracebacks, please include the full traceback. Large logs and files
should be attached. Try to provide a reproducible test case that is the bare
minimum necessary to generate the problem.

No Output in Jupyter Notebook

Instrumented a model according to the docs. Created a notebook with a cell to render the tfma widget.

eval_result = tfma.run_model_analysis(
  model_location=MODEL_LOCATION,
  data_location=EVAL_TFRECORDS_LOCATION,
  file_format='tfrecords')

tfma.view.render_slicing_metrics(eval_result)

This gives me some INFO log messages but no widget. I have installed the nb extensions as indicated in the docs. I used the --user flag for installations.

This output is received in the notebook:

INFO:tensorflow:Restoring parameters from ./data/tfma/1523504954/variables/variables

INFO:tensorflow:Restoring parameters from ./data/tfma/1523504954/variables/variables

INFO:tensorflow:Restoring parameters from ./data/tfma/1523504954/variables/variables

INFO:tensorflow:Restoring parameters from ./data/tfma/1523504954/variables/variables

INFO:tensorflow:Restoring parameters from ./data/tfma/1523504954/variables/variables

INFO:tensorflow:Restoring parameters from ./data/tfma/1523504954/variables/variables

INFO:tensorflow:Restoring parameters from ./data/tfma/1523504954/variables/variables

INFO:tensorflow:Restoring parameters from ./data/tfma/1523504954/variables/variables

minimal example for launching/confirming tfma integration in a notebook?

any chance someone from this project could provide a minimal code example for launching the notebook-based visual TFMA widget to ensure that it's properly integrated/installed in the notebook environment?

ideally this would just render the visual widget using dummy data that could be generated on a single machine w/o some complex training process behind it, primarily for cursory integration testing in the notebook environment.

Using keras metrics in TF 1.X

System information

  • Have I written custom code (as opposed to using a stock example script
    provided in TensorFlow Model Analysis)
    : No
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Mac OS X
  • TensorFlow Model Analysis installed from (source or binary): binary
  • TensorFlow Model Analysis version (use command below): 0.21.6
  • Python version: 3.7.4
  • Jupyter Notebook version: N/A
  • Exact command to reproduce: Running keras metrics evaluation with TF 1.X in dataflow distributed mode

Describe the problem

Running
Describe the problem clearly here. Be sure to convey here why it's a bug in
TFMA assumes a V2 execution environment, and to use it in TF 1.X, user needs to do 'tf.compat.v1.enable_v2_behavior()' as instructed by this issue. This works fine in local execution, however, in dataflow distributed mode, it's hard to do this for every worker. There are two places where TF is used in tfma, predict_extractor_v2 and tf_metric_wrapper. Both cases will fail in dataflow. The first one can be solved by injecting tf.compat.v1.enable_v2_behavior() in model_construct_fn, but for the latter one, there's no setup_fn exposed and is hard to work around. This can be reproduced by running TFMA in TF 1.X in cloud dataflow with any keras metrics.

Stack trace:
Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 650, in do_work work_executor.execute() File "/usr/local/lib/python3.7/site-packages/dataflow_worker/executor.py", line 176, in execute op.start() File "dataflow_worker/shuffle_operations.py", line 50, in dataflow_worker.shuffle_operations.GroupedShuffleReadOperation.start File "dataflow_worker/shuffle_operations.py", line 51, in dataflow_worker.shuffle_operations.GroupedShuffleReadOperation.start File "dataflow_worker/shuffle_operations.py", line 66, in dataflow_worker.shuffle_operations.GroupedShuffleReadOperation.start File "dataflow_worker/shuffle_operations.py", line 67, in dataflow_worker.shuffle_operations.GroupedShuffleReadOperation.start File "dataflow_worker/shuffle_operations.py", line 71, in dataflow_worker.shuffle_operations.GroupedShuffleReadOperation.start File "apache_beam/runners/worker/operations.py", line 256, in apache_beam.runners.worker.operations.Operation.output File "apache_beam/runners/worker/operations.py", line 143, in apache_beam.runners.worker.operations.SingletonConsumerSet.receive File "dataflow_worker/shuffle_operations.py", line 234, in dataflow_worker.shuffle_operations.BatchGroupAlsoByWindowsOperation.process File "dataflow_worker/shuffle_operations.py", line 241, in dataflow_worker.shuffle_operations.BatchGroupAlsoByWindowsOperation.process File "apache_beam/runners/worker/operations.py", line 256, in apache_beam.runners.worker.operations.Operation.output File "apache_beam/runners/worker/operations.py", line 143, in apache_beam.runners.worker.operations.SingletonConsumerSet.receive File "apache_beam/runners/worker/operations.py", line 753, in apache_beam.runners.worker.operations.CombineOperation.process File "apache_beam/runners/worker/operations.py", line 758, in apache_beam.runners.worker.operations.CombineOperation.process File "/usr/local/lib/python3.7/site-packages/apache_beam/transforms/combiners.py", line 866, in extract_only return self.combine_fn.extract_output(accumulator) File "/Users/anbang.zhao/repo/masterchef-training/venv/lib/python3.7/site-packages/tensorflow_model_analysis/evaluators/metrics_and_plots_evaluator_v2.py", line 350, in extract_output output = c.extract_output(a) File "/usr/local/lib/python3.7/site-packages/tensorflow_model_analysis/metrics/tf_metric_wrapper.py", line 591, in extract_output result[key] = metric.result().numpy() AttributeError: 'Tensor' object has no attribute 'numpy'

I've tried using --save_main_session option in dataflow and have tf.compat.v1.enable_v2_behavior() in the driver main function, but it does not work.

Shows all zeros for model analysis

Please go to Stack Overflow for help and support:

https://stackoverflow.com/questions/tagged/tensorflow-model-analysis

If you open a GitHub issue, here is our policy:

  1. It must be a bug, a feature request, or a significant problem with
    documentation (for small docs fixes please send a PR instead).
  2. The form below must be filled out.

Here's why we have that policy: TensorFlow Model Analysis developers respond
to issues. We want to focus on work that benefits the whole community, e.g.,
fixing bugs and adding features. Support only helps individuals. GitHub also
notifies thousands of people when issues are filed. We want them to see you
communicating an interesting problem, rather than being redirected to Stack
Overflow.


System information

  • Have I written custom code (as opposed to using a stock example script
    provided in TensorFlow Model Analysis)
    : Yes / No
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 16.04
  • TensorFlow Model Analysis installed from (source or binary): PyPI
  • TensorFlow Model Analysis version (use command below): 0.9.2
  • Python version: 2.7
  • Jupyter Notebook version: 1.0.0
  • Exact command to reproduce: Follow Chicago Taxi Example (local example) tutorial.

You can obtain the TensorFlow Model Analysis version with

python -c "import tensorflow_model_analysis as tfma; print(tfma.VERSION_STRING)"

Describe the problem

I ran Chicago Taxi Example notebook but got all zeros when visualizing the sliced data, it looked like as picture below.
image

Then I tried to use "Iris flower data set" as input and modified some code, but still got the same result shows as below.
image
I used the first column to sliced data.

I don't know whether import warning as below picture shows would affect the result or not.
image
This import warning don't appeared after first run without restarting kernel.

MultiClassConfusionMatrixPlot - no plot data found

  • Have I written custom code (as opposed to using a stock example script
    provided in TensorFlow Model Analysis)
    : Yes. See below.
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): macOS 10.14.6
  • TensorFlow Model Analysis installed from (source or binary): pip
  • TensorFlow Model Analysis version (use command below): 0.21.2
  • Python version: 3.6.8
  • Jupyter Notebook version: 6.0.3
  • Exact command to reproduce:

Describe the problem

I have an EvalResult that contains data for a multiclass confusion matrix but tfma.view.render_plot can't seem to find it.

Source code / logs

import tensorflow as tf
import tensorflow_model_analysis as tfma
m = tfma.default_eval_shared_model('train', tags='serve', example_weight_key='sample_weight', include_default_metrics=False)
eval_config = tfma.EvalConfig(
    model_specs=[tfma.ModelSpec(signature_name='eval', label_key='activity1', example_weight_key='sample_weight')],
    metrics_specs=tfma.metrics.specs_from_metrics([
        tfma.metrics.MultiClassConfusionMatrixPlot(),
    ], include_weighted_example_count=False),
    slicing_specs=[
        tfma.SlicingSpec(),
    ]
)

import numpy as np
from tensorflow_model_analysis import types
import apache_beam as beam

LABELS = [...]

eye = np.eye(len(LABELS))

@beam.ptransform_fn
@beam.typehints.with_input_types(types.Extracts)
@beam.typehints.with_output_types(types.Extracts)
def f(extracts):
    def _f(ex):
        ex['labels'] = eye[LABELS.index(ex['labels'][0].decode())]
        return ex
    return extracts | 'OHE' >> beam.Map(_f)

integerize_labels = tfma.extractors.Extractor('ohe', f())

result = tfma.run_model_analysis(
    eval_shared_model=m,
    eval_config=eval_config,
    data_location='...',
    extractors=tfma.default_extractors(eval_shared_model=m, eval_config=eval_config) + [integerize_labels],
    evaluators=[
         tfma.evaluators.metrics_and_plots_evaluator_v2.MetricsAndPlotsEvaluator(eval_config=eval_config, eval_shared_model=m, run_after='ohe')
    ]
)
tfma.view.render_plot(result)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-84-348d3cfecd7f> in <module>
----> 1 tfma.view.render_plot(result)

/anaconda3/envs/gcp_test_env/lib/python3.6/site-packages/tensorflow_model_analysis/view/widget_view.py in render_plot(result, slicing_spec, output_name, class_id, top_k, k, label)
    109   data, config = util.get_plot_data_and_config(result.plots, slice_spec_to_use,
    110                                                output_name, class_id, top_k, k,
--> 111                                                label)
    112   return visualization.render_plot(data, config)

/anaconda3/envs/gcp_test_env/lib/python3.6/site-packages/tensorflow_model_analysis/view/util.py in get_plot_data_and_config(results, slicing_spec, output_name, class_id, top_k, k, label)
    335     if not contains_supported_plot_data:
    336       raise ValueError('No plot data found. Maybe provide a label? %s' %
--> 337                        json.dumps(plot_data))
    338 
    339   plot_data = _replace_nan_with_none(plot_data, _SUPPORTED_PLOT_KEYS)

ValueError: No plot data found. Maybe provide a label? {"multiClassConfusionMatrixAtThresholds": {"matrices": [{"entries": [{"numWeightedExamples": 23878.0}, {"predictedClassId": 2, "numWeightedExamples": 4.0}, {"predictedClassId": 3, "numWeightedExamples": 873.0}, {"actualClassId": 1, "predictedClassId": 1, "numWeightedExamples": 306.0}, {"actualClassId": 2, "predictedClassId": 2, "numWeightedExamples": 1173.0}, {"actualClassId": 2, "predictedClassId": 3, "numWeightedExamples": 22.0}, {"actualClassId": 3, "predictedClassId": 3, "numWeightedExamples": 52.0}, {"actualClassId": 4, "predictedClassId": 4, "numWeightedExamples": 1754.0}, {"actualClassId": 5, "predictedClassId": 1, "numWeightedExamples": 37.0}, {"actualClassId": 5, "predictedClassId": 2, "numWeightedExamples": 273.0}, {"actualClassId": 5, "predictedClassId": 5, "numWeightedExamples": 9.0}, {"actualClassId": 5, "predictedClassId": 6, "numWeightedExamples": 4.0}, {"actualClassId": 6, "predictedClassId": 1, "numWeightedExamples": 361.0}, {"actualClassId": 6, "predictedClassId": 2, "numWeightedExamples": 106.0}, {"actualClassId": 6, "predictedClassId": 6, "numWeightedExamples": 5.0}]}]}}

when tf.keras meets get_model_type a memory leakage happens

System information

  • Have I specified the code to reproduce the issue
    (Yes/No): yes
  • Environment in which the code is executed (e.g., Local
    (Linux/MacOS/Windows), Interactive Notebook, Google Cloud, etc): - TensorFlow
    version (you are using): 2.3.2- TFX Version: 0.26.1- Python version:3.6.7

Describe the current behavior
In tfma module, model_util.py :109 get_model_type function, it will load model into gpu and inference the type of model
then return model's type.

  if model_path:
    try:
      keras_model = tf.keras.models.load_model(model_path)
      # In some cases, tf.keras.models.load_model can successfully load a
      # saved_model but it won't actually be a keras model.
      if isinstance(keras_model, tf.keras.models.Model):
        return constants.TF_KERAS
    except Exception:  # pylint: disable=broad-except
      pass

  if tags:
    if tags and eval_constants.EVAL_TAG in tags:
      return constants.TF_ESTIMATOR
    else:
      return constants.TF_GENERIC

  signature_name = None
  if model_spec:
    if model_spec.signature_name:
      signature_name = model_spec.signature_name
    else:
      signature_name = tf.saved_model.DEFAULT_SERVING_SIGNATURE_DEF_KEY

  if signature_name == eval_constants.EVAL_TAG:
    return constants.TF_ESTIMATOR
  else:
    return constants.TF_GENERIC

But thanks to this function, the loaded model keep staying in the gpu memory and takes large amount of memory,

when the evaluator goes on, it starts to do the evaluation, another load process is started and cause an OOM error

Describe the expected behavior
When the get model type finished the GPU memory should be released or we just keep passing the loaded model to the following process, instead load it again.

Standalone code to reproduce the issue Providing a bare minimum test case or
step(s) to reproduce the problem will greatly help us to debug the issue. If
possible, please share a link to Colab/Jupyter/any notebook.
Just try a larger pretrained model like multilingual bert and put it into pipeline with evaluator.

Name of your Organization (Optional)

Other info / logs Include any logs or source code that would be helpful to
diagnose the problem. If including tracebacks, please include the full
traceback. Large logs and files should be attached.
""
RuntimeError: Traceback (most recent call last):
File "apache_beam/runners/common.py", line 1280, in apache_beam.runners.common.DoFnRunner._invoke_lifecycle_method
File "apache_beam/runners/common.py", line 516, in apache_beam.runners.common.DoFnInvoker.invoke_setup
File "/home/luban/.local/lib/python3.6/site-packages/tensorflow_model_analysis/model_util.py", line 758, in setup
super(ModelSignaturesDoFn, self).setup()
File "/home/luban/.local/lib/python3.6/site-packages/tensorflow_model_analysis/model_util.py", line 574, in setup
model_load_time_callback=self._set_model_load_seconds)
File "/home/luban/.local/lib/python3.6/site-packages/tensorflow_model_analysis/types.py", line 160, in load
return self._shared_handle.acquire(construct_fn)
File "/home/luban/.local/lib/python3.6/site-packages/tfx_bsl/beam/shared.py", line 238, in acquire
return _shared_map.acquire(self._key, constructor_fn, tag)
File "/home/luban/.local/lib/python3.6/site-packages/tfx_bsl/beam/shared.py", line 194, in acquire
result = control_block.acquire(constructor_fn, tag)
File "/home/luban/.local/lib/python3.6/site-packages/tfx_bsl/beam/shared.py", line 89, in acquire
result = constructor_fn()
File "/home/luban/.local/lib/python3.6/site-packages/tensorflow_model_analysis/types.py", line 169, in with_load_times
model = self.construct_fn()
File "/home/luban/.local/lib/python3.6/site-packages/tensorflow_model_analysis/model_util.py", line 550, in construct_fn
model = tf.compat.v1.saved_model.load_v2(eval_saved_model_path, tags=tags)
File "/home/luban/.local/lib/python3.6/site-packages/tensorflow/python/saved_model/load.py", line 603, in load
return load_internal(export_dir, tags, options)
File "/home/luban/.local/lib/python3.6/site-packages/tensorflow/python/saved_model/load.py", line 633, in load_internal
ckpt_options)
File "/home/luban/.local/lib/python3.6/site-packages/tensorflow/python/saved_model/load.py", line 130, in init
self._load_all()
File "/home/luban/.local/lib/python3.6/site-packages/tensorflow/python/saved_model/load.py", line 141, in _load_all
self._load_nodes()
File "/home/luban/.local/lib/python3.6/site-packages/tensorflow/python/saved_model/load.py", line 296, in _load_nodes
slot_name=slot_variable_proto.slot_name)
File "/home/luban/.local/lib/python3.6/site-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py", line 764, in add_slot
initial_value=initial_value)
File "/home/luban/.local/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 262, in call
return cls._variable_v2_call(*args, **kwargs)
File "/home/luban/.local/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 256, in _variable_v2_call
shape=shape)
File "/home/luban/.local/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 237, in
previous_getter = lambda **kws: default_variable_creator_v2(None, **kws)
File "/home/luban/.local/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 2646, in default_variable_creator_v2
shape=shape)
File "/home/luban/.local/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 264, in call
return super(VariableMetaclass, cls).call(*args, **kwargs)
File "/home/luban/.local/lib/python3.6/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 1518, in init
distribute_strategy=distribute_strategy)
File "/home/luban/.local/lib/python3.6/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 1651, in _init_from_args
initial_value() if init_from_fn else initial_value,
File "/home/luban/.local/lib/python3.6/site-packages/tensorflow/python/keras/initializers/initializers_v2.py", line 137, in call
return super(Zeros, self).call(shape, dtype=_get_dtype(dtype))
File "/home/luban/.local/lib/python3.6/site-packages/tensorflow/python/ops/init_ops_v2.py", line 132, in call
return array_ops.zeros(shape, dtype)
File "/home/luban/.local/lib/python3.6/site-packages/tensorflow/python/util/dispatch.py", line 201, in wrapper
return target(*args, **kwargs)
File "/home/luban/.local/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 2747, in wrapped
tensor = fun(*args, **kwargs)
File "/home/luban/.local/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 2806, in zeros
output = fill(shape, constant(zero, dtype=dtype), name=name)
File "/home/luban/.local/lib/python3.6/site-packages/tensorflow/python/util/dispatch.py", line 201, in wrapper
return target(*args, **kwargs)
File "/home/luban/.local/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 239, in fill
result = gen_array_ops.fill(dims, value, name=name)
File "/home/luban/.local/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 3402, in fill
_ops.raise_from_not_ok_status(e, name)
File "/home/luban/.local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 6843, in raise_from_not_ok_status
six.raise_from(core._status_to_exception(e.code, message), None)
File "", line 3, in raise_from
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[119547,768] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [Op:Fill]
""

Running the example on Google Cloud Datalab

Please go to Stack Overflow for help and support:

Hi all, no one is responding on the model analysis tensorflow. As much as I want to use the library, I find it very challenging to get some help.

Please advice,
eilalan
https://stackoverflow.com/questions/tagged/tensorflow-model-analysis

If you open a GitHub issue, here is our policy:

  1. It must be a bug, a feature request, or a significant problem with
    documentation (for small docs fixes please send a PR instead).
  2. The form below must be filled out.

Here's why we have that policy: TensorFlow Model Analysis developers respond
to issues. We want to focus on work that benefits the whole community, e.g.,
fixing bugs and adding features. Support only helps individuals. GitHub also
notifies thousands of people when issues are filed. We want them to see you
communicating an interesting problem, rather than being redirected to Stack
Overflow.


System information

  • Have I written custom code (as opposed to using a stock example script
    provided in TensorFlow Model Analysis)
    :
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04):
  • TensorFlow Model Analysis installed from (source or binary):
  • TensorFlow Model Analysis version (use command below):
  • Python version:
  • Jupyter Notebook version:
  • Exact command to reproduce:

You can obtain the TensorFlow Model Analysis version with

python -c "import tensorflow_model_analysis as tfma; print(tfma.VERSION_STRING)"

Describe the problem

Describe the problem clearly here. Be sure to convey here why it's a bug in
TensorFlow Model Analysis or a feature request.

Source code / logs

Include any logs or source code that would be helpful to diagnose the problem.
If including tracebacks, please include the full traceback. Large logs and files
should be attached. Try to provide a reproducible test case that is the bare
minimum necessary to generate the problem.

Support for multi-dimensional labels

System information

  • Have I written custom code (as opposed to using a stock example script
    provided in TensorFlow Model Analysis)
    : stock configuration with custom model
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): macOS 10.15.5
  • TensorFlow Model Analysis installed from (source or binary): binary
  • TensorFlow Model Analysis version (use command below): 0.22.2 and 0.24.0
  • Python version: 3.7
  • Jupyter Notebook version: 2.1.5
  • Exact command to reproduce:

Describe the problem

I encounter the following error when running the evaluator on a model where the output shape is (-1, -1, 5). The label is a variable length sparse categorical feature. Are multi-dimensional labels currently supported? I am unsure whether this qualifies as a bug or a feature request.

The given SavedModel SignatureDef contains the following input(s):
  inputs['examples'] tensor_info:
      dtype: DT_STRING
      shape: (-1)
      name: serving_default_examples:0
The given SavedModel SignatureDef contains the following output(s):
  outputs['output_0'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, -1, 5)
      name: StatefulPartitionedCall:0
Method name is: tensorflow/serving/predict

Source code / logs

WARNING:absl:Tensorflow version (2.3.1) found. Note that TFMA support for TF 2.0 is currently in beta
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-18-1ba4457d108a> in <module>
     41     file_format='tfrecords',
     42     output_path='test',
---> 43     extractors=None)

~/.pyenv/versions/3.7.7/envs/beam-3.7/lib/python3.7/site-packages/tensorflow_model_analysis/api/model_eval_lib.py in run_model_analysis(eval_shared_model, eval_config, data_location, file_format, output_path, extractors, evaluators, writers, pipeline_options, slice_spec, write_config, compute_confidence_intervals, min_slice_size, random_seed_for_testing, schema)
   1124             random_seed_for_testing=random_seed_for_testing,
   1125             tensor_adapter_config=tensor_adapter_config,
-> 1126             schema=schema))
   1127     # pylint: enable=no-value-for-parameter
   1128 

~/.pyenv/versions/3.7.7/envs/beam-3.7/lib/python3.7/site-packages/apache_beam/pipeline.py in __exit__(self, exc_type, exc_val, exc_tb)
    553     try:
    554       if not exc_type:
--> 555         self.run().wait_until_finish()
    556     finally:
    557       self._extra_context.__exit__(exc_type, exc_val, exc_tb)

~/.pyenv/versions/3.7.7/envs/beam-3.7/lib/python3.7/site-packages/apache_beam/pipeline.py in run(self, test_runner_api)
    532         finally:
    533           shutil.rmtree(tmpdir)
--> 534       return self.runner.run_pipeline(self, self._options)
    535     finally:
    536       shutil.rmtree(self.local_tempdir, ignore_errors=True)

~/.pyenv/versions/3.7.7/envs/beam-3.7/lib/python3.7/site-packages/apache_beam/runners/direct/direct_runner.py in run_pipeline(self, pipeline, options)
    117       runner = BundleBasedDirectRunner()
    118 
--> 119     return runner.run_pipeline(pipeline, options)
    120 
    121 

~/.pyenv/versions/3.7.7/envs/beam-3.7/lib/python3.7/site-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py in run_pipeline(self, pipeline, options)
    171 
    172     self._latest_run_result = self.run_via_runner_api(
--> 173         pipeline.to_runner_api(default_environment=self._default_environment))
    174     return self._latest_run_result
    175 

~/.pyenv/versions/3.7.7/envs/beam-3.7/lib/python3.7/site-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py in run_via_runner_api(self, pipeline_proto)
    181     # TODO(pabloem, BEAM-7514): Create a watermark manager (that has access to
    182     #   the teststream (if any), and all the stages).
--> 183     return self.run_stages(stage_context, stages)
    184 
    185   @contextlib.contextmanager

~/.pyenv/versions/3.7.7/envs/beam-3.7/lib/python3.7/site-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py in run_stages(self, stage_context, stages)
    338           stage_results = self._run_stage(
    339               runner_execution_context,
--> 340               bundle_context_manager,
    341           )
    342           monitoring_infos_by_stage[stage.name] = (

~/.pyenv/versions/3.7.7/envs/beam-3.7/lib/python3.7/site-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py in _run_stage(self, runner_execution_context, bundle_context_manager)
    517               input_timers,
    518               expected_timer_output,
--> 519               bundle_manager)
    520 
    521       final_result = merge_results(last_result)

~/.pyenv/versions/3.7.7/envs/beam-3.7/lib/python3.7/site-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py in _run_bundle(self, runner_execution_context, bundle_context_manager, data_input, data_output, input_timers, expected_timer_output, bundle_manager)
    555 
    556     result, splits = bundle_manager.process_bundle(
--> 557         data_input, data_output, input_timers, expected_timer_output)
    558     # Now we collect all the deferred inputs remaining from bundle execution.
    559     # Deferred inputs can be:

~/.pyenv/versions/3.7.7/envs/beam-3.7/lib/python3.7/site-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py in process_bundle(self, inputs, expected_outputs, fired_timers, expected_output_timers, dry_run)
    939     with thread_pool_executor.shared_unbounded_instance() as executor:
    940       for result, split_result in executor.map(execute, zip(part_inputs,  # pylint: disable=zip-builtin-not-iterating
--> 941                                                             timer_inputs)):
    942         split_result_list += split_result
    943         if merged_result is None:

~/.pyenv/versions/3.7.7/lib/python3.7/concurrent/futures/_base.py in result_iterator()
    596                     # Careful not to keep a reference to the popped future
    597                     if timeout is None:
--> 598                         yield fs.pop().result()
    599                     else:
    600                         yield fs.pop().result(end_time - time.monotonic())

~/.pyenv/versions/3.7.7/lib/python3.7/concurrent/futures/_base.py in result(self, timeout)
    433                 raise CancelledError()
    434             elif self._state == FINISHED:
--> 435                 return self.__get_result()
    436             else:
    437                 raise TimeoutError()

~/.pyenv/versions/3.7.7/lib/python3.7/concurrent/futures/_base.py in __get_result(self)
    382     def __get_result(self):
    383         if self._exception:
--> 384             raise self._exception
    385         else:
    386             return self._result

~/.pyenv/versions/3.7.7/envs/beam-3.7/lib/python3.7/site-packages/apache_beam/utils/thread_pool_executor.py in run(self)
     42       # If the future wasn't cancelled, then attempt to execute it.
     43       try:
---> 44         self._future.set_result(self._fn(*self._fn_args, **self._fn_kwargs))
     45       except BaseException as exc:
     46         # Even though Python 2 futures library has #set_exection(),

~/.pyenv/versions/3.7.7/envs/beam-3.7/lib/python3.7/site-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py in execute(part_map_input_timers)
    935           input_timers,
    936           expected_output_timers,
--> 937           dry_run)
    938 
    939     with thread_pool_executor.shared_unbounded_instance() as executor:

~/.pyenv/versions/3.7.7/envs/beam-3.7/lib/python3.7/site-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py in process_bundle(self, inputs, expected_outputs, fired_timers, expected_output_timers, dry_run)
    835             process_bundle_descriptor.id,
    836             cache_tokens=[next(self._cache_token_generator)]))
--> 837     result_future = self._worker_handler.control_conn.push(process_bundle_req)
    838 
    839     split_results = []  # type: List[beam_fn_api_pb2.ProcessBundleSplitResponse]

~/.pyenv/versions/3.7.7/envs/beam-3.7/lib/python3.7/site-packages/apache_beam/runners/portability/fn_api_runner/worker_handlers.py in push(self, request)
    350       self._uid_counter += 1
    351       request.instruction_id = 'control_%s' % self._uid_counter
--> 352     response = self.worker.do_instruction(request)
    353     return ControlFuture(request.instruction_id, response)
    354 

~/.pyenv/versions/3.7.7/envs/beam-3.7/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py in do_instruction(self, request)
    478       # E.g. if register is set, this will call self.register(request.register))
    479       return getattr(self, request_type)(
--> 480           getattr(request, request_type), request.instruction_id)
    481     else:
    482       raise NotImplementedError

~/.pyenv/versions/3.7.7/envs/beam-3.7/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py in process_bundle(self, request, instruction_id)
    513         with self.maybe_profile(instruction_id):
    514           delayed_applications, requests_finalization = (
--> 515               bundle_processor.process_bundle(instruction_id))
    516           monitoring_infos = bundle_processor.monitoring_infos()
    517           monitoring_infos.extend(self.state_cache_metrics_fn())

~/.pyenv/versions/3.7.7/envs/beam-3.7/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py in process_bundle(self, instruction_id)
    981       for op in self.ops.values():
    982         _LOGGER.debug('finish %s', op)
--> 983         op.finish()
    984 
    985       # Close every timer output stream

~/.pyenv/versions/3.7.7/envs/beam-3.7/lib/python3.7/site-packages/apache_beam/runners/worker/operations.cpython-37m-darwin.so in apache_beam.runners.worker.operations.PGBKCVOperation.finish()

~/.pyenv/versions/3.7.7/envs/beam-3.7/lib/python3.7/site-packages/apache_beam/runners/worker/operations.cpython-37m-darwin.so in apache_beam.runners.worker.operations.PGBKCVOperation.finish()

~/.pyenv/versions/3.7.7/envs/beam-3.7/lib/python3.7/site-packages/apache_beam/runners/worker/operations.cpython-37m-darwin.so in apache_beam.runners.worker.operations.PGBKCVOperation.output_key()

~/.pyenv/versions/3.7.7/envs/beam-3.7/lib/python3.7/site-packages/tensorflow_model_analysis/evaluators/metrics_and_plots_evaluator_v2.py in compact(self, accumulator)
    362   def compact(self, accumulator: Any) -> Any:
    363     self._num_compacts.inc(1)
--> 364     return super(_ComputationsCombineFn, self).compact(accumulator)
    365 
    366   def extract_output(self, accumulator: Any) -> metric_types.MetricsDict:

~/.pyenv/versions/3.7.7/envs/beam-3.7/lib/python3.7/site-packages/apache_beam/transforms/combiners.py in compact(self, accumulator)
    717 
    718   def compact(self, accumulator):
--> 719     return [c.compact(a) for c, a in zip(self._combiners, accumulator)]
    720 
    721   def extract_output(self, accumulator):

~/.pyenv/versions/3.7.7/envs/beam-3.7/lib/python3.7/site-packages/apache_beam/transforms/combiners.py in <listcomp>(.0)
    717 
    718   def compact(self, accumulator):
--> 719     return [c.compact(a) for c, a in zip(self._combiners, accumulator)]
    720 
    721   def extract_output(self, accumulator):

~/.pyenv/versions/3.7.7/envs/beam-3.7/lib/python3.7/site-packages/tensorflow_model_analysis/metrics/tf_metric_wrapper.py in compact(self, accumulator)
    558       self, accumulator: _CompilableMetricsAccumulator
    559   ) -> _CompilableMetricsAccumulator:
--> 560     self._process_batch(accumulator)
    561     return accumulator
    562 

~/.pyenv/versions/3.7.7/envs/beam-3.7/lib/python3.7/site-packages/tensorflow_model_analysis/metrics/tf_metric_wrapper.py in _process_batch(self, accumulator)
    524       for metric_index, metric in enumerate(self._metrics[output_name]):
    525         metric.reset_states()
--> 526         metric.update_state(*inputs)
    527         accumulator.add_weights(output_index, metric_index,
    528                                 metric.get_weights())

~/.pyenv/versions/3.7.7/envs/beam-3.7/lib/python3.7/site-packages/tensorflow/python/keras/utils/metrics_utils.py in decorated(metric_obj, *args, **kwargs)
     88 
     89     with tf_utils.graph_context_for_symbolic_tensors(*args, **kwargs):
---> 90       update_op = update_state_fn(*args, **kwargs)
     91     if update_op is not None:  # update_op will be None in eager execution.
     92       metric_obj.add_update(update_op)

~/.pyenv/versions/3.7.7/envs/beam-3.7/lib/python3.7/site-packages/tensorflow/python/keras/metrics.py in update_state_fn(*args, **kwargs)
    174         control_status = ag_ctx.control_status_ctx()
    175         ag_update_state = autograph.tf_convert(obj_update_state, control_status)
--> 176         return ag_update_state(*args, **kwargs)
    177     else:
    178       if isinstance(obj.update_state, def_function.Function):

~/.pyenv/versions/3.7.7/envs/beam-3.7/lib/python3.7/site-packages/tensorflow/python/autograph/impl/api.py in wrapper(*args, **kwargs)
    253       try:
    254         with conversion_ctx:
--> 255           return converted_call(f, args, kwargs, options=options)
    256       except Exception as e:  # pylint:disable=broad-except
    257         if hasattr(e, 'ag_error_metadata'):

~/.pyenv/versions/3.7.7/envs/beam-3.7/lib/python3.7/site-packages/tensorflow/python/autograph/impl/api.py in converted_call(f, args, kwargs, caller_fn_scope, options)
    455   if conversion.is_in_whitelist_cache(f, options):
    456     logging.log(2, 'Whitelisted %s: from cache', f)
--> 457     return _call_unconverted(f, args, kwargs, options, False)
    458 
    459   if ag_ctx.control_status_ctx().status == ag_ctx.Status.DISABLED:

~/.pyenv/versions/3.7.7/envs/beam-3.7/lib/python3.7/site-packages/tensorflow/python/autograph/impl/api.py in _call_unconverted(f, args, kwargs, options, update_cache)
    337 
    338   if kwargs is not None:
--> 339     return f(*args, **kwargs)
    340   return f(*args)
    341 

~/.pyenv/versions/3.7.7/envs/beam-3.7/lib/python3.7/site-packages/tensorflow/python/keras/metrics.py in update_state(self, y_true, y_pred, sample_weight)
    601       Update op.
    602     """
--> 603     y_true = math_ops.cast(y_true, self._dtype)
    604     y_pred = math_ops.cast(y_pred, self._dtype)
    605     [y_true, y_pred], sample_weight = \

~/.pyenv/versions/3.7.7/envs/beam-3.7/lib/python3.7/site-packages/tensorflow/python/util/dispatch.py in wrapper(*args, **kwargs)
    199     """Call target, and fall back on dispatchers if there is a TypeError."""
    200     try:
--> 201       return target(*args, **kwargs)
    202     except (TypeError, ValueError):
    203       # Note: convert_to_eager_tensor currently raises a ValueError, not a

~/.pyenv/versions/3.7.7/envs/beam-3.7/lib/python3.7/site-packages/tensorflow/python/ops/math_ops.py in cast(x, dtype, name)
    918       # allows some conversions that cast() can't do, e.g. casting numbers to
    919       # strings.
--> 920       x = ops.convert_to_tensor(x, name="x")
    921       if x.dtype.base_dtype != base_type:
    922         x = gen_math_ops.cast(x, base_type, name=name)

~/.pyenv/versions/3.7.7/envs/beam-3.7/lib/python3.7/site-packages/tensorflow/python/framework/ops.py in convert_to_tensor(value, dtype, name, as_ref, preferred_dtype, dtype_hint, ctx, accepted_result_types)
   1497 
   1498     if ret is None:
-> 1499       ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
   1500 
   1501     if ret is NotImplemented:

~/.pyenv/versions/3.7.7/envs/beam-3.7/lib/python3.7/site-packages/tensorflow/python/framework/tensor_conversion_registry.py in _default_conversion_function(***failed resolving arguments***)
     50 def _default_conversion_function(value, dtype, name, as_ref):
     51   del as_ref  # Unused.
---> 52   return constant_op.constant(value, dtype, name=name)
     53 
     54 

~/.pyenv/versions/3.7.7/envs/beam-3.7/lib/python3.7/site-packages/tensorflow/python/framework/constant_op.py in constant(value, dtype, shape, name)
    262   """
    263   return _constant_impl(value, dtype, shape, name, verify_shape=False,
--> 264                         allow_broadcast=True)
    265 
    266 

~/.pyenv/versions/3.7.7/envs/beam-3.7/lib/python3.7/site-packages/tensorflow/python/framework/constant_op.py in _constant_impl(value, dtype, shape, name, verify_shape, allow_broadcast)
    273       with trace.Trace("tf.constant"):
    274         return _constant_eager_impl(ctx, value, dtype, shape, verify_shape)
--> 275     return _constant_eager_impl(ctx, value, dtype, shape, verify_shape)
    276 
    277   g = ops.get_default_graph()

~/.pyenv/versions/3.7.7/envs/beam-3.7/lib/python3.7/site-packages/tensorflow/python/framework/constant_op.py in _constant_eager_impl(ctx, value, dtype, shape, verify_shape)
    298 def _constant_eager_impl(ctx, value, dtype, shape, verify_shape):
    299   """Implementation of eager constant."""
--> 300   t = convert_to_eager_tensor(value, ctx, dtype)
    301   if shape is None:
    302     return t

~/.pyenv/versions/3.7.7/envs/beam-3.7/lib/python3.7/site-packages/tensorflow/python/framework/constant_op.py in convert_to_eager_tensor(value, ctx, dtype)
     96       dtype = dtypes.as_dtype(dtype).as_datatype_enum
     97   ctx.ensure_initialized()
---> 98   return ops.EagerTensor(value, ctx.device_name, dtype)
     99 
    100 

ValueError: Failed to convert a NumPy array to a Tensor (Unsupported object type numpy.ndarray).
โ€‹

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.