Giter Site home page Giter Site logo

pair-code / what-if-tool Goto Github PK

View Code? Open in Web Editor NEW
887.0 29.0 162.0 209.05 MB

Source code/webpage/demos for the What-If Tool

Home Page: https://pair-code.github.io/what-if-tool

License: Apache License 2.0

Jupyter Notebook 9.86% Python 0.58% HTML 88.99% TypeScript 0.21% Shell 0.02% JavaScript 0.19% Starlark 0.06% Liquid 0.04% CSS 0.05%
ml-fairness visualization machine-learning jupyterlab-extension colaboratory tensorboard

what-if-tool's Introduction

What-If Tool

What-If Tool Screenshot

The What-If Tool (WIT) provides an easy-to-use interface for expanding understanding of a black-box classification or regression ML model. With the plugin, you can perform inference on a large set of examples and immediately visualize the results in a variety of ways. Additionally, examples can be edited manually or programmatically and re-run through the model in order to see the results of the changes. It contains tooling for investigating model performance and fairness over subsets of a dataset.

The purpose of the tool is that give people a simple, intuitive, and powerful way to play with a trained ML model on a set of data through a visual interface with absolutely no code required.

The tool can be accessed through TensorBoard or as an extension in a Jupyter or Colab notebook.

I don’t want to read this document. Can I just play with a demo?

Check out the large set of web and colab demos in the demo section of the What-If Tool website.

To build the web demos yourself:

  • Binary classifier for UCI Census dataset salary prediction
    • Dataset: UCI Census
    • Task: Predict whether a person earns more or less than $50k based on their census information
    • To build and run the demo from code: bazel run wit_dashboard/demo:demoserver then navigate to http://localhost:6006/wit-dashboard/demo.html
  • Binary classifier for smile detection in images
    • Dataset: CelebA
    • Task: Predict whether the person in an image is smiling
    • To build and run the demo from code: bazel run wit_dashboard/demo:imagedemoserver then navigate to http://localhost:6006/wit-dashboard/image_demo.html
  • Multiclass classifier for Iris dataset
    • Dataset: UCI Iris
    • Task: Predict which of three classes of iris flowers that a flower falls into based on 4 measurements of the flower
    • To build and run the demo from code: bazel run wit_dashboard/demo:irisdemoserver then navigate to http://localhost:6006/wit-dashboard/iris_demo.html
  • Regression model for UCI Census dataset age prediction
    • Dataset: UCI Census
    • Task: Predict the age of a person based on their census information
    • To build and run the demo from code: bazel run wit_dashboard/demo:agedemoserver then navigate to http://localhost:6006/wit-dashboard/age_demo.html
    • This demo model returns attribution values in addition to predictions (through the use of vanilla gradients) in order to demonstate how the tool can display attribution values from predictions.

What do I need to use it in a jupyter or colab notebook?

You can use the What-If Tool to analyze a classification or regression TensorFlow Estimator that takes TensorFlow Example or SequenceExample protos (data points) as inputs directly in a jupyter or colab notebook.

Additionally, the What-If Tool can analyze AI Platform Prediction-hosted classification or regresssion models that take TensorFlow Example protos, SequenceExample protos, or raw JSON objects as inputs.

You can also use What-If Tool with a custom prediction function that takes Tensorflow examples and produces predictions. In this mode, you can load any model (including non-TensorFlow models that don't use Example protos as inputs) as long as your custom function's input and output specifications are correct.

With either AI Platform models or a custom prediction function, the What-If Tool can display and make use of attribution values for each input feature in relation to each prediction. See the below section on attribution values for more information.

If you want to train an ML model from a dataset and explore the dataset and model, check out the What_If_Tool_Notebook_Usage.ipynb notebook in colab, which starts from a CSV file, converts the data to tf.Example protos, trains a classifier, and then uses the What-If Tool to show the classifier performance on the data.

What do I need to use it in TensorBoard?

A walkthrough of using the tool in TensorBoard, including a pretrained model and test dataset, can be found on the What-If Tool page on the TensorBoard website.

To use the tool in TensorBoard, only the following information needs to be provided:

  • The model server host and port, served using TensorFlow Serving. The model can use the TensorFlow Serving Classification, Regression, or Predict API.
    • Information on how to create a saved model with the Estimator API that will use thse appropriate TensorFlow Serving Classification or Regression APIs can be found in the saved model documentation and in this helpful tutorial. Models that use these APIs are the simplest to use with the What-If Tool as they require no set-up in the tool beyond setting the model type.
    • If the model uses the Predict API, the input must be serialized tf.Example or tf.SequenceExample protos and the output must be following:
      • For classification models, the output must include a 2D float tensor containing a list of class probabilities for all possible class indices for each inferred example.
      • For regression models, the output must include a float tensor containing a single regression score for each inferred example.
    • The What-If Tool queries the served model using the gRPC API, not the RESTful API. See the TensorFlow Serving docker documentation for more information on the two APIs. The docker image uses port 8500 for the gRPC API, so if using the docker approach, the port to specify in the What-If Tool will be 8500.
    • Alternatively, instead of querying a model hosted by TensorFlow Serving, you can provide a python function for model prediction to the tool through the "--whatif-use-unsafe-custom-prediction" runtime argument as described in more detail below.
  • A TFRecord file of tf.Examples or tf.SequenceExamples to perform inference on and the number of examples to load from the file.
    • Can handle up to tens of thousands of examples. The exact amount depends on the size of each example (how many features there are and how large the feature values are).
    • The file must be in the logdir provided to TensorBoard on startup. Alternatively, you can provide another directory to allow file loading from, through use of the -whatif-data-dir=PATH runtime parameter.
  • An indication if the model is a regression, binary classification or multi-class classification model.
  • An optional vocab file for the labels for a classification model. This file maps the predicted class indices returned from the model prediction into class labels. The text file contains one label per line, corresponding to the class indices returned by the model, starting with index 0.
    • If this file is provided, then the dashboard will show the predicted labels for a classification model. If not, it will show the predicted class indices.

Alternatively, the What-If Tool can be used to explore a dataset directly from a CSV file. See the next section for details.

The information can be provided in the settings dialog screen, which pops up automatically upon opening this tool and is accessible through the settings icon button in the top-right of the tool. The information can also be provided directly through URL parameters. Changing the settings through the controls automatically updates the URL so that it can be shared with others for them to view the same data in the What-If Tool.

All I have is a dataset. What can I do in TensorBoard? Where do I start?

If you just want to explore the information in a CSV file using the What-If Tool in TensorBoard, just set the path to the examples to the file (with a ".csv" extension) and leave the inference address and model name fields blank. The first line of the CSV file must contain column names. Each line after that contains one example from the dataset, with values for each of the columns defined on the first line. The pipe character ("|") deliminates separate feature values in a list of feature values for a given feature.

In order to make use of the model understanding features of the tool, you can have columns in your dataset that contain the output from an ML model. If your file has a column named "predictions__probabilities" with a pipe-delimited ("|") list of probability scores (between 0 and 1), then the tool will treat those as the output scores of a classification model. If your file has a numeric column named "predictions" then the tool will treat those as the output of a regression model. In this way, the tool can be used to analyze any dataset and the results of any model run offline against the dataset. Note that in this mode, the examples aren't editable as there is no way to get new inference results when an example changes.

What can it do?

Details on the capabilities of the tool, including a guided walkthrough, can be found on the What-If Tool website. Here is a basic rundown of what it can do:

  • Visualize a dataset of TensorFlow Example protos.

    • The main panel shows the dataset using Facets Dive, where the examples can be organized/sliced/positioned/colored by any of the dataset’s features.
      • The examples can also be organized by the results of their inferences.
        • For classification models, this includes inferred label, confidence of inferred label, and inference correctness.
        • For regression models, this includes inferred score and amount of error (including absolute or squared error) in the inference.
    • A selected example can be viewed in detail in the side panel, showing all feature values for all features of that example.
    • For examples that contain an encoded image in a bytes feature named "image/encoded", Facets Dive will create a thumbnail of the image to display the point, and the full-size image is visible in the side panel for a selected example.
    • Aggregate statistics for all loaded examples can be viewed in the side panel using Facets Overview.
  • Visualize the results of the inference

    • By default, examples in the main panel are colored by their inference results.
    • The examples in the main panel can be organized into confusion matrices and other custom layouts to show the inference results faceted by a number of different features, or faceted/positioned by inference result, allowing the creation of small multiples of 1D and 2D histograms and scatter plots.
    • For a selected example, detailed inference results (e.x. predicted classes and their confidence scores) are shown in the side panel.
    • If the model returns attribution values in addition to predictions, they are displayed for each selected example, and the attribution values can be used to control custom layouts and as dimensions to slice the dataset on for performance analysis.
  • Explore counterfactual examples

    • For classification models, for any selected example, with one click you can compare the example to the example most-similar to it but which is classified as a different.
    • Similarity is calculated based on the distribution of feature values across all loaded examples and similarity can be calculated using either L1 or L2 distance.
      • Distance is normalized between features by:
        • For numeric features, use the distance between values divided by the standard deviation of the values across all examples.
        • For categorical features, the distance is 0 if the values are the same, otherwise the distance is the probability that any two examples have the same value for that feature across all examples.
    • In notebook mode, the tool also allows you to set a custom distance function using set_custom_distance_fn in WitConfigBuilder, where that function is used to compute closest counterfactuals instead. As in the case with custom_predict_fn, the custom distance function can be any python function.
  • Edit a selected example in the browser and re-run inference and visualize the difference in the inference results.

    • See auto-generated partial dependence plots, which are plots that for every feature show the change in inference results as that feature has its value changed to different valid values for that feature.
    • Edit/add/remove any feature or feature value in the side panel and re-run inference on the edited datapoint. A history of the inference values of that point as it is edited and re-inferred is shown.
    • For examples that contain encoded images, upload your own image and re-run inference.
    • Clone an existing example for editing/comparison.
    • Revert edits to an edited example.
  • Compare the results of two models on the same input data.

    • If you provide two models to the tool during setup, it will run inference with the provided data on both models and you can compare the results between the two models using all the features defined above.
  • If using a binary classification model and your examples include a feature that describes the true label, you can do the following:

    • See the ROC curve and numeric confusion matrices in the side panel, including the point on the curve where your model lives, given the current positive classification threshold value.
    • See separate ROC curves and numeric confusion matrices split out for subsets of the dataset, sliced of any feature or features of your dataset (i.e. by gender).
    • Manually adjust the positive classification threshold (or thresholds, if slicing the dataset by a feature) and see the difference in inference results, ROC curve position and confusion matrices immediately.
    • Set the positive classification thresholds with one click based on concepts such as the cost of a false positive vs a false negative and satisfying fairness measures such as equality of opportunity or demographic parity.
  • If using a multi-class classification model and your examples include a feature that describes the true label, you can do the following:

    • See a confusion matrix in the side panel for all classifications and all classes.
    • See separate confusion matrices split out for subsets of the dataset, sliced on any feature or features of your dataset.
  • If using a regression model and your examples include a feature that describes the true label, you can do the following:

    • See the mean error, mean absolute error and mean squared error across the dataset.
    • See separate calculations of those mean error calculations split out for subsets of the dataset, sliced of any feature or features of your dataset.

Who is it for?

We imagine WIT to be useful for a wide variety of users.

  • ML researchers and model developers - Investigate your datasets and models and explore inference results. Poke at the data and model to gain insights, for tasks such as debugging strange results and looking into ML fairness.
  • Non-technical stakeholders - Gain an understanding of the performance of a model on a dataset. Try it out with your own data.
  • Lay users - Learn about machine learning by interactively playing with datasets and models.

Notebook mode details

As seen in the example notebook, creating the WitWidget object is what causes the What-If Tool to be displayed in an output cell. The WitWidget object takes a WitConfigBuilder object as a constructor argument. The WitConfigBuilder object specifies the data and model information that the What-If Tool will use.

The WitConfigBuilder object takes a list of tf.Example or tf.SequenceExample protos as a constructor argument. These protos will be shown in the tool and inferred in the specified model.

The model to be used for inference by the tool can be specified in many ways:

  • As a TensorFlow Estimator object that is provided through the set_estimator_and_feature_spec method. In this case the inference will be done inside the notebook using the provided estimator.
  • As a model hosted by AI Platform Prediction through the set_ai_platform_model method.
  • As a custom prediction function provided through set_custom_predict_fn method. In this case WIT will directly call the function for inference.
  • As an endpoint for a model being served by TensorFlow Serving, through the set_inference_address and set_model_name methods. In this case the inference will be done on the model server specified. To query a model served on host "localhost" on port 8888, named "my_model", you would set on your builder builder.set_inference_address('localhost:8888').set_model_name('my_model').

See the documentation of WitConfigBuilder for all options you can provide, including how to specify other model types (defaults to binary classification) and how to specify an optional second model to compare to the first model.

How can the What-If Tool use attribution values and other prediction-time information?

Feature attribution values are numeric values for each input feature to an ML model that indicate how much impact that feature value had on the model's prediction. There are a variety of approaches to get feature attribution values for a predicts from an ML model, including SHAP, Integrated Gradients, SmoothGrad, and more.

They can be a powerful way of analyzing how a model reacts to certain input values beyond just simply studying the effect that changing individual feature values has on model predictions as is done with partial dependence plots. Some attribution techniques require access to a model internals, such as the gradient-based methods, whereas others can be performed on black-box models. Regardless, the What-If Tool can visualize the results of attribution methods in addition to the standard model prediction results.

There are two ways to use the What-If Tool to visualize attribution values. If you have deployed a model to Cloud AI Platform with the explainability feature enabled, and provide this model to the tool through the standard set_ai_platform_model method, then attribution values will automatically be generated and visualized by the tool with no additional setup needed. If you wish to view attribution values for a different model setup, this can be accomplished through use of the custom prediction function.

As described in the set_custom_predict_fn documentation in WitConfigBuilder, this method must return a list of the same size as the number of examples provided to it, with each list entry representing the prediction-time information for that example. In the case of a standard model with no attribution information, the list entry is just a number (in the case of a regression model), or a list of class probabilities (in the case of a classification model).

However, if there is attribution or other prediction-time information, then the list entry can instead be a dictionary, with the standard model prediction output under the predictions key. Attribution information can be returned under the attributions key and any other supplemental information under its own descriptive key. The exact format of the attributions and other supplemental information can be found in the code documentation linked above.

If attribution values are provided to the What-If Tool, they can be used in a number of ways. First, when selecting a datapoint in the Datapoint Editor tab, the attribution values are displayed next to each feature value and the features can be ordered by their attribution strength instead of alphabetically. Additionally, the feature values are colored by their attribution values for quick interpretation of attribution strengths.

Beyond displaying the attribution values for the selected datapoint, the attribution values for each feature can be used in the tool in the same ways as any other feature of the datapoints. They can be used selected in the datapoints visualization controls to use those values to create custom scatter plots and histograms. For example, you can create a scatterplot showing the relationship between the attribution value of two different features, with the datapoints colored by the predicted result from the model. They can also be used in the Performance tab as a way to slice a dataset for comparing performance statistics of different slices. For example, you can quickly compare the aggregate performance of a model on datapoints with low attribution of a specified feature, against the datapoints with high attribution of that feature.

Any other supplemental information returned from a custom prediction function will appear in the tool as a feature named after its key in the dictionary. They can also be used in the same way, driving custom visualizations and as a dimension to slice when analyzing aggregate model performance.

When a datapoint is edited and the re-inferred through the model with the "Run inference" button, the attributions and other supplemental information is recalculated and updated in the tool.

For an example of returning attribution values from a custom prediction function (in this case using the SHAP library to get attributions), see the WIT COMPAS with SHAP notebook.

How do I enable it for use in a Jupyter notebook?

First, install and enable WIT for Jupyter through the following commands:

pip install witwidget
jupyter nbextension install --py --symlink --sys-prefix witwidget
jupyter nbextension enable --py --sys-prefix witwidget

Then, use it as seen at the bottom of the What_If_Tool_Notebook_Usage.ipynb notebook.

How do I enable it for use in a Colab notebook?

Install the widget into the runtime of the notebook kernel by running a cell containing:

!pip install witwidget

Then, use it as seen at the bottom of the What_If_Tool_Notebook_Usage.ipynb notebook.

How do I enable it for use in a JupyterLab or Cloud AI Platform notebook?

WIT has been tested in JupyterLab versions 1.x, 2.x, and 3.x.

Install and enable WIT for JupyterLab 3.x by running a cell containing:

!pip install witwidget
!jupyter labextension install wit-widget
!jupyter labextension install @jupyter-widgets/jupyterlab-manager

Note that you may need to specify the correct version of jupyterlab-manager for you JupyterLab version as per https://www.npmjs.com/package/@jupyter-widgets/jupyterlab-manager.

Note that you may need to run !sudo jupyter labextension ... commands depending on your notebook setup.

Use of WIT after installation is the same as with the other notebook installations.

Can I use a custom prediction function in TensorBoard?

Yes. You can do this by defining a python function named custom_predict_fn which takes two arguments: a list of examples to preform inference on, and the serving bundle object which contains information about the model to query. The function should return a list of results, one entry per example provided. For regression models, the result is just a number. For classification models, the result is a list of numbers, representing the class scores for each possible class. Here is a minimal example that just returns random results:

import random

# The function name "custom_predict_fn" must be exact.
def custom_predict_fn(examples, serving_bundle):
  # Examples are a list of TFRecord objects, each object contains the features of each point.
  # serving_bundle is a dictionary that contains the setup information provided to the tool,
  # such as server address, model name, model version, etc.

  number_of_examples = len(examples)
  results = []
  for _ in range(number_of_examples):
    score = random.random()
    results.append([score, 1 - score]) # For binary classification
    # results.append(score) # For regression
  return results

Define this function in a file you save to disk. For this example, let's assume the file is saved as /tmp/my_custom_predict_function.py. Then the TensorBoard server with tensorboard --whatif-use-unsafe-custom-prediction /tmp/my_custom_predict_function.py and the function should be invoked once you have set up your data and model in the What-If Tool setup dialog. The unsafe means that the function is not sandboxed, so make sure that your function doesn't do anything destructive, such as accidently delete your experiment data.

How can I help develop it?

Check out the developement guide.

What's new in WIT?

Check out the release notes.

what-if-tool's People

Contributors

abrandenb avatar alanfranz avatar dependabot[bot] avatar florist-notes avatar jameswex avatar kevinrobinson avatar kianmeng avatar lanpa avatar mpushkarna avatar ozen avatar sandersk avatar sararob avatar simonewu avatar stephanwlee avatar thedriftofwords avatar tolga-b avatar tonysinghmss avatar tusharsadhwani avatar vmaudgalya avatar wchargin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

what-if-tool's Issues

Can we use the what-if-tool off-line?

Good morning, Thank you so much for this amazing piece of work. When I saw the presentation I immediately thought of our leaders and how might they benefit from the understanding that this process brings. Due to the nature of the data that we are working with here at the Ministry of Education we would never be able to use this tool effectively because of the breach in confidentiality. Is there a way to use this tool on the local machine and have it display the results just as it would have done in the cloud?
Thank you again for your time and suppor.
Best Regards,

Andrei.

Smile detection demo and CelebA datapoint IDs

This is an awesome demo! I spent some time exploring the performance & fairness tabs, and then digging into facets dive, and individual examples. It's really interesting, thanks for putting together such an accessible demo 👍

In the process I found a bunch of data points that were labeled differently than I would have expected. I figured this was errors from the labeling that were part of the CelebA set, but it came up enough times that it led to me investigate further and try to see how many data points appeared to be mislabeled, when compared to my own personal oracle-like labeling truth :)

To debug, I downloaded the CelebA dataset and then started looking at individual data points, assuming the Datapoint ID in WIT would correspond to the image ID and filename in CelebA. This doesn't seem to be the case though, and so I can't figure out how to verify further.

You can pick any number as an example (Datapoint ID 1 is what I started with when trying to compare to CelebA). But for one full example, in facets within WIT, I noticed a datapoint was labeled "Sideburns" in a way I didn't expect looking at the image myself, so I clicked in to see the image and the datapoint ID:
image

But then checking in the the CelebA set, this is what I see for image 000038.jpg:
image

The data in CelebA for 38 in the list_attr_celeba.csv file is also different than the data in WIT for Datapoint ID 38:
Screen Shot 2019-04-26 at 5 29 02 PM

Since there's only 250 examples in the WIT tool, I'm wondering if the full dataset is being sampled for the demo, and then the Datapoint ID values are being mapped to 0-249 and the reference back to the original dataset is lost? That's just a guess though. I see there's a bunch of data in https://github.com/PAIR-code/what-if-tool/tree/master/data/images but not sure how to debug further.

Thanks for sharing this work!

Widget not loading across multiple applications

I am working on several different tools that use the What-If tool for interactive visuals. It seems like I keep running into the same error where I see 'Loading Widget...' and no visual.

Currently, I am working in the JupyterLab Environment of Google Cloud's AI Platform.

I have did the following installations:
pip install ipywidgets
jupyter nbextension enable --py widgetsnbextension
conda install -c conda-forge nodejs
jupyter nbextension install --py --user witwidget
jupyter nbextension enable witwidget --user --py
jupyter labextension install @jupyter-widgets/jupyterlab-manager

And have the following jupyterlab list:

JupyterLab v1.2.16
Known labextensions:
app dir: /opt/conda/share/jupyter/lab
@jupyter-widgets/jupyterlab-manager v1.1.0 enabled OK
@jupyterlab/celltags v0.2.0 enabled OK
@jupyterlab/git v0.10.1 enabled OK
js v0.1.0 enabled OK
jupyter-matplotlib v0.7.2 enabled OK
jupyterlab-plotly v4.8.1 enabled OK
nbdime-jupyterlab v1.0.0 enabled OK
plotlywidget v4.8.1 enabled OK
wit-widget v1.6.0 enabled OK

It would be helpful if you had any insight on why this problem might be occuring!

ReferenceError: configCallback is not defined

https://colab.research.google.com/github/pair-code/what-if-tool/blob/master/WIT_COMPAS.ipynb

In this notebook, running the "Invoke What-If Tool for test data and the trained models" cell gives the following.

MessageErrorTraceback (most recent call last)

<ipython-input-8-18dbcd24366f> in <module>()
     10 config_builder = WitConfigBuilder(examples[0:num_datapoints]).set_estimator_and_feature_spec(
     11     classifier, feature_spec)
---> 12 WitWidget(config_builder, height=tool_height_in_px)

3 frames

/usr/local/lib/python2.7/dist-packages/witwidget/notebook/colab/wit.pyc in __init__(self, config_builder, height, delay_rendering)
    238 
    239     if not delay_rendering:
--> 240       self.render()
    241 
    242     # Increment the static instance WitWidget index counter

/usr/local/lib/python2.7/dist-packages/witwidget/notebook/colab/wit.pyc in render(self)
    252     # Send the provided config and examples to JS
    253     output.eval_js("""configCallback({config})""".format(
--> 254         config=json.dumps(self.config)))
    255     output.eval_js("""updateExamplesCallback({examples})""".format(
    256         examples=json.dumps(self.examples)))

/usr/local/lib/python2.7/dist-packages/google/colab/output/_js.pyc in eval_js(script, ignore_result)
     37   if ignore_result:
     38     return
---> 39   return _message.read_reply_from_input(request_id)
     40 
     41 

/usr/local/lib/python2.7/dist-packages/google/colab/_message.pyc in read_reply_from_input(message_id, timeout_sec)
    104         reply.get('colab_msg_id') == message_id):
    105       if 'error' in reply:
--> 106         raise MessageError(reply['error'])
    107       return reply.get('data', None)
    108 

MessageError: ReferenceError: configCallback is not defined

Incomplete categorical feature list in "Partial Dependence Plots"

WhatIf currently only reads first 50 Datapoints to generate candidates for categorical features to be used in "Partial Dependence Plots". This could be too restrictive. It shall read more data to get a more complete list of categories and choose the most frequent ones for plots.

use my own data

Hi James and Team,

Is it possible to use my own data but still have the same front end?

Can the attribution values be visualized in Tensorboard mode?

I worked with the Notebook mode and was successfully able to project attribution scores (using Shapely algorithm) on WIT dashboard. Due to a bigger data size, I then tried the visualization in the Tensorboard mode. The instructions given on the documentation page, mentions only two requirements: 1. ML model in TF serving format and, 2. TFRecord file of the example data set. There isn't any mention of generating or uploading attribution values (generated by Google Integrated Gradient or SHAP) in the Tensorboard mode. Please suggest if it's possible to add attribution values in Tensorboard mode or am I missing anything.

Add rank correlation when comparing two models

When comparing two models, we could calculate rank correlation (at least for binary classification and regression models). Rank correlation is a number indicating how much the scores from the two models across the test examples line up in terms of order of the scores across those examples.

Need to think about where this info would go though. Would be valuable to calculate on slices as well, when user is slicing in performance tab.

Web demos fail to load in Safari or Firefox

Thanks for sharing this awesome work! 👍

This is what folks see:
Screen Shot 2019-04-26 at 4 55 12 PM

This is what the console outputs:
Screen Shot 2019-04-26 at 4 55 23 PM
Screen Shot 2019-04-26 at 4 55 03 PM

So I'm guessing something about the Bazel config, and that it needs to include a polyfill or some other way to load Polymer code. From poking around a bit I think it's in the config of this build command (to use the smiling dataset as an example): https://github.com/tensorflow/tensorboard/blob/master/tensorboard/plugins/interactive_inference/tf_interactive_inference_dashboard/demo/BUILD#L69

From searching around, I didn't find much info building polymer code with bazel outside the googleplex. And reading that build file, it looks like it's pulling in what I'd expect in https://github.com/tensorflow/tensorboard/blob/master/tensorboard/components/tf_imports/BUILD#L16. And it looks like it pulls in external polymer artifacts in https://github.com/tensorflow/tensorboard/blob/d3a6cfd6eb5c0fff4a405b23c5361875adf908f0/third_party/polymer.bzl#L1379. Those must be working right, since the facets demos work fine in other browsers, so I'm guessing it's something about the specific BUILD task within tf_interactive_inference_dashboard/demo/ but not sure what.

Thanks! :)

Clarify connection between fairness optimization choices on left and data on right

Thanks for sharing! This is awesome, and super cool to see tools that let people now do explorations like in https://research.google.com/bigpicture/attacking-discrimination-in-ml/ with their own models or plain CSVs :)

In reading the UI, and in talking through this with other people about what's going on in the fairness optimizations, I found myself marking up screenshots to explain what was going on, like this:

image

I thought it might be a helpful improvement to make these connections more explicit and obvious, rather than having to parse the text definitions and map them to the UI and data points on the right. The screenshot above isn't a UI proposal, but I could sketch some options if you're interested in brainstorming. It's particularly hard to see what's being compared when you slice by more than one dimension and the confusion matrix isn't visible, so would be interesting in seeing if there's ways to make it possible to see this across say four slices. If there are other ways to look at this, that'd be awesome to learn about too! There's a lot of information and conceptual density here, so laying it out and staying simple seems like a great design challenge but also super hard :)

Relatedly, if I'm understanding right, for some of the choices the actual metric being optimized isn't visible anywhere at all (putting aside the cost ratio altogether for now). So for single threshold, as an example, I believe the number being optimized is the overall accuracy, the aggregation of these two numbers weighted by the count of examples:

image

So in this case I'm trying to see how much the overall accuracy goes down when trying different optimization strategies that will all bring it down as they trade off other goals (eg, equal opportunity). These questions may just from me exploring the impact of different parameters to build intuition, but the kind of question I'm trying to ask is "how much worse is the metric that equal opportunity is optimizing for overall, when I choose demographic parity?" and "how much worse is the metric for equal opportunity for each slice when I choose demographic parity?" Not sure if I'm making any sense, but essentially trying to compare how one optimization choice impacts the other optimization choices' metrics.

Thanks!

Data split for binning - Datapoint editor vs. Performance & Fairness

Hi,

we really like to use the What-If tool. The last days we encountered that the split of the data between the datapoint editor and the performance and fairness tabs isn't performed in the same way. As an example, we binned the data of the UCI census income dataset by age in 10 bins. The number of data points in each bin for the datapoint editor and performance & fairness tabs can vary (s. figure).
whatIf_bins

For us, it would be extremely helpful if the data in e.g. the first bin of the datapoint editor would be exactly the same as in the first bin of the performance and fairness tab.

Best,
Timo

Handle large images

Currently WIT sends all features to the front-end for all examples. If the examples contain image features, this means we can't load a ton of examples for that model.

Instead for large features like images, don't send them to the front-end immediately, only send the image feature to the front-end when clicking on an example to view in the datapoint editor.

which trained models can be used?

Hi There,
I am new to using the what-if tool. I would like to use it to see if my ML model is fair or not; I already have a trained XGBoost model (booster object, saved model). How can I use this model with the what-if tool and is this even possible? I notice in your code that can be modified by users the WIT-from scratch.ipynb that you use classifier = tf.estimator.LinearClassifier
What if I already have a saved model that is of the form I mentioned above (xgboost)? Can I still use the what-if tool or not. I am concerned that tensorflow does not support these types of booster object type models. Any help would be appreciated!!!! Thanks!

Use WIT for model trained in tfx

Hi
I trained a model with tfx and it was exported as saved_model.pb.
Now, I want to reload it and visualize it using WIT.
How can I do this?

I couldn't find a way to do it since when reloading the model:
imported = tf.saved_model.load(export_dir=trained_model_path) I get object from the type :
<tensorflow.python.training.tracking.tracking.AutoTrackable at 0x7f3d71e456a0>
instead of an estimator.

Thanks

Calculate/display AUC for PR curve

For classification models, we display a PR curve in the performance tab. We should calculate the area under this curve and display it above the curve, in the title for the chart.

Docs/Meta for Packagers: Packaging info on PyPI, tags for patches

I'm representing TF SIG Build, a TensorFlow special interest group dedicated to building and testing TF in open source. Our last meeting surfaced confusion from community members involved in packaging TF for other environments (e.g. Gentoo, Anaconda) about tensorboard-plugin-wit, which I think could be resolved with these two asks:

  • Could the PyPI page be updated to answer these questions? Our packagers only know about WIT from the indirect tensorflow -> tensorboard -> tensorflow-plugin-wit dependency, which points to an empty PyPI page.
    • What does the plugin do? (e.g. a short description and links to the canonical WIT site)
    • Why does tensorboard depend on it? (e.g. "it was once part of core tensorboard but was moved to a plugin")
    • Where is the source code, and how is it built? (i.e. this repo)
  • Could we have assurance that future PyPI releases match up with a tag from a Git source? In this case, the 1.6.0post* patch releases lack a matching tag in this repo. For packagers, a tag for each release means they can rebuild the package in the necessary configuration for their platform, and helps verify that the package on PyPI really matches up with the code.

These would help a lot!

Issues in converting to correct json format.

we are not being able to convert our custom dataset into correct format that what if tool requires.
Sample data that we are trying to convert is below.

"ID","age","workclass","fnlwgt","education","education-num","marital-status","occupation","relationship","race","sex","capital-gain","capital-loss","hours-per-week","native-country","result"
19122,42," Federal-gov",178470," HS-grad",9," Divorced"," Adm-clerical"," Not-in-family"," White"," Female",0,0,40," United-States"," <=50K"
20798,49," Federal-gov",115784," Some-college",10," Married-civ-spouse"," Craft-repair"," Husband"," White"," Male",0,0,40," United-States"," <=50K"
32472,34," Private",30673," Masters",14," Married-civ-spouse"," Prof-specialty"," Husband"," White"," Male",0,0,55," United-States"," <=50K"
21476,29," Private",157612," Bachelors",13," Never-married"," Prof-specialty"," Not-in-family"," White"," Female",14344,0,40," United-States"," >50K"
24836,30," Private",175931," HS-grad",9," Married-civ-spouse"," Craft-repair"," Husband"," White"," Male",0,0,40," United-States"," <=50K"
5285,31," Self-emp-inc",236415," Some-college",10," Married-civ-spouse"," Adm-clerical"," Wife"," White"," Female",0,0,20," United-States"," >50K"

This is the json format that i currently have.
[{ "ID": 19122, "age": 42, "capital-gain": 0, "capital-loss": 0, "education": " HS-grad", "education-num": 9, "fnlwgt": 178470, "hours-per-week": 40, "marital-status": " Divorced", "native-country": " United-States", "occupation": " Adm-clerical", "race": " White", "relationship": " Not-in-family", "result": " <=50K", "sex": " Female", "workclass": " Federal-gov" }]

what do we need to do to generate the json in the format that is required by the what if tool.

Installing error: cannot import "ensure_str" from "six"

Hi, I encounter an error while executing jupyter nbextension install --py --symlink --sys-prefix witwidget:
Traceback (most recent call last): File "/usr/bin/jupyter-nbextension", line 11, in <module> load_entry_point('notebook==5.2.2', 'console_scripts', 'jupyter-nbextension')() File "/usr/lib/python3/dist-packages/jupyter_core/application.py", line 266, in launch_instance return super(JupyterApp, cls).launch_instance(argv=argv, **kwargs) File "/usr/lib/python3/dist-packages/traitlets/config/application.py", line 658, in launch_instance app.start() File "/usr/lib/python3/dist-packages/notebook/nbextensions.py", line 988, in start super(NBExtensionApp, self).start() File "/usr/lib/python3/dist-packages/jupyter_core/application.py", line 255, in start self.subapp.start() File "/usr/lib/python3/dist-packages/notebook/nbextensions.py", line 716, in start self.install_extensions() File "/usr/lib/python3/dist-packages/notebook/nbextensions.py", line 695, in install_extensions **kwargs File "/usr/lib/python3/dist-packages/notebook/nbextensions.py", line 211, in install_nbextension_python m, nbexts = _get_nbextension_metadata(module) File "/usr/lib/python3/dist-packages/notebook/nbextensions.py", line 1122, in _get_nbextension_metadata m = import_item(module) File "/usr/lib/python3/dist-packages/traitlets/utils/importstring.py", line 42, in import_item return __import__(parts[0]) File "/home/linin/.local/lib/python3.6/site-packages/witwidget/__init__.py", line 15, in <module> from witwidget.notebook.visualization import * File "/home/linin/.local/lib/python3.6/site-packages/witwidget/notebook/visualization.py", line 27, in <module> from witwidget.notebook.jupyter.wit import * # pylint: disable=wildcard-import,g-import-not-at-top File "/home/linin/.local/lib/python3.6/site-packages/witwidget/notebook/jupyter/wit.py", line 25, in <module> from witwidget.notebook import base File "/home/linin/.local/lib/python3.6/site-packages/witwidget/notebook/base.py", line 26, in <module> from six import ensure_str ImportError: cannot import name 'ensure_str'

But I can import ensure_str within my python2 and python3, where could it go wrong?
Thanks a lot.

WIT missing functionality with TFX exports

Similar to issue #37 I would like to use the WIT with a TFX pipeline. I am trying this out with the Iris/Native Keras example from TFX (https://github.com/tensorflow/tfx/blob/master/tfx/examples/iris/iris_pipeline_native_keras.py). I have tried both set_custom_predict_fn and set_estimator_and_feature_spec. Both allow me to load the WIT but the Predict button cannot be used. In the set_custom_predict_fn case, the WIT gives me the error "AttributeError("'list' object has no attribute 'SerializeToString'",)". In the set_estimator_and_feature_spec case, the WIT gives me the error "AttributeError("'str' object has no attribute 'predict'",)".

Here's the code in a Colab: https://colab.research.google.com/drive/1tfUZ4MLT2Ynj8LNeghOUnL7iBstEQTgv

Which is the correct way to use the WitConfigBuilder with a TFX model, and how do I correct the error?

Thanks!

Getting a full stacktrace for errors

Hey there,

I've used the WIT in the past and am now coming back to it for a new project. I'm trying to use the Jupyter integrated widget with the inference done through a

  • TF serving instance
  • In the tensorflow/serving docker container (2.1.0)
  • Running a servable created from a Keras model
    • Using tf.model.save
  • On another machine
  • Talking via gRPC

I'm getting a fairly unhelpful error whenever I try to run inference (when it starts up and when I click that Infer button):

TypeError('None has type NoneType, but expected one of: bytes, unicode',)

The configuration I've setup is as follows

wit_config = (
    witwidget.WitConfigBuilder(pred_df_ex)
    .set_inference_address('<host_redacted>:8500')
    .set_model_name('fts_test')
    .set_uses_predict_api(True)
    .set_predict_output_tensor('outputs')
    .set_model_type('classification')
    #.set_predict_input_tensor('inputs')
    .set_target_feature('label')
    .set_label_vocab(['No','Yes'])
)

I'm able to rule out that it's not getting a response from the server, since if I mess with the configuration to make it intentionally broken (e.g. if I uncomment that input tensor line) I'll get an error that probably could only come from the server.

<_Rendezvous of RPC that terminated with: status = StatusCode.INVALID_ARGUMENT details = "input size does not match signature: 1!=11 len({inputs}) != 
len({<feature_names_redacted>}). Sent extra: {inputs}. Missing but required: {<feature_names_redacted>}." debug_error_string = 
"{"created":"@1585001946.586897426","description":"Error received from peer ipv4:<ipaddr_redacted>:8500","file":"src/core/lib/surface
/call.cc","file_line":1052,"grpc_message":"input size does not match signature: 1!=11 
len({inputs}) != len({<feature_names_redacted>}). Sent extra: {inputs}. Missing but required: {<feature_names_redacted>}.","grpc_status":3}" >

I've verified that I can talk to the host no problem from this machine as well.

I'm not sure, honestly, whether or not that's an accurate assessment however -- what I'd need is the full stacktrace of that error. Any help would be appreciated.

Style the global attributions table

For both classification and regression models, we now have a global attributions table in the performance tab, if the model returns attributions along with predictions.

We need to appropriately style and position this table for both model types and for one or two models.

Jupyter Lab 2.x support

The witwidget does not seem to work in Jupyter Lab 2.x versions. Even before running witWidget, I get the error:
2020-04-16-140310_1919x217_scrot

Whenever I use witwidget with Jupyter Lab 2.x, I get the error:
2020-04-16-142201_1907x196_scrot

But when I use Jupyterlab 1.x, everything works fine. I guess the witwidget is not ported to Jupyterlab 2.x. Jupyterlab docs contain extension migration guide, which may help in updating the widget to JupyterLab 2.x.

Automatic slice detection

In performance tab, would be nice to have a button to (in the background) calculate slices with the largest performance disparities and surface those to the user for them to explore.

Currently users have to check slices one by one to look at their performance disparities.

Need to do this in an efficient manner as with intersectional slices this becomes quadratic in scale.

Multi-point editing

Would be nice to have a mode where one can click on a feature in the datapoint editor and edit it, and have that edit take affect for ALL datapoints, not just a single one.

Need to think about proper UI for that experience.

WIT Error in Dashboard (Using Jupyter Extention)

Age Demo from WIT [From here](: https://colab.research.google.com/github/pair-code/what-if-tool/blob/master/WIT_Age_Regression.ipynb)

current behavior:
On executing the command- WitWidget(config_builder, height=tool_height_in_px)
I encountered the below error above the WIT extension dashboard:

  `Cannot set tensorflow.serving.Regression.value to array([21.733267], dtype=float32): array([21.733267], dtype=float32) has type <class 'numpy.ndarray'>, but expected one of: numbers.Real`

Problem: Performance tab has no output in WIT dashboard
Browser: Chrome

Expected Output:

Performance tab: Should work with all features of graphs and plots

Please see what is the issue. I have tried to debug the WitWidget function but unable to overcome this error.
Thanks.
WITIssue

Problems replicating web demo using TensorFlow Serving on Docker

Hi,
I was trying to run the toy CelebA model using TF Serving and I couldn't connect to the model.

Here is the command:

 docker run -p 8500:8500 --mount type=bind,source='/Users/user/Downloads/wit_testing',target=/models/wit_testing -e MODEL_NAME=wit_testing -it tensorflow/serving

I ran TensorBoard on localhost:6006 and then I configured the WIT tool as follows:
Zrzut ekranu 2019-06-19 o 12 48 10

I serialized the model to a ProtoBuf as instructed, serialized data to a TFRecords file but:

  1. The photos on the right do not display in the same way they are displayed in the Web Demo, there are just dots representing datapoints.

  2. Whenever I try to run an inference call to the model, I get bad request error (500). The error I am getting:

grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with:
        status = StatusCode.INVALID_ARGUMENT
        details = "Expects arg[0] to be float but string is provided"
        debug_error_string = "{"created":"@1560888632.009402000","description":"Error received from peer ipv6:[::1]:8500","file":"src/core/lib/surface/call.cc","file_line":1041,"grpc_message":"Expects arg[0] to be float but string is provided","grpc_status":3}"

Is there some convention in naming and paths here that I am missing? Or am I doing something else completely wrong?

How to use what-it-tool in front end?

Hi James and Team,

config_builder = (WitConfigBuilder(test_examples.tolist(), X_test.columns.tolist() + ["total_time"])
.set_custom_predict_fn(adjust_prediction)
.set_target_feature('total_time')
.set_model_type('regression'))
WitWidget(config_builder, height=800)

Can we return html or something else from above code, which we can render on frontend as I want to use this in one of my applications.

Just like in plotly we do...in plotly we can return the HTML or we can open the plot in new Web page.

colab notebook runtime disconnects when numdata_points = 20000

I started from the demo income classification , using the linear regressor - I am using my own data set to predict insurance payments using about 12 features. My dataset has over 30,000 points, the what-if tool disconnect when I set the number of data points to 20000. whatif works up to about 15,000 numdata_points.
This is not a time out problem, code runs for under 5 minutes. Is there a limit on the number of data points that WitConfigBuilder can handle in colab.
response would be much appreciated.

'gcloud beta ai-platform explain' giving errors with 3d input array for LSTM model

Following post is not exactly en error with WIT, but I'm having issues with the output from google explain which acts as input for WIT tool. Please help, if possible.

I have a 3d input keras model which trains successfully.

The model summary:

Model: "model"
Layer (type) Output Shape Param #

input_1 (InputLayer) [(None, 5, 1815)] 0
bidirectional (Bidirectional (None, 5, 64) 473088
bidirectional_1 (Bidirection (None, 5, 64) 24832
output (TimeDistributed) (None, 5, 25) 1625
Total params: 499,545
Trainable params: 499,545
Non-trainable params: 0

Post that estimator is defined and the serving is created as:

Convert our Keras model to an estimator

keras_estimator = tf.keras.estimator.model_to_estimator(keras_model=model, model_dir='export')

We need this serving input function to export our model in the next cell

serving_fn = tf.estimator.export.build_raw_serving_input_receiver_fn(
{'input_1': model.input}
)

export the model to bucket

export_path = keras_estimator.export_saved_model(
'gs://' + BUCKET_NAME + '/explanations',
serving_input_receiver_fn=serving_fn
).decode('utf-8')
print(export_path)

The explanation metadata definition is defined and copied to required destination as below:
explanation_metadata = {
"inputs": {
"data": {
"input_tensor_name": "input_1:0",
"input_baselines": [np.mean(data_X, axis=0).tolist()],
"encoding": "bag_of_features",
"index_feature_mapping": feature_X.tolist()
}
},
"outputs": {
"duration": {
"output_tensor_name": "output/Reshape_1:0"
}
},
"framework": "tensorflow"
}

Write the json to a local file

with open('explanation_metadata.json', 'w') as output_file:
json.dump(explanation_metadata, output_file)
!gsutil cp explanation_metadata.json $export_path

Post that the model is created and the version is defined as:

Create the model if it doesn't exist yet (you only need to run this once)

!gcloud ai-platform models create $MODEL --enable-logging --regions=us-central1

Create the version with gcloud

explain_method = 'integrated-gradients'
!gcloud beta ai-platform versions create $VERSION
--model $MODEL
--origin $export_path
--runtime-version 1.15
--framework TENSORFLOW
--python-version 3.7
--machine-type n1-standard-4
--explanation-method $explain_method
--num-integral-steps 25

Everything works fine until this step, but now when I create and send the explain request as:

prediction_json = {'input_1': data_X[:5].tolist()}
with open('diag-data.json', 'w') as outfile:
json.dump(prediction_json, outfile)

Send the request to google cloud

!gcloud beta ai-platform explain --model $MODEL --json-instances='diag-data.json'

I get the following error

{
"error": "Explainability failed with exception: <_InactiveRpcError of RPC that terminated with:\n\tstatus = StatusCode.INVALID_ARGUMENT\n\tdetails = "transpose expects a vector of size 4. But input(1) is a vector of size 3\n\t [[{{node bidirectional/forward_lstm_1/transpose}}]]"\n\tdebug_error_string = "{"created":"@1586068796.692241013","description":"Error received from peer ipv4:10.7.252.78:8500","file":"src/core/lib/surface/call.cc","file_line":1056,"grpc_message":"transpose expects a vector of size 4. But input(1) is a vector of size 3\n\t [[{{node bidirectional/forward_lstm_1/transpose}}]]","grpc_status":3}"\n>"
}

I tried altering the input shape, but nothing worked. Then to verify the format I tried with google cloud predict command which initially did not work, but worked after reshaping the input as:

prediction_json = {'input_1': data_X[:5].reshape(-1,1815).tolist()}
with open('diag-data.json', 'w') as outfile:
json.dump(prediction_json, outfile)

send the predict request

!gcloud beta ai-platform predict --model $MODEL --json-instances='diag-data.json'

I'm at a dead end now with !gcloud beta ai-platform explain --model $MODEL --json-instances='diag-data.json' and looking for the much needed help from SO community.

Also, for ease of experimenting, the notebook could be accessed from google_explain_notebook

Add sort by variation documentation

In the partial dependence plot view, there is a sort by variation button to sort features by how much their partial dependence plots vary (total Y axis distance traveled across the chart). Also, if comparing two models, each feature is ranked by its largest Y axis distance traveled across the two models' PD plots for that feature.

This information should be displayed in a information popup next to the sort button, like we have with other non-obvious buttons/controls.

Matching format for multi input keras model with tfrecord

I am new to tensorflow and I'm quite confused about the format the wit is expecting from the tfrecords to feed the model
Basically I have a multi input keras model with the following signature:

The given SavedModel SignatureDef contains the following input(s):
  inputs['byte_entropy'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 256)
      name: serving_default_byte_entropy:0
  inputs['data_directories'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 30)
      name: serving_default_data_directories:0
  inputs['exports'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 128)
      name: serving_default_exports:0
  inputs['general'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 10)
      name: serving_default_general:0
  inputs['header'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 62)
      name: serving_default_header:0
  inputs['histogram'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 256)
      name: serving_default_histogram:0
  inputs['imports'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 1280)
      name: serving_default_imports:0
  inputs['section'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 255)
      name: serving_default_section:0
  inputs['strings'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 104)
      name: serving_default_strings:0
The given SavedModel SignatureDef contains the following output(s):
  outputs['final_output'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 1)
      name: StatefulPartitionedCall:0
Method name is: tensorflow/serving/predict

And some tfrecords (records of Examples) with the same features

feature={'histogram': FixedLenFeature(shape=[256], dtype=tf.float32, default_value=None), 
'byte_entropy': FixedLenFeature(shape=[256], dtype=tf.float32, default_value=None), 
'strings': FixedLenFeature(shape=[104], dtype=tf.float32, default_value=None), 
'general': FixedLenFeature(shape=[10], dtype=tf.float32, default_value=None), 
'header': FixedLenFeature(shape=[62], dtype=tf.float32, default_value=None), 
'section': FixedLenFeature(shape=[255], dtype=tf.float32, default_value=None), 
'imports': FixedLenFeature(shape=[1280], dtype=tf.float32, default_value=None), 
'exports': FixedLenFeature(shape=[128], dtype=tf.float32, default_value=None), 
'data_directories': FixedLenFeature(shape=[30], dtype=tf.float32, default_value=None), 
'final_output': FixedLenFeature(shape=(), dtype=tf.float32, default_value=None)}

When trying to show it as a regression I get an invalid argument error
image

grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
        status = StatusCode.INVALID_ARGUMENT
        details = "input size does not match signature: 1!=9 len({byte_entropy,data_directories,exports,general,header,histogram,imports,section,strings}) != len({byte_entropy,data_directories,exports,general,header,histogram,imports,section,strings}). Sent extra: {byte_entropy,data_directories,exports,general,header,histogram,imports,section,strings}. Missing but required: {byte_entropy,data_directories,exports,general,header,histogram,imports,section,strings}."
        debug_error_string = "{"created":"@1588279350.370000000","description":"Error received from peer ipv6:[::1]:8500","file":"src/core/lib/surface/call.cc","file_line":1056,"grpc_message":"input size does not match signature: 1!=9 len({byte_entropy,data_directories,exports,general,header,histogram,imports,section,strings}) != len({byte_entropy,data_directories,exports,general,header,histogram,imports,section,strings}). Sent extra: {byte_entropy,data_directories,exports,general,header,histogram,imports,section,strings}. Missing but required: {byte_entropy,data_directories,exports,general,header,histogram,imports,section,strings}.","grpc_status":3}"

Where or how should I specify which tfrecord goes to which input in the model?

And If you are kind do you know how may I alter the model prediction to make it suitable for classification? At the moment the prediction is a number from zero to one and to suit the wit it should be an array with two probabilities. I know this is not a question about wit especially

Nothing shows up in my jupyter notebook

I can successfully run the demo script COMPAS Recidivism Classifier in the jupyter notebook on my local machine. But nothing shows up at the end of jupyter notebook. I assume it will show up the visualization interface after I run WitWidget(config_builder, height=tool_height_in_px).

And I have installed all the extension at the beginning of the jupyter notebook(python3).

! jupyter nbextension install --py --symlink --sys-prefix witwidget
! jupyter nbextension enable --py --sys-prefix witwidget

Release tags

Could you start tagging the releases here on Github?

Create a feedback module

Add a control for users to send feedback about the tool to the WIT team.

Investigate how to best do this. Could just be a bug/feedback button that links to a new github issue as the simplest approach.

Can't find wheel file in the package folder

I'm trying to add some functionality my own in the what-if-tool dashboard. I follow the https://github.com/PAIR-code/what-if-tool/blob/master/DEVELOPMENT.md to setup and rebuild the package.
Because I just use jupyter notebook to play around the what-if-tool, I just follow

a. rm -rf /tmp/wit-pip (if it already exists)
b. bazel run witwidget/pip_package:build_pip_package
c. Install the package
For use in Jupyter notebooks, install and enable the locally-build pip package per instructions in the README, but instead use pip install <pathToBuiltPipPackageWhlFile>, then launch the jupyter notebook kernel.

But after I run b step, I don't know where's the <pathToBuiltPipPackageWhlFile>. I can't find any files end with .whl in the folder. Maybe I missed something. Appreciate any help.

And there is a WARNING when I run b step. Not sure if it causes the problem

WARNING: Download from https://mirror.bazel.build/repo1.maven.org/maven2/com/google/javascript/closure-compiler-unshaded/v20190909/closure-compiler-unshaded-v20190909.jar failed: class com.google.devtools.build.lib.bazel.repository.downloader.UnrecoverableHttpException GET returned 404 Not Found

Local development setup, adjusting Closure compiler compilation level

hello! When doing development, what's a good workflow? I'll share what I discovered and hope that this helps other folks new to the repo, or perhaps they can help me understand better ways to approach this. The TDLR is compilation_level = "BUNDLE" seems useful for development :)

I started with the web demo for the smiling classifier, and it seems the way the build works, changes to the filesystem aren't detected and you have to kill the process and then run the full vulcanize process on each change. This takes about a minute on my laptop, so that's what prompted me to look into this.

In the Bazel output I see:

$ bazel run wit_dashboard/demo:imagedemoserver
INFO: Analyzed target //wit_dashboard/demo:imagedemoserver (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
INFO: From Vulcanizing /wit-dashboard/image_index.html:
...

The vulcanizing step takes about a full minute just to rebuild when changing just the text in line of logging. To understand what it's doing, I read through the BUILD files and the related .bzl definitions over in TensorBoard. I noticed that there are references to other libraries in https://github.com/PAIR-code/what-if-tool/blob/master/wit_dashboard/demo/wit-image-demo.html#L19, and figured maybe that's why the build is taking so long.

Removing those cut the build time down to ~45 seconds.
Removing wit-dashboard cuts the build down to ~15 seconds.
Removing everything but just the Polymer bits brings the build build to ~2 seconds.

Stepping back, I see that the wit-dashboard includes dependencies in the BUILD file from Polymer, Facets, and TensorBoard (as well as WIT components). If I comment out the WIT dependencies from the BUILD file and from the <link /> tags in wit-dashboard.html, this still takes ~40 seconds to build. So it seems to me like most of the build time, even on just changing text in a console.log statement is from either re-compiling dependencies, or from whole-program optimization the Vulcanize task is doing (or maybe that Closure compiler is doing on its behalf).

I tried copying the vulcanize.bzl from TensorBoard into the WIT folder so I could look at it and understand what it's doing. But in the process, I noticed some params in the BUILD task that ultimately does the vulcanizing:

tensorboard_html_binary(
    name = "imagedemoserver",
    testonly = True,  # Keeps JavaScript somewhat readable
    compile = True,  # Run Closure Compiler
    input_path = "/wit-dashboard/image_index.html",
    output_path = "/wit-dashboard/image_demo.html",
    deps = [":imagedemo"],
)

Changing compile = False cuts the build to 2 seconds! But it doesn't work because somewhere in the project there are goog.require style dependencies.

Changing the compilation_level helps though! I found these options in the Closure compiler, and luckily the build task in TensorBoard that calls Closure passes those right along. This gets things working again and down to ~20 seconds. The Closure Bazel defs say to use WHITESPACE_ONLY but that it will disable type checking (https://github.com/bazelbuild/rules_closure/blob/4925e6228e89e3b051a89ff57b8a033fa3fb9544/README.md#arguments-2). This helps (~10 seconds) but breaks the app. The Closure docs don't mention BUNDLE but you can see it in the source:

public enum CompilationLevel {
  /** BUNDLE Simply orders and concatenates files to the output. */
  BUNDLE,

And using this takes like half the time to build as compared to SIMPLE_OPTIMIZATIONS.

In the end, this is the impact on my sweet 2012 MacBook Pro:

# master, after just changing a logging string
$ bazel run wit_dashboard/demo:imagedemoserver
INFO: Elapsed time: 53.611s, Critical Path: 52.66s

# set compilation_level = "BUNDLE" instead of default ("ADVANCED")
$ bazel run wit_dashboard/demo:imagedemoserver
INFO: Elapsed time: 17.940s, Critical Path: 17.45s

So, I'll do this locally now, but would also love to learn if there are better ways to do this :)

Alternately, I also poked around to see if there was a way to either update these calls to listen to a command line arg or env variable passed through bazel run. I skimmed the Bazel docs and issues and saw things like aspects and bazelrc but nothing seemed fast and direct. I suppose this could be done in TensorBoard in the tensorboard_html_binary task. But I also discovered that there's a WORKSPACE and workspace.bzl tasks here, so maybe that could be a place to add a layer of indirection so that the project calls into tensorboard_html_binary_wrapper that reads some env switch so it builds for production by default, but if you do bazel run whatev --be-faster then it can run Closure compiler without the slower advanced optimizations. If doing something like that is helpful I can try but attempting changes to the build setup are always dicey :)

If that's too much of a pain I can just add a note to https://github.com/PAIR-code/what-if-tool/blob/master/DEVELOPMENT.md to help folks discover how to speed up local builds. Thanks!

EDIT: Also noticed that a long time ago @stephanwlee was thinking about this upstream tensorflow/tensorboard#1599 and some other open issues reference related things about advanced compilation mode in dev (eg, tensorflow/tensorboard#2687)

Clarifying how selection updates UI for counterfactuals

hi, thanks for sharing all your awesome work! 👍

I was exploring the UCI dataset on the web demo while reading the paper and it looks to me like there might be a bug in how the UI state updates to color which elements of the counterfactual are different. Alternately, it might be I'm just misunderstanding the UX :)

I'm expecting that when I look at a data point, the attributes of the counterfactual that are different will be shown in green, like the "occupation" and "relationship" values here:

image

To reproduce:

  1. Bin the x-axis by occupation:

Screen Shot 2019-07-23 at 3 18 08 PM

  1. Zoom into "Transport-moving" at the bottom and click on the lowest ">50k" data point, colored red and highlighted here:

image

  1. Enable showing counterfactuals
    Notice that "occupation" and "relationship" are highlighted in green, which is in line with what I'd expect since they're different:
    image

  2. Click on the highest "<50k" data point, colored blue and highlighted here:
    image

  3. Check out the counterfactual
    It looks like some attributes that are the same are highlighted in green, which is not what I would expect. In this screenshot, I'd expect "occupation" to be green but "relationship" to be standard black text.
    image

Note that the highlighting behavior is different if you clear the selection, and then click on the data point in step 4 directly. That shows these attributes highlighted, as I'd expected:

image

So this might be a UX misunderstanding, and maybe I'm not understanding how the counterfactual computation is supposed to interact with the selection. But since the behavior is different depending on the order of doing this, I suspect it's a UI bug with updating in response to state changes. I poked around a bit and seemed like maybe around here is where the syncing between selection interactions, changing the values, and rendering the color here.

Thanks! Let me know if there's anything else I can provide that's helpful for debugging.

Problems replicating COMPASS web demo using TensorFlow Serving on Docker

Hi,

I am new using tensorflow and WIT and I do not even know if Ishould be posting this here but I am trying to replicate the COMPAS demo using TensorFlow Serving on Docker and I get the next error:

Request for model inference failed: RequestNetworkError: RequestNetworkError: 500 at /data/plugin/whatif/infer?inference_address.

I am using the following docker command:

docker run -p 8500:8500 --mount type=bind,source="C:\Users\arancha.abad\Importar_modelos\versiones",target=/models/saved_model -e MODEL_NAME=saved_model -t tensorflow/serving

and it seems to work properly, but when I open WIT on TensorBoard, the only thing I can see is everything related to the .tfrecord file. I can see the datapoints and edit them and I can also go to Features and see every histogram but I can't run an infer and when WIT is opened, the error described abobe is displayed.

I am using tensorflow 2.2 (rc) and tensorboard 2.1.1 and this is the way I export the COMPAS model:
serving_input_fn = tf.estimator.export.build_parsing_serving_input_receiver_fn(feature_spec) export_path = classifier.export_saved_model(export_path, serving_input_fn)

I get the saved_model.pb and the variable folder. If I use saved_model_cli to show the model i get what follows:

MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:

signature_def['classification']:
The given SavedModel SignatureDef contains the following input(s):
inputs['inputs'] tensor_info:
dtype: DT_STRING
shape: (-1)
name: input_example_tensor:0
The given SavedModel SignatureDef contains the following output(s):
outputs['classes'] tensor_info:
dtype: DT_STRING
shape: (-1, 2)
name: head/Tile:0
outputs['scores'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 2)
name: head/predictions/probabilities:0
Method name is: tensorflow/serving/classify

signature_def['predict']:
The given SavedModel SignatureDef contains the following input(s):
inputs['examples'] tensor_info:
dtype: DT_STRING
shape: (-1)
name: input_example_tensor:0
The given SavedModel SignatureDef contains the following output(s):
outputs['all_class_ids'] tensor_info:
dtype: DT_INT32
shape: (-1, 2)
name: head/predictions/Tile:0
outputs['all_classes'] tensor_info:
dtype: DT_STRING
shape: (-1, 2)
name: head/predictions/Tile_1:0
outputs['class_ids'] tensor_info:
dtype: DT_INT64
shape: (-1, 1)
name: head/predictions/ExpandDims:0
outputs['classes'] tensor_info:
dtype: DT_STRING
shape: (-1, 1)
name: head/predictions/str_classes:0
outputs['logistic'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 1)
name: head/predictions/logistic:0
outputs['logits'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 1)
name: linear/linear_model/linear/linear_model/linear/linear_model/weighted_sum:0
outputs['probabilities'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 2)
name: head/predictions/probabilities:0
Method name is: tensorflow/serving/predict

signature_def['regression']:
The given SavedModel SignatureDef contains the following input(s):
inputs['inputs'] tensor_info:
dtype: DT_STRING
shape: (-1)
name: input_example_tensor:0
The given SavedModel SignatureDef contains the following output(s):
outputs['outputs'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 1)
name: head/predictions/logistic:0
Method name is: tensorflow/serving/regress

signature_def['serving_default']:
The given SavedModel SignatureDef contains the following input(s):
inputs['inputs'] tensor_info:
dtype: DT_STRING
shape: (-1)
name: input_example_tensor:0
The given SavedModel SignatureDef contains the following output(s):
outputs['classes'] tensor_info:
dtype: DT_STRING
shape: (-1, 2)
name: head/Tile:0
outputs['scores'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 2)
name: head/predictions/probabilities:0
Method name is: tensorflow/serving/classify

The model has the signatures: predict, classification, regression and serving_default so everything seems to be fine. Right now I dont know what else should I do to make it work, maybe my mistake is the way I create the serving_input_fn or anyother thing so any help would be appreciated.

Thank you for your help!

Incomplete categorical feature list in Partial Dependence Plots"

WhatIf currently only reads first 50 Datapoints to generate candidates for categorical features to be used in "Partial Dependence Plots". This could be too restrictive. It shall read more data to get a more complete list of categories and choose the most frequent ones for plots.

Unable to build the repository due to bazel version

As per package.json, this project is using bazel 0.23.2.

"@bazel/bazel": "^0.23.2",

However, in WORKSPACE file, it requires bazel 0.26.1.

versions.check(minimum_bazel_version = "0.26.1")

I tried yarn add @bazel/[email protected], the build can start but always fails at some bazel rules package, like error loading package 'node_modules/@schematics/update/node_modules/rxjs/src/operators': Unable to find package for @build_bazel_rules_typescript//:defs.bzl: The repository '@build_bazel_rules_typescript' could not be resolved.
or ERROR: error loading package '': in .../org_tensorflow_tensorboard/third_party/workspace.bzl: in .../npm_bazel_typescript/index.bzl: in .../npm_bazel_typescript/internal/ts_repositories.bzl: Unable to load file '@build_bazel_rules_nodejs//:index.bzl': file doesn't exist (when I tried to upgrade @bazel/typescript package to latest).

What is the correct versions of bazel, bazel rules, etc., to use?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.