google / gematria Goto Github PK

Machine learning for machine code.

License: Apache License 2.0

Starlark 4.59% Python 52.21% C++ 38.80% Dockerfile 0.03% CMake 0.16% Assembly 3.73% C 0.15% Shell 0.33%

compiler machine-code machine-learning performance-analysis

gematria's Introduction

Gematria - machine learning for machine code

Contains sources of Gematria, a framework for machine learning on machine code. It includes implementations of the GRANITE model and the Ithemal hierarchical LSTM model for learning inverse throughput of basic blocks.

Installation

Requirements and installation

Our models are built on top of TensorFlow 2.x (using the TensorFlow 1.x compatibility layer) in a mix of C++ and Python. Most of the training code is written in Python; we use C++ for the more demanding parts of the code like graph construction. We use pybind11 to make C++ APIs available in Python.

Basic requirements that need to be installed before starting:

Bazel 6.0 or newer.
A C++ compiler supported by Bazel that compiles C++17. Recent versions of GCC and Clang on Linux both fit the bill.
Python 3.10 or newer.
Git.
PIP.

Additional dependencies, including TensorFlow, Protocol buffers, and different Python libraries are installed through PIP and through Bazel's WORKSPACE file. We strongly recommend using virtualenv to install Python packages to avoid dependency version conflicts with other libraries.

# Get the source code.
$ git clone https://github.com/google/gematria.git
$ cd gematria

# Set up virtualenv.
$ pip install virtualenv
$ virtualenv env
$ . env/bin/activate

# Install Python dependencies.
$ pip install -r requirements.in

# On OS X only. The dependencies of tensorflow-ranking are not set up correctly
# and it needs to be installed manually.
$ pip install --no-deps tensorflow-ranking.

# Build the project, run tests, ...
$ bazel build ...
$ bazel test ...

Building with CMake

A subset of the project, consisting of tools and libraries we eventually plan to merge in the LLVM monorepo, are built with cmake. The requirements are inherited from LLVM, as we use LLVM's "external project" mechanism to build.

First, build TFLite. In addition to the requirements above, see also these prerequisites, noting the reference to the buildbot script which lists additional packages.

Then:

mkdir /tmp/tflite && cd /tmp/tflite
curl https://raw.githubusercontent.com/google/ml-compiler-opt/main/buildbot/build_tflite.sh | bash

This should produce a /tmp/tflite/tflite.cmake.

cd ${GEMATRIA_SRC}
mkdir cmake-build && cd cmake-build
cmake -GNinja -DCMAKE_BUILD_TYPE=Release \
  -C /tmp/tflite/tflite.cmake \
  ${LLVM_PROJECT_SRC}/llvm \
  -DLLVM_EXTERNAL_PROJECTS=gematria \
  -DLLVM_EXTERNAL_GEMATRIA_SOURCE_DIR=${GEMATRIA_SRC}
ninja llvm-granite llvm-cm

Where LLVM_PROJECT_SRC is the absolute path to your local llvm repo, and GEMATRIA_SRC the path to this (the gematria) repo.

To run the llvm-cm tests, you can run the following target:

ninja check-llvm-tools-llvm-cm

Platform Support

We develop and test our code on Linux and x86-64, and we test it on Mac OS X and ARM. While we did not test it, we expect it to work with minimal changes also on other architectures and platforms that run TensorFlow.

Using the models

See the training guide and guides for Python inference and C++ inference.

Repository structure

See the separate document.

Get Involved

Issue tracker: https://github.com/google/Gematria/issues

We welcome patches -- see CONTRIBUTING for more information on how to submit a patch.

Cite us

@inproceedings{granite:iiswc:2022,
  author = {O. Sykora and P. Phothilimthana and C. Mendis and A. Yazdanbakhsh},
  booktitle = {2022 IEEE International Symposium on Workload Characterization (IISWC)},
  title = {{GRANITE: A Graph Neural Network Model for Basic Block Throughput Estimation}},
  year = {2022},
}

gematria's People

Contributors

Stargazers

Watchers

Forkers

boomanaiden154 virajbshah 9tempest mtrofin mufeili jmassapina ego niea3 thecodeofmontecristo

gematria's Issues

Incorrect canonicalized instruction

Looking at the following block:

basic_block {
  machine_instructions {
    assembly: "\tmovl\t$7, %eax"
    machine_code: "\270\007\000\000\000"
  }
  machine_instructions {
    address: 5
    assembly: "\trep\t\tmovl\t$1, %eax"
    machine_code: "\363\270\001\000\000\000"
  }
  canonicalized_instructions {
    mnemonic: "MOV"
    llvm_mnemonic: "MOV32ri"
    output_operands {
      register_name: "EAX"
    }
    input_operands {
      immediate_value: 7
    }
  }
  canonicalized_instructions {
    mnemonic: "MOV\tEAX,"
    prefixes: "REP"
    llvm_mnemonic: "MOV32ri"
    output_operands {
      register_name: "EAX"
    }
    input_operands {
      immediate_value: 1
    }
  }
}
inverse_throughputs {
  source: "zen2"
  inverse_throughput_cycles: 100.0
}

The mneomic for the second canonicalized instruction is incorrect, as for some reason it also includes the register. This causes issues when trying to train a model as there ends up being an out of bounds embedding table access, which causes the job to fail.

Annotator running out of processes

After a while (maybe about ~1000 blocks from my testing), the annotator begins to fail on every block with the following message:

Failed to find addresses for block '488B442410488B7808837C240C00': INTERNAL: Failed to create child process: Resource temporarily unavailable
Block disassembly:
                movq    16(%rsp), %rax
                movq    8(%rax), %rdi
                cmpl    $0, 12(%rsp)

This is presumably because the underlying exegesis code is keeping processes around (although I have yet to confirm that hypothesis). More debugging is needed.

Implement benchmarking script

In order to construct large-scale BB datasets, we need a script that can perform these benchmarking runs, taking in annotated basic blocks from the annotation script (most likely in JSON), and then returning them with throughput information.

Parallelize memory annotations

The current script in ./gematria/datasets/convert_bhive_to_exegesis_inputs.cc runs sequentially. This is somewhat of a problem for using the Exegesis annotator, which isn't particularly fast. This can easily be parallelized as we don't care about the timings at all while running the annotations. This should be doable with some refactoring and use of LLVM's threading APIs.

Parallelize benchmarking

With the large scale of our datasets (potentially 10^8 BBs), we will need a reasonably fast way to benchmark basic blocks. Parallelizing this is an obvious first step. This needs a couple things implemented on the LLVM side:

Shared memory names (used for memory annotations) need a name that is also based on the thread ID rather than just the process ID.
There needs to be an option to pin a benchmarking process to a specific core within llvm-exegesis.

(There might be more on the llvm-exegesis side).

Then, we need to do the following:

Implement parallel benchmarking using LLVM threading primitives.
Validate that running on multiple threads doesn't impact results (using validation counters).
Ship it.

Attempt a complete mlgo regalloc training using gematria as latency predictor

Main goals are to:

see what's missing.
fix what's missing so we have a complete testbed others can use (basically llvm-cm plugin)

We can start with @boomanaiden154 's very simple decompression benchmark and then @virajbshah 's cache missing benchmarks - totally fine if models are overfitting initially.

Error when training model with rep mov instruction

Traceback:

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/client/session.py", line 1379, in _do_call
    return fn(*args)
           ^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/client/session.py", line 1362, in _run_fn
    return self._call_tf_sessionrun(options, feed_dict, fetch_list,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/client/session.py", line 1455, in _call_tf_sessionrun
    return tf_session.TF_SessionRun_wrapper(self._session, options, feed_dict,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[0] = 8 is not in [0, 8)
         [[{{node encoder_1/edge_model/embed/embedding_lookup}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/run_granite_model.py", line 109, in <module>
    app.run(main)
  File "/usr/local/lib/python3.11/dist-packages/absl/app.py", line 308, in run
    _run_main(main, args)
  File "/usr/local/lib/python3.11/dist-packages/absl/app.py", line 254, in _run_main
    sys.exit(main(argv))
             ^^^^^^^^^^
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/run_granite_model.py", line 48, in main
    main_function.run_gematria_model_from_command_line_flags(
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/main_function.py", line 871, in run_gematria_model_from_command_line_flags
    model.train(
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/model_base.py", line 1535, in train
    stats = run_one_epoch()
            ^^^^^^^^^^^^^^^
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/model_base.py", line 1500, in run_one_epoch
    return self.train_mini_batch(
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/model_base.py", line 1628, in train_mini_batch
    return self.train_batch(sess, train_schedule)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/model_base.py", line 1590, in train_batch
    (_, stats) = sess.run((self._train_step, stats_ops), feed_dict=schedule)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/training/monitored_session.py", line 778, in run
    return self._sess.run(
           ^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/training/monitored_session.py", line 1307, in run
    return self._sess.run(
           ^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/training/monitored_session.py", line 1397, in run
    return self._sess.run(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/training/monitored_session.py", line 1464, in run
    outputs = _WrappedSession.run(
              ^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/training/monitored_session.py", line 1228, in run
    return self._sess.run(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/client/session.py", line 969, in run
    result = self._run(None, fetches, feed_dict, options_ptr,
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/client/session.py", line 1192, in _run
    results = self._do_run(handle, final_targets, final_fetches,
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/client/session.py", line 1372, in _do_run
    return self._do_call(_run_fn, feeds, fetches, targets, options,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/client/session.py", line 1398, in _do_call
    raise type(e)(node_def, op, message)  # pylint: disable=no-value-for-parameter
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
tensorflow.python.framework.errors_impl.InvalidArgumentError: Graph execution error:

Detected at node 'encoder_1/edge_model/embed/embedding_lookup' defined at (most recent call last):
    File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/run_granite_model.py", line 109, in <module>
      app.run(main)
    File "/usr/local/lib/python3.11/dist-packages/absl/app.py", line 308, in run
      _run_main(main, args)
    File "/usr/local/lib/python3.11/dist-packages/absl/app.py", line 254, in _run_main
      sys.exit(main(argv))
    File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/run_granite_model.py", line 48, in main
      main_function.run_gematria_model_from_command_line_flags(
    File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/main_function.py", line 803, in run_gematria_model_from_command_line_flags
      model.initialize()
    File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/model_base.py", line 391, in initialize
      self._create_tf_graph()
    File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/graph_builder_model_base.py", line 170, in _create_tf_graph
      super()._create_tf_graph()
    File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/token_model.py", line 200, in _create_tf_graph
      super()._create_tf_graph()
    File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/gnn_model_base.py", line 238, in _create_tf_graph
      self._graphs_tuple_outputs = self._create_graph_network()
    File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/gnn_model_base.py", line 353, in _create_graph_network
      graphs_tuple = layer.module(graphs_tuple)
    File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 397, in __call__
      return self._call(*args, **kwargs)
    File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 419, in _call
      outputs, subgraph_name_scope = self._template(*args, **kwargs)
    File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 227, in _build_wrapper
      output = self._build(*args, **kwargs)
    File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/graph_nets_repo/graph_nets/modules.py", line 409, in _build
      edges=self._edge_model(graph.edges, **edge_model_kwargs),
    File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 397, in __call__
      return self._call(*args, **kwargs)
    File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 419, in _call
      outputs, subgraph_name_scope = self._template(*args, **kwargs)
    File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 227, in _build_wrapper
      output = self._build(*args, **kwargs)
    File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/graph_nets_repo/graph_nets/_base.py", line 112, in _build
      return self._model(*args, **kwargs)
    File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 397, in __call__
      return self._call(*args, **kwargs)
    File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 419, in _call
      outputs, subgraph_name_scope = self._template(*args, **kwargs)
    File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 227, in _build_wrapper
      output = self._build(*args, **kwargs)
    File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/embed.py", line 182, in _build
      return tf.nn.embedding_lookup(embeddings, ids, name="embedding_lookup")
Node: 'encoder_1/edge_model/embed/embedding_lookup'
indices[0] = 8 is not in [0, 8)
         [[{{node encoder_1/edge_model/embed/embedding_lookup}}]]

Original stack trace for 'encoder_1/edge_model/embed/embedding_lookup':
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/run_granite_model.py", line 109, in <module>
    app.run(main)
  File "/usr/local/lib/python3.11/dist-packages/absl/app.py", line 308, in run
    _run_main(main, args)
  File "/usr/local/lib/python3.11/dist-packages/absl/app.py", line 254, in _run_main
    sys.exit(main(argv))
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/run_granite_model.py", line 48, in main
    main_function.run_gematria_model_from_command_line_flags(
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/main_function.py", line 803, in run_gematria_model_from_command_line_flags
    model.initialize()
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/model_base.py", line 391, in initialize
    self._create_tf_graph()
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/graph_builder_model_base.py", line 170, in _create_tf_graph
    super()._create_tf_graph()
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/token_model.py", line 200, in _create_tf_graph
    super()._create_tf_graph()
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/gnn_model_base.py", line 238, in _create_tf_graph
    self._graphs_tuple_outputs = self._create_graph_network()
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/gnn_model_base.py", line 353, in _create_graph_network
    graphs_tuple = layer.module(graphs_tuple)
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 397, in __call__
    return self._call(*args, **kwargs)
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 419, in _call
    outputs, subgraph_name_scope = self._template(*args, **kwargs)
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/ops/template.py", line 398, in __call__
    return self._call_func(args, kwargs)
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/ops/template.py", line 368, in _call_func
    result = self._func(*args, **kwargs)
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 227, in _build_wrapper
    output = self._build(*args, **kwargs)
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/graph_nets_repo/graph_nets/modules.py", line 409, in _build
    edges=self._edge_model(graph.edges, **edge_model_kwargs),
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 397, in __call__
    return self._call(*args, **kwargs)
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 419, in _call
    outputs, subgraph_name_scope = self._template(*args, **kwargs)
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/ops/template.py", line 398, in __call__
    return self._call_func(args, kwargs)
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/ops/template.py", line 368, in _call_func
    result = self._func(*args, **kwargs)
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 227, in _build_wrapper
    output = self._build(*args, **kwargs)
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/graph_nets_repo/graph_nets/_base.py", line 112, in _build
    return self._model(*args, **kwargs)
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 397, in __call__
    return self._call(*args, **kwargs)
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 419, in _call
    outputs, subgraph_name_scope = self._template(*args, **kwargs)
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/ops/template.py", line 398, in __call__
    return self._call_func(args, kwargs)
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/ops/template.py", line 368, in _call_func
    result = self._func(*args, **kwargs)
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 227, in _build_wrapper
    output = self._build(*args, **kwargs)
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/embed.py", line 182, in _build
    return tf.nn.embedding_lookup(embeddings, ids, name="embedding_lookup")
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/util/traceback_utils.py", line 150, in error_handler
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/util/dispatch.py", line 1176, in op_dispatch_handler
    return dispatch_target(*args, **kwargs)
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/ops/embedding_ops.py", line 326, in embedding_lookup
    return _embedding_lookup_and_transform(
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/ops/embedding_ops.py", line 145, in _embedding_lookup_and_transform
    array_ops.gather(params[0], ids, name=name), ids, max_norm)
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/util/traceback_utils.py", line 150, in error_handler
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/util/dispatch.py", line 1176, in op_dispatch_handler
    return dispatch_target(*args, **kwargs)
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/util/deprecation.py", line 576, in new_func
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/ops/array_ops.py", line 5138, in gather
    return gen_array_ops.gather_v2(params, indices, axis, name=name)
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 3982, in gather_v2
    _, _, _op, _outputs = _op_def_library._apply_op_helper(
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/framework/op_def_library.py", line 795, in _apply_op_helper
    op = g._create_op_internal(op_type_name, inputs, dtypes=None,
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/framework/ops.py", line 3381, in _create_op_internal
    ret = Operation.from_node_def(

With the following command line invocation:

bazel run //gematria/granite/python:run_granite_model -- --gematria_action=train --gematria_checkpoint_dir=/tmp/test_model/ --gematria_learning_rate=0.001 --gematria_loss_type=mean_absolute_error --gematria_training_num_epochs=100000 --gematria_tokens_file=/data/vocab_10u7.txt  --gematria_input_file=/tmp/test.tfrecord  --gematria_max_blocks_in_batch=100 --gematria_learning_rate_schedule=cosine --gematria_decay_steps=100000

With the tfrecord dataset produced from the following csv:

f3b801000000,1

With the patch from #107 applied.

--blocks_per_json_file flag not working as expected

Using the following CSV, test.csv:

85c044897c2460,98.000000
3b31,45.000000

With the following command line invocation, assuming ./json exists:

./bazel-bin/gematria/datasets/convert_bhive_to_llvm_exegesis_input --json_output_dir=./json --bhive_csv=./test.csv --blocks_per_json_file=1

We get the following in ./json:

0.json  1.json  2.json

Note that we should only get two files.

0.json:

[
  {
    "Hex": "85c044897c2460",
    "MemoryDefinitions": [
      {
        "Name": "MEM",
        "Size": 4096,
        "Value": 305419776
      }
    ],
    "MemoryMappings": [
      {
        "Address": 65536,
        "Value": "MEM"
      }
    ]
  }
]

1.json:

[
  {
    "Hex": "3b31",
    "MemoryDefinitions": [
      {
        "Name": "MEM",
        "Size": 4096,
        "Value": 305419776
      }
    ],
    "MemoryMappings": [
      {
        "Address": 65536,
        "Value": "MEM"
      }
    ]
  }
]

2.json:

[]

We see one of the blocks duplicated, the second block shows up twice, we get an extra file, and the extra file is empty. This needs to be fixed.

Snippet causing remappings of the same address in the exegesis annotator

# LLVM-EXEGESIS-DEFREG EFLAGS 12345600
# LLVM-EXEGESIS-DEFREG RCX 12345600
# LLVM-EXEGESIS-DEFREG RDI 12345600
# LLVM-EXEGESIS-DEFREG RIP 12345600
# LLVM-EXEGESIS-DEFREG XMM2 12345600
# LLVM-EXEGESIS-LOOP-REGISTER RDX
        movzbl  (%rcx), %eax
        movd    %edi, %xmm0
        pshufd  $0, %xmm0, %xmm0
        movdqa  (%rip), %xmm1
        pand    %xmm0, %xmm1
        pand    (%rip), %xmm0
        pxor    %xmm2, %xmm2
        movdqa  %xmm0, %xmm3
        pcmpeqd %xmm2, %xmm3
        movdqa  %xmm1, %xmm4
        pcmpeqd %xmm2, %xmm4
        packssdw        %xmm3, %xmm4
        packsswb        %xmm4, %xmm4
        movdqa  %xmm4, %xmm3
        pandn   (%rip), %xmm3
        movb    %al, (%rip)
        pand    (%rip), %xmm4
        por     %xmm3, %xmm4
        movq    %xmm4, (%rip)
        testb   $1, %dil
        movl    $45, %eax
        movl    $120, %ecx
        cmovel  %eax, %ecx
        movb    %cl, (%rip)
        testl   $2048, %edi

This snippet ends up making the exegesis annotator map the same address over and over (but eventually, it moves on to another page). Not sure why this behavior is occurring and more investigation is needed.

Bus error in annotator

Using the parallelized annotator:

Bus error (core dumped)

Need to see if this is reproducible and debug why it is happening. It seemed like this happened in the parent process rather than a signal received in the child process that would've been handled through ptrace.

Write comparison script

It would be good to validate to validate that the benchmarking numbers that we're getting match previous results (like BHive and uica-eval) to ensure that we aren't doing anything egregiously wrong. To do this we need to do a couple things:

Write a script (probably python) that can compare CSVs in the BHive format and identify (major) discrepancies.
Do a benchmarking run using our tooling against one of these datasets.
Run the comparison script, observe the results.

Update llvm-cm against recent LLVM changes

llvm-cm needs to be updated to reflect some recent LLVM changes:

BBRanges are now used to represent the basic blocks in a function. llvm-cm needs to support these for cases like basic block sections and split machine functions (and should at the very least have test coverage for them).
-mbb-profile-dump no longer exists, and instead PGOAnalysisMap should be used. Tests/code needs to be updated for this.

This (at least the second part) is sort of a prerequisite for #55.

Too many open files error

Failed to annotate block: INTERNAL: Failed to create a pipe for interprocess communication between llvm-exegesis and the benchmarking subprocess: Too many open files

More investigation is needed. Probably an issue on the LLVM side, but opening here first in case there is some complicated interaction.

Implement tooling for python formatting

There is currently no python formatting tooling in the repository, although it presumably follows the Google Python style guide.

I'd like to propose using yapf:

It seems to be the standard for Google open source projects.
Does require some reformatting of code (based on my testing, might be missing a flag or something)
Is fairly well maintained.

The only issue is that yapf currently doesn't support match-case statements due to some third-party dependencies being unmaintained. This is being tracked in this issue and some work has been going on recently in this PR, but nothing has been finalized yet. This means yapf fails when formatting gematria currently. Building from a fork seems like it should work in the mean time.

Noting here that other formatters like black or pyink don't have this issue.

tokens.txt

Hi,

I'm trying to follow the g3doc inference-api.md documentation, but when I run the command I'm missing the /tmp/tokens.txt file. Could you please let me know how to generate this file?

Thanks,
Z

Bazel test failed: bhive_importer_test

Hi, after building gematria (bazel build ...), I run bazel test ..., but failed at bhive_importer_test.test_x86_parse_csv_line, and the error log indicate that

e_importer_test/test.log 
exec ${PAGER:-/usr/bin/less} "$0" || exit 1
Executing tests from //gematria/datasets/python:bhive_importer_test
-----------------------------------------------------------------------------
Running tests under Python 3.10.0: /home/gematria/gematria_env/bin/python3
[ RUN      ] BhiveImporterTest.test_x86_basic_block_proto_from_bytes
[       OK ] BhiveImporterTest.test_x86_basic_block_proto_from_bytes
[ RUN      ] BhiveImporterTest.test_x86_basic_block_proto_from_hex
[       OK ] BhiveImporterTest.test_x86_basic_block_proto_from_hex
[ RUN      ] BhiveImporterTest.test_x86_nonstandard_columns
[       OK ] BhiveImporterTest.test_x86_nonstandard_columns
[ RUN      ] BhiveImporterTest.test_x86_parse_csv_line
[  FAILED  ] BhiveImporterTest.test_x86_parse_csv_line
======================================================================
ERROR: test_x86_parse_csv_line (__main__.BhiveImporterTest)
BhiveImporterTest.test_x86_parse_csv_line
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/lukez/.cache/bazel/_bazel_lukez/6ec059981b607312b48b2c4811597fe7/sandbox/linux-sandbox/6/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/datasets/python/bhive_importer_test.runfiles/com_google_gematria/gematria/datasets/python/bhive_importer_test.py", line 203, in test_x86_parse_csv_line
    block_proto = importer.basic_block_with_throughput_proto_from_csv_line(
TypeError: basic_block_with_throughput_proto_from_csv_line(): incompatible function arguments. The following argument types are supported:
    1. (self: gematria.datasets.python.bhive_importer.BHiveImporter, source_name: str, line: str, machine_code_hex_column_index: int, throughput_column_index: int, throughput_scaling: float = 1.0, base_address: int = 0) -> gematria::BasicBlockWithThroughputProto

Invoked with: <gematria.datasets.python.bhive_importer.BHiveImporter object at 0x7ffb153321f0>; kwargs: source_name='test: made-up', line='4829d38b44246c8b54246848c1fb034829d04839c3,10', base_address=600, throughput_scaling=2.0

----------------------------------------------------------------------
Ran 4 tests in 0.069s

FAILED (errors=1)

Do you know how to address this issue? The machine I am using is a Intel Broadwell in x86.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.