google / gematria Goto Github PK

View Code? Open in Web Editor NEW

60.0 10.0 11.0 7.34 MB

Machine learning for machine code.

License: Apache License 2.0

Starlark 4.62% Python 51.92% C++ 39.12% Dockerfile 0.04% CMake 0.16% Assembly 3.68% C 0.14% Shell 0.33%

compiler machine-code machine-learning performance-analysis

gematria's Issues

--blocks_per_json_file flag not working as expected

Using the following CSV, test.csv:

85c044897c2460,98.000000
3b31,45.000000

With the following command line invocation, assuming ./json exists:

./bazel-bin/gematria/datasets/convert_bhive_to_llvm_exegesis_input --json_output_dir=./json --bhive_csv=./test.csv --blocks_per_json_file=1

We get the following in ./json:

0.json  1.json  2.json

Note that we should only get two files.

0.json:

[
  {
    "Hex": "85c044897c2460",
    "MemoryDefinitions": [
      {
        "Name": "MEM",
        "Size": 4096,
        "Value": 305419776
      }
    ],
    "MemoryMappings": [
      {
        "Address": 65536,
        "Value": "MEM"
      }
    ]
  }
]

1.json:

[
  {
    "Hex": "3b31",
    "MemoryDefinitions": [
      {
        "Name": "MEM",
        "Size": 4096,
        "Value": 305419776
      }
    ],
    "MemoryMappings": [
      {
        "Address": 65536,
        "Value": "MEM"
      }
    ]
  }
]

2.json:

[]

We see one of the blocks duplicated, the second block shows up twice, we get an extra file, and the extra file is empty. This needs to be fixed.

Update llvm-cm against recent LLVM changes

llvm-cm needs to be updated to reflect some recent LLVM changes:

BBRanges are now used to represent the basic blocks in a function. llvm-cm needs to support these for cases like basic block sections and split machine functions (and should at the very least have test coverage for them).
-mbb-profile-dump no longer exists, and instead PGOAnalysisMap should be used. Tests/code needs to be updated for this.

This (at least the second part) is sort of a prerequisite for #55.

Annotator running out of processes

After a while (maybe about ~1000 blocks from my testing), the annotator begins to fail on every block with the following message:

Failed to find addresses for block '488B442410488B7808837C240C00': INTERNAL: Failed to create child process: Resource temporarily unavailable
Block disassembly:
                movq    16(%rsp), %rax
                movq    8(%rax), %rdi
                cmpl    $0, 12(%rsp)

This is presumably because the underlying exegesis code is keeping processes around (although I have yet to confirm that hypothesis). More debugging is needed.

Bus error in annotator

Using the parallelized annotator:

Bus error (core dumped)

Need to see if this is reproducible and debug why it is happening. It seemed like this happened in the parent process rather than a signal received in the child process that would've been handled through ptrace.

Snippet causing remappings of the same address in the exegesis annotator

# LLVM-EXEGESIS-DEFREG EFLAGS 12345600
# LLVM-EXEGESIS-DEFREG RCX 12345600
# LLVM-EXEGESIS-DEFREG RDI 12345600
# LLVM-EXEGESIS-DEFREG RIP 12345600
# LLVM-EXEGESIS-DEFREG XMM2 12345600
# LLVM-EXEGESIS-LOOP-REGISTER RDX
        movzbl  (%rcx), %eax
        movd    %edi, %xmm0
        pshufd  $0, %xmm0, %xmm0
        movdqa  (%rip), %xmm1
        pand    %xmm0, %xmm1
        pand    (%rip), %xmm0
        pxor    %xmm2, %xmm2
        movdqa  %xmm0, %xmm3
        pcmpeqd %xmm2, %xmm3
        movdqa  %xmm1, %xmm4
        pcmpeqd %xmm2, %xmm4
        packssdw        %xmm3, %xmm4
        packsswb        %xmm4, %xmm4
        movdqa  %xmm4, %xmm3
        pandn   (%rip), %xmm3
        movb    %al, (%rip)
        pand    (%rip), %xmm4
        por     %xmm3, %xmm4
        movq    %xmm4, (%rip)
        testb   $1, %dil
        movl    $45, %eax
        movl    $120, %ecx
        cmovel  %eax, %ecx
        movb    %cl, (%rip)
        testl   $2048, %edi

This snippet ends up making the exegesis annotator map the same address over and over (but eventually, it moves on to another page). Not sure why this behavior is occurring and more investigation is needed.

Incorrect canonicalized instruction

Looking at the following block:

basic_block {
  machine_instructions {
    assembly: "\tmovl\t$7, %eax"
    machine_code: "\270\007\000\000\000"
  }
  machine_instructions {
    address: 5
    assembly: "\trep\t\tmovl\t$1, %eax"
    machine_code: "\363\270\001\000\000\000"
  }
  canonicalized_instructions {
    mnemonic: "MOV"
    llvm_mnemonic: "MOV32ri"
    output_operands {
      register_name: "EAX"
    }
    input_operands {
      immediate_value: 7
    }
  }
  canonicalized_instructions {
    mnemonic: "MOV\tEAX,"
    prefixes: "REP"
    llvm_mnemonic: "MOV32ri"
    output_operands {
      register_name: "EAX"
    }
    input_operands {
      immediate_value: 1
    }
  }
}
inverse_throughputs {
  source: "zen2"
  inverse_throughput_cycles: 100.0
}

The mneomic for the second canonicalized instruction is incorrect, as for some reason it also includes the register. This causes issues when trying to train a model as there ends up being an out of bounds embedding table access, which causes the job to fail.

Implement tooling for python formatting

There is currently no python formatting tooling in the repository, although it presumably follows the Google Python style guide.

I'd like to propose using yapf:

It seems to be the standard for Google open source projects.
Does require some reformatting of code (based on my testing, might be missing a flag or something)
Is fairly well maintained.

The only issue is that yapf currently doesn't support match-case statements due to some third-party dependencies being unmaintained. This is being tracked in this issue and some work has been going on recently in this PR, but nothing has been finalized yet. This means yapf fails when formatting gematria currently. Building from a fork seems like it should work in the mean time.

Noting here that other formatters like black or pyink don't have this issue.

Error when training model with rep mov instruction

Traceback:

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/client/session.py", line 1379, in _do_call
    return fn(*args)
           ^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/client/session.py", line 1362, in _run_fn
    return self._call_tf_sessionrun(options, feed_dict, fetch_list,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/client/session.py", line 1455, in _call_tf_sessionrun
    return tf_session.TF_SessionRun_wrapper(self._session, options, feed_dict,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[0] = 8 is not in [0, 8)
         [[{{node encoder_1/edge_model/embed/embedding_lookup}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/run_granite_model.py", line 109, in <module>
    app.run(main)
  File "/usr/local/lib/python3.11/dist-packages/absl/app.py", line 308, in run
    _run_main(main, args)
  File "/usr/local/lib/python3.11/dist-packages/absl/app.py", line 254, in _run_main
    sys.exit(main(argv))
             ^^^^^^^^^^
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/run_granite_model.py", line 48, in main
    main_function.run_gematria_model_from_command_line_flags(
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/main_function.py", line 871, in run_gematria_model_from_command_line_flags
    model.train(
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/model_base.py", line 1535, in train
    stats = run_one_epoch()
            ^^^^^^^^^^^^^^^
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/model_base.py", line 1500, in run_one_epoch
    return self.train_mini_batch(
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/model_base.py", line 1628, in train_mini_batch
    return self.train_batch(sess, train_schedule)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/model_base.py", line 1590, in train_batch
    (_, stats) = sess.run((self._train_step, stats_ops), feed_dict=schedule)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/training/monitored_session.py", line 778, in run
    return self._sess.run(
           ^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/training/monitored_session.py", line 1307, in run
    return self._sess.run(
           ^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/training/monitored_session.py", line 1397, in run
    return self._sess.run(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/training/monitored_session.py", line 1464, in run
    outputs = _WrappedSession.run(
              ^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/training/monitored_session.py", line 1228, in run
    return self._sess.run(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/client/session.py", line 969, in run
    result = self._run(None, fetches, feed_dict, options_ptr,
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/client/session.py", line 1192, in _run
    results = self._do_run(handle, final_targets, final_fetches,
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/client/session.py", line 1372, in _do_run
    return self._do_call(_run_fn, feeds, fetches, targets, options,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/client/session.py", line 1398, in _do_call
    raise type(e)(node_def, op, message)  # pylint: disable=no-value-for-parameter
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
tensorflow.python.framework.errors_impl.InvalidArgumentError: Graph execution error:

Detected at node 'encoder_1/edge_model/embed/embedding_lookup' defined at (most recent call last):
    File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/run_granite_model.py", line 109, in <module>
      app.run(main)
    File "/usr/local/lib/python3.11/dist-packages/absl/app.py", line 308, in run
      _run_main(main, args)
    File "/usr/local/lib/python3.11/dist-packages/absl/app.py", line 254, in _run_main
      sys.exit(main(argv))
    File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/run_granite_model.py", line 48, in main
      main_function.run_gematria_model_from_command_line_flags(
    File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/main_function.py", line 803, in run_gematria_model_from_command_line_flags
      model.initialize()
    File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/model_base.py", line 391, in initialize
      self._create_tf_graph()
    File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/graph_builder_model_base.py", line 170, in _create_tf_graph
      super()._create_tf_graph()
    File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/token_model.py", line 200, in _create_tf_graph
      super()._create_tf_graph()
    File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/gnn_model_base.py", line 238, in _create_tf_graph
      self._graphs_tuple_outputs = self._create_graph_network()
    File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/gnn_model_base.py", line 353, in _create_graph_network
      graphs_tuple = layer.module(graphs_tuple)
    File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 397, in __call__
      return self._call(*args, **kwargs)
    File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 419, in _call
      outputs, subgraph_name_scope = self._template(*args, **kwargs)
    File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 227, in _build_wrapper
      output = self._build(*args, **kwargs)
    File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/graph_nets_repo/graph_nets/modules.py", line 409, in _build
      edges=self._edge_model(graph.edges, **edge_model_kwargs),
    File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 397, in __call__
      return self._call(*args, **kwargs)
    File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 419, in _call
      outputs, subgraph_name_scope = self._template(*args, **kwargs)
    File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 227, in _build_wrapper
      output = self._build(*args, **kwargs)
    File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/graph_nets_repo/graph_nets/_base.py", line 112, in _build
      return self._model(*args, **kwargs)
    File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 397, in __call__
      return self._call(*args, **kwargs)
    File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 419, in _call
      outputs, subgraph_name_scope = self._template(*args, **kwargs)
    File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 227, in _build_wrapper
      output = self._build(*args, **kwargs)
    File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/embed.py", line 182, in _build
      return tf.nn.embedding_lookup(embeddings, ids, name="embedding_lookup")
Node: 'encoder_1/edge_model/embed/embedding_lookup'
indices[0] = 8 is not in [0, 8)
         [[{{node encoder_1/edge_model/embed/embedding_lookup}}]]

Original stack trace for 'encoder_1/edge_model/embed/embedding_lookup':
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/run_granite_model.py", line 109, in <module>
    app.run(main)
  File "/usr/local/lib/python3.11/dist-packages/absl/app.py", line 308, in run
    _run_main(main, args)
  File "/usr/local/lib/python3.11/dist-packages/absl/app.py", line 254, in _run_main
    sys.exit(main(argv))
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/run_granite_model.py", line 48, in main
    main_function.run_gematria_model_from_command_line_flags(
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/main_function.py", line 803, in run_gematria_model_from_command_line_flags
    model.initialize()
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/model_base.py", line 391, in initialize
    self._create_tf_graph()
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/graph_builder_model_base.py", line 170, in _create_tf_graph
    super()._create_tf_graph()
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/model/python/token_model.py", line 200, in _create_tf_graph
    super()._create_tf_graph()
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/gnn_model_base.py", line 238, in _create_tf_graph
    self._graphs_tuple_outputs = self._create_graph_network()
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/com_google_gematria/gematria/granite/python/gnn_model_base.py", line 353, in _create_graph_network
    graphs_tuple = layer.module(graphs_tuple)
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 397, in __call__
    return self._call(*args, **kwargs)
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 419, in _call
    outputs, subgraph_name_scope = self._template(*args, **kwargs)
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/ops/template.py", line 398, in __call__
    return self._call_func(args, kwargs)
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/ops/template.py", line 368, in _call_func
    result = self._func(*args, **kwargs)
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 227, in _build_wrapper
    output = self._build(*args, **kwargs)
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/graph_nets_repo/graph_nets/modules.py", line 409, in _build
    edges=self._edge_model(graph.edges, **edge_model_kwargs),
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 397, in __call__
    return self._call(*args, **kwargs)
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 419, in _call
    outputs, subgraph_name_scope = self._template(*args, **kwargs)
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/ops/template.py", line 398, in __call__
    return self._call_func(args, kwargs)
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/ops/template.py", line 368, in _call_func
    result = self._func(*args, **kwargs)
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 227, in _build_wrapper
    output = self._build(*args, **kwargs)
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/graph_nets_repo/graph_nets/_base.py", line 112, in _build
    return self._model(*args, **kwargs)
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 397, in __call__
    return self._call(*args, **kwargs)
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 419, in _call
    outputs, subgraph_name_scope = self._template(*args, **kwargs)
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/ops/template.py", line 398, in __call__
    return self._call_func(args, kwargs)
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/ops/template.py", line 368, in _call_func
    result = self._func(*args, **kwargs)
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/base.py", line 227, in _build_wrapper
    output = self._build(*args, **kwargs)
  File "/tmp/bazel-cache/_bazel_aidengro/ab2551f03460bb1db9bd438eba2ec331/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/granite/python/run_granite_model.runfiles/sonnet_repo/sonnet/python/modules/embed.py", line 182, in _build
    return tf.nn.embedding_lookup(embeddings, ids, name="embedding_lookup")
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/util/traceback_utils.py", line 150, in error_handler
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/util/dispatch.py", line 1176, in op_dispatch_handler
    return dispatch_target(*args, **kwargs)
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/ops/embedding_ops.py", line 326, in embedding_lookup
    return _embedding_lookup_and_transform(
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/ops/embedding_ops.py", line 145, in _embedding_lookup_and_transform
    array_ops.gather(params[0], ids, name=name), ids, max_norm)
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/util/traceback_utils.py", line 150, in error_handler
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/util/dispatch.py", line 1176, in op_dispatch_handler
    return dispatch_target(*args, **kwargs)
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/util/deprecation.py", line 576, in new_func
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/ops/array_ops.py", line 5138, in gather
    return gen_array_ops.gather_v2(params, indices, axis, name=name)
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 3982, in gather_v2
    _, _, _op, _outputs = _op_def_library._apply_op_helper(
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/framework/op_def_library.py", line 795, in _apply_op_helper
    op = g._create_op_internal(op_type_name, inputs, dtypes=None,
  File "/usr/local/lib/python3.11/dist-packages/tensorflow/python/framework/ops.py", line 3381, in _create_op_internal
    ret = Operation.from_node_def(

With the following command line invocation:

bazel run //gematria/granite/python:run_granite_model -- --gematria_action=train --gematria_checkpoint_dir=/tmp/test_model/ --gematria_learning_rate=0.001 --gematria_loss_type=mean_absolute_error --gematria_training_num_epochs=100000 --gematria_tokens_file=/data/vocab_10u7.txt  --gematria_input_file=/tmp/test.tfrecord  --gematria_max_blocks_in_batch=100 --gematria_learning_rate_schedule=cosine --gematria_decay_steps=100000

With the tfrecord dataset produced from the following csv:

f3b801000000,1

With the patch from #107 applied.

Bazel test failed: bhive_importer_test

Hi, after building gematria (bazel build ...), I run bazel test ..., but failed at bhive_importer_test.test_x86_parse_csv_line, and the error log indicate that

e_importer_test/test.log 
exec ${PAGER:-/usr/bin/less} "$0" || exit 1
Executing tests from //gematria/datasets/python:bhive_importer_test
-----------------------------------------------------------------------------
Running tests under Python 3.10.0: /home/gematria/gematria_env/bin/python3
[ RUN      ] BhiveImporterTest.test_x86_basic_block_proto_from_bytes
[       OK ] BhiveImporterTest.test_x86_basic_block_proto_from_bytes
[ RUN      ] BhiveImporterTest.test_x86_basic_block_proto_from_hex
[       OK ] BhiveImporterTest.test_x86_basic_block_proto_from_hex
[ RUN      ] BhiveImporterTest.test_x86_nonstandard_columns
[       OK ] BhiveImporterTest.test_x86_nonstandard_columns
[ RUN      ] BhiveImporterTest.test_x86_parse_csv_line
[  FAILED  ] BhiveImporterTest.test_x86_parse_csv_line
======================================================================
ERROR: test_x86_parse_csv_line (__main__.BhiveImporterTest)
BhiveImporterTest.test_x86_parse_csv_line
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/lukez/.cache/bazel/_bazel_lukez/6ec059981b607312b48b2c4811597fe7/sandbox/linux-sandbox/6/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/datasets/python/bhive_importer_test.runfiles/com_google_gematria/gematria/datasets/python/bhive_importer_test.py", line 203, in test_x86_parse_csv_line
    block_proto = importer.basic_block_with_throughput_proto_from_csv_line(
TypeError: basic_block_with_throughput_proto_from_csv_line(): incompatible function arguments. The following argument types are supported:
    1. (self: gematria.datasets.python.bhive_importer.BHiveImporter, source_name: str, line: str, machine_code_hex_column_index: int, throughput_column_index: int, throughput_scaling: float = 1.0, base_address: int = 0) -> gematria::BasicBlockWithThroughputProto

Invoked with: <gematria.datasets.python.bhive_importer.BHiveImporter object at 0x7ffb153321f0>; kwargs: source_name='test: made-up', line='4829d38b44246c8b54246848c1fb034829d04839c3,10', base_address=600, throughput_scaling=2.0

----------------------------------------------------------------------
Ran 4 tests in 0.069s

FAILED (errors=1)

Do you know how to address this issue? The machine I am using is a Intel Broadwell in x86.

FindAccessedAddrsExegesisTest: can't run 'latency' mode

Hello, after building gematria, there is 1 failure with FindAccessedAddrsExegesisTest:

(env)$ env USE_BAZEL_VERSION=6.4.0 ../bazelisk-linux-amd64 test ...
...
//gematria/datasets:find_accessed_addrs_exegesis_test                    FAILED in 0.8s
  /home/hrong1/.cache/bazel/_bazel_hrong1/32246067180bfaeac7e17e4449bcdc84/execroot/com_google_gematria/bazel-out/k8-fastbuild/testlogs/gematria/datasets/find_accessed_addrs_exegesis_test/test.log

Executed 1 out of 51 tests: 50 tests pass and 1 fails locally.

Here is the content of find_accessed_addrs_exegesis_test/test.log:

Executing tests from //gematria/datasets:find_accessed_addrs_exegesis_test
-----------------------------------------------------------------------------
Running main() from gmock_main.cc
[==========] Running 5 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 5 tests from FindAccessedAddrsExegesisTest
[ RUN      ] FindAccessedAddrsExegesisTest.ExegesisNoAccess
Failure value returned from cantFail wrapped call
can't run 'latency' mode, sched model does not define a cycle counter. You can pass --benchmark-phase=... to skip the actual benchmarking or --use-dummy-perf-counters to not query the kernel for real event counts.
UNREACHABLE executed at external/llvm-project/llvm/include/llvm/Support/Error.h:790!

Is this expected? This looks like an important test to fix, as it seems to measure cycles of instructions, which I guess is a basic functionality of gematria.
Thanks!

build failure: no such attribute 'exec_tools' in 'genrule' rule

Hello, I followed the instructions in README to install gematria, but the build failed for an issue: no such attribute 'exec_tools' in 'genrule' rule. Below are the details.

This is the version of bazelisk:

(env) [gematria]$ ../bazelisk-linux-amd64 version
Bazelisk version: v1.19.0
WARNING: Output base '/data/nfs_home/hrong1/.cache/bazel/_bazel_hrong1/fe3956cd0667122177c09d0692bd5c86' is on NFS. This may lead to surprising failures and undetermined behavior.
Build label: 7.1.1
Build target: @@//src/main/java/com/google/devtools/build/lib/bazel:BazelServer
Build time: Thu Mar 21 18:08:37 2024 (1711044517)
Build timestamp: 1711044517
Build timestamp as int: 1711044517

And this is the build:

(env) [gematria]$ ../bazelisk-linux-amd64 build ...
WARNING: Output base '/data/nfs_home/hrong1/.cache/bazel/_bazel_hrong1/fe3956cd0667122177c09d0692bd5c86' is on NFS. This may lead to surprising failures and undetermined behavior.
Starting local Bazel server and connecting to it...
WARNING: --enable_bzlmod is set, but no MODULE.bazel file was found at the workspace root. Bazel will create an empty MODULE.bazel file. Please consider migrating your external dependencies from WORKSPACE to MODULE.bazel. For more details, please refer to https://github.com/bazelbuild/bazel/issues/18958.
DEBUG: Rule 'com_google_protobuf' indicated that a canonical reproducible form can be obtained by modifying arguments commit = "a74f54b724bdc2fe0bfc271f4dc0ceb159805625" and dropping ["tag"]
DEBUG: Repository com_google_protobuf instantiated at:
  /data/nfs_home/hrong1/gematria/WORKSPACE:16:15: in <toplevel>
Repository rule git_repository defined at:
  /data/nfs_home/hrong1/.cache/bazel/_bazel_hrong1/fe3956cd0667122177c09d0692bd5c86/external/bazel_tools/tools/build_defs/repo/git.bzl:189:33: in <toplevel>
ERROR: /data/nfs_home/hrong1/.cache/bazel/_bazel_hrong1/fe3956cd0667122177c09d0692bd5c86/external/com_google_protobuf/python/BUILD.bazel:123:13: @@com_google_protobuf//python:aarch64_test_genrule: no such attribute 'exec_tools' in 'genrule' rule (did you mean 'executable'?)
ERROR: /data/nfs_home/hrong1/.cache/bazel/_bazel_hrong1/fe3956cd0667122177c09d0692bd5c86/external/com_google_protobuf/python/BUILD.bazel:131:12: @@com_google_protobuf//python:x86_64_test_genrule: no such attribute 'exec_tools' in 'genrule' rule (did you mean 'executable'?)
ERROR: /data/nfs_home/hrong1/.cache/bazel/_bazel_hrong1/fe3956cd0667122177c09d0692bd5c86/external/com_google_protobuf/python/BUILD.bazel:17:11: errors encountered resolving select() keys for @@com_google_protobuf//python:protobuf_python
ERROR: Analysis of target '//gematria/proto:canonicalized_instruction_py_pb2' failed; build aborted: Analysis failed
INFO: Elapsed time: 123.120s, Critical Path: 0.03s
INFO: 1 process: 1 internal.
ERROR: Build did NOT complete successfully
FAILED:
    Fetching repository @@pybind11; Cloning tags/v2.10.3 of https://github.com/pybind/pybind11.git
    Fetching repository @@pybind11_abseil_repo; Cloning 1caf1890443e8e303bf88850d3c27d5422903168 of https://github.com/pybind/pybind11_abseil.git
    Fetching repository @@sonnet_repo; Cloning cd5b5fa48e15e4d020f744968f5209949ebe750f of https://github.com/deepmind/sonnet.git
    Fetching repository @@graph_nets_repo; Cloning adf25162ba21bb0ae176c35483a74fb0c9dff576 of https://github.com/deepmind/graph_nets.git
    Fetching repository @@rules_license~; starting
    Fetching repository @@protobuf~; starting
    Fetching repository @@rules_java~; starting
    Fetching repository @@apple_support~; starting

Anyone has any idea? Thanks!

Implement benchmarking script

In order to construct large-scale BB datasets, we need a script that can perform these benchmarking runs, taking in annotated basic blocks from the annotation script (most likely in JSON), and then returning them with throughput information.

Parallelize memory annotations

The current script in ./gematria/datasets/convert_bhive_to_exegesis_inputs.cc runs sequentially. This is somewhat of a problem for using the Exegesis annotator, which isn't particularly fast. This can easily be parallelized as we don't care about the timings at all while running the annotations. This should be doable with some refactoring and use of LLVM's threading APIs.

Attempt a complete mlgo regalloc training using gematria as latency predictor

Main goals are to:

see what's missing.
fix what's missing so we have a complete testbed others can use (basically llvm-cm plugin)

We can start with @boomanaiden154 's very simple decompression benchmark and then @virajbshah 's cache missing benchmarks - totally fine if models are overfitting initially.

TEST llvm-cm X86/multi_func.s FAILED

Hello, after building tflite, there is an error with llvm-cm:

(env) $                         ninja check-llvm-tools-llvm-cm
[0/1] Running llvm-cm tests
llvm-lit: /home/hrong1/llvm-src/llvm-project/llvm/utils/lit/lit/llvm/config.py:502: note: using yaml2obj: /home/hrong1/llvm-src/cmake-build/bin/yaml2obj
llvm-lit: /home/hrong1/llvm-src/llvm-project/llvm/utils/lit/lit/llvm/config.py:502: note: using llvm-cm: /home/hrong1/llvm-src/cmake-build/bin/llvm-cm
llvm-lit: /home/hrong1/llvm-src/llvm-project/llvm/utils/lit/lit/llvm/config.py:502: note: using split-file: /home/hrong1/llvm-src/cmake-build/bin/split-file
llvm-lit: /home/hrong1/llvm-src/llvm-project/llvm/utils/lit/lit/llvm/config.py:502: note: using llvm-mc: /home/hrong1/llvm-src/cmake-build/bin/llvm-mc
FAIL: llvm-cm :: X86/multi_func.s (11 of 11)
******************** TEST 'llvm-cm :: X86/multi_func.s' FAILED ********************
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 2
/home/hrong1/llvm-src/cmake-build/bin/llvm-mc -o /home/hrong1/llvm-src/cmake-build/X86/Output/multi_func.s.tmp.o --filetype=obj -triple=x86_64-unknown-linux-gnu /home/hrong1/gematria/llvm_cm/test/X86/multi_func.s
# executed command: /home/hrong1/llvm-src/cmake-build/bin/llvm-mc -o /home/hrong1/llvm-src/cmake-build/X86/Output/multi_func.s.tmp.o --filetype=obj -triple=x86_64-unknown-linux-gnu /home/hrong1/gematria/llvm_cm/test/X86/multi_func.s
# RUN: at line 3
/home/hrong1/llvm-src/cmake-build/bin/llvm-cm /home/hrong1/llvm-src/cmake-build/X86/Output/multi_func.s.tmp.o -csv=/home/hrong1/gematria/llvm_cm/test/X86/Inputs/multi-func.csv -granite_model=/home/hrong1/gematria/llvm_cm/test/X86/Inputs/gb-token-mit-2022_12_02.tflite -evaluator=granite | /home/hrong1/llvm-src/cmake-build/bin/FileCheck /home/hrong1/gematria/llvm_cm/test/X86/multi_func.s
# executed command: /home/hrong1/llvm-src/cmake-build/bin/llvm-cm /home/hrong1/llvm-src/cmake-build/X86/Output/multi_func.s.tmp.o -csv=/home/hrong1/gematria/llvm_cm/test/X86/Inputs/multi-func.csv -granite_model=/home/hrong1/gematria/llvm_cm/test/X86/Inputs/gb-token-mit-2022_12_02.tflite -evaluator=granite
# .---command stderr------------
# | Unexpected node token: 'RIP'
# `-----------------------------
# executed command: /home/hrong1/llvm-src/cmake-build/bin/FileCheck /home/hrong1/gematria/llvm_cm/test/X86/multi_func.s
# .---command stderr------------
# | /home/hrong1/gematria/llvm_cm/test/X86/multi_func.s:8:15: error: CHECK-NEXT: expected string not found in input
# | # CHECK-NEXT: Calculated Frequency: 8.342712e+03
# |               ^
# | <stdin>:1:11: note: scanning from here
# | <reverse>:
# |           ^
# | <stdin>:2:1: note: possible intended match here
# | Calculated Frequency: 8.342695e+03
# | ^
# |
# | Input file: <stdin>
# | Check file: /home/hrong1/gematria/llvm_cm/test/X86/multi_func.s
# |
# | -dump-input=help explains the following input dump.
# |
# | Input was:
# | <<<<<<
# |           1: <reverse>:
# | next:8'0               X~ error: no match found
# |           2: Calculated Frequency: 8.342695e+03
# | next:8'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# | next:8'1     ?                                   possible intended match
# |           3: <tallestBillboard>:
# | next:8'0     ~~~~~~~~~~~~~~~~~~~~~
# |           4: Calculated Frequency: 2.928508e+05
# | next:8'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |           5: <isMatch>:
# | next:8'0     ~~~~~~~~~~~~
# |           6: Calculated Frequency: 8.204262e+02
# | next:8'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |           7: <bubbleSort>:
# | next:8'0     ~~~~~~~~~~~~~~~
# |           .
# |           .
# |           .
# | >>>>>>
# `-----------------------------
# error: command failed with exit status: 1

--

********************
********************
Failed Tests (1):
  llvm-cm :: X86/multi_func.s


Testing Time: 0.59s

Total Discovered Tests: 11
  Passed: 10 (90.91%)
  Failed:  1 (9.09%)
FAILED: tools/gematria/llvm_cm/CMakeFiles/check-llvm-tools-llvm-cm /home/hrong1/llvm-src/cmake-build/tools/gematria/llvm_cm/CMakeFiles/check-llvm-tools-llvm-cm
cd /home/hrong1/llvm-src/cmake-build/tools/gematria/llvm_cm && /home/hrong1/gematria/env/bin/python3 /home/hrong1/llvm-src/cmake-build/./bin/llvm-lit -sv /home/hrong1/llvm-src/cmake-build/tools/gematria/llvm_cm

Too many open files error

Failed to annotate block: INTERNAL: Failed to create a pipe for interprocess communication between llvm-exegesis and the benchmarking subprocess: Too many open files

More investigation is needed. Probably an issue on the LLVM side, but opening here first in case there is some complicated interaction.

Write comparison script

It would be good to validate to validate that the benchmarking numbers that we're getting match previous results (like BHive and uica-eval) to ensure that we aren't doing anything egregiously wrong. To do this we need to do a couple things:

Write a script (probably python) that can compare CSVs in the BHive format and identify (major) discrepancies.
Do a benchmarking run using our tooling against one of these datasets.
Run the comparison script, observe the results.

Parallelize benchmarking

With the large scale of our datasets (potentially 10^8 BBs), we will need a reasonably fast way to benchmark basic blocks. Parallelizing this is an obvious first step. This needs a couple things implemented on the LLVM side:

Shared memory names (used for memory annotations) need a name that is also based on the thread ID rather than just the process ID.
There needs to be an option to pin a benchmarking process to a specific core within llvm-exegesis.

(There might be more on the llvm-exegesis side).

Then, we need to do the following:

Implement parallel benchmarking using LLVM threading primitives.
Validate that running on multiple threads doesn't impact results (using validation counters).
Ship it.

tokens.txt

Hi,

I'm trying to follow the g3doc inference-api.md documentation, but when I run the command I'm missing the /tmp/tokens.txt file. Could you please let me know how to generate this file?

Thanks,
Z

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.