Giter Site home page Giter Site logo

facebook / buck2 Goto Github PK

View Code? Open in Web Editor NEW
520.0 39.0 25.0 355.97 MB

Build system, successor to Buck

Home Page: https://buck2.build/

License: Apache License 2.0

Rust 75.01% Python 1.67% Shell 0.04% CSS 0.03% HTML 0.01% JavaScript 0.19% Go 0.11% TypeScript 0.03% Starlark 20.87% Batchfile 0.02% Erlang 1.94% C++ 0.08% RenderScript 0.01%

buck2's Introduction

Buck2 CI

WARNING: This project is not yet polished. We are continuing to develop it in the open, but don't expect it to be suitable for most people until Mar/Apr/May 2023 (at which point we'll properly announce it). If you try and use it, you will probably have a bad time. If you are willing to work closely with us, please give it a go and let us know what is blocking you.

This repo contains the code for the Buck2 build system - the successor to the original Buck build system. To understand why it might be interesting, see this explainer. For the moment, we only test it on Linux, and don't recommend running benchmarks as features like the disk cache are not entirely implemented in the open source build.

Getting started

Building Buck2

To build Buck2 type cargo build --bin=buck2 --release from this directory and copy the resulting binary (probably target/release/buck2) to your $PATH. Typing buck2 --help should now work.

Build uses prebuilt protoc binary from protoc-bin-vendored crate. If these binaries to do not work on your machine (for example, when building for NixOS), path to protoc binary and protobuf include path can be specified via BUCK2_BUILD_PROTOC and BUCK2_BUILD_PROTOC_INCLUDE environment variables.

Building sample targets

FIXME(marwhal): This section needs to be made to work

If you cd examples/prelude and type buck2 build ... that will build a number of targets in a variety of languages. Doing so requires that python3 and clang are both on your $PATH.

Bootstrapping Buck2

To build Buck2 using Buck2:

  • Install reindeer, which is used to make Buck targets for Rust libraries.
  • Run reindeer --third-party-dir shim/third-party/rust vendor
  • Run reindeer --third-party-dir shim/third-party/rust buckify --stdout > shim/third-party/rust/BUCK
  • Run buck2 build :buck2

Note that the resulting binary will be compiled without optimisations or jemalloc, so we recommend using the Cargo-produced binary in further development.

Making your own project

A Buck2 project requires:

  • A .buckconfig file in the root which has a [repositories] section listing out interesting cells. We recommend copying from examples/prelude to ensure it contains the necessary fields.
  • A prelude directory, which should be produced with git submodule add https://github.com/facebook/buck2-prelude.git prelude
  • A toolchains directory, which specifies where to find the relevant toolchains. We recommend copying from examples/prelude to start, but you may wish to use alternative toolchains.
  • Some BUILD files that specify the targets specific to your project.

Terminology conventions

  • A target, e.g. fbcode//buck2:buck2, is something a user defines that is an instance of a rule, which can be built.
  • A rule, e.g. cxx_library, is an implementation of how something is built.
  • Loading a TARGETS/BUCK file involves evaluating the Starlark and doing attribute coercion/resolution. It can be done with buck2 cquery fbcode//buck2:buck2 or buck2 cquery 'deps(fbcode//buck2:buck2)' to do it recursively.
  • Analysing a target involves running the associated rule to produce the providers. It can be done with buck2 audit providers fbcode//buck2:buck2.
  • Building a target involves demanding the artifacts from a provider (e.g. DefaultInfo). It can be done with buck2 build fbcode//buck2:buck2.

Coding conventions

Beyond the obvious (well-tested, easy to read) we prefer guidelines that are automatically enforced, e.g. through rust fmt, Clippy or the custom linter we have written. Some rules:

  • Use the utilities from Gazebo where they are useful, in particular, dupe.
  • Prefer to_owned to convert &str to String.
  • Qualify anyhow::Result rather than use anyhow::Result.
  • Most errors should be returned as anyhow::Result. Inspecting errors outside tests and the top-level error handler is strongly discouraged.
  • Most errors should be constructed with thiserror deriving enum values, not raw anyhow!.
  • We use the derivative library to derive the PartialEq and Hash traits when some fields should be ignored.
  • Prefer use crate::foo::bar over use super::bar or use crate::foo::*, apart from test modules which often have use super::* at the top.
  • Modules should either have submodules or types/functions/constants, but not both.

Error messages

  • Names (of variables, targets, files, etc) should be quoted with backticks, e.g. Variable `x` not defined.
  • Lists should use square brackets, e.g. Available targets: [`aa`, `bb`].
  • Error messages should start with an upper case letter. Error messages should not end with a period.

License

Buck2 is both MIT and Apache License, Version 2.0 licensed, as found in the LICENSE-MIT and LICENSE-APACHE files.

buck2's People

Contributors

ahornby avatar alexbalo avatar aloiscochard avatar antonk52 avatar arlyon avatar benhgreen avatar bigfootjon avatar bobyangyf avatar brandonthebuilder avatar c-ryan747 avatar christycylee avatar facebook-github-bot avatar geralt-encore avatar grievejia avatar ianchilds avatar iguridi avatar jakobdegen avatar krallin avatar lebentle avatar maxovtsin avatar milend avatar ndmitchell avatar nuriamari avatar rmaz avatar shayne-fletcher avatar stepancheg avatar thegeorge avatar themarwhal avatar zertosh avatar zsol avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

buck2's Issues

buck tracks files it shouldn't in a version control directory (sapling)

I'm using Sapling with buck. And if I do

buck build ...
sl
buck build ...

The second buck build normally gives me something like:

File changed: root//src/hello/main.rs
File changed: root//.sl/runlog/3513vs2Gf9m4S2DA.lock
File changed: root//.sl/runlog/.tmpeqcGcc
939 additional file changes
...

because sl needs to talk with its daemon and do other things under the .sl/ directory.

This even goes further than that: if Sapling is in the middle of something like a rebase, it may make copies of .bzl files underneath .sl/ during that time, which then get picked up as part of the default cell for the project. This is really annoying. I've had several sl rebase commands fail due to conflicts, and then buck build ... picks up temporary files under .sl/ that are kept as a backup. So if something like a TARGETS file gets copied, buck build ... will fail in some spectacular and unpredictable fashion.

As a short quick fix, it would be nice if whatever logic exists for .git and buck-out to be ignored could be extended to a few other directories, like .sl/ and .hg/

In the long run, it might be nice to have something like a buckignore file for more "exotic" cases like this.

Define a protobuf toolchain

#28 introduces a custom rule to download a protobuf distribution appropriate for the current platform and ready for use by rules like rust_protobuf_library.
This concept could be extended to promote protobuf to a proper toolchain under prelude//toolchains similar to e.g. prelude//toolchains/cxx/zig.
Before implementing this we should research protobuf toolchains in the Bazel ecosystem and see which lessons can be drawn for Buck2. A useful resource may be this document by the Bazel rule author SIG.

building example fails with "Buck2 panicked and DICE may be responsible"

An attempt at building examples/prelude failed with a panick. Steps that caused it:

  • Checkout e7ab90e
  • $ cd examples/prelude
  • $ buck2 build ...
    =====================================================================
    WARNING: You are using Buck v2 compiled with `cargo`, not `buck`.
             Some operations may go slower and logging may be impaired.
    =====================================================================
    
    File changed: root//buck-out/v2/log/20220923-115200_7b50d74a-2e87-49b4-95f5-9613a98aeee1_events.proto.gz
    File changed: root//buck-out/v2/log
    24 additional file changes
    Buck2 panicked and DICE may be responsible. Please be patient as we try to dump DICE graph to `"/tmp/buck2-dumps/dice-dump-be0d66ae-cad5-498b-a48d-47f4d04d9239"`
    DICE graph dumped to `"/tmp/buck2-dumps/dice-dump-be0d66ae-cad5-498b-a48d-47f4d04d9239"`. DICE dumps can take up a lot of disk space, you should delete the dump after reporting.
    thread 'buck2-rt' panicked at 'a file/dir in the repo must have a parent, but `toolchains//` had none', buck2_common/src/dice/file_ops.rs:185:5
    note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
    Build ID: 7b50d74a-2e87-49b4-95f5-9613a98aeee1
    Jobs completed: 0. Time elapsed: 0.0s.
    BUILD FAILED
    Command failed: Buck daemon event bus encountered an error
    
    Caused by:
        0: status: Unknown, message: "error reading a body from connection: broken pipe", details: [], metadata: MetadataMap { headers: {} }
        1: error reading a body from connection: broken pipe
        2: broken pipe
    

The corresponding DICE dump is here.
And the event-log is here.

The panic does not occur after a second run. Before the crash a buck2 build ... was run at the state of 8b4f7e0.

Toolchain for binutils sourcing

We currently have no mechanism for sourcing binutils such as ld, ar, nm, etc which are commonly used when building cxx projects.

Access absolute output path

Really liking buck2 so far, great work!

I've been trying to implement a proto_library rule in starlark but am stuck on a small thing - when creating the command I need access to the absolute directory of an artifact in order to pass it to protoc, i.e. what do I pass for ??? in the snippet below? If I use e.g. . then protoc just writes into my current git checkout.

cmd = cmd_args(["protoc", "--cpp_out=???"])

I've tried various combinations of

  • $(location) => does not get substituted
  • declaring an output with dir=True => "conflicts with the following output paths"

The closest thing I can find in the code is in genrule.bzl but also doesn't appear to work (nor would I expect it to work based on the Rust code)

"GEN_DIR": cmd_args("GEN_DIR_DEPRECATED"),  # ctx.relpath(ctx.output_root_dir(), srcs_path)

Protoc failed: --experimental_allow_proto3_optional was not set

On Ubuntu 22.04.1 LTS with protobuf-compiler installed via apt, protoc --version is libprotoc 3.12.4 and I get the following errors from cargo check:

error: failed to run custom build command for `buck2_test_proto v0.1.0 (buck2/buck2_test_proto)`
Caused by:
  process didn't exit successfully: `buck2/target/debug/build/buck2_test_proto-d224bbb82df65000/build-script-build` (exit status: 1)
  --- stdout
  cargo:rerun-if-env-changed=PROTOC
  --- stderr
  Error: Custom { kind: Other, error: "protoc failed: test.proto: This file contains proto3 optional fields, but --experimental_allow_proto3_optional was not set.\n" }

error: failed to run custom build command for `buck2_data v0.1.0 (buck2/buck2_data)`
Caused by:
  process didn't exit successfully: `buck2/target/debug/build/buck2_data-db49426b304377a7/build-script-build` (exit status: 101)
  --- stdout
  cargo:rerun-if-env-changed=PROTOC
  --- stderr
  thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Custom { kind: Other, error: "protoc failed: data.proto: This file contains proto3 optional fields, but --experimental_allow_proto3_optional was not set.\n" }', buck2_data/build.rs:136:10
  note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

I found that this patch makes the build succeed:

diff --git a/buck2_data/build.rs b/buck2_data/build.rs
index 08790b9a..acbc7c4e 100644
--- a/buck2_data/build.rs
+++ b/buck2_data/build.rs
@@ -16,6 +16,7 @@ fn main() -> io::Result<()> {
     println!("cargo:rerun-if-env-changed=PROTOC");
     buck2_protoc_dev::maybe_setup_protoc("../../..");
     tonic_build::configure()
+        .protoc_arg("--experimental_allow_proto3_optional")
         .type_attribute(
             "buck.data.BuckEvent.data",
             "#[allow(clippy::large_enum_variant)]",
diff --git a/buck2_test_proto/build.rs b/buck2_test_proto/build.rs
index 69f5b6fc..5401c498 100644
--- a/buck2_test_proto/build.rs
+++ b/buck2_test_proto/build.rs
@@ -6,7 +6,9 @@ fn main() -> io::Result<()> {
     // Tonic build uses PROTOC to determine the protoc path.
     println!("cargo:rerun-if-env-changed=PROTOC");
     buck2_protoc_dev::maybe_setup_protoc("../../..");
-    tonic_build::configure().compile(proto_files, &["."])?;
+    tonic_build::configure()
+        .protoc_arg("--experimental_allow_proto3_optional")
+        .compile(proto_files, &["."])?;
 
     // Tell Cargo that if the given file changes, to rerun this build script.
     for proto_file in proto_files {

Self-hosting open-source buck2

AFAIK the open source version of buck2 can currently only be built with cargo.
Making it self-hosting, i.e. to be able to build the open source buck2 with itself, could be a good milestone on the way to making it useful in open source use-cases.
This came up in discussion with @arlyon on how to build the open source version of buck2.

buck2_re_client server and config?

Hi - I can a config section for buck2_re_client in the code, and traced the various components like perform_cache_upload but it's not clear to me what that cache protocol looks like. Is this code usable in the open source release?

Parameterized load() statements?

I have:

load("@bxl//hello.bxl", _hello_main = "main", _hello_args = "args")
hello = bxl(impl = _hello_main, cli_args = _hello_args)

load("@bxl//licenses.bxl", _licenses_main = "main", _licenses_args = "args")
license_check = bxl(impl = _licenses_main, cli_args = _licenses_args)

# ad infinitum...

I want:

files = {
    "hello": "@bxl//hello.bxl",
    # ...
}

for (name, path) in files.items():
    load(path, _main = "main", _args = "args")
    load_symbols({ name: bxl(impl = _main, cli_args = _args) })

But this currently doesn't work. I suspect there are pretty good reasons for this. But would something like this ever be achievable?

Part of the issue I think is just that the API for 'load' is awkward here, because I think it's supposed to introduce the symbols into the module context, not the local scope. But I really just want locally bound names here. So having something like e = load_single_expr(path, name) would be nice. But again I assume there are reasons for this.

libomnibus changing weak symbol to undefined?

I'm building a cxx_python_extension, but when I'm trying to import it I'm getting an undefined symbol:

ImportError: ...buck-out/v2/gen/root/291b6c3a26d6a3e9/__hello__/hello#link-tree/libomnibus.so: undefined symbol: _ITM_registerTMCloneTable

AFAICT this is happens because libomnibus changes w to U. I think the nm command should probably exclude weak symbols (with --no-weak) but I don't understand whether this is by design or a bug?

example output from nm:

ls buck-out/v2/gen/root/291b6c3a26d6a3e9/__hello__/hello#link-tree/*.so | xargs -tn1 nm | rg _ITM_registerTMCloneTable
nm buck-out/v2/gen/root/291b6c3a26d6a3e9/__hello__/hello#link-tree/libc10.so
                 w _ITM_registerTMCloneTable
nm buck-out/v2/gen/root/291b6c3a26d6a3e9/__hello__/hello#link-tree/libomnibus.so
                 U _ITM_registerTMCloneTable
nm buck-out/v2/gen/root/291b6c3a26d6a3e9/__hello__/hello#link-tree/libtorch_cpu.so
                 w _ITM_registerTMCloneTable
nm buck-out/v2/gen/root/291b6c3a26d6a3e9/__hello__/hello#link-tree/libtorch_python.so
                 w _ITM_registerTMCloneTable

jemalloc probably won't work well on aarch64-linux

Leaving this here while I'm using the laptop, so that I don't forget it. Maybe something can be done, or not. But this will probably come back to bite someone eventually, I suspect.


jemalloc seems to currently be a dependency as Buck2's global allocator. While I understand jemalloc is a big part of what makes Facebook tick, and it's excellent, there is a problem: jemalloc compiles the page size of the host operating system into the library, effectively making it part of its ABI. In other words, if you build jemalloc on a host with page size X, and then run it on an OS with page size Y, and X != Y, then things get bad; your program just crashes.

Normally, until relatively recently, this hasn't a problem. Why? Because most systems have mostly decided collectively that 4096 byte pages are good enough (that's wrong, but not important.) So almost everything uses that — except for the new fancy Apple Silicon M-series, such as my M2 MBA. These systems exclusively makes use of not 4k, but 16k pages. This page size is perfectly allowed by the architecture (actually, 4k, 8k, 16k, 32k, and 64k are all valid on aarch64) and 16k pages are a great choice for many platforms, especially client ones.

So the problem begins to crop up once people start building aarch64-linux binaries for their platforms; e.g. Arch Linux ARM or NixOS, which distribute aarch64 binaries. Until the advent of Apple Silicon, you could reasonably expect everything to use the same page size. But now we have this newly, reasonably popular platform using 16k pages. There's a weird thing happening here: most of the systems building packages for users are some weird ARM board (or VM) in a lab churning out binaries 24/7. They just need to run Linux and not set on fire. They aren't very fast and they typically are old CPUs, and often are running custom, hacked Linux kernsl that barely work. But most developers or end users? They want good performance and lots of features, with a stable kernel. For ARM platforms, the only options they reasonably have today for supported ARM systems are Raspberry Pis, Nvidia Jetson series, and now Apple Silicon. And Apple Silicon is, without comparison, the best bang for your buck and the highest performer. So there's a thing here where users are more likely to use one platform I feel, and it's becoming more popular — while systems churning out packages will use another, incompatible one.

This isn't a theoretical concern; Asahi Linux users like myself still (somewhat often) run into broken software. jemalloc isn't the only thing that doesn't support non-4k pages easily, it's just one of the more notorious and easy-to-spot culprits, and it turns otherwise working packages into non-working ones: https://github.com/AsahiLinux/docs/wiki/Broken-Software

Right now, I'm building buck2 manually, so this isn't a concern. But it means my binaries aren't applicable to non-AS users, and vice versa.


So there are a few reasonable avenues of attack here:

  • Don't use a custom allocator at all, and rely on libc.
    • Probably not good; most libc's notoriously aim for "good" steady state performance, not peak performance under harsher conditions.
  • Turn off jemalloc only on aarch64
    • Maybe OK, though a weird incongruence.
  • Turn on jemalloc only when the user (e.g. internal FB builds) ask for it.
    • Maybe OK; at least you could make the argument y'all have enough money to support customized builds like this while the rest of us need something else.
    • You're already doing your own custom builds already, so maybe this isn't a big deal
  • Switch to another allocator, whole-sale
    • Could also make it a configurable toggle
    • Making it a toggle is potentially a footgun though; it's the kind of "useless knob" that people only bang on once the other ones don't work and they're desperate. This makes it more likely to bitrot, for it to lag in testing and performance eval, etc.
    • I've had very good experience with mimalloc; much like jemalloc it also has an excellent design, fun codebase, and respectable author (Daan Leijen fan club.)
      • But I haven't confirmed it avoids this particular quirk of jemalloc's design. Maybe a dead end.
    • It would probably require a bunch of testing on a large codebase to see what kind of impact this change has. I suspect the FB codebase is a good place to try. ;)

I don't know which one of these is the best option.

UI request: a shorthand for 'buck bxl' invocations (or: arbitrary 'buck foobar' commands)

Here's a nifty one-liner I put behind the name bxl:

exec buck2 bxl "bxl//top.bxl:$1" -- "''${@:2}"

This requires you to:

  • set bxl = path/to/some/dir in your root .buckconfig file [repository] stanza, and then
  • export a bunch of top level bxl() definitions from bxl//top.bxl which can be invoked

Then you can just run the command bxl hello --some_arg 123 in order to run an exported bxl() definition named hello. Pretty nice! And because the command invocation uses the bxl// cell to locate top.bxl, the bxl command can work in any repository that defines that cell, and the files can be located anywhere.


So my question is: could something like this be supported in a better way? The reason is pretty simple: it's just a lot easier to type and remember!

Perhaps the best prior art here I can think of is git, which allows git foobar to invoke the separate git-foobar binary, assuming it's in $PATH and has an executable bit set. We don't need to copy this exactly 1-to-1, but it's a generalized solution to this problem, and in fact it's arguable that being able to do a buck2 foobar subcommand is useful for the same reasons. So maybe that's a better place to start, and the bxl scripts could be a more specialized case of this.

Injecting module state inside BUILD files?

While reading the prelude I saw the oncall mechanism, but that isn't really relevant; it's the syntax that I like. Here's something that I was wondering it could be made to work:

load("@prelude//rust.bzl", "rust_binary")

license("MIT OR Apache-2.0")

rust_binary(
    name = "main",
    file = "./main.rs",
    out = "hello",
)

The idea here is the license field specifies the license of all targets in the current BUILD module. But it could be arbitrary metadata in the module context, and the rust_binary rule — or any rule really — could look up this data and use it. This would require the ability for license to somehow inject state that rust_binary could read. I would also like it if this could be queryable via BXL.

Not sure what this would look like, though. But I'm wondering if something like this can be achieved today.

The workaround is fairly simple: just specify a license = "..." attribute on every target. So this is mostly a convenience, but the general mechanism could be used in other ways, perhaps.

Refer to external dependencies?

I'm trying to import

prebuilt_cxx_library(
    name = "spdlog",
    header_dirs = ["/nix/store/yjqxa9782hpl59cg17h5ma4d4l0zh0ac-spdlog-1.10.0-dev/include/"],
    exported_headers = glob(["**/*.h"]),
    header_only = True,
    visibility = ["PUBLIC"],
)

I get Error when treated as a path: expected a relative path but got an absolute path instead:

I also tried using a relative path like ../../../nix/store/... which worked in buck1 IIRC but get

  Error when treated as a target: Invalid absolute target pattern `../../../../nix/store/yjqxa9782hpl59cg17h5ma4d4l0zh0ac-spdlog-1.10.0-dev/include/` is not allowed: Expected a `:`, a trailing `/...` or the literal `...`.
  Error when treated as a path: expected a normalized path but got an un-normalized path instead: `../../../../nix/store/yjqxa9782hpl59cg17h5ma4d4l0zh0ac-

Is there a recommended way to include external dependencies?

Providing a hermetic python toolchain

For serious use-cases we would benefit from a system-independent python toolchain that can source an interpreter. At the basic level this involves downloading a copy of CPython for the current platform that can run code. An issue with this is that basic cpython depends on dynamic libraries such as libssl and libsqlite, so we need to either provide a mechanism for building those consistently (c / cpp compiler) or use an interpreter with static linking such as https://python-build-standalone.readthedocs.io

Potential learnings from bazel

  • dependencies are managed in a repo rule, meaning they are not using the same toolchain that bazel uses, but use the local toolchain
  • cross compilation is tough (broken)

buck2 Documentation

This is a general tracking issue patching holes in the buck2 documentation regarding background information, general use, rule writing, etc. Some high level goals:

  • API docs for the prelude
    • providers
    • rules
  • buck2 for bazel users
  • glossary
  • quick start with and without the prelude
  • developer environment setup w/ extensions
  • toolchains and toolchain rules

Can't build via documented `cargo` commands

With Ubuntu 20.04.5 LTS and rustup, cargo, protoc installed and on path, building via cargo with the supplied command:

cargo build --bin=buck2 --release

eventually gives error text of:

error: failed to run custom build command for `buck2_data v0.1.0 (/home/philip/work/buck2/buck2_data)`

Caused by:
  process didn't exit successfully: `/home/philip/work/buck2/target/release/build/buck2_data-aee8f8c0a5c9a583/build-script-build` (exit status: 1)
  --- stdout
  cargo:rerun-if-env-changed=PROTOC
  cargo:rerun-if-env-changed=PROTOC_INCLUDE
  cargo:rerun-if-env-changed=OUT_DIR
  cargo:rerun-if-changed=data.proto
  cargo:rerun-if-changed=data.proto
  cargo:rerun-if-changed=.

  --- stderr
  INFO: Not setting $PROTOC to "../../../third-party/protobuf/dotslash/protoc", as path does not exist
  INFO: Not setting $PROTOC_INCLUDE to "../../../third-party/protobuf/protobuf/src", as path does not exist
  Error: Custom { kind: Other, error: "protoc failed: Unknown flag: --experimental_allow_proto3_optional\n" }
warning: build failed, waiting for other jobs to finish...

buck2.build website doesn't redirect HTTP to HTTPS

buck2.build website doesn't redirect HTTP requests to HTTPS even though there is a valid HTTPS endpoint (valid TLS certificate).

$ curl -vvv http://buck2.build/
*   Trying 104.21.12.229:80...
* Connected to buck2.build (104.21.12.229) port 80 (#0)
> GET / HTTP/1.1
> Host: buck2.build
> User-Agent: curl/7.86.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Date: Sun, 12 Feb 2023 12:41:29 GMT
< Content-Type: text/html; charset=utf-8
[...]

It can be fixed by enforcing HTTPS. Enforcing HTTPS for your GitHub Pages site.

Overriding a rule name (buck.type) to match its public, exported name?

Consider an API like the following for exposing rules to users:

__private_name = rule(...)

public_api = struct(
    rule01 = __private_name
)

The goal of an API like this is to just be easier to read and write; you don't need to know the public symbols coming out of a module and when applied consistently makes it a little easier to find things. It's nothing groundbreaking. Just me exploring the API design space.

So now when I write BUILD files I use public_api.rule01 to declare targets. Great.

But this falls down when querying target nodes. The buck.type of a rule is given by the name given to the global top-level rule, not based on what's exported from a module (which would be hard to understand/make sense of, in any case.) In other words, the following query fails:

buck cquery 'kind("public_api.rule01", ...)'

This works:

buck cquery 'kind("__private_name", ...)'

Which exposes the internal name from the module. This isn't necessarily specific to struct()-like APIs; a simple case like public_name = __private_name suffers from the same issue too, I think. The internal name is leaked through to the public user.

You can verify this with a cquery; in the following example the exported name of the rule for users is rust.binary, not __rust_binary:

[email protected]:~/src/buck2-nix.sl$ buck cquery src/hello: --output-all-attributes
Build ID: ab0ce60d-bea1-458b-b886-5c02c610306d
Jobs completed: 1. Time elapsed: 0.0s.
{
  "root//src/hello:main (prelude//platform:default-632fe5438d4aecc1)": {
    "buck.type": "__rust_binary",
    "buck.deps": [
      "prelude//toolchains/rust:rust-stable (prelude//platform:default-632fe5438d4aecc1)"
    ],
    "buck.package": "root//src/hello:TARGETS",
    "buck.oncall": null,
    "buck.target_configuration": "prelude//platform:default-632fe5438d4aecc1",
    "buck.execution_platform": "prelude//platform:default",
    "name": "main",
    "default_target_platform": null,
    "target_compatible_with": [],
    "compatible_with": [],
    "exec_compatible_with": [],
    "visibility": [],
    "tests": [],
    "_toolchain": "prelude//toolchains/rust:rust-stable (prelude//platform:default-632fe5438d4aecc1)",
    "file": "root//src/hello/main.rs",
    "out": "hello"
  }
}

So the question is: would it be possible to allow a rule to specify its buck.type name in some manner so that queries and rule uses can be consistent? Perhaps just another parameter to rule() that must be a constant string? If that could work?

It's worth noting this is mostly a convenience so that a rule user isn't confused by the leaked name. I can keep the nice struct()-like API and have the __private_names follow a similar naming logic so that "translating" between the two isn't hard. Not that big a deal.

I can imagine this might be too deeply woven into the implementation to easily fix.

Allow release version of Rust Analyzer to work on the code

It's not currently possible to use Rust Analyzer at the root of the repo. This holds back open-source collaboration.

This is because Rust Analzyer requires builds to build with one version of the toolchain, and by default, that's the stable version. This repo isn't built with the stable toolchain.

It's possible via Settings to get the Analyzer to use a particular nightly toolchain version, but even that is not sufficient, as there are crates used that themselves specify a different nightly toolchain version to the main toolchain being used.

Suggested fix: converge the code on the stable version of the toolchain, or, failing that, on a particular nightly version, so that Rust Analyzer can work.

Providing a hermetic C/C++ toolchain

A self-contained C/C++ toolchain that doesn't assume pre-installed components on the system ensures a consistent build behavior across environments (CI or different developer machines). Traditionally, obtaining a self-contained C/C++ compiler distribution has been quite difficult. The closest commonly used one in the Bazel ecosystem is the LLVM toolchain created by Grail, however, this one still assumes a global or separately provided sysroot. An easier to use alternative is zig cc built on top of clang and already used for Bazel in bazel-zig-cc, which also has builtin support for cross-compilation.

Similar to #19

download_file cannot be used twice in the same rule because it cannot specify an identifier

The following example doesn't seem to work, though I think it should:

defs.bzl:

def __download_license_data_impl(ctx: "context") -> ["provider"]:
    base_url = lambda name: "https://github.com/spdx/license-list-data/blob/{}/json/{}.json".format(ctx.attrs.revision, name)
    def dl_json_file(name, sha1):
        name = "{}.json".format(name)
        out = ctx.actions.declare_output(name)
        ctx.actions.download_file(out, base_url(name), sha1 = sha1)
        return out

    if len(ctx.attrs.sha1) != 2:
        fail("sha1 must be a list of two strings")

    if "licenses:" != ctx.attrs.sha1[0][:9]:
        fail("first sha1 hash must start with 'licenses:'")
    if "exceptions:" != ctx.attrs.sha1[1][:11]:
        fail("second sha1 hash must start with 'exceptions:'")

    licenses_sha1 = ctx.attrs.sha1[0][9:]
    exceptions_sha1 = ctx.attrs.sha1[1][11:]

    licenses_out = dl_json_file("licenses", ctx.attrs.sha1[0])
    exceptions_out = dl_json_file("exceptions", ctx.attrs.sha1[1])

    return [
        DefaultInfo(default_outputs = [ licenses_out, exceptions_out ])
    ]

download_license_data = rule(
    impl = __download_license_data_impl,
    attrs = {
        "revision": attrs.string(),
        "sha1": attrs.list(attrs.string()),
    },
)

and BUILD:

load(":defs.bzl", "download_license_data")

download_license_data(
    name = "spdx_license_data",
    revision = "v3.19",
    sha1 = [
        "licenses:cd5e714801727eb3dce52908ed818b5f54ddd6ba",
        "exceptions:69ca85d10f0737ddd5fd6c68d047bfb0f0fc65a7",
    ],
)

yields:

When running analysis for `root//src/larry:spdx_license_data (prelude//platform:default-7b06d4530de034dc)`

Caused by:
    Analysis produced multiple actions with category `download_file` and no identifier. Add an identifier to these actions to disambiguate them

But download_file can't take an identifier!

How to use multiple anon targets to build a graph?

First off, congrats on the Buck2 source release. I know it isn't ready yet, but I've been patiently, eagerly awaiting it for a while, and I'm so far pleased with my initial testing of it. I'm very interested in the more dynamic class of build systems that Buck2 is part of, with my previous "world champion" title holder in this category being Shake. (Also: Hi Neil!) I'm naturally already using this in fury for nothing important, which is probably a bad idea, but I hope I can give some feedback.

Here is some background. I have this file called toolchains.bzl that contains a big list of hashes, and dependencies a hash has; effectively this is just a DAG encoded as a dict. I would like to "build" each of these hashes, in practice that means doing some work to download it, then creating a symlink pointing to it (really pointing to /nix/store, but that isn't wholly relevant.)

Conceptually, the DAG of hashes is kind of like a series of "source files", where each hash is a source file, and should be "compiled" (downloaded) before any dependent "sources" are compiled. And many different targets can share these "source files". For example, here is the reverse-topologically sorted DAG for the hash 5jfg0xr0nkii0jr7v19ri9zl9fnb8cx8-rust-default-1.65.0, which you can compute yourself from the above toolchains file:

sdsqayp3k5w5hqraa3bkp1bys613q7dc-libunistring-1.0
s0w6dz5ipv87n7fn808pmzgxa4hq4bil-libidn2-2.3.2
hsk71z8admvgykn7vzjy11dfnar9f4r1-glibc-2.35-163
x7h8sxz1cf5jrx1ixw5am4w300gbrjr1-cargo-1.65.0-x86_64-unknown-linux-gnu
n6mpg42fjx73y2kr1vl8ihj1ykmdhrbm-rustfmt-preview-1.65.0-x86_64-unknown-linux-gnu
nfgpn9av331q7zi1dl6d5qpir60y513s-bash-5.1-p16
k0wbm2panqbb0divlapqazbwlvcgv6m0-expand-response-params
2vqp383jfrsjb3yq0szzkirya257h1dp-gcc-11.3.0-lib
nwl7pzafadvagabksz61rg3b3cs58n9i-gmp-with-cxx-stage4-6.2.1
vv0xndc0ip83f72n0hz0wlcf3g8jhsjd-attr-2.5.1
6b882j01cn2s9xjfsxv44im4pm4b3jsr-acl-2.3.1
h48pjfgsjl75bm7f3nxcdcrqjkqwns7m-coreutils-9.1
lal84wf8mcz48srgfshj4ns1yadj1acs-zlib-1.2.13
92h8cksyz9gycda22dgbvvj2ksm01ca4-binutils-2.39
dj8gbkmgrkwndjghna8530hxavr7b5f4-linux-headers-6.0
2vbw0ga4hlxchc3hfb6443mv735h5gcp-glibc-2.35-163-bin
7p2s9z3hy317sdwfn0qc5r8qccgynlx1-glibc-2.35-163-dev
hz9w5kjpnwia847r4zvnd1dya6viqpz1-binutils-wrapper-2.39
gc7zr7wh575g1i5zs20lf3g45damwwbs-gcc-11.3.0
qga0k8h2dk8yszz1p4iz5p1awdq3ng4p-pcre-8.45
fnzj8zmxrq96vnigd0zc888qyys22jfv-gnugrep-3.7
k04h29hz6qs45pn0mzaqbyca63lrz2s0-gcc-wrapper-11.3.0
wrwx0zy8zblcsq8zwhdqbsxr2jv063fk-rustc-1.65.0-x86_64-unknown-linux-gnu
2s0sp14r5aaxhl0z16b99qcrrpfx7chi-clippy-preview-1.65.0-x86_64-unknown-linux-gnu
2dvg4fgb0lvsvr9i8qlljqj34pk2aydd-rust-docs-1.65.0-x86_64-unknown-linux-gnu
0fv17zk01p08zh6bi17m61zlfh93fcwj-rust-std-1.65.0-x86_64-unknown-linux-gnu
5jfg0xr0nkii0jr7v19ri9zl9fnb8cx8-rust-default-1.65.0

So you can read this list like so: if each line is a source input N (ranging from [0...N]), then you must build every source input [0...(N-1)] before building file N itself. Exactly what you expect.

Problem 1: anon_targets example seems broken

So two different hashes may have a common ancestor/set of dependencies, glibc is a good example because almost every hash it in its dependency tree. This seemed like a perfect use case for anonymous targets; it simply allows common work to be shared, introducing sharing that would be lost otherwise. In fact the example in that document is in some sense the same as this one; many "source files" are depended upon by multiple targets, but they don't know about the common structure between them. Therefore you can compile a single "source file" and build it once, rather than N times for each target.

But I simply can't get it to work, and because I'm new to Buck, I feel a bit lost on how to structure it. I think the problem is simply that the anon_targets function defined in there doesn't work. I have specialized it here in the below __anon_nix_targets function:

NixStoreOutputInfo = provider(fields = [ "path" ])

# this rule is run anonymously. its only job is to download a file and create a symlink to it as its sole output
def __nix_build_hash_0(ctx):
    out = ctx.actions.declare_output("{}".format(ctx.attrs.hash))
    storepath = "/nix/store/{}".format(ctx.attrs.hash)
    ctx.actions.run(cmd_args(
        ["nix", "build", "--out-link", out.as_output(), storepath]
    ), category = "nix")
    return [ DefaultInfo(default_outputs = [out]), NixStoreOutputInfo(path = out) ]

__nix_build_hash = rule(
    impl = __nix_build_hash_0,
    attrs = { "hash": attrs.string() },
)

# this builds many anonymous targets with the previous __nix_build_hash rule
def __anon_nix_targets(ctx, xs, k=None):
    def f(hs, ps):
        if len(hs) == 0:
            return k(ctx, ps) if k else ps
        else:
            return ctx.actions.anon_target(__nix_build_hash, hs[0]).map(
                lambda p: f(hs[1:], ps+[p])
            )
    return f(xs, [])

# this downloads a file, and symlinks it, but only after all the dependents are done
def __nix_build_toolchain_store_path_impl(ctx: "context"):
    hash = "5jfg0xr0nkii0jr7v19ri9zl9fnb8cx8-rust-default-1.65.0"
    deps = [
        # "sdsqayp3k5w5hqraa3bkp1bys613q7dc-libunistring-1.0",
        # "s0w6dz5ipv87n7fn808pmzgxa4hq4bil-libidn2-2.3.2",
        # "hsk71z8admvgykn7vzjy11dfnar9f4r1-glibc-2.35-163",
        # "x7h8sxz1cf5jrx1ixw5am4w300gbrjr1-cargo-1.65.0-x86_64-unknown-linux-gnu",
        # "n6mpg42fjx73y2kr1vl8ihj1ykmdhrbm-rustfmt-preview-1.65.0-x86_64-unknown-linux-gnu",
        # "nfgpn9av331q7zi1dl6d5qpir60y513s-bash-5.1-p16",
        # "k0wbm2panqbb0divlapqazbwlvcgv6m0-expand-response-params",
        # "2vqp383jfrsjb3yq0szzkirya257h1dp-gcc-11.3.0-lib",
        # "nwl7pzafadvagabksz61rg3b3cs58n9i-gmp-with-cxx-stage4-6.2.1",
        # "vv0xndc0ip83f72n0hz0wlcf3g8jhsjd-attr-2.5.1",
        # "6b882j01cn2s9xjfsxv44im4pm4b3jsr-acl-2.3.1",
        # "h48pjfgsjl75bm7f3nxcdcrqjkqwns7m-coreutils-9.1",
        # "lal84wf8mcz48srgfshj4ns1yadj1acs-zlib-1.2.13",
        # "92h8cksyz9gycda22dgbvvj2ksm01ca4-binutils-2.39",
        # "dj8gbkmgrkwndjghna8530hxavr7b5f4-linux-headers-6.0",
        # "2vbw0ga4hlxchc3hfb6443mv735h5gcp-glibc-2.35-163-bin",
        # "7p2s9z3hy317sdwfn0qc5r8qccgynlx1-glibc-2.35-163-dev",
        # "hz9w5kjpnwia847r4zvnd1dya6viqpz1-binutils-wrapper-2.39",
        # "gc7zr7wh575g1i5zs20lf3g45damwwbs-gcc-11.3.0",
        # "qga0k8h2dk8yszz1p4iz5p1awdq3ng4p-pcre-8.45",
        # "fnzj8zmxrq96vnigd0zc888qyys22jfv-gnugrep-3.7",
        # "k04h29hz6qs45pn0mzaqbyca63lrz2s0-gcc-wrapper-11.3.0",
        # "wrwx0zy8zblcsq8zwhdqbsxr2jv063fk-rustc-1.65.0-x86_64-unknown-linux-gnu",
        # "2s0sp14r5aaxhl0z16b99qcrrpfx7chi-clippy-preview-1.65.0-x86_64-unknown-linux-gnu",
        "2dvg4fgb0lvsvr9i8qlljqj34pk2aydd-rust-docs-1.65.0-x86_64-unknown-linux-gnu",
        "0fv17zk01p08zh6bi17m61zlfh93fcwj-rust-std-1.65.0-x86_64-unknown-linux-gnu",
    ]

    def k(ctx, ps):
        deps = [p[NixStoreOutputInfo].path for p in ps]
        out = ctx.actions.declare_output("{}".format(hash))
        storepath = "/nix/store/{}".format(hash)
        ctx.actions.run(cmd_args(
            ["nix", "build", "--out-link", out.as_output(), storepath]
        ).hidden(deps), category = "nix")
        return [ DefaultInfo(default_outputs = deps + [out]), NixStoreOutputInfo(path = out) ]

    return __anon_nix_targets(ctx, [{"hash": d} for d in deps], k)

__build_toolchain_store_path_rule = rule(impl = __nix_build_toolchain_store_path_impl, attrs = {})

If the list deps has any number of entries > 1, then this example fails with any example TARGET:

[email protected]:~/src/buck2-nix.sl$ buck clean; buck build src/nix-depgraph:
server shutdown
/home/austin/src/buck2-nix.sl/buck-out/v2/log
/home/austin/.buck/buckd/home/austin/src/buck2-nix.sl/v2
Initialization complete, running the server.
When running analysis for `root//src/nix-depgraph:rust-stable (<unspecified>)`

Caused by:
    expected a list of Provider objects, got promise()
Build ID: 5612be60-9b84-4657-97f6-64e049aedada
Jobs completed: 4. Time elapsed: 0.3s.
BUILD FAILED

However, if len(deps) == 1, i.e. you comment out the next-to-last line, then it works as expected. I think that the problem might be that if there is a single element type of type promise in a list, then Buck can figure it out. But if you have a list with many promises, it simply can't? Or something?

So I've simply been banging my head on this for a day or so, and can't find any reasonable way to make this example work that's intuitive or obvious... I actually had to fix several syntax errors in anon_targets example documentation when I took it from the example document (i.e. it uses the invalid slicing syntax xs[1...] instead of the correct xs[1:]) so I suspect it may have just been a quick example. I can send a PR to fix that, perhaps.

But it always comes back to this exact error: expected list of Providers, but got promise(). Some advice here would be nice; I feel this is close to working, though.

Problem 2: dependencies of anonymous targets need to (recursively) form a graph

The other problem in the above example, which could be handled after the previous problem, is that the dependencies don't properly form a graph structure. If we have the reverse-toposorted list:

foo
foobar
foobarbaz
foobarbazqux

Then the above example correctly specifies foobarbazqux as having all preceding entries as dependencies. But this isn't recursive: foobarbaz doesn't specify that it needs foo and foobar; foobar doesn't specify it needs foo and so on.

In my above example this isn't strictly required for correctness because the nix build command can automatically handle it. But it does mean that the graph Buck sees isn't really "complete" or accurate because it is missing the dependent structure between links.

So I don't really know the "best" way to structure this. I basically just need anonymous targets that are dependent on other anonymous targets, I guess. This is really a code structure question, and honestly I believe I'm overthinking it substantially, but I'd like some guidance to help ensure I'm doing things the "buck way" if at all possible.

It is worth noting here (or emphasizing) that the graph of dependencies above, the toolchains.bzl file, is always "statically known" up front, and is auto-generated. I could change its shape (the dict structure) as much as I want it if makes it easier, but really the list of dependencies is truly fully static, so this feels like something that should be very possible with Buck.

Other than that...

The above two problems are my current issues that I think would be amazing to solve, but I'm all ears to alternative solutions. It's possible this is an "X/Y problem" situation, I suppose.

But other than that: I'm pretty excited about Buck2 and can't wait to see it stabilize! Sorry for the long post; I realize you said "you'll probably have a bad time" using the repository currently, but you wanted feedback, and I'm happy to provide it!


Side note 1

I notice that there are no users of anon_target anywhere in the current Buck prelude, and no examples or tests of it; so perhaps something silently broke or isn't ready to be used yet? I don't know how it's used inside Meta, to be fair, so perhaps something is simply out of date.

Side note 2

I am not using the default Buck prelude, but my own prelude designed around Nix, so it would be nice if any solutions were "free standing" like the above code and didn't require the existing one.

Side note 3

The anon_targets.md documentation notes that anon_targets could be a builtin with potentially more parallelism. While that's nice, I argue something like anon_targets should really be in the Prelude itself, even if it isn't a builtin, exactly so that people aren't left to do the error prone thing of copy/pasting it everywhere like I did above, and discovered several problems.

Cargo install's --git flag: "member of the wrong workspace"

Normally cargo install is able to install binaries from a subdirectory of a git repo. It will find all workspace roots, resolve the requested package name in each one, fail if there is an ambiguity, and otherwise install it.

In buck2's case it seems to go wrong because 2 different workspaces both claim to contain some of the same crates.

$ cargo install cli --git https://github.com/facebookincubator/buck2 --bin buck2
    Updating git repository `https://github.com/facebookincubator/buck2`
error: package `/home/david/.cargo/git/checkouts/buck2-4c0ac0340bde8e6a/c117530/allocative/allocative/Cargo.toml` is a member of the wrong workspace
expected: /home/david/.cargo/git/checkouts/buck2-4c0ac0340bde8e6a/c117530/Cargo.toml
actual:   /home/david/.cargo/git/checkouts/buck2-4c0ac0340bde8e6a/c117530/allocative/Cargo.toml

https://github.com/facebookincubator/buck2/blob/c117530548ddc3ec1e85b5af52e5fe678b32dff0/Cargo.toml#L3-L7

https://github.com/facebookincubator/buck2/blob/c117530548ddc3ec1e85b5af52e5fe678b32dff0/allocative/Cargo.toml#L1-L5

It's hard for me to tell whether this arrangement is intentional, or just an oversight. In any case, it leads to issues.

Support fallback of buck2.file_watcher to 'notify' if 'watchman' doesn't work?

As I noted in #57 (comment), I found a way to enable watchman support via the config option buck2.file_watcher. See this corresponding commit, in particular the highlighted lines: thoughtpolice/[email protected]e47fc8f#diff-d33e979799a45c7c51752e9c8d96a3e452015d1a40b1e4b6ec6a98e92c4d8430R92-R105 — I also recommend the commit message for a summary.

It's very handy. In short, I use direnv to automatically use systemd-run --user to launch a transient user service for watchman, then export an appropriate WATCHMAN_SOCK variable for the repository. Works great! Except...

I forgot to enable it in CI, which caused this failure:

[2022-12-22T15:45:13.628+00:00] 2022-12-22T15:45:13.626994Z  WARN buck2_server::file_watcher::watchman::core: Connecting to Watchman failed (will re-attempt): Error reconnecting to Watchman: While invoking the watchman CLI to discover the server connection details: reader error while deserializing, stderr=`2022-12-22T15:45:13,625: [watchman] while computing sockname: failed to create /usr/local/var/run/watchman/runner-state: No such file or directory
[2022-12-22T15:45:13.630+00:00] `
[2022-12-22T15:45:13.663+00:00] Build ID: 6d28aa[89](https://github.com/thoughtpolice/buck2-nix/actions/runs/3754737959/jobs/6379213447#step:8:90)-1dee-49fd-80c2-b8a54e1c63[96](https://github.com/thoughtpolice/buck2-nix/actions/runs/3754737959/jobs/6379213447#step:8:97)
[2022-12-22T15:45:13.665+00:00] Command failed: 
[2022-12-22T15:45:13.665+00:00] SyncableQueryHandler returned an error
[2022-12-22T15:45:13.665+00:00]
[2022-12-22T15:45:13.665+00:00] Caused by:
[2022-12-22T15:45:13.665+00:00]     0: No Watchman connection
[2022-12-22T15:45:13.665+00:00]     1: Error reconnecting to Watchman
[2022-12-22T15:45:13.665+00:00]     2: Error reconnecting to Watchman
[2022-12-22T15:45:13.665+00:00]     3: While invoking the watchman CLI to discover the server connection details: reader error while deserializing, stderr=`2022-12-22T15:45:13,633: [watchman] while computing sockname: failed to create /usr/local/var/run/watchman/runner-state: No such file or directory
[2022-12-22T15:45:13.665+00:00]        `
BUILD FAILED

So there's two things going on here:

  • I'm using a watchman binary from the repository, the Ubuntu 22.04 one. But I didn't install it with dpkg, I installed it with Nix, since you can't rely on the user having it. A consequence of this is that some directories, such as /usr/local/var/run/watchman/, don't exist, which cause spurious failures, since watchman binaries implicitly have a STATEDIR set to /usr/local at build time. And if there is no WATCHMAN_SOCK variable set, the watchman_client Rust library will query the CLI. Therefore calls like watchman get-sock or whatever will fail, because they always probe a non-existent statedir which causes part of the stack trace. I can work around this in some way with Nix, and will file a possible bug report with upstream watchman about it; but I just wanted to clarify this since it's in the above error.
    • Note that the transient systemd unit sets an explicit path to statefile, logfile, and pidfile, which are the 3 required variables that watchman needs; if these are set, the implicit STATEDIR is not used.
    • This is why I set WATCHMAN_SOCK instead with my setup, because if it is set, the watchman_client library uses it above all else, and doesn't need to query the CLI binary. So it "just works"
  • The build fails outright if the file_watcher can't be configured.

Would it be possible to optionally fallback to buck2.file_watcher=notify if watchman_client detects and captures a failure like above? This would at least allow users to continuing developing in the repository without a strange error if they don't enable watchman, even if they would detect file changes watchman would otherwise ignore. I'm trying to make my repository 'plug and play', and it feels bad if you cd $src; buck build ... and it immediately fails like this without watchman.

It should be possible to override protoc-bin-vendored

As of #60, buck2 no longer depends on an external protoc binary, but uses a binary vendored from @stepancheg's rust-protoc-bin-vendored.

While reducing friction is a noble goal I support fully, this had an unintended side effect: it breaks the ability to build buck using tools like Nix, without tiny workarounds

Problem: dynamically linked binaries require /usr/lib

The basic problem is that when you build something with Nix, paths like /usr are not available during the build process. Every dependency must be declared specifically, and all dependencies are located within /nix/store. This includes libc.so.6, in fact, pthreads, and everything else.

[email protected]:~/src/rust-protoc-bin-vendored$ ldd protoc-bin-vendored-linux-x86_64/bin/protoc
        linux-vdso.so.1 (0x00007ffcf97b1000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f53ce17d000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f53ce096000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f53cde6e000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f53ce18d000)

This means that this ELF binary requires /lib64/ld-linux-x86-64.so.2 to be available, but it won't be available during the build phase. Therefore during the buck2 build process, when attempting to invoke protoc, it will fail, because the binary cannot have its dependencies satsified by the dynamic system linker ld.so. Here's what the Nix-ified output looks like:

[email protected]:~/src/rust-protoc-bin-vendored/ > ldd protoc-bin-vendored-linux-x86_64/bin/protoc
        linux-vdso.so.1 (0x00007ffeaf4ab000)
        libpthread.so.0 => /nix/store/4nlgxhb09sdr51nc9hdm8az5b08vzkgx-glibc-2.35-163/lib/libpthread.so.0 (0x00007f04dc75f000)
        libm.so.6 => /nix/store/4nlgxhb09sdr51nc9hdm8az5b08vzkgx-glibc-2.35-163/lib/libm.so.6 (0x00007f04dc67f000)
        libc.so.6 => /nix/store/4nlgxhb09sdr51nc9hdm8az5b08vzkgx-glibc-2.35-163/lib/libc.so.6 (0x00007f04dc400000)
        /lib64/ld-linux-x86-64.so.2 => /nix/store/4nlgxhb09sdr51nc9hdm8az5b08vzkgx-glibc-2.35-163/lib64/ld-linux-x86-64.so.2 (0x00007f04dc766000)

Can't we fix the binaries?

There is a fun tool called patchelf that could fix this for us, and is designed to handle this exact problem for Nix machines; you can just patchelf binaries to get them to work. And the binaries would work if we could just patch the libc.so.6 paths. However it would require us to inject patchelf into the cargo build process for buck, so it's not exactly trivial, and looked quite painful for me to pull off reliably.

The binaries could also be statically linked, but I don't know how much of a PITA that is.

Quick fix patch

The easiest fix I found was just to set PROTOC and PROTOC_INCLUDE manually in the build environment to point to my working copies of the compiler and its include code, and then just comment out the code that uses protoc-bin-vendored

The following patch was sufficient for me to keep working as if nothing happened:

commit e92d67e4568c5fa6bcfbbd7ee6e16b9f132114a9
Author: Austin Seipp <[email protected]>
Date:   Mon Jan 2 00:46:31 2023 -0600

    hack(protoc): do not use protoc-bin-vendored
    
    This package doesn't work under the Nix sandbox when using Cargo,
    because the vendored binaries can't easily be patchelf'd to look at the
    correct libc paths.
    
    Instead, just rely on PROTOC and PROTOC_INCLUDE being set manually.
    
    Signed-off-by: Austin Seipp <[email protected]>

diff --git a/app/buck2_protoc_dev/src/lib.rs b/app/buck2_protoc_dev/src/lib.rs
index 69980b9a..fecdde46 100644
--- a/app/buck2_protoc_dev/src/lib.rs
+++ b/app/buck2_protoc_dev/src/lib.rs
@@ -80,8 +80,8 @@ impl Builder {
 
     pub fn setup_protoc(self) -> Self {
         // It would be great if there were on the config rather than an env variables...
-        maybe_set_protoc();
-        maybe_set_protoc_include();
+        //maybe_set_protoc();
+        //maybe_set_protoc_include();
         self
     }

This is small enough that I'm happy to carry it for now, but a solution to support both would be nice, with vendoring being the default.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.