Giter Site home page Giter Site logo

frc971 / 971-robot-code Goto Github PK

View Code? Open in Web Editor NEW
54.0 11.0 7.0 302.32 MB

License: Apache License 2.0

Starlark 4.47% C++ 48.03% Shell 0.52% C 35.31% Gnuplot 0.02% TypeScript 1.88% HTML 0.36% CSS 0.07% Python 5.28% PureBasic 0.02% TeX 0.31% G-code 0.39% Ruby 0.87% GLSL 0.48% Makefile 0.16% JavaScript 0.11% Go 1.05% CMake 0.02% Rust 0.62% Dockerfile 0.01%

971-robot-code's Introduction

Introduction

This is FRC Team 971's main code repository. There are README* files throughout the source tree documenting specifics for their respective folders.

Contributing

All development of AOS is done on our gerrit instance at https://software.frc971.org/gerrit with github being a read-only mirror. We are happy to add external contributors. If you are interested, reach out to Austin Schuh or Stephan Massalt and we will help you get access. In case of disputes over if a patch should be taken or not, Austin has final say.

Submissions must be made under the terms of the following Developer Certificate of Origin.

By making a contribution to this project, I certify that:

    (a) The contribution was created in whole or in part by me and I
        have the right to submit it under the open source license
        indicated in the file; or

    (b) The contribution is based upon previous work that, to the best
        of my knowledge, is covered under an appropriate open source
        license and I have the right under that license to submit that
        work with modifications, whether created in whole or in part
        by me, under the same open source license (unless I am
        permitted to submit under a different license), as indicated
        in the file; or

    (c) The contribution was provided directly to me by some other
        person who certified (a), (b) or (c) and I have not modified
        it.

    (d) I understand and agree that this project and the contribution
        are public and that a record of the contribution (including all
        personal information I submit with it, including my sign-off) is
        maintained indefinitely and may be redistributed consistent with
        this project or the open source license(s) involved.

To do this, add the following to your commit message. Gerrit will enforce that all commits have been signed off.

Signed-off-by: Random J Developer <[email protected]>

Git has support for adding Signed-off-by lines by using git commit -s, or you can setup a git commit hook to automatically sign off your commits. Stack Overflow has instructions for how to do this if you are interested.

Access to the code

The main central location for our code is our Gerrit server at https://software.frc971.org/gerrit. If you are on a platform not compatible with our codebase follow the instructions here to set up access to the build server. To download a copy of the 971 code on your computer, follow these steps:

  1. Fill out the 971 system access request form to get a gerrit account.
  • The form is pinned in the #coding channel in slack.
  • You need to be fully signed up for the team and have turned in all the forms and pass the safety test
  1. Wait for Stephan Massalt to setup the account and message you the credentials.
  2. When you log into Gerrit the first time, please add your Email Address
  3. Add your SSH key to Gerrit in order to be able to check out code
  • If you don't already have an ssh key, you can create one using ssh-keygen -t ed25519. This will create a public/private key pair-- the default location for your public key will be ~/.ssh/id_ed25519.pub
  • Log into Gerrit and go to Settings->SSH Keys and paste your public key into the New SSH Key text box and clicking on ADD NEW SSH KEY
  1. Install git: sudo apt install git
  2. Go to the 971-Robot-Code project in Gerrit and run the command to Download the 971-Robot-Code repository.
  • We recommend downloading the code via SSH using the clone with commit-msg hook command
  • NOTE: Running with the option to clone with commit-msg hook will save you trouble later.

To learn more about git, open a terminal and run man git, or see git(1) (especially the NOTES section).

Prerequisites

The instructions below assume you have the following:

  1. A host computer with an appropriate OS or VM to compile the 971 code using Bazel 1. The currently supported operating system for building the code is amd64 Debian Buster. 2. It is likely to work on any x86\_64 GNU/Linux system (e.g., Ubuntu 20.04), but that's not at all well-tested.
  2. Your favorite text editor installed, e.g., vim, emacs
  3. Access to the 971-Robot-Code repository and have downloaded the source code
  4. The ability to ssh into target CPU's like the roborio and Raspberry Pi

Building the code

We use Bazel to build the code. Bazel has extensive docs, including a nice build encyclopedia reference, and does a great job with fast, correct incremental rebuilds.

There are a couple options for building code that are given here-- setting up either your own computer, or using the frc971 build server.

Steps to set up a computer to build the code:

  1. Install any Bazel version:

    1. Check to see if the version of Linux you're running has an apt package for Bazel: apt-cache search bazel or just try sudo apt install bazel
    2. More likely, you'll need to install manually-- see here. We recommend using apt-key instead of gnupg in setting up the key:
      1. Step 1: Add Bazel distribution URI as a package source
        sudo apt install curl
        curl -fsSL https://bazel.build/bazel-release.pub.gpg | sudo apt-key add -
        echo "deb [arch=amd64] https://storage.googleapis.com/bazel-apt stable jdk1.8" | sudo tee /etc/apt/sources.list.d/bazel.list
        
      2. Step 2: Install Bazel
        sudo apt update && sudo apt install bazel
        
  2. Install the required packages:

    sudo apt-get update
    sudo apt-get install python
  3. Change settings to allow Bazel's sandboxing to work-- follow the directions in doc/frc971.conf. For example, the commands to do this would be:

    1. sudo cp doc/frc971.conf /etc/sysctl.d/
    2. sudo sysctl --system
  4. In order to run certain tests, you need to give yourself permissions--follow the "Set up real-time niceties" section of aos/events/README.md.

Setting up access to a workspace on the build server

In order to use the build server, you'll first need to get ssh access set up. (NOTE: you don't need to do any of the other setup steps done for your own computer, since things like bazel, python, etc. are already installed on the build server)

  1. Use ssh-keygen to create a public and private key.
# In windows:
# Important use Powershell!
cd ~/.ssh
ssh-keygen -t ed25519 -f id_971_ed25519
chmod 600 id_971_ed25519
# In Linux and MacOS:
cd ~
ssh-keygen -t ed25519 -f ~/.ssh/id_971_ed25519
chmod 600 ./.ssh/id_971_ed25519
  1. Complete the 971 system access request form for a buildserver account. WAIT for feedback, as he needs to setup the account.
  • The form is pinned in the #coding channel in slack.
  • When asked for your public SSH key, provide the contents of id_971_ed25519.pub
cat ~/.ssh/id_971_ed25519.pub
# This output is your public SSH key for the request form.
  1. Once you hear back from Stephan, test SSH.
ssh [email protected] -p 2222 -i ~/.ssh/id_971_ed25519
  1. If that doesnt work, then send the error msg to #coding However, if it does then use the exit command and then SSH tunnel.
ssh [email protected] -p 2222 -i ~/.ssh/id_971_ed25519 -L 9971:127.0.0.1:3389
  1. Now that you have buildserver access setup, you can follow the vscode instructions to use the buildserver with vscode.
  2. For a graphical interface you can use the Remote Desktop app in Windows. Once you get there, all you need to do is put 127.0.0.1:9971 for the computer name, and use your gerrit username. Once you get connected, accept the server certificate and then enter your password that you gave Stephan. (It's either something unique or your gerrit pwd) Then select the Default panel config. You can exit the Remote Desktop if you are good w/ the raw cmd line interface. And for future logins all you have to do is tunnel and then login using the app. Now if you have a graphical application that you are developing (e.g., spline UI), then you have to run the build command in the Remote Desktop application.
  3. Very important: In order for you to be able to commit, you need to configure your email address in git. To do this, run the following command, replacing <YOUR EMAIL> with the email that you are using for Gerrit:
git config --global user.email "<YOUR EMAIL>"

If there are any questions, post to the #coding Slack channel so that other people who may reach the same issue can refer back to that.

Bazel commands for building, testing, and deploying the code:

  • Build and test everything (on the host system, for the roborio target-- note, this may take a while):
bazel test //...
bazel build --config=roborio -c opt //...
  • Build the code for a specific robot:
# For the roborio:
bazel build --config=roborio -c opt //y2020/...
# For the raspberry pi:
bazel build --config=armv7 -c opt //y2020/...
  • Configuring a roborio: Freshly imaged roboRIOs need to be configured to run the 971 code at startup. This is done by using the setup_roborio.sh script.
bazel run -c opt //frc971/config:setup_roborio -- roboRIO-XXX-frc.local
  • Download code to a robot:
# For the roborio
bazel run --config=roborio -c opt //y2020:download_stripped -- roboRIO-971-frc.local

This assumes the roborio is reachable at roboRIO-971-frc.local. If that does not work, you can try with a static IP address like 10.9.71.2 (see troubleshooting below)

# For the raspberry pi's
bazel run --config=armv7 -c opt //y2020:pi_download_stripped -- 10.9.71.101

NOTE:

  1. The raspberry pi's require that you have your ssh key installed on them in order to copy code over
  2. They are configured to use the IP addresses 10.X.71.Y, where X is 9, 79, 89, or 99 depending on the robot number (971, 7971, 8971, or 9971, respectively), and Y is 101, 102, etc for pi #1, #2, etc.
  • Downloading specific targets to the robot
    1. Generally if you want to update what's running on the robot, you can use the download_stripped (or pi_download_stripped) targets. These will rsync only the changed files, and so are pretty efficient.
    2. If you have a need to compile a specific module, you can build stripped versions of the individual modules by adding "_stripped" to the module name. For example, to build the calibration code (//y2020/vision:calibration) for the pi (armv7), run:
    bazel run --config=armv7 -c opt //y2020/vision:calibration_stripped
    You will then need to manually copy the resulting file over to the robot.

Code reviews

We want all code to at least have a second person look it over before it gets merged into the master branch. Gerrit has extensive documentation on starting reviews. There is also a good intro User Guide and an intro to working with Gerrit and Gerrit workflows

TL;DR: Make and commit your changes, do git push origin HEAD:refs/for/master, and then click on the provided link to add reviewers. If you just upload a change without adding any reviewers, it might sit around for a long time before anybody else notices it.

git-review can make the upload process simpler.

Some other useful packages <TODO: Need to review these>

These aren't strictly necessary to build the code, but Michael found the additional tools provided by these packages useful to install when working with the code on May 13, 2018.

# Get some useful packages including git and subversion.
   apt-get update
   apt-get install git git-gui vim-gtk3
   apt-get install vim-doc git-doc exim4-doc-html yapf
   apt-get install bazel clang-format
   apt-get install python avahi-daemon
# Install apt-file so that packages can be searched
   apt-get install apt-file
   apt-file update
# Install sysstat so that you can tell how many resources are being used during
#   the compile.
   apt-get install sysstat
# iostat is used to observe how hard the disk is being worked and other
#   performance metrics.
   iostat -dx 1
# gitg is a graphical user interface for git.  I find it useful for
# understanding the revision history of the repository and viewing
# log messages and changes.
   apt-get install gitg
# Also may want to install `buildifier` for formatting BUILD files

Creating ssh aliases

It is also handy to alias logins to the raspberry pi's by adding lines like this to your ~/.ssh/config file:

Host pi-7971-2
    User pi
    ForwardAgent yes
    HostName 10.79.71.102
    StrictHostKeyChecking no

or, for the roborio:

Host roborio-971
    User admin
    HostName 10.9.71.2
    StrictHostKeyChecking no

This allows you to use the alias to ping, ssh, or run commands like:

# Download code to robot #7971's raspberry pi #2
bazel run --config=armv7 -c opt //y2020:download_stripped -- pi-7971-2

Roborio Kernel Traces

Currently (as of 2020.02.26), top tends to produce misleading statistics. As such, you can get more useful information about CPU usage by using kernel traces. Sample usage:

# Note that you will need to install the trace-cmd package on the roborio.
# This may be not be a trivial task.
# Start the trace
trace-cmd start -e sched_switch -e workqueue
# Stop the trace
trace-cmd stop
# Save the trace to trace.dat
trace-cmd extract

You can then scp the trace.dat file to your computer and run kernelshark trace.dat (may require installing the kernelshark apt package).

Notes on troubleshooting network setup

If the roboRIO has been configued to use a static IP address like 10.9.71.2, set the laptop to have an IP address on the 10.9.71.x subnet with a netmask of 255.0.0.0. The ".x" is different than the .2 for the roboRIO or any other device on the network. The driver station uses .5 or .6 so avoid those. The radio uses .1 or .50 so avoid those too. A good choice might be in the 90-99 range. If you are at the school, disconnect from the student wireless network or try setting your netmask to 255.255.255.0 if you want to be on both networks. The student wireless network is on a 10.?.?.? subnet which can cause problems with connecting to the robot.

If running Bazel on the download_stripped target does not work for the IP address you're using for the roborio or the raspberry pi, it probably means that the robot and laptop are on different subnets. They need to be on the same subnet for the laptop to connect to the robot. Connecting can be confirmed by using ping.

ping roboRIO-971-frc.local

or

ping 10.9.71.2

If this does not work, perhaps the roboRIO has not been configured to have a static IP address. Use a USB cable to connect from a Windows laptop to the roboRIO and use a web browser (Chrome is preferred, IE/Edge is not-- see this technical note) to configure the roboRIO to have a static IP address of 10.9.71.2. Browse to http://roborio-971-frc.local or http://172.22.11.2. Click on the "Ethernet" icon on the left, select "Static" for the "Configure IPv4 Address" option. Set the "IPv4 Address" to 10.9.71.2. Set the "Subnet Mask" to "255.0.0.0". Finally click on "Save" at the bottom of the screen. If you have trouble using an Ethernet cable, try using a USB cable (USB A->B).

Another option is to configure the laptop to have a link-local connection by using the "Network Settings" GUI. The laptop will then be on the same subnet in the address range of 169.254.0.0 to 169.254.255.255. James thinks this will only work over Ethernet (i.e., not USB; he is not sure what will happen if you attempt this over USB), and if the robot does not have a static IP address set and there is no DHCP server assigning an IP address to the roboRIO. James says to also note that this implies that the roboRIO will also have a 169.254.. IP addresss, and that the only simple way to figure it out is to use mDNS.

LSP Setup for Rust

You can run bazel run //tools:gen_rust_project to generate a rust-project.json file which rust-analyzer will pick up. You will need to execute this rule periodically as it will be outdated whenever the BUILD files change.

Note that there's currently no way to tell rust-analyzer how to compile the code, so while it will give you completion support, go to definition, and other niceties, it won't show compilation errors or warnings at this point.

Other resources

  1. Intro to AOS, our robot Operating System
  2. Introductory example: ping/pong
  3. Example of logging

TODOs:

  1. Add more on networking setup and troubleshooting
  2. Move Roborio Kernel Traces out of here, maybe into documentation/
  3. Currently requires apt install libsigsegv2, but this should be temporary

971-robot-code's People

Contributors

adsnaider avatar alexanderperry avatar austinschuh avatar austinschuhbrt avatar ben-fred avatar brian-peloton avatar brt-adam-snaider avatar brt-alexei avatar brt-sarah-newman avatar bsilver8192 avatar campbellcrowley avatar comran avatar djpetti avatar emilymarkova avatar filipkujawa avatar hspeiser avatar jameskuszmaul-brt avatar jkuszmaul avatar john-henry1234 avatar m3rcuriel avatar maxstrid avatar milind-u avatar nsohmers avatar philsc avatar platipus25 avatar sabinadavis avatar steple avatar tchatow avatar yashchainani28 avatar yimmy13 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

971-robot-code's Issues

Better timing report consumption

Create one or both of:

  • Command line tool to filter timing reports based on application name and display them (with various possibilities for how it is displayed--at a minimum just dumping the straight JSON).
  • Plot that breaks out timing report information in a useful manner.

AOS <-> ROS bridge

Create some sort of bridge to allow bridging ROS nodes to AOS, to accommodate people who want to be able to experiment with both.

Exactly how this will look is not entirely clear, but the goal should be to be relatively easy to get working, rather than focusing on getting every single possible feature working.

Year 2018 arm feedforward

Hey,

I'm Gabor from team 114 and we're looking at your arm feed forward code to try to implement it in our robot. It's in the y2018/control_loops/python/arm_trajectory.py file. Can you please explain us a few things?

  1. What are the G1 and G2 constants? I think this is the gear ratio of the motors, but I'm not 100% sure.

The following questions are not essential to be answered, but if you have time, we'd appreciate if you do

  1. What is the difference between the K2 and K4 matrices? I understand what the other matrices do, but K2 and K4 are both multiplied by omega - which would imply that both are used to relate torque to omega. But I don't understand why you would need two matrices for the same thing. Can you please clarify what the exact purpose of these matrices are?
  2. Can you please clarify how exactly the constant matrices are calculated and what the values in it are?

@AustinSchuh @platipus25 I'm pinging you because it seems that you are the contributors to this file, so I assume you know the implementation details.

Thanks in advance,
Gabor and team 114

JSON config should not accept timestamp_logger_nodes that different from source_node

Guided by Austin-- I was looking at the following code out of y2022_roborio.json:

{ "name": "/drivetrain", "type": "frc971.control_loops.drivetrain.Output", "source_node": "roborio", "frequency": 400, "max_size": 80, "num_senders": 2, "logger": "LOCAL_AND_REMOTE_LOGGER", "logger_nodes": [ "imu" ], "destination_nodes": [ { "name": "imu", "priority": 5, "timestamp_logger": "LOCAL_AND_REMOTE_LOGGER", "timestamp_logger_nodes": [ "imu" ], "time_to_live": 5000000 } ] },
The timestamp_logger_nodes should be the source_node, "roborio", rather than currently as "imu". Our json creation/merging shouldn't accept other choices than the source_node.

Dynamic flag-setting in starterd

Allow specifying flags as part of the starter RPC definition. Unsure exactly how this should manage override of flags specified in the AOS config. This is very helpful when wanting to start applications in the same environment as they would see under starterd but with slightly changed flags.

Faster DARE solver

I thought you may be interested that I contributed a new DARE solver to WPILib that's 2.8x faster than SLICOT, the solver you use now, based on roboRIO benchmarking on a 5 state, 2 input differential drive LQR problem.

The gist of the algorithm from DOI 10.1080/00207170410001714988 is:

#include <Eigen/Cholesky>
#include <Eigen/Core>
#include <Eigen/LU>

template <int States, int Inputs>
Eigen::Matrix<double, States, States> DARE(
    const Eigen::Matrix<double, States, States>& A,
    const Eigen::Matrix<double, States, Inputs>& B,
    const Eigen::Matrix<double, States, States>& Q,
    const Eigen::Matrix<double, Inputs, Inputs>& R) {
  // [1] E. K.-W. Chu, H.-Y. Fan, W.-W. Lin & C.-S. Wang
  //     "Structure-Preserving Algorithms for Periodic Discrete-Time
  //     Algebraic Riccati Equations",
  //     International Journal of Control, 77:8, 767-788, 2004.
  //     DOI: 10.1080/00207170410001714988
  //
  // Implements SDA algorithm on p. 5 of [1] (initial A, G, H are from (4)).
  using StateMatrix = Eigen::Matrix<double, States, States>;

  StateMatrix A_k = A;
  StateMatrix G_k = B * R.llt().solve(B.transpose());
  StateMatrix H_k;
  StateMatrix H_k1 = Q;

  do {
    H_k = H_k1;

    StateMatrix W = StateMatrix::Identity() + G_k * H_k;
    auto W_solver = W.lu();

    StateMatrix V_1 = W_solver.solve(A_k);

    // Solve V₂Wᵀ = Gₖ for V₂
    //
    // V₂Wᵀ = Gₖ
    // (V₂Wᵀ)ᵀ = Gₖᵀ
    // WV₂ᵀ = Gₖᵀ
    // V₂ᵀ = W.solve(Gₖᵀ)
    // V₂ = W.solve(Gₖᵀ)ᵀ
    StateMatrix V_2 = W_solver.solve(G_k.transpose()).transpose();

    G_k += A_k * V_2 * A_k.transpose();
    H_k1 = H_k + V_1.transpose() * H_k * A_k;
    A_k *= V_1;
  } while ((H_k1 - H_k).norm() > 1e-10 * H_k1.norm());

  return H_k1;
}

The preconditions necessary for convergence are:

  1. Q is symmetric positive semidefinite
  2. R is symmetric positive definite
  3. The (A, B) pair is stabilizable
  4. The (A, C) pair where Q = CᵀC is detectable

The paper proves convergence under weaker conditions, but it seems to involve solving a generalized eigenvalue problem with the QZ algorithm. SLICOT and Drake use that to solve the whole problem, so it seemed too expensive to bother attempting.

The precondition checks turned out to be 50-60% of the total algorithm runtime, so WPILib exposed a function that skips them if the user knows they'll be satisfied. This would be a good candidate for your Kalman filter error covariance init code, since a comment in there mentioned it didn't use the DARE solver because it had unnecessary checks.

Here's WPILib's impl, which supports static sizing (for performance) and dynamic sizing (for JNI) and throws exceptions on precondition violations.
https://github.com/wpilibsuite/allwpilib/blob/main/wpimath/src/main/native/include/frc/DARE.h

I'd recommend std::expected instead of exceptions for your use case.

Add config validator rule

And a rule that creates a test to confirm that, for a given config:

  • Remote timestamp channels are all specified and specified correctly.
  • Logs for any given node are fully self-consistent (i.e., won't require --skip_missing_forwarding_entries)--note that this should be configurable, because you may only care about this for a few nodes.

JSON to flatbuffer parsing error messages are hard to see

People consistently seem to struggle with identifying failures associated with the JSON->flatbuffers code. Some of this may be that the messages tend to not stand out much in the program output. Part of it is also that failures of the code tend to show up as segfaults rather than some sort of more coherent error message.

Dropping sent-too-fast messages in LogReader makes unreadable logs

We currently drop replayed messages if they get sent too fast when replaying a log. This is itself somewhat dubious behavior, but also creates an issue where if you create a log of the replay, then you can get errors like

F0402 12:20:39.001713 8892 logfile_utils.cc:1506] Check failed: result.timestamp == monotonic_remote_time ({.boot=0, .time=160.591261903sec} vs. {.boot=0, .time=160.623109070sec}) : Queue index matches, but timestamp doesn't. Please investigate!

OOM killer leaks messages

If the OOM killer kills a process, all it's handles get leaked. This is because the OOM killer doesn't run the robust futex cleanup to free the messages.

It actually looks like /proc/pid/stat has a start time for the thread. When opening the queue, we could look for any PIDs which are nonzero, see if they exist, and then check that the start time matches. If it does, it's pretty much guaranteed to be the same process, and is actually running.

Timestamp shared memory messages on observation/receipt

Currently, messages sent on shared memory channels are timestamped prior to actually being "sent" (https://github.com/frc971/971-Robot-Code/blob/master/aos/ipc_lib/lockless_queue.cc#L1028). While we should have the guarantees in place to ensure that messages do not get sent with out-of-order timestamps on a given channel, the current ordering does mean that a process listening on multiple channels could plausibly observe messages across channels out-of-order (which isn't supposed to happen).

If we instead figured out an appropriate way to have the first observer of a message timestamp it (this "observer" would be the first of any fetchers or watchers, as well as something that would run immediately after the send actually went through), then because of how the event processing happens on the listening side, it should no longer be possible to observe out of order events.

Avoid double-sending on replayed channels

Add a check to logger to ensure that applications in log replay aren't sending on channels that are also getting replayed (i.e., all the relevant channels from the log are remapped).

Should pretty much just be a matter of doing this TODO:

// TODO(james): Enable exclusive senders on LogReader to allow us to
// ensure we are remapping channels correctly.
event_loop_unique_ptr_ = node_event_loop_factory_->MakeEventLoop(
"log_reader", {NodeEventLoopFactory::CheckSentTooFast::kNo,
NodeEventLoopFactory::ExclusiveSenders::kNo});

and then seeing what blows up.

Sandbox escape: libpcre3

To reproduce:

$ bazel build //documentation/tutorials:create-a-new-autonomous
INFO: Analyzed target //documentation/tutorials:create-a-new-autonomous (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
ERROR: /home/steple/source/971-Robot-Code/documentation/tutorials/BUILD:14:13: Executing genrule //documentation/tutorials:create-a-new-autonomous failed: (Exit 127): bash failed: error executing command (from target //documentation/tutorials:create-a-new-autonomous) /bin/bash -c ... (remaining 1 argument skipped)

Use --sandbox_debug to see verbose messages from the sandbox and retain the sandbox build root for debugging
external/pandoc/usr/bin/pandoc: error while loading shared libraries: libpcre.so.3: cannot open shared object file: No such file or directory
Target //documentation/tutorials:create-a-new-autonomous failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 0.881s, Critical Path: 0.01s
INFO: 2 processes: 2 internal.
FAILED: Build did NOT complete successfully

sudo apt install libpcre3 will fix this on Ubuntu 23.10.

Ran out of signals

We have a watcher which blocks forever (different bug). This makes it so the event loop isn't able to process signals.

RT signals queue up. RLIMIT_SIGPENDING is the limit per process, which defaults to somewhere around 7k of them.

Once this happens, we are unable to wake up any process, and everything gets triggered off timers. We need a way to not accumulate signals forever.

Support logging remote timestamps on third-party node

If we have a system with nodes a, b, and logger, and are only running a logger on logger, then currently it is not possible to be log timestamps for messages sent between a and b. So if there is a channel that is forwarded from a to b and the logger, then if you attempt to replay a log from the perspective of b you will not get any messages replayed on that channel since the logger didn't log the timestamps for the message arrivals on b, so doesn't know when the messages actually arrived :(.

This means being able to specify arbitrary nodes in the timestamp_logger_nodes field for a connection, see:

// If the corresponding delivery timestamps for this channel are logged
// remotely, which node should be responsible for logging the data. Note:
// for now, this can only be the source node. Empty implies the node this
// connection is connecting to (i.e. name).
timestamp_logger_nodes:[string] (id: 2);

irq_affinity should report top results

This would be incredibly handy to be able to debug what is happening when something goes wrong to see what else is happening in the system. 1hz is plenty (as we dredge through /proc to see how things are going), or lower frequency when we are dredging through.

We should also put the scheduler + affinity + priority in that report too, along with memory usage.

Proxy EventLoop

An EventLoop implementation which creates multiple EventLoops that are all scheduled via one underlying EventLoop would allow combining multiple applications in a single process. They must all be on the same node, and can communicate with each other and the outside world as normal. This would provide a similar API to SimulatedEventLoopFactory which can create new EventLoops on demand.

Some tricky things to keep in mind:

  • Make sure watchers and senders from all the proxied EventLoops work with each other
  • Multiple senders on the same channel in multiple proxied EventLoops. TimingReports end up doing this.
  • Timers and fetchers can mostly be used directly. Should tack on something in the name to help decipher TimingReports though (each one will be reported twice, once with a longer name in the proxy EventLoop and once with just the given name for the proxied EventLoop)

Build breaks when libbz2-dev is installed

cargo_raze__pcre fails to build. Digging in, the key lines in bazel-out/k8-opt/bin/external/cargo_raze__pcre/pcre_foreign_cc/CMake.log are:

-- Found BZip2: /usr/lib/x86_64-linux-gnu/libbz2.so (found version "1.0.8") 
-- Looking for BZ2_bzCompressInit
-- Looking for BZ2_bzCompressInit - not found
-- Found ZLIB: /dev/shm/bazel-sandbox.f856700afb979694adb03c9da2b7c0b9dacd7824b0655f8fe6ce9ad53a8e0492/linux-sandbox/1656/execroot/org_frc971/bazel-out/k8-opt/bin/external/cargo_raze__pcre/pcre.ext_build_deps/lib/libzlib.a (found version "1.2.11") 
-- Found Readline: /usr/include
-- Could not find OPTIONAL package Editline

We then explode with:

[ 83%] Building C object CMakeFiles/pcregrep.dir/pcregrep.c.o
^[[1m/dev/shm/bazel-sandbox.f856700afb979694adb03c9da2b7c0b9dacd7824b0655f8fe6ce9ad53a8e0492/linux-sandbox/1656/execroot/org_frc971/external/cargo_raze__pcre/pcregrep.c:69:10: ^[[0m^[[0;1;31mfatal error: ^[[0m^[[1m
      'bzlib.h' file not found^[[0m
#include <bzlib.h>
^[[0;1;32m         ^~~~~~~~~
^[[0m1 error generated. 
make[2]: *** [CMakeFiles/pcregrep.dir/build.make:76: CMakeFiles/pcregrep.dir/pcregrep.c.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:116: CMakeFiles/pcregrep.dir/all] Error 2
make: *** [Makefile:136: all] Error 2

Somehow, it looks like cmake is leaking outside the sandbox and finding things on the host when being built. Hmmm.

aos_dump doesn't respect maps

If you have /camera -> /pi1/camera on pi1, aos_dump /camera complains and doesn't respect the map.

This is a bit subtle, since we really would like aos_dump to be able to subscribe to any channel for debugging. But, it would also be nice to have aos_dump properly respect remaps.

Support configuring watchers

Sometimes it is helpful to only process the latest message with a watcher, or to disable "die when you get behind" behavior. Add configuration support to watchers to enable all this.

Better C++ flatbuffers API

The current flatbuffers C++ API requires building up messages inside-out. It should be possible to use the C++ stack for this instead, to allow the C++ code to build up messages from the outside in (which is often more natural).

More concretely, this means generating C++ classes for each flatbuffer struct which hold values for each field (and track which ones are set). Nested structs should be contained in their parent objects, because a major use case is calling functions which return a nested object, and there's no other easy place to store these objects. Then, these objects can do a depth-first traversal of the C++ object graph to actually write the flatbuffer (aka each object writes out its children, tracking the resulting offsets in local variables, then writes out itself and returns the offset to its parent).

I think writing the buffer should be done in a offset_t Write method (or similar), which is passed the FlatBufferBuilder. The top-level will normally be called via a templated wrapper type that holds onto the fbb with a destructor which calls Write on the top-level object and then Finish, to keep everything in outside-in order.

An alternative would be holding the fbb in each object, and then having their destructors write it out, but that means more stack space and makes it impossible to decide to skip writing it out later (for example, build up a sub object and then realize it's not actually needed, without taking up any space in the final buffer). Doing it in the destructor also means the parent object has to keep track of where the offsets to all its children are coming from.

Need to think through handling arrays of primitives. These can be big and variable-sized, neither of which interacts well with putting them on the stack. At the same time, allocating them immediately unlike other objects makes for a confusing API. Maybe provide APIs for both?

Handling arrays of objects is tricky. ArrayWriter<T> StartArray(int max_size) would work well for many cases, with convenience void CreateArray(span<const T>) when the temporary storage is managed externally. However, there's no place to stash C++ pointers to the intermediate objects. The flatbuffers array needs to be placed after those objects in the buffer, and C++ pointers are larger than offsets on 64-bit platforms. Forcing the user to allocate that array externally goes against making this API nice and easy to use, but that's the best I can think of right now. void CreateArray(span<const T*>), with an extra level of indirection, could be handy but also looks like a big foot-gun with dangling references.

Do we need to manage shared subobjects? Writing them out redundantly is easy, but not helpful for space efficiency. Maybe use a bit in the bitmask to track whether it's been written out, and make a union for all the variable storage which gets overwritten to the offset it was written to?

This will increase stack usage, which may be undesirable. It's probably worth using a bitmask in the generated code to track which fields are set, rather than using std::optional or a separate bool for each one.

Copying sub-objects could become expensive. It should be possible to structure this so that RVO (return value optimization) constructs the sub-objects in place for the common case of a function returning an entire sub-object.

These classes end up looking similar to the existing TableT, but without storing data in the objects. Do we want to expose reading the fields (and checking if they're set) and/or building them from a const Table*?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.