Giter Site home page Giter Site logo

daniel5151 / gdbstub Goto Github PK

View Code? Open in Web Editor NEW
277.0 9.0 43.0 1.05 MB

An ergonomic, featureful, and easy-to-integrate implementation of the GDB Remote Serial Protocol in Rust (with no-compromises #![no_std] support)

License: Other

Rust 99.71% Shell 0.29%
gdbstub rust packet-parsing gdb gdb-rsp emulation rsp emulators debugging no-std

gdbstub's Introduction

gdbstub

An ergonomic, featureful, and easy-to-integrate implementation of the GDB Remote Serial Protocol in Rust, with no-compromises #![no_std] support.

gdbstub makes it easy to integrate powerful guest debugging support to your emulator / hypervisor / debugger / embedded project. By implementing just a few basic methods of the gdbstub::Target trait, you can have a rich GDB debugging session up and running in no time!

gdbstub's API makes extensive use of a technique called Inlineable Dyn Extension Traits (IDETs) to expose fine-grained, zero-cost control over enabled GDB protocol features without relying on compile-time features flags. Aside from making it effortless to toggle enabled protocol features, IDETs also ensure that any unimplemented features are guaranteed to be dead-code-eliminated in release builds!

If you're looking for a quick snippet of example code to see what a featureful gdbstub integration might look like, check out examples/armv4t/gdb/mod.rs

Why use gdbstub?

  • Excellent Ergonomics
    • Instead of simply exposing the underlying GDB protocol "warts and all", gdbstub tries to abstract as much of the raw GDB protocol details from the user.
      • Instead of having to dig through obscure XML files deep the GDB codebase just to read/write from CPU/architecture registers, gdbstub comes with a community-curated collection of built-in architecture definitions for most popular platforms!
      • Organizes GDB's countless optional protocol extensions into a coherent, understandable, and type-safe hierarchy of traits.
      • Automatically handles client/server protocol feature negotiation, without needing to micro-manage the specific qSupported packet response.
    • gdbstub makes extensive use of Rust's powerful type system + generics to enforce protocol invariants at compile time, minimizing the number of tricky protocol details end users have to worry about.
    • Using a novel technique called Inlineable Dyn Extension Traits (IDETs), gdbstub enables fine-grained control over active protocol extensions without relying on clunky cargo features or the use of unsafe code!
  • Easy to Integrate
    • gdbstub's API is designed to be a "drop in" solution when you want to add debugging support into a project, and shouldn't require any large refactoring effort to integrate into an existing project.
  • #![no_std] Ready & Size Optimized
    • gdbstub is a no_std first library, whereby all protocol features are required to be no_std compatible.
    • gdbstub does not require any dynamic memory allocation, and can be configured to use fixed-size, pre-allocated buffers. This enables gdbstub to be used on even the most resource constrained, no-alloc platforms.
    • gdbstub is entirely panic free in most minimal configurations*, resulting in substantially smaller and more robust code.
    • gdbstub is transport-layer agnostic, and uses a basic Connection interface to communicate with the GDB server. As long as target has some method of performing in-order, serial, byte-wise I/O (e.g: putchar/getchar over UART), it's possible to run gdbstub on it!
    • "You don't pay for what you don't use": All code related to parsing/handling protocol extensions is guaranteed to be dead-code-eliminated from an optimized binary if left unimplemented. See the Zero-overhead Protocol Extensions section below for more details.
    • gdbstub's minimal configuration has an incredibly low binary size + RAM overhead, enabling it to be used on even the most resource-constrained microcontrollers.
      • When compiled in release mode, using all the tricks outlined in min-sized-rust, a baseline gdbstub implementation can weigh in at less than 10kb of .text + .rodata! *
      • *Exact numbers vary by target platform, compiler version, and gdbstub revision. In mixed-language projects, cross-language LTO may be required (#101). Data was collected using the included example_no_std project compiled on x86_64.

Can I Use gdbstub in Production?

Yes, as long as you don't mind some API churn until 1.0.0 is released.

Due to gdbstub's heavy use of Rust's type system in enforcing GDB protocol invariants at compile time, it's often been the case that implementing new GDB protocol features has required making some breaking API changes. While these changes are typically quite minor, they are nonetheless semver-breaking, and may require a code-change when moving between versions. Any particularly involved changes will typically be documented in a dedicated transition guide document.

That being said, gdbstub has already been integrated into many real-world projects since its initial 0.1 release, and empirical evidence suggests that it seems to be doing its job quite well! Thusfar, most reported issues have been caused by improperly implemented Target and/or Arch implementations, while the core gdbstub library itself has proven to be reasonably bug-free.

See the Future Plans + Roadmap to 1.0.0 for more information on what features gdbstub still needs to implement before committing to API stability with version 1.0.0.

Debugging Features

The GDB Remote Serial Protocol is surprisingly complex, supporting advanced features such as remote file I/O, spawning new processes, "rewinding" program execution, and much, much more. Thankfully, most of these features are completely optional, and getting a basic debugging session up-and-running only requires implementing a few basic methods:

  • Base GDB Protocol
    • Read/Write memory
    • Read/Write registers
    • Enumerating threads

Yep, that's right! That's all it takes to get gdb connected!

Of course, most use-cases will want to support additional debugging features as well. At the moment, gdbstub implements the following GDB protocol extensions:

  • Automatic target architecture + feature configuration
  • Resume
    • Continue
    • Single Step
    • Range Step
    • Reverse Step/Continue
  • Breakpoints
    • Software Breakpoints
    • Hardware Breakpoints
    • Read/Write/Access Watchpoints (i.e: value breakpoints)
  • Extended Mode
    • Launch new processes
    • Attach to an existing process
    • Kill an existing process
    • Pass env vars + args to spawned processes
    • Change working directory
    • Enable/disable ASLR
  • Read Memory Map (info mem)
  • Read Section/Segment relocation offsets
  • Handle custom monitor Commands
    • Extend the GDB protocol with custom debug commands using GDB's monitor command!
  • Host I/O
    • Access the remote target's filesystem to read/write file
    • Can be used to automatically read the remote executable on attach (using ExecFile)
  • Read auxiliary vector (info auxv)
  • Extra thread info (info threads)
  • Extra library information (info sharedlibraries)

Note: GDB features are implemented on an as-needed basis by gdbstub's contributors. If there's a missing GDB feature that you'd like gdbstub to implement, please file an issue and/or open a PR!

For a full list of GDB remote features, check out the GDB Remote Configuration Docs for a table of GDB commands + their corresponding Remote Serial Protocol packets.

Zero-overhead Protocol Extensions

Using a technique called Inlineable Dyn Extension Traits (IDETs), gdbstub is able to leverage the Rust compiler's powerful optimization passes to ensure any unused features are dead-code-eliminated in release builds without having to rely on compile-time features flags!

For example, if your target doesn't implement a custom GDB monitor command handler, the resulting binary won't include any code related to parsing / handling the underlying qRcmd packet!

If you're interested in the low-level technical details of how IDETs work, I've included a brief writeup in the documentation here.

Feature flags

By default, the std and alloc features are enabled.

When using gdbstub in #![no_std] contexts, make sure to set default-features = false.

  • alloc
    • Implement Connection for Box<dyn Connection>.
    • Log outgoing packets via log::trace! (uses a heap-allocated output buffer).
    • Provide built-in implementations for certain protocol features:
      • Use a heap-allocated packet buffer in GdbStub (if none is provided via GdbStubBuilder::with_packet_buffer).
      • (Monitor Command) Use a heap-allocated output buffer in ConsoleOutput.
  • std (implies alloc)
    • Implement Connection for TcpStream and UnixStream.
    • Implement std::error::Error for gdbstub::Error.
    • Add a TargetError::Io variant to simplify std::io::Error handling from Target methods.
  • paranoid_unsafe

Examples

Real-World Examples

While some of these projects may use older versions of gdbstub, they can nonetheless serve as useful examples of what a typical gdbstub integration might look like.

If you end up using gdbstub in your project, consider opening a PR and adding it to this list!

In-tree "Toy" Examples

These examples are built as part of the CI, and are guaranteed to be kept up to date with the latest version of gdbstub's API.

  • armv4t - ./examples/armv4t/
    • An incredibly simple ARMv4T-based system emulator with gdbstub support.
    • Implements (almost) all available target::ext features. This makes it a great resource when first implementing a new protocol extension!
  • armv4t_multicore - ./examples/armv4t_multicore/
    • A dual-core variation of the armv4t example.
    • Implements the core of gdbstub's multithread extensions API, but not much else.
  • example_no_std - ./example_no_std
    • An extremely minimal example which shows off how gdbstub can be used in a #![no_std] project.
    • Unlike the armv4t/armv4t_multicore examples, this project does not include a working emulator, and simply stubs all gdbstub functions.
    • Doubles as a test-bed for tracking gdbstub's approximate binary footprint (via the check_size.sh script), as well as validating certain dead-code-elimination optimizations.

unsafe in gdbstub

gdbstub limits its use of unsafe to a bare minimum, with all uses of unsafe required to have a corresponding // SAFETY comment as justification.

For those paranoid about trusting third-party unsafe code, gdbstub comes with an opt-in paranoid_unsafe feature, which enables #![forbid(unsafe_code)] on the entire gdbstub crate, swapping out all instances of unsafe code with equivalent (albeit less-performant) alternatives.

The following list exhaustively documents all uses of unsafe in gdbstub:

  • With default features

    • Don't emit provably unreachable panics
      • src/protocol/packet.rs: Method in PacketBuf that use index using stored sub-Range<usize> into the buffer
      • src/protocol/common/hex.rs: decode_hex_buf
  • When the std feature is enabled:

    • src/connection/impls/unixstream.rs: An implementation of UnixStream::peek which uses libc::recv. Will be removed once rust-lang/rust#76923 stabilizes this feature in the stdlib.

Writing panic-free code

Ideally, the Rust compiler would have some way to opt-in to a strict "no-panic" mode. Unfortunately, at the time of writing (2022/04/24), no such mode exists. As such, the only way to avoid the Rust compiler + stdlib's implicit panics is by being very careful when writing code, and manually checking that those panicking paths get optimized out!

And when I say "manually checking", I mean checking generated asm output.

Why even go through this effort?

  • Panic infrastructure can be expensive, and when you're optimizing for embedded, no_std use-cases, panic infrastructure brings in hundreds of additional bytes into the final binary.
  • gdbstub can be used to implement low-level debuggers, and if the debugger itself panics, well... it's not like you can debug it all that easily!

As such, gdbstub promises to introduce zero additional panics into an existing project, subject to the following conditions:

  1. The binary is compiled in release mode
    • *subject to the specific rustc version being used (codegen and optimization vary between versions)
    • *different hardware architectures may be subject to different compiler optimizations
      • i.e: at this time, only x86 is actively tested to be panic-free
  2. gdbstub's paranoid_unsafe cargo feature is disabled
    • LLVM is unable to omit certain panic checks without requiring a bit of unsafe code
    • See the unsafe in gdbstub section for more details
  3. The Arch implementation being used doesn't include panicking code
    • Note: The arch implementations under gdbstub_arch are not guaranteed to be panic free!
    • If you do spot a panicking arch in gdbstub_arch, consider opening a PR to fix it

If you're using gdbstub in a no-panic project and have determined that gdbstub is at fault for introducing a panicking code path, please file an issue!

Future Plans + Roadmap to 1.0.0

While the vast majority of GDB protocol features (e.g: remote filesystem support, tracepoint packets, most query packets, etc...) should not require breaking API changes, the following features will most likely require at least some breaking API changes, and should therefore be implemented prior to 1.0.0.

Not that this is not an exhaustive list, and is subject to change.

  • Allow fine-grained control over target features via the Arch trait (#12)
  • Implement GDB's various high-level operating modes:
    • Single/Multi Thread debugging
    • Multiprocess Debugging (#124
      • Requires adding a new target::ext::base::multiprocess API.
      • Note: gdbstub already implements multiprocess extensions "under-the-hood", and just hard-codes a fake PID, so this is mostly a matter of "putting in the work".
    • Extended Mode (target extended-remote)
    • Non-Stop Mode
  • Have a working example of gdbstub running in a "bare-metal" #![no_std] environment.

Additionally, while not strict blockers to 1.0.0, it would be good to explore these features as well:

  • Should gdbstub commit to a MSRV?
  • Remove lingering instances of RawRegId from gdbstub_arch (#29)
  • Exposing async/await interfaces (particularly wrt. handling GDB client interrupts) (#36)
  • How/if to support LLDB extensions (#99)
  • Supporting multi-arch debugging via a single target
    • e.g: debugging x86 and ARM processes on macOS
  • Proper handling of "nack" packets (for spotty connections) (#137)

License

gdbstub is free and open source! All code in this repository is dual-licensed under either:

at your option. This means you can select the license you prefer! This dual-licensing approach is the de-facto standard in the Rust ecosystem and there are very good reasons to include both.

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

gdbstub's People

Contributors

709924470 avatar alexcrichton avatar atouchet avatar bet4it avatar daniel5151 avatar drchat avatar fzyz999 avatar geigerzaehler avatar gz avatar jamcleod avatar jameshageman avatar jawilk avatar keiichiw avatar mchesser avatar mkroening avatar mrk-its avatar pheki avatar ptosi avatar qwandor avatar sapir avatar sean-purcell avatar starfleetcadet75 avatar thefaxman avatar thomashk0 avatar tiwalun avatar xobs avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gdbstub's Issues

Fix memory watchpoints

While gdbstub ostensibly reports that it supports memory watchpoints, I've never actually gotten them to work correctly when testing with GDB.

LLDB Compatibility

This is a meta-issue to discuss + track gdbstub's LLDB compatibility story.

Overview

LLDB uses the GDB Remote Serial Protocol for remote debugging, and for the most part, LLDB works "out of the box" with GDB server implementations. Unfortunately, there are some differences between how LLDB and GDB implement the Remote Serial Protocol, resulting in potential issues when mixing and matching servers/clients.

In some cases, LLDB has added new protocol extensions that are disjoint from (and therefore backwards compatible with) the base GDB protocol. The de-facto documentation for these LLDB specific extensions can be found here:

Unfortunately, there are some cases where LLDB happened to implement certain packets differently from GDB, resulting in breaking incompatibilities. The most pertinent example of this is wrt how LLDB handles vFile packets. This blog post does a good job providing an overview of the various LLDB-GDB RSP compatibility hazards: https://www.moritz.systems/blog/improving-gdb-protocol-compatibility-in-lldb/

Moreover, as mentioned in the blog post above, LLDB has actually made breaking changes to its RSP between versions! Unlike GDB, using a newer LLDB client with an older LLDB server is not supported!

Supporting LLDB clients in gdbstub

With these challenges in mind, what would be the best way to support LLDB clients in gdbstub?

For the most part, it should be pretty straightforward to support arbitrary LLDB extensions in gdbstub. After all, adding support for any particular LLDB packet should be no different from adding support for a new GDB packet.

Indeed, the bulk of the complexity in supporting LLDB clients is figuring out how to handle the inconsistencies between the LLDB and GDB RSPs.

  • Should gdbstub include some kind of configuration to target a specific LLDB protocol version?
  • Would it be feasible to somehow support all LLDB clients via some kind of version-detection mechanism?
  • Should LLDB extensions "bleed into" the existing APIs, or should LLDB extensions be clearly marked as "separate" from the main GDB protocol?
    • e.g: should the existing vFile:open API be updated to include LLDB-specific vFile:open flags?
  • Older versions of LLDB used a custom qRegisterInfo packet to fetch a target's register description, though newer versions of LLDB support parsing target.xml data. Should gdbstub attempt to "unify" qRegisterInfo with target.xml via the Arch trait, thereby enabling transparent support across all GDB and LLDB clients?

These are open questions, and ones that I don't have answers to just yet. I suspect the only way to answer these questions will be by someone putting in the legwork to try some ideas out, and seeing how the things look.

[RFC] s/managed/bytes?

bytes create is widely used with networking code. It could bend quite well with Packet and Command in gdbstub. bytes is actively maintained (compared to managed crate which doesn't have much commits in about two years), supports no_std (>=0.5.5), and zero copy operations; with bytes::BufMut, you can also push<value: T> with bytes, so no need to have util/managed_vec.rs. Most importantly, since it is isomorphic to Arc<Vec<u8>>, using bytes allows user to get rid of the annoying lifetime annotations (with slices), which can be found in many places in src/protocol. What do you think?

Support Remote File I/O

See https://sourceware.org/gdb/onlinedocs/gdb/File_002dI_002fO-Overview.html#File_002dI_002fO-Overview

The File I/O remote protocol extension (short: File-I/O) allows the target to use the host’s file system and console I/O to perform various system calls. System calls on the target system are translated into a remote protocol packet to the host system, which then performs the needed actions and returns a response packet to the target system. This simulates file system operations even on targets that lack file systems.

Additionally, it would also be nice to implement Host I/O Packets at the same time (See https://sourceware.org/gdb/onlinedocs/gdb/Host-I_002fO-Packets.html#Host-I_002fO-Packets)

This has been implemented in #66. See update below...

The Host I/O packets allow GDB to perform I/O operations on the far side of a remote link. For example, Host I/O is used to upload and download files to a remote target with its own filesystem. Host I/O uses the same constant values and data structure layout as the target-initiated File-I/O protocol.


Skimming through the spec, it looks like implementing these extensions can be done in a totally backwards-compatible manner, simply by adding some new extension traits to Target, and shouldn't require any breaking API changes.

There's also some prior art from luser/rust-gdb-remote-protocol#60, which could be a good source of inspiration.


Update 8/20/2021: #66 has been merged, adding support for Host I/O packets!

In hindsight, this issue should probably have been split into two separate ones, as while there is some overlap between File I/O and Host I/O, the two are actually very different features.

Dynamic `Arch` selection

Extracted from a back-and-forth on #31 (comment)

Will likely tie into #12


With the current Arch architecture, it's actually pretty easy to "punch through" the type-safe abstractions by implementing an Arch that looks something like this:

// not actually tested, but something like this ought to work

pub enum PassthroughArch {}

impl Arch for PassthroughArch {
    type Usize = u64; // something reasonably big
    type Registers = PassthroughRegs;
    type RegId = PassthroughRegId;
    ...
}

pub struct PassthroughRegs {
    len: usize,
    buffer: [u8; MAX_SIZE], // a vec could also work
}

// push logic up a level into `Target`
impl Registers for PassthroughRegs {
    fn gdb_serialize(&self, mut write_byte: impl FnMut(Option<u8>)) {
        for byte in &self.buffer[..self.len] { write_byte(Some(byte)) }
    }

    fn gdb_deserialize(&mut self, bytes: &[u8]) -> Result<(), ()> {
        self.buffer[..bytes.len()].copy_from_slice(bytes)
    }
}

pub struct PassthroughRegId(usize);
impl RegId for PassthroughRegId { ... } // elided for brevity, should be straightforward

impl Target {
    type Arch = PassthroughArch;

    // .... later on, in the appropriate IDET ... //

    fn read_registers(
        &mut self, 
        regs: &mut PassthroughRegs<YourTargetSpecificMetadata>
    ) -> TargetResult<(), Self> {
        // write data directly into `regs.buffer` + `regs.len`
        if self.cpu.in_foo_mode() { /* one way */  } else { /* other way */ }
        // can re-use existing `impl Registers` types from `gdbstub_arch`, 
        // calling `gdb_serialize` and `gdb_deserialize` directly...
        if using_arm {
            let mut regs = gdbstub_arch::arm::regs::ArmCoreRegs::default();
            regs.pc = self.pc;
            // ...
            let mut i = 0;
            regs.gdb_serialize(|b| { regs.buffer[i] = b.unwrap_or(0); i++; })
        }
    }

    // similarly for `write_registers`
}

This can then be combined with the recently added TargetDescriptionXmlOverride IDET to effectively create a runtime configurable Arch + Target (see #43)

While this will work in gdbstub right now, this approach does have a few downsides:

  • Bypasses one of gdbstub's most useful abstractions, forcing end users to care about the unfortunate details of how GDB de/serializes register data (e.g: sending data in the order defined by target.xml, input/output buffer management, etc...)
  • It is not zero-copy, as the PassthroughRegs buffer will be be immediately copied into the "true" output buffer on every invocation. Mind you, neither is using a type Registers intermediary, but hey, linear copies are super fast, so it's probably fine.
  • The PassthroughRegs type is static, and as such, will have to include a buffer that's large enough to handle the "wost case" largest inbound/outbound payload. This can be mitigated by using variable-length type as the backing storage (such as a Vec) when running in an std environment.
  • In the current version of gdbstub (0.4.5 at the time of writing), each call to {read,write}_registers results in a fresh instance of type Registers being allocated on the stack. The original intent behind this decision was to avoid having a permanent "static" Registers struct take up memory inside struct GdbStub, but honestly, it's not that important. Thankfully, this is a internal issue with a fairly easy fix, and I'm considering switching where the type is stored (in struct GdbStub vs. the stack) in the next minor/patch version of gdbstub.

There are a couple things that can be done to mitigate this issue:

  1. Provide a single "blessed" implementation of PassthroughArch, so that folks don't have to write all this boilerplate themselves
    • This is easy, and can be done as a first-time-contributor's PR, as it doesn't require digging into gdbstub's guts.
  2. Set aside some time to reconsider the underlying Arch API, and modify it to enable more elegant runtime configuration, while also retaining as many ergonomic features / abstraction layers as the current static-oriented API.
    • This is significantly harder, and is something that I would probably want to tackle myself, as it may have sweeping changes across the codebase

Use GDB-defined Signal Numbers instead of `u8`

It has recently come to my attention that GDB actually defines its own set of signal constants for use in the RSP:

https://github.com/bminor/binutils-gdb/blob/master/include/gdb/signals.def

I just assumed it'd be using POSIX signal numbers, but nope. Admittedly, this makes a lot more sense, as the GDB RSP is platform agnostic, so trying to shoehorn POSIX onto all platform probably wouldn't be the best idea.

Switching from using bare u8s to structured Signal enums will require a breaking change.

Add documentation about adjusting PC to Target::resume

Maybe I'm misunderstanding, however, when a "continue" command is issued from GDB, Target::resume will be called with an iterator containing ResumeAction::Continue over and over. For example If I set a breakpoint then continue (while also logging when a breakpoint is hit or a "continue" action is issued) I get the following out:

GDB input/output:

(gdb) target remote localhost:4444
Remote debugging using localhost:4444
warning: No executable has been specified and target does not support
determining executable automatically.  Try using the "file" command.
0xffffffff81a01fd0 in ?? ()
(gdb) b *0xffffffff81a01fd3
Breakpoint 1 at 0xffffffff81a01fd3
(gdb) c

(note only a single continue)

Debug server output:

starting pc: ffffffff81a01fd0
Add breakpoint 0xffffffff81a01fd3
Continue
break @ pc: ffffffff81a01fd3
Continue
...
break @ pc: ffffffff81a01fd3
Continue
break @ pc: ffffffff81a01fd3
Continue
...
break @ pc: ffffffff81a01fd3
Continue

So essentially everytime a breakpoint is hit it goes:
Target::resume returns StopReason::SwBreak -> Target::resume is called again with ResumeAction::Continue (despite the user not saying to continue yet) -> execution continues until next breakpoint

This results in every breakpoint being instantly continue-d.

$ gdb --version
GNU gdb (Ubuntu 9.1-0ubuntu1) 9.1

Willing to work on and test a fix (if needed), just need some guidance.

Relicense gdbstub under dual MIT/Apache-2.0

Inspired by bevyengine/bevy#2373 and bevyengine/bevy#2509

When I released gdbstub 0.1, I didn't think too much about which license to use, and just picked one that seemed reasonable - namely, the MIT license. Over time, as I've gotten more familiar with crate maintainership and open source licensing, I've come to realize that it'd probably be a good idea to switch over to the dual MIT/Apache-2.0 license that most of the Rust ecosystem uses.

For more context around the Why behind this relicense, check out bevyengine/bevy#2373

This isn't urgent by any means, but I might as well get the ball rolling in case there are some contributors who are a bit slow to respond.

If you are mentioned in this issue, I need your help to make this happen

To agree to this relicense, please read the details in this issue, then leave a comment with the following message:

I license past and future contributions under the dual MIT/Apache-2.0 license, allowing licensees to choose either at their option.

If you disagree, please respond with your reasoning. Anyone who doesn't agree to the relicense will have any gdbstub contributions that qualify as "copyrightable" removed or re-implemented.

What will this look like?

After getting explicit approval from each contributor of copyrightable work (as not all contributions qualify for copyright, due to not being a "creative work", e.g. a typo fix), we will do the following:

  • Change the LICENSE file to describe the dual license. Move the MIT license to docs/LICENSE-MIT. Add the Apache-2.0 license to docs/LICENSE-APACHE
  • Update the gdbstub and gdbstub_arch Cargo.toml files to use the new "MIT OR Apache-2.0" license value
  • Add a License section to the main readme

Note: I do intend to keep the Copyright (c) <year> Daniel Prilik line in the licenses. Please let me know if you're strongly opposed to this.

Contributor checkoff

Contributors with "obsolete" changes

(no need for approval)

  • iburinoc

Contributors with "trivial" changes that are ok to keep

(no need for approval)

  • JamesHageman
  • pheki

Support reverse-execution `b` commands

Add support for the GDB reverse set of commands.
This corresponds to ReverseContinue+ and ReverseStep+ in the qSupported features.

This will likely warrant a new set of ops, such as a ReverseExecutionOps.

Split packet trace logging into separate feature

At the moment, incoming packets get logged via this log::trace! statement, and outgoing packets are logged via this log::trace! statement.

The latter log statement requires allocating an output buffer in the ResponseWriter to stash the outgoing packet, and is currently gated behind a cfg(feature = "std") flag.

Instead, this bit of tracing functionality ought to be split off into a separate "trace-pkt" feature, with a dependency on alloc (as opposed to std).

This oversight was uncovered while looking into #77

[Feedback] Overall API design

Did you end up using gdbstub in your project?
Or maybe you considered using it, but found some major flaw in the API?

In either case, please leave a comment letting me know what you think of gdbstub!

Also, if it's not "classified info", please share what kind of project you're integrating gdbstub into (e.g: emulation, virtualization, embedded, etc...).

  • If everything went smoothly, awesome! Positive feedback like that makes me more confident that the library is getting closer to a 1.0.0 release.
  • Hit a pain point? Let me know how you think the API could be improved!

Please don't comment about missing protocol features or bugs. Those should be filed as separate issues.


This tracking issue will remain open until gdbstub hits version 1.0.0.

Split `gdstub::arch` into a separate crate

As gdbstub gets used by more and more projects, folks will undoubtedly continue to upstream more arch implementations, and uncover subtle bugs in existing implementations (e.g: #44)

By having arch implementations live in-tree alongside the rest of the gdbstub code, any arch-level breaking-changes force the entire gdbstub crate to release a new breaking version. This isn't ideal, as this will artificially bump up gdbstub's version number, even though the actual breaking changes themselves were incredibly minor.

Instead, it would be a good idea to break gdbstub::arch into a separate crate gdbstub_arch that was had gdbstub as a dependency, but not vice-versa. If this were the case, then gdbstub_arch's version number could be bumped with impunity as new arch / fixes come in.


The act of splitting the module out into a separate crate would be a breaking change, but it could be lumped in with other breaking changes to minimize the impact.

The gdbstub_arch crate would live in-tree, as having it as a separate git repo would be a bit overkill.

Support fine-grained control over Target Features

Aside from specifying the core architecture of the target, the Target Description XML is also used to specify various features of the architecture. As the docs say: "Features are currently used to describe available CPU registers and the types of their contents"

e.g: for x86, the core registers are defined under the org.gnu.gdb.i386.core feature, while the optional SSE registers are defined under the org.gnu.gdb.i386.sse feature.

Currently, the Arch trait isn't very well suited to describing features, requiring a separate implementation for each set of features. i.e: if an architecture has 3 different features, A, B, and C, each with their own registers, there would have to be 3! (i.e: 6) different register structs to describe each set of features!

I'd like to preserve the current "described at compile time" approach to the Arch trait, and I think with a little bit of experimentation, it shouldn't be too hard to find some sort of ergonomic trait-based solution to the problem.


Some general ideas:

  • It might make sense to have the target_features_xml method accept a FnMut(&'static str) callback (instead of returning Option<'static str>, as that would allow building up an XML file in "chunks" based on which features are available.

gdbstub_arch package doesn't contain LICENSE file

gdbstub_arch-0.1.0 package distributed on crates.io doesn't contain LICENSE file while its Cargo.toml says its license is MIT.

$ curl -L 'https://crates.io/api/v1/crates/gdbstub_arch/0.1.0/download' > /tmp/gdbstub_arch-0.1.0.tar.gz
$ tar xvf /tmp/gdbstub_arch-0.1.0.tar.gz
$ cat gdbstub_arch-0.1.0/LICENSE
cat: gdbstub_arch-0.1.0/LICENSE: No such file or directory

@daniel5151 Could you release the updated version (0.1.1?) with LICENSE included?

Support for m86k architecture (e.g: Sega Genesis)

I'm incredibly disappointed.

As a long time user of gdbstub I was shocked to see there is absolutely no support for the Sega Genesis. Just the other day I attempted to debug my play session of Disney's Aladdin for the Genesis and thought I was in an Orwellian nightmare when I found that not only could gdbstub not set breakpoints within the Aladdin game, but it failed to fulfill its API for any game on the Sega Genesis I tried it on.

Do you think you're so high and mighty as to exclude the Sega Genesis from the catalogue of supported hardware? That console could eat you for breakfast, lunch and dinner and still have room for dessert. That console could grind this paltry repo into a fine powder, rim a fine glass of 1998 Veuve Clicquot La Grande Dame with it and enjoy it to the crisp, pulchritudinous sounds of Street Fighter II: Champion Edition.

You can do better. As the maintainer of this library I am positively floored that you could be so inept. People like myself are counting on you to at least approximate competence. Yes, the library is open-source and yes, the library is free. Let me tell you something pal, have you heard of exposure? I've starred this on GitHub. Starred. Well sir after this gross exploitation of the sacred trust between open-source user and open-source maintainer I can assure you I will be revoking the star I gave you with gusto.

Set `TCP_NODELAY` in examples and documentation

Since the protocol was originally designed to run over serial connections, it involves sending very small packets (i.e. the single byte step packet) and immediately waiting for the response. This behaves badly with the default TCP buffering settings (i.e. Nagle's algorithm). Causing it (at least on some platforms) to have an unusable amount of latency.

Most gdbserver implementations I've seen set the TCP_NODELAY option to avoid this buffering. You can see this being done in gdbserver here:

https://github.com/bminor/binutils-gdb/blob/aea44f64c827932a2c67aeb2ee35a332696df8e8/gdbserver/remote-utils.cc#L155-L159

It might be useful for the documentation/examples to reference this, since I remember it being some what difficult to diagnose when I initially came across it (and I temporarily forgot about it when updating my code to use this crate since I was following the examples).

`qThreadExtraInfo` support

While porting probe-rs to use gdbstub I found two packets that would be helpful to respond to:

qC - this queries the current thread ID. WIthout responding GDB works correctly, but the client emits a message that may be confusing to end users. Edit: qC is no longer needed after debugging an issue with ? in the thread.

qThreadExtraInfo - this allows the stub to provide a string of extra data about a thread. It's a free form string that is displayed on the client and can be useful for adding extra detailed status or thread names, for example.

Both of these would produce a better experience for the probe-rs logic and I'd be glad to take a shot at producing a PR that adds the logic. Before I do I wanted some feedback on how to approach this.

Both packets feel like appropriate functions on the MultiThreadBase trait. To avoid a break I could add them with default implementations - qC could return the first value from list_active_threads and qThreadExtraInfo could return a 0 byte string by default. This avoids a break, but it puts a bit more burden on the implementor to realize these are options for them to extend.

Any thoughts / feedback on this?

Internal refactor to enable using 100% of PacketBuf as response buffer

As discussed in #69, we'll gradually want to move various callback based APIs over to a "here is a &mut [u8] buffer, please fill it with data" APIs.

Instead of allocating a whole new outgoing packet buffer for this, it'd be nice to reuse the existing PacketBuf. This could work, since packets that request data are almost always able to be parsed into fixed-size, lifetime-free structs, which would leave the packet buf available to be used as scratch space.

Unfortunately, the current implementation of the packet parser includes a "late decode" step, whereby target-dependent fields (such as memory addresses / breakpoint kinds / etc...) are parsed into &[u8] bufs in the packet parsing code, and are only converted into their concrete types later on, in the handler code (where the type of Target is known). This is an important property to maintain, as eventually, we'd want to support debugging multiple Target types at the same time (e.g: on macOS, a single gdbserver can debug both x86 and ARM code), and the only way to do this would be by having the packet parsing code be Target agnostic.

Instead, there should be some way of obtaining a reference to the entire, raw, underlying &mut [u8] PacketBuf after the late decode step has been completed, but this is harder than it seems. Getting the lifetimes to line up here will probably be tricky, and I suspect getting this working will require some real code-cotorsion, and possibly even a sprinkle of unsafe.


In the meantime, we'll be going with the approach used by the m packet, whereby the packet parsing code will "stash" a &mut [u8] pointing to the trailing unused bit of the buffer as part of the parsed struct.

This works, but is a bit wasteful (as not 100% of the packet buffer is being utilized), and also a bit annoying to implement on a per-packet basis. Nonetheless, the GDB RSP allows targets to return less than the requested amount of data provided without there necessarily being an error, specifically because certain implementation might be using different sized buffers for incoming / outgoing data.

Given that "workaround" works pretty well, and that losing ~30 bytes of a ~4096 byte PacketBuf isn't particularly noticeable, getting to 100% efficiency isn't a super high priority, but it's still something to think about, and potentially implement at some point.

[arch][x86] Break GPRs & SRs into individual fields/variants

At the moment, the X86_64CoreRegs::regs field and X86[_64]CoreRegs::segments fields rely on an implicit ordering documented in their respective comments. It would be much better if explicit fields were used instead.

As pointed out in #34 (comment), there is no "one true" ordering for these registers, which could result in some nasty bugs where end-users de/serialize registers in the wrong order.

The corresponding X86[_64]CoreRegId enums would have to be changed as well.


This would be a breaking change.

Improve `GdbStubStateMachine` API Docs

To avoid blocking the release of 0.6 any longer, I've decided to leave this bit of the API sparsely documented for now.

While all the API types and methods have been documented, there isn't any good mod-level docs on how to use the various types. Instead, I've just pointed folks at various bits of real-world example code.

gdb: `mips64` is an unknown architecture

Thanks to this amazing library, I've been able to set up debugging for my emulator.
There are a few errors that seem out of place (and don't happen on the ARM example provided) that have to do with gdbstub_arch's MIPS64 XML.

Connecting through GDB normally (causes errors):

(gdb) target remote localhost:6464
Remote debugging using localhost:6464
warning: while parsing target description (at line 1): Target description specified unknown architecture "mips64"
warning: Could not load XML target description; ignoring
warning: No executable has been specified and target does not support
determining executable automatically.  Try using the "file" command.
Remote 'g' packet reply is too long (expected 312 bytes, got 576 bytes): 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000400000a4ffffffff00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000010000000000000000000000000000003f00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000f01f00a4ffffffff0000000000000000000000000000000004004070000000000000000000000000000000000000000000000000000000000000000000000000400000a400000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000a000000000000

Connecting through GDB, setting arch (also causes errors):

(gdb) set arch mips
The target architecture is set to "mips".
(gdb) target remote localhost:6464
Remote debugging using localhost:6464
warning: while parsing target description (at line 1): Target description specified unknown architecture "mips64"
warning: Could not load XML target description; ignoring
warning: No executable has been specified and target does not support
determining executable automatically.  Try using the "file" command.
warning: while parsing target description (at line 1): Target description specified unknown architecture "mips64"
warning: Could not load XML target description; ignoring
Remote 'g' packet reply is too long (expected 360 bytes, got 576 bytes): 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000400000a4ffffffff00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000010000000000000000000000000000003f00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000f01f00a4ffffffff0000000000000000000000000000000004004070000000000000000000000000000000000000000000000000000000000000000000000000400000a400000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000a000000000000

Connecting through GDB, setting arch and mips abi

(gdb) set arch mips
The target architecture is set to "mips".
(gdb) set mips abi n64
(gdb) target remote localhost:6464
Remote debugging using localhost:6464
warning: while parsing target description (at line 1): Target description specified unknown architecture "mips64"
warning: Could not load XML target description; ignoring
warning: No executable has been specified and target does not support
determining executable automatically.  Try using the "file" command.
warning: while parsing target description (at line 1): Target description specified unknown architecture "mips64"
warning: Could not load XML target description; ignoring
...

Example for comparison (note I don't need to set anything to connect):

(gdb) target remote 127.0.0.1:9001
Remote debugging using 127.0.0.1:9001
Reading /test.elf from remote target...
warning: File transfers from remote targets can be slow. Use "set sysroot" to access files locally instead.
Reading /test.elf from remote target...
Reading symbols from target:/test.elf...
main () at test.c:1
1	int main() {
...

It seems there is no such architecture as mips64.
Here are all the valid architectures that can be set in GDB:

(gdb) set arch mips<TAB>
mips                 mips:4300            mips:gs464           mips:loongson_2e
mips:10000           mips:4400            mips:gs464e          mips:loongson_2f
mips:12000           mips:4600            mips:interaptiv-mr2  mips:micromips
mips:14000           mips:4650            mips:isa32           mips:mips5
mips:16              mips:5000            mips:isa32r2         mips:octeon
mips:16000           mips:5400            mips:isa32r3         mips:octeon+
mips:3000            mips:5500            mips:isa32r5         mips:octeon2
mips:3900            mips:5900            mips:isa32r6         mips:octeon3
mips:4000            mips:6000            mips:isa64           mips:sb1
mips:4010            mips:7000            mips:isa64r2         mips:xlr
mips:4100            mips:8000            mips:isa64r3
mips:4111            mips:9000            mips:isa64r5
mips:4120            mips:gs264e          mips:isa64r6

As for how to make it recognized as 64 bit, I don't really know. The docs seem to suggest you'd need to set the width inside org.gnu.gdb.mips.{cpu,cp1,fpu,dsp}

Some links that might help making a proper XML:
https://sourceware.org/gdb/onlinedocs/gdb/Target-Description-Format.html
https://sourceware.org/gdb/onlinedocs/gdb/MIPS-Features.html#MIPS-Features

Support `monitor` commands (`qRcmd` packet)

Should be pretty simple (just a single new packet + new trait method).

Something like this maybe?

fn handle_monitor_cmd(
    &mut self, 
    cmd: &str, 
    output: impl FnMut(&str)
) -> Option<Result<(), Self::Error>> { 
    None 
}

Add API for Bare-Metal OS Support

gdbstub makes it very easy to add a GDB server to any project. The current design uses long-polling to interact with the target. Conceptually, gdbstub sits between a network Connection and the Target being debugged. When the target is running, gdbstub is blocked.

An API for bare metal could live in an interrupt handler such as a UART. Each time a character is received, it would be passed to gdbstub for collection and processing. This has several nice properties:

  1. It is easy to remove, allowing for feature gating -- no gdbstub means no debugger, and you simply have to remove the interrupt hook.
  2. It is very nonintrusive -- you only need to add code to the interrupt handler without needing to adjust the main loop at all.
  3. Because interrupt handlers run in an interrupt context, the system is necessarily in a stable state with interrupts disabled, meaning things won't be changing as you inspect them.
  4. Being in an interrupt handler allows you to debug even the kernel itself.

GDB error: Reply contains invalid hex digit 84

If I step past a certain instruction without setting a breakpoint on it, I consistently get "invalid hex digit 84". I'm assuming that this instruction is relevant due to the register state at that moment (due to a couple of factors, one of which being that I got the same error in the past from returning an Err from memory reads). At minimum I'm sure this is an issue that is hard to find the cause of and could maybe use some documentation on how to avoid.

Let me know if I can provide any further information to help debug this.

PacketUnexpected with ThreadStopReason::GdbCtrlCInterrupt

I tried to support Ctrl+C (in GDB) by waiting for the serial line interrupt, and if it arrives providing a ThreadStopReason::GdbCtrlCInterrupt to the state machine. However, this gives me:

17017938078 [ERROR] - gdbstub::gdbstub_impl::state_machine: Unexpected interrupt packet!
17029278514 [ERROR] - nrk::arch::gdb: gdbstub internal error PacketUnexpected

Which seems to be going here. The comment is strange because I'm doing a Ctrl+C while my code is spinning in an infinite loop. Anyways I figured I'd ask also because PacketUnexpected is treated as "always a bug"...

Explore + Document how to work with Big-Endian (and Bi-Endian) Targets

#25 (which added support for 32-bit PowerPC architectures) raised some interesting questions regarding how gdbstub should handle big-endian and bi-endian targets.

Namely, should the Registers::gdb_de/serialize methods be configurable based on the target's current endianness? i.e: plumb through some sort of Target API which queries the endianess of the target, and switches between to/from_be_bytes and to/from_le_bytes appropriated?

Unfortunately, this isn't something I can easily test, since I don't have any bi-endian or big-endian targets to play with...
As such, I'm marking this issue as "help wanted," so that hopefully someone who does have experience with bi-endian systems could clarify the situation.

Implement missing arch-specific `RegId` implementations

Overview

#22 added support for register-level read/writes, and introduced a new RegId associated type to the existing Registers trait. This associated type is used to translate raw GDB register ids (i.e: a arch-dependent usize) into a structured human-readable enum identifying the register.

e.g:

/// 32-bit ARM core register identifier.
#[derive(Debug, Clone, Copy)]
pub enum ArmCoreRegId {
    /// General purpose registers (R0-R12)
    Gpr(u8),
    /// Stack Pointer (R13)
    Sp,
    /// Link Register (R14)
    Lr,
    /// Program Counter (R15)
    Pc,
    /// Floating point registers (F0-F7)
    Fpr(u8),
    /// Floating point status
    Fps,
    /// Current Program Status Register (cpsr)
    Cpsr,
}

impl RegId for ArmCoreRegId {
    fn from_raw_id(id: usize) -> Option<(Self, usize)> {
        match id {
            0..=12 => Some((Self::Gpr(id as u8), 4)),
            13 => Some((Self::Sp, 4)),
            14 => Some((Self::Lr, 4)),
            15 => Some((Self::Pc, 4)),
            16..=23 => Some((Self::Fpr(id as u8), 4)),
            25 => Some((Self::Cpsr, 4)),
            _ => None,
        }
    }
}

Unfortunately, this API was only added after several contributors had already upstreamed their Arch implementations. As a result, there are several arch implementations which are missing proper RegId enums.

As a stop-gap measure, affected Arch implementations have been modified to accept a RegIdImpl type parameter, which requires users to manually specify a RegId implementation. If none is available, users can also specify RegId = (), which uses a stubbed implementation that always returns None.

e.g:

pub enum PowerPcAltivec32<RegIdImpl: RegId> {
    #[doc(hidden)]
    _Marker(core::marker::PhantomData<RegIdImpl>),
}

impl<RegIdImpl: RegId> Arch for PowerPcAltivec32<RegIdImpl> {
    type Usize = u32;
    type Registers = reg::PowerPcCommonRegs;
    type RegId = RegIdImpl;
    // ...
}

Action Items

At the time of writing, the following Arch implementations are still missing proper RegId implementations:

  • Armv4t
  • Mips / Mips64
  • Msp430
  • PowerPcAltivec32
  • Riscv32 / Riscv64
  • x86 (i386)
  • x86_64

Whenever a RegId enum is upstreamed, the associated Arch's RegIdImpl parameter will be defaulted to the newly added enum. This will simplify the API without requiring an explicit breaking API change. Once all RegIdImpl have a default implementation, only a single breaking API change will be required to remove RegIdImpl entirely.

Please only contribute RegId implementations that have been reasonably tested in your own project!

As with all architecture-specific bits of functionality in gdbstub, it's up to the contributor to test whether or not the feature works as expected - all I can do is review the core for style and efficiency.

This issue is not a blocker, and I do not mind having Arch implementations with a RegIdImpl parameter lingering in the codebase.

0.2.2 semantic version is incompatible

Several methods in Target trait break semantic versioning upon updating to version 0.2.2, such as read_addrs, write_addrs, and resume.

This leads to weird bugs with crates using 0.2 version of gdbstub compiling just fine with existing Cargo.lock file, but failing when running cargo package.

I'm not sure what would be the best move here, but cargo docs would suggest bumping the minor version number. I suppose that in addition, yanking 0.2.2 would make it ideal.

Support GDB Agent Expressions [for conditional breakpoints, breakpoint commands, tracepoints, etc...]

See https://sourceware.org/gdb/current/onlinedocs/gdb/Agent-Expressions.html#Agent-Expressions

In some applications, it is not feasible for the debugger to interrupt the program’s execution long enough for the developer to learn anything helpful about its behavior. If the program’s correctness depends on its real-time behavior, delays introduced by a debugger might cause the program to fail, even when the code itself is correct. It is useful to be able to observe the program’s behavior without interrupting it.

When GDB is debugging a remote target, the GDB agent code running on the target computes the values of the expressions itself. To avoid having a full symbolic expression evaluator on the agent, GDB translates expressions in the source language into a simpler bytecode language, and then sends the bytecode to the agent; the agent then executes the bytecode, and records the values for GDB to retrieve later.

The bytecode language is simple; there are forty-odd opcodes, the bulk of which are the usual vocabulary of C operands (addition, subtraction, shifts, and so on) and various sizes of literals and memory reference operations. The bytecode interpreter operates strictly on machine-level values — various sizes of integers and floating point numbers — and requires no information about types or symbols; thus, the interpreter’s internal data structures are simple, and each bytecode requires only a few native machine instructions to implement it. The interpreter is small, and strict limits on the memory and time required to evaluate an expression are easy to determine, making it suitable for use by the debugging agent in real-time applications.

One notable application of Agent Expressions not included in this overview is that breakpoint packets support specifying conditions and/or commands as bytecode expressions to be executed directly on the device. This enables significantly faster conditional breakpoints, as the target does not have to stop execution, communicate with the remote GDB client, and wait for the client to execute the condition.


There are several considerations to keep in mind while working on this feature and designing its API:

  • The existing breakpoint packet parser must be augmented with zero cost support for parsing Agent Expressions when the appropriate feature is enabled.
    • This is weird, as most packets are not parsed differently based on which protocol features are being used.
  • The Agent Expression API must be written such that users should have control over how and where incoming bytecode expressions (which are of variable-length) are stored. gdbstub should parse packets with bytecode expressions and extract the bytecode expression slices, but should not decide how they are stored / retrieved.
    • This could be done by having the Agent API rely on a generic BytecodeStorage trait, which users provide the implementation of.
    • gdbstub must include at least one "batteries included" implementation to make it as easy as possible to get up and running with the feature. e.g: if the std feature is enabled, provide a HashMap-based implementation.
  • gdbstub should not lock users into a single bytecode interpreter. Users should have the flexibility to decide how/when bytecode expressions are be executed.
    • This could be done by having the Agent API rely on a generic BytecodeExecutor trait, which users provide the implementation of.
    • gdbstub should offer a "batteries included" interpreter for those that just want to get up and running quickly (but this wouldn't be a feature blocker)
    • e.g: users looking to get the absolute lowest overhead when running agent expressions should be able to use their own implementations. e.g: they might want to JIT incoming bytecode expressions into target-specific assembly which can then be executed as part of their breakpoint handler routines.
  • And lastly, as always, the API should be difficult to misuse, and zero-copy.
    • zero-copy in the sense that users should not be forced to copy bytecode expressions into the bytecode executor.

Some final things of note:

  • There's already an existing GDB Agent Bytecode interpreter written in Rust! @khuey's https://github.com/khuey/gdb-agent
    • This is excellent, as that means there's already an easy-to-use, off-the-shelf BytecodeExecutor implementation that can be used to validate the feature is working as intended!
    • It would still be nice if gdbstub included an "in-house" Agent Bytecode executor (with no_std support), but that's not a feature blocker
  • This feature would tie-into any future Tracepoint Packet support, so it would be a good idea to at least skim through those docs and make sure the API isn't being designed in a way that won't "play nice" if tracepoint support is added.

Add support for responding with library relocation offsets

For some targets, sections may be relocated from their base address. As a result, the stub may need to tell GDB the final section addresses to ensure that debug symbols are resolved correctly after relocation.

Depending on the target this can be done using several mechanisms:

  • For targets where library offsets are maintained externally (e.g. Windows) this can be done by responding to qXfer:library:read.

  • For System-V architectures, GDB is capable of extracting library offsets from memory if it knows the base address of the dynamic linker. The base address can be specified by either implementing the qOffsets command or by including a AT_BASE entry in the response to the more modern qXfer:auxv:read command. Alternatively, a target can implement qXfer:library-svr4:read, however this may involve digging into the internals of the dynamic linker.

Currently, only the qOffsets command has been implemented (see: #30).


Original issue:

For targets that relocate an image before execution, the debug symbols in GDB will have the incorrect offsets. The qOffsets query allows GDB to query the stub for the text segment offset after relocation.

I implemented the functionality I needed here: mchesser@db347e0 however there appears to be two different methods of reporting the offset (I think one is relative, and one is absolute), and I'm not entirely sure if I missed anything in the implementation

Pass kind argument to add_hw_watchpoint

Currently, when we add watchpoint, kind is ignored:

let addr =
<T::Arch as Arch>::Usize::from_be_bytes(cmd.addr).ok_or(Error::TargetMismatch)?;
let kind =
<T::Arch as Arch>::BreakpointKind::from_usize(cmd.kind).ok_or(Error::TargetMismatch)?;
let handler_status = match cmd_kind {
CmdKind::Add => {
use crate::target::ext::breakpoints::WatchKind::*;
let supported = match cmd.type_ {
0 => (ops.sw_breakpoint()).map(|op| op.add_sw_breakpoint(addr, kind)),
1 => (ops.hw_breakpoint()).map(|op| op.add_hw_breakpoint(addr, kind)),
2 => (ops.hw_watchpoint()).map(|op| op.add_hw_watchpoint(addr, Write)),
3 => (ops.hw_watchpoint()).map(|op| op.add_hw_watchpoint(addr, Read)),
4 => (ops.hw_watchpoint()).map(|op| op.add_hw_watchpoint(addr, ReadWrite)),

But according to gdb rsp doc, kind when set watchpoint means number of bytes to watch, which is not meaningless:

‘Z2,addr,kind’
    Insert (‘Z2’) or remove (‘z2’) a write watchpoint at addr. The number of bytes to watch is specified by kind. 

We need also pass kind as an argument to add_hw_watchpoint.

Explore binary overhead in mixed Rust + C/C++ projects

Hi
First of all thank you for this nice project.
I was looking at it with the goal of embedding it on a small arm or riscv board and talk to it over usb/cdc.

  • Arm support : check
  • Riscv support : check
  • No_std : check
  • Simple to use API: check

So far it looked like a perfect match
But then i tried to build the no_std example on top of my project baseline, that is a mix of C++ and rust. The base project is ~ 20 kB binary size.

It built find, except it consumed a metric ton of flash (i guess this is all relative :) ) .
With a lazy setup i was around 600 kB, using all the tricks i now it went down to ~ 200 kB.

I've looked into the why, ~ 80 kB or so is gdbstub/_arch itself, ~ 120 kB is pulling dependencies from core, fmt,...
In the example, calling gdb.incoming_data(&mut target, byte) is enough for the final executable to jump from 20k to 220 kB
(binary size), as the code is no longer tagged as not used and is not removed.

So a couple of questions :

  • Is it the size expected for the no_std version ? or did i make something silly ?
  • Do you have any hint how to shrink it down ?

Thank you again

Remove `SingleStepGdbBehavior::Unknown` variant

With #95, there are no longer any Arch implementation in gdbstub_arch that use SingleStepGdbBehavior::Unknown, and therefore, this variant can be removed.

Doing so would be a breaking change, and would only land as part of the next breaking release of gdbstub (which at the time of writing would be gdbstub 0.7).

This is a tracking issue to make sure this variant gets removed at some point prior to 0.7's release.

Enabling `ConsoleOutput` in more places

At the moment, gdbstub only exposes ConsoleOutput in one place: as part of the handle_monitor_cmd interface. That said, this functionality of "print text from the target to the client terminal" isn't some special feature of the monitor command interface, and is also supported as a bog-standard Stop Reply Packet (namely, the O XX... packet).

Moreover, based on a very brief skim through the gdb client source code, it seems that the GDB client might be resilient enough to accept O packets at any time, not just as a response to qRcmd and as a stop reply packet! Note that this behavior is not formalized by the spec, but if it works, then it opens the door to some very cool functionality...

EDIT: of course, there's also the Host I/O packet interface, which enables writing to the host console via a standard write(1, buf, len) interface. Admittedly, this can only be used in the same context as regular O xx stop reply packets, so aside from the benefit of using less bandwidth (as it uses the binary data transfer protocol vs. the 2-char ascii per byte protocol), it's not that much better.

With these bits of info in mind, there are a few ways gdbstub could expand expose ConsoleOutput in more places:

1. Exposing ConsoleOutput as part of the current resume() interfaces.

This should be pretty straightforward, and could be implemented similar to the current monitor command ConsoleOutput (i.e: using a callback).

Note that this will require an API change to add the additional function parameter.

2. Exposing a "global" handle to send O XX... packets.

At the time of writing, I don't have a concrete idea of how to implement this (especially for no_std targets), but I was thinking the API could look something like this:

let connection: TcpStream = wait_for_gdb_connection(9001);
let mut debugger = GdbStub::new(connection);

// get an instance of the global `ConsoleOutput` handle
let console_out_handle = debugger.console_output_handle();
let mut target = MyTarget::new(console_out_handle)?;
target.set_console_out_handle(console_out_handle); // or maybe the target supports late-binding

match debugger.run(&mut target) { ... }

// ..
// later, in the target implementation...
// ..

impl MultiThreadOps for MyTarget {
    fn read_registers(
        &mut self,
        regs: &mut gdbstub_arch::arm::reg::ArmCoreRegs,
        tid: Tid,
    ) -> TargetResult<(), Self> {
        outputln!(self.console_out_handle, "reading regs for tid {:?}", tid);
        // ...
    }
}

IMPORTANT NOTE: because the underlying GDB protocol doesn't support sending console output packets at totally arbitrary points during execution (e.g: in the middle of writing out some other, longer packet), calling output! will require buffering + differing output until such a time when it's reasonable to flush. In other words, the global output! interface will have to be non blocking. This is in contrast to the qRcmd or the resume implementations, which can immediately output the data.

Two final comments:

  • I have a strong suspicion that this feature will have to be std/alloc only, as my gut feeling is that it'll be incredibly tricky to get the lifetimes / locking right without the use of Arc. That said, this is just a gut feeling, and it very well might be the case that once someone takes a crack at an implementation, a more obvious solution will present itself.
  • If a no_std implementation is possible, it may need to be feature-gated to avoid ballooning the binary size.

Missing register ids for xmm registers?

I'm getting a write_register request for reg_id (57) that isn't really known by gdbstub. I suppose I can add it myself? I just can't really tell which one is supposed to be 57 here?

I found this by accident because my gdb sends a packet for setting reg 57/0x39, which is ignored by gdbstub (here with some added log statements):

28275229934 [TRACE] - gdbstub::protocol::recv_packet: <-- $P39=ffffffffffffffff#59
28279335634 [INFO ] - gdbstub::gdbstub_impl::ext::single_register_access: got P P { reg_id: 57, val: [255, 255, 255, 255, 255, 255, 255, 255] }
28285262982 [INFO ] - gdbstub::gdbstub_impl::ext::single_register_access: reg is None
28289378772 [INFO ] - gdbstub::gdbstub_impl::ext::single_register_access: empty pkt?
28293528816 [TRACE] - gdbstub::protocol::response_writer: --> $#00

Make single-stepping optional

A close reading of the GDB RSP docs suggests that single-stepping is actually an optional feature of the protocol. Emphasis mine:
https://sourceware.org/gdb/current/onlinedocs/gdb/Overview.html#Overview

At a minimum, a stub is required to support the ‘?’ command to tell GDB the reason for halting, ‘g’ and ‘G’ commands for register access, and the ‘m’ and ‘M’ commands for memory access. Stubs that only control single-threaded targets can implement run control with the ‘c’ (continue) command, and if the target architecture supports hardware-assisted single-stepping, the ‘s’ (step) command. Stubs that support multi-threading targets should support the ‘vCont’ command. All other commands are optional.

At first blush, it would seem that making single-stepping an optional feature is actually fairly straightforward: simply tweak the resume API to model single-stepping in a manner similar to other optional resume modes (such as optimized range stepping). It would require an API breaking change (removing the ResumeReason::Step{WithSignal} variants), but aside from that, it shouldn't be too tricky to implement.


Unfortunately, making single-stepping optional comes with a hidden footgun: if a GDB user tries to single-step on a target that don't support native single-stepping, the GDB client will instead attempt to use temporary breakpoints to "emulate" single stepping. You can see this behavior by skimming through the source of gdb/infrun.c:resume_1.

The gdbstub documentation should make it very clear that supporting optimized single stepping is highly recommended, as it is significantly more efficient than relying on continue + temporary breakpoints. Moreover, if the target doesn't support single stepping or breakpoints, then calling step won't work at all.

Explore overhead of Trait Objects in Target API

Would it be possible to use generic parameters for the public trait methods?

fn resume(
    &mut self,
    actions: impl Iterator<Item = (TidSelector, ResumeAction)>,
    check_gdb_interrupt: impl FnMut() -> bool
) -> Result<(Tid, StopReason<<Self::Arch as Arch>::Usize>), Self::Error>

Personally, I find this easier to read, but it also allows consumers to use a monomorphised version of the function. I think this is useful for more constrained debugger implementations since it allows the generated code to be much smaller for calls like resume(Empty, || false).

I'm not sure why you'd need a Target trait object, but there's no reason you couldn't make one internally: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=4e94a730ae674c63595bed600b64cec7

Unsupported X packets (write binary data)

I'm getting some

22363003360 [INFO ] - gdbstub::gdbstub_impl: Unknown command: Ok("X40003cf7e750,0:")

lines in my gdb session.

Apparently X is for:

write mem (binary) | Xaddr,length:XX... | addr is address, length is number of bytes, XX... is binary data.  The characters $, #, and 0x7d are escaped using 0x7d.

If you can confirm that this is really missing (and not me doing something stupid again) I'll give it a shot and implement it.

Explicitly support non-fatal invalid read/writes in `Target::read/write_addrs`

As per discussion in #15, the Target::read/write_addrs API doesn't support a clear mechanism to signal non-fatal invalid memory reads/writes.

A simple fix would be to modify the API to return a Result<bool, Self::Error>, where Ok(true) signals success, and Ok(false) signals an error (as is done elsewhere in the API, such as with update_sw_breakpoint).

Support `qGetTLSAddr`

I noticed that this crate does not support qGetTLSAddr for working with thread-local storage.

Although I do not currently need this myself, I thought it would be good to track this issue.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.