Giter Site home page Giter Site logo

kata-containers / cgroups-rs Goto Github PK

View Code? Open in Web Editor NEW

This project forked from levex/cgroups-rs

113.0 113.0 43.0 382 KB

Native Rust library for managing control groups under Linux

Home Page: https://crates.io/crates/cgroups-rs

License: Other

Rust 99.32% Shell 0.53% Makefile 0.15%

cgroups-rs's Introduction

CI | Publish Kata Containers payload Kata Containers Nightly CI

Kata Containers

Welcome to Kata Containers!

This repository is the home of the Kata Containers code for the 2.0 and newer releases.

If you want to learn about Kata Containers, visit the main Kata Containers website.

Introduction

Kata Containers is an open source project and community working to build a standard implementation of lightweight Virtual Machines (VMs) that feel and perform like containers, but provide the workload isolation and security advantages of VMs.

License

The code is licensed under the Apache 2.0 license. See the license file for further details.

Platform support

Kata Containers currently runs on 64-bit systems supporting the following technologies:

Architecture Virtualization technology
x86_64, amd64 Intel VT-x, AMD SVM
aarch64 ("arm64") ARM Hyp
ppc64le IBM Power
s390x IBM Z & LinuxONE SIE

Hardware requirements

The Kata Containers runtime provides a command to determine if your host system is capable of running and creating a Kata Container:

$ kata-runtime check

Notes:

  • This command runs a number of checks including connecting to the network to determine if a newer release of Kata Containers is available on GitHub. If you do not wish this to check to run, add the --no-network-checks option.

  • By default, only a brief success / failure message is printed. If more details are needed, the --verbose flag can be used to display the list of all the checks performed.

  • If the command is run as the root user additional checks are run (including checking if another incompatible hypervisor is running). When running as root, network checks are automatically disabled.

Getting started

See the installation documentation.

Documentation

See the official documentation including:

Configuration

Kata Containers uses a single configuration file which contains a number of sections for various parts of the Kata Containers system including the runtime, the agent and the hypervisor.

Hypervisors

See the hypervisors document and the Hypervisor specific configuration details.

Community

To learn more about the project, its community and governance, see the community repository. This is the first place to go if you wish to contribute to the project.

Getting help

See the community section for ways to contact us.

Raising issues

Please raise an issue in this repository.

Note: If you are reporting a security issue, please follow the vulnerability reporting process

Developers

See the developer guide.

Components

Main components

The table below lists the core parts of the project:

Component Type Description
runtime core Main component run by a container manager and providing a containerd shimv2 runtime implementation.
runtime-rs core The Rust version runtime.
agent core Management process running inside the virtual machine / POD that sets up the container environment.
dragonball core An optional built-in VMM brings out-of-the-box Kata Containers experience with optimizations on container workloads
documentation documentation Documentation common to all components (such as design and install documentation).
tests tests Excludes unit tests which live with the main code.

Additional components

The table below lists the remaining parts of the project:

Component Type Description
packaging infrastructure Scripts and metadata for producing packaged binaries
(components, hypervisors, kernel and rootfs).
kernel kernel Linux kernel used by the hypervisor to boot the guest image. Patches are stored here.
osbuilder infrastructure Tool to create "mini O/S" rootfs and initrd images and kernel for the hypervisor.
kata-debug infrastructure Utility tool to gather Kata Containers debug information from Kubernetes clusters.
agent-ctl utility Tool that provides low-level access for testing the agent.
kata-ctl utility Tool that provides advanced commands and debug facilities.
trace-forwarder utility Agent tracing helper.
runk utility Standard OCI container runtime based on the agent.
ci CI Continuous Integration configuration files and scripts.
ocp-ci CI Continuous Integration configuration for the OpenShift pipelines.
katacontainers.io Source for the katacontainers.io site.
Webhook utility Example of a simple admission controller webhook to annotate pods with the Kata runtime class

Packaging and releases

Kata Containers is now available natively for most distributions.

General tests

See the tests documentation.

Metrics tests

See the metrics documentation.

Glossary of Terms

See the glossary of terms related to Kata Containers.

cgroups-rs's People

Contributors

amitlevy avatar apokleos avatar bergwolf avatar burning1020 avatar dcantah avatar dubek avatar flxo avatar fprasx avatar gkurz avatar herano avatar houstar avatar jakob-naucke avatar jmagnuson avatar jodh-intel avatar jongwu avatar justxuewei avatar kvasscn avatar levex avatar lifupan avatar liubin avatar mjerabek avatar mzweilz avatar nrxus avatar ordovicia avatar quanweizhou avatar rtzoeller avatar studychao avatar tecywiz121 avatar tim-zhang avatar yaoyinnan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cgroups-rs's Issues

CGROUP2_SUPER_MAGIC missing on s390x

On s390x Linux:

error[E0425]: cannot find value `CGROUP2_SUPER_MAGIC` in module `statfs`
   --> src/hierarchies.rs:294:51
    |
294 |     fs_stat.unwrap().filesystem_type() == statfs::CGROUP2_SUPER_MAGIC
    |                                                   ^^^^^^^^^^^^^^^^^^^ not found in `statfs`
    |

This is a bug in libc+nix. The respective patches are now released. Will create a PR.

Update to nix 0.20.2

Describe the bug
A vulnerability has been reported to RustSec for the nix 0.20.0 crate. The affected function doesn't look to be used in this crate, however, using cargo audit will trip on the 0.20.0 dependency nonetheless.

Expected behavior
Running cargo audit without error.

Additional context
RustSec entry:

Crate:         nix
Version:       0.20.0
Title:         Out-of-bounds write in nix::unistd::getgrouplist
Date:          2021-09-27
ID:            RUSTSEC-2021-0119
URL:           https://rustsec.org/advisories/RUSTSEC-2021-0119
Solution:      Upgrade to ^0.20.2 OR ^0.21.2 OR ^0.22.2 OR >=0.23.0

Public access to path of memory controller

Which feature do you think can be improved?

This package is great insofar is it works both on cgroups v1 and v2. Thank you! However, it's a little slow for my needs. In particular, I am getting memory stats. Right now the way memory_stat() works is it opens a whole bunch of files, and does a whole bunch of parsing, much of which is not relevant to me. It's pretty slow (for my needs, this is happening occasionally but on a critical code path).

How can it be improved?

As a workaround to API redesign/improvements, just making the controller path public (or open_path() public) would allow for me to open the two files I need, and just rely on the v1 vs v2 + controller mapping code in the package.

Additional Information

An API for getting e.g. memory info that only got specific attributes would be handy, right now I'm forced to load everything that fits in memory_stat() output. But that would be more work for you, and this change is just adding a 2-line method, say.

readme example does not compile

fn main() {
    

use cgroups_rs::*;
use cgroups_rs::cgroup_builder::*;

// Acquire a handle for the cgroup hierarchy.
let hier = cgroups_rs::hierarchies::auto();

// Use the builder pattern (see the documentation to create the control group)
//
// This creates a control group named "example" in the V1 hierarchy.
    let cg: Cgroup = CgroupBuilder::new("example")
        .cpu()
        .shares(85)
        .done()
        .build(hier);

// Now `cg` is a control group that gets 85% of the CPU time in relative to
// other control groups.

// Get a handle to the CPU controller.
let cpus: &cgroups_rs::cpu::CpuController = cg.controller_of().unwrap();
cpus.add_task(&CgroupPid::from(1234u64));

// [...]

// Finally, clean up and delete the control group.
cg.delete();

// Note that `Cgroup` does not implement `Drop` and therefore when the
// structure is dropped, the Cgroup will stay around. This is because, later
// you can then re-create the `Cgroup` using `load()`. We aren't too set on
// this behavior, so it might change in the feature. Rest assured, it will be a
// major version change.

}
error[E0308]: mismatched types
  --> src/main.rs:13:22
   |
13 |       let cg: Cgroup = CgroupBuilder::new("example")
   |  _____________------___^
   | |             |
   | |             expected due to this
14 | |         .cpu()
15 | |         .shares(85)
16 | |         .done()
17 | |         .build(hier);
   | |____________________^ expected `Cgroup`, found `Result<Cgroup, Error>`
   |
   = note: expected struct `cgroups_rs::Cgroup`
                found enum `Result<cgroups_rs::Cgroup, cgroups_rs::error::Error>`

For more information about this error, try `rustc --explain E0308`.
error: could not compile `cg` due to previous error

cgroups-rs 0.3

Add native Tokio support for oom events of the memory controller

Is your feature request related to a problem? Please describe.

The functions events::notify_on_oom_v1 and events::notify_on_oom_v2 return a std::sync::mpsc::Receiver. Using the std channels in a tokio environment is cumbersome because it requires to spawn a thread that blocks on Receiver::recv. There's an additional thread spawned inside those functions that block on reading the eventfd and sending the notification via the channel.

Describe the solution you'd like

Add a feature tokio that returns a Result<tokio::sync::mpsc::Receiver> or a Result<impl Stream<Item = Stream>>> that seamlessly integrates into tokio applications. For the implementation the tokio eventfd can be used that uses a non blocking event_fd.

Additional context

The dependency list of cgroups-rs is very small which is nice. Adding tokio as a dependency must definitely be hidden behind a feature.

I can come up with a proposal PR if wanted. :-)

logging: Add some well-known log events at info log level

Kata provides both logging and tracing. Increasing the log level from the default "info" level to "debug" is ideal for problem determination, but may impact performance.

Sometimes it is useful to be able to pick out "key" points / events in the lifecycle of a container. Enabling debug logging would provide that but not in a lightweight fashion.

Alternatively, tracing could be enabled as that too can provide the key events. However, tracing is also not zero cost in terms of setup or use.

A compromise is to add a small number of "well known" log calls that log at the default "info" level. This allows the key events to be determined without enabling full debug or tracing.

Support discard for blkio

Now we use a fixed format to parse io service data, but it may change.

cgroups-rs/src/blkio.rs

Lines 103 to 107 in a45ecf0

match x {
[(major, minor, "Read", read_val), (_, _, "Write", write_val),
(_, _, "Sync", sync_val), (_, _, "Async", async_val),
(_, _, "Total", total_val)] =>
Some(IoService {

Should implement like runc to support different version kernels.

Cgroup not being created on Ubuntu 20

The Cgroup builder does not seem to create the group on Ubuntu 20
code sample:

let cgs = CgroupBuilder::new(name)
        .cpu().shares(256).done()
        .memory().kernel_memory_limit(KMEM_LIMIT).memory_hard_limit(MEM_LIMIT).done()
        .pid().maximum_number_of_processes(MAX_PID).done()
        .blkio().weight(50).done()
        .build(Box::new(V2::new()));

Question: Read-only filesystem issue

When i tried to run the example code, the cgroups builder panicked with this error

root@b728d2fea702:/test_rust/src# cargo run
    Finished dev [unoptimized + debuginfo] target(s) in 0.01s
     Running `/test_rust/target/debug/test_rust`
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Error { kind: FsError, cause: Some(Os { code: 30, kind: ReadOnlyFilesystem, message: "Read-only file system" }) }', src/main.rs:16:10
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

I'm not sure if this is some problem with my docker container (minimal rust image). I also want to ask, where are the code that actually edits the linux cgroups information. Thank you!

Reduce duplicated implementation of methods read_u64_from and read_string_from in many subsystem's files

Which feature do you think can be improved?
How about reducing duplicated methods such as read_u64_from or read_string_from in many subsystem's file?

How can it be improved?
Method fn read_u64_from(file: File) -> Result<u64>{...} or fn read_string_from(mut file: File) {...} reimplement many times. In addition to that method read_i64_from(file: File) has the same core code as fn read_u64_from(file: File).
Reduce duplicated code in subsystem's files and implement only in lib.rs

Describe how specifically you think it could be improved.
Remove duplicated implementation and only implement them in lib.rs
Origin:

fn read_u64_from(mut file: File) -> Result<u64> {
    let mut string = String::new();
    match file.read_to_string(&mut string) {
        Ok(_) => string
            .trim()
            .parse()
            .map_err(|e| Error::with_cause(ParseError, e)),
        Err(e) => Err(Error::with_cause(ReadFailed, e)),
    }
}

pub fn read_i64_from(mut file: File) -> Result<i64> {
    let mut string = String::new();
    match file.read_to_string(&mut string) {
        Ok(_) => string
            .trim()
            .parse()
            .map_err(|e| Error::with_cause(ParseError, e)),
        Err(e) => Err(Error::with_cause(ReadFailed, e)),
    }
}

fn read_string_from(mut file: File) -> Result<String> {
    let mut string = String::new();
    match file.read_to_string(&mut string) {
        Ok(_) => Ok(string.trim().to_string()),
        Err(e) => Err(Error::with_cause(ReadFailed, e)),
    }
}

After reduced duplicated codes:

/// read and parse an u64 data
fn read_u64_from(file: File) -> Result<u64> {
    read_from::<u64>(file)
}

/// read and parse an i64 data
fn read_i64_from(file: File) -> Result<i64> {
    read_from::<i64>(file)
}

fn read_from<T>(mut file: File) -> Result<T>
    where T: FromStr, <T as FromStr>::Err: 'static + Send + Sync + std::error::Error {
    let mut string = String::new();
    match file.read_to_string(&mut string) {
        Ok(_) => string
            .trim()
            .parse::<T>()
            .map_err(|e| Error::with_cause(ParseError, e)),
        Err(e) => Err(Error::with_cause(ReadFailed, e)),
    }
}

fn read_string_from(mut file: File) -> Result<String> {
    let mut string = String::new();
    match file.read_to_string(&mut string) {
        Ok(_) => Ok(string.trim().to_string()),
        Err(e) => Err(Error::with_cause(ReadFailed, e)),
    }
}

Support rootless cgroup v2

Currently cgroups-rs manages cgroups in privilged mode. To support rootless containers, this library should ask for permission of cgroup management in unprivileged mode.

Related implementations, which requires communicating with systemd, asking for a path to rootless cgroup 2 hierarchy, use it for relative path.
github.com/opencontainers/runc/pull/2281

cgroup2: Support cgroup.kill

In 5.14+ there exists a cgroup.kill that will send a SIGKILL to every process in the tree if written to (with a "1"). This would be useful for kata to avoid the freezing -> manually sending SIGKILL -> thawing process it does currently (same thing runc does at the moment).

Panic on reading memory stats

Describe the bug

The lib crashes on get_max_value("memory.max") like values (the backtrace is provided as follows).

1e8f492ed79f21fca179d09544cb97f

8badf3ce6c43f48414685a75721ae6d

The /proc/<pic>/cgroup of the process contains only 1 line:

0::/

The MVP

use procfs::{Meminfo, ProcError};
use cgroups_rs::memory::MemController;
use cgroups_rs::*;

pub fn flush_memory_usage() {
    let container_total = cgroup_node_mem_total()
        .map_err(|e| {
            println!("flush_memory_usage cgroup_node_mem_total: {}", e);
        })
        .unwrap_or_default();
        println!("container total mem: {}", container_total);

}

pub fn cgroup_node_mem_total() -> Result<u64, ProcError> {
            // Acquire a handle for the cgroup hierarchy.
            let hier = cgroups_rs::hierarchies::auto();
            let cg = Cgroup::load(hier, String::from(""));
            let mc: &MemController = cg.controller_of().unwrap();
            Ok(mc.memory_stat().stat.hierarchical_memory_limit as u64)
}

fn main() {
        flush_memory_usage();
}

The full log:

xxx@fancybox:~/workspace/cgtest$ RUST_BACKTRACE=1 cargo r
   Compiling cgtest v0.1.0 (/home/xueruini/workspace/cgtest)
    Finished dev [unoptimized + debuginfo] target(s) in 1.21s
     Running `target/debug/cgtest`
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Error { kind: ReadFailed("/sys/fs/cgroup/memory.high"), cause: Some(Os { code: 2, kind: NotFound, message: "No such file or directory" }) }', /home/xxxx/.cargo/registry/src/rsproxy.cn-8f6827c7555bfaf8/cgroups-rs-0.3.2/src/memory.rs:587:34
stack backtrace:
   0: rust_begin_unwind
             at /rustc/9eb3afe9ebe9c7d2b84b71002d44f4a0edac95e0/library/std/src/panicking.rs:575:5
   1: core::panicking::panic_fmt
             at /rustc/9eb3afe9ebe9c7d2b84b71002d44f4a0edac95e0/library/core/src/panicking.rs:64:14
   2: core::result::unwrap_failed
             at /rustc/9eb3afe9ebe9c7d2b84b71002d44f4a0edac95e0/library/core/src/result.rs:1790:5
   3: core::result::Result<T,E>::unwrap
             at /rustc/9eb3afe9ebe9c7d2b84b71002d44f4a0edac95e0/library/core/src/result.rs:1112:23
   4: cgroups_rs::memory::MemController::memory_stat_v2
             at /home/xxx/.cargo/registry/src/rsproxy.cn-8f6827c7555bfaf8/cgroups-rs-0.3.2/src/memory.rs:587:19
   5: cgroups_rs::memory::MemController::memory_stat
             at /home/xxx/.cargo/registry/src/rsproxy.cn-8f6827c7555bfaf8/cgroups-rs-0.3.2/src/memory.rs:621:20
   6: cgtest::cgroup_node_mem_total
             at ./src/main.rs:23:16
   7: cgtest::flush_memory_usage
             at ./src/main.rs:6:27
   8: cgtest::main
             at ./src/main.rs:27:2
   9: core::ops::function::FnOnce::call_once
             at /rustc/9eb3afe9ebe9c7d2b84b71002d44f4a0edac95e0/library/core/src/ops/function.rs:250:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

Expected behavior
get_mem() should return as expected instead of panic.

Additional context

I tried this on various environments, including wsl2+ubuntu 20.04, wsl2+ubuntu 22.04, VirtualBox 7.0 + ubuntu 22.04.

Kernel memory limit is unsupported on Linux kenrel >= 5.16

Describe the bug

As described first in kata-containers/kata-containers#4390 , Linux 5.16 merged a change which causes writes to cgroup's memory.kmem.limit_in_bytes to return -ENOTSUPP and do nothing.

This means that the unit tests fail with Linux 5.16 and cgroup v1.

To reproduce this, I installed an Ubuntu 22.04 VM (default kernel 5.15). I added systemd.unified_cgroup_hierarchy=0 to the kernel command-line. On 5.15 make test passes OK.

I then installed kernel 5.16.20 from Ubuntu (link). When I run the tests:

     Running tests/builder.rs (target/debug/deps/builder-4ff582e8201f8686)

running 7 tests
test test_blkio_res_build ... ignored
test test_devices_res_build ... ignored
thread 'test_memory_res_build' panicked at 'assertion failed: `(left == right)`
  left: `9223372036854771712`,
 right: `134217728`', tests/builder.rs:50:13
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
test test_memory_res_build ... FAILED
test test_network_res_build ... ok
test test_cpu_res_build ... ok
test test_hugepages_res_build ... ok
test test_pid_res_build ... ok

failures:

failures:
    test_memory_res_build

test result: FAILED. 4 passed; 1 failed; 2 ignored; 0 measured; 0 filtered out; finished in 0.07s

Expected behavior
Unit tests pass.

It's a design question of whether to "swallow" the unsupported error from the OS and just do nothing, or keep the error (and fix the callers, such as kata agent). This is also related to the question in #76.

Additional context

  1. With cgroup v2 this doesn't happen because the tests skip this.
  2. I added a new test which attempts to call set_kmem_limit -- this indeed fails with OS error (95 unsupported).

Huge pages failure with cgroups v2

When running with cgroups v2, the kata-agent binary from https://github.com/kata-containers/kata-containers ends up using cgroups v1 naming :

[pid   193] openat(AT_FDCWD, "/sys/fs/cgroup/crio/ee05a3ebc7aaf6232684454cade61e2d1b897fc126461f69959e613cb68c6d62/hugetlb.2MB.max", O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0666) = 54
[pid   193] openat(AT_FDCWD, "/sys/fs/cgroup/crio/ee05a3ebc7aaf6232684454cade61e2d1b897fc126461f69959e613cb68c6d62/hugetlb.2MB.limit_in_bytes", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)

This write + read sequence is this code in cgroups-rs:

            let _ = self.set_limit_in_bytes(&i.size, i.limit);
            if self.limit_in_bytes(&i.size)? != i.limit {
                return Err(Error::new(Other));
            }

set_limit_in_bytes() does have the bits to support v1 and v2 but limit_in_bytes() only knows about v1. This prevents to start kata containers if the guest has cgroups v2.

panic when parse memory stats with old linux kernel

Env

kernel version: 3.10.0
cgroup version: v1

Describe the bug

  1. call memory_stat() function at src/memory.rs
  2. go in to parse_numa_stat() function
  3. the first 4 arrtibutes work well but panic at let hier_total_line = ls.next().unwrap();
  4. check memory.numa_stat file under /sys/fs/cgrop, i couldn't find any words start with hierarchical_**, maybe caused by old kernel

Expected behavior
return default value when get None

Additional context
suggest change unwrap() to unwrap_or()
this bug will happen when call parse_oom_control function

Create new release

I'm trying to update all the out of date crate versions in Kata.

Now that #67 has been merged, can we create a new release because...

  • good (latest nix):

    $ curl -o - -sL https://raw.githubusercontent.com/kata-containers/cgroups-rs/main/Cargo.toml | grep ^nix
    nix = "0.23.0"
    
  • bad (insecure version of nix - https://rustsec.org/advisories/RUSTSEC-2021-0119.html):

    $ cargo download -q cgroups-rs=0.2.7 | gunzip | tar -xvf - cgroups-rs-0.2.7/Cargo.toml -O | grep dependencies\.nix -A 1
    cgroups-rs-0.2.7/Cargo.toml
    [dependencies.nix]
    version = "0.20.2"

Error handling

Which feature do you think can be improved?
Error handling when performing fallible operations.

How can it be improved?
Don't silently ignore Results and propagate them to the crate user.

Additional Information

Some examples :

cgroups-rs/src/cgroup.rs

Lines 63 to 65 in 1df6e7a

fn create(&self) {
if self.hier.v2() {
let _ret = create_v2_cgroup(self.hier.root(), &self.path);

pub fn build(self, hier: Box<dyn Hierarchy>) -> Cgroup {
let cg = Cgroup::new(hier, self.name);
let _ret = cg.apply(&self.resources);
cg
}

Is there a rationale behind this design choice?
I'd like to propose a PR to implement error propagation, which could be implemented behind a feature flag for backwards compatibility.

Lift `'static` requirement on resource `attr` fields

Which feature do you think can be improved?

The key type of the additional attributes map of various resources like MemoryResources has the 'static requirement. This is nice for manual setup but really hard for configurations read from files and applied later on.

How can it be improved?

Either assign a dedicated lifetime to the structs that must be fulfilled or change the key type of the maps to String for the sake of simplicity.

Additional Information

Anything else to add? Just a big thank you for this crate ;-)

cgroups::error::Error can't shared between threads and can't be handled by anyhow

  |
167 |                 mem_controller.set_tcp_limit(kernel_tcp)?;
    |                                                         ^ `(dyn std::error::Error + std::marker::Send + 'static)` cannot be shared between threads safely
    |
    = help: the trait `std::marker::Sync` is not implemented for `(dyn std::error::Error + std::marker::Send + 'static)`
    = note: required because of the requirements on the impl of `std::marker::Sync` for `std::ptr::Unique<(dyn std::error::Error + std::marker::Send + 'static)>`
    = note: required because it appears within the type `std::boxed::Box<(dyn std::error::Error + std::marker::Send + 'static)>`
    = note: required because it appears within the type `std::option::Option<std::boxed::Box<(dyn std::error::Error + std::marker::Send + 'static)>>`
    = note: required because it appears within the type `cgroups::error::Error`
    = note: required because of the requirements on the impl of `std::convert::From<cgroups::error::Error>` for `anyhow::Error`
    = note: required by `std::convert::From::from`

FreezerController not available for cgroups v2

Describe the bug
Although FreezerController supports cgroups v2, it is not included in Cgroup, because cgroups v2 do not have "freezer" as a controller, but it's implemented directly via cgroup.freeze (and thus "freezer" is not included in cgroup.controllers).

Expected behavior
FreezerController is supported also with cgroups v2.

Additional context
Observed on linux 5.10 and 5.14 (NixOS 21.05).

Freezer controller "doesn't exist" in v2 cgroup

Describe the bug
Creating a cgroup with the freezer controller on v2 hierarchy fails with error
Error { kind: SpecifiedControllers, cause: None }

Expected behavior
It should succeed

Additional context
The freezer controller isn't mentioned in the cgroups file anymore, but is still available if accessing /sys/fs/cgroup/cgroup.freeze
I guess the API needs to be altered and can't be same for v1/v2?

        let hierarchy = cgroups_rs::hierarchies::auto();
        // get pids of all processes in the root cgroup
        let root_process_cgroup = cgroups_rs::Cgroup::load(hierarchy, "/proc/1/sys/fs/cgroup");
        let root_pids = root_process_cgroup.procs();

        let hierarchy = cgroups_rs::hierarchies::auto();
        // create new cgroup with freezer only, add this procs - this might exist from previous runs
        // of mirrord.
        let cgroup = cgroups_rs::Cgroup::new_with_specified_controllers(
            hierarchy,
            MIRRORD_CGROUP_PATH,
            Some(vec!["freezer".to_string()]),
        )?; // fails here
        for pid in root_pids {
            cgroup.add_task(pid)?;
        }
        let freezer_controller: &FreezerController = cgroup
            .controller_of()
            .ok_or(AgentError::PauseFailedCgroupFreezerNotFound)?;
        freezer_controller.freeze()?;

type i8 compile error in ARM aarch64

Description of problem

kata-container jenkins-ci-ARM-ubuntu-18-04 test report error in kata-containers/kata-containers#747

Error Message:
Compiling cgroups v0.1.1-alpha.0 (https://github.com/kata-containers/cgroups-rs?tag=0.1.1#3852d7c1)
error[E0308]: mismatched types
--> /home/jenkins/.cargo/git/checkouts/cgroups-rs-1340950e7d819bfb/3852d7c/src/lib.rs:769:34
|
769 | let _ = unsafe { libc::rmdir(p.as_ptr() as *const i8) };
| ^^^^^^^^^^^^^^^^^^^^^^^ expected u8, found i8
|
= note: expected raw pointer *const u8
found raw pointer *const i8

error: aborting due to previous error

For more information about this error, try rustc --explain E0308.
error: could not compile cgroups.

Log url: http://jenkins.katacontainers.io/job/kata-containers-2.0-ubuntu-ARM-PR/303/console

Expected result

Maybe should use libc::c_char instead of i8

Actual result

Compiling report error.

Further information

a similar issue remacs/remacs#1393

Include file path for I/O errors

If read/write cgroup file failed, it's better to include the file path for debugging purposes.

Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create containerd task: unable to write to a control group file caused by: Os { code: 22, kind: InvalidInput, message: "Invalid argument" }: unknown

Implement Drop (cg.delete() not removing cgroup)

Which feature do you think can be improved?

Can we implement Drop for the cgroup? I understand that this will be a major version change in the future, however this is something we are considering leveraging for Aurae.

How can it be improved?

Ensure that cg.delete() removes the cgroup from the system - OR - create a similar method with this behavior.

Additional Information

Pat yourself on the back for doing god's work and maintaining the cgroup crate in rust.

support adding process to subsystem subset optionally in cgroup v1

Is your feature request related to a problem? Please describe.
In some scenes, we want to add kata overhead processes and task only to certain subsystems in cgroup v1.
For example, if hugepages is applied to vmm, we should put all threads into sandbox resource controller, even though sandbox_cgroup_only is false. When vmm startups, it cannot ensure that only vcpu thread uses the specified hugepages. So, we should constrain the entire vmm process not only vcpu thread in hugetlb subsystem.

Describe the solution you'd like
To satisfy the requirement that moving a process and task to the specified cgroup subsystems and leaving it in another cgroup for other subsystems, we should add optionally process and task to specifiled subsystems subset.

read tasks hang when the cgroup dir removed after open

Describe the bug
read tasks hang when the cgroup dir removed after open

Expected behavior
should not hang

Additional context
current behavior

      /// Get the list of tasks that this controller has.
      fn tasks(&self) -> Vec<CgroupPid> {
          let mut file = "tasks";
          if self.is_v2() {
              file = "cgroup.procs";
          }
          self.open_path(file, false)
              .map(|file| {
                  let bf = BufReader::new(file);
                  let mut v = Vec::new();
                  for line in bf.lines().flatten() {
                      let n = line.trim().parse().unwrap_or(0u64);
                      v.push(n);
                  }
                  v.into_iter().map(CgroupPid::from).collect()
              })
              .unwrap_or_default()
      }

if the group dir remove after open_path the bf.lines() will return Some(Err(...)).
when for line in "bf.lines().flatten()" will hang, because the reader will return an error and the flatten always in the loop

here is the simulation program: after running this demo, will always hang

use std::io::{BufRead, BufReader};

fn main() {
    let dir = "/sys/fs/cgroup/cpuset,cpu,cpuacct/line_test";
    let file_path = format!("{}/tasks", dir);

    // create dir
    std::fs::create_dir_all(dir).unwrap();

    let file = std::fs::File::open(&file_path).unwrap();

    std::fs::remove_dir(dir).unwrap();

    let bf = BufReader::new(file);
    for line in bf.lines().flatten() {
        println!("line: {}", line);
    }

    println!("end");
}

lint: Fix current clippy errors

error: the borrowed expression implements the required traits
   --> src/hierarchies.rs:179:30
    |
179 |         Cgroup::load(auto(), &parent_path)
    |                              ^^^^^^^^^^^^ help: change this to: `parent_path`
    |
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_borrow
    = note: `-D clippy::needless-borrow` implied by `-D warnings`

error: the borrowed expression implements the required traits
   --> src/hierarchies.rs:261:30
    |
261 |         Cgroup::load(auto(), &parent_path)
    |                              ^^^^^^^^^^^^ help: change this to: `parent_path`
    |
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_borrow

The interface of set_cfs_quota takes an u64, but value unlimited is represented as -1

Description of problem

According to https://www.kernel.org/doc/html/latest/scheduler/sched-bwc.html

A value of -1 for cpu.cfs_quota_us indicates that the group does not have any bandwidth restriction in place, such a group is described as an unconstrained bandwidth group. This represents the traditional work-conserving behavior for CFS.

and gives the example of default values as:

cpu.cfs_period_us=100ms
cpu.cfs_quota=-1

The value -1 cannot be passed using the existing interface.

Parsing of v2 blkio stat broken

Describe the bug

Parsing the stats of the blkio controller is broken. According to [1], section IO Interface Files the io.stat file has the format:

8:16 rbytes=1459200 wbytes=314773504 rios=192 wios=35
8:0 rbytes=90430464 wbytes=299008000 rios=8950 wios=1252

but the parsing code expects a : instead of a = in blk.rs

Expected behavior

Parse the correct values.

Additional context

The implementation could be easily optimised to avoid the String and Vec allocations by modifying get_value:

    fn get_value(s: &str) -> u64 {
        s.split('=')
            .skip(1)
            .next()
            .and_then(|s| s.parse::<u64>().ok())
            .unwrap_or_default()
    }

which is only used for the v2 parsing. Same applies for more places of the parsing code in (at least) blkio.

[1] https://www.kernel.org/doc/Documentation/cgroup-v2.txt

Add optional serde support for statistics structs

Is your feature request related to a problem? Please describe.

Gathering statistics via cgroups is great. They cover hierarchies of processes without the need to scan a list of task etc. The statistics contains a huge amount of values. Copying the values to a custom struct or merging them into a hashmap is manual work. Using the exact same structure and names as in the cgroup files has the advantage that the values are also documented.
The structs defined for e.g memory or blkio do not derive from Serdes Serialize and Deserialize which would allow to directly serialise stats generated with e.g blkio::BlkIoController::blkio.

Describe the solution you'd like

Add an optional crate dependency serde with feature derive:

serde = { version = "1.0", features = ["derive"], optional = true }

Derive the statistics structs from serde::Serialize and serde::Deserialize if the feature serde is enabled e.g:

/// State of and statistics gathered by the kernel about the memory usage of the control group's
/// tasks.
#[derive(Debug)]
#[cfg_attr(feature="serde", derive(serde::Serialize, serde::Deserialize))]
pub struct Memory {
    /// How many times the limit has been hit.
    pub fail_cnt: u64,
    /// The limit in bytes of the memory usage of the control group's tasks.
    pub limit_in_bytes: i64,
    /// The current usage of memory by the control group's tasks.
    pub usage_in_bytes: u64,
...

See serde-rs/serde#1021.

Describe alternatives you've considered

Following Deriving De/Serialize for type in a different crate

Additional context

A PR should be almost non invasive, because without the feature enabled there shouldn't be any change.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.