lablup / backend.ai-jail Goto Github PK
View Code? Open in Web Editor NEWA programmable security sandbox for Backend.AI kernels
License: GNU Lesser General Public License v3.0
A programmable security sandbox for Backend.AI kernels
License: GNU Lesser General Public License v3.0
This is causing problems with compiling kernels.
Often TensorFlow codes spawn many threads, but the jail recognizes "too many" threads while the actual number of threads are within the configured limit.
Potential solutions:
But still, TensorFlow seems to increase the number of threads when we repeat calling regressors.
We need to find some good solution on this.
NOTE:
Even the following code produces a large number of threads more than the number of CPU cores allocated to the container:
config = tf.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1, \
allow_soft_placement=True, device_count = {'CPU': 1})
session = tf.Session(config=config)
So that we don't have to bake the jail binaries and the base kernel images every time when we update the policies.
The configuration format should be easily writable and understandable, so JSON/YAML/TOML would be good choices. We need to figure out which is most accessible from the golang ecosystem.
Keywords for prior knowledges required:
To make it configurable, we should implement:
By a recent investigation of unexpected jail failures by @tlqaksqhr, we finally identified that the root cause was intermix of docker-default apparmor profile and our jail's seccomp+ptrace.
(Yes, I thought apparmor is deprecated but it has been still being used!)
References:
Since apparmor simplifies some parts of our jail policy implementation, such as path-based access controls, let's combine its advantage with our jail.
policy.yml
to apparmor profile? Or, could we do the reverse (importing the docker-default apparmor profile to the base policy.yml
)?
policy.yml
when starting containers, and unload the profile when containers terminate. (one profile per container)apparmor=unconfined
security options when starting containers in the agents.It is difficult to have language-by-language stdin overrides.
Could we just override the read()
syscall with stdin (file descriptor zero)?
Things to test/consider:
read()
with fd zero really worksdup()
, dup2()
? Is there any language that use them? (i.e., should we keep track of such syscalls and target/returned file descriptors?)read()
overriding -- maybe we need to check isatty()
for the stdin fd?It is non-trivial to manage outbound security rules using IP addresses, as many external websites rely on load balancers and volatile IP addresses on top of clouds.
Let's build a DNS server that provides transparent access to whitelist domains (e.g., github.com) from user kernel sessions but returns "unresolved" results for other domains.
This would not be perfect but will provides a good starting point.
ref) https://docs.docker.com/engine/release-notes/ (20.10 series)
- seccomp: Whitelist clock_adjtime. CAP_SYS_TIME is still required for time adjustment moby/moby#40929
- seccomp: Add openat2 and faccessat2 to default seccomp profile moby/moby#41353
- seccomp: allow ‘rseq’ syscall in default seccomp profile moby/moby#41158
- seccomp: allow syscall membarrier moby/moby#40731
- seccomp: whitelist io-uring related system calls moby/moby#39415
- Fix seccomp profile for clone syscall moby/moby#39308
The current debug mode (enabled via -debug
flag) dumps too much information.
We should have a "watch" mode that transparently allows all system calls but logs the system calls blocked by the current designated policy. This will be useful to update our filter sets when we encounter new application that does not work with Sorna jail but works well without it.
#29 did basic port to ARM64, but Linux on different architectures (like x86_64 and ARM64) can execute different syscalls for the same code, and Jail needs to take that into consideration.
At least one such case is known. For the following Python code:
import os
os.access('/', os.F_OK)
x86_64 executes access
but ARM64 executes faccessat
. Currently, access
is checked for path but faccessat
is not.
While Rust rewriting, exec.LookPath
function's feature seems to be missing.
For example,
// Below command panicked with "python3 not found" error message.
$ target/debug/backendai-jail python3
// Below command is working as expected.
$ target/debug/backendai-jail /usr/bin/python3
I'm not sure if this is intentional, but I think it could be useful to include this feature.
We can call the which
command directly, or maybe it would be better to use which crate.
It seems the "which" command could be inserted into this https://github.com/lablup/backend.ai-jail/blob/main/src/jail.rs#L767
It seems we are using inconsistent golang versions in Dockerfile.builder-musllinux, Dockerfile.builder-manylinux and Dockerfile.
Trying to build the development container using the readme file causes the following error.
#6 1.656 src/github.com/seccomp/libseccomp-golang/seccomp_internal.go:698: cannot use req.data.arch (type C.__u32) as type C.uint32_t in argument to archFromNative
------
executor failed running [/bin/sh -c go get github.com/seccomp/libseccomp-golang && go get github.com/fatih/color && go get github.com/gobwas/glob && go get gopkg.in/yaml.v2]: exit code: 2
make: *** [prepare-dev] Error 1
Since Dockerfile.builder-musllinux and Dockerfile.builder-manylinux uses golang 1.11 version and Dockerfile uses golang 1.8 version, I think the golang version for the Dockerfile
should be updated to 1.11 version.
Currently, if we need to hook, say, ioctl request 42, all ioctl requests are trapped to user space, because we use seccomp's add_rule
method, like add_rule(ioctl)
.
We can do better. If we use seccomp's add_rule_conditional
method instead, like add_rule_conditional(ioctl, arg2 == 42)
, only ioctl request 42 is trapped to user space, because comparison check is done in kernel space. This may improve performance.
Write minimal test cases, at least.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.