Simplest possible secure sandboxing (limited resources needed)

apparmorcgroupssandbox

I'm working on a project that implements distributed simulations: arbitrary code is executed on multiple nodes and the results are later collected and aggregated.

Each node is an instance of an Ubuntu Linux virtual machine and runs a master process which takes care of forwarding the code to be executed to a number of worker processes (1 for each core).

This question is about how to make sure that each worker operates in a sandboxed environment without resorting to the use of a virtual machine instance for each of them. The exact requirements for the workers are:

fs: no write permission, read-only permission limited to a single directory (and sub-folders)
net: only local communications allowed (IPC, TCP, whatever…)
mem: cap on memory usage (no swap memory) kill if over mem limit
cpu: only 1 core allowed, kill if over time limit

No other limitations should be imposed: the worker should be able to load dynamic libraries (from the read-only folder), spawn new threads or processes, call system function, ecc ecc but the limits must be inherited by the spawned / loaded entities and should apply in a sum-wise way (for instance we can't have a worker spawn two threads that use 800MB each is the memory limit for such worker is 1GB).

It goes without saying that there should be no way for the worker to raise its rights.

I spent considerable time reviewing the available alternatives (SELinux, AppArmor, cgroups, ulimit, Linux namespaces, LXC, Docker, …) for the simplest solution that satisfies my requirements but my experience on the field is limited.

Current understanding: LXC and Docker a bit on the heavy side for my use case and are not completely secure 1. AppArmor preferable to SELinux due to easier configuration, use it for fs and net restrictions; cgroups preferable to ulimit (which operates on a single process), used it for mem and cpu restrictions.

Is this the simplest way to achieve my goal? Could I use AppArmor or cgroups exclusively? Is there some obvious security hole in my model? The guideline should be "worker allowed to bring down itself but nothing else".

Best Answer

Yes, you can use cgroups and SELinux/AppArmor exclusively to monitor and control the arbitrary code that you will execute.

With cgroups, you can do the following:

Limit CPU core usage to 1 CPU with the cpuset subsystem
Set memory usage limits with the memory subsystem, tracking even the forks. See https://github.com/gsauthof/cgmemtime for an example.
Prevent network access to anything that isn't on lo with net_prio subsystem.

And with SELinux/AppArmor, you can limit the process's read/write access.

Note: I am unfamiliar with AppArmor, but it is a Mandatory Access Control (MAC) system, meaning that guarding writing and reading is it's job.

Using these systems is a matter of the writing the proper configurations. Of course, all this is much easier said than done. So here are a few reference links to get you started:

Good Luck!

Best Answer

Related Solutions

What are the benefits of using longer/shorter periods in cpu.cfs_period_us

Linux – How to have memory/cpu limits only for forked children and not for the parent process