SSH – Limit CPU and Memory Resources in Session

cgroupsssh

I have a server with 4 GPUs/128 cores where I use a workload manager (slurm) to manage resource allocation between users for computation.

The issue is that the scheduler server (for job submission) runs on the same machine(1), where users connect through ssh to submit jobs.

I would like to restrict the resource allocation (cpu/mem) for all users only inside their ssh session, so that they will not run heavy computations outside the scheduling system (but they can inside jobs).

I know I can use cgroups to implement such limit, e.g.:

cgcreate -g memory,cpu:cpulimited
cgset -r cpu.shares=100 cpulimited # limit to ~10% of cpu use
cgset -r memory.limit_in_bytes=$((10*1024*1024*1024)) # limit to 10 GB

How can I ensure that any ssh session will be run in this cgroup cpulimited? but in the same time, when slurm lauches a job, I don't want to bypass the limit set by the scheduler (for any user).

(1) I need a workload manager because I have several users and it can be chaotic without it, however I don't have any other machine that could be the scheduling server (like on any standard cluster).

Best Answer

I found a solution to my problem so I share it.

I defined a cgroup customssh/limit using the cgconfig service from the libcgroup. Here is an extract of my cgconfig.conf file:

group customssh/limit {
  perm {
    admin {
      uid=root;
      gid=root;
    }
    task {
      uid=myuser;
      gid=mygroup;
      fperm=775;
    }
  }
  cpu {
    cpu.shares=50;
  }
  memory {
    memory.limit_in_bytes="10G";
    memory.memsw.limit_in_bytes="10G";
    memory.soft_limit_in_bytes="2G";
  }
  devices {
    devices.deny="c 195:* rwm";
  }
}
  • fperm = 775 allows users of the group mygroup to use this cgroup
  • I enforce limit on CPU and memory usage, and deny access to GPUs with devices.deny="c 195:* rwm";

I created the file /etc/profile.d/ssh.sh which is run on every login with the following contents:

# check if the user is in a ssh session
if [[ -n $SSH_CONNECTION ]]; then
    # shell PID 
    SESSIONPID=$$
    # attach the shell to the `customssh/limit` cgroup
    cgclassify -g memory,cpu,devices:customssh/limit $SESSIONPID
    # all processes run by this shell will be affected
fi
Related Question