I am facing a problem where I have a fleet of servers which contain a lot of data. Each of the host runs many instances of a specific process p1, which makes several scp connections to other hosts in parallel to get the data it has to process. This in turn puts a lot of load on these hosts and many times they go down.
I am looking for ways through which I can limit the number of concurrent scp processes that can be run on a single host.
Most of the links pointed me to MaxStartup & MaxSessions settings in /etc/ssh/sshd_config which were more to do with limiting the number of ssh sessions that can be made/initiated at any given point etc.
Is there a specific config file for scp which can be used here? Or is there a way at the system level to limit the number of instances of a specific process/command that can run concurrently at a time?
Best Answer
scp
itself has no such feature. With GNUparallel
you can use thesem
command (from semaphore) to arbitrarily limit concurrent processes:For all processes started with the same
--id
, this applies a limit of 50 concurrent instances. An attempt to start a 51st process will wait (indefinitely) until one of the other processes exits. Add--fg
to keep the process in the foreground (default is to run it in the background, but this doesn't behave quite the same a shell background process).Note that the state is stored in
${HOME}/.parallel/
so this won't work quite as hoped if you have multiple users usingscp
, you may need a lower limit for each user. (It should also be possible override theHOME
environment variable when invokingsem
, make sureumask
permits group write, and modify the permissions so they share state, I have not tested this heavily though, YMMV.)parallel
requires onlyperl
and a few standard modules.You might also consider using
scp -l N
where N is a transfer limit in kBps, select a specific cipher (for speed, depending on your required security), or disable compression (especially if the data is already compressed) to further reduce CPU impact.For
scp
, ssh is effectively a pipe and anscp
instance runs on each end (the receiving end runs with the undocumented-t
option). RegardingMaxSessions
, this won't help, "sessions" are multiplexed over a single SSH connection. Despite copious misinformation to the contrary,MaxSessions
limits only the multiplexing of sessions per-TCP connection, not any other limit.The PAM module
pam_limits
supports limiting concurrent logins, so if OpenSSH is built with PAM, andusePAM yes
is present in thesshd_config
you can set limit by username, group membership (and more). You can then set a hardmaxlogins
to limit the logins in/etc/security/limits.conf
. However this counts up all logins per user, not just the new logins using justssh
, and not justscp
, so you might run into trouble unless you have a dedicatedscp
user id. Once enabled, it will also apply to interactive ssh sessions. One way around this is to copy or symlink thesshd
binary, calling itsshd-scp
then you can use a separate PAM configuration file, i.e./etc/pam.d/sshd-scp
(OpenSSH callspam_start()
with the "service name" set to that of the binary it was invoked as). You'll need to run this on a separate port (or IP), and using a separatesshd_config
is probably a good idea too. If you implement this, thenscp
will fail (exit code 254) when the limit is reached, so you'll have to deal with that in your transfer process.(Other options include
ionice
andcpulimit
, these may causescp
sessions to timeout or hang for long periods, causing more problems.)The old school way of doing something similar is to use
atd
andbatch
, but that doesn't offer tuning of concurrency, it queues and starts processes when the load is below a specific threshold. A newer variation on that is Task Spooler that supports queueing and running jobs in a more configurable sequential/parallel way, with runtime reconfiguration supported (e.g. changing queued jobs and concurrency settings), though it offers no load or CPU related control itself.