scp
itself has no such feature. With GNU parallel
you can use the sem
command (from semaphore) to arbitrarily limit concurrent processes:
sem --id scp -j 50 scp ...
For all processes started with the same --id
, this applies a limit of 50 concurrent instances. An attempt to start a 51st process will wait (indefinitely) until one of the other processes exits. Add --fg
to keep the process in the foreground (default is to run it in the background, but this doesn't behave quite the same a shell background process).
Note that the state is stored in ${HOME}/.parallel/
so this won't work quite as hoped if you have multiple users using scp
, you may need a lower limit for each user. (It should also be possible override the HOME
environment variable when invoking sem
, make sure umask
permits group write, and modify the permissions so they share state, I have not tested this heavily though, YMMV.)
parallel
requires only perl
and a few standard modules.
You might also consider using scp -l N
where N is a transfer limit in kBps, select a specific cipher (for speed, depending on your required security), or disable compression (especially if the data is already compressed) to further reduce CPU impact.
For scp
, ssh is effectively a pipe and an scp
instance runs on each end (the receiving end runs with the undocumented -t
option). Regarding MaxSessions
, this won't help, "sessions" are multiplexed over a single SSH connection. Despite copious misinformation to the contrary, MaxSessions
limits only the multiplexing of sessions per-TCP connection, not any other limit.
The PAM module pam_limits
supports limiting concurrent logins, so if OpenSSH is built with PAM, and usePAM yes
is present in the sshd_config
you can set limit by username, group membership (and more). You can then set a hard maxlogins
to limit the logins in /etc/security/limits.conf
. However this counts up all logins per user, not just the new logins using just ssh
, and not just scp
, so you might run into trouble unless you have a dedicated scp
user id. Once enabled, it will also apply to interactive ssh sessions. One way around this is to copy or symlink the sshd
binary, calling it sshd-scp
then you can use a separate PAM configuration file, i.e. /etc/pam.d/sshd-scp
(OpenSSH calls pam_start()
with the "service name" set to that of the binary it was invoked as). You'll need to run this on a separate port (or IP), and using a separate sshd_config
is probably a good idea too.
If you implement this, then scp
will fail (exit code 254) when the limit is reached, so you'll have to deal with that in your transfer process.
(Other options include ionice
and cpulimit
, these may cause scp
sessions to timeout or hang for long periods, causing more problems.)
The old school way of doing something similar is to use atd
and batch
, but that doesn't offer tuning of concurrency, it queues and starts processes when the load is below a specific threshold. A newer variation on that is Task Spooler that supports queueing and running jobs in a more configurable sequential/parallel way, with runtime reconfiguration supported (e.g. changing queued jobs and concurrency settings), though it offers no load or CPU related control itself.
On a Centos 7 test system via
$ sudo rpm -Uvh https://packages.microsoft.com/config/rhel/7/packages-microsoft-prod.rpm
$ sudo yum install dotnet-sdk-2.1
which results in dotnet-sdk-2.1-2.1.400-1.x86_64
being installed then with the test code
using System;
using System.Diagnostics;
using System.ComponentModel;
namespace myApp {
class Program {
static void Main(string[] args) {
var process = new Process();
process.EnableRaisingEvents = true; // to avoid [defunct] sh processes
process.StartInfo.FileName = "/var/tmp/foo";
process.StartInfo.Arguments = "";
process.StartInfo.UseShellExecute = true;
process.StartInfo.CreateNoWindow = true;
process.Start();
process.WaitForExit(10000);
if (process.HasExited) {
Console.WriteLine("Exit code: " + process.ExitCode);
} else {
Console.WriteLine("Child process still running after 10 seconds");
}
}
}
}
and a shell script as /var/tmp/foo
a strace
stalls out and shows that /var/tmp/foo
is run through xdg-open
which on my system does...I'm not sure what, it seems a needless complication.
$ strace -o foo -f dotnet run
Child process still running after 10 seconds
^C
$ grep /var/tmp/foo foo
25907 execve("/usr/bin/xdg-open", ["/usr/bin/xdg-open", "/var/tmp/foo"], [/* 37 vars */] <unfinished ...>
...
a simpler solution is to simply exec
a program that in turn can be a shell script that does what you want, which for .NET requires not using the shell:
process.StartInfo.UseShellExecute = false;
with this set the strace
shows that /var/tmp/foo
is being run via a (much simpler) execve(2)
call:
26268 stat("/var/tmp/foo", {st_mode=S_IFREG|0755, st_size=37, ...}) = 0
26268 access("/var/tmp/foo", X_OK) = 0
26275 execve("/var/tmp/foo", ["/var/tmp/foo"], [/* 37 vars */] <unfinished ...>
and that .NET refuses to exit:
$ strace -o foo -f dotnet run
Child process still running after 10 seconds
^C^C^C^C^C^C^C^C
because foo
replaces itself with something that ignores most signals (notably not USR2
, or there is always KILL
(but avoid using that!)):
$ cat /var/tmp/foo
#!/bin/sh
exec /var/tmp/stayin-alive
$ cat /var/tmp/stayin-alive
#!/usr/bin/perl
use Sys::Syslog;
for my $s (qw(HUP INT QUIT PIPE ALRM TERM CHLD USR1)) {
$SIG{$s} = \&shandle;
}
openlog( 'stayin-alive', 'ndelay,pid', LOG_USER );
while (1) {
syslog LOG_NOTICE, "oh oh oh oh oh stayin alive";
sleep 7;
}
sub shandle {
syslog LOG_NOTICE, "nice try - @_";
}
daemonize
With a process that disassociates itself from the parent and a shell script that runs a few commands (hopefully equivalent to your intended apt-get update; apt-get upgrade
)
$ cat /var/tmp/a-few-things
#!/bin/sh
sleep 17 ; echo a >/var/tmp/output ; echo b >/var/tmp/output
we can modify the .NET program to run /var/tmp/solitary /var/tmp/a-few-things
process.StartInfo.FileName = "/var/tmp/solitary";
process.StartInfo.Arguments = "/var/tmp/a-few-things";
process.StartInfo.UseShellExecute = false;
which when run causes the .NET program to exit fairly quickly
$ dotnet run
Exit code: 0
and, eventually, the /var/tmp/output
file does contain two lines written by a process that was not killed when the .NET program when away.
You probably should save the output from the APT commands somewhere, and may also need something so that two (or more!) updates are not
trying to be run at the same time, etc. This version does not stop for questions and ignores any TERM
signals (INT
may also need to be ignored).
#!/bin/sh
trap '' TERM
set -e
apt-get --yes update
apt-get --yes upgrade
Best Answer
If your
grep
does not support-w
:or
or (clobbering the positional parameters)
Note that this is not atomic, so the count may be wrong if some processes die and others get spawned in the short span of time it takes to gather the data.