Ssh – MaxStartups and MaxSessions configurations parameter for ssh connections

gnu-parallellinuxscpsshd

I am copying the files from machineB and machineC into machineA as I am running my below shell script on machineA.

If the files is not there in machineB then it should be there in machineC for sure so I will try copying the files from machineB first, if it is not there in machineB then I will try copying the same files from machineC.

I am copying the files in parallel using GNU Parallel library and it is working fine. Currently I am copying 10 files in parallel.

Below is my shell script which I have –

#!/bin/bash

export PRIMARY=/test01/primary
export SECONDARY=/test02/secondary
readonly FILERS_LOCATION=(machineB machineC)
export FILERS_LOCATION_1=${FILERS_LOCATION[0]}
export FILERS_LOCATION_2=${FILERS_LOCATION[1]}
PRIMARY_PARTITION=(550 274 2 546 278) # this will have more file numbers
SECONDARY_PARTITION=(1643 1103 1372 1096 1369 1568) # this will have more file numbers

export dir3=/testing/snapshot/20140103

find "$PRIMARY" -mindepth 1 -delete
find "$SECONDARY" -mindepth 1 -delete

do_Copy() {
  el=$1
  PRIMSEC=$2
  scp david@$FILERS_LOCATION_1:$dir3/new_weekly_2014_"$el"_200003_5.data $PRIMSEC/. || scp david@$FILERS_LOCATION_2:$dir3/new_weekly_2014_"$el"_200003_5.data $PRIMSEC/.
}
export -f do_Copy

parallel --retries 10 -j 10 do_Copy {} $PRIMARY ::: "${PRIMARY_PARTITION[@]}" &
parallel --retries 10 -j 10 do_Copy {} $SECONDARY ::: "${SECONDARY_PARTITION[@]}" &
wait    

echo "All files copied."

Problem Statement:-

With the above script at some point (not everytime) I am getting this exception –

ssh_exchange_identification: Connection closed by remote host
ssh_exchange_identification: Connection closed by remote host
ssh_exchange_identification: Connection closed by remote host

And I guess the error is typically caused by too many ssh/scp starting at the same time. That leads me to believe /etc/ssh/sshd_config:MaxStartups and MaxSessions is set too low.

But my question is on which server it is pretty low? machineB and machineC or machineA? And on what machines I need to increase the number?

On machineA this is what I can find and they all are commented out –

root@machineA:/home/david# grep MaxStartups /etc/ssh/sshd_config
#MaxStartups 10:30:60

root@machineA:/home/david# grep MaxSessions /etc/ssh/sshd_config

And on machineB and machineC this is what I can find –

[root@machineB ~]$ grep MaxStartups /etc/ssh/sshd_config
#MaxStartups 10

[root@machineB ~]$ grep MaxSessions /etc/ssh/sshd_config
#MaxSessions 10

Best Answer

If I understand this code correctly I believe this is your issue:

do_Copy() {
  el=$1
  PRIMSEC=$2
  scp david@$FILERS_LOCATION_1:$dir3/new_weekly_2014_"$el"_200003_5.data \
    $PRIMSEC/. || \
    scp david@$FILERS_LOCATION_2:$dir3/new_weekly_2014_"$el"_200003_5.data \
    $PRIMSEC/.
}
export -f do_Copy

parallel --retries 10 -j 10 do_Copy {} \
    $PRIMARY ::: "${PRIMARY_PARTITION[@]}" &
parallel --retries 10 -j 10 do_Copy {} \
    $SECONDARY ::: "${SECONDARY_PARTITION[@]}" &
wait    

You're running 20 scp's in parallel, but machines B & C can only handle 10 each:

#MaxStartups 10

I'd dial back those parallel lines to say 5 and see if that fixes your issue. If you want to increase the number of MaxStartups on machines B & C you could do that as well:

MaxStartups 15

And make sure to restart the sshd service on both B & C:

$ sudo service sshd restart

Confirming config file modifications

You can double check that they're working by running sshd in test mode via the -T switch.

$ sudo /usr/sbin/sshd -T | grep -i max
maxauthtries 6
maxsessions 10
clientalivecountmax 3
maxstartups 10:30:100
Related Question