SSH multiplexing gotchas
OpenSSH (the book) says this about SSH multiplexing:
Multiplexing is the ability to send more than one signal over a single line or connection. In OpenSSH, multiplexing can re-use an existing outgoing TCP connection for multiple concurrent SSH sessions to a remote SSH server, avoiding the overhead of creating a new TCP connection and reauthenticating each time.
For example, maybe you want to use the Docker CLI on
jump-host.example.org
, against a Docker daemon on
docker-host.example.org
. You follow
the Docker docs' suggestion
for setting up Docker over SSH, including multiplexing:
Host docker-host.example.org
ControlMaster auto
ControlPath ~/.ssh/control-%C
ControlPersist yes
After using this setup for a while, you notice that SSHing into
docker-host
fails occasionally with an error message like
"Session open refused by peer". You Google and discover that, by
default, OpenSSH has a limit of 10 multiplexed sessions per TCP
connection. Beyond that, sshd
on
docker-host
starts rejecting sessions.
You resolve this by increasing the MaxSessions
setting in
/etc/ssh/sshd_config
on docker-host
, to 100,
1,000, or 2,147,483,647 (the max value for a 32-bit signed integer, to
disable this limit entirely).
However, even after bumping MaxSessions
and restarting
the sshd
service, you still see the error message.
sshd -T
shows that sshd_config
is valid and
you set MaxSessions
correctly. What's going on?
The problem is, restarting sshd
doesn't kill existing SSH
sessions. On docker-host
, those existing sessions are
supported by existing sshd
instances. And those existing
processes use the old configuration from sshd_config
...
including MaxSessions 10
.
Fixing the immediate problem is easy. Just delete
~/.ssh/control-...
from jump-host
. But how
to avoid this next time you update sshd_config
?
Part of the problem is ControlPersist yes
. This means
that the master SSH session between jump-host
and
docker-host
, the one that all the multiplexed sessions
run on, stays open until you close it manually. Even between
sshd
service restarts.
So you change ControlPersist to a time, e.g. 60m
. That
way, if the master session lives for more than 60 minutes, it'll close
as soon as all its multiplexed sessions also close. Which should
happen... at some point? Probably? Fingers crossed.