watchfrr: force kill daemons on restart

Today, watchfrr sends a SIGSTOP to a misbehaving daemon through
frrcommon. The issue is, a stuck daemon (like in a thread starvation
situation) will not honor a SIGSTOP, and watchfrr will try indefinitely
to kill it.

frrcommon will now send a SIGSTOP, and if ineffective after 60 seconds,
it will send a SIGKILL.

Signed-off-by: Tuetuopay <tuetuopay@me.com>
This commit is contained in:
Alexis Bauvin 2024-10-18 11:36:08 +02:00
parent 2a90c80f49
commit 7cf4d10423

View file

@ -214,11 +214,17 @@ daemon_stop() {
debug "kill -2 $pid"
kill -2 "$pid"
cnt=1200
cnt=600
while kill -0 "$pid" 2>/dev/null; do
sleep .1
[ $(( cnt -= 1 )) -gt 0 ] || break
done
if kill -0 "$pid" 2>/dev/null; then
[ "$2" = "--quiet" ] || log_failure_msg "Failed to stop $dmninst, sending SIGKILL"
debug "kill -9 $pid"
kill -9 "$pid"
sleep .1
fi
if kill -0 "$pid" 2>/dev/null; then
[ "$2" = "--quiet" ] || log_failure_msg "Failed to stop $dmninst, pid $pid still running"
still_running=1