frr/staticd
Donald Sharp da0f552f5d staticd: Fix crash because registering unknown vrf
With recent commit:

c1adc8f1d6 staticd has started to crash
aproximately 1/10 of the tine in the static_vrf topotest

(gdb) bt
0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=140400982256064) at ./nptl/pthread_kill.c:44
1  __pthread_kill_internal (signo=6, threadid=140400982256064) at ./nptl/pthread_kill.c:78
2  __GI___pthread_kill (threadid=140400982256064, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
3  0x00007fb1a6442476 in __GI_raise (sig=6) at ../sysdeps/posix/raise.c:26
4  0x00007fb1a6950823 in core_handler (signo=6, siginfo=0x7ffd6d832ff0, context=0x7ffd6d832ec0) at lib/sigevent.c:268
5  <signal handler called>
6  __pthread_kill_implementation (no_tid=0, signo=6, threadid=140400982256064) at ./nptl/pthread_kill.c:44
7  __pthread_kill_internal (signo=6, threadid=140400982256064) at ./nptl/pthread_kill.c:78
8  __GI___pthread_kill (threadid=140400982256064, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
9  0x00007fb1a6442476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
10 0x00007fb1a64287f3 in __GI_abort () at ./stdlib/abort.c:79
11 0x00007fb1a699a422 in _zlog_assert_failed (xref=0x55f7dfd3dac0 <_xref.117>,
   extra=0x55f7dfd30c30 "BUG: NH %pFX registered but not in hashtable") at lib/zlog.c:789
12 0x000055f7dfd1201f in static_zebra_nht_register (nh=0x55f7fd2ecd80, reg=true) at staticd/static_zebra.c:333
13 0x000055f7dfd29c9d in static_install_nexthop (nh=0x55f7fd2ecd80) at staticd/static_routes.c:299
14 0x000055f7dfd2a126 in static_fixup_vrf (vrf=0x55f7fd2333a0, stable=0x55f7fd271030, afi=AFI_IP, safi=SAFI_UNICAST)
   at staticd/static_routes.c:441
15 0x000055f7dfd2a2be in static_fixup_vrf_ids (vrf=0x55f7fd2333a0) at staticd/static_routes.c:494
16 0x000055f7dfd15b53 in static_vrf_enable (vrf=0x55f7fd2333a0) at staticd/static_vrf.c:124
17 0x00007fb1a696ffa5 in vrf_enable (vrf=0x55f7fd2333a0) at lib/vrf.c:325
18 0x00007fb1a6991c87 in zclient_vrf_add (cmd=33, zclient=0x55f7fd29f740, length=76, vrf_id=8) at lib/zclient.c:2701
19 0x00007fb1a6996cba in zclient_read (thread=0x7ffd6d834230) at lib/zclient.c:4764
20 0x00007fb1a696bd9b in event_call (thread=0x7ffd6d834230) at lib/event.c:2019
21 0x00007fb1a68e1a3a in frr_run (master=0x55f7fd102e10) at lib/libfrr.c:1246
22 0x000055f7dfd1081e in main (argc=7, argv=0x7ffd6d834478, envp=0x7ffd6d8344b8) at staticd/static_main.c:193

Tracking this down, the crash is because the nh believes that is already
registered but lookup fails, causing this assert.  Looking at the code
static_fixup_vrf is changing the vrf_id.  I put a zlog_debug right
before the change of the nh vrf_id and noticed that the vrf id was
UNKNOWN.  So, the code is attempting to register into zebra the nexthop
with a vrf unknown( which will be ignored ).

Modify the code in the registration process to notice that the nh is
still UNKNOWN and as such nothing should be done.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-02-22 18:13:15 -05:00
..
.gitignore staticd: Start the addition of a staticd 2018-07-29 12:37:24 -04:00
Makefile staticd: Start the addition of a staticd 2018-07-29 12:37:24 -04:00
static_bfd.c staticd: fix NB dependency hack 2024-02-02 00:57:59 +02:00
static_debug.c staticd: Add debug option for SRv6 2025-01-18 10:28:49 +00:00
static_debug.h staticd: Add debug option for SRv6 2025-01-18 10:28:49 +00:00
static_main.c staticd: Initialize/cleanup SRv6 2025-01-18 10:28:49 +00:00
static_nb.c staticd: fix botched staticd YANG for dst-src 2025-01-28 15:40:17 +01:00
static_nb.h staticd: fix botched staticd YANG for dst-src 2025-01-28 15:40:17 +01:00
static_nb_config.c staticd: fix botched staticd YANG for dst-src 2025-01-28 15:40:17 +01:00
static_nht.c staticd: fix NHT for dst-src routes 2025-01-28 15:40:17 +01:00
static_nht.h staticd: fix NHT for dst-src routes 2025-01-28 15:40:17 +01:00
static_routes.c staticd: Failed to register nexthop after networking restart 2025-02-14 12:12:11 -08:00
static_routes.h staticd: add a separate function for uninstalling nexthops 2024-02-04 22:28:10 +02:00
static_srv6.c staticd: Install SIDs when a dependent interface goes up/down 2025-01-18 10:28:49 +00:00
static_srv6.h staticd: Install SIDs when a dependent interface goes up/down 2025-01-18 10:28:49 +00:00
static_vrf.c staticd: fix botched staticd YANG for dst-src 2025-01-28 15:40:17 +01:00
static_vrf.h staticd: fix NB dependency hack 2024-02-02 00:57:59 +02:00
static_vty.c isisd, lib: add some codepoints usually shared with other vendors 2025-02-14 15:40:42 +01:00
static_vty.h mgmtd, staticd: output staticd configuration from mgmtd 2023-11-21 13:28:40 +02:00
static_zebra.c staticd: Fix crash because registering unknown vrf 2025-02-22 18:13:15 -05:00
static_zebra.h staticd: Request/Release SIDs to SID Manager 2025-01-18 10:28:49 +00:00
subdir.am staticd: Add infrastructure for SRv6 2025-01-18 10:28:49 +00:00