Commit graph

38382 commits

Author SHA1 Message Date
Christian Hopps 3d4dee5f0c tests: add another directory to search path for pylint
Some IDEs (e.g., emacs+lsp) run pylint from the root directory and so
we need to add `tests/topotests` so that `lib` and `munet` are found
by pylint when used in imports

Signed-off-by: Christian Hopps <chopps@labn.net>
2025-03-24 05:10:36 +00:00
Donatas Abraitis c288e5fbaf
Merge pull request #18399 from LabNConsulting/chopps/fix-unit-tests
2 unit-test fixes
2025-03-16 15:14:55 +01:00
Donatas Abraitis f2245941d8
Merge pull request #18384 from LabNConsulting/chopps/suppress-expected-libyang-error-log
lib: suppress libyang logs during expected error result
2025-03-16 15:12:59 +01:00
Donatas Abraitis f5a74fc91c
Merge pull request #18387 from Manpreet-k0/redo_import_check_crash
bgpd: Fixed crash upon bgp network import-check command
2025-03-15 18:35:04 +01:00
Donatas Abraitis 35cc716363
Merge pull request #18394 from donaldsharp/fpm_listener_output
zebra: add ability to specify output file with fpm_listener
2025-03-15 18:32:19 +01:00
Donatas Abraitis 0647a115cc
Merge pull request #18395 from donaldsharp/bgp_stream_copy_usage_removal
bgpd: Remove unnecessary stream_new/stream_copies in bgp_open_make
2025-03-15 18:31:41 +01:00
Donatas Abraitis 7444121f79
Merge pull request #18393 from LabNConsulting/aceelindem/ospf6-area-no-config-delete-stale
ospf6d: Disable and delete OSPFv3 areas that no longer have interfaces or configuration.
2025-03-15 14:15:58 +01:00
Christian Hopps bc3f7d9c07 tests: fix wrong callback function parameters in unit-test
Signed-off-by: Christian Hopps <chopps@labn.net>
2025-03-15 04:12:31 +00:00
Christian Hopps 986501029e tests: deal with configure overridden timestamp prec in unit test
Previously if you configured a different timestamp precision then
`make check` would fail as the non-default config is generated and
fails test_cli config file comparison.

Signed-off-by: Christian Hopps <chopps@labn.net>
2025-03-15 04:11:48 +00:00
Donald Sharp c9655e2893 bgpd: Remove unnecessary stream_new/stream_copies in bgp_open_make
The call into bgp_open_capability can return that it wrote more
than BGP_OPEN_NON_EXT_OPT_LEN bytes, in that case the open
part needs to be written again with ext_opt_params set to
true to allow extended parameters to be written thus keeping
the len < 255 bytes.  The code to do this was first creating
a new stream and then copying into it the stream, trying
to call bgp_open_capability() and if it succeeded recopying
the tmp stream back onto the original.

Let's change this around such that we save the current spot
in the stream of where we are writing and if the change does
not work reset the pointer and try again with the correct
parameter.  This removes the stream and multiple copies and
eventual free of the temporary stream.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-03-14 14:50:59 -04:00
Donald Sharp f0b2bc3b4c zebra: add ability to specify output file with fpm_listener
The fpm_listener didn't have the ability to specify the output
file location at all.  Modify the code to accept this.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-03-14 13:24:19 -04:00
Acee Lindem 04994891fe ospf6d: Disable and delete OSPFv3 areas that no longer have interfaces or configuration.
This fix will delete an OSPFv3 area when all the interfaces and
        configuration (ranges, NSSA ranges, stub area, NSSA area, filter-list,
        import-list and export-list) have been removed. The changes provides
        a general solution to https://github.com/FRRouting/frr/issues/18324.

Signed-off-by: Acee Lindem <acee@lindem.com>
2025-03-14 16:02:28 +00:00
Jafar Al-Gharaibeh 7945af0200
Merge pull request #18360 from raja-rajasekar/rajasekarr/fix_explicit_sid_allocation
zebra: ensure proper return for failure for Sid allocation
2025-03-14 09:57:41 -05:00
Christian Hopps 4663c3ef82 lib: suppress libyang logs during expected error result.
Signed-off-by: Christian Hopps <chopps@labn.net>
2025-03-14 14:36:13 +00:00
Christian Hopps 48ceff2128 lib: northbound: also log debugs for new get callback
Signed-off-by: Christian Hopps <chopps@labn.net>
2025-03-14 14:36:13 +00:00
Donald Sharp dbecefb6c6
Merge pull request #18383 from LabNConsulting/chopps/fix-oper-state-list-query-bug
Fix bug with oper-state queries including list node
2025-03-14 10:33:16 -04:00
Donald Sharp e5848deedf
Merge pull request #18377 from kaffarell/master
isisd: fix bit flag collision in options field
2025-03-14 09:43:09 -04:00
Donald Sharp e7318ce845
Merge pull request #18388 from y-bharath14/srib-yang-v5
yang: Fixed pyang errors at frr-bgp-common.yang
2025-03-14 09:41:39 -04:00
Manpreet Kaur bc1008b970 bgpd: Fixed crash upon bgp network import-check command
BT:
```
3  <signal handler called>
4  0x00005616837546fc in bgp_static_update (bgp=bgp@entry=0x5616865eac50, p=0x561686639e40,
    bgp_static=0x561686639f50, afi=afi@entry=AFI_IP6, safi=safi@entry=SAFI_UNICAST) at ../bgpd/bgp_route.c:7232
5  0x0000561683754ad0 in bgp_static_add (bgp=0x5616865eac50) at ../bgpd/bgp_table.h:413
6  0x0000561683785e2e in no_bgp_network_import_check (self=<optimized out>, vty=0x5616865e04c0,
    argc=<optimized out>, argv=<optimized out>) at ../bgpd/bgp_vty.c:4609
7  0x00007fdbcc294820 in cmd_execute_command_real (vline=vline@entry=0x561686663000,
```

The program encountered a SEG FAULT when attempting to access pi->extra->vrfleak->bgp_orig because
pi->extra->vrfleak was NULL.
```
(gdb) p pi->extra->vrfleak
$1 = (struct bgp_path_info_extra_vrfleak *) 0x0
(gdb) p pi->extra->vrfleak->bgp_orig
Cannot access memory at address 0x8
```
Added NOT NULL check on pi->extra->vrfleak before accessing pi->extra->vrfleak->bgp_orig
to prevent the segmentation fault.

Signed-off-by: Manpreet Kaur <manpreetk@nvidia.com>
2025-03-14 05:40:16 -07:00
Y Bharath d7839f5ddd yang: Fixed pyang errors at frr-bgp-common.yang
Fixed pyang errors at frr-bgp-common.yang

Signed-off-by: y-bharath14 <y.bharath@samsung.com>
2025-03-14 14:31:52 +05:30
Christian Hopps 2586c4c3ed lib: make sure we update the darr_strlen from pruned string.
This fixes a bug when handling of queries which include list nodes in
the xpath.

Signed-off-by: Christian Hopps <chopps@labn.net>
2025-03-14 08:37:46 +00:00
Christian Hopps d58a8f473b lib: add darr_strlen_fixup() to update len based on NUL term
Signed-off-by: Christian Hopps <chopps@labn.net>
2025-03-14 08:37:46 +00:00
Donatas Abraitis 8982a81de1
Merge pull request #18380 from donaldsharp/non_peer_group
bgpd: Show bgp <afi> <safi> shouldn't display peers in groups
2025-03-14 07:30:52 +01:00
Donald Sharp 3fb72a03c3 bgpd: Show bgp <afi> <safi> shouldn't display peers in groups
The command `show bgp <afi> <safi>` has this output:

r1# show bgp ipv4 uni 10.0.0.0
BGP routing table entry for 10.0.0.0/32, version 1
Paths: (1 available, best #1, table default)
  Advertised to non peer-group peers:
  r1-eth0 r1-eth1 r1-eth2 r1-eth3
  ....

It specifically states `Advertised to non peer-group peers:` yet
the code is not filtering those out.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-03-13 15:19:02 -04:00
Gabriel Goller 86faed512f isisd: fix bit flag collision in options field
Resolve conflict between F_ISIS_UNIT_TEST and ISIS_OPT_DUMMY_AS_LOOPBACK
which were both using the same bit value (0x01). This collision caused
unit test mode to be unintentionally enabled when DUMMY_AS_LOOPBACK was set.

Signed-off-by: Gabriel Goller <g.goller@proxmox.com>
2025-03-13 16:51:04 +01:00
Donatas Abraitis 2ab8cce2e1
Merge pull request #18366 from pguibert6WIND/community_limit_zero_add_test
Add Testing for community and Extended community match limit zero
2025-03-12 21:47:17 +01:00
Jafar Al-Gharaibeh ddf483c65c
Merge pull request #18367 from donaldsharp/static_nexthop_reinstall
staticd: Install known nexthops upon connection with zebra
2025-03-12 10:30:13 -05:00
Jafar Al-Gharaibeh e97b20576e
Merge pull request #18368 from donaldsharp/backout_bgp_if_up_down_changes
Revert "bgpd: upon if event, evaluate bnc with matching nexthop"
2025-03-12 10:29:11 -05:00
Donald Sharp 052aea624e Revert "bgpd: upon if event, evaluate bnc with matching nexthop"
This reverts commit 58592be577.

This commit is being reverted because of several issues:

a) tcpdump -i <any interface that bgp happens to use>

This command causes bgp to dump it's entire table to all
of it's peers again.  This is a huge problem in any type
of scaled environment *and* it is not unusual to have an
operator do this.

b) This commit appears to be attempting to solve the problem
with route leaking across vrf's using labels( or somesuch ).
Unfortunately we have absolutely no topotests that show the
behavior.  I am also unable to get any type of how to reproduce
the problem being solved by the commit.  I do know, though,
that the problem really stems from the fact that bgp has
decided to cheat and not create bnc's for route leaking.
Thus when a nexthop changes, bgp is not being notified.
This commit was being used as a hammer to solve the problem.

While I do agree backing out a bug fix for some operator
is less then ideal, I believe that since I cannot get the
operator to tell me the problem it solved and the fact
that sending large amounts of updates with just a simple
tcpdump command ( actually 2 one for tcpdump start and
one for finishing ) is more detrimental in my eyes at
this point in time.  Additionally the solution used
is the wrong one for the problem.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-03-12 09:01:52 -04:00
Donald Sharp 918a1f85c2 staticd: Install known nexthops upon connection with zebra
CI tests are showing cases where staticd is connecting to
zebra after config is read in and the nexthops are never
being registered w/ zebra:

2025/03/11 15:39:44 STATIC: [T83RR-8SM5G] staticd 10.4-dev starting: vty@2616
2025/03/11 15:39:45 STATIC: [GH3PB-C7X4Y] Static Route to 13.13.13.13/32 not installed currently because dependent config not fully available
2025/03/11 15:39:45 STATIC: [RHJK1-M5FAR] static_zebra_nht_register: Failure to send nexthop 1.1.1.2/32 for 11.11.11.11/32 to zebra
2025/03/11 15:39:45 STATIC: [M7Q4P-46WDR] vty[14]@> enable

Zebra shows connection time as:

2025/03/11 15:39:45.933343 ZEBRA: [V98V0-MTWPF] client 5 says hello and bids fair to announce only static routes vrf=0

As a result staticd never installs the route because it has no nexthop
tracking to say that the route could be installed.

Modify staticd on startup to go through it's nexthops and dump them to
zebra to allow the staticd state machine to get to work.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-03-12 08:30:43 -04:00
Mark Stapp 27953dd141
Merge pull request #18336 from routingrocks/rvaratharaj/bugfixmar
zebra: Fix neigh delete causing heap-use-after-free error
2025-03-12 08:09:29 -04:00
Rajesh Varatharaj 3060afc84d zebra: Fix neigh delete causing heap-use-after-free error
Issue:
Not freeing the neighbor n  within the same function can lead to
memory leak.
zebra_neigh_del_all() -> zebra_neigh_del() re lookup and free

Fix: not accessing n after its freed.
Directly free the neighbor entry (n) when its interface index matches
ifp->ifindex.

This fixes:
ERROR: AddressSanitizer: heap-use-after-free on address 0x6070001052e8 at pc 0x7f6bf7d09ddb bp 0x7ffd3366a000 sp 0x7ffd33669ff0
READ of size 8 at 0x6070001052e8 thread T0
    #0 0x7f6bf7d09dda in _rb_next lib/openbsd-tree.c:455
    #1 0x55f95a307261 in zebra_neigh_rb_head_RB_NEXT zebra/zebra_neigh.h:34
    #2 0x55f95a3082e9 in zebra_neigh_del_all zebra/zebra_neigh.c:162
    #3 0x55f95a121ee7 in zebra_interface_down_update zebra/redistribute.c:571
    #4 0x55f95a0f819d in if_down zebra/interface.c:1017
    #5 0x55f95a0fe168 in zebra_if_dplane_ifp_handling zebra/interface.c:2102
    #6 0x55f95a0ff10c in zebra_if_dplane_result zebra/interface.c:2241
    #7 0x55f95a27ce9c in rib_process_dplane_results zebra/zebra_rib.c:5015
    #8 0x7f6bf7da3ad9 in event_call lib/event.c:1984
    #9 0x7f6bf7c62141 in frr_run lib/libfrr.c:1246
    #10 0x55f95a11ca7f in main zebra/main.c:543
    #11 0x7f6bf7029d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    #12 0x7f6bf7029e3f in __libc_start_main_impl ../csu/libc-start.c:392
    #13 0x55f95a0dd0b4 in _start (/usr/lib/frr/zebra+0x1a80b4)

Ticket: #18047

Signed-off-by: Rajesh Varatharaj <rvaratharaj@nvidia.com>
2025-03-11 13:41:40 -07:00
Mark Stapp d0cb3ad7cb
Merge pull request #16614 from louis-6wind/fix-otable-heap-after-free
zebra: fix table heap-after-free crash
2025-03-11 14:03:14 -04:00
Philippe Guibert ec703b0e08 topotests: check when extended community-limit is set to 0
Add a test to control that when the extended community limit is set
to 0, then only 2 BGP updates are received on R3.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2025-03-11 17:33:49 +01:00
Philippe Guibert c3038acf3c topotests: check when community-limit is set to 0
Add a test to control that when the community limit is set to 0, then
only 2 BGP updates are received on R3.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2025-03-11 17:31:22 +01:00
Russ White b1711c010f
Merge pull request #18362 from y-bharath14/srib-tests-v4
tests: Fixed NameError at bmpserver.py
2025-03-11 11:26:12 -04:00
Russ White aa0841e431
Merge pull request #18346 from donaldsharp/memory_leaks_bgp
Clean up some code and bad assumptions in zebra
2025-03-11 10:35:14 -04:00
Russ White 27a504fcef
Merge pull request #18342 from anlancs/ospfd-minor-change
ospfd: minor change for style
2025-03-11 10:33:41 -04:00
Y Bharath 2c17648015 tests: Fixed NameError at bmpserver.py
Fixed NameError at bmpserver.py

Signed-off-by: y-bharath14 <y.bharath@samsung.com>
2025-03-11 11:33:33 +05:30
Rajasekar Raja 5a63cf4c0d zebra: ensure proper return for failure for Sid allocation
The functions alloc_srv6_sid_func_explicit/dynamic expect to return bool
but we have places where we return a -1 or NULL which the caller is
assuming as a True/Valid and ending up allocating Sid

Without Fix:
2025/03/10 21:44:04.295350 ZEBRA: [XWV20-TGK70] alloc_srv6_sid_func_explicit: trying to allocate explicit SID function 65088 from block fcbb:bbbb::/32
2025/03/10 21:44:04.295351 ZEBRA: [MM61M-TQZNP] alloc_srv6_sid_func_explicit: elib s 10000 e 20000 wlib s 1000 ewlib s 30000 e 1000 SID_FUNC 65088
2025/03/10 21:44:04.295352 ZEBRA: [QGHMB-SWNFW] alloc_srv6_sid_func_explicit: function 65088 is outside ELIB [10000/20000] and EWLIB alloc ranges [30000/1000]
2025/03/10 21:44:04.295367 ZEBRA: [H0GZA-NNSWJ] get_srv6_sid_explicit: allocated explicit SRv6 SID fcbb:bbbb:1:fe40:: for context End.X nh6 2001::2
2025/03/10 21:44:04.295368 ZEBRA: [XBBYD-T1Q7P] srv6_manager_get_sid_internal: got new SRv6 SID for ctx End.X nh6 2001::2: sid_value=fcbb:bbbb:1:fe40:: (func=65088) (proto=4, instance=0, sessionId=0), notifying all clients

With Fix:
2025/03/10 22:04:25.052235 ZEBRA: [MM61M-TQZNP] alloc_srv6_sid_func_explicit: elib s 30000 e 31000 wlib s 31000 ewlib s 30000 e 31000 SID_FUNC 65056
2025/03/10 22:04:25.052236 ZEBRA: [YHMRC-EMYNX] alloc_srv6_sid_func_explicit: function 65056 is outside ELIB [30000/31000] and EWLIB alloc ranges [30000/31000]
2025/03/10 22:04:25.052254 ZEBRA: [XSG8X-Q2XJX] get_srv6_sid_explicit: invalid SM request arguments: failed to allocate SID function 65056 from block fcbb:bbbb::/32
2025/03/10 22:04:25.052257 ZEBRA: [YC52T-427SJ] srv6_manager_get_sid_internal: not got SRv6 SID for ctx End.DT6 vrf_id 4, sid_value=fcbb:bbbb:1:fe20::, locator_name=MAIN
root@rajasekarr:/tmp/topotests/static_srv6_sids.test_static_srv6_sids/r1#

Ticket :#
Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
2025-03-10 15:26:38 -07:00
Louis Scalbert c50fe21045 tests: zebra_rib, test vrf change
Test table ID move to a VRF and the removal of the VRF.

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2025-03-10 09:54:18 +01:00
Louis Scalbert c6afe42455 lib, tests, zebra: keep table routes at vrf disabling
At VRF disabling, keep the route entries that was associated to its
table ID but not to the VRF itself. Kernel flushes these entries so we
need to reinstall them.

To do so, add a flag to mean that a route entry is owned by a table ID
and not by a VRF. If the VRF associated to the table ID is deleted, the
route entry must not be deleted.

Update to tests with new flag. 2057 is in hexa 0x809, meaning that the
new flag has been to some prefix.

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2025-03-10 09:54:18 +01:00
Louis Scalbert 97c159e882 tests: zebra_rib, test vrf change
Test table ID move to a VRF and the removal of the VRF.

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2025-03-10 09:54:18 +01:00
Louis Scalbert 52a35e9592 zebra: fix vanished blackhole route
Fix vanished blackhole route when kernel routes are updated.

> root@router# echo "100 my_table" | tee -a /etc/iproute2/rt_tables
> root@router# ip l add du0 type dummy
> root@router# ifconfig du0 192.168.0.1/24 up
> root@router# ip route add blackhole default table 100
> root@router# ip route show table 100
> blackhole default
> root@router# vtysh -c 'show ip route table 100'
> [...]
> Table 100:
> K>* 0.0.0.0/0 [0/0] unreachable (blackhole), weight 1, 00:00:05
> root@router# ip l add red type vrf table 100
> root@router# vtysh -c 'show ip route table 100'
> [...]
> Table 100:
> K>* 0.0.0.0/0 [0/0] unreachable (blackhole), weight 1, 00:00:16
> root@router# ip l set du0 master red
> root@router# vtysh -c 'show ip route table 100'
> [...]
> Table 100:
> C>* 192.168.0.0/24 is directly connected, du0, weight 1, 00:00:02
> L>* 192.168.0.1/32 is directly connected, du0, weight 1, 00:00:02
> root@router# ip route show table 100
> blackhole default
> 192.168.0.0/24 dev du0 proto kernel scope link src 192.168.0.1
> local 192.168.0.1 dev du0 proto kernel scope host src 192.168.0.1
> broadcast 192.168.0.255 dev du0 proto kernel scope link src 192.168.0.1

Fixes: d528c02a20 ("zebra: Handle kernel routes appropriately")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2025-03-10 09:54:18 +01:00
Louis Scalbert 5cde97678e zebra: fix removed default route at vrf enabling
When a routing table (RT) already has a default route before being
assigned to a VRF, the default route vanishes in zebra after the VRF
assignment.

> root@router:~# ip route add blackhole default table 100
> root@router:~# ip route show table 100
> blackhole default
> root@router:~# vtysh -c 'show ip route table 100'
> [...]
> VRF default table 100:
> K>* 0.0.0.0/0 [0/0] unreachable (blackhole), 00:00:05
> root@router:~# ip l add red type vrf table 100
> root@router:~# vtysh -c 'show ip route table 100'
> root@router:~#

Do not override the default route if it exists.

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2025-03-10 09:54:18 +01:00
Louis Scalbert fb8bf9cf59 zebra: remove vrf route entries at vrf disabling
This is the continuation of the previous commit.

When a VRF is deleted, the kernel retains only its own routing entries
in the former VRF table and removes all others.

This change ensures that routing entries created by FRR daemons are also
removed from the former zebra VRF table when the VRF is disabled.

To test:

> echo "100 my_table" | tee -a /etc/iproute2/rt_tables
> ip l add du0 type dummy
> ifconfig du0 192.168.0.1/24 up
> ip route add blackhole default table 100
> ip route show table 100
> ip l add red type vrf table 100
> ip l set du0 master red
> vtysh -c 'configure' -c 'vrf red' -c 'ip route 10.0.0.0/24 192.168.0.254'
> vtysh -c 'show ip route table 100'
> sleep 0.1
> ip l del red
> sleep 0.1
> vtysh -c 'show ip route table 100'
> ip l add red type vrf table 100
> ip l set du0 master red
> vtysh -c 'configure' -c 'vrf red' -c 'ip route 10.0.0.0/24 192.168.0.254'
> vtysh -c 'show ip route table 100'
> sleep 0.1
> ip l del red
> sleep 0.1
> vtysh -c 'show ip route table 100'

Fixes: d8612e6 ("zebra: Track tables allocated by vrf and cleanup")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2025-03-10 09:54:18 +01:00
Louis Scalbert 7395e399b1 zebra: fix table heap-after-free crash
Fix a heap-after-free that causes zebra to crash even without
address-sanitizer. To reproduce:

> echo "100 my_table" | tee -a /etc/iproute2/rt_tables
> ip route add blackhole default table 100
> ip route show table 100
> ip l add red type vrf table 100
> ip l del red
> ip route del blackhole default table 100

Zebra manages routing tables for all existing Linux RT tables,
regardless of whether they are assigned to a VRF interface. When a table
is not assigned to any VRF, zebra arbitrarily assigns it to the default
VRF, even though this is not strictly accurate (the code expects this
behavior).

When an RT table is created after a VRF, zebra correctly assigns the
table to the VRF. However, if a VRF interface is assigned to an existing
RT table, zebra does not update the table owner, which remains as the
default VRF. As a result, existing routing entries remain under the
default VRF, while new entries are correctly assigned to the VRF. The
VRF mismatch is unexpected in the code and creates crashes and memory
related issues.

Furthermore, Linux does not automatically delete RT tables when they are
unassigned from a VRF. It is incorrect to delete these tables from zebra.

Instead, at VRF disabling, do not release the table but reassign it to
the default VRF. At VRF enabling, change the table owner back to the
appropriate VRF.

> ==2866266==ERROR: AddressSanitizer: heap-use-after-free on address 0x606000154f54 at pc 0x7fa32474b83f bp 0x7ffe94f67d90 sp 0x7ffe94f67d88
> READ of size 1 at 0x606000154f54 thread T0
>     #0 0x7fa32474b83e in rn_hash_node_const_find lib/table.c:28
>     #1 0x7fa32474bab1 in rn_hash_node_find lib/table.c:28
>     #2 0x7fa32474d783 in route_node_get lib/table.c:283
>     #3 0x7fa3247328dd in srcdest_rnode_get lib/srcdest_table.c:231
>     #4 0x55b0e4fa8da4 in rib_find_rn_from_ctx zebra/zebra_rib.c:1957
>     #5 0x55b0e4fa8e31 in rib_process_result zebra/zebra_rib.c:1988
>     #6 0x55b0e4fb9d64 in rib_process_dplane_results zebra/zebra_rib.c:4894
>     #7 0x7fa32476689c in event_call lib/event.c:1996
>     #8 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232
>     #9 0x55b0e4e6c32a in main zebra/main.c:526
>     #10 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308
>     #11 0x55b0e4e2d649 in _start (/usr/lib/frr/zebra+0x1a1649)
>
> 0x606000154f54 is located 20 bytes inside of 56-byte region [0x606000154f40,0x606000154f78)
> freed by thread T0 here:
>     #0 0x7fa324ca9b6f in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:123
>     #1 0x7fa324668d8f in qfree lib/memory.c:130
>     #2 0x7fa32474c421 in route_table_free lib/table.c:126
>     #3 0x7fa32474bf96 in route_table_finish lib/table.c:46
>     #4 0x55b0e4fbca3a in zebra_router_free_table zebra/zebra_router.c:191
>     #5 0x55b0e4fbccea in zebra_router_release_table zebra/zebra_router.c:214
>     #6 0x55b0e4fd428e in zebra_vrf_disable zebra/zebra_vrf.c:219
>     #7 0x7fa32476fabf in vrf_disable lib/vrf.c:326
>     #8 0x7fa32476f5d4 in vrf_delete lib/vrf.c:231
>     #9 0x55b0e4e4ad36 in interface_vrf_change zebra/interface.c:1478
>     #10 0x55b0e4e4d5d2 in zebra_if_dplane_ifp_handling zebra/interface.c:1949
>     #11 0x55b0e4e4fb89 in zebra_if_dplane_result zebra/interface.c:2268
>     #12 0x55b0e4fb9f26 in rib_process_dplane_results zebra/zebra_rib.c:4954
>     #13 0x7fa32476689c in event_call lib/event.c:1996
>     #14 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232
>     #15 0x55b0e4e6c32a in main zebra/main.c:526
>     #16 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308
>
> previously allocated by thread T0 here:
>     #0 0x7fa324caa037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
>     #1 0x7fa324668c4d in qcalloc lib/memory.c:105
>     #2 0x7fa32474bf33 in route_table_init_with_delegate lib/table.c:38
>     #3 0x7fa32474e73c in route_table_init lib/table.c:512
>     #4 0x55b0e4fbc353 in zebra_router_get_table zebra/zebra_router.c:137
>     #5 0x55b0e4fd4da0 in zebra_vrf_table_create zebra/zebra_vrf.c:358
>     #6 0x55b0e4fd3d30 in zebra_vrf_enable zebra/zebra_vrf.c:140
>     #7 0x7fa32476f9b2 in vrf_enable lib/vrf.c:286
>     #8 0x55b0e4e4af76 in interface_vrf_change zebra/interface.c:1533
>     #9 0x55b0e4e4d612 in zebra_if_dplane_ifp_handling zebra/interface.c:1968
>     #10 0x55b0e4e4fb89 in zebra_if_dplane_result zebra/interface.c:2268
>     #11 0x55b0e4fb9f26 in rib_process_dplane_results zebra/zebra_rib.c:4954
>     #12 0x7fa32476689c in event_call lib/event.c:1996
>     #13 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232
>     #14 0x55b0e4e6c32a in main zebra/main.c:526
>     #15 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308

Fixes: d8612e6 ("zebra: Track tables allocated by vrf and cleanup")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2025-03-10 09:54:18 +01:00
Jafar Al-Gharaibeh 3f785c913d
Merge pull request #18348 from donaldsharp/topotest_startup_order
Topotest startup order
2025-03-08 21:52:16 -06:00
Donald Sharp 009f42dd5b tests: Have zebra startup look for the zserv.api socket
Ensure that the zserv.api socket is actually up and running
before moving onto other daemons after zebra.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-03-07 18:43:18 -05:00
Donald Sharp dd609bc069 tests: Allow mgmtd and zebra to fully come up before other daemons
Currently the topotest infrastructure is starting up daemons
in mgmtd,zebra, staticd then everything else.

The problem that is happening, under heavy load, is that
zebra may not be fully started and when a daemon attempts
to connect to it, it will not be able to connect.
Some of the daemons do not have great retry mechanisms at all.
In addition our normal systemctl startup scripts actually
wait a small amount of time for zebra to be ready before
moving onto the other daemons.

Let's make topotests startup a tiny bit more nuanced
and have mgmtd fully up before starting up zebra.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-03-07 18:43:18 -05:00