Commit graph

6213 commits

Author SHA1 Message Date
Donatas Abraitis 004c6c0260
Merge pull request #18692 from donaldsharp/event_cleanups
zebra: Save event pointer for rib sweeping
2025-04-19 03:40:27 +03:00
Donald Sharp 547894c087 zebra: Save event pointer for rib sweeping
The rib_sweep_route function when not doing graceful
restart does not attempt to save the event on the
t_rib_sweep pointer for shutdown.  Prevent any
weird shenanigans by allowing shutdown to clean
up the rib_sweep_route event.

Signed-off-by: Donald Sharp <donaldsharp72@gmail.com>
2025-04-18 17:44:39 -04:00
Mark Stapp 1ca756f315
Merge pull request #18497 from krishna-samy/show-metaq-counters
zebra: show command to display metaq info
2025-04-16 09:16:40 -04:00
Mark Stapp 21a32b010b
Merge pull request #18579 from krishna-samy/krishna/dplane_fpm_read
zebra: change fpm_read to batch the messages
2025-04-16 08:47:11 -04:00
Krishnasamy 7e8c18d0b0 zebra: change fpm_read to batch the messages
Make code changes in fpm_read to create a list of ctx and send it to
zebra for processing rather than sending individual ctx

Signed-off-by: Krishnasamy <krishnasamyr@nvidia.com>
2025-04-16 07:14:55 +00:00
Jafar Al-Gharaibeh 0dc71bcfca
Merge pull request #18641 from donaldsharp/fpm_listener_storage
zebra: Add ability to dump routes received from fpm_listener
2025-04-14 15:21:13 -05:00
Donald Sharp bd8ee74b49
Merge pull request #18645 from louis-6wind/fix-zebra-pbr-leak
zebra: fix pbr_iptable memory leak
2025-04-11 19:54:03 -04:00
Louis Scalbert 55ea74d630 zebra: clean pbr_iptable interface_name_list free
Clean up code related to pbr_iptable->interface_name_list free. This is
a cosmetic change.

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2025-04-11 15:52:42 +02:00
Louis Scalbert 92cddedffd zebra: fix pbr_iptable memory leak
We are obviously doing deleting on wrong object.

> Direct leak of 40 byte(s) in 1 object(s) allocated from:
>     #0 0x7fcf718b4a57 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
>     #1 0x7fcf7126f8dd in qcalloc lib/memory.c:105
>     #2 0x7fcf7124401a in list_new lib/linklist.c:49
>     #3 0x55771621d86d in pbr_iptable_alloc_intern zebra/zebra_pbr.c:1015
>     #4 0x7fcf71217d79 in hash_get lib/hash.c:147
>     #5 0x55771621dad3 in zebra_pbr_add_iptable zebra/zebra_pbr.c:1030
>     #6 0x55771614d00c in zread_iptable zebra/zapi_msg.c:4131
>     #7 0x55771614e586 in zserv_handle_commands zebra/zapi_msg.c:4424
>     #8 0x5577162dae2c in zserv_process_messages zebra/zserv.c:521
>     #9 0x7fcf7137798e in event_call lib/event.c:2011
>     #10 0x7fcf71242ff1 in frr_run lib/libfrr.c:1216
>     #11 0x5577160e4d6d in main zebra/main.c:540
>     #12 0x7fcf70c29d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>
> Indirect leak of 24 byte(s) in 1 object(s) allocated from:
>     #0 0x7fcf718b4a57 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
>     #1 0x7fcf7126f8dd in qcalloc lib/memory.c:105
>     #2 0x7fcf71244129 in listnode_new lib/linklist.c:71
>     #3 0x7fcf71244238 in listnode_add lib/linklist.c:92
>     #4 0x55771621d938 in pbr_iptable_alloc_intern zebra/zebra_pbr.c:1019
>     #5 0x7fcf71217d79 in hash_get lib/hash.c:147
>     #6 0x55771621dad3 in zebra_pbr_add_iptable zebra/zebra_pbr.c:1030
>     #7 0x55771614d00c in zread_iptable zebra/zapi_msg.c:4131
>     #8 0x55771614e586 in zserv_handle_commands zebra/zapi_msg.c:4424
>     #9 0x5577162dae2c in zserv_process_messages zebra/zserv.c:521
>     #10 0x7fcf7137798e in event_call lib/event.c:2011
>     #11 0x7fcf71242ff1 in frr_run lib/libfrr.c:1216
>     #12 0x5577160e4d6d in main zebra/main.c:540
>     #13 0x7fcf70c29d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

Fixes: f80ec7e3d6 ("zebra: handle iptable list of interfaces")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2025-04-11 15:52:30 +02:00
Louis Scalbert cd451ff4ef zebra: split up MTYPE_PBR_OBJ
Split up MTYPE_PBR_OBJ into dedicated MTYPE to clarify the memory
allocation and free.

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2025-04-11 15:52:30 +02:00
Donald Sharp 6299a89371 zebra: Add ability to dump routes received from fpm_listener
The fpm_listener currently has no ability to store the list
of prefixes that it has received.  Modify the code to store
the prefixes in a typesafe RB Tree.  Additionally modify
the code such that when a SIGUSR1 is received to dump
the routes out.  If the operator specifies a -z <filename>
then write the routes to that file.  It will overwrite
the last version of the file written.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-04-10 20:01:39 -04:00
Donald Sharp ef580f0e80 zebra: modify fpm_listener to display data about nhgs
Currently the fpm_listener completely ignores NHG's.
Let's start dumping some data about the nexthop groups:

[2025-04-10 16:55:12.939235306] FPM message - Type: 1, Length 52
[2025-04-10 16:55:12.939254252] Nexthop Group ID: 9, Protocol: Zebra(11), Contains 1 nexthops, Family: 2, Scope: 0
[2025-04-10 16:55:12.939260564] FPM message - Type: 1, Length 52
[2025-04-10 16:55:12.939263990] Nexthop Group ID: 10, Protocol: Zebra(11), Contains 1 nexthops, Family: 2, Scope: 0
[2025-04-10 16:55:12.939268659] FPM message - Type: 1, Length 56
[2025-04-10 16:55:12.939271635] Nexthop Group ID: 8, Protocol: Zebra(11), Contains 2 nexthops, Family: 0, Scope: 0

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-04-10 17:14:38 -04:00
Donald Sharp 64a6a2e175 zebra: Fix shadow warning in irdp_packet.c
My compiler is complaining about irdp_sock
being a shadow variable.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-04-09 12:01:30 -04:00
Mark Stapp 7c98a27f3e zebra: clean up -Wshadow compiler warnings
Clean up variable-shadowing compiler warnings.

Signed-off-by: Mark Stapp <mjs@cisco.com>
2025-04-08 14:41:27 -04:00
Russ White c312917988
Merge pull request #18450 from donaldsharp/bgp_packet_reads
Bgp packet reads conversion to a FIFO
2025-04-01 10:12:37 -04:00
Krishnasamy 751ae76648 zebra: show command to display metaq info
Display below info from metaq and sub queues
1. Current queue size
2. Max/Highwater size
3. Total number of events received fo so far

r1# sh zebra metaq
MetaQ Summary
Current Size    : 0
Max Size        : 9
Total           : 20
 |------------------------------------------------------------------|
 | SubQ                             | Current  | Max Size  | Total  |
 |----------------------------------+----------+-----------+--------|
 | NHG Objects                      | 0        | 0         | 0      |
 |----------------------------------+----------+-----------+--------|
 | EVPN/VxLan Objects               | 0        | 0         | 0      |
 |----------------------------------+----------+-----------+--------|
 | Early Route Processing           | 0        | 8         | 11     |
 |----------------------------------+----------+-----------+--------|
 | Early Label Handling             | 0        | 0         | 0      |
 |----------------------------------+----------+-----------+--------|
 | Connected Routes                 | 0        | 6         | 9      |
 |----------------------------------+----------+-----------+--------|
 | Kernel Routes                    | 0        | 0         | 0      |
 |----------------------------------+----------+-----------+--------|
 | Static Routes                    | 0        | 0         | 0      |
 |----------------------------------+----------+-----------+--------|
 | RIP/OSPF/ISIS/EIGRP/NHRP Routes  | 0        | 0         | 0      |
 |----------------------------------+----------+-----------+--------|
 | BGP Routes                       | 0        | 0         | 0      |
 |----------------------------------+----------+-----------+--------|
 | Other Routes                     | 0        | 0         | 0      |
 |----------------------------------+----------+-----------+--------|
 | Graceful Restart                 | 0        | 0         | 0      |
 |------------------------------------------------------------------|

Signed-off-by: Krishnasamy <krishnasamyr@nvidia.com>
2025-04-01 09:32:46 +00:00
Donald Sharp f82682a3f9 zebra: Clean up memory associated with affinity maps
Zebra is using affinity maps but not cleaning up memory on shutdown.
BAD!

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-03-30 17:54:34 -04:00
Donald Sharp 937a9fb3e9 zebra: Limit reading packets when MetaQ is full
Currently Zebra is just reading packets off the zapi
wire and stacking them up for processing in zebra
in the future.  When there is significant churn
in the network the size of zebra can grow without
bounds due to the MetaQ sizing constraints.  This
ends up showing by the number of nexthops in the
system.  Reducing the number of packets serviced
to limit the metaQ size to the packets to process
allieviates this problem.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-03-25 09:10:46 -04:00
Mark Stapp 556d3c445d
Merge pull request #18359 from soumyar-roy/soumya/streamsize
zebra: zebra crash for zapi stream
2025-03-21 11:30:16 -04:00
Russ White 37fd451997
Merge pull request #18409 from donaldsharp/typesafe_zclient
Typesafe zclient
2025-03-20 12:48:47 -04:00
Soumya Roy 860c1e4450 zebra: reduce memory usage by streams when redistributing routes
This commit undo 8c9b007a0c
stream lib has been modified to expand the stream if needed
Now for zapi route encode, we use expandable stream

Signed-off-by: Soumya Roy <souroy@nvidia.com>
2025-03-20 16:13:44 +00:00
Soumya Roy 6fe9092eb3 zebra: zebra crash for zapi stream
Issue:
If static route is created with a BGP route as nexthop, which
recursively resolves over 512 ECMP v6 nexthops, zapi nexthop encode
fails, as there is not enough memory allocated for stream. This causes
assert/core dump in zebra. Right now we allocate fixed memory
of ZEBRA_MAX_PACKET_SIZ size.

Fix:
1)Dynamically calculate required memory size for the stream
2)try to optimize memory usage

Testing:
No crash happens anymore with the fix
zebra: zebra crash for zapi stream

Issue:
If static route is created with a BGP route as nexthop, which
recursively resolves over 512 ECMP v6 nexthops, zapi nexthop encode
fails, as there is not enough memory allocated for stream. This causes
assert/core dump in zebra. Right now we allocate fixed memory
of ZEBRA_MAX_PACKET_SIZ size.

Fix:
1)Dynamically calculate required memory size for the stream
2)try to optimize memory usage

Testing:
No crash happens anymore with the fix
r1#
r1# sharp install routes 2100:cafe:: nexthop 2001:db8::1 1000
r1#

r2# conf
r2(config)# ipv6 route 2503:feca::100/128 2100:cafe::1
r2(config)# exit
r2#

Signed-off-by: Soumya Roy <souroy@nvidia.com>
2025-03-20 16:13:44 +00:00
Donald Sharp 9c273fad26 zebra: Add timestamp to output
It's interesting to know the time we received the route.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-03-19 13:47:01 -04:00
Donald Sharp 04d6adc94b zebra: Allow fpm_listener to reject all routes
Now usage of `-r -f` with fpm_listener now causes all
routes to be rejected.

r1# sharp install routes 10.0.0.0 nexthop 192.168.44.5 5
r1# show ip route
Codes: K - kernel route, C - connected, L - local, S - static,
       R - RIP, O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric, t - Table-Direct,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure

IPv4 unicast VRF default:
D>o 10.0.0.0/32 [150/0] via 192.168.44.5, r1-eth0, weight 1, 00:00:02
D>o 10.0.0.1/32 [150/0] via 192.168.44.5, r1-eth0, weight 1, 00:00:02
D>o 10.0.0.2/32 [150/0] via 192.168.44.5, r1-eth0, weight 1, 00:00:02
D>o 10.0.0.3/32 [150/0] via 192.168.44.5, r1-eth0, weight 1, 00:00:02
D>o 10.0.0.4/32 [150/0] via 192.168.44.5, r1-eth0, weight 1, 00:00:02
C>* 192.168.44.0/24 is directly connected, r1-eth0, weight 1, 00:00:37
L>* 192.168.44.1/32 is directly connected, r1-eth0, weight 1, 00:00:37
r1#

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-03-19 13:43:47 -04:00
Donald Sharp 4d6f5c7e27 zebra: Rework the stale client list to a typesafe list
The stale client list was just a linked list, let's use
the typesafe list.

Signed-off-by: Donald Sharp <donaldsharp72@gmail.com>
2025-03-19 13:43:00 -04:00
Donald Sharp 24d293277f zebra: Convert the zrouter.client_list to a typesafe list
This list should just be a typesafe list.

Signed-off-by: Donald Sharp <donaldsharp72@gmail.com>
2025-03-19 13:27:36 -04:00
Russ White d5b864ebee
Merge pull request #18374 from raja-rajasekar/rajasekarr/nhg_intf_flap_issue
zebra: Fix reinstalling nexthops in NHGs upon interface flaps
2025-03-19 08:10:15 -04:00
Rajasekar Raja de168795ab zebra: Fix reinstalling nexthops in NHGs upon interface flaps
Trigger:
Imagine a route utilizing an NHG with six nexthops (Intf swp1-swp6).
If interfaces swp1-swp4 flaps, the NHG remains the same but now only
references two nexthops (swp5-6) instead of all six. This behavior
occurs due to how NHGs with recursive nexthops are managed within Zebra.

In the scenario below, NHG 370 has all six nexthops installed in the
kernel. However, Zebra maintains a list of recursive NHGs that NHG 370
references i.e., Depends: (371), (372), (373) which are not directly
installed in the kernel.
- When an interface comes up, its nexthop and corresponding dependents
  are installed.
- These dependents (counterparts to 371-373) are non-recursive and
  are installed as well.
- However, when attempting to install the recursive ones in
  zebra_nhg_install_kernel(), they resolve to the already installed
  counterparts, resulting in a NO-OP.

Fixing this by iterating all dependents of the recursively resolved
NHGs and reinstalling them.

Trigger: Flap swp1 to swp4

Before Fix:
root@leaf-11:mgmt:/var/home/cumulus# ip route show | grep 6.0.0.5
6.0.0.5 nhid 370 proto bgp metric 20
ip -d next show
id 337 via 2000:1:0:1:0:f:0:9 dev swp6 scope link proto zebra
id 339 via 2000:1:0:1:0:e:0:9 dev swp5 scope link proto zebra
id 341 via 2000:1:0:1:0:8:0:8 dev swp4 scope link proto zebra
id 343 via 2000:1:0:1:0:7:0:8 dev swp3 scope link proto zebra
id 346 via 2000:1:0:1:0:1:0:7 dev swp2 scope link proto zebra
id 348 via 2000:1:0:1::7 dev swp1 scope link proto zebra
id 370 group 346/348/341/343/337/339 scope global proto zebra

After Trigger:
root@leaf-11:mgmt:/var/home/cumulus# ip route show | grep 6.0.0.5
6.0.0.5 nhid 370 proto bgp metric 20
root@leaf-11:mgmt:/var/home/cumulus# ip -d next show
id 337 via 2000:1:0:1:0:f:0:9 dev swp6 scope link proto zebra
id 339 via 2000:1:0:1:0:e:0:9 dev swp5 scope link proto zebra
id 370 group 337/339 scope global proto zebra

After Fix:
root@leaf-11:mgmt:/var/home/cumulus# ip route show | grep 6.0.0.5
6.0.0.5 nhid 432 proto bgp metric 20
ip -d next show
id 432 group 395/397/400/402/405/407 scope global proto zebra

After Trigger
root@leaf-11:mgmt:/var/home/cumulus# ip route show | grep 6.0.0.5
6.0.0.5 nhid 432 proto bgp metric 20
root@leaf-11:mgmt:/var/home/cumulus# ip -d next show
id 432 group 395/397/400/402/405/407 scope global proto zebra

Ticket :#

Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-03-18 12:21:42 -07:00
Russ White 4b6e0ba1a1
Merge pull request #18349 from donaldsharp/more_yang_state
More yang state
2025-03-18 11:02:28 -04:00
Dmytro Shytyi e6d08a89c7
zebra: add rtadv information output in vtysh json
Add to "show interface json" output multiple rtadv parameters.

if_dump_vty() calls => hook_call(zebra_if_extra_info, vty, ifp);

if_dump_vty_json() now do the same call, with additional parameter:
hook_call(zebra_if_extra_info, vty, json_if, ifp);

Signed-off-by: Dmytro Shytyi <dmytro.shytyi@6wind.com>
2025-03-17 11:19:58 +01:00
Donatas Abraitis 35cc716363
Merge pull request #18394 from donaldsharp/fpm_listener_output
zebra: add ability to specify output file with fpm_listener
2025-03-15 18:32:19 +01:00
Donald Sharp f0b2bc3b4c zebra: add ability to specify output file with fpm_listener
The fpm_listener didn't have the ability to specify the output
file location at all.  Modify the code to accept this.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-03-14 13:24:19 -04:00
Jafar Al-Gharaibeh 7945af0200
Merge pull request #18360 from raja-rajasekar/rajasekarr/fix_explicit_sid_allocation
zebra: ensure proper return for failure for Sid allocation
2025-03-14 09:57:41 -05:00
Mark Stapp 27953dd141
Merge pull request #18336 from routingrocks/rvaratharaj/bugfixmar
zebra: Fix neigh delete causing heap-use-after-free error
2025-03-12 08:09:29 -04:00
Rajesh Varatharaj 3060afc84d zebra: Fix neigh delete causing heap-use-after-free error
Issue:
Not freeing the neighbor n  within the same function can lead to
memory leak.
zebra_neigh_del_all() -> zebra_neigh_del() re lookup and free

Fix: not accessing n after its freed.
Directly free the neighbor entry (n) when its interface index matches
ifp->ifindex.

This fixes:
ERROR: AddressSanitizer: heap-use-after-free on address 0x6070001052e8 at pc 0x7f6bf7d09ddb bp 0x7ffd3366a000 sp 0x7ffd33669ff0
READ of size 8 at 0x6070001052e8 thread T0
    #0 0x7f6bf7d09dda in _rb_next lib/openbsd-tree.c:455
    #1 0x55f95a307261 in zebra_neigh_rb_head_RB_NEXT zebra/zebra_neigh.h:34
    #2 0x55f95a3082e9 in zebra_neigh_del_all zebra/zebra_neigh.c:162
    #3 0x55f95a121ee7 in zebra_interface_down_update zebra/redistribute.c:571
    #4 0x55f95a0f819d in if_down zebra/interface.c:1017
    #5 0x55f95a0fe168 in zebra_if_dplane_ifp_handling zebra/interface.c:2102
    #6 0x55f95a0ff10c in zebra_if_dplane_result zebra/interface.c:2241
    #7 0x55f95a27ce9c in rib_process_dplane_results zebra/zebra_rib.c:5015
    #8 0x7f6bf7da3ad9 in event_call lib/event.c:1984
    #9 0x7f6bf7c62141 in frr_run lib/libfrr.c:1246
    #10 0x55f95a11ca7f in main zebra/main.c:543
    #11 0x7f6bf7029d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    #12 0x7f6bf7029e3f in __libc_start_main_impl ../csu/libc-start.c:392
    #13 0x55f95a0dd0b4 in _start (/usr/lib/frr/zebra+0x1a80b4)

Ticket: #18047

Signed-off-by: Rajesh Varatharaj <rvaratharaj@nvidia.com>
2025-03-11 13:41:40 -07:00
Mark Stapp d0cb3ad7cb
Merge pull request #16614 from louis-6wind/fix-otable-heap-after-free
zebra: fix table heap-after-free crash
2025-03-11 14:03:14 -04:00
Rajasekar Raja 5a63cf4c0d zebra: ensure proper return for failure for Sid allocation
The functions alloc_srv6_sid_func_explicit/dynamic expect to return bool
but we have places where we return a -1 or NULL which the caller is
assuming as a True/Valid and ending up allocating Sid

Without Fix:
2025/03/10 21:44:04.295350 ZEBRA: [XWV20-TGK70] alloc_srv6_sid_func_explicit: trying to allocate explicit SID function 65088 from block fcbb:bbbb::/32
2025/03/10 21:44:04.295351 ZEBRA: [MM61M-TQZNP] alloc_srv6_sid_func_explicit: elib s 10000 e 20000 wlib s 1000 ewlib s 30000 e 1000 SID_FUNC 65088
2025/03/10 21:44:04.295352 ZEBRA: [QGHMB-SWNFW] alloc_srv6_sid_func_explicit: function 65088 is outside ELIB [10000/20000] and EWLIB alloc ranges [30000/1000]
2025/03/10 21:44:04.295367 ZEBRA: [H0GZA-NNSWJ] get_srv6_sid_explicit: allocated explicit SRv6 SID fcbb:bbbb:1:fe40:: for context End.X nh6 2001::2
2025/03/10 21:44:04.295368 ZEBRA: [XBBYD-T1Q7P] srv6_manager_get_sid_internal: got new SRv6 SID for ctx End.X nh6 2001::2: sid_value=fcbb:bbbb:1:fe40:: (func=65088) (proto=4, instance=0, sessionId=0), notifying all clients

With Fix:
2025/03/10 22:04:25.052235 ZEBRA: [MM61M-TQZNP] alloc_srv6_sid_func_explicit: elib s 30000 e 31000 wlib s 31000 ewlib s 30000 e 31000 SID_FUNC 65056
2025/03/10 22:04:25.052236 ZEBRA: [YHMRC-EMYNX] alloc_srv6_sid_func_explicit: function 65056 is outside ELIB [30000/31000] and EWLIB alloc ranges [30000/31000]
2025/03/10 22:04:25.052254 ZEBRA: [XSG8X-Q2XJX] get_srv6_sid_explicit: invalid SM request arguments: failed to allocate SID function 65056 from block fcbb:bbbb::/32
2025/03/10 22:04:25.052257 ZEBRA: [YC52T-427SJ] srv6_manager_get_sid_internal: not got SRv6 SID for ctx End.DT6 vrf_id 4, sid_value=fcbb:bbbb:1:fe20::, locator_name=MAIN
root@rajasekarr:/tmp/topotests/static_srv6_sids.test_static_srv6_sids/r1#

Ticket :#
Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
2025-03-10 15:26:38 -07:00
Louis Scalbert c6afe42455 lib, tests, zebra: keep table routes at vrf disabling
At VRF disabling, keep the route entries that was associated to its
table ID but not to the VRF itself. Kernel flushes these entries so we
need to reinstall them.

To do so, add a flag to mean that a route entry is owned by a table ID
and not by a VRF. If the VRF associated to the table ID is deleted, the
route entry must not be deleted.

Update to tests with new flag. 2057 is in hexa 0x809, meaning that the
new flag has been to some prefix.

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2025-03-10 09:54:18 +01:00
Louis Scalbert 52a35e9592 zebra: fix vanished blackhole route
Fix vanished blackhole route when kernel routes are updated.

> root@router# echo "100 my_table" | tee -a /etc/iproute2/rt_tables
> root@router# ip l add du0 type dummy
> root@router# ifconfig du0 192.168.0.1/24 up
> root@router# ip route add blackhole default table 100
> root@router# ip route show table 100
> blackhole default
> root@router# vtysh -c 'show ip route table 100'
> [...]
> Table 100:
> K>* 0.0.0.0/0 [0/0] unreachable (blackhole), weight 1, 00:00:05
> root@router# ip l add red type vrf table 100
> root@router# vtysh -c 'show ip route table 100'
> [...]
> Table 100:
> K>* 0.0.0.0/0 [0/0] unreachable (blackhole), weight 1, 00:00:16
> root@router# ip l set du0 master red
> root@router# vtysh -c 'show ip route table 100'
> [...]
> Table 100:
> C>* 192.168.0.0/24 is directly connected, du0, weight 1, 00:00:02
> L>* 192.168.0.1/32 is directly connected, du0, weight 1, 00:00:02
> root@router# ip route show table 100
> blackhole default
> 192.168.0.0/24 dev du0 proto kernel scope link src 192.168.0.1
> local 192.168.0.1 dev du0 proto kernel scope host src 192.168.0.1
> broadcast 192.168.0.255 dev du0 proto kernel scope link src 192.168.0.1

Fixes: d528c02a20 ("zebra: Handle kernel routes appropriately")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2025-03-10 09:54:18 +01:00
Louis Scalbert 5cde97678e zebra: fix removed default route at vrf enabling
When a routing table (RT) already has a default route before being
assigned to a VRF, the default route vanishes in zebra after the VRF
assignment.

> root@router:~# ip route add blackhole default table 100
> root@router:~# ip route show table 100
> blackhole default
> root@router:~# vtysh -c 'show ip route table 100'
> [...]
> VRF default table 100:
> K>* 0.0.0.0/0 [0/0] unreachable (blackhole), 00:00:05
> root@router:~# ip l add red type vrf table 100
> root@router:~# vtysh -c 'show ip route table 100'
> root@router:~#

Do not override the default route if it exists.

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2025-03-10 09:54:18 +01:00
Louis Scalbert fb8bf9cf59 zebra: remove vrf route entries at vrf disabling
This is the continuation of the previous commit.

When a VRF is deleted, the kernel retains only its own routing entries
in the former VRF table and removes all others.

This change ensures that routing entries created by FRR daemons are also
removed from the former zebra VRF table when the VRF is disabled.

To test:

> echo "100 my_table" | tee -a /etc/iproute2/rt_tables
> ip l add du0 type dummy
> ifconfig du0 192.168.0.1/24 up
> ip route add blackhole default table 100
> ip route show table 100
> ip l add red type vrf table 100
> ip l set du0 master red
> vtysh -c 'configure' -c 'vrf red' -c 'ip route 10.0.0.0/24 192.168.0.254'
> vtysh -c 'show ip route table 100'
> sleep 0.1
> ip l del red
> sleep 0.1
> vtysh -c 'show ip route table 100'
> ip l add red type vrf table 100
> ip l set du0 master red
> vtysh -c 'configure' -c 'vrf red' -c 'ip route 10.0.0.0/24 192.168.0.254'
> vtysh -c 'show ip route table 100'
> sleep 0.1
> ip l del red
> sleep 0.1
> vtysh -c 'show ip route table 100'

Fixes: d8612e6 ("zebra: Track tables allocated by vrf and cleanup")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2025-03-10 09:54:18 +01:00
Louis Scalbert 7395e399b1 zebra: fix table heap-after-free crash
Fix a heap-after-free that causes zebra to crash even without
address-sanitizer. To reproduce:

> echo "100 my_table" | tee -a /etc/iproute2/rt_tables
> ip route add blackhole default table 100
> ip route show table 100
> ip l add red type vrf table 100
> ip l del red
> ip route del blackhole default table 100

Zebra manages routing tables for all existing Linux RT tables,
regardless of whether they are assigned to a VRF interface. When a table
is not assigned to any VRF, zebra arbitrarily assigns it to the default
VRF, even though this is not strictly accurate (the code expects this
behavior).

When an RT table is created after a VRF, zebra correctly assigns the
table to the VRF. However, if a VRF interface is assigned to an existing
RT table, zebra does not update the table owner, which remains as the
default VRF. As a result, existing routing entries remain under the
default VRF, while new entries are correctly assigned to the VRF. The
VRF mismatch is unexpected in the code and creates crashes and memory
related issues.

Furthermore, Linux does not automatically delete RT tables when they are
unassigned from a VRF. It is incorrect to delete these tables from zebra.

Instead, at VRF disabling, do not release the table but reassign it to
the default VRF. At VRF enabling, change the table owner back to the
appropriate VRF.

> ==2866266==ERROR: AddressSanitizer: heap-use-after-free on address 0x606000154f54 at pc 0x7fa32474b83f bp 0x7ffe94f67d90 sp 0x7ffe94f67d88
> READ of size 1 at 0x606000154f54 thread T0
>     #0 0x7fa32474b83e in rn_hash_node_const_find lib/table.c:28
>     #1 0x7fa32474bab1 in rn_hash_node_find lib/table.c:28
>     #2 0x7fa32474d783 in route_node_get lib/table.c:283
>     #3 0x7fa3247328dd in srcdest_rnode_get lib/srcdest_table.c:231
>     #4 0x55b0e4fa8da4 in rib_find_rn_from_ctx zebra/zebra_rib.c:1957
>     #5 0x55b0e4fa8e31 in rib_process_result zebra/zebra_rib.c:1988
>     #6 0x55b0e4fb9d64 in rib_process_dplane_results zebra/zebra_rib.c:4894
>     #7 0x7fa32476689c in event_call lib/event.c:1996
>     #8 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232
>     #9 0x55b0e4e6c32a in main zebra/main.c:526
>     #10 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308
>     #11 0x55b0e4e2d649 in _start (/usr/lib/frr/zebra+0x1a1649)
>
> 0x606000154f54 is located 20 bytes inside of 56-byte region [0x606000154f40,0x606000154f78)
> freed by thread T0 here:
>     #0 0x7fa324ca9b6f in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:123
>     #1 0x7fa324668d8f in qfree lib/memory.c:130
>     #2 0x7fa32474c421 in route_table_free lib/table.c:126
>     #3 0x7fa32474bf96 in route_table_finish lib/table.c:46
>     #4 0x55b0e4fbca3a in zebra_router_free_table zebra/zebra_router.c:191
>     #5 0x55b0e4fbccea in zebra_router_release_table zebra/zebra_router.c:214
>     #6 0x55b0e4fd428e in zebra_vrf_disable zebra/zebra_vrf.c:219
>     #7 0x7fa32476fabf in vrf_disable lib/vrf.c:326
>     #8 0x7fa32476f5d4 in vrf_delete lib/vrf.c:231
>     #9 0x55b0e4e4ad36 in interface_vrf_change zebra/interface.c:1478
>     #10 0x55b0e4e4d5d2 in zebra_if_dplane_ifp_handling zebra/interface.c:1949
>     #11 0x55b0e4e4fb89 in zebra_if_dplane_result zebra/interface.c:2268
>     #12 0x55b0e4fb9f26 in rib_process_dplane_results zebra/zebra_rib.c:4954
>     #13 0x7fa32476689c in event_call lib/event.c:1996
>     #14 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232
>     #15 0x55b0e4e6c32a in main zebra/main.c:526
>     #16 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308
>
> previously allocated by thread T0 here:
>     #0 0x7fa324caa037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
>     #1 0x7fa324668c4d in qcalloc lib/memory.c:105
>     #2 0x7fa32474bf33 in route_table_init_with_delegate lib/table.c:38
>     #3 0x7fa32474e73c in route_table_init lib/table.c:512
>     #4 0x55b0e4fbc353 in zebra_router_get_table zebra/zebra_router.c:137
>     #5 0x55b0e4fd4da0 in zebra_vrf_table_create zebra/zebra_vrf.c:358
>     #6 0x55b0e4fd3d30 in zebra_vrf_enable zebra/zebra_vrf.c:140
>     #7 0x7fa32476f9b2 in vrf_enable lib/vrf.c:286
>     #8 0x55b0e4e4af76 in interface_vrf_change zebra/interface.c:1533
>     #9 0x55b0e4e4d612 in zebra_if_dplane_ifp_handling zebra/interface.c:1968
>     #10 0x55b0e4e4fb89 in zebra_if_dplane_result zebra/interface.c:2268
>     #11 0x55b0e4fb9f26 in rib_process_dplane_results zebra/zebra_rib.c:4954
>     #12 0x7fa32476689c in event_call lib/event.c:1996
>     #13 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232
>     #14 0x55b0e4e6c32a in main zebra/main.c:526
>     #15 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308

Fixes: d8612e6 ("zebra: Track tables allocated by vrf and cleanup")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2025-03-10 09:54:18 +01:00
Donald Sharp 9bf22f603e zebra: Add mpls-forwarding to yang state model
The mpls-forwarding state was missing from the model
add it.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-03-07 22:24:42 -05:00
Donald Sharp 633ef005bd zebra: Don't use MTYPE_TMP for l2 vni data
Convert over from MTYPE_TMP to MTYPE_L2_VNI as the
data type.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-03-07 11:50:41 -05:00
Donald Sharp b648479cb4 zebra: Declutter zebra_vxlan_if_add_update_vni
This function has equivalent code on both sides
of a if statement.  Let's consolidate this.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-03-07 11:48:05 -05:00
Donald Sharp 45e2f0fc6e zebra: malloc functions cannot fail
Let's try to remember that when using a malloc function
it can never fail and as such testing for NULL does
nothing.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-03-07 11:48:05 -05:00
Donatas Abraitis 26d1e5ce17
Merge pull request #18214 from soumyar-roy/soumya/ra514nei
zebra: Bring up 514 BGP neighbor sessions
2025-03-06 20:15:19 +02:00
Soumya Roy 6a75d33b5c zebra: Bring up 514 BGP neighbor sessions
Issue:
When 514 inerfaces/neighbors are configured, it creates socket error,
"Cannot allocate memory", when back to back V6 RA messages are tried
to be sent over the socket. This prevents interface, to know its peer's
link local address. Socket error comes when 1) try to join ICMPv6 all
router multicast group, back to back for all interfaces 2)send back to
back RA for all interfaces

Fix:
1)For ICMPv6 join case, we check if the interface has already joined
all router group, if not try to join. On failure, retry joining after
random amount of time determined 1 ms to ICMPV6_JOIN_TIMER_EXP_MS(100 ms)
2) For RA issue case, batch sending of RA mesages using wheel timer

Testing:
Monitor BGP session running sh bgp summary command

Before fix:
r1# sh bgp summary

IPv4 Unicast Summary:
BGP router identifier 192.168.1.1, local AS number 1001 VRF default vrf-id 0
BGP table version 0
RIB entries 0, using 0 bytes of memory
Peers 515, using 12 MiB of memory

Neighbor        V         AS   MsgRcvd   MsgSent   TblVer  InQ OutQ  Up/Down State/PfxRcd   PfxSnt Desc
r1-eth0         4       1002        89        90        0    0    0 00:07:10            0        0 N/A
r1-eth1         4       1002        89        90        0    0    0 00:07:10            0        0 N/A
r1-eth2         4       1002        89        90        0    0    0 00:07:10            0        0 N/A
r1-eth3         4       1002        89        90        0    0    0 00:07:10            0        0 N/A
r1-eth4         4       1002        89        90        0    0    0 00:07:10            0        0 N/A
r1-eth5         4       1002        89        90        0    0    0 00:07:10            0        0 N/A

…..<snip>...
r1-eth252       4       1002        31        29        0    0    0 00:02:08            0        0 N/A
r1-eth253       4       1002        31        29        0    0    0 00:02:08            0        0 N/A
r1-eth254       4       1002        31        29        0    0    0 00:02:08            0        0 N/A
r1-eth255       4       1002        31        29        0    0    0 00:02:08            0        0 N/A
r1-eth256       4          0         0         0        0    0    0    never         Idle        0 N/A
r1-eth257       4          0         0         0        0    0    0    never         Idle        0 N/A
r1-eth258       4          0         0         0        0    0    0    never         Idle        0 N/A
r1-eth259       4          0         0         0        0    0    0    never         Idle        0 N/A
r1-eth260       4          0         0         0        0    0    0    never         Idle        0 N/A
……..<snip>…..
r1-eth511       4          0         0         0        0    0    0    never         Idle        0 N/A
r1-eth512       4          0         0         0        0    0    0    never         Idle        0 N/A
r1-eth513       4          0         0         0        0    0    0    never         Idle        0 N/A
r1-eth514       4          0         0         0        0    0    0    never         Idle        0 N/A
After fix:
r1# show bgp summary

IPv4 Unicast Summary:
BGP router identifier 192.168.1.1, local AS number 1001 VRF default vrf-id 0
BGP table version 0
RIB entries 0, using 0 bytes of memory
Peers 515, using 12 MiB of memory

Neighbor        V         AS   MsgRcvd   MsgSent   TblVer  InQ OutQ  Up/Down State/PfxRcd   PfxSnt Desc
r1-eth0         4       1002        87        87        0    0    0 00:07:04            0        0 N/A
r1-eth1         4       1002        87        87        0    0    0 00:07:04            0        0 N/A
r1-eth2         4       1002        87        87        0    0    0 00:07:04            0        0 N/A
r1-eth3         4       1002        64        67        0    0    0 00:05:09            0        0 N/A
r1-eth4         4       1002        87        87        0    0    0 00:07:04            0        0 N/A
r1-eth5         4       1002        87        87        0    0    0 00:07:04            0        0 N/A
r1-eth6         4       1002        67        70        0    0    0 00:05:22            0        0 N/A
r1-eth7         4       1002        87        87        0    0    0 00:07:04            0        0 N/A
r1-eth8         4       1002        87        87        0    0    0 00:07:04            0        0 N/A
....
r1-eth499       4       1002        43        43        0    0    0 00:03:22            0        0 N/A
r1-eth500       4       1002        43        43        0    0    0 00:03:22            0        0 N/A
r1-eth501       4       1002        19        22        0    0    0 00:01:21            0        0 N/A
r1-eth502       4       1002        43        43        0    0    0 00:03:22            0        0 N/A
r1-eth503       4       1002        43        43        0    0    0 00:03:22            0        0 N/A
r1-eth504       4       1002        20        23        0    0    0 00:01:30            0        0 N/A
r1-eth505       4       1002        43        43        0    0    0 00:03:22            0        0 N/A
r1-eth506       4       1002        43        43        0    0    0 00:03:22            0        0 N/A
r1-eth507       4       1002        22        25        0    0    0 00:01:39            0        0 N/A
r1-eth508       4       1002        43        43        0    0    0 00:03:22            0        0 N/A
r1-eth509       4       1002        17        20        0    0    0 00:01:13            0        0 N/A
r1-eth510       4       1002        43        43        0    0    0 00:03:22            0        0 N/A
r1-eth511       4       1002        43        43        0    0    0 00:03:22            0        0 N/A
r1-eth512       4       1002        19        22        0    0    0 00:01:22            0        0 N/A
r1-eth513       4       1002        43        43        0    0    0 00:03:22            0        0 N/A
r1-eth514       4       1002        43        43        0    0    0 00:03:22            0        0 N/A

Signed-off-by: Soumya Roy <souroy@nvidia.com>
2025-03-05 06:15:56 +00:00
Russ White 0b094a772c
Merge pull request #18253 from dksharp5/yang_zebra
Allow retrieval of v4/v6 forwarding state via NB
2025-03-04 09:25:24 -05:00
Mark Stapp b66145b8ca
Merge pull request #18030 from fdumontet6WIND/mem_alloc_stream
zebra: reduce memory usage by streams when redistributing routes
2025-03-03 11:09:47 -05:00