Commit graph

857 commits

Author SHA1 Message Date
Christopher Dziomba ec06de8403
zebra: use nexthop instead of route vrf_id for EVPN
Today zebra tracks EVPN vtep_ips in the L3-VNI of the route_entry.
Leaking routes in bgpd and using different RMACs on the remote side
for different L3-VNIs for a single VTEP leads to updates in the leak
destination VRF which is not desired. Zebra gets the source VRF in
vrf_id of the nexthops. This nexthop vrf_id is now used for adding/
updating/deleting the l3vni nexthop.

Signed-off-by: Christopher Dziomba <christopher.dziomba@telekom.de>
2025-04-26 20:10:37 +02:00
Russ White e525972513
Merge pull request #18158 from chdxD1/evpn-no-withdraw-for-update
Update EVPN prefix routes properly instead of withdraw/install
2025-04-22 10:54:57 -04:00
Mark Stapp 1ca756f315
Merge pull request #18497 from krishna-samy/show-metaq-counters
zebra: show command to display metaq info
2025-04-16 09:16:40 -04:00
Mark Stapp 7c98a27f3e zebra: clean up -Wshadow compiler warnings
Clean up variable-shadowing compiler warnings.

Signed-off-by: Mark Stapp <mjs@cisco.com>
2025-04-08 14:41:27 -04:00
Christopher Dziomba 069a23da10
zebra: Add refcounts to evpn nexthop prefixes
With bgpd no longer sending withdraws for EVPN prefix routes
zebra needs to track the path/ref count of prefixes for an EVPN
nexthop. When a route is updated the count is first increased
with the new paths and then decreased with the paths of the
existing/"same" route entry.
However the same evpn route methods are used for EVPN MH as well,
where bgpd already tracks the references. It is expected that an
ADD operation for the respective A-D routes is handled as an
upsert, a DEL operation should really remove the respective A-D
reference on a next-hop. For this the old behaviour (no path/ref
counting in zebra) is preserved.

Signed-off-by: Christopher Dziomba <christopher.dziomba@telekom.de>
2025-04-01 20:32:50 +02:00
Krishnasamy 751ae76648 zebra: show command to display metaq info
Display below info from metaq and sub queues
1. Current queue size
2. Max/Highwater size
3. Total number of events received fo so far

r1# sh zebra metaq
MetaQ Summary
Current Size    : 0
Max Size        : 9
Total           : 20
 |------------------------------------------------------------------|
 | SubQ                             | Current  | Max Size  | Total  |
 |----------------------------------+----------+-----------+--------|
 | NHG Objects                      | 0        | 0         | 0      |
 |----------------------------------+----------+-----------+--------|
 | EVPN/VxLan Objects               | 0        | 0         | 0      |
 |----------------------------------+----------+-----------+--------|
 | Early Route Processing           | 0        | 8         | 11     |
 |----------------------------------+----------+-----------+--------|
 | Early Label Handling             | 0        | 0         | 0      |
 |----------------------------------+----------+-----------+--------|
 | Connected Routes                 | 0        | 6         | 9      |
 |----------------------------------+----------+-----------+--------|
 | Kernel Routes                    | 0        | 0         | 0      |
 |----------------------------------+----------+-----------+--------|
 | Static Routes                    | 0        | 0         | 0      |
 |----------------------------------+----------+-----------+--------|
 | RIP/OSPF/ISIS/EIGRP/NHRP Routes  | 0        | 0         | 0      |
 |----------------------------------+----------+-----------+--------|
 | BGP Routes                       | 0        | 0         | 0      |
 |----------------------------------+----------+-----------+--------|
 | Other Routes                     | 0        | 0         | 0      |
 |----------------------------------+----------+-----------+--------|
 | Graceful Restart                 | 0        | 0         | 0      |
 |------------------------------------------------------------------|

Signed-off-by: Krishnasamy <krishnasamyr@nvidia.com>
2025-04-01 09:32:46 +00:00
Donald Sharp 937a9fb3e9 zebra: Limit reading packets when MetaQ is full
Currently Zebra is just reading packets off the zapi
wire and stacking them up for processing in zebra
in the future.  When there is significant churn
in the network the size of zebra can grow without
bounds due to the MetaQ sizing constraints.  This
ends up showing by the number of nexthops in the
system.  Reducing the number of packets serviced
to limit the metaQ size to the packets to process
allieviates this problem.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-03-25 09:10:46 -04:00
Louis Scalbert 52a35e9592 zebra: fix vanished blackhole route
Fix vanished blackhole route when kernel routes are updated.

> root@router# echo "100 my_table" | tee -a /etc/iproute2/rt_tables
> root@router# ip l add du0 type dummy
> root@router# ifconfig du0 192.168.0.1/24 up
> root@router# ip route add blackhole default table 100
> root@router# ip route show table 100
> blackhole default
> root@router# vtysh -c 'show ip route table 100'
> [...]
> Table 100:
> K>* 0.0.0.0/0 [0/0] unreachable (blackhole), weight 1, 00:00:05
> root@router# ip l add red type vrf table 100
> root@router# vtysh -c 'show ip route table 100'
> [...]
> Table 100:
> K>* 0.0.0.0/0 [0/0] unreachable (blackhole), weight 1, 00:00:16
> root@router# ip l set du0 master red
> root@router# vtysh -c 'show ip route table 100'
> [...]
> Table 100:
> C>* 192.168.0.0/24 is directly connected, du0, weight 1, 00:00:02
> L>* 192.168.0.1/32 is directly connected, du0, weight 1, 00:00:02
> root@router# ip route show table 100
> blackhole default
> 192.168.0.0/24 dev du0 proto kernel scope link src 192.168.0.1
> local 192.168.0.1 dev du0 proto kernel scope host src 192.168.0.1
> broadcast 192.168.0.255 dev du0 proto kernel scope link src 192.168.0.1

Fixes: d528c02a20 ("zebra: Handle kernel routes appropriately")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2025-03-10 09:54:18 +01:00
Louis Scalbert fb8bf9cf59 zebra: remove vrf route entries at vrf disabling
This is the continuation of the previous commit.

When a VRF is deleted, the kernel retains only its own routing entries
in the former VRF table and removes all others.

This change ensures that routing entries created by FRR daemons are also
removed from the former zebra VRF table when the VRF is disabled.

To test:

> echo "100 my_table" | tee -a /etc/iproute2/rt_tables
> ip l add du0 type dummy
> ifconfig du0 192.168.0.1/24 up
> ip route add blackhole default table 100
> ip route show table 100
> ip l add red type vrf table 100
> ip l set du0 master red
> vtysh -c 'configure' -c 'vrf red' -c 'ip route 10.0.0.0/24 192.168.0.254'
> vtysh -c 'show ip route table 100'
> sleep 0.1
> ip l del red
> sleep 0.1
> vtysh -c 'show ip route table 100'
> ip l add red type vrf table 100
> ip l set du0 master red
> vtysh -c 'configure' -c 'vrf red' -c 'ip route 10.0.0.0/24 192.168.0.254'
> vtysh -c 'show ip route table 100'
> sleep 0.1
> ip l del red
> sleep 0.1
> vtysh -c 'show ip route table 100'

Fixes: d8612e6 ("zebra: Track tables allocated by vrf and cleanup")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2025-03-10 09:54:18 +01:00
David Schweizer 1eef3a77e3
lib,zebra: Allow class E prefixes in RIB
Changes allow ipv4 class E addresses and prefixes in the 240.0.0.0/4
range to be configured on interfaces, imported from the kernel routing
table and redistributed as connected routes in zebra by default.

Changes also fix routes with class E prefixes in kernel routing table
getting rejected by zebra during early daemon startup.

Drivin this change in default behavior are cloud providers (with
customers still using obsolete ipv4 protocol, i.e. Azure, AWS) running
out of ip space and abusing class E for addressing instances (announced
via BGP) over tunneling connections back to customers on premise
infrastructure.

Signed-off-by: David Schweizer <dschweizer@opensourcerouting.org>
2025-02-14 15:05:08 +01:00
Donald Sharp b2fc167978 zebra: Fix pass back of data from dplane through fpm pipe
A recent code change 29122bc9b8
changed the passing of data up the fpm from passing the
tableid and vrf to the sonic expected tableid contains
the vrfid.  This violates the assumptions in the code
that the netlink message passes up the tableid as the
tableid.  Additionally this code change did not modify
the rib_find_rn_from_ctx to actually properly decode
what could be passed up.  Let's just fix this and let
Sonic carry the patch as appropriate for themselves
since they are not the only users of dplane_fpm_nl.c

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-01-31 15:05:40 -05:00
Donald Sharp 67da971218
Merge pull request #17581 from mjstapp/fix_fpm_netlink
zebra: avoid race between FPM pthread and zebra main pthread in netlink encode/decode
2025-01-14 13:42:29 -05:00
Russ White c8ba5b09b3
Merge pull request #17544 from anlancs/zebra/fix-plug-interface
zebra: fix wrong nexthop status for kernel routes
2024-12-17 11:16:32 -05:00
anlan_cs 298bc623e7 zebra: don't uninstall kernel routes
After the nexthop check is fixed, zebra will wrongly uninstall the kernel routes
with inactive nexthop.

This commit would skip the uninstallation for kernel routes.

Signed-off-by: anlan_cs <anlan_cs@126.com>
2024-12-17 16:14:30 +08:00
Nathan Bahr 4250eae00d zebra,pimd,lib: Modify ZEBRA_NEXTHOP_LOOKUP_MRIB
Modified ZEBRA_NEXTHOP_LOOKUP_MRIB to include the SAFI from which to do the lookup.
This generalizes the API away from MRIB specifically and allows the user to decide how it should do lookups.
Rename ZEBRA_NEXTHOP_LOOKUP_MRIB to ZEBRA_NEXTHOP_LOOKUP now that it is more generalized.
This change is in preperation to remove multicast lookup mode completely from zebra.

Signed-off-by: Nathan Bahr <nbahr@atcorp.com>
2024-12-12 13:50:31 +00:00
Donatas Abraitis 492750f8bc
Merge pull request #17638 from donaldsharp/zebra_metaq_stuff
zebra: Remove tests for allocation failure
2024-12-12 11:10:31 +02:00
Donald Sharp 5d8bf74f0a zebra: Remove tests for allocation failure
This cannot happen.  No need to test

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-12-11 11:43:48 -05:00
Donald Sharp da7393b8fd zebra: Fix another ships in the night issue with WFI
Effectively When bgp would send a route update down
to zebra and immediately after that a asic update
from the kernel was read.  Zebra would choose the
asic update and drop the bgp update leaving us in
a state where bgp was not used as the true source.

Modify the code so that in rib_multipath_nhe
we notice that we have an unprocessed route update
from bgp.  And if so just drop this kernel update
about an older version of the route since it is
no longer needed.

Ticket: 2722533
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-12-09 12:35:42 -05:00
Mark Stapp 9af5425a28 zebra: improve an rnode debug
Improve a debug when we create a new rib_dest by calling the debug
after setting up the dest.

Signed-off-by: Mark Stapp <mjs@cisco.com>
2024-12-05 09:23:53 -05:00
Mark Stapp 29122bc9b8 zebra: refactor netlink route message parsing
Separate core netlink route message parsing into a new api that
uses a dplane ctx to hold the parsed attribute data. Use the new
api in two paths: the normal netlink update message parsing path,
and in the FPM plugin, which also uses netlink encoding. The FPM
route-notificatin code runs in its own pthread, and only needs
a subset of the route info that zebra ordinarily develops. This
change stops that pthread from accessing zebra's internal
data, such as vrfs and ifps, that are not thread-safe.

Signed-off-by: Mark Stapp <mjs@cisco.com>
2024-12-05 09:23:53 -05:00
anlan_cs f536ca30f5 zebra: use macro for one check
Signed-off-by: anlan_cs <anlan_cs@126.com>
2024-12-05 21:20:05 +08:00
Russ White fe20f83286
Merge pull request #17326 from anlancs/fix/zebra-no-ifp-down
zebra: fix missing kernel routes
2024-11-05 10:20:36 -05:00
anlan_cs 44a82da405 zebra: fix missing kernel routes
The `rib_update_handle_kernel_route_down_possibility()` didn't consider
the kernel routes ( blackhole )  without interface.  When some other
interfaces are down, these kernel routes will be wrongly removed.

Signed-off-by: anlan_cs <anlan_cs@126.com>
2024-10-31 22:45:16 +08:00
Nathan Bahr be2a7ed6af zebra: Add ability to import alternate tables into the MRIB
Expanded the cli command to include an mrib flag for importing to
the main table MRIB instead of the main table URIB.
Piped through specifying the safi through the import table functions
rather than hardcoding to SAFI_UNICAST.
Import still only import routes from the URIB subtable, only added the
ability to import into the main table MRIB.

Signed-off-by: Nathan Bahr <nbahr@atcorp.com>
2024-10-29 20:17:59 +00:00
Donald Sharp 3bff65abc7 zebra: When installing a mroute, allow it to flow
Currently the mroute code was not allowing the mroute
to be sent to the dataplane.  This leaves us with a
situation where the routes being installed where never
being set as installed and additionally nht against
the mrib would not work if the route came into existence
after the nexthop tracking was asked for.

Turns out all the pieces where there to let this work.
Modify the code to pass it to the dplane and to send
it back up as having worked.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-10-28 15:02:39 -04:00
Donald Sharp 811168ecc3 zebra: Add safi to some debugs
Trying to figure out what safi we are talking about is fun when
it is not put into the debugs.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-10-28 14:29:31 -04:00
anlan_cs 5829fea1b5 zebra: remove useless code
Signed-off-by: anlan_cs <anlan_cs@126.com>
2024-10-19 13:32:53 +08:00
Donald Sharp 28237d73ad zebra: Attempt to explain the rnh tracking code better
I got asked today what was going on in the rnh code.  I
had to take time off of what I was doing and rewrap my
head around this code, since it's been a long time.
As that this question may come up again in the future
I am trying to document this better so that someone
coming behind us will be able to read this and get
a better idea of what the algorithm is attempting
to do.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-10-15 12:42:17 -04:00
Mark Stapp 6c1bc51bbb
Merge pull request #16737 from raja-rajasekar/rajasekarr/vlan_to_dplane
zebra: vlan to dplane
2024-10-15 08:06:34 -04:00
Donald Sharp f50b1f7c22 zebra: Move pw status settting until after we get results
Currently the pw code sets the status of the pw for install
and uninstall immediately when notifying the dplane.  This
is incorrect in that we do not actually know the status at
this point in time.  When we get the result is when to set
the status.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-10-07 20:36:45 -04:00
Donna Sharp 8eb5f4f506 zebra: remove unused function rib_lookup_ipv4
Signed-off-by: Donna Sharp <dksharp5@gmail.com>
2024-10-06 18:53:11 -04:00
Rajasekar Raja aa4786642c zebra: vlan to dplane Offload from main
Trigger: Zebra core seen when we convert l2vni to l3vni and back

BackTrace:
/usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(_zlog_assert_failed+0xe9) [0x7f4af96989d9]
/usr/lib/frr/zebra(zebra_vxlan_if_vni_up+0x250) [0x5561022ae030]
/usr/lib/frr/zebra(netlink_vlan_change+0x2f4) [0x5561021fd354]
/usr/lib/frr/zebra(netlink_parse_info+0xff) [0x55610220d37f]
/usr/lib/frr/zebra(+0xc264a) [0x55610220d64a]
/usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(thread_call+0x7d) [0x7f4af967e96d]
/usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(frr_run+0xe8) [0x7f4af9637588]
/usr/lib/frr/zebra(main+0x402) [0x5561021f4d32]
/lib/x86_64-linux-gnu/libc.so.6(+0x2724a) [0x7f4af932624a]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x85) [0x7f4af9326305]
/usr/lib/frr/zebra(_start+0x21) [0x5561021f72f1]

Root Cause:
In working case,
 - We get a RTM_NEWLINK whose ctx is enqueued by zebra dplane and
   dequeued by zebra main and processed i.e.
   (102000 is deleted from vxlan99) before we handle RTM_NEWVLAN.
 - So in handling of NEWVLAN (vxlan99) we bail out since find with
   vlan id 703 does not exist.

root@leaf2:mgmt:/var/log/frr# cat ~/raja_logs/working/nocras.log  | grep "RTM_NEWLINK\|QUEUED\|vxlan99\|in thread"
2024/07/18 23:09:33.741105 ZEBRA: [KMXEB-K771Y] netlink_parse_info: netlink-dp-in (NS 0) type RTM_NEWLINK(16), len=616, seq=0, pid=0
2024/07/18 23:09:33.744061 ZEBRA: [K8FXY-V65ZJ] Intf dplane ctx 0x7f2244000cf0, op INTF_INSTALL, ifindex (65), result QUEUED
2024/07/18 23:09:33.767240 ZEBRA: [KMXEB-K771Y] netlink_parse_info: netlink-dp-in (NS 0) type RTM_NEWLINK(16), len=508, seq=0, pid=0
2024/07/18 23:09:33.767380 ZEBRA: [K8FXY-V65ZJ] Intf dplane ctx 0x7f2244000cf0, op INTF_INSTALL, ifindex (73), result QUEUED
2024/07/18 23:09:33.767389 ZEBRA: [NVFT0-HS1EX] INTF_INSTALL for vxlan99(73)
2024/07/18 23:09:33.767404 ZEBRA: [TQR2A-H2RFY] Vlan-Vni(1186:1186-6000002:6000002) update for VxLAN IF vxlan99(73)
2024/07/18 23:09:33.767422 ZEBRA: [TP4VP-XZ627] Del L2-VNI 102000 intf vxlan99(73)
2024/07/18 23:09:33.767858 ZEBRA: [QYXB9-6RNNK] RTM_NEWVLAN bridge IF vxlan99 NS 0
2024/07/18 23:09:33.767866 ZEBRA: [KKZGZ-8PCDW] Cannot find VNI for VID (703) IF vxlan99 for vlan state update >>>>BAIL OUT

In failure case,
 - The NEWVLAN is received first even before processing RTM_NEWLINK.
 - Since the vxlan id 102000 is not removed from the vxlan99,
   the find with vlan id 703 returns the 102000 one and we invoke
   zebra_vxlan_if_vni_up where the interfaces don't match and assert.

root@leaf2:mgmt:/var/log/frr# cat ~/raja_logs/noworking/crash.log | grep "RTM_NEWLINK\|QUEUED\|vxlan99\|in thread"
2024/07/18 22:26:43.829370 ZEBRA: [KMXEB-K771Y] netlink_parse_info: netlink-dp-in (NS 0) type RTM_NEWLINK(16), len=616, seq=0, pid=0
2024/07/18 22:26:43.829646 ZEBRA: [K8FXY-V65ZJ] Intf dplane ctx 0x7fe07c026d30, op INTF_INSTALL, ifindex (65), result QUEUED
2024/07/18 22:26:43.853930 ZEBRA: [QYXB9-6RNNK] RTM_NEWVLAN bridge IF vxlan99 NS 0
2024/07/18 22:26:43.853949 ZEBRA: [K61WJ-XQQ3X] Intf vxlan99(73) L2-VNI 102000 is UP >>> VLAN PROCESSED BEFORE INTF EVENT
2024/07/18 22:26:43.853951 ZEBRA: [SPV50-BX2RP] RAJA zevpn_vxlanif vxlan48 and ifp vxlan99
2024/07/18 22:26:43.854005 ZEBRA: [KMXEB-K771Y] netlink_parse_info: netlink-dp-in (NS 0) type RTM_NEWLINK(16), len=508, seq=0, pid=0
2024/07/18 22:26:43.854241 ZEBRA: [KMXEB-K771Y] netlink_parse_info: netlink-dp-in (NS 0) type RTM_NEWLINK(16), len=516, seq=0, pid=0
2024/07/18 22:26:43.854251 ZEBRA: [KMXEB-K771Y] netlink_parse_info: netlink-dp-in (NS 0) type RTM_NEWLINK(16), len=544, seq=0, pid=0
ZEBRA: in thread kernel_read scheduled from zebra/kernel_netlink.c:505 kernel_read()

Fix:
Similar to #13396, where link change
handling was offloaded to dplane, same is being done for vlan events.

Note: Prior to this change, zebra main thread was interested in the
RTNLGRP_BRVLAN. So all the kernel events pertaining to vlan was
handled by zebra main.

With this change change as well the handling of vlan events is still
with Zebra main. However we offload it via Dplane thread.

Ticket :#3878175

Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
2024-09-26 20:17:35 -07:00
Donald Sharp f02d76f0fd zebra: Attempt to reuse NHG after interface up and route reinstall
The previous commit modified zebra to reinstall the singleton
nexthops for a nexthop group when a interface event comes up.
Now let's modify zebra to attempt to reuse the nexthop group
when this happens and the upper level protocol resends the
route down with that.  Only match if the protocol is the same
as well as the instance and the nexthop groups would match.

Here is the new behavior:
eva(config)# do show ip route 9.9.9.9/32
Routing entry for 9.9.9.9/32
  Known via "static", distance 1, metric 0, best
  Last update 00:00:08 ago
  * 192.168.99.33, via dummy1, weight 1
  * 192.168.100.33, via dummy2, weight 1
  * 192.168.101.33, via dummy3, weight 1
  * 192.168.102.33, via dummy4, weight 1

eva(config)# do show ip route nexthop-group 9.9.9.9/32
% Unknown command: do show ip route nexthop-group 9.9.9.9/32
eva(config)# do show ip route 9.9.9.9/32 nexthop-group
Routing entry for 9.9.9.9/32
  Known via "static", distance 1, metric 0, best
  Last update 00:00:54 ago
  Nexthop Group ID: 57
  * 192.168.99.33, via dummy1, weight 1
  * 192.168.100.33, via dummy2, weight 1
  * 192.168.101.33, via dummy3, weight 1
  * 192.168.102.33, via dummy4, weight 1

eva(config)# exit
eva# conf
eva(config)# int dummy3
eva(config-if)# shut
eva(config-if)# no shut
eva(config-if)# do show ip route 9.9.9.9/32 nexthop-group
Routing entry for 9.9.9.9/32
  Known via "static", distance 1, metric 0, best
  Last update 00:00:08 ago
  Nexthop Group ID: 57
  * 192.168.99.33, via dummy1, weight 1
  * 192.168.100.33, via dummy2, weight 1
  * 192.168.101.33, via dummy3, weight 1
  * 192.168.102.33, via dummy4, weight 1

eva(config-if)# exit
eva(config)# exit
eva# exit
sharpd@eva ~/frr1 (master) [255]> ip nexthop show id 57
id 57 group 37/43/50/58 proto zebra
sharpd@eva ~/frr1 (master)> ip route show 9.9.9.9/32
9.9.9.9 nhid 57 proto 196 metric 20
	nexthop via 192.168.99.33 dev dummy1 weight 1
	nexthop via 192.168.100.33 dev dummy2 weight 1
	nexthop via 192.168.101.33 dev dummy3 weight 1
	nexthop via 192.168.102.33 dev dummy4 weight 1
sharpd@eva ~/frr1 (master)>

Notice that we now no longer are creating a bunch of new
nexthop groups.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-09-16 09:34:05 -04:00
Donald Sharp ce166ca789 zebra: Expose _route_entry_dump_nh so it can be used.
Expose this helper function so it can be used in zebra_nhg.c

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-09-16 09:34:05 -04:00
Donald Sharp e0437aba6d zebra: Add more vrf name to debugs
Trying to debug some cross vrf stuff in zebra and frankly
it's hard to grep the file for the routes you are interested
in.  Let's clean this up some and get a bit better
information for us developers

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-09-11 15:30:43 -04:00
Russ White add56c61dd
Merge pull request #15259 from dmytroshytyi-6WIND/nexthop_resolution
zebra: add LSP entry to nexthop via recursive (part 2)
2024-09-10 10:04:08 -04:00
Donald Sharp 98b11de9f6 zebra: Modify show zebra dplane providers to give more data
The show zebra dplane provider command was ommitting
the input and output queues to the dplane itself.
It would be nice to have this insight as well.

New output:
r1# show zebra dplane providers
dataplane Incoming Queue from Zebra: 100
Zebra dataplane providers:
  Kernel (1): in: 6, q: 0, q_max: 3, out: 6, q: 14, q_max: 3
  dplane_fpm_nl (2): in: 6, q: 10, q_max: 3, out: 6, q: 0, q_max: 3
dataplane Outgoing Queue to Zebra: 43
r1#

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-09-05 15:52:05 -04:00
Donald Sharp 0c72a78930 zebra: Allow for initial deny of installation of nhe's
Currently the FRR code will receive both kernel and
connected routes that do not actually have an underlying
nexthop group at all.  Zebra turns around and creates
a `matching` nexthop hash entry and installs it.
For connected routes, this will create 2 singleton
nexthops in the dplane per interface (v4 and v6).
For kernel routes it would just create 1 singleton
nexthop that might be used or not.

This is bad because the dplane has a limited amount
of space available for nexthop entries and if you
happen to have a large number of interfaces then
all of a sudden you have 2x(# of interfaces) singleton
nexthops.

Let's modify the code to delay creation of these singleton
nexthops until they have been used by something else in the
system.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-08-30 08:23:48 -04:00
Donald Sharp 8ad5643abe zebra: Convince SA that the ng will always be valid
There is a code path that could theoretically get you
to a point where the ng->nexthop is a NULL value.
Let's just make sure the SA system believes that
cannot happen anymore.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-08-29 18:10:30 -04:00
Donald Sharp 9bc0cd8241 zebra: Prevent accidental re memory leak in odd case
There exists a path in rib_add_multipath where if a decision
is made to not use the passed in re, we just drop the memory
instead of freeing it.  Let's free it.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-08-27 06:25:34 -04:00
Donald Sharp d528c02a20 zebra: Handle kernel routes appropriately
Current code intentionally ignores kernel routes.  Modify
zebra to allow these routes to be read in on linux.  Also
modify zebra to look to see if a route should be treated
as a connected and mark it as such.

Additionally this should properly handle some of the issues
being seen with NOPREFIXROUTE.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-08-27 06:25:34 -04:00
Donald Sharp bdfccf69fa zebra: Expose rib_update_handle_vrf_all
This function will be used on interface down
events to allow for kernel routes to be cleaned
up.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-08-27 06:25:34 -04:00
Donald Sharp f450e1cda4 zebra: Make p and src_p const for rib_delete
The prefix'es p and src_p are not const.  Let's make
them so.  Useful to signal that we will not change this
data.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-08-27 06:25:34 -04:00
Donatas Abraitis 1e288c9b55 zebra: Do not forget to free opaque data for route entry
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2024-08-13 18:00:30 +03:00
Donald Sharp e53fa582bc zebra: Fix removal of routes on MetaQ when client goes down
It is possible that right before an upper level protocol dies
or is killed routes would be installed into zebra.  These routes
could be on the Meta-Q for early route-processing.  Leaving us with
a situation where the client is removed, and all it's routes that are
in the rib at that time, and then after that the MetaQ is run and the
routes are reprocessed leaving routes from an upper level daemon
post daemon going away from zebra's perspective.  These routes will
be abandoned.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-07-30 08:34:06 -04:00
Philippe Guibert 731f74e35f zebra, topotests: do not set nexthop's FIB flag when DUPLICATE present
The bgp_duplicate_nexthop test installs routes with nexthop's
flags set to both DUPLICATE and FIB: this should not happen.

The DUPLICATE flag of a nexthop indicates this nexthop is already
used in the same nexthop-group, and there is no need to install it
twice in the system; having the FIB flag set indicates that the
nexthop is installed in the system. This is why both flags should
not be set on the same nexthop.

This case happens at installation time, but can also happen
at update time.
- Fix this by not setting the FIB flag value when the DUPLICATE
flag is present.
- Modify the bgp_duplicate_test to check that the FIB flag is not
present on duplicated nexthops.
- Modify the bgp_peer_type_multipath_relax test.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2024-07-08 15:42:02 +02:00
vivek b5682ffbf0 *: Add and use option for graceful (re)start
Add a new start option "-K" to libfrr to denote a graceful start,
and use it in zebra and bgpd.

zebra will use this option to denote a planned FRR graceful restart
(supporting only bgpd currently) to wait for a route sync completion
from bgpd before cleaning up old stale routes from the FIB. An optional
timer provides an upper-bounds for this cleanup.

bgpd will use this option to denote either a planned FRR graceful
restart or a bgpd-only graceful restart, and this will drive the BGP
GR restarting router procedures.

Signed-off-by: Vivek Venkatraman <vivek@nvidia.com>
2024-07-01 13:02:52 -07:00
Philippe Guibert 9f88ed98a4 zebra: relax mpls entries installation with labeled unicast entries
Until now, when a FEC entry is added in zebra, driven by the
reception of a BGP labeled unicast update, an LSP entry is
created. That LSP entry is resolved by using the route entry
which is also installed by BGP labeled unicast.

This route entry is not available when we face with i-bgp
peering session. I-BGP labeled sessions are used to establish
IP connectivity across separate IGPs.

The below dumps illustrate a 3 IGP topology. An attempt to create
connectivity between the north and the south machines is done.

The 3 separate IGPs are configured in Segment routing:
- north-east
- east-west
- west-south

We create BGP peerings between each endpoint of each IGP:
- iBGP between (north) and (east)
- iBGP between (east) and (west)
- iBGP between (west) and (south)

Before that patch, the FEC entries could not be resolved on the
east machine:

Before:
east-vm# show mpls fec
192.0.2.1/32
  Label: 18
  Client list: bgp(fd 48)
192.0.2.5/32
  Label: 17
  Client list: bgp(fd 48)
192.0.2.7/32
  Label: 19
  Client list: bgp(fd 48)

east-vm# show mpls table
 Inbound Label  Type        Nexthop      Outbound Label
 --------------------------------------------------------
 1011           SR (OSPF)   192.168.2.2  1011
 1022           SR (OSPF)   192.168.2.2  implicit-null
 11044          SR (IS-IS)  192.168.3.4  implicit-null
 11055          SR (IS-IS)  192.168.3.4  11055
 30000          SR (OSPF)   192.168.2.2  implicit-null
 30001          SR (OSPF)   192.168.2.2  implicit-null
 36000          SR (IS-IS)  192.168.3.4  implicit-null

east-vm# show ip route
[..]
B   192.0.2.1/32 [200/0] via 192.0.2.1 inactive, label implicit-null, weight 1, 00:17:45
O>* 192.0.2.1/32 [110/20] via 192.168.2.2, r3-eth0, label 1011, weight 1, 00:17:47
O>* 192.0.2.2/32 [110/10] via 192.168.2.2, r3-eth0, label implicit-null, weight 1, 00:17:47
O   192.0.2.3/32 [110/0] is directly connected, lo, weight 1, 00:17:57
C>* 192.0.2.3/32 is directly connected, lo, 00:18:03
I>* 192.0.2.4/32 [115/20] via 192.168.3.4, r3-eth1, label implicit-null, weight 1, 00:17:59
B   192.0.2.5/32 [200/0] via 192.0.2.5 inactive, label implicit-null, weight 1, 00:17:56
I>* 192.0.2.5/32 [115/30] via 192.168.3.4, r3-eth1, label 11055, weight 1, 00:17:58
B>  192.0.2.7/32 [200/0] via 192.0.2.5 (recursive), label 19, weight 1, 00:17:45
  *                        via 192.168.3.4, r3-eth1, label 11055/19, weight 1, 00:17:45
[..]

After command "mpls fec nexthop-resolution" was applied, the FEC
entries will resolve over any non BGP route that has a labeled path
selected.

east-vm# show mpls table
 Inbound Label  Type        Nexthop      Outbound Label
 --------------------------------------------------------
 17             SR (IS-IS)  192.168.3.4  11055
 18             SR (OSPF)   192.168.2.2  1011
 19             BGP         192.168.3.4  11055/19
 1011           SR (OSPF)   192.168.2.2  1011
 1022           SR (OSPF)   192.168.2.2  implicit-null
 11044          SR (IS-IS)  192.168.3.4  implicit-null
 11055          SR (IS-IS)  192.168.3.4  11055
 30000          SR (OSPF)   192.168.2.2  implicit-null
 30001          SR (OSPF)   192.168.2.2  implicit-null
 36000          SR (IS-IS)  192.168.3.4  implicit-null

Co-developed-by: Dmytro Shytyi <dmytro.shytyi@6wind.com>
Signed-off-by: Dmytro Shytyi <dmytro.shytyi@6wind.com>
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2024-06-07 12:33:48 +02:00
Jafar Al-Gharaibeh 080054686f
Merge pull request #15216 from donaldsharp/zebra_opaque_mem_leak
zebra: Fix opaque memory leak in rare situation
2024-02-02 10:54:20 -06:00
Donald Sharp 8d5ed65e0d zebra: Cleanup dest assignment
dest was shadowing dest inside of an if statement additionally
both legs needed dest to be assigned.  Let's clean this up a
slight bit and use it appropriately

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-01-24 17:10:45 -05:00