Issue:
In a scaled setup, (where number of nets > BGP_CLEARING_BATCH_MAX_DESTS
for walk_batch_table_helper), when peer is shutdown, it is seen some
of the paths are not deleted, which are received from that peer.
Fix:
This is due to, in clear_batch_rib_helper, once walk_batch_table_helper
returns after BGP_CLEARING_BATCH_MAX_DESTS is reached, we just break
from inner loop for the afi/safi for loops. So during walk for next
afi/safi that 'ret' state is overwritten with new state. Also the
resume context is overwritten. This causes to lose the start point
for next walk, some nets are skipped forever. So they are not marked
for deletion anymore. To fix this, we immediately return from current
run. This will have resume state to be stored correctly, and next walk
will start from there.
Testing:
32 ecmp paths were received from the shutdown peer
Before fix:
show bgp ipv6 2052:52:1:167::/64
BGP routing table entry for 2052:52:1:167::/64, version 495
Paths: (246 available, best #127, table default)
Not advertised to any peer
<snip>
4200165500 4200165002
2021:21:51:101::2(spine-5) from spine-5(2021:21:51:101::2) (6.0.0.17)
(fe80::202:ff:fe00:55) (prefer-global)
Origin incomplete, valid, external, multipath
Last update: Fri Apr 4 17:25:05 2025
4200165500 4200165002
2021:21:11:116::2(spine-1) from spine-1(2021:21:11:116::2) (0.0.0.0)
(fe80::202:ff:fe00:3d) (prefer-global)<<<<path not deleted
Origin incomplete, valid, external
Last update: Fri Apr 4 17:25:05 2025
4200165500 4200165002
2021:21:11:115::2(spine-1) from spine-1(2021:21:11:115::2) (0.0.0.0)
(fe80::202:ff:fe00:3d) (prefer-global)<<<<path not deleted
Origin incomplete, valid, external
Last update: Fri Apr 4 17:25:05 2025
<snip>
32 paths are supposed to be withdrawn:
root@leaf-1:mgmt:# vtysh -c "show bgp ipv6 2052:52:1:167::/64" | grep "prefer-global" | wc -l
256
root@leaf-1:mgmt# vtysh -c "show bgp ipv6 2052:52:1:167::/64" | grep "prefer-global" | wc -l
246<<should be 224, but showing 246, which is wrong
After fix:
32 paths are supposed to be withdrawn:
root@leaf-1:mgmt:# vtysh -c "show bgp ipv6 2052:52:1:167::/64" | grep "prefer-global" | wc -l
256
root@leaf-1:mgmt:# vtysh -c "show bgp ipv6 2052:52:1:167::/64" | grep "prefer-global" | wc -l
224<<<shows correctly
Signed-off-by: Soumya Roy <souroy@nvidia.com>
Indirect leak of 56 byte(s) in 1 object(s) allocated from:
0 0x7fdaf6cb83b7 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:77
1 0x7fdaf683a480 in qcalloc lib/memory.c:106
2 0x7fdaf68dd706 in route_table_init_with_delegate lib/table.c:38
3 0x5649b22c05b0 in bgp_table_init bgpd/bgp_table.c:139
4 0x5649b2273da0 in bgp_static_set bgpd/bgp_route.c:7779
5 0x5649b21eba58 in vpnv4_network bgpd/bgp_mplsvpn.c:3244
6 0x7fdaf67b6d61 in cmd_execute_command_real lib/command.c:1003
7 0x7fdaf67b7080 in cmd_execute_command lib/command.c:1062
8 0x7fdaf67b75ac in cmd_execute lib/command.c:1228
9 0x7fdaf68ffb20 in vty_command lib/vty.c:626
10 0x7fdaf6900073 in vty_execute lib/vty.c:1389
11 0x7fdaf6903e24 in vtysh_read lib/vty.c:2408
12 0x7fdaf68f0222 in event_call lib/event.c:2019
13 0x7fdaf681b3c6 in frr_run lib/libfrr.c:1247
14 0x5649b211c903 in main bgpd/bgp_main.c:565
15 0x7fdaf630c249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
Table was being created but never deleted. Let's delete it.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
There are some tables not being freed upon shutdown. This
is happening because the table is being locked as dests
are being put on the metaQ. When in shutdown it was clearing
the MetaQ it was not unlocking the table
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When exporting a VPN SRv6 route, the path may not be considered valid if
the nexthop is not valid. This is the case when the 'nexthop vpn export'
command is used. The below example illustrates that the VPN path to
2001:1::/64 is not selected, as the expected nexthop to find in vrf10 is
the one configured:
> # show running-config
> router bgp 1 vrf vrf10
> address-family ipv6 unicast
> nexthop vpn export 2001::1
> # show bgp ipv6 vpn
> [..]
> Route Distinguisher: 1:10
> 2001:1::/64 2001::1@4 0 0 65001 i
> UN=2001::1 EC{99:99} label=16 sid=2001:db8:1:1:: sid_structure=[40,24,16,0] type=bgp, subtype=5
The analysis indicates that the 2001::1 nexthop is considered.
> 2025/03/20 21:47:53.751853 BGP: [RD1WY-YE9EC] leak_update: entry: leak-to=VRF default, p=2001:1::/64, type=10, sub_type=0
> 2025/03/20 21:47:53.751855 BGP: [VWNP2-DNMFV] Found existing bnc 2001::1/128(0)(VRF vrf10) flags 0x82 ifindex 0 #paths 2 peer 0x0, resolved prefix UNK prefix
> 2025/03/20 21:47:53.751856 BGP: [VWC2R-4REXZ] leak_update_nexthop_valid: 2001:1::/64 nexthop is not valid (in VRF vrf10)
> 2025/03/20 21:47:53.751857 BGP: [HX87B-ZXWX9] leak_update: ->VRF default: 2001:1::/64: Found route, no change
Actually, to check the nexthop validity, only the source path in the VRF
has the correct nexthop. Fix this by reusing the source path information
instead of the current one.
> 2025/03/20 22:43:51.703521 BGP: [RD1WY-YE9EC] leak_update: entry: leak-to=VRF default, p=2001:1::/64, type=10, sub_type=0
> 2025/03/20 22:43:51.703523 BGP: [VWNP2-DNMFV] Found existing bnc fe80::b812:37ff:fe13:d441/128(0)(VRF vrf10) flags 0x87 ifindex 0 #paths 2 peer 0x0, resolved prefix fe80::/64
> 2025/03/20 22:43:51.703525 BGP: [VWC2R-4REXZ] leak_update_nexthop_valid: 2001:1::/64 nexthop is valid (in VRF vrf10)
> 2025/03/20 22:43:51.703526 BGP: [HX87B-ZXWX9] leak_update: ->VRF default: 2001:1::/64: Found route, no change
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Modify the batch clear code to be able to stop after processing
some of the work and to pick back up again. This will allow
the very expensive nature of the batch clearing to be spread out
and allow bgp to continue to be responsive.
Signed-off-by: Mark Stapp <mjs@cisco.com>
All assignments of the EVPN attributes (ESI and Gateway IP) are gated
behind the peer being set up for inbound soft-reconfiguration.
There are no actual reasons for this limitation, so let's perform the
EVPN attribute assignment no matter what when soft reconfiguration is
not enabled.
Fixes: 6e076ba523 ("bgpd: Fix for ain->attr corruption during path update")
Signed-off-by: Tuetuopay <tuetuopay@me.com>
The two memory leaks:
==387155== 744 (48 direct, 696 indirect) bytes in 1 blocks are definitely lost in loss record 222 of 262
==387155== at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==387155== by 0x4C1B982: json_object_new_object (in /usr/lib/x86_64-linux-gnu/libjson-c.so.5.1.0)
==387155== by 0x2E4146: peer_adj_routes (bgp_route.c:15245)
==387155== by 0x2E4F1A: show_ip_bgp_instance_neighbor_advertised_route_magic (bgp_route.c:15549)
==387155== by 0x2B982B: show_ip_bgp_instance_neighbor_advertised_route (bgp_route_clippy.c:722)
==387155== by 0x4915E6F: cmd_execute_command_real (command.c:1003)
==387155== by 0x4915FE8: cmd_execute_command (command.c:1062)
==387155== by 0x4916598: cmd_execute (command.c:1228)
==387155== by 0x49EB858: vty_command (vty.c:626)
==387155== by 0x49ED77C: vty_execute (vty.c:1389)
==387155== by 0x49EFFA7: vtysh_read (vty.c:2408)
==387155== by 0x49E4156: event_call (event.c:2019)
==387155== by 0x4958ABD: frr_run (libfrr.c:1247)
==387155== by 0x206A68: main (bgp_main.c:557)
==387155==
==387155== 2,976 (192 direct, 2,784 indirect) bytes in 4 blocks are definitely lost in loss record 240 of 262
==387155== at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==387155== by 0x4C1B982: json_object_new_object (in /usr/lib/x86_64-linux-gnu/libjson-c.so.5.1.0)
==387155== by 0x2E45CA: peer_adj_routes (bgp_route.c:15325)
==387155== by 0x2E4F1A: show_ip_bgp_instance_neighbor_advertised_route_magic (bgp_route.c:15549)
==387155== by 0x2B982B: show_ip_bgp_instance_neighbor_advertised_route (bgp_route_clippy.c:722)
==387155== by 0x4915E6F: cmd_execute_command_real (command.c:1003)
==387155== by 0x4915FE8: cmd_execute_command (command.c:1062)
==387155== by 0x4916598: cmd_execute (command.c:1228)
==387155== by 0x49EB858: vty_command (vty.c:626)
==387155== by 0x49ED77C: vty_execute (vty.c:1389)
==387155== by 0x49EFFA7: vtysh_read (vty.c:2408)
==387155== by 0x49E4156: event_call (event.c:2019)
==387155== by 0x4958ABD: frr_run (libfrr.c:1247)
==387155== by 0x206A68: main (bgp_main.c:557)
For the 1st one, if the operator issues a advertised-routes command, the
json_ar variable was never being freed.
For the 2nd one, if the operator issued a command where the
output_count_per_rd is 0, we need to free the json_routes value.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
BT:
```
3 <signal handler called>
4 0x00005616837546fc in bgp_static_update (bgp=bgp@entry=0x5616865eac50, p=0x561686639e40,
bgp_static=0x561686639f50, afi=afi@entry=AFI_IP6, safi=safi@entry=SAFI_UNICAST) at ../bgpd/bgp_route.c:7232
5 0x0000561683754ad0 in bgp_static_add (bgp=0x5616865eac50) at ../bgpd/bgp_table.h:413
6 0x0000561683785e2e in no_bgp_network_import_check (self=<optimized out>, vty=0x5616865e04c0,
argc=<optimized out>, argv=<optimized out>) at ../bgpd/bgp_vty.c:4609
7 0x00007fdbcc294820 in cmd_execute_command_real (vline=vline@entry=0x561686663000,
```
The program encountered a SEG FAULT when attempting to access pi->extra->vrfleak->bgp_orig because
pi->extra->vrfleak was NULL.
```
(gdb) p pi->extra->vrfleak
$1 = (struct bgp_path_info_extra_vrfleak *) 0x0
(gdb) p pi->extra->vrfleak->bgp_orig
Cannot access memory at address 0x8
```
Added NOT NULL check on pi->extra->vrfleak before accessing pi->extra->vrfleak->bgp_orig
to prevent the segmentation fault.
Signed-off-by: Manpreet Kaur <manpreetk@nvidia.com>
The command `show bgp <afi> <safi>` has this output:
r1# show bgp ipv4 uni 10.0.0.0
BGP routing table entry for 10.0.0.0/32, version 1
Paths: (1 available, best #1, table default)
Advertised to non peer-group peers:
r1-eth0 r1-eth1 r1-eth2 r1-eth3
....
It specifically states `Advertised to non peer-group peers:` yet
the code is not filtering those out.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When peer connections encounter errors, attempt to batch some
of the clearing processing that occurs. Add a new batch object,
add multiple peers to it, if possible. Do one rib walk for the
batch, rather than one walk per peer. Use a handler callback
per batch to check and remove peers' path-infos, rather than
a work-queue and callback per peer. The original clearing code
remains; it's used for single peers.
Signed-off-by: Mark Stapp <mjs@cisco.com>
Coverity rightly points out that the worse pointer
cannot be null in this section of code. Fix it.
Signed-off-by: Donald Sharp <donaldsharp72@gmail.com>
When unconfiguring an aggregated prefix, the VPN prefix is not
removed. Fix this by refreshing the VPN leak when the aggregated route
is or is not available.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
As part of the upstream master commit (f3575f61c7 bgpd: Sort the
bgp_path_inf) the snippet of the code for dmed check condition
left out, which leads to an issue of selecting incorrect bestpath.
As an example:
During the bestpath selection local route looses to another path due
to dmed condition being hit.
The snippet of the logs:
2025/02/20 03:06:20.131441 BGP: [JW7VP-K1YVV]
[2]:[0]:[48]:[00:92:00:00:00:10](VRF default): Comparing path
27.0.0.7 flags Valid with path Static announcement flags Selected Valid Attr Changed Unsorted
2025/02/20 03:06:20.131445 BGP: [SYTDR-QV6X9] [2]:[0]:[48]:[00:92:00:00:00:10]: path 27.0.0.7 loses to path Static announcement as ES 03:44:38:39:ff:ff:02:00:00:01 is same and local
2025/02/20 03:06:20.131452 BGP: [JW7VP-K1YVV] [2]:[0]:[48]:[00:92:00:00:00:10](VRF default): Comparing path 27.0.0.8 flags Valid with path Static announcement flags Selected Valid Attr Changed Unsorted
2025/02/20 03:06:20.131456 BGP: [SYTDR-QV6X9] [2]:[0]:[48]:[00:92:00:00:00:10]: path 27.0.0.8 loses to path Static announcement as ES 03:44:38:39:ff:ff:02:00:00:01 is same and local
2025/02/20 03:06:20.131458 BGP: [WEWEC-8SE72] [2]:[0]:[48]:[00:92:00:00:00:10](VRF default): path Static announcement is the bestpath from AS 0 <<<< static is best
2025/02/20 03:06:20.131463 BGP: [Z3A78-GM3G5] bgp_best_selection: [2]:[0]:[48]:[00:92:00:00:00:10](VRF default) pi 27.0.0.7 dmed
2025/02/20 03:06:20.131467 BGP: [Z3A78-GM3G5] bgp_best_selection: [2]:[0]:[48]:[00:92:00:00:00:10](VRF default) pi 27.0.0.8 dmed
2025/02/20 03:06:20.131471 BGP: [N6CTF-2RSKS] [2]:[0]:[48]:[00:92:00:00:00:10](VRF default): After path selection, newbest is path 27.0.0.7 oldbest was Static announce
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
In bgp_show_table_rd(), the is_last argument is determined using the
expression "next == NULL" to check if the RD table is the last one. This
helps ensure proper JSON formatting.
However, if next is not NULL but is no longer associated with a BGP
table, the JSON output becomes malformed.
Updates the condition to also verify the existence of the next bgp_dest
table.
Fixes: 1ae44dfcba ("bgpd: unify 'show bgp' with RD with normal unicast bgp show")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Even if we have unnumbered peering, let's respect `no neighbor X capability link-local`
and disable it per-neighbor on demand.
Fixes: db853cc97e ("bgpd: Implement Link-Local Next Hop capability")
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Memory is being leaked when processing the eoiu marker.
BGP is creating a dummy dest to contain the data but
it was never freed. As well as the eoiu info was
not being freed either.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When you have suppress-fib-pending turned on it is possible
to end up in a situation where the prefix is not withdrawn
from downstream peers.
Here is the timing that I believe is happening:
a) have 2 paths to a peer.
b) receive a withdrawal from 1 path, set BGP_NODE_FIB_INSTALL_PENDING
and send the route install to zebra.
c) receive a withdrawal from the other path.
d) At this point we have a dest->flags set BGP_NODE_FIB_INSTALL_PENDING
old_select the path_info going away, new_select is NULL
e) A bit further down we call group_announce_route() which calls
the code to see if we should advertise the path. It sees the
BGP_NODE_FIB_INSTALL_PENDING flag and says, nope.
f) the route is sent to zebra to withdraw, which unsets the
BGP_NODE_FIB_INSTALL_PENDING.
g) This function winds up and deletes the path_info. Dest now
has no path infos.
h) BGP receives the route install(from step b) and unsets the
BGP_NODE_FIB_INSTALL_PENDING flag
i) BGP receives the route removed from zebra (from step f) and
unsets the flag again.
We know if there is no new_select, let's go ahead and just
unset the PENDING flag to allow the withdrawal to go out
at the time when the second withdrawal is received.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
draft-ietf-idr-deprecate-as-set-confed-set-16 defines that we MUST NOT
advertise an aggregate prefix to the contributing ASes.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
The json output of advertised-routes is incorrect, as there is a missing
brace with route-distinguisher:
observed with the bgp_vpnv4_noretain test:
> "bgpTableVersion":0,"bgpLocalRouterId":"192.0.2.1","defaultLocPrf":100,"localAS":65500,
> "advertisedRoutes": "192.0.2.1:1":{"rd":"192.0.2.1:1","10.101.0.0/24":{"prefix":"10.101.0.0/24",
expected:
> "bgpTableVersion":0,"bgpLocalRouterId":"192.0.2.1","defaultLocPrf":100,"localAS":65500,
> "advertisedRoutes": { "192.0.2.1:1":{"rd":"192.0.2.1:1","10.101.0.0/24":{"prefix":"10.101.0.0/24",
> ^
> missing brace
Fix this by adding the missing braces.
Fixes: 4838bac033 ("bgpd: neighbors received-routes/advertised-routes stringify changes")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
When aggregate is used, the suppressed information is not displayed in
the json attributes of a given path. To illustrate, the dump of the
192.168.2.1/32 path in the bgp_aggregate_address_topo1 topotest:
> # show bgp ipv4
> [..]
> s> 192.168.2.1/32 10.0.0.2 0 65001 i
>
> # show bgp ipv4 detail
> [..]
> BGP routing table entry for 192.168.2.1/32, version 17
> Paths: (1 available, best #1, table default, vrf (null), Advertisements suppressed by an aggregate.)
> Not advertised to any peer
> 65001 <---- missing suppressed flag
> 10.0.0.2 from 10.0.0.2 (10.254.254.3)
> Origin IGP, valid, external, best (First path received)
> Last update: Fri Jan 24 13:11:41 2025
>
> # show bgp ipv4 detail json
> [..]
> ,"192.168.2.1/32": [{"aspath":{"string":"65001","segments":[{"type":"as-sequence","list":[65001]}],"length":1},"origin":"IGP","valid":true,"version":17,
> "bestpath":{"overall":true,"selectionReason":"First path received"}, <---- missing suppressed flag
> "lastUpdate":{"epoch":1737720700,"string":"Fri Jan 24 13:11:40 2025\n"},
> "nexthops":[{"ip":"10.0.0.2","afi":"ipv4","metric":0,"accessible":true,"used":true}],
> "peer":{"peerId":"10.0.0.2","routerId":"10.254.254.3","type":"external"}}]
Fix this by adding the json information.
> # show bgp ipv4 detail
> [..]
> BGP routing table entry for 192.168.2.1/32, version 17
> Paths: (1 available, best #1, table default, vrf (null), Advertisements suppressed by an aggregate.)
> Not advertised to any peer
> 65001, (suppressed)
> 10.0.0.2 from 10.0.0.2 (10.254.254.3)
> Origin IGP, valid, external, best (First path received)
> Last update: Fri Jan 24 13:11:41 2025
>
> # show bgp ipv4 detail json
> [..]
> ,"192.168.2.1/32": [{"aspath":{"string":"65001","segments":[{"type":"as-sequence","list":[65001]}],"length":1},"suppressed":true,"origin":"IGP","valid":true,"version":17,
> "bestpath":{"overall":true,"selectionReason":"First path received"},
> "lastUpdate":{"epoch":1737720991,"string":"Fri Jan 24 13:16:31 2025"},
> "nexthops":[{"ip":"10.0.0.2","afi":"ipv4","metric":0,"accessible":true,"used":true}],"peer":{"peerId":"10.0.0.2","routerId":"10.254.254.3","type":"external"}}]
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
EVPN imported routes AF is not AF_EVPN, in that case
the path info extra with EVPN is nver created.
In order to create evpn info under path extra, define
new api which does not specifically check for AF_EVPN
afi type.
Signed-off-by: Chirag Shah <chirag@nvidia.com>
draft-uttaro-idr-bgp-oad says:
Extended communities which are non-transitive across an AS boundary MAY be
advertised over an EBGP-OAD session if allowed by explicit policy configuration.
If allowed, all the members of the OAD SHOULD be configured to use the same
criteria.
For example, the Origin Validation State Extended Community, defined as
non-transitive in [RFC8097], can be advertised to peers in the same OAD.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Some bgp evpn memory contexts are not freed at the end of the bgp
process.
> =================================================================
> ==1208677==ERROR: LeakSanitizer: detected memory leaks
>
> Direct leak of 96 byte(s) in 2 object(s) allocated from:
> #0 0x7f93ad4b4a57 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
> #1 0x7f93ace77233 in qcalloc lib/memory.c:106
> #2 0x563bb68f4df1 in process_type5_route bgpd/bgp_evpn.c:5084
> #3 0x563bb68fb663 in bgp_nlri_parse_evpn bgpd/bgp_evpn.c:6302
> #4 0x563bb69ea2a9 in bgp_nlri_parse bgpd/bgp_packet.c:347
> #5 0x563bb69f7716 in bgp_update_receive bgpd/bgp_packet.c:2482
> #6 0x563bb6a04d3b in bgp_process_packet bgpd/bgp_packet.c:4091
> #7 0x7f93acf8082d in event_call lib/event.c:1996
> #8 0x7f93ace48931 in frr_run lib/libfrr.c:1232
> #9 0x563bb6880ae1 in main bgpd/bgp_main.c:557
> #10 0x7f93ac829d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
Actually, the bgp evpn context may noy be used if adj rib in is unused.
This may lead to memory leaks. Fix this by freeing the context in those
corner cases.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>