Commit graph

1916 commits

Author SHA1 Message Date
Donatas Abraitis 2f0b8ff1ea
Merge pull request #18624 from louis-6wind/remove-afi2family
bgpd: remove useless calls to afi2family
2025-04-10 17:14:08 +03:00
Mark Stapp 0e8e0c5fb9
Merge pull request #18594 from soumyar-roy/soumya/netwithdraw
bgpd: Paths not deleted received from shutdown peer
2025-04-10 10:07:32 -04:00
Soumya Roy d2bec7a691 bgpd: Paths, received from shutdown peer, not deleted
Issue:
In a scaled setup, (where number of nets > BGP_CLEARING_BATCH_MAX_DESTS
for walk_batch_table_helper), when peer is shutdown, it is seen some
of the paths are not deleted, which are received from that peer.

Fix:
This is due to, in clear_batch_rib_helper, once walk_batch_table_helper
returns after BGP_CLEARING_BATCH_MAX_DESTS is reached, we just break
from inner loop for the afi/safi for loops. So during walk for next
afi/safi that 'ret' state is overwritten with new state. Also the
resume context is overwritten. This causes to lose the start point
for next walk, some nets are skipped forever. So they are not marked
for deletion anymore. To fix this, we immediately return from current
run. This will have resume state to be stored correctly, and next walk
will start from there.

Testing:
32 ecmp paths were received from the shutdown peer
Before fix:
show bgp ipv6 2052:52:1:167::/64
BGP routing table entry for 2052:52:1:167::/64, version 495
Paths: (246 available, best #127, table default)
  Not advertised to any peer

<snip>
  4200165500 4200165002
    2021:21:51:101::2(spine-5) from spine-5(2021:21:51:101::2) (6.0.0.17)
    (fe80::202:ff:fe00:55) (prefer-global)
      Origin incomplete, valid, external, multipath
      Last update: Fri Apr  4 17:25:05 2025
  4200165500 4200165002
    2021:21:11:116::2(spine-1) from spine-1(2021:21:11:116::2) (0.0.0.0)
    (fe80::202:ff:fe00:3d) (prefer-global)<<<<path not deleted
      Origin incomplete, valid, external
      Last update: Fri Apr  4 17:25:05 2025
  4200165500 4200165002
    2021:21:11:115::2(spine-1) from spine-1(2021:21:11:115::2) (0.0.0.0)
    (fe80::202:ff:fe00:3d) (prefer-global)<<<<path not deleted
      Origin incomplete, valid, external
      Last update: Fri Apr  4 17:25:05 2025
<snip>

 32 paths are supposed to be withdrawn:
root@leaf-1:mgmt:# vtysh -c "show bgp ipv6 2052:52:1:167::/64" | grep "prefer-global" | wc -l
256
root@leaf-1:mgmt# vtysh -c "show bgp ipv6 2052:52:1:167::/64" | grep "prefer-global" | wc -l
246<<should be 224, but showing 246, which is wrong
After fix:
 32 paths are supposed to be withdrawn:
root@leaf-1:mgmt:# vtysh -c "show bgp ipv6 2052:52:1:167::/64" | grep "prefer-global" | wc -l
256
root@leaf-1:mgmt:# vtysh -c "show bgp ipv6 2052:52:1:167::/64" | grep "prefer-global" | wc -l
224<<<shows correctly

Signed-off-by: Soumya Roy <souroy@nvidia.com>
2025-04-09 14:32:23 +00:00
Donald Sharp b2d8d9b37a bgpd: On shutdown free up table for static routes
Indirect leak of 56 byte(s) in 1 object(s) allocated from:
    0 0x7fdaf6cb83b7 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:77
    1 0x7fdaf683a480 in qcalloc lib/memory.c:106
    2 0x7fdaf68dd706 in route_table_init_with_delegate lib/table.c:38
    3 0x5649b22c05b0 in bgp_table_init bgpd/bgp_table.c:139
    4 0x5649b2273da0 in bgp_static_set bgpd/bgp_route.c:7779
    5 0x5649b21eba58 in vpnv4_network bgpd/bgp_mplsvpn.c:3244
    6 0x7fdaf67b6d61 in cmd_execute_command_real lib/command.c:1003
    7 0x7fdaf67b7080 in cmd_execute_command lib/command.c:1062
    8 0x7fdaf67b75ac in cmd_execute lib/command.c:1228
    9 0x7fdaf68ffb20 in vty_command lib/vty.c:626
    10 0x7fdaf6900073 in vty_execute lib/vty.c:1389
    11 0x7fdaf6903e24 in vtysh_read lib/vty.c:2408
    12 0x7fdaf68f0222 in event_call lib/event.c:2019
    13 0x7fdaf681b3c6 in frr_run lib/libfrr.c:1247
    14 0x5649b211c903 in main bgpd/bgp_main.c:565
    15 0x7fdaf630c249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

Table was being created but never deleted.  Let's delete it.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-04-09 09:28:31 -04:00
Louis Scalbert a0a0749568 bgpd: remove useless calls to afi2family
Remove useless calls to afi2family(). str2prefix() always sets the
prefix family.

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2025-04-09 13:07:56 +02:00
Mark Stapp 660cbf5651 bgpd: clean up variable-shadowing compiler warnings
Clean up -Wshadow warnings in bgp.

Signed-off-by: Mark Stapp <mjs@cisco.com>
2025-04-08 14:41:27 -04:00
Russ White ab67e5544e
Merge pull request #18396 from pguibert6WIND/srv6l3vpn_to_bgp_vrf_redistribute
Add BGP redistribution in SRv6 BGP
2025-04-03 08:25:32 -04:00
Louis Scalbert cbf27be5d9 bgpd: optimize attrhash_cmp calls
Only call attrhash_cmp when necessary.

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2025-04-01 16:52:58 +02:00
Russ White c312917988
Merge pull request #18450 from donaldsharp/bgp_packet_reads
Bgp packet reads conversion to a FIFO
2025-04-01 10:12:37 -04:00
Donald Sharp c9d431d4db bgpd: On shutdown, unlock table when clearing the bgp metaQ
There are some tables not being freed upon shutdown.  This
is happening because the table is being locked as dests
are being put on the metaQ.  When in shutdown it was clearing
the MetaQ it was not unlocking the table

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-03-30 14:02:16 -04:00
Donald Sharp 83a92c926e bgpd: Delay processing MetaQ in some events
If the number of peers that are being handled on
the peer connection fifo is greater than 10, that
means we have some network event going on.  Let's
allow the packet processing to continue instead
of running the metaQ.  This has advantages because
everything else in BGP is only run after the metaQ
is run.  This includes best path processing,
installation of the route into zebra as well as
telling our peers about this change.  Why does
this matter?  It matters because if we are receiving
the same route multiple times we limit best path processing
to much fewer times and consequently we also limit
the number of times we send the route update out and
we install the route much fewer times as well.

Prior to this patch, with 512 peers and 5k routes.
CPU time for bgpd was 3:10, zebra was 3:28.  After
the patch CPU time for bgpd was 0:55 and zebra was
0:25.

Here are the prior `show event cpu`:
Event statistics for bgpd:

Showing statistics for pthread default
--------------------------------------
                               CPU (user+system): Real (wall-clock):
Active   Runtime(ms)   Invoked Avg uSec Max uSecs Avg uSec Max uSecs  CPU_Warn Wall_Warn Starv_Warn   Type  Event
    0         20.749     33144        0       395        1       396         0         0          0    T    (bgp_generate_updgrp_packets)
    0       9936.199      1818     5465     43588     5466     43589         0         0          0     E   bgp_handle_route_announcements_to_zebra
    0          0.220        84        2        20        3        20         0         0          0    T    update_subgroup_merge_check_thread_cb
    0          0.058         2       29        43       29        43         0         0          0     E   zclient_connect
    0      17297.733       466    37119     67428    37124     67429         0         0          0   W     zclient_flush_data
    1          0.134         7       19        40       20        42         0         0          0  R      vtysh_accept
    0        151.396      1067      141      1181      142      1189         0         0          0  R      vtysh_read
    0          0.297      1030        0        14        0        14         0         0          0    T    (bgp_routeadv_timer)
    0          0.001         1        1         1        2         2         0         0          0    T    bgp_sync_label_manager
    2          9.374       544       17       261       17       262         0         0          0  R      bgp_accept
    0          0.001         1        1         1        2         2         0         0          0    T    bgp_startup_timer_expire
    0          0.012         1       12        12       13        13         0         0          0     E   frr_config_read_in
    0          0.308         1      308       308      309       309         0         0          0    T    subgroup_coalesce_timer
    0          4.027       105       38        77       39        78         0         0          0    T    (bgp_start_timer)
    0     112206.442      1818    61719     84726    61727     84736         0         0          0    TE   work_queue_run
    0          0.345         1      345       345      346       346         0         0          0    T    bgp_config_finish
    0          0.710       620        1         6        1         9         0         0          0   W     bgp_connect_check
    2         39.420      8283        4       110        5       111         0         0          0  R      zclient_read
    0          0.052         1       52        52      578       578         0         0          0    T    bgp_start_label_manager
    0          0.452        87        5        90        5        90         0         0          0    T    bgp_announce_route_timer_expired
    0        185.837      3088       60       537       92     21705         0         0          0     E   bgp_event
    0      48719.671      4346    11210     78292    11215     78317         0         0          0     E   bgp_process_packet

Showing statistics for pthread BGP I/O thread
---------------------------------------------
                               CPU (user+system): Real (wall-clock):
Active   Runtime(ms)   Invoked Avg uSec Max uSecs Avg uSec Max uSecs  CPU_Warn Wall_Warn Starv_Warn   Type  Event
    0        321.915     28597       11        86       11       265         0         0          0   W     bgp_process_writes
  515        115.586     26954        4       121        4       128         0         0          0  R      bgp_process_reads

Event statistics for zebra:

Showing statistics for pthread default
--------------------------------------
                               CPU (user+system): Real (wall-clock):
Active   Runtime(ms)   Invoked Avg uSec Max uSecs Avg uSec Max uSecs  CPU_Warn Wall_Warn Starv_Warn   Type  Event
    0          0.109         2       54        62       55        63         0         0          0    T    timer_walk_start
    1          0.550        11       50       100       50       100         0         0          0  R      vtysh_accept
    0     112848.163      4441    25410    405489    25413    410127         0         0          0     E   zserv_process_messages
    0          0.007         1        7         7        7         7         0         0          0     E   frr_config_read_in
    0          0.005         1        5         5        5         5         0         0          0    T    rib_sweep_route
    1        573.589      4789      119      1567      120      1568         0         0          0    T    wheel_timer_thread
  347         30.848        97      318      1367      318      1366         0         0          0    T    zebra_nhg_timer
    0          0.005         1        5         5        6         6         0         0          0    T    zebra_evpn_mh_startup_delay_exp_cb
    0          5.404       521       10        38       10        70         0         0          0    T    timer_walk_continue
    1          1.669         9      185       219      186       219         0         0          0  R      zserv_accept
    1          0.174        18        9        53       10        53         0         0          0  R      msg_conn_read
    0          3.028       520        5        47        6        47         0         0          0    T    if_zebra_speed_update
    0          0.324       274        1         5        1         6         0         0          0   W     msg_conn_write
    1         24.661      2124       11       359       12       359         0         0          0  R      kernel_read
    0      73683.333      2964    24859    143223    24861    143239         0         0          0    TE   work_queue_run
    1         46.649      6789        6       424        7       424         0         0          0  R      rtadv_read
    0         52.661        85      619      2087      620      2088         0         0          0  R      vtysh_read
    0         42.660        18     2370     21694     2373     21695         0         0          0     E   msg_conn_proc_msgs
    0          0.034         1       34        34       35        35         0         0          0     E   msg_client_connect_timer
    0       2786.938      2300     1211     29456     1219     29555         0         0          0     E   rib_process_dplane_results

Showing statistics for pthread Zebra dplane thread
--------------------------------------------------
                               CPU (user+system): Real (wall-clock):
Active   Runtime(ms)   Invoked Avg uSec Max uSecs Avg uSec Max uSecs  CPU_Warn Wall_Warn Starv_Warn   Type  Event
    0       4875.670    200371       24       770       24       776         0         0          0     E   dplane_thread_loop
    0          0.059         1       59        59       76        76         0         0          0     E   dplane_incoming_request
    1          9.640       722       13      4510       15      5343         0         0          0  R      dplane_incoming_read

Here are the post `show event cpu` results:

Event statistics for bgpd:

Showing statistics for pthread default
--------------------------------------
                               CPU (user+system): Real (wall-clock):
Active   Runtime(ms)   Invoked Avg uSec Max uSecs Avg uSec Max uSecs  CPU_Warn Wall_Warn Starv_Warn   Type  Event
    0      21297.497      3565     5974     57912     5981     57913         0         0          0     E   bgp_process_packet
    0        149.742      1068      140      1109      140      1110         0         0          0  R      vtysh_read
    0          0.013         1       13        13       14        14         0         0          0     E   frr_config_read_in
    0          0.459        86        5       104        5       105         0         0          0    T    bgp_announce_route_timer_expired
    0          0.139        81        1        20        2        21         0         0          0    T    update_subgroup_merge_check_thread_cb
    0        405.889    291687        1       179        1       450         0         0          0    T    (bgp_generate_updgrp_packets)
    0          0.682       618        1         6        1         9         0         0          0   W     bgp_connect_check
    0          3.888       103       37        81       38        82         0         0          0    T    (bgp_start_timer)
    0          0.074         1       74        74      458       458         0         0          0    T    bgp_start_label_manager
    0          0.000         1        0         0        1         1         0         0          0    T    bgp_sync_label_manager
    0          0.121         3       40        54      100       141         0         0          0     E   bgp_process_conn_error
    0          0.060         2       30        49       30        50         0         0          0     E   zclient_connect
    0          0.354         1      354       354      355       355         0         0          0    T    bgp_config_finish
    0          0.283         1      283       283      284       284         0         0          0    T    subgroup_coalesce_timer
    0      29365.962      1805    16269     99445    16273     99454         0         0          0    TE   work_queue_run
    0        185.532      3097       59       497       94     26107         0         0          0     E   bgp_event
    1          0.290         8       36       151       37       158         0         0          0  R      vtysh_accept
    2          9.462       548       17       320       17       322         0         0          0  R      bgp_accept
    2         40.219      8283        4       128        5       128         0         0          0  R      zclient_read
    0          0.322      1031        0         4        0         5         0         0          0    T    (bgp_routeadv_timer)
    0        356.812       637      560      3007      560      3007         0         0          0     E   bgp_handle_route_announcements_to_zebra

Showing statistics for pthread BGP I/O thread
---------------------------------------------
                               CPU (user+system): Real (wall-clock):
Active   Runtime(ms)   Invoked Avg uSec Max uSecs Avg uSec Max uSecs  CPU_Warn Wall_Warn Starv_Warn   Type  Event
  515         62.965     14335        4       103        5       181         0         0          0  R      bgp_process_reads
    0       1986.041    219813        9       213        9       315         0         0          0   W     bgp_process_writes

Event statistics for zebra:

Showing statistics for pthread default
--------------------------------------
                               CPU (user+system): Real (wall-clock):
Active   Runtime(ms)   Invoked Avg uSec Max uSecs Avg uSec Max uSecs  CPU_Warn Wall_Warn Starv_Warn   Type  Event
    0          0.006         1        6         6        7         7         0         0          0     E   frr_config_read_in
    0       3673.365      2044     1797    259281     1800    261342         0         0          0     E   zserv_process_messages
    1        651.846      8041       81      1090       82      1233         0         0          0    T    wheel_timer_thread
    0         38.184        18     2121     21345     2122     21346         0         0          0     E   msg_conn_proc_msgs
    1          0.651        12       54       112       55       112         0         0          0  R      vtysh_accept
    0          0.102         2       51        55       51        56         0         0          0    T    timer_walk_start
    0        202.721      1577      128     29172      141     29226         0         0          0     E   rib_process_dplane_results
    1         41.650      6645        6       140        6       140         0         0          0  R      rtadv_read
    1         22.518      1969       11       106       12       154         0         0          0  R      kernel_read
    0          4.265        48       88      1465       89      1466         0         0          0  R      vtysh_read
    0       6099.851       650     9384     28313     9390     28314         0         0          0    TE   work_queue_run
    0          5.104       521        9        30       10        31         0         0          0    T    timer_walk_continue
    0          3.078       520        5        53        6        55         0         0          0    T    if_zebra_speed_update
    0          0.005         1        5         5        5         5         0         0          0    T    rib_sweep_route
    0          0.034         1       34        34       35        35         0         0          0     E   msg_client_connect_timer
    1          1.641         9      182       214      183       215         0         0          0  R      zserv_accept
    0          0.358       274        1         6        2         6         0         0          0   W     msg_conn_write
    1          0.159        18        8        54        9        54         0         0          0  R      msg_conn_read

Showing statistics for pthread Zebra dplane thread
--------------------------------------------------
                               CPU (user+system): Real (wall-clock):
Active   Runtime(ms)   Invoked Avg uSec Max uSecs Avg uSec Max uSecs  CPU_Warn Wall_Warn Starv_Warn   Type  Event
    0        301.404      7280       41      1878       41      1878         0         0          0     E   dplane_thread_loop
    0          0.048         1       48        48       49        49         0         0          0     E   dplane_incoming_request
    1          9.558       727       13      4659       14      5420         0         0          0  R      dplane_incoming_read

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-03-25 10:47:39 -04:00
Mark Stapp 39bb12299c bgpd: fix SA warnings in bgp clearing code
Fix a possible use-after-free in the recent bgp batch
clearing code, CID 1639091.

Signed-off-by: Mark Stapp <mjs@cisco.com>
2025-03-25 09:32:14 -04:00
Russ White 7afe25744b
Merge pull request #18447 from donaldsharp/bgp_clear_batch
Bgp clear batch
2025-03-24 16:13:49 -04:00
Philippe Guibert 99acebcdc9 bgpd: fix check validity of a VPN SRv6 route with modified nexthop
When exporting a VPN SRv6 route, the path may not be considered valid if
the nexthop is not valid. This is the case when the 'nexthop vpn export'
command is used. The below example illustrates that the VPN path to
2001:1::/64 is not selected, as the expected nexthop to find in vrf10 is
the one configured:

> # show running-config
> router bgp 1 vrf vrf10
>  address-family ipv6 unicast
>   nexthop vpn export 2001::1

> # show bgp ipv6 vpn
> [..]
> Route Distinguisher: 1:10
>      2001:1::/64      2001::1@4                0             0 65001 i
>     UN=2001::1 EC{99:99} label=16 sid=2001:db8:1:1:: sid_structure=[40,24,16,0] type=bgp, subtype=5

The analysis indicates that the 2001::1 nexthop is considered.

> 2025/03/20 21:47:53.751853 BGP: [RD1WY-YE9EC] leak_update: entry: leak-to=VRF default, p=2001:1::/64, type=10, sub_type=0
> 2025/03/20 21:47:53.751855 BGP: [VWNP2-DNMFV] Found existing bnc 2001::1/128(0)(VRF vrf10) flags 0x82 ifindex 0 #paths 2 peer 0x0, resolved prefix UNK prefix
> 2025/03/20 21:47:53.751856 BGP: [VWC2R-4REXZ] leak_update_nexthop_valid: 2001:1::/64 nexthop is not valid (in VRF vrf10)
> 2025/03/20 21:47:53.751857 BGP: [HX87B-ZXWX9] leak_update: ->VRF default: 2001:1::/64: Found route, no change

Actually, to check the nexthop validity, only the source path in the VRF
has the correct nexthop. Fix this by reusing the source path information
instead of the current one.

> 2025/03/20 22:43:51.703521 BGP: [RD1WY-YE9EC] leak_update: entry: leak-to=VRF default, p=2001:1::/64, type=10, sub_type=0
> 2025/03/20 22:43:51.703523 BGP: [VWNP2-DNMFV] Found existing bnc fe80::b812:37ff:fe13:d441/128(0)(VRF vrf10) flags 0x87 ifindex 0 #paths 2 peer 0x0, resolved prefix fe80::/64
> 2025/03/20 22:43:51.703525 BGP: [VWC2R-4REXZ] leak_update_nexthop_valid: 2001:1::/64 nexthop is valid (in VRF vrf10)
> 2025/03/20 22:43:51.703526 BGP: [HX87B-ZXWX9] leak_update: ->VRF default: 2001:1::/64: Found route, no change

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2025-03-24 09:17:01 +01:00
Donatas Abraitis 44c4743e08
Merge pull request #18378 from Tuetuopay/fix-route-map-gateway-ip
bgpd: fix `set evpn gateway-ip ipv[46]` route-map
2025-03-23 12:38:38 +02:00
Mark Stapp c527882012 bgpd: Allow batch clear to do partial work and continue later
Modify the batch clear code to be able to stop after processing
some of the work and to pick back up again.  This will allow
the very expensive nature of the batch clearing to be spread out
and allow bgp to continue to be responsive.

Signed-off-by: Mark Stapp <mjs@cisco.com>
2025-03-20 09:33:52 -04:00
Tuetuopay 7320659f78 bgpd: fix evpn attributes being dropped on input
All assignments of the EVPN attributes (ESI and Gateway IP) are gated
behind the peer being set up for inbound soft-reconfiguration.

There are no actual reasons for this limitation, so let's perform the
EVPN attribute assignment no matter what when soft reconfiguration is
not enabled.

Fixes: 6e076ba523 ("bgpd: Fix for ain->attr corruption during path update")
Signed-off-by: Tuetuopay <tuetuopay@me.com>
2025-03-20 10:23:17 +01:00
Donald Sharp 9651b159cc bgpd: Fix leaked memory when showing some bgp routes
The two memory leaks:

==387155== 744 (48 direct, 696 indirect) bytes in 1 blocks are definitely lost in loss record 222 of 262
==387155==    at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==387155==    by 0x4C1B982: json_object_new_object (in /usr/lib/x86_64-linux-gnu/libjson-c.so.5.1.0)
==387155==    by 0x2E4146: peer_adj_routes (bgp_route.c:15245)
==387155==    by 0x2E4F1A: show_ip_bgp_instance_neighbor_advertised_route_magic (bgp_route.c:15549)
==387155==    by 0x2B982B: show_ip_bgp_instance_neighbor_advertised_route (bgp_route_clippy.c:722)
==387155==    by 0x4915E6F: cmd_execute_command_real (command.c:1003)
==387155==    by 0x4915FE8: cmd_execute_command (command.c:1062)
==387155==    by 0x4916598: cmd_execute (command.c:1228)
==387155==    by 0x49EB858: vty_command (vty.c:626)
==387155==    by 0x49ED77C: vty_execute (vty.c:1389)
==387155==    by 0x49EFFA7: vtysh_read (vty.c:2408)
==387155==    by 0x49E4156: event_call (event.c:2019)
==387155==    by 0x4958ABD: frr_run (libfrr.c:1247)
==387155==    by 0x206A68: main (bgp_main.c:557)
==387155==
==387155== 2,976 (192 direct, 2,784 indirect) bytes in 4 blocks are definitely lost in loss record 240 of 262
==387155==    at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==387155==    by 0x4C1B982: json_object_new_object (in /usr/lib/x86_64-linux-gnu/libjson-c.so.5.1.0)
==387155==    by 0x2E45CA: peer_adj_routes (bgp_route.c:15325)
==387155==    by 0x2E4F1A: show_ip_bgp_instance_neighbor_advertised_route_magic (bgp_route.c:15549)
==387155==    by 0x2B982B: show_ip_bgp_instance_neighbor_advertised_route (bgp_route_clippy.c:722)
==387155==    by 0x4915E6F: cmd_execute_command_real (command.c:1003)
==387155==    by 0x4915FE8: cmd_execute_command (command.c:1062)
==387155==    by 0x4916598: cmd_execute (command.c:1228)
==387155==    by 0x49EB858: vty_command (vty.c:626)
==387155==    by 0x49ED77C: vty_execute (vty.c:1389)
==387155==    by 0x49EFFA7: vtysh_read (vty.c:2408)
==387155==    by 0x49E4156: event_call (event.c:2019)
==387155==    by 0x4958ABD: frr_run (libfrr.c:1247)
==387155==    by 0x206A68: main (bgp_main.c:557)

For the 1st one, if the operator issues a advertised-routes command, the
json_ar variable was never being freed.

For the 2nd one, if the operator issued a command where the
output_count_per_rd is 0, we need to free the json_routes value.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-03-19 16:53:50 -04:00
Donatas Abraitis f47b2fb94a bgpd: Move stale Adj-RIB-Out paths removal to subgroup_process_announce_selected()
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2025-03-17 16:02:16 +02:00
Manpreet Kaur bc1008b970 bgpd: Fixed crash upon bgp network import-check command
BT:
```
3  <signal handler called>
4  0x00005616837546fc in bgp_static_update (bgp=bgp@entry=0x5616865eac50, p=0x561686639e40,
    bgp_static=0x561686639f50, afi=afi@entry=AFI_IP6, safi=safi@entry=SAFI_UNICAST) at ../bgpd/bgp_route.c:7232
5  0x0000561683754ad0 in bgp_static_add (bgp=0x5616865eac50) at ../bgpd/bgp_table.h:413
6  0x0000561683785e2e in no_bgp_network_import_check (self=<optimized out>, vty=0x5616865e04c0,
    argc=<optimized out>, argv=<optimized out>) at ../bgpd/bgp_vty.c:4609
7  0x00007fdbcc294820 in cmd_execute_command_real (vline=vline@entry=0x561686663000,
```

The program encountered a SEG FAULT when attempting to access pi->extra->vrfleak->bgp_orig because
pi->extra->vrfleak was NULL.
```
(gdb) p pi->extra->vrfleak
$1 = (struct bgp_path_info_extra_vrfleak *) 0x0
(gdb) p pi->extra->vrfleak->bgp_orig
Cannot access memory at address 0x8
```
Added NOT NULL check on pi->extra->vrfleak before accessing pi->extra->vrfleak->bgp_orig
to prevent the segmentation fault.

Signed-off-by: Manpreet Kaur <manpreetk@nvidia.com>
2025-03-14 05:40:16 -07:00
Donald Sharp 3fb72a03c3 bgpd: Show bgp <afi> <safi> shouldn't display peers in groups
The command `show bgp <afi> <safi>` has this output:

r1# show bgp ipv4 uni 10.0.0.0
BGP routing table entry for 10.0.0.0/32, version 1
Paths: (1 available, best #1, table default)
  Advertised to non peer-group peers:
  r1-eth0 r1-eth1 r1-eth2 r1-eth3
  ....

It specifically states `Advertised to non peer-group peers:` yet
the code is not filtering those out.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-03-13 15:19:02 -04:00
Mark Stapp 58f924d287 bgpd: batch peer connection error clearing
When peer connections encounter errors, attempt to batch some
of the clearing processing that occurs. Add a new batch object,
add multiple peers to it, if possible. Do one rib walk for the
batch, rather than one walk per peer. Use a handler callback
per batch to check and remove peers' path-infos, rather than
a work-queue and callback per peer. The original clearing code
remains; it's used for single peers.

Signed-off-by: Mark Stapp <mjs@cisco.com>
2025-03-12 12:42:06 -04:00
Donald Sharp 0758aa10a8 bgpd: Fix dead code in bgp_route.c #1637664
Coverity rightly points out that the worse pointer
cannot be null in this section of code.  Fix it.

Signed-off-by: Donald Sharp <donaldsharp72@gmail.com>
2025-03-06 09:59:19 -05:00
Russ White 6f48c7d785
Merge pull request #18301 from pguibert6WIND/vpn_prefix_aggregate_export_and_accept
Vpn prefix aggregate export and accept
2025-03-04 09:39:52 -05:00
Russ White 21dc0a4d16
Merge pull request #17961 from opensourcerouting/fix/bgp_reject_as_aggregate
bgpd: Do not advertise aggregate routes to contributing ASes
2025-03-04 09:18:38 -05:00
Philippe Guibert bf15730267 bgpd: fix syncs suppressed prefixes in VPN environments
By using the summary-only option for aggregated prefixes, the suppressed
prefixes are however exported as VPN prefixes, whereas they should not.

> r1# show bgp vrf vrf1 ipv4
> [..]
>  *>  172.31.1.0/24    0.0.0.0                  0         32768 ?
>  s>  172.31.1.1/32    0.0.0.0                  0         32768 ?
>  s>  172.31.1.2/32    0.0.0.0                  0         32768 ?
>  s>  172.31.1.3/32    0.0.0.0                  0         32768 ?
> [..]
> r1#
>
> r1# show bgp ipv4 vpn
> [..]
>  *>  172.31.1.0/24    0.0.0.0@4<               0         32768 ?
>     UN=0.0.0.0 EC{52:100} label=101 type=bgp, subtype=5
>  *>  172.31.1.1/32    0.0.0.0@4<               0         32768 ?
>     UN=0.0.0.0 EC{52:100} label=101 type=bgp, subtype=5
>  *>  172.31.1.2/32    0.0.0.0@4<               0         32768 ?
>     UN=0.0.0.0 EC{52:100} label=101 type=bgp, subtype=5
>  *>  172.31.1.3/32    0.0.0.0@4<               0         32768 ?
>     UN=0.0.0.0 EC{52:100} label=101 type=bgp, subtype=5
> [..]
> r1#

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2025-03-03 20:43:11 +01:00
Philippe Guibert 32088c43a8 bgpd: fix remove vpn aggregated prefix upon unconfiguration
When unconfiguring an aggregated prefix, the VPN prefix is not
removed. Fix this by refreshing the VPN leak when the aggregated route
is or is not available.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2025-03-03 20:43:11 +01:00
Donald Sharp 83ad94694b bgpd: remove dmed check not required in bestpath selection
As part of the upstream master commit (f3575f61c7 bgpd: Sort the
bgp_path_inf) the snippet of the code for dmed check condition
left out, which leads to an issue of selecting incorrect bestpath.

As an example:

During the bestpath selection local route looses to another path due
to dmed condition being hit.

The snippet of the logs:

2025/02/20 03:06:20.131441 BGP: [JW7VP-K1YVV]
[2]:[0]:[48]:[00:92:00:00:00:10](VRF default): Comparing path
27.0.0.7 flags Valid  with path Static announcement flags Selected Valid Attr Changed Unsorted
2025/02/20 03:06:20.131445 BGP: [SYTDR-QV6X9] [2]:[0]:[48]:[00:92:00:00:00:10]: path 27.0.0.7 loses to path Static announcement as ES 03:44:38:39:ff:ff:02:00:00:01 is same and local
2025/02/20 03:06:20.131452 BGP: [JW7VP-K1YVV] [2]:[0]:[48]:[00:92:00:00:00:10](VRF default): Comparing path 27.0.0.8 flags Valid  with path Static announcement flags Selected Valid Attr Changed Unsorted
2025/02/20 03:06:20.131456 BGP: [SYTDR-QV6X9] [2]:[0]:[48]:[00:92:00:00:00:10]: path 27.0.0.8 loses to path Static announcement as ES 03:44:38:39:ff:ff:02:00:00:01 is same and local
2025/02/20 03:06:20.131458 BGP: [WEWEC-8SE72] [2]:[0]:[48]:[00:92:00:00:00:10](VRF default): path Static announcement is the bestpath from AS 0   <<<< static is best
2025/02/20 03:06:20.131463 BGP: [Z3A78-GM3G5] bgp_best_selection: [2]:[0]:[48]:[00:92:00:00:00:10](VRF default) pi 27.0.0.7 dmed
2025/02/20 03:06:20.131467 BGP: [Z3A78-GM3G5] bgp_best_selection: [2]:[0]:[48]:[00:92:00:00:00:10](VRF default) pi 27.0.0.8 dmed
2025/02/20 03:06:20.131471 BGP: [N6CTF-2RSKS] [2]:[0]:[48]:[00:92:00:00:00:10](VRF default): After path selection, newbest is path 27.0.0.7 oldbest was Static announce

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-02-20 14:28:15 -05:00
Donatas Abraitis d8ea27a188
Merge pull request #18122 from louis-6wind/bgp_cleanup_table-factorize
bgpd: factorize bgp_table_cleanup()
2025-02-14 23:08:51 +02:00
Donald Sharp b9ac2c7b2f
Merge pull request #18080 from opensourcerouting/fix/enable_ll_capability_if_using_unnumerred
bgpd: Some fixes/improvements for Link-Local Next Hop capability
2025-02-13 14:06:24 -05:00
Louis Scalbert 85784be2ec bgpd: remove unused afi arg from bgp_cleanup_table
Remove unused AFI argument from bgp_cleanup_table()

Fixes: ec6e09c271 ("bgpd: fix flushing ipv6 flowspec entries when peering stops")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2025-02-13 16:04:33 +01:00
Louis Scalbert b5014d371c bgpd: factorize bgp_table_cleanup()
Factorize bgp_table_cleanup(). Cosmetic change. It will help adding
AFI / SAFI in the future.

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2025-02-13 15:59:06 +01:00
Louis Scalbert cf0269649c bgpd: fix incorrect json in bgp_show_table_rd
In bgp_show_table_rd(), the is_last argument is determined using the
expression "next == NULL" to check if the RD table is the last one. This
helps ensure proper JSON formatting.

However, if next is not NULL but is no longer associated with a BGP
table, the JSON output becomes malformed.

Updates the condition to also verify the existence of the next bgp_dest
table.

Fixes: 1ae44dfcba ("bgpd: unify 'show bgp' with RD with normal unicast bgp show")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2025-02-12 13:34:21 +01:00
Donatas Abraitis ca24f56a5c bgpd: Add an ability to disable link-local capability per peer
Even if we have unnumbered peering, let's respect `no neighbor X capability link-local`
and disable it per-neighbor on demand.

Fixes: db853cc97e ("bgpd: Implement Link-Local Next Hop capability")

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2025-02-11 17:01:56 +02:00
Russ White b040b83c35
Merge pull request #17870 from opensourcerouting/fix/bgp_show_ip_bgp_cmd_internal
bgpd: Show internal data for BGP routes
2025-02-11 08:44:43 -05:00
Donatas Abraitis 28c337de18 bgpd: Add additional JSON field linkLocalOnly
This is to show if the NH address received as Link-Local Next Hop capability.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2025-02-11 13:13:40 +02:00
Donatas Abraitis 5f8ad4045c bgpd: Show scope as link-local if using LL Next Hop capability
Fixes: db853cc97e ("bgpd: Implement Link-Local Next Hop capability")

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2025-02-11 13:13:39 +02:00
Russ White 2ef76a3350
Merge pull request #17871 from opensourcerouting/feature/bgp_link_local_capability
bgpd: Implement Link-Local Next Hop capability
2025-02-07 14:00:59 -05:00
Jafar Al-Gharaibeh 8d71ce9d7d
Merge pull request #18000 from donaldsharp/bgp_eoiu_mem_leak
bgpd: Fix up memory leak in processing eoiu marker
2025-02-04 23:20:42 -06:00
Russ White e6228f7880
Merge pull request #17896 from opensourcerouting/fix/bgp_oad_extended_communities
bgpd: Send non-transitive extended communities from/to OAD peers
2025-02-04 11:42:16 -05:00
Donald Sharp c6b7a993fb bgpd: Fix up memory leak in processing eoiu marker
Memory is being leaked when processing the eoiu marker.
BGP is creating a dummy dest to contain the data but
it was never freed.  As well as the eoiu info was
not being freed either.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-02-04 10:56:59 -05:00
Donald Sharp 4e8eda74ec bgpd: With suppress-fib-pending ensure withdrawal is sent
When you have suppress-fib-pending turned on it is possible
to end up in a situation where the prefix is not withdrawn
from downstream peers.

Here is the timing that I believe is happening:

a) have 2 paths to a peer.
b) receive a withdrawal from 1 path, set BGP_NODE_FIB_INSTALL_PENDING
   and send the route install to zebra.
c) receive a withdrawal from the other path.
d) At this point we have a dest->flags set BGP_NODE_FIB_INSTALL_PENDING
   old_select the path_info going away, new_select is NULL
e) A bit further down we call group_announce_route() which calls
   the code to see if we should advertise the path.  It sees the
   BGP_NODE_FIB_INSTALL_PENDING flag and says, nope.
f) the route is sent to zebra to withdraw, which unsets the
   BGP_NODE_FIB_INSTALL_PENDING.
g) This function winds up and deletes the path_info.  Dest now
   has no path infos.
h) BGP receives the route install(from step b) and unsets the
   BGP_NODE_FIB_INSTALL_PENDING flag
i) BGP receives the route removed from zebra (from step f) and
   unsets the flag again.

We know if there is no new_select, let's go ahead and just
unset the PENDING flag to allow the withdrawal to go out
at the time when the second withdrawal is received.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-01-31 18:53:30 -05:00
Donatas Abraitis 925b365a87 bgpd: Do not advertise aggregate routes to contributing ASes
draft-ietf-idr-deprecate-as-set-confed-set-16 defines that we MUST NOT
advertise an aggregate prefix to the contributing ASes.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2025-01-31 09:43:00 +02:00
Donatas Abraitis ee67699bd7
Merge pull request #17905 from pguibert6WIND/advertised_routes_incorrect_json
Advertised routes incorrect json
2025-01-27 23:32:33 +02:00
Philippe Guibert e78a049c49 bgpd: fix missing braces when dumping json vpn advertised-routes
The json output of advertised-routes is incorrect, as there is a missing
brace with route-distinguisher:

observed with the bgp_vpnv4_noretain test:
> "bgpTableVersion":0,"bgpLocalRouterId":"192.0.2.1","defaultLocPrf":100,"localAS":65500,
> "advertisedRoutes": "192.0.2.1:1":{"rd":"192.0.2.1:1","10.101.0.0/24":{"prefix":"10.101.0.0/24",

expected:
> "bgpTableVersion":0,"bgpLocalRouterId":"192.0.2.1","defaultLocPrf":100,"localAS":65500,
> "advertisedRoutes": { "192.0.2.1:1":{"rd":"192.0.2.1:1","10.101.0.0/24":{"prefix":"10.101.0.0/24",
>                     ^
>                     missing brace

Fix this by adding the missing braces.

Fixes: 4838bac033 ("bgpd: neighbors received-routes/advertised-routes stringify changes")

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2025-01-27 11:50:32 +01:00
Philippe Guibert c742817312 bgpd: fix add json attribute to reflect suppressed path
When aggregate is used, the suppressed information is not displayed in
the json attributes of a given path. To illustrate, the dump of the
192.168.2.1/32 path in the bgp_aggregate_address_topo1 topotest:

> # show bgp ipv4
> [..]
>  s> 192.168.2.1/32   10.0.0.2                               0 65001 i
>
> # show bgp ipv4 detail
> [..]
> BGP routing table entry for 192.168.2.1/32, version 17
> Paths: (1 available, best #1, table default, vrf (null), Advertisements suppressed by an aggregate.)
>   Not advertised to any peer
>   65001            <---- missing suppressed flag
>     10.0.0.2 from 10.0.0.2 (10.254.254.3)
>       Origin IGP, valid, external, best (First path received)
>       Last update: Fri Jan 24 13:11:41 2025
>
> # show bgp ipv4 detail json
> [..]
> ,"192.168.2.1/32": [{"aspath":{"string":"65001","segments":[{"type":"as-sequence","list":[65001]}],"length":1},"origin":"IGP","valid":true,"version":17,
> "bestpath":{"overall":true,"selectionReason":"First path received"},                <---- missing suppressed flag
> "lastUpdate":{"epoch":1737720700,"string":"Fri Jan 24 13:11:40 2025\n"},
> "nexthops":[{"ip":"10.0.0.2","afi":"ipv4","metric":0,"accessible":true,"used":true}],
> "peer":{"peerId":"10.0.0.2","routerId":"10.254.254.3","type":"external"}}]

Fix this by adding the json information.

> # show bgp ipv4 detail
> [..]
> BGP routing table entry for 192.168.2.1/32, version 17
> Paths: (1 available, best #1, table default, vrf (null), Advertisements suppressed by an aggregate.)
>   Not advertised to any peer
>   65001, (suppressed)
>     10.0.0.2 from 10.0.0.2 (10.254.254.3)
>       Origin IGP, valid, external, best (First path received)
>       Last update: Fri Jan 24 13:11:41 2025
>
> # show bgp ipv4 detail json
> [..]
> ,"192.168.2.1/32": [{"aspath":{"string":"65001","segments":[{"type":"as-sequence","list":[65001]}],"length":1},"suppressed":true,"origin":"IGP","valid":true,"version":17,
> "bestpath":{"overall":true,"selectionReason":"First path received"},
> "lastUpdate":{"epoch":1737720991,"string":"Fri Jan 24 13:16:31 2025"},
> "nexthops":[{"ip":"10.0.0.2","afi":"ipv4","metric":0,"accessible":true,"used":true}],"peer":{"peerId":"10.0.0.2","routerId":"10.254.254.3","type":"external"}}]

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2025-01-27 11:47:41 +01:00
Donatas Abraitis f19b843e9e
Merge pull request #17652 from pguibert6WIND/topotest_bgp_evpn_rt5
bgpd, tests: bgp_evpn_rt5, add test with match evpn vni command
2025-01-23 13:12:35 +02:00
Chirag Shah c358b92c9f bgpd: fix evpn path info get api
EVPN imported routes AF is not AF_EVPN, in that case
the path info extra with EVPN is nver created.
In order to create evpn info under path extra, define
new api which does not specifically check for AF_EVPN
afi type.

Signed-off-by: Chirag Shah <chirag@nvidia.com>
2025-01-21 15:17:20 -08:00
Donatas Abraitis f2759c46ce bgpd: Send non-transitive extended communities from/to OAD peers
draft-uttaro-idr-bgp-oad says:

Extended communities which are non-transitive across an AS boundary MAY be
advertised over an EBGP-OAD session if allowed by explicit policy configuration.
If allowed, all the members of the OAD SHOULD be configured to use the same
criteria.
For example, the Origin Validation State Extended Community, defined as
non-transitive in [RFC8097], can be advertised to peers in the same OAD.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2025-01-21 21:17:39 +02:00
Philippe Guibert 6855bf2232 bgpd: fix bgp evpn memory leaks when adj-rib-in is disabled
Some bgp evpn memory contexts are not freed at the end of the bgp
process.

> =================================================================
> ==1208677==ERROR: LeakSanitizer: detected memory leaks
>
> Direct leak of 96 byte(s) in 2 object(s) allocated from:
>     #0 0x7f93ad4b4a57 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
>     #1 0x7f93ace77233 in qcalloc lib/memory.c:106
>     #2 0x563bb68f4df1 in process_type5_route bgpd/bgp_evpn.c:5084
>     #3 0x563bb68fb663 in bgp_nlri_parse_evpn bgpd/bgp_evpn.c:6302
>     #4 0x563bb69ea2a9 in bgp_nlri_parse bgpd/bgp_packet.c:347
>     #5 0x563bb69f7716 in bgp_update_receive bgpd/bgp_packet.c:2482
>     #6 0x563bb6a04d3b in bgp_process_packet bgpd/bgp_packet.c:4091
>     #7 0x7f93acf8082d in event_call lib/event.c:1996
>     #8 0x7f93ace48931 in frr_run lib/libfrr.c:1232
>     #9 0x563bb6880ae1 in main bgpd/bgp_main.c:557
>     #10 0x7f93ac829d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

Actually, the bgp evpn context may noy be used if adj rib in is unused.
This may lead to memory leaks. Fix this by freeing the context in those
corner cases.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2025-01-21 13:48:36 +01:00