matthieu/frr - Forgejo: Beyond coding. We Forge.

Author	SHA1	Message	Date
Donald Sharp	5d4c7d2ece	Merge pull request #18675 from mjstapp/fix_clang_18_warnings lib,pimd,bgpd,bfdd: Fix clang 18 warnings	2025-04-17 18:38:21 -04:00
Mark Stapp	2d42318625	bfdd, bgpd: clean up clang warnings Clean up some clang compiler warnings. Signed-off-by: Mark Stapp <mjs@cisco.com>	2025-04-16 13:50:21 -04:00
David Lamparter	d46909e50f	bgpd: fix misused rfapi conditional bgpd/bgpd.c:8975:5: error: "ENABLE_BGP_VNC" is not defined, evaluates to 0 [-Werror=undef] 8975 \| #if ENABLE_BGP_VNC Fixes: FRRouting#18546 Fixes: `1629c05924` ("bgpd: rfapi: track outstanding rib and import timers, free mem at exit") Cc: G. Paul Ziemba <paulz@labn.net> Signed-off-by: David Lamparter <equinox@opensourcerouting.org>	2025-04-16 12:47:36 +02:00
Donatas Abraitis	7ef9394c33	Merge pull request #18653 from louis-6wind/fix-bgp-pbr-mem-leaks bgpd: fix pbr memory leaks	2025-04-15 10:07:14 +03:00
Mark Stapp	81b472bd79	lib,bgpd: clean up clang warnings Clean up a couple of clang compiler warnings (this was clang 18) Signed-off-by: Mark Stapp <mjs@cisco.com>	2025-04-14 15:32:08 -04:00
Louis Scalbert	eb8281aeb8	bgpd: fix bgp_pbr_or_filter memory leaks Note that bgp_pbr_policyroute_add_from_zebra() and bgp_pbr_policyroute_remove_from_zebra() are only called from bgp_pbr_handle_entry(). > ==966967==ERROR: LeakSanitizer: detected memory leaks > > Direct leak of 40 byte(s) in 1 object(s) allocated from: > #0 0x7fd447ab4a57 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154 > #1 0x7fd44746f8dd in qcalloc lib/memory.c:105 > #2 0x7fd44744401a in list_new lib/linklist.c:49 > #3 0x560f8c094490 in bgp_pbr_handle_entry bgpd/bgp_pbr.c:2818 > #4 0x560f8c095993 in bgp_pbr_update_entry bgpd/bgp_pbr.c:2941 > #5 0x560f8c2164f3 in bgp_zebra_announce bgpd/bgp_zebra.c:1618 > #6 0x560f8c0bd668 in bgp_process_main_one bgpd/bgp_route.c:3691 > #7 0x560f8c0be7fe in process_subq_other_route bgpd/bgp_route.c:3856 > #8 0x560f8c0bf280 in process_subq bgpd/bgp_route.c:3955 > #9 0x560f8c0bf320 in meta_queue_process bgpd/bgp_route.c:3980 > #10 0x7fd44759fdfc in work_queue_run lib/workqueue.c:282 > #11 0x7fd4475779b2 in event_call lib/event.c:2011 > #12 0x7fd447442ff1 in frr_run lib/libfrr.c:1216 > #13 0x560f8bef0a15 in main bgpd/bgp_main.c:545 > #14 0x7fd446e29d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 > > Direct leak of 40 byte(s) in 1 object(s) allocated from: > #0 0x7fd447ab4a57 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154 > #1 0x7fd44746f8dd in qcalloc lib/memory.c:105 > #2 0x7fd44744401a in list_new lib/linklist.c:49 > #3 0x560f8c09439d in bgp_pbr_handle_entry bgpd/bgp_pbr.c:2812 > #4 0x560f8c095993 in bgp_pbr_update_entry bgpd/bgp_pbr.c:2941 > #5 0x560f8c2164f3 in bgp_zebra_announce bgpd/bgp_zebra.c:1618 > #6 0x560f8c0bd668 in bgp_process_main_one bgpd/bgp_route.c:3691 > #7 0x560f8c0be7fe in process_subq_other_route bgpd/bgp_route.c:3856 > #8 0x560f8c0bf280 in process_subq bgpd/bgp_route.c:3955 > #9 0x560f8c0bf320 in meta_queue_process bgpd/bgp_route.c:3980 > #10 0x7fd44759fdfc in work_queue_run lib/workqueue.c:282 > #11 0x7fd4475779b2 in event_call lib/event.c:2011 > #12 0x7fd447442ff1 in frr_run lib/libfrr.c:1216 > #13 0x560f8bef0a15 in main bgpd/bgp_main.c:545 > #14 0x7fd446e29d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 > > Direct leak of 4 byte(s) in 1 object(s) allocated from: > #0 0x7fd447ab4a57 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154 > #1 0x7fd44746f8dd in qcalloc lib/memory.c:105 > #2 0x560f8c080cec in bgp_pbr_extract_enumerate_unary bgpd/bgp_pbr.c:362 > #3 0x560f8c080f7e in bgp_pbr_extract_enumerate bgpd/bgp_pbr.c:400 > #4 0x560f8c094530 in bgp_pbr_handle_entry bgpd/bgp_pbr.c:2819 > #5 0x560f8c095993 in bgp_pbr_update_entry bgpd/bgp_pbr.c:2941 > #6 0x560f8c2164f3 in bgp_zebra_announce bgpd/bgp_zebra.c:1618 > #7 0x560f8c0bd668 in bgp_process_main_one bgpd/bgp_route.c:3691 > #8 0x560f8c0be7fe in process_subq_other_route bgpd/bgp_route.c:3856 > #9 0x560f8c0bf280 in process_subq bgpd/bgp_route.c:3955 > #10 0x560f8c0bf320 in meta_queue_process bgpd/bgp_route.c:3980 > #11 0x7fd44759fdfc in work_queue_run lib/workqueue.c:282 > #12 0x7fd4475779b2 in event_call lib/event.c:2011 > #13 0x7fd447442ff1 in frr_run lib/libfrr.c:1216 > #14 0x560f8bef0a15 in main bgpd/bgp_main.c:545 > #15 0x7fd446e29d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 > > Direct leak of 4 byte(s) in 1 object(s) allocated from: > #0 0x7fd447ab4a57 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154 > #1 0x7fd44746f8dd in qcalloc lib/memory.c:105 > #2 0x560f8c080cec in bgp_pbr_extract_enumerate_unary bgpd/bgp_pbr.c:362 > #3 0x560f8c080f7e in bgp_pbr_extract_enumerate bgpd/bgp_pbr.c:400 > #4 0x560f8c09443d in bgp_pbr_handle_entry bgpd/bgp_pbr.c:2813 > #5 0x560f8c095993 in bgp_pbr_update_entry bgpd/bgp_pbr.c:2941 > #6 0x560f8c2164f3 in bgp_zebra_announce bgpd/bgp_zebra.c:1618 > #7 0x560f8c0bd668 in bgp_process_main_one bgpd/bgp_route.c:3691 > #8 0x560f8c0be7fe in process_subq_other_route bgpd/bgp_route.c:3856 > #9 0x560f8c0bf280 in process_subq bgpd/bgp_route.c:3955 > #10 0x560f8c0bf320 in meta_queue_process bgpd/bgp_route.c:3980 > #11 0x7fd44759fdfc in work_queue_run lib/workqueue.c:282 > #12 0x7fd4475779b2 in event_call lib/event.c:2011 > #13 0x7fd447442ff1 in frr_run lib/libfrr.c:1216 > #14 0x560f8bef0a15 in main bgpd/bgp_main.c:545 > #15 0x7fd446e29d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>	2025-04-14 13:52:02 +02:00
Louis Scalbert	442d8bce36	bgpd: fix bgp_pbr_rule memory leak Fix bgp_pbr_rule memory leak. Found by code review. Fixes: `27e376d4e1` ("bgpd: an hash list of pbr iprule is created") Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>	2025-04-14 13:09:36 +02:00
Louis Scalbert	8d9df5cf04	bgpd: fix bgp_pbr_match memory leak > Direct leak of 1144 byte(s) in 13 object(s) allocated from: > #0 0x7f3eedeb4a57 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154 > #1 0x7f3eed86f8dd in qcalloc lib/memory.c:105 > #2 0x55b32d236faf in bgp_pbr_match_alloc_intern bgpd/bgp_pbr.c:1074 > #3 0x7f3eed817d79 in hash_get lib/hash.c:147 > #4 0x55b32d242d9a in bgp_pbr_policyroute_add_to_zebra_unit bgpd/bgp_pbr.c:2486 > #5 0x55b32d244436 in bgp_pbr_policyroute_add_to_zebra bgpd/bgp_pbr.c:2672 > #6 0x55b32d245a05 in bgp_pbr_handle_entry bgpd/bgp_pbr.c:2843 > #7 0x55b32d246912 in bgp_pbr_update_entry bgpd/bgp_pbr.c:2939 > #8 0x55b32d3c7472 in bgp_zebra_announce bgpd/bgp_zebra.c:1618 > #9 0x55b32d26e5e7 in bgp_process_main_one bgpd/bgp_route.c:3691 > #10 0x55b32d26f77d in process_subq_other_route bgpd/bgp_route.c:3856 > #11 0x55b32d2701ff in process_subq bgpd/bgp_route.c:3955 > #12 0x55b32d27029f in meta_queue_process bgpd/bgp_route.c:3980 > #13 0x7f3eed99fdd8 in work_queue_run lib/workqueue.c:282 > #14 0x7f3eed97798e in event_call lib/event.c:2011 > #15 0x7f3eed842ff1 in frr_run lib/libfrr.c:1216 > #16 0x55b32d0a1a15 in main bgpd/bgp_main.c:545 > #17 0x7f3eed229d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 Fixes: `d114b0d739` ("bgpd: inject policy route entry from bgp into zebra pbr entries.") Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>	2025-04-14 13:09:35 +02:00
Louis Scalbert	520518c3ad	bgpd: fix bgp_pbr_match_entry memory leak > ==238132==ERROR: LeakSanitizer: detected memory leaks > > Direct leak of 160 byte(s) in 1 object(s) allocated from: > #0 0x7fd79f0b4a57 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154 > #1 0x7fd79ea6f8dd in qcalloc lib/memory.c:105 > #2 0x5586b26995f9 in bgp_pbr_match_entry_alloc_intern bgpd/bgp_pbr.c:1155 > #3 0x7fd79ea17d79 in hash_get lib/hash.c:147 > #4 0x5586b26a551d in bgp_pbr_policyroute_add_to_zebra_unit bgpd/bgp_pbr.c:2522 > #5 0x5586b26a6436 in bgp_pbr_policyroute_add_to_zebra bgpd/bgp_pbr.c:2672 > #6 0x5586b26a8089 in bgp_pbr_handle_entry bgpd/bgp_pbr.c:2876 > #7 0x5586b26a8912 in bgp_pbr_update_entry bgpd/bgp_pbr.c:2939 > #8 0x5586b2829472 in bgp_zebra_announce bgpd/bgp_zebra.c:1618 > #9 0x5586b282ab4b in bgp_zebra_announce_table bgpd/bgp_zebra.c:1766 > #10 0x5586b2824b99 in bgp_zebra_tm_connect bgpd/bgp_zebra.c:1091 > #11 0x7fd79eb7798e in event_call lib/event.c:2011 > #12 0x7fd79ea42ff1 in frr_run lib/libfrr.c:1216 > #13 0x5586b2503a15 in main bgpd/bgp_main.c:545 > #14 0x7fd79e429d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 Fixes: `d114b0d739` ("bgpd: inject policy route entry from bgp into zebra pbr entries.") Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>	2025-04-14 13:09:34 +02:00
Donald Sharp	dcf43ae009	bgpd: Prevent crash when issuing a show rpki connections When attempting to check rpki status and the connection has been turned off, let's check to see if we are connected before we ask the rpki subsystem, else we will get a crash in the rpki library. Signed-off-by: Donald Sharp <donaldsharp72@gmail.com>	2025-04-12 16:59:56 -04:00
Donatas Abraitis	dc37bff8c0	Merge pull request #18376 from pguibert6WIND/show_bgp_neighbor_counter_tx bgpd: fix add prefix sent in 'show bgp neighbor'	2025-04-10 19:38:59 +03:00
Donatas Abraitis	cf351d6d05	Merge pull request #18611 from pguibert6WIND/bgp_usid bgpd: add usid behavior for bgp srv6 instructions	2025-04-10 19:34:44 +03:00
Donatas Abraitis	2f0b8ff1ea	Merge pull request #18624 from louis-6wind/remove-afi2family bgpd: remove useless calls to afi2family	2025-04-10 17:14:08 +03:00
Mark Stapp	0e8e0c5fb9	Merge pull request #18594 from soumyar-roy/soumya/netwithdraw bgpd: Paths not deleted received from shutdown peer	2025-04-10 10:07:32 -04:00
Philippe Guibert	a47a53b003	bgpd: fix add prefix sent in 'show bgp neighbor' The 'acceptedPrefixCounter' is available in 'show bgp neighbor json', but there is no equivalent when using the non json output. Add it. > # show bgp neighbor > [..] > Community attribute sent to this neighbor(all) > 0 accepted prefixes, 1 sent prefixes Fixes: `856ca177c4` ("Added json formating support to show-...-neighbors-... bgp commands.") Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>	2025-04-10 10:12:36 +02:00
Soumya Roy	d2bec7a691	bgpd: Paths, received from shutdown peer, not deleted Issue: In a scaled setup, (where number of nets > BGP_CLEARING_BATCH_MAX_DESTS for walk_batch_table_helper), when peer is shutdown, it is seen some of the paths are not deleted, which are received from that peer. Fix: This is due to, in clear_batch_rib_helper, once walk_batch_table_helper returns after BGP_CLEARING_BATCH_MAX_DESTS is reached, we just break from inner loop for the afi/safi for loops. So during walk for next afi/safi that 'ret' state is overwritten with new state. Also the resume context is overwritten. This causes to lose the start point for next walk, some nets are skipped forever. So they are not marked for deletion anymore. To fix this, we immediately return from current run. This will have resume state to be stored correctly, and next walk will start from there. Testing: 32 ecmp paths were received from the shutdown peer Before fix: show bgp ipv6 2052:52:1:167::/64 BGP routing table entry for 2052:52:1:167::/64, version 495 Paths: (246 available, best #127, table default) Not advertised to any peer <snip> 4200165500 4200165002 2021:21:51:101::2(spine-5) from spine-5(2021:21:51:101::2) (6.0.0.17) (fe80::202:ff:fe00:55) (prefer-global) Origin incomplete, valid, external, multipath Last update: Fri Apr 4 17:25:05 2025 4200165500 4200165002 2021:21:11:116::2(spine-1) from spine-1(2021:21:11:116::2) (0.0.0.0) (fe80::202:ff:fe00:3d) (prefer-global)<<<<path not deleted Origin incomplete, valid, external Last update: Fri Apr 4 17:25:05 2025 4200165500 4200165002 2021:21:11:115::2(spine-1) from spine-1(2021:21:11:115::2) (0.0.0.0) (fe80::202:ff:fe00:3d) (prefer-global)<<<<path not deleted Origin incomplete, valid, external Last update: Fri Apr 4 17:25:05 2025 <snip> 32 paths are supposed to be withdrawn: root@leaf-1:mgmt:# vtysh -c "show bgp ipv6 2052:52:1:167::/64" \| grep "prefer-global" \| wc -l 256 root@leaf-1:mgmt# vtysh -c "show bgp ipv6 2052:52:1:167::/64" \| grep "prefer-global" \| wc -l 246<<should be 224, but showing 246, which is wrong After fix: 32 paths are supposed to be withdrawn: root@leaf-1:mgmt:# vtysh -c "show bgp ipv6 2052:52:1:167::/64" \| grep "prefer-global" \| wc -l 256 root@leaf-1:mgmt:# vtysh -c "show bgp ipv6 2052:52:1:167::/64" \| grep "prefer-global" \| wc -l 224<<<shows correctly Signed-off-by: Soumya Roy <souroy@nvidia.com>	2025-04-09 14:32:23 +00:00
Donald Sharp	b2d8d9b37a	bgpd: On shutdown free up table for static routes Indirect leak of 56 byte(s) in 1 object(s) allocated from: 0 0x7fdaf6cb83b7 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:77 1 0x7fdaf683a480 in qcalloc lib/memory.c:106 2 0x7fdaf68dd706 in route_table_init_with_delegate lib/table.c:38 3 0x5649b22c05b0 in bgp_table_init bgpd/bgp_table.c:139 4 0x5649b2273da0 in bgp_static_set bgpd/bgp_route.c:7779 5 0x5649b21eba58 in vpnv4_network bgpd/bgp_mplsvpn.c:3244 6 0x7fdaf67b6d61 in cmd_execute_command_real lib/command.c:1003 7 0x7fdaf67b7080 in cmd_execute_command lib/command.c:1062 8 0x7fdaf67b75ac in cmd_execute lib/command.c:1228 9 0x7fdaf68ffb20 in vty_command lib/vty.c:626 10 0x7fdaf6900073 in vty_execute lib/vty.c:1389 11 0x7fdaf6903e24 in vtysh_read lib/vty.c:2408 12 0x7fdaf68f0222 in event_call lib/event.c:2019 13 0x7fdaf681b3c6 in frr_run lib/libfrr.c:1247 14 0x5649b211c903 in main bgpd/bgp_main.c:565 15 0x7fdaf630c249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 Table was being created but never deleted. Let's delete it. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2025-04-09 09:28:31 -04:00
Louis Scalbert	a0a0749568	bgpd: remove useless calls to afi2family Remove useless calls to afi2family(). str2prefix() always sets the prefix family. Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>	2025-04-09 13:07:56 +02:00
David Lamparter	8418e57791	Merge pull request #17915 from mjstapp/compile_wshadow	2025-04-09 09:59:06 +02:00
Mark Stapp	660cbf5651	bgpd: clean up variable-shadowing compiler warnings Clean up -Wshadow warnings in bgp. Signed-off-by: Mark Stapp <mjs@cisco.com>	2025-04-08 14:41:27 -04:00
Donald Sharp	b18c309015	bgpd: On shutdown free up memory leak found by topotest This commit fixes two types of problems: a) Avoidance of cleaning up memory when a instance is hidden, thus causing it never to be freed on shutdown b) In some instances bgp_create is called 2 times for some code. We are double allocating memory and dropping it on the second allocation. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2025-04-08 11:47:50 -04:00
Philippe Guibert	2ced6d233f	bgpd: add usid behavior for bgp srv6 instructions Until now, BGP srv6 usid instructions were not really used. Add the support for this. Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>	2025-04-08 16:16:06 +02:00
Donatas Abraitis	5e092d0e25	Merge pull request #18558 from spoignant-proton/master bgpd: flowspec: remove sizelimit check applied to the wrong length field (issue 18557)	2025-04-07 03:27:02 +03:00
Mark Stapp	259ffe1dfe	Merge pull request #18562 from opensourcerouting/fix/bfd_down_if_established bgpd: Treat the peer as not active due to BFD down only if established	2025-04-04 12:28:18 -04:00
Stephane Poignant	2cee5567bc	bgpd: flowspec: remove sizelimit check applied to the wrong length field (issue 18557) Section 4.1 of RFC8955 defines how the length field of flowspec NLRIs is encoded. The method use implies a maximum length of 4095 for a single flowspec NLRI. However, in bgp_flowspec.c, we check the length attribute of the bgp_nlri structure against this maximum value, which actually is the total length of all NLRI included in the considered MP_REACH_NLRI path attribute. Due to this confusion, frr would reject valid announces that contain many flowspec NLRIs, when their cummulative length exceeds 4095, and close the session. The proposed change removes that check entirely. Indeed, there is no need to check the length field of each invidual NLRI because the method employed make it impossible to encode a length greater than 4095. Signed-off-by: Stephane Poignant <stephane.poignant@proton.ch>	2025-04-04 13:29:02 +02:00
Mark Stapp	e0a97e5b85	Merge pull request #18546 from LabNConsulting/ziemba/250330-rfapi-mem-cleanup bgpd: rfapi: track outstanding rib and import timers, free mem at exit	2025-04-03 09:01:35 -04:00
Russ White	ab67e5544e	Merge pull request #18396 from pguibert6WIND/srv6l3vpn_to_bgp_vrf_redistribute Add BGP redistribution in SRv6 BGP	2025-04-03 08:25:32 -04:00
Rajesh Varatharaj	35129c88b4	bgpd: Skip EVPN MAC processing for non-EVPN peers Issue: "Processing EVPN MAC interface change on peer" log message is printed even when the peer didnt have EVPN address family. Fix: Process only if the peer is in EVPN address family Ticket: #17890 Signed-off-by: Rajesh Varatharaj <rvaratharaj@nvidia.com>	2025-04-02 11:48:42 -07:00
Donatas Abraitis	da4a7b0356	bgpd: Treat the peer as not active due to BFD down only if established If we have `neighbor X bfd` and BFD status is DOWN and/or ADMIN_DOWN, and BGP session is not yet established, we never allow the session to establish. Let's fix this regression that was in 10.2. Fixes: `1fb48f5` ("bgpd: Do not start BGP session if BFD profile is in shutdown state") Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2025-04-02 17:24:09 +03:00
Louis Scalbert	cbf27be5d9	bgpd: optimize attrhash_cmp calls Only call attrhash_cmp when necessary. Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>	2025-04-01 16:52:58 +02:00
Russ White	c312917988	Merge pull request #18450 from donaldsharp/bgp_packet_reads Bgp packet reads conversion to a FIFO	2025-04-01 10:12:37 -04:00
G. Paul Ziemba	1629c05924	bgpd: rfapi: track outstanding rib and import timers, free mem at exit While here, also make "VPN SAFI clear" test wait for clear result (tests/topotests/bgp_rfapi_basic_sanity{,_config2}) Original RFAPI code relied on the frr timer system to remember various allocations that were supposed to be freed at future times rather than manage a parallel database. However, if bgpd is terminated before the times expire, those pending allocations are marked as memory leaks, even though they wouldn't be leaks under normal operation. This change adds some hash tables to track these outstanding allocations that are associated with pending timers, and uses those tables to free the allocations when bgpd exits. Signed-off-by: G. Paul Ziemba <paulz@labn.net>	2025-03-31 08:45:33 -07:00
Donald Sharp	354aee8932	bgpd: Free memory associated with aspath_dup Fix this: ==3890443== 92 (48 direct, 44 indirect) bytes in 1 blocks are definitely lost in loss record 68 of 98 ==3890443== at 0x484DA83: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==3890443== by 0x49737B3: qcalloc (memory.c:106) ==3890443== by 0x3EA63B: aspath_dup (bgp_aspath.c:703) ==3890443== by 0x2F5438: route_set_aspath_exclude (bgp_routemap.c:2604) ==3890443== by 0x49BC52A: route_map_apply_ext (routemap.c:2708) ==3890443== by 0x2C1069: bgp_input_modifier (bgp_route.c:1925) ==3890443== by 0x2C9F12: bgp_update (bgp_route.c:5205) ==3890443== by 0x2CF281: bgp_nlri_parse_ip (bgp_route.c:7271) ==3890443== by 0x2A28C7: bgp_nlri_parse (bgp_packet.c:338) ==3890443== by 0x2A7F5C: bgp_update_receive (bgp_packet.c:2448) ==3890443== by 0x2ACCA6: bgp_process_packet (bgp_packet.c:4046) ==3890443== by 0x49EB77C: event_call (event.c:2019) ==3890443== by 0x495FAD1: frr_run (libfrr.c:1247) ==3890443== by 0x208D6D: main (bgp_main.c:557) Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2025-03-30 17:54:34 -04:00
Donald Sharp	c9d431d4db	bgpd: On shutdown, unlock table when clearing the bgp metaQ There are some tables not being freed upon shutdown. This is happening because the table is being locked as dests are being put on the metaQ. When in shutdown it was clearing the MetaQ it was not unlocking the table Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2025-03-30 14:02:16 -04:00
Donald Sharp	06480c0c81	bgpd: When shutting down do not clear self peers Commit: `e0ae285eb8` Modified the fsm state machine to attempt to not clear routes on a peer that was not established. The peer should be not a peer self. We do not want to ever clear the peer self. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2025-03-30 14:02:16 -04:00
Russ White	565da4d471	Merge pull request #18498 from opensourcerouting/fix/keep_stale_routes_on_clear bgpd: Retain the routes if we do a clear with N-bit set for Graceful-Restart	2025-03-26 14:02:52 -04:00
Donald Sharp	212a5379b0	Merge pull request #18502 from opensourcerouting/fix/mpls_withdraw_label bgpd: Set the label for MP_UNREACH_NLRI 0x800000 instead of 0x000000	2025-03-26 11:26:19 -04:00
Donatas Abraitis	94e2aadf71	bgpd: Set the label for MP_UNREACH_NLRI 0x800000 instead of 0x000000 RFC8277 says: The procedures in [RFC3107] for withdrawing the binding of a label or sequence of labels to a prefix are not specified clearly and correctly. => How to Explicitly Withdraw the Binding of a Label to a Prefix Suppose a BGP speaker has announced, on a given BGP session, the binding of a given label or sequence of labels to a given prefix. Suppose it now wishes to withdraw that binding. To do so, it may send a BGP UPDATE message with an MP_UNREACH_NLRI attribute. The NLRI field of this attribute is encoded as follows: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ \| Length \| Compatibility \| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ \| Prefix ~ ~ \| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 4: NLRI for Withdrawal Upon transmission, the Compatibility field SHOULD be set to 0x800000. Upon reception, the value of the Compatibility field MUST be ignored. [RFC3107] also made it possible to withdraw a binding without specifying the label explicitly, by setting the Compatibility field to 0x800000. However, some implementations set it to 0x000000. In order to ensure backwards compatibility, it is RECOMMENDED by this document that the Compatibility field be set to 0x800000, but it is REQUIRED that it be ignored upon reception. In FRR case where a single label is used per-prefix, we should send 0x800000, and not 0x000000. Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2025-03-26 10:30:52 +02:00
Donatas Abraitis	42b9d985cc	bgpd: Remove unused defines from bgp_label.h Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2025-03-26 08:50:06 +02:00
Donatas Abraitis	b7c657d4e0	bgpd: Retain the routes if we do a clear with N-bit set for Graceful-Restart On receiving side we already did the job correctly, but the peer which initiates the clear does not retain the other's routes. This commit fixes that. Fixes: `20170775da` ("bgpd: Activate Graceful-Restart when receiving CEASE/HOLDTIME notifications") Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2025-03-25 17:20:56 +02:00
Donald Sharp	83a92c926e	bgpd: Delay processing MetaQ in some events If the number of peers that are being handled on the peer connection fifo is greater than 10, that means we have some network event going on. Let's allow the packet processing to continue instead of running the metaQ. This has advantages because everything else in BGP is only run after the metaQ is run. This includes best path processing, installation of the route into zebra as well as telling our peers about this change. Why does this matter? It matters because if we are receiving the same route multiple times we limit best path processing to much fewer times and consequently we also limit the number of times we send the route update out and we install the route much fewer times as well. Prior to this patch, with 512 peers and 5k routes. CPU time for bgpd was 3:10, zebra was 3:28. After the patch CPU time for bgpd was 0:55 and zebra was 0:25. Here are the prior `show event cpu`: Event statistics for bgpd: Showing statistics for pthread default -------------------------------------- CPU (user+system): Real (wall-clock): Active Runtime(ms) Invoked Avg uSec Max uSecs Avg uSec Max uSecs CPU_Warn Wall_Warn Starv_Warn Type Event 0 20.749 33144 0 395 1 396 0 0 0 T (bgp_generate_updgrp_packets) 0 9936.199 1818 5465 43588 5466 43589 0 0 0 E bgp_handle_route_announcements_to_zebra 0 0.220 84 2 20 3 20 0 0 0 T update_subgroup_merge_check_thread_cb 0 0.058 2 29 43 29 43 0 0 0 E zclient_connect 0 17297.733 466 37119 67428 37124 67429 0 0 0 W zclient_flush_data 1 0.134 7 19 40 20 42 0 0 0 R vtysh_accept 0 151.396 1067 141 1181 142 1189 0 0 0 R vtysh_read 0 0.297 1030 0 14 0 14 0 0 0 T (bgp_routeadv_timer) 0 0.001 1 1 1 2 2 0 0 0 T bgp_sync_label_manager 2 9.374 544 17 261 17 262 0 0 0 R bgp_accept 0 0.001 1 1 1 2 2 0 0 0 T bgp_startup_timer_expire 0 0.012 1 12 12 13 13 0 0 0 E frr_config_read_in 0 0.308 1 308 308 309 309 0 0 0 T subgroup_coalesce_timer 0 4.027 105 38 77 39 78 0 0 0 T (bgp_start_timer) 0 112206.442 1818 61719 84726 61727 84736 0 0 0 TE work_queue_run 0 0.345 1 345 345 346 346 0 0 0 T bgp_config_finish 0 0.710 620 1 6 1 9 0 0 0 W bgp_connect_check 2 39.420 8283 4 110 5 111 0 0 0 R zclient_read 0 0.052 1 52 52 578 578 0 0 0 T bgp_start_label_manager 0 0.452 87 5 90 5 90 0 0 0 T bgp_announce_route_timer_expired 0 185.837 3088 60 537 92 21705 0 0 0 E bgp_event 0 48719.671 4346 11210 78292 11215 78317 0 0 0 E bgp_process_packet Showing statistics for pthread BGP I/O thread --------------------------------------------- CPU (user+system): Real (wall-clock): Active Runtime(ms) Invoked Avg uSec Max uSecs Avg uSec Max uSecs CPU_Warn Wall_Warn Starv_Warn Type Event 0 321.915 28597 11 86 11 265 0 0 0 W bgp_process_writes 515 115.586 26954 4 121 4 128 0 0 0 R bgp_process_reads Event statistics for zebra: Showing statistics for pthread default -------------------------------------- CPU (user+system): Real (wall-clock): Active Runtime(ms) Invoked Avg uSec Max uSecs Avg uSec Max uSecs CPU_Warn Wall_Warn Starv_Warn Type Event 0 0.109 2 54 62 55 63 0 0 0 T timer_walk_start 1 0.550 11 50 100 50 100 0 0 0 R vtysh_accept 0 112848.163 4441 25410 405489 25413 410127 0 0 0 E zserv_process_messages 0 0.007 1 7 7 7 7 0 0 0 E frr_config_read_in 0 0.005 1 5 5 5 5 0 0 0 T rib_sweep_route 1 573.589 4789 119 1567 120 1568 0 0 0 T wheel_timer_thread 347 30.848 97 318 1367 318 1366 0 0 0 T zebra_nhg_timer 0 0.005 1 5 5 6 6 0 0 0 T zebra_evpn_mh_startup_delay_exp_cb 0 5.404 521 10 38 10 70 0 0 0 T timer_walk_continue 1 1.669 9 185 219 186 219 0 0 0 R zserv_accept 1 0.174 18 9 53 10 53 0 0 0 R msg_conn_read 0 3.028 520 5 47 6 47 0 0 0 T if_zebra_speed_update 0 0.324 274 1 5 1 6 0 0 0 W msg_conn_write 1 24.661 2124 11 359 12 359 0 0 0 R kernel_read 0 73683.333 2964 24859 143223 24861 143239 0 0 0 TE work_queue_run 1 46.649 6789 6 424 7 424 0 0 0 R rtadv_read 0 52.661 85 619 2087 620 2088 0 0 0 R vtysh_read 0 42.660 18 2370 21694 2373 21695 0 0 0 E msg_conn_proc_msgs 0 0.034 1 34 34 35 35 0 0 0 E msg_client_connect_timer 0 2786.938 2300 1211 29456 1219 29555 0 0 0 E rib_process_dplane_results Showing statistics for pthread Zebra dplane thread -------------------------------------------------- CPU (user+system): Real (wall-clock): Active Runtime(ms) Invoked Avg uSec Max uSecs Avg uSec Max uSecs CPU_Warn Wall_Warn Starv_Warn Type Event 0 4875.670 200371 24 770 24 776 0 0 0 E dplane_thread_loop 0 0.059 1 59 59 76 76 0 0 0 E dplane_incoming_request 1 9.640 722 13 4510 15 5343 0 0 0 R dplane_incoming_read Here are the post `show event cpu` results: Event statistics for bgpd: Showing statistics for pthread default -------------------------------------- CPU (user+system): Real (wall-clock): Active Runtime(ms) Invoked Avg uSec Max uSecs Avg uSec Max uSecs CPU_Warn Wall_Warn Starv_Warn Type Event 0 21297.497 3565 5974 57912 5981 57913 0 0 0 E bgp_process_packet 0 149.742 1068 140 1109 140 1110 0 0 0 R vtysh_read 0 0.013 1 13 13 14 14 0 0 0 E frr_config_read_in 0 0.459 86 5 104 5 105 0 0 0 T bgp_announce_route_timer_expired 0 0.139 81 1 20 2 21 0 0 0 T update_subgroup_merge_check_thread_cb 0 405.889 291687 1 179 1 450 0 0 0 T (bgp_generate_updgrp_packets) 0 0.682 618 1 6 1 9 0 0 0 W bgp_connect_check 0 3.888 103 37 81 38 82 0 0 0 T (bgp_start_timer) 0 0.074 1 74 74 458 458 0 0 0 T bgp_start_label_manager 0 0.000 1 0 0 1 1 0 0 0 T bgp_sync_label_manager 0 0.121 3 40 54 100 141 0 0 0 E bgp_process_conn_error 0 0.060 2 30 49 30 50 0 0 0 E zclient_connect 0 0.354 1 354 354 355 355 0 0 0 T bgp_config_finish 0 0.283 1 283 283 284 284 0 0 0 T subgroup_coalesce_timer 0 29365.962 1805 16269 99445 16273 99454 0 0 0 TE work_queue_run 0 185.532 3097 59 497 94 26107 0 0 0 E bgp_event 1 0.290 8 36 151 37 158 0 0 0 R vtysh_accept 2 9.462 548 17 320 17 322 0 0 0 R bgp_accept 2 40.219 8283 4 128 5 128 0 0 0 R zclient_read 0 0.322 1031 0 4 0 5 0 0 0 T (bgp_routeadv_timer) 0 356.812 637 560 3007 560 3007 0 0 0 E bgp_handle_route_announcements_to_zebra Showing statistics for pthread BGP I/O thread --------------------------------------------- CPU (user+system): Real (wall-clock): Active Runtime(ms) Invoked Avg uSec Max uSecs Avg uSec Max uSecs CPU_Warn Wall_Warn Starv_Warn Type Event 515 62.965 14335 4 103 5 181 0 0 0 R bgp_process_reads 0 1986.041 219813 9 213 9 315 0 0 0 W bgp_process_writes Event statistics for zebra: Showing statistics for pthread default -------------------------------------- CPU (user+system): Real (wall-clock): Active Runtime(ms) Invoked Avg uSec Max uSecs Avg uSec Max uSecs CPU_Warn Wall_Warn Starv_Warn Type Event 0 0.006 1 6 6 7 7 0 0 0 E frr_config_read_in 0 3673.365 2044 1797 259281 1800 261342 0 0 0 E zserv_process_messages 1 651.846 8041 81 1090 82 1233 0 0 0 T wheel_timer_thread 0 38.184 18 2121 21345 2122 21346 0 0 0 E msg_conn_proc_msgs 1 0.651 12 54 112 55 112 0 0 0 R vtysh_accept 0 0.102 2 51 55 51 56 0 0 0 T timer_walk_start 0 202.721 1577 128 29172 141 29226 0 0 0 E rib_process_dplane_results 1 41.650 6645 6 140 6 140 0 0 0 R rtadv_read 1 22.518 1969 11 106 12 154 0 0 0 R kernel_read 0 4.265 48 88 1465 89 1466 0 0 0 R vtysh_read 0 6099.851 650 9384 28313 9390 28314 0 0 0 TE work_queue_run 0 5.104 521 9 30 10 31 0 0 0 T timer_walk_continue 0 3.078 520 5 53 6 55 0 0 0 T if_zebra_speed_update 0 0.005 1 5 5 5 5 0 0 0 T rib_sweep_route 0 0.034 1 34 34 35 35 0 0 0 E msg_client_connect_timer 1 1.641 9 182 214 183 215 0 0 0 R zserv_accept 0 0.358 274 1 6 2 6 0 0 0 W msg_conn_write 1 0.159 18 8 54 9 54 0 0 0 R msg_conn_read Showing statistics for pthread Zebra dplane thread -------------------------------------------------- CPU (user+system): Real (wall-clock): Active Runtime(ms) Invoked Avg uSec Max uSecs Avg uSec Max uSecs CPU_Warn Wall_Warn Starv_Warn Type Event 0 301.404 7280 41 1878 41 1878 0 0 0 E dplane_thread_loop 0 0.048 1 48 48 49 49 0 0 0 E dplane_incoming_request 1 9.558 727 13 4659 14 5420 0 0 0 R dplane_incoming_read Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2025-03-25 10:47:39 -04:00
Mark Stapp	39bb12299c	bgpd: fix SA warnings in bgp clearing code Fix a possible use-after-free in the recent bgp batch clearing code, CID 1639091. Signed-off-by: Mark Stapp <mjs@cisco.com>	2025-03-25 09:32:14 -04:00
Donald Sharp	12bf042c68	bgpd: Modify bgp to handle packet events in a FIFO Current behavor of BGP is to have a event per connection. Given that on startup of BGP with a high number of neighbors you end up with 2 * # of peers events that are being processed. Additionally once BGP has selected the connection this still only comes down to 512 events. This number of events is swamping the event system and in addition delaying any other work from being done in BGP at all because the the 512 events are always going to take precedence over everything else. The other main events are the handling of the metaQ(1 event), update group events( 1 per update group ) and the zebra batching event. These are being swamped. Modify the BGP code to have a FIFO of connections. As new data comes in to read, place the connection on the end of the FIFO. Have the bgp_process_packet handle up to 100 packets spread across the individual peers where each peer/connection is limited to the original quanta. During testing I noticed that withdrawal events at very very large scale are taking up to 40 seconds to process so I added a check for yielding to further limit the number of packets being processed. This change also allow for BGP to be interactive again on scale setups on initial convergence. Prior to this change any vtysh command entered would be delayed by 10's of seconds in my setup while BGP was doing other work. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2025-03-25 09:10:46 -04:00
Donatas Abraitis	45af7ea217	Merge pull request #18483 from donaldsharp/holdtime_mistake bgpd: Fix holdtime not working properly when busy	2025-03-25 09:38:09 +02:00
Donatas Abraitis	0a405f477d	Merge pull request #18484 from mjstapp/fix_evpn_rt_cli bgpd: fix handling of configured route-targets for l2vni, l3vni	2025-03-25 09:37:08 +02:00
Mark Stapp	2496bfecfe	bgpd: fix handling of configured RTs for l2vni, l3vni Test for existing explicit config as part of validation of route-target configuration: allow explicit config of generic/ default AS+VNI, for example, instead of rejecting it. Signed-off-by: Mark Stapp <mjs@cisco.com>	2025-03-24 16:53:32 -04:00
Russ White	7afe25744b	Merge pull request #18447 from donaldsharp/bgp_clear_batch Bgp clear batch	2025-03-24 16:13:49 -04:00
Philippe Guibert	56c9f1c566	bgpd: fix dereference of null pointer in bgp_nht Assuming attr is null, a dereference can happen in the function make_prefix(). Add the protection over attr before accessing the variable. Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>	2025-03-24 20:59:18 +01:00
Donald Sharp	9a26a56c51	bgpd: Fix holdtime not working properly when busy Commit: `cc9f21da22` Modified the bgp_fsm code to dissallow the extension of the hold time when the system is under extremely heavy load. This was a attempt to remove the return code but it was too aggressive and messed up this bit of code. Put the behavior back that was introduced in: `d0874d195d` Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2025-03-24 15:55:01 -04:00
Philippe Guibert	99acebcdc9	bgpd: fix check validity of a VPN SRv6 route with modified nexthop When exporting a VPN SRv6 route, the path may not be considered valid if the nexthop is not valid. This is the case when the 'nexthop vpn export' command is used. The below example illustrates that the VPN path to 2001:1::/64 is not selected, as the expected nexthop to find in vrf10 is the one configured: > # show running-config > router bgp 1 vrf vrf10 > address-family ipv6 unicast > nexthop vpn export 2001::1 > # show bgp ipv6 vpn > [..] > Route Distinguisher: 1:10 > 2001:1::/64 2001::1@4 0 0 65001 i > UN=2001::1 EC{99:99} label=16 sid=2001:db8:1:1:: sid_structure=[40,24,16,0] type=bgp, subtype=5 The analysis indicates that the 2001::1 nexthop is considered. > 2025/03/20 21:47:53.751853 BGP: [RD1WY-YE9EC] leak_update: entry: leak-to=VRF default, p=2001:1::/64, type=10, sub_type=0 > 2025/03/20 21:47:53.751855 BGP: [VWNP2-DNMFV] Found existing bnc 2001::1/128(0)(VRF vrf10) flags 0x82 ifindex 0 #paths 2 peer 0x0, resolved prefix UNK prefix > 2025/03/20 21:47:53.751856 BGP: [VWC2R-4REXZ] leak_update_nexthop_valid: 2001:1::/64 nexthop is not valid (in VRF vrf10) > 2025/03/20 21:47:53.751857 BGP: [HX87B-ZXWX9] leak_update: ->VRF default: 2001:1::/64: Found route, no change Actually, to check the nexthop validity, only the source path in the VRF has the correct nexthop. Fix this by reusing the source path information instead of the current one. > 2025/03/20 22:43:51.703521 BGP: [RD1WY-YE9EC] leak_update: entry: leak-to=VRF default, p=2001:1::/64, type=10, sub_type=0 > 2025/03/20 22:43:51.703523 BGP: [VWNP2-DNMFV] Found existing bnc fe80::b812:37ff:fe13:d441/128(0)(VRF vrf10) flags 0x87 ifindex 0 #paths 2 peer 0x0, resolved prefix fe80::/64 > 2025/03/20 22:43:51.703525 BGP: [VWC2R-4REXZ] leak_update_nexthop_valid: 2001:1::/64 nexthop is valid (in VRF vrf10) > 2025/03/20 22:43:51.703526 BGP: [HX87B-ZXWX9] leak_update: ->VRF default: 2001:1::/64: Found route, no change Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>	2025-03-24 09:17:01 +01:00

1 2 3 4 5 ...

8487 commits