matthieu/frr - Forgejo: Beyond coding. We Forge.

Author	SHA1	Message	Date
Mark Stapp	660cbf5651	bgpd: clean up variable-shadowing compiler warnings Clean up -Wshadow warnings in bgp. Signed-off-by: Mark Stapp <mjs@cisco.com>	2025-04-08 14:41:27 -04:00
Russ White	c312917988	Merge pull request #18450 from donaldsharp/bgp_packet_reads Bgp packet reads conversion to a FIFO	2025-04-01 10:12:37 -04:00
Donatas Abraitis	b7c657d4e0	bgpd: Retain the routes if we do a clear with N-bit set for Graceful-Restart On receiving side we already did the job correctly, but the peer which initiates the clear does not retain the other's routes. This commit fixes that. Fixes: `20170775da` ("bgpd: Activate Graceful-Restart when receiving CEASE/HOLDTIME notifications") Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2025-03-25 17:20:56 +02:00
Donald Sharp	12bf042c68	bgpd: Modify bgp to handle packet events in a FIFO Current behavor of BGP is to have a event per connection. Given that on startup of BGP with a high number of neighbors you end up with 2 * # of peers events that are being processed. Additionally once BGP has selected the connection this still only comes down to 512 events. This number of events is swamping the event system and in addition delaying any other work from being done in BGP at all because the the 512 events are always going to take precedence over everything else. The other main events are the handling of the metaQ(1 event), update group events( 1 per update group ) and the zebra batching event. These are being swamped. Modify the BGP code to have a FIFO of connections. As new data comes in to read, place the connection on the end of the FIFO. Have the bgp_process_packet handle up to 100 packets spread across the individual peers where each peer/connection is limited to the original quanta. During testing I noticed that withdrawal events at very very large scale are taking up to 40 seconds to process so I added a check for yielding to further limit the number of packets being processed. This change also allow for BGP to be interactive again on scale setups on initial convergence. Prior to this change any vtysh command entered would be delayed by 10's of seconds in my setup while BGP was doing other work. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2025-03-25 09:10:46 -04:00
Russ White	7afe25744b	Merge pull request #18447 from donaldsharp/bgp_clear_batch Bgp clear batch	2025-03-24 16:13:49 -04:00
Donald Sharp	c9655e2893	bgpd: Remove unnecessary stream_new/stream_copies in bgp_open_make The call into bgp_open_capability can return that it wrote more than BGP_OPEN_NON_EXT_OPT_LEN bytes, in that case the open part needs to be written again with ext_opt_params set to true to allow extended parameters to be written thus keeping the len < 255 bytes. The code to do this was first creating a new stream and then copying into it the stream, trying to call bgp_open_capability() and if it succeeded recopying the tmp stream back onto the original. Let's change this around such that we save the current spot in the stream of where we are writing and if the change does not work reset the pointer and try again with the correct parameter. This removes the stream and multiple copies and eventual free of the temporary stream. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2025-03-14 14:50:59 -04:00
Mark Stapp	58f924d287	bgpd: batch peer connection error clearing When peer connections encounter errors, attempt to batch some of the clearing processing that occurs. Add a new batch object, add multiple peers to it, if possible. Do one rib walk for the batch, rather than one walk per peer. Use a handler callback per batch to check and remove peers' path-infos, rather than a work-queue and callback per peer. The original clearing code remains; it's used for single peers. Signed-off-by: Mark Stapp <mjs@cisco.com>	2025-03-12 12:42:06 -04:00
Mark Stapp	6a5962e1f8	bgpd: Replace per-peer connection error with per-bgp Replace the per-peer connection error with a per-bgp event and a list. The io pthread enqueues peers per-bgp-instance, and the error-handing code can process multiple peers if there have been multiple failures. Signed-off-by: Mark Stapp <mjs@cisco.com>	2025-03-12 12:40:07 -04:00
Donald Sharp	2cd1d00dde	bgpd: Convert bgp_keepalive_send to use a connection The peer is going to eventually have a incoming and outgoing connection. Let's send the data based upon the connection not the peer. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2025-02-28 10:28:50 -05:00
Donald Sharp	ffff1a1760	bgpd: Fix another crash in orf I was pointed at yet another crash in the orf code. I think it stems from basicaly the same problem as the last one. Let's just make sure that the orf_plist is handled appropriately. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2025-02-19 06:29:36 -05:00
Donald Sharp	3d43d7b789	bgpd: When removing the prefix list drop the pointer We are very very rarely seeing this crash: 0 0x7f36ba48e389 in prefix_list_apply_ext lib/plist.c:789 1 0x55eff3fa4126 in subgroup_announce_check bgpd/bgp_route.c:2334 2 0x55eff3fa858e in subgroup_process_announce_selected bgpd/bgp_route.c:3440 3 0x55eff4016488 in subgroup_announce_table bgpd/bgp_updgrp_adv.c:808 4 0x55eff401664e in subgroup_announce_route bgpd/bgp_updgrp_adv.c:861 5 0x55eff40111df in peer_af_announce_route bgpd/bgp_updgrp.c:2223 6 0x55eff3f884cb in bgp_announce_route_timer_expired bgpd/bgp_route.c:5892 7 0x7f36ba4ec239 in event_call lib/event.c:2019 8 0x7f36ba41a22a in frr_run lib/libfrr.c:1295 9 0x55eff3e668b7 in main bgpd/bgp_main.c:557 10 0x7f36b9e2d249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 11 0x7f36b9e2d304 in __libc_start_main_impl ../csu/libc-start.c:360 12 0x55eff3e64a30 in _start (/home/ci/cibuild.1407/frr-source/bgpd/.libs/bgpd+0x2fda30) 0x608000037038 is located 24 bytes inside of 88-byte region [0x608000037020,0x608000037078) freed by thread T0 here: 0 0x7f36ba8b76a8 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:52 1 0x7f36ba439bd7 in qfree lib/memory.c:131 2 0x7f36ba48d3a3 in prefix_list_free lib/plist.c:156 3 0x7f36ba48d3a3 in prefix_list_delete lib/plist.c:247 4 0x7f36ba48fbef in prefix_bgp_orf_remove_all lib/plist.c:1516 5 0x55eff3f679c4 in bgp_route_refresh_receive bgpd/bgp_packet.c:2841 6 0x55eff3f70bab in bgp_process_packet bgpd/bgp_packet.c:4069 7 0x7f36ba4ec239 in event_call lib/event.c:2019 8 0x7f36ba41a22a in frr_run lib/libfrr.c:1295 9 0x55eff3e668b7 in main bgpd/bgp_main.c:557 10 0x7f36b9e2d249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 previously allocated by thread T0 here: 0 0x7f36ba8b83b7 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:77 1 0x7f36ba4392e4 in qcalloc lib/memory.c:106 2 0x7f36ba48d0de in prefix_list_new lib/plist.c:150 3 0x7f36ba48d0de in prefix_list_insert lib/plist.c:186 4 0x7f36ba48d0de in prefix_list_get lib/plist.c:204 5 0x7f36ba48f9df in prefix_bgp_orf_set lib/plist.c:1479 6 0x55eff3f67ba6 in bgp_route_refresh_receive bgpd/bgp_packet.c:2920 7 0x55eff3f70bab in bgp_process_packet bgpd/bgp_packet.c:4069 8 0x7f36ba4ec239 in event_call lib/event.c:2019 9 0x7f36ba41a22a in frr_run lib/libfrr.c:1295 10 0x55eff3e668b7 in main bgpd/bgp_main.c:557 11 0x7f36b9e2d249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 Let's just stop trying to save the pointer around in the peer->orf_plist data structure. There are other design problems but at least lets stop the crash from possibly happening. Fixes: #18138 Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2025-02-14 07:58:01 -05:00
Donald Sharp	418d0064bc	Merge pull request #18068 from opensourcerouting/fix/coverity_link_local_capability bgpd: Do not check for capability length for Link-Local Next Hop capability	2025-02-12 09:35:16 -05:00
Philippe Guibert	2a143041f8	bgpd: fix loc-rib open message should use router-id When forging BMP open message, the BGP router-id of tx open message of the BMP LOC-RIB peer up message is always set to 0.0.0.0, whatever the configured value of 'bgp router-id'. Actually, when forging a peer up LOC-RIB message, the BGP router-id value should be taken from the main BGP instance, and not from the peer bgp identifier. Fix this by refreshing the router-id whenever a peer up loc-rib message should be sent. Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>	2025-02-11 11:49:28 +01:00
Donatas Abraitis	d3741f8437	bgpd: Do not check for capability length for Link-Local Next Hop capability Capability's length is 0 and this is not needed to check if it's multiplied by X or there is a minimum length for that. Fixes: `db853cc97e` ("bgpd: Implement Link-Local Next Hop capability") Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2025-02-08 13:01:53 +02:00
Russ White	2ef76a3350	Merge pull request #17871 from opensourcerouting/feature/bgp_link_local_capability bgpd: Implement Link-Local Next Hop capability	2025-02-07 14:00:59 -05:00
Russ White	f3b6651954	Merge pull request #17863 from opensourcerouting/fix/bgp_coverity_1617727 bgpd: Check if the peer really exists before sending dynamic capability	2025-01-28 10:35:57 -05:00
Donatas Abraitis	4338e21aa2	Revert "bgpd: Handle Addpath capability using dynamic capabilities" This reverts commit `05cf9d03b3`. TL;DR; Handling BGP AddPath capability is not trivial (possible) dynamically. When the sender is AddPath-capable and sends NLRIs encoded with AddPath ID, and at the same time the receiver sends AddPath capability "disable-addpath-rx" (flag update) via dynamic capabilities, both peers are out of sync about the AddPath state. The receiver thinks already he's not AddPath-capable anymore, hence it tries to parse NLRIs as non-AddPath, while they are actually encoded as AddPath. AddPath capability itself does not provide (in RFC) any mechanism on backward compatible way to handle NLRIs if they come mixed (AddPath + non-AddPath). This explains why we have failures in our CI periodically. Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2025-01-25 20:51:16 +02:00
Donatas Abraitis	db853cc97e	bgpd: Implement Link-Local Next Hop capability Related: https://datatracker.ietf.org/doc/html/draft-white-linklocal-capability TL;DR; use 16 bytes long next-hops for point-to-point (unnumbered) links instead of sending 32 bytes (::/LL, GUA/LL, LL/LL combinations). For backward compatiblity we should handle even 32 bytes existing next hops. Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2025-01-17 16:48:32 +02:00
Donatas Abraitis	2df722262f	bgpd: Check if the peer really exists before sending dynamic capability CID: 1617727 Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2025-01-16 09:06:17 +02:00
Donatas Abraitis	d60320c6d2	bgpd: Handle ENHE capability via dynamic capability FRR supports dynamic capability which is useful to exchange the capabilities without tearing down the session. ENHE capability was missed to be included handling via dynamic capability. Let's add it too. This was missed and asked in Slack that it would be useful. Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2025-01-14 22:46:53 +02:00
Donatas Abraitis	28e62b46ba	bgpd: Show prefix-related stats per neighbor E.g.: ``` Prefix statistics: Inbound filtered: 0 AS-PATH loop: 0 Originator loop: 0 Cluster loop: 0 Invalid next-hop: 0 Withdrawn: 0 Attributes discarded: 3 ``` JSON: ``` "prefixStats":{ "inboundFiltered":0, "aspathLoop":0, "originatorLoop":0, "clusterLoop":0, "invalidNextHop":0, "withdrawn":0, "attributesDiscarded":3 }, ``` Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2024-12-30 12:26:19 +02:00
Donald Sharp	1baeb81632	bgpd: bgp_getsockname should use connection Let's use the connection associated with the peer instead. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2024-11-26 11:59:33 -05:00
Donald Sharp	5592aecefd	bgpd: Convert rcvd_attr_printed to a bool No need for a integer to store this, use a bool Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2024-10-31 10:35:01 -04:00
Donald Sharp	138935a5fd	bgpd: Fix wrong pthread event cancelling 0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=130719886083648) at ./nptl/pthread_kill.c:44 1 __pthread_kill_internal (signo=6, threadid=130719886083648) at ./nptl/pthread_kill.c:78 2 __GI___pthread_kill (threadid=130719886083648, signo=signo@entry=6) at ./nptl/pthread_kill.c:89 3 0x000076e399e42476 in __GI_raise (sig=6) at ../sysdeps/posix/raise.c:26 4 0x000076e39a34f950 in core_handler (signo=6, siginfo=0x76e3985fca30, context=0x76e3985fc900) at lib/sigevent.c:258 5 <signal handler called> 6 __pthread_kill_implementation (no_tid=0, signo=6, threadid=130719886083648) at ./nptl/pthread_kill.c:44 7 __pthread_kill_internal (signo=6, threadid=130719886083648) at ./nptl/pthread_kill.c:78 8 __GI___pthread_kill (threadid=130719886083648, signo=signo@entry=6) at ./nptl/pthread_kill.c:89 9 0x000076e399e42476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26 10 0x000076e399e287f3 in __GI_abort () at ./stdlib/abort.c:79 11 0x000076e39a39874b in _zlog_assert_failed (xref=0x76e39a46cca0 <_xref.27>, extra=0x0) at lib/zlog.c:789 12 0x000076e39a369dde in cancel_event_helper (m=0x5eda32df5e40, arg=0x5eda33afeed0, flags=1) at lib/event.c:1428 13 0x000076e39a369ef6 in event_cancel_event_ready (m=0x5eda32df5e40, arg=0x5eda33afeed0) at lib/event.c:1470 14 0x00005eda0a94a5b3 in bgp_stop (connection=0x5eda33afeed0) at bgpd/bgp_fsm.c:1355 15 0x00005eda0a94b4ae in bgp_stop_with_notify (connection=0x5eda33afeed0, code=8 '\b', sub_code=0 '\000') at bgpd/bgp_fsm.c:1610 16 0x00005eda0a979498 in bgp_packet_add (connection=0x5eda33afeed0, peer=0x5eda33b11800, s=0x76e3880daf90) at bgpd/bgp_packet.c:152 17 0x00005eda0a97a80f in bgp_keepalive_send (peer=0x5eda33b11800) at bgpd/bgp_packet.c:639 18 0x00005eda0a9511fd in peer_process (hb=0x5eda33c9ab80, arg=0x76e3985ffaf0) at bgpd/bgp_keepalives.c:111 19 0x000076e39a2cd8e6 in hash_iterate (hash=0x76e388000be0, func=0x5eda0a95105e <peer_process>, arg=0x76e3985ffaf0) at lib/hash.c:252 20 0x00005eda0a951679 in bgp_keepalives_start (arg=0x5eda3306af80) at bgpd/bgp_keepalives.c:214 21 0x000076e39a2c9932 in frr_pthread_inner (arg=0x5eda3306af80) at lib/frr_pthread.c:180 22 0x000076e399e94ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442 23 0x000076e399f26850 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 (gdb) f 12 12 0x000076e39a369dde in cancel_event_helper (m=0x5eda32df5e40, arg=0x5eda33afeed0, flags=1) at lib/event.c:1428 1428 assert(m->owner == pthread_self()); In this decode the attempt to cancel the connection's events from the wrong thread is causing the crash. Modify the code to create an event on the bm->master to cancel the events for the connection. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2024-10-24 21:01:26 -04:00
Donald Sharp	b097a3188a	bgpd: Fix deadlock in bgp_keepalive and master pthreads (gdb) bt 0 futex_wait (private=0, expected=2, futex_word=0x5c438e9a98d8) at ../sysdeps/nptl/futex-internal.h:146 1 __GI___lll_lock_wait (futex=futex@entry=0x5c438e9a98d8, private=0) at ./nptl/lowlevellock.c:49 2 0x00007af16d698002 in lll_mutex_lock_optimized (mutex=0x5c438e9a98d8) at ./nptl/pthread_mutex_lock.c:48 3 ___pthread_mutex_lock (mutex=0x5c438e9a98d8) at ./nptl/pthread_mutex_lock.c:93 4 0x00005c4369c17e70 in _frr_mtx_lock (mutex=0x5c438e9a98d8, func=0x5c4369dc2750 <__func__.265> "bgp_notify_send_internal") at ./lib/frr_pthread.h:258 5 0x00005c4369c1a07a in bgp_notify_send_internal (connection=0x5c438e9a98c0, code=8 '\b', sub_code=0 '\000', data=0x0, datalen=0, use_curr=true) at bgpd/bgp_packet.c:928 6 0x00005c4369c1a707 in bgp_notify_send (connection=0x5c438e9a98c0, code=8 '\b', sub_code=0 '\000') at bgpd/bgp_packet.c:1069 7 0x00005c4369bea422 in bgp_stop_with_notify (connection=0x5c438e9a98c0, code=8 '\b', sub_code=0 '\000') at bgpd/bgp_fsm.c:1597 8 0x00005c4369c18480 in bgp_packet_add (connection=0x5c438e9a98c0, peer=0x5c438e9b6010, s=0x7af15c06bf70) at bgpd/bgp_packet.c:151 9 0x00005c4369c19816 in bgp_keepalive_send (peer=0x5c438e9b6010) at bgpd/bgp_packet.c:639 10 0x00005c4369bf01fd in peer_process (hb=0x5c438ed05520, arg=0x7af16bdffaf0) at bgpd/bgp_keepalives.c:111 11 0x00007af16dacd8e6 in hash_iterate (hash=0x7af15c000be0, func=0x5c4369bf005e <peer_process>, arg=0x7af16bdffaf0) at lib/hash.c:252 12 0x00005c4369bf0679 in bgp_keepalives_start (arg=0x5c438e0db110) at bgpd/bgp_keepalives.c:214 13 0x00007af16dac9932 in frr_pthread_inner (arg=0x5c438e0db110) at lib/frr_pthread.c:180 14 0x00007af16d694ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442 15 0x00007af16d726850 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 (gdb) The bgp keepalive pthread gets deadlocked with itself and consequently the bgp master pthread gets locked when it attempts to lock the peerhash_mtx, since it is also locked by the keepalive_pthread The keepalive pthread is locking the peerhash_mtx in bgp_keepalives_start. Next the connection->io_mtx mutex in bgp_keepalives_send is locked and then when it notices a problem it invokes bgp_stop_with_notify which relocks the same mutex ( and of course the relock causes it to get stuck on itself ). This generates a deadlock condition. Modify the code to only hold the connection->io_mtx as short as possible. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2024-10-24 21:01:26 -04:00
Maxence Younsi	035304c25a	bgpd: bmp loc-rib peer up/down for vrfs added bmp bgp peer for vrfs added peer up vrf in bmp peer up state added vrf state in bmpbgp added safe bmp_peer_sendall : bmp_peer_sendall_safe changed bgp_open_send to call new bgp_open_make bgp_open_make creates a bgp open packet, now used in bmp for peer up vrf added hook and call to bgp instance state vrf peer state is recomputed when interfaces (including vrf itf) go up / down and when it gets created or removed Link: `e48ba38070` Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com> Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com> Signed-off-by: Maxence Younsi <mx.yns@outlook.fr>	2024-10-11 15:14:12 +02:00
sri-mohan1	c853c8d13b	bgpd: changes for code maintainability these changes are for improving the code maintainability and readability Signed-off-by: sri-mohan1 <sri.mohan@samsung.com>	2024-10-10 23:23:20 +05:30
Donatas Abraitis	cadfa693d6	bgpd: Implement BGP dual-as feature This is helpful for migrations, etc. The neighbor is configured with: ``` router bgp 65000 neighbor X local-as 65001 no-prepend replace-as dual-as ``` Neighbor X can use either 65000, or 65001 to peer with. Closes: https://github.com/FRRouting/frr/issues/13928 Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2024-09-13 10:51:41 +03:00
Donatas Abraitis	19a85c68bf	bgpd: Do not send route-refresh if it wasn't negotiated in capabilities Fixes: `04dfcb14ff` ("bgpd: Deprecate Prestandard Route Refresh capability (128)") Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2024-09-01 22:35:22 +03:00
Donatas Abraitis	1d181dfb98	bgpd: Clear previously allocated capabilities values before parsing a new OPEN Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2024-08-07 10:16:26 +03:00
Donatas Abraitis	b1b1c922a5	bgpd: Do not increment treat-as-withdraw counters if debug is enabled Increment only if we really treat the UPDATE as withdrawn. Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2024-07-25 13:41:23 +03:00
Donatas Abraitis	e8169f9385	bgpd: Show extended parameters support for the OPEN messages We did that for the receiving side, but not for a sending side, let's fix it. Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2024-07-19 05:20:04 +03:00
Donatas Abraitis	0dfe25697f	bgpd: Implement `neighbor X remote-as auto` In some cases (large scale) it's desired to avoid changing configurations, but let the BGP to automatically handle ASN changes. `auto` means the peering can be iBGP or eBGP. It will be automatically detected and adjusted from the OPEN message. Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2024-07-04 14:42:19 +03:00
vivek	75040a0295	bgpd: Enhance OPEN Tx debug log Signed-off-by: Vivek Venkatraman <vivek@nvidia.com>	2024-07-01 13:02:52 -07:00
vivek	c6ed1cc16d	bgpd: Refine restarter operation - R-bit & F-bit Introduce BGP-wide flags to denote if BGP has started gracefully and GR is in progress or not. Use this for setting of the R-bit in the GR capability, and not a timer which is set for any new instance creation. Mark graceful restart is complete when the deferred path selection has been done and route sync with zebra as well as deferred EOR advertisement has been initiated. Introduce a function to check on F-bit setting rather than just base it on configuration. Subsequent commits will extend these functionalities. Signed-off-by: Vivek Venkatraman <vivek@nvidia.com>	2024-07-01 13:02:45 -07:00
Russ White	ed5628fef1	Merge pull request #16213 from opensourcerouting/fix/fqdn_capability_parsing_for_dynamic_capability bgpd: Check if we have really enough data before doing memcpy for FQDN capability	2024-06-24 16:38:58 -04:00
Russ White	f047430576	Merge pull request #16211 from opensourcerouting/fix/dynamic_software_version_sanity_check bgpd: Check if we have really enough data before doing memcpy for software version	2024-06-24 16:38:50 -04:00
Donatas Abraitis	b685ab5e1b	bgpd: Check if we have really enough data before doing memcpy for FQDN capability We advance data pointer (data++), but we do memcpy() with the length that is 1-byte over, which is technically heap overflow. ``` ==411461==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x50600011da1a at pc 0xc4f45a9786f0 bp 0xffffed1e2740 sp 0xffffed1e1f30 READ of size 4 at 0x50600011da1a thread T0 0 0xc4f45a9786ec in __asan_memcpy (/home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/.libs/bgpd+0x3586ec) (BuildId: e794c5f796eee20c8973d7efb9bf5735e54d44cd) 1 0xc4f45abf15f8 in bgp_dynamic_capability_fqdn /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_packet.c:3457:4 2 0xc4f45abdd408 in bgp_capability_msg_parse /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_packet.c:3911:4 3 0xc4f45abdbeb4 in bgp_capability_receive /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_packet.c:3980:9 4 0xc4f45abde2cc in bgp_process_packet /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_packet.c:4109:11 5 0xc4f45a9b6110 in LLVMFuzzerTestOneInput /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_main.c:582:3 ``` Found by fuzzing. Reported-by: Iggy Frankovic <iggyfran@amazon.com> Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2024-06-13 08:12:10 +03:00
Donatas Abraitis	5d7af51c4f	bgpd: Check if we have really enough data before doing memcpy for software version If we receive CAPABILITY message (software-version), we SHOULD check if we really have enough data before doing memcpy(), that could also lead to buffer overflow. (data + len > end) is not enough, because after this check we do data++ and later memcpy(..., data, len). That means we have one more byte. Hit this through fuzzing by ``` 0 0xaaaaaadf872c in __asan_memcpy (/home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/.libs/bgpd+0x35872c) (BuildId: 9c6e455d0d9a20f5a4d2f035b443f50add9564d7) 1 0xaaaaab06bfbc in bgp_dynamic_capability_software_version /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_packet.c:3713:3 2 0xaaaaab05ccb4 in bgp_capability_msg_parse /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_packet.c:3839:4 3 0xaaaaab05c074 in bgp_capability_receive /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_packet.c:3980:9 4 0xaaaaab05e48c in bgp_process_packet /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_packet.c:4109:11 5 0xaaaaaae36150 in LLVMFuzzerTestOneInput /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_main.c:582:3 ``` Hit this again by Iggy \m/ Reported-by: Iggy Frankovic <iggyfran@amazon.com> Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2024-06-12 22:54:45 +03:00
Donatas Abraitis	ae1f3a4851	bgpd: Keep last notification's state about hard reset When we receive a hard-reset notification, we always show it if it was a hard, or not. For sending side, we missed that. Let's display it too. Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2024-06-11 15:51:21 +03:00
Donatas Abraitis	637ab53f75	bgpd: Send End-of-RIB not only if Graceful Restart capability is received Before we checked for received Graceful Restart capability, but that was also incorrect, because we SHOULD HAVE checked it per AFI/SAFI instead. https://datatracker.ietf.org/doc/html/rfc4724 says: Although the End-of-RIB marker is specified for the purpose of BGP graceful restart, it is noted that the generation of such a marker upon completion of the initial update would be useful for routing convergence in general, and thus the practice is recommended. Thus, it might be reasonable to send EoR regardless of whether the Graceful Restart capability is received or not from the peer. Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2024-05-31 15:03:55 +03:00
Donatas Abraitis	0d079e01e5	bgpd: Check if FQDN capability length is in valid ranges If FQDN capability comes as dynamic capability we should check if the encoding is proper. Before this patch we returned an error if the hostname/domainname length check was > end. But technically, if the length is also == end, this is a malformed capability, because we use the data incorrectly after we check the length. This causes heap overflow (when compiled with address-sanitizer). Signed-off-by: Iggy Frankovic <iggyfran@amazon.com> Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2024-05-24 10:38:49 +03:00
Donatas Abraitis	150eb73054	bgpd: Send a notification if we receive CAPABILITY message if not exepected Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2024-05-24 10:35:43 +03:00
Donatas Abraitis	048609103c	bgpd: Add sanity check for capability lengths before processing them This is for CAPABILITY messages, not for OPEN message capabilities. Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2024-05-24 10:35:42 +03:00
Donald Sharp	b4c64b3a39	bgpd: Remove unused addition found in clang Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2024-04-25 14:45:32 -04:00
Donatas Abraitis	30a332dad8	bgpd: Fix errors handling for MP/GR capabilities as dynamic capability When receiving a MP/GR capability as dynamic capability, but malformed, do not forget to advance the pointer to avoid hitting infinity loop. After: ``` Mar 29 11:15:28 donatas-laptop bgpd[353550]: [GS0AQ-HKY0X] 127.0.0.1 rcv CAPABILITY Mar 29 11:15:28 donatas-laptop bgpd[353550]: [JTVED-VGTQQ] 127.0.0.1(donatas-pc): CAPABILITY has action: 1, code: 5, length 0 Mar 29 11:15:28 donatas-laptop bgpd[353550]: [JTVED-VGTQQ] 127.0.0.1(donatas-pc): CAPABILITY has action: 1, code: 0, length 0 Mar 29 11:15:28 donatas-laptop bgpd[353550]: [HFHDS-QT71N][EC 33554494] 127.0.0.1(donatas-pc): unrecognized capability code: 0 - ignored Mar 29 11:15:28 donatas-laptop bgpd[353550]: [JTVED-VGTQQ] 127.0.0.1(donatas-pc): CAPABILITY has action: 0, code: 0, length 0 Mar 29 11:15:28 donatas-laptop bgpd[353550]: [HFHDS-QT71N][EC 33554494] 127.0.0.1(donatas-pc): unrecognized capability code: 0 - ignored Mar 29 11:15:28 donatas-laptop bgpd[353550]: [JTVED-VGTQQ] 127.0.0.1(donatas-pc): CAPABILITY has action: 0, code: 0, length 0 Mar 29 11:15:28 donatas-laptop bgpd[353550]: [HFHDS-QT71N][EC 33554494] 127.0.0.1(donatas-pc): unrecognized capability code: 0 - ignored Mar 29 11:15:28 donatas-laptop bgpd[353550]: [JTVED-VGTQQ] 127.0.0.1(donatas-pc): CAPABILITY has action: 0, code: 0, length 1 Mar 29 11:15:28 donatas-laptop bgpd[353550]: [HFHDS-QT71N][EC 33554494] 127.0.0.1(donatas-pc): unrecognized capability code: 0 - ignored Mar 29 11:15:28 donatas-laptop bgpd[353550]: [JTVED-VGTQQ] 127.0.0.1(donatas-pc): CAPABILITY has action: 1, code: 1, length 10 Mar 29 11:15:28 donatas-laptop bgpd[353550]: [Z1DRQ-N6Z5F] 127.0.0.1(donatas-pc): Dynamic Capability MultiProtocol Extensions afi/safi invalid (bad-value/unicast) ``` Before: ``` Mar 29 11:14:54 donatas-laptop bgpd[347675]: [JTVED-VGTQQ] 127.0.0.1(donatas-pc): CAPABILITY has action: 1, code: 1, length 10 Mar 29 11:14:54 donatas-laptop bgpd[347675]: [Z1DRQ-N6Z5F] 127.0.0.1(donatas-pc): Dynamic Capability MultiProtocol Extensions afi/safi invalid (bad-value/unicast) Mar 29 11:14:54 donatas-laptop bgpd[347675]: [JTVED-VGTQQ] 127.0.0.1(donatas-pc): CAPABILITY has action: 1, code: 1, length 10 Mar 29 11:14:54 donatas-laptop bgpd[347675]: [Z1DRQ-N6Z5F] 127.0.0.1(donatas-pc): Dynamic Capability MultiProtocol Extensions afi/safi invalid (bad-value/unicast) Mar 29 11:14:54 donatas-laptop bgpd[347675]: [JTVED-VGTQQ] 127.0.0.1(donatas-pc): CAPABILITY has action: 1, code: 1, length 10 Mar 29 11:14:54 donatas-laptop bgpd[347675]: [Z1DRQ-N6Z5F] 127.0.0.1(donatas-pc): Dynamic Capability MultiProtocol Extensions afi/safi invalid (bad-value/unicast) Mar 29 11:14:54 donatas-laptop bgpd[347675]: [JTVED-VGTQQ] 127.0.0.1(donatas-pc): CAPABILITY has action: 1, code: 1, length 10 Mar 29 11:14:54 donatas-laptop bgpd[347675]: [Z1DRQ-N6Z5F] 127.0.0.1(donatas-pc): Dynamic Capability MultiProtocol Extensions afi/safi invalid (bad-value/unicast) Mar 29 11:14:54 donatas-laptop bgpd[347675]: [JTVED-VGTQQ] 127.0.0.1(donatas-pc): CAPABILITY has action: 1, code: 1, length 10 Mar 29 11:14:54 donatas-laptop bgpd[347675]: [Z1DRQ-N6Z5F] 127.0.0.1(donatas-pc): Dynamic Capability MultiProtocol Extensions afi/safi invalid (bad-value/unicast) Mar 29 11:14:54 donatas-laptop bgpd[347675]: [JTVED-VGTQQ] 127.0.0.1(donatas-pc): CAPABILITY has action: 1, code: 1, length 10 Mar 29 11:14:54 donatas-laptop bgpd[347675]: [Z1DRQ-N6Z5F] 127.0.0.1(donatas-pc): Dynamic Capability MultiProtocol Extensions afi/safi invalid (bad-value/unicast) Mar 29 11:14:54 donatas-laptop bgpd[347675]: [JTVED-VGTQQ] 127.0.0.1(donatas-pc): CAPABILITY has action: 1, code: 1, length 10 Mar 29 11:14:54 donatas-laptop bgpd[347675]: [Z1DRQ-N6Z5F] 127.0.0.1(donatas-pc): Dynamic Capability MultiProtocol Extensions afi/safi invalid (bad-value/unicast) Mar 29 11:14:54 donatas-laptop bgpd[347675]: [JTVED-VGTQQ] 127.0.0.1(donatas-pc): CAPABILITY has action: 1, code: 1, length 10 Mar 29 11:14:54 donatas-laptop bgpd[347675]: [Z1DRQ-N6Z5F] 127.0.0.1(donatas-pc): Dynamic Capability MultiProtocol Extensions afi/safi invalid (bad-value/unicast) Mar 29 11:14:54 donatas-laptop bgpd[347675]: [JTVED-VGTQQ] 127.0.0.1(donatas-pc): CAPABILITY has action: 1, code: 1, length 10 Mar 29 11:14:54 donatas-laptop bgpd[347675]: [Z1DRQ-N6Z5F] 127.0.0.1(donatas-pc): Dynamic Capability MultiProtocol Extensions afi/safi invalid (bad-value/unicast) Mar 29 11:14:54 donatas-laptop bgpd[347675]: [JTVED-VGTQQ] 127.0.0.1(donatas-pc): CAPABILITY has action: 1, code: 1, length 10 Mar 29 11:14:54 donatas-laptop bgpd[347675]: [Z1DRQ-N6Z5F] 127.0.0.1(donatas-pc): Dynamic Capability MultiProtocol Extensions afi/safi invalid (bad-value/unicast) Mar 29 11:14:54 donatas-laptop bgpd[347675]: [JTVED-VGTQQ] 127.0.0.1(donatas-pc): CAPABILITY has action: 1, code: 1, length 10 Mar 29 11:14:54 donatas-laptop bgpd[347675]: [Z1DRQ-N6Z5F] 127.0.0.1(donatas-pc): Dynamic Capability MultiProtocol Extensions afi/safi invalid (bad-value/unicast) Mar 29 11:14:54 donatas-laptop bgpd[347675]: [JTVED-VGTQQ] 127.0.0.1(donatas-pc): CAPABILITY has action: 1, code: 1, length 10 Mar 29 11:14:54 donatas-laptop bgpd[347675]: [Z1DRQ-N6Z5F] 127.0.0.1(donatas-pc): Dynamic Capability MultiProtocol Extensions afi/safi invalid (bad-value/unicast) Mar 29 11:14:54 donatas-laptop bgpd[347675]: [JTVED-VGTQQ] 127.0.0.1(donatas-pc): CAPABILITY has action: 1, code: 1, length 10 Mar 29 11:14:54 donatas-laptop bgpd[347675]: [Z1DRQ-N6Z5F] 127.0.0.1(donatas-pc): Dynamic Capability MultiProtocol Extensions afi/safi invalid (bad-value/unicast) Mar 29 11:14:54 donatas-laptop bgpd[347675]: [JTVED-VGTQQ] 127.0.0.1(donatas-pc): CAPABILITY has action: 1, code: 1, length 10 Mar 29 11:14:54 donatas-laptop bgpd[347675]: [Z1DRQ-N6Z5F] 127.0.0.1(donatas-pc): Dynamic Capability MultiProtocol Extensions afi/safi invalid (bad-value/unicast) Mar 29 11:14:54 donatas-laptop bgpd[347675]: [JTVED-VGTQQ] 127.0.0.1(donatas-pc): CAPABILITY has action: 1, code: 1, length 10 Mar 29 11:14:54 donatas-laptop bgpd[347675]: [Z1DRQ-N6Z5F] 127.0.0.1(donatas-pc): Dynamic Capability MultiProtocol Extensions afi/safi invalid (bad-value/unicast) Mar 29 11:14:54 donatas-laptop bgpd[347675]: [JTVED-VGTQQ] 127.0.0.1(donatas-pc): CAPABILITY has action: 1, code: 1, length 10 ``` Reported-by: Iggy Frankovic <iggyfran@amazon.com> Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2024-03-30 16:15:08 +02:00
Donatas Abraitis	081f6520ff	bgpd: Avoid padding for bgp_paths_limit_capability struct When sending the packets over the network (dynamic capability) it reports 6 bytes instead of 5 bytes, and causes some issues between little/big endian machines. Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2024-03-14 09:50:49 +02:00
Donatas Abraitis	78757362f2	bgpd: Allow dynamically disable graceful-restart/long-lived graceful-restart If we enter `bgp graceful-restart-disable`, make sure we disable the capabilities. Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2024-03-10 18:25:30 +02:00
Donatas Abraitis	77102e853e	bgpd: Unset advertised capabilities if capability is disabled When using dynamic capabilities, do not forget to unset advertised capabilities. Otherwise, it's kept as advertised. Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2024-03-09 22:23:37 +02:00
Donatas Abraitis	4967bf6d72	bgpd: Send "Send Hold Timer Expired" on such events notification This is required by the current (latest/-02 draft). IANA has registered code 8 for "Send Hold Timer Expired" in the "BGP Error (Notification) Codes" sub-registry under the "Border Gateway Protocol (BGP) Parameters" registry. https://datatracker.ietf.org/doc/html/draft-ietf-idr-bgp-sendholdtimer Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2024-02-29 15:37:53 +02:00

1 2 3 4 5 ...

445 commits