Commit graph

400 commits

Author SHA1 Message Date
Russ White ab67e5544e
Merge pull request #18396 from pguibert6WIND/srv6l3vpn_to_bgp_vrf_redistribute
Add BGP redistribution in SRv6 BGP
2025-04-03 08:25:32 -04:00
Russ White c312917988
Merge pull request #18450 from donaldsharp/bgp_packet_reads
Bgp packet reads conversion to a FIFO
2025-04-01 10:12:37 -04:00
Donald Sharp 06480c0c81 bgpd: When shutting down do not clear self peers
Commit: e0ae285eb8

Modified the fsm state machine to attempt to not
clear routes on a peer that was not established.
The peer should be not a peer self.  We do not want
to ever clear the peer self.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-03-30 14:02:16 -04:00
Donald Sharp 12bf042c68 bgpd: Modify bgp to handle packet events in a FIFO
Current behavor of BGP is to have a event per connection.  Given
that on startup of BGP with a high number of neighbors you end
up with 2 * # of peers events that are being processed.  Additionally
once BGP has selected the connection this still only comes down
to 512 events.  This number of events is swamping the event system
and in addition delaying any other work from being done in BGP at
all because the the 512 events are always going to take precedence
over everything else.  The other main events are the handling
of the metaQ(1 event), update group events( 1 per update group )
and the zebra batching event.  These are being swamped.

Modify the BGP code to have a FIFO of connections.  As new data
comes in to read, place the connection on the end of the FIFO.
Have the bgp_process_packet handle up to 100 packets spread
across the individual peers where each peer/connection is limited
to the original quanta.  During testing I noticed that withdrawal
events at very very large scale are taking up to 40 seconds to process
so I added a check for yielding to further limit the number of packets
being processed.

This change also allow for BGP to be interactive again on scale
setups on initial convergence.  Prior to this change any vtysh
command entered would be delayed by 10's of seconds in my setup
while BGP was doing other work.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-03-25 09:10:46 -04:00
Donatas Abraitis 45af7ea217
Merge pull request #18483 from donaldsharp/holdtime_mistake
bgpd: Fix holdtime not working properly when busy
2025-03-25 09:38:09 +02:00
Russ White 7afe25744b
Merge pull request #18447 from donaldsharp/bgp_clear_batch
Bgp clear batch
2025-03-24 16:13:49 -04:00
Donald Sharp 9a26a56c51 bgpd: Fix holdtime not working properly when busy
Commit:  cc9f21da22

Modified the bgp_fsm code to dissallow the extension
of the hold time when the system is under extremely
heavy load.  This was a attempt to remove the return
code but it was too aggressive and messed up this bit
of code.

Put the behavior back that was introduced in:
d0874d195d

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-03-24 15:55:01 -04:00
Philippe Guibert 99acebcdc9 bgpd: fix check validity of a VPN SRv6 route with modified nexthop
When exporting a VPN SRv6 route, the path may not be considered valid if
the nexthop is not valid. This is the case when the 'nexthop vpn export'
command is used. The below example illustrates that the VPN path to
2001:1::/64 is not selected, as the expected nexthop to find in vrf10 is
the one configured:

> # show running-config
> router bgp 1 vrf vrf10
>  address-family ipv6 unicast
>   nexthop vpn export 2001::1

> # show bgp ipv6 vpn
> [..]
> Route Distinguisher: 1:10
>      2001:1::/64      2001::1@4                0             0 65001 i
>     UN=2001::1 EC{99:99} label=16 sid=2001:db8:1:1:: sid_structure=[40,24,16,0] type=bgp, subtype=5

The analysis indicates that the 2001::1 nexthop is considered.

> 2025/03/20 21:47:53.751853 BGP: [RD1WY-YE9EC] leak_update: entry: leak-to=VRF default, p=2001:1::/64, type=10, sub_type=0
> 2025/03/20 21:47:53.751855 BGP: [VWNP2-DNMFV] Found existing bnc 2001::1/128(0)(VRF vrf10) flags 0x82 ifindex 0 #paths 2 peer 0x0, resolved prefix UNK prefix
> 2025/03/20 21:47:53.751856 BGP: [VWC2R-4REXZ] leak_update_nexthop_valid: 2001:1::/64 nexthop is not valid (in VRF vrf10)
> 2025/03/20 21:47:53.751857 BGP: [HX87B-ZXWX9] leak_update: ->VRF default: 2001:1::/64: Found route, no change

Actually, to check the nexthop validity, only the source path in the VRF
has the correct nexthop. Fix this by reusing the source path information
instead of the current one.

> 2025/03/20 22:43:51.703521 BGP: [RD1WY-YE9EC] leak_update: entry: leak-to=VRF default, p=2001:1::/64, type=10, sub_type=0
> 2025/03/20 22:43:51.703523 BGP: [VWNP2-DNMFV] Found existing bnc fe80::b812:37ff:fe13:d441/128(0)(VRF vrf10) flags 0x87 ifindex 0 #paths 2 peer 0x0, resolved prefix fe80::/64
> 2025/03/20 22:43:51.703525 BGP: [VWC2R-4REXZ] leak_update_nexthop_valid: 2001:1::/64 nexthop is valid (in VRF vrf10)
> 2025/03/20 22:43:51.703526 BGP: [HX87B-ZXWX9] leak_update: ->VRF default: 2001:1::/64: Found route, no change

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2025-03-24 09:17:01 +01:00
Donatas Abraitis ace4b8fe61 bgpd: Print the real reason why the peer is not accepted (incoming)
If it's suppressed due to BFD down or unspecified connection, we never know
the real reason and just say "no AF activated" which is misleading.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2025-03-17 14:52:42 +02:00
Mark Stapp 58f924d287 bgpd: batch peer connection error clearing
When peer connections encounter errors, attempt to batch some
of the clearing processing that occurs. Add a new batch object,
add multiple peers to it, if possible. Do one rib walk for the
batch, rather than one walk per peer. Use a handler callback
per batch to check and remove peers' path-infos, rather than
a work-queue and callback per peer. The original clearing code
remains; it's used for single peers.

Signed-off-by: Mark Stapp <mjs@cisco.com>
2025-03-12 12:42:06 -04:00
Donald Sharp 2cd1d00dde bgpd: Convert bgp_keepalive_send to use a connection
The peer is going to eventually have a incoming and
outgoing connection.  Let's send the data based
upon the connection not the peer.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-02-28 10:28:50 -05:00
Donald Sharp 543fc6dc56 bgpd: Add connection direction to debug logs
Currently the incoming and outgoing connections mix up their
logs and there is absolutely no way to tell which way is being
talked about when both are operating.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-02-28 10:28:50 -05:00
Jafar Al-Gharaibeh 92288c9069
Merge pull request #17865 from donaldsharp/coverity_2024_new_hotness
Coverity 2024 new hotness
2025-02-06 10:15:55 -06:00
Donatas Abraitis 739f2b566a bgpd: Do not start BGP session if BGP identifier is not set
If we have IPv6-only network and no IPv4 addresses at all, then by default
0.0.0.0 is created which is treated as malformed according to RFC 6286.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2025-01-29 23:03:06 +02:00
Donatas Abraitis 0702ddb3c9 bgpd: Do not show "Waiting for OPEN" as last reset
This is actually not reset, and should be ignored showing it as last reset
under `show bgp neighbor`.

Fixes: 1e91f1d119 ("bgpd: Update failed reason to distinguish some NHT scenario")

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2025-01-19 11:07:59 +02:00
Donald Sharp f94ad538cf bgpd: Ensure ibuf count is protected by mutex
Grab the count of streams in ibuf when it is protected
by a mutex.  Since this data is written to it in another
pthread.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-01-17 10:16:48 -05:00
Donald Sharp 348c2dc3f8 bgpd: Only update peer connection information when needed
Currently bgp is repeatedly grabbing peer connection information.
This is a bit overkill.  There are two situations:

a) Opening a connection to the peer
   In this case, we know the remote port/address a priori and can get
   the local information by just asking the OS.
b) Peer opening a connection to us.
   In this case, we know the local port/address a priori and can get
   the remote information by just asking the OS.

Modify the code to just grab this data at the appropriate time.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-01-10 10:07:11 -05:00
Donald Sharp 78fa9b6feb bgpd: su_remote and su_local are properties of the connection
su_local and su_remote in the peer can change based upon
if we are initiating the remote connection or receiving it.
As such we need to treat it as a property of the connection.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-01-10 10:07:11 -05:00
Donald Sharp 0e416ff157 bgpd: bgp_getsockanme is connection oriented
Let's make it so.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2025-01-10 10:06:16 -05:00
Jafar Al-Gharaibeh f78b1786a6
Merge pull request #17599 from opensourcerouting/fix/reduce_default_connect_timer
bgpd: Connect retry timer backoff
2024-12-18 16:26:37 -06:00
Donald Sharp 40c31bdf40 bgpd: When calling bgp_process, prevent infinite loop
If we have this construct:

for (pi = bgp_dest_get_bgp_path_info(dest); pi; pi = pi->next) {
     ...
     bgp_process();
}

This can induce an infinite loop.  This happens because bgp_process
will move the unsorted items to the top of the list for handling,
as such it is necessary to hold the next pointer to the side
to actually look at each possible bgp_path_info.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-12-12 15:08:35 -05:00
Donatas Abraitis ab3535fbcf bgpd: Implement connect retry backoff
Instead of starting with a fairly high value of retry, let's try with a lower
and increase with a backoff to reach what was a default value (120s).

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2024-12-11 17:20:48 +02:00
Donald Sharp 7bf3f53e44 bgpd: peer_active is connection oriented, make it so
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-11-26 11:59:39 -05:00
Donald Sharp 1baeb81632 bgpd: bgp_getsockname should use connection
Let's use the connection associated with the peer
instead.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-11-26 11:59:33 -05:00
Donald Sharp 72f716ef28 bgpd: Modify bgp_connect_in_progress_update_connection to use connection
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-11-26 11:59:27 -05:00
Donald Sharp 2771431938 bgpd: Modify bgp_udpatesockname to pass in a connection
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-11-26 11:59:19 -05:00
Donald Sharp eacf923b00 bgpd: Fix pattern of usage in bgp_notify_config_change
if (BGP_IS_VALID_STATE_FOR_NOTIF(peer->connection->status))
        peer_notify_config_change(peer->connection);
else
        bgp_session_reset_safe(peer, &nnode);

Let's add a bool return to peer_notify_config_change of whether or
not it should call the peer session reset.  This simplifies
the code a bunch.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-11-26 11:59:18 -05:00
Donald Sharp ba0edb9545 bgpd: Add peer_notify_config_change() function
We have about a bajillion tests of if we can
notify the peer and then we send a config change
notification.  Let's just make a function that
does this.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-11-26 11:58:23 -05:00
Donatas Abraitis 0a85b1ba04 bgpd: Fix graceful-restart for peer-groups
Slipped somehow that peer-groups with GR is just completely broken, but it was
working before.

Strikes again, that we MUST have more and more topotests.

Fixes: 15403f521a ("bgpd: Streamline GR config, act on change immediately")

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2024-11-24 21:57:19 +02:00
Donald Sharp 2a94de8af2 bgpd: bgp_connect should return an enum connect_result
This function when it is run by bgp_start is expected
to return a `enum connect_result`.  But instead
the function returns a variety of values that are
not really being checked for.  Consolidate to a correct
choice.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-11-20 16:11:22 -05:00
Donatas Abraitis 29eafd32c5 bgpd: Do not try to uninstall BFD session if the peer is not established
Having something like:

```
 neighbor 192.168.1.222 ebgp-multihop 32
 neighbor 192.168.1.222 update-source 192.168.1.5
 neighbor 192.168.1.222 bfd
```

Won't work and the result is (empty):

```
$ show bfd peers
BFD Peers:
```

bgp_stop() is called in BGP FSM multiple times (even at startup) that
causes intermediate session interruption when update-source/ebgp-multihop
is triggered.

With this fix, the ordering does not matter and the BFD session's parameters
are updated correctly.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2024-11-11 16:49:22 +02:00
Donatas Abraitis 895d586a5f bgpd: Set LLGR stale routes for all the paths including addpath
Without this patch we set only the first path for the route (if multiple exist)
as LLGR stale and stop doing that for the rest of the paths, which is wrong.

Fixes: 1479ed2fb3 ("bgpd: Implement LLGR helper mode")

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2024-11-07 14:05:36 +02:00
Donald Sharp 138935a5fd bgpd: Fix wrong pthread event cancelling
0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=130719886083648) at ./nptl/pthread_kill.c:44
1  __pthread_kill_internal (signo=6, threadid=130719886083648) at ./nptl/pthread_kill.c:78
2  __GI___pthread_kill (threadid=130719886083648, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
3  0x000076e399e42476 in __GI_raise (sig=6) at ../sysdeps/posix/raise.c:26
4  0x000076e39a34f950 in core_handler (signo=6, siginfo=0x76e3985fca30, context=0x76e3985fc900) at lib/sigevent.c:258
5  <signal handler called>
6  __pthread_kill_implementation (no_tid=0, signo=6, threadid=130719886083648) at ./nptl/pthread_kill.c:44
7  __pthread_kill_internal (signo=6, threadid=130719886083648) at ./nptl/pthread_kill.c:78
8  __GI___pthread_kill (threadid=130719886083648, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
9  0x000076e399e42476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
10 0x000076e399e287f3 in __GI_abort () at ./stdlib/abort.c:79
11 0x000076e39a39874b in _zlog_assert_failed (xref=0x76e39a46cca0 <_xref.27>, extra=0x0) at lib/zlog.c:789
12 0x000076e39a369dde in cancel_event_helper (m=0x5eda32df5e40, arg=0x5eda33afeed0, flags=1) at lib/event.c:1428
13 0x000076e39a369ef6 in event_cancel_event_ready (m=0x5eda32df5e40, arg=0x5eda33afeed0) at lib/event.c:1470
14 0x00005eda0a94a5b3 in bgp_stop (connection=0x5eda33afeed0) at bgpd/bgp_fsm.c:1355
15 0x00005eda0a94b4ae in bgp_stop_with_notify (connection=0x5eda33afeed0, code=8 '\b', sub_code=0 '\000') at bgpd/bgp_fsm.c:1610
16 0x00005eda0a979498 in bgp_packet_add (connection=0x5eda33afeed0, peer=0x5eda33b11800, s=0x76e3880daf90) at bgpd/bgp_packet.c:152
17 0x00005eda0a97a80f in bgp_keepalive_send (peer=0x5eda33b11800) at bgpd/bgp_packet.c:639
18 0x00005eda0a9511fd in peer_process (hb=0x5eda33c9ab80, arg=0x76e3985ffaf0) at bgpd/bgp_keepalives.c:111
19 0x000076e39a2cd8e6 in hash_iterate (hash=0x76e388000be0, func=0x5eda0a95105e <peer_process>, arg=0x76e3985ffaf0) at lib/hash.c:252
20 0x00005eda0a951679 in bgp_keepalives_start (arg=0x5eda3306af80) at bgpd/bgp_keepalives.c:214
21 0x000076e39a2c9932 in frr_pthread_inner (arg=0x5eda3306af80) at lib/frr_pthread.c:180
22 0x000076e399e94ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
23 0x000076e399f26850 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
(gdb) f 12
12 0x000076e39a369dde in cancel_event_helper (m=0x5eda32df5e40, arg=0x5eda33afeed0, flags=1) at lib/event.c:1428
1428		assert(m->owner == pthread_self());

In this decode the attempt to cancel the connection's events from
the wrong thread is causing the crash.  Modify the code to create an
event on the bm->master to cancel the events for the connection.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-10-24 21:01:26 -04:00
David Lamparter 49cf311d46 *: clang-SA friendly switch-enum-return-string
clang-19's SA complains about unused initializers for this kind of
"switch (enum) { return string }" kind of code.  Use direct string
return values to avoid the issue.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2024-10-16 13:00:11 +02:00
Philippe Guibert 37702ca080 bgpd: fix 'nexthop_set failed' error message often displayed
The 'nexthop_set failed, resetting connection - intf' log message
is often seen when peering with BGP peers. This message has been
displayed by introducing a recent fix that extracts the IP/port
information of outgoing connections when peering is not yet
established.

Fix this by separating the update of the socket information from
the call to bgp_zebra_nexthop_set().

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2024-09-12 16:14:27 +02:00
Donald Sharp bb78f73fa6 bgpd: Reduce # of iterations when doing llgr
Code was scanning a table then identifying a prefix
that needed to be modified then calling code that
reran bestpath on the entire table again.

If you had multiple items that needed processing
you would end up scanning and setting the entire
table to be scanned multiple times.  No bueno.

a) We do not need to reprocess items that are not
being modified.

b) We do not need to walk the entire table multiple
times, we have the data that is needed already.

Modify the code to just call bgp_process on the
interesting nodes.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-09-06 10:39:41 -04:00
Donald Sharp 4344ac1d28 bgpd: global_gr_mode does not need to be set twice
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-08-22 13:32:20 -04:00
Donatas Abraitis 9c710eef0c bgpd: Use bgp_session_reset_safe() for GR update all peers
It might cause this use-after-free:

```
==6523==ERROR: AddressSanitizer: heap-use-after-free on address 0x60300058d720 at pc 0x55f3ab62ab1f bp 0x7ffe5b95a0d0 sp 0x7ffe5b95a0c8
READ of size 8 at 0x60300058d720 thread T0
    #0 0x55f3ab62ab1e in bgp_gr_update_mode_of_all_peers bgpd/bgp_fsm.c:2729
    #1 0x55f3ab62ab1e in bgp_gr_update_all bgpd/bgp_fsm.c:2779
    #2 0x55f3ab73557e in bgp_inst_gr_config_vty bgpd/bgp_vty.c:3037
    #3 0x55f3ab74db69 in bgp_graceful_restart bgpd/bgp_vty.c:3130
    #4 0x7fc5539a9584 in cmd_execute_command_real lib/command.c:1002
    #5 0x7fc5539a98a3 in cmd_execute_command lib/command.c:1061
    #6 0x7fc5539a9dcf in cmd_execute lib/command.c:1227
    #7 0x7fc553ae493f in vty_command lib/vty.c:616
    #8 0x7fc553ae4e92 in vty_execute lib/vty.c:1379
    #9 0x7fc553aedd34 in vtysh_read lib/vty.c:2374
    #10 0x7fc553ad8a64 in event_call lib/event.c:1995
    #11 0x7fc553a0c429 in frr_run lib/libfrr.c:1232
    #12 0x55f3ab57b78d in main bgpd/bgp_main.c:555
    #13 0x7fc55342d249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    #14 0x7fc55342d304 in __libc_start_main_impl ../csu/libc-start.c:360
    #15 0x55f3ab5799a0 in _start (/usr/lib/frr/bgpd+0x2e19a0)

0x60300058d720 is located 16 bytes inside of 24-byte region [0x60300058d710,0x60300058d728)
freed by thread T0 here:
    #0 0x7fc553eb76a8 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:52
    #1 0x7fc553a2b713 in qfree lib/memory.c:130
    #2 0x7fc553a0e50d in listnode_free lib/linklist.c:81
    #3 0x7fc553a0e50d in list_delete_node lib/linklist.c:379
    #4 0x55f3ab7ae353 in peer_delete bgpd/bgpd.c:2796
    #5 0x55f3ab7ae91f in bgp_session_reset bgpd/bgpd.c:141
    #6 0x55f3ab62ab17 in bgp_gr_update_mode_of_all_peers bgpd/bgp_fsm.c:2752
    #7 0x55f3ab62ab17 in bgp_gr_update_all bgpd/bgp_fsm.c:2779
    #8 0x55f3ab73557e in bgp_inst_gr_config_vty bgpd/bgp_vty.c:3037
    #9 0x55f3ab74db69 in bgp_graceful_restart bgpd/bgp_vty.c:3130
    #10 0x7fc5539a9584 in cmd_execute_command_real lib/command.c:1002
    #11 0x7fc5539a98a3 in cmd_execute_command lib/command.c:1061
    #12 0x7fc5539a9dcf in cmd_execute lib/command.c:1227
    #13 0x7fc553ae493f in vty_command lib/vty.c:616
    #14 0x7fc553ae4e92 in vty_execute lib/vty.c:1379
    #15 0x7fc553aedd34 in vtysh_read lib/vty.c:2374
    #16 0x7fc553ad8a64 in event_call lib/event.c:1995
    #17 0x7fc553a0c429 in frr_run lib/libfrr.c:1232
    #18 0x55f3ab57b78d in main bgpd/bgp_main.c:555
    #19 0x7fc55342d249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

previously allocated by thread T0 here:
    #0 0x7fc553eb83b7 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:77
    #1 0x7fc553a2ae20 in qcalloc lib/memory.c:105
    #2 0x7fc553a0d056 in listnode_new lib/linklist.c:71
    #3 0x7fc553a0d85b in listnode_add_sort lib/linklist.c:197
    #4 0x55f3ab7baec4 in peer_create bgpd/bgpd.c:1996
    #5 0x55f3ab65be8b in bgp_accept bgpd/bgp_network.c:604
    #6 0x7fc553ad8a64 in event_call lib/event.c:1995
    #7 0x7fc553a0c429 in frr_run lib/libfrr.c:1232
    #8 0x55f3ab57b78d in main bgpd/bgp_main.c:555
    #9 0x7fc55342d249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
```

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2024-07-31 11:43:19 +03:00
Donatas Abraitis fa9bd07ae5 bgpd: Keep the last reset reason before we reset the peer
If we send a notification, there is no point setting the last_reset, because
bgp_notify_send() sets last_reset to PEER_DOWN_NOTIFY_SEND (almost everywhere).

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2024-07-25 13:22:27 +03:00
Donatas Abraitis 743b169384 bgpd: Set the last_reset if we change the password also
```
donatas.net(config-router)# do show ip bgp summary failed

IPv4 Unicast Summary:
BGP router identifier 1.1.1.1, local AS number 65001 VRF default vrf-id 0
BGP table version 0
RIB entries 0, using 0 bytes of memory
Peers 1, using 24 KiB of memory

Neighbor        EstdCnt DropCnt ResetTime Reason
127.0.0.1             2       2  00:02:02 Password config change (GoBGP/3.26.0)

Displayed neighbors 1
Total number of neighbors 1
```

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2024-07-25 13:06:46 +03:00
Donatas Abraitis 45f80de734 bgpd: Pass a connection struct directly for EVENT_OFF()
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2024-07-24 15:30:43 +03:00
vivek c6ed1cc16d bgpd: Refine restarter operation - R-bit & F-bit
Introduce BGP-wide flags to denote if BGP has started gracefully
and GR is in progress or not. Use this for setting of the R-bit in
the GR capability, and not a timer which is set for any new
instance creation. Mark graceful restart is complete when the
deferred path selection has been done and route sync with zebra as
well as deferred EOR advertisement has been initiated.

Introduce a function to check on F-bit setting rather than just
base it on configuration.

Subsequent commits will extend these functionalities.

Signed-off-by: Vivek Venkatraman <vivek@nvidia.com>
2024-07-01 13:02:45 -07:00
vivek 15403f521a bgpd: Streamline GR config, act on change immediately
Streamline the BGP graceful-restart configuration at the global and
peer level some more. Similar to many other neighbor capability
parameters like MP and ENHE, reset the session immediately upon a
change to the configuration. This will be more aligned with the
transactional UI model also and will not require a separate 'clear'
command to be executed.

Note: Peer-group graceful-restart configuration is not yet supported.

Signed-off-by: Vivek Venkatraman <vivek@nvidia.com>
2024-06-27 11:40:57 -07:00
Loïc Sang e0ae285eb8 bgpd: avoid clearing routes for peers that were never established
Under heavy system load with many peers in passive mode and a large
number of routes, bgpd can enter an infinite loop. This occurs while
processing timeout BGP_OPEN messages, which prevents it from accepting
new connections. The following log entries illustrate the issue:
>bgpd[6151]: [VX6SM-8YE5W][EC 33554460] 3.3.2.224: nexthop_set failed, resetting connection - intf 0x0
>bgpd[6151]: [P790V-THJKS][EC 100663299] bgp_open_receive: bgp_getsockname() failed for peer: 3.3.2.224
>bgpd[6151]: [HTQD2-0R1WR][EC 33554451] bgp_process_packet: BGP OPEN receipt failed for peer: 3.3.2.224
... repeating

The issue occurs when bgpd handles a massive number of routes in the RIB
while receiving numerous BGP_OPEN packets. If bgpd is overloaded, it
fails to process these packets promptly, leading the remote peer to
close the connection and resend BGP_OPEN packets.

When bgpd eventually starts processing these timeout BGP_OPEN packets,
it finds the TCP connection closed by the remote peer, resulting in
"bgp_stop()" being called. For each timeout peer, bgpd must iterate
through the routing table, which is time-consuming and causes new
incoming BGP_OPEN packets to timeout, perpetuating the infinite loop.

To address this issue, the code is modified to check if the peer has
been established at least once before calling "bgp_clear_route_all()".
This ensures that routes are only cleared for peers that had a
successful session, preventing unnecessary iterations over the routing
table for peers that never established a connection.

With this change, BGP_OPEN timeout messages may still occur, but in the
worst case, bgpd will stabilize. Before this patch, bgpd could enter a
loop where it was unable to accpet any new connections.

Signed-off-by: Loïc Sang <loic.sang@6wind.com>
2024-06-26 16:11:16 +02:00
Louis Scalbert e446308d76 bgpd: fix dynamic peer graceful restart race condition
bgp_llgr topotest sometimes fails at step 8:

> topo: STEP 8: 'Check if we can see 172.16.1.2/32 after R4 (dynamic peer) was killed'

R4 neighbor is deleted on R2 because it fails to re-connect:

> 14:33:40.128048 BGP: [HKWM3-ZC5QP] 192.168.3.1 fd -1 went from Established to Clearing
> 14:33:40.128154 BGP: [MJ1TJ-HEE3V] 192.168.3.1(r4) graceful restart timer expired
> 14:33:40.128158 BGP: [ZTA2J-YRKGY] 192.168.3.1(r4) graceful restart stalepath timer stopped
> 14:33:40.128162 BGP: [H917J-25EWN] 192.168.3.1(r4) Long-lived stale timer (IPv4 Unicast) started for 20 sec
> 14:33:40.128168 BGP: [H5X66-NXP9S] 192.168.3.1(r4) Long-lived set stale community (LLGR_STALE) for: 172.16.1.2/32
> 14:33:40.128220 BGP: [H5X66-NXP9S] 192.168.3.1(r4) Long-lived set stale community (LLGR_STALE) for: 192.168.3.0/24
> [...]
> 14:33:41.138869 BGP: [RGGAC-RJ6WG] 192.168.3.1 [Event] Connect failed 111(Connection refused)
> 14:33:41.138906 BGP: [ZWCSR-M7FG9] 192.168.3.1 [FSM] TCP_connection_open_failed (Connect->Active), fd 23
> 14:33:41.138912 BGP: [JA9RP-HSD1K] 192.168.3.1 (dynamic neighbor) deleted (bgp_connect_fail)
> 14:33:41.139126 BGP: [P98A2-2RDFE] 192.168.3.1(r4) graceful restart stalepath timer stopped

af8496af08 ("bgpd: Do not delete BGP dynamic peers if graceful restart
kicks in") forgot to modify bgp_connect_fail()

Do not delete the peer in bgp_connect_fail() if Non-Stop-Forwarding is
in progress.

Fixes: af8496af08 ("bgpd: Do not delete BGP dynamic peers if graceful restart kicks in")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2024-05-16 15:19:11 +02:00
Russ White 827badc53c
Merge pull request #15883 from opensourcerouting/fix/bgpd_gr_fsm
bgpd: Apply NOOP when doing negative commands for GR operations
2024-05-07 09:56:51 -04:00
Donatas Abraitis 7b5595b61d bgpd: Print old/new states of graceful restart FSM
To better debug what's going on before/after.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2024-04-30 13:44:17 +03:00
Philippe Guibert f101108e3e bgpd: fix covery ID 1585206
The return value of bgp_getsockname() should always be
checked.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2024-04-29 15:44:24 +02:00
Philippe Guibert 78ce63952a bgpd: fix addressing information of non established outgoing sessions
When trying to connect to a BGP peer that does not respons, the 'show
bgp neighbors' command does not give any indication on the local and
remote addresses used:

> # show bgp neighbors
>  BGP neighbor is 192.0.2.150, remote AS 65500, local AS 65500, internal link
>   Local Role: undefined
>   Remote Role: undefined
>   BGP version 4, remote router ID 0.0.0.0, local router ID 192.0.2.1
>   BGP state = Connect
> [..]
>   Connections established 0; dropped 0
>   Last reset 00:00:04,   Waiting for peer OPEN (n/a)
>   Internal BGP neighbor may be up to 255 hops away.
> BGP Connect Retry Timer in Seconds: 120
> Next connect timer due in 117 seconds
> Read thread: off  Write thread: off  FD used: 27

The addressing information (address and port) are only available
when TCP session is established, whereas this information is present
at the system level:

> root@ubuntu2204:~# netstat -pan | grep 192.0.2.1
> tcp        0      0 192.0.2.1:179           192.0.2.150:38060       SYN_RECV    -
> tcp        0      1 192.0.2.1:46526         192.0.2.150:179         SYN_SENT    488310/bgpd

Add the display for outgoing BGP session, as the information in
the getsockname() API provides information for connected streams.
When getpeername() API does not give any information, use the peer
configuration (destination port is encoded in peer->port).

> # show bgp neighbors
> BGP neighbor is 192.0.2.150, remote AS 65500, local AS 65500, internal link
>   Local Role: undefined
>   Remote Role: undefined
>   BGP version 4, remote router ID 0.0.0.0, local router ID 192.0.2.1
>   BGP state = Connect
> [..]
>   Connections established 0; dropped 0
>   Last reset 00:00:16,   Waiting for peer OPEN (n/a)
> Local host: 192.0.2.1, Local port: 46084
> Foreign host: 192.0.2.150, Foreign port: 179

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2024-04-15 09:16:54 +02:00
Donatas Abraitis 4967bf6d72 bgpd: Send "Send Hold Timer Expired" on such events notification
This is required by the current (latest/-02 draft).

IANA has registered code 8 for "Send Hold Timer Expired" in the "BGP
Error (Notification) Codes" sub-registry under the "Border Gateway
Protocol (BGP) Parameters" registry.

https://datatracker.ietf.org/doc/html/draft-ietf-idr-bgp-sendholdtimer

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2024-02-29 15:37:53 +02:00