frr/bgpd/bgp_fsm.c

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

3095 lines
93 KiB
C
Raw Normal View History

// SPDX-License-Identifier: GPL-2.0-or-later
2002-12-13 21:15:29 +01:00
/* BGP-4 Finite State Machine
* From RFC1771 [A Border Gateway Protocol 4 (BGP-4)]
* Copyright (C) 1996, 97, 98 Kunihiro Ishiguro
*/
2002-12-13 21:15:29 +01:00
#include <zebra.h>
#include "linklist.h"
#include "prefix.h"
#include "sockunion.h"
#include "frrevent.h"
2002-12-13 21:15:29 +01:00
#include "log.h"
#include "stream.h"
#include "ringbuf.h"
2002-12-13 21:15:29 +01:00
#include "memory.h"
#include "plist.h"
bgpd: bgpd-update-delay.patch COMMAND: 'update-delay <max-delay in seconds> [<establish-wait in seconds>]' DESCRIPTION: This feature is used to enable read-only mode on BGP process restart or when BGP process is cleared using 'clear ip bgp *'. When applicable, read-only mode would begin as soon as the first peer reaches Established state and a timer for <max-delay> seconds is started. During this mode BGP doesn't run any best-path or generate any updates to its peers. This mode continues until: 1. All the configured peers, except the shutdown peers, have sent explicit EOR (End-Of-RIB) or an implicit-EOR. The first keep-alive after BGP has reached Established is considered an implicit-EOR. If the <establish-wait> optional value is given, then BGP will wait for peers to reach establish from the begining of the update-delay till the establish-wait period is over, i.e. the minimum set of established peers for which EOR is expected would be peers established during the establish-wait window, not necessarily all the configured neighbors. 2. max-delay period is over. On hitting any of the above two conditions, BGP resumes the decision process and generates updates to its peers. Default <max-delay> is 0, i.e. the feature is off by default. This feature can be useful in reducing CPU/network used as BGP restarts/clears. Particularly useful in the topologies where BGP learns a prefix from many peers. Intermediate bestpaths are possible for the same prefix as peers get established and start receiving updates at different times. This feature should offer a value-add if the network has a high number of such prefixes. IMPLEMENTATION OBJECTIVES: Given this is an optional feature, minimized the code-churn. Used existing constructs wherever possible (existing queue-plug/unplug were used to achieve delay and resume of best-paths/update-generation). As a result, no new data-structure(s) had to be defined and allocated. When the feature is disabled, the new node is not exercised for the most part. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com> Dinesh Dutt <ddutt@cumulusnetworks.com>
2015-05-20 02:40:33 +02:00
#include "workqueue.h"
#include "queue.h"
#include "filter.h"
#include "command.h"
#include "lib_errors.h"
#include "zclient.h"
#include "lib/json.h"
2002-12-13 21:15:29 +01:00
#include "bgpd/bgpd.h"
#include "bgpd/bgp_attr.h"
#include "bgpd/bgp_debug.h"
#include "bgpd/bgp_errors.h"
2002-12-13 21:15:29 +01:00
#include "bgpd/bgp_fsm.h"
#include "bgpd/bgp_packet.h"
#include "bgpd/bgp_network.h"
#include "bgpd/bgp_route.h"
#include "bgpd/bgp_dump.h"
#include "bgpd/bgp_open.h"
bgpd: bgpd-update-delay.patch COMMAND: 'update-delay <max-delay in seconds> [<establish-wait in seconds>]' DESCRIPTION: This feature is used to enable read-only mode on BGP process restart or when BGP process is cleared using 'clear ip bgp *'. When applicable, read-only mode would begin as soon as the first peer reaches Established state and a timer for <max-delay> seconds is started. During this mode BGP doesn't run any best-path or generate any updates to its peers. This mode continues until: 1. All the configured peers, except the shutdown peers, have sent explicit EOR (End-Of-RIB) or an implicit-EOR. The first keep-alive after BGP has reached Established is considered an implicit-EOR. If the <establish-wait> optional value is given, then BGP will wait for peers to reach establish from the begining of the update-delay till the establish-wait period is over, i.e. the minimum set of established peers for which EOR is expected would be peers established during the establish-wait window, not necessarily all the configured neighbors. 2. max-delay period is over. On hitting any of the above two conditions, BGP resumes the decision process and generates updates to its peers. Default <max-delay> is 0, i.e. the feature is off by default. This feature can be useful in reducing CPU/network used as BGP restarts/clears. Particularly useful in the topologies where BGP learns a prefix from many peers. Intermediate bestpaths are possible for the same prefix as peers get established and start receiving updates at different times. This feature should offer a value-add if the network has a high number of such prefixes. IMPLEMENTATION OBJECTIVES: Given this is an optional feature, minimized the code-churn. Used existing constructs wherever possible (existing queue-plug/unplug were used to achieve delay and resume of best-paths/update-generation). As a result, no new data-structure(s) had to be defined and allocated. When the feature is disabled, the new node is not exercised for the most part. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com> Dinesh Dutt <ddutt@cumulusnetworks.com>
2015-05-20 02:40:33 +02:00
#include "bgpd/bgp_advertise.h"
bgpd: Implement LLGR helper mode Tested between GoBGP and FRR (this commit). ``` ┌───────────┐ ┌────────────┐ │ │ │ │ │ GoBGPD │ │ FRRouting │ │ (restart) │ │ │ │ │ │ │ └──────┬────┘ └───────┬────┘ │ │ │ │ │ │ │ ┌───────────┐ │ │ │ │ │ │ │ │ │ └─────┤ FRRouting ├────────┘ │ (helper) │ │ │ └───────────┘ // GoBGPD % cat /etc/gobgp/config.toml [global.config] as = 65002 router-id = "2.2.2.2" port = 179 [[neighbors]] [neighbors.config] peer-as = 65001 neighbor-address = "2a02:abc::123" [neighbors.graceful-restart.config] enabled = true restart-time = 3 long-lived-enabled = true [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv6-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 10 [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv4-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 20 % ./gobgp global rib add -a ipv6 2001:db8:4::/64 % ./gobgp global rib add -a ipv6 2001:db8:5::/64 community 65535:7 % ./gobgp global rib add -a ipv4 100.100.100.100/32 % ./gobgp global rib add -a ipv4 100.100.100.200/32 community 65535:7 ``` 1. When killing GoBGPD, graceful restart timer starts in FRR helper router; 2. When GR timer expires in helper router: a) LLGR_STALE community is attached to routes to be retained; b) Clear stale routes that have NO_LLGR community attached; c) Start LLGR timer per AFI/SAFI; d) Recompute bestpath and reannounce routes to peers; d) When LLGR timer expires, clear all routes on particular AFI/SAFI. Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2021-12-20 22:03:09 +01:00
#include "bgpd/bgp_community.h"
#include "bgpd/bgp_updgrp.h"
#include "bgpd/bgp_nht.h"
#include "bgpd/bgp_bfd.h"
#include "bgpd/bgp_memory.h"
#include "bgpd/bgp_keepalives.h"
#include "bgpd/bgp_io.h"
#include "bgpd/bgp_zebra.h"
#include "bgpd/bgp_vty.h"
DEFINE_HOOK(peer_backward_transition, (struct peer * peer), (peer));
DEFINE_HOOK(peer_status_changed, (struct peer * peer), (peer));
/* Definition of display strings corresponding to FSM events. This should be
* kept consistent with the events defined in bgpd.h
*/
static const char *const bgp_event_str[] = {
NULL,
"BGP_Start",
"BGP_Stop",
"TCP_connection_open",
"TCP_connection_open_w_delay",
"TCP_connection_closed",
"TCP_connection_open_failed",
"TCP_fatal_error",
"ConnectRetry_timer_expired",
"Hold_Timer_expired",
"KeepAlive_timer_expired",
"DelayOpen_timer_expired",
"Receive_OPEN_message",
"Receive_KEEPALIVE_message",
"Receive_UPDATE_message",
"Receive_NOTIFICATION_message",
"Clearing_Completed",
};
2002-12-13 21:15:29 +01:00
/* BGP FSM (finite state machine) has three types of functions. Type
one is thread functions. Type two is event functions. Type three
is FSM functions. Timer functions are set by bgp_timer_set
function. */
/* BGP event function. */
void bgp_event(struct event *event);
2002-12-13 21:15:29 +01:00
/* BGP thread functions. */
static void bgp_start_timer(struct event *event);
static void bgp_connect_timer(struct event *event);
static void bgp_holdtime_timer(struct event *event);
static void bgp_delayopen_timer(struct event *event);
2002-12-13 21:15:29 +01:00
/* Register peer with NHT */
int bgp_peer_reg_with_nht(struct peer *peer)
{
int connected = 0;
if (peer->sort == BGP_PEER_EBGP && peer->ttl == BGP_DEFAULT_TTL
&& !CHECK_FLAG(peer->flags, PEER_FLAG_DISABLE_CONNECTED_CHECK)
&& !CHECK_FLAG(peer->bgp->flags, BGP_FLAG_DISABLE_NH_CONNECTED_CHK))
connected = 1;
return bgp_find_or_add_nexthop(peer->bgp, peer->bgp,
family2afi(
peer->connection->su.sa.sa_family),
SAFI_UNICAST, NULL, peer, connected,
NULL);
}
static void peer_xfer_stats(struct peer *peer_dst, struct peer *peer_src)
{
/* Copy stats over. These are only the pre-established state stats */
peer_dst->open_in += peer_src->open_in;
peer_dst->open_out += peer_src->open_out;
peer_dst->keepalive_in += peer_src->keepalive_in;
peer_dst->keepalive_out += peer_src->keepalive_out;
peer_dst->notify_in += peer_src->notify_in;
peer_dst->notify_out += peer_src->notify_out;
peer_dst->dynamic_cap_in += peer_src->dynamic_cap_in;
peer_dst->dynamic_cap_out += peer_src->dynamic_cap_out;
}
static struct peer *peer_xfer_conn(struct peer *from_peer)
{
struct peer *peer;
afi_t afi;
safi_t safi;
enum bgp_fsm_events last_evt, last_maj_evt;
struct peer_connection *keeper, *going_away;
assert(from_peer != NULL);
/*
* Keeper is the connection that is staying around
*/
keeper = from_peer->connection;
peer = from_peer->doppelganger;
if (!peer || !CHECK_FLAG(peer->flags, PEER_FLAG_CONFIG_NODE))
return from_peer;
/*
* from_peer is pointing at the non config node and
* at this point peer is pointing at the CONFIG node
* peer ( non incoming connection ). The going_away pointer
* is the connection that is being placed on to
* the non Config node for deletion.
*/
going_away = peer->connection;
/*
* Let's check that we are not going to loose known configuration
* state based upon doppelganger rules.
*/
FOREACH_AFI_SAFI (afi, safi) {
if (from_peer->afc[afi][safi] != peer->afc[afi][safi]) {
flog_err(
EC_BGP_DOPPELGANGER_CONFIG,
"from_peer->afc[%d][%d] is not the same as what we are overwriting",
afi, safi);
return NULL;
}
}
if (bgp_debug_neighbor_events(peer))
zlog_debug("%s: peer transfer %p fd %d -> %p fd %d)",
from_peer->host, from_peer, from_peer->connection->fd,
peer, peer->connection->fd);
bgp_writes_off(going_away);
bgp_reads_off(going_away);
bgp_writes_off(keeper);
bgp_reads_off(keeper);
/*
* Before exchanging FD remove doppelganger from
* keepalive peer hash. It could be possible conf peer
* fd is set to -1. If blocked on lock then keepalive
* thread can access peer pointer with fd -1.
*/
bgp_keepalives_off(keeper);
EVENT_OFF(going_away->t_routeadv);
EVENT_OFF(going_away->t_connect);
EVENT_OFF(going_away->t_delayopen);
EVENT_OFF(going_away->t_connect_check_r);
EVENT_OFF(going_away->t_connect_check_w);
bgpd: Fix wrong pthread event cancelling 0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=130719886083648) at ./nptl/pthread_kill.c:44 1 __pthread_kill_internal (signo=6, threadid=130719886083648) at ./nptl/pthread_kill.c:78 2 __GI___pthread_kill (threadid=130719886083648, signo=signo@entry=6) at ./nptl/pthread_kill.c:89 3 0x000076e399e42476 in __GI_raise (sig=6) at ../sysdeps/posix/raise.c:26 4 0x000076e39a34f950 in core_handler (signo=6, siginfo=0x76e3985fca30, context=0x76e3985fc900) at lib/sigevent.c:258 5 <signal handler called> 6 __pthread_kill_implementation (no_tid=0, signo=6, threadid=130719886083648) at ./nptl/pthread_kill.c:44 7 __pthread_kill_internal (signo=6, threadid=130719886083648) at ./nptl/pthread_kill.c:78 8 __GI___pthread_kill (threadid=130719886083648, signo=signo@entry=6) at ./nptl/pthread_kill.c:89 9 0x000076e399e42476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26 10 0x000076e399e287f3 in __GI_abort () at ./stdlib/abort.c:79 11 0x000076e39a39874b in _zlog_assert_failed (xref=0x76e39a46cca0 <_xref.27>, extra=0x0) at lib/zlog.c:789 12 0x000076e39a369dde in cancel_event_helper (m=0x5eda32df5e40, arg=0x5eda33afeed0, flags=1) at lib/event.c:1428 13 0x000076e39a369ef6 in event_cancel_event_ready (m=0x5eda32df5e40, arg=0x5eda33afeed0) at lib/event.c:1470 14 0x00005eda0a94a5b3 in bgp_stop (connection=0x5eda33afeed0) at bgpd/bgp_fsm.c:1355 15 0x00005eda0a94b4ae in bgp_stop_with_notify (connection=0x5eda33afeed0, code=8 '\b', sub_code=0 '\000') at bgpd/bgp_fsm.c:1610 16 0x00005eda0a979498 in bgp_packet_add (connection=0x5eda33afeed0, peer=0x5eda33b11800, s=0x76e3880daf90) at bgpd/bgp_packet.c:152 17 0x00005eda0a97a80f in bgp_keepalive_send (peer=0x5eda33b11800) at bgpd/bgp_packet.c:639 18 0x00005eda0a9511fd in peer_process (hb=0x5eda33c9ab80, arg=0x76e3985ffaf0) at bgpd/bgp_keepalives.c:111 19 0x000076e39a2cd8e6 in hash_iterate (hash=0x76e388000be0, func=0x5eda0a95105e <peer_process>, arg=0x76e3985ffaf0) at lib/hash.c:252 20 0x00005eda0a951679 in bgp_keepalives_start (arg=0x5eda3306af80) at bgpd/bgp_keepalives.c:214 21 0x000076e39a2c9932 in frr_pthread_inner (arg=0x5eda3306af80) at lib/frr_pthread.c:180 22 0x000076e399e94ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442 23 0x000076e399f26850 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 (gdb) f 12 12 0x000076e39a369dde in cancel_event_helper (m=0x5eda32df5e40, arg=0x5eda33afeed0, flags=1) at lib/event.c:1428 1428 assert(m->owner == pthread_self()); In this decode the attempt to cancel the connection's events from the wrong thread is causing the crash. Modify the code to create an event on the bm->master to cancel the events for the connection. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-10-24 23:44:31 +02:00
EVENT_OFF(going_away->t_stop_with_notify);
EVENT_OFF(keeper->t_routeadv);
EVENT_OFF(keeper->t_connect);
EVENT_OFF(keeper->t_delayopen);
EVENT_OFF(keeper->t_connect_check_r);
EVENT_OFF(keeper->t_connect_check_w);
EVENT_OFF(keeper->t_process_packet);
/*
* At this point in time, it is possible that there are packets pending
* on various buffers. Those need to be transferred or dropped,
* otherwise we'll get spurious failures during session establishment.
*/
peer->connection = keeper;
keeper->peer = peer;
from_peer->connection = going_away;
going_away->peer = from_peer;
peer->as = from_peer->as;
peer->v_holdtime = from_peer->v_holdtime;
peer->v_keepalive = from_peer->v_keepalive;
peer->v_routeadv = from_peer->v_routeadv;
peer->v_delayopen = from_peer->v_delayopen;
peer->v_gr_restart = from_peer->v_gr_restart;
peer->cap = from_peer->cap;
peer->remote_role = from_peer->remote_role;
last_evt = peer->last_event;
last_maj_evt = peer->last_major_event;
peer->last_event = from_peer->last_event;
peer->last_major_event = from_peer->last_major_event;
from_peer->last_event = last_evt;
from_peer->last_major_event = last_maj_evt;
peer->remote_id = from_peer->remote_id;
bgpd: Add a new command to only show failed peerings In a data center, having 32-128 peers is not uncommon. In such a situation, to find a peer that has failed and why is several commands. This hinders both the automatability of failure detection and the ease/speed with which the reason can be found. To simplify this process of catching a failure and its cause quicker, this patch does the following: 1. Created a new function, bgp_show_failed_summary to display the failed summary output for JSON and vty 2. Created a new function to display the reset code/subcode. This is now used in the failed summary code and in the show neighbors code 3. Added a new variable failedPeers in all the JSON outputs, including the vanilla "show bgp summary" family. This lists the failed session count. 4. Display peer, dropped count, estd count, uptime and the reason for failure as the output of "show bgp summary failed" family of commands 5. Added three resset codes for the case where we're waiting for NHT, waiting for peer IPv6 addr, waiting for VRF to init. This also counts the case where only one peer has advertised an AFI/SAFI. The new command has the optional keyword "failed" added to the classical summary command. The changes affect only one existing output, that of "show [ip] bgp neighbors <nbr>". As we track the lack of NHT resolution for a peer or the lack of knowing a peer IPv6 addr, the output of that command will show a "waiting for NHT" etc. as the last reset reason. This patch includes update to the documentation too. Signed-off-by: Dinesh G Dutt <5016467+ddutt@users.noreply.github.com>
2019-08-31 18:24:49 +02:00
peer->last_reset = from_peer->last_reset;
peer->max_packet_size = from_peer->max_packet_size;
BGP_GR_ROUTER_DETECT_AND_SEND_CAPABILITY_TO_ZEBRA(peer->bgp,
peer->bgp->peer);
if (bgp_peer_gr_mode_get(peer) == PEER_DISABLE) {
UNSET_FLAG(peer->sflags, PEER_STATUS_NSF_MODE);
if (CHECK_FLAG(peer->sflags, PEER_STATUS_NSF_WAIT)) {
peer_nsf_stop(peer);
}
}
if (peer->hostname) {
XFREE(MTYPE_BGP_PEER_HOST, peer->hostname);
peer->hostname = NULL;
}
if (from_peer->hostname != NULL) {
peer->hostname = from_peer->hostname;
from_peer->hostname = NULL;
}
if (peer->domainname) {
XFREE(MTYPE_BGP_PEER_HOST, peer->domainname);
peer->domainname = NULL;
}
if (from_peer->domainname != NULL) {
peer->domainname = from_peer->domainname;
from_peer->domainname = NULL;
}
bgpd: Add BGP Software Version Capability Implement: https://datatracker.ietf.org/doc/html/draft-abraitis-bgp-version-capability Tested with GoBGP: ``` % ./gobgp neighbor 192.168.10.124 BGP neighbor is 192.168.10.124, remote AS 65001 BGP version 4, remote router ID 200.200.200.202 BGP state = ESTABLISHED, up for 00:01:49 BGP OutQ = 0, Flops = 0 Hold time is 3, keepalive interval is 1 seconds Configured hold time is 90, keepalive interval is 30 seconds Neighbor capabilities: multiprotocol: ipv4-unicast: advertised and received ipv6-unicast: advertised route-refresh: advertised and received extended-nexthop: advertised Local: nlri: ipv4-unicast, nexthop: ipv6 UnknownCapability(6): received UnknownCapability(9): received graceful-restart: advertised and received Local: restart time 10 sec ipv6-unicast ipv4-unicast Remote: restart time 120 sec, notification flag set ipv4-unicast, forward flag set 4-octet-as: advertised and received add-path: received Remote: ipv4-unicast: receive enhanced-route-refresh: received long-lived-graceful-restart: advertised and received Local: ipv6-unicast, restart time 10 sec ipv4-unicast, restart time 20 sec Remote: ipv4-unicast, restart time 0 sec, forward flag set fqdn: advertised and received Local: name: donatas-pc, domain: Remote: name: spine1-debian-11, domain: software-version: advertised and received Local: GoBGP/3.10.0 Remote: FRRouting/8.5-dev-MyOwnFRRVersion-gdc92f44a45-dirt cisco-route-refresh: received Message statistics: ``` FRR side: ``` root@spine1-debian-11:~# vtysh -c 'show bgp neighbor 192.168.10.17 json' | \ > jq '."192.168.10.17".neighborCapabilities.softwareVersion.receivedSoftwareVersion' "GoBGP/3.10.0" root@spine1-debian-11:~# ``` Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2023-02-25 08:00:23 +01:00
if (peer->soft_version) {
XFREE(MTYPE_BGP_SOFT_VERSION, peer->soft_version);
peer->soft_version = NULL;
}
if (from_peer->soft_version) {
peer->soft_version = from_peer->soft_version;
from_peer->soft_version = NULL;
}
FOREACH_AFI_SAFI (afi, safi) {
peer->af_sflags[afi][safi] = from_peer->af_sflags[afi][safi];
peer->af_cap[afi][safi] = from_peer->af_cap[afi][safi];
peer->afc_nego[afi][safi] = from_peer->afc_nego[afi][safi];
peer->afc_adv[afi][safi] = from_peer->afc_adv[afi][safi];
peer->afc_recv[afi][safi] = from_peer->afc_recv[afi][safi];
peer->orf_plist[afi][safi] = from_peer->orf_plist[afi][safi];
bgpd: Implement LLGR helper mode Tested between GoBGP and FRR (this commit). ``` ┌───────────┐ ┌────────────┐ │ │ │ │ │ GoBGPD │ │ FRRouting │ │ (restart) │ │ │ │ │ │ │ └──────┬────┘ └───────┬────┘ │ │ │ │ │ │ │ ┌───────────┐ │ │ │ │ │ │ │ │ │ └─────┤ FRRouting ├────────┘ │ (helper) │ │ │ └───────────┘ // GoBGPD % cat /etc/gobgp/config.toml [global.config] as = 65002 router-id = "2.2.2.2" port = 179 [[neighbors]] [neighbors.config] peer-as = 65001 neighbor-address = "2a02:abc::123" [neighbors.graceful-restart.config] enabled = true restart-time = 3 long-lived-enabled = true [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv6-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 10 [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv4-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 20 % ./gobgp global rib add -a ipv6 2001:db8:4::/64 % ./gobgp global rib add -a ipv6 2001:db8:5::/64 community 65535:7 % ./gobgp global rib add -a ipv4 100.100.100.100/32 % ./gobgp global rib add -a ipv4 100.100.100.200/32 community 65535:7 ``` 1. When killing GoBGPD, graceful restart timer starts in FRR helper router; 2. When GR timer expires in helper router: a) LLGR_STALE community is attached to routes to be retained; b) Clear stale routes that have NO_LLGR community attached; c) Start LLGR timer per AFI/SAFI; d) Recompute bestpath and reannounce routes to peers; d) When LLGR timer expires, clear all routes on particular AFI/SAFI. Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2021-12-20 22:03:09 +01:00
peer->llgr[afi][safi] = from_peer->llgr[afi][safi];
peer->addpath_paths_limit[afi][safi] =
from_peer->addpath_paths_limit[afi][safi];
}
if (bgp_getsockname(keeper) < 0) {
flog_err(EC_LIB_SOCKET,
"%%bgp_getsockname() failed for %s peer %s fd %d (from_peer fd %d)",
(CHECK_FLAG(peer->sflags, PEER_STATUS_ACCEPT_PEER)
? "accept"
: ""),
peer->host, going_away->fd, keeper->fd);
BGP_EVENT_ADD(going_away, BGP_Stop);
BGP_EVENT_ADD(keeper, BGP_Stop);
return NULL;
}
if (going_away->status > Active) {
if (bgp_getsockname(going_away) < 0) {
flog_err(EC_LIB_SOCKET,
"%%bgp_getsockname() failed for %s from_peer %s fd %d (peer fd %d)",
(CHECK_FLAG(from_peer->sflags,
PEER_STATUS_ACCEPT_PEER)
? "accept"
: ""),
from_peer->host, going_away->fd, keeper->fd);
bgp_stop(going_away);
from_peer = NULL;
}
}
// Note: peer_xfer_stats() must be called with I/O turned OFF
if (from_peer)
peer_xfer_stats(peer, from_peer);
/* Register peer for NHT. This is to allow RAs to be enabled when
* needed, even on a passive connection.
*/
bgp_peer_reg_with_nht(peer);
if (from_peer)
bgp_replace_nexthop_by_peer(from_peer, peer);
bgp_reads_on(keeper);
bgp_writes_on(keeper);
event_add_event(bm->master, bgp_process_packet, keeper, 0,
&keeper->t_process_packet);
return (peer);
}
2002-12-13 21:15:29 +01:00
/* Hook function called after bgp event is occered. And vty's
neighbor command invoke this function after making neighbor
structure. */
void bgp_timer_set(struct peer_connection *connection)
2002-12-13 21:15:29 +01:00
{
bgpd: Implement LLGR helper mode Tested between GoBGP and FRR (this commit). ``` ┌───────────┐ ┌────────────┐ │ │ │ │ │ GoBGPD │ │ FRRouting │ │ (restart) │ │ │ │ │ │ │ └──────┬────┘ └───────┬────┘ │ │ │ │ │ │ │ ┌───────────┐ │ │ │ │ │ │ │ │ │ └─────┤ FRRouting ├────────┘ │ (helper) │ │ │ └───────────┘ // GoBGPD % cat /etc/gobgp/config.toml [global.config] as = 65002 router-id = "2.2.2.2" port = 179 [[neighbors]] [neighbors.config] peer-as = 65001 neighbor-address = "2a02:abc::123" [neighbors.graceful-restart.config] enabled = true restart-time = 3 long-lived-enabled = true [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv6-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 10 [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv4-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 20 % ./gobgp global rib add -a ipv6 2001:db8:4::/64 % ./gobgp global rib add -a ipv6 2001:db8:5::/64 community 65535:7 % ./gobgp global rib add -a ipv4 100.100.100.100/32 % ./gobgp global rib add -a ipv4 100.100.100.200/32 community 65535:7 ``` 1. When killing GoBGPD, graceful restart timer starts in FRR helper router; 2. When GR timer expires in helper router: a) LLGR_STALE community is attached to routes to be retained; b) Clear stale routes that have NO_LLGR community attached; c) Start LLGR timer per AFI/SAFI; d) Recompute bestpath and reannounce routes to peers; d) When LLGR timer expires, clear all routes on particular AFI/SAFI. Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2021-12-20 22:03:09 +01:00
afi_t afi;
safi_t safi;
struct peer *peer = connection->peer;
bgpd: Implement LLGR helper mode Tested between GoBGP and FRR (this commit). ``` ┌───────────┐ ┌────────────┐ │ │ │ │ │ GoBGPD │ │ FRRouting │ │ (restart) │ │ │ │ │ │ │ └──────┬────┘ └───────┬────┘ │ │ │ │ │ │ │ ┌───────────┐ │ │ │ │ │ │ │ │ │ └─────┤ FRRouting ├────────┘ │ (helper) │ │ │ └───────────┘ // GoBGPD % cat /etc/gobgp/config.toml [global.config] as = 65002 router-id = "2.2.2.2" port = 179 [[neighbors]] [neighbors.config] peer-as = 65001 neighbor-address = "2a02:abc::123" [neighbors.graceful-restart.config] enabled = true restart-time = 3 long-lived-enabled = true [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv6-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 10 [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv4-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 20 % ./gobgp global rib add -a ipv6 2001:db8:4::/64 % ./gobgp global rib add -a ipv6 2001:db8:5::/64 community 65535:7 % ./gobgp global rib add -a ipv4 100.100.100.100/32 % ./gobgp global rib add -a ipv4 100.100.100.200/32 community 65535:7 ``` 1. When killing GoBGPD, graceful restart timer starts in FRR helper router; 2. When GR timer expires in helper router: a) LLGR_STALE community is attached to routes to be retained; b) Clear stale routes that have NO_LLGR community attached; c) Start LLGR timer per AFI/SAFI; d) Recompute bestpath and reannounce routes to peers; d) When LLGR timer expires, clear all routes on particular AFI/SAFI. Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2021-12-20 22:03:09 +01:00
switch (connection->status) {
2002-12-13 21:15:29 +01:00
case Idle:
/* First entry point of peer's finite state machine. In Idle
status start timer is on unless peer is shutdown or peer is
inactive. All other timer must be turned off */
if (BGP_PEER_START_SUPPRESSED(peer) || peer_active(connection) != BGP_PEER_ACTIVE ||
peer->bgp->vrf_id == VRF_UNKNOWN) {
EVENT_OFF(connection->t_start);
2002-12-13 21:15:29 +01:00
} else {
BGP_TIMER_ON(connection->t_start, bgp_start_timer,
peer->v_start);
}
EVENT_OFF(connection->t_connect);
EVENT_OFF(connection->t_holdtime);
bgp_keepalives_off(connection);
EVENT_OFF(connection->t_routeadv);
EVENT_OFF(connection->t_delayopen);
break;
case Connect:
2002-12-13 21:15:29 +01:00
/* After start timer is expired, the peer moves to Connect
status. Make sure start timer is off and connect timer is
on. */
EVENT_OFF(connection->t_start);
if (CHECK_FLAG(peer->flags, PEER_FLAG_TIMER_DELAYOPEN))
BGP_TIMER_ON(connection->t_connect, bgp_connect_timer,
(peer->v_delayopen + peer->v_connect));
else
BGP_TIMER_ON(connection->t_connect, bgp_connect_timer,
peer->v_connect);
EVENT_OFF(connection->t_holdtime);
bgp_keepalives_off(connection);
EVENT_OFF(connection->t_routeadv);
2002-12-13 21:15:29 +01:00
break;
2002-12-13 21:15:29 +01:00
case Active:
/* Active is waiting connection from remote peer. And if
connect timer is expired, change status to Connect. */
EVENT_OFF(connection->t_start);
/* If peer is passive mode, do not set connect timer. */
if (CHECK_FLAG(peer->flags, PEER_FLAG_PASSIVE)
|| CHECK_FLAG(peer->sflags, PEER_STATUS_NSF_WAIT)) {
EVENT_OFF(connection->t_connect);
} else {
if (CHECK_FLAG(peer->flags, PEER_FLAG_TIMER_DELAYOPEN))
BGP_TIMER_ON(connection->t_connect,
bgp_connect_timer,
(peer->v_delayopen +
peer->v_connect));
else
BGP_TIMER_ON(connection->t_connect,
bgp_connect_timer, peer->v_connect);
}
EVENT_OFF(connection->t_holdtime);
bgp_keepalives_off(connection);
EVENT_OFF(connection->t_routeadv);
break;
2002-12-13 21:15:29 +01:00
case OpenSent:
/* OpenSent status. */
EVENT_OFF(connection->t_start);
EVENT_OFF(connection->t_connect);
2002-12-13 21:15:29 +01:00
if (peer->v_holdtime != 0) {
BGP_TIMER_ON(connection->t_holdtime, bgp_holdtime_timer,
peer->v_holdtime);
} else {
EVENT_OFF(connection->t_holdtime);
}
bgp_keepalives_off(connection);
EVENT_OFF(connection->t_routeadv);
EVENT_OFF(connection->t_delayopen);
break;
case OpenConfirm:
2002-12-13 21:15:29 +01:00
/* OpenConfirm status. */
EVENT_OFF(connection->t_start);
EVENT_OFF(connection->t_connect);
/*
* If the negotiated Hold Time value is zero, then the Hold Time
* timer and KeepAlive timers are not started.
* Additionally if a different hold timer has been negotiated
* than we must stop then start the timer again
*/
EVENT_OFF(connection->t_holdtime);
if (peer->v_holdtime == 0)
bgp_keepalives_off(connection);
else {
BGP_TIMER_ON(connection->t_holdtime, bgp_holdtime_timer,
peer->v_holdtime);
bgp_keepalives_on(connection);
}
EVENT_OFF(connection->t_routeadv);
EVENT_OFF(connection->t_delayopen);
2002-12-13 21:15:29 +01:00
break;
2002-12-13 21:15:29 +01:00
case Established:
/* In Established status start and connect timer is turned
off. */
EVENT_OFF(connection->t_start);
EVENT_OFF(connection->t_connect);
EVENT_OFF(connection->t_delayopen);
/*
* Same as OpenConfirm, if holdtime is zero then both holdtime
* and keepalive must be turned off.
* Additionally if a different hold timer has been negotiated
* then we must stop then start the timer again
*/
EVENT_OFF(connection->t_holdtime);
if (peer->v_holdtime == 0)
bgp_keepalives_off(connection);
else {
BGP_TIMER_ON(connection->t_holdtime, bgp_holdtime_timer,
peer->v_holdtime);
bgp_keepalives_on(connection);
}
2002-12-13 21:15:29 +01:00
break;
case Deleted:
EVENT_OFF(peer->connection->t_gr_restart);
EVENT_OFF(peer->connection->t_gr_stale);
bgpd: Implement LLGR helper mode Tested between GoBGP and FRR (this commit). ``` ┌───────────┐ ┌────────────┐ │ │ │ │ │ GoBGPD │ │ FRRouting │ │ (restart) │ │ │ │ │ │ │ └──────┬────┘ └───────┬────┘ │ │ │ │ │ │ │ ┌───────────┐ │ │ │ │ │ │ │ │ │ └─────┤ FRRouting ├────────┘ │ (helper) │ │ │ └───────────┘ // GoBGPD % cat /etc/gobgp/config.toml [global.config] as = 65002 router-id = "2.2.2.2" port = 179 [[neighbors]] [neighbors.config] peer-as = 65001 neighbor-address = "2a02:abc::123" [neighbors.graceful-restart.config] enabled = true restart-time = 3 long-lived-enabled = true [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv6-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 10 [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv4-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 20 % ./gobgp global rib add -a ipv6 2001:db8:4::/64 % ./gobgp global rib add -a ipv6 2001:db8:5::/64 community 65535:7 % ./gobgp global rib add -a ipv4 100.100.100.100/32 % ./gobgp global rib add -a ipv4 100.100.100.200/32 community 65535:7 ``` 1. When killing GoBGPD, graceful restart timer starts in FRR helper router; 2. When GR timer expires in helper router: a) LLGR_STALE community is attached to routes to be retained; b) Clear stale routes that have NO_LLGR community attached; c) Start LLGR timer per AFI/SAFI; d) Recompute bestpath and reannounce routes to peers; d) When LLGR timer expires, clear all routes on particular AFI/SAFI. Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2021-12-20 22:03:09 +01:00
FOREACH_AFI_SAFI (afi, safi)
EVENT_OFF(peer->t_llgr_stale[afi][safi]);
bgpd: Implement LLGR helper mode Tested between GoBGP and FRR (this commit). ``` ┌───────────┐ ┌────────────┐ │ │ │ │ │ GoBGPD │ │ FRRouting │ │ (restart) │ │ │ │ │ │ │ └──────┬────┘ └───────┬────┘ │ │ │ │ │ │ │ ┌───────────┐ │ │ │ │ │ │ │ │ │ └─────┤ FRRouting ├────────┘ │ (helper) │ │ │ └───────────┘ // GoBGPD % cat /etc/gobgp/config.toml [global.config] as = 65002 router-id = "2.2.2.2" port = 179 [[neighbors]] [neighbors.config] peer-as = 65001 neighbor-address = "2a02:abc::123" [neighbors.graceful-restart.config] enabled = true restart-time = 3 long-lived-enabled = true [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv6-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 10 [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv4-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 20 % ./gobgp global rib add -a ipv6 2001:db8:4::/64 % ./gobgp global rib add -a ipv6 2001:db8:5::/64 community 65535:7 % ./gobgp global rib add -a ipv4 100.100.100.100/32 % ./gobgp global rib add -a ipv4 100.100.100.200/32 community 65535:7 ``` 1. When killing GoBGPD, graceful restart timer starts in FRR helper router; 2. When GR timer expires in helper router: a) LLGR_STALE community is attached to routes to be retained; b) Clear stale routes that have NO_LLGR community attached; c) Start LLGR timer per AFI/SAFI; d) Recompute bestpath and reannounce routes to peers; d) When LLGR timer expires, clear all routes on particular AFI/SAFI. Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2021-12-20 22:03:09 +01:00
EVENT_OFF(peer->connection->t_pmax_restart);
EVENT_OFF(peer->t_refresh_stalepath);
fallthrough;
2002-12-13 21:15:29 +01:00
case Clearing:
EVENT_OFF(connection->t_start);
EVENT_OFF(connection->t_connect);
EVENT_OFF(connection->t_holdtime);
bgp_keepalives_off(connection);
EVENT_OFF(connection->t_routeadv);
EVENT_OFF(connection->t_delayopen);
break;
case BGP_STATUS_MAX:
flog_err(EC_LIB_DEVELOPMENT,
"BGP_STATUS_MAX while a legal state is not valid state for the FSM");
break;
2002-12-13 21:15:29 +01:00
}
}
/* BGP start timer. This function set BGP_Start event to thread value
and process event. */
static void bgp_start_timer(struct event *thread)
2002-12-13 21:15:29 +01:00
{
struct peer_connection *connection = EVENT_ARG(thread);
struct peer *peer = connection->peer;
2002-12-13 21:15:29 +01:00
Overhual BGP debugs Summary of changes - added an option to enable keepalive debugs for a specific peer - added an option to enable inbound and/or outbound updates debugs for a specific peer - added an option to enable update debugs for a specific prefix - added an option to enable zebra debugs for a specific prefix - combined "deb bgp", "deb bgp events" and "deb bgp fsm" into "deb bgp neighbor-events". "deb bgp neighbor-events" can be enabled for a specific peer. - merged "deb bgp filters" into "deb bgp update" - moved the per-peer logging to one central log file. We now have the ability to filter all verbose debugs on a per-peer and per-prefix basis so we no longer need to keep log files per-peer. This simplifies troubleshooting by keeping all BGP logs in one location. The use r can then grep for the peer IP they are interested in if they wish to see the logs for a specific peer. - Changed "show debugging" in isis to "show debugging isis" to be consistent with all other protocols. This was very confusing for the user because they would type "show debug" and expect to see a list of debugs enabled across all protocols. - Removed "undebug" from the parser for BGP. Again this was to be consisten with all other protocols. - Removed the "all" keyword from the BGP debug parser. The user can now do "no debug bgp" to disable all BGP debugs, before you had to type "no deb all bgp" which was confusing. The new parse tree for BGP debugging is: deb bgp as4 deb bgp as4 segment deb bgp keepalives [A.B.C.D|WORD|X:X::X:X] deb bgp neighbor-events [A.B.C.D|WORD|X:X::X:X] deb bgp nht deb bgp updates [in|out] [A.B.C.D|WORD|X:X::X:X] deb bgp updates prefix [A.B.C.D/M|X:X::X:X/M] deb bgp zebra deb bgp zebra prefix [A.B.C.D/M|X:X::X:X/M]
2015-05-20 02:58:12 +02:00
if (bgp_debug_neighbor_events(peer))
zlog_debug("%s [FSM] Timer (start timer expire for %s).", peer->host,
bgp_peer_get_connection_direction(connection));
2002-12-13 21:15:29 +01:00
EVENT_VAL(thread) = BGP_Start;
2005-06-01 Paul Jakma <paul.jakma@sun.com> * bgpd/(general) refcount struct peer and bgp_info, hence allowing us add work_queues for bgp_process. * bgpd/bgp_route.h: (struct bgp_info) Add 'lock' field for refcount. Add bgp_info_{lock,unlock} helper functions. Add bgp_info_{add,delete} helpers, to remove need for users managing locking/freeing of bgp_info and bgp_node's. * bgpd/bgp_table.h: (struct bgp_node) Add a flags field, and BGP_NODE_PROCESS_SCHEDULED to merge redundant processing of nodes. * bgpd/bgp_fsm.h: Make the ON/OFF/ADD/REMOVE macros lock and unlock peer reference as appropriate. * bgpd/bgp_damp.c: Remove its internal prototypes for bgp_info_delete/free. Just use bgp_info_delete. * bgpd/bgpd.h: (struct bgp_master) Add work_queue pointers. (struct peer) Add reference count 'lock' (peer_lock,peer_unlock) New helpers to take/release reference on struct peer. * bgpd/bgp_advertise.c: (general) Add peer and bgp_info refcounting and balance how references are taken and released. (bgp_advertise_free) release bgp_info reference, if appropriate (bgp_adj_out_free) unlock peer (bgp_advertise_clean) leave the adv references alone, or else call bgp_advertise_free cant unlock them. (bgp_adj_out_set) lock the peer on new adj's, leave the reference alone otherwise. lock the new bgp_info reference. (bgp_adj_in_set) lock the peer reference (bgp_adj_in_remove) and unlock it here (bgp_sync_delete) make hash_free on peer conditional, just in case. * bgpd/bgp_fsm.c: (general) document that the timers depend on bgp_event to release a peer reference. (bgp_fsm_change_status) moved up the file, unchanged. (bgp_stop) Decrement peer lock as many times as cancel_event canceled - shouldnt be needed but just in case. stream_fifo_clean of obuf made conditional, just in case. (bgp_event) always unlock the peer, regardless of return value of bgp_fsm_change_status. * bgpd/bgp_packet.c: (general) change several bgp_stop's to BGP_EVENT's. (bgp_read) Add a mysterious extra peer_unlock for ACCEPT_PEERs along with a comment on it. * bgpd/bgp_route.c: (general) Add refcounting of bgp_info, cleanup some of the resource management around bgp_info. Refcount peer. Add workqueues for bgp_process and clear_table. (bgp_info_new) make static (bgp_info_free) Ditto, and unlock the peer reference. (bgp_info_lock,bgp_info_unlock) new exported functions (bgp_info_add) Add a bgp_info to a bgp_node in correct fashion, taking care of reference counts. (bgp_info_delete) do the opposite of bgp_info_add. (bgp_process_rsclient) Converted into a work_queue work function. (bgp_process_main) ditto. (bgp_processq_del) process work queue item deconstructor (bgp_process_queue_init) process work queue init (bgp_process) call init function if required, set up queue item and add to queue, rather than calling process functions directly. (bgp_rib_remove) let bgp_info_delete manage bgp_info refcounts (bgp_rib_withdraw) ditto (bgp_update_rsclient) let bgp_info_add manage refcounts (bgp_update_main) ditto (bgp_clear_route_node) clear_node_queue work function, does per-node aspects of what bgp_clear_route_table did previously (bgp_clear_node_queue_del) clear_node_queue item delete function (bgp_clear_node_complete) clear_node_queue completion function, it unplugs the process queues, which have to be blocked while clear_node_queue is being processed to prevent a race. (bgp_clear_node_queue_init) init function for clear_node_queue work queues (bgp_clear_route_table) Sets up items onto a workqueue now, rather than clearing each node directly. Plugs both process queues to avoid potential race. (bgp_static_withdraw_rsclient) let bgp_info_{add,delete} manage bgp_info refcounts. (bgp_static_update_rsclient) ditto (bgp_static_update_main) ditto (bgp_static_update_vpnv4) ditto, remove unneeded cast. (bgp_static_withdraw) see bgp_static_withdraw_rsclient (bgp_static_withdraw_vpnv4) ditto (bgp_aggregate_{route,add,delete}) ditto (bgp_redistribute_{add,delete,withdraw}) ditto * bgpd/bgp_vty.c: (peer_rsclient_set_vty) lock rsclient list peer reference (peer_rsclient_unset_vty) ditto, but unlock same reference * bgpd/bgpd.c: (peer_free) handle frees of info to be kept for lifetime of struct peer. (peer_lock,peer_unlock) peer refcount helpers (peer_new) add initial refcounts (peer_create,peer_create_accept) lock peer as appropriate (peer_delete) unlock as appropriate, move out some free's to peer_free. (peer_group_bind,peer_group_unbind) peer refcounting as appropriate. (bgp_create) check CALLOC return value. (bgp_terminate) free workqueues too. * lib/memtypes.c: Add MTYPE_BGP_PROCESS_QUEUE and MTYPE_BGP_CLEAR_NODE_QUEUE
2005-06-01 13:17:05 +02:00
bgp_event(thread); /* bgp_event unlocks peer */
2002-12-13 21:15:29 +01:00
}
/* BGP connect retry timer. */
static void bgp_connect_timer(struct event *thread)
2002-12-13 21:15:29 +01:00
{
struct peer_connection *connection = EVENT_ARG(thread);
struct peer *peer = connection->peer;
/* stop the DelayOpenTimer if it is running */
EVENT_OFF(connection->t_delayopen);
assert(!connection->t_write);
assert(!connection->t_read);
Overhual BGP debugs Summary of changes - added an option to enable keepalive debugs for a specific peer - added an option to enable inbound and/or outbound updates debugs for a specific peer - added an option to enable update debugs for a specific prefix - added an option to enable zebra debugs for a specific prefix - combined "deb bgp", "deb bgp events" and "deb bgp fsm" into "deb bgp neighbor-events". "deb bgp neighbor-events" can be enabled for a specific peer. - merged "deb bgp filters" into "deb bgp update" - moved the per-peer logging to one central log file. We now have the ability to filter all verbose debugs on a per-peer and per-prefix basis so we no longer need to keep log files per-peer. This simplifies troubleshooting by keeping all BGP logs in one location. The use r can then grep for the peer IP they are interested in if they wish to see the logs for a specific peer. - Changed "show debugging" in isis to "show debugging isis" to be consistent with all other protocols. This was very confusing for the user because they would type "show debug" and expect to see a list of debugs enabled across all protocols. - Removed "undebug" from the parser for BGP. Again this was to be consisten with all other protocols. - Removed the "all" keyword from the BGP debug parser. The user can now do "no debug bgp" to disable all BGP debugs, before you had to type "no deb all bgp" which was confusing. The new parse tree for BGP debugging is: deb bgp as4 deb bgp as4 segment deb bgp keepalives [A.B.C.D|WORD|X:X::X:X] deb bgp neighbor-events [A.B.C.D|WORD|X:X::X:X] deb bgp nht deb bgp updates [in|out] [A.B.C.D|WORD|X:X::X:X] deb bgp updates prefix [A.B.C.D/M|X:X::X:X/M] deb bgp zebra deb bgp zebra prefix [A.B.C.D/M|X:X::X:X/M]
2015-05-20 02:58:12 +02:00
if (bgp_debug_neighbor_events(peer))
zlog_debug("%s [FSM] Timer (connect timer (%us) expire for %s)", peer->host,
peer->v_connect, bgp_peer_get_connection_direction(connection));
if (CHECK_FLAG(peer->sflags, PEER_STATUS_ACCEPT_PEER))
bgp_stop(connection);
else {
if (!peer->connect)
peer->v_connect = MIN(BGP_MAX_CONNECT_RETRY, peer->v_connect * 2);
EVENT_VAL(thread) = ConnectRetry_timer_expired;
bgp_event(thread); /* bgp_event unlocks peer */
}
2002-12-13 21:15:29 +01:00
}
/* BGP holdtime timer. */
static void bgp_holdtime_timer(struct event *thread)
2002-12-13 21:15:29 +01:00
{
atomic_size_t inq_count;
struct peer_connection *connection = EVENT_ARG(thread);
struct peer *peer = connection->peer;
2002-12-13 21:15:29 +01:00
Overhual BGP debugs Summary of changes - added an option to enable keepalive debugs for a specific peer - added an option to enable inbound and/or outbound updates debugs for a specific peer - added an option to enable update debugs for a specific prefix - added an option to enable zebra debugs for a specific prefix - combined "deb bgp", "deb bgp events" and "deb bgp fsm" into "deb bgp neighbor-events". "deb bgp neighbor-events" can be enabled for a specific peer. - merged "deb bgp filters" into "deb bgp update" - moved the per-peer logging to one central log file. We now have the ability to filter all verbose debugs on a per-peer and per-prefix basis so we no longer need to keep log files per-peer. This simplifies troubleshooting by keeping all BGP logs in one location. The use r can then grep for the peer IP they are interested in if they wish to see the logs for a specific peer. - Changed "show debugging" in isis to "show debugging isis" to be consistent with all other protocols. This was very confusing for the user because they would type "show debug" and expect to see a list of debugs enabled across all protocols. - Removed "undebug" from the parser for BGP. Again this was to be consisten with all other protocols. - Removed the "all" keyword from the BGP debug parser. The user can now do "no debug bgp" to disable all BGP debugs, before you had to type "no deb all bgp" which was confusing. The new parse tree for BGP debugging is: deb bgp as4 deb bgp as4 segment deb bgp keepalives [A.B.C.D|WORD|X:X::X:X] deb bgp neighbor-events [A.B.C.D|WORD|X:X::X:X] deb bgp nht deb bgp updates [in|out] [A.B.C.D|WORD|X:X::X:X] deb bgp updates prefix [A.B.C.D/M|X:X::X:X/M] deb bgp zebra deb bgp zebra prefix [A.B.C.D/M|X:X::X:X/M]
2015-05-20 02:58:12 +02:00
if (bgp_debug_neighbor_events(peer))
zlog_debug("%s [FSM] Timer (holdtime timer expire for %s)", peer->host,
bgp_peer_get_connection_direction(connection));
2002-12-13 21:15:29 +01:00
/*
* Given that we do not have any expectation of ordering
* for handling packets from a peer -vs- handling
* the hold timer for a peer as that they are both
* events on the peer. If we have incoming
* data on the peers inq, let's give the system a chance
* to handle that data. This can be especially true
* for systems where we are heavily loaded for one
* reason or another.
*/
frr_with_mutex (&connection->io_mtx) {
inq_count = atomic_load_explicit(&connection->ibuf->count, memory_order_relaxed);
}
if (inq_count)
BGP_TIMER_ON(connection->t_holdtime, bgp_holdtime_timer,
peer->v_holdtime);
EVENT_VAL(thread) = Hold_Timer_expired;
2005-06-01 Paul Jakma <paul.jakma@sun.com> * bgpd/(general) refcount struct peer and bgp_info, hence allowing us add work_queues for bgp_process. * bgpd/bgp_route.h: (struct bgp_info) Add 'lock' field for refcount. Add bgp_info_{lock,unlock} helper functions. Add bgp_info_{add,delete} helpers, to remove need for users managing locking/freeing of bgp_info and bgp_node's. * bgpd/bgp_table.h: (struct bgp_node) Add a flags field, and BGP_NODE_PROCESS_SCHEDULED to merge redundant processing of nodes. * bgpd/bgp_fsm.h: Make the ON/OFF/ADD/REMOVE macros lock and unlock peer reference as appropriate. * bgpd/bgp_damp.c: Remove its internal prototypes for bgp_info_delete/free. Just use bgp_info_delete. * bgpd/bgpd.h: (struct bgp_master) Add work_queue pointers. (struct peer) Add reference count 'lock' (peer_lock,peer_unlock) New helpers to take/release reference on struct peer. * bgpd/bgp_advertise.c: (general) Add peer and bgp_info refcounting and balance how references are taken and released. (bgp_advertise_free) release bgp_info reference, if appropriate (bgp_adj_out_free) unlock peer (bgp_advertise_clean) leave the adv references alone, or else call bgp_advertise_free cant unlock them. (bgp_adj_out_set) lock the peer on new adj's, leave the reference alone otherwise. lock the new bgp_info reference. (bgp_adj_in_set) lock the peer reference (bgp_adj_in_remove) and unlock it here (bgp_sync_delete) make hash_free on peer conditional, just in case. * bgpd/bgp_fsm.c: (general) document that the timers depend on bgp_event to release a peer reference. (bgp_fsm_change_status) moved up the file, unchanged. (bgp_stop) Decrement peer lock as many times as cancel_event canceled - shouldnt be needed but just in case. stream_fifo_clean of obuf made conditional, just in case. (bgp_event) always unlock the peer, regardless of return value of bgp_fsm_change_status. * bgpd/bgp_packet.c: (general) change several bgp_stop's to BGP_EVENT's. (bgp_read) Add a mysterious extra peer_unlock for ACCEPT_PEERs along with a comment on it. * bgpd/bgp_route.c: (general) Add refcounting of bgp_info, cleanup some of the resource management around bgp_info. Refcount peer. Add workqueues for bgp_process and clear_table. (bgp_info_new) make static (bgp_info_free) Ditto, and unlock the peer reference. (bgp_info_lock,bgp_info_unlock) new exported functions (bgp_info_add) Add a bgp_info to a bgp_node in correct fashion, taking care of reference counts. (bgp_info_delete) do the opposite of bgp_info_add. (bgp_process_rsclient) Converted into a work_queue work function. (bgp_process_main) ditto. (bgp_processq_del) process work queue item deconstructor (bgp_process_queue_init) process work queue init (bgp_process) call init function if required, set up queue item and add to queue, rather than calling process functions directly. (bgp_rib_remove) let bgp_info_delete manage bgp_info refcounts (bgp_rib_withdraw) ditto (bgp_update_rsclient) let bgp_info_add manage refcounts (bgp_update_main) ditto (bgp_clear_route_node) clear_node_queue work function, does per-node aspects of what bgp_clear_route_table did previously (bgp_clear_node_queue_del) clear_node_queue item delete function (bgp_clear_node_complete) clear_node_queue completion function, it unplugs the process queues, which have to be blocked while clear_node_queue is being processed to prevent a race. (bgp_clear_node_queue_init) init function for clear_node_queue work queues (bgp_clear_route_table) Sets up items onto a workqueue now, rather than clearing each node directly. Plugs both process queues to avoid potential race. (bgp_static_withdraw_rsclient) let bgp_info_{add,delete} manage bgp_info refcounts. (bgp_static_update_rsclient) ditto (bgp_static_update_main) ditto (bgp_static_update_vpnv4) ditto, remove unneeded cast. (bgp_static_withdraw) see bgp_static_withdraw_rsclient (bgp_static_withdraw_vpnv4) ditto (bgp_aggregate_{route,add,delete}) ditto (bgp_redistribute_{add,delete,withdraw}) ditto * bgpd/bgp_vty.c: (peer_rsclient_set_vty) lock rsclient list peer reference (peer_rsclient_unset_vty) ditto, but unlock same reference * bgpd/bgpd.c: (peer_free) handle frees of info to be kept for lifetime of struct peer. (peer_lock,peer_unlock) peer refcount helpers (peer_new) add initial refcounts (peer_create,peer_create_accept) lock peer as appropriate (peer_delete) unlock as appropriate, move out some free's to peer_free. (peer_group_bind,peer_group_unbind) peer refcounting as appropriate. (bgp_create) check CALLOC return value. (bgp_terminate) free workqueues too. * lib/memtypes.c: Add MTYPE_BGP_PROCESS_QUEUE and MTYPE_BGP_CLEAR_NODE_QUEUE
2005-06-01 13:17:05 +02:00
bgp_event(thread); /* bgp_event unlocks peer */
2002-12-13 21:15:29 +01:00
}
void bgp_routeadv_timer(struct event *thread)
2002-12-13 21:15:29 +01:00
{
struct peer_connection *connection = EVENT_ARG(thread);
struct peer *peer = connection->peer;
2002-12-13 21:15:29 +01:00
Overhual BGP debugs Summary of changes - added an option to enable keepalive debugs for a specific peer - added an option to enable inbound and/or outbound updates debugs for a specific peer - added an option to enable update debugs for a specific prefix - added an option to enable zebra debugs for a specific prefix - combined "deb bgp", "deb bgp events" and "deb bgp fsm" into "deb bgp neighbor-events". "deb bgp neighbor-events" can be enabled for a specific peer. - merged "deb bgp filters" into "deb bgp update" - moved the per-peer logging to one central log file. We now have the ability to filter all verbose debugs on a per-peer and per-prefix basis so we no longer need to keep log files per-peer. This simplifies troubleshooting by keeping all BGP logs in one location. The use r can then grep for the peer IP they are interested in if they wish to see the logs for a specific peer. - Changed "show debugging" in isis to "show debugging isis" to be consistent with all other protocols. This was very confusing for the user because they would type "show debug" and expect to see a list of debugs enabled across all protocols. - Removed "undebug" from the parser for BGP. Again this was to be consisten with all other protocols. - Removed the "all" keyword from the BGP debug parser. The user can now do "no debug bgp" to disable all BGP debugs, before you had to type "no deb all bgp" which was confusing. The new parse tree for BGP debugging is: deb bgp as4 deb bgp as4 segment deb bgp keepalives [A.B.C.D|WORD|X:X::X:X] deb bgp neighbor-events [A.B.C.D|WORD|X:X::X:X] deb bgp nht deb bgp updates [in|out] [A.B.C.D|WORD|X:X::X:X] deb bgp updates prefix [A.B.C.D/M|X:X::X:X/M] deb bgp zebra deb bgp zebra prefix [A.B.C.D/M|X:X::X:X/M]
2015-05-20 02:58:12 +02:00
if (bgp_debug_neighbor_events(peer))
zlog_debug("%s [FSM] Timer (routeadv timer expire for %s)", peer->host,
bgp_peer_get_connection_direction(connection));
2002-12-13 21:15:29 +01:00
peer->synctime = monotime(NULL);
2002-12-13 21:15:29 +01:00
event_add_timer_msec(bm->master, bgp_generate_updgrp_packets, connection,
0, &connection->t_generate_updgrp_packets);
/* MRAI timer will be started again when FIFO is built, no need to
* do it here.
bgpd: bgpd-mrai.patch BGP: Event-driven route announcement taking into account min route advertisement interval ISSUE BGP starts the routeadv timer (peer->t_routeadv) to expire in 1 sec when a peer is established. From then on, the timer expires periodically based on the configured MRAI value (default: 30sec for EBGP, 5sec for IBGP). At the expiry, the write thread is triggered that takes the routes from peer's sync FIFO (adj-rib-out) and sends UPDATEs. This has a few drawbacks: (1) Delay in new route announcement: Even when the last UPDATE message was sent a while back, the next route change will necessarily have to wait for routeadv expiry (2) CPU usage: The timer is always armed. If the operator chooses to configure a lower value of MRAI (zero second is a preferred choice in many deployments) for better convergence, it leads to high CPU usage for BGP process, even at the times of no network churn. PATCH Make the route advertisement event-driven - When routes are added to peer's sync FIFO, check if the routeadv timer needs to be adjusted (or started). Conversely, do not arm the routeadv timer unconditionally. The patch also addresses route announcements during read-only mode (update-delay). During read-only mode operation, the routeadv timer is not started. When BGP comes out of read-only mode and all the routes are processed, the timer is started for all peers with zero expiry, so that the UPDATEs can be sent all at once. This leads to (near-)optimal UPDATE packing. Finally, the patch makes the "max # packets to write to peer socket at a time" configurable. Currently it is hard-coded to 10. The command is at the top router-bgp mode and is called "write-quanta <number>". It is a useful convergence parameter to tweak. Signed-off-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com> Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com>
2015-05-20 02:40:37 +02:00
*/
2002-12-13 21:15:29 +01:00
}
/* RFC 4271 DelayOpenTimer */
void bgp_delayopen_timer(struct event *thread)
{
struct peer_connection *connection = EVENT_ARG(thread);
struct peer *peer = connection->peer;
if (bgp_debug_neighbor_events(peer))
zlog_debug("%s [FSM] Timer (DelayOpentimer expire for %s)", peer->host,
bgp_peer_get_connection_direction(connection));
EVENT_VAL(thread) = DelayOpen_timer_expired;
bgp_event(thread); /* bgp_event unlocks peer */
}
/* BGP Peer Down Cause */
const char *const peer_down_str[] = {
"",
"Router ID changed",
"Remote AS changed",
"Local AS change",
"Cluster ID changed",
"Confederation identifier changed",
"Confederation peer changed",
"RR client config change",
"RS client config change",
"Update source change",
"Address family activated",
"Admin. shutdown",
"User reset",
"BGP Notification received",
"BGP Notification send",
"Peer closed the session",
"Neighbor deleted",
"Peer-group add member",
"Peer-group delete member",
"Capability changed",
"Passive config change",
"Multihop config change",
"NSF peer closed the session",
"Intf peering v6only config change",
"BFD down received",
"Interface down",
"Neighbor address lost",
"No path to specified Neighbor",
"Waiting for Peer IPv6 LLA",
"Waiting for VRF to be initialized",
"No AFI/SAFI activated for peer",
"AS Set config change",
"Waiting for peer OPEN",
"Reached received prefix count",
"Socket Error",
"Admin. shutdown (RTT)",
"Suppress Fib Turned On or Off",
"Password config change",
"Router ID is missing",
};
static void bgp_graceful_restart_timer_off(struct peer_connection *connection,
struct peer *peer)
{
afi_t afi;
safi_t safi;
bgpd: Implement LLGR helper mode Tested between GoBGP and FRR (this commit). ``` ┌───────────┐ ┌────────────┐ │ │ │ │ │ GoBGPD │ │ FRRouting │ │ (restart) │ │ │ │ │ │ │ └──────┬────┘ └───────┬────┘ │ │ │ │ │ │ │ ┌───────────┐ │ │ │ │ │ │ │ │ │ └─────┤ FRRouting ├────────┘ │ (helper) │ │ │ └───────────┘ // GoBGPD % cat /etc/gobgp/config.toml [global.config] as = 65002 router-id = "2.2.2.2" port = 179 [[neighbors]] [neighbors.config] peer-as = 65001 neighbor-address = "2a02:abc::123" [neighbors.graceful-restart.config] enabled = true restart-time = 3 long-lived-enabled = true [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv6-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 10 [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv4-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 20 % ./gobgp global rib add -a ipv6 2001:db8:4::/64 % ./gobgp global rib add -a ipv6 2001:db8:5::/64 community 65535:7 % ./gobgp global rib add -a ipv4 100.100.100.100/32 % ./gobgp global rib add -a ipv4 100.100.100.200/32 community 65535:7 ``` 1. When killing GoBGPD, graceful restart timer starts in FRR helper router; 2. When GR timer expires in helper router: a) LLGR_STALE community is attached to routes to be retained; b) Clear stale routes that have NO_LLGR community attached; c) Start LLGR timer per AFI/SAFI; d) Recompute bestpath and reannounce routes to peers; d) When LLGR timer expires, clear all routes on particular AFI/SAFI. Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2021-12-20 22:03:09 +01:00
FOREACH_AFI_SAFI (afi, safi)
if (CHECK_FLAG(peer->af_sflags[afi][safi],
PEER_STATUS_LLGR_WAIT))
return;
UNSET_FLAG(peer->sflags, PEER_STATUS_NSF_WAIT);
EVENT_OFF(connection->t_gr_stale);
bgpd: Do not delete BGP dynamic peers if graceful restart kicks in ``` ~# vtysh -c 'show bgp ipv4 unicast summary' | grep 192.168.10.17 *donatas-pc(192.168.10.17) 4 65002 8 12 0 0 0 00:01:35 2 14 N/A ``` Before shutting down 192.168.10.17: ``` ~# vtysh -c 'show bgp ipv4 unicast 100.100.100.100/32' BGP routing table entry for 100.100.100.100/32, version 7 Paths: (2 available, best #2, table default) Advertised to non peer-group peers: home-spine1.donatas.net(192.168.0.2) 65002, (stale) 192.168.10.17 from donatas-pc(192.168.10.17) (0.0.0.0) Origin incomplete, valid, external Last update: Sat Jan 15 21:45:47 2022 65001 192.168.0.2 from home-spine1.donatas.net(192.168.0.2) (2.2.2.2) Origin incomplete, metric 0, valid, external, best (Older Path) Last update: Sat Jan 15 21:25:19 2022 ``` After 192.168.10.17 is down: ``` ~# vtysh -c 'show bgp ipv4 unicast summary' | grep 192.168.10.17 donatas-pc(192.168.10.17) 4 65002 5 9 0 0 0 00:00:12 Active 0 N/A ~# vtysh -c 'show bgp ipv4 unicast 100.100.100.100/32' BGP routing table entry for 100.100.100.100/32, version 7 Paths: (2 available, best #2, table default) Advertised to non peer-group peers: home-spine1.donatas.net(192.168.0.2) 65002, (stale) 192.168.10.17 from donatas-pc(192.168.10.17) (0.0.0.0) Origin incomplete, valid, external Community: llgr-stale Last update: Sat Jan 15 21:49:01 2022 Time until Long-lived stale route deleted: 16 65001 192.168.0.2 from home-spine1.donatas.net(192.168.0.2) (2.2.2.2) Origin incomplete, metric 0, valid, external, best (First path received) Last update: Sat Jan 15 21:25:19 2022 ``` Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2022-01-15 22:16:15 +01:00
if (peer_dynamic_neighbor(peer) &&
!(CHECK_FLAG(peer->flags, PEER_FLAG_DELETE))) {
if (bgp_debug_neighbor_events(peer))
zlog_debug("%s (dynamic neighbor) deleted (%s) for %s", __func__,
peer->host, bgp_peer_get_connection_direction(connection));
bgpd: Do not delete BGP dynamic peers if graceful restart kicks in ``` ~# vtysh -c 'show bgp ipv4 unicast summary' | grep 192.168.10.17 *donatas-pc(192.168.10.17) 4 65002 8 12 0 0 0 00:01:35 2 14 N/A ``` Before shutting down 192.168.10.17: ``` ~# vtysh -c 'show bgp ipv4 unicast 100.100.100.100/32' BGP routing table entry for 100.100.100.100/32, version 7 Paths: (2 available, best #2, table default) Advertised to non peer-group peers: home-spine1.donatas.net(192.168.0.2) 65002, (stale) 192.168.10.17 from donatas-pc(192.168.10.17) (0.0.0.0) Origin incomplete, valid, external Last update: Sat Jan 15 21:45:47 2022 65001 192.168.0.2 from home-spine1.donatas.net(192.168.0.2) (2.2.2.2) Origin incomplete, metric 0, valid, external, best (Older Path) Last update: Sat Jan 15 21:25:19 2022 ``` After 192.168.10.17 is down: ``` ~# vtysh -c 'show bgp ipv4 unicast summary' | grep 192.168.10.17 donatas-pc(192.168.10.17) 4 65002 5 9 0 0 0 00:00:12 Active 0 N/A ~# vtysh -c 'show bgp ipv4 unicast 100.100.100.100/32' BGP routing table entry for 100.100.100.100/32, version 7 Paths: (2 available, best #2, table default) Advertised to non peer-group peers: home-spine1.donatas.net(192.168.0.2) 65002, (stale) 192.168.10.17 from donatas-pc(192.168.10.17) (0.0.0.0) Origin incomplete, valid, external Community: llgr-stale Last update: Sat Jan 15 21:49:01 2022 Time until Long-lived stale route deleted: 16 65001 192.168.0.2 from home-spine1.donatas.net(192.168.0.2) (2.2.2.2) Origin incomplete, metric 0, valid, external, best (First path received) Last update: Sat Jan 15 21:25:19 2022 ``` Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2022-01-15 22:16:15 +01:00
peer_delete(peer);
}
bgp_timer_set(connection);
bgpd: Implement LLGR helper mode Tested between GoBGP and FRR (this commit). ``` ┌───────────┐ ┌────────────┐ │ │ │ │ │ GoBGPD │ │ FRRouting │ │ (restart) │ │ │ │ │ │ │ └──────┬────┘ └───────┬────┘ │ │ │ │ │ │ │ ┌───────────┐ │ │ │ │ │ │ │ │ │ └─────┤ FRRouting ├────────┘ │ (helper) │ │ │ └───────────┘ // GoBGPD % cat /etc/gobgp/config.toml [global.config] as = 65002 router-id = "2.2.2.2" port = 179 [[neighbors]] [neighbors.config] peer-as = 65001 neighbor-address = "2a02:abc::123" [neighbors.graceful-restart.config] enabled = true restart-time = 3 long-lived-enabled = true [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv6-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 10 [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv4-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 20 % ./gobgp global rib add -a ipv6 2001:db8:4::/64 % ./gobgp global rib add -a ipv6 2001:db8:5::/64 community 65535:7 % ./gobgp global rib add -a ipv4 100.100.100.100/32 % ./gobgp global rib add -a ipv4 100.100.100.200/32 community 65535:7 ``` 1. When killing GoBGPD, graceful restart timer starts in FRR helper router; 2. When GR timer expires in helper router: a) LLGR_STALE community is attached to routes to be retained; b) Clear stale routes that have NO_LLGR community attached; c) Start LLGR timer per AFI/SAFI; d) Recompute bestpath and reannounce routes to peers; d) When LLGR timer expires, clear all routes on particular AFI/SAFI. Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2021-12-20 22:03:09 +01:00
}
static void bgp_llgr_stale_timer_expire(struct event *thread)
bgpd: Implement LLGR helper mode Tested between GoBGP and FRR (this commit). ``` ┌───────────┐ ┌────────────┐ │ │ │ │ │ GoBGPD │ │ FRRouting │ │ (restart) │ │ │ │ │ │ │ └──────┬────┘ └───────┬────┘ │ │ │ │ │ │ │ ┌───────────┐ │ │ │ │ │ │ │ │ │ └─────┤ FRRouting ├────────┘ │ (helper) │ │ │ └───────────┘ // GoBGPD % cat /etc/gobgp/config.toml [global.config] as = 65002 router-id = "2.2.2.2" port = 179 [[neighbors]] [neighbors.config] peer-as = 65001 neighbor-address = "2a02:abc::123" [neighbors.graceful-restart.config] enabled = true restart-time = 3 long-lived-enabled = true [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv6-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 10 [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv4-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 20 % ./gobgp global rib add -a ipv6 2001:db8:4::/64 % ./gobgp global rib add -a ipv6 2001:db8:5::/64 community 65535:7 % ./gobgp global rib add -a ipv4 100.100.100.100/32 % ./gobgp global rib add -a ipv4 100.100.100.200/32 community 65535:7 ``` 1. When killing GoBGPD, graceful restart timer starts in FRR helper router; 2. When GR timer expires in helper router: a) LLGR_STALE community is attached to routes to be retained; b) Clear stale routes that have NO_LLGR community attached; c) Start LLGR timer per AFI/SAFI; d) Recompute bestpath and reannounce routes to peers; d) When LLGR timer expires, clear all routes on particular AFI/SAFI. Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2021-12-20 22:03:09 +01:00
{
struct peer_af *paf;
struct peer *peer;
afi_t afi;
safi_t safi;
paf = EVENT_ARG(thread);
bgpd: Implement LLGR helper mode Tested between GoBGP and FRR (this commit). ``` ┌───────────┐ ┌────────────┐ │ │ │ │ │ GoBGPD │ │ FRRouting │ │ (restart) │ │ │ │ │ │ │ └──────┬────┘ └───────┬────┘ │ │ │ │ │ │ │ ┌───────────┐ │ │ │ │ │ │ │ │ │ └─────┤ FRRouting ├────────┘ │ (helper) │ │ │ └───────────┘ // GoBGPD % cat /etc/gobgp/config.toml [global.config] as = 65002 router-id = "2.2.2.2" port = 179 [[neighbors]] [neighbors.config] peer-as = 65001 neighbor-address = "2a02:abc::123" [neighbors.graceful-restart.config] enabled = true restart-time = 3 long-lived-enabled = true [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv6-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 10 [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv4-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 20 % ./gobgp global rib add -a ipv6 2001:db8:4::/64 % ./gobgp global rib add -a ipv6 2001:db8:5::/64 community 65535:7 % ./gobgp global rib add -a ipv4 100.100.100.100/32 % ./gobgp global rib add -a ipv4 100.100.100.200/32 community 65535:7 ``` 1. When killing GoBGPD, graceful restart timer starts in FRR helper router; 2. When GR timer expires in helper router: a) LLGR_STALE community is attached to routes to be retained; b) Clear stale routes that have NO_LLGR community attached; c) Start LLGR timer per AFI/SAFI; d) Recompute bestpath and reannounce routes to peers; d) When LLGR timer expires, clear all routes on particular AFI/SAFI. Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2021-12-20 22:03:09 +01:00
peer = paf->peer;
afi = paf->afi;
safi = paf->safi;
/* If the timer for the "Long-lived Stale Time" expires before the
* session is re-established, the helper MUST delete all the
* stale routes from the neighbor that it is retaining.
*/
if (bgp_debug_neighbor_events(peer))
zlog_debug("%pBP Long-lived stale timer (%s) expired for %s", peer,
get_afi_safi_str(afi, safi, false),
bgp_peer_get_connection_direction(peer->connection));
bgpd: Implement LLGR helper mode Tested between GoBGP and FRR (this commit). ``` ┌───────────┐ ┌────────────┐ │ │ │ │ │ GoBGPD │ │ FRRouting │ │ (restart) │ │ │ │ │ │ │ └──────┬────┘ └───────┬────┘ │ │ │ │ │ │ │ ┌───────────┐ │ │ │ │ │ │ │ │ │ └─────┤ FRRouting ├────────┘ │ (helper) │ │ │ └───────────┘ // GoBGPD % cat /etc/gobgp/config.toml [global.config] as = 65002 router-id = "2.2.2.2" port = 179 [[neighbors]] [neighbors.config] peer-as = 65001 neighbor-address = "2a02:abc::123" [neighbors.graceful-restart.config] enabled = true restart-time = 3 long-lived-enabled = true [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv6-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 10 [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv4-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 20 % ./gobgp global rib add -a ipv6 2001:db8:4::/64 % ./gobgp global rib add -a ipv6 2001:db8:5::/64 community 65535:7 % ./gobgp global rib add -a ipv4 100.100.100.100/32 % ./gobgp global rib add -a ipv4 100.100.100.200/32 community 65535:7 ``` 1. When killing GoBGPD, graceful restart timer starts in FRR helper router; 2. When GR timer expires in helper router: a) LLGR_STALE community is attached to routes to be retained; b) Clear stale routes that have NO_LLGR community attached; c) Start LLGR timer per AFI/SAFI; d) Recompute bestpath and reannounce routes to peers; d) When LLGR timer expires, clear all routes on particular AFI/SAFI. Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2021-12-20 22:03:09 +01:00
UNSET_FLAG(peer->af_sflags[afi][safi], PEER_STATUS_LLGR_WAIT);
bgp_clear_stale_route(peer, afi, safi);
bgp_graceful_restart_timer_off(peer->connection, peer);
bgpd: Implement LLGR helper mode Tested between GoBGP and FRR (this commit). ``` ┌───────────┐ ┌────────────┐ │ │ │ │ │ GoBGPD │ │ FRRouting │ │ (restart) │ │ │ │ │ │ │ └──────┬────┘ └───────┬────┘ │ │ │ │ │ │ │ ┌───────────┐ │ │ │ │ │ │ │ │ │ └─────┤ FRRouting ├────────┘ │ (helper) │ │ │ └───────────┘ // GoBGPD % cat /etc/gobgp/config.toml [global.config] as = 65002 router-id = "2.2.2.2" port = 179 [[neighbors]] [neighbors.config] peer-as = 65001 neighbor-address = "2a02:abc::123" [neighbors.graceful-restart.config] enabled = true restart-time = 3 long-lived-enabled = true [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv6-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 10 [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv4-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 20 % ./gobgp global rib add -a ipv6 2001:db8:4::/64 % ./gobgp global rib add -a ipv6 2001:db8:5::/64 community 65535:7 % ./gobgp global rib add -a ipv4 100.100.100.100/32 % ./gobgp global rib add -a ipv4 100.100.100.200/32 community 65535:7 ``` 1. When killing GoBGPD, graceful restart timer starts in FRR helper router; 2. When GR timer expires in helper router: a) LLGR_STALE community is attached to routes to be retained; b) Clear stale routes that have NO_LLGR community attached; c) Start LLGR timer per AFI/SAFI; d) Recompute bestpath and reannounce routes to peers; d) When LLGR timer expires, clear all routes on particular AFI/SAFI. Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2021-12-20 22:03:09 +01:00
}
static void bgp_set_llgr_stale(struct peer *peer, afi_t afi, safi_t safi)
{
struct bgp_dest *dest;
struct bgp_path_info *pi, *next;
bgpd: Implement LLGR helper mode Tested between GoBGP and FRR (this commit). ``` ┌───────────┐ ┌────────────┐ │ │ │ │ │ GoBGPD │ │ FRRouting │ │ (restart) │ │ │ │ │ │ │ └──────┬────┘ └───────┬────┘ │ │ │ │ │ │ │ ┌───────────┐ │ │ │ │ │ │ │ │ │ └─────┤ FRRouting ├────────┘ │ (helper) │ │ │ └───────────┘ // GoBGPD % cat /etc/gobgp/config.toml [global.config] as = 65002 router-id = "2.2.2.2" port = 179 [[neighbors]] [neighbors.config] peer-as = 65001 neighbor-address = "2a02:abc::123" [neighbors.graceful-restart.config] enabled = true restart-time = 3 long-lived-enabled = true [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv6-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 10 [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv4-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 20 % ./gobgp global rib add -a ipv6 2001:db8:4::/64 % ./gobgp global rib add -a ipv6 2001:db8:5::/64 community 65535:7 % ./gobgp global rib add -a ipv4 100.100.100.100/32 % ./gobgp global rib add -a ipv4 100.100.100.200/32 community 65535:7 ``` 1. When killing GoBGPD, graceful restart timer starts in FRR helper router; 2. When GR timer expires in helper router: a) LLGR_STALE community is attached to routes to be retained; b) Clear stale routes that have NO_LLGR community attached; c) Start LLGR timer per AFI/SAFI; d) Recompute bestpath and reannounce routes to peers; d) When LLGR timer expires, clear all routes on particular AFI/SAFI. Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2021-12-20 22:03:09 +01:00
struct bgp_table *table;
struct attr attr;
if (safi == SAFI_MPLS_VPN || safi == SAFI_ENCAP || safi == SAFI_EVPN) {
for (dest = bgp_table_top(peer->bgp->rib[afi][safi]); dest;
dest = bgp_route_next(dest)) {
struct bgp_dest *rm;
table = bgp_dest_get_bgp_table_info(dest);
if (!table)
continue;
for (rm = bgp_table_top(table); rm;
rm = bgp_route_next(rm))
for (pi = bgp_dest_get_bgp_path_info(rm);
(pi != NULL) && (next = pi->next, 1); pi = next) {
bgpd: Implement LLGR helper mode Tested between GoBGP and FRR (this commit). ``` ┌───────────┐ ┌────────────┐ │ │ │ │ │ GoBGPD │ │ FRRouting │ │ (restart) │ │ │ │ │ │ │ └──────┬────┘ └───────┬────┘ │ │ │ │ │ │ │ ┌───────────┐ │ │ │ │ │ │ │ │ │ └─────┤ FRRouting ├────────┘ │ (helper) │ │ │ └───────────┘ // GoBGPD % cat /etc/gobgp/config.toml [global.config] as = 65002 router-id = "2.2.2.2" port = 179 [[neighbors]] [neighbors.config] peer-as = 65001 neighbor-address = "2a02:abc::123" [neighbors.graceful-restart.config] enabled = true restart-time = 3 long-lived-enabled = true [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv6-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 10 [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv4-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 20 % ./gobgp global rib add -a ipv6 2001:db8:4::/64 % ./gobgp global rib add -a ipv6 2001:db8:5::/64 community 65535:7 % ./gobgp global rib add -a ipv4 100.100.100.100/32 % ./gobgp global rib add -a ipv4 100.100.100.200/32 community 65535:7 ``` 1. When killing GoBGPD, graceful restart timer starts in FRR helper router; 2. When GR timer expires in helper router: a) LLGR_STALE community is attached to routes to be retained; b) Clear stale routes that have NO_LLGR community attached; c) Start LLGR timer per AFI/SAFI; d) Recompute bestpath and reannounce routes to peers; d) When LLGR timer expires, clear all routes on particular AFI/SAFI. Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2021-12-20 22:03:09 +01:00
if (pi->peer != peer)
continue;
if (bgp_attr_get_community(pi->attr) &&
bgpd: Implement LLGR helper mode Tested between GoBGP and FRR (this commit). ``` ┌───────────┐ ┌────────────┐ │ │ │ │ │ GoBGPD │ │ FRRouting │ │ (restart) │ │ │ │ │ │ │ └──────┬────┘ └───────┬────┘ │ │ │ │ │ │ │ ┌───────────┐ │ │ │ │ │ │ │ │ │ └─────┤ FRRouting ├────────┘ │ (helper) │ │ │ └───────────┘ // GoBGPD % cat /etc/gobgp/config.toml [global.config] as = 65002 router-id = "2.2.2.2" port = 179 [[neighbors]] [neighbors.config] peer-as = 65001 neighbor-address = "2a02:abc::123" [neighbors.graceful-restart.config] enabled = true restart-time = 3 long-lived-enabled = true [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv6-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 10 [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv4-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 20 % ./gobgp global rib add -a ipv6 2001:db8:4::/64 % ./gobgp global rib add -a ipv6 2001:db8:5::/64 community 65535:7 % ./gobgp global rib add -a ipv4 100.100.100.100/32 % ./gobgp global rib add -a ipv4 100.100.100.200/32 community 65535:7 ``` 1. When killing GoBGPD, graceful restart timer starts in FRR helper router; 2. When GR timer expires in helper router: a) LLGR_STALE community is attached to routes to be retained; b) Clear stale routes that have NO_LLGR community attached; c) Start LLGR timer per AFI/SAFI; d) Recompute bestpath and reannounce routes to peers; d) When LLGR timer expires, clear all routes on particular AFI/SAFI. Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2021-12-20 22:03:09 +01:00
community_include(
bgp_attr_get_community(
pi->attr),
bgpd: Implement LLGR helper mode Tested between GoBGP and FRR (this commit). ``` ┌───────────┐ ┌────────────┐ │ │ │ │ │ GoBGPD │ │ FRRouting │ │ (restart) │ │ │ │ │ │ │ └──────┬────┘ └───────┬────┘ │ │ │ │ │ │ │ ┌───────────┐ │ │ │ │ │ │ │ │ │ └─────┤ FRRouting ├────────┘ │ (helper) │ │ │ └───────────┘ // GoBGPD % cat /etc/gobgp/config.toml [global.config] as = 65002 router-id = "2.2.2.2" port = 179 [[neighbors]] [neighbors.config] peer-as = 65001 neighbor-address = "2a02:abc::123" [neighbors.graceful-restart.config] enabled = true restart-time = 3 long-lived-enabled = true [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv6-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 10 [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv4-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 20 % ./gobgp global rib add -a ipv6 2001:db8:4::/64 % ./gobgp global rib add -a ipv6 2001:db8:5::/64 community 65535:7 % ./gobgp global rib add -a ipv4 100.100.100.100/32 % ./gobgp global rib add -a ipv4 100.100.100.200/32 community 65535:7 ``` 1. When killing GoBGPD, graceful restart timer starts in FRR helper router; 2. When GR timer expires in helper router: a) LLGR_STALE community is attached to routes to be retained; b) Clear stale routes that have NO_LLGR community attached; c) Start LLGR timer per AFI/SAFI; d) Recompute bestpath and reannounce routes to peers; d) When LLGR timer expires, clear all routes on particular AFI/SAFI. Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2021-12-20 22:03:09 +01:00
COMMUNITY_NO_LLGR))
continue;
if (bgp_attr_get_community(pi->attr) &&
community_include(bgp_attr_get_community(pi->attr),
COMMUNITY_LLGR_STALE))
continue;
bgpd: Implement LLGR helper mode Tested between GoBGP and FRR (this commit). ``` ┌───────────┐ ┌────────────┐ │ │ │ │ │ GoBGPD │ │ FRRouting │ │ (restart) │ │ │ │ │ │ │ └──────┬────┘ └───────┬────┘ │ │ │ │ │ │ │ ┌───────────┐ │ │ │ │ │ │ │ │ │ └─────┤ FRRouting ├────────┘ │ (helper) │ │ │ └───────────┘ // GoBGPD % cat /etc/gobgp/config.toml [global.config] as = 65002 router-id = "2.2.2.2" port = 179 [[neighbors]] [neighbors.config] peer-as = 65001 neighbor-address = "2a02:abc::123" [neighbors.graceful-restart.config] enabled = true restart-time = 3 long-lived-enabled = true [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv6-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 10 [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv4-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 20 % ./gobgp global rib add -a ipv6 2001:db8:4::/64 % ./gobgp global rib add -a ipv6 2001:db8:5::/64 community 65535:7 % ./gobgp global rib add -a ipv4 100.100.100.100/32 % ./gobgp global rib add -a ipv4 100.100.100.200/32 community 65535:7 ``` 1. When killing GoBGPD, graceful restart timer starts in FRR helper router; 2. When GR timer expires in helper router: a) LLGR_STALE community is attached to routes to be retained; b) Clear stale routes that have NO_LLGR community attached; c) Start LLGR timer per AFI/SAFI; d) Recompute bestpath and reannounce routes to peers; d) When LLGR timer expires, clear all routes on particular AFI/SAFI. Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2021-12-20 22:03:09 +01:00
if (bgp_debug_neighbor_events(peer))
zlog_debug(
"%pBP Long-lived set stale community (LLGR_STALE) for: %pFX",
peer, &dest->rn->p);
bgpd: Implement LLGR helper mode Tested between GoBGP and FRR (this commit). ``` ┌───────────┐ ┌────────────┐ │ │ │ │ │ GoBGPD │ │ FRRouting │ │ (restart) │ │ │ │ │ │ │ └──────┬────┘ └───────┬────┘ │ │ │ │ │ │ │ ┌───────────┐ │ │ │ │ │ │ │ │ │ └─────┤ FRRouting ├────────┘ │ (helper) │ │ │ └───────────┘ // GoBGPD % cat /etc/gobgp/config.toml [global.config] as = 65002 router-id = "2.2.2.2" port = 179 [[neighbors]] [neighbors.config] peer-as = 65001 neighbor-address = "2a02:abc::123" [neighbors.graceful-restart.config] enabled = true restart-time = 3 long-lived-enabled = true [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv6-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 10 [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv4-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 20 % ./gobgp global rib add -a ipv6 2001:db8:4::/64 % ./gobgp global rib add -a ipv6 2001:db8:5::/64 community 65535:7 % ./gobgp global rib add -a ipv4 100.100.100.100/32 % ./gobgp global rib add -a ipv4 100.100.100.200/32 community 65535:7 ``` 1. When killing GoBGPD, graceful restart timer starts in FRR helper router; 2. When GR timer expires in helper router: a) LLGR_STALE community is attached to routes to be retained; b) Clear stale routes that have NO_LLGR community attached; c) Start LLGR timer per AFI/SAFI; d) Recompute bestpath and reannounce routes to peers; d) When LLGR timer expires, clear all routes on particular AFI/SAFI. Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2021-12-20 22:03:09 +01:00
attr = *pi->attr;
bgp_attr_add_llgr_community(&attr);
pi->attr = bgp_attr_intern(&attr);
bgp_process(peer->bgp, rm, pi, afi,
safi);
bgpd: Implement LLGR helper mode Tested between GoBGP and FRR (this commit). ``` ┌───────────┐ ┌────────────┐ │ │ │ │ │ GoBGPD │ │ FRRouting │ │ (restart) │ │ │ │ │ │ │ └──────┬────┘ └───────┬────┘ │ │ │ │ │ │ │ ┌───────────┐ │ │ │ │ │ │ │ │ │ └─────┤ FRRouting ├────────┘ │ (helper) │ │ │ └───────────┘ // GoBGPD % cat /etc/gobgp/config.toml [global.config] as = 65002 router-id = "2.2.2.2" port = 179 [[neighbors]] [neighbors.config] peer-as = 65001 neighbor-address = "2a02:abc::123" [neighbors.graceful-restart.config] enabled = true restart-time = 3 long-lived-enabled = true [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv6-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 10 [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv4-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 20 % ./gobgp global rib add -a ipv6 2001:db8:4::/64 % ./gobgp global rib add -a ipv6 2001:db8:5::/64 community 65535:7 % ./gobgp global rib add -a ipv4 100.100.100.100/32 % ./gobgp global rib add -a ipv4 100.100.100.200/32 community 65535:7 ``` 1. When killing GoBGPD, graceful restart timer starts in FRR helper router; 2. When GR timer expires in helper router: a) LLGR_STALE community is attached to routes to be retained; b) Clear stale routes that have NO_LLGR community attached; c) Start LLGR timer per AFI/SAFI; d) Recompute bestpath and reannounce routes to peers; d) When LLGR timer expires, clear all routes on particular AFI/SAFI. Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2021-12-20 22:03:09 +01:00
}
}
} else {
for (dest = bgp_table_top(peer->bgp->rib[afi][safi]); dest;
dest = bgp_route_next(dest))
for (pi = bgp_dest_get_bgp_path_info(dest);
(pi != NULL) && (next = pi->next, 1); pi = next) {
bgpd: Implement LLGR helper mode Tested between GoBGP and FRR (this commit). ``` ┌───────────┐ ┌────────────┐ │ │ │ │ │ GoBGPD │ │ FRRouting │ │ (restart) │ │ │ │ │ │ │ └──────┬────┘ └───────┬────┘ │ │ │ │ │ │ │ ┌───────────┐ │ │ │ │ │ │ │ │ │ └─────┤ FRRouting ├────────┘ │ (helper) │ │ │ └───────────┘ // GoBGPD % cat /etc/gobgp/config.toml [global.config] as = 65002 router-id = "2.2.2.2" port = 179 [[neighbors]] [neighbors.config] peer-as = 65001 neighbor-address = "2a02:abc::123" [neighbors.graceful-restart.config] enabled = true restart-time = 3 long-lived-enabled = true [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv6-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 10 [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv4-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 20 % ./gobgp global rib add -a ipv6 2001:db8:4::/64 % ./gobgp global rib add -a ipv6 2001:db8:5::/64 community 65535:7 % ./gobgp global rib add -a ipv4 100.100.100.100/32 % ./gobgp global rib add -a ipv4 100.100.100.200/32 community 65535:7 ``` 1. When killing GoBGPD, graceful restart timer starts in FRR helper router; 2. When GR timer expires in helper router: a) LLGR_STALE community is attached to routes to be retained; b) Clear stale routes that have NO_LLGR community attached; c) Start LLGR timer per AFI/SAFI; d) Recompute bestpath and reannounce routes to peers; d) When LLGR timer expires, clear all routes on particular AFI/SAFI. Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2021-12-20 22:03:09 +01:00
if (pi->peer != peer)
continue;
if (bgp_attr_get_community(pi->attr) &&
community_include(
bgp_attr_get_community(pi->attr),
COMMUNITY_NO_LLGR))
bgpd: Implement LLGR helper mode Tested between GoBGP and FRR (this commit). ``` ┌───────────┐ ┌────────────┐ │ │ │ │ │ GoBGPD │ │ FRRouting │ │ (restart) │ │ │ │ │ │ │ └──────┬────┘ └───────┬────┘ │ │ │ │ │ │ │ ┌───────────┐ │ │ │ │ │ │ │ │ │ └─────┤ FRRouting ├────────┘ │ (helper) │ │ │ └───────────┘ // GoBGPD % cat /etc/gobgp/config.toml [global.config] as = 65002 router-id = "2.2.2.2" port = 179 [[neighbors]] [neighbors.config] peer-as = 65001 neighbor-address = "2a02:abc::123" [neighbors.graceful-restart.config] enabled = true restart-time = 3 long-lived-enabled = true [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv6-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 10 [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv4-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 20 % ./gobgp global rib add -a ipv6 2001:db8:4::/64 % ./gobgp global rib add -a ipv6 2001:db8:5::/64 community 65535:7 % ./gobgp global rib add -a ipv4 100.100.100.100/32 % ./gobgp global rib add -a ipv4 100.100.100.200/32 community 65535:7 ``` 1. When killing GoBGPD, graceful restart timer starts in FRR helper router; 2. When GR timer expires in helper router: a) LLGR_STALE community is attached to routes to be retained; b) Clear stale routes that have NO_LLGR community attached; c) Start LLGR timer per AFI/SAFI; d) Recompute bestpath and reannounce routes to peers; d) When LLGR timer expires, clear all routes on particular AFI/SAFI. Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2021-12-20 22:03:09 +01:00
continue;
if (bgp_attr_get_community(pi->attr) &&
community_include(bgp_attr_get_community(pi->attr),
COMMUNITY_LLGR_STALE))
continue;
bgpd: Implement LLGR helper mode Tested between GoBGP and FRR (this commit). ``` ┌───────────┐ ┌────────────┐ │ │ │ │ │ GoBGPD │ │ FRRouting │ │ (restart) │ │ │ │ │ │ │ └──────┬────┘ └───────┬────┘ │ │ │ │ │ │ │ ┌───────────┐ │ │ │ │ │ │ │ │ │ └─────┤ FRRouting ├────────┘ │ (helper) │ │ │ └───────────┘ // GoBGPD % cat /etc/gobgp/config.toml [global.config] as = 65002 router-id = "2.2.2.2" port = 179 [[neighbors]] [neighbors.config] peer-as = 65001 neighbor-address = "2a02:abc::123" [neighbors.graceful-restart.config] enabled = true restart-time = 3 long-lived-enabled = true [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv6-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 10 [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv4-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 20 % ./gobgp global rib add -a ipv6 2001:db8:4::/64 % ./gobgp global rib add -a ipv6 2001:db8:5::/64 community 65535:7 % ./gobgp global rib add -a ipv4 100.100.100.100/32 % ./gobgp global rib add -a ipv4 100.100.100.200/32 community 65535:7 ``` 1. When killing GoBGPD, graceful restart timer starts in FRR helper router; 2. When GR timer expires in helper router: a) LLGR_STALE community is attached to routes to be retained; b) Clear stale routes that have NO_LLGR community attached; c) Start LLGR timer per AFI/SAFI; d) Recompute bestpath and reannounce routes to peers; d) When LLGR timer expires, clear all routes on particular AFI/SAFI. Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2021-12-20 22:03:09 +01:00
if (bgp_debug_neighbor_events(peer))
zlog_debug(
"%pBP Long-lived set stale community (LLGR_STALE) for: %pFX",
peer, &dest->rn->p);
bgpd: Implement LLGR helper mode Tested between GoBGP and FRR (this commit). ``` ┌───────────┐ ┌────────────┐ │ │ │ │ │ GoBGPD │ │ FRRouting │ │ (restart) │ │ │ │ │ │ │ └──────┬────┘ └───────┬────┘ │ │ │ │ │ │ │ ┌───────────┐ │ │ │ │ │ │ │ │ │ └─────┤ FRRouting ├────────┘ │ (helper) │ │ │ └───────────┘ // GoBGPD % cat /etc/gobgp/config.toml [global.config] as = 65002 router-id = "2.2.2.2" port = 179 [[neighbors]] [neighbors.config] peer-as = 65001 neighbor-address = "2a02:abc::123" [neighbors.graceful-restart.config] enabled = true restart-time = 3 long-lived-enabled = true [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv6-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 10 [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv4-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 20 % ./gobgp global rib add -a ipv6 2001:db8:4::/64 % ./gobgp global rib add -a ipv6 2001:db8:5::/64 community 65535:7 % ./gobgp global rib add -a ipv4 100.100.100.100/32 % ./gobgp global rib add -a ipv4 100.100.100.200/32 community 65535:7 ``` 1. When killing GoBGPD, graceful restart timer starts in FRR helper router; 2. When GR timer expires in helper router: a) LLGR_STALE community is attached to routes to be retained; b) Clear stale routes that have NO_LLGR community attached; c) Start LLGR timer per AFI/SAFI; d) Recompute bestpath and reannounce routes to peers; d) When LLGR timer expires, clear all routes on particular AFI/SAFI. Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2021-12-20 22:03:09 +01:00
attr = *pi->attr;
bgp_attr_add_llgr_community(&attr);
pi->attr = bgp_attr_intern(&attr);
bgp_process(peer->bgp, dest, pi, afi, safi);
bgpd: Implement LLGR helper mode Tested between GoBGP and FRR (this commit). ``` ┌───────────┐ ┌────────────┐ │ │ │ │ │ GoBGPD │ │ FRRouting │ │ (restart) │ │ │ │ │ │ │ └──────┬────┘ └───────┬────┘ │ │ │ │ │ │ │ ┌───────────┐ │ │ │ │ │ │ │ │ │ └─────┤ FRRouting ├────────┘ │ (helper) │ │ │ └───────────┘ // GoBGPD % cat /etc/gobgp/config.toml [global.config] as = 65002 router-id = "2.2.2.2" port = 179 [[neighbors]] [neighbors.config] peer-as = 65001 neighbor-address = "2a02:abc::123" [neighbors.graceful-restart.config] enabled = true restart-time = 3 long-lived-enabled = true [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv6-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 10 [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv4-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 20 % ./gobgp global rib add -a ipv6 2001:db8:4::/64 % ./gobgp global rib add -a ipv6 2001:db8:5::/64 community 65535:7 % ./gobgp global rib add -a ipv4 100.100.100.100/32 % ./gobgp global rib add -a ipv4 100.100.100.200/32 community 65535:7 ``` 1. When killing GoBGPD, graceful restart timer starts in FRR helper router; 2. When GR timer expires in helper router: a) LLGR_STALE community is attached to routes to be retained; b) Clear stale routes that have NO_LLGR community attached; c) Start LLGR timer per AFI/SAFI; d) Recompute bestpath and reannounce routes to peers; d) When LLGR timer expires, clear all routes on particular AFI/SAFI. Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2021-12-20 22:03:09 +01:00
}
}
}
static void bgp_graceful_restart_timer_expire(struct event *thread)
bgpd: Implement LLGR helper mode Tested between GoBGP and FRR (this commit). ``` ┌───────────┐ ┌────────────┐ │ │ │ │ │ GoBGPD │ │ FRRouting │ │ (restart) │ │ │ │ │ │ │ └──────┬────┘ └───────┬────┘ │ │ │ │ │ │ │ ┌───────────┐ │ │ │ │ │ │ │ │ │ └─────┤ FRRouting ├────────┘ │ (helper) │ │ │ └───────────┘ // GoBGPD % cat /etc/gobgp/config.toml [global.config] as = 65002 router-id = "2.2.2.2" port = 179 [[neighbors]] [neighbors.config] peer-as = 65001 neighbor-address = "2a02:abc::123" [neighbors.graceful-restart.config] enabled = true restart-time = 3 long-lived-enabled = true [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv6-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 10 [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv4-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 20 % ./gobgp global rib add -a ipv6 2001:db8:4::/64 % ./gobgp global rib add -a ipv6 2001:db8:5::/64 community 65535:7 % ./gobgp global rib add -a ipv4 100.100.100.100/32 % ./gobgp global rib add -a ipv4 100.100.100.200/32 community 65535:7 ``` 1. When killing GoBGPD, graceful restart timer starts in FRR helper router; 2. When GR timer expires in helper router: a) LLGR_STALE community is attached to routes to be retained; b) Clear stale routes that have NO_LLGR community attached; c) Start LLGR timer per AFI/SAFI; d) Recompute bestpath and reannounce routes to peers; d) When LLGR timer expires, clear all routes on particular AFI/SAFI. Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2021-12-20 22:03:09 +01:00
{
struct peer_connection *connection = EVENT_ARG(thread);
struct peer *peer = connection->peer;
struct peer *tmp_peer;
bgpd: Implement LLGR helper mode Tested between GoBGP and FRR (this commit). ``` ┌───────────┐ ┌────────────┐ │ │ │ │ │ GoBGPD │ │ FRRouting │ │ (restart) │ │ │ │ │ │ │ └──────┬────┘ └───────┬────┘ │ │ │ │ │ │ │ ┌───────────┐ │ │ │ │ │ │ │ │ │ └─────┤ FRRouting ├────────┘ │ (helper) │ │ │ └───────────┘ // GoBGPD % cat /etc/gobgp/config.toml [global.config] as = 65002 router-id = "2.2.2.2" port = 179 [[neighbors]] [neighbors.config] peer-as = 65001 neighbor-address = "2a02:abc::123" [neighbors.graceful-restart.config] enabled = true restart-time = 3 long-lived-enabled = true [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv6-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 10 [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv4-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 20 % ./gobgp global rib add -a ipv6 2001:db8:4::/64 % ./gobgp global rib add -a ipv6 2001:db8:5::/64 community 65535:7 % ./gobgp global rib add -a ipv4 100.100.100.100/32 % ./gobgp global rib add -a ipv4 100.100.100.200/32 community 65535:7 ``` 1. When killing GoBGPD, graceful restart timer starts in FRR helper router; 2. When GR timer expires in helper router: a) LLGR_STALE community is attached to routes to be retained; b) Clear stale routes that have NO_LLGR community attached; c) Start LLGR timer per AFI/SAFI; d) Recompute bestpath and reannounce routes to peers; d) When LLGR timer expires, clear all routes on particular AFI/SAFI. Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2021-12-20 22:03:09 +01:00
struct listnode *node, *nnode;
struct peer_af *paf;
afi_t afi;
safi_t safi;
if (bgp_debug_neighbor_events(peer))
zlog_debug("%pBP graceful restart timer expired and graceful restart stalepath timer stopped for %s",
peer, bgp_peer_get_connection_direction(connection));
bgpd: Implement LLGR helper mode Tested between GoBGP and FRR (this commit). ``` ┌───────────┐ ┌────────────┐ │ │ │ │ │ GoBGPD │ │ FRRouting │ │ (restart) │ │ │ │ │ │ │ └──────┬────┘ └───────┬────┘ │ │ │ │ │ │ │ ┌───────────┐ │ │ │ │ │ │ │ │ │ └─────┤ FRRouting ├────────┘ │ (helper) │ │ │ └───────────┘ // GoBGPD % cat /etc/gobgp/config.toml [global.config] as = 65002 router-id = "2.2.2.2" port = 179 [[neighbors]] [neighbors.config] peer-as = 65001 neighbor-address = "2a02:abc::123" [neighbors.graceful-restart.config] enabled = true restart-time = 3 long-lived-enabled = true [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv6-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 10 [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv4-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 20 % ./gobgp global rib add -a ipv6 2001:db8:4::/64 % ./gobgp global rib add -a ipv6 2001:db8:5::/64 community 65535:7 % ./gobgp global rib add -a ipv4 100.100.100.100/32 % ./gobgp global rib add -a ipv4 100.100.100.200/32 community 65535:7 ``` 1. When killing GoBGPD, graceful restart timer starts in FRR helper router; 2. When GR timer expires in helper router: a) LLGR_STALE community is attached to routes to be retained; b) Clear stale routes that have NO_LLGR community attached; c) Start LLGR timer per AFI/SAFI; d) Recompute bestpath and reannounce routes to peers; d) When LLGR timer expires, clear all routes on particular AFI/SAFI. Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2021-12-20 22:03:09 +01:00
FOREACH_AFI_SAFI (afi, safi) {
if (!peer->nsf[afi][safi])
continue;
/* Once the "Restart Time" period ends, the LLGR period is
* said to have begun and the following procedures MUST be
* performed:
*
* The helper router MUST start a timer for the
* "Long-lived Stale Time".
*
* The helper router MUST attach the LLGR_STALE community
* for the stale routes being retained. Note that this
* requirement implies that the routes would need to be
* readvertised, to disseminate the modified community.
*/
if (peer->llgr[afi][safi].stale_time) {
paf = peer_af_find(peer, afi, safi);
if (!paf)
continue;
if (bgp_debug_neighbor_events(peer))
zlog_debug("%pBP Long-lived stale timer (%s) started for %d sec for %s",
peer, get_afi_safi_str(afi, safi, false),
peer->llgr[afi][safi].stale_time,
bgp_peer_get_connection_direction(connection));
bgpd: Implement LLGR helper mode Tested between GoBGP and FRR (this commit). ``` ┌───────────┐ ┌────────────┐ │ │ │ │ │ GoBGPD │ │ FRRouting │ │ (restart) │ │ │ │ │ │ │ └──────┬────┘ └───────┬────┘ │ │ │ │ │ │ │ ┌───────────┐ │ │ │ │ │ │ │ │ │ └─────┤ FRRouting ├────────┘ │ (helper) │ │ │ └───────────┘ // GoBGPD % cat /etc/gobgp/config.toml [global.config] as = 65002 router-id = "2.2.2.2" port = 179 [[neighbors]] [neighbors.config] peer-as = 65001 neighbor-address = "2a02:abc::123" [neighbors.graceful-restart.config] enabled = true restart-time = 3 long-lived-enabled = true [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv6-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 10 [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv4-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 20 % ./gobgp global rib add -a ipv6 2001:db8:4::/64 % ./gobgp global rib add -a ipv6 2001:db8:5::/64 community 65535:7 % ./gobgp global rib add -a ipv4 100.100.100.100/32 % ./gobgp global rib add -a ipv4 100.100.100.200/32 community 65535:7 ``` 1. When killing GoBGPD, graceful restart timer starts in FRR helper router; 2. When GR timer expires in helper router: a) LLGR_STALE community is attached to routes to be retained; b) Clear stale routes that have NO_LLGR community attached; c) Start LLGR timer per AFI/SAFI; d) Recompute bestpath and reannounce routes to peers; d) When LLGR timer expires, clear all routes on particular AFI/SAFI. Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2021-12-20 22:03:09 +01:00
SET_FLAG(peer->af_sflags[afi][safi],
PEER_STATUS_LLGR_WAIT);
bgp_set_llgr_stale(peer, afi, safi);
bgp_clear_stale_route(peer, afi, safi);
event_add_timer(bm->master, bgp_llgr_stale_timer_expire,
paf, peer->llgr[afi][safi].stale_time,
&peer->t_llgr_stale[afi][safi]);
bgpd: Implement LLGR helper mode Tested between GoBGP and FRR (this commit). ``` ┌───────────┐ ┌────────────┐ │ │ │ │ │ GoBGPD │ │ FRRouting │ │ (restart) │ │ │ │ │ │ │ └──────┬────┘ └───────┬────┘ │ │ │ │ │ │ │ ┌───────────┐ │ │ │ │ │ │ │ │ │ └─────┤ FRRouting ├────────┘ │ (helper) │ │ │ └───────────┘ // GoBGPD % cat /etc/gobgp/config.toml [global.config] as = 65002 router-id = "2.2.2.2" port = 179 [[neighbors]] [neighbors.config] peer-as = 65001 neighbor-address = "2a02:abc::123" [neighbors.graceful-restart.config] enabled = true restart-time = 3 long-lived-enabled = true [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv6-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 10 [[neighbors.afi-safis]] [neighbors.afi-safis.config] afi-safi-name = "ipv4-unicast" [neighbors.afi-safis.mp-graceful-restart.config] enabled = true [neighbors.afi-safis.long-lived-graceful-restart.config] enabled = true restart-time = 20 % ./gobgp global rib add -a ipv6 2001:db8:4::/64 % ./gobgp global rib add -a ipv6 2001:db8:5::/64 community 65535:7 % ./gobgp global rib add -a ipv4 100.100.100.100/32 % ./gobgp global rib add -a ipv4 100.100.100.200/32 community 65535:7 ``` 1. When killing GoBGPD, graceful restart timer starts in FRR helper router; 2. When GR timer expires in helper router: a) LLGR_STALE community is attached to routes to be retained; b) Clear stale routes that have NO_LLGR community attached; c) Start LLGR timer per AFI/SAFI; d) Recompute bestpath and reannounce routes to peers; d) When LLGR timer expires, clear all routes on particular AFI/SAFI. Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2021-12-20 22:03:09 +01:00
for (ALL_LIST_ELEMENTS(peer->bgp->peer, node, nnode,
tmp_peer))
bgp_announce_route(tmp_peer, afi, safi, false);
} else {
bgp_clear_stale_route(peer, afi, safi);
}
}
bgp_graceful_restart_timer_off(connection, peer);
}
static void bgp_graceful_stale_timer_expire(struct event *thread)
{
struct peer_connection *connection = EVENT_ARG(thread);
struct peer *peer = connection->peer;
afi_t afi;
safi_t safi;
Overhual BGP debugs Summary of changes - added an option to enable keepalive debugs for a specific peer - added an option to enable inbound and/or outbound updates debugs for a specific peer - added an option to enable update debugs for a specific prefix - added an option to enable zebra debugs for a specific prefix - combined "deb bgp", "deb bgp events" and "deb bgp fsm" into "deb bgp neighbor-events". "deb bgp neighbor-events" can be enabled for a specific peer. - merged "deb bgp filters" into "deb bgp update" - moved the per-peer logging to one central log file. We now have the ability to filter all verbose debugs on a per-peer and per-prefix basis so we no longer need to keep log files per-peer. This simplifies troubleshooting by keeping all BGP logs in one location. The use r can then grep for the peer IP they are interested in if they wish to see the logs for a specific peer. - Changed "show debugging" in isis to "show debugging isis" to be consistent with all other protocols. This was very confusing for the user because they would type "show debug" and expect to see a list of debugs enabled across all protocols. - Removed "undebug" from the parser for BGP. Again this was to be consisten with all other protocols. - Removed the "all" keyword from the BGP debug parser. The user can now do "no debug bgp" to disable all BGP debugs, before you had to type "no deb all bgp" which was confusing. The new parse tree for BGP debugging is: deb bgp as4 deb bgp as4 segment deb bgp keepalives [A.B.C.D|WORD|X:X::X:X] deb bgp neighbor-events [A.B.C.D|WORD|X:X::X:X] deb bgp nht deb bgp updates [in|out] [A.B.C.D|WORD|X:X::X:X] deb bgp updates prefix [A.B.C.D/M|X:X::X:X/M] deb bgp zebra deb bgp zebra prefix [A.B.C.D/M|X:X::X:X/M]
2015-05-20 02:58:12 +02:00
if (bgp_debug_neighbor_events(peer))
zlog_debug("%pBP graceful restart stalepath timer expired for %s", peer,
bgp_peer_get_connection_direction(connection));
/* NSF delete stale route */
FOREACH_AFI_SAFI_NSF (afi, safi)
if (peer->nsf[afi][safi])
bgp_clear_stale_route(peer, afi, safi);
}
/* Selection deferral timer processing function */
static void bgp_graceful_deferral_timer_expire(struct event *thread)
{
struct afi_safi_info *info;
afi_t afi;
safi_t safi;
struct bgp *bgp;
info = EVENT_ARG(thread);
afi = info->afi;
safi = info->safi;
bgp = info->bgp;
if (BGP_DEBUG(update, UPDATE_OUT))
zlog_debug(
"afi %d, safi %d : graceful restart deferral timer expired",
afi, safi);
bgp->gr_info[afi][safi].eor_required = 0;
bgp->gr_info[afi][safi].eor_received = 0;
XFREE(MTYPE_TMP, info);
/* Best path selection */
bgp_best_path_select_defer(bgp, afi, safi);
}
static bool bgp_update_delay_applicable(struct bgp *bgp)
bgpd: bgpd-update-delay.patch COMMAND: 'update-delay <max-delay in seconds> [<establish-wait in seconds>]' DESCRIPTION: This feature is used to enable read-only mode on BGP process restart or when BGP process is cleared using 'clear ip bgp *'. When applicable, read-only mode would begin as soon as the first peer reaches Established state and a timer for <max-delay> seconds is started. During this mode BGP doesn't run any best-path or generate any updates to its peers. This mode continues until: 1. All the configured peers, except the shutdown peers, have sent explicit EOR (End-Of-RIB) or an implicit-EOR. The first keep-alive after BGP has reached Established is considered an implicit-EOR. If the <establish-wait> optional value is given, then BGP will wait for peers to reach establish from the begining of the update-delay till the establish-wait period is over, i.e. the minimum set of established peers for which EOR is expected would be peers established during the establish-wait window, not necessarily all the configured neighbors. 2. max-delay period is over. On hitting any of the above two conditions, BGP resumes the decision process and generates updates to its peers. Default <max-delay> is 0, i.e. the feature is off by default. This feature can be useful in reducing CPU/network used as BGP restarts/clears. Particularly useful in the topologies where BGP learns a prefix from many peers. Intermediate bestpaths are possible for the same prefix as peers get established and start receiving updates at different times. This feature should offer a value-add if the network has a high number of such prefixes. IMPLEMENTATION OBJECTIVES: Given this is an optional feature, minimized the code-churn. Used existing constructs wherever possible (existing queue-plug/unplug were used to achieve delay and resume of best-paths/update-generation). As a result, no new data-structure(s) had to be defined and allocated. When the feature is disabled, the new node is not exercised for the most part. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com> Dinesh Dutt <ddutt@cumulusnetworks.com>
2015-05-20 02:40:33 +02:00
{
/* update_delay_over flag should be reset (set to 0) for any new
applicability of the update-delay during BGP process lifetime.
And it should be set after an occurence of the update-delay is
over)*/
if (!bgp->update_delay_over)
return true;
return false;
bgpd: bgpd-update-delay.patch COMMAND: 'update-delay <max-delay in seconds> [<establish-wait in seconds>]' DESCRIPTION: This feature is used to enable read-only mode on BGP process restart or when BGP process is cleared using 'clear ip bgp *'. When applicable, read-only mode would begin as soon as the first peer reaches Established state and a timer for <max-delay> seconds is started. During this mode BGP doesn't run any best-path or generate any updates to its peers. This mode continues until: 1. All the configured peers, except the shutdown peers, have sent explicit EOR (End-Of-RIB) or an implicit-EOR. The first keep-alive after BGP has reached Established is considered an implicit-EOR. If the <establish-wait> optional value is given, then BGP will wait for peers to reach establish from the begining of the update-delay till the establish-wait period is over, i.e. the minimum set of established peers for which EOR is expected would be peers established during the establish-wait window, not necessarily all the configured neighbors. 2. max-delay period is over. On hitting any of the above two conditions, BGP resumes the decision process and generates updates to its peers. Default <max-delay> is 0, i.e. the feature is off by default. This feature can be useful in reducing CPU/network used as BGP restarts/clears. Particularly useful in the topologies where BGP learns a prefix from many peers. Intermediate bestpaths are possible for the same prefix as peers get established and start receiving updates at different times. This feature should offer a value-add if the network has a high number of such prefixes. IMPLEMENTATION OBJECTIVES: Given this is an optional feature, minimized the code-churn. Used existing constructs wherever possible (existing queue-plug/unplug were used to achieve delay and resume of best-paths/update-generation). As a result, no new data-structure(s) had to be defined and allocated. When the feature is disabled, the new node is not exercised for the most part. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com> Dinesh Dutt <ddutt@cumulusnetworks.com>
2015-05-20 02:40:33 +02:00
}
bool bgp_update_delay_active(struct bgp *bgp)
bgpd: bgpd-update-delay.patch COMMAND: 'update-delay <max-delay in seconds> [<establish-wait in seconds>]' DESCRIPTION: This feature is used to enable read-only mode on BGP process restart or when BGP process is cleared using 'clear ip bgp *'. When applicable, read-only mode would begin as soon as the first peer reaches Established state and a timer for <max-delay> seconds is started. During this mode BGP doesn't run any best-path or generate any updates to its peers. This mode continues until: 1. All the configured peers, except the shutdown peers, have sent explicit EOR (End-Of-RIB) or an implicit-EOR. The first keep-alive after BGP has reached Established is considered an implicit-EOR. If the <establish-wait> optional value is given, then BGP will wait for peers to reach establish from the begining of the update-delay till the establish-wait period is over, i.e. the minimum set of established peers for which EOR is expected would be peers established during the establish-wait window, not necessarily all the configured neighbors. 2. max-delay period is over. On hitting any of the above two conditions, BGP resumes the decision process and generates updates to its peers. Default <max-delay> is 0, i.e. the feature is off by default. This feature can be useful in reducing CPU/network used as BGP restarts/clears. Particularly useful in the topologies where BGP learns a prefix from many peers. Intermediate bestpaths are possible for the same prefix as peers get established and start receiving updates at different times. This feature should offer a value-add if the network has a high number of such prefixes. IMPLEMENTATION OBJECTIVES: Given this is an optional feature, minimized the code-churn. Used existing constructs wherever possible (existing queue-plug/unplug were used to achieve delay and resume of best-paths/update-generation). As a result, no new data-structure(s) had to be defined and allocated. When the feature is disabled, the new node is not exercised for the most part. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com> Dinesh Dutt <ddutt@cumulusnetworks.com>
2015-05-20 02:40:33 +02:00
{
if (bgp->t_update_delay)
return true;
return false;
bgpd: bgpd-update-delay.patch COMMAND: 'update-delay <max-delay in seconds> [<establish-wait in seconds>]' DESCRIPTION: This feature is used to enable read-only mode on BGP process restart or when BGP process is cleared using 'clear ip bgp *'. When applicable, read-only mode would begin as soon as the first peer reaches Established state and a timer for <max-delay> seconds is started. During this mode BGP doesn't run any best-path or generate any updates to its peers. This mode continues until: 1. All the configured peers, except the shutdown peers, have sent explicit EOR (End-Of-RIB) or an implicit-EOR. The first keep-alive after BGP has reached Established is considered an implicit-EOR. If the <establish-wait> optional value is given, then BGP will wait for peers to reach establish from the begining of the update-delay till the establish-wait period is over, i.e. the minimum set of established peers for which EOR is expected would be peers established during the establish-wait window, not necessarily all the configured neighbors. 2. max-delay period is over. On hitting any of the above two conditions, BGP resumes the decision process and generates updates to its peers. Default <max-delay> is 0, i.e. the feature is off by default. This feature can be useful in reducing CPU/network used as BGP restarts/clears. Particularly useful in the topologies where BGP learns a prefix from many peers. Intermediate bestpaths are possible for the same prefix as peers get established and start receiving updates at different times. This feature should offer a value-add if the network has a high number of such prefixes. IMPLEMENTATION OBJECTIVES: Given this is an optional feature, minimized the code-churn. Used existing constructs wherever possible (existing queue-plug/unplug were used to achieve delay and resume of best-paths/update-generation). As a result, no new data-structure(s) had to be defined and allocated. When the feature is disabled, the new node is not exercised for the most part. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com> Dinesh Dutt <ddutt@cumulusnetworks.com>
2015-05-20 02:40:33 +02:00
}
bool bgp_update_delay_configured(struct bgp *bgp)
bgpd: bgpd-update-delay.patch COMMAND: 'update-delay <max-delay in seconds> [<establish-wait in seconds>]' DESCRIPTION: This feature is used to enable read-only mode on BGP process restart or when BGP process is cleared using 'clear ip bgp *'. When applicable, read-only mode would begin as soon as the first peer reaches Established state and a timer for <max-delay> seconds is started. During this mode BGP doesn't run any best-path or generate any updates to its peers. This mode continues until: 1. All the configured peers, except the shutdown peers, have sent explicit EOR (End-Of-RIB) or an implicit-EOR. The first keep-alive after BGP has reached Established is considered an implicit-EOR. If the <establish-wait> optional value is given, then BGP will wait for peers to reach establish from the begining of the update-delay till the establish-wait period is over, i.e. the minimum set of established peers for which EOR is expected would be peers established during the establish-wait window, not necessarily all the configured neighbors. 2. max-delay period is over. On hitting any of the above two conditions, BGP resumes the decision process and generates updates to its peers. Default <max-delay> is 0, i.e. the feature is off by default. This feature can be useful in reducing CPU/network used as BGP restarts/clears. Particularly useful in the topologies where BGP learns a prefix from many peers. Intermediate bestpaths are possible for the same prefix as peers get established and start receiving updates at different times. This feature should offer a value-add if the network has a high number of such prefixes. IMPLEMENTATION OBJECTIVES: Given this is an optional feature, minimized the code-churn. Used existing constructs wherever possible (existing queue-plug/unplug were used to achieve delay and resume of best-paths/update-generation). As a result, no new data-structure(s) had to be defined and allocated. When the feature is disabled, the new node is not exercised for the most part. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com> Dinesh Dutt <ddutt@cumulusnetworks.com>
2015-05-20 02:40:33 +02:00
{
if (bgp->v_update_delay)
return true;
return false;
bgpd: bgpd-update-delay.patch COMMAND: 'update-delay <max-delay in seconds> [<establish-wait in seconds>]' DESCRIPTION: This feature is used to enable read-only mode on BGP process restart or when BGP process is cleared using 'clear ip bgp *'. When applicable, read-only mode would begin as soon as the first peer reaches Established state and a timer for <max-delay> seconds is started. During this mode BGP doesn't run any best-path or generate any updates to its peers. This mode continues until: 1. All the configured peers, except the shutdown peers, have sent explicit EOR (End-Of-RIB) or an implicit-EOR. The first keep-alive after BGP has reached Established is considered an implicit-EOR. If the <establish-wait> optional value is given, then BGP will wait for peers to reach establish from the begining of the update-delay till the establish-wait period is over, i.e. the minimum set of established peers for which EOR is expected would be peers established during the establish-wait window, not necessarily all the configured neighbors. 2. max-delay period is over. On hitting any of the above two conditions, BGP resumes the decision process and generates updates to its peers. Default <max-delay> is 0, i.e. the feature is off by default. This feature can be useful in reducing CPU/network used as BGP restarts/clears. Particularly useful in the topologies where BGP learns a prefix from many peers. Intermediate bestpaths are possible for the same prefix as peers get established and start receiving updates at different times. This feature should offer a value-add if the network has a high number of such prefixes. IMPLEMENTATION OBJECTIVES: Given this is an optional feature, minimized the code-churn. Used existing constructs wherever possible (existing queue-plug/unplug were used to achieve delay and resume of best-paths/update-generation). As a result, no new data-structure(s) had to be defined and allocated. When the feature is disabled, the new node is not exercised for the most part. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com> Dinesh Dutt <ddutt@cumulusnetworks.com>
2015-05-20 02:40:33 +02:00
}
/* Do the post-processing needed when bgp comes out of the read-only mode
on ending the update delay. */
void bgp_update_delay_end(struct bgp *bgp)
{
EVENT_OFF(bgp->t_update_delay);
EVENT_OFF(bgp->t_establish_wait);
bgpd: bgpd-update-delay.patch COMMAND: 'update-delay <max-delay in seconds> [<establish-wait in seconds>]' DESCRIPTION: This feature is used to enable read-only mode on BGP process restart or when BGP process is cleared using 'clear ip bgp *'. When applicable, read-only mode would begin as soon as the first peer reaches Established state and a timer for <max-delay> seconds is started. During this mode BGP doesn't run any best-path or generate any updates to its peers. This mode continues until: 1. All the configured peers, except the shutdown peers, have sent explicit EOR (End-Of-RIB) or an implicit-EOR. The first keep-alive after BGP has reached Established is considered an implicit-EOR. If the <establish-wait> optional value is given, then BGP will wait for peers to reach establish from the begining of the update-delay till the establish-wait period is over, i.e. the minimum set of established peers for which EOR is expected would be peers established during the establish-wait window, not necessarily all the configured neighbors. 2. max-delay period is over. On hitting any of the above two conditions, BGP resumes the decision process and generates updates to its peers. Default <max-delay> is 0, i.e. the feature is off by default. This feature can be useful in reducing CPU/network used as BGP restarts/clears. Particularly useful in the topologies where BGP learns a prefix from many peers. Intermediate bestpaths are possible for the same prefix as peers get established and start receiving updates at different times. This feature should offer a value-add if the network has a high number of such prefixes. IMPLEMENTATION OBJECTIVES: Given this is an optional feature, minimized the code-churn. Used existing constructs wherever possible (existing queue-plug/unplug were used to achieve delay and resume of best-paths/update-generation). As a result, no new data-structure(s) had to be defined and allocated. When the feature is disabled, the new node is not exercised for the most part. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com> Dinesh Dutt <ddutt@cumulusnetworks.com>
2015-05-20 02:40:33 +02:00
/* Reset update-delay related state */
bgp->update_delay_over = 1;
bgp->established = 0;
bgp->restarted_peers = 0;
bgp->implicit_eors = 0;
bgp->explicit_eors = 0;
frr_timestamp(3, bgp->update_delay_end_time,
sizeof(bgp->update_delay_end_time));
bgpd: bgpd-mrai.patch BGP: Event-driven route announcement taking into account min route advertisement interval ISSUE BGP starts the routeadv timer (peer->t_routeadv) to expire in 1 sec when a peer is established. From then on, the timer expires periodically based on the configured MRAI value (default: 30sec for EBGP, 5sec for IBGP). At the expiry, the write thread is triggered that takes the routes from peer's sync FIFO (adj-rib-out) and sends UPDATEs. This has a few drawbacks: (1) Delay in new route announcement: Even when the last UPDATE message was sent a while back, the next route change will necessarily have to wait for routeadv expiry (2) CPU usage: The timer is always armed. If the operator chooses to configure a lower value of MRAI (zero second is a preferred choice in many deployments) for better convergence, it leads to high CPU usage for BGP process, even at the times of no network churn. PATCH Make the route advertisement event-driven - When routes are added to peer's sync FIFO, check if the routeadv timer needs to be adjusted (or started). Conversely, do not arm the routeadv timer unconditionally. The patch also addresses route announcements during read-only mode (update-delay). During read-only mode operation, the routeadv timer is not started. When BGP comes out of read-only mode and all the routes are processed, the timer is started for all peers with zero expiry, so that the UPDATEs can be sent all at once. This leads to (near-)optimal UPDATE packing. Finally, the patch makes the "max # packets to write to peer socket at a time" configurable. Currently it is hard-coded to 10. The command is at the top router-bgp mode and is called "write-quanta <number>". It is a useful convergence parameter to tweak. Signed-off-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com> Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com>
2015-05-20 02:40:37 +02:00
/*
* Add an end-of-initial-update marker to the main process queues so
* that
* the route advertisement timer for the peers can be started. Also set
* the zebra and peer update hold flags. These flags are used to achieve
* three stages in the update-delay post processing:
* 1. Finish best-path selection for all the prefixes held on the
* queues.
* (routes in BGP are updated, and peers sync queues are populated
* too)
* 2. As the eoiu mark is reached in the bgp process routine, ship all
* the
* routes to zebra. With that zebra should see updates from BGP
* close
* to each other.
* 3. Unblock the peer update writes. With that peer update packing
* with
* the prefixes should be at its maximum.
bgpd: bgpd-mrai.patch BGP: Event-driven route announcement taking into account min route advertisement interval ISSUE BGP starts the routeadv timer (peer->t_routeadv) to expire in 1 sec when a peer is established. From then on, the timer expires periodically based on the configured MRAI value (default: 30sec for EBGP, 5sec for IBGP). At the expiry, the write thread is triggered that takes the routes from peer's sync FIFO (adj-rib-out) and sends UPDATEs. This has a few drawbacks: (1) Delay in new route announcement: Even when the last UPDATE message was sent a while back, the next route change will necessarily have to wait for routeadv expiry (2) CPU usage: The timer is always armed. If the operator chooses to configure a lower value of MRAI (zero second is a preferred choice in many deployments) for better convergence, it leads to high CPU usage for BGP process, even at the times of no network churn. PATCH Make the route advertisement event-driven - When routes are added to peer's sync FIFO, check if the routeadv timer needs to be adjusted (or started). Conversely, do not arm the routeadv timer unconditionally. The patch also addresses route announcements during read-only mode (update-delay). During read-only mode operation, the routeadv timer is not started. When BGP comes out of read-only mode and all the routes are processed, the timer is started for all peers with zero expiry, so that the UPDATEs can be sent all at once. This leads to (near-)optimal UPDATE packing. Finally, the patch makes the "max # packets to write to peer socket at a time" configurable. Currently it is hard-coded to 10. The command is at the top router-bgp mode and is called "write-quanta <number>". It is a useful convergence parameter to tweak. Signed-off-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com> Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com>
2015-05-20 02:40:37 +02:00
*/
BGP: route-server will now use addpath...chop the _rsclient code Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com> Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com> Ticket: CM-8122 per draft-ietf-idr-ix-bgp-route-server-09: 2.3.2.2.2. BGP ADD-PATH Approach The [I-D.ietf-idr-add-paths] Internet draft proposes a different approach to multiple path propagation, by allowing a BGP speaker to forward multiple paths for the same prefix on a single BGP session. As [RFC4271] specifies that a BGP listener must implement an implicit withdraw when it receives an UPDATE message for a prefix which already exists in its Adj-RIB-In, this approach requires explicit support for the feature both on the route server and on its clients. If the ADD-PATH capability is negotiated bidirectionally between the route server and a route server client, and the route server client propagates multiple paths for the same prefix to the route server, then this could potentially cause the propagation of inactive, invalid or suboptimal paths to the route server, thereby causing loss of reachability to other route server clients. For this reason, ADD- PATH implementations on a route server should enforce send-only mode with the route server clients, which would result in negotiating receive-only mode from the client to the route server. This allows us to delete all of the following code: - All XXXX_rsclient() functions - peer->rib - BGP_TABLE_MAIN and BGP_TABLE_RSCLIENT - RMAP_IMPORT and RMAP_EXPORT
2015-11-10 16:29:12 +01:00
bgp_add_eoiu_mark(bgp);
bgp->main_zebra_update_hold = 1;
bgp->main_peers_update_hold = 1;
/*
* Resume the queue processing. This should trigger the event that would
* take care of processing any work that was queued during the read-only
* mode.
*/
work_queue_unplug(bgp->process_queue);
bgpd: bgpd-update-delay.patch COMMAND: 'update-delay <max-delay in seconds> [<establish-wait in seconds>]' DESCRIPTION: This feature is used to enable read-only mode on BGP process restart or when BGP process is cleared using 'clear ip bgp *'. When applicable, read-only mode would begin as soon as the first peer reaches Established state and a timer for <max-delay> seconds is started. During this mode BGP doesn't run any best-path or generate any updates to its peers. This mode continues until: 1. All the configured peers, except the shutdown peers, have sent explicit EOR (End-Of-RIB) or an implicit-EOR. The first keep-alive after BGP has reached Established is considered an implicit-EOR. If the <establish-wait> optional value is given, then BGP will wait for peers to reach establish from the begining of the update-delay till the establish-wait period is over, i.e. the minimum set of established peers for which EOR is expected would be peers established during the establish-wait window, not necessarily all the configured neighbors. 2. max-delay period is over. On hitting any of the above two conditions, BGP resumes the decision process and generates updates to its peers. Default <max-delay> is 0, i.e. the feature is off by default. This feature can be useful in reducing CPU/network used as BGP restarts/clears. Particularly useful in the topologies where BGP learns a prefix from many peers. Intermediate bestpaths are possible for the same prefix as peers get established and start receiving updates at different times. This feature should offer a value-add if the network has a high number of such prefixes. IMPLEMENTATION OBJECTIVES: Given this is an optional feature, minimized the code-churn. Used existing constructs wherever possible (existing queue-plug/unplug were used to achieve delay and resume of best-paths/update-generation). As a result, no new data-structure(s) had to be defined and allocated. When the feature is disabled, the new node is not exercised for the most part. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com> Dinesh Dutt <ddutt@cumulusnetworks.com>
2015-05-20 02:40:33 +02:00
}
bgpd: bgpd-mrai.patch BGP: Event-driven route announcement taking into account min route advertisement interval ISSUE BGP starts the routeadv timer (peer->t_routeadv) to expire in 1 sec when a peer is established. From then on, the timer expires periodically based on the configured MRAI value (default: 30sec for EBGP, 5sec for IBGP). At the expiry, the write thread is triggered that takes the routes from peer's sync FIFO (adj-rib-out) and sends UPDATEs. This has a few drawbacks: (1) Delay in new route announcement: Even when the last UPDATE message was sent a while back, the next route change will necessarily have to wait for routeadv expiry (2) CPU usage: The timer is always armed. If the operator chooses to configure a lower value of MRAI (zero second is a preferred choice in many deployments) for better convergence, it leads to high CPU usage for BGP process, even at the times of no network churn. PATCH Make the route advertisement event-driven - When routes are added to peer's sync FIFO, check if the routeadv timer needs to be adjusted (or started). Conversely, do not arm the routeadv timer unconditionally. The patch also addresses route announcements during read-only mode (update-delay). During read-only mode operation, the routeadv timer is not started. When BGP comes out of read-only mode and all the routes are processed, the timer is started for all peers with zero expiry, so that the UPDATEs can be sent all at once. This leads to (near-)optimal UPDATE packing. Finally, the patch makes the "max # packets to write to peer socket at a time" configurable. Currently it is hard-coded to 10. The command is at the top router-bgp mode and is called "write-quanta <number>". It is a useful convergence parameter to tweak. Signed-off-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com> Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com>
2015-05-20 02:40:37 +02:00
/**
* see bgp_fsm.h
*/
void bgp_start_routeadv(struct bgp *bgp)
{
struct listnode *node, *nnode;
struct peer *peer;
zlog_info("%s, update hold status %d", __func__,
BGP: route-server will now use addpath...chop the _rsclient code Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com> Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com> Ticket: CM-8122 per draft-ietf-idr-ix-bgp-route-server-09: 2.3.2.2.2. BGP ADD-PATH Approach The [I-D.ietf-idr-add-paths] Internet draft proposes a different approach to multiple path propagation, by allowing a BGP speaker to forward multiple paths for the same prefix on a single BGP session. As [RFC4271] specifies that a BGP listener must implement an implicit withdraw when it receives an UPDATE message for a prefix which already exists in its Adj-RIB-In, this approach requires explicit support for the feature both on the route server and on its clients. If the ADD-PATH capability is negotiated bidirectionally between the route server and a route server client, and the route server client propagates multiple paths for the same prefix to the route server, then this could potentially cause the propagation of inactive, invalid or suboptimal paths to the route server, thereby causing loss of reachability to other route server clients. For this reason, ADD- PATH implementations on a route server should enforce send-only mode with the route server clients, which would result in negotiating receive-only mode from the client to the route server. This allows us to delete all of the following code: - All XXXX_rsclient() functions - peer->rib - BGP_TABLE_MAIN and BGP_TABLE_RSCLIENT - RMAP_IMPORT and RMAP_EXPORT
2015-11-10 16:29:12 +01:00
bgp->main_peers_update_hold);
BGP: route-server will now use addpath...chop the _rsclient code Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com> Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com> Ticket: CM-8122 per draft-ietf-idr-ix-bgp-route-server-09: 2.3.2.2.2. BGP ADD-PATH Approach The [I-D.ietf-idr-add-paths] Internet draft proposes a different approach to multiple path propagation, by allowing a BGP speaker to forward multiple paths for the same prefix on a single BGP session. As [RFC4271] specifies that a BGP listener must implement an implicit withdraw when it receives an UPDATE message for a prefix which already exists in its Adj-RIB-In, this approach requires explicit support for the feature both on the route server and on its clients. If the ADD-PATH capability is negotiated bidirectionally between the route server and a route server client, and the route server client propagates multiple paths for the same prefix to the route server, then this could potentially cause the propagation of inactive, invalid or suboptimal paths to the route server, thereby causing loss of reachability to other route server clients. For this reason, ADD- PATH implementations on a route server should enforce send-only mode with the route server clients, which would result in negotiating receive-only mode from the client to the route server. This allows us to delete all of the following code: - All XXXX_rsclient() functions - peer->rib - BGP_TABLE_MAIN and BGP_TABLE_RSCLIENT - RMAP_IMPORT and RMAP_EXPORT
2015-11-10 16:29:12 +01:00
if (bgp->main_peers_update_hold)
return;
frr_timestamp(3, bgp->update_delay_peers_resume_time,
sizeof(bgp->update_delay_peers_resume_time));
bgpd: bgpd-mrai.patch BGP: Event-driven route announcement taking into account min route advertisement interval ISSUE BGP starts the routeadv timer (peer->t_routeadv) to expire in 1 sec when a peer is established. From then on, the timer expires periodically based on the configured MRAI value (default: 30sec for EBGP, 5sec for IBGP). At the expiry, the write thread is triggered that takes the routes from peer's sync FIFO (adj-rib-out) and sends UPDATEs. This has a few drawbacks: (1) Delay in new route announcement: Even when the last UPDATE message was sent a while back, the next route change will necessarily have to wait for routeadv expiry (2) CPU usage: The timer is always armed. If the operator chooses to configure a lower value of MRAI (zero second is a preferred choice in many deployments) for better convergence, it leads to high CPU usage for BGP process, even at the times of no network churn. PATCH Make the route advertisement event-driven - When routes are added to peer's sync FIFO, check if the routeadv timer needs to be adjusted (or started). Conversely, do not arm the routeadv timer unconditionally. The patch also addresses route announcements during read-only mode (update-delay). During read-only mode operation, the routeadv timer is not started. When BGP comes out of read-only mode and all the routes are processed, the timer is started for all peers with zero expiry, so that the UPDATEs can be sent all at once. This leads to (near-)optimal UPDATE packing. Finally, the patch makes the "max # packets to write to peer socket at a time" configurable. Currently it is hard-coded to 10. The command is at the top router-bgp mode and is called "write-quanta <number>". It is a useful convergence parameter to tweak. Signed-off-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com> Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com>
2015-05-20 02:40:37 +02:00
for (ALL_LIST_ELEMENTS(bgp->peer, node, nnode, peer)) {
struct peer_connection *connection = peer->connection;
if (!peer_established(connection))
bgpd: bgpd-mrai.patch BGP: Event-driven route announcement taking into account min route advertisement interval ISSUE BGP starts the routeadv timer (peer->t_routeadv) to expire in 1 sec when a peer is established. From then on, the timer expires periodically based on the configured MRAI value (default: 30sec for EBGP, 5sec for IBGP). At the expiry, the write thread is triggered that takes the routes from peer's sync FIFO (adj-rib-out) and sends UPDATEs. This has a few drawbacks: (1) Delay in new route announcement: Even when the last UPDATE message was sent a while back, the next route change will necessarily have to wait for routeadv expiry (2) CPU usage: The timer is always armed. If the operator chooses to configure a lower value of MRAI (zero second is a preferred choice in many deployments) for better convergence, it leads to high CPU usage for BGP process, even at the times of no network churn. PATCH Make the route advertisement event-driven - When routes are added to peer's sync FIFO, check if the routeadv timer needs to be adjusted (or started). Conversely, do not arm the routeadv timer unconditionally. The patch also addresses route announcements during read-only mode (update-delay). During read-only mode operation, the routeadv timer is not started. When BGP comes out of read-only mode and all the routes are processed, the timer is started for all peers with zero expiry, so that the UPDATEs can be sent all at once. This leads to (near-)optimal UPDATE packing. Finally, the patch makes the "max # packets to write to peer socket at a time" configurable. Currently it is hard-coded to 10. The command is at the top router-bgp mode and is called "write-quanta <number>". It is a useful convergence parameter to tweak. Signed-off-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com> Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com>
2015-05-20 02:40:37 +02:00
continue;
EVENT_OFF(connection->t_routeadv);
BGP_TIMER_ON(connection->t_routeadv, bgp_routeadv_timer, 0);
bgpd: bgpd-mrai.patch BGP: Event-driven route announcement taking into account min route advertisement interval ISSUE BGP starts the routeadv timer (peer->t_routeadv) to expire in 1 sec when a peer is established. From then on, the timer expires periodically based on the configured MRAI value (default: 30sec for EBGP, 5sec for IBGP). At the expiry, the write thread is triggered that takes the routes from peer's sync FIFO (adj-rib-out) and sends UPDATEs. This has a few drawbacks: (1) Delay in new route announcement: Even when the last UPDATE message was sent a while back, the next route change will necessarily have to wait for routeadv expiry (2) CPU usage: The timer is always armed. If the operator chooses to configure a lower value of MRAI (zero second is a preferred choice in many deployments) for better convergence, it leads to high CPU usage for BGP process, even at the times of no network churn. PATCH Make the route advertisement event-driven - When routes are added to peer's sync FIFO, check if the routeadv timer needs to be adjusted (or started). Conversely, do not arm the routeadv timer unconditionally. The patch also addresses route announcements during read-only mode (update-delay). During read-only mode operation, the routeadv timer is not started. When BGP comes out of read-only mode and all the routes are processed, the timer is started for all peers with zero expiry, so that the UPDATEs can be sent all at once. This leads to (near-)optimal UPDATE packing. Finally, the patch makes the "max # packets to write to peer socket at a time" configurable. Currently it is hard-coded to 10. The command is at the top router-bgp mode and is called "write-quanta <number>". It is a useful convergence parameter to tweak. Signed-off-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com> Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com>
2015-05-20 02:40:37 +02:00
}
}
/**
* see bgp_fsm.h
*/
void bgp_adjust_routeadv(struct peer *peer)
{
time_t nowtime = monotime(NULL);
bgpd: bgpd-mrai.patch BGP: Event-driven route announcement taking into account min route advertisement interval ISSUE BGP starts the routeadv timer (peer->t_routeadv) to expire in 1 sec when a peer is established. From then on, the timer expires periodically based on the configured MRAI value (default: 30sec for EBGP, 5sec for IBGP). At the expiry, the write thread is triggered that takes the routes from peer's sync FIFO (adj-rib-out) and sends UPDATEs. This has a few drawbacks: (1) Delay in new route announcement: Even when the last UPDATE message was sent a while back, the next route change will necessarily have to wait for routeadv expiry (2) CPU usage: The timer is always armed. If the operator chooses to configure a lower value of MRAI (zero second is a preferred choice in many deployments) for better convergence, it leads to high CPU usage for BGP process, even at the times of no network churn. PATCH Make the route advertisement event-driven - When routes are added to peer's sync FIFO, check if the routeadv timer needs to be adjusted (or started). Conversely, do not arm the routeadv timer unconditionally. The patch also addresses route announcements during read-only mode (update-delay). During read-only mode operation, the routeadv timer is not started. When BGP comes out of read-only mode and all the routes are processed, the timer is started for all peers with zero expiry, so that the UPDATEs can be sent all at once. This leads to (near-)optimal UPDATE packing. Finally, the patch makes the "max # packets to write to peer socket at a time" configurable. Currently it is hard-coded to 10. The command is at the top router-bgp mode and is called "write-quanta <number>". It is a useful convergence parameter to tweak. Signed-off-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com> Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com>
2015-05-20 02:40:37 +02:00
double diff;
unsigned long remain;
struct peer_connection *connection = peer->connection;
/* Bypass checks for special case of MRAI being 0 */
if (peer->v_routeadv == 0) {
/* Stop existing timer, just in case it is running for a
* different
* duration and schedule write thread immediately.
*/
EVENT_OFF(connection->t_routeadv);
peer->synctime = monotime(NULL);
/* If suppress fib pending is enabled, route is advertised to
* peers when the status is received from the FIB. The delay
* is added to update group packet generate which will allow
* more routes to be sent in the update message
*/
BGP_UPDATE_GROUP_TIMER_ON(&connection->t_generate_updgrp_packets,
bgp_generate_updgrp_packets);
return;
}
bgpd: bgpd-mrai.patch BGP: Event-driven route announcement taking into account min route advertisement interval ISSUE BGP starts the routeadv timer (peer->t_routeadv) to expire in 1 sec when a peer is established. From then on, the timer expires periodically based on the configured MRAI value (default: 30sec for EBGP, 5sec for IBGP). At the expiry, the write thread is triggered that takes the routes from peer's sync FIFO (adj-rib-out) and sends UPDATEs. This has a few drawbacks: (1) Delay in new route announcement: Even when the last UPDATE message was sent a while back, the next route change will necessarily have to wait for routeadv expiry (2) CPU usage: The timer is always armed. If the operator chooses to configure a lower value of MRAI (zero second is a preferred choice in many deployments) for better convergence, it leads to high CPU usage for BGP process, even at the times of no network churn. PATCH Make the route advertisement event-driven - When routes are added to peer's sync FIFO, check if the routeadv timer needs to be adjusted (or started). Conversely, do not arm the routeadv timer unconditionally. The patch also addresses route announcements during read-only mode (update-delay). During read-only mode operation, the routeadv timer is not started. When BGP comes out of read-only mode and all the routes are processed, the timer is started for all peers with zero expiry, so that the UPDATEs can be sent all at once. This leads to (near-)optimal UPDATE packing. Finally, the patch makes the "max # packets to write to peer socket at a time" configurable. Currently it is hard-coded to 10. The command is at the top router-bgp mode and is called "write-quanta <number>". It is a useful convergence parameter to tweak. Signed-off-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com> Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com>
2015-05-20 02:40:37 +02:00
/*
* CASE I:
* If the last update was written more than MRAI back, expire the timer
* instantly so that we can send the update out sooner.
*
* <------- MRAI --------->
* |-----------------|-----------------------|
* <------------- m ------------>
* ^ ^ ^
* | | |
* | | current time
* | timer start
* last write
*
* m > MRAI
*/
diff = difftime(nowtime, peer->last_update);
bgpd: bgpd-mrai.patch BGP: Event-driven route announcement taking into account min route advertisement interval ISSUE BGP starts the routeadv timer (peer->t_routeadv) to expire in 1 sec when a peer is established. From then on, the timer expires periodically based on the configured MRAI value (default: 30sec for EBGP, 5sec for IBGP). At the expiry, the write thread is triggered that takes the routes from peer's sync FIFO (adj-rib-out) and sends UPDATEs. This has a few drawbacks: (1) Delay in new route announcement: Even when the last UPDATE message was sent a while back, the next route change will necessarily have to wait for routeadv expiry (2) CPU usage: The timer is always armed. If the operator chooses to configure a lower value of MRAI (zero second is a preferred choice in many deployments) for better convergence, it leads to high CPU usage for BGP process, even at the times of no network churn. PATCH Make the route advertisement event-driven - When routes are added to peer's sync FIFO, check if the routeadv timer needs to be adjusted (or started). Conversely, do not arm the routeadv timer unconditionally. The patch also addresses route announcements during read-only mode (update-delay). During read-only mode operation, the routeadv timer is not started. When BGP comes out of read-only mode and all the routes are processed, the timer is started for all peers with zero expiry, so that the UPDATEs can be sent all at once. This leads to (near-)optimal UPDATE packing. Finally, the patch makes the "max # packets to write to peer socket at a time" configurable. Currently it is hard-coded to 10. The command is at the top router-bgp mode and is called "write-quanta <number>". It is a useful convergence parameter to tweak. Signed-off-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com> Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com>
2015-05-20 02:40:37 +02:00
if (diff > (double)peer->v_routeadv) {
EVENT_OFF(connection->t_routeadv);
BGP_TIMER_ON(connection->t_routeadv, bgp_routeadv_timer, 0);
bgpd: bgpd-mrai.patch BGP: Event-driven route announcement taking into account min route advertisement interval ISSUE BGP starts the routeadv timer (peer->t_routeadv) to expire in 1 sec when a peer is established. From then on, the timer expires periodically based on the configured MRAI value (default: 30sec for EBGP, 5sec for IBGP). At the expiry, the write thread is triggered that takes the routes from peer's sync FIFO (adj-rib-out) and sends UPDATEs. This has a few drawbacks: (1) Delay in new route announcement: Even when the last UPDATE message was sent a while back, the next route change will necessarily have to wait for routeadv expiry (2) CPU usage: The timer is always armed. If the operator chooses to configure a lower value of MRAI (zero second is a preferred choice in many deployments) for better convergence, it leads to high CPU usage for BGP process, even at the times of no network churn. PATCH Make the route advertisement event-driven - When routes are added to peer's sync FIFO, check if the routeadv timer needs to be adjusted (or started). Conversely, do not arm the routeadv timer unconditionally. The patch also addresses route announcements during read-only mode (update-delay). During read-only mode operation, the routeadv timer is not started. When BGP comes out of read-only mode and all the routes are processed, the timer is started for all peers with zero expiry, so that the UPDATEs can be sent all at once. This leads to (near-)optimal UPDATE packing. Finally, the patch makes the "max # packets to write to peer socket at a time" configurable. Currently it is hard-coded to 10. The command is at the top router-bgp mode and is called "write-quanta <number>". It is a useful convergence parameter to tweak. Signed-off-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com> Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com>
2015-05-20 02:40:37 +02:00
return;
}
bgpd: bgpd-mrai.patch BGP: Event-driven route announcement taking into account min route advertisement interval ISSUE BGP starts the routeadv timer (peer->t_routeadv) to expire in 1 sec when a peer is established. From then on, the timer expires periodically based on the configured MRAI value (default: 30sec for EBGP, 5sec for IBGP). At the expiry, the write thread is triggered that takes the routes from peer's sync FIFO (adj-rib-out) and sends UPDATEs. This has a few drawbacks: (1) Delay in new route announcement: Even when the last UPDATE message was sent a while back, the next route change will necessarily have to wait for routeadv expiry (2) CPU usage: The timer is always armed. If the operator chooses to configure a lower value of MRAI (zero second is a preferred choice in many deployments) for better convergence, it leads to high CPU usage for BGP process, even at the times of no network churn. PATCH Make the route advertisement event-driven - When routes are added to peer's sync FIFO, check if the routeadv timer needs to be adjusted (or started). Conversely, do not arm the routeadv timer unconditionally. The patch also addresses route announcements during read-only mode (update-delay). During read-only mode operation, the routeadv timer is not started. When BGP comes out of read-only mode and all the routes are processed, the timer is started for all peers with zero expiry, so that the UPDATEs can be sent all at once. This leads to (near-)optimal UPDATE packing. Finally, the patch makes the "max # packets to write to peer socket at a time" configurable. Currently it is hard-coded to 10. The command is at the top router-bgp mode and is called "write-quanta <number>". It is a useful convergence parameter to tweak. Signed-off-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com> Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com>
2015-05-20 02:40:37 +02:00
/*
* CASE II:
* - Find when to expire the MRAI timer.
* If MRAI timer is not active, assume we can start it now.
*
* <------- MRAI --------->
* |------------|-----------------------|
* <-------- m ----------><----- r ----->
* ^ ^ ^
* | | |
* | | current time
* | timer start
* last write
*
* (MRAI - m) < r
*/
if (connection->t_routeadv)
remain = event_timer_remain_second(connection->t_routeadv);
bgpd: bgpd-mrai.patch BGP: Event-driven route announcement taking into account min route advertisement interval ISSUE BGP starts the routeadv timer (peer->t_routeadv) to expire in 1 sec when a peer is established. From then on, the timer expires periodically based on the configured MRAI value (default: 30sec for EBGP, 5sec for IBGP). At the expiry, the write thread is triggered that takes the routes from peer's sync FIFO (adj-rib-out) and sends UPDATEs. This has a few drawbacks: (1) Delay in new route announcement: Even when the last UPDATE message was sent a while back, the next route change will necessarily have to wait for routeadv expiry (2) CPU usage: The timer is always armed. If the operator chooses to configure a lower value of MRAI (zero second is a preferred choice in many deployments) for better convergence, it leads to high CPU usage for BGP process, even at the times of no network churn. PATCH Make the route advertisement event-driven - When routes are added to peer's sync FIFO, check if the routeadv timer needs to be adjusted (or started). Conversely, do not arm the routeadv timer unconditionally. The patch also addresses route announcements during read-only mode (update-delay). During read-only mode operation, the routeadv timer is not started. When BGP comes out of read-only mode and all the routes are processed, the timer is started for all peers with zero expiry, so that the UPDATEs can be sent all at once. This leads to (near-)optimal UPDATE packing. Finally, the patch makes the "max # packets to write to peer socket at a time" configurable. Currently it is hard-coded to 10. The command is at the top router-bgp mode and is called "write-quanta <number>". It is a useful convergence parameter to tweak. Signed-off-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com> Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com>
2015-05-20 02:40:37 +02:00
else
remain = peer->v_routeadv;
diff = peer->v_routeadv - diff;
if (diff <= (double)remain) {
EVENT_OFF(connection->t_routeadv);
BGP_TIMER_ON(connection->t_routeadv, bgp_routeadv_timer, diff);
bgpd: bgpd-mrai.patch BGP: Event-driven route announcement taking into account min route advertisement interval ISSUE BGP starts the routeadv timer (peer->t_routeadv) to expire in 1 sec when a peer is established. From then on, the timer expires periodically based on the configured MRAI value (default: 30sec for EBGP, 5sec for IBGP). At the expiry, the write thread is triggered that takes the routes from peer's sync FIFO (adj-rib-out) and sends UPDATEs. This has a few drawbacks: (1) Delay in new route announcement: Even when the last UPDATE message was sent a while back, the next route change will necessarily have to wait for routeadv expiry (2) CPU usage: The timer is always armed. If the operator chooses to configure a lower value of MRAI (zero second is a preferred choice in many deployments) for better convergence, it leads to high CPU usage for BGP process, even at the times of no network churn. PATCH Make the route advertisement event-driven - When routes are added to peer's sync FIFO, check if the routeadv timer needs to be adjusted (or started). Conversely, do not arm the routeadv timer unconditionally. The patch also addresses route announcements during read-only mode (update-delay). During read-only mode operation, the routeadv timer is not started. When BGP comes out of read-only mode and all the routes are processed, the timer is started for all peers with zero expiry, so that the UPDATEs can be sent all at once. This leads to (near-)optimal UPDATE packing. Finally, the patch makes the "max # packets to write to peer socket at a time" configurable. Currently it is hard-coded to 10. The command is at the top router-bgp mode and is called "write-quanta <number>". It is a useful convergence parameter to tweak. Signed-off-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com> Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com>
2015-05-20 02:40:37 +02:00
}
}
static bool bgp_maxmed_onstartup_applicable(struct bgp *bgp)
{
if (!bgp->maxmed_onstartup_over)
return true;
return false;
}
bool bgp_maxmed_onstartup_configured(struct bgp *bgp)
{
if (bgp->v_maxmed_onstartup != BGP_MAXMED_ONSTARTUP_UNCONFIGURED)
return true;
return false;
}
bool bgp_maxmed_onstartup_active(struct bgp *bgp)
{
if (bgp->t_maxmed_onstartup)
return true;
return false;
}
void bgp_maxmed_update(struct bgp *bgp)
{
uint8_t maxmed_active;
uint32_t maxmed_value;
if (bgp->v_maxmed_admin) {
maxmed_active = 1;
maxmed_value = bgp->maxmed_admin_value;
} else if (bgp->t_maxmed_onstartup) {
maxmed_active = 1;
maxmed_value = bgp->maxmed_onstartup_value;
} else {
maxmed_active = 0;
maxmed_value = BGP_MAXMED_VALUE_DEFAULT;
}
if (bgp->maxmed_active != maxmed_active
|| bgp->maxmed_value != maxmed_value) {
bgp->maxmed_active = maxmed_active;
bgp->maxmed_value = maxmed_value;
update_group_announce(bgp);
}
}
int bgp_fsm_error_subcode(int status)
{
int fsm_err_subcode = BGP_NOTIFY_FSM_ERR_SUBCODE_UNSPECIFIC;
switch (status) {
case OpenSent:
fsm_err_subcode = BGP_NOTIFY_FSM_ERR_SUBCODE_OPENSENT;
break;
case OpenConfirm:
fsm_err_subcode = BGP_NOTIFY_FSM_ERR_SUBCODE_OPENCONFIRM;
break;
case Established:
fsm_err_subcode = BGP_NOTIFY_FSM_ERR_SUBCODE_ESTABLISHED;
break;
default:
break;
}
return fsm_err_subcode;
}
/* The maxmed onstartup timer expiry callback. */
static void bgp_maxmed_onstartup_timer(struct event *thread)
{
struct bgp *bgp;
zlog_info("Max med on startup ended - timer expired.");
bgp = EVENT_ARG(thread);
EVENT_OFF(bgp->t_maxmed_onstartup);
bgp->maxmed_onstartup_over = 1;
bgp_maxmed_update(bgp);
}
static void bgp_maxmed_onstartup_begin(struct bgp *bgp)
{
/* Applicable only once in the process lifetime on the startup */
if (bgp->maxmed_onstartup_over)
return;
zlog_info("Begin maxmed onstartup mode - timer %d seconds",
bgp->v_maxmed_onstartup);
event_add_timer(bm->master, bgp_maxmed_onstartup_timer, bgp,
bgp->v_maxmed_onstartup, &bgp->t_maxmed_onstartup);
if (!bgp->v_maxmed_admin) {
bgp->maxmed_active = 1;
bgp->maxmed_value = bgp->maxmed_onstartup_value;
}
/* Route announce to all peers should happen after this in
* bgp_establish() */
}
static void bgp_maxmed_onstartup_process_status_change(struct peer *peer)
{
if (peer_established(peer->connection) && !peer->bgp->established) {
bgp_maxmed_onstartup_begin(peer->bgp);
}
}
bgpd: bgpd-update-delay.patch COMMAND: 'update-delay <max-delay in seconds> [<establish-wait in seconds>]' DESCRIPTION: This feature is used to enable read-only mode on BGP process restart or when BGP process is cleared using 'clear ip bgp *'. When applicable, read-only mode would begin as soon as the first peer reaches Established state and a timer for <max-delay> seconds is started. During this mode BGP doesn't run any best-path or generate any updates to its peers. This mode continues until: 1. All the configured peers, except the shutdown peers, have sent explicit EOR (End-Of-RIB) or an implicit-EOR. The first keep-alive after BGP has reached Established is considered an implicit-EOR. If the <establish-wait> optional value is given, then BGP will wait for peers to reach establish from the begining of the update-delay till the establish-wait period is over, i.e. the minimum set of established peers for which EOR is expected would be peers established during the establish-wait window, not necessarily all the configured neighbors. 2. max-delay period is over. On hitting any of the above two conditions, BGP resumes the decision process and generates updates to its peers. Default <max-delay> is 0, i.e. the feature is off by default. This feature can be useful in reducing CPU/network used as BGP restarts/clears. Particularly useful in the topologies where BGP learns a prefix from many peers. Intermediate bestpaths are possible for the same prefix as peers get established and start receiving updates at different times. This feature should offer a value-add if the network has a high number of such prefixes. IMPLEMENTATION OBJECTIVES: Given this is an optional feature, minimized the code-churn. Used existing constructs wherever possible (existing queue-plug/unplug were used to achieve delay and resume of best-paths/update-generation). As a result, no new data-structure(s) had to be defined and allocated. When the feature is disabled, the new node is not exercised for the most part. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com> Dinesh Dutt <ddutt@cumulusnetworks.com>
2015-05-20 02:40:33 +02:00
/* The update delay timer expiry callback. */
static void bgp_update_delay_timer(struct event *thread)
bgpd: bgpd-update-delay.patch COMMAND: 'update-delay <max-delay in seconds> [<establish-wait in seconds>]' DESCRIPTION: This feature is used to enable read-only mode on BGP process restart or when BGP process is cleared using 'clear ip bgp *'. When applicable, read-only mode would begin as soon as the first peer reaches Established state and a timer for <max-delay> seconds is started. During this mode BGP doesn't run any best-path or generate any updates to its peers. This mode continues until: 1. All the configured peers, except the shutdown peers, have sent explicit EOR (End-Of-RIB) or an implicit-EOR. The first keep-alive after BGP has reached Established is considered an implicit-EOR. If the <establish-wait> optional value is given, then BGP will wait for peers to reach establish from the begining of the update-delay till the establish-wait period is over, i.e. the minimum set of established peers for which EOR is expected would be peers established during the establish-wait window, not necessarily all the configured neighbors. 2. max-delay period is over. On hitting any of the above two conditions, BGP resumes the decision process and generates updates to its peers. Default <max-delay> is 0, i.e. the feature is off by default. This feature can be useful in reducing CPU/network used as BGP restarts/clears. Particularly useful in the topologies where BGP learns a prefix from many peers. Intermediate bestpaths are possible for the same prefix as peers get established and start receiving updates at different times. This feature should offer a value-add if the network has a high number of such prefixes. IMPLEMENTATION OBJECTIVES: Given this is an optional feature, minimized the code-churn. Used existing constructs wherever possible (existing queue-plug/unplug were used to achieve delay and resume of best-paths/update-generation). As a result, no new data-structure(s) had to be defined and allocated. When the feature is disabled, the new node is not exercised for the most part. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com> Dinesh Dutt <ddutt@cumulusnetworks.com>
2015-05-20 02:40:33 +02:00
{
struct bgp *bgp;
zlog_info("Update delay ended - timer expired.");
bgp = EVENT_ARG(thread);
EVENT_OFF(bgp->t_update_delay);
bgpd: bgpd-update-delay.patch COMMAND: 'update-delay <max-delay in seconds> [<establish-wait in seconds>]' DESCRIPTION: This feature is used to enable read-only mode on BGP process restart or when BGP process is cleared using 'clear ip bgp *'. When applicable, read-only mode would begin as soon as the first peer reaches Established state and a timer for <max-delay> seconds is started. During this mode BGP doesn't run any best-path or generate any updates to its peers. This mode continues until: 1. All the configured peers, except the shutdown peers, have sent explicit EOR (End-Of-RIB) or an implicit-EOR. The first keep-alive after BGP has reached Established is considered an implicit-EOR. If the <establish-wait> optional value is given, then BGP will wait for peers to reach establish from the begining of the update-delay till the establish-wait period is over, i.e. the minimum set of established peers for which EOR is expected would be peers established during the establish-wait window, not necessarily all the configured neighbors. 2. max-delay period is over. On hitting any of the above two conditions, BGP resumes the decision process and generates updates to its peers. Default <max-delay> is 0, i.e. the feature is off by default. This feature can be useful in reducing CPU/network used as BGP restarts/clears. Particularly useful in the topologies where BGP learns a prefix from many peers. Intermediate bestpaths are possible for the same prefix as peers get established and start receiving updates at different times. This feature should offer a value-add if the network has a high number of such prefixes. IMPLEMENTATION OBJECTIVES: Given this is an optional feature, minimized the code-churn. Used existing constructs wherever possible (existing queue-plug/unplug were used to achieve delay and resume of best-paths/update-generation). As a result, no new data-structure(s) had to be defined and allocated. When the feature is disabled, the new node is not exercised for the most part. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com> Dinesh Dutt <ddutt@cumulusnetworks.com>
2015-05-20 02:40:33 +02:00
bgp_update_delay_end(bgp);
}
/* The establish wait timer expiry callback. */
static void bgp_establish_wait_timer(struct event *thread)
bgpd: bgpd-update-delay.patch COMMAND: 'update-delay <max-delay in seconds> [<establish-wait in seconds>]' DESCRIPTION: This feature is used to enable read-only mode on BGP process restart or when BGP process is cleared using 'clear ip bgp *'. When applicable, read-only mode would begin as soon as the first peer reaches Established state and a timer for <max-delay> seconds is started. During this mode BGP doesn't run any best-path or generate any updates to its peers. This mode continues until: 1. All the configured peers, except the shutdown peers, have sent explicit EOR (End-Of-RIB) or an implicit-EOR. The first keep-alive after BGP has reached Established is considered an implicit-EOR. If the <establish-wait> optional value is given, then BGP will wait for peers to reach establish from the begining of the update-delay till the establish-wait period is over, i.e. the minimum set of established peers for which EOR is expected would be peers established during the establish-wait window, not necessarily all the configured neighbors. 2. max-delay period is over. On hitting any of the above two conditions, BGP resumes the decision process and generates updates to its peers. Default <max-delay> is 0, i.e. the feature is off by default. This feature can be useful in reducing CPU/network used as BGP restarts/clears. Particularly useful in the topologies where BGP learns a prefix from many peers. Intermediate bestpaths are possible for the same prefix as peers get established and start receiving updates at different times. This feature should offer a value-add if the network has a high number of such prefixes. IMPLEMENTATION OBJECTIVES: Given this is an optional feature, minimized the code-churn. Used existing constructs wherever possible (existing queue-plug/unplug were used to achieve delay and resume of best-paths/update-generation). As a result, no new data-structure(s) had to be defined and allocated. When the feature is disabled, the new node is not exercised for the most part. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com> Dinesh Dutt <ddutt@cumulusnetworks.com>
2015-05-20 02:40:33 +02:00
{
struct bgp *bgp;
zlog_info("Establish wait - timer expired.");
bgp = EVENT_ARG(thread);
EVENT_OFF(bgp->t_establish_wait);
bgpd: bgpd-update-delay.patch COMMAND: 'update-delay <max-delay in seconds> [<establish-wait in seconds>]' DESCRIPTION: This feature is used to enable read-only mode on BGP process restart or when BGP process is cleared using 'clear ip bgp *'. When applicable, read-only mode would begin as soon as the first peer reaches Established state and a timer for <max-delay> seconds is started. During this mode BGP doesn't run any best-path or generate any updates to its peers. This mode continues until: 1. All the configured peers, except the shutdown peers, have sent explicit EOR (End-Of-RIB) or an implicit-EOR. The first keep-alive after BGP has reached Established is considered an implicit-EOR. If the <establish-wait> optional value is given, then BGP will wait for peers to reach establish from the begining of the update-delay till the establish-wait period is over, i.e. the minimum set of established peers for which EOR is expected would be peers established during the establish-wait window, not necessarily all the configured neighbors. 2. max-delay period is over. On hitting any of the above two conditions, BGP resumes the decision process and generates updates to its peers. Default <max-delay> is 0, i.e. the feature is off by default. This feature can be useful in reducing CPU/network used as BGP restarts/clears. Particularly useful in the topologies where BGP learns a prefix from many peers. Intermediate bestpaths are possible for the same prefix as peers get established and start receiving updates at different times. This feature should offer a value-add if the network has a high number of such prefixes. IMPLEMENTATION OBJECTIVES: Given this is an optional feature, minimized the code-churn. Used existing constructs wherever possible (existing queue-plug/unplug were used to achieve delay and resume of best-paths/update-generation). As a result, no new data-structure(s) had to be defined and allocated. When the feature is disabled, the new node is not exercised for the most part. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com> Dinesh Dutt <ddutt@cumulusnetworks.com>
2015-05-20 02:40:33 +02:00
bgp_check_update_delay(bgp);
}
/* Steps to begin the update delay:
- initialize queues if needed
- stop the queue processing
- start the timer */
static void bgp_update_delay_begin(struct bgp *bgp)
{
struct listnode *node, *nnode;
struct peer *peer;
/* Stop the processing of queued work. Enqueue shall continue */
work_queue_plug(bgp->process_queue);
bgpd: bgpd-update-delay.patch COMMAND: 'update-delay <max-delay in seconds> [<establish-wait in seconds>]' DESCRIPTION: This feature is used to enable read-only mode on BGP process restart or when BGP process is cleared using 'clear ip bgp *'. When applicable, read-only mode would begin as soon as the first peer reaches Established state and a timer for <max-delay> seconds is started. During this mode BGP doesn't run any best-path or generate any updates to its peers. This mode continues until: 1. All the configured peers, except the shutdown peers, have sent explicit EOR (End-Of-RIB) or an implicit-EOR. The first keep-alive after BGP has reached Established is considered an implicit-EOR. If the <establish-wait> optional value is given, then BGP will wait for peers to reach establish from the begining of the update-delay till the establish-wait period is over, i.e. the minimum set of established peers for which EOR is expected would be peers established during the establish-wait window, not necessarily all the configured neighbors. 2. max-delay period is over. On hitting any of the above two conditions, BGP resumes the decision process and generates updates to its peers. Default <max-delay> is 0, i.e. the feature is off by default. This feature can be useful in reducing CPU/network used as BGP restarts/clears. Particularly useful in the topologies where BGP learns a prefix from many peers. Intermediate bestpaths are possible for the same prefix as peers get established and start receiving updates at different times. This feature should offer a value-add if the network has a high number of such prefixes. IMPLEMENTATION OBJECTIVES: Given this is an optional feature, minimized the code-churn. Used existing constructs wherever possible (existing queue-plug/unplug were used to achieve delay and resume of best-paths/update-generation). As a result, no new data-structure(s) had to be defined and allocated. When the feature is disabled, the new node is not exercised for the most part. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com> Dinesh Dutt <ddutt@cumulusnetworks.com>
2015-05-20 02:40:33 +02:00
for (ALL_LIST_ELEMENTS(bgp->peer, node, nnode, peer))
peer->update_delay_over = 0;
/* Start the update-delay timer */
event_add_timer(bm->master, bgp_update_delay_timer, bgp,
bgp->v_update_delay, &bgp->t_update_delay);
bgpd: bgpd-update-delay.patch COMMAND: 'update-delay <max-delay in seconds> [<establish-wait in seconds>]' DESCRIPTION: This feature is used to enable read-only mode on BGP process restart or when BGP process is cleared using 'clear ip bgp *'. When applicable, read-only mode would begin as soon as the first peer reaches Established state and a timer for <max-delay> seconds is started. During this mode BGP doesn't run any best-path or generate any updates to its peers. This mode continues until: 1. All the configured peers, except the shutdown peers, have sent explicit EOR (End-Of-RIB) or an implicit-EOR. The first keep-alive after BGP has reached Established is considered an implicit-EOR. If the <establish-wait> optional value is given, then BGP will wait for peers to reach establish from the begining of the update-delay till the establish-wait period is over, i.e. the minimum set of established peers for which EOR is expected would be peers established during the establish-wait window, not necessarily all the configured neighbors. 2. max-delay period is over. On hitting any of the above two conditions, BGP resumes the decision process and generates updates to its peers. Default <max-delay> is 0, i.e. the feature is off by default. This feature can be useful in reducing CPU/network used as BGP restarts/clears. Particularly useful in the topologies where BGP learns a prefix from many peers. Intermediate bestpaths are possible for the same prefix as peers get established and start receiving updates at different times. This feature should offer a value-add if the network has a high number of such prefixes. IMPLEMENTATION OBJECTIVES: Given this is an optional feature, minimized the code-churn. Used existing constructs wherever possible (existing queue-plug/unplug were used to achieve delay and resume of best-paths/update-generation). As a result, no new data-structure(s) had to be defined and allocated. When the feature is disabled, the new node is not exercised for the most part. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com> Dinesh Dutt <ddutt@cumulusnetworks.com>
2015-05-20 02:40:33 +02:00
if (bgp->v_establish_wait != bgp->v_update_delay)
event_add_timer(bm->master, bgp_establish_wait_timer, bgp,
bgp->v_establish_wait, &bgp->t_establish_wait);
bgpd: bgpd-update-delay.patch COMMAND: 'update-delay <max-delay in seconds> [<establish-wait in seconds>]' DESCRIPTION: This feature is used to enable read-only mode on BGP process restart or when BGP process is cleared using 'clear ip bgp *'. When applicable, read-only mode would begin as soon as the first peer reaches Established state and a timer for <max-delay> seconds is started. During this mode BGP doesn't run any best-path or generate any updates to its peers. This mode continues until: 1. All the configured peers, except the shutdown peers, have sent explicit EOR (End-Of-RIB) or an implicit-EOR. The first keep-alive after BGP has reached Established is considered an implicit-EOR. If the <establish-wait> optional value is given, then BGP will wait for peers to reach establish from the begining of the update-delay till the establish-wait period is over, i.e. the minimum set of established peers for which EOR is expected would be peers established during the establish-wait window, not necessarily all the configured neighbors. 2. max-delay period is over. On hitting any of the above two conditions, BGP resumes the decision process and generates updates to its peers. Default <max-delay> is 0, i.e. the feature is off by default. This feature can be useful in reducing CPU/network used as BGP restarts/clears. Particularly useful in the topologies where BGP learns a prefix from many peers. Intermediate bestpaths are possible for the same prefix as peers get established and start receiving updates at different times. This feature should offer a value-add if the network has a high number of such prefixes. IMPLEMENTATION OBJECTIVES: Given this is an optional feature, minimized the code-churn. Used existing constructs wherever possible (existing queue-plug/unplug were used to achieve delay and resume of best-paths/update-generation). As a result, no new data-structure(s) had to be defined and allocated. When the feature is disabled, the new node is not exercised for the most part. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com> Dinesh Dutt <ddutt@cumulusnetworks.com>
2015-05-20 02:40:33 +02:00
frr_timestamp(3, bgp->update_delay_begin_time,
sizeof(bgp->update_delay_begin_time));
bgpd: bgpd-update-delay.patch COMMAND: 'update-delay <max-delay in seconds> [<establish-wait in seconds>]' DESCRIPTION: This feature is used to enable read-only mode on BGP process restart or when BGP process is cleared using 'clear ip bgp *'. When applicable, read-only mode would begin as soon as the first peer reaches Established state and a timer for <max-delay> seconds is started. During this mode BGP doesn't run any best-path or generate any updates to its peers. This mode continues until: 1. All the configured peers, except the shutdown peers, have sent explicit EOR (End-Of-RIB) or an implicit-EOR. The first keep-alive after BGP has reached Established is considered an implicit-EOR. If the <establish-wait> optional value is given, then BGP will wait for peers to reach establish from the begining of the update-delay till the establish-wait period is over, i.e. the minimum set of established peers for which EOR is expected would be peers established during the establish-wait window, not necessarily all the configured neighbors. 2. max-delay period is over. On hitting any of the above two conditions, BGP resumes the decision process and generates updates to its peers. Default <max-delay> is 0, i.e. the feature is off by default. This feature can be useful in reducing CPU/network used as BGP restarts/clears. Particularly useful in the topologies where BGP learns a prefix from many peers. Intermediate bestpaths are possible for the same prefix as peers get established and start receiving updates at different times. This feature should offer a value-add if the network has a high number of such prefixes. IMPLEMENTATION OBJECTIVES: Given this is an optional feature, minimized the code-churn. Used existing constructs wherever possible (existing queue-plug/unplug were used to achieve delay and resume of best-paths/update-generation). As a result, no new data-structure(s) had to be defined and allocated. When the feature is disabled, the new node is not exercised for the most part. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com> Dinesh Dutt <ddutt@cumulusnetworks.com>
2015-05-20 02:40:33 +02:00
}
static void bgp_update_delay_process_status_change(struct peer *peer)
{
if (peer_established(peer->connection)) {
bgpd: bgpd-update-delay.patch COMMAND: 'update-delay <max-delay in seconds> [<establish-wait in seconds>]' DESCRIPTION: This feature is used to enable read-only mode on BGP process restart or when BGP process is cleared using 'clear ip bgp *'. When applicable, read-only mode would begin as soon as the first peer reaches Established state and a timer for <max-delay> seconds is started. During this mode BGP doesn't run any best-path or generate any updates to its peers. This mode continues until: 1. All the configured peers, except the shutdown peers, have sent explicit EOR (End-Of-RIB) or an implicit-EOR. The first keep-alive after BGP has reached Established is considered an implicit-EOR. If the <establish-wait> optional value is given, then BGP will wait for peers to reach establish from the begining of the update-delay till the establish-wait period is over, i.e. the minimum set of established peers for which EOR is expected would be peers established during the establish-wait window, not necessarily all the configured neighbors. 2. max-delay period is over. On hitting any of the above two conditions, BGP resumes the decision process and generates updates to its peers. Default <max-delay> is 0, i.e. the feature is off by default. This feature can be useful in reducing CPU/network used as BGP restarts/clears. Particularly useful in the topologies where BGP learns a prefix from many peers. Intermediate bestpaths are possible for the same prefix as peers get established and start receiving updates at different times. This feature should offer a value-add if the network has a high number of such prefixes. IMPLEMENTATION OBJECTIVES: Given this is an optional feature, minimized the code-churn. Used existing constructs wherever possible (existing queue-plug/unplug were used to achieve delay and resume of best-paths/update-generation). As a result, no new data-structure(s) had to be defined and allocated. When the feature is disabled, the new node is not exercised for the most part. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com> Dinesh Dutt <ddutt@cumulusnetworks.com>
2015-05-20 02:40:33 +02:00
if (!peer->bgp->established++) {
bgp_update_delay_begin(peer->bgp);
zlog_info(
"Begin read-only mode - update-delay timer %d seconds",
peer->bgp->v_update_delay);
}
if (CHECK_FLAG(peer->cap, PEER_CAP_GRACEFUL_RESTART_R_BIT_RCV))
bgpd: bgpd-update-delay.patch COMMAND: 'update-delay <max-delay in seconds> [<establish-wait in seconds>]' DESCRIPTION: This feature is used to enable read-only mode on BGP process restart or when BGP process is cleared using 'clear ip bgp *'. When applicable, read-only mode would begin as soon as the first peer reaches Established state and a timer for <max-delay> seconds is started. During this mode BGP doesn't run any best-path or generate any updates to its peers. This mode continues until: 1. All the configured peers, except the shutdown peers, have sent explicit EOR (End-Of-RIB) or an implicit-EOR. The first keep-alive after BGP has reached Established is considered an implicit-EOR. If the <establish-wait> optional value is given, then BGP will wait for peers to reach establish from the begining of the update-delay till the establish-wait period is over, i.e. the minimum set of established peers for which EOR is expected would be peers established during the establish-wait window, not necessarily all the configured neighbors. 2. max-delay period is over. On hitting any of the above two conditions, BGP resumes the decision process and generates updates to its peers. Default <max-delay> is 0, i.e. the feature is off by default. This feature can be useful in reducing CPU/network used as BGP restarts/clears. Particularly useful in the topologies where BGP learns a prefix from many peers. Intermediate bestpaths are possible for the same prefix as peers get established and start receiving updates at different times. This feature should offer a value-add if the network has a high number of such prefixes. IMPLEMENTATION OBJECTIVES: Given this is an optional feature, minimized the code-churn. Used existing constructs wherever possible (existing queue-plug/unplug were used to achieve delay and resume of best-paths/update-generation). As a result, no new data-structure(s) had to be defined and allocated. When the feature is disabled, the new node is not exercised for the most part. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com> Dinesh Dutt <ddutt@cumulusnetworks.com>
2015-05-20 02:40:33 +02:00
bgp_update_restarted_peers(peer);
}
if (peer->connection->ostatus == Established &&
bgp_update_delay_active(peer->bgp)) {
bgpd: bgpd-update-delay.patch COMMAND: 'update-delay <max-delay in seconds> [<establish-wait in seconds>]' DESCRIPTION: This feature is used to enable read-only mode on BGP process restart or when BGP process is cleared using 'clear ip bgp *'. When applicable, read-only mode would begin as soon as the first peer reaches Established state and a timer for <max-delay> seconds is started. During this mode BGP doesn't run any best-path or generate any updates to its peers. This mode continues until: 1. All the configured peers, except the shutdown peers, have sent explicit EOR (End-Of-RIB) or an implicit-EOR. The first keep-alive after BGP has reached Established is considered an implicit-EOR. If the <establish-wait> optional value is given, then BGP will wait for peers to reach establish from the begining of the update-delay till the establish-wait period is over, i.e. the minimum set of established peers for which EOR is expected would be peers established during the establish-wait window, not necessarily all the configured neighbors. 2. max-delay period is over. On hitting any of the above two conditions, BGP resumes the decision process and generates updates to its peers. Default <max-delay> is 0, i.e. the feature is off by default. This feature can be useful in reducing CPU/network used as BGP restarts/clears. Particularly useful in the topologies where BGP learns a prefix from many peers. Intermediate bestpaths are possible for the same prefix as peers get established and start receiving updates at different times. This feature should offer a value-add if the network has a high number of such prefixes. IMPLEMENTATION OBJECTIVES: Given this is an optional feature, minimized the code-churn. Used existing constructs wherever possible (existing queue-plug/unplug were used to achieve delay and resume of best-paths/update-generation). As a result, no new data-structure(s) had to be defined and allocated. When the feature is disabled, the new node is not exercised for the most part. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com> Dinesh Dutt <ddutt@cumulusnetworks.com>
2015-05-20 02:40:33 +02:00
/* Adjust the update-delay state to account for this flap.
NOTE: Intentionally skipping adjusting implicit_eors or
explicit_eors
counters. Extra sanity check in bgp_check_update_delay()
should
be enough to take care of any additive discrepancy in bgp eor
counters */
peer->bgp->established--;
peer->update_delay_over = 0;
}
}
/* Called after event occurred, this function change status and reset
2005-06-01 Paul Jakma <paul.jakma@sun.com> * bgpd/(general) refcount struct peer and bgp_info, hence allowing us add work_queues for bgp_process. * bgpd/bgp_route.h: (struct bgp_info) Add 'lock' field for refcount. Add bgp_info_{lock,unlock} helper functions. Add bgp_info_{add,delete} helpers, to remove need for users managing locking/freeing of bgp_info and bgp_node's. * bgpd/bgp_table.h: (struct bgp_node) Add a flags field, and BGP_NODE_PROCESS_SCHEDULED to merge redundant processing of nodes. * bgpd/bgp_fsm.h: Make the ON/OFF/ADD/REMOVE macros lock and unlock peer reference as appropriate. * bgpd/bgp_damp.c: Remove its internal prototypes for bgp_info_delete/free. Just use bgp_info_delete. * bgpd/bgpd.h: (struct bgp_master) Add work_queue pointers. (struct peer) Add reference count 'lock' (peer_lock,peer_unlock) New helpers to take/release reference on struct peer. * bgpd/bgp_advertise.c: (general) Add peer and bgp_info refcounting and balance how references are taken and released. (bgp_advertise_free) release bgp_info reference, if appropriate (bgp_adj_out_free) unlock peer (bgp_advertise_clean) leave the adv references alone, or else call bgp_advertise_free cant unlock them. (bgp_adj_out_set) lock the peer on new adj's, leave the reference alone otherwise. lock the new bgp_info reference. (bgp_adj_in_set) lock the peer reference (bgp_adj_in_remove) and unlock it here (bgp_sync_delete) make hash_free on peer conditional, just in case. * bgpd/bgp_fsm.c: (general) document that the timers depend on bgp_event to release a peer reference. (bgp_fsm_change_status) moved up the file, unchanged. (bgp_stop) Decrement peer lock as many times as cancel_event canceled - shouldnt be needed but just in case. stream_fifo_clean of obuf made conditional, just in case. (bgp_event) always unlock the peer, regardless of return value of bgp_fsm_change_status. * bgpd/bgp_packet.c: (general) change several bgp_stop's to BGP_EVENT's. (bgp_read) Add a mysterious extra peer_unlock for ACCEPT_PEERs along with a comment on it. * bgpd/bgp_route.c: (general) Add refcounting of bgp_info, cleanup some of the resource management around bgp_info. Refcount peer. Add workqueues for bgp_process and clear_table. (bgp_info_new) make static (bgp_info_free) Ditto, and unlock the peer reference. (bgp_info_lock,bgp_info_unlock) new exported functions (bgp_info_add) Add a bgp_info to a bgp_node in correct fashion, taking care of reference counts. (bgp_info_delete) do the opposite of bgp_info_add. (bgp_process_rsclient) Converted into a work_queue work function. (bgp_process_main) ditto. (bgp_processq_del) process work queue item deconstructor (bgp_process_queue_init) process work queue init (bgp_process) call init function if required, set up queue item and add to queue, rather than calling process functions directly. (bgp_rib_remove) let bgp_info_delete manage bgp_info refcounts (bgp_rib_withdraw) ditto (bgp_update_rsclient) let bgp_info_add manage refcounts (bgp_update_main) ditto (bgp_clear_route_node) clear_node_queue work function, does per-node aspects of what bgp_clear_route_table did previously (bgp_clear_node_queue_del) clear_node_queue item delete function (bgp_clear_node_complete) clear_node_queue completion function, it unplugs the process queues, which have to be blocked while clear_node_queue is being processed to prevent a race. (bgp_clear_node_queue_init) init function for clear_node_queue work queues (bgp_clear_route_table) Sets up items onto a workqueue now, rather than clearing each node directly. Plugs both process queues to avoid potential race. (bgp_static_withdraw_rsclient) let bgp_info_{add,delete} manage bgp_info refcounts. (bgp_static_update_rsclient) ditto (bgp_static_update_main) ditto (bgp_static_update_vpnv4) ditto, remove unneeded cast. (bgp_static_withdraw) see bgp_static_withdraw_rsclient (bgp_static_withdraw_vpnv4) ditto (bgp_aggregate_{route,add,delete}) ditto (bgp_redistribute_{add,delete,withdraw}) ditto * bgpd/bgp_vty.c: (peer_rsclient_set_vty) lock rsclient list peer reference (peer_rsclient_unset_vty) ditto, but unlock same reference * bgpd/bgpd.c: (peer_free) handle frees of info to be kept for lifetime of struct peer. (peer_lock,peer_unlock) peer refcount helpers (peer_new) add initial refcounts (peer_create,peer_create_accept) lock peer as appropriate (peer_delete) unlock as appropriate, move out some free's to peer_free. (peer_group_bind,peer_group_unbind) peer refcounting as appropriate. (bgp_create) check CALLOC return value. (bgp_terminate) free workqueues too. * lib/memtypes.c: Add MTYPE_BGP_PROCESS_QUEUE and MTYPE_BGP_CLEAR_NODE_QUEUE
2005-06-01 13:17:05 +02:00
read/write and timer thread. */
void bgp_fsm_change_status(struct peer_connection *connection,
enum bgp_fsm_status status)
2005-06-01 Paul Jakma <paul.jakma@sun.com> * bgpd/(general) refcount struct peer and bgp_info, hence allowing us add work_queues for bgp_process. * bgpd/bgp_route.h: (struct bgp_info) Add 'lock' field for refcount. Add bgp_info_{lock,unlock} helper functions. Add bgp_info_{add,delete} helpers, to remove need for users managing locking/freeing of bgp_info and bgp_node's. * bgpd/bgp_table.h: (struct bgp_node) Add a flags field, and BGP_NODE_PROCESS_SCHEDULED to merge redundant processing of nodes. * bgpd/bgp_fsm.h: Make the ON/OFF/ADD/REMOVE macros lock and unlock peer reference as appropriate. * bgpd/bgp_damp.c: Remove its internal prototypes for bgp_info_delete/free. Just use bgp_info_delete. * bgpd/bgpd.h: (struct bgp_master) Add work_queue pointers. (struct peer) Add reference count 'lock' (peer_lock,peer_unlock) New helpers to take/release reference on struct peer. * bgpd/bgp_advertise.c: (general) Add peer and bgp_info refcounting and balance how references are taken and released. (bgp_advertise_free) release bgp_info reference, if appropriate (bgp_adj_out_free) unlock peer (bgp_advertise_clean) leave the adv references alone, or else call bgp_advertise_free cant unlock them. (bgp_adj_out_set) lock the peer on new adj's, leave the reference alone otherwise. lock the new bgp_info reference. (bgp_adj_in_set) lock the peer reference (bgp_adj_in_remove) and unlock it here (bgp_sync_delete) make hash_free on peer conditional, just in case. * bgpd/bgp_fsm.c: (general) document that the timers depend on bgp_event to release a peer reference. (bgp_fsm_change_status) moved up the file, unchanged. (bgp_stop) Decrement peer lock as many times as cancel_event canceled - shouldnt be needed but just in case. stream_fifo_clean of obuf made conditional, just in case. (bgp_event) always unlock the peer, regardless of return value of bgp_fsm_change_status. * bgpd/bgp_packet.c: (general) change several bgp_stop's to BGP_EVENT's. (bgp_read) Add a mysterious extra peer_unlock for ACCEPT_PEERs along with a comment on it. * bgpd/bgp_route.c: (general) Add refcounting of bgp_info, cleanup some of the resource management around bgp_info. Refcount peer. Add workqueues for bgp_process and clear_table. (bgp_info_new) make static (bgp_info_free) Ditto, and unlock the peer reference. (bgp_info_lock,bgp_info_unlock) new exported functions (bgp_info_add) Add a bgp_info to a bgp_node in correct fashion, taking care of reference counts. (bgp_info_delete) do the opposite of bgp_info_add. (bgp_process_rsclient) Converted into a work_queue work function. (bgp_process_main) ditto. (bgp_processq_del) process work queue item deconstructor (bgp_process_queue_init) process work queue init (bgp_process) call init function if required, set up queue item and add to queue, rather than calling process functions directly. (bgp_rib_remove) let bgp_info_delete manage bgp_info refcounts (bgp_rib_withdraw) ditto (bgp_update_rsclient) let bgp_info_add manage refcounts (bgp_update_main) ditto (bgp_clear_route_node) clear_node_queue work function, does per-node aspects of what bgp_clear_route_table did previously (bgp_clear_node_queue_del) clear_node_queue item delete function (bgp_clear_node_complete) clear_node_queue completion function, it unplugs the process queues, which have to be blocked while clear_node_queue is being processed to prevent a race. (bgp_clear_node_queue_init) init function for clear_node_queue work queues (bgp_clear_route_table) Sets up items onto a workqueue now, rather than clearing each node directly. Plugs both process queues to avoid potential race. (bgp_static_withdraw_rsclient) let bgp_info_{add,delete} manage bgp_info refcounts. (bgp_static_update_rsclient) ditto (bgp_static_update_main) ditto (bgp_static_update_vpnv4) ditto, remove unneeded cast. (bgp_static_withdraw) see bgp_static_withdraw_rsclient (bgp_static_withdraw_vpnv4) ditto (bgp_aggregate_{route,add,delete}) ditto (bgp_redistribute_{add,delete,withdraw}) ditto * bgpd/bgp_vty.c: (peer_rsclient_set_vty) lock rsclient list peer reference (peer_rsclient_unset_vty) ditto, but unlock same reference * bgpd/bgpd.c: (peer_free) handle frees of info to be kept for lifetime of struct peer. (peer_lock,peer_unlock) peer refcount helpers (peer_new) add initial refcounts (peer_create,peer_create_accept) lock peer as appropriate (peer_delete) unlock as appropriate, move out some free's to peer_free. (peer_group_bind,peer_group_unbind) peer refcounting as appropriate. (bgp_create) check CALLOC return value. (bgp_terminate) free workqueues too. * lib/memtypes.c: Add MTYPE_BGP_PROCESS_QUEUE and MTYPE_BGP_CLEAR_NODE_QUEUE
2005-06-01 13:17:05 +02:00
{
struct peer *peer = connection->peer;
struct bgp *bgp = peer->bgp;
uint32_t peer_count;
peer_count = bgp->established_peers;
if (status == Established) {
bgp->established_peers++;
/* Reset the retry timer if we already established */
if (peer->connect)
peer->v_connect = peer->connect;
else
peer->v_connect = peer->bgp->default_connect_retry;
} else if ((peer_established(connection)) && (status != Established))
bgp->established_peers--;
if (bgp_debug_neighbor_events(peer)) {
struct vrf *vrf = vrf_lookup_by_id(bgp->vrf_id);
zlog_debug("%s : vrf %s(%u), Status: %s established_peers %u for %s", __func__,
vrf ? vrf->name : "Unknown", bgp->vrf_id,
lookup_msg(bgp_status_msg, status, NULL), bgp->established_peers,
bgp_peer_get_connection_direction(connection));
}
/* Set to router ID to the value provided by RIB if there are no peers
* in the established state and peer count did not change
*/
if ((peer_count != bgp->established_peers) &&
(bgp->established_peers == 0))
bgp_router_id_zebra_bump(bgp->vrf_id, NULL);
/* Transition into Clearing or Deleted must /always/ clear all routes..
* (and must do so before actually changing into Deleted..
*/
bgpd: avoid clearing routes for peers that were never established Under heavy system load with many peers in passive mode and a large number of routes, bgpd can enter an infinite loop. This occurs while processing timeout BGP_OPEN messages, which prevents it from accepting new connections. The following log entries illustrate the issue: >bgpd[6151]: [VX6SM-8YE5W][EC 33554460] 3.3.2.224: nexthop_set failed, resetting connection - intf 0x0 >bgpd[6151]: [P790V-THJKS][EC 100663299] bgp_open_receive: bgp_getsockname() failed for peer: 3.3.2.224 >bgpd[6151]: [HTQD2-0R1WR][EC 33554451] bgp_process_packet: BGP OPEN receipt failed for peer: 3.3.2.224 ... repeating The issue occurs when bgpd handles a massive number of routes in the RIB while receiving numerous BGP_OPEN packets. If bgpd is overloaded, it fails to process these packets promptly, leading the remote peer to close the connection and resend BGP_OPEN packets. When bgpd eventually starts processing these timeout BGP_OPEN packets, it finds the TCP connection closed by the remote peer, resulting in "bgp_stop()" being called. For each timeout peer, bgpd must iterate through the routing table, which is time-consuming and causes new incoming BGP_OPEN packets to timeout, perpetuating the infinite loop. To address this issue, the code is modified to check if the peer has been established at least once before calling "bgp_clear_route_all()". This ensures that routes are only cleared for peers that had a successful session, preventing unnecessary iterations over the routing table for peers that never established a connection. With this change, BGP_OPEN timeout messages may still occur, but in the worst case, bgpd will stabilize. Before this patch, bgpd could enter a loop where it was unable to accpet any new connections. Signed-off-by: Loïc Sang <loic.sang@6wind.com>
2024-06-19 16:19:22 +02:00
if (status >= Clearing && (peer->established || peer == bgp->peer_self)) {
bgp_clear_route_all(peer);
/* If no route was queued for the clear-node processing,
* generate the
* completion event here. This is needed because if there are no
* routes
* to trigger the background clear-node thread, the event won't
* get
* generated and the peer would be stuck in Clearing. Note that
* this
* event is for the peer and helps the peer transition out of
* Clearing
* state; it should not be generated per (AFI,SAFI). The event
* is
* directly posted here without calling clear_node_complete() as
* we
* shouldn't do an extra unlock. This event will get processed
* after
* the state change that happens below, so peer will be in
* Clearing
* (or Deleted).
*/
if (!work_queue_is_scheduled(peer->clear_node_queue) &&
status != Deleted)
BGP_EVENT_ADD(connection, Clearing_Completed);
}
2005-06-01 Paul Jakma <paul.jakma@sun.com> * bgpd/(general) refcount struct peer and bgp_info, hence allowing us add work_queues for bgp_process. * bgpd/bgp_route.h: (struct bgp_info) Add 'lock' field for refcount. Add bgp_info_{lock,unlock} helper functions. Add bgp_info_{add,delete} helpers, to remove need for users managing locking/freeing of bgp_info and bgp_node's. * bgpd/bgp_table.h: (struct bgp_node) Add a flags field, and BGP_NODE_PROCESS_SCHEDULED to merge redundant processing of nodes. * bgpd/bgp_fsm.h: Make the ON/OFF/ADD/REMOVE macros lock and unlock peer reference as appropriate. * bgpd/bgp_damp.c: Remove its internal prototypes for bgp_info_delete/free. Just use bgp_info_delete. * bgpd/bgpd.h: (struct bgp_master) Add work_queue pointers. (struct peer) Add reference count 'lock' (peer_lock,peer_unlock) New helpers to take/release reference on struct peer. * bgpd/bgp_advertise.c: (general) Add peer and bgp_info refcounting and balance how references are taken and released. (bgp_advertise_free) release bgp_info reference, if appropriate (bgp_adj_out_free) unlock peer (bgp_advertise_clean) leave the adv references alone, or else call bgp_advertise_free cant unlock them. (bgp_adj_out_set) lock the peer on new adj's, leave the reference alone otherwise. lock the new bgp_info reference. (bgp_adj_in_set) lock the peer reference (bgp_adj_in_remove) and unlock it here (bgp_sync_delete) make hash_free on peer conditional, just in case. * bgpd/bgp_fsm.c: (general) document that the timers depend on bgp_event to release a peer reference. (bgp_fsm_change_status) moved up the file, unchanged. (bgp_stop) Decrement peer lock as many times as cancel_event canceled - shouldnt be needed but just in case. stream_fifo_clean of obuf made conditional, just in case. (bgp_event) always unlock the peer, regardless of return value of bgp_fsm_change_status. * bgpd/bgp_packet.c: (general) change several bgp_stop's to BGP_EVENT's. (bgp_read) Add a mysterious extra peer_unlock for ACCEPT_PEERs along with a comment on it. * bgpd/bgp_route.c: (general) Add refcounting of bgp_info, cleanup some of the resource management around bgp_info. Refcount peer. Add workqueues for bgp_process and clear_table. (bgp_info_new) make static (bgp_info_free) Ditto, and unlock the peer reference. (bgp_info_lock,bgp_info_unlock) new exported functions (bgp_info_add) Add a bgp_info to a bgp_node in correct fashion, taking care of reference counts. (bgp_info_delete) do the opposite of bgp_info_add. (bgp_process_rsclient) Converted into a work_queue work function. (bgp_process_main) ditto. (bgp_processq_del) process work queue item deconstructor (bgp_process_queue_init) process work queue init (bgp_process) call init function if required, set up queue item and add to queue, rather than calling process functions directly. (bgp_rib_remove) let bgp_info_delete manage bgp_info refcounts (bgp_rib_withdraw) ditto (bgp_update_rsclient) let bgp_info_add manage refcounts (bgp_update_main) ditto (bgp_clear_route_node) clear_node_queue work function, does per-node aspects of what bgp_clear_route_table did previously (bgp_clear_node_queue_del) clear_node_queue item delete function (bgp_clear_node_complete) clear_node_queue completion function, it unplugs the process queues, which have to be blocked while clear_node_queue is being processed to prevent a race. (bgp_clear_node_queue_init) init function for clear_node_queue work queues (bgp_clear_route_table) Sets up items onto a workqueue now, rather than clearing each node directly. Plugs both process queues to avoid potential race. (bgp_static_withdraw_rsclient) let bgp_info_{add,delete} manage bgp_info refcounts. (bgp_static_update_rsclient) ditto (bgp_static_update_main) ditto (bgp_static_update_vpnv4) ditto, remove unneeded cast. (bgp_static_withdraw) see bgp_static_withdraw_rsclient (bgp_static_withdraw_vpnv4) ditto (bgp_aggregate_{route,add,delete}) ditto (bgp_redistribute_{add,delete,withdraw}) ditto * bgpd/bgp_vty.c: (peer_rsclient_set_vty) lock rsclient list peer reference (peer_rsclient_unset_vty) ditto, but unlock same reference * bgpd/bgpd.c: (peer_free) handle frees of info to be kept for lifetime of struct peer. (peer_lock,peer_unlock) peer refcount helpers (peer_new) add initial refcounts (peer_create,peer_create_accept) lock peer as appropriate (peer_delete) unlock as appropriate, move out some free's to peer_free. (peer_group_bind,peer_group_unbind) peer refcounting as appropriate. (bgp_create) check CALLOC return value. (bgp_terminate) free workqueues too. * lib/memtypes.c: Add MTYPE_BGP_PROCESS_QUEUE and MTYPE_BGP_CLEAR_NODE_QUEUE
2005-06-01 13:17:05 +02:00
/* Preserve old status and change into new status. */
connection->ostatus = connection->status;
connection->status = status;
bgpd: Add `neighbor <neigh> shutdown rtt` command This would be useful in cases with lots of peers and shutdown them automatically if RTT goes above the specified limit. A host with 512 or more IPv6 addresses has a higher latency due to ipv6_addr_label(). This method tries to pick the best candidate address fo outgoing connection and literally increases processing latency. ``` Samples: 28 of event 'cycles', Event count (approx.): 22131542 Children Self Command Shared Object Symbol + 100.00% 0.00% ping6 [kernel.kallsyms] [k] entry_SYSCALL_64_fastpath + 100.00% 0.00% ping6 [unknown] [.] 0x0df0ad0b8047022a + 100.00% 0.00% ping6 libc-2.17.so [.] __sendto_nocancel + 100.00% 0.00% ping6 [kernel.kallsyms] [k] sys_sendto + 100.00% 0.00% ping6 [kernel.kallsyms] [k] SYSC_sendto + 100.00% 0.00% ping6 [kernel.kallsyms] [k] sock_sendmsg + 100.00% 0.00% ping6 [kernel.kallsyms] [k] inet_sendmsg + 100.00% 0.00% ping6 [kernel.kallsyms] [k] rawv6_sendmsg + 100.00% 0.00% ping6 [kernel.kallsyms] [k] ip6_dst_lookup_flow + 100.00% 0.00% ping6 [kernel.kallsyms] [k] ip6_dst_lookup_tail + 100.00% 0.00% ping6 [kernel.kallsyms] [k] ip6_route_get_saddr + 100.00% 0.00% ping6 [kernel.kallsyms] [k] ipv6_dev_get_saddr + 100.00% 0.00% ping6 [kernel.kallsyms] [k] __ipv6_dev_get_saddr + 100.00% 0.00% ping6 [kernel.kallsyms] [k] ipv6_get_saddr_eval + 100.00% 0.00% ping6 [kernel.kallsyms] [k] ipv6_addr_label + 100.00% 100.00% ping6 [kernel.kallsyms] [k] __ipv6_addr_label + 0.00% 0.00% ping6 [kernel.kallsyms] [k] schedule ``` This is how it works: ``` ~# vtysh -c 'show bgp neigh 192.168.0.2 json' | jq '."192.168.0.2".estimatedRttInMsecs' 9 ~# tc qdisc add dev eth1 root netem delay 120ms ~# vtysh -c 'show bgp neigh 192.168.0.2 json' | jq '."192.168.0.2".estimatedRttInMsecs' 89 ~# vtysh -c 'show bgp neigh 192.168.0.2 json' | jq '."192.168.0.2".estimatedRttInMsecs' null ~# vtysh -c 'show bgp neigh 192.168.0.2 json' | jq '."192.168.0.2".lastResetDueTo' "Admin. shutdown" ``` Warning message: bgpd[14807]: 192.168.0.2 shutdown due to high round-trip-time (200ms > 150ms) Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2020-07-25 08:10:56 +02:00
/* Reset received keepalives counter on every FSM change */
peer->rtt_keepalive_rcv = 0;
/* Fire backward transition hook if that's the case */
if (connection->ostatus == Established &&
connection->status != Established)
hook_call(peer_backward_transition, peer);
/* Save event that caused status change. */
peer->last_major_event = peer->cur_event;
/* Operations after status change */
hook_call(peer_status_changed, peer);
if (status == Established)
UNSET_FLAG(peer->sflags, PEER_STATUS_ACCEPT_PEER);
/* If max-med processing is applicable, do the necessary. */
if (status == Established) {
if (bgp_maxmed_onstartup_configured(peer->bgp)
&& bgp_maxmed_onstartup_applicable(peer->bgp))
bgp_maxmed_onstartup_process_status_change(peer);
else
peer->bgp->maxmed_onstartup_over = 1;
}
bgpd: bgpd-update-delay.patch COMMAND: 'update-delay <max-delay in seconds> [<establish-wait in seconds>]' DESCRIPTION: This feature is used to enable read-only mode on BGP process restart or when BGP process is cleared using 'clear ip bgp *'. When applicable, read-only mode would begin as soon as the first peer reaches Established state and a timer for <max-delay> seconds is started. During this mode BGP doesn't run any best-path or generate any updates to its peers. This mode continues until: 1. All the configured peers, except the shutdown peers, have sent explicit EOR (End-Of-RIB) or an implicit-EOR. The first keep-alive after BGP has reached Established is considered an implicit-EOR. If the <establish-wait> optional value is given, then BGP will wait for peers to reach establish from the begining of the update-delay till the establish-wait period is over, i.e. the minimum set of established peers for which EOR is expected would be peers established during the establish-wait window, not necessarily all the configured neighbors. 2. max-delay period is over. On hitting any of the above two conditions, BGP resumes the decision process and generates updates to its peers. Default <max-delay> is 0, i.e. the feature is off by default. This feature can be useful in reducing CPU/network used as BGP restarts/clears. Particularly useful in the topologies where BGP learns a prefix from many peers. Intermediate bestpaths are possible for the same prefix as peers get established and start receiving updates at different times. This feature should offer a value-add if the network has a high number of such prefixes. IMPLEMENTATION OBJECTIVES: Given this is an optional feature, minimized the code-churn. Used existing constructs wherever possible (existing queue-plug/unplug were used to achieve delay and resume of best-paths/update-generation). As a result, no new data-structure(s) had to be defined and allocated. When the feature is disabled, the new node is not exercised for the most part. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com> Dinesh Dutt <ddutt@cumulusnetworks.com>
2015-05-20 02:40:33 +02:00
/* If update-delay processing is applicable, do the necessary. */
if (bgp_update_delay_configured(peer->bgp)
&& bgp_update_delay_applicable(peer->bgp))
bgp_update_delay_process_status_change(peer);
Overhual BGP debugs Summary of changes - added an option to enable keepalive debugs for a specific peer - added an option to enable inbound and/or outbound updates debugs for a specific peer - added an option to enable update debugs for a specific prefix - added an option to enable zebra debugs for a specific prefix - combined "deb bgp", "deb bgp events" and "deb bgp fsm" into "deb bgp neighbor-events". "deb bgp neighbor-events" can be enabled for a specific peer. - merged "deb bgp filters" into "deb bgp update" - moved the per-peer logging to one central log file. We now have the ability to filter all verbose debugs on a per-peer and per-prefix basis so we no longer need to keep log files per-peer. This simplifies troubleshooting by keeping all BGP logs in one location. The use r can then grep for the peer IP they are interested in if they wish to see the logs for a specific peer. - Changed "show debugging" in isis to "show debugging isis" to be consistent with all other protocols. This was very confusing for the user because they would type "show debug" and expect to see a list of debugs enabled across all protocols. - Removed "undebug" from the parser for BGP. Again this was to be consisten with all other protocols. - Removed the "all" keyword from the BGP debug parser. The user can now do "no debug bgp" to disable all BGP debugs, before you had to type "no deb all bgp" which was confusing. The new parse tree for BGP debugging is: deb bgp as4 deb bgp as4 segment deb bgp keepalives [A.B.C.D|WORD|X:X::X:X] deb bgp neighbor-events [A.B.C.D|WORD|X:X::X:X] deb bgp nht deb bgp updates [in|out] [A.B.C.D|WORD|X:X::X:X] deb bgp updates prefix [A.B.C.D/M|X:X::X:X/M] deb bgp zebra deb bgp zebra prefix [A.B.C.D/M|X:X::X:X/M]
2015-05-20 02:58:12 +02:00
if (bgp_debug_neighbor_events(peer))
zlog_debug("%s fd %d went from %s to %s for %s", peer->host, connection->fd,
lookup_msg(bgp_status_msg, connection->ostatus, NULL),
lookup_msg(bgp_status_msg, connection->status, NULL),
bgp_peer_get_connection_direction(connection));
2005-06-01 Paul Jakma <paul.jakma@sun.com> * bgpd/(general) refcount struct peer and bgp_info, hence allowing us add work_queues for bgp_process. * bgpd/bgp_route.h: (struct bgp_info) Add 'lock' field for refcount. Add bgp_info_{lock,unlock} helper functions. Add bgp_info_{add,delete} helpers, to remove need for users managing locking/freeing of bgp_info and bgp_node's. * bgpd/bgp_table.h: (struct bgp_node) Add a flags field, and BGP_NODE_PROCESS_SCHEDULED to merge redundant processing of nodes. * bgpd/bgp_fsm.h: Make the ON/OFF/ADD/REMOVE macros lock and unlock peer reference as appropriate. * bgpd/bgp_damp.c: Remove its internal prototypes for bgp_info_delete/free. Just use bgp_info_delete. * bgpd/bgpd.h: (struct bgp_master) Add work_queue pointers. (struct peer) Add reference count 'lock' (peer_lock,peer_unlock) New helpers to take/release reference on struct peer. * bgpd/bgp_advertise.c: (general) Add peer and bgp_info refcounting and balance how references are taken and released. (bgp_advertise_free) release bgp_info reference, if appropriate (bgp_adj_out_free) unlock peer (bgp_advertise_clean) leave the adv references alone, or else call bgp_advertise_free cant unlock them. (bgp_adj_out_set) lock the peer on new adj's, leave the reference alone otherwise. lock the new bgp_info reference. (bgp_adj_in_set) lock the peer reference (bgp_adj_in_remove) and unlock it here (bgp_sync_delete) make hash_free on peer conditional, just in case. * bgpd/bgp_fsm.c: (general) document that the timers depend on bgp_event to release a peer reference. (bgp_fsm_change_status) moved up the file, unchanged. (bgp_stop) Decrement peer lock as many times as cancel_event canceled - shouldnt be needed but just in case. stream_fifo_clean of obuf made conditional, just in case. (bgp_event) always unlock the peer, regardless of return value of bgp_fsm_change_status. * bgpd/bgp_packet.c: (general) change several bgp_stop's to BGP_EVENT's. (bgp_read) Add a mysterious extra peer_unlock for ACCEPT_PEERs along with a comment on it. * bgpd/bgp_route.c: (general) Add refcounting of bgp_info, cleanup some of the resource management around bgp_info. Refcount peer. Add workqueues for bgp_process and clear_table. (bgp_info_new) make static (bgp_info_free) Ditto, and unlock the peer reference. (bgp_info_lock,bgp_info_unlock) new exported functions (bgp_info_add) Add a bgp_info to a bgp_node in correct fashion, taking care of reference counts. (bgp_info_delete) do the opposite of bgp_info_add. (bgp_process_rsclient) Converted into a work_queue work function. (bgp_process_main) ditto. (bgp_processq_del) process work queue item deconstructor (bgp_process_queue_init) process work queue init (bgp_process) call init function if required, set up queue item and add to queue, rather than calling process functions directly. (bgp_rib_remove) let bgp_info_delete manage bgp_info refcounts (bgp_rib_withdraw) ditto (bgp_update_rsclient) let bgp_info_add manage refcounts (bgp_update_main) ditto (bgp_clear_route_node) clear_node_queue work function, does per-node aspects of what bgp_clear_route_table did previously (bgp_clear_node_queue_del) clear_node_queue item delete function (bgp_clear_node_complete) clear_node_queue completion function, it unplugs the process queues, which have to be blocked while clear_node_queue is being processed to prevent a race. (bgp_clear_node_queue_init) init function for clear_node_queue work queues (bgp_clear_route_table) Sets up items onto a workqueue now, rather than clearing each node directly. Plugs both process queues to avoid potential race. (bgp_static_withdraw_rsclient) let bgp_info_{add,delete} manage bgp_info refcounts. (bgp_static_update_rsclient) ditto (bgp_static_update_main) ditto (bgp_static_update_vpnv4) ditto, remove unneeded cast. (bgp_static_withdraw) see bgp_static_withdraw_rsclient (bgp_static_withdraw_vpnv4) ditto (bgp_aggregate_{route,add,delete}) ditto (bgp_redistribute_{add,delete,withdraw}) ditto * bgpd/bgp_vty.c: (peer_rsclient_set_vty) lock rsclient list peer reference (peer_rsclient_unset_vty) ditto, but unlock same reference * bgpd/bgpd.c: (peer_free) handle frees of info to be kept for lifetime of struct peer. (peer_lock,peer_unlock) peer refcount helpers (peer_new) add initial refcounts (peer_create,peer_create_accept) lock peer as appropriate (peer_delete) unlock as appropriate, move out some free's to peer_free. (peer_group_bind,peer_group_unbind) peer refcounting as appropriate. (bgp_create) check CALLOC return value. (bgp_terminate) free workqueues too. * lib/memtypes.c: Add MTYPE_BGP_PROCESS_QUEUE and MTYPE_BGP_CLEAR_NODE_QUEUE
2005-06-01 13:17:05 +02:00
}
/* Flush the event queue and ensure the peer is shut down */
static enum bgp_fsm_state_progress
bgp_clearing_completed(struct peer_connection *connection)
{
enum bgp_fsm_state_progress rc = bgp_stop(connection);
if (rc >= BGP_FSM_SUCCESS)
event_cancel_event_ready(bm->master, connection);
return rc;
}
2002-12-13 21:15:29 +01:00
/* Administrative BGP peer stop event. */
/* May be called multiple times for the same peer */
enum bgp_fsm_state_progress bgp_stop(struct peer_connection *connection)
2002-12-13 21:15:29 +01:00
{
afi_t afi;
safi_t safi;
char orf_name[BUFSIZ];
enum bgp_fsm_state_progress ret = BGP_FSM_SUCCESS;
struct peer *peer = connection->peer;
struct bgp *bgp = peer->bgp;
struct graceful_restart_info *gr_info = NULL;
peer->nsf_af_count = 0;
bgpd: Do not delete BGP dynamic peers if graceful restart kicks in ``` ~# vtysh -c 'show bgp ipv4 unicast summary' | grep 192.168.10.17 *donatas-pc(192.168.10.17) 4 65002 8 12 0 0 0 00:01:35 2 14 N/A ``` Before shutting down 192.168.10.17: ``` ~# vtysh -c 'show bgp ipv4 unicast 100.100.100.100/32' BGP routing table entry for 100.100.100.100/32, version 7 Paths: (2 available, best #2, table default) Advertised to non peer-group peers: home-spine1.donatas.net(192.168.0.2) 65002, (stale) 192.168.10.17 from donatas-pc(192.168.10.17) (0.0.0.0) Origin incomplete, valid, external Last update: Sat Jan 15 21:45:47 2022 65001 192.168.0.2 from home-spine1.donatas.net(192.168.0.2) (2.2.2.2) Origin incomplete, metric 0, valid, external, best (Older Path) Last update: Sat Jan 15 21:25:19 2022 ``` After 192.168.10.17 is down: ``` ~# vtysh -c 'show bgp ipv4 unicast summary' | grep 192.168.10.17 donatas-pc(192.168.10.17) 4 65002 5 9 0 0 0 00:00:12 Active 0 N/A ~# vtysh -c 'show bgp ipv4 unicast 100.100.100.100/32' BGP routing table entry for 100.100.100.100/32, version 7 Paths: (2 available, best #2, table default) Advertised to non peer-group peers: home-spine1.donatas.net(192.168.0.2) 65002, (stale) 192.168.10.17 from donatas-pc(192.168.10.17) (0.0.0.0) Origin incomplete, valid, external Community: llgr-stale Last update: Sat Jan 15 21:49:01 2022 Time until Long-lived stale route deleted: 16 65001 192.168.0.2 from home-spine1.donatas.net(192.168.0.2) (2.2.2.2) Origin incomplete, metric 0, valid, external, best (First path received) Last update: Sat Jan 15 21:25:19 2022 ``` Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2022-01-15 22:16:15 +01:00
if (peer_dynamic_neighbor_no_nsf(peer) &&
!(CHECK_FLAG(peer->flags, PEER_FLAG_DELETE))) {
if (bgp_debug_neighbor_events(peer))
zlog_debug("%s (dynamic neighbor) deleted (%s) for %s", __func__,
peer->host, bgp_peer_get_connection_direction(connection));
peer_delete(peer);
return BGP_FSM_FAILURE_AND_DELETE;
}
/* Can't do this in Clearing; events are used for state transitions */
if (connection->status != Clearing) {
/* Delete all existing events of the peer */
event_cancel_event_ready(bm->master, connection);
}
2002-12-13 21:15:29 +01:00
/* Increment Dropped count. */
if (peer_established(connection)) {
2002-12-13 21:15:29 +01:00
peer->dropped++;
if (peer->bfd_config && (peer->last_reset == PEER_DOWN_UPDATE_SOURCE_CHANGE ||
peer->last_reset == PEER_DOWN_MULTIHOP_CHANGE))
bfd_sess_uninstall(peer->bfd_config->session);
/* Notify BGP conditional advertisement process */
peer->advmap_table_change = true;
/* bgp log-neighbor-changes of neighbor Down */
if (CHECK_FLAG(peer->bgp->flags,
BGP_FLAG_LOG_NEIGHBOR_CHANGES)) {
struct vrf *vrf = vrf_lookup_by_id(peer->bgp->vrf_id);
zlog_info(
"%%ADJCHANGE: neighbor %pBP in vrf %s Down %s",
peer,
vrf ? ((vrf->vrf_id != VRF_DEFAULT)
? vrf->name
: VRF_DEFAULT_NAME)
: "",
peer_down_str[(int)peer->last_reset]);
}
/* graceful restart */
if (connection->t_gr_stale) {
EVENT_OFF(connection->t_gr_stale);
if (bgp_debug_neighbor_events(peer))
zlog_debug("%pBP graceful restart stalepath timer stopped for %s",
peer, bgp_peer_get_connection_direction(connection));
}
if (CHECK_FLAG(peer->sflags, PEER_STATUS_NSF_WAIT)) {
Overhual BGP debugs Summary of changes - added an option to enable keepalive debugs for a specific peer - added an option to enable inbound and/or outbound updates debugs for a specific peer - added an option to enable update debugs for a specific prefix - added an option to enable zebra debugs for a specific prefix - combined "deb bgp", "deb bgp events" and "deb bgp fsm" into "deb bgp neighbor-events". "deb bgp neighbor-events" can be enabled for a specific peer. - merged "deb bgp filters" into "deb bgp update" - moved the per-peer logging to one central log file. We now have the ability to filter all verbose debugs on a per-peer and per-prefix basis so we no longer need to keep log files per-peer. This simplifies troubleshooting by keeping all BGP logs in one location. The use r can then grep for the peer IP they are interested in if they wish to see the logs for a specific peer. - Changed "show debugging" in isis to "show debugging isis" to be consistent with all other protocols. This was very confusing for the user because they would type "show debug" and expect to see a list of debugs enabled across all protocols. - Removed "undebug" from the parser for BGP. Again this was to be consisten with all other protocols. - Removed the "all" keyword from the BGP debug parser. The user can now do "no debug bgp" to disable all BGP debugs, before you had to type "no deb all bgp" which was confusing. The new parse tree for BGP debugging is: deb bgp as4 deb bgp as4 segment deb bgp keepalives [A.B.C.D|WORD|X:X::X:X] deb bgp neighbor-events [A.B.C.D|WORD|X:X::X:X] deb bgp nht deb bgp updates [in|out] [A.B.C.D|WORD|X:X::X:X] deb bgp updates prefix [A.B.C.D/M|X:X::X:X/M] deb bgp zebra deb bgp zebra prefix [A.B.C.D/M|X:X::X:X/M]
2015-05-20 02:58:12 +02:00
if (bgp_debug_neighbor_events(peer)) {
zlog_debug("%pBP graceful restart timer started for %d sec for %s",
peer, peer->v_gr_restart,
bgp_peer_get_connection_direction(connection));
zlog_debug("%pBP graceful restart stalepath timer started for %d sec for %s",
peer, peer->bgp->stalepath_time,
bgp_peer_get_connection_direction(connection));
}
BGP_TIMER_ON(connection->t_gr_restart,
bgp_graceful_restart_timer_expire,
peer->v_gr_restart);
BGP_TIMER_ON(connection->t_gr_stale,
bgp_graceful_stale_timer_expire,
peer->bgp->stalepath_time);
} else {
UNSET_FLAG(peer->sflags, PEER_STATUS_NSF_MODE);
FOREACH_AFI_SAFI_NSF (afi, safi)
peer->nsf[afi][safi] = 0;
}
/* Stop route-refresh stalepath timer */
if (peer->t_refresh_stalepath) {
EVENT_OFF(peer->t_refresh_stalepath);
if (bgp_debug_neighbor_events(peer))
zlog_debug("%pBP route-refresh restart stalepath timer stopped for %s",
peer, bgp_peer_get_connection_direction(connection));
}
/* If peer reset before receiving EOR, decrement EOR count and
* cancel the selection deferral timer if there are no
* pending EOR messages to be received
*/
if (BGP_PEER_GRACEFUL_RESTART_CAPABLE(peer)) {
FOREACH_AFI_SAFI (afi, safi) {
if (!peer->afc_nego[afi][safi]
|| CHECK_FLAG(peer->af_sflags[afi][safi],
PEER_STATUS_EOR_RECEIVED))
continue;
gr_info = &bgp->gr_info[afi][safi];
if (!gr_info)
continue;
if (gr_info->eor_required)
gr_info->eor_required--;
if (BGP_DEBUG(update, UPDATE_OUT))
zlog_debug("peer %s, EOR_required %d for %s", peer->host,
gr_info->eor_required,
bgp_peer_get_connection_direction(connection));
/* There is no pending EOR message */
if (gr_info->eor_required == 0) {
if (gr_info->t_select_deferral) {
void *info = EVENT_ARG(
gr_info->t_select_deferral);
XFREE(MTYPE_TMP, info);
}
EVENT_OFF(gr_info->t_select_deferral);
gr_info->eor_received = 0;
}
}
}
/* set last reset time */
peer->resettime = peer->uptime = monotime(NULL);
if (BGP_DEBUG(update_groups, UPDATE_GROUPS))
zlog_debug("%s remove from all update group for %s", peer->host,
bgp_peer_get_connection_direction(connection));
update_group_remove_peer_afs(peer);
/* Reset peer synctime */
peer->synctime = 0;
}
/* stop keepalives */
bgp_keepalives_off(connection);
/* Stop read and write threads. */
bgp_writes_off(connection);
bgp_reads_off(connection);
EVENT_OFF(connection->t_connect_check_r);
EVENT_OFF(connection->t_connect_check_w);
bgpd: Fix wrong pthread event cancelling 0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=130719886083648) at ./nptl/pthread_kill.c:44 1 __pthread_kill_internal (signo=6, threadid=130719886083648) at ./nptl/pthread_kill.c:78 2 __GI___pthread_kill (threadid=130719886083648, signo=signo@entry=6) at ./nptl/pthread_kill.c:89 3 0x000076e399e42476 in __GI_raise (sig=6) at ../sysdeps/posix/raise.c:26 4 0x000076e39a34f950 in core_handler (signo=6, siginfo=0x76e3985fca30, context=0x76e3985fc900) at lib/sigevent.c:258 5 <signal handler called> 6 __pthread_kill_implementation (no_tid=0, signo=6, threadid=130719886083648) at ./nptl/pthread_kill.c:44 7 __pthread_kill_internal (signo=6, threadid=130719886083648) at ./nptl/pthread_kill.c:78 8 __GI___pthread_kill (threadid=130719886083648, signo=signo@entry=6) at ./nptl/pthread_kill.c:89 9 0x000076e399e42476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26 10 0x000076e399e287f3 in __GI_abort () at ./stdlib/abort.c:79 11 0x000076e39a39874b in _zlog_assert_failed (xref=0x76e39a46cca0 <_xref.27>, extra=0x0) at lib/zlog.c:789 12 0x000076e39a369dde in cancel_event_helper (m=0x5eda32df5e40, arg=0x5eda33afeed0, flags=1) at lib/event.c:1428 13 0x000076e39a369ef6 in event_cancel_event_ready (m=0x5eda32df5e40, arg=0x5eda33afeed0) at lib/event.c:1470 14 0x00005eda0a94a5b3 in bgp_stop (connection=0x5eda33afeed0) at bgpd/bgp_fsm.c:1355 15 0x00005eda0a94b4ae in bgp_stop_with_notify (connection=0x5eda33afeed0, code=8 '\b', sub_code=0 '\000') at bgpd/bgp_fsm.c:1610 16 0x00005eda0a979498 in bgp_packet_add (connection=0x5eda33afeed0, peer=0x5eda33b11800, s=0x76e3880daf90) at bgpd/bgp_packet.c:152 17 0x00005eda0a97a80f in bgp_keepalive_send (peer=0x5eda33b11800) at bgpd/bgp_packet.c:639 18 0x00005eda0a9511fd in peer_process (hb=0x5eda33c9ab80, arg=0x76e3985ffaf0) at bgpd/bgp_keepalives.c:111 19 0x000076e39a2cd8e6 in hash_iterate (hash=0x76e388000be0, func=0x5eda0a95105e <peer_process>, arg=0x76e3985ffaf0) at lib/hash.c:252 20 0x00005eda0a951679 in bgp_keepalives_start (arg=0x5eda3306af80) at bgpd/bgp_keepalives.c:214 21 0x000076e39a2c9932 in frr_pthread_inner (arg=0x5eda3306af80) at lib/frr_pthread.c:180 22 0x000076e399e94ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442 23 0x000076e399f26850 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 (gdb) f 12 12 0x000076e39a369dde in cancel_event_helper (m=0x5eda32df5e40, arg=0x5eda33afeed0, flags=1) at lib/event.c:1428 1428 assert(m->owner == pthread_self()); In this decode the attempt to cancel the connection's events from the wrong thread is causing the crash. Modify the code to create an event on the bm->master to cancel the events for the connection. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-10-24 23:44:31 +02:00
EVENT_OFF(connection->t_stop_with_notify);
/* Stop all timers. */
EVENT_OFF(connection->t_start);
EVENT_OFF(connection->t_connect);
EVENT_OFF(connection->t_holdtime);
EVENT_OFF(connection->t_routeadv);
EVENT_OFF(connection->t_delayopen);
2002-12-13 21:15:29 +01:00
/* Clear input and output buffer. */
frr_with_mutex (&connection->io_mtx) {
if (connection->ibuf)
stream_fifo_clean(connection->ibuf);
if (connection->obuf)
stream_fifo_clean(connection->obuf);
if (connection->ibuf_work)
ringbuf_wipe(connection->ibuf_work);
if (peer->curr) {
stream_free(peer->curr);
peer->curr = NULL;
}
}
/* Close of file descriptor. */
if (connection->fd >= 0) {
close(connection->fd);
connection->fd = -1;
connection->dir = UNKNOWN;
}
bgpd: Clear capabilities field when resetting a bgp neighbor Currently, the following sequence of events between peers could result in erroneous capability reports on the peer with enabled dont-capability-negotiate option: - having some of the capabilities advertised to a bgp neighbor, - then disabling capability negotiation to that neighbor, - then resetting connection to it, - and no capabilities are actually sent to the neighbor, - but "show bgp neighbors" on the host still displays them as advertised to the neighbor. There are two possibilities for establishing a new connection - the established connection was initiated by us with bgp_start(), - the connection was initiated on the neighbor side and processed by us via bgp_accept() in bgp_network.c. The former case results in "show bgp neighbors" displaying only "received" in capabilities, as the peer's cap is initiated to zero in bgp_start(). In the latter case, if bgp_accept() happens before bgp_start() is called, then new peer capabilities are being transferred from its previous record before being zeroed in bgp_start(). This results in "show bgp neighbors" still displaying "advertised and received" in capabilities. Following the logic of a similar af_cap field clearing, treated correctly in both cases, we - reset peer's capability during bgp_stop() - don't pass it over to a new peer structure in bgp_accept(). This fix prevents transferring of the previous capabilities record to a new peer instance in arbitrary reconnect scenario. Signed-off-by: Alexander Skorichenko <askorichenko@netgate.com>
2021-07-14 22:43:37 +02:00
/* Reset capabilities. */
peer->cap = 0;
/* Resetting neighbor role to the default value */
peer->remote_role = ROLE_UNDEFINED;
FOREACH_AFI_SAFI (afi, safi) {
/* Reset all negotiated variables */
peer->afc_nego[afi][safi] = 0;
peer->afc_adv[afi][safi] = 0;
peer->afc_recv[afi][safi] = 0;
/* peer address family capability flags*/
peer->af_cap[afi][safi] = 0;
/* peer address family status flags*/
peer->af_sflags[afi][safi] = 0;
/* Received ORF prefix-filter */
peer->orf_plist[afi][safi] = NULL;
if ((connection->status == OpenConfirm) ||
peer_established(connection)) {
/* ORF received prefix-filter pnt */
snprintf(orf_name, sizeof(orf_name), "%s.%d.%d",
peer->host, afi, safi);
prefix_bgp_orf_remove_all(afi, orf_name);
}
}
/* Reset keepalive and holdtime */
if (CHECK_FLAG(peer->flags, PEER_FLAG_TIMER)) {
peer->v_keepalive = peer->keepalive;
peer->v_holdtime = peer->holdtime;
} else {
peer->v_keepalive = peer->bgp->default_keepalive;
peer->v_holdtime = peer->bgp->default_holdtime;
}
/* Reset DelayOpenTime */
if (CHECK_FLAG(peer->flags, PEER_FLAG_TIMER_DELAYOPEN))
peer->v_delayopen = peer->delayopen;
else
peer->v_delayopen = peer->bgp->default_delayopen;
2002-12-13 21:15:29 +01:00
peer->update_time = 0;
if (!CHECK_FLAG(peer->flags, PEER_FLAG_CONFIG_NODE)
&& !(CHECK_FLAG(peer->flags, PEER_FLAG_DELETE))) {
peer_delete(peer);
ret = BGP_FSM_FAILURE_AND_DELETE;
} else {
bgp_peer_conf_if_to_su_update(connection);
}
return ret;
2002-12-13 21:15:29 +01:00
}
/* BGP peer is stoped by the error. */
static enum bgp_fsm_state_progress
bgp_stop_with_error(struct peer_connection *connection)
2002-12-13 21:15:29 +01:00
{
struct peer *peer = connection->peer;
2002-12-13 21:15:29 +01:00
/* Double start timer. */
peer->v_start *= 2;
2002-12-13 21:15:29 +01:00
/* Overflow check. */
if (peer->v_start >= (60 * 2))
peer->v_start = (60 * 2);
bgpd: Do not delete BGP dynamic peers if graceful restart kicks in ``` ~# vtysh -c 'show bgp ipv4 unicast summary' | grep 192.168.10.17 *donatas-pc(192.168.10.17) 4 65002 8 12 0 0 0 00:01:35 2 14 N/A ``` Before shutting down 192.168.10.17: ``` ~# vtysh -c 'show bgp ipv4 unicast 100.100.100.100/32' BGP routing table entry for 100.100.100.100/32, version 7 Paths: (2 available, best #2, table default) Advertised to non peer-group peers: home-spine1.donatas.net(192.168.0.2) 65002, (stale) 192.168.10.17 from donatas-pc(192.168.10.17) (0.0.0.0) Origin incomplete, valid, external Last update: Sat Jan 15 21:45:47 2022 65001 192.168.0.2 from home-spine1.donatas.net(192.168.0.2) (2.2.2.2) Origin incomplete, metric 0, valid, external, best (Older Path) Last update: Sat Jan 15 21:25:19 2022 ``` After 192.168.10.17 is down: ``` ~# vtysh -c 'show bgp ipv4 unicast summary' | grep 192.168.10.17 donatas-pc(192.168.10.17) 4 65002 5 9 0 0 0 00:00:12 Active 0 N/A ~# vtysh -c 'show bgp ipv4 unicast 100.100.100.100/32' BGP routing table entry for 100.100.100.100/32, version 7 Paths: (2 available, best #2, table default) Advertised to non peer-group peers: home-spine1.donatas.net(192.168.0.2) 65002, (stale) 192.168.10.17 from donatas-pc(192.168.10.17) (0.0.0.0) Origin incomplete, valid, external Community: llgr-stale Last update: Sat Jan 15 21:49:01 2022 Time until Long-lived stale route deleted: 16 65001 192.168.0.2 from home-spine1.donatas.net(192.168.0.2) (2.2.2.2) Origin incomplete, metric 0, valid, external, best (First path received) Last update: Sat Jan 15 21:25:19 2022 ``` Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2022-01-15 22:16:15 +01:00
if (peer_dynamic_neighbor_no_nsf(peer)) {
if (bgp_debug_neighbor_events(peer))
zlog_debug("%s (dynamic neighbor) deleted (%s) for %s", __func__,
peer->host, bgp_peer_get_connection_direction(connection));
peer_delete(peer);
return BGP_FSM_FAILURE;
}
return bgp_stop(connection);
2002-12-13 21:15:29 +01:00
}
/* something went wrong, send notify and tear down */
enum bgp_fsm_state_progress
bgp_stop_with_notify(struct peer_connection *connection, uint8_t code,
uint8_t sub_code)
{
struct peer *peer = connection->peer;
/* Send notify to remote peer */
bgp_notify_send(connection, code, sub_code);
bgpd: Do not delete BGP dynamic peers if graceful restart kicks in ``` ~# vtysh -c 'show bgp ipv4 unicast summary' | grep 192.168.10.17 *donatas-pc(192.168.10.17) 4 65002 8 12 0 0 0 00:01:35 2 14 N/A ``` Before shutting down 192.168.10.17: ``` ~# vtysh -c 'show bgp ipv4 unicast 100.100.100.100/32' BGP routing table entry for 100.100.100.100/32, version 7 Paths: (2 available, best #2, table default) Advertised to non peer-group peers: home-spine1.donatas.net(192.168.0.2) 65002, (stale) 192.168.10.17 from donatas-pc(192.168.10.17) (0.0.0.0) Origin incomplete, valid, external Last update: Sat Jan 15 21:45:47 2022 65001 192.168.0.2 from home-spine1.donatas.net(192.168.0.2) (2.2.2.2) Origin incomplete, metric 0, valid, external, best (Older Path) Last update: Sat Jan 15 21:25:19 2022 ``` After 192.168.10.17 is down: ``` ~# vtysh -c 'show bgp ipv4 unicast summary' | grep 192.168.10.17 donatas-pc(192.168.10.17) 4 65002 5 9 0 0 0 00:00:12 Active 0 N/A ~# vtysh -c 'show bgp ipv4 unicast 100.100.100.100/32' BGP routing table entry for 100.100.100.100/32, version 7 Paths: (2 available, best #2, table default) Advertised to non peer-group peers: home-spine1.donatas.net(192.168.0.2) 65002, (stale) 192.168.10.17 from donatas-pc(192.168.10.17) (0.0.0.0) Origin incomplete, valid, external Community: llgr-stale Last update: Sat Jan 15 21:49:01 2022 Time until Long-lived stale route deleted: 16 65001 192.168.0.2 from home-spine1.donatas.net(192.168.0.2) (2.2.2.2) Origin incomplete, metric 0, valid, external, best (First path received) Last update: Sat Jan 15 21:25:19 2022 ``` Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2022-01-15 22:16:15 +01:00
if (peer_dynamic_neighbor_no_nsf(peer)) {
if (bgp_debug_neighbor_events(peer))
zlog_debug("%s (dynamic neighbor) deleted (%s) for %s", __func__,
peer->host, bgp_peer_get_connection_direction(connection));
peer_delete(peer);
return BGP_FSM_FAILURE;
}
/* Clear start timer value to default. */
peer->v_start = BGP_INIT_START_TIMER;
return bgp_stop(connection);
}
/**
* Determines whether a TCP session has successfully established for a peer and
* events as appropriate.
*
* This function is called when setting up a new session. After connect() is
* called on the peer's socket (in bgp_start()), the fd is passed to poll()
* to wait for connection success or failure. When poll() returns, this
* function is called to evaluate the result.
*
* Due to differences in behavior of poll() on Linux and BSD - specifically,
* the value of .revents in the case of a closed connection - this function is
* scheduled both for a read and a write event. The write event is triggered
* when the connection is established. A read event is triggered when the
* connection is closed. Thus we need to cancel whichever one did not occur.
*/
static void bgp_connect_check(struct event *thread)
{
int status;
socklen_t slen;
int ret;
struct peer_connection *connection = EVENT_ARG(thread);
struct peer *peer = connection->peer;
assert(!CHECK_FLAG(connection->thread_flags, PEER_THREAD_READS_ON));
assert(!CHECK_FLAG(connection->thread_flags, PEER_THREAD_WRITES_ON));
assert(!connection->t_read);
assert(!connection->t_write);
EVENT_OFF(connection->t_connect_check_r);
EVENT_OFF(connection->t_connect_check_w);
/* Check file descriptor. */
slen = sizeof(status);
ret = getsockopt(connection->fd, SOL_SOCKET, SO_ERROR, (void *)&status,
&slen);
/* If getsockopt is fail, this is fatal error. */
if (ret < 0) {
zlog_err("can't get sockopt for nonblocking connect: %d(%s)",
errno, safe_strerror(errno));
BGP_EVENT_ADD(connection, TCP_fatal_error);
return;
}
/* When status is 0 then TCP connection is established. */
if (status == 0) {
if (CHECK_FLAG(peer->flags, PEER_FLAG_TIMER_DELAYOPEN))
BGP_EVENT_ADD(connection,
TCP_connection_open_w_delay);
else
BGP_EVENT_ADD(connection, TCP_connection_open);
return;
} else {
if (bgp_debug_neighbor_events(peer))
zlog_debug("%s [Event] Connect failed %d(%s) for connection %s", peer->host,
status, safe_strerror(status),
bgp_peer_get_connection_direction(connection));
BGP_EVENT_ADD(connection, TCP_connection_open_failed);
return;
}
}
2002-12-13 21:15:29 +01:00
/* TCP connection open. Next we send open message to remote peer. And
add read thread for reading open message. */
static enum bgp_fsm_state_progress
bgp_connect_success(struct peer_connection *connection)
2002-12-13 21:15:29 +01:00
{
struct peer *peer = connection->peer;
if (connection->fd < 0) {
flog_err(EC_BGP_CONNECT, "%s peer's fd is negative value %d",
__func__, connection->fd);
return bgp_stop(connection);
2002-12-13 21:15:29 +01:00
}
if (bgp_getsockname(connection) < 0) {
flog_err_sys(EC_LIB_SOCKET,
"%s: bgp_getsockname(): failed for peer %s, fd %d",
__func__, peer->host, connection->fd);
bgp_notify_send(connection, BGP_NOTIFY_FSM_ERR,
bgp_fsm_error_subcode(connection->status));
bgp_writes_on(connection);
return BGP_FSM_FAILURE;
}
/*
* If we are doing nht for a peer that ls v6 LL based
* massage the event system to make things happy
*/
bgp_nht_interface_events(peer);
bgp_reads_on(connection);
Overhual BGP debugs Summary of changes - added an option to enable keepalive debugs for a specific peer - added an option to enable inbound and/or outbound updates debugs for a specific peer - added an option to enable update debugs for a specific prefix - added an option to enable zebra debugs for a specific prefix - combined "deb bgp", "deb bgp events" and "deb bgp fsm" into "deb bgp neighbor-events". "deb bgp neighbor-events" can be enabled for a specific peer. - merged "deb bgp filters" into "deb bgp update" - moved the per-peer logging to one central log file. We now have the ability to filter all verbose debugs on a per-peer and per-prefix basis so we no longer need to keep log files per-peer. This simplifies troubleshooting by keeping all BGP logs in one location. The use r can then grep for the peer IP they are interested in if they wish to see the logs for a specific peer. - Changed "show debugging" in isis to "show debugging isis" to be consistent with all other protocols. This was very confusing for the user because they would type "show debug" and expect to see a list of debugs enabled across all protocols. - Removed "undebug" from the parser for BGP. Again this was to be consisten with all other protocols. - Removed the "all" keyword from the BGP debug parser. The user can now do "no debug bgp" to disable all BGP debugs, before you had to type "no deb all bgp" which was confusing. The new parse tree for BGP debugging is: deb bgp as4 deb bgp as4 segment deb bgp keepalives [A.B.C.D|WORD|X:X::X:X] deb bgp neighbor-events [A.B.C.D|WORD|X:X::X:X] deb bgp nht deb bgp updates [in|out] [A.B.C.D|WORD|X:X::X:X] deb bgp updates prefix [A.B.C.D/M|X:X::X:X/M] deb bgp zebra deb bgp zebra prefix [A.B.C.D/M|X:X::X:X/M]
2015-05-20 02:58:12 +02:00
if (bgp_debug_neighbor_events(peer)) {
if (!CHECK_FLAG(peer->sflags, PEER_STATUS_ACCEPT_PEER))
zlog_debug("%s open active, local address %pSU for %s", peer->host,
connection->su_local,
bgp_peer_get_connection_direction(connection));
else
zlog_debug("%s passive open for %s", peer->host,
bgp_peer_get_connection_direction(connection));
}
/* Send an open message */
bgp_open_send(connection);
return BGP_FSM_SUCCESS;
2002-12-13 21:15:29 +01:00
}
/* TCP connection open with RFC 4271 optional session attribute DelayOpen flag
* set.
*/
static enum bgp_fsm_state_progress
bgp_connect_success_w_delayopen(struct peer_connection *connection)
{
struct peer *peer = connection->peer;
if (connection->fd < 0) {
flog_err(EC_BGP_CONNECT, "%s: peer's fd is negative value %d",
__func__, connection->fd);
return bgp_stop(connection);
}
if (bgp_getsockname(connection) < 0) {
flog_err_sys(EC_LIB_SOCKET,
"%s: bgp_getsockname(): failed for peer %s, fd %d",
__func__, peer->host, connection->fd);
bgp_notify_send(connection, BGP_NOTIFY_FSM_ERR,
bgp_fsm_error_subcode(connection->status));
bgp_writes_on(connection);
return BGP_FSM_FAILURE;
}
/*
* If we are doing nht for a peer that ls v6 LL based
* massage the event system to make things happy
*/
bgp_nht_interface_events(peer);
bgp_reads_on(connection);
if (bgp_debug_neighbor_events(peer)) {
if (!CHECK_FLAG(peer->sflags, PEER_STATUS_ACCEPT_PEER))
zlog_debug("%s open active, local address %pSU for %s", peer->host,
connection->su_local,
bgp_peer_get_connection_direction(connection));
else
zlog_debug("%s passive open for %s", peer->host,
bgp_peer_get_connection_direction(connection));
}
/* set the DelayOpenTime to the inital value */
peer->v_delayopen = peer->delayopen;
/* Start the DelayOpenTimer if it is not already running */
if (!peer->connection->t_delayopen)
BGP_TIMER_ON(peer->connection->t_delayopen, bgp_delayopen_timer,
peer->v_delayopen);
if (bgp_debug_neighbor_events(peer))
zlog_debug("%s [FSM] BGP OPEN message delayed for %d seconds for connection %s",
peer->host, peer->delayopen,
bgp_peer_get_connection_direction(connection));
return BGP_FSM_SUCCESS;
}
2002-12-13 21:15:29 +01:00
/* TCP connect fail */
static enum bgp_fsm_state_progress
bgp_connect_fail(struct peer_connection *connection)
2002-12-13 21:15:29 +01:00
{
struct peer *peer = connection->peer;
bgpd: fix dynamic peer graceful restart race condition bgp_llgr topotest sometimes fails at step 8: > topo: STEP 8: 'Check if we can see 172.16.1.2/32 after R4 (dynamic peer) was killed' R4 neighbor is deleted on R2 because it fails to re-connect: > 14:33:40.128048 BGP: [HKWM3-ZC5QP] 192.168.3.1 fd -1 went from Established to Clearing > 14:33:40.128154 BGP: [MJ1TJ-HEE3V] 192.168.3.1(r4) graceful restart timer expired > 14:33:40.128158 BGP: [ZTA2J-YRKGY] 192.168.3.1(r4) graceful restart stalepath timer stopped > 14:33:40.128162 BGP: [H917J-25EWN] 192.168.3.1(r4) Long-lived stale timer (IPv4 Unicast) started for 20 sec > 14:33:40.128168 BGP: [H5X66-NXP9S] 192.168.3.1(r4) Long-lived set stale community (LLGR_STALE) for: 172.16.1.2/32 > 14:33:40.128220 BGP: [H5X66-NXP9S] 192.168.3.1(r4) Long-lived set stale community (LLGR_STALE) for: 192.168.3.0/24 > [...] > 14:33:41.138869 BGP: [RGGAC-RJ6WG] 192.168.3.1 [Event] Connect failed 111(Connection refused) > 14:33:41.138906 BGP: [ZWCSR-M7FG9] 192.168.3.1 [FSM] TCP_connection_open_failed (Connect->Active), fd 23 > 14:33:41.138912 BGP: [JA9RP-HSD1K] 192.168.3.1 (dynamic neighbor) deleted (bgp_connect_fail) > 14:33:41.139126 BGP: [P98A2-2RDFE] 192.168.3.1(r4) graceful restart stalepath timer stopped af8496af08 ("bgpd: Do not delete BGP dynamic peers if graceful restart kicks in") forgot to modify bgp_connect_fail() Do not delete the peer in bgp_connect_fail() if Non-Stop-Forwarding is in progress. Fixes: af8496af08 ("bgpd: Do not delete BGP dynamic peers if graceful restart kicks in") Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2024-05-16 15:08:09 +02:00
if (peer_dynamic_neighbor_no_nsf(peer)) {
if (bgp_debug_neighbor_events(peer))
zlog_debug("%s (dynamic neighbor) deleted (%s) for %s", __func__,
peer->host, bgp_peer_get_connection_direction(connection));
peer_delete(peer);
return BGP_FSM_FAILURE_AND_DELETE;
}
/*
* If we are doing nht for a peer that ls v6 LL based
* massage the event system to make things happy
*/
bgp_nht_interface_events(peer);
return bgp_stop(connection);
2002-12-13 21:15:29 +01:00
}
bgpd: fix addressing information of non established outgoing sessions When trying to connect to a BGP peer that does not respons, the 'show bgp neighbors' command does not give any indication on the local and remote addresses used: > # show bgp neighbors > BGP neighbor is 192.0.2.150, remote AS 65500, local AS 65500, internal link > Local Role: undefined > Remote Role: undefined > BGP version 4, remote router ID 0.0.0.0, local router ID 192.0.2.1 > BGP state = Connect > [..] > Connections established 0; dropped 0 > Last reset 00:00:04, Waiting for peer OPEN (n/a) > Internal BGP neighbor may be up to 255 hops away. > BGP Connect Retry Timer in Seconds: 120 > Next connect timer due in 117 seconds > Read thread: off Write thread: off FD used: 27 The addressing information (address and port) are only available when TCP session is established, whereas this information is present at the system level: > root@ubuntu2204:~# netstat -pan | grep 192.0.2.1 > tcp 0 0 192.0.2.1:179 192.0.2.150:38060 SYN_RECV - > tcp 0 1 192.0.2.1:46526 192.0.2.150:179 SYN_SENT 488310/bgpd Add the display for outgoing BGP session, as the information in the getsockname() API provides information for connected streams. When getpeername() API does not give any information, use the peer configuration (destination port is encoded in peer->port). > # show bgp neighbors > BGP neighbor is 192.0.2.150, remote AS 65500, local AS 65500, internal link > Local Role: undefined > Remote Role: undefined > BGP version 4, remote router ID 0.0.0.0, local router ID 192.0.2.1 > BGP state = Connect > [..] > Connections established 0; dropped 0 > Last reset 00:00:16, Waiting for peer OPEN (n/a) > Local host: 192.0.2.1, Local port: 46084 > Foreign host: 192.0.2.150, Foreign port: 179 Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2024-04-05 09:55:05 +02:00
/* after connect is called(), getpeername is able to return
* port and address on non established streams
*/
static void bgp_connect_in_progress_update_connection(struct peer_connection *connection)
bgpd: fix addressing information of non established outgoing sessions When trying to connect to a BGP peer that does not respons, the 'show bgp neighbors' command does not give any indication on the local and remote addresses used: > # show bgp neighbors > BGP neighbor is 192.0.2.150, remote AS 65500, local AS 65500, internal link > Local Role: undefined > Remote Role: undefined > BGP version 4, remote router ID 0.0.0.0, local router ID 192.0.2.1 > BGP state = Connect > [..] > Connections established 0; dropped 0 > Last reset 00:00:04, Waiting for peer OPEN (n/a) > Internal BGP neighbor may be up to 255 hops away. > BGP Connect Retry Timer in Seconds: 120 > Next connect timer due in 117 seconds > Read thread: off Write thread: off FD used: 27 The addressing information (address and port) are only available when TCP session is established, whereas this information is present at the system level: > root@ubuntu2204:~# netstat -pan | grep 192.0.2.1 > tcp 0 0 192.0.2.1:179 192.0.2.150:38060 SYN_RECV - > tcp 0 1 192.0.2.1:46526 192.0.2.150:179 SYN_SENT 488310/bgpd Add the display for outgoing BGP session, as the information in the getsockname() API provides information for connected streams. When getpeername() API does not give any information, use the peer configuration (destination port is encoded in peer->port). > # show bgp neighbors > BGP neighbor is 192.0.2.150, remote AS 65500, local AS 65500, internal link > Local Role: undefined > Remote Role: undefined > BGP version 4, remote router ID 0.0.0.0, local router ID 192.0.2.1 > BGP state = Connect > [..] > Connections established 0; dropped 0 > Last reset 00:00:16, Waiting for peer OPEN (n/a) > Local host: 192.0.2.1, Local port: 46084 > Foreign host: 192.0.2.150, Foreign port: 179 Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2024-04-05 09:55:05 +02:00
{
struct peer *peer = connection->peer;
if (!connection->su_remote && !BGP_CONNECTION_SU_UNSPEC(connection)) {
/* if connect initiated, then dest port and dest addresses are well known */
connection->su_remote = sockunion_dup(&connection->su);
if (sockunion_family(connection->su_remote) == AF_INET)
connection->su_remote->sin.sin_port = htons(peer->port);
else if (sockunion_family(connection->su_remote) == AF_INET6)
connection->su_remote->sin6.sin6_port = htons(peer->port);
bgpd: fix addressing information of non established outgoing sessions When trying to connect to a BGP peer that does not respons, the 'show bgp neighbors' command does not give any indication on the local and remote addresses used: > # show bgp neighbors > BGP neighbor is 192.0.2.150, remote AS 65500, local AS 65500, internal link > Local Role: undefined > Remote Role: undefined > BGP version 4, remote router ID 0.0.0.0, local router ID 192.0.2.1 > BGP state = Connect > [..] > Connections established 0; dropped 0 > Last reset 00:00:04, Waiting for peer OPEN (n/a) > Internal BGP neighbor may be up to 255 hops away. > BGP Connect Retry Timer in Seconds: 120 > Next connect timer due in 117 seconds > Read thread: off Write thread: off FD used: 27 The addressing information (address and port) are only available when TCP session is established, whereas this information is present at the system level: > root@ubuntu2204:~# netstat -pan | grep 192.0.2.1 > tcp 0 0 192.0.2.1:179 192.0.2.150:38060 SYN_RECV - > tcp 0 1 192.0.2.1:46526 192.0.2.150:179 SYN_SENT 488310/bgpd Add the display for outgoing BGP session, as the information in the getsockname() API provides information for connected streams. When getpeername() API does not give any information, use the peer configuration (destination port is encoded in peer->port). > # show bgp neighbors > BGP neighbor is 192.0.2.150, remote AS 65500, local AS 65500, internal link > Local Role: undefined > Remote Role: undefined > BGP version 4, remote router ID 0.0.0.0, local router ID 192.0.2.1 > BGP state = Connect > [..] > Connections established 0; dropped 0 > Last reset 00:00:16, Waiting for peer OPEN (n/a) > Local host: 192.0.2.1, Local port: 46084 > Foreign host: 192.0.2.150, Foreign port: 179 Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2024-04-05 09:55:05 +02:00
}
}
2002-12-13 21:15:29 +01:00
/* This function is the first starting point of all BGP connection. It
* try to connect to remote peer with non-blocking IO.
*/
static enum bgp_fsm_state_progress bgp_start(struct peer_connection *connection)
2002-12-13 21:15:29 +01:00
{
struct peer *peer = connection->peer;
enum connect_result status;
bgp_peer_conf_if_to_su_update(connection);
if (connection->su.sa.sa_family == AF_UNSPEC) {
if (bgp_debug_neighbor_events(peer))
zlog_debug("%s [FSM] Unable to get neighbor's IP address, waiting... for %s",
peer->host, bgp_peer_get_connection_direction(connection));
bgpd: Add a new command to only show failed peerings In a data center, having 32-128 peers is not uncommon. In such a situation, to find a peer that has failed and why is several commands. This hinders both the automatability of failure detection and the ease/speed with which the reason can be found. To simplify this process of catching a failure and its cause quicker, this patch does the following: 1. Created a new function, bgp_show_failed_summary to display the failed summary output for JSON and vty 2. Created a new function to display the reset code/subcode. This is now used in the failed summary code and in the show neighbors code 3. Added a new variable failedPeers in all the JSON outputs, including the vanilla "show bgp summary" family. This lists the failed session count. 4. Display peer, dropped count, estd count, uptime and the reason for failure as the output of "show bgp summary failed" family of commands 5. Added three resset codes for the case where we're waiting for NHT, waiting for peer IPv6 addr, waiting for VRF to init. This also counts the case where only one peer has advertised an AFI/SAFI. The new command has the optional keyword "failed" added to the classical summary command. The changes affect only one existing output, that of "show [ip] bgp neighbors <nbr>". As we track the lack of NHT resolution for a peer or the lack of knowing a peer IPv6 addr, the output of that command will show a "waiting for NHT" etc. as the last reset reason. This patch includes update to the documentation too. Signed-off-by: Dinesh G Dutt <5016467+ddutt@users.noreply.github.com>
2019-08-31 18:24:49 +02:00
peer->last_reset = PEER_DOWN_NBR_ADDR;
return BGP_FSM_FAILURE;
}
[bgpd] Fix 0.99 shutdown regression, introduce Clearing and Deleted states 2006-09-14 Paul Jakma <paul.jakma@sun.com> * (general) Fix some niggly issues around 'shutdown' and clearing by adding a Clearing FSM wait-state and a hidden 'Deleted' FSM state, to allow deleted peers to 'cool off' and hit 0 references. This introduces a slow memory leak of struct peer, however that's more a testament to the fragility of the reference counting than a bug in this patch, cleanup of reference counting to fix this is to follow. * bgpd.h: Add Clearing, Deleted states and Clearing_Completed and event. * bgp_debug.c: (bgp_status_msg[]) Add strings for Clearing and Deleted. * bgp_fsm.h: Don't allow timer/event threads to set anything for Deleted peers. * bgp_fsm.c: (bgp_timer_set) Add Clearing and Deleted. Deleted needs to stop everything. (bgp_stop) Remove explicit fsm_change_status call, the general framework handles the transition. (bgp_start) Log a warning if a start is attempted on a peer that should stay down, trying to start a peer. (struct .. FSM) Add Clearing_Completed events, has little influence except when in state Clearing to signal wait-state can end. Add Clearing and Deleted states, former is a wait-state, latter is a placeholder state to allow peers to disappear quietly once refcounts settle. (bgp_event) Try reduce verbosity of FSM state-change debug, changes to same state are not interesting (Established->Established) Allow NULL action functions in FSM. * bgp_packet.c: (bgp_write) Use FSM events, rather than trying to twiddle directly with FSM state behind the back of FSM. (bgp_write_notify) ditto. (bgp_read) Remove the vague ACCEPT_PEER peer_unlock, or else this patch crashes, now it leaks instead. * bgp_route.c: (bgp_clear_node_complete) Clearing_Completed event, to end clearing. (bgp_clear_route) See extensive comments. * bgpd.c: (peer_free) should only be called while in Deleted, peer refcounting controls when peer_free is called. bgp_sync_delete should be here, not in peer_delete. (peer_delete) Initiate delete. Transition to Deleted state manually. When removing peer from indices that provide visibility of it, take great care to be idempotent wrt the reference counting of struct peer through those indices. Use bgp_timer_set, rather than replicating. Call to bgp_sync_delete isn't appropriate here, sync can be referenced while shutting down and finishing deletion. (peer_group_bind) Take care to be idempotent wrt list references indexing peers.
2006-09-14 04:58:49 +02:00
if (BGP_PEER_START_SUPPRESSED(peer)) {
Overhual BGP debugs Summary of changes - added an option to enable keepalive debugs for a specific peer - added an option to enable inbound and/or outbound updates debugs for a specific peer - added an option to enable update debugs for a specific prefix - added an option to enable zebra debugs for a specific prefix - combined "deb bgp", "deb bgp events" and "deb bgp fsm" into "deb bgp neighbor-events". "deb bgp neighbor-events" can be enabled for a specific peer. - merged "deb bgp filters" into "deb bgp update" - moved the per-peer logging to one central log file. We now have the ability to filter all verbose debugs on a per-peer and per-prefix basis so we no longer need to keep log files per-peer. This simplifies troubleshooting by keeping all BGP logs in one location. The use r can then grep for the peer IP they are interested in if they wish to see the logs for a specific peer. - Changed "show debugging" in isis to "show debugging isis" to be consistent with all other protocols. This was very confusing for the user because they would type "show debug" and expect to see a list of debugs enabled across all protocols. - Removed "undebug" from the parser for BGP. Again this was to be consisten with all other protocols. - Removed the "all" keyword from the BGP debug parser. The user can now do "no debug bgp" to disable all BGP debugs, before you had to type "no deb all bgp" which was confusing. The new parse tree for BGP debugging is: deb bgp as4 deb bgp as4 segment deb bgp keepalives [A.B.C.D|WORD|X:X::X:X] deb bgp neighbor-events [A.B.C.D|WORD|X:X::X:X] deb bgp nht deb bgp updates [in|out] [A.B.C.D|WORD|X:X::X:X] deb bgp updates prefix [A.B.C.D/M|X:X::X:X/M] deb bgp zebra deb bgp zebra prefix [A.B.C.D/M|X:X::X:X/M]
2015-05-20 02:58:12 +02:00
if (bgp_debug_neighbor_events(peer))
flog_err(EC_BGP_FSM,
"%s [FSM] Trying to start suppressed peer - this is never supposed to happen!",
peer->host);
if (CHECK_FLAG(peer->sflags, PEER_STATUS_RTT_SHUTDOWN))
peer->last_reset = PEER_DOWN_RTT_SHUTDOWN;
else if (CHECK_FLAG(peer->flags, PEER_FLAG_SHUTDOWN))
peer->last_reset = PEER_DOWN_USER_SHUTDOWN;
else if (CHECK_FLAG(peer->bgp->flags, BGP_FLAG_SHUTDOWN))
peer->last_reset = PEER_DOWN_USER_SHUTDOWN;
else if (CHECK_FLAG(peer->sflags, PEER_STATUS_PREFIX_OVERFLOW))
peer->last_reset = PEER_DOWN_PFX_COUNT;
return BGP_FSM_FAILURE;
[bgpd] Fix 0.99 shutdown regression, introduce Clearing and Deleted states 2006-09-14 Paul Jakma <paul.jakma@sun.com> * (general) Fix some niggly issues around 'shutdown' and clearing by adding a Clearing FSM wait-state and a hidden 'Deleted' FSM state, to allow deleted peers to 'cool off' and hit 0 references. This introduces a slow memory leak of struct peer, however that's more a testament to the fragility of the reference counting than a bug in this patch, cleanup of reference counting to fix this is to follow. * bgpd.h: Add Clearing, Deleted states and Clearing_Completed and event. * bgp_debug.c: (bgp_status_msg[]) Add strings for Clearing and Deleted. * bgp_fsm.h: Don't allow timer/event threads to set anything for Deleted peers. * bgp_fsm.c: (bgp_timer_set) Add Clearing and Deleted. Deleted needs to stop everything. (bgp_stop) Remove explicit fsm_change_status call, the general framework handles the transition. (bgp_start) Log a warning if a start is attempted on a peer that should stay down, trying to start a peer. (struct .. FSM) Add Clearing_Completed events, has little influence except when in state Clearing to signal wait-state can end. Add Clearing and Deleted states, former is a wait-state, latter is a placeholder state to allow peers to disappear quietly once refcounts settle. (bgp_event) Try reduce verbosity of FSM state-change debug, changes to same state are not interesting (Established->Established) Allow NULL action functions in FSM. * bgp_packet.c: (bgp_write) Use FSM events, rather than trying to twiddle directly with FSM state behind the back of FSM. (bgp_write_notify) ditto. (bgp_read) Remove the vague ACCEPT_PEER peer_unlock, or else this patch crashes, now it leaks instead. * bgp_route.c: (bgp_clear_node_complete) Clearing_Completed event, to end clearing. (bgp_clear_route) See extensive comments. * bgpd.c: (peer_free) should only be called while in Deleted, peer refcounting controls when peer_free is called. bgp_sync_delete should be here, not in peer_delete. (peer_delete) Initiate delete. Transition to Deleted state manually. When removing peer from indices that provide visibility of it, take great care to be idempotent wrt the reference counting of struct peer through those indices. Use bgp_timer_set, rather than replicating. Call to bgp_sync_delete isn't appropriate here, sync can be referenced while shutting down and finishing deletion. (peer_group_bind) Take care to be idempotent wrt list references indexing peers.
2006-09-14 04:58:49 +02:00
}
/* Clear remote router-id. */
peer->remote_id.s_addr = INADDR_ANY;
/* Clear peer capability flag. */
peer->cap = 0;
if (peer->bgp->vrf_id == VRF_UNKNOWN) {
if (bgp_debug_neighbor_events(peer))
flog_err(
EC_BGP_FSM,
"%s [FSM] In a VRF that is not initialised yet",
peer->host);
bgpd: Add a new command to only show failed peerings In a data center, having 32-128 peers is not uncommon. In such a situation, to find a peer that has failed and why is several commands. This hinders both the automatability of failure detection and the ease/speed with which the reason can be found. To simplify this process of catching a failure and its cause quicker, this patch does the following: 1. Created a new function, bgp_show_failed_summary to display the failed summary output for JSON and vty 2. Created a new function to display the reset code/subcode. This is now used in the failed summary code and in the show neighbors code 3. Added a new variable failedPeers in all the JSON outputs, including the vanilla "show bgp summary" family. This lists the failed session count. 4. Display peer, dropped count, estd count, uptime and the reason for failure as the output of "show bgp summary failed" family of commands 5. Added three resset codes for the case where we're waiting for NHT, waiting for peer IPv6 addr, waiting for VRF to init. This also counts the case where only one peer has advertised an AFI/SAFI. The new command has the optional keyword "failed" added to the classical summary command. The changes affect only one existing output, that of "show [ip] bgp neighbors <nbr>". As we track the lack of NHT resolution for a peer or the lack of knowing a peer IPv6 addr, the output of that command will show a "waiting for NHT" etc. as the last reset reason. This patch includes update to the documentation too. Signed-off-by: Dinesh G Dutt <5016467+ddutt@users.noreply.github.com>
2019-08-31 18:24:49 +02:00
peer->last_reset = PEER_DOWN_VRF_UNINIT;
return BGP_FSM_FAILURE;
}
/* Register peer for NHT. If next hop is already resolved, proceed
* with connection setup, else wait.
*/
if (!bgp_peer_reg_with_nht(peer)) {
if (bgp_zebra_num_connects()) {
if (bgp_debug_neighbor_events(peer))
zlog_debug("%s [FSM] Waiting for NHT, no path to neighbor present for %s",
peer->host,
bgp_peer_get_connection_direction(connection));
bgpd: Add a new command to only show failed peerings In a data center, having 32-128 peers is not uncommon. In such a situation, to find a peer that has failed and why is several commands. This hinders both the automatability of failure detection and the ease/speed with which the reason can be found. To simplify this process of catching a failure and its cause quicker, this patch does the following: 1. Created a new function, bgp_show_failed_summary to display the failed summary output for JSON and vty 2. Created a new function to display the reset code/subcode. This is now used in the failed summary code and in the show neighbors code 3. Added a new variable failedPeers in all the JSON outputs, including the vanilla "show bgp summary" family. This lists the failed session count. 4. Display peer, dropped count, estd count, uptime and the reason for failure as the output of "show bgp summary failed" family of commands 5. Added three resset codes for the case where we're waiting for NHT, waiting for peer IPv6 addr, waiting for VRF to init. This also counts the case where only one peer has advertised an AFI/SAFI. The new command has the optional keyword "failed" added to the classical summary command. The changes affect only one existing output, that of "show [ip] bgp neighbors <nbr>". As we track the lack of NHT resolution for a peer or the lack of knowing a peer IPv6 addr, the output of that command will show a "waiting for NHT" etc. as the last reset reason. This patch includes update to the documentation too. Signed-off-by: Dinesh G Dutt <5016467+ddutt@users.noreply.github.com>
2019-08-31 18:24:49 +02:00
peer->last_reset = PEER_DOWN_WAITING_NHT;
BGP_EVENT_ADD(connection, TCP_connection_open_failed);
return BGP_FSM_SUCCESS;
}
2002-12-13 21:15:29 +01:00
}
assert(!connection->t_write);
assert(!connection->t_read);
assert(!CHECK_FLAG(connection->thread_flags, PEER_THREAD_WRITES_ON));
assert(!CHECK_FLAG(connection->thread_flags, PEER_THREAD_READS_ON));
status = bgp_connect(connection);
switch (status) {
2002-12-13 21:15:29 +01:00
case connect_error:
Overhual BGP debugs Summary of changes - added an option to enable keepalive debugs for a specific peer - added an option to enable inbound and/or outbound updates debugs for a specific peer - added an option to enable update debugs for a specific prefix - added an option to enable zebra debugs for a specific prefix - combined "deb bgp", "deb bgp events" and "deb bgp fsm" into "deb bgp neighbor-events". "deb bgp neighbor-events" can be enabled for a specific peer. - merged "deb bgp filters" into "deb bgp update" - moved the per-peer logging to one central log file. We now have the ability to filter all verbose debugs on a per-peer and per-prefix basis so we no longer need to keep log files per-peer. This simplifies troubleshooting by keeping all BGP logs in one location. The use r can then grep for the peer IP they are interested in if they wish to see the logs for a specific peer. - Changed "show debugging" in isis to "show debugging isis" to be consistent with all other protocols. This was very confusing for the user because they would type "show debug" and expect to see a list of debugs enabled across all protocols. - Removed "undebug" from the parser for BGP. Again this was to be consisten with all other protocols. - Removed the "all" keyword from the BGP debug parser. The user can now do "no debug bgp" to disable all BGP debugs, before you had to type "no deb all bgp" which was confusing. The new parse tree for BGP debugging is: deb bgp as4 deb bgp as4 segment deb bgp keepalives [A.B.C.D|WORD|X:X::X:X] deb bgp neighbor-events [A.B.C.D|WORD|X:X::X:X] deb bgp nht deb bgp updates [in|out] [A.B.C.D|WORD|X:X::X:X] deb bgp updates prefix [A.B.C.D/M|X:X::X:X/M] deb bgp zebra deb bgp zebra prefix [A.B.C.D/M|X:X::X:X/M]
2015-05-20 02:58:12 +02:00
if (bgp_debug_neighbor_events(peer))
zlog_debug("%s [FSM] Connect error for %s", peer->host,
bgp_peer_get_connection_direction(connection));
BGP_EVENT_ADD(connection, TCP_connection_open_failed);
break;
2002-12-13 21:15:29 +01:00
case connect_success:
Overhual BGP debugs Summary of changes - added an option to enable keepalive debugs for a specific peer - added an option to enable inbound and/or outbound updates debugs for a specific peer - added an option to enable update debugs for a specific prefix - added an option to enable zebra debugs for a specific prefix - combined "deb bgp", "deb bgp events" and "deb bgp fsm" into "deb bgp neighbor-events". "deb bgp neighbor-events" can be enabled for a specific peer. - merged "deb bgp filters" into "deb bgp update" - moved the per-peer logging to one central log file. We now have the ability to filter all verbose debugs on a per-peer and per-prefix basis so we no longer need to keep log files per-peer. This simplifies troubleshooting by keeping all BGP logs in one location. The use r can then grep for the peer IP they are interested in if they wish to see the logs for a specific peer. - Changed "show debugging" in isis to "show debugging isis" to be consistent with all other protocols. This was very confusing for the user because they would type "show debug" and expect to see a list of debugs enabled across all protocols. - Removed "undebug" from the parser for BGP. Again this was to be consisten with all other protocols. - Removed the "all" keyword from the BGP debug parser. The user can now do "no debug bgp" to disable all BGP debugs, before you had to type "no deb all bgp" which was confusing. The new parse tree for BGP debugging is: deb bgp as4 deb bgp as4 segment deb bgp keepalives [A.B.C.D|WORD|X:X::X:X] deb bgp neighbor-events [A.B.C.D|WORD|X:X::X:X] deb bgp nht deb bgp updates [in|out] [A.B.C.D|WORD|X:X::X:X] deb bgp updates prefix [A.B.C.D/M|X:X::X:X/M] deb bgp zebra deb bgp zebra prefix [A.B.C.D/M|X:X::X:X/M]
2015-05-20 02:58:12 +02:00
if (bgp_debug_neighbor_events(peer))
zlog_debug("%s [FSM] Connect immediately success, fd %d for %s", peer->host,
connection->fd, bgp_peer_get_connection_direction(connection));
BGP_EVENT_ADD(connection, TCP_connection_open);
break;
2002-12-13 21:15:29 +01:00
case connect_in_progress:
/* To check nonblocking connect, we wait until socket is
readable or writable. */
Overhual BGP debugs Summary of changes - added an option to enable keepalive debugs for a specific peer - added an option to enable inbound and/or outbound updates debugs for a specific peer - added an option to enable update debugs for a specific prefix - added an option to enable zebra debugs for a specific prefix - combined "deb bgp", "deb bgp events" and "deb bgp fsm" into "deb bgp neighbor-events". "deb bgp neighbor-events" can be enabled for a specific peer. - merged "deb bgp filters" into "deb bgp update" - moved the per-peer logging to one central log file. We now have the ability to filter all verbose debugs on a per-peer and per-prefix basis so we no longer need to keep log files per-peer. This simplifies troubleshooting by keeping all BGP logs in one location. The use r can then grep for the peer IP they are interested in if they wish to see the logs for a specific peer. - Changed "show debugging" in isis to "show debugging isis" to be consistent with all other protocols. This was very confusing for the user because they would type "show debug" and expect to see a list of debugs enabled across all protocols. - Removed "undebug" from the parser for BGP. Again this was to be consisten with all other protocols. - Removed the "all" keyword from the BGP debug parser. The user can now do "no debug bgp" to disable all BGP debugs, before you had to type "no deb all bgp" which was confusing. The new parse tree for BGP debugging is: deb bgp as4 deb bgp as4 segment deb bgp keepalives [A.B.C.D|WORD|X:X::X:X] deb bgp neighbor-events [A.B.C.D|WORD|X:X::X:X] deb bgp nht deb bgp updates [in|out] [A.B.C.D|WORD|X:X::X:X] deb bgp updates prefix [A.B.C.D/M|X:X::X:X/M] deb bgp zebra deb bgp zebra prefix [A.B.C.D/M|X:X::X:X/M]
2015-05-20 02:58:12 +02:00
if (bgp_debug_neighbor_events(peer))
zlog_debug("%s [FSM] Non blocking connect waiting result, fd %d for %s",
peer->host, connection->fd,
bgp_peer_get_connection_direction(connection));
if (connection->fd < 0) {
flog_err(EC_BGP_FSM, "%s peer's fd is negative value %d",
__func__, peer->connection->fd);
return BGP_FSM_FAILURE;
}
bgp_connect_in_progress_update_connection(connection);
bgpd: fix addressing information of non established outgoing sessions When trying to connect to a BGP peer that does not respons, the 'show bgp neighbors' command does not give any indication on the local and remote addresses used: > # show bgp neighbors > BGP neighbor is 192.0.2.150, remote AS 65500, local AS 65500, internal link > Local Role: undefined > Remote Role: undefined > BGP version 4, remote router ID 0.0.0.0, local router ID 192.0.2.1 > BGP state = Connect > [..] > Connections established 0; dropped 0 > Last reset 00:00:04, Waiting for peer OPEN (n/a) > Internal BGP neighbor may be up to 255 hops away. > BGP Connect Retry Timer in Seconds: 120 > Next connect timer due in 117 seconds > Read thread: off Write thread: off FD used: 27 The addressing information (address and port) are only available when TCP session is established, whereas this information is present at the system level: > root@ubuntu2204:~# netstat -pan | grep 192.0.2.1 > tcp 0 0 192.0.2.1:179 192.0.2.150:38060 SYN_RECV - > tcp 0 1 192.0.2.1:46526 192.0.2.150:179 SYN_SENT 488310/bgpd Add the display for outgoing BGP session, as the information in the getsockname() API provides information for connected streams. When getpeername() API does not give any information, use the peer configuration (destination port is encoded in peer->port). > # show bgp neighbors > BGP neighbor is 192.0.2.150, remote AS 65500, local AS 65500, internal link > Local Role: undefined > Remote Role: undefined > BGP version 4, remote router ID 0.0.0.0, local router ID 192.0.2.1 > BGP state = Connect > [..] > Connections established 0; dropped 0 > Last reset 00:00:16, Waiting for peer OPEN (n/a) > Local host: 192.0.2.1, Local port: 46084 > Foreign host: 192.0.2.150, Foreign port: 179 Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2024-04-05 09:55:05 +02:00
/*
* - when the socket becomes ready, poll() will signify POLLOUT
* - if it fails to connect, poll() will signify POLLHUP
* - POLLHUP is handled as a 'read' event by thread.c
*
* therefore, we schedule both a read and a write event with
* bgp_connect_check() as the handler for each and cancel the
* unused event in that function.
*/
event_add_read(bm->master, bgp_connect_check, connection,
connection->fd, &connection->t_connect_check_r);
event_add_write(bm->master, bgp_connect_check, connection,
connection->fd, &connection->t_connect_check_w);
2002-12-13 21:15:29 +01:00
break;
}
return BGP_FSM_SUCCESS;
2002-12-13 21:15:29 +01:00
}
/* Connect retry timer is expired when the peer status is Connect. */
static enum bgp_fsm_state_progress
bgp_reconnect(struct peer_connection *connection)
2002-12-13 21:15:29 +01:00
{
struct peer *peer = connection->peer;
enum bgp_fsm_state_progress ret;
ret = bgp_stop(connection);
if (ret < BGP_FSM_SUCCESS)
return ret;
/* Send graceful restart capabilty */
BGP_GR_ROUTER_DETECT_AND_SEND_CAPABILITY_TO_ZEBRA(peer->bgp,
peer->bgp->peer);
return bgp_start(connection);
2002-12-13 21:15:29 +01:00
}
static enum bgp_fsm_state_progress
bgp_fsm_open(struct peer_connection *connection)
2002-12-13 21:15:29 +01:00
{
/* If DelayOpen is active, we may still need to send an open message */
if ((connection->status == Connect) || (connection->status == Active))
bgp_open_send(connection);
2002-12-13 21:15:29 +01:00
/* Send keepalive and make keepalive timer */
bgp_keepalive_send(connection);
2002-12-13 21:15:29 +01:00
return BGP_FSM_SUCCESS;
2002-12-13 21:15:29 +01:00
}
/* FSM error, unexpected event. This is error of BGP connection. So cut the
peer and change to Idle status. */
static enum bgp_fsm_state_progress
bgp_fsm_event_error(struct peer_connection *connection)
{
struct peer *peer = connection->peer;
flog_err(EC_BGP_FSM, "%s [FSM] unexpected packet received in state %s",
peer->host,
lookup_msg(bgp_status_msg, connection->status, NULL));
return bgp_stop_with_notify(connection, BGP_NOTIFY_FSM_ERR,
bgp_fsm_error_subcode(connection->status));
}
2002-12-13 21:15:29 +01:00
/* Hold timer expire. This is error of BGP connection. So cut the
peer and change to Idle status. */
static enum bgp_fsm_state_progress
bgp_fsm_holdtime_expire(struct peer_connection *connection)
2002-12-13 21:15:29 +01:00
{
struct peer *peer = connection->peer;
Overhual BGP debugs Summary of changes - added an option to enable keepalive debugs for a specific peer - added an option to enable inbound and/or outbound updates debugs for a specific peer - added an option to enable update debugs for a specific prefix - added an option to enable zebra debugs for a specific prefix - combined "deb bgp", "deb bgp events" and "deb bgp fsm" into "deb bgp neighbor-events". "deb bgp neighbor-events" can be enabled for a specific peer. - merged "deb bgp filters" into "deb bgp update" - moved the per-peer logging to one central log file. We now have the ability to filter all verbose debugs on a per-peer and per-prefix basis so we no longer need to keep log files per-peer. This simplifies troubleshooting by keeping all BGP logs in one location. The use r can then grep for the peer IP they are interested in if they wish to see the logs for a specific peer. - Changed "show debugging" in isis to "show debugging isis" to be consistent with all other protocols. This was very confusing for the user because they would type "show debug" and expect to see a list of debugs enabled across all protocols. - Removed "undebug" from the parser for BGP. Again this was to be consisten with all other protocols. - Removed the "all" keyword from the BGP debug parser. The user can now do "no debug bgp" to disable all BGP debugs, before you had to type "no deb all bgp" which was confusing. The new parse tree for BGP debugging is: deb bgp as4 deb bgp as4 segment deb bgp keepalives [A.B.C.D|WORD|X:X::X:X] deb bgp neighbor-events [A.B.C.D|WORD|X:X::X:X] deb bgp nht deb bgp updates [in|out] [A.B.C.D|WORD|X:X::X:X] deb bgp updates prefix [A.B.C.D/M|X:X::X:X/M] deb bgp zebra deb bgp zebra prefix [A.B.C.D/M|X:X::X:X/M]
2015-05-20 02:58:12 +02:00
if (bgp_debug_neighbor_events(peer))
zlog_debug("%s [FSM] Hold timer expire for %s", peer->host,
bgp_peer_get_connection_direction(connection));
2002-12-13 21:15:29 +01:00
/* RFC8538 updates RFC 4724 by defining an extension that permits
* the Graceful Restart procedures to be performed when the BGP
* speaker receives a BGP NOTIFICATION message or the Hold Time expires.
*/
if (peer_established(connection) &&
bgp_has_graceful_restart_notification(peer))
if (CHECK_FLAG(peer->sflags, PEER_STATUS_NSF_MODE))
SET_FLAG(peer->sflags, PEER_STATUS_NSF_WAIT);
return bgp_stop_with_notify(connection, BGP_NOTIFY_HOLD_ERR, 0);
2002-12-13 21:15:29 +01:00
}
/* RFC 4271 DelayOpenTimer_Expires event */
static enum bgp_fsm_state_progress
bgp_fsm_delayopen_timer_expire(struct peer_connection *connection)
{
/* Stop the DelayOpenTimer */
EVENT_OFF(connection->t_delayopen);
/* Send open message to peer */
bgp_open_send(connection);
/* Set the HoldTimer to a large value (4 minutes) */
connection->peer->v_holdtime = 245;
return BGP_FSM_SUCCESS;
}
/* Start the selection deferral timer thread for the specified AFI, SAFI */
static int bgp_start_deferral_timer(struct bgp *bgp, afi_t afi, safi_t safi,
struct graceful_restart_info *gr_info)
{
struct afi_safi_info *thread_info;
/* If the deferral timer is active, then increment eor count */
if (gr_info->t_select_deferral) {
gr_info->eor_required++;
return 0;
}
/* Start the deferral timer when the first peer enabled for the graceful
* restart is established
*/
if (gr_info->eor_required == 0) {
thread_info = XMALLOC(MTYPE_TMP, sizeof(struct afi_safi_info));
thread_info->afi = afi;
thread_info->safi = safi;
thread_info->bgp = bgp;
event_add_timer(bm->master, bgp_graceful_deferral_timer_expire,
thread_info, bgp->select_defer_time,
&gr_info->t_select_deferral);
}
gr_info->eor_required++;
/* Send message to RIB indicating route update pending */
if (gr_info->af_enabled == false) {
gr_info->af_enabled = true;
gr_info->route_sync = false;
bgp->gr_route_sync_pending = true;
bgp_zebra_update(bgp, afi, safi,
ZEBRA_CLIENT_ROUTE_UPDATE_PENDING);
}
if (BGP_DEBUG(update, UPDATE_OUT))
zlog_debug("Started the deferral timer for %s eor_required %d",
get_afi_safi_str(afi, safi, false),
gr_info->eor_required);
return 0;
}
/* Update the graceful restart information for the specified AFI, SAFI */
static int bgp_update_gr_info(struct peer *peer, afi_t afi, safi_t safi)
{
struct graceful_restart_info *gr_info;
struct bgp *bgp = peer->bgp;
int ret = 0;
if ((afi < AFI_IP) || (afi >= AFI_MAX)) {
if (BGP_DEBUG(update, UPDATE_OUT))
zlog_debug("%s : invalid afi %d", __func__, afi);
return -1;
}
if ((safi < SAFI_UNICAST) || (safi > SAFI_MPLS_VPN)) {
if (BGP_DEBUG(update, UPDATE_OUT))
zlog_debug("%s : invalid safi %d", __func__, safi);
return -1;
}
/* Restarting router */
if (BGP_PEER_GRACEFUL_RESTART_CAPABLE(peer)
&& BGP_PEER_RESTARTING_MODE(peer)) {
/* Check if the forwarding state is preserved */
if (bgp_gr_is_forwarding_preserved(bgp)) {
gr_info = &(bgp->gr_info[afi][safi]);
ret = bgp_start_deferral_timer(bgp, afi, safi, gr_info);
}
}
return ret;
}
/**
* Transition to Established state.
*
* Convert peer from stub to full fledged peer, set some timers, and generate
* initial updates.
*/
static enum bgp_fsm_state_progress
bgp_establish(struct peer_connection *connection)
2002-12-13 21:15:29 +01:00
{
afi_t afi;
safi_t safi;
int nsf_af_count = 0;
enum bgp_fsm_state_progress ret = BGP_FSM_SUCCESS;
struct peer *other;
int status;
struct peer *peer = connection->peer;
struct peer *orig = peer;
other = peer->doppelganger;
hash_release(peer->bgp->peerhash, peer);
if (other)
hash_release(peer->bgp->peerhash, other);
peer = peer_xfer_conn(peer);
if (!peer) {
flog_err(EC_BGP_CONNECT, "%%Neighbor failed in xfer_conn");
/*
* A failure of peer_xfer_conn but not putting the peers
* back in the hash ends up with a situation where incoming
* connections are rejected, as that the peer is not found
* when a lookup is done
*/
(void)hash_get(orig->bgp->peerhash, orig, hash_alloc_intern);
if (other)
(void)hash_get(other->bgp->peerhash, other,
hash_alloc_intern);
return BGP_FSM_FAILURE;
}
/*
* At this point the connections have been possibly swapped
* let's reset it.
*/
connection = peer->connection;
if (other == peer)
ret = BGP_FSM_SUCCESS_STATE_TRANSFER;
2002-12-13 21:15:29 +01:00
/* Reset capability open status flag. */
if (!CHECK_FLAG(peer->sflags, PEER_STATUS_CAPABILITY_OPEN))
SET_FLAG(peer->sflags, PEER_STATUS_CAPABILITY_OPEN);
2002-12-13 21:15:29 +01:00
/* Clear start timer value to default. */
peer->v_start = BGP_INIT_START_TIMER;
2002-12-13 21:15:29 +01:00
/* Increment established count. */
peer->established++;
bgp_fsm_change_status(connection, Established);
if (peer->last_reset == PEER_DOWN_WAITING_OPEN)
peer->last_reset = 0;
/* bgp log-neighbor-changes of neighbor Up */
if (CHECK_FLAG(peer->bgp->flags, BGP_FLAG_LOG_NEIGHBOR_CHANGES)) {
struct vrf *vrf = vrf_lookup_by_id(peer->bgp->vrf_id);
zlog_info("%%ADJCHANGE: neighbor %pBP in vrf %s Up", peer,
vrf ? ((vrf->vrf_id != VRF_DEFAULT)
? vrf->name
: VRF_DEFAULT_NAME)
: "");
}
/* assign update-group/subgroup */
update_group_adjust_peer_afs(peer);
/* graceful restart */
UNSET_FLAG(peer->sflags, PEER_STATUS_NSF_WAIT);
if (bgp_debug_neighbor_events(peer)) {
if (BGP_PEER_RESTARTING_MODE(peer))
zlog_debug("%pBP BGP_RESTARTING_MODE %s", peer,
bgp_peer_get_connection_direction(connection));
else if (BGP_PEER_HELPER_MODE(peer))
zlog_debug("%pBP BGP_HELPER_MODE %s", peer,
bgp_peer_get_connection_direction(connection));
}
FOREACH_AFI_SAFI_NSF (afi, safi) {
if (peer->afc_nego[afi][safi] &&
CHECK_FLAG(peer->cap, PEER_CAP_RESTART_ADV) &&
CHECK_FLAG(peer->af_cap[afi][safi],
PEER_CAP_RESTART_AF_RCV)) {
if (peer->nsf[afi][safi] &&
!CHECK_FLAG(peer->af_cap[afi][safi],
PEER_CAP_RESTART_AF_PRESERVE_RCV))
bgp_clear_stale_route(peer, afi, safi);
peer->nsf[afi][safi] = 1;
nsf_af_count++;
} else {
if (peer->nsf[afi][safi])
bgp_clear_stale_route(peer, afi, safi);
peer->nsf[afi][safi] = 0;
}
/* Update the graceful restart information */
if (peer->afc_nego[afi][safi]) {
if (!BGP_SELECT_DEFER_DISABLE(peer->bgp)) {
status = bgp_update_gr_info(peer, afi, safi);
if (status < 0)
zlog_err(
"Error in updating graceful restart for %s",
get_afi_safi_str(afi, safi,
false));
} else {
if (BGP_PEER_GRACEFUL_RESTART_CAPABLE(peer) &&
BGP_PEER_RESTARTING_MODE(peer) &&
bgp_gr_is_forwarding_preserved(peer->bgp))
peer->bgp->gr_info[afi][safi]
.eor_required++;
}
}
}
if (!CHECK_FLAG(peer->cap, PEER_CAP_RESTART_RCV)) {
if ((bgp_peer_gr_mode_get(peer) == PEER_GR)
|| ((bgp_peer_gr_mode_get(peer) == PEER_GLOBAL_INHERIT)
&& (bgp_global_gr_mode_get(peer->bgp) == GLOBAL_GR))) {
FOREACH_AFI_SAFI (afi, safi)
/* Send route processing complete
message to RIB */
bgp_zebra_update(
peer->bgp, afi, safi,
ZEBRA_CLIENT_ROUTE_UPDATE_COMPLETE);
}
} else {
/* Peer sends R-bit. In this case, we need to send
* ZEBRA_CLIENT_ROUTE_UPDATE_COMPLETE to Zebra. */
if (CHECK_FLAG(peer->cap,
PEER_CAP_GRACEFUL_RESTART_R_BIT_RCV)) {
FOREACH_AFI_SAFI (afi, safi)
/* Send route processing complete
message to RIB */
bgp_zebra_update(
peer->bgp, afi, safi,
ZEBRA_CLIENT_ROUTE_UPDATE_COMPLETE);
}
}
peer->nsf_af_count = nsf_af_count;
if (nsf_af_count)
SET_FLAG(peer->sflags, PEER_STATUS_NSF_MODE);
else {
UNSET_FLAG(peer->sflags, PEER_STATUS_NSF_MODE);
if (connection->t_gr_stale) {
EVENT_OFF(connection->t_gr_stale);
Overhual BGP debugs Summary of changes - added an option to enable keepalive debugs for a specific peer - added an option to enable inbound and/or outbound updates debugs for a specific peer - added an option to enable update debugs for a specific prefix - added an option to enable zebra debugs for a specific prefix - combined "deb bgp", "deb bgp events" and "deb bgp fsm" into "deb bgp neighbor-events". "deb bgp neighbor-events" can be enabled for a specific peer. - merged "deb bgp filters" into "deb bgp update" - moved the per-peer logging to one central log file. We now have the ability to filter all verbose debugs on a per-peer and per-prefix basis so we no longer need to keep log files per-peer. This simplifies troubleshooting by keeping all BGP logs in one location. The use r can then grep for the peer IP they are interested in if they wish to see the logs for a specific peer. - Changed "show debugging" in isis to "show debugging isis" to be consistent with all other protocols. This was very confusing for the user because they would type "show debug" and expect to see a list of debugs enabled across all protocols. - Removed "undebug" from the parser for BGP. Again this was to be consisten with all other protocols. - Removed the "all" keyword from the BGP debug parser. The user can now do "no debug bgp" to disable all BGP debugs, before you had to type "no deb all bgp" which was confusing. The new parse tree for BGP debugging is: deb bgp as4 deb bgp as4 segment deb bgp keepalives [A.B.C.D|WORD|X:X::X:X] deb bgp neighbor-events [A.B.C.D|WORD|X:X::X:X] deb bgp nht deb bgp updates [in|out] [A.B.C.D|WORD|X:X::X:X] deb bgp updates prefix [A.B.C.D/M|X:X::X:X/M] deb bgp zebra deb bgp zebra prefix [A.B.C.D/M|X:X::X:X/M]
2015-05-20 02:58:12 +02:00
if (bgp_debug_neighbor_events(peer))
zlog_debug("%pBP graceful restart stalepath timer stopped for %s",
peer, bgp_peer_get_connection_direction(connection));
}
}
if (connection->t_gr_restart) {
EVENT_OFF(connection->t_gr_restart);
if (bgp_debug_neighbor_events(peer))
zlog_debug("%pBP graceful restart timer stopped for %s", peer,
bgp_peer_get_connection_direction(connection));
}
2002-12-13 21:15:29 +01:00
/* Reset uptime, turn on keepalives, send current table. */
if (!peer->v_holdtime)
bgp_keepalives_on(connection);
peer->uptime = monotime(NULL);
bgpd: Stop LLGR timer when the connection is established When the connection goes up, the timer is not stopped and if we have a subsequent GR event we have an old timer which is not as we expect. Before: ``` spine1-debian-11# sh ip bgp 192.168.100.1/32 BGP routing table entry for 192.168.100.1/32, version 95 Paths: (1 available, best #1, table default, mark routes to be retained for a longer time. Requires support for Long-lived BGP Graceful Restart) Not advertised to any peer 65001 47583, (stale) 192.168.0.1 from 192.168.0.1 (100.100.200.100) Origin incomplete, valid, external, best (First path received) Community: llgr-stale Last update: Mon Mar 28 08:27:53 2022 Time until Long-lived stale route deleted: 23 <<<<<<<<<<<< spine1-debian-11# sh ip bgp 192.168.100.1/32 BGP routing table entry for 192.168.100.1/32, version 103 Paths: (1 available, best #1, table default) Advertised to non peer-group peers: 192.168.0.1 65001 47583 192.168.0.1 from 192.168.0.1 (100.100.200.100) Origin incomplete, valid, external, best (First path received) Last update: Mon Mar 28 08:43:29 2022 spine1-debian-11# sh ip bgp 192.168.100.1/32 BGP routing table entry for 192.168.100.1/32, version 103 Paths: (1 available, best #1, table default, mark routes to be retained for a longer time. Requires support for Long-lived BGP Graceful Restart) Not advertised to any peer 65001 47583, (stale) 192.168.0.1 from 192.168.0.1 (100.100.200.100) Origin incomplete, valid, external, best (First path received) Community: llgr-stale Last update: Mon Mar 28 08:43:30 2022 Time until Long-lived stale route deleted: 17 <<<<<<<<<<<<<<< ``` After: ``` spine1-debian-11# sh ip bgp 192.168.100.1/32 BGP routing table entry for 192.168.100.1/32, version 79 Paths: (1 available, best #1, table default, mark routes to be retained for a longer time. Requires support for Long-lived BGP Graceful Restart) Not advertised to any peer 65001 47583, (stale) 192.168.0.1 from 192.168.0.1 (0.0.0.0) Origin incomplete, valid, external, best (First path received) Community: llgr-stale Last update: Mon Mar 28 09:05:18 2022 Time until Long-lived stale route deleted: 24 <<<<<<<<<<<<<<< spine1-debian-11# sh ip bgp 192.168.100.1/32 BGP routing table entry for 192.168.100.1/32, version 87 Paths: (1 available, best #1, table default) Advertised to non peer-group peers: 192.168.0.1 65001 47583 192.168.0.1 from 192.168.0.1 (100.100.200.100) Origin incomplete, valid, external, best (First path received) Last update: Mon Mar 28 09:05:25 2022 spine1-debian-11# sh ip bgp 192.168.100.1/32 BGP routing table entry for 192.168.100.1/32, version 87 Paths: (1 available, best #1, table default, mark routes to be retained for a longer time. Requires support for Long-lived BGP Graceful Restart) Not advertised to any peer 65001 47583, (stale) 192.168.0.1 from 192.168.0.1 (100.100.200.100) Origin incomplete, valid, external, best (First path received) Community: llgr-stale Last update: Mon Mar 28 09:05:29 2022 Time until Long-lived stale route deleted: 29 <<<<<<<<<<<<<< ``` Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-03-28 10:41:35 +02:00
/* Send route-refresh when ORF is enabled.
* Stop Long-lived Graceful Restart timers.
*/
FOREACH_AFI_SAFI (afi, safi) {
bgpd: Stop LLGR timer when the connection is established When the connection goes up, the timer is not stopped and if we have a subsequent GR event we have an old timer which is not as we expect. Before: ``` spine1-debian-11# sh ip bgp 192.168.100.1/32 BGP routing table entry for 192.168.100.1/32, version 95 Paths: (1 available, best #1, table default, mark routes to be retained for a longer time. Requires support for Long-lived BGP Graceful Restart) Not advertised to any peer 65001 47583, (stale) 192.168.0.1 from 192.168.0.1 (100.100.200.100) Origin incomplete, valid, external, best (First path received) Community: llgr-stale Last update: Mon Mar 28 08:27:53 2022 Time until Long-lived stale route deleted: 23 <<<<<<<<<<<< spine1-debian-11# sh ip bgp 192.168.100.1/32 BGP routing table entry for 192.168.100.1/32, version 103 Paths: (1 available, best #1, table default) Advertised to non peer-group peers: 192.168.0.1 65001 47583 192.168.0.1 from 192.168.0.1 (100.100.200.100) Origin incomplete, valid, external, best (First path received) Last update: Mon Mar 28 08:43:29 2022 spine1-debian-11# sh ip bgp 192.168.100.1/32 BGP routing table entry for 192.168.100.1/32, version 103 Paths: (1 available, best #1, table default, mark routes to be retained for a longer time. Requires support for Long-lived BGP Graceful Restart) Not advertised to any peer 65001 47583, (stale) 192.168.0.1 from 192.168.0.1 (100.100.200.100) Origin incomplete, valid, external, best (First path received) Community: llgr-stale Last update: Mon Mar 28 08:43:30 2022 Time until Long-lived stale route deleted: 17 <<<<<<<<<<<<<<< ``` After: ``` spine1-debian-11# sh ip bgp 192.168.100.1/32 BGP routing table entry for 192.168.100.1/32, version 79 Paths: (1 available, best #1, table default, mark routes to be retained for a longer time. Requires support for Long-lived BGP Graceful Restart) Not advertised to any peer 65001 47583, (stale) 192.168.0.1 from 192.168.0.1 (0.0.0.0) Origin incomplete, valid, external, best (First path received) Community: llgr-stale Last update: Mon Mar 28 09:05:18 2022 Time until Long-lived stale route deleted: 24 <<<<<<<<<<<<<<< spine1-debian-11# sh ip bgp 192.168.100.1/32 BGP routing table entry for 192.168.100.1/32, version 87 Paths: (1 available, best #1, table default) Advertised to non peer-group peers: 192.168.0.1 65001 47583 192.168.0.1 from 192.168.0.1 (100.100.200.100) Origin incomplete, valid, external, best (First path received) Last update: Mon Mar 28 09:05:25 2022 spine1-debian-11# sh ip bgp 192.168.100.1/32 BGP routing table entry for 192.168.100.1/32, version 87 Paths: (1 available, best #1, table default, mark routes to be retained for a longer time. Requires support for Long-lived BGP Graceful Restart) Not advertised to any peer 65001 47583, (stale) 192.168.0.1 from 192.168.0.1 (100.100.200.100) Origin incomplete, valid, external, best (First path received) Community: llgr-stale Last update: Mon Mar 28 09:05:29 2022 Time until Long-lived stale route deleted: 29 <<<<<<<<<<<<<< ``` Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-03-28 10:41:35 +02:00
if (peer->t_llgr_stale[afi][safi]) {
EVENT_OFF(peer->t_llgr_stale[afi][safi]);
bgpd: Stop LLGR timer when the connection is established When the connection goes up, the timer is not stopped and if we have a subsequent GR event we have an old timer which is not as we expect. Before: ``` spine1-debian-11# sh ip bgp 192.168.100.1/32 BGP routing table entry for 192.168.100.1/32, version 95 Paths: (1 available, best #1, table default, mark routes to be retained for a longer time. Requires support for Long-lived BGP Graceful Restart) Not advertised to any peer 65001 47583, (stale) 192.168.0.1 from 192.168.0.1 (100.100.200.100) Origin incomplete, valid, external, best (First path received) Community: llgr-stale Last update: Mon Mar 28 08:27:53 2022 Time until Long-lived stale route deleted: 23 <<<<<<<<<<<< spine1-debian-11# sh ip bgp 192.168.100.1/32 BGP routing table entry for 192.168.100.1/32, version 103 Paths: (1 available, best #1, table default) Advertised to non peer-group peers: 192.168.0.1 65001 47583 192.168.0.1 from 192.168.0.1 (100.100.200.100) Origin incomplete, valid, external, best (First path received) Last update: Mon Mar 28 08:43:29 2022 spine1-debian-11# sh ip bgp 192.168.100.1/32 BGP routing table entry for 192.168.100.1/32, version 103 Paths: (1 available, best #1, table default, mark routes to be retained for a longer time. Requires support for Long-lived BGP Graceful Restart) Not advertised to any peer 65001 47583, (stale) 192.168.0.1 from 192.168.0.1 (100.100.200.100) Origin incomplete, valid, external, best (First path received) Community: llgr-stale Last update: Mon Mar 28 08:43:30 2022 Time until Long-lived stale route deleted: 17 <<<<<<<<<<<<<<< ``` After: ``` spine1-debian-11# sh ip bgp 192.168.100.1/32 BGP routing table entry for 192.168.100.1/32, version 79 Paths: (1 available, best #1, table default, mark routes to be retained for a longer time. Requires support for Long-lived BGP Graceful Restart) Not advertised to any peer 65001 47583, (stale) 192.168.0.1 from 192.168.0.1 (0.0.0.0) Origin incomplete, valid, external, best (First path received) Community: llgr-stale Last update: Mon Mar 28 09:05:18 2022 Time until Long-lived stale route deleted: 24 <<<<<<<<<<<<<<< spine1-debian-11# sh ip bgp 192.168.100.1/32 BGP routing table entry for 192.168.100.1/32, version 87 Paths: (1 available, best #1, table default) Advertised to non peer-group peers: 192.168.0.1 65001 47583 192.168.0.1 from 192.168.0.1 (100.100.200.100) Origin incomplete, valid, external, best (First path received) Last update: Mon Mar 28 09:05:25 2022 spine1-debian-11# sh ip bgp 192.168.100.1/32 BGP routing table entry for 192.168.100.1/32, version 87 Paths: (1 available, best #1, table default, mark routes to be retained for a longer time. Requires support for Long-lived BGP Graceful Restart) Not advertised to any peer 65001 47583, (stale) 192.168.0.1 from 192.168.0.1 (100.100.200.100) Origin incomplete, valid, external, best (First path received) Community: llgr-stale Last update: Mon Mar 28 09:05:29 2022 Time until Long-lived stale route deleted: 29 <<<<<<<<<<<<<< ``` Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-03-28 10:41:35 +02:00
if (bgp_debug_neighbor_events(peer))
zlog_debug("%pBP Long-lived stale timer stopped for afi/safi: %d/%d for %s",
peer, afi, safi,
bgp_peer_get_connection_direction(connection));
bgpd: Stop LLGR timer when the connection is established When the connection goes up, the timer is not stopped and if we have a subsequent GR event we have an old timer which is not as we expect. Before: ``` spine1-debian-11# sh ip bgp 192.168.100.1/32 BGP routing table entry for 192.168.100.1/32, version 95 Paths: (1 available, best #1, table default, mark routes to be retained for a longer time. Requires support for Long-lived BGP Graceful Restart) Not advertised to any peer 65001 47583, (stale) 192.168.0.1 from 192.168.0.1 (100.100.200.100) Origin incomplete, valid, external, best (First path received) Community: llgr-stale Last update: Mon Mar 28 08:27:53 2022 Time until Long-lived stale route deleted: 23 <<<<<<<<<<<< spine1-debian-11# sh ip bgp 192.168.100.1/32 BGP routing table entry for 192.168.100.1/32, version 103 Paths: (1 available, best #1, table default) Advertised to non peer-group peers: 192.168.0.1 65001 47583 192.168.0.1 from 192.168.0.1 (100.100.200.100) Origin incomplete, valid, external, best (First path received) Last update: Mon Mar 28 08:43:29 2022 spine1-debian-11# sh ip bgp 192.168.100.1/32 BGP routing table entry for 192.168.100.1/32, version 103 Paths: (1 available, best #1, table default, mark routes to be retained for a longer time. Requires support for Long-lived BGP Graceful Restart) Not advertised to any peer 65001 47583, (stale) 192.168.0.1 from 192.168.0.1 (100.100.200.100) Origin incomplete, valid, external, best (First path received) Community: llgr-stale Last update: Mon Mar 28 08:43:30 2022 Time until Long-lived stale route deleted: 17 <<<<<<<<<<<<<<< ``` After: ``` spine1-debian-11# sh ip bgp 192.168.100.1/32 BGP routing table entry for 192.168.100.1/32, version 79 Paths: (1 available, best #1, table default, mark routes to be retained for a longer time. Requires support for Long-lived BGP Graceful Restart) Not advertised to any peer 65001 47583, (stale) 192.168.0.1 from 192.168.0.1 (0.0.0.0) Origin incomplete, valid, external, best (First path received) Community: llgr-stale Last update: Mon Mar 28 09:05:18 2022 Time until Long-lived stale route deleted: 24 <<<<<<<<<<<<<<< spine1-debian-11# sh ip bgp 192.168.100.1/32 BGP routing table entry for 192.168.100.1/32, version 87 Paths: (1 available, best #1, table default) Advertised to non peer-group peers: 192.168.0.1 65001 47583 192.168.0.1 from 192.168.0.1 (100.100.200.100) Origin incomplete, valid, external, best (First path received) Last update: Mon Mar 28 09:05:25 2022 spine1-debian-11# sh ip bgp 192.168.100.1/32 BGP routing table entry for 192.168.100.1/32, version 87 Paths: (1 available, best #1, table default, mark routes to be retained for a longer time. Requires support for Long-lived BGP Graceful Restart) Not advertised to any peer 65001 47583, (stale) 192.168.0.1 from 192.168.0.1 (100.100.200.100) Origin incomplete, valid, external, best (First path received) Community: llgr-stale Last update: Mon Mar 28 09:05:29 2022 Time until Long-lived stale route deleted: 29 <<<<<<<<<<<<<< ``` Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-03-28 10:41:35 +02:00
}
if (CHECK_FLAG(peer->af_cap[afi][safi],
PEER_CAP_ORF_PREFIX_SM_ADV)) {
2002-12-13 21:15:29 +01:00
if (CHECK_FLAG(peer->af_cap[afi][safi],
PEER_CAP_ORF_PREFIX_RM_RCV))
bgp_route_refresh_send(
peer, afi, safi, ORF_TYPE_PREFIX,
REFRESH_IMMEDIATE, 0,
BGP_ROUTE_REFRESH_NORMAL);
}
}
2002-12-13 21:15:29 +01:00
/* First update is deferred until ORF or ROUTE-REFRESH is received */
FOREACH_AFI_SAFI (afi, safi) {
if (CHECK_FLAG(peer->af_cap[afi][safi],
PEER_CAP_ORF_PREFIX_RM_ADV))
2002-12-13 21:15:29 +01:00
if (CHECK_FLAG(peer->af_cap[afi][safi],
PEER_CAP_ORF_PREFIX_SM_RCV))
SET_FLAG(peer->af_sflags[afi][safi],
PEER_STATUS_ORF_WAIT_REFRESH);
}
bgp_announce_peer(peer);
bgpd: bgpd-mrai.patch BGP: Event-driven route announcement taking into account min route advertisement interval ISSUE BGP starts the routeadv timer (peer->t_routeadv) to expire in 1 sec when a peer is established. From then on, the timer expires periodically based on the configured MRAI value (default: 30sec for EBGP, 5sec for IBGP). At the expiry, the write thread is triggered that takes the routes from peer's sync FIFO (adj-rib-out) and sends UPDATEs. This has a few drawbacks: (1) Delay in new route announcement: Even when the last UPDATE message was sent a while back, the next route change will necessarily have to wait for routeadv expiry (2) CPU usage: The timer is always armed. If the operator chooses to configure a lower value of MRAI (zero second is a preferred choice in many deployments) for better convergence, it leads to high CPU usage for BGP process, even at the times of no network churn. PATCH Make the route advertisement event-driven - When routes are added to peer's sync FIFO, check if the routeadv timer needs to be adjusted (or started). Conversely, do not arm the routeadv timer unconditionally. The patch also addresses route announcements during read-only mode (update-delay). During read-only mode operation, the routeadv timer is not started. When BGP comes out of read-only mode and all the routes are processed, the timer is started for all peers with zero expiry, so that the UPDATEs can be sent all at once. This leads to (near-)optimal UPDATE packing. Finally, the patch makes the "max # packets to write to peer socket at a time" configurable. Currently it is hard-coded to 10. The command is at the top router-bgp mode and is called "write-quanta <number>". It is a useful convergence parameter to tweak. Signed-off-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com> Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com>
2015-05-20 02:40:37 +02:00
/* Start the route advertisement timer to send updates to the peer - if
* BGP
bgpd: bgpd-mrai.patch BGP: Event-driven route announcement taking into account min route advertisement interval ISSUE BGP starts the routeadv timer (peer->t_routeadv) to expire in 1 sec when a peer is established. From then on, the timer expires periodically based on the configured MRAI value (default: 30sec for EBGP, 5sec for IBGP). At the expiry, the write thread is triggered that takes the routes from peer's sync FIFO (adj-rib-out) and sends UPDATEs. This has a few drawbacks: (1) Delay in new route announcement: Even when the last UPDATE message was sent a while back, the next route change will necessarily have to wait for routeadv expiry (2) CPU usage: The timer is always armed. If the operator chooses to configure a lower value of MRAI (zero second is a preferred choice in many deployments) for better convergence, it leads to high CPU usage for BGP process, even at the times of no network churn. PATCH Make the route advertisement event-driven - When routes are added to peer's sync FIFO, check if the routeadv timer needs to be adjusted (or started). Conversely, do not arm the routeadv timer unconditionally. The patch also addresses route announcements during read-only mode (update-delay). During read-only mode operation, the routeadv timer is not started. When BGP comes out of read-only mode and all the routes are processed, the timer is started for all peers with zero expiry, so that the UPDATEs can be sent all at once. This leads to (near-)optimal UPDATE packing. Finally, the patch makes the "max # packets to write to peer socket at a time" configurable. Currently it is hard-coded to 10. The command is at the top router-bgp mode and is called "write-quanta <number>". It is a useful convergence parameter to tweak. Signed-off-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com> Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com>
2015-05-20 02:40:37 +02:00
* is not in read-only mode. If it is, the timer will be started at the
* end
bgpd: bgpd-mrai.patch BGP: Event-driven route announcement taking into account min route advertisement interval ISSUE BGP starts the routeadv timer (peer->t_routeadv) to expire in 1 sec when a peer is established. From then on, the timer expires periodically based on the configured MRAI value (default: 30sec for EBGP, 5sec for IBGP). At the expiry, the write thread is triggered that takes the routes from peer's sync FIFO (adj-rib-out) and sends UPDATEs. This has a few drawbacks: (1) Delay in new route announcement: Even when the last UPDATE message was sent a while back, the next route change will necessarily have to wait for routeadv expiry (2) CPU usage: The timer is always armed. If the operator chooses to configure a lower value of MRAI (zero second is a preferred choice in many deployments) for better convergence, it leads to high CPU usage for BGP process, even at the times of no network churn. PATCH Make the route advertisement event-driven - When routes are added to peer's sync FIFO, check if the routeadv timer needs to be adjusted (or started). Conversely, do not arm the routeadv timer unconditionally. The patch also addresses route announcements during read-only mode (update-delay). During read-only mode operation, the routeadv timer is not started. When BGP comes out of read-only mode and all the routes are processed, the timer is started for all peers with zero expiry, so that the UPDATEs can be sent all at once. This leads to (near-)optimal UPDATE packing. Finally, the patch makes the "max # packets to write to peer socket at a time" configurable. Currently it is hard-coded to 10. The command is at the top router-bgp mode and is called "write-quanta <number>". It is a useful convergence parameter to tweak. Signed-off-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com> Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com>
2015-05-20 02:40:37 +02:00
* of read-only mode.
*/
bgpd: bgpd-mrai.patch BGP: Event-driven route announcement taking into account min route advertisement interval ISSUE BGP starts the routeadv timer (peer->t_routeadv) to expire in 1 sec when a peer is established. From then on, the timer expires periodically based on the configured MRAI value (default: 30sec for EBGP, 5sec for IBGP). At the expiry, the write thread is triggered that takes the routes from peer's sync FIFO (adj-rib-out) and sends UPDATEs. This has a few drawbacks: (1) Delay in new route announcement: Even when the last UPDATE message was sent a while back, the next route change will necessarily have to wait for routeadv expiry (2) CPU usage: The timer is always armed. If the operator chooses to configure a lower value of MRAI (zero second is a preferred choice in many deployments) for better convergence, it leads to high CPU usage for BGP process, even at the times of no network churn. PATCH Make the route advertisement event-driven - When routes are added to peer's sync FIFO, check if the routeadv timer needs to be adjusted (or started). Conversely, do not arm the routeadv timer unconditionally. The patch also addresses route announcements during read-only mode (update-delay). During read-only mode operation, the routeadv timer is not started. When BGP comes out of read-only mode and all the routes are processed, the timer is started for all peers with zero expiry, so that the UPDATEs can be sent all at once. This leads to (near-)optimal UPDATE packing. Finally, the patch makes the "max # packets to write to peer socket at a time" configurable. Currently it is hard-coded to 10. The command is at the top router-bgp mode and is called "write-quanta <number>". It is a useful convergence parameter to tweak. Signed-off-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com> Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com>
2015-05-20 02:40:37 +02:00
if (!bgp_update_delay_active(peer->bgp)) {
EVENT_OFF(peer->connection->t_routeadv);
BGP_TIMER_ON(peer->connection->t_routeadv, bgp_routeadv_timer,
0);
}
2002-12-13 21:15:29 +01:00
if (peer->doppelganger &&
(peer->doppelganger->connection->status != Deleted)) {
2002-12-13 21:15:29 +01:00
if (bgp_debug_neighbor_events(peer))
zlog_debug("[Event] Deleting stub connection for peer %s for %s", peer->host,
bgp_peer_get_connection_direction(peer->doppelganger->connection));
if (peer->doppelganger->connection->status > Active)
bgp_notify_send(peer->doppelganger->connection,
BGP_NOTIFY_CEASE,
2002-12-13 21:15:29 +01:00
BGP_NOTIFY_CEASE_COLLISION_RESOLUTION);
else
2002-12-13 21:15:29 +01:00
peer_delete(peer->doppelganger);
}
/*
* If we are replacing the old peer for a doppelganger
* then switch it around in the bgp->peerhash
* the doppelgangers su and this peer's su are the same
* so the hash_release is the same for either.
*/
(void)hash_get(peer->bgp->peerhash, peer, hash_alloc_intern);
/* Start BFD peer if not already running. */
if (peer->bfd_config)
bgp_peer_bfd_update_source(peer);
return ret;
2002-12-13 21:15:29 +01:00
}
/* Keepalive packet is received. */
static enum bgp_fsm_state_progress
bgp_fsm_keepalive(struct peer_connection *connection)
2002-12-13 21:15:29 +01:00
{
EVENT_OFF(connection->t_holdtime);
return BGP_FSM_SUCCESS;
2002-12-13 21:15:29 +01:00
}
/* Update packet is received. */
static enum bgp_fsm_state_progress
bgp_fsm_update(struct peer_connection *connection)
2002-12-13 21:15:29 +01:00
{
EVENT_OFF(connection->t_holdtime);
return BGP_FSM_SUCCESS;
2002-12-13 21:15:29 +01:00
}
/* This is empty event. */
static enum bgp_fsm_state_progress bgp_ignore(struct peer_connection *connection)
2002-12-13 21:15:29 +01:00
{
struct peer *peer = connection->peer;
flog_err(EC_BGP_FSM,
"%s [FSM] Ignoring event %s in state %s, prior events %s, %s, fd %d",
peer->host, bgp_event_str[peer->cur_event],
lookup_msg(bgp_status_msg, connection->status, NULL),
bgp_event_str[peer->last_event],
bgp_event_str[peer->last_major_event], connection->fd);
return BGP_FSM_SUCCESS;
2002-12-13 21:15:29 +01:00
}
/* This is to handle unexpected events.. */
static enum bgp_fsm_state_progress
bgp_fsm_exception(struct peer_connection *connection)
{
struct peer *peer = connection->peer;
flog_err(EC_BGP_FSM,
"%s [FSM] Unexpected event %s in state %s, prior events %s, %s, fd %d",
peer->host, bgp_event_str[peer->cur_event],
lookup_msg(bgp_status_msg, connection->status, NULL),
bgp_event_str[peer->last_event],
bgp_event_str[peer->last_major_event], connection->fd);
return bgp_stop(connection);
}
void bgp_fsm_nht_update(struct peer_connection *connection, struct peer *peer,
bool has_valid_nexthops)
{
if (!peer)
return;
switch (connection->status) {
case Idle:
if (has_valid_nexthops)
BGP_EVENT_ADD(connection, BGP_Start);
break;
case Connect:
if (!has_valid_nexthops) {
EVENT_OFF(connection->t_connect);
BGP_EVENT_ADD(connection, TCP_fatal_error);
}
break;
case Active:
if (has_valid_nexthops) {
EVENT_OFF(connection->t_connect);
BGP_EVENT_ADD(connection, ConnectRetry_timer_expired);
}
break;
case OpenSent:
case OpenConfirm:
case Established:
if (!has_valid_nexthops
&& (peer->gtsm_hops == BGP_GTSM_HOPS_CONNECTED
|| peer->bgp->fast_convergence))
BGP_EVENT_ADD(connection, TCP_fatal_error);
break;
case Clearing:
case Deleted:
case BGP_STATUS_MAX:
break;
}
}
2002-12-13 21:15:29 +01:00
/* Finite State Machine structure */
static const struct {
enum bgp_fsm_state_progress (*func)(struct peer_connection *);
enum bgp_fsm_status next_state;
2002-12-13 21:15:29 +01:00
} FSM[BGP_STATUS_MAX - 1][BGP_EVENTS_MAX - 1] = {
{
/* Idle state: In Idle state, all events other than BGP_Start is
ignored. With BGP_Start event, finite state machine calls
bgp_start(). */
{bgp_start, Connect}, /* BGP_Start */
{bgp_stop, Idle}, /* BGP_Stop */
{bgp_stop, Idle}, /* TCP_connection_open */
{bgp_stop, Idle}, /* TCP_connection_open_w_delay */
2002-12-13 21:15:29 +01:00
{bgp_stop, Idle}, /* TCP_connection_closed */
{bgp_ignore, Idle}, /* TCP_connection_open_failed */
{bgp_stop, Idle}, /* TCP_fatal_error */
{bgp_ignore, Idle}, /* ConnectRetry_timer_expired */
{bgp_ignore, Idle}, /* Hold_Timer_expired */
{bgp_ignore, Idle}, /* KeepAlive_timer_expired */
{bgp_ignore, Idle}, /* DelayOpen_timer_expired */
2002-12-13 21:15:29 +01:00
{bgp_ignore, Idle}, /* Receive_OPEN_message */
{bgp_ignore, Idle}, /* Receive_KEEPALIVE_message */
{bgp_ignore, Idle}, /* Receive_UPDATE_message */
{bgp_ignore, Idle}, /* Receive_NOTIFICATION_message */
[bgpd] Fix 0.99 shutdown regression, introduce Clearing and Deleted states 2006-09-14 Paul Jakma <paul.jakma@sun.com> * (general) Fix some niggly issues around 'shutdown' and clearing by adding a Clearing FSM wait-state and a hidden 'Deleted' FSM state, to allow deleted peers to 'cool off' and hit 0 references. This introduces a slow memory leak of struct peer, however that's more a testament to the fragility of the reference counting than a bug in this patch, cleanup of reference counting to fix this is to follow. * bgpd.h: Add Clearing, Deleted states and Clearing_Completed and event. * bgp_debug.c: (bgp_status_msg[]) Add strings for Clearing and Deleted. * bgp_fsm.h: Don't allow timer/event threads to set anything for Deleted peers. * bgp_fsm.c: (bgp_timer_set) Add Clearing and Deleted. Deleted needs to stop everything. (bgp_stop) Remove explicit fsm_change_status call, the general framework handles the transition. (bgp_start) Log a warning if a start is attempted on a peer that should stay down, trying to start a peer. (struct .. FSM) Add Clearing_Completed events, has little influence except when in state Clearing to signal wait-state can end. Add Clearing and Deleted states, former is a wait-state, latter is a placeholder state to allow peers to disappear quietly once refcounts settle. (bgp_event) Try reduce verbosity of FSM state-change debug, changes to same state are not interesting (Established->Established) Allow NULL action functions in FSM. * bgp_packet.c: (bgp_write) Use FSM events, rather than trying to twiddle directly with FSM state behind the back of FSM. (bgp_write_notify) ditto. (bgp_read) Remove the vague ACCEPT_PEER peer_unlock, or else this patch crashes, now it leaks instead. * bgp_route.c: (bgp_clear_node_complete) Clearing_Completed event, to end clearing. (bgp_clear_route) See extensive comments. * bgpd.c: (peer_free) should only be called while in Deleted, peer refcounting controls when peer_free is called. bgp_sync_delete should be here, not in peer_delete. (peer_delete) Initiate delete. Transition to Deleted state manually. When removing peer from indices that provide visibility of it, take great care to be idempotent wrt the reference counting of struct peer through those indices. Use bgp_timer_set, rather than replicating. Call to bgp_sync_delete isn't appropriate here, sync can be referenced while shutting down and finishing deletion. (peer_group_bind) Take care to be idempotent wrt list references indexing peers.
2006-09-14 04:58:49 +02:00
{bgp_ignore, Idle}, /* Clearing_Completed */
2002-12-13 21:15:29 +01:00
},
{
/* Connect */
{bgp_ignore, Connect}, /* BGP_Start */
{bgp_stop, Idle}, /* BGP_Stop */
{bgp_connect_success, OpenSent}, /* TCP_connection_open */
{bgp_connect_success_w_delayopen,
Connect}, /* TCP_connection_open_w_delay */
{bgp_stop, Idle}, /* TCP_connection_closed */
2002-12-13 21:15:29 +01:00
{bgp_connect_fail, Active}, /* TCP_connection_open_failed */
{bgp_connect_fail, Idle}, /* TCP_fatal_error */
{bgp_reconnect, Connect}, /* ConnectRetry_timer_expired */
{bgp_fsm_exception, Idle}, /* Hold_Timer_expired */
{bgp_fsm_exception, Idle}, /* KeepAlive_timer_expired */
{bgp_fsm_delayopen_timer_expire,
OpenSent}, /* DelayOpen_timer_expired */
{bgp_fsm_open, OpenConfirm}, /* Receive_OPEN_message */
{bgp_fsm_exception, Idle}, /* Receive_KEEPALIVE_message */
{bgp_fsm_exception, Idle}, /* Receive_UPDATE_message */
{bgp_stop, Idle}, /* Receive_NOTIFICATION_message */
{bgp_fsm_exception, Idle}, /* Clearing_Completed */
2002-12-13 21:15:29 +01:00
},
{
/* Active, */
{bgp_ignore, Active}, /* BGP_Start */
{bgp_stop, Idle}, /* BGP_Stop */
{bgp_connect_success, OpenSent}, /* TCP_connection_open */
{bgp_connect_success_w_delayopen,
Active}, /* TCP_connection_open_w_delay */
{bgp_stop, Idle}, /* TCP_connection_closed */
{bgp_ignore, Active}, /* TCP_connection_open_failed */
{bgp_fsm_exception, Idle}, /* TCP_fatal_error */
{bgp_start, Connect}, /* ConnectRetry_timer_expired */
{bgp_fsm_exception, Idle}, /* Hold_Timer_expired */
{bgp_fsm_exception, Idle}, /* KeepAlive_timer_expired */
{bgp_fsm_delayopen_timer_expire,
OpenSent}, /* DelayOpen_timer_expired */
{bgp_fsm_open, OpenConfirm}, /* Receive_OPEN_message */
{bgp_fsm_exception, Idle}, /* Receive_KEEPALIVE_message */
{bgp_fsm_exception, Idle}, /* Receive_UPDATE_message */
{bgp_fsm_exception, Idle}, /* Receive_NOTIFICATION_message */
{bgp_fsm_exception, Idle}, /* Clearing_Completed */
2002-12-13 21:15:29 +01:00
},
{
/* OpenSent, */
{bgp_ignore, OpenSent}, /* BGP_Start */
{bgp_stop, Idle}, /* BGP_Stop */
{bgp_stop, Active}, /* TCP_connection_open */
{bgp_fsm_exception, Idle}, /* TCP_connection_open_w_delay */
{bgp_stop, Active}, /* TCP_connection_closed */
{bgp_stop, Active}, /* TCP_connection_open_failed */
{bgp_stop, Active}, /* TCP_fatal_error */
{bgp_fsm_exception, Idle}, /* ConnectRetry_timer_expired */
2002-12-13 21:15:29 +01:00
{bgp_fsm_holdtime_expire, Idle}, /* Hold_Timer_expired */
{bgp_fsm_exception, Idle}, /* KeepAlive_timer_expired */
{bgp_fsm_exception, Idle}, /* DelayOpen_timer_expired */
2002-12-13 21:15:29 +01:00
{bgp_fsm_open, OpenConfirm}, /* Receive_OPEN_message */
{bgp_fsm_event_error, Idle}, /* Receive_KEEPALIVE_message */
{bgp_fsm_event_error, Idle}, /* Receive_UPDATE_message */
{bgp_fsm_event_error, Idle}, /* Receive_NOTIFICATION_message */
{bgp_fsm_exception, Idle}, /* Clearing_Completed */
2002-12-13 21:15:29 +01:00
},
{
/* OpenConfirm, */
{bgp_ignore, OpenConfirm}, /* BGP_Start */
{bgp_stop, Idle}, /* BGP_Stop */
{bgp_stop, Idle}, /* TCP_connection_open */
{bgp_fsm_exception, Idle}, /* TCP_connection_open_w_delay */
{bgp_stop, Idle}, /* TCP_connection_closed */
{bgp_stop, Idle}, /* TCP_connection_open_failed */
{bgp_stop, Idle}, /* TCP_fatal_error */
{bgp_fsm_exception, Idle}, /* ConnectRetry_timer_expired */
2002-12-13 21:15:29 +01:00
{bgp_fsm_holdtime_expire, Idle}, /* Hold_Timer_expired */
{bgp_ignore, OpenConfirm}, /* KeepAlive_timer_expired */
{bgp_fsm_exception, Idle}, /* DelayOpen_timer_expired */
{bgp_fsm_exception, Idle}, /* Receive_OPEN_message */
2002-12-13 21:15:29 +01:00
{bgp_establish, Established}, /* Receive_KEEPALIVE_message */
{bgp_fsm_exception, Idle}, /* Receive_UPDATE_message */
2002-12-13 21:15:29 +01:00
{bgp_stop_with_error, Idle}, /* Receive_NOTIFICATION_message */
{bgp_fsm_exception, Idle}, /* Clearing_Completed */
2002-12-13 21:15:29 +01:00
},
{
/* Established, */
[bgpd] Fix 0.99 shutdown regression, introduce Clearing and Deleted states 2006-09-14 Paul Jakma <paul.jakma@sun.com> * (general) Fix some niggly issues around 'shutdown' and clearing by adding a Clearing FSM wait-state and a hidden 'Deleted' FSM state, to allow deleted peers to 'cool off' and hit 0 references. This introduces a slow memory leak of struct peer, however that's more a testament to the fragility of the reference counting than a bug in this patch, cleanup of reference counting to fix this is to follow. * bgpd.h: Add Clearing, Deleted states and Clearing_Completed and event. * bgp_debug.c: (bgp_status_msg[]) Add strings for Clearing and Deleted. * bgp_fsm.h: Don't allow timer/event threads to set anything for Deleted peers. * bgp_fsm.c: (bgp_timer_set) Add Clearing and Deleted. Deleted needs to stop everything. (bgp_stop) Remove explicit fsm_change_status call, the general framework handles the transition. (bgp_start) Log a warning if a start is attempted on a peer that should stay down, trying to start a peer. (struct .. FSM) Add Clearing_Completed events, has little influence except when in state Clearing to signal wait-state can end. Add Clearing and Deleted states, former is a wait-state, latter is a placeholder state to allow peers to disappear quietly once refcounts settle. (bgp_event) Try reduce verbosity of FSM state-change debug, changes to same state are not interesting (Established->Established) Allow NULL action functions in FSM. * bgp_packet.c: (bgp_write) Use FSM events, rather than trying to twiddle directly with FSM state behind the back of FSM. (bgp_write_notify) ditto. (bgp_read) Remove the vague ACCEPT_PEER peer_unlock, or else this patch crashes, now it leaks instead. * bgp_route.c: (bgp_clear_node_complete) Clearing_Completed event, to end clearing. (bgp_clear_route) See extensive comments. * bgpd.c: (peer_free) should only be called while in Deleted, peer refcounting controls when peer_free is called. bgp_sync_delete should be here, not in peer_delete. (peer_delete) Initiate delete. Transition to Deleted state manually. When removing peer from indices that provide visibility of it, take great care to be idempotent wrt the reference counting of struct peer through those indices. Use bgp_timer_set, rather than replicating. Call to bgp_sync_delete isn't appropriate here, sync can be referenced while shutting down and finishing deletion. (peer_group_bind) Take care to be idempotent wrt list references indexing peers.
2006-09-14 04:58:49 +02:00
{bgp_ignore, Established}, /* BGP_Start */
{bgp_stop, Clearing}, /* BGP_Stop */
{bgp_stop, Clearing}, /* TCP_connection_open */
{bgp_fsm_exception, Idle}, /* TCP_connection_open_w_delay */
{bgp_stop, Clearing}, /* TCP_connection_closed */
{bgp_stop, Clearing}, /* TCP_connection_open_failed */
{bgp_stop, Clearing}, /* TCP_fatal_error */
{bgp_stop, Clearing}, /* ConnectRetry_timer_expired */
[bgpd] Fix 0.99 shutdown regression, introduce Clearing and Deleted states 2006-09-14 Paul Jakma <paul.jakma@sun.com> * (general) Fix some niggly issues around 'shutdown' and clearing by adding a Clearing FSM wait-state and a hidden 'Deleted' FSM state, to allow deleted peers to 'cool off' and hit 0 references. This introduces a slow memory leak of struct peer, however that's more a testament to the fragility of the reference counting than a bug in this patch, cleanup of reference counting to fix this is to follow. * bgpd.h: Add Clearing, Deleted states and Clearing_Completed and event. * bgp_debug.c: (bgp_status_msg[]) Add strings for Clearing and Deleted. * bgp_fsm.h: Don't allow timer/event threads to set anything for Deleted peers. * bgp_fsm.c: (bgp_timer_set) Add Clearing and Deleted. Deleted needs to stop everything. (bgp_stop) Remove explicit fsm_change_status call, the general framework handles the transition. (bgp_start) Log a warning if a start is attempted on a peer that should stay down, trying to start a peer. (struct .. FSM) Add Clearing_Completed events, has little influence except when in state Clearing to signal wait-state can end. Add Clearing and Deleted states, former is a wait-state, latter is a placeholder state to allow peers to disappear quietly once refcounts settle. (bgp_event) Try reduce verbosity of FSM state-change debug, changes to same state are not interesting (Established->Established) Allow NULL action functions in FSM. * bgp_packet.c: (bgp_write) Use FSM events, rather than trying to twiddle directly with FSM state behind the back of FSM. (bgp_write_notify) ditto. (bgp_read) Remove the vague ACCEPT_PEER peer_unlock, or else this patch crashes, now it leaks instead. * bgp_route.c: (bgp_clear_node_complete) Clearing_Completed event, to end clearing. (bgp_clear_route) See extensive comments. * bgpd.c: (peer_free) should only be called while in Deleted, peer refcounting controls when peer_free is called. bgp_sync_delete should be here, not in peer_delete. (peer_delete) Initiate delete. Transition to Deleted state manually. When removing peer from indices that provide visibility of it, take great care to be idempotent wrt the reference counting of struct peer through those indices. Use bgp_timer_set, rather than replicating. Call to bgp_sync_delete isn't appropriate here, sync can be referenced while shutting down and finishing deletion. (peer_group_bind) Take care to be idempotent wrt list references indexing peers.
2006-09-14 04:58:49 +02:00
{bgp_fsm_holdtime_expire, Clearing}, /* Hold_Timer_expired */
{bgp_ignore, Established}, /* KeepAlive_timer_expired */
{bgp_fsm_exception, Idle}, /* DelayOpen_timer_expired */
{bgp_stop, Clearing}, /* Receive_OPEN_message */
[bgpd] Fix 0.99 shutdown regression, introduce Clearing and Deleted states 2006-09-14 Paul Jakma <paul.jakma@sun.com> * (general) Fix some niggly issues around 'shutdown' and clearing by adding a Clearing FSM wait-state and a hidden 'Deleted' FSM state, to allow deleted peers to 'cool off' and hit 0 references. This introduces a slow memory leak of struct peer, however that's more a testament to the fragility of the reference counting than a bug in this patch, cleanup of reference counting to fix this is to follow. * bgpd.h: Add Clearing, Deleted states and Clearing_Completed and event. * bgp_debug.c: (bgp_status_msg[]) Add strings for Clearing and Deleted. * bgp_fsm.h: Don't allow timer/event threads to set anything for Deleted peers. * bgp_fsm.c: (bgp_timer_set) Add Clearing and Deleted. Deleted needs to stop everything. (bgp_stop) Remove explicit fsm_change_status call, the general framework handles the transition. (bgp_start) Log a warning if a start is attempted on a peer that should stay down, trying to start a peer. (struct .. FSM) Add Clearing_Completed events, has little influence except when in state Clearing to signal wait-state can end. Add Clearing and Deleted states, former is a wait-state, latter is a placeholder state to allow peers to disappear quietly once refcounts settle. (bgp_event) Try reduce verbosity of FSM state-change debug, changes to same state are not interesting (Established->Established) Allow NULL action functions in FSM. * bgp_packet.c: (bgp_write) Use FSM events, rather than trying to twiddle directly with FSM state behind the back of FSM. (bgp_write_notify) ditto. (bgp_read) Remove the vague ACCEPT_PEER peer_unlock, or else this patch crashes, now it leaks instead. * bgp_route.c: (bgp_clear_node_complete) Clearing_Completed event, to end clearing. (bgp_clear_route) See extensive comments. * bgpd.c: (peer_free) should only be called while in Deleted, peer refcounting controls when peer_free is called. bgp_sync_delete should be here, not in peer_delete. (peer_delete) Initiate delete. Transition to Deleted state manually. When removing peer from indices that provide visibility of it, take great care to be idempotent wrt the reference counting of struct peer through those indices. Use bgp_timer_set, rather than replicating. Call to bgp_sync_delete isn't appropriate here, sync can be referenced while shutting down and finishing deletion. (peer_group_bind) Take care to be idempotent wrt list references indexing peers.
2006-09-14 04:58:49 +02:00
{bgp_fsm_keepalive,
Established}, /* Receive_KEEPALIVE_message */
{bgp_fsm_update, Established}, /* Receive_UPDATE_message */
{bgp_stop_with_error,
Clearing}, /* Receive_NOTIFICATION_message */
{bgp_fsm_exception, Idle}, /* Clearing_Completed */
[bgpd] Fix 0.99 shutdown regression, introduce Clearing and Deleted states 2006-09-14 Paul Jakma <paul.jakma@sun.com> * (general) Fix some niggly issues around 'shutdown' and clearing by adding a Clearing FSM wait-state and a hidden 'Deleted' FSM state, to allow deleted peers to 'cool off' and hit 0 references. This introduces a slow memory leak of struct peer, however that's more a testament to the fragility of the reference counting than a bug in this patch, cleanup of reference counting to fix this is to follow. * bgpd.h: Add Clearing, Deleted states and Clearing_Completed and event. * bgp_debug.c: (bgp_status_msg[]) Add strings for Clearing and Deleted. * bgp_fsm.h: Don't allow timer/event threads to set anything for Deleted peers. * bgp_fsm.c: (bgp_timer_set) Add Clearing and Deleted. Deleted needs to stop everything. (bgp_stop) Remove explicit fsm_change_status call, the general framework handles the transition. (bgp_start) Log a warning if a start is attempted on a peer that should stay down, trying to start a peer. (struct .. FSM) Add Clearing_Completed events, has little influence except when in state Clearing to signal wait-state can end. Add Clearing and Deleted states, former is a wait-state, latter is a placeholder state to allow peers to disappear quietly once refcounts settle. (bgp_event) Try reduce verbosity of FSM state-change debug, changes to same state are not interesting (Established->Established) Allow NULL action functions in FSM. * bgp_packet.c: (bgp_write) Use FSM events, rather than trying to twiddle directly with FSM state behind the back of FSM. (bgp_write_notify) ditto. (bgp_read) Remove the vague ACCEPT_PEER peer_unlock, or else this patch crashes, now it leaks instead. * bgp_route.c: (bgp_clear_node_complete) Clearing_Completed event, to end clearing. (bgp_clear_route) See extensive comments. * bgpd.c: (peer_free) should only be called while in Deleted, peer refcounting controls when peer_free is called. bgp_sync_delete should be here, not in peer_delete. (peer_delete) Initiate delete. Transition to Deleted state manually. When removing peer from indices that provide visibility of it, take great care to be idempotent wrt the reference counting of struct peer through those indices. Use bgp_timer_set, rather than replicating. Call to bgp_sync_delete isn't appropriate here, sync can be referenced while shutting down and finishing deletion. (peer_group_bind) Take care to be idempotent wrt list references indexing peers.
2006-09-14 04:58:49 +02:00
},
{
/* Clearing, */
{bgp_ignore, Clearing}, /* BGP_Start */
{bgp_stop, Clearing}, /* BGP_Stop */
{bgp_stop, Clearing}, /* TCP_connection_open */
{bgp_stop, Clearing}, /* TCP_connection_open_w_delay */
{bgp_stop, Clearing}, /* TCP_connection_closed */
{bgp_stop, Clearing}, /* TCP_connection_open_failed */
{bgp_stop, Clearing}, /* TCP_fatal_error */
{bgp_stop, Clearing}, /* ConnectRetry_timer_expired */
{bgp_stop, Clearing}, /* Hold_Timer_expired */
{bgp_stop, Clearing}, /* KeepAlive_timer_expired */
{bgp_stop, Clearing}, /* DelayOpen_timer_expired */
{bgp_stop, Clearing}, /* Receive_OPEN_message */
{bgp_stop, Clearing}, /* Receive_KEEPALIVE_message */
{bgp_stop, Clearing}, /* Receive_UPDATE_message */
{bgp_stop, Clearing}, /* Receive_NOTIFICATION_message */
{bgp_clearing_completed, Idle}, /* Clearing_Completed */
[bgpd] Fix 0.99 shutdown regression, introduce Clearing and Deleted states 2006-09-14 Paul Jakma <paul.jakma@sun.com> * (general) Fix some niggly issues around 'shutdown' and clearing by adding a Clearing FSM wait-state and a hidden 'Deleted' FSM state, to allow deleted peers to 'cool off' and hit 0 references. This introduces a slow memory leak of struct peer, however that's more a testament to the fragility of the reference counting than a bug in this patch, cleanup of reference counting to fix this is to follow. * bgpd.h: Add Clearing, Deleted states and Clearing_Completed and event. * bgp_debug.c: (bgp_status_msg[]) Add strings for Clearing and Deleted. * bgp_fsm.h: Don't allow timer/event threads to set anything for Deleted peers. * bgp_fsm.c: (bgp_timer_set) Add Clearing and Deleted. Deleted needs to stop everything. (bgp_stop) Remove explicit fsm_change_status call, the general framework handles the transition. (bgp_start) Log a warning if a start is attempted on a peer that should stay down, trying to start a peer. (struct .. FSM) Add Clearing_Completed events, has little influence except when in state Clearing to signal wait-state can end. Add Clearing and Deleted states, former is a wait-state, latter is a placeholder state to allow peers to disappear quietly once refcounts settle. (bgp_event) Try reduce verbosity of FSM state-change debug, changes to same state are not interesting (Established->Established) Allow NULL action functions in FSM. * bgp_packet.c: (bgp_write) Use FSM events, rather than trying to twiddle directly with FSM state behind the back of FSM. (bgp_write_notify) ditto. (bgp_read) Remove the vague ACCEPT_PEER peer_unlock, or else this patch crashes, now it leaks instead. * bgp_route.c: (bgp_clear_node_complete) Clearing_Completed event, to end clearing. (bgp_clear_route) See extensive comments. * bgpd.c: (peer_free) should only be called while in Deleted, peer refcounting controls when peer_free is called. bgp_sync_delete should be here, not in peer_delete. (peer_delete) Initiate delete. Transition to Deleted state manually. When removing peer from indices that provide visibility of it, take great care to be idempotent wrt the reference counting of struct peer through those indices. Use bgp_timer_set, rather than replicating. Call to bgp_sync_delete isn't appropriate here, sync can be referenced while shutting down and finishing deletion. (peer_group_bind) Take care to be idempotent wrt list references indexing peers.
2006-09-14 04:58:49 +02:00
},
{
/* Deleted, */
{bgp_ignore, Deleted}, /* BGP_Start */
{bgp_ignore, Deleted}, /* BGP_Stop */
{bgp_ignore, Deleted}, /* TCP_connection_open */
{bgp_ignore, Deleted}, /* TCP_connection_open_w_delay */
[bgpd] Fix 0.99 shutdown regression, introduce Clearing and Deleted states 2006-09-14 Paul Jakma <paul.jakma@sun.com> * (general) Fix some niggly issues around 'shutdown' and clearing by adding a Clearing FSM wait-state and a hidden 'Deleted' FSM state, to allow deleted peers to 'cool off' and hit 0 references. This introduces a slow memory leak of struct peer, however that's more a testament to the fragility of the reference counting than a bug in this patch, cleanup of reference counting to fix this is to follow. * bgpd.h: Add Clearing, Deleted states and Clearing_Completed and event. * bgp_debug.c: (bgp_status_msg[]) Add strings for Clearing and Deleted. * bgp_fsm.h: Don't allow timer/event threads to set anything for Deleted peers. * bgp_fsm.c: (bgp_timer_set) Add Clearing and Deleted. Deleted needs to stop everything. (bgp_stop) Remove explicit fsm_change_status call, the general framework handles the transition. (bgp_start) Log a warning if a start is attempted on a peer that should stay down, trying to start a peer. (struct .. FSM) Add Clearing_Completed events, has little influence except when in state Clearing to signal wait-state can end. Add Clearing and Deleted states, former is a wait-state, latter is a placeholder state to allow peers to disappear quietly once refcounts settle. (bgp_event) Try reduce verbosity of FSM state-change debug, changes to same state are not interesting (Established->Established) Allow NULL action functions in FSM. * bgp_packet.c: (bgp_write) Use FSM events, rather than trying to twiddle directly with FSM state behind the back of FSM. (bgp_write_notify) ditto. (bgp_read) Remove the vague ACCEPT_PEER peer_unlock, or else this patch crashes, now it leaks instead. * bgp_route.c: (bgp_clear_node_complete) Clearing_Completed event, to end clearing. (bgp_clear_route) See extensive comments. * bgpd.c: (peer_free) should only be called while in Deleted, peer refcounting controls when peer_free is called. bgp_sync_delete should be here, not in peer_delete. (peer_delete) Initiate delete. Transition to Deleted state manually. When removing peer from indices that provide visibility of it, take great care to be idempotent wrt the reference counting of struct peer through those indices. Use bgp_timer_set, rather than replicating. Call to bgp_sync_delete isn't appropriate here, sync can be referenced while shutting down and finishing deletion. (peer_group_bind) Take care to be idempotent wrt list references indexing peers.
2006-09-14 04:58:49 +02:00
{bgp_ignore, Deleted}, /* TCP_connection_closed */
{bgp_ignore, Deleted}, /* TCP_connection_open_failed */
{bgp_ignore, Deleted}, /* TCP_fatal_error */
{bgp_ignore, Deleted}, /* ConnectRetry_timer_expired */
{bgp_ignore, Deleted}, /* Hold_Timer_expired */
{bgp_ignore, Deleted}, /* KeepAlive_timer_expired */
{bgp_ignore, Deleted}, /* DelayOpen_timer_expired */
[bgpd] Fix 0.99 shutdown regression, introduce Clearing and Deleted states 2006-09-14 Paul Jakma <paul.jakma@sun.com> * (general) Fix some niggly issues around 'shutdown' and clearing by adding a Clearing FSM wait-state and a hidden 'Deleted' FSM state, to allow deleted peers to 'cool off' and hit 0 references. This introduces a slow memory leak of struct peer, however that's more a testament to the fragility of the reference counting than a bug in this patch, cleanup of reference counting to fix this is to follow. * bgpd.h: Add Clearing, Deleted states and Clearing_Completed and event. * bgp_debug.c: (bgp_status_msg[]) Add strings for Clearing and Deleted. * bgp_fsm.h: Don't allow timer/event threads to set anything for Deleted peers. * bgp_fsm.c: (bgp_timer_set) Add Clearing and Deleted. Deleted needs to stop everything. (bgp_stop) Remove explicit fsm_change_status call, the general framework handles the transition. (bgp_start) Log a warning if a start is attempted on a peer that should stay down, trying to start a peer. (struct .. FSM) Add Clearing_Completed events, has little influence except when in state Clearing to signal wait-state can end. Add Clearing and Deleted states, former is a wait-state, latter is a placeholder state to allow peers to disappear quietly once refcounts settle. (bgp_event) Try reduce verbosity of FSM state-change debug, changes to same state are not interesting (Established->Established) Allow NULL action functions in FSM. * bgp_packet.c: (bgp_write) Use FSM events, rather than trying to twiddle directly with FSM state behind the back of FSM. (bgp_write_notify) ditto. (bgp_read) Remove the vague ACCEPT_PEER peer_unlock, or else this patch crashes, now it leaks instead. * bgp_route.c: (bgp_clear_node_complete) Clearing_Completed event, to end clearing. (bgp_clear_route) See extensive comments. * bgpd.c: (peer_free) should only be called while in Deleted, peer refcounting controls when peer_free is called. bgp_sync_delete should be here, not in peer_delete. (peer_delete) Initiate delete. Transition to Deleted state manually. When removing peer from indices that provide visibility of it, take great care to be idempotent wrt the reference counting of struct peer through those indices. Use bgp_timer_set, rather than replicating. Call to bgp_sync_delete isn't appropriate here, sync can be referenced while shutting down and finishing deletion. (peer_group_bind) Take care to be idempotent wrt list references indexing peers.
2006-09-14 04:58:49 +02:00
{bgp_ignore, Deleted}, /* Receive_OPEN_message */
{bgp_ignore, Deleted}, /* Receive_KEEPALIVE_message */
{bgp_ignore, Deleted}, /* Receive_UPDATE_message */
{bgp_ignore, Deleted}, /* Receive_NOTIFICATION_message */
{bgp_ignore, Deleted}, /* Clearing_Completed */
2002-12-13 21:15:29 +01:00
},
};
/* Execute event process. */
void bgp_event(struct event *thread)
2002-12-13 21:15:29 +01:00
{
struct peer_connection *connection = EVENT_ARG(thread);
enum bgp_fsm_events event;
struct peer *peer = connection->peer;
2002-12-13 21:15:29 +01:00
event = EVENT_VAL(thread);
2002-12-13 21:15:29 +01:00
peer_lock(peer);
bgp_event_update(connection, event);
peer_unlock(peer);
}
int bgp_event_update(struct peer_connection *connection,
enum bgp_fsm_events event)
{
enum bgp_fsm_status next;
enum bgp_fsm_state_progress ret = 0;
int fsm_result = FSM_PEER_NOOP;
int passive_conn = 0;
int dyn_nbr;
struct peer *peer = connection->peer;
passive_conn =
(CHECK_FLAG(peer->sflags, PEER_STATUS_ACCEPT_PEER)) ? 1 : 0;
dyn_nbr = peer_dynamic_neighbor(peer);
2002-12-13 21:15:29 +01:00
/* Logging this event. */
next = FSM[connection->status - 1][event - 1].next_state;
if (bgp_debug_neighbor_events(peer) && connection->status != next)
zlog_debug("%s [FSM] %s (%s->%s), fd %d for %s", peer->host, bgp_event_str[event],
lookup_msg(bgp_status_msg, connection->status, NULL),
lookup_msg(bgp_status_msg, next, NULL), connection->fd,
bgp_peer_get_connection_direction(connection));
peer->last_event = peer->cur_event;
peer->cur_event = event;
2002-12-13 21:15:29 +01:00
/* Call function. */
if (FSM[connection->status - 1][event - 1].func)
ret = (*(FSM[connection->status - 1][event - 1].func))(
connection);
switch (ret) {
case BGP_FSM_SUCCESS:
case BGP_FSM_SUCCESS_STATE_TRANSFER:
if (ret == BGP_FSM_SUCCESS_STATE_TRANSFER &&
next == Established) {
/* The case when doppelganger swap accurred in
bgp_establish.
Update the peer pointer accordingly */
fsm_result = FSM_PEER_TRANSFERRED;
}
2005-06-01 Paul Jakma <paul.jakma@sun.com> * bgpd/(general) refcount struct peer and bgp_info, hence allowing us add work_queues for bgp_process. * bgpd/bgp_route.h: (struct bgp_info) Add 'lock' field for refcount. Add bgp_info_{lock,unlock} helper functions. Add bgp_info_{add,delete} helpers, to remove need for users managing locking/freeing of bgp_info and bgp_node's. * bgpd/bgp_table.h: (struct bgp_node) Add a flags field, and BGP_NODE_PROCESS_SCHEDULED to merge redundant processing of nodes. * bgpd/bgp_fsm.h: Make the ON/OFF/ADD/REMOVE macros lock and unlock peer reference as appropriate. * bgpd/bgp_damp.c: Remove its internal prototypes for bgp_info_delete/free. Just use bgp_info_delete. * bgpd/bgpd.h: (struct bgp_master) Add work_queue pointers. (struct peer) Add reference count 'lock' (peer_lock,peer_unlock) New helpers to take/release reference on struct peer. * bgpd/bgp_advertise.c: (general) Add peer and bgp_info refcounting and balance how references are taken and released. (bgp_advertise_free) release bgp_info reference, if appropriate (bgp_adj_out_free) unlock peer (bgp_advertise_clean) leave the adv references alone, or else call bgp_advertise_free cant unlock them. (bgp_adj_out_set) lock the peer on new adj's, leave the reference alone otherwise. lock the new bgp_info reference. (bgp_adj_in_set) lock the peer reference (bgp_adj_in_remove) and unlock it here (bgp_sync_delete) make hash_free on peer conditional, just in case. * bgpd/bgp_fsm.c: (general) document that the timers depend on bgp_event to release a peer reference. (bgp_fsm_change_status) moved up the file, unchanged. (bgp_stop) Decrement peer lock as many times as cancel_event canceled - shouldnt be needed but just in case. stream_fifo_clean of obuf made conditional, just in case. (bgp_event) always unlock the peer, regardless of return value of bgp_fsm_change_status. * bgpd/bgp_packet.c: (general) change several bgp_stop's to BGP_EVENT's. (bgp_read) Add a mysterious extra peer_unlock for ACCEPT_PEERs along with a comment on it. * bgpd/bgp_route.c: (general) Add refcounting of bgp_info, cleanup some of the resource management around bgp_info. Refcount peer. Add workqueues for bgp_process and clear_table. (bgp_info_new) make static (bgp_info_free) Ditto, and unlock the peer reference. (bgp_info_lock,bgp_info_unlock) new exported functions (bgp_info_add) Add a bgp_info to a bgp_node in correct fashion, taking care of reference counts. (bgp_info_delete) do the opposite of bgp_info_add. (bgp_process_rsclient) Converted into a work_queue work function. (bgp_process_main) ditto. (bgp_processq_del) process work queue item deconstructor (bgp_process_queue_init) process work queue init (bgp_process) call init function if required, set up queue item and add to queue, rather than calling process functions directly. (bgp_rib_remove) let bgp_info_delete manage bgp_info refcounts (bgp_rib_withdraw) ditto (bgp_update_rsclient) let bgp_info_add manage refcounts (bgp_update_main) ditto (bgp_clear_route_node) clear_node_queue work function, does per-node aspects of what bgp_clear_route_table did previously (bgp_clear_node_queue_del) clear_node_queue item delete function (bgp_clear_node_complete) clear_node_queue completion function, it unplugs the process queues, which have to be blocked while clear_node_queue is being processed to prevent a race. (bgp_clear_node_queue_init) init function for clear_node_queue work queues (bgp_clear_route_table) Sets up items onto a workqueue now, rather than clearing each node directly. Plugs both process queues to avoid potential race. (bgp_static_withdraw_rsclient) let bgp_info_{add,delete} manage bgp_info refcounts. (bgp_static_update_rsclient) ditto (bgp_static_update_main) ditto (bgp_static_update_vpnv4) ditto, remove unneeded cast. (bgp_static_withdraw) see bgp_static_withdraw_rsclient (bgp_static_withdraw_vpnv4) ditto (bgp_aggregate_{route,add,delete}) ditto (bgp_redistribute_{add,delete,withdraw}) ditto * bgpd/bgp_vty.c: (peer_rsclient_set_vty) lock rsclient list peer reference (peer_rsclient_unset_vty) ditto, but unlock same reference * bgpd/bgpd.c: (peer_free) handle frees of info to be kept for lifetime of struct peer. (peer_lock,peer_unlock) peer refcount helpers (peer_new) add initial refcounts (peer_create,peer_create_accept) lock peer as appropriate (peer_delete) unlock as appropriate, move out some free's to peer_free. (peer_group_bind,peer_group_unbind) peer refcounting as appropriate. (bgp_create) check CALLOC return value. (bgp_terminate) free workqueues too. * lib/memtypes.c: Add MTYPE_BGP_PROCESS_QUEUE and MTYPE_BGP_CLEAR_NODE_QUEUE
2005-06-01 13:17:05 +02:00
/* If status is changed. */
if (next != connection->status) {
bgp_fsm_change_status(connection, next);
/*
* If we're going to ESTABLISHED then we executed a
* peer transfer. In this case we can either return
* FSM_PEER_TRANSITIONED or FSM_PEER_TRANSFERRED.
* Opting for TRANSFERRED since transfer implies
* session establishment.
*/
if (fsm_result != FSM_PEER_TRANSFERRED)
fsm_result = FSM_PEER_TRANSITIONED;
}
2005-06-01 Paul Jakma <paul.jakma@sun.com> * bgpd/(general) refcount struct peer and bgp_info, hence allowing us add work_queues for bgp_process. * bgpd/bgp_route.h: (struct bgp_info) Add 'lock' field for refcount. Add bgp_info_{lock,unlock} helper functions. Add bgp_info_{add,delete} helpers, to remove need for users managing locking/freeing of bgp_info and bgp_node's. * bgpd/bgp_table.h: (struct bgp_node) Add a flags field, and BGP_NODE_PROCESS_SCHEDULED to merge redundant processing of nodes. * bgpd/bgp_fsm.h: Make the ON/OFF/ADD/REMOVE macros lock and unlock peer reference as appropriate. * bgpd/bgp_damp.c: Remove its internal prototypes for bgp_info_delete/free. Just use bgp_info_delete. * bgpd/bgpd.h: (struct bgp_master) Add work_queue pointers. (struct peer) Add reference count 'lock' (peer_lock,peer_unlock) New helpers to take/release reference on struct peer. * bgpd/bgp_advertise.c: (general) Add peer and bgp_info refcounting and balance how references are taken and released. (bgp_advertise_free) release bgp_info reference, if appropriate (bgp_adj_out_free) unlock peer (bgp_advertise_clean) leave the adv references alone, or else call bgp_advertise_free cant unlock them. (bgp_adj_out_set) lock the peer on new adj's, leave the reference alone otherwise. lock the new bgp_info reference. (bgp_adj_in_set) lock the peer reference (bgp_adj_in_remove) and unlock it here (bgp_sync_delete) make hash_free on peer conditional, just in case. * bgpd/bgp_fsm.c: (general) document that the timers depend on bgp_event to release a peer reference. (bgp_fsm_change_status) moved up the file, unchanged. (bgp_stop) Decrement peer lock as many times as cancel_event canceled - shouldnt be needed but just in case. stream_fifo_clean of obuf made conditional, just in case. (bgp_event) always unlock the peer, regardless of return value of bgp_fsm_change_status. * bgpd/bgp_packet.c: (general) change several bgp_stop's to BGP_EVENT's. (bgp_read) Add a mysterious extra peer_unlock for ACCEPT_PEERs along with a comment on it. * bgpd/bgp_route.c: (general) Add refcounting of bgp_info, cleanup some of the resource management around bgp_info. Refcount peer. Add workqueues for bgp_process and clear_table. (bgp_info_new) make static (bgp_info_free) Ditto, and unlock the peer reference. (bgp_info_lock,bgp_info_unlock) new exported functions (bgp_info_add) Add a bgp_info to a bgp_node in correct fashion, taking care of reference counts. (bgp_info_delete) do the opposite of bgp_info_add. (bgp_process_rsclient) Converted into a work_queue work function. (bgp_process_main) ditto. (bgp_processq_del) process work queue item deconstructor (bgp_process_queue_init) process work queue init (bgp_process) call init function if required, set up queue item and add to queue, rather than calling process functions directly. (bgp_rib_remove) let bgp_info_delete manage bgp_info refcounts (bgp_rib_withdraw) ditto (bgp_update_rsclient) let bgp_info_add manage refcounts (bgp_update_main) ditto (bgp_clear_route_node) clear_node_queue work function, does per-node aspects of what bgp_clear_route_table did previously (bgp_clear_node_queue_del) clear_node_queue item delete function (bgp_clear_node_complete) clear_node_queue completion function, it unplugs the process queues, which have to be blocked while clear_node_queue is being processed to prevent a race. (bgp_clear_node_queue_init) init function for clear_node_queue work queues (bgp_clear_route_table) Sets up items onto a workqueue now, rather than clearing each node directly. Plugs both process queues to avoid potential race. (bgp_static_withdraw_rsclient) let bgp_info_{add,delete} manage bgp_info refcounts. (bgp_static_update_rsclient) ditto (bgp_static_update_main) ditto (bgp_static_update_vpnv4) ditto, remove unneeded cast. (bgp_static_withdraw) see bgp_static_withdraw_rsclient (bgp_static_withdraw_vpnv4) ditto (bgp_aggregate_{route,add,delete}) ditto (bgp_redistribute_{add,delete,withdraw}) ditto * bgpd/bgp_vty.c: (peer_rsclient_set_vty) lock rsclient list peer reference (peer_rsclient_unset_vty) ditto, but unlock same reference * bgpd/bgpd.c: (peer_free) handle frees of info to be kept for lifetime of struct peer. (peer_lock,peer_unlock) peer refcount helpers (peer_new) add initial refcounts (peer_create,peer_create_accept) lock peer as appropriate (peer_delete) unlock as appropriate, move out some free's to peer_free. (peer_group_bind,peer_group_unbind) peer refcounting as appropriate. (bgp_create) check CALLOC return value. (bgp_terminate) free workqueues too. * lib/memtypes.c: Add MTYPE_BGP_PROCESS_QUEUE and MTYPE_BGP_CLEAR_NODE_QUEUE
2005-06-01 13:17:05 +02:00
/* Make sure timer is set. */
bgp_timer_set(connection);
break;
case BGP_FSM_FAILURE:
/*
* If we got a return value of -1, that means there was an
* error, restart the FSM. Since bgp_stop() was called on the
* peer. only a few fields are safe to access here. In any case
* we need to indicate that the peer was stopped in the return
* code.
*/
if (!dyn_nbr && !passive_conn && peer->bgp &&
ret != BGP_FSM_FAILURE_AND_DELETE) {
flog_err(EC_BGP_FSM,
"%s [FSM] Failure handling event %s in state %s, prior events %s, %s, fd %d, last reset: %s",
peer->host, bgp_event_str[peer->cur_event],
lookup_msg(bgp_status_msg, connection->status,
NULL),
bgp_event_str[peer->last_event],
bgp_event_str[peer->last_major_event],
connection->fd,
peer_down_str[peer->last_reset]);
bgp_stop(connection);
bgp_fsm_change_status(connection, Idle);
bgp_timer_set(connection);
}
fsm_result = FSM_PEER_STOPPED;
break;
case BGP_FSM_FAILURE_AND_DELETE:
fsm_result = FSM_PEER_STOPPED;
break;
2005-06-01 Paul Jakma <paul.jakma@sun.com> * bgpd/(general) refcount struct peer and bgp_info, hence allowing us add work_queues for bgp_process. * bgpd/bgp_route.h: (struct bgp_info) Add 'lock' field for refcount. Add bgp_info_{lock,unlock} helper functions. Add bgp_info_{add,delete} helpers, to remove need for users managing locking/freeing of bgp_info and bgp_node's. * bgpd/bgp_table.h: (struct bgp_node) Add a flags field, and BGP_NODE_PROCESS_SCHEDULED to merge redundant processing of nodes. * bgpd/bgp_fsm.h: Make the ON/OFF/ADD/REMOVE macros lock and unlock peer reference as appropriate. * bgpd/bgp_damp.c: Remove its internal prototypes for bgp_info_delete/free. Just use bgp_info_delete. * bgpd/bgpd.h: (struct bgp_master) Add work_queue pointers. (struct peer) Add reference count 'lock' (peer_lock,peer_unlock) New helpers to take/release reference on struct peer. * bgpd/bgp_advertise.c: (general) Add peer and bgp_info refcounting and balance how references are taken and released. (bgp_advertise_free) release bgp_info reference, if appropriate (bgp_adj_out_free) unlock peer (bgp_advertise_clean) leave the adv references alone, or else call bgp_advertise_free cant unlock them. (bgp_adj_out_set) lock the peer on new adj's, leave the reference alone otherwise. lock the new bgp_info reference. (bgp_adj_in_set) lock the peer reference (bgp_adj_in_remove) and unlock it here (bgp_sync_delete) make hash_free on peer conditional, just in case. * bgpd/bgp_fsm.c: (general) document that the timers depend on bgp_event to release a peer reference. (bgp_fsm_change_status) moved up the file, unchanged. (bgp_stop) Decrement peer lock as many times as cancel_event canceled - shouldnt be needed but just in case. stream_fifo_clean of obuf made conditional, just in case. (bgp_event) always unlock the peer, regardless of return value of bgp_fsm_change_status. * bgpd/bgp_packet.c: (general) change several bgp_stop's to BGP_EVENT's. (bgp_read) Add a mysterious extra peer_unlock for ACCEPT_PEERs along with a comment on it. * bgpd/bgp_route.c: (general) Add refcounting of bgp_info, cleanup some of the resource management around bgp_info. Refcount peer. Add workqueues for bgp_process and clear_table. (bgp_info_new) make static (bgp_info_free) Ditto, and unlock the peer reference. (bgp_info_lock,bgp_info_unlock) new exported functions (bgp_info_add) Add a bgp_info to a bgp_node in correct fashion, taking care of reference counts. (bgp_info_delete) do the opposite of bgp_info_add. (bgp_process_rsclient) Converted into a work_queue work function. (bgp_process_main) ditto. (bgp_processq_del) process work queue item deconstructor (bgp_process_queue_init) process work queue init (bgp_process) call init function if required, set up queue item and add to queue, rather than calling process functions directly. (bgp_rib_remove) let bgp_info_delete manage bgp_info refcounts (bgp_rib_withdraw) ditto (bgp_update_rsclient) let bgp_info_add manage refcounts (bgp_update_main) ditto (bgp_clear_route_node) clear_node_queue work function, does per-node aspects of what bgp_clear_route_table did previously (bgp_clear_node_queue_del) clear_node_queue item delete function (bgp_clear_node_complete) clear_node_queue completion function, it unplugs the process queues, which have to be blocked while clear_node_queue is being processed to prevent a race. (bgp_clear_node_queue_init) init function for clear_node_queue work queues (bgp_clear_route_table) Sets up items onto a workqueue now, rather than clearing each node directly. Plugs both process queues to avoid potential race. (bgp_static_withdraw_rsclient) let bgp_info_{add,delete} manage bgp_info refcounts. (bgp_static_update_rsclient) ditto (bgp_static_update_main) ditto (bgp_static_update_vpnv4) ditto, remove unneeded cast. (bgp_static_withdraw) see bgp_static_withdraw_rsclient (bgp_static_withdraw_vpnv4) ditto (bgp_aggregate_{route,add,delete}) ditto (bgp_redistribute_{add,delete,withdraw}) ditto * bgpd/bgp_vty.c: (peer_rsclient_set_vty) lock rsclient list peer reference (peer_rsclient_unset_vty) ditto, but unlock same reference * bgpd/bgpd.c: (peer_free) handle frees of info to be kept for lifetime of struct peer. (peer_lock,peer_unlock) peer refcount helpers (peer_new) add initial refcounts (peer_create,peer_create_accept) lock peer as appropriate (peer_delete) unlock as appropriate, move out some free's to peer_free. (peer_group_bind,peer_group_unbind) peer refcounting as appropriate. (bgp_create) check CALLOC return value. (bgp_terminate) free workqueues too. * lib/memtypes.c: Add MTYPE_BGP_PROCESS_QUEUE and MTYPE_BGP_CLEAR_NODE_QUEUE
2005-06-01 13:17:05 +02:00
}
return fsm_result;
2002-12-13 21:15:29 +01:00
}
/* BGP GR Code */
static inline void
bgp_peer_inherit_global_gr_mode(struct peer *peer,
enum global_mode global_gr_mode)
{
switch (global_gr_mode) {
case GLOBAL_HELPER:
BGP_PEER_GR_HELPER_ENABLE(peer);
break;
case GLOBAL_GR:
BGP_PEER_GR_ENABLE(peer);
break;
case GLOBAL_DISABLE:
BGP_PEER_GR_DISABLE(peer);
break;
case GLOBAL_INVALID:
default:
zlog_err("Unexpected Global GR mode %d", global_gr_mode);
}
}
static void bgp_gr_update_mode_of_all_peers(struct bgp *bgp,
enum global_mode global_new_state)
{
struct peer *peer = {0};
struct listnode *node = {0};
struct listnode *nnode = {0};
enum peer_mode peer_old_state = PEER_INVALID;
struct peer_group *group;
struct peer *member;
for (ALL_LIST_ELEMENTS(bgp->peer, node, nnode, peer)) {
if (!CHECK_FLAG(peer->sflags, PEER_STATUS_GROUP)) {
peer_old_state = bgp_peer_gr_mode_get(peer);
if (peer_old_state != PEER_GLOBAL_INHERIT)
continue;
bgp_peer_inherit_global_gr_mode(peer, global_new_state);
bgp_peer_gr_flags_update(peer);
if (BGP_DEBUG(graceful_restart, GRACEFUL_RESTART))
zlog_debug("%pBP: Inherited Global GR mode, GR flags 0x%x peer flags 0x%" PRIx64
"...resetting session",
peer, peer->peer_gr_new_status_flag, peer->flags);
peer->last_reset = PEER_DOWN_CAPABILITY_CHANGE;
if (!peer_notify_config_change(peer->connection))
bgp_session_reset_safe(peer, &nnode);
} else {
group = peer->group;
for (ALL_LIST_ELEMENTS(group->peer, node, nnode, member)) {
peer_old_state = bgp_peer_gr_mode_get(member);
if (peer_old_state != PEER_GLOBAL_INHERIT)
continue;
bgp_peer_inherit_global_gr_mode(member, global_new_state);
bgp_peer_gr_flags_update(member);
if (BGP_DEBUG(graceful_restart, GRACEFUL_RESTART))
zlog_debug("%pBP: Inherited Global GR mode, GR flags 0x%x peer flags 0x%" PRIx64
"...resetting session",
member, member->peer_gr_new_status_flag,
member->flags);
member->last_reset = PEER_DOWN_CAPABILITY_CHANGE;
if (!peer_notify_config_change(member->connection))
bgp_session_reset(member);
}
}
}
}
int bgp_gr_update_all(struct bgp *bgp, enum global_gr_command global_gr_cmd)
{
enum global_mode global_new_state = GLOBAL_INVALID;
enum global_mode global_old_state = GLOBAL_INVALID;
global_old_state = bgp_global_gr_mode_get(bgp);
global_new_state = bgp->GLOBAL_GR_FSM[global_old_state][global_gr_cmd];
if (BGP_DEBUG(graceful_restart, GRACEFUL_RESTART))
zlog_debug("%s: Handle GR command %s, current GR state %s, new GR state %s",
bgp->name_pretty, print_global_gr_cmd(global_gr_cmd),
print_global_gr_mode(global_old_state),
print_global_gr_mode(global_new_state));
if (global_old_state == GLOBAL_INVALID)
return BGP_ERR_GR_OPERATION_FAILED;
if (global_new_state == GLOBAL_INVALID)
return BGP_ERR_GR_INVALID_CMD;
if (global_new_state == global_old_state)
return BGP_GR_NO_OPERATION;
/* Update global GR mode and process all peers in instance. */
bgp->global_gr_present_state = global_new_state;
bgp_gr_update_mode_of_all_peers(bgp, global_new_state);
return BGP_GR_SUCCESS;
}
const char *print_peer_gr_mode(enum peer_mode pr_mode)
{
switch (pr_mode) {
case PEER_HELPER:
return "PEER_HELPER";
case PEER_GR:
return "PEER_GR";
case PEER_DISABLE:
return "PEER_DISABLE";
case PEER_INVALID:
return "PEER_INVALID";
case PEER_GLOBAL_INHERIT:
return "PEER_GLOBAL_INHERIT";
}
return NULL;
}
const char *print_peer_gr_cmd(enum peer_gr_command pr_gr_cmd)
{
switch (pr_gr_cmd) {
case PEER_GR_CMD:
return "PEER_GR_CMD";
case NO_PEER_GR_CMD:
return "NO_PEER_GR_CMD";
case PEER_DISABLE_CMD:
return "PEER_DISABLE_GR_CMD";
case NO_PEER_DISABLE_CMD:
return "NO_PEER_DISABLE_GR_CMD";
case PEER_HELPER_CMD:
return "PEER_HELPER_CMD";
case NO_PEER_HELPER_CMD:
return "NO_PEER_HELPER_CMD";
}
return NULL;
}
const char *print_global_gr_mode(enum global_mode gl_mode)
{
switch (gl_mode) {
case GLOBAL_HELPER:
return "GLOBAL_HELPER";
case GLOBAL_GR:
return "GLOBAL_GR";
case GLOBAL_DISABLE:
return "GLOBAL_DISABLE";
case GLOBAL_INVALID:
return "GLOBAL_INVALID";
}
return "???";
}
const char *print_global_gr_cmd(enum global_gr_command gl_gr_cmd)
{
switch (gl_gr_cmd) {
case GLOBAL_GR_CMD:
return "GLOBAL_GR_CMD";
case NO_GLOBAL_GR_CMD:
return "NO_GLOBAL_GR_CMD";
case GLOBAL_DISABLE_CMD:
return "GLOBAL_DISABLE_CMD";
case NO_GLOBAL_DISABLE_CMD:
return "NO_GLOBAL_DISABLE_CMD";
}
return NULL;
}
enum global_mode bgp_global_gr_mode_get(struct bgp *bgp)
{
return bgp->global_gr_present_state;
}
enum peer_mode bgp_peer_gr_mode_get(struct peer *peer)
{
return peer->peer_gr_present_state;
}
int bgp_neighbor_graceful_restart(struct peer *peer,
enum peer_gr_command peer_gr_cmd)
{
enum peer_mode peer_new_state = PEER_INVALID;
enum peer_mode peer_old_state = PEER_INVALID;
struct bgp_peer_gr gr_fsm;
int result = BGP_GR_FAILURE;
peer_old_state = bgp_peer_gr_mode_get(peer);
gr_fsm = peer->PEER_GR_FSM[peer_old_state][peer_gr_cmd];
peer_new_state = gr_fsm.next_state;
if (BGP_DEBUG(graceful_restart, GRACEFUL_RESTART))
zlog_debug("%pBP: Handle GR command %s, current GR state %s, new GR state %s",
peer, print_peer_gr_cmd(peer_gr_cmd),
print_peer_gr_mode(peer_old_state),
print_peer_gr_mode(peer_new_state));
if (peer_old_state == PEER_INVALID)
return BGP_ERR_GR_OPERATION_FAILED;
if (peer_new_state == PEER_INVALID)
return BGP_ERR_GR_INVALID_CMD;
if (peer_new_state == peer_old_state)
return BGP_GR_NO_OPERATION;
result = gr_fsm.action_fun(peer, peer_old_state, peer_new_state);
return result;
}
static inline bool gr_mode_matches(enum peer_mode peer_gr_mode,
enum global_mode global_gr_mode)
{
if ((peer_gr_mode == PEER_HELPER && global_gr_mode == GLOBAL_HELPER) ||
(peer_gr_mode == PEER_GR && global_gr_mode == GLOBAL_GR) ||
(peer_gr_mode == PEER_DISABLE && global_gr_mode == GLOBAL_DISABLE))
return true;
return false;
}
unsigned int bgp_peer_gr_action(struct peer *peer, enum peer_mode old_state,
enum peer_mode new_state)
{
enum global_mode global_gr_mode;
bool session_reset = true;
struct peer_group *group;
struct peer *member;
struct listnode *node, *nnode;
if (old_state == new_state)
return BGP_GR_NO_OPERATION;
if ((old_state == PEER_INVALID) || (new_state == PEER_INVALID))
return BGP_ERR_GR_INVALID_CMD;
global_gr_mode = bgp_global_gr_mode_get(peer->bgp);
if ((old_state == PEER_GLOBAL_INHERIT) &&
(new_state != PEER_GLOBAL_INHERIT)) {
BGP_PEER_GR_GLOBAL_INHERIT_UNSET(peer);
if (gr_mode_matches(new_state, global_gr_mode))
/* Peer was inheriting the global state and
* its new state still is the same, so a
* session reset is not needed.
*/
session_reset = false;
} else if ((new_state == PEER_GLOBAL_INHERIT) &&
(old_state != PEER_GLOBAL_INHERIT)) {
BGP_PEER_GR_GLOBAL_INHERIT_SET(peer);
if (gr_mode_matches(old_state, global_gr_mode))
/* Peer is inheriting the global state and
* its old state was also the same, so a
* session reset is not needed.
*/
session_reset = false;
}
/* Ensure we move to the new state and update flags */
bgp_peer_move_to_gr_mode(peer, new_state);
if (session_reset) {
if (!CHECK_FLAG(peer->sflags, PEER_STATUS_GROUP)) {
peer->last_reset = PEER_DOWN_CAPABILITY_CHANGE;
if (!peer_notify_config_change(peer->connection))
bgp_session_reset(peer);
} else {
group = peer->group;
for (ALL_LIST_ELEMENTS(group->peer, node, nnode, member)) {
member->last_reset = PEER_DOWN_CAPABILITY_CHANGE;
bgp_peer_move_to_gr_mode(member, new_state);
if (!peer_notify_config_change(member->connection))
bgp_session_reset(member);
}
}
}
return BGP_GR_SUCCESS;
}
void bgp_peer_move_to_gr_mode(struct peer *peer, enum peer_mode new_state)
{
enum global_mode global_gr_mode = bgp_global_gr_mode_get(peer->bgp);
enum peer_mode old_state = bgp_peer_gr_mode_get(peer);
switch (new_state) {
case PEER_HELPER:
BGP_PEER_GR_HELPER_ENABLE(peer);
break;
case PEER_GR:
BGP_PEER_GR_ENABLE(peer);
break;
case PEER_DISABLE:
BGP_PEER_GR_DISABLE(peer);
break;
case PEER_GLOBAL_INHERIT:
BGP_PEER_GR_GLOBAL_INHERIT_SET(peer);
bgp_peer_inherit_global_gr_mode(peer, global_gr_mode);
break;
case PEER_INVALID:
default:
zlog_err(
"[BGP_GR] Default switch mode ::: SOMETHING IS WRONG !!!");
break;
}
bgp_peer_gr_flags_update(peer);
peer->peer_gr_present_state = new_state;
if (BGP_DEBUG(graceful_restart, GRACEFUL_RESTART))
zlog_debug("%pBP: Peer GR mode changed from %s to %s, GR flags 0x%x peer flags 0x%" PRIx64,
peer, print_peer_gr_mode(old_state),
print_peer_gr_mode(new_state),
peer->peer_gr_new_status_flag, peer->flags);
}
void bgp_peer_gr_flags_update(struct peer *peer)
{
if (CHECK_FLAG(peer->peer_gr_new_status_flag,
PEER_GRACEFUL_RESTART_NEW_STATE_HELPER))
SET_FLAG(peer->flags, PEER_FLAG_GRACEFUL_RESTART_HELPER);
else
UNSET_FLAG(peer->flags, PEER_FLAG_GRACEFUL_RESTART_HELPER);
if (CHECK_FLAG(peer->peer_gr_new_status_flag,
PEER_GRACEFUL_RESTART_NEW_STATE_RESTART))
SET_FLAG(peer->flags, PEER_FLAG_GRACEFUL_RESTART);
else
UNSET_FLAG(peer->flags, PEER_FLAG_GRACEFUL_RESTART);
if (CHECK_FLAG(peer->peer_gr_new_status_flag,
PEER_GRACEFUL_RESTART_NEW_STATE_INHERIT))
SET_FLAG(peer->flags,
PEER_FLAG_GRACEFUL_RESTART_GLOBAL_INHERIT);
else
UNSET_FLAG(peer->flags,
PEER_FLAG_GRACEFUL_RESTART_GLOBAL_INHERIT);
if (BGP_DEBUG(graceful_restart, GRACEFUL_RESTART))
zlog_debug("%pBP: Peer flags updated to 0x%" PRIx64
", GR flags 0x%x, GR mode %s",
peer, peer->flags, peer->peer_gr_new_status_flag,
print_peer_gr_mode(bgp_peer_gr_mode_get(peer)));
/*
* If GR has been completely disabled for the peer and we were
* acting as the Helper for the peer (i.e., keeping stale routes
* and running the restart timer or stalepath timer), clear those
* states.
*/
if (!CHECK_FLAG(peer->flags, PEER_FLAG_GRACEFUL_RESTART) &&
!CHECK_FLAG(peer->flags, PEER_FLAG_GRACEFUL_RESTART_HELPER)) {
UNSET_FLAG(peer->sflags, PEER_STATUS_NSF_MODE);
if (CHECK_FLAG(peer->sflags, PEER_STATUS_NSF_WAIT)) {
if (bgp_debug_neighbor_events(peer))
zlog_debug("%pBP: GR disabled, stopping NSF and clearing stale routes",
peer);
peer_nsf_stop(peer);
}
}
}
bgpd: Fix wrong pthread event cancelling 0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=130719886083648) at ./nptl/pthread_kill.c:44 1 __pthread_kill_internal (signo=6, threadid=130719886083648) at ./nptl/pthread_kill.c:78 2 __GI___pthread_kill (threadid=130719886083648, signo=signo@entry=6) at ./nptl/pthread_kill.c:89 3 0x000076e399e42476 in __GI_raise (sig=6) at ../sysdeps/posix/raise.c:26 4 0x000076e39a34f950 in core_handler (signo=6, siginfo=0x76e3985fca30, context=0x76e3985fc900) at lib/sigevent.c:258 5 <signal handler called> 6 __pthread_kill_implementation (no_tid=0, signo=6, threadid=130719886083648) at ./nptl/pthread_kill.c:44 7 __pthread_kill_internal (signo=6, threadid=130719886083648) at ./nptl/pthread_kill.c:78 8 __GI___pthread_kill (threadid=130719886083648, signo=signo@entry=6) at ./nptl/pthread_kill.c:89 9 0x000076e399e42476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26 10 0x000076e399e287f3 in __GI_abort () at ./stdlib/abort.c:79 11 0x000076e39a39874b in _zlog_assert_failed (xref=0x76e39a46cca0 <_xref.27>, extra=0x0) at lib/zlog.c:789 12 0x000076e39a369dde in cancel_event_helper (m=0x5eda32df5e40, arg=0x5eda33afeed0, flags=1) at lib/event.c:1428 13 0x000076e39a369ef6 in event_cancel_event_ready (m=0x5eda32df5e40, arg=0x5eda33afeed0) at lib/event.c:1470 14 0x00005eda0a94a5b3 in bgp_stop (connection=0x5eda33afeed0) at bgpd/bgp_fsm.c:1355 15 0x00005eda0a94b4ae in bgp_stop_with_notify (connection=0x5eda33afeed0, code=8 '\b', sub_code=0 '\000') at bgpd/bgp_fsm.c:1610 16 0x00005eda0a979498 in bgp_packet_add (connection=0x5eda33afeed0, peer=0x5eda33b11800, s=0x76e3880daf90) at bgpd/bgp_packet.c:152 17 0x00005eda0a97a80f in bgp_keepalive_send (peer=0x5eda33b11800) at bgpd/bgp_packet.c:639 18 0x00005eda0a9511fd in peer_process (hb=0x5eda33c9ab80, arg=0x76e3985ffaf0) at bgpd/bgp_keepalives.c:111 19 0x000076e39a2cd8e6 in hash_iterate (hash=0x76e388000be0, func=0x5eda0a95105e <peer_process>, arg=0x76e3985ffaf0) at lib/hash.c:252 20 0x00005eda0a951679 in bgp_keepalives_start (arg=0x5eda3306af80) at bgpd/bgp_keepalives.c:214 21 0x000076e39a2c9932 in frr_pthread_inner (arg=0x5eda3306af80) at lib/frr_pthread.c:180 22 0x000076e399e94ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442 23 0x000076e399f26850 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 (gdb) f 12 12 0x000076e39a369dde in cancel_event_helper (m=0x5eda32df5e40, arg=0x5eda33afeed0, flags=1) at lib/event.c:1428 1428 assert(m->owner == pthread_self()); In this decode the attempt to cancel the connection's events from the wrong thread is causing the crash. Modify the code to create an event on the bm->master to cancel the events for the connection. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-10-24 23:44:31 +02:00
void bgp_event_stop_with_notify(struct event *event)
{
struct peer_connection *connection = EVENT_ARG(event);
bgp_stop_with_notify(connection, BGP_NOTIFY_SEND_HOLD_ERR, 0);
}