frr/bgpd/bgp_zebra.c

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

4643 lines
123 KiB
C
Raw Normal View History

// SPDX-License-Identifier: GPL-2.0-or-later
2002-12-13 21:15:29 +01:00
/* zebra client
* Copyright (C) 1997, 98, 99 Kunihiro Ishiguro
* Copyright (c) 2023 LabN Consulting, L.L.C.
*/
2002-12-13 21:15:29 +01:00
#include <zebra.h>
#include "command.h"
#include "stream.h"
#include "network.h"
#include "prefix.h"
#include "log.h"
#include "sockunion.h"
#include "zclient.h"
#include "routemap.h"
#include "frrevent.h"
#include "queue.h"
#include "memory.h"
#include "lib/json.h"
bfd: Fix for missing BFD client regs/deregs from quagga clients Ticket: CM-11256 Signed-off-by: Radhika Mahankali <radhika@cumulusnetworks.com> Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com> Reviewed-by: Kanna Rajagopal <kanna@cumulusnetworks.com> Testing: Unit, PTM smoke, OSPF smoke, BGP Smoke Issue: BFD client registrations are not being sent to PTM from BGP/OSPF clients when the quagga clients have no BFD configuration. This can create stale BFD sessions in PTM when BFD is removed from quagga configuration before quagga is restarted. BFD client de-registrations from BGP/OSPF also go missing sometimes when quagga is restarted. This also will cause stale BFD sessions in PTM. Root Cause: BFD client registrations were being sent at the time of BGP/OSPF daemon initialization. But, they were being sent to zebra before the socket connection between zebra and BGP/OSPF was established. This causes the missing BFD client registrations. BFD client de-registrations are sent from zebra when zebra detects socket close for BGP/OSPF daemons. Based on the timing, the de-registrations may happen after socket between PTM and zebra is closed. This will result in missing de-registrations. Fix: Moved sending of BFD client registration messages to zebra connected callback to make sure that they are sent after the BGP/OSPF daemons connect with zebra. Added BFD client de-registrations for BGP/OSPF to be also sent when zebra daemon gets restart signal. They are sent from the signal handler only if it was not already handled in zebra client socket close callback.
2016-06-21 12:39:58 +02:00
#include "lib/bfd.h"
#include "lib/route_opaque.h"
#include "filter.h"
#include "mpls.h"
#include "vxlan.h"
#include "pbr.h"
#include "frrdistance.h"
2002-12-13 21:15:29 +01:00
#include "bgpd/bgpd.h"
#include "bgpd/bgp_route.h"
#include "bgpd/bgp_attr.h"
#include "bgpd/bgp_aspath.h"
2002-12-13 21:15:29 +01:00
#include "bgpd/bgp_nexthop.h"
#include "bgpd/bgp_zebra.h"
#include "bgpd/bgp_fsm.h"
#include "bgpd/bgp_debug.h"
#include "bgpd/bgp_errors.h"
#include "bgpd/bgp_mpath.h"
#include "bgpd/bgp_nexthop.h"
#include "bgpd/bgp_nht.h"
#include "bgpd/bgp_bfd.h"
#include "bgpd/bgp_label.h"
#ifdef ENABLE_BGP_VNC
#include "bgpd/rfapi/rfapi_backend.h"
#include "bgpd/rfapi/vnc_export_bgp.h"
bgpd: add L3/L2VPN Virtual Network Control feature This feature adds an L3 & L2 VPN application that makes use of the VPN and Encap SAFIs. This code is currently used to support IETF NVO3 style operation. In NVO3 terminology it provides the Network Virtualization Authority (NVA) and the ability to import/export IP prefixes and MAC addresses from Network Virtualization Edges (NVEs). The code supports per-NVE tables. The NVE-NVA protocol used to communicate routing and Ethernet / Layer 2 (L2) forwarding information between NVAs and NVEs is referred to as the Remote Forwarder Protocol (RFP). OpenFlow is an example RFP. For general background on NVO3 and RFP concepts see [1]. For information on Openflow see [2]. RFPs are integrated with BGP via the RF API contained in the new "rfapi" BGP sub-directory. Currently, only a simple example RFP is included in Quagga. Developers may use this example as a starting point to integrate Quagga with an RFP of their choosing, e.g., OpenFlow. The RFAPI code also supports the ability import/export of routing information between VNC and customer edge routers (CEs) operating within a virtual network. Import/export may take place between BGP views or to the default zebera VRF. BGP, with IP VPNs and Tunnel Encapsulation, is used to distribute VPN information between NVAs. BGP based IP VPN support is defined in RFC4364, BGP/MPLS IP Virtual Private Networks (VPNs), and RFC4659, BGP-MPLS IP Virtual Private Network (VPN) Extension for IPv6 VPN . Use of both the Encapsulation Subsequent Address Family Identifier (SAFI) and the Tunnel Encapsulation Attribute, RFC5512, The BGP Encapsulation Subsequent Address Family Identifier (SAFI) and the BGP Tunnel Encapsulation Attribute, are supported. MAC address distribution does not follow any standard BGB encoding, although it was inspired by the early IETF EVPN concepts. The feature is conditionally compiled and disabled by default. Use the --enable-bgp-vnc configure option to enable. The majority of this code was authored by G. Paul Ziemba <paulz@labn.net>. [1] http://tools.ietf.org/html/draft-ietf-nvo3-nve-nva-cp-req [2] https://www.opennetworking.org/sdn-resources/technical-library Now includes changes needed to merge with cmaster-next.
2016-05-07 20:18:56 +02:00
#endif
#include "bgpd/bgp_evpn.h"
#include "bgpd/bgp_mplsvpn.h"
#include "bgpd/bgp_labelpool.h"
#include "bgpd/bgp_pbr.h"
#include "bgpd/bgp_evpn_private.h"
bgpd: support for Ethernet Segments and Type-1/EAD routes This is the base patch that brings in support for Type-1 routes. It includes support for - - Ethernet Segment (ES) management - EAD route handling - MAC-IP (Type-2) routes with a non-zero ESI i.e. Aliasing for active-active multihoming - Initial infra for consistency checking. Consistency checking is a fundamental feature for active-active solutions like MLAG. We will try to levarage the info in the EAD-ES/EAD-EVI routes to detect inconsitencies in access config across VTEPs attached to the same Ethernet Segment. Functionality Overview - ======================== 1. Ethernet segments are created in zebra and associated with access VLANs. zebra sends that info as ES and ES-EVI objects to BGP. 2. BGP advertises EAD-ES and EAD-EVI routes for the locally attached ethernet segments. 3. Similarly BGP processes EAD-ES and EAD-EVI routes from peers and translates them into ES-VTEP objects which are then sent to zebra as remote ESs. 4. Each ES in zebra is associated with a list of active VTEPs which is then translated into a L2-NHG (nexthop group). This is the ES "Alias" entry 5. MAC-IP routes with a non-zero ESI use the alias entry created in (4.) to forward traffic i.e. a MAC-ECMP is done to these remote-ES destinations. EAD route management (route table and key) - ============================================ 1. Local EAD-ES routes a. route-table: per-ES route-table key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP) b. route-table: per-VNI route-table Not added c. route-table: global route-table key: {RD=ES-RD, ESI, ET=0xffffffff) 2. Remote EAD-ES routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP) c. route-table: global route-table key: {RD=ES-RD, ESI, ET=0xffffffff) 3. Local EAD-EVI routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=0, ESI, ET=0, VTEP-IP) c. route-table: global route-table key: {RD=L2-VNI-RD, ESI, ET=0) 4. Remote EAD-EVI routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=0, ESI, ET=0, VTEP-IP) c. route-table: global route-table key: {RD=L2-VNI-RD, ESI, ET=0) Please refer to bgp_evpn_mh.h for info on how the data-structures are organized. Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-03-27 22:43:50 +01:00
#include "bgpd/bgp_evpn_mh.h"
#include "bgpd/bgp_mac.h"
bgpd: lttng tracepoint for local events received from zebra TPs - ===== root@ibm-2410a1-01:mgmt:~# lttng list --userspace |grep frr_bgp:evpn.*recv frr_bgp:evpn_local_l3vni_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_l3vni_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_macip_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_macip_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_vni_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_vni_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_evi_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_evi_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) root@ibm-2410a1-01:mgmt:~# Sample output - =============== 1. ES frr_bgp:evpn_mh_local_es_add_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01', 'vtep': '27.0.0.15', 'active': 0, 'bypass': 0, 'df_pref': 50000} frr_bgp:evpn_mh_local_es_del_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01'} 2. ES-EVI frr_bgp:evpn_mh_local_es_evi_add_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01', 'vni': 1004} frr_bgp:evpn_mh_local_es_evi_del_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01', 'vni': 1001} 3. L2-VNI frr_bgp:evpn_local_vni_add_zrecv {'vni': 1004, 'vtep': '27.0.0.15', 'mc_grp': '239.1.1.104', 'vrf': 97} 4. L3-VNI frr_bgp:evpn_local_l3vni_add_zrecv {'vni': 4001, 'vrf': 87, 'svi_rmac': '24:8a:07:cc:aa:5f', 'vrr_rmac': '24:8a:07:cc:aa:5f', 'vtep': '27.0.0.15', 'filter': 0, 'svi_ifindex': 95, 'anycast_mac': 'n' frr_bgp:evpn_local_l3vni_del_zrecv {'vni': 4003, 'vrf': 107} 5. MAC-IP frr_bgp:evpn_local_macip_add_zrecv {'vni': 1003, 'mac': '00:02:00:00:00:04', 'ip': 'fe80::202:ff:fe00:4', 'flags': 4, 'seq': 0, 'esi': '03:44:38:39:ff:ff:01:00:00:02'} frr_bgp:evpn_local_macip_del_zrecv {'vni': 1000, 'mac': '00:02:00:00:00:04', 'ip': '2001:fee1::4', 'state': 1} Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
2021-10-04 18:41:43 +02:00
#include "bgpd/bgp_trace.h"
#include "bgpd/bgp_community.h"
#include "bgpd/bgp_lcommunity.h"
2002-12-13 21:15:29 +01:00
/* All information about zebra. */
[bgpd] Stability fixes including bugs 397, 492 I've spent the last several weeks working on stability fixes to bgpd. These patches fix all of the numerous crashes, assertion failures, memory leaks and memory stomping I could find. Valgrind was used extensively. Added new function bgp_exit() to help catch problems. If "debug bgp" is configured and bgpd exits with status of 0, statistics on remaining lib/memory.c allocations are printed to stderr. It is my hope that other developers will use this to stay on top of memory issues. Example questionable exit: bgpd: memstats: Current memory utilization in module LIB: bgpd: memstats: Link List : 6 bgpd: memstats: Link Node : 5 bgpd: memstats: Hash : 8 bgpd: memstats: Hash Bucket : 2 bgpd: memstats: Hash Index : 8 bgpd: memstats: Work queue : 3 bgpd: memstats: Work queue item : 2 bgpd: memstats: Work queue name string : 3 bgpd: memstats: Current memory utilization in module BGP: bgpd: memstats: BGP instance : 1 bgpd: memstats: BGP peer : 1 bgpd: memstats: BGP peer hostname : 1 bgpd: memstats: BGP attribute : 1 bgpd: memstats: BGP extra attributes : 1 bgpd: memstats: BGP aspath : 1 bgpd: memstats: BGP aspath str : 1 bgpd: memstats: BGP table : 24 bgpd: memstats: BGP node : 1 bgpd: memstats: BGP route : 1 bgpd: memstats: BGP synchronise : 8 bgpd: memstats: BGP Process queue : 1 bgpd: memstats: BGP node clear queue : 1 bgpd: memstats: NOTE: If configuration exists, utilization may be expected. Example clean exit: bgpd: memstats: No remaining tracked memory utilization. This patch fixes bug #397: "Invalid free in bgp_announce_check()". This patch fixes bug #492: "SIGBUS in bgpd/bgp_route.c: bgp_clear_route_node()". My apologies for not separating out these changes into individual patches. The complexity of doing so boggled what is left of my brain. I hope this is all still useful to the community. This code has been production tested, in non-route-server-client mode, on a linux 32-bit box and a 64-bit box. Release/reset functions, used by bgp_exit(), added to: bgpd/bgp_attr.c,h bgpd/bgp_community.c,h bgpd/bgp_dump.c,h bgpd/bgp_ecommunity.c,h bgpd/bgp_filter.c,h bgpd/bgp_nexthop.c,h bgpd/bgp_route.c,h lib/routemap.c,h File by file analysis: * bgpd/bgp_aspath.c: Prevent re-use of ashash after it is released. * bgpd/bgp_attr.c: #if removed uncalled cluster_dup(). * bgpd/bgp_clist.c,h: Allow community_list_terminate() to be called from bgp_exit(). * bgpd/bgp_filter.c: Fix aslist->name use without allocation check, and also fix memory leak. * bgpd/bgp_main.c: Created bgp_exit() exit routine. This function frees allocations made as part of bgpd initialization and, to some extent, configuration. If "debug bgp" is configured, memory stats are printed as described above. * bgpd/bgp_nexthop.c: zclient_new() already allocates stream for ibuf/obuf, so bgp_scan_init() shouldn't do it too. Also, made it so zlookup is global so bgp_exit() can use it. * bgpd/bgp_packet.c: bgp_capability_msg_parse() call to bgp_clear_route() adjusted to use new BGP_CLEAR_ROUTE_NORMAL flag. * bgpd/bgp_route.h: Correct reference counter "lock" to be signed. bgp_clear_route() now accepts a bgp_clear_route_type of either BGP_CLEAR_ROUTE_NORMAL or BGP_CLEAR_ROUTE_MY_RSCLIENT. * bgpd/bgp_route.c: - bgp_process_rsclient(): attr was being zero'ed and then bgp_attr_extra_free() was being called with it, even though it was never filled with valid data. - bgp_process_rsclient(): Make sure rsclient->group is not NULL before use. - bgp_processq_del(): Add call to bgp_table_unlock(). - bgp_process(): Add call to bgp_table_lock(). - bgp_update_rsclient(): memset clearing of new_attr not needed since declarationw with "= { 0 }" does it. memset was already commented out. - bgp_update_rsclient(): Fix screwed up misleading indentation. - bgp_withdraw_rsclient(): Fix screwed up misleading indentation. - bgp_clear_route_node(): Support BGP_CLEAR_ROUTE_MY_RSCLIENT. - bgp_clear_node_queue_del(): Add call to bgp_table_unlock() and also free struct bgp_clear_node_queue used for work item. - bgp_clear_node_complete(): Do peer_unlock() after BGP_EVENT_ADD() in case peer is released by peer_unlock() call. - bgp_clear_route_table(): Support BGP_CLEAR_ROUTE_MY_RSCLIENT. Use struct bgp_clear_node_queue to supply data to worker. Add call to bgp_table_lock(). - bgp_clear_route(): Add support for BGP_CLEAR_ROUTE_NORMAL or BGP_CLEAR_ROUTE_MY_RSCLIENT. - bgp_clear_route_all(): Use BGP_CLEAR_ROUTE_NORMAL. Bug 397 fixes: - bgp_default_originate() - bgp_announce_table() * bgpd/bgp_table.h: - struct bgp_table: Added reference count. Changed type of owner to be "struct peer *" rather than "void *". - struct bgp_node: Correct reference counter "lock" to be signed. * bgpd/bgp_table.c: - Added bgp_table reference counting. - bgp_table_free(): Fixed cleanup code. Call peer_unlock() on owner if set. - bgp_unlock_node(): Added assertion. - bgp_node_get(): Added call to bgp_lock_node() to code path that it was missing from. * bgpd/bgp_vty.c: - peer_rsclient_set_vty(): Call peer_lock() as part of peer assignment to owner. Handle failure gracefully. - peer_rsclient_unset_vty(): Add call to bgp_clear_route() with BGP_CLEAR_ROUTE_MY_RSCLIENT purpose. * bgpd/bgp_zebra.c: Made it so zclient is global so bgp_exit() can use it. * bgpd/bgpd.c: - peer_lock(): Allow to be called when status is "Deleted". - peer_deactivate(): Supply BGP_CLEAR_ROUTE_NORMAL purpose to bgp_clear_route() call. - peer_delete(): Common variable listnode pn. Fix bug in which rsclient was only dealt with if not part of a peer group. Call bgp_clear_route() for rsclient, if appropriate, and do so with BGP_CLEAR_ROUTE_MY_RSCLIENT purpose. - peer_group_get(): Use XSTRDUP() instead of strdup() for conf->host. - peer_group_bind(): Call bgp_clear_route() for rsclient, and do so with BGP_CLEAR_ROUTE_MY_RSCLIENT purpose. - bgp_create(): Use XSTRDUP() instead of strdup() for peer_self->host. - bgp_delete(): Delete peers before groups, rather than after. And then rather than deleting rsclients, verify that there are none at this point. - bgp_unlock(): Add assertion. - bgp_free(): Call bgp_table_finish() rather than doing XFREE() itself. * lib/command.c,h: Compiler warning fixes. Add cmd_terminate(). Fixed massive leak in install_element() in which cmd_make_descvec() was being called more than once for the same cmd->strvec/string/doc. * lib/log.c: Make closezlog() check fp before calling fclose(). * lib/memory.c: Catch when alloc count goes negative by using signed counts. Correct #endif comment. Add log_memstats_stderr(). * lib/memory.h: Add log_memstats_stderr(). * lib/thread.c: thread->funcname was being accessed in thread_call() after it had been freed. Rearranged things so that thread_call() frees funcname. Also made it so thread_master_free() cleans up cpu_record. * lib/vty.c,h: Use global command_cr. Add vty_terminate(). * lib/zclient.c,h: Re-enable zclient_free().
2009-07-18 07:44:03 +02:00
struct zclient *zclient = NULL;
struct zclient *zclient_sync;
static bool bgp_zebra_label_manager_connect(void);
2002-12-13 21:15:29 +01:00
/* hook to indicate vrf status change for SNMP */
DEFINE_HOOK(bgp_vrf_status_changed, (struct bgp *bgp, struct interface *ifp),
(bgp, ifp));
DEFINE_MTYPE_STATIC(BGPD, BGP_IF_INFO, "BGP interface context");
/* Can we install into zebra? */
static inline bool bgp_install_info_to_zebra(struct bgp *bgp)
{
if (zclient->sock <= 0)
return false;
if (!IS_BGP_INST_KNOWN_TO_ZEBRA(bgp)) {
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug(
"%s: No zebra instance to talk to, not installing information",
__func__);
return false;
}
return true;
}
int zclient_num_connects;
2004-10-03 20:18:34 +02:00
/* Router-id update message from zebra. */
static int bgp_router_id_update(ZAPI_CALLBACK_ARGS)
2002-12-13 21:15:29 +01:00
{
2004-10-03 20:18:34 +02:00
struct prefix router_id;
2002-12-13 21:15:29 +01:00
2004-10-03 20:18:34 +02:00
zebra_router_id_update_read(zclient->ibuf, &router_id);
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("Rx Router Id update VRF %u Id %pFX", vrf_id,
&router_id);
bgp_router_id_zebra_bump(vrf_id, &router_id);
2002-12-13 21:15:29 +01:00
return 0;
}
/* Set or clear interface on which unnumbered neighbor is configured. This
* would in turn cause BGP to initiate or turn off IPv6 RAs on this
* interface.
*/
static void bgp_update_interface_nbrs(struct bgp *bgp, struct interface *ifp,
struct interface *upd_ifp)
{
struct listnode *node, *nnode;
struct peer *peer;
for (ALL_LIST_ELEMENTS(bgp->peer, node, nnode, peer)) {
if (peer->conf_if && (strcmp(peer->conf_if, ifp->name) == 0)) {
if (upd_ifp) {
peer->ifp = upd_ifp;
bgp_zebra_initiate_radv(bgp, peer);
} else {
bgp_zebra_terminate_radv(bgp, peer);
peer->ifp = upd_ifp;
}
}
}
}
static int bgp_read_fec_update(ZAPI_CALLBACK_ARGS)
{
bgp_parse_fec_update();
return 0;
}
static void bgp_start_interface_nbrs(struct bgp *bgp, struct interface *ifp)
{
struct listnode *node, *nnode;
struct peer *peer;
for (ALL_LIST_ELEMENTS(bgp->peer, node, nnode, peer)) {
if (peer->conf_if && (strcmp(peer->conf_if, ifp->name) == 0) &&
!peer_established(peer->connection)) {
if (peer_active(peer->connection) == BGP_PEER_ACTIVE)
BGP_EVENT_ADD(peer->connection, BGP_Stop);
BGP_EVENT_ADD(peer->connection, BGP_Start);
}
}
}
static void bgp_nbr_connected_add(struct bgp *bgp, struct nbr_connected *ifc)
{
struct connected *connected;
struct interface *ifp;
struct prefix *p;
/* Kick-off the FSM for any relevant peers only if there is a
* valid local address on the interface.
*/
ifp = ifc->ifp;
frr_each (if_connected, ifp->connected, connected) {
p = connected->address;
if (p->family == AF_INET6
&& IN6_IS_ADDR_LINKLOCAL(&p->u.prefix6))
break;
}
if (!connected)
return;
bgp_start_interface_nbrs(bgp, ifp);
}
static void bgp_nbr_connected_delete(struct bgp *bgp, struct nbr_connected *ifc,
int del)
{
struct listnode *node, *nnode;
struct peer *peer;
struct interface *ifp;
for (ALL_LIST_ELEMENTS(bgp->peer, node, nnode, peer)) {
if (peer->conf_if
&& (strcmp(peer->conf_if, ifc->ifp->name) == 0)) {
peer->last_reset = PEER_DOWN_NBR_ADDR_DEL;
BGP_EVENT_ADD(peer->connection, BGP_Stop);
}
}
/* Free neighbor also, if we're asked to. */
if (del) {
ifp = ifc->ifp;
listnode_delete(ifp->nbr_connected, ifc);
nbr_connected_free(ifc);
}
}
static int bgp_ifp_destroy(struct interface *ifp)
2002-12-13 21:15:29 +01:00
{
struct bgp *bgp;
2002-12-13 21:15:29 +01:00
bgp = ifp->vrf->info;
Overhual BGP debugs Summary of changes - added an option to enable keepalive debugs for a specific peer - added an option to enable inbound and/or outbound updates debugs for a specific peer - added an option to enable update debugs for a specific prefix - added an option to enable zebra debugs for a specific prefix - combined "deb bgp", "deb bgp events" and "deb bgp fsm" into "deb bgp neighbor-events". "deb bgp neighbor-events" can be enabled for a specific peer. - merged "deb bgp filters" into "deb bgp update" - moved the per-peer logging to one central log file. We now have the ability to filter all verbose debugs on a per-peer and per-prefix basis so we no longer need to keep log files per-peer. This simplifies troubleshooting by keeping all BGP logs in one location. The use r can then grep for the peer IP they are interested in if they wish to see the logs for a specific peer. - Changed "show debugging" in isis to "show debugging isis" to be consistent with all other protocols. This was very confusing for the user because they would type "show debug" and expect to see a list of debugs enabled across all protocols. - Removed "undebug" from the parser for BGP. Again this was to be consisten with all other protocols. - Removed the "all" keyword from the BGP debug parser. The user can now do "no debug bgp" to disable all BGP debugs, before you had to type "no deb all bgp" which was confusing. The new parse tree for BGP debugging is: deb bgp as4 deb bgp as4 segment deb bgp keepalives [A.B.C.D|WORD|X:X::X:X] deb bgp neighbor-events [A.B.C.D|WORD|X:X::X:X] deb bgp nht deb bgp updates [in|out] [A.B.C.D|WORD|X:X::X:X] deb bgp updates prefix [A.B.C.D/M|X:X::X:X/M] deb bgp zebra deb bgp zebra prefix [A.B.C.D/M|X:X::X:X/M]
2015-05-20 02:58:12 +02:00
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("Rx Intf del VRF %s IF %s", ifp->vrf->name,
ifp->name);
if (bgp) {
bgp_update_interface_nbrs(bgp, ifp, NULL);
hook_call(bgp_vrf_status_changed, bgp, ifp);
}
bgp_mac_del_mac_entry(ifp);
2002-12-13 21:15:29 +01:00
return 0;
}
static int bgp_ifp_up(struct interface *ifp)
2002-12-13 21:15:29 +01:00
{
struct connected *c;
struct nbr_connected *nc;
struct listnode *node, *nnode;
struct bgp *bgp;
bgp = ifp->vrf->info;
2002-12-13 21:15:29 +01:00
bgp_mac_add_mac_entry(ifp);
Overhual BGP debugs Summary of changes - added an option to enable keepalive debugs for a specific peer - added an option to enable inbound and/or outbound updates debugs for a specific peer - added an option to enable update debugs for a specific prefix - added an option to enable zebra debugs for a specific prefix - combined "deb bgp", "deb bgp events" and "deb bgp fsm" into "deb bgp neighbor-events". "deb bgp neighbor-events" can be enabled for a specific peer. - merged "deb bgp filters" into "deb bgp update" - moved the per-peer logging to one central log file. We now have the ability to filter all verbose debugs on a per-peer and per-prefix basis so we no longer need to keep log files per-peer. This simplifies troubleshooting by keeping all BGP logs in one location. The use r can then grep for the peer IP they are interested in if they wish to see the logs for a specific peer. - Changed "show debugging" in isis to "show debugging isis" to be consistent with all other protocols. This was very confusing for the user because they would type "show debug" and expect to see a list of debugs enabled across all protocols. - Removed "undebug" from the parser for BGP. Again this was to be consisten with all other protocols. - Removed the "all" keyword from the BGP debug parser. The user can now do "no debug bgp" to disable all BGP debugs, before you had to type "no deb all bgp" which was confusing. The new parse tree for BGP debugging is: deb bgp as4 deb bgp as4 segment deb bgp keepalives [A.B.C.D|WORD|X:X::X:X] deb bgp neighbor-events [A.B.C.D|WORD|X:X::X:X] deb bgp nht deb bgp updates [in|out] [A.B.C.D|WORD|X:X::X:X] deb bgp updates prefix [A.B.C.D/M|X:X::X:X/M] deb bgp zebra deb bgp zebra prefix [A.B.C.D/M|X:X::X:X/M]
2015-05-20 02:58:12 +02:00
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("Rx Intf up VRF %s IF %s", ifp->vrf->name, ifp->name);
if (!bgp)
return 0;
frr_each (if_connected, ifp->connected, c)
bgp_connected_add(bgp, c);
2002-12-13 21:15:29 +01:00
for (ALL_LIST_ELEMENTS(ifp->nbr_connected, node, nnode, nc))
bgp_nbr_connected_add(bgp, nc);
hook_call(bgp_vrf_status_changed, bgp, ifp);
bgp_nht_ifp_up(ifp);
if (bgp_get_default() && if_is_vrf(ifp)) {
vpn_leak_zebra_vrf_label_update(bgp, AFI_IP);
vpn_leak_zebra_vrf_label_update(bgp, AFI_IP6);
vpn_leak_zebra_vrf_sid_update(bgp, AFI_IP);
vpn_leak_zebra_vrf_sid_update(bgp, AFI_IP6);
vpn_leak_postchange_all();
}
2002-12-13 21:15:29 +01:00
return 0;
}
static int bgp_ifp_down(struct interface *ifp)
2002-12-13 21:15:29 +01:00
{
struct connected *c;
struct nbr_connected *nc;
struct listnode *node, *nnode;
struct bgp *bgp;
struct peer *peer;
bgp = ifp->vrf->info;
2002-12-13 21:15:29 +01:00
bgp_mac_del_mac_entry(ifp);
Overhual BGP debugs Summary of changes - added an option to enable keepalive debugs for a specific peer - added an option to enable inbound and/or outbound updates debugs for a specific peer - added an option to enable update debugs for a specific prefix - added an option to enable zebra debugs for a specific prefix - combined "deb bgp", "deb bgp events" and "deb bgp fsm" into "deb bgp neighbor-events". "deb bgp neighbor-events" can be enabled for a specific peer. - merged "deb bgp filters" into "deb bgp update" - moved the per-peer logging to one central log file. We now have the ability to filter all verbose debugs on a per-peer and per-prefix basis so we no longer need to keep log files per-peer. This simplifies troubleshooting by keeping all BGP logs in one location. The use r can then grep for the peer IP they are interested in if they wish to see the logs for a specific peer. - Changed "show debugging" in isis to "show debugging isis" to be consistent with all other protocols. This was very confusing for the user because they would type "show debug" and expect to see a list of debugs enabled across all protocols. - Removed "undebug" from the parser for BGP. Again this was to be consisten with all other protocols. - Removed the "all" keyword from the BGP debug parser. The user can now do "no debug bgp" to disable all BGP debugs, before you had to type "no deb all bgp" which was confusing. The new parse tree for BGP debugging is: deb bgp as4 deb bgp as4 segment deb bgp keepalives [A.B.C.D|WORD|X:X::X:X] deb bgp neighbor-events [A.B.C.D|WORD|X:X::X:X] deb bgp nht deb bgp updates [in|out] [A.B.C.D|WORD|X:X::X:X] deb bgp updates prefix [A.B.C.D/M|X:X::X:X/M] deb bgp zebra deb bgp zebra prefix [A.B.C.D/M|X:X::X:X/M]
2015-05-20 02:58:12 +02:00
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("Rx Intf down VRF %s IF %s", ifp->vrf->name,
ifp->name);
if (!bgp)
return 0;
frr_each (if_connected, ifp->connected, c)
bgp_connected_delete(bgp, c);
2002-12-13 21:15:29 +01:00
for (ALL_LIST_ELEMENTS(ifp->nbr_connected, node, nnode, nc))
bgp_nbr_connected_delete(bgp, nc, 1);
bgpd: fix fast external fallover behavior ISSUES 1. When an interface goes down, the zclient callbacks are invoked in the following order: (a) address_delete() that removes the connected address list: ifp->connected, (b) interface_down() that performs "fast external fallover" operation. The operation relies on ifp->connected to look for peers that should be brought down. That's a cyclic dependency. 2. 'ttl-security' configuration handler sets peer->ttl to MAXTTL (so that BGP packets are sent with TTL=255, as per the requirement of ttl-security). This, however, is incompatible with 'fast external fallover' as the fallover operation checks for (ttl == 1) to determine directly connected peers. 3. The current fallover operation does not work for IPv6 address family. PATCH 1. The patch removes the dependency on 'ifp->connected' list for fast fallover. The peer already contains a nexthop structure that reflects the peering address. The nexthop structure has a pointer to the interface (ifp) that peering address resolves to. Everytime the TCP connection succeeds, the ifp is updated. The patch uses this ifp in the interface_down() callback for a match for the peers that should be brought down. 2. The evaluation for directly connected peering is enhanced as 'peer->ttl == 1' OR 'peer->gtsm_hops == 1'. Thus a ttl-security configuration on the peer with one hop is directly connected and should be brought down under 'fast external fallover'. 3. Because of fix (1), IPv6 address family works automatically. Signed-off-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com> Reviewed-by: Dinesh G Dutt <ddutt@cumulusnetworks.com> Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2013-09-11 05:33:55 +02:00
/* Fast external-failover */
if (!CHECK_FLAG(bgp->flags, BGP_FLAG_NO_FAST_EXT_FAILOVER)) {
2002-12-13 21:15:29 +01:00
for (ALL_LIST_ELEMENTS(bgp->peer, node, nnode, peer)) {
/* Take down directly connected peers. */
if ((peer->ttl != BGP_DEFAULT_TTL)
&& (peer->gtsm_hops != BGP_GTSM_HOPS_CONNECTED))
continue;
2002-12-13 21:15:29 +01:00
if (ifp == peer->nexthop.ifp) {
BGP_EVENT_ADD(peer->connection, BGP_Stop);
peer->last_reset = PEER_DOWN_IF_DOWN;
}
2002-12-13 21:15:29 +01:00
}
}
hook_call(bgp_vrf_status_changed, bgp, ifp);
bgp_nht_ifp_down(ifp);
if (bgp_get_default() && if_is_vrf(ifp)) {
vpn_leak_zebra_vrf_label_withdraw(bgp, AFI_IP);
vpn_leak_zebra_vrf_label_withdraw(bgp, AFI_IP6);
vpn_leak_zebra_vrf_sid_withdraw(bgp, AFI_IP);
vpn_leak_zebra_vrf_sid_withdraw(bgp, AFI_IP6);
vpn_leak_postchange_all();
}
2002-12-13 21:15:29 +01:00
return 0;
}
static int bgp_interface_address_add(ZAPI_CALLBACK_ARGS)
2002-12-13 21:15:29 +01:00
{
struct connected *ifc;
struct bgp *bgp;
struct peer *peer;
struct prefix *addr;
struct listnode *node, *nnode;
afi_t afi;
safi_t safi;
bgp = bgp_lookup_by_vrf_id(vrf_id);
ifc = zebra_interface_address_read(cmd, zclient->ibuf, vrf_id);
2002-12-13 21:15:29 +01:00
if (ifc == NULL)
return 0;
if (bgp_debug_zebra(ifc->address))
zlog_debug("Rx Intf address add VRF %s IF %s addr %pFX",
ifc->ifp->vrf->name, ifc->ifp->name, ifc->address);
if (!bgp)
return 0;
if (if_is_operative(ifc->ifp)) {
bgp_connected_add(bgp, ifc);
/* If we have learnt of any neighbors on this interface,
* check to kick off any BGP interface-based neighbors,
* but only if this is a link-local address.
*/
if (IN6_IS_ADDR_LINKLOCAL(&ifc->address->u.prefix6)
&& !list_isempty(ifc->ifp->nbr_connected))
bgp_start_interface_nbrs(bgp, ifc->ifp);
else {
addr = ifc->address;
for (ALL_LIST_ELEMENTS(bgp->peer, node, nnode, peer)) {
if (addr->family == AF_INET)
continue;
/*
* If the Peer's interface name matches the
* interface name for which BGP received the
* update and if the received interface address
* is a globalV6 and if the peer is currently
* using a v4-mapped-v6 addr or a link local
* address, then copy the Rxed global v6 addr
* into peer's v6_global and send updates out
* with new nexthop addr.
*/
if ((peer->conf_if &&
(strcmp(peer->conf_if, ifc->ifp->name) ==
0)) &&
!IN6_IS_ADDR_LINKLOCAL(&addr->u.prefix6) &&
((IS_MAPPED_IPV6(
&peer->nexthop.v6_global)) ||
IN6_IS_ADDR_LINKLOCAL(
&peer->nexthop.v6_global))) {
if (bgp_debug_zebra(ifc->address)) {
zlog_debug(
"Update peer %pBP's current intf addr %pI6 and send updates",
peer,
&peer->nexthop
.v6_global);
}
memcpy(&peer->nexthop.v6_global,
&addr->u.prefix6,
IPV6_MAX_BYTELEN);
FOREACH_AFI_SAFI (afi, safi)
bgp_announce_route(peer, afi,
safi, true);
}
}
}
}
2002-12-13 21:15:29 +01:00
return 0;
}
static int bgp_interface_address_delete(ZAPI_CALLBACK_ARGS)
2002-12-13 21:15:29 +01:00
{
struct listnode *node, *nnode;
struct connected *ifc;
struct peer *peer;
struct bgp *bgp;
struct prefix *addr;
bgp = bgp_lookup_by_vrf_id(vrf_id);
ifc = zebra_interface_address_read(cmd, zclient->ibuf, vrf_id);
2002-12-13 21:15:29 +01:00
if (ifc == NULL)
return 0;
if (bgp_debug_zebra(ifc->address))
zlog_debug("Rx Intf address del VRF %s IF %s addr %pFX",
ifc->ifp->vrf->name, ifc->ifp->name, ifc->address);
if (bgp && if_is_operative(ifc->ifp)) {
bgp_connected_delete(bgp, ifc);
}
2002-12-13 21:15:29 +01:00
addr = ifc->address;
if (bgp) {
/*
* When we are using the v6 global as part of the peering
* nexthops and we are removing it, then we need to
* clear the peer data saved for that nexthop and
* cause a re-announcement of the route. Since
* we do not want the peering to bounce.
*/
for (ALL_LIST_ELEMENTS(bgp->peer, node, nnode, peer)) {
afi_t afi;
safi_t safi;
if (addr->family == AF_INET)
continue;
if (!IN6_IS_ADDR_LINKLOCAL(&addr->u.prefix6) &&
memcmp(&peer->nexthop.v6_global, &addr->u.prefix6, IPV6_MAX_BYTELEN) ==
0) {
memset(&peer->nexthop.v6_global, 0, IPV6_MAX_BYTELEN);
FOREACH_AFI_SAFI (afi, safi)
bgp_announce_route(peer, afi, safi,
true);
}
}
}
connected_free(&ifc);
2002-12-13 21:15:29 +01:00
return 0;
}
static int bgp_interface_nbr_address_add(ZAPI_CALLBACK_ARGS)
{
struct nbr_connected *ifc = NULL;
struct bgp *bgp;
ifc = zebra_interface_nbr_address_read(cmd, zclient->ibuf, vrf_id);
if (ifc == NULL)
return 0;
if (bgp_debug_zebra(ifc->address))
zlog_debug("Rx Intf neighbor add VRF %s IF %s addr %pFX",
ifc->ifp->vrf->name, ifc->ifp->name, ifc->address);
if (if_is_operative(ifc->ifp)) {
bgp = bgp_lookup_by_vrf_id(vrf_id);
if (bgp)
bgp_nbr_connected_add(bgp, ifc);
}
return 0;
}
static int bgp_interface_nbr_address_delete(ZAPI_CALLBACK_ARGS)
{
struct nbr_connected *ifc = NULL;
struct bgp *bgp;
ifc = zebra_interface_nbr_address_read(cmd, zclient->ibuf, vrf_id);
if (ifc == NULL)
return 0;
if (bgp_debug_zebra(ifc->address))
zlog_debug("Rx Intf neighbor del VRF %s IF %s addr %pFX",
ifc->ifp->vrf->name, ifc->ifp->name, ifc->address);
if (if_is_operative(ifc->ifp)) {
bgp = bgp_lookup_by_vrf_id(vrf_id);
if (bgp)
bgp_nbr_connected_delete(bgp, ifc, 0);
}
nbr_connected_free(ifc);
return 0;
}
2002-12-13 21:15:29 +01:00
/* Zebra route add and delete treatment. */
static int zebra_read_route(ZAPI_CALLBACK_ARGS)
2002-12-13 21:15:29 +01:00
{
enum nexthop_types_t nhtype;
enum blackhole_type bhtype = BLACKHOLE_UNSPEC;
struct zapi_route api;
union g_addr nexthop = {};
ifindex_t ifindex;
int add, i;
struct bgp *bgp;
2002-12-13 21:15:29 +01:00
bgp = bgp_lookup_by_vrf_id(vrf_id);
if (!bgp)
return 0;
if (zapi_route_decode(zclient->ibuf, &api) < 0)
return -1;
/* we completely ignore srcdest routes for now. */
if (CHECK_FLAG(api.message, ZAPI_MESSAGE_SRCPFX))
return 0;
/* ignore link-local address. */
if (api.prefix.family == AF_INET6
&& IN6_IS_ADDR_LINKLOCAL(&api.prefix.u.prefix6))
return 0;
ifindex = api.nexthops[0].ifindex;
nhtype = api.nexthops[0].type;
/* api_nh structure has union of gate and bh_type */
if (nhtype == NEXTHOP_TYPE_BLACKHOLE) {
/* bh_type is only applicable if NEXTHOP_TYPE_BLACKHOLE*/
bhtype = api.nexthops[0].bh_type;
} else
nexthop = api.nexthops[0].gate;
add = (cmd == ZEBRA_REDISTRIBUTE_ROUTE_ADD);
if (add) {
/*
* The ADD message is actually an UPDATE and there is no
* explicit DEL
* for a prior redistributed route, if any. So, perform an
* implicit
* DEL processing for the same redistributed route from any
* other
* source type.
*/
for (i = 0; i < ZEBRA_ROUTE_MAX; i++) {
if (i != api.type)
bgp_redistribute_delete(bgp, &api.prefix, i,
api.instance);
}
/* Now perform the add/update. */
bgp_redistribute_add(bgp, &api.prefix, &nexthop, ifindex,
nhtype, api.distance, bhtype, api.metric,
api.type, api.instance, api.tag);
} else {
bgp_redistribute_delete(bgp, &api.prefix, api.type,
api.instance);
}
if (bgp_debug_zebra(&api.prefix)) {
char buf[PREFIX_STRLEN];
if (add) {
inet_ntop(api.prefix.family, &nexthop, buf,
sizeof(buf));
zlog_debug("Rx route ADD %s %s[%d] %pFX nexthop %s (type %d if %u) metric %u distance %u tag %" ROUTE_TAG_PRI,
bgp->name_pretty,
zebra_route_string(api.type), api.instance,
&api.prefix, buf, nhtype, ifindex,
api.metric, api.distance, api.tag);
} else {
zlog_debug("Rx route DEL %s %s[%d] %pFX",
bgp->name_pretty,
zebra_route_string(api.type), api.instance,
&api.prefix);
}
}
2002-12-13 21:15:29 +01:00
return 0;
}
struct interface *if_lookup_by_ipv4(struct in_addr *addr, vrf_id_t vrf_id)
2002-12-13 21:15:29 +01:00
{
struct vrf *vrf;
2002-12-13 21:15:29 +01:00
struct interface *ifp;
struct connected *connected;
struct prefix_ipv4 p;
struct prefix *cp;
vrf = vrf_lookup_by_id(vrf_id);
if (!vrf)
return NULL;
2002-12-13 21:15:29 +01:00
p.family = AF_INET;
p.prefix = *addr;
p.prefixlen = IPV4_MAX_BITLEN;
FOR_ALL_INTERFACES (vrf, ifp) {
frr_each (if_connected, ifp->connected, connected) {
2002-12-13 21:15:29 +01:00
cp = connected->address;
2002-12-13 21:15:29 +01:00
if (cp->family == AF_INET)
if (prefix_match(cp, (struct prefix *)&p))
return ifp;
}
2002-12-13 21:15:29 +01:00
}
return NULL;
}
struct interface *if_lookup_by_ipv4_exact(struct in_addr *addr, vrf_id_t vrf_id)
2002-12-13 21:15:29 +01:00
{
struct vrf *vrf;
2002-12-13 21:15:29 +01:00
struct interface *ifp;
struct connected *connected;
struct prefix *cp;
vrf = vrf_lookup_by_id(vrf_id);
if (!vrf)
return NULL;
FOR_ALL_INTERFACES (vrf, ifp) {
frr_each (if_connected, ifp->connected, connected) {
2002-12-13 21:15:29 +01:00
cp = connected->address;
2002-12-13 21:15:29 +01:00
if (cp->family == AF_INET)
if (IPV4_ADDR_SAME(&cp->u.prefix4, addr))
return ifp;
}
2002-12-13 21:15:29 +01:00
}
return NULL;
}
struct interface *if_lookup_by_ipv6(struct in6_addr *addr, ifindex_t ifindex,
vrf_id_t vrf_id)
2002-12-13 21:15:29 +01:00
{
struct vrf *vrf;
2002-12-13 21:15:29 +01:00
struct interface *ifp;
struct connected *connected;
struct prefix_ipv6 p;
struct prefix *cp;
vrf = vrf_lookup_by_id(vrf_id);
if (!vrf)
return NULL;
2002-12-13 21:15:29 +01:00
p.family = AF_INET6;
p.prefix = *addr;
p.prefixlen = IPV6_MAX_BITLEN;
FOR_ALL_INTERFACES (vrf, ifp) {
frr_each (if_connected, ifp->connected, connected) {
2002-12-13 21:15:29 +01:00
cp = connected->address;
2002-12-13 21:15:29 +01:00
if (cp->family == AF_INET6)
if (prefix_match(cp, (struct prefix *)&p)) {
if (IN6_IS_ADDR_LINKLOCAL(
&cp->u.prefix6)) {
if (ifindex == ifp->ifindex)
return ifp;
} else
return ifp;
}
}
2002-12-13 21:15:29 +01:00
}
return NULL;
}
struct interface *if_lookup_by_ipv6_exact(struct in6_addr *addr,
ifindex_t ifindex, vrf_id_t vrf_id)
2002-12-13 21:15:29 +01:00
{
struct vrf *vrf;
2002-12-13 21:15:29 +01:00
struct interface *ifp;
struct connected *connected;
struct prefix *cp;
vrf = vrf_lookup_by_id(vrf_id);
if (!vrf)
return NULL;
FOR_ALL_INTERFACES (vrf, ifp) {
frr_each (if_connected, ifp->connected, connected) {
2002-12-13 21:15:29 +01:00
cp = connected->address;
2002-12-13 21:15:29 +01:00
if (cp->family == AF_INET6)
if (IPV6_ADDR_SAME(&cp->u.prefix6, addr)) {
if (IN6_IS_ADDR_LINKLOCAL(
&cp->u.prefix6)) {
if (ifindex == ifp->ifindex)
return ifp;
} else
return ifp;
}
}
2002-12-13 21:15:29 +01:00
}
return NULL;
}
static int if_get_ipv6_global(struct interface *ifp, struct in6_addr *addr)
{
struct connected *connected;
struct prefix *cp;
frr_each (if_connected, ifp->connected, connected) {
2002-12-13 21:15:29 +01:00
cp = connected->address;
2002-12-13 21:15:29 +01:00
if (cp->family == AF_INET6)
if (!IN6_IS_ADDR_LINKLOCAL(&cp->u.prefix6)) {
memcpy(addr, &cp->u.prefix6, IPV6_MAX_BYTELEN);
return 1;
}
}
return 0;
}
static bool if_get_ipv6_local(struct interface *ifp, struct in6_addr *addr)
2002-12-13 21:15:29 +01:00
{
struct connected *connected;
struct prefix *cp;
frr_each (if_connected, ifp->connected, connected) {
2002-12-13 21:15:29 +01:00
cp = connected->address;
2002-12-13 21:15:29 +01:00
if (cp->family == AF_INET6)
if (IN6_IS_ADDR_LINKLOCAL(&cp->u.prefix6)) {
memcpy(addr, &cp->u.prefix6, IPV6_MAX_BYTELEN);
return true;
2002-12-13 21:15:29 +01:00
}
}
return false;
2002-12-13 21:15:29 +01:00
}
static int if_get_ipv4_address(struct interface *ifp, struct in_addr *addr)
{
struct connected *connected;
struct prefix *cp;
frr_each (if_connected, ifp->connected, connected) {
cp = connected->address;
if ((cp->family == AF_INET)
&& !ipv4_martian(&(cp->u.prefix4))) {
*addr = cp->u.prefix4;
return 1;
}
}
return 0;
}
bool bgp_zebra_nexthop_set(union sockunion *local, union sockunion *remote,
struct bgp_nexthop *nexthop, struct peer *peer)
2002-12-13 21:15:29 +01:00
{
int ret = 0;
struct interface *ifp = NULL;
bool v6_ll_avail = true;
bool shared_network_original = peer->shared_network;
2002-12-13 21:15:29 +01:00
memset(nexthop, 0, sizeof(struct bgp_nexthop));
2002-12-13 21:15:29 +01:00
if (!local)
return false;
2002-12-13 21:15:29 +01:00
if (!remote)
return false;
2002-12-13 21:15:29 +01:00
if (local->sa.sa_family == AF_INET) {
nexthop->v4 = local->sin.sin_addr;
if (peer->update_if)
ifp = if_lookup_by_name(peer->update_if,
peer->bgp->vrf_id);
else
ifp = if_lookup_by_ipv4_exact(&local->sin.sin_addr,
peer->bgp->vrf_id);
2002-12-13 21:15:29 +01:00
}
if (local->sa.sa_family == AF_INET6) {
memcpy(&nexthop->v6_global, &local->sin6.sin6_addr, IPV6_MAX_BYTELEN);
if (IN6_IS_ADDR_LINKLOCAL(&local->sin6.sin6_addr)) {
if (peer->conf_if || peer->ifname)
2002-12-13 21:15:29 +01:00
ifp = if_lookup_by_name(peer->conf_if
? peer->conf_if
: peer->ifname,
peer->bgp->vrf_id);
else if (peer->update_if)
ifp = if_lookup_by_name(peer->update_if,
peer->bgp->vrf_id);
} else if (peer->update_if)
ifp = if_lookup_by_name(peer->update_if,
peer->bgp->vrf_id);
2002-12-13 21:15:29 +01:00
else
ifp = if_lookup_by_ipv6_exact(&local->sin6.sin6_addr,
local->sin6.sin6_scope_id,
peer->bgp->vrf_id);
2002-12-13 21:15:29 +01:00
}
/* Handle peerings via loopbacks. For instance, peer between
* 127.0.0.1 and 127.0.0.2. In short, allow peering with self
* via 127.0.0.0/8.
*/
if (!ifp && cmd_allow_reserved_ranges_get())
ifp = if_get_vrf_loopback(peer->bgp->vrf_id);
if (!ifp) {
/*
* BGP views do not currently get proper data
* from zebra( when attached ) to be able to
* properly resolve nexthops, so give this
* instance type a pass.
*/
if (peer->bgp->inst_type == BGP_INSTANCE_TYPE_VIEW)
return true;
/*
* If we have no interface data but we have established
* some connection w/ zebra than something has gone
* terribly terribly wrong here, so say this failed
* If we do not any zebra connection then not
* having a ifp pointer is ok.
*/
return zclient_num_connects ? false : true;
}
2002-12-13 21:15:29 +01:00
nexthop->ifp = ifp;
2002-12-13 21:15:29 +01:00
/* IPv4 connection, fetch and store IPv6 local address(es) if any. */
if (local->sa.sa_family == AF_INET) {
/* IPv6 nexthop*/
ret = if_get_ipv6_global(ifp, &nexthop->v6_global);
2002-12-13 21:15:29 +01:00
if (!ret) {
/* There is no global nexthop. Use link-local address as
* both the
* global and link-local nexthop. In this scenario, the
* expectation
* for interop is that the network admin would use a
* route-map to
* specify the global IPv6 nexthop.
*/
v6_ll_avail =
if_get_ipv6_local(ifp, &nexthop->v6_global);
2002-12-13 21:15:29 +01:00
memcpy(&nexthop->v6_local, &nexthop->v6_global,
IPV6_MAX_BYTELEN);
} else
v6_ll_avail =
if_get_ipv6_local(ifp, &nexthop->v6_local);
/*
* If we are a v4 connection and we are not doing unnumbered
* not having a v6 LL address is ok
*/
if (!v6_ll_avail && !peer->conf_if)
v6_ll_avail = true;
if (if_lookup_by_ipv4(&remote->sin.sin_addr, peer->bgp->vrf_id))
peer->shared_network = true;
else
peer->shared_network = false;
2002-12-13 21:15:29 +01:00
}
/* IPv6 connection, fetch and store IPv4 local address if any. */
if (local->sa.sa_family == AF_INET6) {
2002-12-13 21:15:29 +01:00
struct interface *direct = NULL;
/* IPv4 nexthop. */
ret = if_get_ipv4_address(ifp, &nexthop->v4);
if (!ret && peer->local_id.s_addr != INADDR_ANY)
2002-12-13 21:15:29 +01:00
nexthop->v4 = peer->local_id;
2002-12-13 21:15:29 +01:00
/* Global address*/
if (!IN6_IS_ADDR_LINKLOCAL(&local->sin6.sin6_addr)) {
memcpy(&nexthop->v6_global, &local->sin6.sin6_addr,
IPV6_MAX_BYTELEN);
/* If directly connected set link-local address. */
direct = if_lookup_by_ipv6(&remote->sin6.sin6_addr,
remote->sin6.sin6_scope_id,
peer->bgp->vrf_id);
2002-12-13 21:15:29 +01:00
if (direct)
v6_ll_avail = if_get_ipv6_local(
ifp, &nexthop->v6_local);
/*
* It's fine to not have a v6 LL when using
* update-source loopback/vrf
*/
if (!v6_ll_avail && if_is_loopback(ifp))
v6_ll_avail = true;
else if (!v6_ll_avail) {
flog_warn(
EC_BGP_NO_LL_ADDRESS_AVAILABLE,
"Interface: %s does not have a v6 LL address associated with it, waiting until one is created for it",
ifp->name);
}
} else
2002-12-13 21:15:29 +01:00
/* Link-local address. */
{
2002-12-13 21:15:29 +01:00
ret = if_get_ipv6_global(ifp, &nexthop->v6_global);
/* If there is no global address. Set link-local
address as
2002-12-13 21:15:29 +01:00
global. I know this break RFC specification... */
/* In this scenario, the expectation for interop is that
* the
* network admin would use a route-map to specify the
* global
* IPv6 nexthop.
*/
2002-12-13 21:15:29 +01:00
if (!ret)
memcpy(&nexthop->v6_global,
2002-12-13 21:15:29 +01:00
&local->sin6.sin6_addr,
IPV6_MAX_BYTELEN);
/* Always set the link-local address */
2002-12-13 21:15:29 +01:00
memcpy(&nexthop->v6_local, &local->sin6.sin6_addr,
IPV6_MAX_BYTELEN);
}
if (IN6_IS_ADDR_LINKLOCAL(&local->sin6.sin6_addr)
|| if_lookup_by_ipv6(&remote->sin6.sin6_addr,
remote->sin6.sin6_scope_id,
peer->bgp->vrf_id))
peer->shared_network = true;
else
peer->shared_network = false;
}
2002-12-13 21:15:29 +01:00
if (shared_network_original != peer->shared_network)
bgp_peer_bfd_update_source(peer);
2002-12-13 21:15:29 +01:00
/* KAME stack specific treatment. */
#ifdef KAME
if (IN6_IS_ADDR_LINKLOCAL(&nexthop->v6_global)
&& IN6_LINKLOCAL_IFINDEX(nexthop->v6_global)) {
SET_IN6_LINKLOCAL_IFINDEX(nexthop->v6_global, 0);
}
if (IN6_IS_ADDR_LINKLOCAL(&nexthop->v6_local)
&& IN6_LINKLOCAL_IFINDEX(nexthop->v6_local)) {
SET_IN6_LINKLOCAL_IFINDEX(nexthop->v6_local, 0);
}
#endif /* KAME */
/* If we have identified the local interface, there is no error for now.
*/
return v6_ll_avail;
2002-12-13 21:15:29 +01:00
}
static struct in6_addr *
bgp_path_info_to_ipv6_nexthop(struct bgp_path_info *path, ifindex_t *ifindex)
bgpd: bgpd-table-map.patch COMMAND: table-map <route-map-name> DESCRIPTION: This feature is used to apply a route-map on route updates from BGP to Zebra. All the applicable match operations are allowed, such as match on prefix, next-hop, communities, etc. Set operations for this attach-point are limited to metric and next-hop only. Any operation of this feature does not affect BGPs internal RIB. Supported for ipv4 and ipv6 address families. It works on multi-paths as well, however, metric setting is based on the best-path only. IMPLEMENTATION NOTES: The route-map application at this point is not supposed to modify any of BGP route's attributes (anything in bgp_info for that matter). To achieve that, creating a copy of the bgp_attr was inevitable. Implementation tries to keep the memory footprint low, code comments do point out the rationale behind a few choices made. bgp_zebra_announce() was already a big routine, adding this feature would extend it further. Patch has created a few smaller routines/macros whereever possible to keep the size of the routine in check without compromising on the readability of the code/flow inside this routine. For updating a partially filtered route (with its nexthops), BGP to Zebra replacement semantic of the next-hops serves the purpose well. However, with this patch there could be some redundant withdraws each time BGP announces a route thats (all the nexthops) gets denied by the route-map application. Handling of this case could be optimized by keeping state with the prefix and the nexthops in BGP. The patch doesn't optimizing that case, as even with the redundant withdraws the total number of updates to zebra are still be capped by the total number of routes in the table. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com>
2015-05-20 02:40:34 +02:00
{
struct in6_addr *nexthop = NULL;
bgpd: bgpd-table-map.patch COMMAND: table-map <route-map-name> DESCRIPTION: This feature is used to apply a route-map on route updates from BGP to Zebra. All the applicable match operations are allowed, such as match on prefix, next-hop, communities, etc. Set operations for this attach-point are limited to metric and next-hop only. Any operation of this feature does not affect BGPs internal RIB. Supported for ipv4 and ipv6 address families. It works on multi-paths as well, however, metric setting is based on the best-path only. IMPLEMENTATION NOTES: The route-map application at this point is not supposed to modify any of BGP route's attributes (anything in bgp_info for that matter). To achieve that, creating a copy of the bgp_attr was inevitable. Implementation tries to keep the memory footprint low, code comments do point out the rationale behind a few choices made. bgp_zebra_announce() was already a big routine, adding this feature would extend it further. Patch has created a few smaller routines/macros whereever possible to keep the size of the routine in check without compromising on the readability of the code/flow inside this routine. For updating a partially filtered route (with its nexthops), BGP to Zebra replacement semantic of the next-hops serves the purpose well. However, with this patch there could be some redundant withdraws each time BGP announces a route thats (all the nexthops) gets denied by the route-map application. Handling of this case could be optimized by keeping state with the prefix and the nexthops in BGP. The patch doesn't optimizing that case, as even with the redundant withdraws the total number of updates to zebra are still be capped by the total number of routes in the table. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com>
2015-05-20 02:40:34 +02:00
/* Only global address nexthop exists. */
if (path->attr->mp_nexthop_len == BGP_ATTR_NHLEN_IPV6_GLOBAL
|| path->attr->mp_nexthop_len == BGP_ATTR_NHLEN_VPNV6_GLOBAL) {
nexthop = &path->attr->mp_nexthop_global;
if (IN6_IS_ADDR_LINKLOCAL(nexthop))
*ifindex = path->attr->nh_ifindex;
}
bgpd: bgpd-table-map.patch COMMAND: table-map <route-map-name> DESCRIPTION: This feature is used to apply a route-map on route updates from BGP to Zebra. All the applicable match operations are allowed, such as match on prefix, next-hop, communities, etc. Set operations for this attach-point are limited to metric and next-hop only. Any operation of this feature does not affect BGPs internal RIB. Supported for ipv4 and ipv6 address families. It works on multi-paths as well, however, metric setting is based on the best-path only. IMPLEMENTATION NOTES: The route-map application at this point is not supposed to modify any of BGP route's attributes (anything in bgp_info for that matter). To achieve that, creating a copy of the bgp_attr was inevitable. Implementation tries to keep the memory footprint low, code comments do point out the rationale behind a few choices made. bgp_zebra_announce() was already a big routine, adding this feature would extend it further. Patch has created a few smaller routines/macros whereever possible to keep the size of the routine in check without compromising on the readability of the code/flow inside this routine. For updating a partially filtered route (with its nexthops), BGP to Zebra replacement semantic of the next-hops serves the purpose well. However, with this patch there could be some redundant withdraws each time BGP announces a route thats (all the nexthops) gets denied by the route-map application. Handling of this case could be optimized by keeping state with the prefix and the nexthops in BGP. The patch doesn't optimizing that case, as even with the redundant withdraws the total number of updates to zebra are still be capped by the total number of routes in the table. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com>
2015-05-20 02:40:34 +02:00
/* If both global and link-local address present. */
if (path->attr->mp_nexthop_len == BGP_ATTR_NHLEN_IPV6_GLOBAL_AND_LL
|| path->attr->mp_nexthop_len
== BGP_ATTR_NHLEN_VPNV6_GLOBAL_AND_LL) {
/* Check if route-map is set to prefer global over link-local */
if (CHECK_FLAG(path->attr->nh_flags,
BGP_ATTR_NH_MP_PREFER_GLOBAL)) {
nexthop = &path->attr->mp_nexthop_global;
if (IN6_IS_ADDR_LINKLOCAL(nexthop))
*ifindex = path->attr->nh_ifindex;
} else {
/* Workaround for Cisco's nexthop bug. */
if (IN6_IS_ADDR_UNSPECIFIED(&path->attr->mp_nexthop_global) &&
path->peer->connection->su_remote &&
path->peer->connection->su_remote->sa.sa_family == AF_INET6) {
nexthop = &path->peer->connection->su_remote->sin6.sin6_addr;
if (IN6_IS_ADDR_LINKLOCAL(nexthop))
*ifindex = path->peer->nexthop.ifp
->ifindex;
} else {
nexthop = &path->attr->mp_nexthop_local;
if (IN6_IS_ADDR_LINKLOCAL(nexthop))
*ifindex = path->attr->nh_lla_ifindex;
}
}
bgpd: bgpd-table-map.patch COMMAND: table-map <route-map-name> DESCRIPTION: This feature is used to apply a route-map on route updates from BGP to Zebra. All the applicable match operations are allowed, such as match on prefix, next-hop, communities, etc. Set operations for this attach-point are limited to metric and next-hop only. Any operation of this feature does not affect BGPs internal RIB. Supported for ipv4 and ipv6 address families. It works on multi-paths as well, however, metric setting is based on the best-path only. IMPLEMENTATION NOTES: The route-map application at this point is not supposed to modify any of BGP route's attributes (anything in bgp_info for that matter). To achieve that, creating a copy of the bgp_attr was inevitable. Implementation tries to keep the memory footprint low, code comments do point out the rationale behind a few choices made. bgp_zebra_announce() was already a big routine, adding this feature would extend it further. Patch has created a few smaller routines/macros whereever possible to keep the size of the routine in check without compromising on the readability of the code/flow inside this routine. For updating a partially filtered route (with its nexthops), BGP to Zebra replacement semantic of the next-hops serves the purpose well. However, with this patch there could be some redundant withdraws each time BGP announces a route thats (all the nexthops) gets denied by the route-map application. Handling of this case could be optimized by keeping state with the prefix and the nexthops in BGP. The patch doesn't optimizing that case, as even with the redundant withdraws the total number of updates to zebra are still be capped by the total number of routes in the table. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com>
2015-05-20 02:40:34 +02:00
}
bgpd: bgpd-table-map.patch COMMAND: table-map <route-map-name> DESCRIPTION: This feature is used to apply a route-map on route updates from BGP to Zebra. All the applicable match operations are allowed, such as match on prefix, next-hop, communities, etc. Set operations for this attach-point are limited to metric and next-hop only. Any operation of this feature does not affect BGPs internal RIB. Supported for ipv4 and ipv6 address families. It works on multi-paths as well, however, metric setting is based on the best-path only. IMPLEMENTATION NOTES: The route-map application at this point is not supposed to modify any of BGP route's attributes (anything in bgp_info for that matter). To achieve that, creating a copy of the bgp_attr was inevitable. Implementation tries to keep the memory footprint low, code comments do point out the rationale behind a few choices made. bgp_zebra_announce() was already a big routine, adding this feature would extend it further. Patch has created a few smaller routines/macros whereever possible to keep the size of the routine in check without compromising on the readability of the code/flow inside this routine. For updating a partially filtered route (with its nexthops), BGP to Zebra replacement semantic of the next-hops serves the purpose well. However, with this patch there could be some redundant withdraws each time BGP announces a route thats (all the nexthops) gets denied by the route-map application. Handling of this case could be optimized by keeping state with the prefix and the nexthops in BGP. The patch doesn't optimizing that case, as even with the redundant withdraws the total number of updates to zebra are still be capped by the total number of routes in the table. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com>
2015-05-20 02:40:34 +02:00
return nexthop;
}
static bool bgp_table_map_apply(struct route_map *map, const struct prefix *p,
struct bgp_path_info *path)
bgpd: bgpd-table-map.patch COMMAND: table-map <route-map-name> DESCRIPTION: This feature is used to apply a route-map on route updates from BGP to Zebra. All the applicable match operations are allowed, such as match on prefix, next-hop, communities, etc. Set operations for this attach-point are limited to metric and next-hop only. Any operation of this feature does not affect BGPs internal RIB. Supported for ipv4 and ipv6 address families. It works on multi-paths as well, however, metric setting is based on the best-path only. IMPLEMENTATION NOTES: The route-map application at this point is not supposed to modify any of BGP route's attributes (anything in bgp_info for that matter). To achieve that, creating a copy of the bgp_attr was inevitable. Implementation tries to keep the memory footprint low, code comments do point out the rationale behind a few choices made. bgp_zebra_announce() was already a big routine, adding this feature would extend it further. Patch has created a few smaller routines/macros whereever possible to keep the size of the routine in check without compromising on the readability of the code/flow inside this routine. For updating a partially filtered route (with its nexthops), BGP to Zebra replacement semantic of the next-hops serves the purpose well. However, with this patch there could be some redundant withdraws each time BGP announces a route thats (all the nexthops) gets denied by the route-map application. Handling of this case could be optimized by keeping state with the prefix and the nexthops in BGP. The patch doesn't optimizing that case, as even with the redundant withdraws the total number of updates to zebra are still be capped by the total number of routes in the table. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com>
2015-05-20 02:40:34 +02:00
{
route_map_result_t ret;
ret = route_map_apply(map, p, path);
bgp_attr_flush(path->attr);
if (ret != RMAP_DENYMATCH)
return true;
Overhual BGP debugs Summary of changes - added an option to enable keepalive debugs for a specific peer - added an option to enable inbound and/or outbound updates debugs for a specific peer - added an option to enable update debugs for a specific prefix - added an option to enable zebra debugs for a specific prefix - combined "deb bgp", "deb bgp events" and "deb bgp fsm" into "deb bgp neighbor-events". "deb bgp neighbor-events" can be enabled for a specific peer. - merged "deb bgp filters" into "deb bgp update" - moved the per-peer logging to one central log file. We now have the ability to filter all verbose debugs on a per-peer and per-prefix basis so we no longer need to keep log files per-peer. This simplifies troubleshooting by keeping all BGP logs in one location. The use r can then grep for the peer IP they are interested in if they wish to see the logs for a specific peer. - Changed "show debugging" in isis to "show debugging isis" to be consistent with all other protocols. This was very confusing for the user because they would type "show debug" and expect to see a list of debugs enabled across all protocols. - Removed "undebug" from the parser for BGP. Again this was to be consisten with all other protocols. - Removed the "all" keyword from the BGP debug parser. The user can now do "no debug bgp" to disable all BGP debugs, before you had to type "no deb all bgp" which was confusing. The new parse tree for BGP debugging is: deb bgp as4 deb bgp as4 segment deb bgp keepalives [A.B.C.D|WORD|X:X::X:X] deb bgp neighbor-events [A.B.C.D|WORD|X:X::X:X] deb bgp nht deb bgp updates [in|out] [A.B.C.D|WORD|X:X::X:X] deb bgp updates prefix [A.B.C.D/M|X:X::X:X/M] deb bgp zebra deb bgp zebra prefix [A.B.C.D/M|X:X::X:X/M]
2015-05-20 02:58:12 +02:00
if (bgp_debug_zebra(p)) {
bgpd: bgpd-table-map.patch COMMAND: table-map <route-map-name> DESCRIPTION: This feature is used to apply a route-map on route updates from BGP to Zebra. All the applicable match operations are allowed, such as match on prefix, next-hop, communities, etc. Set operations for this attach-point are limited to metric and next-hop only. Any operation of this feature does not affect BGPs internal RIB. Supported for ipv4 and ipv6 address families. It works on multi-paths as well, however, metric setting is based on the best-path only. IMPLEMENTATION NOTES: The route-map application at this point is not supposed to modify any of BGP route's attributes (anything in bgp_info for that matter). To achieve that, creating a copy of the bgp_attr was inevitable. Implementation tries to keep the memory footprint low, code comments do point out the rationale behind a few choices made. bgp_zebra_announce() was already a big routine, adding this feature would extend it further. Patch has created a few smaller routines/macros whereever possible to keep the size of the routine in check without compromising on the readability of the code/flow inside this routine. For updating a partially filtered route (with its nexthops), BGP to Zebra replacement semantic of the next-hops serves the purpose well. However, with this patch there could be some redundant withdraws each time BGP announces a route thats (all the nexthops) gets denied by the route-map application. Handling of this case could be optimized by keeping state with the prefix and the nexthops in BGP. The patch doesn't optimizing that case, as even with the redundant withdraws the total number of updates to zebra are still be capped by the total number of routes in the table. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com>
2015-05-20 02:40:34 +02:00
if (p->family == AF_INET) {
zlog_debug(
"Zebra rmap deny: IPv4 route %pFX nexthop %pI4",
p, &path->attr->nexthop);
bgpd: bgpd-table-map.patch COMMAND: table-map <route-map-name> DESCRIPTION: This feature is used to apply a route-map on route updates from BGP to Zebra. All the applicable match operations are allowed, such as match on prefix, next-hop, communities, etc. Set operations for this attach-point are limited to metric and next-hop only. Any operation of this feature does not affect BGPs internal RIB. Supported for ipv4 and ipv6 address families. It works on multi-paths as well, however, metric setting is based on the best-path only. IMPLEMENTATION NOTES: The route-map application at this point is not supposed to modify any of BGP route's attributes (anything in bgp_info for that matter). To achieve that, creating a copy of the bgp_attr was inevitable. Implementation tries to keep the memory footprint low, code comments do point out the rationale behind a few choices made. bgp_zebra_announce() was already a big routine, adding this feature would extend it further. Patch has created a few smaller routines/macros whereever possible to keep the size of the routine in check without compromising on the readability of the code/flow inside this routine. For updating a partially filtered route (with its nexthops), BGP to Zebra replacement semantic of the next-hops serves the purpose well. However, with this patch there could be some redundant withdraws each time BGP announces a route thats (all the nexthops) gets denied by the route-map application. Handling of this case could be optimized by keeping state with the prefix and the nexthops in BGP. The patch doesn't optimizing that case, as even with the redundant withdraws the total number of updates to zebra are still be capped by the total number of routes in the table. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com>
2015-05-20 02:40:34 +02:00
}
if (p->family == AF_INET6) {
ifindex_t ifindex;
struct in6_addr *nexthop;
nexthop = bgp_path_info_to_ipv6_nexthop(path, &ifindex);
bgpd: bgpd-table-map.patch COMMAND: table-map <route-map-name> DESCRIPTION: This feature is used to apply a route-map on route updates from BGP to Zebra. All the applicable match operations are allowed, such as match on prefix, next-hop, communities, etc. Set operations for this attach-point are limited to metric and next-hop only. Any operation of this feature does not affect BGPs internal RIB. Supported for ipv4 and ipv6 address families. It works on multi-paths as well, however, metric setting is based on the best-path only. IMPLEMENTATION NOTES: The route-map application at this point is not supposed to modify any of BGP route's attributes (anything in bgp_info for that matter). To achieve that, creating a copy of the bgp_attr was inevitable. Implementation tries to keep the memory footprint low, code comments do point out the rationale behind a few choices made. bgp_zebra_announce() was already a big routine, adding this feature would extend it further. Patch has created a few smaller routines/macros whereever possible to keep the size of the routine in check without compromising on the readability of the code/flow inside this routine. For updating a partially filtered route (with its nexthops), BGP to Zebra replacement semantic of the next-hops serves the purpose well. However, with this patch there could be some redundant withdraws each time BGP announces a route thats (all the nexthops) gets denied by the route-map application. Handling of this case could be optimized by keeping state with the prefix and the nexthops in BGP. The patch doesn't optimizing that case, as even with the redundant withdraws the total number of updates to zebra are still be capped by the total number of routes in the table. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com>
2015-05-20 02:40:34 +02:00
zlog_debug(
"Zebra rmap deny: IPv6 route %pFX nexthop %pI6",
p, nexthop);
bgpd: bgpd-table-map.patch COMMAND: table-map <route-map-name> DESCRIPTION: This feature is used to apply a route-map on route updates from BGP to Zebra. All the applicable match operations are allowed, such as match on prefix, next-hop, communities, etc. Set operations for this attach-point are limited to metric and next-hop only. Any operation of this feature does not affect BGPs internal RIB. Supported for ipv4 and ipv6 address families. It works on multi-paths as well, however, metric setting is based on the best-path only. IMPLEMENTATION NOTES: The route-map application at this point is not supposed to modify any of BGP route's attributes (anything in bgp_info for that matter). To achieve that, creating a copy of the bgp_attr was inevitable. Implementation tries to keep the memory footprint low, code comments do point out the rationale behind a few choices made. bgp_zebra_announce() was already a big routine, adding this feature would extend it further. Patch has created a few smaller routines/macros whereever possible to keep the size of the routine in check without compromising on the readability of the code/flow inside this routine. For updating a partially filtered route (with its nexthops), BGP to Zebra replacement semantic of the next-hops serves the purpose well. However, with this patch there could be some redundant withdraws each time BGP announces a route thats (all the nexthops) gets denied by the route-map application. Handling of this case could be optimized by keeping state with the prefix and the nexthops in BGP. The patch doesn't optimizing that case, as even with the redundant withdraws the total number of updates to zebra are still be capped by the total number of routes in the table. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com>
2015-05-20 02:40:34 +02:00
}
}
return false;
bgpd: bgpd-table-map.patch COMMAND: table-map <route-map-name> DESCRIPTION: This feature is used to apply a route-map on route updates from BGP to Zebra. All the applicable match operations are allowed, such as match on prefix, next-hop, communities, etc. Set operations for this attach-point are limited to metric and next-hop only. Any operation of this feature does not affect BGPs internal RIB. Supported for ipv4 and ipv6 address families. It works on multi-paths as well, however, metric setting is based on the best-path only. IMPLEMENTATION NOTES: The route-map application at this point is not supposed to modify any of BGP route's attributes (anything in bgp_info for that matter). To achieve that, creating a copy of the bgp_attr was inevitable. Implementation tries to keep the memory footprint low, code comments do point out the rationale behind a few choices made. bgp_zebra_announce() was already a big routine, adding this feature would extend it further. Patch has created a few smaller routines/macros whereever possible to keep the size of the routine in check without compromising on the readability of the code/flow inside this routine. For updating a partially filtered route (with its nexthops), BGP to Zebra replacement semantic of the next-hops serves the purpose well. However, with this patch there could be some redundant withdraws each time BGP announces a route thats (all the nexthops) gets denied by the route-map application. Handling of this case could be optimized by keeping state with the prefix and the nexthops in BGP. The patch doesn't optimizing that case, as even with the redundant withdraws the total number of updates to zebra are still be capped by the total number of routes in the table. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com>
2015-05-20 02:40:34 +02:00
}
static struct event *bgp_tm_thread_connect;
static bool bgp_tm_status_connected;
static bool bgp_tm_chunk_obtained;
#define BGP_FLOWSPEC_TABLE_CHUNK 100000
static uint32_t bgp_tm_min, bgp_tm_max, bgp_tm_chunk_size;
struct bgp *bgp_tm_bgp;
static void bgp_zebra_tm_connect(struct event *t)
{
struct zclient *zclient;
int delay = 10, ret = 0;
zclient = EVENT_ARG(t);
if (bgp_tm_status_connected && zclient->sock > 0)
delay = 60;
else {
bgp_tm_status_connected = false;
ret = tm_table_manager_connect(zclient);
}
if (ret < 0) {
zlog_err("Error connecting to table manager!");
bgp_tm_status_connected = false;
} else {
if (!bgp_tm_status_connected) {
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug(
"Connecting to table manager. Success");
}
bgp_tm_status_connected = true;
if (!bgp_tm_chunk_obtained) {
if (bgp_zebra_get_table_range(zclient, bgp_tm_chunk_size,
&bgp_tm_min,
&bgp_tm_max) >= 0) {
bgp_tm_chunk_obtained = true;
/* parse non installed entries */
bgp_zebra_announce_table(bgp_tm_bgp, AFI_IP, SAFI_FLOWSPEC);
}
}
}
event_add_timer(bm->master, bgp_zebra_tm_connect, zclient, delay,
&bgp_tm_thread_connect);
}
bool bgp_zebra_tm_chunk_obtained(void)
{
return bgp_tm_chunk_obtained;
}
uint32_t bgp_zebra_tm_get_id(void)
{
static int table_id;
if (!bgp_tm_chunk_obtained)
return ++table_id;
return bgp_tm_min++;
}
void bgp_zebra_init_tm_connect(struct bgp *bgp)
{
int delay = 1;
/* if already set, do nothing
*/
if (bgp_tm_thread_connect != NULL)
return;
bgp_tm_status_connected = false;
bgp_tm_chunk_obtained = false;
bgp_tm_min = bgp_tm_max = 0;
bgp_tm_chunk_size = BGP_FLOWSPEC_TABLE_CHUNK;
bgp_tm_bgp = bgp;
event_add_timer(bm->master, bgp_zebra_tm_connect, zclient_sync, delay,
&bgp_tm_thread_connect);
}
int bgp_zebra_get_table_range(struct zclient *zc, uint32_t chunk_size,
uint32_t *start, uint32_t *end)
{
int ret;
if (!bgp_tm_status_connected)
return -1;
ret = tm_get_table_chunk(zc, chunk_size, start, end);
if (ret < 0) {
flog_err(EC_BGP_TABLE_CHUNK,
"BGP: Error getting table chunk %u", chunk_size);
return -1;
}
zlog_info("BGP: Table Manager returns range from chunk %u is [%u %u]",
chunk_size, *start, *end);
return 0;
}
static bool update_ipv4nh_for_route_install(int nh_othervrf, struct bgp *nh_bgp,
struct in_addr *nexthop,
struct attr *attr, bool is_evpn,
struct zapi_nexthop *api_nh)
{
struct bgp_route_evpn *bre = bgp_attr_get_evpn_overlay(attr);
api_nh->gate.ipv4 = *nexthop;
api_nh->vrf_id = nh_bgp->vrf_id;
/* Need to set fields appropriately for EVPN routes imported into
* a VRF (which are programmed as onlink on l3-vni SVI) as well as
* connected routes leaked into a VRF.
*/
if (attr->nh_type == NEXTHOP_TYPE_BLACKHOLE) {
api_nh->type = attr->nh_type;
api_nh->bh_type = attr->bh_type;
} else if (is_evpn) {
/*
* If the nexthop is EVPN overlay index gateway IP,
* treat the nexthop as NEXTHOP_TYPE_IPV4
* Else, mark the nexthop as onlink.
*/
if (bre && bre->type == OVERLAY_INDEX_GATEWAY_IP)
api_nh->type = NEXTHOP_TYPE_IPV4;
else {
api_nh->type = NEXTHOP_TYPE_IPV4_IFINDEX;
SET_FLAG(api_nh->flags, ZAPI_NEXTHOP_FLAG_EVPN);
SET_FLAG(api_nh->flags, ZAPI_NEXTHOP_FLAG_ONLINK);
api_nh->ifindex = nh_bgp->l3vni_svi_ifindex;
}
} else if (nh_othervrf && api_nh->gate.ipv4.s_addr == INADDR_ANY) {
api_nh->type = NEXTHOP_TYPE_IFINDEX;
api_nh->ifindex = attr->nh_ifindex;
} else
api_nh->type = NEXTHOP_TYPE_IPV4;
return true;
}
static bool update_ipv6nh_for_route_install(int nh_othervrf, struct bgp *nh_bgp,
struct in6_addr *nexthop,
ifindex_t ifindex,
struct bgp_path_info *pi,
struct bgp_path_info *best_pi,
bool is_evpn,
struct zapi_nexthop *api_nh)
{
struct attr *attr;
struct bgp_route_evpn *bre;
attr = pi->attr;
api_nh->vrf_id = nh_bgp->vrf_id;
bre = bgp_attr_get_evpn_overlay(attr);
if (attr->nh_type == NEXTHOP_TYPE_BLACKHOLE) {
api_nh->type = attr->nh_type;
api_nh->bh_type = attr->bh_type;
} else if (is_evpn) {
/*
* If the nexthop is EVPN overlay index gateway IP,
* treat the nexthop as NEXTHOP_TYPE_IPV4
* Else, mark the nexthop as onlink.
*/
if (bre && bre->type == OVERLAY_INDEX_GATEWAY_IP)
api_nh->type = NEXTHOP_TYPE_IPV6;
else {
api_nh->type = NEXTHOP_TYPE_IPV6_IFINDEX;
SET_FLAG(api_nh->flags, ZAPI_NEXTHOP_FLAG_EVPN);
SET_FLAG(api_nh->flags, ZAPI_NEXTHOP_FLAG_ONLINK);
api_nh->ifindex = nh_bgp->l3vni_svi_ifindex;
}
} else if (nh_othervrf) {
if (IN6_IS_ADDR_UNSPECIFIED(nexthop)) {
api_nh->type = NEXTHOP_TYPE_IFINDEX;
api_nh->ifindex = attr->nh_ifindex;
} else if (IN6_IS_ADDR_LINKLOCAL(nexthop)) {
if (ifindex == 0)
return false;
api_nh->type = NEXTHOP_TYPE_IPV6_IFINDEX;
api_nh->ifindex = ifindex;
} else {
api_nh->type = NEXTHOP_TYPE_IPV6;
api_nh->ifindex = 0;
}
} else {
if (IN6_IS_ADDR_LINKLOCAL(nexthop)) {
if (pi == best_pi
&& attr->mp_nexthop_len
== BGP_ATTR_NHLEN_IPV6_GLOBAL_AND_LL)
if (pi->peer->nexthop.ifp)
ifindex =
pi->peer->nexthop.ifp->ifindex;
if (!ifindex) {
bgpd: Fix bgp core with a possible Intf delete Although trigger unknown, based on the backtrace in one of the internal testing, we do see some delete in the Intf where we can have the peer ifp pointer null and we try to dereference it while trying to install the route leading to a crash Skip updating the ifindex in such cases and since the nexthop is not properly updated, BGP skips sending it to zebra. BackTrace: 0 0x00007faef05e7ebc in ?? () from /lib/x86_64-linux-gnu/libc.so.6 1 0x00007faef0598fb2 in raise () from /lib/x86_64-linux-gnu/libc.so.6 2 0x00007faef09900dc in core_handler (signo=11, siginfo=0x7ffdde8cb4b0, context=<optimized out>) at lib/sigevent.c:274 3 <signal handler called> 4 0x00005560aad4b7d8 in update_ipv6nh_for_route_install (api_nh=0x7ffdde8cbe94, is_evpn=false, best_pi=0x5560b21187d0, pi=0x5560b21187d0, ifindex=0, nexthop=0x5560b03cb0dc, nh_bgp=0x5560ace04df0, nh_othervrf=0) at bgpd/bgp_zebra.c:1273 5 bgp_zebra_announce_actual (dest=dest@entry=0x5560afcfa950, info=0x5560b21187d0, bgp=0x5560ace04df0) at bgpd/bgp_zebra.c:1521 6 0x00005560aad4bc85 in bgp_handle_route_announcements_to_zebra (e=<optimized out>) at bgpd/bgp_zebra.c:1896 7 0x00007faef09a1c0d in thread_call (thread=thread@entry=0x7ffdde8d7580) at lib/thread.c:2008 8 0x00007faef095a598 in frr_run (master=0x5560ac7e5190) at lib/libfrr.c:1223 9 0x00005560aac65db6 in main (argc=<optimized out>, argv=<optimized out>) at bgpd/bgp_main.c:557 (gdb) f 4 4 0x00005560aad4b7d8 in update_ipv6nh_for_route_install (api_nh=0x7ffdde8cbe94, is_evpn=false, best_pi=0x5560b21187d0, pi=0x5560b21187d0, ifindex=0, nexthop=0x5560b03cb0dc, nh_bgp=0x5560ace04df0, nh_othervrf=0) at bgpd/bgp_zebra.c:1273 1273 in bgpd/bgp_zebra.c (gdb) p pi->peer->ifp $26 = (struct interface *) 0x0 Ticket :#4203904 Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
2024-12-10 22:45:02 +01:00
if (pi->peer->conf_if) {
if (pi->peer->ifp)
ifindex = pi->peer->ifp->ifindex;
} else if (pi->peer->ifname)
ifindex = ifname2ifindex(
pi->peer->ifname,
pi->peer->bgp->vrf_id);
else if (pi->peer->nexthop.ifp)
ifindex =
pi->peer->nexthop.ifp->ifindex;
}
if (ifindex == 0)
return false;
api_nh->type = NEXTHOP_TYPE_IPV6_IFINDEX;
api_nh->ifindex = ifindex;
} else {
api_nh->type = NEXTHOP_TYPE_IPV6;
api_nh->ifindex = 0;
}
}
/* api_nh structure has union of gate and bh_type */
if (nexthop && api_nh->type != NEXTHOP_TYPE_BLACKHOLE)
api_nh->gate.ipv6 = *nexthop;
return true;
}
static bool bgp_zebra_use_nhop_weighted(struct bgp *bgp, struct attr *attr,
uint64_t *nh_weight)
{
/* zero link-bandwidth and link-bandwidth not present are treated
* as the same situation.
*/
if (!attr->link_bw) {
/* the only situations should be if we're either told
* to skip or use default weight.
*/
if (bgp->lb_handling == BGP_LINK_BW_SKIP_MISSING)
return false;
*nh_weight = BGP_ZEBRA_DEFAULT_NHOP_WEIGHT;
} else
*nh_weight = attr->link_bw;
return true;
}
static void bgp_zebra_announce_parse_nexthop(
struct bgp_path_info *info, const struct prefix *p, struct bgp *bgp,
struct zapi_route *api, unsigned int *valid_nh_count, afi_t afi,
safi_t safi, uint32_t *nhg_id, uint32_t *metric, route_tag_t *tag,
bool *allow_recursion)
2002-12-13 21:15:29 +01:00
{
struct zapi_nexthop *api_nh;
int nh_family;
struct bgp_path_info *mpinfo;
struct bgp *bgp_orig;
struct attr local_attr;
struct bgp_path_info local_info;
struct bgp_path_info *mpinfo_cp = &local_info;
mpls_label_t *labels;
uint8_t num_labels = 0;
mpls_label_t nh_label;
int nh_othervrf = 0;
bool nh_updated = false;
bool do_wt_ecmp;
uint32_t ttl = 0;
uint32_t bos = 0;
uint32_t exp = 0;
struct bgp_route_evpn *bre = NULL;
/* Determine if we're doing weighted ECMP or not */
do_wt_ecmp = bgp_path_info_mpath_chkwtd(bgp, info);
/*
* vrf leaking support (will have only one nexthop)
*/
if (info->extra && info->extra->vrfleak &&
info->extra->vrfleak->bgp_orig)
nh_othervrf = 1;
/* EVPN MAC-IP routes are installed with a L3 NHG id */
if (nhg_id && bgp_evpn_path_es_use_nhg(bgp, info, nhg_id)) {
mpinfo = NULL;
zapi_route_set_nhg_id(api, nhg_id);
} else {
mpinfo = info;
}
for (; mpinfo; mpinfo = bgp_path_info_mpath_next(mpinfo)) {
uint64_t nh_weight;
bool is_evpn;
bool is_parent_evpn;
if (*valid_nh_count >= multipath_num)
break;
*mpinfo_cp = *mpinfo;
nh_weight = 0;
/* Get nexthop address-family */
if (p->family == AF_INET &&
!BGP_ATTR_MP_NEXTHOP_LEN_IP6(mpinfo_cp->attr))
nh_family = AF_INET;
else if (p->family == AF_INET6 ||
(p->family == AF_INET &&
BGP_ATTR_MP_NEXTHOP_LEN_IP6(mpinfo_cp->attr)))
nh_family = AF_INET6;
else
continue;
/* If processing for weighted ECMP, determine the next hop's
* weight. Based on user setting, we may skip the next hop
* in some situations.
*/
if (do_wt_ecmp) {
if (!bgp_zebra_use_nhop_weighted(bgp, mpinfo->attr,
&nh_weight))
continue;
}
api_nh = &api->nexthops[*valid_nh_count];
api_nh->srte_color = bgp_attr_get_color(info->attr);
if (bgp_debug_zebra(&api->prefix)) {
if (BGP_PATH_INFO_NUM_LABELS(mpinfo)) {
zlog_debug("%s: p=%pFX, bgp_is_valid_label: %d",
__func__, p,
bgp_is_valid_label(
&mpinfo->extra->labels
->label[0]));
} else {
bgpd: check and set extra num_labels The handling of MPLS labels in BGP faces an issue due to the way labels are stored in memory. They are stored in bgp_path_info but not in bgp_adj_in and bgp_adj_out structures. As a consequence, some configuration changes result in losing labels or even a bgpd crash. For example, when retrieving routes from the Adj-RIB-in table ("soft-reconfiguration inbound" enabled), labels are missing. bgp_path_info stores the MPLS labels, as shown below: > struct bgp_path_info { > struct bgp_path_info_extra *extra; > [...] > struct bgp_path_info_extra { > mpls_label_t label[BGP_MAX_LABELS]; > uint32_t num_labels; > [...] To solve those issues, a solution would be to set label data to the bgp_adj_in and bgp_adj_out structures in addition to the bgp_path_info_extra structure. The idea is to reference a common label pointer in all these three structures. And to store the data in a hash list in order to save memory. However, an issue in the code prevents us from setting clean data without a rework. The extra->num_labels field, which is intended to indicate the number of labels in extra->label[], is not reliably checked or set. The code often incorrectly assumes that if the extra pointer is present, then a label must also be present, leading to direct access to extra->label[] without verifying extra->num_labels. This assumption usually works because extra->label[0] is set to MPLS_INVALID_LABEL when a new bgp_path_info_extra is created, but it is technically incorrect. Cleanup the label code by setting num_labels each time values are set in extra->label[] and checking extra->num_labels before accessing the labels. Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2024-02-05 17:05:20 +01:00
zlog_debug("%s: p=%pFX, no label", __func__, p);
}
}
if (bgp->table_map[afi][safi].name) {
/* Copy info and attributes, so the route-map
apply doesn't modify the BGP route info. */
local_attr = *mpinfo->attr;
mpinfo_cp->attr = &local_attr;
if (!bgp_table_map_apply(bgp->table_map[afi][safi].map,
p, mpinfo_cp))
continue;
/* metric/tag is only allowed to be
* overridden on 1st nexthop */
if (mpinfo == info) {
if (metric)
*metric = bgp_med_value(mpinfo_cp->attr, bgp);
if (tag)
*tag = mpinfo_cp->attr->tag;
}
}
BGP_ORIGINAL_UPDATE(bgp_orig, mpinfo, bgp);
is_parent_evpn = is_route_parent_evpn(mpinfo);
if (nh_family == AF_INET) {
nh_updated = update_ipv4nh_for_route_install(
nh_othervrf, bgp_orig,
&mpinfo_cp->attr->nexthop, mpinfo_cp->attr,
is_parent_evpn, api_nh);
} else {
ifindex_t ifindex = IFINDEX_INTERNAL;
struct in6_addr *nexthop;
nexthop = bgp_path_info_to_ipv6_nexthop(mpinfo_cp,
&ifindex);
bgpd: Handle IPv6 prefixes with IPv4 nexthops for zebra Prevent from crashing as well here: ``` (gdb) bt 0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 1 0x00007ff54ec5242a in __GI_abort () at abort.c:89 2 0x00007ff54ffb1dd5 in core_handler (signo=11, siginfo=0x7fff189328f0, context=<optimized out>) at lib/sigevent.c:255 3 <signal handler called> 4 update_ipv6nh_for_route_install (api_nh=0x7fff1893309c, is_evpn=<optimized out>, best_pi=0x55c18854f220, pi=0x55c18854f220, ifindex=0, nexthop=0x0, nh_bgp=0x55c18850db20, nh_othervrf=<optimized out>) at bgpd/bgp_zebra.c:1099 5 bgp_zebra_announce (dest=dest@entry=0x55c188553020, p=p@entry=0x55c188553020, info=info@entry=0x55c18854f220, bgp=bgp@entry=0x55c18850db20, afi=afi@entry=AFI_IP6, safi=safi@entry=SAFI_UNICAST) at bgpd/bgp_zebra.c:1381 6 0x000055c1858ffa3a in bgp_process_main_one (bgp=0x55c18850db20, dest=0x55c188553020, afi=AFI_IP6, safi=SAFI_UNICAST) at bgpd/bgp_route.c:2908 7 0x000055c1858ffbbe in bgp_process_wq (wq=<optimized out>, data=0x55c1885550a0) at bgpd/bgp_route.c:3017 8 0x00007ff54ffca560 in work_queue_run (thread=0x7fff189373e0) at lib/workqueue.c:291 9 0x00007ff54ffc0a91 in thread_call (thread=thread@entry=0x7fff189373e0) at lib/thread.c:1681 10 0x00007ff54ff8b978 in frr_run (master=0x55c187caaed0) at lib/libfrr.c:1110 11 0x000055c1858a165b in main (argc=6, argv=0x7fff18937648) at bgpd/bgp_main.c:523 ``` ``` 5 bgp_zebra_announce (dest=dest@entry=0x55c188553020, p=p@entry=0x55c188553020, info=info@entry=0x55c18854f220, bgp=bgp@entry=0x55c18850db20, afi=afi@entry=AFI_IP6, safi=safi@entry=SAFI_UNICAST) at bgpd/bgp_zebra.c:1381 ifindex = 0 nexthop = 0x0 nh_weight = 0 ``` Reproduce: ``` ~# echo "announce route 2a02:4780:1::abdc/128 next-hop 192.168.0.2" > /run/exabgp.in ``` Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2020-12-04 16:37:36 +01:00
if (!nexthop)
nh_updated = update_ipv4nh_for_route_install(
nh_othervrf, bgp_orig,
bgpd: Handle IPv6 prefixes with IPv4 nexthops for zebra Prevent from crashing as well here: ``` (gdb) bt 0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 1 0x00007ff54ec5242a in __GI_abort () at abort.c:89 2 0x00007ff54ffb1dd5 in core_handler (signo=11, siginfo=0x7fff189328f0, context=<optimized out>) at lib/sigevent.c:255 3 <signal handler called> 4 update_ipv6nh_for_route_install (api_nh=0x7fff1893309c, is_evpn=<optimized out>, best_pi=0x55c18854f220, pi=0x55c18854f220, ifindex=0, nexthop=0x0, nh_bgp=0x55c18850db20, nh_othervrf=<optimized out>) at bgpd/bgp_zebra.c:1099 5 bgp_zebra_announce (dest=dest@entry=0x55c188553020, p=p@entry=0x55c188553020, info=info@entry=0x55c18854f220, bgp=bgp@entry=0x55c18850db20, afi=afi@entry=AFI_IP6, safi=safi@entry=SAFI_UNICAST) at bgpd/bgp_zebra.c:1381 6 0x000055c1858ffa3a in bgp_process_main_one (bgp=0x55c18850db20, dest=0x55c188553020, afi=AFI_IP6, safi=SAFI_UNICAST) at bgpd/bgp_route.c:2908 7 0x000055c1858ffbbe in bgp_process_wq (wq=<optimized out>, data=0x55c1885550a0) at bgpd/bgp_route.c:3017 8 0x00007ff54ffca560 in work_queue_run (thread=0x7fff189373e0) at lib/workqueue.c:291 9 0x00007ff54ffc0a91 in thread_call (thread=thread@entry=0x7fff189373e0) at lib/thread.c:1681 10 0x00007ff54ff8b978 in frr_run (master=0x55c187caaed0) at lib/libfrr.c:1110 11 0x000055c1858a165b in main (argc=6, argv=0x7fff18937648) at bgpd/bgp_main.c:523 ``` ``` 5 bgp_zebra_announce (dest=dest@entry=0x55c188553020, p=p@entry=0x55c188553020, info=info@entry=0x55c18854f220, bgp=bgp@entry=0x55c18850db20, afi=afi@entry=AFI_IP6, safi=safi@entry=SAFI_UNICAST) at bgpd/bgp_zebra.c:1381 ifindex = 0 nexthop = 0x0 nh_weight = 0 ``` Reproduce: ``` ~# echo "announce route 2a02:4780:1::abdc/128 next-hop 192.168.0.2" > /run/exabgp.in ``` Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2020-12-04 16:37:36 +01:00
&mpinfo_cp->attr->nexthop,
mpinfo_cp->attr, is_parent_evpn,
api_nh);
bgpd: Handle IPv6 prefixes with IPv4 nexthops for zebra Prevent from crashing as well here: ``` (gdb) bt 0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 1 0x00007ff54ec5242a in __GI_abort () at abort.c:89 2 0x00007ff54ffb1dd5 in core_handler (signo=11, siginfo=0x7fff189328f0, context=<optimized out>) at lib/sigevent.c:255 3 <signal handler called> 4 update_ipv6nh_for_route_install (api_nh=0x7fff1893309c, is_evpn=<optimized out>, best_pi=0x55c18854f220, pi=0x55c18854f220, ifindex=0, nexthop=0x0, nh_bgp=0x55c18850db20, nh_othervrf=<optimized out>) at bgpd/bgp_zebra.c:1099 5 bgp_zebra_announce (dest=dest@entry=0x55c188553020, p=p@entry=0x55c188553020, info=info@entry=0x55c18854f220, bgp=bgp@entry=0x55c18850db20, afi=afi@entry=AFI_IP6, safi=safi@entry=SAFI_UNICAST) at bgpd/bgp_zebra.c:1381 6 0x000055c1858ffa3a in bgp_process_main_one (bgp=0x55c18850db20, dest=0x55c188553020, afi=AFI_IP6, safi=SAFI_UNICAST) at bgpd/bgp_route.c:2908 7 0x000055c1858ffbbe in bgp_process_wq (wq=<optimized out>, data=0x55c1885550a0) at bgpd/bgp_route.c:3017 8 0x00007ff54ffca560 in work_queue_run (thread=0x7fff189373e0) at lib/workqueue.c:291 9 0x00007ff54ffc0a91 in thread_call (thread=thread@entry=0x7fff189373e0) at lib/thread.c:1681 10 0x00007ff54ff8b978 in frr_run (master=0x55c187caaed0) at lib/libfrr.c:1110 11 0x000055c1858a165b in main (argc=6, argv=0x7fff18937648) at bgpd/bgp_main.c:523 ``` ``` 5 bgp_zebra_announce (dest=dest@entry=0x55c188553020, p=p@entry=0x55c188553020, info=info@entry=0x55c18854f220, bgp=bgp@entry=0x55c18850db20, afi=afi@entry=AFI_IP6, safi=safi@entry=SAFI_UNICAST) at bgpd/bgp_zebra.c:1381 ifindex = 0 nexthop = 0x0 nh_weight = 0 ``` Reproduce: ``` ~# echo "announce route 2a02:4780:1::abdc/128 next-hop 192.168.0.2" > /run/exabgp.in ``` Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2020-12-04 16:37:36 +01:00
else
nh_updated = update_ipv6nh_for_route_install(
nh_othervrf, bgp_orig, nexthop, ifindex,
mpinfo, info, is_parent_evpn, api_nh);
}
is_evpn = !!CHECK_FLAG(api_nh->flags, ZAPI_NEXTHOP_FLAG_EVPN);
bre = bgp_attr_get_evpn_overlay(mpinfo->attr);
/* Did we get proper nexthop info to update zebra? */
if (!nh_updated)
continue;
/* Allow recursion if it is a multipath group with both
* eBGP and iBGP paths.
*/
if (allow_recursion && !*allow_recursion &&
CHECK_FLAG(bgp->flags, BGP_FLAG_PEERTYPE_MULTIPATH_RELAX) &&
(mpinfo->peer->sort == BGP_PEER_IBGP ||
mpinfo->peer->sort == BGP_PEER_CONFED))
*allow_recursion = true;
num_labels = BGP_PATH_INFO_NUM_LABELS(mpinfo);
labels = num_labels ? mpinfo->extra->labels->label : NULL;
if (num_labels && (is_evpn || bgp_is_valid_label(&labels[0]))) {
enum lsp_types_t nh_label_type = ZEBRA_LSP_NONE;
if (is_evpn) {
nh_label = *bgp_evpn_path_info_labels_get_l3vni(
labels, num_labels);
nh_label_type = ZEBRA_LSP_EVPN;
} else {
mpls_lse_decode(labels[0], &nh_label, &ttl,
&exp, &bos);
}
SET_FLAG(api_nh->flags, ZAPI_NEXTHOP_FLAG_LABEL);
api_nh->label_num = 1;
api_nh->label_type = nh_label_type;
api_nh->labels[0] = nh_label;
}
if (is_evpn && !(bre && bre->type == OVERLAY_INDEX_GATEWAY_IP))
memcpy(&api_nh->rmac, &(mpinfo->attr->rmac),
sizeof(struct ethaddr));
api_nh->weight = nh_weight;
if (((mpinfo->attr->srv6_l3vpn &&
!sid_zero_ipv6(&mpinfo->attr->srv6_l3vpn->sid)) ||
(mpinfo->attr->srv6_vpn &&
!sid_zero_ipv6(&mpinfo->attr->srv6_vpn->sid))) &&
!is_evpn && bgp_is_valid_label(&labels[0])) {
struct in6_addr *sid_tmp =
mpinfo->attr->srv6_l3vpn
? (&mpinfo->attr->srv6_l3vpn->sid)
: (&mpinfo->attr->srv6_vpn->sid);
memcpy(&api_nh->seg6_segs[0], sid_tmp,
sizeof(api_nh->seg6_segs[0]));
if (mpinfo->attr->srv6_l3vpn &&
mpinfo->attr->srv6_l3vpn->transposition_len != 0) {
mpls_lse_decode(labels[0], &nh_label, &ttl,
&exp, &bos);
if (nh_label < MPLS_LABEL_UNRESERVED_MIN) {
if (bgp_debug_zebra(&api->prefix))
zlog_debug(
"skip invalid SRv6 routes: transposition scheme is used, but label is too small");
continue;
}
transpose_sid(&api_nh->seg6_segs[0], nh_label,
mpinfo->attr->srv6_l3vpn
->transposition_offset,
mpinfo->attr->srv6_l3vpn
->transposition_len);
}
api_nh->seg_num = 1;
SET_FLAG(api_nh->flags, ZAPI_NEXTHOP_FLAG_SEG6);
}
(*valid_nh_count)++;
}
}
static void bgp_debug_zebra_nh(struct zapi_route *api)
{
int i;
int nh_family;
char nh_buf[INET6_ADDRSTRLEN];
char eth_buf[ETHER_ADDR_STRLEN + 7] = { '\0' };
char buf1[ETHER_ADDR_STRLEN];
char label_buf[20];
char sid_buf[20];
char segs_buf[256];
struct zapi_nexthop *api_nh;
int count;
count = api->nexthop_num;
for (i = 0; i < count; i++) {
api_nh = &api->nexthops[i];
switch (api_nh->type) {
case NEXTHOP_TYPE_IFINDEX:
nh_buf[0] = '\0';
break;
case NEXTHOP_TYPE_IPV4:
case NEXTHOP_TYPE_IPV4_IFINDEX:
nh_family = AF_INET;
inet_ntop(nh_family, &api_nh->gate, nh_buf,
sizeof(nh_buf));
break;
case NEXTHOP_TYPE_IPV6:
case NEXTHOP_TYPE_IPV6_IFINDEX:
nh_family = AF_INET6;
inet_ntop(nh_family, &api_nh->gate, nh_buf,
sizeof(nh_buf));
break;
case NEXTHOP_TYPE_BLACKHOLE:
strlcpy(nh_buf, "blackhole", sizeof(nh_buf));
break;
default:
/* Note: add new nexthop case */
assert(0);
break;
}
label_buf[0] = '\0';
eth_buf[0] = '\0';
segs_buf[0] = '\0';
if (CHECK_FLAG(api_nh->flags, ZAPI_NEXTHOP_FLAG_LABEL) &&
!CHECK_FLAG(api_nh->flags, ZAPI_NEXTHOP_FLAG_EVPN))
snprintf(label_buf, sizeof(label_buf), "label %u",
api_nh->labels[0]);
if (CHECK_FLAG(api_nh->flags, ZAPI_NEXTHOP_FLAG_SEG6) &&
!CHECK_FLAG(api_nh->flags, ZAPI_NEXTHOP_FLAG_EVPN)) {
inet_ntop(AF_INET6, &api_nh->seg6_segs[0], sid_buf,
sizeof(sid_buf));
snprintf(segs_buf, sizeof(segs_buf), "segs %s", sid_buf);
}
if (CHECK_FLAG(api_nh->flags, ZAPI_NEXTHOP_FLAG_EVPN) &&
!is_zero_mac(&api_nh->rmac))
snprintf(eth_buf, sizeof(eth_buf), " RMAC %s",
prefix_mac2str(&api_nh->rmac, buf1,
sizeof(buf1)));
zlog_debug(" nhop [%d]: %s if %u VRF %u wt %" PRIu64
" %s %s %s",
i + 1, nh_buf, api_nh->ifindex, api_nh->vrf_id,
api_nh->weight, label_buf, segs_buf, eth_buf);
}
}
static enum zclient_send_status
bgp_zebra_announce_actual(struct bgp_dest *dest, struct bgp_path_info *info,
struct bgp *bgp)
{
struct bgp_path_info *bpi_ultimate;
struct zapi_route api = { 0 };
unsigned int valid_nh_count = 0;
bool allow_recursion = false;
uint8_t distance;
struct peer *peer;
uint32_t metric;
route_tag_t tag;
uint32_t nhg_id = 0;
struct bgp_table *table = bgp_dest_table(dest);
const struct prefix *p = bgp_dest_get_prefix(dest);
if (table->safi == SAFI_FLOWSPEC) {
bgp_pbr_update_entry(bgp, p, info, table->afi, table->safi,
true);
return ZCLIENT_SEND_SUCCESS;
}
/* Make Zebra API structure. */
api.vrf_id = bgp->vrf_id;
api.type = ZEBRA_ROUTE_BGP;
api.safi = table->safi;
api.prefix = *p;
SET_FLAG(api.message, ZAPI_MESSAGE_NEXTHOP);
peer = info->peer;
if (info->type == ZEBRA_ROUTE_BGP) {
bpi_ultimate = bgp_get_imported_bpi_ultimate(info);
peer = bpi_ultimate->peer;
}
tag = info->attr->tag;
if (peer->sort == BGP_PEER_IBGP || peer->sort == BGP_PEER_CONFED
|| info->sub_type == BGP_ROUTE_AGGREGATE) {
SET_FLAG(api.flags, ZEBRA_FLAG_IBGP);
SET_FLAG(api.flags, ZEBRA_FLAG_ALLOW_RECURSION);
}
if ((peer->sort == BGP_PEER_EBGP && peer->ttl != BGP_DEFAULT_TTL)
|| CHECK_FLAG(peer->flags, PEER_FLAG_DISABLE_CONNECTED_CHECK)
|| CHECK_FLAG(bgp->flags, BGP_FLAG_DISABLE_NH_CONNECTED_CHK))
allow_recursion = true;
if (info->attr->rmap_table_id) {
SET_FLAG(api.message, ZAPI_MESSAGE_TABLEID);
api.tableid = info->attr->rmap_table_id;
}
if (info->attr->srte_color)
SET_FLAG(api.message, ZAPI_MESSAGE_SRTE);
/* Metric is currently based on the best-path only */
metric = info->attr->med;
bgp_zebra_announce_parse_nexthop(info, p, bgp, &api, &valid_nh_count,
table->afi, table->safi, &nhg_id,
&metric, &tag, &allow_recursion);
if (CHECK_FLAG(bm->flags, BM_FLAG_SEND_EXTRA_DATA_TO_ZEBRA)) {
struct bgp_zebra_opaque bzo = {};
const char *reason =
bgp_path_selection_reason2str(dest->reason);
strlcpy(bzo.aspath, info->attr->aspath->str,
sizeof(bzo.aspath));
if (info->attr->flag & ATTR_FLAG_BIT(BGP_ATTR_COMMUNITIES))
strlcpy(bzo.community,
bgp_attr_get_community(info->attr)->str,
sizeof(bzo.community));
if (info->attr->flag
& ATTR_FLAG_BIT(BGP_ATTR_LARGE_COMMUNITIES))
strlcpy(bzo.lcommunity,
bgp_attr_get_lcommunity(info->attr)->str,
sizeof(bzo.lcommunity));
strlcpy(bzo.selection_reason, reason,
sizeof(bzo.selection_reason));
SET_FLAG(api.message, ZAPI_MESSAGE_OPAQUE);
api.opaque.length = MIN(sizeof(struct bgp_zebra_opaque),
ZAPI_MESSAGE_OPAQUE_LENGTH);
memcpy(api.opaque.data, &bzo, api.opaque.length);
}
if (allow_recursion)
SET_FLAG(api.flags, ZEBRA_FLAG_ALLOW_RECURSION);
/*
* When we create an aggregate route we must also
* install a Null0 route in the RIB, so overwrite
* what was written into api with a blackhole route
*/
if (info->sub_type == BGP_ROUTE_AGGREGATE)
zapi_route_set_blackhole(&api, BLACKHOLE_NULL);
else
api.nexthop_num = valid_nh_count;
SET_FLAG(api.message, ZAPI_MESSAGE_METRIC);
api.metric = metric;
if (tag) {
SET_FLAG(api.message, ZAPI_MESSAGE_TAG);
api.tag = tag;
}
distance = bgp_distance_apply(p, info, table->afi, table->safi, bgp);
if (distance) {
SET_FLAG(api.message, ZAPI_MESSAGE_DISTANCE);
api.distance = distance;
}
if (bgp_debug_zebra(p)) {
zlog_debug("Tx route add %s (table id %u) %pFX metric %u tag %" ROUTE_TAG_PRI
" count %d nhg %d",
bgp->name_pretty, api.tableid, &api.prefix,
api.metric, api.tag, api.nexthop_num, nhg_id);
bgp_debug_zebra_nh(&api);
zlog_debug("%s: %pFX: announcing to zebra (recursion %sset)",
__func__, p, (allow_recursion ? "" : "NOT "));
}
return zclient_route_send(ZEBRA_ROUTE_ADD, zclient, &api);
2002-12-13 21:15:29 +01:00
}
bgpd: bgpd-table-map.patch COMMAND: table-map <route-map-name> DESCRIPTION: This feature is used to apply a route-map on route updates from BGP to Zebra. All the applicable match operations are allowed, such as match on prefix, next-hop, communities, etc. Set operations for this attach-point are limited to metric and next-hop only. Any operation of this feature does not affect BGPs internal RIB. Supported for ipv4 and ipv6 address families. It works on multi-paths as well, however, metric setting is based on the best-path only. IMPLEMENTATION NOTES: The route-map application at this point is not supposed to modify any of BGP route's attributes (anything in bgp_info for that matter). To achieve that, creating a copy of the bgp_attr was inevitable. Implementation tries to keep the memory footprint low, code comments do point out the rationale behind a few choices made. bgp_zebra_announce() was already a big routine, adding this feature would extend it further. Patch has created a few smaller routines/macros whereever possible to keep the size of the routine in check without compromising on the readability of the code/flow inside this routine. For updating a partially filtered route (with its nexthops), BGP to Zebra replacement semantic of the next-hops serves the purpose well. However, with this patch there could be some redundant withdraws each time BGP announces a route thats (all the nexthops) gets denied by the route-map application. Handling of this case could be optimized by keeping state with the prefix and the nexthops in BGP. The patch doesn't optimizing that case, as even with the redundant withdraws the total number of updates to zebra are still be capped by the total number of routes in the table. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com>
2015-05-20 02:40:34 +02:00
/* Announce all routes of a table to zebra */
void bgp_zebra_announce_table(struct bgp *bgp, afi_t afi, safi_t safi)
{
struct bgp_dest *dest;
bgpd: bgpd-table-map.patch COMMAND: table-map <route-map-name> DESCRIPTION: This feature is used to apply a route-map on route updates from BGP to Zebra. All the applicable match operations are allowed, such as match on prefix, next-hop, communities, etc. Set operations for this attach-point are limited to metric and next-hop only. Any operation of this feature does not affect BGPs internal RIB. Supported for ipv4 and ipv6 address families. It works on multi-paths as well, however, metric setting is based on the best-path only. IMPLEMENTATION NOTES: The route-map application at this point is not supposed to modify any of BGP route's attributes (anything in bgp_info for that matter). To achieve that, creating a copy of the bgp_attr was inevitable. Implementation tries to keep the memory footprint low, code comments do point out the rationale behind a few choices made. bgp_zebra_announce() was already a big routine, adding this feature would extend it further. Patch has created a few smaller routines/macros whereever possible to keep the size of the routine in check without compromising on the readability of the code/flow inside this routine. For updating a partially filtered route (with its nexthops), BGP to Zebra replacement semantic of the next-hops serves the purpose well. However, with this patch there could be some redundant withdraws each time BGP announces a route thats (all the nexthops) gets denied by the route-map application. Handling of this case could be optimized by keeping state with the prefix and the nexthops in BGP. The patch doesn't optimizing that case, as even with the redundant withdraws the total number of updates to zebra are still be capped by the total number of routes in the table. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com>
2015-05-20 02:40:34 +02:00
struct bgp_table *table;
struct bgp_path_info *pi;
/* Don't try to install if we're not connected to Zebra or Zebra doesn't
* know of this instance.
*/
if (!bgp_install_info_to_zebra(bgp))
return;
bgpd: bgpd-table-map.patch COMMAND: table-map <route-map-name> DESCRIPTION: This feature is used to apply a route-map on route updates from BGP to Zebra. All the applicable match operations are allowed, such as match on prefix, next-hop, communities, etc. Set operations for this attach-point are limited to metric and next-hop only. Any operation of this feature does not affect BGPs internal RIB. Supported for ipv4 and ipv6 address families. It works on multi-paths as well, however, metric setting is based on the best-path only. IMPLEMENTATION NOTES: The route-map application at this point is not supposed to modify any of BGP route's attributes (anything in bgp_info for that matter). To achieve that, creating a copy of the bgp_attr was inevitable. Implementation tries to keep the memory footprint low, code comments do point out the rationale behind a few choices made. bgp_zebra_announce() was already a big routine, adding this feature would extend it further. Patch has created a few smaller routines/macros whereever possible to keep the size of the routine in check without compromising on the readability of the code/flow inside this routine. For updating a partially filtered route (with its nexthops), BGP to Zebra replacement semantic of the next-hops serves the purpose well. However, with this patch there could be some redundant withdraws each time BGP announces a route thats (all the nexthops) gets denied by the route-map application. Handling of this case could be optimized by keeping state with the prefix and the nexthops in BGP. The patch doesn't optimizing that case, as even with the redundant withdraws the total number of updates to zebra are still be capped by the total number of routes in the table. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com>
2015-05-20 02:40:34 +02:00
table = bgp->rib[afi][safi];
if (!table)
return;
for (dest = bgp_table_top(table); dest; dest = bgp_route_next(dest))
for (pi = bgp_dest_get_bgp_path_info(dest); pi; pi = pi->next)
if (CHECK_FLAG(pi->flags, BGP_PATH_SELECTED) &&
(pi->type == ZEBRA_ROUTE_BGP && (pi->sub_type == BGP_ROUTE_NORMAL ||
pi->sub_type == BGP_ROUTE_IMPORTED))) {
bool is_add = true;
if (bgp->table_map[afi][safi].name) {
struct attr local_attr = *pi->attr;
struct bgp_path_info local_info = *pi;
local_info.attr = &local_attr;
is_add = bgp_table_map_apply(bgp->table_map[afi][safi].map,
bgp_dest_get_prefix(dest),
&local_info);
}
bgp_zebra_route_install(dest, pi, bgp, is_add, NULL, false);
}
bgpd: bgpd-table-map.patch COMMAND: table-map <route-map-name> DESCRIPTION: This feature is used to apply a route-map on route updates from BGP to Zebra. All the applicable match operations are allowed, such as match on prefix, next-hop, communities, etc. Set operations for this attach-point are limited to metric and next-hop only. Any operation of this feature does not affect BGPs internal RIB. Supported for ipv4 and ipv6 address families. It works on multi-paths as well, however, metric setting is based on the best-path only. IMPLEMENTATION NOTES: The route-map application at this point is not supposed to modify any of BGP route's attributes (anything in bgp_info for that matter). To achieve that, creating a copy of the bgp_attr was inevitable. Implementation tries to keep the memory footprint low, code comments do point out the rationale behind a few choices made. bgp_zebra_announce() was already a big routine, adding this feature would extend it further. Patch has created a few smaller routines/macros whereever possible to keep the size of the routine in check without compromising on the readability of the code/flow inside this routine. For updating a partially filtered route (with its nexthops), BGP to Zebra replacement semantic of the next-hops serves the purpose well. However, with this patch there could be some redundant withdraws each time BGP announces a route thats (all the nexthops) gets denied by the route-map application. Handling of this case could be optimized by keeping state with the prefix and the nexthops in BGP. The patch doesn't optimizing that case, as even with the redundant withdraws the total number of updates to zebra are still be capped by the total number of routes in the table. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com>
2015-05-20 02:40:34 +02:00
}
/* Announce routes of any bgp subtype of a table to zebra */
void bgp_zebra_announce_table_all_subtypes(struct bgp *bgp, afi_t afi,
safi_t safi)
{
struct bgp_dest *dest;
struct bgp_table *table;
struct bgp_path_info *pi;
if (!bgp_install_info_to_zebra(bgp))
return;
table = bgp->rib[afi][safi];
if (!table)
return;
for (dest = bgp_table_top(table); dest; dest = bgp_route_next(dest))
for (pi = bgp_dest_get_bgp_path_info(dest); pi; pi = pi->next)
if (CHECK_FLAG(pi->flags, BGP_PATH_SELECTED) &&
pi->type == ZEBRA_ROUTE_BGP)
bgp_zebra_route_install(dest, pi, bgp, true,
NULL, false);
}
enum zclient_send_status bgp_zebra_withdraw_actual(struct bgp_dest *dest,
struct bgp_path_info *info,
struct bgp *bgp)
2002-12-13 21:15:29 +01:00
{
struct zapi_route api;
struct peer *peer;
struct bgp_table *table = bgp_dest_table(dest);
const struct prefix *p = bgp_dest_get_prefix(dest);
if (table->safi == SAFI_FLOWSPEC) {
peer = info->peer;
bgp_pbr_update_entry(peer->bgp, p, info, table->afi,
table->safi, false);
return ZCLIENT_SEND_SUCCESS;
}
memset(&api, 0, sizeof(api));
api.vrf_id = bgp->vrf_id;
api.type = ZEBRA_ROUTE_BGP;
api.safi = table->safi;
api.prefix = *p;
if (info->attr->rmap_table_id) {
SET_FLAG(api.message, ZAPI_MESSAGE_TABLEID);
api.tableid = info->attr->rmap_table_id;
}
if (bgp_debug_zebra(p))
zlog_debug("Tx route delete %s (table id %u) %pFX",
bgp->name_pretty, api.tableid, &api.prefix);
return zclient_route_send(ZEBRA_ROUTE_DELETE, zclient, &api);
}
/*
* Walk the new Fifo list one by one and invoke bgp_zebra_announce/withdraw
* to install/withdraw the routes to zebra.
*
* If status = ZCLIENT_SEND_SUCCESS (Buffer empt)y i.e. Zebra is free to
* receive more incoming data, then pick the next item on the list and
* continue processing.
*
* If status = ZCLIENT_SEND_BUFFERED (Buffer pending) i.e. Zebra is busy,
* break and bail out of the function because once at some point when zebra
* is free, a callback is triggered which inturn call this same function and
* continue processing items on list.
*/
#define ZEBRA_ANNOUNCEMENTS_LIMIT 1000
static void bgp_handle_route_announcements_to_zebra(struct event *e)
{
bool is_evpn = false;
uint32_t count = 0;
struct bgp_dest *dest = NULL;
struct bgp_table *table = NULL;
enum zclient_send_status status = ZCLIENT_SEND_SUCCESS;
bool install;
const struct prefix_evpn *evp = NULL;
while (count < ZEBRA_ANNOUNCEMENTS_LIMIT) {
is_evpn = false;
dest = zebra_announce_pop(&bm->zebra_announce_head);
if (!dest)
break;
table = bgp_dest_table(dest);
install = CHECK_FLAG(dest->flags, BGP_NODE_SCHEDULE_FOR_INSTALL);
if (table->afi == AFI_L2VPN && table->safi == SAFI_EVPN) {
is_evpn = true;
evp = (const struct prefix_evpn *)bgp_dest_get_prefix(
dest);
}
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("BGP %s%s route %pBD(%s) with dest %p and flags 0x%x to zebra",
install ? "announcing" : "withdrawing",
is_evpn ? " evpn" : " ", dest,
table->bgp->name_pretty, dest, dest->flags);
if (install) {
if (is_evpn)
status =
evpn_zebra_install(table->bgp,
dest->za_vpn,
(const struct prefix_evpn
*)
bgp_dest_get_prefix(
dest),
dest->za_bgp_pi);
else
status = bgp_zebra_announce_actual(dest,
dest->za_bgp_pi,
table->bgp);
UNSET_FLAG(dest->flags, BGP_NODE_SCHEDULE_FOR_INSTALL);
} else {
if (is_evpn)
status = evpn_zebra_uninstall(
table->bgp, dest->za_vpn,
(const struct prefix_evpn *)
bgp_dest_get_prefix(dest),
dest->za_bgp_pi, false);
else
status = bgp_zebra_withdraw_actual(dest,
dest->za_bgp_pi,
table->bgp);
UNSET_FLAG(dest->flags, BGP_NODE_SCHEDULE_FOR_DELETE);
}
if (is_evpn && status == ZCLIENT_SEND_FAILURE)
flog_err(EC_BGP_EVPN_FAIL,
"%s (%u): Failed to %s EVPN %pFX %s route in VNI %u",
vrf_id_to_name(table->bgp->vrf_id),
table->bgp->vrf_id,
install ? "install" : "uninstall", evp,
evp->prefix.route_type == BGP_EVPN_MAC_IP_ROUTE
? "MACIP"
: "IMET",
dest->za_vpn->vni);
bgp_path_info_unlock(dest->za_bgp_pi);
dest->za_bgp_pi = NULL;
dest->za_vpn = NULL;
bgp_dest_unlock_node(dest);
if (status == ZCLIENT_SEND_BUFFERED)
break;
count++;
}
if (status != ZCLIENT_SEND_BUFFERED &&
zebra_announce_count(&bm->zebra_announce_head))
event_add_event(bm->master,
bgp_handle_route_announcements_to_zebra, NULL,
0, &bm->t_bgp_zebra_route);
}
/*
* Callback function invoked when zclient_flush_data() receives a BUFFER_EMPTY
* i.e. zebra is free to receive more incoming data.
*/
static void bgp_zebra_buffer_write_ready(void)
{
bgp_handle_route_announcements_to_zebra(NULL);
}
/*
* BGP is now keeping a list of dests with the dest having a pointer
* to the bgp_path_info that it will be working on.
* Here is the sequence of events that should happen:
*
* Current State New State Action
* ------------- --------- ------
* ---- Install Place dest on list, save pi, mark
* as going to be installed
* ---- Withdrawal Place dest on list, save pi, mark
* as going to be deleted
*
* Install Install Leave dest on list, release old pi,
* save new pi, mark as going to be
* Installed
* Install Withdrawal Leave dest on list, release old pi,
* save new pi, mark as going to be
* withdrawan, remove install flag
*
* Withdrawal Install Leave dest on list, release old pi,
* save new pi, mark as going to be
* installed.
* Withdrawal Withdrawal Leave dest on list, release old pi,
* save new pi, mark as going to be
* withdrawn.
*/
void bgp_zebra_route_install(struct bgp_dest *dest, struct bgp_path_info *info,
struct bgp *bgp, bool install, struct bgpevpn *vpn,
bool is_sync)
{
bool is_evpn = false;
struct bgp_table *table = NULL;
table = bgp_dest_table(dest);
if (table && table->afi == AFI_L2VPN && table->safi == SAFI_EVPN)
is_evpn = true;
/*
* BGP is installing this route and bgp has been configured
* to suppress announcements until the route has been installed
* let's set the fact that we expect this route to be installed
*/
if (install) {
if (BGP_SUPPRESS_FIB_ENABLED(bgp))
SET_FLAG(dest->flags, BGP_NODE_FIB_INSTALL_PENDING);
if (bgp->main_zebra_update_hold && !is_evpn)
return;
} else {
UNSET_FLAG(dest->flags, BGP_NODE_FIB_INSTALL_PENDING);
}
/*
* Don't try to install if we're not connected to Zebra or Zebra doesn't
* know of this instance.
*/
if (!bgp_install_info_to_zebra(bgp) && !is_evpn)
return;
if (!CHECK_FLAG(dest->flags, BGP_NODE_SCHEDULE_FOR_INSTALL) &&
!CHECK_FLAG(dest->flags, BGP_NODE_SCHEDULE_FOR_DELETE)) {
zebra_announce_add_tail(&bm->zebra_announce_head, dest);
/*
* If neither flag is set and za_bgp_pi is not set then it is a bug
*/
assert(!dest->za_bgp_pi);
bgp_path_info_lock(info);
bgp_dest_lock_node(dest);
dest->za_bgp_pi = info;
} else if (CHECK_FLAG(dest->flags, BGP_NODE_SCHEDULE_FOR_INSTALL)) {
assert(dest->za_bgp_pi);
bgp_path_info_unlock(dest->za_bgp_pi);
bgp_path_info_lock(info);
dest->za_bgp_pi = info;
} else if (CHECK_FLAG(dest->flags, BGP_NODE_SCHEDULE_FOR_DELETE)) {
assert(dest->za_bgp_pi);
bgp_path_info_unlock(dest->za_bgp_pi);
bgp_path_info_lock(info);
dest->za_bgp_pi = info;
}
if (is_evpn) {
dest->za_vpn = vpn;
dest->za_is_sync = is_sync;
}
if (install) {
UNSET_FLAG(dest->flags, BGP_NODE_SCHEDULE_FOR_DELETE);
SET_FLAG(dest->flags, BGP_NODE_SCHEDULE_FOR_INSTALL);
} else {
UNSET_FLAG(dest->flags, BGP_NODE_SCHEDULE_FOR_INSTALL);
SET_FLAG(dest->flags, BGP_NODE_SCHEDULE_FOR_DELETE);
}
event_add_event(bm->master, bgp_handle_route_announcements_to_zebra,
NULL, 0, &bm->t_bgp_zebra_route);
2002-12-13 21:15:29 +01:00
}
/* Withdraw all entries in a BGP instances RIB table from Zebra */
void bgp_zebra_withdraw_table_all_subtypes(struct bgp *bgp, afi_t afi, safi_t safi)
{
struct bgp_dest *dest;
struct bgp_table *table;
struct bgp_path_info *pi;
if (!bgp_install_info_to_zebra(bgp))
return;
table = bgp->rib[afi][safi];
if (!table)
return;
for (dest = bgp_table_top(table); dest; dest = bgp_route_next(dest)) {
for (pi = bgp_dest_get_bgp_path_info(dest); pi; pi = pi->next) {
if (CHECK_FLAG(pi->flags, BGP_PATH_SELECTED)
&& (pi->type == ZEBRA_ROUTE_BGP))
bgp_zebra_route_install(dest, pi, bgp, false,
NULL, false);
}
}
}
struct bgp_redist *bgp_redist_lookup(struct bgp *bgp, afi_t afi, uint8_t type,
unsigned short instance)
Multi-Instance OSPF Summary ——————————————------------- - etc/init.d/quagga is modified to support creating separate ospf daemon process for each instance. Each individual instance is monitored by watchquagga just like any protocol daemons.(requires initd-mi.patch). - Vtysh is modified to able to connect to multiple daemons of the same protocol (supported for OSPF only for now). - ospfd is modified to remember the Instance-ID that its invoked with. For the entire life of the process it caters to any command request that matches that instance-ID (unless its a non instance specific command). Routes/messages to zebra are tagged with instance-ID. - zebra route/redistribute mechanisms are modified to work with [protocol type + instance-id] - bgpd now has ability to have multiple instance specific redistribution for a protocol (OSPF only supported/tested for now). - zlog ability to display instance-id besides the protocol/daemon name. - Changes in other daemons are to because of the needed integration with some of the modified APIs/routines. (Didn’t prefer replicating too many separate instance specific APIs.) - config/show/debug commands are modified to take instance-id argument as appropriate. Guidelines to start using multi-instance ospf --------------------------------------------- The patch is backward compatible, i.e for any previous way of single ospf deamon(router ospf <cr>) will continue to work as is, including all the show commands etc. To enable multiple instances, do the following: 1. service quagga stop 2. Modify /etc/quagga/daemons to add instance-ids of each desired instance in the following format: ospfd=“yes" ospfd_instances="1,2,3" assuming you want to enable 3 instances with those instance ids. 3. Create corresponding ospfd config files as ospfd-1.conf, ospfd-2.conf and ospfd-3.conf. 4. service quagga start/restart 5. Verify that the deamons are started as expected. You should see ospfd started with -n <instance-id> option. ps –ef | grep quagga With that /var/run/quagga/ should have ospfd-<instance-id>.pid and ospfd-<instance-id>/vty to each instance. 6. vtysh to work with instances as you would with any other deamons. 7. Overall most quagga semantics are the same working with the instance deamon, like it is for any other daemon. NOTE: To safeguard against errors leading to too many processes getting invoked, a hard limit on number of instance-ids is in place, currently its 5. Allowed instance-id range is <1-65535> Once daemons are up, show running from vtysh should show the instance-id of each daemon as 'router ospf <instance-id>’ (without needing explicit configuration) Instance-id can not be changed via vtysh, other router ospf configuration is allowed as before. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com> Reviewed-by: Dinesh G Dutt <ddutt@cumulusnetworks.com>
2015-05-20 03:03:42 +02:00
{
struct list *red_list;
struct listnode *node;
struct bgp_redist *red;
red_list = bgp->redist[afi][type];
if (!red_list)
return (NULL);
for (ALL_LIST_ELEMENTS_RO(red_list, node, red))
if (red->instance == instance)
return red;
return NULL;
}
struct bgp_redist *bgp_redist_add(struct bgp *bgp, afi_t afi, uint8_t type,
unsigned short instance)
Multi-Instance OSPF Summary ——————————————------------- - etc/init.d/quagga is modified to support creating separate ospf daemon process for each instance. Each individual instance is monitored by watchquagga just like any protocol daemons.(requires initd-mi.patch). - Vtysh is modified to able to connect to multiple daemons of the same protocol (supported for OSPF only for now). - ospfd is modified to remember the Instance-ID that its invoked with. For the entire life of the process it caters to any command request that matches that instance-ID (unless its a non instance specific command). Routes/messages to zebra are tagged with instance-ID. - zebra route/redistribute mechanisms are modified to work with [protocol type + instance-id] - bgpd now has ability to have multiple instance specific redistribution for a protocol (OSPF only supported/tested for now). - zlog ability to display instance-id besides the protocol/daemon name. - Changes in other daemons are to because of the needed integration with some of the modified APIs/routines. (Didn’t prefer replicating too many separate instance specific APIs.) - config/show/debug commands are modified to take instance-id argument as appropriate. Guidelines to start using multi-instance ospf --------------------------------------------- The patch is backward compatible, i.e for any previous way of single ospf deamon(router ospf <cr>) will continue to work as is, including all the show commands etc. To enable multiple instances, do the following: 1. service quagga stop 2. Modify /etc/quagga/daemons to add instance-ids of each desired instance in the following format: ospfd=“yes" ospfd_instances="1,2,3" assuming you want to enable 3 instances with those instance ids. 3. Create corresponding ospfd config files as ospfd-1.conf, ospfd-2.conf and ospfd-3.conf. 4. service quagga start/restart 5. Verify that the deamons are started as expected. You should see ospfd started with -n <instance-id> option. ps –ef | grep quagga With that /var/run/quagga/ should have ospfd-<instance-id>.pid and ospfd-<instance-id>/vty to each instance. 6. vtysh to work with instances as you would with any other deamons. 7. Overall most quagga semantics are the same working with the instance deamon, like it is for any other daemon. NOTE: To safeguard against errors leading to too many processes getting invoked, a hard limit on number of instance-ids is in place, currently its 5. Allowed instance-id range is <1-65535> Once daemons are up, show running from vtysh should show the instance-id of each daemon as 'router ospf <instance-id>’ (without needing explicit configuration) Instance-id can not be changed via vtysh, other router ospf configuration is allowed as before. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com> Reviewed-by: Dinesh G Dutt <ddutt@cumulusnetworks.com>
2015-05-20 03:03:42 +02:00
{
struct list *red_list;
struct bgp_redist *red;
red = bgp_redist_lookup(bgp, afi, type, instance);
if (red)
return red;
if (!bgp->redist[afi][type])
bgp->redist[afi][type] = list_new();
red_list = bgp->redist[afi][type];
red = XCALLOC(MTYPE_BGP_REDIST, sizeof(struct bgp_redist));
Multi-Instance OSPF Summary ——————————————------------- - etc/init.d/quagga is modified to support creating separate ospf daemon process for each instance. Each individual instance is monitored by watchquagga just like any protocol daemons.(requires initd-mi.patch). - Vtysh is modified to able to connect to multiple daemons of the same protocol (supported for OSPF only for now). - ospfd is modified to remember the Instance-ID that its invoked with. For the entire life of the process it caters to any command request that matches that instance-ID (unless its a non instance specific command). Routes/messages to zebra are tagged with instance-ID. - zebra route/redistribute mechanisms are modified to work with [protocol type + instance-id] - bgpd now has ability to have multiple instance specific redistribution for a protocol (OSPF only supported/tested for now). - zlog ability to display instance-id besides the protocol/daemon name. - Changes in other daemons are to because of the needed integration with some of the modified APIs/routines. (Didn’t prefer replicating too many separate instance specific APIs.) - config/show/debug commands are modified to take instance-id argument as appropriate. Guidelines to start using multi-instance ospf --------------------------------------------- The patch is backward compatible, i.e for any previous way of single ospf deamon(router ospf <cr>) will continue to work as is, including all the show commands etc. To enable multiple instances, do the following: 1. service quagga stop 2. Modify /etc/quagga/daemons to add instance-ids of each desired instance in the following format: ospfd=“yes" ospfd_instances="1,2,3" assuming you want to enable 3 instances with those instance ids. 3. Create corresponding ospfd config files as ospfd-1.conf, ospfd-2.conf and ospfd-3.conf. 4. service quagga start/restart 5. Verify that the deamons are started as expected. You should see ospfd started with -n <instance-id> option. ps –ef | grep quagga With that /var/run/quagga/ should have ospfd-<instance-id>.pid and ospfd-<instance-id>/vty to each instance. 6. vtysh to work with instances as you would with any other deamons. 7. Overall most quagga semantics are the same working with the instance deamon, like it is for any other daemon. NOTE: To safeguard against errors leading to too many processes getting invoked, a hard limit on number of instance-ids is in place, currently its 5. Allowed instance-id range is <1-65535> Once daemons are up, show running from vtysh should show the instance-id of each daemon as 'router ospf <instance-id>’ (without needing explicit configuration) Instance-id can not be changed via vtysh, other router ospf configuration is allowed as before. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com> Reviewed-by: Dinesh G Dutt <ddutt@cumulusnetworks.com>
2015-05-20 03:03:42 +02:00
red->instance = instance;
listnode_add(red_list, red);
return red;
}
static void bgp_redist_del(struct bgp *bgp, afi_t afi, uint8_t type,
unsigned short instance)
Multi-Instance OSPF Summary ——————————————------------- - etc/init.d/quagga is modified to support creating separate ospf daemon process for each instance. Each individual instance is monitored by watchquagga just like any protocol daemons.(requires initd-mi.patch). - Vtysh is modified to able to connect to multiple daemons of the same protocol (supported for OSPF only for now). - ospfd is modified to remember the Instance-ID that its invoked with. For the entire life of the process it caters to any command request that matches that instance-ID (unless its a non instance specific command). Routes/messages to zebra are tagged with instance-ID. - zebra route/redistribute mechanisms are modified to work with [protocol type + instance-id] - bgpd now has ability to have multiple instance specific redistribution for a protocol (OSPF only supported/tested for now). - zlog ability to display instance-id besides the protocol/daemon name. - Changes in other daemons are to because of the needed integration with some of the modified APIs/routines. (Didn’t prefer replicating too many separate instance specific APIs.) - config/show/debug commands are modified to take instance-id argument as appropriate. Guidelines to start using multi-instance ospf --------------------------------------------- The patch is backward compatible, i.e for any previous way of single ospf deamon(router ospf <cr>) will continue to work as is, including all the show commands etc. To enable multiple instances, do the following: 1. service quagga stop 2. Modify /etc/quagga/daemons to add instance-ids of each desired instance in the following format: ospfd=“yes" ospfd_instances="1,2,3" assuming you want to enable 3 instances with those instance ids. 3. Create corresponding ospfd config files as ospfd-1.conf, ospfd-2.conf and ospfd-3.conf. 4. service quagga start/restart 5. Verify that the deamons are started as expected. You should see ospfd started with -n <instance-id> option. ps –ef | grep quagga With that /var/run/quagga/ should have ospfd-<instance-id>.pid and ospfd-<instance-id>/vty to each instance. 6. vtysh to work with instances as you would with any other deamons. 7. Overall most quagga semantics are the same working with the instance deamon, like it is for any other daemon. NOTE: To safeguard against errors leading to too many processes getting invoked, a hard limit on number of instance-ids is in place, currently its 5. Allowed instance-id range is <1-65535> Once daemons are up, show running from vtysh should show the instance-id of each daemon as 'router ospf <instance-id>’ (without needing explicit configuration) Instance-id can not be changed via vtysh, other router ospf configuration is allowed as before. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com> Reviewed-by: Dinesh G Dutt <ddutt@cumulusnetworks.com>
2015-05-20 03:03:42 +02:00
{
struct bgp_redist *red;
Multi-Instance OSPF Summary ——————————————------------- - etc/init.d/quagga is modified to support creating separate ospf daemon process for each instance. Each individual instance is monitored by watchquagga just like any protocol daemons.(requires initd-mi.patch). - Vtysh is modified to able to connect to multiple daemons of the same protocol (supported for OSPF only for now). - ospfd is modified to remember the Instance-ID that its invoked with. For the entire life of the process it caters to any command request that matches that instance-ID (unless its a non instance specific command). Routes/messages to zebra are tagged with instance-ID. - zebra route/redistribute mechanisms are modified to work with [protocol type + instance-id] - bgpd now has ability to have multiple instance specific redistribution for a protocol (OSPF only supported/tested for now). - zlog ability to display instance-id besides the protocol/daemon name. - Changes in other daemons are to because of the needed integration with some of the modified APIs/routines. (Didn’t prefer replicating too many separate instance specific APIs.) - config/show/debug commands are modified to take instance-id argument as appropriate. Guidelines to start using multi-instance ospf --------------------------------------------- The patch is backward compatible, i.e for any previous way of single ospf deamon(router ospf <cr>) will continue to work as is, including all the show commands etc. To enable multiple instances, do the following: 1. service quagga stop 2. Modify /etc/quagga/daemons to add instance-ids of each desired instance in the following format: ospfd=“yes" ospfd_instances="1,2,3" assuming you want to enable 3 instances with those instance ids. 3. Create corresponding ospfd config files as ospfd-1.conf, ospfd-2.conf and ospfd-3.conf. 4. service quagga start/restart 5. Verify that the deamons are started as expected. You should see ospfd started with -n <instance-id> option. ps –ef | grep quagga With that /var/run/quagga/ should have ospfd-<instance-id>.pid and ospfd-<instance-id>/vty to each instance. 6. vtysh to work with instances as you would with any other deamons. 7. Overall most quagga semantics are the same working with the instance deamon, like it is for any other daemon. NOTE: To safeguard against errors leading to too many processes getting invoked, a hard limit on number of instance-ids is in place, currently its 5. Allowed instance-id range is <1-65535> Once daemons are up, show running from vtysh should show the instance-id of each daemon as 'router ospf <instance-id>’ (without needing explicit configuration) Instance-id can not be changed via vtysh, other router ospf configuration is allowed as before. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com> Reviewed-by: Dinesh G Dutt <ddutt@cumulusnetworks.com>
2015-05-20 03:03:42 +02:00
red = bgp_redist_lookup(bgp, afi, type, instance);
Multi-Instance OSPF Summary ——————————————------------- - etc/init.d/quagga is modified to support creating separate ospf daemon process for each instance. Each individual instance is monitored by watchquagga just like any protocol daemons.(requires initd-mi.patch). - Vtysh is modified to able to connect to multiple daemons of the same protocol (supported for OSPF only for now). - ospfd is modified to remember the Instance-ID that its invoked with. For the entire life of the process it caters to any command request that matches that instance-ID (unless its a non instance specific command). Routes/messages to zebra are tagged with instance-ID. - zebra route/redistribute mechanisms are modified to work with [protocol type + instance-id] - bgpd now has ability to have multiple instance specific redistribution for a protocol (OSPF only supported/tested for now). - zlog ability to display instance-id besides the protocol/daemon name. - Changes in other daemons are to because of the needed integration with some of the modified APIs/routines. (Didn’t prefer replicating too many separate instance specific APIs.) - config/show/debug commands are modified to take instance-id argument as appropriate. Guidelines to start using multi-instance ospf --------------------------------------------- The patch is backward compatible, i.e for any previous way of single ospf deamon(router ospf <cr>) will continue to work as is, including all the show commands etc. To enable multiple instances, do the following: 1. service quagga stop 2. Modify /etc/quagga/daemons to add instance-ids of each desired instance in the following format: ospfd=“yes" ospfd_instances="1,2,3" assuming you want to enable 3 instances with those instance ids. 3. Create corresponding ospfd config files as ospfd-1.conf, ospfd-2.conf and ospfd-3.conf. 4. service quagga start/restart 5. Verify that the deamons are started as expected. You should see ospfd started with -n <instance-id> option. ps –ef | grep quagga With that /var/run/quagga/ should have ospfd-<instance-id>.pid and ospfd-<instance-id>/vty to each instance. 6. vtysh to work with instances as you would with any other deamons. 7. Overall most quagga semantics are the same working with the instance deamon, like it is for any other daemon. NOTE: To safeguard against errors leading to too many processes getting invoked, a hard limit on number of instance-ids is in place, currently its 5. Allowed instance-id range is <1-65535> Once daemons are up, show running from vtysh should show the instance-id of each daemon as 'router ospf <instance-id>’ (without needing explicit configuration) Instance-id can not be changed via vtysh, other router ospf configuration is allowed as before. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com> Reviewed-by: Dinesh G Dutt <ddutt@cumulusnetworks.com>
2015-05-20 03:03:42 +02:00
if (red) {
listnode_delete(bgp->redist[afi][type], red);
XFREE(MTYPE_BGP_REDIST, red);
if (!bgp->redist[afi][type]->count)
list_delete(&bgp->redist[afi][type]);
Multi-Instance OSPF Summary ——————————————------------- - etc/init.d/quagga is modified to support creating separate ospf daemon process for each instance. Each individual instance is monitored by watchquagga just like any protocol daemons.(requires initd-mi.patch). - Vtysh is modified to able to connect to multiple daemons of the same protocol (supported for OSPF only for now). - ospfd is modified to remember the Instance-ID that its invoked with. For the entire life of the process it caters to any command request that matches that instance-ID (unless its a non instance specific command). Routes/messages to zebra are tagged with instance-ID. - zebra route/redistribute mechanisms are modified to work with [protocol type + instance-id] - bgpd now has ability to have multiple instance specific redistribution for a protocol (OSPF only supported/tested for now). - zlog ability to display instance-id besides the protocol/daemon name. - Changes in other daemons are to because of the needed integration with some of the modified APIs/routines. (Didn’t prefer replicating too many separate instance specific APIs.) - config/show/debug commands are modified to take instance-id argument as appropriate. Guidelines to start using multi-instance ospf --------------------------------------------- The patch is backward compatible, i.e for any previous way of single ospf deamon(router ospf <cr>) will continue to work as is, including all the show commands etc. To enable multiple instances, do the following: 1. service quagga stop 2. Modify /etc/quagga/daemons to add instance-ids of each desired instance in the following format: ospfd=“yes" ospfd_instances="1,2,3" assuming you want to enable 3 instances with those instance ids. 3. Create corresponding ospfd config files as ospfd-1.conf, ospfd-2.conf and ospfd-3.conf. 4. service quagga start/restart 5. Verify that the deamons are started as expected. You should see ospfd started with -n <instance-id> option. ps –ef | grep quagga With that /var/run/quagga/ should have ospfd-<instance-id>.pid and ospfd-<instance-id>/vty to each instance. 6. vtysh to work with instances as you would with any other deamons. 7. Overall most quagga semantics are the same working with the instance deamon, like it is for any other daemon. NOTE: To safeguard against errors leading to too many processes getting invoked, a hard limit on number of instance-ids is in place, currently its 5. Allowed instance-id range is <1-65535> Once daemons are up, show running from vtysh should show the instance-id of each daemon as 'router ospf <instance-id>’ (without needing explicit configuration) Instance-id can not be changed via vtysh, other router ospf configuration is allowed as before. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com> Reviewed-by: Dinesh G Dutt <ddutt@cumulusnetworks.com>
2015-05-20 03:03:42 +02:00
}
}
2002-12-13 21:15:29 +01:00
/* Other routes redistribution into BGP. */
int bgp_redistribute_set(struct bgp *bgp, afi_t afi, int type,
unsigned short instance, bool changed)
2002-12-13 21:15:29 +01:00
{
/* If redistribute options are changed call
* bgp_redistribute_unreg() to reset the option and withdraw
* the routes
*/
if (changed)
bgp_redistribute_unreg(bgp, afi, type, instance);
2002-12-13 21:15:29 +01:00
/* Return if already redistribute flag is set. */
*: add VRF ID in the API message header The API messages are used by zebra to exchange the interfaces, addresses, routes and router-id information with its clients. To distinguish which VRF the information belongs to, a new field "VRF ID" is added in the message header. And hence the message version is increased to 3. * The new field "VRF ID" in the message header: Length (2 bytes) Marker (1 byte) Version (1 byte) VRF ID (2 bytes, newly added) Command (2 bytes) - Client side: - zclient_create_header() adds the VRF ID in the message header. - zclient_read() extracts and validates the VRF ID from the header, and passes the VRF ID to the callback functions registered to the API messages. - All relative functions are appended with a new parameter "vrf_id", including all the callback functions. - "vrf_id" is also added to "struct zapi_ipv4" and "struct zapi_ipv6". Clients need to correctly set the VRF ID when using the API functions zapi_ipv4_route() and zapi_ipv6_route(). - Till now all messages sent from a client have the default VRF ID "0" in the header. - The HELLO message is special, which is used as the heart-beat of a client, and has no relation with VRF. The VRF ID in the HELLO message header will always be 0 and ignored by zebra. - Zebra side: - zserv_create_header() adds the VRF ID in the message header. - zebra_client_read() extracts and validates the VRF ID from the header, and passes the VRF ID to the functions which process the received messages. - All relative functions are appended with a new parameter "vrf_id". * Suppress the messages in a VRF which a client does not care: Some clients may not care about the information in the VRF X, and zebra should not send the messages in the VRF X to those clients. Extra flags are used to indicate which VRF is registered by a client, and a new message ZEBRA_VRF_UNREGISTER is introduced to let a client can unregister a VRF when it does not need any information in that VRF. A client sends any message other than ZEBRA_VRF_UNREGISTER in a VRF will automatically register to that VRF. - lib/vrf: A new utility "VRF bit-map" is provided to manage the flags for VRFs, one bit per VRF ID. - Use vrf_bitmap_init()/vrf_bitmap_free() to initialize/free a bit-map; - Use vrf_bitmap_set()/vrf_bitmap_unset() to set/unset a flag in the given bit-map, corresponding to the given VRF ID; - Use vrf_bitmap_check() to test whether the flag, in the given bit-map and for the given VRF ID, is set. - Client side: - In "struct zclient", the following flags are changed from "u_char" to "vrf_bitmap_t": redist[ZEBRA_ROUTE_MAX] default_information These flags are extended for each VRF, and controlled by the clients themselves (or with the help of zclient_redistribute() and zclient_redistribute_default()). - Zebra side: - In "struct zserv", the following flags are changed from "u_char" to "vrf_bitmap_t": redist[ZEBRA_ROUTE_MAX] redist_default ifinfo ridinfo These flags are extended for each VRF, as the VRF registration flags. They are maintained on receiving a ZEBRA_XXX_ADD or ZEBRA_XXX_DELETE message. When sending an interface/address/route/router-id message in a VRF to a client, if the corresponding VRF registration flag is not set, this message will not be dropped by zebra. - A new function zread_vrf_unregister() is introduced to process the new command ZEBRA_VRF_UNREGISTER. All the VRF registration flags are cleared for the requested VRF. Those clients, who support only the default VRF, will never receive a message in a non-default VRF, thanks to the filter in zebra. * New callback for the event of successful connection to zebra: - zclient_start() is splitted, keeping only the code of connecting to zebra. - Now zclient_init()=>zclient_connect()=>zclient_start() operations are purely dealing with the connection to zbera. - Once zebra is successfully connected, at the end of zclient_start(), a new callback is used to inform the client about connection. - Till now, in the callback of connect-to-zebra event, all clients send messages to zebra to request the router-id/interface/routes information in the default VRF. Of corse in future the client can do anything it wants in this callback. For example, it may send requests for both default VRF and some non-default VRFs. Signed-off-by: Feng Lu <lu.feng@6wind.com> Reviewed-by: Alain Ritoux <alain.ritoux@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Donald Sharp <sharpd@cumulusnetworks.com> Conflicts: lib/zclient.h lib/zebra.h zebra/zserv.c zebra/zserv.h Conflicts: bgpd/bgp_nexthop.c bgpd/bgp_nht.c bgpd/bgp_zebra.c isisd/isis_zebra.c lib/zclient.c lib/zclient.h lib/zebra.h nhrpd/nhrp_interface.c nhrpd/nhrp_route.c nhrpd/nhrpd.h ospf6d/ospf6_zebra.c ospf6d/ospf6_zebra.h ospfd/ospf_vty.c ospfd/ospf_zebra.c pimd/pim_zebra.c pimd/pim_zlookup.c ripd/rip_zebra.c ripngd/ripng_zebra.c zebra/redistribute.c zebra/rt_netlink.c zebra/zebra_rnh.c zebra/zebra_rnh.h zebra/zserv.c zebra/zserv.h
2014-10-16 03:52:36 +02:00
if (instance) {
if (type == ZEBRA_ROUTE_TABLE_DIRECT) {
/*
* When redistribution type is `table-direct` the
* instance means `table identification`.
*
* `table_id` support 32bit integers, however since
* `instance` is being overloaded to `table_id` it
* will only be possible to use the first 65535
* entries.
*
* Also the ZAPI must also support `int`
* (see `zebra_redistribute_add`).
*/
struct redist_table_direct table = {
.table_id = instance,
.vrf_id = bgp->vrf_id,
};
if (redist_lookup_table_direct(&zclient->mi_redist[afi][type], &table) !=
NULL)
return CMD_WARNING;
redist_add_table_direct(&zclient->mi_redist[afi][type], &table);
} else {
if (redist_check_instance(&zclient->mi_redist[afi][type], instance))
return CMD_WARNING;
2002-12-13 21:15:29 +01:00
redist_add_instance(&zclient->mi_redist[afi][type], instance);
}
*: add VRF ID in the API message header The API messages are used by zebra to exchange the interfaces, addresses, routes and router-id information with its clients. To distinguish which VRF the information belongs to, a new field "VRF ID" is added in the message header. And hence the message version is increased to 3. * The new field "VRF ID" in the message header: Length (2 bytes) Marker (1 byte) Version (1 byte) VRF ID (2 bytes, newly added) Command (2 bytes) - Client side: - zclient_create_header() adds the VRF ID in the message header. - zclient_read() extracts and validates the VRF ID from the header, and passes the VRF ID to the callback functions registered to the API messages. - All relative functions are appended with a new parameter "vrf_id", including all the callback functions. - "vrf_id" is also added to "struct zapi_ipv4" and "struct zapi_ipv6". Clients need to correctly set the VRF ID when using the API functions zapi_ipv4_route() and zapi_ipv6_route(). - Till now all messages sent from a client have the default VRF ID "0" in the header. - The HELLO message is special, which is used as the heart-beat of a client, and has no relation with VRF. The VRF ID in the HELLO message header will always be 0 and ignored by zebra. - Zebra side: - zserv_create_header() adds the VRF ID in the message header. - zebra_client_read() extracts and validates the VRF ID from the header, and passes the VRF ID to the functions which process the received messages. - All relative functions are appended with a new parameter "vrf_id". * Suppress the messages in a VRF which a client does not care: Some clients may not care about the information in the VRF X, and zebra should not send the messages in the VRF X to those clients. Extra flags are used to indicate which VRF is registered by a client, and a new message ZEBRA_VRF_UNREGISTER is introduced to let a client can unregister a VRF when it does not need any information in that VRF. A client sends any message other than ZEBRA_VRF_UNREGISTER in a VRF will automatically register to that VRF. - lib/vrf: A new utility "VRF bit-map" is provided to manage the flags for VRFs, one bit per VRF ID. - Use vrf_bitmap_init()/vrf_bitmap_free() to initialize/free a bit-map; - Use vrf_bitmap_set()/vrf_bitmap_unset() to set/unset a flag in the given bit-map, corresponding to the given VRF ID; - Use vrf_bitmap_check() to test whether the flag, in the given bit-map and for the given VRF ID, is set. - Client side: - In "struct zclient", the following flags are changed from "u_char" to "vrf_bitmap_t": redist[ZEBRA_ROUTE_MAX] default_information These flags are extended for each VRF, and controlled by the clients themselves (or with the help of zclient_redistribute() and zclient_redistribute_default()). - Zebra side: - In "struct zserv", the following flags are changed from "u_char" to "vrf_bitmap_t": redist[ZEBRA_ROUTE_MAX] redist_default ifinfo ridinfo These flags are extended for each VRF, as the VRF registration flags. They are maintained on receiving a ZEBRA_XXX_ADD or ZEBRA_XXX_DELETE message. When sending an interface/address/route/router-id message in a VRF to a client, if the corresponding VRF registration flag is not set, this message will not be dropped by zebra. - A new function zread_vrf_unregister() is introduced to process the new command ZEBRA_VRF_UNREGISTER. All the VRF registration flags are cleared for the requested VRF. Those clients, who support only the default VRF, will never receive a message in a non-default VRF, thanks to the filter in zebra. * New callback for the event of successful connection to zebra: - zclient_start() is splitted, keeping only the code of connecting to zebra. - Now zclient_init()=>zclient_connect()=>zclient_start() operations are purely dealing with the connection to zbera. - Once zebra is successfully connected, at the end of zclient_start(), a new callback is used to inform the client about connection. - Till now, in the callback of connect-to-zebra event, all clients send messages to zebra to request the router-id/interface/routes information in the default VRF. Of corse in future the client can do anything it wants in this callback. For example, it may send requests for both default VRF and some non-default VRFs. Signed-off-by: Feng Lu <lu.feng@6wind.com> Reviewed-by: Alain Ritoux <alain.ritoux@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Donald Sharp <sharpd@cumulusnetworks.com> Conflicts: lib/zclient.h lib/zebra.h zebra/zserv.c zebra/zserv.h Conflicts: bgpd/bgp_nexthop.c bgpd/bgp_nht.c bgpd/bgp_zebra.c isisd/isis_zebra.c lib/zclient.c lib/zclient.h lib/zebra.h nhrpd/nhrp_interface.c nhrpd/nhrp_route.c nhrpd/nhrpd.h ospf6d/ospf6_zebra.c ospf6d/ospf6_zebra.h ospfd/ospf_vty.c ospfd/ospf_zebra.c pimd/pim_zebra.c pimd/pim_zlookup.c ripd/rip_zebra.c ripngd/ripng_zebra.c zebra/redistribute.c zebra/rt_netlink.c zebra/zebra_rnh.c zebra/zebra_rnh.h zebra/zserv.c zebra/zserv.h
2014-10-16 03:52:36 +02:00
} else {
if (vrf_bitmap_check(&zclient->redist[afi][type], bgp->vrf_id))
return CMD_WARNING;
#ifdef ENABLE_BGP_VNC
if (EVPN_ENABLED(bgp) && type == ZEBRA_ROUTE_VNC_DIRECT) {
bgpd: add L3/L2VPN Virtual Network Control feature This feature adds an L3 & L2 VPN application that makes use of the VPN and Encap SAFIs. This code is currently used to support IETF NVO3 style operation. In NVO3 terminology it provides the Network Virtualization Authority (NVA) and the ability to import/export IP prefixes and MAC addresses from Network Virtualization Edges (NVEs). The code supports per-NVE tables. The NVE-NVA protocol used to communicate routing and Ethernet / Layer 2 (L2) forwarding information between NVAs and NVEs is referred to as the Remote Forwarder Protocol (RFP). OpenFlow is an example RFP. For general background on NVO3 and RFP concepts see [1]. For information on Openflow see [2]. RFPs are integrated with BGP via the RF API contained in the new "rfapi" BGP sub-directory. Currently, only a simple example RFP is included in Quagga. Developers may use this example as a starting point to integrate Quagga with an RFP of their choosing, e.g., OpenFlow. The RFAPI code also supports the ability import/export of routing information between VNC and customer edge routers (CEs) operating within a virtual network. Import/export may take place between BGP views or to the default zebera VRF. BGP, with IP VPNs and Tunnel Encapsulation, is used to distribute VPN information between NVAs. BGP based IP VPN support is defined in RFC4364, BGP/MPLS IP Virtual Private Networks (VPNs), and RFC4659, BGP-MPLS IP Virtual Private Network (VPN) Extension for IPv6 VPN . Use of both the Encapsulation Subsequent Address Family Identifier (SAFI) and the Tunnel Encapsulation Attribute, RFC5512, The BGP Encapsulation Subsequent Address Family Identifier (SAFI) and the BGP Tunnel Encapsulation Attribute, are supported. MAC address distribution does not follow any standard BGB encoding, although it was inspired by the early IETF EVPN concepts. The feature is conditionally compiled and disabled by default. Use the --enable-bgp-vnc configure option to enable. The majority of this code was authored by G. Paul Ziemba <paulz@labn.net>. [1] http://tools.ietf.org/html/draft-ietf-nvo3-nve-nva-cp-req [2] https://www.opennetworking.org/sdn-resources/technical-library Now includes changes needed to merge with cmaster-next.
2016-05-07 20:18:56 +02:00
vnc_export_bgp_enable(
bgp, afi); /* only enables if mode bits cfg'd */
}
#endif
vrf_bitmap_set(&zclient->redist[afi][type], bgp->vrf_id);
*: add VRF ID in the API message header The API messages are used by zebra to exchange the interfaces, addresses, routes and router-id information with its clients. To distinguish which VRF the information belongs to, a new field "VRF ID" is added in the message header. And hence the message version is increased to 3. * The new field "VRF ID" in the message header: Length (2 bytes) Marker (1 byte) Version (1 byte) VRF ID (2 bytes, newly added) Command (2 bytes) - Client side: - zclient_create_header() adds the VRF ID in the message header. - zclient_read() extracts and validates the VRF ID from the header, and passes the VRF ID to the callback functions registered to the API messages. - All relative functions are appended with a new parameter "vrf_id", including all the callback functions. - "vrf_id" is also added to "struct zapi_ipv4" and "struct zapi_ipv6". Clients need to correctly set the VRF ID when using the API functions zapi_ipv4_route() and zapi_ipv6_route(). - Till now all messages sent from a client have the default VRF ID "0" in the header. - The HELLO message is special, which is used as the heart-beat of a client, and has no relation with VRF. The VRF ID in the HELLO message header will always be 0 and ignored by zebra. - Zebra side: - zserv_create_header() adds the VRF ID in the message header. - zebra_client_read() extracts and validates the VRF ID from the header, and passes the VRF ID to the functions which process the received messages. - All relative functions are appended with a new parameter "vrf_id". * Suppress the messages in a VRF which a client does not care: Some clients may not care about the information in the VRF X, and zebra should not send the messages in the VRF X to those clients. Extra flags are used to indicate which VRF is registered by a client, and a new message ZEBRA_VRF_UNREGISTER is introduced to let a client can unregister a VRF when it does not need any information in that VRF. A client sends any message other than ZEBRA_VRF_UNREGISTER in a VRF will automatically register to that VRF. - lib/vrf: A new utility "VRF bit-map" is provided to manage the flags for VRFs, one bit per VRF ID. - Use vrf_bitmap_init()/vrf_bitmap_free() to initialize/free a bit-map; - Use vrf_bitmap_set()/vrf_bitmap_unset() to set/unset a flag in the given bit-map, corresponding to the given VRF ID; - Use vrf_bitmap_check() to test whether the flag, in the given bit-map and for the given VRF ID, is set. - Client side: - In "struct zclient", the following flags are changed from "u_char" to "vrf_bitmap_t": redist[ZEBRA_ROUTE_MAX] default_information These flags are extended for each VRF, and controlled by the clients themselves (or with the help of zclient_redistribute() and zclient_redistribute_default()). - Zebra side: - In "struct zserv", the following flags are changed from "u_char" to "vrf_bitmap_t": redist[ZEBRA_ROUTE_MAX] redist_default ifinfo ridinfo These flags are extended for each VRF, as the VRF registration flags. They are maintained on receiving a ZEBRA_XXX_ADD or ZEBRA_XXX_DELETE message. When sending an interface/address/route/router-id message in a VRF to a client, if the corresponding VRF registration flag is not set, this message will not be dropped by zebra. - A new function zread_vrf_unregister() is introduced to process the new command ZEBRA_VRF_UNREGISTER. All the VRF registration flags are cleared for the requested VRF. Those clients, who support only the default VRF, will never receive a message in a non-default VRF, thanks to the filter in zebra. * New callback for the event of successful connection to zebra: - zclient_start() is splitted, keeping only the code of connecting to zebra. - Now zclient_init()=>zclient_connect()=>zclient_start() operations are purely dealing with the connection to zbera. - Once zebra is successfully connected, at the end of zclient_start(), a new callback is used to inform the client about connection. - Till now, in the callback of connect-to-zebra event, all clients send messages to zebra to request the router-id/interface/routes information in the default VRF. Of corse in future the client can do anything it wants in this callback. For example, it may send requests for both default VRF and some non-default VRFs. Signed-off-by: Feng Lu <lu.feng@6wind.com> Reviewed-by: Alain Ritoux <alain.ritoux@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Donald Sharp <sharpd@cumulusnetworks.com> Conflicts: lib/zclient.h lib/zebra.h zebra/zserv.c zebra/zserv.h Conflicts: bgpd/bgp_nexthop.c bgpd/bgp_nht.c bgpd/bgp_zebra.c isisd/isis_zebra.c lib/zclient.c lib/zclient.h lib/zebra.h nhrpd/nhrp_interface.c nhrpd/nhrp_route.c nhrpd/nhrpd.h ospf6d/ospf6_zebra.c ospf6d/ospf6_zebra.h ospfd/ospf_vty.c ospfd/ospf_zebra.c pimd/pim_zebra.c pimd/pim_zlookup.c ripd/rip_zebra.c ripngd/ripng_zebra.c zebra/redistribute.c zebra/rt_netlink.c zebra/zebra_rnh.c zebra/zebra_rnh.h zebra/zserv.c zebra/zserv.h
2014-10-16 03:52:36 +02:00
}
2002-12-13 21:15:29 +01:00
/*
* Don't try to register if we're not connected to Zebra or Zebra
* doesn't know of this instance.
*
* When we come up later well resend if needed.
*/
if (!bgp_install_info_to_zebra(bgp))
return CMD_SUCCESS;
Overhual BGP debugs Summary of changes - added an option to enable keepalive debugs for a specific peer - added an option to enable inbound and/or outbound updates debugs for a specific peer - added an option to enable update debugs for a specific prefix - added an option to enable zebra debugs for a specific prefix - combined "deb bgp", "deb bgp events" and "deb bgp fsm" into "deb bgp neighbor-events". "deb bgp neighbor-events" can be enabled for a specific peer. - merged "deb bgp filters" into "deb bgp update" - moved the per-peer logging to one central log file. We now have the ability to filter all verbose debugs on a per-peer and per-prefix basis so we no longer need to keep log files per-peer. This simplifies troubleshooting by keeping all BGP logs in one location. The use r can then grep for the peer IP they are interested in if they wish to see the logs for a specific peer. - Changed "show debugging" in isis to "show debugging isis" to be consistent with all other protocols. This was very confusing for the user because they would type "show debug" and expect to see a list of debugs enabled across all protocols. - Removed "undebug" from the parser for BGP. Again this was to be consisten with all other protocols. - Removed the "all" keyword from the BGP debug parser. The user can now do "no debug bgp" to disable all BGP debugs, before you had to type "no deb all bgp" which was confusing. The new parse tree for BGP debugging is: deb bgp as4 deb bgp as4 segment deb bgp keepalives [A.B.C.D|WORD|X:X::X:X] deb bgp neighbor-events [A.B.C.D|WORD|X:X::X:X] deb bgp nht deb bgp updates [in|out] [A.B.C.D|WORD|X:X::X:X] deb bgp updates prefix [A.B.C.D/M|X:X::X:X/M] deb bgp zebra deb bgp zebra prefix [A.B.C.D/M|X:X::X:X/M]
2015-05-20 02:58:12 +02:00
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("Tx redistribute add %s afi %d %s %d",
bgp->name_pretty, afi, zebra_route_string(type),
instance);
2002-12-13 21:15:29 +01:00
/* Send distribute add message to zebra. */
*: add VRF ID in the API message header The API messages are used by zebra to exchange the interfaces, addresses, routes and router-id information with its clients. To distinguish which VRF the information belongs to, a new field "VRF ID" is added in the message header. And hence the message version is increased to 3. * The new field "VRF ID" in the message header: Length (2 bytes) Marker (1 byte) Version (1 byte) VRF ID (2 bytes, newly added) Command (2 bytes) - Client side: - zclient_create_header() adds the VRF ID in the message header. - zclient_read() extracts and validates the VRF ID from the header, and passes the VRF ID to the callback functions registered to the API messages. - All relative functions are appended with a new parameter "vrf_id", including all the callback functions. - "vrf_id" is also added to "struct zapi_ipv4" and "struct zapi_ipv6". Clients need to correctly set the VRF ID when using the API functions zapi_ipv4_route() and zapi_ipv6_route(). - Till now all messages sent from a client have the default VRF ID "0" in the header. - The HELLO message is special, which is used as the heart-beat of a client, and has no relation with VRF. The VRF ID in the HELLO message header will always be 0 and ignored by zebra. - Zebra side: - zserv_create_header() adds the VRF ID in the message header. - zebra_client_read() extracts and validates the VRF ID from the header, and passes the VRF ID to the functions which process the received messages. - All relative functions are appended with a new parameter "vrf_id". * Suppress the messages in a VRF which a client does not care: Some clients may not care about the information in the VRF X, and zebra should not send the messages in the VRF X to those clients. Extra flags are used to indicate which VRF is registered by a client, and a new message ZEBRA_VRF_UNREGISTER is introduced to let a client can unregister a VRF when it does not need any information in that VRF. A client sends any message other than ZEBRA_VRF_UNREGISTER in a VRF will automatically register to that VRF. - lib/vrf: A new utility "VRF bit-map" is provided to manage the flags for VRFs, one bit per VRF ID. - Use vrf_bitmap_init()/vrf_bitmap_free() to initialize/free a bit-map; - Use vrf_bitmap_set()/vrf_bitmap_unset() to set/unset a flag in the given bit-map, corresponding to the given VRF ID; - Use vrf_bitmap_check() to test whether the flag, in the given bit-map and for the given VRF ID, is set. - Client side: - In "struct zclient", the following flags are changed from "u_char" to "vrf_bitmap_t": redist[ZEBRA_ROUTE_MAX] default_information These flags are extended for each VRF, and controlled by the clients themselves (or with the help of zclient_redistribute() and zclient_redistribute_default()). - Zebra side: - In "struct zserv", the following flags are changed from "u_char" to "vrf_bitmap_t": redist[ZEBRA_ROUTE_MAX] redist_default ifinfo ridinfo These flags are extended for each VRF, as the VRF registration flags. They are maintained on receiving a ZEBRA_XXX_ADD or ZEBRA_XXX_DELETE message. When sending an interface/address/route/router-id message in a VRF to a client, if the corresponding VRF registration flag is not set, this message will not be dropped by zebra. - A new function zread_vrf_unregister() is introduced to process the new command ZEBRA_VRF_UNREGISTER. All the VRF registration flags are cleared for the requested VRF. Those clients, who support only the default VRF, will never receive a message in a non-default VRF, thanks to the filter in zebra. * New callback for the event of successful connection to zebra: - zclient_start() is splitted, keeping only the code of connecting to zebra. - Now zclient_init()=>zclient_connect()=>zclient_start() operations are purely dealing with the connection to zbera. - Once zebra is successfully connected, at the end of zclient_start(), a new callback is used to inform the client about connection. - Till now, in the callback of connect-to-zebra event, all clients send messages to zebra to request the router-id/interface/routes information in the default VRF. Of corse in future the client can do anything it wants in this callback. For example, it may send requests for both default VRF and some non-default VRFs. Signed-off-by: Feng Lu <lu.feng@6wind.com> Reviewed-by: Alain Ritoux <alain.ritoux@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Donald Sharp <sharpd@cumulusnetworks.com> Conflicts: lib/zclient.h lib/zebra.h zebra/zserv.c zebra/zserv.h Conflicts: bgpd/bgp_nexthop.c bgpd/bgp_nht.c bgpd/bgp_zebra.c isisd/isis_zebra.c lib/zclient.c lib/zclient.h lib/zebra.h nhrpd/nhrp_interface.c nhrpd/nhrp_route.c nhrpd/nhrpd.h ospf6d/ospf6_zebra.c ospf6d/ospf6_zebra.h ospfd/ospf_vty.c ospfd/ospf_zebra.c pimd/pim_zebra.c pimd/pim_zlookup.c ripd/rip_zebra.c ripngd/ripng_zebra.c zebra/redistribute.c zebra/rt_netlink.c zebra/zebra_rnh.c zebra/zebra_rnh.h zebra/zserv.c zebra/zserv.h
2014-10-16 03:52:36 +02:00
zebra_redistribute_send(ZEBRA_REDISTRIBUTE_ADD, zclient, afi, type,
instance, bgp->vrf_id);
2002-12-13 21:15:29 +01:00
return CMD_SUCCESS;
}
Multi-Instance OSPF Summary ——————————————------------- - etc/init.d/quagga is modified to support creating separate ospf daemon process for each instance. Each individual instance is monitored by watchquagga just like any protocol daemons.(requires initd-mi.patch). - Vtysh is modified to able to connect to multiple daemons of the same protocol (supported for OSPF only for now). - ospfd is modified to remember the Instance-ID that its invoked with. For the entire life of the process it caters to any command request that matches that instance-ID (unless its a non instance specific command). Routes/messages to zebra are tagged with instance-ID. - zebra route/redistribute mechanisms are modified to work with [protocol type + instance-id] - bgpd now has ability to have multiple instance specific redistribution for a protocol (OSPF only supported/tested for now). - zlog ability to display instance-id besides the protocol/daemon name. - Changes in other daemons are to because of the needed integration with some of the modified APIs/routines. (Didn’t prefer replicating too many separate instance specific APIs.) - config/show/debug commands are modified to take instance-id argument as appropriate. Guidelines to start using multi-instance ospf --------------------------------------------- The patch is backward compatible, i.e for any previous way of single ospf deamon(router ospf <cr>) will continue to work as is, including all the show commands etc. To enable multiple instances, do the following: 1. service quagga stop 2. Modify /etc/quagga/daemons to add instance-ids of each desired instance in the following format: ospfd=“yes" ospfd_instances="1,2,3" assuming you want to enable 3 instances with those instance ids. 3. Create corresponding ospfd config files as ospfd-1.conf, ospfd-2.conf and ospfd-3.conf. 4. service quagga start/restart 5. Verify that the deamons are started as expected. You should see ospfd started with -n <instance-id> option. ps –ef | grep quagga With that /var/run/quagga/ should have ospfd-<instance-id>.pid and ospfd-<instance-id>/vty to each instance. 6. vtysh to work with instances as you would with any other deamons. 7. Overall most quagga semantics are the same working with the instance deamon, like it is for any other daemon. NOTE: To safeguard against errors leading to too many processes getting invoked, a hard limit on number of instance-ids is in place, currently its 5. Allowed instance-id range is <1-65535> Once daemons are up, show running from vtysh should show the instance-id of each daemon as 'router ospf <instance-id>’ (without needing explicit configuration) Instance-id can not be changed via vtysh, other router ospf configuration is allowed as before. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com> Reviewed-by: Dinesh G Dutt <ddutt@cumulusnetworks.com>
2015-05-20 03:03:42 +02:00
int bgp_redistribute_resend(struct bgp *bgp, afi_t afi, int type,
unsigned short instance)
{
/* Don't try to send if we're not connected to Zebra or Zebra doesn't
* know of this instance.
*/
if (!bgp_install_info_to_zebra(bgp))
return -1;
Overhual BGP debugs Summary of changes - added an option to enable keepalive debugs for a specific peer - added an option to enable inbound and/or outbound updates debugs for a specific peer - added an option to enable update debugs for a specific prefix - added an option to enable zebra debugs for a specific prefix - combined "deb bgp", "deb bgp events" and "deb bgp fsm" into "deb bgp neighbor-events". "deb bgp neighbor-events" can be enabled for a specific peer. - merged "deb bgp filters" into "deb bgp update" - moved the per-peer logging to one central log file. We now have the ability to filter all verbose debugs on a per-peer and per-prefix basis so we no longer need to keep log files per-peer. This simplifies troubleshooting by keeping all BGP logs in one location. The use r can then grep for the peer IP they are interested in if they wish to see the logs for a specific peer. - Changed "show debugging" in isis to "show debugging isis" to be consistent with all other protocols. This was very confusing for the user because they would type "show debug" and expect to see a list of debugs enabled across all protocols. - Removed "undebug" from the parser for BGP. Again this was to be consisten with all other protocols. - Removed the "all" keyword from the BGP debug parser. The user can now do "no debug bgp" to disable all BGP debugs, before you had to type "no deb all bgp" which was confusing. The new parse tree for BGP debugging is: deb bgp as4 deb bgp as4 segment deb bgp keepalives [A.B.C.D|WORD|X:X::X:X] deb bgp neighbor-events [A.B.C.D|WORD|X:X::X:X] deb bgp nht deb bgp updates [in|out] [A.B.C.D|WORD|X:X::X:X] deb bgp updates prefix [A.B.C.D/M|X:X::X:X/M] deb bgp zebra deb bgp zebra prefix [A.B.C.D/M|X:X::X:X/M]
2015-05-20 02:58:12 +02:00
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("Tx redistribute del/add %s afi %d %s %d",
bgp->name_pretty, afi, zebra_route_string(type),
instance);
/* Send distribute add message to zebra. */
*: add VRF ID in the API message header The API messages are used by zebra to exchange the interfaces, addresses, routes and router-id information with its clients. To distinguish which VRF the information belongs to, a new field "VRF ID" is added in the message header. And hence the message version is increased to 3. * The new field "VRF ID" in the message header: Length (2 bytes) Marker (1 byte) Version (1 byte) VRF ID (2 bytes, newly added) Command (2 bytes) - Client side: - zclient_create_header() adds the VRF ID in the message header. - zclient_read() extracts and validates the VRF ID from the header, and passes the VRF ID to the callback functions registered to the API messages. - All relative functions are appended with a new parameter "vrf_id", including all the callback functions. - "vrf_id" is also added to "struct zapi_ipv4" and "struct zapi_ipv6". Clients need to correctly set the VRF ID when using the API functions zapi_ipv4_route() and zapi_ipv6_route(). - Till now all messages sent from a client have the default VRF ID "0" in the header. - The HELLO message is special, which is used as the heart-beat of a client, and has no relation with VRF. The VRF ID in the HELLO message header will always be 0 and ignored by zebra. - Zebra side: - zserv_create_header() adds the VRF ID in the message header. - zebra_client_read() extracts and validates the VRF ID from the header, and passes the VRF ID to the functions which process the received messages. - All relative functions are appended with a new parameter "vrf_id". * Suppress the messages in a VRF which a client does not care: Some clients may not care about the information in the VRF X, and zebra should not send the messages in the VRF X to those clients. Extra flags are used to indicate which VRF is registered by a client, and a new message ZEBRA_VRF_UNREGISTER is introduced to let a client can unregister a VRF when it does not need any information in that VRF. A client sends any message other than ZEBRA_VRF_UNREGISTER in a VRF will automatically register to that VRF. - lib/vrf: A new utility "VRF bit-map" is provided to manage the flags for VRFs, one bit per VRF ID. - Use vrf_bitmap_init()/vrf_bitmap_free() to initialize/free a bit-map; - Use vrf_bitmap_set()/vrf_bitmap_unset() to set/unset a flag in the given bit-map, corresponding to the given VRF ID; - Use vrf_bitmap_check() to test whether the flag, in the given bit-map and for the given VRF ID, is set. - Client side: - In "struct zclient", the following flags are changed from "u_char" to "vrf_bitmap_t": redist[ZEBRA_ROUTE_MAX] default_information These flags are extended for each VRF, and controlled by the clients themselves (or with the help of zclient_redistribute() and zclient_redistribute_default()). - Zebra side: - In "struct zserv", the following flags are changed from "u_char" to "vrf_bitmap_t": redist[ZEBRA_ROUTE_MAX] redist_default ifinfo ridinfo These flags are extended for each VRF, as the VRF registration flags. They are maintained on receiving a ZEBRA_XXX_ADD or ZEBRA_XXX_DELETE message. When sending an interface/address/route/router-id message in a VRF to a client, if the corresponding VRF registration flag is not set, this message will not be dropped by zebra. - A new function zread_vrf_unregister() is introduced to process the new command ZEBRA_VRF_UNREGISTER. All the VRF registration flags are cleared for the requested VRF. Those clients, who support only the default VRF, will never receive a message in a non-default VRF, thanks to the filter in zebra. * New callback for the event of successful connection to zebra: - zclient_start() is splitted, keeping only the code of connecting to zebra. - Now zclient_init()=>zclient_connect()=>zclient_start() operations are purely dealing with the connection to zbera. - Once zebra is successfully connected, at the end of zclient_start(), a new callback is used to inform the client about connection. - Till now, in the callback of connect-to-zebra event, all clients send messages to zebra to request the router-id/interface/routes information in the default VRF. Of corse in future the client can do anything it wants in this callback. For example, it may send requests for both default VRF and some non-default VRFs. Signed-off-by: Feng Lu <lu.feng@6wind.com> Reviewed-by: Alain Ritoux <alain.ritoux@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Donald Sharp <sharpd@cumulusnetworks.com> Conflicts: lib/zclient.h lib/zebra.h zebra/zserv.c zebra/zserv.h Conflicts: bgpd/bgp_nexthop.c bgpd/bgp_nht.c bgpd/bgp_zebra.c isisd/isis_zebra.c lib/zclient.c lib/zclient.h lib/zebra.h nhrpd/nhrp_interface.c nhrpd/nhrp_route.c nhrpd/nhrpd.h ospf6d/ospf6_zebra.c ospf6d/ospf6_zebra.h ospfd/ospf_vty.c ospfd/ospf_zebra.c pimd/pim_zebra.c pimd/pim_zlookup.c ripd/rip_zebra.c ripngd/ripng_zebra.c zebra/redistribute.c zebra/rt_netlink.c zebra/zebra_rnh.c zebra/zebra_rnh.h zebra/zserv.c zebra/zserv.h
2014-10-16 03:52:36 +02:00
zebra_redistribute_send(ZEBRA_REDISTRIBUTE_DELETE, zclient, afi, type,
instance, bgp->vrf_id);
*: add VRF ID in the API message header The API messages are used by zebra to exchange the interfaces, addresses, routes and router-id information with its clients. To distinguish which VRF the information belongs to, a new field "VRF ID" is added in the message header. And hence the message version is increased to 3. * The new field "VRF ID" in the message header: Length (2 bytes) Marker (1 byte) Version (1 byte) VRF ID (2 bytes, newly added) Command (2 bytes) - Client side: - zclient_create_header() adds the VRF ID in the message header. - zclient_read() extracts and validates the VRF ID from the header, and passes the VRF ID to the callback functions registered to the API messages. - All relative functions are appended with a new parameter "vrf_id", including all the callback functions. - "vrf_id" is also added to "struct zapi_ipv4" and "struct zapi_ipv6". Clients need to correctly set the VRF ID when using the API functions zapi_ipv4_route() and zapi_ipv6_route(). - Till now all messages sent from a client have the default VRF ID "0" in the header. - The HELLO message is special, which is used as the heart-beat of a client, and has no relation with VRF. The VRF ID in the HELLO message header will always be 0 and ignored by zebra. - Zebra side: - zserv_create_header() adds the VRF ID in the message header. - zebra_client_read() extracts and validates the VRF ID from the header, and passes the VRF ID to the functions which process the received messages. - All relative functions are appended with a new parameter "vrf_id". * Suppress the messages in a VRF which a client does not care: Some clients may not care about the information in the VRF X, and zebra should not send the messages in the VRF X to those clients. Extra flags are used to indicate which VRF is registered by a client, and a new message ZEBRA_VRF_UNREGISTER is introduced to let a client can unregister a VRF when it does not need any information in that VRF. A client sends any message other than ZEBRA_VRF_UNREGISTER in a VRF will automatically register to that VRF. - lib/vrf: A new utility "VRF bit-map" is provided to manage the flags for VRFs, one bit per VRF ID. - Use vrf_bitmap_init()/vrf_bitmap_free() to initialize/free a bit-map; - Use vrf_bitmap_set()/vrf_bitmap_unset() to set/unset a flag in the given bit-map, corresponding to the given VRF ID; - Use vrf_bitmap_check() to test whether the flag, in the given bit-map and for the given VRF ID, is set. - Client side: - In "struct zclient", the following flags are changed from "u_char" to "vrf_bitmap_t": redist[ZEBRA_ROUTE_MAX] default_information These flags are extended for each VRF, and controlled by the clients themselves (or with the help of zclient_redistribute() and zclient_redistribute_default()). - Zebra side: - In "struct zserv", the following flags are changed from "u_char" to "vrf_bitmap_t": redist[ZEBRA_ROUTE_MAX] redist_default ifinfo ridinfo These flags are extended for each VRF, as the VRF registration flags. They are maintained on receiving a ZEBRA_XXX_ADD or ZEBRA_XXX_DELETE message. When sending an interface/address/route/router-id message in a VRF to a client, if the corresponding VRF registration flag is not set, this message will not be dropped by zebra. - A new function zread_vrf_unregister() is introduced to process the new command ZEBRA_VRF_UNREGISTER. All the VRF registration flags are cleared for the requested VRF. Those clients, who support only the default VRF, will never receive a message in a non-default VRF, thanks to the filter in zebra. * New callback for the event of successful connection to zebra: - zclient_start() is splitted, keeping only the code of connecting to zebra. - Now zclient_init()=>zclient_connect()=>zclient_start() operations are purely dealing with the connection to zbera. - Once zebra is successfully connected, at the end of zclient_start(), a new callback is used to inform the client about connection. - Till now, in the callback of connect-to-zebra event, all clients send messages to zebra to request the router-id/interface/routes information in the default VRF. Of corse in future the client can do anything it wants in this callback. For example, it may send requests for both default VRF and some non-default VRFs. Signed-off-by: Feng Lu <lu.feng@6wind.com> Reviewed-by: Alain Ritoux <alain.ritoux@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Donald Sharp <sharpd@cumulusnetworks.com> Conflicts: lib/zclient.h lib/zebra.h zebra/zserv.c zebra/zserv.h Conflicts: bgpd/bgp_nexthop.c bgpd/bgp_nht.c bgpd/bgp_zebra.c isisd/isis_zebra.c lib/zclient.c lib/zclient.h lib/zebra.h nhrpd/nhrp_interface.c nhrpd/nhrp_route.c nhrpd/nhrpd.h ospf6d/ospf6_zebra.c ospf6d/ospf6_zebra.h ospfd/ospf_vty.c ospfd/ospf_zebra.c pimd/pim_zebra.c pimd/pim_zlookup.c ripd/rip_zebra.c ripngd/ripng_zebra.c zebra/redistribute.c zebra/rt_netlink.c zebra/zebra_rnh.c zebra/zebra_rnh.h zebra/zserv.c zebra/zserv.h
2014-10-16 03:52:36 +02:00
zebra_redistribute_send(ZEBRA_REDISTRIBUTE_ADD, zclient, afi, type,
instance, bgp->vrf_id);
return 0;
}
2002-12-13 21:15:29 +01:00
/* Redistribute with route-map specification. */
bool bgp_redistribute_rmap_set(struct bgp_redist *red, const char *name,
struct route_map *route_map)
2002-12-13 21:15:29 +01:00
{
Multi-Instance OSPF Summary ——————————————------------- - etc/init.d/quagga is modified to support creating separate ospf daemon process for each instance. Each individual instance is monitored by watchquagga just like any protocol daemons.(requires initd-mi.patch). - Vtysh is modified to able to connect to multiple daemons of the same protocol (supported for OSPF only for now). - ospfd is modified to remember the Instance-ID that its invoked with. For the entire life of the process it caters to any command request that matches that instance-ID (unless its a non instance specific command). Routes/messages to zebra are tagged with instance-ID. - zebra route/redistribute mechanisms are modified to work with [protocol type + instance-id] - bgpd now has ability to have multiple instance specific redistribution for a protocol (OSPF only supported/tested for now). - zlog ability to display instance-id besides the protocol/daemon name. - Changes in other daemons are to because of the needed integration with some of the modified APIs/routines. (Didn’t prefer replicating too many separate instance specific APIs.) - config/show/debug commands are modified to take instance-id argument as appropriate. Guidelines to start using multi-instance ospf --------------------------------------------- The patch is backward compatible, i.e for any previous way of single ospf deamon(router ospf <cr>) will continue to work as is, including all the show commands etc. To enable multiple instances, do the following: 1. service quagga stop 2. Modify /etc/quagga/daemons to add instance-ids of each desired instance in the following format: ospfd=“yes" ospfd_instances="1,2,3" assuming you want to enable 3 instances with those instance ids. 3. Create corresponding ospfd config files as ospfd-1.conf, ospfd-2.conf and ospfd-3.conf. 4. service quagga start/restart 5. Verify that the deamons are started as expected. You should see ospfd started with -n <instance-id> option. ps –ef | grep quagga With that /var/run/quagga/ should have ospfd-<instance-id>.pid and ospfd-<instance-id>/vty to each instance. 6. vtysh to work with instances as you would with any other deamons. 7. Overall most quagga semantics are the same working with the instance deamon, like it is for any other daemon. NOTE: To safeguard against errors leading to too many processes getting invoked, a hard limit on number of instance-ids is in place, currently its 5. Allowed instance-id range is <1-65535> Once daemons are up, show running from vtysh should show the instance-id of each daemon as 'router ospf <instance-id>’ (without needing explicit configuration) Instance-id can not be changed via vtysh, other router ospf configuration is allowed as before. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com> Reviewed-by: Dinesh G Dutt <ddutt@cumulusnetworks.com>
2015-05-20 03:03:42 +02:00
if (red->rmap.name && (strcmp(red->rmap.name, name) == 0))
return false;
2002-12-13 21:15:29 +01:00
XFREE(MTYPE_ROUTE_MAP_NAME, red->rmap.name);
/* Decrement the count for existing routemap and
* increment the count for new route map.
*/
route_map_counter_decrement(red->rmap.map);
red->rmap.name = XSTRDUP(MTYPE_ROUTE_MAP_NAME, name);
red->rmap.map = route_map;
route_map_counter_increment(red->rmap.map);
2002-12-13 21:15:29 +01:00
return true;
2002-12-13 21:15:29 +01:00
}
/* Redistribute with metric specification. */
bool bgp_redistribute_metric_set(struct bgp *bgp, struct bgp_redist *red,
afi_t afi, int type, uint32_t metric)
2002-12-13 21:15:29 +01:00
{
struct bgp_dest *dest;
struct bgp_path_info *pi;
Multi-Instance OSPF Summary ——————————————------------- - etc/init.d/quagga is modified to support creating separate ospf daemon process for each instance. Each individual instance is monitored by watchquagga just like any protocol daemons.(requires initd-mi.patch). - Vtysh is modified to able to connect to multiple daemons of the same protocol (supported for OSPF only for now). - ospfd is modified to remember the Instance-ID that its invoked with. For the entire life of the process it caters to any command request that matches that instance-ID (unless its a non instance specific command). Routes/messages to zebra are tagged with instance-ID. - zebra route/redistribute mechanisms are modified to work with [protocol type + instance-id] - bgpd now has ability to have multiple instance specific redistribution for a protocol (OSPF only supported/tested for now). - zlog ability to display instance-id besides the protocol/daemon name. - Changes in other daemons are to because of the needed integration with some of the modified APIs/routines. (Didn’t prefer replicating too many separate instance specific APIs.) - config/show/debug commands are modified to take instance-id argument as appropriate. Guidelines to start using multi-instance ospf --------------------------------------------- The patch is backward compatible, i.e for any previous way of single ospf deamon(router ospf <cr>) will continue to work as is, including all the show commands etc. To enable multiple instances, do the following: 1. service quagga stop 2. Modify /etc/quagga/daemons to add instance-ids of each desired instance in the following format: ospfd=“yes" ospfd_instances="1,2,3" assuming you want to enable 3 instances with those instance ids. 3. Create corresponding ospfd config files as ospfd-1.conf, ospfd-2.conf and ospfd-3.conf. 4. service quagga start/restart 5. Verify that the deamons are started as expected. You should see ospfd started with -n <instance-id> option. ps –ef | grep quagga With that /var/run/quagga/ should have ospfd-<instance-id>.pid and ospfd-<instance-id>/vty to each instance. 6. vtysh to work with instances as you would with any other deamons. 7. Overall most quagga semantics are the same working with the instance deamon, like it is for any other daemon. NOTE: To safeguard against errors leading to too many processes getting invoked, a hard limit on number of instance-ids is in place, currently its 5. Allowed instance-id range is <1-65535> Once daemons are up, show running from vtysh should show the instance-id of each daemon as 'router ospf <instance-id>’ (without needing explicit configuration) Instance-id can not be changed via vtysh, other router ospf configuration is allowed as before. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com> Reviewed-by: Dinesh G Dutt <ddutt@cumulusnetworks.com>
2015-05-20 03:03:42 +02:00
if (red->redist_metric_flag && red->redist_metric == metric)
return false;
Multi-Instance OSPF Summary ——————————————------------- - etc/init.d/quagga is modified to support creating separate ospf daemon process for each instance. Each individual instance is monitored by watchquagga just like any protocol daemons.(requires initd-mi.patch). - Vtysh is modified to able to connect to multiple daemons of the same protocol (supported for OSPF only for now). - ospfd is modified to remember the Instance-ID that its invoked with. For the entire life of the process it caters to any command request that matches that instance-ID (unless its a non instance specific command). Routes/messages to zebra are tagged with instance-ID. - zebra route/redistribute mechanisms are modified to work with [protocol type + instance-id] - bgpd now has ability to have multiple instance specific redistribution for a protocol (OSPF only supported/tested for now). - zlog ability to display instance-id besides the protocol/daemon name. - Changes in other daemons are to because of the needed integration with some of the modified APIs/routines. (Didn’t prefer replicating too many separate instance specific APIs.) - config/show/debug commands are modified to take instance-id argument as appropriate. Guidelines to start using multi-instance ospf --------------------------------------------- The patch is backward compatible, i.e for any previous way of single ospf deamon(router ospf <cr>) will continue to work as is, including all the show commands etc. To enable multiple instances, do the following: 1. service quagga stop 2. Modify /etc/quagga/daemons to add instance-ids of each desired instance in the following format: ospfd=“yes" ospfd_instances="1,2,3" assuming you want to enable 3 instances with those instance ids. 3. Create corresponding ospfd config files as ospfd-1.conf, ospfd-2.conf and ospfd-3.conf. 4. service quagga start/restart 5. Verify that the deamons are started as expected. You should see ospfd started with -n <instance-id> option. ps –ef | grep quagga With that /var/run/quagga/ should have ospfd-<instance-id>.pid and ospfd-<instance-id>/vty to each instance. 6. vtysh to work with instances as you would with any other deamons. 7. Overall most quagga semantics are the same working with the instance deamon, like it is for any other daemon. NOTE: To safeguard against errors leading to too many processes getting invoked, a hard limit on number of instance-ids is in place, currently its 5. Allowed instance-id range is <1-65535> Once daemons are up, show running from vtysh should show the instance-id of each daemon as 'router ospf <instance-id>’ (without needing explicit configuration) Instance-id can not be changed via vtysh, other router ospf configuration is allowed as before. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com> Reviewed-by: Dinesh G Dutt <ddutt@cumulusnetworks.com>
2015-05-20 03:03:42 +02:00
red->redist_metric_flag = 1;
red->redist_metric = metric;
for (dest = bgp_table_top(bgp->rib[afi][SAFI_UNICAST]); dest;
dest = bgp_route_next(dest)) {
for (pi = bgp_dest_get_bgp_path_info(dest); pi; pi = pi->next) {
if (pi->sub_type == BGP_ROUTE_REDISTRIBUTE
&& pi->type == type
&& pi->instance == red->instance) {
struct attr *old_attr;
struct attr new_attr;
new_attr = *pi->attr;
new_attr.med = red->redist_metric;
old_attr = pi->attr;
pi->attr = bgp_attr_intern(&new_attr);
bgp_attr_unintern(&old_attr);
bgp_path_info_set_flag(dest, pi,
BGP_PATH_ATTR_CHANGED);
bgp_process(bgp, dest, pi, afi, SAFI_UNICAST);
}
}
}
return true;
2002-12-13 21:15:29 +01:00
}
/* Unset redistribution. */
int bgp_redistribute_unreg(struct bgp *bgp, afi_t afi, int type,
unsigned short instance)
2002-12-13 21:15:29 +01:00
{
Multi-Instance OSPF Summary ——————————————------------- - etc/init.d/quagga is modified to support creating separate ospf daemon process for each instance. Each individual instance is monitored by watchquagga just like any protocol daemons.(requires initd-mi.patch). - Vtysh is modified to able to connect to multiple daemons of the same protocol (supported for OSPF only for now). - ospfd is modified to remember the Instance-ID that its invoked with. For the entire life of the process it caters to any command request that matches that instance-ID (unless its a non instance specific command). Routes/messages to zebra are tagged with instance-ID. - zebra route/redistribute mechanisms are modified to work with [protocol type + instance-id] - bgpd now has ability to have multiple instance specific redistribution for a protocol (OSPF only supported/tested for now). - zlog ability to display instance-id besides the protocol/daemon name. - Changes in other daemons are to because of the needed integration with some of the modified APIs/routines. (Didn’t prefer replicating too many separate instance specific APIs.) - config/show/debug commands are modified to take instance-id argument as appropriate. Guidelines to start using multi-instance ospf --------------------------------------------- The patch is backward compatible, i.e for any previous way of single ospf deamon(router ospf <cr>) will continue to work as is, including all the show commands etc. To enable multiple instances, do the following: 1. service quagga stop 2. Modify /etc/quagga/daemons to add instance-ids of each desired instance in the following format: ospfd=“yes" ospfd_instances="1,2,3" assuming you want to enable 3 instances with those instance ids. 3. Create corresponding ospfd config files as ospfd-1.conf, ospfd-2.conf and ospfd-3.conf. 4. service quagga start/restart 5. Verify that the deamons are started as expected. You should see ospfd started with -n <instance-id> option. ps –ef | grep quagga With that /var/run/quagga/ should have ospfd-<instance-id>.pid and ospfd-<instance-id>/vty to each instance. 6. vtysh to work with instances as you would with any other deamons. 7. Overall most quagga semantics are the same working with the instance deamon, like it is for any other daemon. NOTE: To safeguard against errors leading to too many processes getting invoked, a hard limit on number of instance-ids is in place, currently its 5. Allowed instance-id range is <1-65535> Once daemons are up, show running from vtysh should show the instance-id of each daemon as 'router ospf <instance-id>’ (without needing explicit configuration) Instance-id can not be changed via vtysh, other router ospf configuration is allowed as before. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com> Reviewed-by: Dinesh G Dutt <ddutt@cumulusnetworks.com>
2015-05-20 03:03:42 +02:00
struct bgp_redist *red;
Multi-Instance OSPF Summary ——————————————------------- - etc/init.d/quagga is modified to support creating separate ospf daemon process for each instance. Each individual instance is monitored by watchquagga just like any protocol daemons.(requires initd-mi.patch). - Vtysh is modified to able to connect to multiple daemons of the same protocol (supported for OSPF only for now). - ospfd is modified to remember the Instance-ID that its invoked with. For the entire life of the process it caters to any command request that matches that instance-ID (unless its a non instance specific command). Routes/messages to zebra are tagged with instance-ID. - zebra route/redistribute mechanisms are modified to work with [protocol type + instance-id] - bgpd now has ability to have multiple instance specific redistribution for a protocol (OSPF only supported/tested for now). - zlog ability to display instance-id besides the protocol/daemon name. - Changes in other daemons are to because of the needed integration with some of the modified APIs/routines. (Didn’t prefer replicating too many separate instance specific APIs.) - config/show/debug commands are modified to take instance-id argument as appropriate. Guidelines to start using multi-instance ospf --------------------------------------------- The patch is backward compatible, i.e for any previous way of single ospf deamon(router ospf <cr>) will continue to work as is, including all the show commands etc. To enable multiple instances, do the following: 1. service quagga stop 2. Modify /etc/quagga/daemons to add instance-ids of each desired instance in the following format: ospfd=“yes" ospfd_instances="1,2,3" assuming you want to enable 3 instances with those instance ids. 3. Create corresponding ospfd config files as ospfd-1.conf, ospfd-2.conf and ospfd-3.conf. 4. service quagga start/restart 5. Verify that the deamons are started as expected. You should see ospfd started with -n <instance-id> option. ps –ef | grep quagga With that /var/run/quagga/ should have ospfd-<instance-id>.pid and ospfd-<instance-id>/vty to each instance. 6. vtysh to work with instances as you would with any other deamons. 7. Overall most quagga semantics are the same working with the instance deamon, like it is for any other daemon. NOTE: To safeguard against errors leading to too many processes getting invoked, a hard limit on number of instance-ids is in place, currently its 5. Allowed instance-id range is <1-65535> Once daemons are up, show running from vtysh should show the instance-id of each daemon as 'router ospf <instance-id>’ (without needing explicit configuration) Instance-id can not be changed via vtysh, other router ospf configuration is allowed as before. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com> Reviewed-by: Dinesh G Dutt <ddutt@cumulusnetworks.com>
2015-05-20 03:03:42 +02:00
red = bgp_redist_lookup(bgp, afi, type, instance);
if (!red)
return CMD_SUCCESS;
2002-12-13 21:15:29 +01:00
/* Return if zebra connection is disabled. */
*: add VRF ID in the API message header The API messages are used by zebra to exchange the interfaces, addresses, routes and router-id information with its clients. To distinguish which VRF the information belongs to, a new field "VRF ID" is added in the message header. And hence the message version is increased to 3. * The new field "VRF ID" in the message header: Length (2 bytes) Marker (1 byte) Version (1 byte) VRF ID (2 bytes, newly added) Command (2 bytes) - Client side: - zclient_create_header() adds the VRF ID in the message header. - zclient_read() extracts and validates the VRF ID from the header, and passes the VRF ID to the callback functions registered to the API messages. - All relative functions are appended with a new parameter "vrf_id", including all the callback functions. - "vrf_id" is also added to "struct zapi_ipv4" and "struct zapi_ipv6". Clients need to correctly set the VRF ID when using the API functions zapi_ipv4_route() and zapi_ipv6_route(). - Till now all messages sent from a client have the default VRF ID "0" in the header. - The HELLO message is special, which is used as the heart-beat of a client, and has no relation with VRF. The VRF ID in the HELLO message header will always be 0 and ignored by zebra. - Zebra side: - zserv_create_header() adds the VRF ID in the message header. - zebra_client_read() extracts and validates the VRF ID from the header, and passes the VRF ID to the functions which process the received messages. - All relative functions are appended with a new parameter "vrf_id". * Suppress the messages in a VRF which a client does not care: Some clients may not care about the information in the VRF X, and zebra should not send the messages in the VRF X to those clients. Extra flags are used to indicate which VRF is registered by a client, and a new message ZEBRA_VRF_UNREGISTER is introduced to let a client can unregister a VRF when it does not need any information in that VRF. A client sends any message other than ZEBRA_VRF_UNREGISTER in a VRF will automatically register to that VRF. - lib/vrf: A new utility "VRF bit-map" is provided to manage the flags for VRFs, one bit per VRF ID. - Use vrf_bitmap_init()/vrf_bitmap_free() to initialize/free a bit-map; - Use vrf_bitmap_set()/vrf_bitmap_unset() to set/unset a flag in the given bit-map, corresponding to the given VRF ID; - Use vrf_bitmap_check() to test whether the flag, in the given bit-map and for the given VRF ID, is set. - Client side: - In "struct zclient", the following flags are changed from "u_char" to "vrf_bitmap_t": redist[ZEBRA_ROUTE_MAX] default_information These flags are extended for each VRF, and controlled by the clients themselves (or with the help of zclient_redistribute() and zclient_redistribute_default()). - Zebra side: - In "struct zserv", the following flags are changed from "u_char" to "vrf_bitmap_t": redist[ZEBRA_ROUTE_MAX] redist_default ifinfo ridinfo These flags are extended for each VRF, as the VRF registration flags. They are maintained on receiving a ZEBRA_XXX_ADD or ZEBRA_XXX_DELETE message. When sending an interface/address/route/router-id message in a VRF to a client, if the corresponding VRF registration flag is not set, this message will not be dropped by zebra. - A new function zread_vrf_unregister() is introduced to process the new command ZEBRA_VRF_UNREGISTER. All the VRF registration flags are cleared for the requested VRF. Those clients, who support only the default VRF, will never receive a message in a non-default VRF, thanks to the filter in zebra. * New callback for the event of successful connection to zebra: - zclient_start() is splitted, keeping only the code of connecting to zebra. - Now zclient_init()=>zclient_connect()=>zclient_start() operations are purely dealing with the connection to zbera. - Once zebra is successfully connected, at the end of zclient_start(), a new callback is used to inform the client about connection. - Till now, in the callback of connect-to-zebra event, all clients send messages to zebra to request the router-id/interface/routes information in the default VRF. Of corse in future the client can do anything it wants in this callback. For example, it may send requests for both default VRF and some non-default VRFs. Signed-off-by: Feng Lu <lu.feng@6wind.com> Reviewed-by: Alain Ritoux <alain.ritoux@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Donald Sharp <sharpd@cumulusnetworks.com> Conflicts: lib/zclient.h lib/zebra.h zebra/zserv.c zebra/zserv.h Conflicts: bgpd/bgp_nexthop.c bgpd/bgp_nht.c bgpd/bgp_zebra.c isisd/isis_zebra.c lib/zclient.c lib/zclient.h lib/zebra.h nhrpd/nhrp_interface.c nhrpd/nhrp_route.c nhrpd/nhrpd.h ospf6d/ospf6_zebra.c ospf6d/ospf6_zebra.h ospfd/ospf_vty.c ospfd/ospf_zebra.c pimd/pim_zebra.c pimd/pim_zlookup.c ripd/rip_zebra.c ripngd/ripng_zebra.c zebra/redistribute.c zebra/rt_netlink.c zebra/zebra_rnh.c zebra/zebra_rnh.h zebra/zserv.c zebra/zserv.h
2014-10-16 03:52:36 +02:00
if (instance) {
if (type == ZEBRA_ROUTE_TABLE_DIRECT) {
struct redist_table_direct table = {
.table_id = instance,
.vrf_id = bgp->vrf_id,
};
if (redist_lookup_table_direct(&zclient->mi_redist[afi][type], &table) ==
NULL)
return CMD_WARNING;
redist_del_table_direct(&zclient->mi_redist[afi][type], &table);
} else {
if (!redist_check_instance(&zclient->mi_redist[afi][type], instance))
return CMD_WARNING;
redist_del_instance(&zclient->mi_redist[afi][type], instance);
}
*: add VRF ID in the API message header The API messages are used by zebra to exchange the interfaces, addresses, routes and router-id information with its clients. To distinguish which VRF the information belongs to, a new field "VRF ID" is added in the message header. And hence the message version is increased to 3. * The new field "VRF ID" in the message header: Length (2 bytes) Marker (1 byte) Version (1 byte) VRF ID (2 bytes, newly added) Command (2 bytes) - Client side: - zclient_create_header() adds the VRF ID in the message header. - zclient_read() extracts and validates the VRF ID from the header, and passes the VRF ID to the callback functions registered to the API messages. - All relative functions are appended with a new parameter "vrf_id", including all the callback functions. - "vrf_id" is also added to "struct zapi_ipv4" and "struct zapi_ipv6". Clients need to correctly set the VRF ID when using the API functions zapi_ipv4_route() and zapi_ipv6_route(). - Till now all messages sent from a client have the default VRF ID "0" in the header. - The HELLO message is special, which is used as the heart-beat of a client, and has no relation with VRF. The VRF ID in the HELLO message header will always be 0 and ignored by zebra. - Zebra side: - zserv_create_header() adds the VRF ID in the message header. - zebra_client_read() extracts and validates the VRF ID from the header, and passes the VRF ID to the functions which process the received messages. - All relative functions are appended with a new parameter "vrf_id". * Suppress the messages in a VRF which a client does not care: Some clients may not care about the information in the VRF X, and zebra should not send the messages in the VRF X to those clients. Extra flags are used to indicate which VRF is registered by a client, and a new message ZEBRA_VRF_UNREGISTER is introduced to let a client can unregister a VRF when it does not need any information in that VRF. A client sends any message other than ZEBRA_VRF_UNREGISTER in a VRF will automatically register to that VRF. - lib/vrf: A new utility "VRF bit-map" is provided to manage the flags for VRFs, one bit per VRF ID. - Use vrf_bitmap_init()/vrf_bitmap_free() to initialize/free a bit-map; - Use vrf_bitmap_set()/vrf_bitmap_unset() to set/unset a flag in the given bit-map, corresponding to the given VRF ID; - Use vrf_bitmap_check() to test whether the flag, in the given bit-map and for the given VRF ID, is set. - Client side: - In "struct zclient", the following flags are changed from "u_char" to "vrf_bitmap_t": redist[ZEBRA_ROUTE_MAX] default_information These flags are extended for each VRF, and controlled by the clients themselves (or with the help of zclient_redistribute() and zclient_redistribute_default()). - Zebra side: - In "struct zserv", the following flags are changed from "u_char" to "vrf_bitmap_t": redist[ZEBRA_ROUTE_MAX] redist_default ifinfo ridinfo These flags are extended for each VRF, as the VRF registration flags. They are maintained on receiving a ZEBRA_XXX_ADD or ZEBRA_XXX_DELETE message. When sending an interface/address/route/router-id message in a VRF to a client, if the corresponding VRF registration flag is not set, this message will not be dropped by zebra. - A new function zread_vrf_unregister() is introduced to process the new command ZEBRA_VRF_UNREGISTER. All the VRF registration flags are cleared for the requested VRF. Those clients, who support only the default VRF, will never receive a message in a non-default VRF, thanks to the filter in zebra. * New callback for the event of successful connection to zebra: - zclient_start() is splitted, keeping only the code of connecting to zebra. - Now zclient_init()=>zclient_connect()=>zclient_start() operations are purely dealing with the connection to zbera. - Once zebra is successfully connected, at the end of zclient_start(), a new callback is used to inform the client about connection. - Till now, in the callback of connect-to-zebra event, all clients send messages to zebra to request the router-id/interface/routes information in the default VRF. Of corse in future the client can do anything it wants in this callback. For example, it may send requests for both default VRF and some non-default VRFs. Signed-off-by: Feng Lu <lu.feng@6wind.com> Reviewed-by: Alain Ritoux <alain.ritoux@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Donald Sharp <sharpd@cumulusnetworks.com> Conflicts: lib/zclient.h lib/zebra.h zebra/zserv.c zebra/zserv.h Conflicts: bgpd/bgp_nexthop.c bgpd/bgp_nht.c bgpd/bgp_zebra.c isisd/isis_zebra.c lib/zclient.c lib/zclient.h lib/zebra.h nhrpd/nhrp_interface.c nhrpd/nhrp_route.c nhrpd/nhrpd.h ospf6d/ospf6_zebra.c ospf6d/ospf6_zebra.h ospfd/ospf_vty.c ospfd/ospf_zebra.c pimd/pim_zebra.c pimd/pim_zlookup.c ripd/rip_zebra.c ripngd/ripng_zebra.c zebra/redistribute.c zebra/rt_netlink.c zebra/zebra_rnh.c zebra/zebra_rnh.h zebra/zserv.c zebra/zserv.h
2014-10-16 03:52:36 +02:00
} else {
if (!vrf_bitmap_check(&zclient->redist[afi][type], bgp->vrf_id))
*: add VRF ID in the API message header The API messages are used by zebra to exchange the interfaces, addresses, routes and router-id information with its clients. To distinguish which VRF the information belongs to, a new field "VRF ID" is added in the message header. And hence the message version is increased to 3. * The new field "VRF ID" in the message header: Length (2 bytes) Marker (1 byte) Version (1 byte) VRF ID (2 bytes, newly added) Command (2 bytes) - Client side: - zclient_create_header() adds the VRF ID in the message header. - zclient_read() extracts and validates the VRF ID from the header, and passes the VRF ID to the callback functions registered to the API messages. - All relative functions are appended with a new parameter "vrf_id", including all the callback functions. - "vrf_id" is also added to "struct zapi_ipv4" and "struct zapi_ipv6". Clients need to correctly set the VRF ID when using the API functions zapi_ipv4_route() and zapi_ipv6_route(). - Till now all messages sent from a client have the default VRF ID "0" in the header. - The HELLO message is special, which is used as the heart-beat of a client, and has no relation with VRF. The VRF ID in the HELLO message header will always be 0 and ignored by zebra. - Zebra side: - zserv_create_header() adds the VRF ID in the message header. - zebra_client_read() extracts and validates the VRF ID from the header, and passes the VRF ID to the functions which process the received messages. - All relative functions are appended with a new parameter "vrf_id". * Suppress the messages in a VRF which a client does not care: Some clients may not care about the information in the VRF X, and zebra should not send the messages in the VRF X to those clients. Extra flags are used to indicate which VRF is registered by a client, and a new message ZEBRA_VRF_UNREGISTER is introduced to let a client can unregister a VRF when it does not need any information in that VRF. A client sends any message other than ZEBRA_VRF_UNREGISTER in a VRF will automatically register to that VRF. - lib/vrf: A new utility "VRF bit-map" is provided to manage the flags for VRFs, one bit per VRF ID. - Use vrf_bitmap_init()/vrf_bitmap_free() to initialize/free a bit-map; - Use vrf_bitmap_set()/vrf_bitmap_unset() to set/unset a flag in the given bit-map, corresponding to the given VRF ID; - Use vrf_bitmap_check() to test whether the flag, in the given bit-map and for the given VRF ID, is set. - Client side: - In "struct zclient", the following flags are changed from "u_char" to "vrf_bitmap_t": redist[ZEBRA_ROUTE_MAX] default_information These flags are extended for each VRF, and controlled by the clients themselves (or with the help of zclient_redistribute() and zclient_redistribute_default()). - Zebra side: - In "struct zserv", the following flags are changed from "u_char" to "vrf_bitmap_t": redist[ZEBRA_ROUTE_MAX] redist_default ifinfo ridinfo These flags are extended for each VRF, as the VRF registration flags. They are maintained on receiving a ZEBRA_XXX_ADD or ZEBRA_XXX_DELETE message. When sending an interface/address/route/router-id message in a VRF to a client, if the corresponding VRF registration flag is not set, this message will not be dropped by zebra. - A new function zread_vrf_unregister() is introduced to process the new command ZEBRA_VRF_UNREGISTER. All the VRF registration flags are cleared for the requested VRF. Those clients, who support only the default VRF, will never receive a message in a non-default VRF, thanks to the filter in zebra. * New callback for the event of successful connection to zebra: - zclient_start() is splitted, keeping only the code of connecting to zebra. - Now zclient_init()=>zclient_connect()=>zclient_start() operations are purely dealing with the connection to zbera. - Once zebra is successfully connected, at the end of zclient_start(), a new callback is used to inform the client about connection. - Till now, in the callback of connect-to-zebra event, all clients send messages to zebra to request the router-id/interface/routes information in the default VRF. Of corse in future the client can do anything it wants in this callback. For example, it may send requests for both default VRF and some non-default VRFs. Signed-off-by: Feng Lu <lu.feng@6wind.com> Reviewed-by: Alain Ritoux <alain.ritoux@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Donald Sharp <sharpd@cumulusnetworks.com> Conflicts: lib/zclient.h lib/zebra.h zebra/zserv.c zebra/zserv.h Conflicts: bgpd/bgp_nexthop.c bgpd/bgp_nht.c bgpd/bgp_zebra.c isisd/isis_zebra.c lib/zclient.c lib/zclient.h lib/zebra.h nhrpd/nhrp_interface.c nhrpd/nhrp_route.c nhrpd/nhrpd.h ospf6d/ospf6_zebra.c ospf6d/ospf6_zebra.h ospfd/ospf_vty.c ospfd/ospf_zebra.c pimd/pim_zebra.c pimd/pim_zlookup.c ripd/rip_zebra.c ripngd/ripng_zebra.c zebra/redistribute.c zebra/rt_netlink.c zebra/zebra_rnh.c zebra/zebra_rnh.h zebra/zserv.c zebra/zserv.h
2014-10-16 03:52:36 +02:00
return CMD_WARNING;
vrf_bitmap_unset(&zclient->redist[afi][type], bgp->vrf_id);
*: add VRF ID in the API message header The API messages are used by zebra to exchange the interfaces, addresses, routes and router-id information with its clients. To distinguish which VRF the information belongs to, a new field "VRF ID" is added in the message header. And hence the message version is increased to 3. * The new field "VRF ID" in the message header: Length (2 bytes) Marker (1 byte) Version (1 byte) VRF ID (2 bytes, newly added) Command (2 bytes) - Client side: - zclient_create_header() adds the VRF ID in the message header. - zclient_read() extracts and validates the VRF ID from the header, and passes the VRF ID to the callback functions registered to the API messages. - All relative functions are appended with a new parameter "vrf_id", including all the callback functions. - "vrf_id" is also added to "struct zapi_ipv4" and "struct zapi_ipv6". Clients need to correctly set the VRF ID when using the API functions zapi_ipv4_route() and zapi_ipv6_route(). - Till now all messages sent from a client have the default VRF ID "0" in the header. - The HELLO message is special, which is used as the heart-beat of a client, and has no relation with VRF. The VRF ID in the HELLO message header will always be 0 and ignored by zebra. - Zebra side: - zserv_create_header() adds the VRF ID in the message header. - zebra_client_read() extracts and validates the VRF ID from the header, and passes the VRF ID to the functions which process the received messages. - All relative functions are appended with a new parameter "vrf_id". * Suppress the messages in a VRF which a client does not care: Some clients may not care about the information in the VRF X, and zebra should not send the messages in the VRF X to those clients. Extra flags are used to indicate which VRF is registered by a client, and a new message ZEBRA_VRF_UNREGISTER is introduced to let a client can unregister a VRF when it does not need any information in that VRF. A client sends any message other than ZEBRA_VRF_UNREGISTER in a VRF will automatically register to that VRF. - lib/vrf: A new utility "VRF bit-map" is provided to manage the flags for VRFs, one bit per VRF ID. - Use vrf_bitmap_init()/vrf_bitmap_free() to initialize/free a bit-map; - Use vrf_bitmap_set()/vrf_bitmap_unset() to set/unset a flag in the given bit-map, corresponding to the given VRF ID; - Use vrf_bitmap_check() to test whether the flag, in the given bit-map and for the given VRF ID, is set. - Client side: - In "struct zclient", the following flags are changed from "u_char" to "vrf_bitmap_t": redist[ZEBRA_ROUTE_MAX] default_information These flags are extended for each VRF, and controlled by the clients themselves (or with the help of zclient_redistribute() and zclient_redistribute_default()). - Zebra side: - In "struct zserv", the following flags are changed from "u_char" to "vrf_bitmap_t": redist[ZEBRA_ROUTE_MAX] redist_default ifinfo ridinfo These flags are extended for each VRF, as the VRF registration flags. They are maintained on receiving a ZEBRA_XXX_ADD or ZEBRA_XXX_DELETE message. When sending an interface/address/route/router-id message in a VRF to a client, if the corresponding VRF registration flag is not set, this message will not be dropped by zebra. - A new function zread_vrf_unregister() is introduced to process the new command ZEBRA_VRF_UNREGISTER. All the VRF registration flags are cleared for the requested VRF. Those clients, who support only the default VRF, will never receive a message in a non-default VRF, thanks to the filter in zebra. * New callback for the event of successful connection to zebra: - zclient_start() is splitted, keeping only the code of connecting to zebra. - Now zclient_init()=>zclient_connect()=>zclient_start() operations are purely dealing with the connection to zbera. - Once zebra is successfully connected, at the end of zclient_start(), a new callback is used to inform the client about connection. - Till now, in the callback of connect-to-zebra event, all clients send messages to zebra to request the router-id/interface/routes information in the default VRF. Of corse in future the client can do anything it wants in this callback. For example, it may send requests for both default VRF and some non-default VRFs. Signed-off-by: Feng Lu <lu.feng@6wind.com> Reviewed-by: Alain Ritoux <alain.ritoux@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Donald Sharp <sharpd@cumulusnetworks.com> Conflicts: lib/zclient.h lib/zebra.h zebra/zserv.c zebra/zserv.h Conflicts: bgpd/bgp_nexthop.c bgpd/bgp_nht.c bgpd/bgp_zebra.c isisd/isis_zebra.c lib/zclient.c lib/zclient.h lib/zebra.h nhrpd/nhrp_interface.c nhrpd/nhrp_route.c nhrpd/nhrpd.h ospf6d/ospf6_zebra.c ospf6d/ospf6_zebra.h ospfd/ospf_vty.c ospfd/ospf_zebra.c pimd/pim_zebra.c pimd/pim_zlookup.c ripd/rip_zebra.c ripngd/ripng_zebra.c zebra/redistribute.c zebra/rt_netlink.c zebra/zebra_rnh.c zebra/zebra_rnh.h zebra/zserv.c zebra/zserv.h
2014-10-16 03:52:36 +02:00
}
2002-12-13 21:15:29 +01:00
if (bgp_install_info_to_zebra(bgp)) {
/* Send distribute delete message to zebra. */
Overhual BGP debugs Summary of changes - added an option to enable keepalive debugs for a specific peer - added an option to enable inbound and/or outbound updates debugs for a specific peer - added an option to enable update debugs for a specific prefix - added an option to enable zebra debugs for a specific prefix - combined "deb bgp", "deb bgp events" and "deb bgp fsm" into "deb bgp neighbor-events". "deb bgp neighbor-events" can be enabled for a specific peer. - merged "deb bgp filters" into "deb bgp update" - moved the per-peer logging to one central log file. We now have the ability to filter all verbose debugs on a per-peer and per-prefix basis so we no longer need to keep log files per-peer. This simplifies troubleshooting by keeping all BGP logs in one location. The use r can then grep for the peer IP they are interested in if they wish to see the logs for a specific peer. - Changed "show debugging" in isis to "show debugging isis" to be consistent with all other protocols. This was very confusing for the user because they would type "show debug" and expect to see a list of debugs enabled across all protocols. - Removed "undebug" from the parser for BGP. Again this was to be consisten with all other protocols. - Removed the "all" keyword from the BGP debug parser. The user can now do "no debug bgp" to disable all BGP debugs, before you had to type "no deb all bgp" which was confusing. The new parse tree for BGP debugging is: deb bgp as4 deb bgp as4 segment deb bgp keepalives [A.B.C.D|WORD|X:X::X:X] deb bgp neighbor-events [A.B.C.D|WORD|X:X::X:X] deb bgp nht deb bgp updates [in|out] [A.B.C.D|WORD|X:X::X:X] deb bgp updates prefix [A.B.C.D/M|X:X::X:X/M] deb bgp zebra deb bgp zebra prefix [A.B.C.D/M|X:X::X:X/M]
2015-05-20 02:58:12 +02:00
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("Tx redistribute del %s afi %d %s %d",
bgp->name_pretty, afi,
zebra_route_string(type), instance);
*: add VRF ID in the API message header The API messages are used by zebra to exchange the interfaces, addresses, routes and router-id information with its clients. To distinguish which VRF the information belongs to, a new field "VRF ID" is added in the message header. And hence the message version is increased to 3. * The new field "VRF ID" in the message header: Length (2 bytes) Marker (1 byte) Version (1 byte) VRF ID (2 bytes, newly added) Command (2 bytes) - Client side: - zclient_create_header() adds the VRF ID in the message header. - zclient_read() extracts and validates the VRF ID from the header, and passes the VRF ID to the callback functions registered to the API messages. - All relative functions are appended with a new parameter "vrf_id", including all the callback functions. - "vrf_id" is also added to "struct zapi_ipv4" and "struct zapi_ipv6". Clients need to correctly set the VRF ID when using the API functions zapi_ipv4_route() and zapi_ipv6_route(). - Till now all messages sent from a client have the default VRF ID "0" in the header. - The HELLO message is special, which is used as the heart-beat of a client, and has no relation with VRF. The VRF ID in the HELLO message header will always be 0 and ignored by zebra. - Zebra side: - zserv_create_header() adds the VRF ID in the message header. - zebra_client_read() extracts and validates the VRF ID from the header, and passes the VRF ID to the functions which process the received messages. - All relative functions are appended with a new parameter "vrf_id". * Suppress the messages in a VRF which a client does not care: Some clients may not care about the information in the VRF X, and zebra should not send the messages in the VRF X to those clients. Extra flags are used to indicate which VRF is registered by a client, and a new message ZEBRA_VRF_UNREGISTER is introduced to let a client can unregister a VRF when it does not need any information in that VRF. A client sends any message other than ZEBRA_VRF_UNREGISTER in a VRF will automatically register to that VRF. - lib/vrf: A new utility "VRF bit-map" is provided to manage the flags for VRFs, one bit per VRF ID. - Use vrf_bitmap_init()/vrf_bitmap_free() to initialize/free a bit-map; - Use vrf_bitmap_set()/vrf_bitmap_unset() to set/unset a flag in the given bit-map, corresponding to the given VRF ID; - Use vrf_bitmap_check() to test whether the flag, in the given bit-map and for the given VRF ID, is set. - Client side: - In "struct zclient", the following flags are changed from "u_char" to "vrf_bitmap_t": redist[ZEBRA_ROUTE_MAX] default_information These flags are extended for each VRF, and controlled by the clients themselves (or with the help of zclient_redistribute() and zclient_redistribute_default()). - Zebra side: - In "struct zserv", the following flags are changed from "u_char" to "vrf_bitmap_t": redist[ZEBRA_ROUTE_MAX] redist_default ifinfo ridinfo These flags are extended for each VRF, as the VRF registration flags. They are maintained on receiving a ZEBRA_XXX_ADD or ZEBRA_XXX_DELETE message. When sending an interface/address/route/router-id message in a VRF to a client, if the corresponding VRF registration flag is not set, this message will not be dropped by zebra. - A new function zread_vrf_unregister() is introduced to process the new command ZEBRA_VRF_UNREGISTER. All the VRF registration flags are cleared for the requested VRF. Those clients, who support only the default VRF, will never receive a message in a non-default VRF, thanks to the filter in zebra. * New callback for the event of successful connection to zebra: - zclient_start() is splitted, keeping only the code of connecting to zebra. - Now zclient_init()=>zclient_connect()=>zclient_start() operations are purely dealing with the connection to zbera. - Once zebra is successfully connected, at the end of zclient_start(), a new callback is used to inform the client about connection. - Till now, in the callback of connect-to-zebra event, all clients send messages to zebra to request the router-id/interface/routes information in the default VRF. Of corse in future the client can do anything it wants in this callback. For example, it may send requests for both default VRF and some non-default VRFs. Signed-off-by: Feng Lu <lu.feng@6wind.com> Reviewed-by: Alain Ritoux <alain.ritoux@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Donald Sharp <sharpd@cumulusnetworks.com> Conflicts: lib/zclient.h lib/zebra.h zebra/zserv.c zebra/zserv.h Conflicts: bgpd/bgp_nexthop.c bgpd/bgp_nht.c bgpd/bgp_zebra.c isisd/isis_zebra.c lib/zclient.c lib/zclient.h lib/zebra.h nhrpd/nhrp_interface.c nhrpd/nhrp_route.c nhrpd/nhrpd.h ospf6d/ospf6_zebra.c ospf6d/ospf6_zebra.h ospfd/ospf_vty.c ospfd/ospf_zebra.c pimd/pim_zebra.c pimd/pim_zlookup.c ripd/rip_zebra.c ripngd/ripng_zebra.c zebra/redistribute.c zebra/rt_netlink.c zebra/zebra_rnh.c zebra/zebra_rnh.h zebra/zserv.c zebra/zserv.h
2014-10-16 03:52:36 +02:00
zebra_redistribute_send(ZEBRA_REDISTRIBUTE_DELETE, zclient, afi,
type, instance, bgp->vrf_id);
}
2002-12-13 21:15:29 +01:00
/* Withdraw redistributed routes from current BGP's routing table. */
Multi-Instance OSPF Summary ——————————————------------- - etc/init.d/quagga is modified to support creating separate ospf daemon process for each instance. Each individual instance is monitored by watchquagga just like any protocol daemons.(requires initd-mi.patch). - Vtysh is modified to able to connect to multiple daemons of the same protocol (supported for OSPF only for now). - ospfd is modified to remember the Instance-ID that its invoked with. For the entire life of the process it caters to any command request that matches that instance-ID (unless its a non instance specific command). Routes/messages to zebra are tagged with instance-ID. - zebra route/redistribute mechanisms are modified to work with [protocol type + instance-id] - bgpd now has ability to have multiple instance specific redistribution for a protocol (OSPF only supported/tested for now). - zlog ability to display instance-id besides the protocol/daemon name. - Changes in other daemons are to because of the needed integration with some of the modified APIs/routines. (Didn’t prefer replicating too many separate instance specific APIs.) - config/show/debug commands are modified to take instance-id argument as appropriate. Guidelines to start using multi-instance ospf --------------------------------------------- The patch is backward compatible, i.e for any previous way of single ospf deamon(router ospf <cr>) will continue to work as is, including all the show commands etc. To enable multiple instances, do the following: 1. service quagga stop 2. Modify /etc/quagga/daemons to add instance-ids of each desired instance in the following format: ospfd=“yes" ospfd_instances="1,2,3" assuming you want to enable 3 instances with those instance ids. 3. Create corresponding ospfd config files as ospfd-1.conf, ospfd-2.conf and ospfd-3.conf. 4. service quagga start/restart 5. Verify that the deamons are started as expected. You should see ospfd started with -n <instance-id> option. ps –ef | grep quagga With that /var/run/quagga/ should have ospfd-<instance-id>.pid and ospfd-<instance-id>/vty to each instance. 6. vtysh to work with instances as you would with any other deamons. 7. Overall most quagga semantics are the same working with the instance deamon, like it is for any other daemon. NOTE: To safeguard against errors leading to too many processes getting invoked, a hard limit on number of instance-ids is in place, currently its 5. Allowed instance-id range is <1-65535> Once daemons are up, show running from vtysh should show the instance-id of each daemon as 'router ospf <instance-id>’ (without needing explicit configuration) Instance-id can not be changed via vtysh, other router ospf configuration is allowed as before. Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com> Reviewed-by: Dinesh G Dutt <ddutt@cumulusnetworks.com>
2015-05-20 03:03:42 +02:00
bgp_redistribute_withdraw(bgp, afi, type, instance);
2002-12-13 21:15:29 +01:00
return CMD_SUCCESS;
}
/* Unset redistribution. */
static void _bgp_redistribute_unset(struct bgp *bgp, afi_t afi, int type,
unsigned short instance)
{
struct bgp_redist *red;
/*
* vnc and vpn->vrf checks must be before red check because
* they operate within bgpd irrespective of zebra connection
* status. red lookup fails if there is no zebra connection.
*/
#ifdef ENABLE_BGP_VNC
if (EVPN_ENABLED(bgp) && type == ZEBRA_ROUTE_VNC_DIRECT) {
vnc_export_bgp_disable(bgp, afi);
}
#endif
red = bgp_redist_lookup(bgp, afi, type, instance);
if (!red)
return;
bgp_redistribute_unreg(bgp, afi, type, instance);
/* Unset route-map. */
XFREE(MTYPE_ROUTE_MAP_NAME, red->rmap.name);
route_map_counter_decrement(red->rmap.map);
red->rmap.map = NULL;
/* Unset metric. */
red->redist_metric_flag = 0;
red->redist_metric = 0;
bgp_redist_del(bgp, afi, type, instance);
}
void bgp_redistribute_unset(struct bgp *bgp, afi_t afi, int type,
unsigned short instance)
{
struct listnode *node, *nnode;
struct bgp_redist *red;
if ((type != ZEBRA_ROUTE_TABLE && type != ZEBRA_ROUTE_TABLE_DIRECT) ||
instance != 0)
return _bgp_redistribute_unset(bgp, afi, type, instance);
/* walk over instance */
if (!bgp->redist[afi][type])
return;
for (ALL_LIST_ELEMENTS(bgp->redist[afi][type], node, nnode, red))
_bgp_redistribute_unset(bgp, afi, type, red->instance);
}
void bgp_redistribute_redo(struct bgp *bgp)
{
afi_t afi;
int i;
struct list *red_list;
struct listnode *node;
struct bgp_redist *red;
for (afi = AFI_IP; afi < AFI_MAX; afi++) {
for (i = 0; i < ZEBRA_ROUTE_MAX; i++) {
red_list = bgp->redist[afi][i];
if (!red_list)
continue;
for (ALL_LIST_ELEMENTS_RO(red_list, node, red)) {
bgp_redistribute_resend(bgp, afi, i,
red->instance);
}
}
}
}
void bgp_zclient_reset(void)
2002-12-13 21:15:29 +01:00
{
zclient_reset(zclient);
}
/* Register this instance with Zebra. Invoked upon connect (for
* default instance) and when other VRFs are learnt (or created and
* already learnt).
*/
void bgp_zebra_instance_register(struct bgp *bgp)
{
/* Don't try to register if we're not connected to Zebra */
if (!zclient || zclient->sock < 0)
return;
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("Registering %s", bgp->name_pretty);
/* Register for router-id, interfaces, redistributed routes. */
zclient_send_reg_requests(zclient, bgp->vrf_id);
/* For EVPN instance, register to learn about VNIs, if appropriate. */
if (bgp->advertise_all_vni)
bgp_zebra_advertise_all_vni(bgp, 1);
bgp_nht_register_nexthops(bgp);
/*
* Request SRv6 locator information from Zebra, if SRv6 is enabled
* and a locator is configured for this BGP instance.
*/
if (bgp->srv6_enabled && bgp->srv6_locator_name[0] != '\0' && !bgp->srv6_locator)
bgp_zebra_srv6_manager_get_locator(bgp->srv6_locator_name);
}
/* Deregister this instance with Zebra. Invoked upon the instance
* being deleted (default or VRF) and it is already registered.
*/
void bgp_zebra_instance_deregister(struct bgp *bgp)
{
/* Don't try to deregister if we're not connected to Zebra */
if (zclient->sock < 0)
return;
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("Deregistering %s", bgp->name_pretty);
/* For EVPN instance, unregister learning about VNIs, if appropriate. */
if (bgp->advertise_all_vni)
bgp_zebra_advertise_all_vni(bgp, 0);
/* Deregister for router-id, interfaces, redistributed routes. */
zclient_send_dereg_requests(zclient, bgp->vrf_id);
}
void bgp_zebra_initiate_radv(struct bgp *bgp, struct peer *peer)
{
uint32_t ra_interval = BGP_UNNUM_DEFAULT_RA_INTERVAL;
if (CHECK_FLAG(bgp->flags, BGP_FLAG_IPV6_NO_AUTO_RA))
return;
/* Don't try to initiate if we're not connected to Zebra */
if (zclient->sock < 0)
return;
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("%u: Initiating RA for peer %s", bgp->vrf_id,
peer->host);
/*
* If unnumbered peer (peer->ifp) call thru zapi to start RAs.
* If we don't have an ifp pointer, call function to find the
* ifps for a numbered enhe peer to turn RAs on.
*/
peer->ifp ? zclient_send_interface_radv_req(zclient, bgp->vrf_id,
peer->ifp, 1, ra_interval)
: bgp_nht_reg_enhe_cap_intfs(peer);
}
void bgp_zebra_terminate_radv(struct bgp *bgp, struct peer *peer)
{
/* Don't try to terminate if we're not connected to Zebra */
if (zclient->sock < 0)
return;
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("%u: Terminating RA for peer %s", bgp->vrf_id,
peer->host);
/*
* If unnumbered peer (peer->ifp) call thru zapi to stop RAs.
* If we don't have an ifp pointer, call function to find the
* ifps for a numbered enhe peer to turn RAs off.
*/
peer->ifp ? zclient_send_interface_radv_req(zclient, bgp->vrf_id,
peer->ifp, 0, 0)
: bgp_nht_dereg_enhe_cap_intfs(peer);
}
int bgp_zebra_advertise_subnet(struct bgp *bgp, int advertise, vni_t vni)
{
struct stream *s = NULL;
/* Check socket. */
if (!zclient || zclient->sock < 0)
return 0;
/* Don't try to register if Zebra doesn't know of this instance. */
if (!IS_BGP_INST_KNOWN_TO_ZEBRA(bgp)) {
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug(
"%s: No zebra instance to talk to, cannot advertise subnet",
__func__);
return 0;
}
s = zclient->obuf;
stream_reset(s);
zclient_create_header(s, ZEBRA_ADVERTISE_SUBNET, bgp->vrf_id);
stream_putc(s, advertise);
stream_put3(s, vni);
stream_putw_at(s, 0, stream_get_endp(s));
return zclient_send_message(zclient);
}
int bgp_zebra_advertise_svi_macip(struct bgp *bgp, int advertise, vni_t vni)
{
struct stream *s = NULL;
/* Check socket. */
if (!zclient || zclient->sock < 0)
return 0;
/* Don't try to register if Zebra doesn't know of this instance. */
if (!IS_BGP_INST_KNOWN_TO_ZEBRA(bgp))
return 0;
s = zclient->obuf;
stream_reset(s);
zclient_create_header(s, ZEBRA_ADVERTISE_SVI_MACIP, bgp->vrf_id);
stream_putc(s, advertise);
stream_putl(s, vni);
stream_putw_at(s, 0, stream_get_endp(s));
return zclient_send_message(zclient);
}
int bgp_zebra_advertise_gw_macip(struct bgp *bgp, int advertise, vni_t vni)
{
struct stream *s = NULL;
/* Check socket. */
if (!zclient || zclient->sock < 0)
return 0;
/* Don't try to register if Zebra doesn't know of this instance. */
if (!IS_BGP_INST_KNOWN_TO_ZEBRA(bgp)) {
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug(
"%s: No zebra instance to talk to, not installing gw_macip",
__func__);
return 0;
}
s = zclient->obuf;
stream_reset(s);
zclient_create_header(s, ZEBRA_ADVERTISE_DEFAULT_GW, bgp->vrf_id);
stream_putc(s, advertise);
stream_putl(s, vni);
stream_putw_at(s, 0, stream_get_endp(s));
return zclient_send_message(zclient);
}
int bgp_zebra_vxlan_flood_control(struct bgp *bgp,
enum vxlan_flood_control flood_ctrl)
{
struct stream *s;
/* Check socket. */
if (!zclient || zclient->sock < 0)
return 0;
/* Don't try to register if Zebra doesn't know of this instance. */
if (!IS_BGP_INST_KNOWN_TO_ZEBRA(bgp)) {
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug(
"%s: No zebra instance to talk to, not installing all vni",
__func__);
return 0;
}
s = zclient->obuf;
stream_reset(s);
zclient_create_header(s, ZEBRA_VXLAN_FLOOD_CONTROL, bgp->vrf_id);
stream_putc(s, flood_ctrl);
stream_putw_at(s, 0, stream_get_endp(s));
return zclient_send_message(zclient);
}
int bgp_zebra_advertise_all_vni(struct bgp *bgp, int advertise)
{
struct stream *s;
/* Check socket. */
if (!zclient || zclient->sock < 0)
return 0;
/* Don't try to register if Zebra doesn't know of this instance. */
if (!IS_BGP_INST_KNOWN_TO_ZEBRA(bgp))
return 0;
s = zclient->obuf;
stream_reset(s);
zclient_create_header(s, ZEBRA_ADVERTISE_ALL_VNI, bgp->vrf_id);
stream_putc(s, advertise);
/* Also inform current BUM handling setting. This is really
* relevant only when 'advertise' is set.
*/
stream_putc(s, bgp->vxlan_flood_ctrl);
stream_putw_at(s, 0, stream_get_endp(s));
return zclient_send_message(zclient);
}
int bgp_zebra_dup_addr_detection(struct bgp *bgp)
{
struct stream *s;
/* Check socket. */
if (!zclient || zclient->sock < 0)
return 0;
/* Don't try to register if Zebra doesn't know of this instance. */
if (!IS_BGP_INST_KNOWN_TO_ZEBRA(bgp))
return 0;
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("dup addr detect %s max_moves %u time %u freeze %s freeze_time %u",
bgp->evpn_info->dup_addr_detect ?
"enable" : "disable",
bgp->evpn_info->dad_max_moves,
bgp->evpn_info->dad_time,
bgp->evpn_info->dad_freeze ?
"enable" : "disable",
bgp->evpn_info->dad_freeze_time);
s = zclient->obuf;
stream_reset(s);
zclient_create_header(s, ZEBRA_DUPLICATE_ADDR_DETECTION,
bgp->vrf_id);
stream_putl(s, bgp->evpn_info->dup_addr_detect);
stream_putl(s, bgp->evpn_info->dad_time);
stream_putl(s, bgp->evpn_info->dad_max_moves);
stream_putl(s, bgp->evpn_info->dad_freeze);
stream_putl(s, bgp->evpn_info->dad_freeze_time);
stream_putw_at(s, 0, stream_get_endp(s));
return zclient_send_message(zclient);
}
static int rule_notify_owner(ZAPI_CALLBACK_ARGS)
{
uint32_t seqno, priority, unique;
enum zapi_rule_notify_owner note;
struct bgp_pbr_action *bgp_pbra;
struct bgp_pbr_rule *bgp_pbr = NULL;
char ifname[IFNAMSIZ + 1];
if (!zapi_rule_notify_decode(zclient->ibuf, &seqno, &priority, &unique,
ifname, &note))
return -1;
bgp_pbra = bgp_pbr_action_rule_lookup(vrf_id, unique);
if (!bgp_pbra) {
/* look in bgp pbr rule */
bgp_pbr = bgp_pbr_rule_lookup(vrf_id, unique);
if (!bgp_pbr && note != ZAPI_RULE_REMOVED) {
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("%s: Fail to look BGP rule (%u)",
__func__, unique);
return 0;
}
}
switch (note) {
case ZAPI_RULE_FAIL_INSTALL:
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("%s: Received RULE_FAIL_INSTALL", __func__);
if (bgp_pbra) {
bgp_pbra->installed = false;
bgp_pbra->install_in_progress = false;
} else {
bgp_pbr->installed = false;
bgp_pbr->install_in_progress = false;
}
break;
case ZAPI_RULE_INSTALLED:
if (bgp_pbra) {
bgp_pbra->installed = true;
bgp_pbra->install_in_progress = false;
} else {
struct bgp_path_info *path;
struct bgp_path_info_extra *extra;
bgp_pbr->installed = true;
bgp_pbr->install_in_progress = false;
bgp_pbr->action->refcnt++;
/* link bgp_info to bgp_pbr */
path = (struct bgp_path_info *)bgp_pbr->path;
extra = bgp_path_info_extra_get(path);
if (!extra->flowspec) {
extra->flowspec =
XCALLOC(MTYPE_BGP_ROUTE_EXTRA_FS,
sizeof(struct bgp_path_info_extra_fs));
extra->flowspec->bgp_fs_iprule = NULL;
extra->flowspec->bgp_fs_pbr = NULL;
}
listnode_add_force(&extra->flowspec->bgp_fs_iprule, bgp_pbr);
}
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("%s: Received RULE_INSTALLED", __func__);
break;
case ZAPI_RULE_FAIL_REMOVE:
case ZAPI_RULE_REMOVED:
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("%s: Received RULE REMOVED", __func__);
break;
}
return 0;
}
static int ipset_notify_owner(ZAPI_CALLBACK_ARGS)
{
uint32_t unique;
enum zapi_ipset_notify_owner note;
struct bgp_pbr_match *bgp_pbim;
if (!zapi_ipset_notify_decode(zclient->ibuf,
&unique,
&note))
return -1;
bgp_pbim = bgp_pbr_match_ipset_lookup(vrf_id, unique);
if (!bgp_pbim) {
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("%s: Fail to look BGP match ( %u, ID %u)",
__func__, note, unique);
return 0;
}
switch (note) {
case ZAPI_IPSET_FAIL_INSTALL:
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("%s: Received IPSET_FAIL_INSTALL", __func__);
bgp_pbim->installed = false;
bgp_pbim->install_in_progress = false;
break;
case ZAPI_IPSET_INSTALLED:
bgp_pbim->installed = true;
bgp_pbim->install_in_progress = false;
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("%s: Received IPSET_INSTALLED", __func__);
break;
case ZAPI_IPSET_FAIL_REMOVE:
case ZAPI_IPSET_REMOVED:
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("%s: Received IPSET REMOVED", __func__);
break;
}
return 0;
}
static int ipset_entry_notify_owner(ZAPI_CALLBACK_ARGS)
{
uint32_t unique;
char ipset_name[ZEBRA_IPSET_NAME_SIZE];
enum zapi_ipset_entry_notify_owner note;
struct bgp_pbr_match_entry *bgp_pbime;
if (!zapi_ipset_entry_notify_decode(
zclient->ibuf,
&unique,
ipset_name,
&note))
return -1;
bgp_pbime = bgp_pbr_match_ipset_entry_lookup(vrf_id,
ipset_name,
unique);
if (!bgp_pbime) {
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug(
"%s: Fail to look BGP match entry (%u, ID %u)",
__func__, note, unique);
return 0;
}
switch (note) {
case ZAPI_IPSET_ENTRY_FAIL_INSTALL:
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("%s: Received IPSET_ENTRY_FAIL_INSTALL",
__func__);
bgp_pbime->installed = false;
bgp_pbime->install_in_progress = false;
break;
case ZAPI_IPSET_ENTRY_INSTALLED:
{
struct bgp_path_info *path;
struct bgp_path_info_extra *extra;
bgp_pbime->installed = true;
bgp_pbime->install_in_progress = false;
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("%s: Received IPSET_ENTRY_INSTALLED",
__func__);
/* link bgp_path_info to bpme */
path = (struct bgp_path_info *)bgp_pbime->path;
extra = bgp_path_info_extra_get(path);
if (!extra->flowspec) {
extra->flowspec =
XCALLOC(MTYPE_BGP_ROUTE_EXTRA_FS,
sizeof(struct bgp_path_info_extra_fs));
extra->flowspec->bgp_fs_iprule = NULL;
extra->flowspec->bgp_fs_pbr = NULL;
}
listnode_add_force(&extra->flowspec->bgp_fs_pbr, bgp_pbime);
}
break;
case ZAPI_IPSET_ENTRY_FAIL_REMOVE:
case ZAPI_IPSET_ENTRY_REMOVED:
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("%s: Received IPSET_ENTRY_REMOVED",
__func__);
break;
}
return 0;
}
static int iptable_notify_owner(ZAPI_CALLBACK_ARGS)
{
uint32_t unique;
enum zapi_iptable_notify_owner note;
struct bgp_pbr_match *bgpm;
if (!zapi_iptable_notify_decode(
zclient->ibuf,
&unique,
&note))
return -1;
bgpm = bgp_pbr_match_iptable_lookup(vrf_id, unique);
if (!bgpm) {
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("%s: Fail to look BGP iptable (%u %u)",
__func__, note, unique);
return 0;
}
switch (note) {
case ZAPI_IPTABLE_FAIL_INSTALL:
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("%s: Received IPTABLE_FAIL_INSTALL",
__func__);
bgpm->installed_in_iptable = false;
bgpm->install_iptable_in_progress = false;
break;
case ZAPI_IPTABLE_INSTALLED:
bgpm->installed_in_iptable = true;
bgpm->install_iptable_in_progress = false;
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("%s: Received IPTABLE_INSTALLED", __func__);
bgpm->action->refcnt++;
break;
case ZAPI_IPTABLE_FAIL_REMOVE:
case ZAPI_IPTABLE_REMOVED:
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("%s: Received IPTABLE REMOVED", __func__);
break;
}
return 0;
}
/* Process route notification messages from RIB */
static int bgp_zebra_route_notify_owner(int command, struct zclient *zclient,
zebra_size_t length, vrf_id_t vrf_id)
{
struct prefix p;
enum zapi_route_notify_owner note;
uint32_t table_id;
afi_t afi;
safi_t safi;
struct bgp_dest *dest;
struct bgp *bgp;
struct bgp_path_info *pi, *new_select;
if (!zapi_route_notify_decode(zclient->ibuf, &p, &table_id, &note,
&afi, &safi)) {
zlog_err("%s : error in msg decode", __func__);
return -1;
}
/* Get the bgp instance */
bgp = bgp_lookup_by_vrf_id(vrf_id);
if (!bgp) {
flog_err(EC_BGP_INVALID_BGP_INSTANCE,
"%s : bgp instance not found vrf %d", __func__,
vrf_id);
return -1;
}
/* Find the bgp route node */
dest = bgp_safi_node_lookup(bgp->rib[afi][safi], safi, &p,
&bgp->vrf_prd);
if (!dest) {
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("%s: %pFX does not exist in the BGP table, nothing to do for %u",
__func__, &p, note);
return -1;
}
switch (note) {
case ZAPI_ROUTE_INSTALLED:
new_select = NULL;
/* Clear the flags so that route can be processed */
UNSET_FLAG(dest->flags, BGP_NODE_FIB_INSTALL_PENDING);
SET_FLAG(dest->flags, BGP_NODE_FIB_INSTALLED);
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("route %pBD : INSTALLED", dest);
/* Find the best route */
for (pi = dest->info; pi; pi = pi->next) {
/* Process aggregate route */
bgp_aggregate_increment(bgp, &p, pi, afi, safi);
if (CHECK_FLAG(pi->flags, BGP_PATH_SELECTED))
new_select = pi;
}
/* Advertise the route */
if (new_select)
group_announce_route(bgp, afi, safi, dest, new_select);
else {
flog_err(EC_BGP_INVALID_ROUTE,
"selected route %pBD not found", dest);
bgp_dest_unlock_node(dest);
return -1;
}
break;
case ZAPI_ROUTE_REMOVED:
/* Route deleted from dataplane, reset the installed flag
* so that route can be reinstalled when client sends
* route add later
*/
UNSET_FLAG(dest->flags, BGP_NODE_FIB_INSTALLED);
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("route %pBD: Removed from Fib", dest);
break;
case ZAPI_ROUTE_FAIL_INSTALL:
new_select = NULL;
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("route: %pBD Failed to Install into Fib",
dest);
UNSET_FLAG(dest->flags, BGP_NODE_FIB_INSTALL_PENDING);
UNSET_FLAG(dest->flags, BGP_NODE_FIB_INSTALLED);
for (pi = bgp_dest_get_bgp_path_info(dest); pi; pi = pi->next) {
if (CHECK_FLAG(pi->flags, BGP_PATH_SELECTED))
new_select = pi;
}
if (new_select)
group_announce_route(bgp, afi, safi, dest, new_select);
/* Error will be logged by zebra module */
break;
case ZAPI_ROUTE_BETTER_ADMIN_WON:
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("route: %pBD removed due to better admin won",
dest);
new_select = NULL;
UNSET_FLAG(dest->flags, BGP_NODE_FIB_INSTALL_PENDING);
UNSET_FLAG(dest->flags, BGP_NODE_FIB_INSTALLED);
for (pi = bgp_dest_get_bgp_path_info(dest); pi; pi = pi->next) {
bgp_aggregate_decrement(bgp, &p, pi, afi, safi);
if (CHECK_FLAG(pi->flags, BGP_PATH_SELECTED))
new_select = pi;
}
if (new_select)
group_announce_route(bgp, afi, safi, dest, new_select);
/* No action required */
break;
case ZAPI_ROUTE_REMOVE_FAIL:
zlog_warn("%s: Route %pBD failure to remove", __func__, dest);
break;
}
bgp_dest_unlock_node(dest);
return 0;
}
/* this function is used to forge ip rule,
* - either for iptable/ipset using fwmark id
* - or for sample ip rule cmd
*/
static void bgp_encode_pbr_rule_action(struct stream *s,
struct bgp_pbr_action *pbra,
struct bgp_pbr_rule *pbr)
{
uint8_t fam = AF_INET;
struct pbr_rule r;
if (pbra->nh.type == NEXTHOP_TYPE_IPV6)
fam = AF_INET6;
/*
* Convert to canonical form
*/
memset(&r, 0, sizeof(r));
/* r.seq unused */
if (pbr)
r.priority = pbr->priority;
/* ruleno unused - priority change
* ruleno permits distinguishing various FS PBR entries
* - FS PBR entries based on ipset/iptables
* - FS PBR entries based on iprule
* the latter may contain default routing information injected by FS
*/
if (pbr)
r.unique = pbr->unique;
else
r.unique = pbra->unique;
r.family = fam;
/* filter */
if (pbr && pbr->flags & MATCH_IP_SRC_SET) {
SET_FLAG(r.filter.filter_bm, PBR_FILTER_SRC_IP);
r.filter.src_ip = pbr->src;
} else {
/* ??? */
r.filter.src_ip.family = fam;
}
if (pbr && pbr->flags & MATCH_IP_DST_SET) {
SET_FLAG(r.filter.filter_bm, PBR_FILTER_DST_IP);
r.filter.dst_ip = pbr->dst;
} else {
/* ??? */
r.filter.dst_ip.family = fam;
}
/* src_port, dst_port, pcp, dsfield not used */
if (!pbr) {
SET_FLAG(r.filter.filter_bm, PBR_FILTER_FWMARK);
r.filter.fwmark = pbra->fwmark;
}
SET_FLAG(r.action.flags, PBR_ACTION_TABLE); /* always valid */
r.action.table = pbra->table_id;
zapi_pbr_rule_encode(s, &r);
}
static void bgp_encode_pbr_ipset_match(struct stream *s,
struct bgp_pbr_match *pbim)
{
stream_putl(s, pbim->unique);
stream_putl(s, pbim->type);
stream_putc(s, pbim->family);
stream_put(s, pbim->ipset_name,
ZEBRA_IPSET_NAME_SIZE);
}
static void bgp_encode_pbr_ipset_entry_match(struct stream *s,
struct bgp_pbr_match_entry *pbime)
{
stream_putl(s, pbime->unique);
/* check that back pointer is not null */
stream_put(s, pbime->backpointer->ipset_name,
ZEBRA_IPSET_NAME_SIZE);
stream_putc(s, pbime->src.family);
stream_putc(s, pbime->src.prefixlen);
stream_put(s, &pbime->src.u.prefix, prefix_blen(&pbime->src));
stream_putc(s, pbime->dst.family);
stream_putc(s, pbime->dst.prefixlen);
stream_put(s, &pbime->dst.u.prefix, prefix_blen(&pbime->dst));
stream_putw(s, pbime->src_port_min);
stream_putw(s, pbime->src_port_max);
stream_putw(s, pbime->dst_port_min);
stream_putw(s, pbime->dst_port_max);
stream_putc(s, pbime->proto);
}
static void bgp_encode_pbr_iptable_match(struct stream *s,
struct bgp_pbr_action *bpa,
struct bgp_pbr_match *pbm)
{
stream_putl(s, pbm->unique2);
stream_putl(s, pbm->type);
stream_putl(s, pbm->flags);
/* TODO: correlate with what is contained
* into bgp_pbr_action.
* currently only forward supported
*/
if (bpa->nh.type == NEXTHOP_TYPE_BLACKHOLE)
stream_putl(s, ZEBRA_IPTABLES_DROP);
else
stream_putl(s, ZEBRA_IPTABLES_FORWARD);
stream_putl(s, bpa->fwmark);
stream_put(s, pbm->ipset_name,
ZEBRA_IPSET_NAME_SIZE);
stream_putc(s, pbm->family);
stream_putw(s, pbm->pkt_len_min);
stream_putw(s, pbm->pkt_len_max);
stream_putw(s, pbm->tcp_flags);
stream_putw(s, pbm->tcp_mask_flags);
stream_putc(s, pbm->dscp_value);
stream_putc(s, pbm->fragment);
stream_putc(s, pbm->protocol);
stream_putw(s, pbm->flow_label);
}
/* BGP has established connection with Zebra. */
*: add VRF ID in the API message header The API messages are used by zebra to exchange the interfaces, addresses, routes and router-id information with its clients. To distinguish which VRF the information belongs to, a new field "VRF ID" is added in the message header. And hence the message version is increased to 3. * The new field "VRF ID" in the message header: Length (2 bytes) Marker (1 byte) Version (1 byte) VRF ID (2 bytes, newly added) Command (2 bytes) - Client side: - zclient_create_header() adds the VRF ID in the message header. - zclient_read() extracts and validates the VRF ID from the header, and passes the VRF ID to the callback functions registered to the API messages. - All relative functions are appended with a new parameter "vrf_id", including all the callback functions. - "vrf_id" is also added to "struct zapi_ipv4" and "struct zapi_ipv6". Clients need to correctly set the VRF ID when using the API functions zapi_ipv4_route() and zapi_ipv6_route(). - Till now all messages sent from a client have the default VRF ID "0" in the header. - The HELLO message is special, which is used as the heart-beat of a client, and has no relation with VRF. The VRF ID in the HELLO message header will always be 0 and ignored by zebra. - Zebra side: - zserv_create_header() adds the VRF ID in the message header. - zebra_client_read() extracts and validates the VRF ID from the header, and passes the VRF ID to the functions which process the received messages. - All relative functions are appended with a new parameter "vrf_id". * Suppress the messages in a VRF which a client does not care: Some clients may not care about the information in the VRF X, and zebra should not send the messages in the VRF X to those clients. Extra flags are used to indicate which VRF is registered by a client, and a new message ZEBRA_VRF_UNREGISTER is introduced to let a client can unregister a VRF when it does not need any information in that VRF. A client sends any message other than ZEBRA_VRF_UNREGISTER in a VRF will automatically register to that VRF. - lib/vrf: A new utility "VRF bit-map" is provided to manage the flags for VRFs, one bit per VRF ID. - Use vrf_bitmap_init()/vrf_bitmap_free() to initialize/free a bit-map; - Use vrf_bitmap_set()/vrf_bitmap_unset() to set/unset a flag in the given bit-map, corresponding to the given VRF ID; - Use vrf_bitmap_check() to test whether the flag, in the given bit-map and for the given VRF ID, is set. - Client side: - In "struct zclient", the following flags are changed from "u_char" to "vrf_bitmap_t": redist[ZEBRA_ROUTE_MAX] default_information These flags are extended for each VRF, and controlled by the clients themselves (or with the help of zclient_redistribute() and zclient_redistribute_default()). - Zebra side: - In "struct zserv", the following flags are changed from "u_char" to "vrf_bitmap_t": redist[ZEBRA_ROUTE_MAX] redist_default ifinfo ridinfo These flags are extended for each VRF, as the VRF registration flags. They are maintained on receiving a ZEBRA_XXX_ADD or ZEBRA_XXX_DELETE message. When sending an interface/address/route/router-id message in a VRF to a client, if the corresponding VRF registration flag is not set, this message will not be dropped by zebra. - A new function zread_vrf_unregister() is introduced to process the new command ZEBRA_VRF_UNREGISTER. All the VRF registration flags are cleared for the requested VRF. Those clients, who support only the default VRF, will never receive a message in a non-default VRF, thanks to the filter in zebra. * New callback for the event of successful connection to zebra: - zclient_start() is splitted, keeping only the code of connecting to zebra. - Now zclient_init()=>zclient_connect()=>zclient_start() operations are purely dealing with the connection to zbera. - Once zebra is successfully connected, at the end of zclient_start(), a new callback is used to inform the client about connection. - Till now, in the callback of connect-to-zebra event, all clients send messages to zebra to request the router-id/interface/routes information in the default VRF. Of corse in future the client can do anything it wants in this callback. For example, it may send requests for both default VRF and some non-default VRFs. Signed-off-by: Feng Lu <lu.feng@6wind.com> Reviewed-by: Alain Ritoux <alain.ritoux@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Donald Sharp <sharpd@cumulusnetworks.com> Conflicts: lib/zclient.h lib/zebra.h zebra/zserv.c zebra/zserv.h Conflicts: bgpd/bgp_nexthop.c bgpd/bgp_nht.c bgpd/bgp_zebra.c isisd/isis_zebra.c lib/zclient.c lib/zclient.h lib/zebra.h nhrpd/nhrp_interface.c nhrpd/nhrp_route.c nhrpd/nhrpd.h ospf6d/ospf6_zebra.c ospf6d/ospf6_zebra.h ospfd/ospf_vty.c ospfd/ospf_zebra.c pimd/pim_zebra.c pimd/pim_zlookup.c ripd/rip_zebra.c ripngd/ripng_zebra.c zebra/redistribute.c zebra/rt_netlink.c zebra/zebra_rnh.c zebra/zebra_rnh.h zebra/zserv.c zebra/zserv.h
2014-10-16 03:52:36 +02:00
static void bgp_zebra_connected(struct zclient *zclient)
{
struct bgp *bgp;
*: add VRF ID in the API message header The API messages are used by zebra to exchange the interfaces, addresses, routes and router-id information with its clients. To distinguish which VRF the information belongs to, a new field "VRF ID" is added in the message header. And hence the message version is increased to 3. * The new field "VRF ID" in the message header: Length (2 bytes) Marker (1 byte) Version (1 byte) VRF ID (2 bytes, newly added) Command (2 bytes) - Client side: - zclient_create_header() adds the VRF ID in the message header. - zclient_read() extracts and validates the VRF ID from the header, and passes the VRF ID to the callback functions registered to the API messages. - All relative functions are appended with a new parameter "vrf_id", including all the callback functions. - "vrf_id" is also added to "struct zapi_ipv4" and "struct zapi_ipv6". Clients need to correctly set the VRF ID when using the API functions zapi_ipv4_route() and zapi_ipv6_route(). - Till now all messages sent from a client have the default VRF ID "0" in the header. - The HELLO message is special, which is used as the heart-beat of a client, and has no relation with VRF. The VRF ID in the HELLO message header will always be 0 and ignored by zebra. - Zebra side: - zserv_create_header() adds the VRF ID in the message header. - zebra_client_read() extracts and validates the VRF ID from the header, and passes the VRF ID to the functions which process the received messages. - All relative functions are appended with a new parameter "vrf_id". * Suppress the messages in a VRF which a client does not care: Some clients may not care about the information in the VRF X, and zebra should not send the messages in the VRF X to those clients. Extra flags are used to indicate which VRF is registered by a client, and a new message ZEBRA_VRF_UNREGISTER is introduced to let a client can unregister a VRF when it does not need any information in that VRF. A client sends any message other than ZEBRA_VRF_UNREGISTER in a VRF will automatically register to that VRF. - lib/vrf: A new utility "VRF bit-map" is provided to manage the flags for VRFs, one bit per VRF ID. - Use vrf_bitmap_init()/vrf_bitmap_free() to initialize/free a bit-map; - Use vrf_bitmap_set()/vrf_bitmap_unset() to set/unset a flag in the given bit-map, corresponding to the given VRF ID; - Use vrf_bitmap_check() to test whether the flag, in the given bit-map and for the given VRF ID, is set. - Client side: - In "struct zclient", the following flags are changed from "u_char" to "vrf_bitmap_t": redist[ZEBRA_ROUTE_MAX] default_information These flags are extended for each VRF, and controlled by the clients themselves (or with the help of zclient_redistribute() and zclient_redistribute_default()). - Zebra side: - In "struct zserv", the following flags are changed from "u_char" to "vrf_bitmap_t": redist[ZEBRA_ROUTE_MAX] redist_default ifinfo ridinfo These flags are extended for each VRF, as the VRF registration flags. They are maintained on receiving a ZEBRA_XXX_ADD or ZEBRA_XXX_DELETE message. When sending an interface/address/route/router-id message in a VRF to a client, if the corresponding VRF registration flag is not set, this message will not be dropped by zebra. - A new function zread_vrf_unregister() is introduced to process the new command ZEBRA_VRF_UNREGISTER. All the VRF registration flags are cleared for the requested VRF. Those clients, who support only the default VRF, will never receive a message in a non-default VRF, thanks to the filter in zebra. * New callback for the event of successful connection to zebra: - zclient_start() is splitted, keeping only the code of connecting to zebra. - Now zclient_init()=>zclient_connect()=>zclient_start() operations are purely dealing with the connection to zbera. - Once zebra is successfully connected, at the end of zclient_start(), a new callback is used to inform the client about connection. - Till now, in the callback of connect-to-zebra event, all clients send messages to zebra to request the router-id/interface/routes information in the default VRF. Of corse in future the client can do anything it wants in this callback. For example, it may send requests for both default VRF and some non-default VRFs. Signed-off-by: Feng Lu <lu.feng@6wind.com> Reviewed-by: Alain Ritoux <alain.ritoux@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Donald Sharp <sharpd@cumulusnetworks.com> Conflicts: lib/zclient.h lib/zebra.h zebra/zserv.c zebra/zserv.h Conflicts: bgpd/bgp_nexthop.c bgpd/bgp_nht.c bgpd/bgp_zebra.c isisd/isis_zebra.c lib/zclient.c lib/zclient.h lib/zebra.h nhrpd/nhrp_interface.c nhrpd/nhrp_route.c nhrpd/nhrpd.h ospf6d/ospf6_zebra.c ospf6d/ospf6_zebra.h ospfd/ospf_vty.c ospfd/ospf_zebra.c pimd/pim_zebra.c pimd/pim_zlookup.c ripd/rip_zebra.c ripngd/ripng_zebra.c zebra/redistribute.c zebra/rt_netlink.c zebra/zebra_rnh.c zebra/zebra_rnh.h zebra/zserv.c zebra/zserv.h
2014-10-16 03:52:36 +02:00
zclient_num_connects++; /* increment even if not responding */
/* Send the client registration */
bfd_client_sendmsg(zclient, ZEBRA_BFD_CLIENT_REGISTER, VRF_DEFAULT);
/* At this point, we may or may not have BGP instances configured, but
* we're only interested in the default VRF (others wouldn't have learnt
* the VRF from Zebra yet.)
*/
bgp = bgp_get_default();
if (!bgp)
return;
bgp_zebra_instance_register(bgp);
/* TODO - What if we have peers and networks configured, do we have to
* kick-start them?
*/
BGP_GR_ROUTER_DETECT_AND_SEND_CAPABILITY_TO_ZEBRA(bgp, bgp->peer);
*: add VRF ID in the API message header The API messages are used by zebra to exchange the interfaces, addresses, routes and router-id information with its clients. To distinguish which VRF the information belongs to, a new field "VRF ID" is added in the message header. And hence the message version is increased to 3. * The new field "VRF ID" in the message header: Length (2 bytes) Marker (1 byte) Version (1 byte) VRF ID (2 bytes, newly added) Command (2 bytes) - Client side: - zclient_create_header() adds the VRF ID in the message header. - zclient_read() extracts and validates the VRF ID from the header, and passes the VRF ID to the callback functions registered to the API messages. - All relative functions are appended with a new parameter "vrf_id", including all the callback functions. - "vrf_id" is also added to "struct zapi_ipv4" and "struct zapi_ipv6". Clients need to correctly set the VRF ID when using the API functions zapi_ipv4_route() and zapi_ipv6_route(). - Till now all messages sent from a client have the default VRF ID "0" in the header. - The HELLO message is special, which is used as the heart-beat of a client, and has no relation with VRF. The VRF ID in the HELLO message header will always be 0 and ignored by zebra. - Zebra side: - zserv_create_header() adds the VRF ID in the message header. - zebra_client_read() extracts and validates the VRF ID from the header, and passes the VRF ID to the functions which process the received messages. - All relative functions are appended with a new parameter "vrf_id". * Suppress the messages in a VRF which a client does not care: Some clients may not care about the information in the VRF X, and zebra should not send the messages in the VRF X to those clients. Extra flags are used to indicate which VRF is registered by a client, and a new message ZEBRA_VRF_UNREGISTER is introduced to let a client can unregister a VRF when it does not need any information in that VRF. A client sends any message other than ZEBRA_VRF_UNREGISTER in a VRF will automatically register to that VRF. - lib/vrf: A new utility "VRF bit-map" is provided to manage the flags for VRFs, one bit per VRF ID. - Use vrf_bitmap_init()/vrf_bitmap_free() to initialize/free a bit-map; - Use vrf_bitmap_set()/vrf_bitmap_unset() to set/unset a flag in the given bit-map, corresponding to the given VRF ID; - Use vrf_bitmap_check() to test whether the flag, in the given bit-map and for the given VRF ID, is set. - Client side: - In "struct zclient", the following flags are changed from "u_char" to "vrf_bitmap_t": redist[ZEBRA_ROUTE_MAX] default_information These flags are extended for each VRF, and controlled by the clients themselves (or with the help of zclient_redistribute() and zclient_redistribute_default()). - Zebra side: - In "struct zserv", the following flags are changed from "u_char" to "vrf_bitmap_t": redist[ZEBRA_ROUTE_MAX] redist_default ifinfo ridinfo These flags are extended for each VRF, as the VRF registration flags. They are maintained on receiving a ZEBRA_XXX_ADD or ZEBRA_XXX_DELETE message. When sending an interface/address/route/router-id message in a VRF to a client, if the corresponding VRF registration flag is not set, this message will not be dropped by zebra. - A new function zread_vrf_unregister() is introduced to process the new command ZEBRA_VRF_UNREGISTER. All the VRF registration flags are cleared for the requested VRF. Those clients, who support only the default VRF, will never receive a message in a non-default VRF, thanks to the filter in zebra. * New callback for the event of successful connection to zebra: - zclient_start() is splitted, keeping only the code of connecting to zebra. - Now zclient_init()=>zclient_connect()=>zclient_start() operations are purely dealing with the connection to zbera. - Once zebra is successfully connected, at the end of zclient_start(), a new callback is used to inform the client about connection. - Till now, in the callback of connect-to-zebra event, all clients send messages to zebra to request the router-id/interface/routes information in the default VRF. Of corse in future the client can do anything it wants in this callback. For example, it may send requests for both default VRF and some non-default VRFs. Signed-off-by: Feng Lu <lu.feng@6wind.com> Reviewed-by: Alain Ritoux <alain.ritoux@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Donald Sharp <sharpd@cumulusnetworks.com> Conflicts: lib/zclient.h lib/zebra.h zebra/zserv.c zebra/zserv.h Conflicts: bgpd/bgp_nexthop.c bgpd/bgp_nht.c bgpd/bgp_zebra.c isisd/isis_zebra.c lib/zclient.c lib/zclient.h lib/zebra.h nhrpd/nhrp_interface.c nhrpd/nhrp_route.c nhrpd/nhrpd.h ospf6d/ospf6_zebra.c ospf6d/ospf6_zebra.h ospfd/ospf_vty.c ospfd/ospf_zebra.c pimd/pim_zebra.c pimd/pim_zlookup.c ripd/rip_zebra.c ripngd/ripng_zebra.c zebra/redistribute.c zebra/rt_netlink.c zebra/zebra_rnh.c zebra/zebra_rnh.h zebra/zserv.c zebra/zserv.h
2014-10-16 03:52:36 +02:00
}
void bgp_zebra_process_remote_routes_for_l2vni(struct event *e)
{
/*
* If we have learnt and retained remote routes (VTEPs, MACs)
* for this VNI, install them.
*/
install_uninstall_routes_for_vni(NULL, NULL, true);
/*
* If there are VNIs still pending to be processed, schedule them
* after a small sleep so that CPU can be used for other purposes.
*/
if (zebra_l2_vni_count(&bm->zebra_l2_vni_head))
event_add_timer_msec(bm->master, bgp_zebra_process_remote_routes_for_l2vni, NULL,
20, &bm->t_bgp_zebra_l2_vni);
}
void bgp_zebra_process_remote_routes_for_l3vrf(struct event *e)
{
/*
* Install/Uninstall all remote routes belonging to l3vni
*
* NOTE:
* - At this point it does not matter whether we call
* install_routes_for_vrf/uninstall_routes_for_vrf.
* - Since we pass struct bgp as NULL,
* * we iterate the bm FIFO list
* * the second variable (true) is ignored as well and
* calculated based on the BGP-VRFs flags for ADD/DELETE.
*/
install_uninstall_routes_for_vrf(NULL, true);
/*
* If there are L3VNIs still pending to be processed, schedule them
* after a small sleep so that CPU can be used for other purposes.
*/
if (zebra_l3_vni_count(&bm->zebra_l3_vni_head)) {
event_add_timer_msec(bm->master, bgp_zebra_process_remote_routes_for_l3vrf, NULL,
20, &bm->t_bgp_zebra_l3_vni);
}
}
bgpd: support for Ethernet Segments and Type-1/EAD routes This is the base patch that brings in support for Type-1 routes. It includes support for - - Ethernet Segment (ES) management - EAD route handling - MAC-IP (Type-2) routes with a non-zero ESI i.e. Aliasing for active-active multihoming - Initial infra for consistency checking. Consistency checking is a fundamental feature for active-active solutions like MLAG. We will try to levarage the info in the EAD-ES/EAD-EVI routes to detect inconsitencies in access config across VTEPs attached to the same Ethernet Segment. Functionality Overview - ======================== 1. Ethernet segments are created in zebra and associated with access VLANs. zebra sends that info as ES and ES-EVI objects to BGP. 2. BGP advertises EAD-ES and EAD-EVI routes for the locally attached ethernet segments. 3. Similarly BGP processes EAD-ES and EAD-EVI routes from peers and translates them into ES-VTEP objects which are then sent to zebra as remote ESs. 4. Each ES in zebra is associated with a list of active VTEPs which is then translated into a L2-NHG (nexthop group). This is the ES "Alias" entry 5. MAC-IP routes with a non-zero ESI use the alias entry created in (4.) to forward traffic i.e. a MAC-ECMP is done to these remote-ES destinations. EAD route management (route table and key) - ============================================ 1. Local EAD-ES routes a. route-table: per-ES route-table key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP) b. route-table: per-VNI route-table Not added c. route-table: global route-table key: {RD=ES-RD, ESI, ET=0xffffffff) 2. Remote EAD-ES routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP) c. route-table: global route-table key: {RD=ES-RD, ESI, ET=0xffffffff) 3. Local EAD-EVI routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=0, ESI, ET=0, VTEP-IP) c. route-table: global route-table key: {RD=L2-VNI-RD, ESI, ET=0) 4. Remote EAD-EVI routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=0, ESI, ET=0, VTEP-IP) c. route-table: global route-table key: {RD=L2-VNI-RD, ESI, ET=0) Please refer to bgp_evpn_mh.h for info on how the data-structures are organized. Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-03-27 22:43:50 +01:00
static int bgp_zebra_process_local_es_add(ZAPI_CALLBACK_ARGS)
{
esi_t esi;
struct bgp *bgp = NULL;
struct stream *s = NULL;
char buf[ESI_STR_LEN];
struct in_addr originator_ip;
uint8_t active;
uint8_t bypass;
bgpd: support for DF election in EVPN-MH DF (Designated forwarder) election is used for picking a single BUM-traffic forwarded per-ES. RFC7432 specifies a mechanism called service carving for DF election. However that mechanism has many disadvantages - 1. LBs poorly. 2. Doesn't allow for a controlled failover needed in upgrade scenarios. 3. Not easy to hw accelerate. To fix the poor performance of service carving alternate DF mechanisms have been proposed via the following drafts - draft-ietf-bess-evpn-df-election-framework draft-ietf-bess-evpn-pref-df This commit adds support for the pref-df election mechanism which is used as the default. Other mechanisms including service-carving may be added later. In this mechanism one switch on an ES is elected as DF based on the preference value; higher preference wins with IP address acting as the tie-breaker (lower-IP wins if pref value is the same). Sample output ============= >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> torm-11# sh bgp l2vpn evpn es 03:00:00:00:00:01:11:00:00:01 ESI: 03:00:00:00:00:01:11:00:00:01 Type: LR RD: 27.0.0.15:6 Originator-IP: 27.0.0.15 Local ES DF preference: 100 VNI Count: 10 Remote VNI Count: 10 Inconsistent VNI VTEP Count: 0 Inconsistencies: - VTEPs: 27.0.0.16 flags: EA df_alg: preference df_pref: 32767 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> torm-11# sh bgp l2vpn evpn route esi 03:00:00:00:00:01:11:00:00:01 *> [4]:[03:00:00:00:00:01:11:00:00:01]:[32]:[27.0.0.15] 27.0.0.15 32768 i ET:8 ES-Import-Rt:00:00:00:00:01:11 DF: (alg: 2, pref: 100) >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-05-09 01:35:09 +02:00
uint16_t df_pref;
bgpd: support for Ethernet Segments and Type-1/EAD routes This is the base patch that brings in support for Type-1 routes. It includes support for - - Ethernet Segment (ES) management - EAD route handling - MAC-IP (Type-2) routes with a non-zero ESI i.e. Aliasing for active-active multihoming - Initial infra for consistency checking. Consistency checking is a fundamental feature for active-active solutions like MLAG. We will try to levarage the info in the EAD-ES/EAD-EVI routes to detect inconsitencies in access config across VTEPs attached to the same Ethernet Segment. Functionality Overview - ======================== 1. Ethernet segments are created in zebra and associated with access VLANs. zebra sends that info as ES and ES-EVI objects to BGP. 2. BGP advertises EAD-ES and EAD-EVI routes for the locally attached ethernet segments. 3. Similarly BGP processes EAD-ES and EAD-EVI routes from peers and translates them into ES-VTEP objects which are then sent to zebra as remote ESs. 4. Each ES in zebra is associated with a list of active VTEPs which is then translated into a L2-NHG (nexthop group). This is the ES "Alias" entry 5. MAC-IP routes with a non-zero ESI use the alias entry created in (4.) to forward traffic i.e. a MAC-ECMP is done to these remote-ES destinations. EAD route management (route table and key) - ============================================ 1. Local EAD-ES routes a. route-table: per-ES route-table key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP) b. route-table: per-VNI route-table Not added c. route-table: global route-table key: {RD=ES-RD, ESI, ET=0xffffffff) 2. Remote EAD-ES routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP) c. route-table: global route-table key: {RD=ES-RD, ESI, ET=0xffffffff) 3. Local EAD-EVI routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=0, ESI, ET=0, VTEP-IP) c. route-table: global route-table key: {RD=L2-VNI-RD, ESI, ET=0) 4. Remote EAD-EVI routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=0, ESI, ET=0, VTEP-IP) c. route-table: global route-table key: {RD=L2-VNI-RD, ESI, ET=0) Please refer to bgp_evpn_mh.h for info on how the data-structures are organized. Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-03-27 22:43:50 +01:00
bgp = bgp_lookup_by_vrf_id(vrf_id);
if (!bgp)
return 0;
s = zclient->ibuf;
stream_get(&esi, s, sizeof(esi_t));
originator_ip.s_addr = stream_get_ipv4(s);
active = stream_getc(s);
bgpd: support for DF election in EVPN-MH DF (Designated forwarder) election is used for picking a single BUM-traffic forwarded per-ES. RFC7432 specifies a mechanism called service carving for DF election. However that mechanism has many disadvantages - 1. LBs poorly. 2. Doesn't allow for a controlled failover needed in upgrade scenarios. 3. Not easy to hw accelerate. To fix the poor performance of service carving alternate DF mechanisms have been proposed via the following drafts - draft-ietf-bess-evpn-df-election-framework draft-ietf-bess-evpn-pref-df This commit adds support for the pref-df election mechanism which is used as the default. Other mechanisms including service-carving may be added later. In this mechanism one switch on an ES is elected as DF based on the preference value; higher preference wins with IP address acting as the tie-breaker (lower-IP wins if pref value is the same). Sample output ============= >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> torm-11# sh bgp l2vpn evpn es 03:00:00:00:00:01:11:00:00:01 ESI: 03:00:00:00:00:01:11:00:00:01 Type: LR RD: 27.0.0.15:6 Originator-IP: 27.0.0.15 Local ES DF preference: 100 VNI Count: 10 Remote VNI Count: 10 Inconsistent VNI VTEP Count: 0 Inconsistencies: - VTEPs: 27.0.0.16 flags: EA df_alg: preference df_pref: 32767 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> torm-11# sh bgp l2vpn evpn route esi 03:00:00:00:00:01:11:00:00:01 *> [4]:[03:00:00:00:00:01:11:00:00:01]:[32]:[27.0.0.15] 27.0.0.15 32768 i ET:8 ES-Import-Rt:00:00:00:00:01:11 DF: (alg: 2, pref: 100) >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-05-09 01:35:09 +02:00
df_pref = stream_getw(s);
bypass = stream_getc(s);
bgpd: support for Ethernet Segments and Type-1/EAD routes This is the base patch that brings in support for Type-1 routes. It includes support for - - Ethernet Segment (ES) management - EAD route handling - MAC-IP (Type-2) routes with a non-zero ESI i.e. Aliasing for active-active multihoming - Initial infra for consistency checking. Consistency checking is a fundamental feature for active-active solutions like MLAG. We will try to levarage the info in the EAD-ES/EAD-EVI routes to detect inconsitencies in access config across VTEPs attached to the same Ethernet Segment. Functionality Overview - ======================== 1. Ethernet segments are created in zebra and associated with access VLANs. zebra sends that info as ES and ES-EVI objects to BGP. 2. BGP advertises EAD-ES and EAD-EVI routes for the locally attached ethernet segments. 3. Similarly BGP processes EAD-ES and EAD-EVI routes from peers and translates them into ES-VTEP objects which are then sent to zebra as remote ESs. 4. Each ES in zebra is associated with a list of active VTEPs which is then translated into a L2-NHG (nexthop group). This is the ES "Alias" entry 5. MAC-IP routes with a non-zero ESI use the alias entry created in (4.) to forward traffic i.e. a MAC-ECMP is done to these remote-ES destinations. EAD route management (route table and key) - ============================================ 1. Local EAD-ES routes a. route-table: per-ES route-table key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP) b. route-table: per-VNI route-table Not added c. route-table: global route-table key: {RD=ES-RD, ESI, ET=0xffffffff) 2. Remote EAD-ES routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP) c. route-table: global route-table key: {RD=ES-RD, ESI, ET=0xffffffff) 3. Local EAD-EVI routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=0, ESI, ET=0, VTEP-IP) c. route-table: global route-table key: {RD=L2-VNI-RD, ESI, ET=0) 4. Remote EAD-EVI routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=0, ESI, ET=0, VTEP-IP) c. route-table: global route-table key: {RD=L2-VNI-RD, ESI, ET=0) Please refer to bgp_evpn_mh.h for info on how the data-structures are organized. Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-03-27 22:43:50 +01:00
if (BGP_DEBUG(zebra, ZEBRA))
bgpd: support for DF election in EVPN-MH DF (Designated forwarder) election is used for picking a single BUM-traffic forwarded per-ES. RFC7432 specifies a mechanism called service carving for DF election. However that mechanism has many disadvantages - 1. LBs poorly. 2. Doesn't allow for a controlled failover needed in upgrade scenarios. 3. Not easy to hw accelerate. To fix the poor performance of service carving alternate DF mechanisms have been proposed via the following drafts - draft-ietf-bess-evpn-df-election-framework draft-ietf-bess-evpn-pref-df This commit adds support for the pref-df election mechanism which is used as the default. Other mechanisms including service-carving may be added later. In this mechanism one switch on an ES is elected as DF based on the preference value; higher preference wins with IP address acting as the tie-breaker (lower-IP wins if pref value is the same). Sample output ============= >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> torm-11# sh bgp l2vpn evpn es 03:00:00:00:00:01:11:00:00:01 ESI: 03:00:00:00:00:01:11:00:00:01 Type: LR RD: 27.0.0.15:6 Originator-IP: 27.0.0.15 Local ES DF preference: 100 VNI Count: 10 Remote VNI Count: 10 Inconsistent VNI VTEP Count: 0 Inconsistencies: - VTEPs: 27.0.0.16 flags: EA df_alg: preference df_pref: 32767 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> torm-11# sh bgp l2vpn evpn route esi 03:00:00:00:00:01:11:00:00:01 *> [4]:[03:00:00:00:00:01:11:00:00:01]:[32]:[27.0.0.15] 27.0.0.15 32768 i ET:8 ES-Import-Rt:00:00:00:00:01:11 DF: (alg: 2, pref: 100) >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-05-09 01:35:09 +02:00
zlog_debug(
"Rx add ESI %s originator-ip %pI4 active %u df_pref %u %s",
esi_to_str(&esi, buf, sizeof(buf)), &originator_ip,
active, df_pref, bypass ? "bypass" : "");
bgpd: support for Ethernet Segments and Type-1/EAD routes This is the base patch that brings in support for Type-1 routes. It includes support for - - Ethernet Segment (ES) management - EAD route handling - MAC-IP (Type-2) routes with a non-zero ESI i.e. Aliasing for active-active multihoming - Initial infra for consistency checking. Consistency checking is a fundamental feature for active-active solutions like MLAG. We will try to levarage the info in the EAD-ES/EAD-EVI routes to detect inconsitencies in access config across VTEPs attached to the same Ethernet Segment. Functionality Overview - ======================== 1. Ethernet segments are created in zebra and associated with access VLANs. zebra sends that info as ES and ES-EVI objects to BGP. 2. BGP advertises EAD-ES and EAD-EVI routes for the locally attached ethernet segments. 3. Similarly BGP processes EAD-ES and EAD-EVI routes from peers and translates them into ES-VTEP objects which are then sent to zebra as remote ESs. 4. Each ES in zebra is associated with a list of active VTEPs which is then translated into a L2-NHG (nexthop group). This is the ES "Alias" entry 5. MAC-IP routes with a non-zero ESI use the alias entry created in (4.) to forward traffic i.e. a MAC-ECMP is done to these remote-ES destinations. EAD route management (route table and key) - ============================================ 1. Local EAD-ES routes a. route-table: per-ES route-table key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP) b. route-table: per-VNI route-table Not added c. route-table: global route-table key: {RD=ES-RD, ESI, ET=0xffffffff) 2. Remote EAD-ES routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP) c. route-table: global route-table key: {RD=ES-RD, ESI, ET=0xffffffff) 3. Local EAD-EVI routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=0, ESI, ET=0, VTEP-IP) c. route-table: global route-table key: {RD=L2-VNI-RD, ESI, ET=0) 4. Remote EAD-EVI routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=0, ESI, ET=0, VTEP-IP) c. route-table: global route-table key: {RD=L2-VNI-RD, ESI, ET=0) Please refer to bgp_evpn_mh.h for info on how the data-structures are organized. Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-03-27 22:43:50 +01:00
bgpd: lttng tracepoint for local events received from zebra TPs - ===== root@ibm-2410a1-01:mgmt:~# lttng list --userspace |grep frr_bgp:evpn.*recv frr_bgp:evpn_local_l3vni_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_l3vni_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_macip_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_macip_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_vni_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_vni_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_evi_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_evi_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) root@ibm-2410a1-01:mgmt:~# Sample output - =============== 1. ES frr_bgp:evpn_mh_local_es_add_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01', 'vtep': '27.0.0.15', 'active': 0, 'bypass': 0, 'df_pref': 50000} frr_bgp:evpn_mh_local_es_del_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01'} 2. ES-EVI frr_bgp:evpn_mh_local_es_evi_add_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01', 'vni': 1004} frr_bgp:evpn_mh_local_es_evi_del_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01', 'vni': 1001} 3. L2-VNI frr_bgp:evpn_local_vni_add_zrecv {'vni': 1004, 'vtep': '27.0.0.15', 'mc_grp': '239.1.1.104', 'vrf': 97} 4. L3-VNI frr_bgp:evpn_local_l3vni_add_zrecv {'vni': 4001, 'vrf': 87, 'svi_rmac': '24:8a:07:cc:aa:5f', 'vrr_rmac': '24:8a:07:cc:aa:5f', 'vtep': '27.0.0.15', 'filter': 0, 'svi_ifindex': 95, 'anycast_mac': 'n' frr_bgp:evpn_local_l3vni_del_zrecv {'vni': 4003, 'vrf': 107} 5. MAC-IP frr_bgp:evpn_local_macip_add_zrecv {'vni': 1003, 'mac': '00:02:00:00:00:04', 'ip': 'fe80::202:ff:fe00:4', 'flags': 4, 'seq': 0, 'esi': '03:44:38:39:ff:ff:01:00:00:02'} frr_bgp:evpn_local_macip_del_zrecv {'vni': 1000, 'mac': '00:02:00:00:00:04', 'ip': '2001:fee1::4', 'state': 1} Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
2021-10-04 18:41:43 +02:00
frrtrace(5, frr_bgp, evpn_mh_local_es_add_zrecv, &esi, originator_ip,
active, bypass, df_pref);
bgp_evpn_local_es_add(bgp, &esi, originator_ip, active, df_pref,
!!bypass);
bgpd: support for Ethernet Segments and Type-1/EAD routes This is the base patch that brings in support for Type-1 routes. It includes support for - - Ethernet Segment (ES) management - EAD route handling - MAC-IP (Type-2) routes with a non-zero ESI i.e. Aliasing for active-active multihoming - Initial infra for consistency checking. Consistency checking is a fundamental feature for active-active solutions like MLAG. We will try to levarage the info in the EAD-ES/EAD-EVI routes to detect inconsitencies in access config across VTEPs attached to the same Ethernet Segment. Functionality Overview - ======================== 1. Ethernet segments are created in zebra and associated with access VLANs. zebra sends that info as ES and ES-EVI objects to BGP. 2. BGP advertises EAD-ES and EAD-EVI routes for the locally attached ethernet segments. 3. Similarly BGP processes EAD-ES and EAD-EVI routes from peers and translates them into ES-VTEP objects which are then sent to zebra as remote ESs. 4. Each ES in zebra is associated with a list of active VTEPs which is then translated into a L2-NHG (nexthop group). This is the ES "Alias" entry 5. MAC-IP routes with a non-zero ESI use the alias entry created in (4.) to forward traffic i.e. a MAC-ECMP is done to these remote-ES destinations. EAD route management (route table and key) - ============================================ 1. Local EAD-ES routes a. route-table: per-ES route-table key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP) b. route-table: per-VNI route-table Not added c. route-table: global route-table key: {RD=ES-RD, ESI, ET=0xffffffff) 2. Remote EAD-ES routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP) c. route-table: global route-table key: {RD=ES-RD, ESI, ET=0xffffffff) 3. Local EAD-EVI routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=0, ESI, ET=0, VTEP-IP) c. route-table: global route-table key: {RD=L2-VNI-RD, ESI, ET=0) 4. Remote EAD-EVI routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=0, ESI, ET=0, VTEP-IP) c. route-table: global route-table key: {RD=L2-VNI-RD, ESI, ET=0) Please refer to bgp_evpn_mh.h for info on how the data-structures are organized. Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-03-27 22:43:50 +01:00
return 0;
}
static int bgp_zebra_process_local_es_del(ZAPI_CALLBACK_ARGS)
{
esi_t esi;
struct bgp *bgp = NULL;
struct stream *s = NULL;
char buf[ESI_STR_LEN];
memset(&esi, 0, sizeof(esi_t));
bgpd: support for Ethernet Segments and Type-1/EAD routes This is the base patch that brings in support for Type-1 routes. It includes support for - - Ethernet Segment (ES) management - EAD route handling - MAC-IP (Type-2) routes with a non-zero ESI i.e. Aliasing for active-active multihoming - Initial infra for consistency checking. Consistency checking is a fundamental feature for active-active solutions like MLAG. We will try to levarage the info in the EAD-ES/EAD-EVI routes to detect inconsitencies in access config across VTEPs attached to the same Ethernet Segment. Functionality Overview - ======================== 1. Ethernet segments are created in zebra and associated with access VLANs. zebra sends that info as ES and ES-EVI objects to BGP. 2. BGP advertises EAD-ES and EAD-EVI routes for the locally attached ethernet segments. 3. Similarly BGP processes EAD-ES and EAD-EVI routes from peers and translates them into ES-VTEP objects which are then sent to zebra as remote ESs. 4. Each ES in zebra is associated with a list of active VTEPs which is then translated into a L2-NHG (nexthop group). This is the ES "Alias" entry 5. MAC-IP routes with a non-zero ESI use the alias entry created in (4.) to forward traffic i.e. a MAC-ECMP is done to these remote-ES destinations. EAD route management (route table and key) - ============================================ 1. Local EAD-ES routes a. route-table: per-ES route-table key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP) b. route-table: per-VNI route-table Not added c. route-table: global route-table key: {RD=ES-RD, ESI, ET=0xffffffff) 2. Remote EAD-ES routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP) c. route-table: global route-table key: {RD=ES-RD, ESI, ET=0xffffffff) 3. Local EAD-EVI routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=0, ESI, ET=0, VTEP-IP) c. route-table: global route-table key: {RD=L2-VNI-RD, ESI, ET=0) 4. Remote EAD-EVI routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=0, ESI, ET=0, VTEP-IP) c. route-table: global route-table key: {RD=L2-VNI-RD, ESI, ET=0) Please refer to bgp_evpn_mh.h for info on how the data-structures are organized. Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-03-27 22:43:50 +01:00
bgp = bgp_lookup_by_vrf_id(vrf_id);
if (!bgp)
return 0;
s = zclient->ibuf;
stream_get(&esi, s, sizeof(esi_t));
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("Rx del ESI %s",
esi_to_str(&esi, buf, sizeof(buf)));
bgpd: lttng tracepoint for local events received from zebra TPs - ===== root@ibm-2410a1-01:mgmt:~# lttng list --userspace |grep frr_bgp:evpn.*recv frr_bgp:evpn_local_l3vni_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_l3vni_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_macip_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_macip_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_vni_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_vni_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_evi_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_evi_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) root@ibm-2410a1-01:mgmt:~# Sample output - =============== 1. ES frr_bgp:evpn_mh_local_es_add_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01', 'vtep': '27.0.0.15', 'active': 0, 'bypass': 0, 'df_pref': 50000} frr_bgp:evpn_mh_local_es_del_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01'} 2. ES-EVI frr_bgp:evpn_mh_local_es_evi_add_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01', 'vni': 1004} frr_bgp:evpn_mh_local_es_evi_del_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01', 'vni': 1001} 3. L2-VNI frr_bgp:evpn_local_vni_add_zrecv {'vni': 1004, 'vtep': '27.0.0.15', 'mc_grp': '239.1.1.104', 'vrf': 97} 4. L3-VNI frr_bgp:evpn_local_l3vni_add_zrecv {'vni': 4001, 'vrf': 87, 'svi_rmac': '24:8a:07:cc:aa:5f', 'vrr_rmac': '24:8a:07:cc:aa:5f', 'vtep': '27.0.0.15', 'filter': 0, 'svi_ifindex': 95, 'anycast_mac': 'n' frr_bgp:evpn_local_l3vni_del_zrecv {'vni': 4003, 'vrf': 107} 5. MAC-IP frr_bgp:evpn_local_macip_add_zrecv {'vni': 1003, 'mac': '00:02:00:00:00:04', 'ip': 'fe80::202:ff:fe00:4', 'flags': 4, 'seq': 0, 'esi': '03:44:38:39:ff:ff:01:00:00:02'} frr_bgp:evpn_local_macip_del_zrecv {'vni': 1000, 'mac': '00:02:00:00:00:04', 'ip': '2001:fee1::4', 'state': 1} Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
2021-10-04 18:41:43 +02:00
frrtrace(1, frr_bgp, evpn_mh_local_es_del_zrecv, &esi);
bgpd: support for Ethernet Segments and Type-1/EAD routes This is the base patch that brings in support for Type-1 routes. It includes support for - - Ethernet Segment (ES) management - EAD route handling - MAC-IP (Type-2) routes with a non-zero ESI i.e. Aliasing for active-active multihoming - Initial infra for consistency checking. Consistency checking is a fundamental feature for active-active solutions like MLAG. We will try to levarage the info in the EAD-ES/EAD-EVI routes to detect inconsitencies in access config across VTEPs attached to the same Ethernet Segment. Functionality Overview - ======================== 1. Ethernet segments are created in zebra and associated with access VLANs. zebra sends that info as ES and ES-EVI objects to BGP. 2. BGP advertises EAD-ES and EAD-EVI routes for the locally attached ethernet segments. 3. Similarly BGP processes EAD-ES and EAD-EVI routes from peers and translates them into ES-VTEP objects which are then sent to zebra as remote ESs. 4. Each ES in zebra is associated with a list of active VTEPs which is then translated into a L2-NHG (nexthop group). This is the ES "Alias" entry 5. MAC-IP routes with a non-zero ESI use the alias entry created in (4.) to forward traffic i.e. a MAC-ECMP is done to these remote-ES destinations. EAD route management (route table and key) - ============================================ 1. Local EAD-ES routes a. route-table: per-ES route-table key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP) b. route-table: per-VNI route-table Not added c. route-table: global route-table key: {RD=ES-RD, ESI, ET=0xffffffff) 2. Remote EAD-ES routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP) c. route-table: global route-table key: {RD=ES-RD, ESI, ET=0xffffffff) 3. Local EAD-EVI routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=0, ESI, ET=0, VTEP-IP) c. route-table: global route-table key: {RD=L2-VNI-RD, ESI, ET=0) 4. Remote EAD-EVI routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=0, ESI, ET=0, VTEP-IP) c. route-table: global route-table key: {RD=L2-VNI-RD, ESI, ET=0) Please refer to bgp_evpn_mh.h for info on how the data-structures are organized. Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-03-27 22:43:50 +01:00
bgp_evpn_local_es_del(bgp, &esi);
return 0;
}
static int bgp_zebra_process_local_es_evi(ZAPI_CALLBACK_ARGS)
{
esi_t esi;
vni_t vni;
struct bgp *bgp;
struct stream *s;
char buf[ESI_STR_LEN];
bgp = bgp_lookup_by_vrf_id(vrf_id);
if (!bgp)
return 0;
s = zclient->ibuf;
stream_get(&esi, s, sizeof(esi_t));
bgpd: support for Ethernet Segments and Type-1/EAD routes This is the base patch that brings in support for Type-1 routes. It includes support for - - Ethernet Segment (ES) management - EAD route handling - MAC-IP (Type-2) routes with a non-zero ESI i.e. Aliasing for active-active multihoming - Initial infra for consistency checking. Consistency checking is a fundamental feature for active-active solutions like MLAG. We will try to levarage the info in the EAD-ES/EAD-EVI routes to detect inconsitencies in access config across VTEPs attached to the same Ethernet Segment. Functionality Overview - ======================== 1. Ethernet segments are created in zebra and associated with access VLANs. zebra sends that info as ES and ES-EVI objects to BGP. 2. BGP advertises EAD-ES and EAD-EVI routes for the locally attached ethernet segments. 3. Similarly BGP processes EAD-ES and EAD-EVI routes from peers and translates them into ES-VTEP objects which are then sent to zebra as remote ESs. 4. Each ES in zebra is associated with a list of active VTEPs which is then translated into a L2-NHG (nexthop group). This is the ES "Alias" entry 5. MAC-IP routes with a non-zero ESI use the alias entry created in (4.) to forward traffic i.e. a MAC-ECMP is done to these remote-ES destinations. EAD route management (route table and key) - ============================================ 1. Local EAD-ES routes a. route-table: per-ES route-table key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP) b. route-table: per-VNI route-table Not added c. route-table: global route-table key: {RD=ES-RD, ESI, ET=0xffffffff) 2. Remote EAD-ES routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP) c. route-table: global route-table key: {RD=ES-RD, ESI, ET=0xffffffff) 3. Local EAD-EVI routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=0, ESI, ET=0, VTEP-IP) c. route-table: global route-table key: {RD=L2-VNI-RD, ESI, ET=0) 4. Remote EAD-EVI routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=0, ESI, ET=0, VTEP-IP) c. route-table: global route-table key: {RD=L2-VNI-RD, ESI, ET=0) Please refer to bgp_evpn_mh.h for info on how the data-structures are organized. Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-03-27 22:43:50 +01:00
vni = stream_getl(s);
if (BGP_DEBUG(zebra, ZEBRA))
bgpd: support for Ethernet Segments and Type-1/EAD routes This is the base patch that brings in support for Type-1 routes. It includes support for - - Ethernet Segment (ES) management - EAD route handling - MAC-IP (Type-2) routes with a non-zero ESI i.e. Aliasing for active-active multihoming - Initial infra for consistency checking. Consistency checking is a fundamental feature for active-active solutions like MLAG. We will try to levarage the info in the EAD-ES/EAD-EVI routes to detect inconsitencies in access config across VTEPs attached to the same Ethernet Segment. Functionality Overview - ======================== 1. Ethernet segments are created in zebra and associated with access VLANs. zebra sends that info as ES and ES-EVI objects to BGP. 2. BGP advertises EAD-ES and EAD-EVI routes for the locally attached ethernet segments. 3. Similarly BGP processes EAD-ES and EAD-EVI routes from peers and translates them into ES-VTEP objects which are then sent to zebra as remote ESs. 4. Each ES in zebra is associated with a list of active VTEPs which is then translated into a L2-NHG (nexthop group). This is the ES "Alias" entry 5. MAC-IP routes with a non-zero ESI use the alias entry created in (4.) to forward traffic i.e. a MAC-ECMP is done to these remote-ES destinations. EAD route management (route table and key) - ============================================ 1. Local EAD-ES routes a. route-table: per-ES route-table key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP) b. route-table: per-VNI route-table Not added c. route-table: global route-table key: {RD=ES-RD, ESI, ET=0xffffffff) 2. Remote EAD-ES routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP) c. route-table: global route-table key: {RD=ES-RD, ESI, ET=0xffffffff) 3. Local EAD-EVI routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=0, ESI, ET=0, VTEP-IP) c. route-table: global route-table key: {RD=L2-VNI-RD, ESI, ET=0) 4. Remote EAD-EVI routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=0, ESI, ET=0, VTEP-IP) c. route-table: global route-table key: {RD=L2-VNI-RD, ESI, ET=0) Please refer to bgp_evpn_mh.h for info on how the data-structures are organized. Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-03-27 22:43:50 +01:00
zlog_debug("Rx %s ESI %s VNI %u",
(cmd == ZEBRA_VNI_ADD) ? "add" : "del",
esi_to_str(&esi, buf, sizeof(buf)), vni);
bgpd: lttng tracepoint for local events received from zebra TPs - ===== root@ibm-2410a1-01:mgmt:~# lttng list --userspace |grep frr_bgp:evpn.*recv frr_bgp:evpn_local_l3vni_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_l3vni_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_macip_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_macip_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_vni_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_vni_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_evi_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_evi_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) root@ibm-2410a1-01:mgmt:~# Sample output - =============== 1. ES frr_bgp:evpn_mh_local_es_add_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01', 'vtep': '27.0.0.15', 'active': 0, 'bypass': 0, 'df_pref': 50000} frr_bgp:evpn_mh_local_es_del_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01'} 2. ES-EVI frr_bgp:evpn_mh_local_es_evi_add_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01', 'vni': 1004} frr_bgp:evpn_mh_local_es_evi_del_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01', 'vni': 1001} 3. L2-VNI frr_bgp:evpn_local_vni_add_zrecv {'vni': 1004, 'vtep': '27.0.0.15', 'mc_grp': '239.1.1.104', 'vrf': 97} 4. L3-VNI frr_bgp:evpn_local_l3vni_add_zrecv {'vni': 4001, 'vrf': 87, 'svi_rmac': '24:8a:07:cc:aa:5f', 'vrr_rmac': '24:8a:07:cc:aa:5f', 'vtep': '27.0.0.15', 'filter': 0, 'svi_ifindex': 95, 'anycast_mac': 'n' frr_bgp:evpn_local_l3vni_del_zrecv {'vni': 4003, 'vrf': 107} 5. MAC-IP frr_bgp:evpn_local_macip_add_zrecv {'vni': 1003, 'mac': '00:02:00:00:00:04', 'ip': 'fe80::202:ff:fe00:4', 'flags': 4, 'seq': 0, 'esi': '03:44:38:39:ff:ff:01:00:00:02'} frr_bgp:evpn_local_macip_del_zrecv {'vni': 1000, 'mac': '00:02:00:00:00:04', 'ip': '2001:fee1::4', 'state': 1} Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
2021-10-04 18:41:43 +02:00
if (cmd == ZEBRA_LOCAL_ES_EVI_ADD) {
frrtrace(2, frr_bgp, evpn_mh_local_es_evi_add_zrecv, &esi, vni);
bgpd: support for Ethernet Segments and Type-1/EAD routes This is the base patch that brings in support for Type-1 routes. It includes support for - - Ethernet Segment (ES) management - EAD route handling - MAC-IP (Type-2) routes with a non-zero ESI i.e. Aliasing for active-active multihoming - Initial infra for consistency checking. Consistency checking is a fundamental feature for active-active solutions like MLAG. We will try to levarage the info in the EAD-ES/EAD-EVI routes to detect inconsitencies in access config across VTEPs attached to the same Ethernet Segment. Functionality Overview - ======================== 1. Ethernet segments are created in zebra and associated with access VLANs. zebra sends that info as ES and ES-EVI objects to BGP. 2. BGP advertises EAD-ES and EAD-EVI routes for the locally attached ethernet segments. 3. Similarly BGP processes EAD-ES and EAD-EVI routes from peers and translates them into ES-VTEP objects which are then sent to zebra as remote ESs. 4. Each ES in zebra is associated with a list of active VTEPs which is then translated into a L2-NHG (nexthop group). This is the ES "Alias" entry 5. MAC-IP routes with a non-zero ESI use the alias entry created in (4.) to forward traffic i.e. a MAC-ECMP is done to these remote-ES destinations. EAD route management (route table and key) - ============================================ 1. Local EAD-ES routes a. route-table: per-ES route-table key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP) b. route-table: per-VNI route-table Not added c. route-table: global route-table key: {RD=ES-RD, ESI, ET=0xffffffff) 2. Remote EAD-ES routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP) c. route-table: global route-table key: {RD=ES-RD, ESI, ET=0xffffffff) 3. Local EAD-EVI routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=0, ESI, ET=0, VTEP-IP) c. route-table: global route-table key: {RD=L2-VNI-RD, ESI, ET=0) 4. Remote EAD-EVI routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=0, ESI, ET=0, VTEP-IP) c. route-table: global route-table key: {RD=L2-VNI-RD, ESI, ET=0) Please refer to bgp_evpn_mh.h for info on how the data-structures are organized. Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-03-27 22:43:50 +01:00
bgp_evpn_local_es_evi_add(bgp, &esi, vni);
bgpd: lttng tracepoint for local events received from zebra TPs - ===== root@ibm-2410a1-01:mgmt:~# lttng list --userspace |grep frr_bgp:evpn.*recv frr_bgp:evpn_local_l3vni_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_l3vni_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_macip_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_macip_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_vni_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_vni_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_evi_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_evi_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) root@ibm-2410a1-01:mgmt:~# Sample output - =============== 1. ES frr_bgp:evpn_mh_local_es_add_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01', 'vtep': '27.0.0.15', 'active': 0, 'bypass': 0, 'df_pref': 50000} frr_bgp:evpn_mh_local_es_del_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01'} 2. ES-EVI frr_bgp:evpn_mh_local_es_evi_add_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01', 'vni': 1004} frr_bgp:evpn_mh_local_es_evi_del_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01', 'vni': 1001} 3. L2-VNI frr_bgp:evpn_local_vni_add_zrecv {'vni': 1004, 'vtep': '27.0.0.15', 'mc_grp': '239.1.1.104', 'vrf': 97} 4. L3-VNI frr_bgp:evpn_local_l3vni_add_zrecv {'vni': 4001, 'vrf': 87, 'svi_rmac': '24:8a:07:cc:aa:5f', 'vrr_rmac': '24:8a:07:cc:aa:5f', 'vtep': '27.0.0.15', 'filter': 0, 'svi_ifindex': 95, 'anycast_mac': 'n' frr_bgp:evpn_local_l3vni_del_zrecv {'vni': 4003, 'vrf': 107} 5. MAC-IP frr_bgp:evpn_local_macip_add_zrecv {'vni': 1003, 'mac': '00:02:00:00:00:04', 'ip': 'fe80::202:ff:fe00:4', 'flags': 4, 'seq': 0, 'esi': '03:44:38:39:ff:ff:01:00:00:02'} frr_bgp:evpn_local_macip_del_zrecv {'vni': 1000, 'mac': '00:02:00:00:00:04', 'ip': '2001:fee1::4', 'state': 1} Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
2021-10-04 18:41:43 +02:00
} else {
frrtrace(2, frr_bgp, evpn_mh_local_es_evi_del_zrecv, &esi, vni);
bgpd: support for Ethernet Segments and Type-1/EAD routes This is the base patch that brings in support for Type-1 routes. It includes support for - - Ethernet Segment (ES) management - EAD route handling - MAC-IP (Type-2) routes with a non-zero ESI i.e. Aliasing for active-active multihoming - Initial infra for consistency checking. Consistency checking is a fundamental feature for active-active solutions like MLAG. We will try to levarage the info in the EAD-ES/EAD-EVI routes to detect inconsitencies in access config across VTEPs attached to the same Ethernet Segment. Functionality Overview - ======================== 1. Ethernet segments are created in zebra and associated with access VLANs. zebra sends that info as ES and ES-EVI objects to BGP. 2. BGP advertises EAD-ES and EAD-EVI routes for the locally attached ethernet segments. 3. Similarly BGP processes EAD-ES and EAD-EVI routes from peers and translates them into ES-VTEP objects which are then sent to zebra as remote ESs. 4. Each ES in zebra is associated with a list of active VTEPs which is then translated into a L2-NHG (nexthop group). This is the ES "Alias" entry 5. MAC-IP routes with a non-zero ESI use the alias entry created in (4.) to forward traffic i.e. a MAC-ECMP is done to these remote-ES destinations. EAD route management (route table and key) - ============================================ 1. Local EAD-ES routes a. route-table: per-ES route-table key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP) b. route-table: per-VNI route-table Not added c. route-table: global route-table key: {RD=ES-RD, ESI, ET=0xffffffff) 2. Remote EAD-ES routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP) c. route-table: global route-table key: {RD=ES-RD, ESI, ET=0xffffffff) 3. Local EAD-EVI routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=0, ESI, ET=0, VTEP-IP) c. route-table: global route-table key: {RD=L2-VNI-RD, ESI, ET=0) 4. Remote EAD-EVI routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=0, ESI, ET=0, VTEP-IP) c. route-table: global route-table key: {RD=L2-VNI-RD, ESI, ET=0) Please refer to bgp_evpn_mh.h for info on how the data-structures are organized. Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-03-27 22:43:50 +01:00
bgp_evpn_local_es_evi_del(bgp, &esi, vni);
bgpd: lttng tracepoint for local events received from zebra TPs - ===== root@ibm-2410a1-01:mgmt:~# lttng list --userspace |grep frr_bgp:evpn.*recv frr_bgp:evpn_local_l3vni_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_l3vni_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_macip_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_macip_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_vni_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_vni_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_evi_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_evi_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) root@ibm-2410a1-01:mgmt:~# Sample output - =============== 1. ES frr_bgp:evpn_mh_local_es_add_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01', 'vtep': '27.0.0.15', 'active': 0, 'bypass': 0, 'df_pref': 50000} frr_bgp:evpn_mh_local_es_del_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01'} 2. ES-EVI frr_bgp:evpn_mh_local_es_evi_add_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01', 'vni': 1004} frr_bgp:evpn_mh_local_es_evi_del_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01', 'vni': 1001} 3. L2-VNI frr_bgp:evpn_local_vni_add_zrecv {'vni': 1004, 'vtep': '27.0.0.15', 'mc_grp': '239.1.1.104', 'vrf': 97} 4. L3-VNI frr_bgp:evpn_local_l3vni_add_zrecv {'vni': 4001, 'vrf': 87, 'svi_rmac': '24:8a:07:cc:aa:5f', 'vrr_rmac': '24:8a:07:cc:aa:5f', 'vtep': '27.0.0.15', 'filter': 0, 'svi_ifindex': 95, 'anycast_mac': 'n' frr_bgp:evpn_local_l3vni_del_zrecv {'vni': 4003, 'vrf': 107} 5. MAC-IP frr_bgp:evpn_local_macip_add_zrecv {'vni': 1003, 'mac': '00:02:00:00:00:04', 'ip': 'fe80::202:ff:fe00:4', 'flags': 4, 'seq': 0, 'esi': '03:44:38:39:ff:ff:01:00:00:02'} frr_bgp:evpn_local_macip_del_zrecv {'vni': 1000, 'mac': '00:02:00:00:00:04', 'ip': '2001:fee1::4', 'state': 1} Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
2021-10-04 18:41:43 +02:00
}
bgpd: support for Ethernet Segments and Type-1/EAD routes This is the base patch that brings in support for Type-1 routes. It includes support for - - Ethernet Segment (ES) management - EAD route handling - MAC-IP (Type-2) routes with a non-zero ESI i.e. Aliasing for active-active multihoming - Initial infra for consistency checking. Consistency checking is a fundamental feature for active-active solutions like MLAG. We will try to levarage the info in the EAD-ES/EAD-EVI routes to detect inconsitencies in access config across VTEPs attached to the same Ethernet Segment. Functionality Overview - ======================== 1. Ethernet segments are created in zebra and associated with access VLANs. zebra sends that info as ES and ES-EVI objects to BGP. 2. BGP advertises EAD-ES and EAD-EVI routes for the locally attached ethernet segments. 3. Similarly BGP processes EAD-ES and EAD-EVI routes from peers and translates them into ES-VTEP objects which are then sent to zebra as remote ESs. 4. Each ES in zebra is associated with a list of active VTEPs which is then translated into a L2-NHG (nexthop group). This is the ES "Alias" entry 5. MAC-IP routes with a non-zero ESI use the alias entry created in (4.) to forward traffic i.e. a MAC-ECMP is done to these remote-ES destinations. EAD route management (route table and key) - ============================================ 1. Local EAD-ES routes a. route-table: per-ES route-table key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP) b. route-table: per-VNI route-table Not added c. route-table: global route-table key: {RD=ES-RD, ESI, ET=0xffffffff) 2. Remote EAD-ES routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP) c. route-table: global route-table key: {RD=ES-RD, ESI, ET=0xffffffff) 3. Local EAD-EVI routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=0, ESI, ET=0, VTEP-IP) c. route-table: global route-table key: {RD=L2-VNI-RD, ESI, ET=0) 4. Remote EAD-EVI routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=0, ESI, ET=0, VTEP-IP) c. route-table: global route-table key: {RD=L2-VNI-RD, ESI, ET=0) Please refer to bgp_evpn_mh.h for info on how the data-structures are organized. Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-03-27 22:43:50 +01:00
return 0;
}
static int bgp_zebra_process_local_l3vni(ZAPI_CALLBACK_ARGS)
{
int filter = 0;
vni_t l3vni = 0;
struct ethaddr svi_rmac, vrr_rmac = {.octet = {0} };
struct in_addr originator_ip;
struct stream *s;
ifindex_t svi_ifindex;
bool is_anycast_mac = false;
memset(&svi_rmac, 0, sizeof(svi_rmac));
memset(&originator_ip, 0, sizeof(originator_ip));
s = zclient->ibuf;
l3vni = stream_getl(s);
if (cmd == ZEBRA_L3VNI_ADD) {
stream_get(&svi_rmac, s, sizeof(struct ethaddr));
originator_ip.s_addr = stream_get_ipv4(s);
stream_get(&filter, s, sizeof(int));
svi_ifindex = stream_getl(s);
stream_get(&vrr_rmac, s, sizeof(struct ethaddr));
is_anycast_mac = stream_getl(s);
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("Rx L3VNI ADD VRF %s VNI %u Originator-IP %pI4 RMAC svi-mac %pEA vrr-mac %pEA filter %s svi-if %u",
vrf_id_to_name(vrf_id), l3vni,
&originator_ip, &svi_rmac, &vrr_rmac,
filter ? "prefix-routes-only" : "none",
svi_ifindex);
bgpd: lttng tracepoint for local events received from zebra TPs - ===== root@ibm-2410a1-01:mgmt:~# lttng list --userspace |grep frr_bgp:evpn.*recv frr_bgp:evpn_local_l3vni_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_l3vni_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_macip_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_macip_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_vni_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_vni_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_evi_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_evi_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) root@ibm-2410a1-01:mgmt:~# Sample output - =============== 1. ES frr_bgp:evpn_mh_local_es_add_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01', 'vtep': '27.0.0.15', 'active': 0, 'bypass': 0, 'df_pref': 50000} frr_bgp:evpn_mh_local_es_del_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01'} 2. ES-EVI frr_bgp:evpn_mh_local_es_evi_add_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01', 'vni': 1004} frr_bgp:evpn_mh_local_es_evi_del_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01', 'vni': 1001} 3. L2-VNI frr_bgp:evpn_local_vni_add_zrecv {'vni': 1004, 'vtep': '27.0.0.15', 'mc_grp': '239.1.1.104', 'vrf': 97} 4. L3-VNI frr_bgp:evpn_local_l3vni_add_zrecv {'vni': 4001, 'vrf': 87, 'svi_rmac': '24:8a:07:cc:aa:5f', 'vrr_rmac': '24:8a:07:cc:aa:5f', 'vtep': '27.0.0.15', 'filter': 0, 'svi_ifindex': 95, 'anycast_mac': 'n' frr_bgp:evpn_local_l3vni_del_zrecv {'vni': 4003, 'vrf': 107} 5. MAC-IP frr_bgp:evpn_local_macip_add_zrecv {'vni': 1003, 'mac': '00:02:00:00:00:04', 'ip': 'fe80::202:ff:fe00:4', 'flags': 4, 'seq': 0, 'esi': '03:44:38:39:ff:ff:01:00:00:02'} frr_bgp:evpn_local_macip_del_zrecv {'vni': 1000, 'mac': '00:02:00:00:00:04', 'ip': '2001:fee1::4', 'state': 1} Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
2021-10-04 18:41:43 +02:00
frrtrace(8, frr_bgp, evpn_local_l3vni_add_zrecv, l3vni, vrf_id,
&svi_rmac, &vrr_rmac, filter, originator_ip,
svi_ifindex, is_anycast_mac);
bgp_evpn_local_l3vni_add(l3vni, vrf_id, &svi_rmac, &vrr_rmac,
originator_ip, filter, svi_ifindex,
is_anycast_mac);
} else {
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("Rx L3VNI DEL VRF %s VNI %u",
vrf_id_to_name(vrf_id), l3vni);
bgpd: lttng tracepoint for local events received from zebra TPs - ===== root@ibm-2410a1-01:mgmt:~# lttng list --userspace |grep frr_bgp:evpn.*recv frr_bgp:evpn_local_l3vni_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_l3vni_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_macip_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_macip_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_vni_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_vni_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_evi_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_evi_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) root@ibm-2410a1-01:mgmt:~# Sample output - =============== 1. ES frr_bgp:evpn_mh_local_es_add_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01', 'vtep': '27.0.0.15', 'active': 0, 'bypass': 0, 'df_pref': 50000} frr_bgp:evpn_mh_local_es_del_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01'} 2. ES-EVI frr_bgp:evpn_mh_local_es_evi_add_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01', 'vni': 1004} frr_bgp:evpn_mh_local_es_evi_del_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01', 'vni': 1001} 3. L2-VNI frr_bgp:evpn_local_vni_add_zrecv {'vni': 1004, 'vtep': '27.0.0.15', 'mc_grp': '239.1.1.104', 'vrf': 97} 4. L3-VNI frr_bgp:evpn_local_l3vni_add_zrecv {'vni': 4001, 'vrf': 87, 'svi_rmac': '24:8a:07:cc:aa:5f', 'vrr_rmac': '24:8a:07:cc:aa:5f', 'vtep': '27.0.0.15', 'filter': 0, 'svi_ifindex': 95, 'anycast_mac': 'n' frr_bgp:evpn_local_l3vni_del_zrecv {'vni': 4003, 'vrf': 107} 5. MAC-IP frr_bgp:evpn_local_macip_add_zrecv {'vni': 1003, 'mac': '00:02:00:00:00:04', 'ip': 'fe80::202:ff:fe00:4', 'flags': 4, 'seq': 0, 'esi': '03:44:38:39:ff:ff:01:00:00:02'} frr_bgp:evpn_local_macip_del_zrecv {'vni': 1000, 'mac': '00:02:00:00:00:04', 'ip': '2001:fee1::4', 'state': 1} Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
2021-10-04 18:41:43 +02:00
frrtrace(2, frr_bgp, evpn_local_l3vni_del_zrecv, l3vni, vrf_id);
bgp_evpn_local_l3vni_del(l3vni, vrf_id);
}
return 0;
}
static int bgp_zebra_process_local_vni(ZAPI_CALLBACK_ARGS)
{
struct stream *s;
vni_t vni;
struct bgp *bgp;
struct in_addr vtep_ip = {INADDR_ANY};
vrf_id_t tenant_vrf_id = VRF_DEFAULT;
struct in_addr mcast_grp = {INADDR_ANY};
ifindex_t svi_ifindex = 0;
s = zclient->ibuf;
vni = stream_getl(s);
if (cmd == ZEBRA_VNI_ADD) {
vtep_ip.s_addr = stream_get_ipv4(s);
stream_get(&tenant_vrf_id, s, sizeof(vrf_id_t));
mcast_grp.s_addr = stream_get_ipv4(s);
stream_get(&svi_ifindex, s, sizeof(ifindex_t));
}
bgp = bgp_lookup_by_vrf_id(vrf_id);
if (!bgp)
return 0;
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug(
"Rx VNI %s VRF %s VNI %u tenant-vrf %s SVI ifindex %u",
(cmd == ZEBRA_VNI_ADD) ? "add" : "del",
vrf_id_to_name(vrf_id), vni,
vrf_id_to_name(tenant_vrf_id), svi_ifindex);
bgpd: lttng tracepoint for local events received from zebra TPs - ===== root@ibm-2410a1-01:mgmt:~# lttng list --userspace |grep frr_bgp:evpn.*recv frr_bgp:evpn_local_l3vni_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_l3vni_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_macip_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_macip_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_vni_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_vni_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_evi_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_evi_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) root@ibm-2410a1-01:mgmt:~# Sample output - =============== 1. ES frr_bgp:evpn_mh_local_es_add_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01', 'vtep': '27.0.0.15', 'active': 0, 'bypass': 0, 'df_pref': 50000} frr_bgp:evpn_mh_local_es_del_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01'} 2. ES-EVI frr_bgp:evpn_mh_local_es_evi_add_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01', 'vni': 1004} frr_bgp:evpn_mh_local_es_evi_del_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01', 'vni': 1001} 3. L2-VNI frr_bgp:evpn_local_vni_add_zrecv {'vni': 1004, 'vtep': '27.0.0.15', 'mc_grp': '239.1.1.104', 'vrf': 97} 4. L3-VNI frr_bgp:evpn_local_l3vni_add_zrecv {'vni': 4001, 'vrf': 87, 'svi_rmac': '24:8a:07:cc:aa:5f', 'vrr_rmac': '24:8a:07:cc:aa:5f', 'vtep': '27.0.0.15', 'filter': 0, 'svi_ifindex': 95, 'anycast_mac': 'n' frr_bgp:evpn_local_l3vni_del_zrecv {'vni': 4003, 'vrf': 107} 5. MAC-IP frr_bgp:evpn_local_macip_add_zrecv {'vni': 1003, 'mac': '00:02:00:00:00:04', 'ip': 'fe80::202:ff:fe00:4', 'flags': 4, 'seq': 0, 'esi': '03:44:38:39:ff:ff:01:00:00:02'} frr_bgp:evpn_local_macip_del_zrecv {'vni': 1000, 'mac': '00:02:00:00:00:04', 'ip': '2001:fee1::4', 'state': 1} Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
2021-10-04 18:41:43 +02:00
if (cmd == ZEBRA_VNI_ADD) {
frrtrace(4, frr_bgp, evpn_local_vni_add_zrecv, vni, vtep_ip,
tenant_vrf_id, mcast_grp);
return bgp_evpn_local_vni_add(
bgp, vni,
vtep_ip.s_addr != INADDR_ANY ? vtep_ip : bgp->router_id,
tenant_vrf_id, mcast_grp, svi_ifindex);
bgpd: lttng tracepoint for local events received from zebra TPs - ===== root@ibm-2410a1-01:mgmt:~# lttng list --userspace |grep frr_bgp:evpn.*recv frr_bgp:evpn_local_l3vni_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_l3vni_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_macip_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_macip_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_vni_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_vni_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_evi_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_evi_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) root@ibm-2410a1-01:mgmt:~# Sample output - =============== 1. ES frr_bgp:evpn_mh_local_es_add_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01', 'vtep': '27.0.0.15', 'active': 0, 'bypass': 0, 'df_pref': 50000} frr_bgp:evpn_mh_local_es_del_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01'} 2. ES-EVI frr_bgp:evpn_mh_local_es_evi_add_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01', 'vni': 1004} frr_bgp:evpn_mh_local_es_evi_del_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01', 'vni': 1001} 3. L2-VNI frr_bgp:evpn_local_vni_add_zrecv {'vni': 1004, 'vtep': '27.0.0.15', 'mc_grp': '239.1.1.104', 'vrf': 97} 4. L3-VNI frr_bgp:evpn_local_l3vni_add_zrecv {'vni': 4001, 'vrf': 87, 'svi_rmac': '24:8a:07:cc:aa:5f', 'vrr_rmac': '24:8a:07:cc:aa:5f', 'vtep': '27.0.0.15', 'filter': 0, 'svi_ifindex': 95, 'anycast_mac': 'n' frr_bgp:evpn_local_l3vni_del_zrecv {'vni': 4003, 'vrf': 107} 5. MAC-IP frr_bgp:evpn_local_macip_add_zrecv {'vni': 1003, 'mac': '00:02:00:00:00:04', 'ip': 'fe80::202:ff:fe00:4', 'flags': 4, 'seq': 0, 'esi': '03:44:38:39:ff:ff:01:00:00:02'} frr_bgp:evpn_local_macip_del_zrecv {'vni': 1000, 'mac': '00:02:00:00:00:04', 'ip': '2001:fee1::4', 'state': 1} Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
2021-10-04 18:41:43 +02:00
} else {
frrtrace(1, frr_bgp, evpn_local_vni_del_zrecv, vni);
return bgp_evpn_local_vni_del(bgp, vni);
bgpd: lttng tracepoint for local events received from zebra TPs - ===== root@ibm-2410a1-01:mgmt:~# lttng list --userspace |grep frr_bgp:evpn.*recv frr_bgp:evpn_local_l3vni_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_l3vni_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_macip_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_macip_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_vni_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_vni_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_evi_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_evi_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) root@ibm-2410a1-01:mgmt:~# Sample output - =============== 1. ES frr_bgp:evpn_mh_local_es_add_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01', 'vtep': '27.0.0.15', 'active': 0, 'bypass': 0, 'df_pref': 50000} frr_bgp:evpn_mh_local_es_del_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01'} 2. ES-EVI frr_bgp:evpn_mh_local_es_evi_add_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01', 'vni': 1004} frr_bgp:evpn_mh_local_es_evi_del_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01', 'vni': 1001} 3. L2-VNI frr_bgp:evpn_local_vni_add_zrecv {'vni': 1004, 'vtep': '27.0.0.15', 'mc_grp': '239.1.1.104', 'vrf': 97} 4. L3-VNI frr_bgp:evpn_local_l3vni_add_zrecv {'vni': 4001, 'vrf': 87, 'svi_rmac': '24:8a:07:cc:aa:5f', 'vrr_rmac': '24:8a:07:cc:aa:5f', 'vtep': '27.0.0.15', 'filter': 0, 'svi_ifindex': 95, 'anycast_mac': 'n' frr_bgp:evpn_local_l3vni_del_zrecv {'vni': 4003, 'vrf': 107} 5. MAC-IP frr_bgp:evpn_local_macip_add_zrecv {'vni': 1003, 'mac': '00:02:00:00:00:04', 'ip': 'fe80::202:ff:fe00:4', 'flags': 4, 'seq': 0, 'esi': '03:44:38:39:ff:ff:01:00:00:02'} frr_bgp:evpn_local_macip_del_zrecv {'vni': 1000, 'mac': '00:02:00:00:00:04', 'ip': '2001:fee1::4', 'state': 1} Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
2021-10-04 18:41:43 +02:00
}
}
static int bgp_zebra_process_local_macip(ZAPI_CALLBACK_ARGS)
{
struct stream *s;
vni_t vni;
struct bgp *bgp;
struct ethaddr mac;
struct ipaddr ip;
int ipa_len;
bgpd, zebra: EVPN extended mobility support Implement procedures similar to what is specified in https://tools.ietf.org/html/draft-malhotra-bess-evpn-irb-extended-mobility in order to support extended mobility scenarios in EVPN. These are scenarios where a host/VM move results in a different (MAC,IP) binding from earlier. For example, a host with an address assignment (IP1, MAC1) moves behind a different PE (VTEP) and has an address assignment of (IP1, MAC2) or a host with an address assignment (IP5, MAC5) has a different assignment of (IP6, MAC5) after the move. Note that while these are described as "move" scenarios, they also cover the situation when a VM is shut down and a new VM is spun up at a different location that reuses the IP address or MAC address of the earlier instance, but not both. Yet another scenario is a MAC change for an attached host/VM i.e., when the MAC of an attached host changes from MAC1 to MAC2. This is necessary because there may already be a non-zero sequence number associated with MAC2. Also, even though (IP, MAC1) is withdrawn before (IP, MAC2) is advertised, they may propagate through the network differently. The procedures continue to rely on the MAC mobility extended community specified in RFC 7432 and already supported by the implementation, but augment it with a inheritance mechanism that understands the relationship of the host MACIP (ARP/neighbor table entry) to the underlying MAC (MAC forwarding database entry). In FRR, this relationship is understood by the zebra component which doubles as the "host mobility manager", so the MAC mobility sequence numbers are determined through interaction between bgpd and zebra. Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com> Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com> Reviewed-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2018-08-20 21:20:06 +02:00
uint8_t flags = 0;
uint32_t seqnum = 0;
int state = 0;
bgpd: support for Ethernet Segments and Type-1/EAD routes This is the base patch that brings in support for Type-1 routes. It includes support for - - Ethernet Segment (ES) management - EAD route handling - MAC-IP (Type-2) routes with a non-zero ESI i.e. Aliasing for active-active multihoming - Initial infra for consistency checking. Consistency checking is a fundamental feature for active-active solutions like MLAG. We will try to levarage the info in the EAD-ES/EAD-EVI routes to detect inconsitencies in access config across VTEPs attached to the same Ethernet Segment. Functionality Overview - ======================== 1. Ethernet segments are created in zebra and associated with access VLANs. zebra sends that info as ES and ES-EVI objects to BGP. 2. BGP advertises EAD-ES and EAD-EVI routes for the locally attached ethernet segments. 3. Similarly BGP processes EAD-ES and EAD-EVI routes from peers and translates them into ES-VTEP objects which are then sent to zebra as remote ESs. 4. Each ES in zebra is associated with a list of active VTEPs which is then translated into a L2-NHG (nexthop group). This is the ES "Alias" entry 5. MAC-IP routes with a non-zero ESI use the alias entry created in (4.) to forward traffic i.e. a MAC-ECMP is done to these remote-ES destinations. EAD route management (route table and key) - ============================================ 1. Local EAD-ES routes a. route-table: per-ES route-table key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP) b. route-table: per-VNI route-table Not added c. route-table: global route-table key: {RD=ES-RD, ESI, ET=0xffffffff) 2. Remote EAD-ES routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP) c. route-table: global route-table key: {RD=ES-RD, ESI, ET=0xffffffff) 3. Local EAD-EVI routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=0, ESI, ET=0, VTEP-IP) c. route-table: global route-table key: {RD=L2-VNI-RD, ESI, ET=0) 4. Remote EAD-EVI routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=0, ESI, ET=0, VTEP-IP) c. route-table: global route-table key: {RD=L2-VNI-RD, ESI, ET=0) Please refer to bgp_evpn_mh.h for info on how the data-structures are organized. Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-03-27 22:43:50 +01:00
char buf2[ESI_STR_LEN];
esi_t esi;
memset(&ip, 0, sizeof(ip));
s = zclient->ibuf;
vni = stream_getl(s);
stream_get(&mac.octet, s, ETH_ALEN);
ipa_len = stream_getl(s);
if (ipa_len != 0 && ipa_len != IPV4_MAX_BYTELEN
&& ipa_len != IPV6_MAX_BYTELEN) {
flog_err(EC_BGP_MACIP_LEN,
"%u:Recv MACIP %s with invalid IP addr length %d",
vrf_id, (cmd == ZEBRA_MACIP_ADD) ? "Add" : "Del",
ipa_len);
return -1;
}
if (ipa_len) {
ip.ipa_type =
(ipa_len == IPV4_MAX_BYTELEN) ? IPADDR_V4 : IPADDR_V6;
stream_get(&ip.ip.addr, s, ipa_len);
}
if (cmd == ZEBRA_MACIP_ADD) {
bgpd, zebra: EVPN extended mobility support Implement procedures similar to what is specified in https://tools.ietf.org/html/draft-malhotra-bess-evpn-irb-extended-mobility in order to support extended mobility scenarios in EVPN. These are scenarios where a host/VM move results in a different (MAC,IP) binding from earlier. For example, a host with an address assignment (IP1, MAC1) moves behind a different PE (VTEP) and has an address assignment of (IP1, MAC2) or a host with an address assignment (IP5, MAC5) has a different assignment of (IP6, MAC5) after the move. Note that while these are described as "move" scenarios, they also cover the situation when a VM is shut down and a new VM is spun up at a different location that reuses the IP address or MAC address of the earlier instance, but not both. Yet another scenario is a MAC change for an attached host/VM i.e., when the MAC of an attached host changes from MAC1 to MAC2. This is necessary because there may already be a non-zero sequence number associated with MAC2. Also, even though (IP, MAC1) is withdrawn before (IP, MAC2) is advertised, they may propagate through the network differently. The procedures continue to rely on the MAC mobility extended community specified in RFC 7432 and already supported by the implementation, but augment it with a inheritance mechanism that understands the relationship of the host MACIP (ARP/neighbor table entry) to the underlying MAC (MAC forwarding database entry). In FRR, this relationship is understood by the zebra component which doubles as the "host mobility manager", so the MAC mobility sequence numbers are determined through interaction between bgpd and zebra. Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com> Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com> Reviewed-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2018-08-20 21:20:06 +02:00
flags = stream_getc(s);
seqnum = stream_getl(s);
bgpd: support for Ethernet Segments and Type-1/EAD routes This is the base patch that brings in support for Type-1 routes. It includes support for - - Ethernet Segment (ES) management - EAD route handling - MAC-IP (Type-2) routes with a non-zero ESI i.e. Aliasing for active-active multihoming - Initial infra for consistency checking. Consistency checking is a fundamental feature for active-active solutions like MLAG. We will try to levarage the info in the EAD-ES/EAD-EVI routes to detect inconsitencies in access config across VTEPs attached to the same Ethernet Segment. Functionality Overview - ======================== 1. Ethernet segments are created in zebra and associated with access VLANs. zebra sends that info as ES and ES-EVI objects to BGP. 2. BGP advertises EAD-ES and EAD-EVI routes for the locally attached ethernet segments. 3. Similarly BGP processes EAD-ES and EAD-EVI routes from peers and translates them into ES-VTEP objects which are then sent to zebra as remote ESs. 4. Each ES in zebra is associated with a list of active VTEPs which is then translated into a L2-NHG (nexthop group). This is the ES "Alias" entry 5. MAC-IP routes with a non-zero ESI use the alias entry created in (4.) to forward traffic i.e. a MAC-ECMP is done to these remote-ES destinations. EAD route management (route table and key) - ============================================ 1. Local EAD-ES routes a. route-table: per-ES route-table key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP) b. route-table: per-VNI route-table Not added c. route-table: global route-table key: {RD=ES-RD, ESI, ET=0xffffffff) 2. Remote EAD-ES routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP) c. route-table: global route-table key: {RD=ES-RD, ESI, ET=0xffffffff) 3. Local EAD-EVI routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=0, ESI, ET=0, VTEP-IP) c. route-table: global route-table key: {RD=L2-VNI-RD, ESI, ET=0) 4. Remote EAD-EVI routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=0, ESI, ET=0, VTEP-IP) c. route-table: global route-table key: {RD=L2-VNI-RD, ESI, ET=0) Please refer to bgp_evpn_mh.h for info on how the data-structures are organized. Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-03-27 22:43:50 +01:00
stream_get(&esi, s, sizeof(esi_t));
} else {
state = stream_getl(s);
memset(&esi, 0, sizeof(esi_t));
bgpd, zebra: EVPN extended mobility support Implement procedures similar to what is specified in https://tools.ietf.org/html/draft-malhotra-bess-evpn-irb-extended-mobility in order to support extended mobility scenarios in EVPN. These are scenarios where a host/VM move results in a different (MAC,IP) binding from earlier. For example, a host with an address assignment (IP1, MAC1) moves behind a different PE (VTEP) and has an address assignment of (IP1, MAC2) or a host with an address assignment (IP5, MAC5) has a different assignment of (IP6, MAC5) after the move. Note that while these are described as "move" scenarios, they also cover the situation when a VM is shut down and a new VM is spun up at a different location that reuses the IP address or MAC address of the earlier instance, but not both. Yet another scenario is a MAC change for an attached host/VM i.e., when the MAC of an attached host changes from MAC1 to MAC2. This is necessary because there may already be a non-zero sequence number associated with MAC2. Also, even though (IP, MAC1) is withdrawn before (IP, MAC2) is advertised, they may propagate through the network differently. The procedures continue to rely on the MAC mobility extended community specified in RFC 7432 and already supported by the implementation, but augment it with a inheritance mechanism that understands the relationship of the host MACIP (ARP/neighbor table entry) to the underlying MAC (MAC forwarding database entry). In FRR, this relationship is understood by the zebra component which doubles as the "host mobility manager", so the MAC mobility sequence numbers are determined through interaction between bgpd and zebra. Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com> Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com> Reviewed-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2018-08-20 21:20:06 +02:00
}
bgp = bgp_lookup_by_vrf_id(vrf_id);
if (!bgp)
return 0;
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug(
"%u:Recv MACIP %s f 0x%x MAC %pEA IP %pIA VNI %u seq %u state %d ESI %s",
vrf_id, (cmd == ZEBRA_MACIP_ADD) ? "Add" : "Del", flags,
&mac, &ip, vni, seqnum, state,
esi_to_str(&esi, buf2, sizeof(buf2)));
bgpd: lttng tracepoint for local events received from zebra TPs - ===== root@ibm-2410a1-01:mgmt:~# lttng list --userspace |grep frr_bgp:evpn.*recv frr_bgp:evpn_local_l3vni_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_l3vni_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_macip_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_macip_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_vni_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_vni_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_evi_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_evi_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) root@ibm-2410a1-01:mgmt:~# Sample output - =============== 1. ES frr_bgp:evpn_mh_local_es_add_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01', 'vtep': '27.0.0.15', 'active': 0, 'bypass': 0, 'df_pref': 50000} frr_bgp:evpn_mh_local_es_del_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01'} 2. ES-EVI frr_bgp:evpn_mh_local_es_evi_add_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01', 'vni': 1004} frr_bgp:evpn_mh_local_es_evi_del_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01', 'vni': 1001} 3. L2-VNI frr_bgp:evpn_local_vni_add_zrecv {'vni': 1004, 'vtep': '27.0.0.15', 'mc_grp': '239.1.1.104', 'vrf': 97} 4. L3-VNI frr_bgp:evpn_local_l3vni_add_zrecv {'vni': 4001, 'vrf': 87, 'svi_rmac': '24:8a:07:cc:aa:5f', 'vrr_rmac': '24:8a:07:cc:aa:5f', 'vtep': '27.0.0.15', 'filter': 0, 'svi_ifindex': 95, 'anycast_mac': 'n' frr_bgp:evpn_local_l3vni_del_zrecv {'vni': 4003, 'vrf': 107} 5. MAC-IP frr_bgp:evpn_local_macip_add_zrecv {'vni': 1003, 'mac': '00:02:00:00:00:04', 'ip': 'fe80::202:ff:fe00:4', 'flags': 4, 'seq': 0, 'esi': '03:44:38:39:ff:ff:01:00:00:02'} frr_bgp:evpn_local_macip_del_zrecv {'vni': 1000, 'mac': '00:02:00:00:00:04', 'ip': '2001:fee1::4', 'state': 1} Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
2021-10-04 18:41:43 +02:00
if (cmd == ZEBRA_MACIP_ADD) {
frrtrace(6, frr_bgp, evpn_local_macip_add_zrecv, vni, &mac, &ip,
flags, seqnum, &esi);
bgpd, zebra: EVPN extended mobility support Implement procedures similar to what is specified in https://tools.ietf.org/html/draft-malhotra-bess-evpn-irb-extended-mobility in order to support extended mobility scenarios in EVPN. These are scenarios where a host/VM move results in a different (MAC,IP) binding from earlier. For example, a host with an address assignment (IP1, MAC1) moves behind a different PE (VTEP) and has an address assignment of (IP1, MAC2) or a host with an address assignment (IP5, MAC5) has a different assignment of (IP6, MAC5) after the move. Note that while these are described as "move" scenarios, they also cover the situation when a VM is shut down and a new VM is spun up at a different location that reuses the IP address or MAC address of the earlier instance, but not both. Yet another scenario is a MAC change for an attached host/VM i.e., when the MAC of an attached host changes from MAC1 to MAC2. This is necessary because there may already be a non-zero sequence number associated with MAC2. Also, even though (IP, MAC1) is withdrawn before (IP, MAC2) is advertised, they may propagate through the network differently. The procedures continue to rely on the MAC mobility extended community specified in RFC 7432 and already supported by the implementation, but augment it with a inheritance mechanism that understands the relationship of the host MACIP (ARP/neighbor table entry) to the underlying MAC (MAC forwarding database entry). In FRR, this relationship is understood by the zebra component which doubles as the "host mobility manager", so the MAC mobility sequence numbers are determined through interaction between bgpd and zebra. Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com> Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com> Reviewed-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2018-08-20 21:20:06 +02:00
return bgp_evpn_local_macip_add(bgp, vni, &mac, &ip,
bgpd: support for Ethernet Segments and Type-1/EAD routes This is the base patch that brings in support for Type-1 routes. It includes support for - - Ethernet Segment (ES) management - EAD route handling - MAC-IP (Type-2) routes with a non-zero ESI i.e. Aliasing for active-active multihoming - Initial infra for consistency checking. Consistency checking is a fundamental feature for active-active solutions like MLAG. We will try to levarage the info in the EAD-ES/EAD-EVI routes to detect inconsitencies in access config across VTEPs attached to the same Ethernet Segment. Functionality Overview - ======================== 1. Ethernet segments are created in zebra and associated with access VLANs. zebra sends that info as ES and ES-EVI objects to BGP. 2. BGP advertises EAD-ES and EAD-EVI routes for the locally attached ethernet segments. 3. Similarly BGP processes EAD-ES and EAD-EVI routes from peers and translates them into ES-VTEP objects which are then sent to zebra as remote ESs. 4. Each ES in zebra is associated with a list of active VTEPs which is then translated into a L2-NHG (nexthop group). This is the ES "Alias" entry 5. MAC-IP routes with a non-zero ESI use the alias entry created in (4.) to forward traffic i.e. a MAC-ECMP is done to these remote-ES destinations. EAD route management (route table and key) - ============================================ 1. Local EAD-ES routes a. route-table: per-ES route-table key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP) b. route-table: per-VNI route-table Not added c. route-table: global route-table key: {RD=ES-RD, ESI, ET=0xffffffff) 2. Remote EAD-ES routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP) c. route-table: global route-table key: {RD=ES-RD, ESI, ET=0xffffffff) 3. Local EAD-EVI routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=0, ESI, ET=0, VTEP-IP) c. route-table: global route-table key: {RD=L2-VNI-RD, ESI, ET=0) 4. Remote EAD-EVI routes a. route-table: per-ES route-table Not added b. route-table: per-VNI route-table key: {RD=0, ESI, ET=0, VTEP-IP) c. route-table: global route-table key: {RD=L2-VNI-RD, ESI, ET=0) Please refer to bgp_evpn_mh.h for info on how the data-structures are organized. Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-03-27 22:43:50 +01:00
flags, seqnum, &esi);
bgpd: lttng tracepoint for local events received from zebra TPs - ===== root@ibm-2410a1-01:mgmt:~# lttng list --userspace |grep frr_bgp:evpn.*recv frr_bgp:evpn_local_l3vni_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_l3vni_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_macip_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_macip_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_vni_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_vni_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_evi_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_evi_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) root@ibm-2410a1-01:mgmt:~# Sample output - =============== 1. ES frr_bgp:evpn_mh_local_es_add_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01', 'vtep': '27.0.0.15', 'active': 0, 'bypass': 0, 'df_pref': 50000} frr_bgp:evpn_mh_local_es_del_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01'} 2. ES-EVI frr_bgp:evpn_mh_local_es_evi_add_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01', 'vni': 1004} frr_bgp:evpn_mh_local_es_evi_del_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01', 'vni': 1001} 3. L2-VNI frr_bgp:evpn_local_vni_add_zrecv {'vni': 1004, 'vtep': '27.0.0.15', 'mc_grp': '239.1.1.104', 'vrf': 97} 4. L3-VNI frr_bgp:evpn_local_l3vni_add_zrecv {'vni': 4001, 'vrf': 87, 'svi_rmac': '24:8a:07:cc:aa:5f', 'vrr_rmac': '24:8a:07:cc:aa:5f', 'vtep': '27.0.0.15', 'filter': 0, 'svi_ifindex': 95, 'anycast_mac': 'n' frr_bgp:evpn_local_l3vni_del_zrecv {'vni': 4003, 'vrf': 107} 5. MAC-IP frr_bgp:evpn_local_macip_add_zrecv {'vni': 1003, 'mac': '00:02:00:00:00:04', 'ip': 'fe80::202:ff:fe00:4', 'flags': 4, 'seq': 0, 'esi': '03:44:38:39:ff:ff:01:00:00:02'} frr_bgp:evpn_local_macip_del_zrecv {'vni': 1000, 'mac': '00:02:00:00:00:04', 'ip': '2001:fee1::4', 'state': 1} Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
2021-10-04 18:41:43 +02:00
} else {
frrtrace(4, frr_bgp, evpn_local_macip_del_zrecv, vni, &mac, &ip,
state);
return bgp_evpn_local_macip_del(bgp, vni, &mac, &ip, state);
bgpd: lttng tracepoint for local events received from zebra TPs - ===== root@ibm-2410a1-01:mgmt:~# lttng list --userspace |grep frr_bgp:evpn.*recv frr_bgp:evpn_local_l3vni_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_l3vni_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_macip_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_macip_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_vni_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_local_vni_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_evi_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_evi_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_del_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) frr_bgp:evpn_mh_local_es_add_zrecv (loglevel: TRACE_INFO (6)) (type: tracepoint) root@ibm-2410a1-01:mgmt:~# Sample output - =============== 1. ES frr_bgp:evpn_mh_local_es_add_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01', 'vtep': '27.0.0.15', 'active': 0, 'bypass': 0, 'df_pref': 50000} frr_bgp:evpn_mh_local_es_del_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01'} 2. ES-EVI frr_bgp:evpn_mh_local_es_evi_add_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01', 'vni': 1004} frr_bgp:evpn_mh_local_es_evi_del_zrecv {'esi': '03:44:38:39:ff:ff:01:00:00:01', 'vni': 1001} 3. L2-VNI frr_bgp:evpn_local_vni_add_zrecv {'vni': 1004, 'vtep': '27.0.0.15', 'mc_grp': '239.1.1.104', 'vrf': 97} 4. L3-VNI frr_bgp:evpn_local_l3vni_add_zrecv {'vni': 4001, 'vrf': 87, 'svi_rmac': '24:8a:07:cc:aa:5f', 'vrr_rmac': '24:8a:07:cc:aa:5f', 'vtep': '27.0.0.15', 'filter': 0, 'svi_ifindex': 95, 'anycast_mac': 'n' frr_bgp:evpn_local_l3vni_del_zrecv {'vni': 4003, 'vrf': 107} 5. MAC-IP frr_bgp:evpn_local_macip_add_zrecv {'vni': 1003, 'mac': '00:02:00:00:00:04', 'ip': 'fe80::202:ff:fe00:4', 'flags': 4, 'seq': 0, 'esi': '03:44:38:39:ff:ff:01:00:00:02'} frr_bgp:evpn_local_macip_del_zrecv {'vni': 1000, 'mac': '00:02:00:00:00:04', 'ip': '2001:fee1::4', 'state': 1} Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
2021-10-04 18:41:43 +02:00
}
}
static int bgp_zebra_process_local_ip_prefix(ZAPI_CALLBACK_ARGS)
{
struct stream *s = NULL;
struct bgp *bgp_vrf = NULL;
struct prefix p;
memset(&p, 0, sizeof(p));
s = zclient->ibuf;
stream_get(&p, s, sizeof(struct prefix));
bgp_vrf = bgp_lookup_by_vrf_id(vrf_id);
if (!bgp_vrf)
return 0;
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("Recv prefix %pFX %s on vrf %s", &p,
(cmd == ZEBRA_IP_PREFIX_ROUTE_ADD) ? "ADD" : "DEL",
vrf_id_to_name(vrf_id));
if (cmd == ZEBRA_IP_PREFIX_ROUTE_ADD) {
if (p.family == AF_INET)
bgp_evpn_advertise_type5_route(bgp_vrf, &p, NULL,
AFI_IP, SAFI_UNICAST);
else
bgp_evpn_advertise_type5_route(bgp_vrf, &p, NULL,
AFI_IP6, SAFI_UNICAST);
} else {
if (p.family == AF_INET)
bgp_evpn_withdraw_type5_route(bgp_vrf, &p, AFI_IP,
SAFI_UNICAST);
else
bgp_evpn_withdraw_type5_route(bgp_vrf, &p, AFI_IP6,
SAFI_UNICAST);
}
return 0;
}
extern struct zebra_privs_t bgpd_privs;
static int bgp_ifp_create(struct interface *ifp)
{
struct bgp *bgp;
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("Rx Intf add VRF %s IF %s", ifp->vrf->name,
ifp->name);
/* We don't need to check for vrf->bgp link to add this local MAC
* to the hash table as the tenant VRF might not have the BGP instance.
*/
bgp_mac_add_mac_entry(ifp);
bgp = ifp->vrf->info;
if (!bgp)
return 0;
bgp_update_interface_nbrs(bgp, ifp, ifp);
hook_call(bgp_vrf_status_changed, bgp, ifp);
if (bgp_get_default() && if_is_loopback(ifp)) {
vpn_leak_zebra_vrf_label_update(bgp, AFI_IP);
vpn_leak_zebra_vrf_label_update(bgp, AFI_IP6);
vpn_leak_zebra_vrf_sid_update(bgp, AFI_IP);
vpn_leak_zebra_vrf_sid_update(bgp, AFI_IP6);
vpn_leak_postchange_all();
}
return 0;
}
static int bgp_zebra_process_srv6_locator_chunk(ZAPI_CALLBACK_ARGS)
{
struct stream *s = NULL;
struct bgp *bgp = bgp_get_default();
struct listnode *node;
struct srv6_locator_chunk *c;
struct srv6_locator_chunk *chunk = srv6_locator_chunk_alloc();
s = zclient->ibuf;
zapi_srv6_locator_chunk_decode(s, chunk);
if (strcmp(bgp->srv6_locator_name, chunk->locator_name) != 0) {
zlog_err("%s: Locator name unmatch %s:%s", __func__,
bgp->srv6_locator_name, chunk->locator_name);
srv6_locator_chunk_free(&chunk);
return 0;
}
for (ALL_LIST_ELEMENTS_RO(bgp->srv6_locator_chunks, node, c)) {
if (!prefix_cmp(&c->prefix, &chunk->prefix)) {
srv6_locator_chunk_free(&chunk);
return 0;
}
}
listnode_add(bgp->srv6_locator_chunks, chunk);
vpn_leak_postchange_all();
return 0;
}
/**
* Internal function to process an SRv6 locator
*
* @param locator The locator to be processed
*/
static int bgp_zebra_process_srv6_locator_internal(struct srv6_locator *locator)
{
struct bgp *bgp = bgp_get_default();
if (!bgp || !bgp->srv6_enabled || !locator)
return -1;
/*
* Check if the main BGP instance is configured to use the received
* locator
*/
if (strcmp(bgp->srv6_locator_name, locator->name) != 0) {
zlog_err("%s: SRv6 Locator name unmatch %s:%s", __func__,
bgp->srv6_locator_name, locator->name);
return 0;
}
zlog_info("%s: Received SRv6 locator %s %pFX, loc-block-len=%u, loc-node-len=%u func-len=%u, arg-len=%u",
__func__, locator->name, &locator->prefix,
locator->block_bits_length, locator->node_bits_length,
locator->function_bits_length, locator->argument_bits_length);
/* Store the locator in the main BGP instance */
bgp->srv6_locator = srv6_locator_alloc(locator->name);
srv6_locator_copy(bgp->srv6_locator, locator);
/*
* Process VPN-to-VRF and VRF-to-VPN leaks to advertise new locator
* and SIDs.
*/
vpn_leak_postchange_all();
return 0;
}
static int bgp_zebra_srv6_sid_notify(ZAPI_CALLBACK_ARGS)
{
struct bgp *bgp = bgp_get_default();
struct srv6_locator *locator;
struct srv6_sid_ctx ctx;
struct in6_addr sid_addr;
enum zapi_srv6_sid_notify note;
struct bgp *bgp_vrf;
struct vrf *vrf;
struct listnode *node, *nnode;
char buf[256];
struct in6_addr *tovpn_sid;
struct prefix_ipv6 tmp_prefix;
uint32_t sid_func;
bool found = false;
if (!bgp || !bgp->srv6_enabled)
return -1;
if (!bgp->srv6_locator) {
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("%s: ignoring SRv6 SID notify: locator not set",
__func__);
return -1;
}
/* Decode the received notification message */
if (!zapi_srv6_sid_notify_decode(zclient->ibuf, &ctx, &sid_addr,
&sid_func, NULL, &note, NULL)) {
zlog_err("%s : error in msg decode", __func__);
return -1;
}
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("%s: received SRv6 SID notify: ctx %s sid_value %pI6 %s",
__func__, srv6_sid_ctx2str(buf, sizeof(buf), &ctx),
&sid_addr, zapi_srv6_sid_notify2str(note));
/* Get the BGP instance for which the SID has been requested, if any */
for (ALL_LIST_ELEMENTS(bm->bgp, node, nnode, bgp_vrf)) {
vrf = vrf_lookup_by_id(bgp_vrf->vrf_id);
if (!vrf)
continue;
if (vrf->vrf_id == ctx.vrf_id) {
found = true;
break;
}
}
if (!found) {
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("%s: ignoring SRv6 SID notify: No VRF suitable for received SID ctx %s sid_value %pI6",
__func__,
srv6_sid_ctx2str(buf, sizeof(buf), &ctx),
&sid_addr);
return -1;
}
/* Handle notification */
switch (note) {
case ZAPI_SRV6_SID_ALLOCATED:
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("SRv6 SID %pI6 %s : ALLOCATED", &sid_addr,
srv6_sid_ctx2str(buf, sizeof(buf), &ctx));
/* Verify that the received SID belongs to the configured locator */
tmp_prefix.family = AF_INET6;
tmp_prefix.prefixlen = IPV6_MAX_BITLEN;
tmp_prefix.prefix = sid_addr;
if (!prefix_match((struct prefix *)&bgp->srv6_locator->prefix,
(struct prefix *)&tmp_prefix))
return -1;
/* Get label */
uint8_t func_len = bgp->srv6_locator->function_bits_length;
uint8_t shift_len = BGP_PREFIX_SID_SRV6_MAX_FUNCTION_LENGTH -
func_len;
int label = sid_func << shift_len;
/* Un-export VPN to VRF routes */
vpn_leak_prechange(BGP_VPN_POLICY_DIR_TOVPN, AFI_IP, bgp,
bgp_vrf);
vpn_leak_prechange(BGP_VPN_POLICY_DIR_TOVPN, AFI_IP6, bgp,
bgp_vrf);
locator = srv6_locator_alloc(bgp->srv6_locator_name);
srv6_locator_copy(locator, bgp->srv6_locator);
/* Store SID, locator, and label */
tovpn_sid = XCALLOC(MTYPE_BGP_SRV6_SID, sizeof(struct in6_addr));
*tovpn_sid = sid_addr;
if (ctx.behavior == ZEBRA_SEG6_LOCAL_ACTION_END_DT6) {
XFREE(MTYPE_BGP_SRV6_SID,
bgp_vrf->vpn_policy[AFI_IP6].tovpn_sid);
srv6_locator_free(
bgp_vrf->vpn_policy[AFI_IP6].tovpn_sid_locator);
sid_unregister(bgp,
bgp_vrf->vpn_policy[AFI_IP6].tovpn_sid);
bgp_vrf->vpn_policy[AFI_IP6].tovpn_sid = tovpn_sid;
bgp_vrf->vpn_policy[AFI_IP6].tovpn_sid_locator = locator;
bgp_vrf->vpn_policy[AFI_IP6].tovpn_sid_transpose_label =
label;
} else if (ctx.behavior == ZEBRA_SEG6_LOCAL_ACTION_END_DT4) {
XFREE(MTYPE_BGP_SRV6_SID,
bgp_vrf->vpn_policy[AFI_IP].tovpn_sid);
srv6_locator_free(
bgp_vrf->vpn_policy[AFI_IP].tovpn_sid_locator);
sid_unregister(bgp,
bgp_vrf->vpn_policy[AFI_IP].tovpn_sid);
bgp_vrf->vpn_policy[AFI_IP].tovpn_sid = tovpn_sid;
bgp_vrf->vpn_policy[AFI_IP].tovpn_sid_locator = locator;
bgp_vrf->vpn_policy[AFI_IP].tovpn_sid_transpose_label =
label;
} else if (ctx.behavior == ZEBRA_SEG6_LOCAL_ACTION_END_DT46) {
XFREE(MTYPE_BGP_SRV6_SID, bgp_vrf->tovpn_sid);
srv6_locator_free(bgp_vrf->tovpn_sid_locator);
sid_unregister(bgp, bgp_vrf->tovpn_sid);
bgp_vrf->tovpn_sid = tovpn_sid;
bgp_vrf->tovpn_sid_locator = locator;
bgp_vrf->tovpn_sid_transpose_label = label;
} else {
srv6_locator_free(locator);
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("Unsupported behavior. Not assigned SRv6 SID: %s %pI6",
srv6_sid_ctx2str(buf, sizeof(buf),
&ctx),
&sid_addr);
return -1;
}
/* Register the new SID */
sid_register(bgp, tovpn_sid, bgp->srv6_locator_name);
/* Export VPN to VRF routes */
vpn_leak_postchange_all();
break;
case ZAPI_SRV6_SID_RELEASED:
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("SRv6 SID %pI6 %s: RELEASED", &sid_addr,
srv6_sid_ctx2str(buf, sizeof(buf), &ctx));
/* Un-export VPN to VRF routes */
vpn_leak_prechange(BGP_VPN_POLICY_DIR_TOVPN, AFI_IP, bgp,
bgp_vrf);
vpn_leak_prechange(BGP_VPN_POLICY_DIR_TOVPN, AFI_IP6, bgp,
bgp_vrf);
/* Remove SID, locator, and label */
if (ctx.behavior == ZEBRA_SEG6_LOCAL_ACTION_END_DT6) {
XFREE(MTYPE_BGP_SRV6_SID,
bgp_vrf->vpn_policy[AFI_IP6].tovpn_sid);
if (bgp_vrf->vpn_policy[AFI_IP6].tovpn_sid_locator) {
srv6_locator_free(bgp->vpn_policy[AFI_IP6]
.tovpn_sid_locator);
bgp_vrf->vpn_policy[AFI_IP6].tovpn_sid_locator =
NULL;
}
bgp_vrf->vpn_policy[AFI_IP6].tovpn_sid_transpose_label =
0;
/* Unregister the SID */
sid_unregister(bgp,
bgp_vrf->vpn_policy[AFI_IP6].tovpn_sid);
} else if (ctx.behavior == ZEBRA_SEG6_LOCAL_ACTION_END_DT4) {
XFREE(MTYPE_BGP_SRV6_SID,
bgp_vrf->vpn_policy[AFI_IP].tovpn_sid);
if (bgp_vrf->vpn_policy[AFI_IP].tovpn_sid_locator) {
srv6_locator_free(bgp->vpn_policy[AFI_IP]
.tovpn_sid_locator);
bgp_vrf->vpn_policy[AFI_IP].tovpn_sid_locator =
NULL;
}
bgp_vrf->vpn_policy[AFI_IP].tovpn_sid_transpose_label =
0;
/* Unregister the SID */
sid_unregister(bgp,
bgp_vrf->vpn_policy[AFI_IP].tovpn_sid);
} else if (ctx.behavior == ZEBRA_SEG6_LOCAL_ACTION_END_DT46) {
XFREE(MTYPE_BGP_SRV6_SID, bgp_vrf->tovpn_sid);
if (bgp_vrf->tovpn_sid_locator) {
srv6_locator_free(bgp_vrf->tovpn_sid_locator);
bgp_vrf->tovpn_sid_locator = NULL;
}
bgp_vrf->tovpn_sid_transpose_label = 0;
/* Unregister the SID */
sid_unregister(bgp, bgp_vrf->tovpn_sid);
} else {
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("Unsupported behavior. Not assigned SRv6 SID: %s %pI6",
srv6_sid_ctx2str(buf, sizeof(buf),
&ctx),
&sid_addr);
return -1;
}
/* Export VPN to VRF routes*/
vpn_leak_postchange_all();
break;
case ZAPI_SRV6_SID_FAIL_ALLOC:
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("SRv6 SID %pI6 %s: Failed to allocate",
&sid_addr,
srv6_sid_ctx2str(buf, sizeof(buf), &ctx));
/* Error will be logged by zebra module */
break;
case ZAPI_SRV6_SID_FAIL_RELEASE:
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("%s: SRv6 SID %pI6 %s failure to release",
__func__, &sid_addr,
srv6_sid_ctx2str(buf, sizeof(buf), &ctx));
/* Error will be logged by zebra module */
break;
}
return 0;
}
static int bgp_zebra_process_srv6_locator_add(ZAPI_CALLBACK_ARGS)
{
struct srv6_locator loc = {};
struct bgp *bgp = bgp_get_default();
if (!bgp || !bgp->srv6_enabled)
return 0;
if (zapi_srv6_locator_decode(zclient->ibuf, &loc) < 0)
return -1;
return bgp_zebra_process_srv6_locator_internal(&loc);
}
static int bgp_zebra_process_srv6_locator_delete(ZAPI_CALLBACK_ARGS)
{
struct srv6_locator loc = {};
struct bgp *bgp = bgp_get_default();
struct listnode *node, *nnode;
struct srv6_locator_chunk *chunk;
struct srv6_locator *tovpn_sid_locator;
struct bgp_srv6_function *func;
struct bgp *bgp_vrf;
struct in6_addr *tovpn_sid;
struct prefix_ipv6 tmp_prefi;
if (!bgp)
return 0;
if (zapi_srv6_locator_decode(zclient->ibuf, &loc) < 0)
return -1;
// clear SRv6 locator
if (bgp->srv6_locator) {
srv6_locator_free(bgp->srv6_locator);
bgp->srv6_locator = NULL;
}
// refresh chunks
for (ALL_LIST_ELEMENTS(bgp->srv6_locator_chunks, node, nnode, chunk))
if (prefix_match((struct prefix *)&loc.prefix,
(struct prefix *)&chunk->prefix)) {
listnode_delete(bgp->srv6_locator_chunks, chunk);
srv6_locator_chunk_free(&chunk);
}
// refresh functions
for (ALL_LIST_ELEMENTS(bgp->srv6_functions, node, nnode, func)) {
tmp_prefi.family = AF_INET6;
tmp_prefi.prefixlen = IPV6_MAX_BITLEN;
tmp_prefi.prefix = func->sid;
if (prefix_match((struct prefix *)&loc.prefix,
(struct prefix *)&tmp_prefi)) {
listnode_delete(bgp->srv6_functions, func);
bgpd: Free Memory for SRv6 Functions and Locator Chunks Implement proper memory cleanup for SRv6 functions and locator chunks to prevent potential memory leaks. The list callback deletion functions have been set. The ASan leak log for reference: ``` *********************************************************************************** Address Sanitizer Error detected in bgp_srv6l3vpn_to_bgp_vrf.test_bgp_srv6l3vpn_to_bgp_vrf/r2.asan.bgpd.4180 ================================================================= ==4180==ERROR: LeakSanitizer: detected memory leaks Direct leak of 544 byte(s) in 2 object(s) allocated from: #0 0x7f8d176a0d28 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded28) #1 0x7f8d1709f238 in qcalloc lib/memory.c:105 #2 0x55d5dba6ee75 in sid_register bgpd/bgp_mplsvpn.c:591 #3 0x55d5dba6ee75 in alloc_new_sid bgpd/bgp_mplsvpn.c:712 #4 0x55d5dba6f3ce in ensure_vrf_tovpn_sid_per_af bgpd/bgp_mplsvpn.c:758 #5 0x55d5dba6fb94 in ensure_vrf_tovpn_sid bgpd/bgp_mplsvpn.c:849 #6 0x55d5dba7f975 in vpn_leak_postchange bgpd/bgp_mplsvpn.h:299 #7 0x55d5dba7f975 in vpn_leak_postchange_all bgpd/bgp_mplsvpn.c:3704 #8 0x55d5dbbb6c66 in bgp_zebra_process_srv6_locator_chunk bgpd/bgp_zebra.c:3164 #9 0x7f8d1716f08a in zclient_read lib/zclient.c:4459 #10 0x7f8d1713f034 in event_call lib/event.c:1974 #11 0x7f8d1708242b in frr_run lib/libfrr.c:1214 #12 0x55d5db99d19d in main bgpd/bgp_main.c:510 #13 0x7f8d160c5c86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86) Direct leak of 296 byte(s) in 1 object(s) allocated from: #0 0x7f8d176a0d28 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded28) #1 0x7f8d1709f238 in qcalloc lib/memory.c:105 #2 0x7f8d170b1d5f in srv6_locator_chunk_alloc lib/srv6.c:135 #3 0x55d5dbbb6a19 in bgp_zebra_process_srv6_locator_chunk bgpd/bgp_zebra.c:3144 #4 0x7f8d1716f08a in zclient_read lib/zclient.c:4459 #5 0x7f8d1713f034 in event_call lib/event.c:1974 #6 0x7f8d1708242b in frr_run lib/libfrr.c:1214 #7 0x55d5db99d19d in main bgpd/bgp_main.c:510 #8 0x7f8d160c5c86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86) *********************************************************************************** ``` Signed-off-by: Keelan Cannoo <keelan.cannoo@icloud.com>
2023-11-29 12:49:58 +01:00
srv6_function_free(func);
}
}
// refresh tovpn_sid
for (ALL_LIST_ELEMENTS_RO(bm->bgp, node, bgp_vrf)) {
if (bgp_vrf->inst_type != BGP_INSTANCE_TYPE_VRF)
continue;
// refresh vpnv4 tovpn_sid
tovpn_sid = bgp_vrf->vpn_policy[AFI_IP].tovpn_sid;
if (tovpn_sid) {
tmp_prefi.family = AF_INET6;
tmp_prefi.prefixlen = IPV6_MAX_BITLEN;
tmp_prefi.prefix = *tovpn_sid;
if (prefix_match((struct prefix *)&loc.prefix,
(struct prefix *)&tmp_prefi))
XFREE(MTYPE_BGP_SRV6_SID,
bgp_vrf->vpn_policy[AFI_IP].tovpn_sid);
}
// refresh vpnv6 tovpn_sid
tovpn_sid = bgp_vrf->vpn_policy[AFI_IP6].tovpn_sid;
if (tovpn_sid) {
tmp_prefi.family = AF_INET6;
tmp_prefi.prefixlen = IPV6_MAX_BITLEN;
tmp_prefi.prefix = *tovpn_sid;
if (prefix_match((struct prefix *)&loc.prefix,
(struct prefix *)&tmp_prefi))
XFREE(MTYPE_BGP_SRV6_SID,
bgp_vrf->vpn_policy[AFI_IP6].tovpn_sid);
}
/* refresh per-vrf tovpn_sid */
tovpn_sid = bgp_vrf->tovpn_sid;
if (tovpn_sid) {
tmp_prefi.family = AF_INET6;
tmp_prefi.prefixlen = IPV6_MAX_BITLEN;
tmp_prefi.prefix = *tovpn_sid;
if (prefix_match((struct prefix *)&loc.prefix,
(struct prefix *)&tmp_prefi))
XFREE(MTYPE_BGP_SRV6_SID, bgp_vrf->tovpn_sid);
}
}
vpn_leak_postchange_all();
/* refresh tovpn_sid_locator */
for (ALL_LIST_ELEMENTS_RO(bm->bgp, node, bgp_vrf)) {
if (bgp_vrf->inst_type != BGP_INSTANCE_TYPE_VRF)
continue;
/* refresh vpnv4 tovpn_sid_locator */
tovpn_sid_locator =
bgp_vrf->vpn_policy[AFI_IP].tovpn_sid_locator;
if (tovpn_sid_locator) {
tmp_prefi.family = AF_INET6;
tmp_prefi.prefixlen = IPV6_MAX_BITLEN;
tmp_prefi.prefix = tovpn_sid_locator->prefix.prefix;
if (prefix_match((struct prefix *)&loc.prefix,
(struct prefix *)&tmp_prefi)) {
srv6_locator_free(bgp_vrf->vpn_policy[AFI_IP]
.tovpn_sid_locator);
bgp_vrf->vpn_policy[AFI_IP].tovpn_sid_locator =
NULL;
}
}
/* refresh vpnv6 tovpn_sid_locator */
tovpn_sid_locator =
bgp_vrf->vpn_policy[AFI_IP6].tovpn_sid_locator;
if (tovpn_sid_locator) {
tmp_prefi.family = AF_INET6;
tmp_prefi.prefixlen = IPV6_MAX_BITLEN;
tmp_prefi.prefix = tovpn_sid_locator->prefix.prefix;
if (prefix_match((struct prefix *)&loc.prefix,
(struct prefix *)&tmp_prefi)) {
srv6_locator_free(bgp_vrf->vpn_policy[AFI_IP6]
.tovpn_sid_locator);
bgp_vrf->vpn_policy[AFI_IP6].tovpn_sid_locator =
NULL;
}
}
/* refresh per-vrf tovpn_sid_locator */
tovpn_sid_locator = bgp_vrf->tovpn_sid_locator;
if (tovpn_sid_locator) {
tmp_prefi.family = AF_INET6;
tmp_prefi.prefixlen = IPV6_MAX_BITLEN;
tmp_prefi.prefix = tovpn_sid_locator->prefix.prefix;
if (prefix_match((struct prefix *)&loc.prefix,
(struct prefix *)&tmp_prefi)) {
srv6_locator_free(bgp_vrf->tovpn_sid_locator);
bgp_vrf->tovpn_sid_locator = NULL;
}
}
}
return 0;
}
static zclient_handler *const bgp_handlers[] = {
[ZEBRA_ROUTER_ID_UPDATE] = bgp_router_id_update,
[ZEBRA_INTERFACE_ADDRESS_ADD] = bgp_interface_address_add,
[ZEBRA_INTERFACE_ADDRESS_DELETE] = bgp_interface_address_delete,
[ZEBRA_INTERFACE_NBR_ADDRESS_ADD] = bgp_interface_nbr_address_add,
[ZEBRA_INTERFACE_NBR_ADDRESS_DELETE] = bgp_interface_nbr_address_delete,
[ZEBRA_REDISTRIBUTE_ROUTE_ADD] = zebra_read_route,
[ZEBRA_REDISTRIBUTE_ROUTE_DEL] = zebra_read_route,
[ZEBRA_FEC_UPDATE] = bgp_read_fec_update,
[ZEBRA_LOCAL_ES_ADD] = bgp_zebra_process_local_es_add,
[ZEBRA_LOCAL_ES_DEL] = bgp_zebra_process_local_es_del,
[ZEBRA_VNI_ADD] = bgp_zebra_process_local_vni,
[ZEBRA_LOCAL_ES_EVI_ADD] = bgp_zebra_process_local_es_evi,
[ZEBRA_LOCAL_ES_EVI_DEL] = bgp_zebra_process_local_es_evi,
[ZEBRA_VNI_DEL] = bgp_zebra_process_local_vni,
[ZEBRA_MACIP_ADD] = bgp_zebra_process_local_macip,
[ZEBRA_MACIP_DEL] = bgp_zebra_process_local_macip,
[ZEBRA_L3VNI_ADD] = bgp_zebra_process_local_l3vni,
[ZEBRA_L3VNI_DEL] = bgp_zebra_process_local_l3vni,
[ZEBRA_IP_PREFIX_ROUTE_ADD] = bgp_zebra_process_local_ip_prefix,
[ZEBRA_IP_PREFIX_ROUTE_DEL] = bgp_zebra_process_local_ip_prefix,
[ZEBRA_RULE_NOTIFY_OWNER] = rule_notify_owner,
[ZEBRA_IPSET_NOTIFY_OWNER] = ipset_notify_owner,
[ZEBRA_IPSET_ENTRY_NOTIFY_OWNER] = ipset_entry_notify_owner,
[ZEBRA_IPTABLE_NOTIFY_OWNER] = iptable_notify_owner,
[ZEBRA_ROUTE_NOTIFY_OWNER] = bgp_zebra_route_notify_owner,
[ZEBRA_SRV6_LOCATOR_ADD] = bgp_zebra_process_srv6_locator_add,
[ZEBRA_SRV6_LOCATOR_DELETE] = bgp_zebra_process_srv6_locator_delete,
[ZEBRA_SRV6_MANAGER_GET_LOCATOR_CHUNK] =
bgp_zebra_process_srv6_locator_chunk,
[ZEBRA_SRV6_SID_NOTIFY] = bgp_zebra_srv6_sid_notify,
};
static int bgp_if_new_hook(struct interface *ifp)
{
struct bgp_interface *iifp;
if (ifp->info)
return 0;
iifp = XCALLOC(MTYPE_BGP_IF_INFO, sizeof(struct bgp_interface));
ifp->info = iifp;
return 0;
}
static int bgp_if_delete_hook(struct interface *ifp)
{
XFREE(MTYPE_BGP_IF_INFO, ifp->info);
return 0;
}
void bgp_if_init(void)
{
/* Initialize Zebra interface data structure. */
hook_register_prio(if_add, 0, bgp_if_new_hook);
hook_register_prio(if_del, 0, bgp_if_delete_hook);
}
static bool bgp_zebra_label_manager_ready(void)
{
return (zclient_sync->sock > 0);
}
static void bgp_start_label_manager(struct event *start)
{
if (!bgp_zebra_label_manager_ready() &&
!bgp_zebra_label_manager_connect())
event_add_timer(bm->master, bgp_start_label_manager, NULL, 1,
&bm->t_bgp_start_label_manager);
}
static bool bgp_zebra_label_manager_connect(void)
{
/* Connect to label manager. */
if (zclient_socket_connect(zclient_sync) < 0) {
zlog_warn("%s: failed connecting synchronous zclient!",
__func__);
return false;
}
/* make socket non-blocking */
set_nonblocking(zclient_sync->sock);
/* Send hello to notify zebra this is a synchronous client */
if (zclient_send_hello(zclient_sync) == ZCLIENT_SEND_FAILURE) {
zlog_warn("%s: failed sending hello for synchronous zclient!",
__func__);
close(zclient_sync->sock);
zclient_sync->sock = -1;
return false;
}
/* Connect to label manager */
if (lm_label_manager_connect(zclient_sync, 0) != 0) {
zlog_warn("%s: failed connecting to label manager!", __func__);
if (zclient_sync->sock > 0) {
close(zclient_sync->sock);
zclient_sync->sock = -1;
}
return false;
}
/* tell label pool that zebra is connected */
bgp_lp_event_zebra_up();
/* tell BGP L3VPN that label manager is available */
if (bgp_get_default())
vpn_leak_postchange_all();
return true;
}
static void bgp_zebra_capabilities(struct zclient_capabilities *cap)
{
bm->v6_with_v4_nexthops = cap->v6_with_v4_nexthop;
}
void bgp_zebra_init(struct event_loop *master, unsigned short instance)
2002-12-13 21:15:29 +01:00
{
zclient_num_connects = 0;
hook_register_prio(if_real, 0, bgp_ifp_create);
hook_register_prio(if_up, 0, bgp_ifp_up);
hook_register_prio(if_down, 0, bgp_ifp_down);
hook_register_prio(if_unreal, 0, bgp_ifp_destroy);
2002-12-13 21:15:29 +01:00
/* Set default values. */
zclient = zclient_new(master, &zclient_options_default, bgp_handlers,
array_size(bgp_handlers));
zclient_init(zclient, ZEBRA_ROUTE_BGP, 0, &bgpd_privs);
zclient->zebra_buffer_write_ready = bgp_zebra_buffer_write_ready;
*: add VRF ID in the API message header The API messages are used by zebra to exchange the interfaces, addresses, routes and router-id information with its clients. To distinguish which VRF the information belongs to, a new field "VRF ID" is added in the message header. And hence the message version is increased to 3. * The new field "VRF ID" in the message header: Length (2 bytes) Marker (1 byte) Version (1 byte) VRF ID (2 bytes, newly added) Command (2 bytes) - Client side: - zclient_create_header() adds the VRF ID in the message header. - zclient_read() extracts and validates the VRF ID from the header, and passes the VRF ID to the callback functions registered to the API messages. - All relative functions are appended with a new parameter "vrf_id", including all the callback functions. - "vrf_id" is also added to "struct zapi_ipv4" and "struct zapi_ipv6". Clients need to correctly set the VRF ID when using the API functions zapi_ipv4_route() and zapi_ipv6_route(). - Till now all messages sent from a client have the default VRF ID "0" in the header. - The HELLO message is special, which is used as the heart-beat of a client, and has no relation with VRF. The VRF ID in the HELLO message header will always be 0 and ignored by zebra. - Zebra side: - zserv_create_header() adds the VRF ID in the message header. - zebra_client_read() extracts and validates the VRF ID from the header, and passes the VRF ID to the functions which process the received messages. - All relative functions are appended with a new parameter "vrf_id". * Suppress the messages in a VRF which a client does not care: Some clients may not care about the information in the VRF X, and zebra should not send the messages in the VRF X to those clients. Extra flags are used to indicate which VRF is registered by a client, and a new message ZEBRA_VRF_UNREGISTER is introduced to let a client can unregister a VRF when it does not need any information in that VRF. A client sends any message other than ZEBRA_VRF_UNREGISTER in a VRF will automatically register to that VRF. - lib/vrf: A new utility "VRF bit-map" is provided to manage the flags for VRFs, one bit per VRF ID. - Use vrf_bitmap_init()/vrf_bitmap_free() to initialize/free a bit-map; - Use vrf_bitmap_set()/vrf_bitmap_unset() to set/unset a flag in the given bit-map, corresponding to the given VRF ID; - Use vrf_bitmap_check() to test whether the flag, in the given bit-map and for the given VRF ID, is set. - Client side: - In "struct zclient", the following flags are changed from "u_char" to "vrf_bitmap_t": redist[ZEBRA_ROUTE_MAX] default_information These flags are extended for each VRF, and controlled by the clients themselves (or with the help of zclient_redistribute() and zclient_redistribute_default()). - Zebra side: - In "struct zserv", the following flags are changed from "u_char" to "vrf_bitmap_t": redist[ZEBRA_ROUTE_MAX] redist_default ifinfo ridinfo These flags are extended for each VRF, as the VRF registration flags. They are maintained on receiving a ZEBRA_XXX_ADD or ZEBRA_XXX_DELETE message. When sending an interface/address/route/router-id message in a VRF to a client, if the corresponding VRF registration flag is not set, this message will not be dropped by zebra. - A new function zread_vrf_unregister() is introduced to process the new command ZEBRA_VRF_UNREGISTER. All the VRF registration flags are cleared for the requested VRF. Those clients, who support only the default VRF, will never receive a message in a non-default VRF, thanks to the filter in zebra. * New callback for the event of successful connection to zebra: - zclient_start() is splitted, keeping only the code of connecting to zebra. - Now zclient_init()=>zclient_connect()=>zclient_start() operations are purely dealing with the connection to zbera. - Once zebra is successfully connected, at the end of zclient_start(), a new callback is used to inform the client about connection. - Till now, in the callback of connect-to-zebra event, all clients send messages to zebra to request the router-id/interface/routes information in the default VRF. Of corse in future the client can do anything it wants in this callback. For example, it may send requests for both default VRF and some non-default VRFs. Signed-off-by: Feng Lu <lu.feng@6wind.com> Reviewed-by: Alain Ritoux <alain.ritoux@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Donald Sharp <sharpd@cumulusnetworks.com> Conflicts: lib/zclient.h lib/zebra.h zebra/zserv.c zebra/zserv.h Conflicts: bgpd/bgp_nexthop.c bgpd/bgp_nht.c bgpd/bgp_zebra.c isisd/isis_zebra.c lib/zclient.c lib/zclient.h lib/zebra.h nhrpd/nhrp_interface.c nhrpd/nhrp_route.c nhrpd/nhrpd.h ospf6d/ospf6_zebra.c ospf6d/ospf6_zebra.h ospfd/ospf_vty.c ospfd/ospf_zebra.c pimd/pim_zebra.c pimd/pim_zlookup.c ripd/rip_zebra.c ripngd/ripng_zebra.c zebra/redistribute.c zebra/rt_netlink.c zebra/zebra_rnh.c zebra/zebra_rnh.h zebra/zserv.c zebra/zserv.h
2014-10-16 03:52:36 +02:00
zclient->zebra_connected = bgp_zebra_connected;
zclient->zebra_capabilities = bgp_zebra_capabilities;
zclient->nexthop_update = bgp_nexthop_update;
zclient->instance = instance;
/* Initialize special zclient for synchronous message exchanges. */
zclient_sync = zclient_new(master, &zclient_options_sync, NULL, 0);
zclient_sync->sock = -1;
zclient_sync->redist_default = ZEBRA_ROUTE_BGP;
zclient_sync->instance = instance;
zclient_sync->session_id = 1;
zclient_sync->privs = &bgpd_privs;
if (!bgp_zebra_label_manager_ready())
event_add_timer(master, bgp_start_label_manager, NULL, 1,
&bm->t_bgp_start_label_manager);
2002-12-13 21:15:29 +01:00
}
void bgp_zebra_destroy(void)
{
if (zclient == NULL)
return;
zclient_stop(zclient);
zclient_free(zclient);
zclient = NULL;
if (zclient_sync == NULL)
return;
zclient_stop(zclient_sync);
zclient_free(zclient_sync);
zclient_sync = NULL;
}
int bgp_zebra_num_connects(void)
{
return zclient_num_connects;
}
void bgp_send_pbr_rule_action(struct bgp_pbr_action *pbra,
struct bgp_pbr_rule *pbr,
bool install)
{
struct stream *s;
if (pbra->install_in_progress && !pbr)
return;
if (pbr && pbr->install_in_progress)
return;
if (BGP_DEBUG(zebra, ZEBRA)) {
if (pbr)
zlog_debug("%s: table %d (ip rule) %d", __func__,
pbra->table_id, install);
else
zlog_debug("%s: table %d fwmark %d %d", __func__,
pbra->table_id, pbra->fwmark, install);
}
s = zclient->obuf;
stream_reset(s);
zclient_create_header(s,
install ? ZEBRA_RULE_ADD : ZEBRA_RULE_DELETE,
VRF_DEFAULT);
bgp_encode_pbr_rule_action(s, pbra, pbr);
if ((zclient_send_message(zclient) != ZCLIENT_SEND_FAILURE)
&& install) {
if (!pbr)
pbra->install_in_progress = true;
else
pbr->install_in_progress = true;
}
}
void bgp_send_pbr_ipset_match(struct bgp_pbr_match *pbrim, bool install)
{
struct stream *s;
if (pbrim->install_in_progress)
return;
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("%s: name %s type %d %d, ID %u", __func__,
pbrim->ipset_name, pbrim->type, install,
pbrim->unique);
s = zclient->obuf;
stream_reset(s);
zclient_create_header(s,
install ? ZEBRA_IPSET_CREATE :
ZEBRA_IPSET_DESTROY,
VRF_DEFAULT);
stream_putl(s, 1); /* send one pbr action */
bgp_encode_pbr_ipset_match(s, pbrim);
stream_putw_at(s, 0, stream_get_endp(s));
if ((zclient_send_message(zclient) != ZCLIENT_SEND_FAILURE) && install)
pbrim->install_in_progress = true;
}
void bgp_send_pbr_ipset_entry_match(struct bgp_pbr_match_entry *pbrime,
bool install)
{
struct stream *s;
if (pbrime->install_in_progress)
return;
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("%s: name %s %d %d, ID %u", __func__,
pbrime->backpointer->ipset_name, pbrime->unique,
install, pbrime->unique);
s = zclient->obuf;
stream_reset(s);
zclient_create_header(s,
install ? ZEBRA_IPSET_ENTRY_ADD :
ZEBRA_IPSET_ENTRY_DELETE,
VRF_DEFAULT);
stream_putl(s, 1); /* send one pbr action */
bgp_encode_pbr_ipset_entry_match(s, pbrime);
stream_putw_at(s, 0, stream_get_endp(s));
if ((zclient_send_message(zclient) != ZCLIENT_SEND_FAILURE) && install)
pbrime->install_in_progress = true;
}
static void bgp_encode_pbr_interface_list(struct bgp *bgp, struct stream *s,
uint8_t family)
{
struct bgp_pbr_config *bgp_pbr_cfg = bgp->bgp_pbr_cfg;
struct bgp_pbr_interface_head *head;
struct bgp_pbr_interface *pbr_if;
struct interface *ifp;
if (!bgp_pbr_cfg)
return;
if (family == AF_INET)
head = &(bgp_pbr_cfg->ifaces_by_name_ipv4);
else
head = &(bgp_pbr_cfg->ifaces_by_name_ipv6);
RB_FOREACH (pbr_if, bgp_pbr_interface_head, head) {
ifp = if_lookup_by_name(pbr_if->name, bgp->vrf_id);
if (ifp)
stream_putl(s, ifp->ifindex);
}
}
static int bgp_pbr_get_ifnumber(struct bgp *bgp, uint8_t family)
{
struct bgp_pbr_config *bgp_pbr_cfg = bgp->bgp_pbr_cfg;
struct bgp_pbr_interface_head *head;
struct bgp_pbr_interface *pbr_if;
int cnt = 0;
if (!bgp_pbr_cfg)
return 0;
if (family == AF_INET)
head = &(bgp_pbr_cfg->ifaces_by_name_ipv4);
else
head = &(bgp_pbr_cfg->ifaces_by_name_ipv6);
RB_FOREACH (pbr_if, bgp_pbr_interface_head, head) {
if (if_lookup_by_name(pbr_if->name, bgp->vrf_id))
cnt++;
}
return cnt;
}
void bgp_send_pbr_iptable(struct bgp_pbr_action *pba,
struct bgp_pbr_match *pbm,
bool install)
{
struct stream *s;
int ret = 0;
int nb_interface;
if (pbm->install_iptable_in_progress)
return;
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("%s: name %s type %d mark %d %d, ID %u", __func__,
pbm->ipset_name, pbm->type, pba->fwmark, install,
pbm->unique2);
s = zclient->obuf;
stream_reset(s);
zclient_create_header(s,
install ? ZEBRA_IPTABLE_ADD :
ZEBRA_IPTABLE_DELETE,
VRF_DEFAULT);
bgp_encode_pbr_iptable_match(s, pba, pbm);
nb_interface = bgp_pbr_get_ifnumber(pba->bgp, pbm->family);
stream_putl(s, nb_interface);
if (nb_interface)
bgp_encode_pbr_interface_list(pba->bgp, s, pbm->family);
stream_putw_at(s, 0, stream_get_endp(s));
ret = zclient_send_message(zclient);
if (install) {
if (ret != ZCLIENT_SEND_FAILURE)
pba->refcnt++;
else
pbm->install_iptable_in_progress = true;
}
}
/* inject in table <table_id> a default route to:
* - if nexthop IP is present : to this nexthop
* - if vrf is different from local : to the matching VRF
*/
void bgp_zebra_announce_default(struct bgp *bgp, struct nexthop *nh,
afi_t afi, uint32_t table_id, bool announce)
{
struct zapi_nexthop *api_nh;
struct zapi_route api;
struct prefix p;
if (!nh || (nh->type != NEXTHOP_TYPE_IPV4
&& nh->type != NEXTHOP_TYPE_IPV6)
|| nh->vrf_id == VRF_UNKNOWN)
return;
/* in vrf-lite, no default route has to be announced
* the table id of vrf is directly used to divert traffic
*/
if (!vrf_is_backend_netns() && bgp->vrf_id != nh->vrf_id)
return;
memset(&p, 0, sizeof(p));
if (afi != AFI_IP && afi != AFI_IP6)
return;
p.family = afi2family(afi);
memset(&api, 0, sizeof(api));
api.vrf_id = bgp->vrf_id;
api.type = ZEBRA_ROUTE_BGP;
api.safi = SAFI_UNICAST;
api.prefix = p;
api.tableid = table_id;
api.nexthop_num = 1;
SET_FLAG(api.message, ZAPI_MESSAGE_TABLEID);
SET_FLAG(api.message, ZAPI_MESSAGE_NEXTHOP);
api_nh = &api.nexthops[0];
api.distance = ZEBRA_EBGP_DISTANCE_DEFAULT;
SET_FLAG(api.message, ZAPI_MESSAGE_DISTANCE);
api_nh->vrf_id = nh->vrf_id;
if (BGP_DEBUG(zebra, ZEBRA)) {
struct vrf *vrf;
vrf = vrf_lookup_by_id(nh->vrf_id);
zlog_debug("%s: %s default route to %pNHvv(%s) table %d",
bgp->name_pretty, announce ? "adding" : "withdrawing",
nh, VRF_LOGNAME(vrf), table_id);
}
/* redirect IP */
if (afi == AFI_IP && nh->gate.ipv4.s_addr != INADDR_ANY) {
api_nh->gate.ipv4 = nh->gate.ipv4;
api_nh->type = NEXTHOP_TYPE_IPV4;
} else if (afi == AFI_IP6 && memcmp(&nh->gate.ipv6, &in6addr_any,
sizeof(struct in6_addr))) {
memcpy(&api_nh->gate.ipv6, &nh->gate.ipv6,
sizeof(struct in6_addr));
api_nh->type = NEXTHOP_TYPE_IPV6;
} else if (nh->vrf_id != bgp->vrf_id) {
struct vrf *vrf;
struct interface *ifp;
vrf = vrf_lookup_by_id(nh->vrf_id);
if (!vrf)
return;
/* create default route with interface <VRF>
* with nexthop-vrf <VRF>
*/
ifp = if_lookup_by_name_vrf(vrf->name, vrf);
if (!ifp)
return;
api_nh->type = NEXTHOP_TYPE_IFINDEX;
api_nh->ifindex = ifp->ifindex;
}
zclient_route_send(announce ? ZEBRA_ROUTE_ADD : ZEBRA_ROUTE_DELETE,
zclient, &api);
}
/* Send capabilities to RIB */
int bgp_zebra_send_capabilities(struct bgp *bgp, bool disable)
{
struct zapi_cap api;
int ret = BGP_GR_SUCCESS;
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("%s: Sending %sable for %s", __func__,
disable ? "dis" : "en", bgp->name_pretty);
if (zclient == NULL) {
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("%s: %s zclient invalid", __func__,
bgp->name_pretty);
return BGP_GR_FAILURE;
}
/* Check if the client is connected */
if ((zclient->sock < 0) || (zclient->t_connect)) {
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("%s: %s client not connected", __func__,
bgp->name_pretty);
return BGP_GR_FAILURE;
}
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("%s(%d): Sending GR capability %s to zebra",
bgp->name_pretty, bgp->vrf_id,
disable ? "disabled" : "enabled");
/* Check if capability is already sent. If the flag force is set
* send the capability since this can be initial bgp configuration
*/
memset(&api, 0, sizeof(api));
if (disable) {
api.cap = ZEBRA_CLIENT_GR_DISABLE;
api.vrf_id = bgp->vrf_id;
} else {
api.cap = ZEBRA_CLIENT_GR_CAPABILITIES;
api.stale_removal_time = bgp->rib_stale_time;
api.vrf_id = bgp->vrf_id;
}
if (zclient_capabilities_send(ZEBRA_CLIENT_CAPABILITIES, zclient, &api)
== ZCLIENT_SEND_FAILURE) {
zlog_err("%s(%d): Error sending GR capability to zebra",
bgp->name_pretty, bgp->vrf_id);
ret = BGP_GR_FAILURE;
} else {
if (disable)
bgp->present_zebra_gr_state = ZEBRA_GR_DISABLE;
else
bgp->present_zebra_gr_state = ZEBRA_GR_ENABLE;
ret = BGP_GR_SUCCESS;
}
return ret;
}
/* Send route update pesding or completed status to RIB for the
* specific AFI, SAFI
*/
int bgp_zebra_update(struct bgp *bgp, afi_t afi, safi_t safi,
enum zserv_client_capabilities type)
{
struct zapi_cap api = {0};
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("%s: %s afi: %u safi: %u Command %s", __func__,
bgp->name_pretty, afi, safi,
zserv_gr_client_cap_string(type));
if (zclient == NULL) {
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("%s: %s zclient == NULL, invalid", __func__,
bgp->name_pretty);
return BGP_GR_FAILURE;
}
/* Check if the client is connected */
if ((zclient->sock < 0) || (zclient->t_connect)) {
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("%s: %s client not connected", __func__,
bgp->name_pretty);
return BGP_GR_FAILURE;
}
api.afi = afi;
api.safi = safi;
api.vrf_id = bgp->vrf_id;
api.cap = type;
if (zclient_capabilities_send(ZEBRA_CLIENT_CAPABILITIES, zclient, &api)
== ZCLIENT_SEND_FAILURE) {
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("%s: %s error sending capability", __func__,
bgp->name_pretty);
return BGP_GR_FAILURE;
}
return BGP_GR_SUCCESS;
}
/* Send RIB stale timer update */
int bgp_zebra_stale_timer_update(struct bgp *bgp)
{
struct zapi_cap api;
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("%s: %s Timer Update to %u", __func__,
bgp->name_pretty, bgp->rib_stale_time);
if (zclient == NULL) {
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("zclient invalid");
return BGP_GR_FAILURE;
}
/* Check if the client is connected */
if ((zclient->sock < 0) || (zclient->t_connect)) {
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("%s: %s client not connected", __func__,
bgp->name_pretty);
return BGP_GR_FAILURE;
}
memset(&api, 0, sizeof(api));
api.cap = ZEBRA_CLIENT_RIB_STALE_TIME;
api.stale_removal_time = bgp->rib_stale_time;
api.vrf_id = bgp->vrf_id;
if (zclient_capabilities_send(ZEBRA_CLIENT_CAPABILITIES, zclient, &api)
== ZCLIENT_SEND_FAILURE) {
if (BGP_DEBUG(zebra, ZEBRA))
zlog_debug("%s: %s error sending capability", __func__,
bgp->name_pretty);
return BGP_GR_FAILURE;
}
return BGP_GR_SUCCESS;
}
int bgp_zebra_srv6_manager_get_locator_chunk(const char *name)
{
return srv6_manager_get_locator_chunk(zclient, name);
}
int bgp_zebra_srv6_manager_release_locator_chunk(const char *name)
{
return srv6_manager_release_locator_chunk(zclient, name);
}
bgpd: add support for l3vpn per-nexthop label This commit introduces a new method to associate a label to prefixes to export to a VPNv4 backbone. All the methods to associate a label to a BGP update is documented in rfc4364, chapter 4.3.2. Initially, the "single label for an entire VRF" method was available. This commit adds "single label for each attachment circuit" method. The change impacts the control-plane, because each BGP update is checked to know if the nexthop has reachability in the VRF or not. If this is the case, then a unique label for a given destination IP in the VRF will be picked up. This label will be reused for an other BGP update that will have the same nexthop IP address. The change impacts the data-plane, because the MPLs pop mechanism applied to incoming labelled packets changes: the MPLS label is popped, and the packet is directly sent to the connected nexthop described in the previous outgoing BGP VPN update. By default per-vrf mode is done, but the user may choose the per-nexthop mode, by using the vty command from the previous commit. In the latter case, a per-vrf label will however be allocated to handle networks that are not directly connected. This is the case for local traffic for instance. The change also include the following: - ECMP case In case a route is learnt in a given VRF, and is resolved via an ECMP nexthop. This implies that when exporting the route as a BGP update, if label allocation per nexthop is used, then two possible MPLS values could be picked up, which is not possible with the current implementation. Actually, the NLRI for VPNv4 stores one prefix, and one single label value, not two. Today, RFC8277 with multiple label capability is not yet available. To avoid this corner case, when a route is resolved via more than one nexthop, the label allocation per nexthop will not apply, and the default per-vrf label will be chosen. Let us imagine BGP redistributes a static route using the `172.31.0.20` nexthop. The nexthop resolution will find two different nexthops fo a unique BGP update. > r1# show running-config > [..] > vrf vrf1 > ip route 172.31.0.30/32 172.31.0.20 > r1# show bgp vrf vrf1 nexthop > [..] > 172.31.0.20 valid [IGP metric 0], #paths 1 > gate 192.0.2.11 > gate 192.0.2.12 > Last update: Mon Jan 16 09:27:09 2023 > Paths: > 1/1 172.31.0.30/32 VRF vrf1 flags 0x20018 To avoid this situation, BGP updates that resolve over multiple nexthops are using the unique per-vrf label. - recursive route case Prefixes that need a recursive route to be resolved can also be eligible for mpls allocation per nexthop. In that case, the nexthop will be the recursive nexthop calculated. To achieve this, all nexthop types in bnc contexts are valid, except for the blackhole nexthops. - network declared prefixes Nexthop tracking is used to look for the reachability of the prefixes. When the the 'no bgp network import-check' command is used, network declared prefixes are maintained active, even if there is no active nexthop. Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-02-28 14:25:02 +01:00
/**
* Ask the SRv6 Manager (zebra) about a specific locator
*
* @param name Locator name
* @return 0 on success, -1 otherwise
*/
int bgp_zebra_srv6_manager_get_locator(const char *name)
{
if (!name)
return -1;
/*
* Send the Get Locator request to the SRv6 Manager and return the
* result
*/
return srv6_manager_get_locator(zclient, name);
}
/**
* Ask the SRv6 Manager (zebra) to allocate a SID.
*
* Optionally, it is possible to provide an IPv6 address (sid_value parameter).
*
* When sid_value is provided, the SRv6 Manager allocates the requested SID
* address, if the request can be satisfied (explicit allocation).
*
* When sid_value is not provided, the SRv6 Manager allocates any available SID
* from the provided locator (dynamic allocation).
*
* @param ctx Context to be associated with the request SID
* @param sid_value IPv6 address to be associated with the requested SID (optional)
* @param locator_name Name of the locator from which the SID must be allocated
* @param sid_func SID Function allocated by the SRv6 Manager.
*/
bool bgp_zebra_request_srv6_sid(const struct srv6_sid_ctx *ctx,
struct in6_addr *sid_value,
const char *locator_name, uint32_t *sid_func)
{
int ret;
if (!ctx || !locator_name)
return false;
/*
* Send the Get SRv6 SID request to the SRv6 Manager and check the
* result
*/
ret = srv6_manager_get_sid(zclient, ctx, sid_value, locator_name,
sid_func);
if (ret < 0) {
zlog_warn("%s: error getting SRv6 SID!", __func__);
return false;
}
return true;
}
/**
* Ask the SRv6 Manager (zebra) to release a previously allocated SID.
*
* This function is used to tell the SRv6 Manager that BGP no longer intends
* to use the SID.
*
* @param ctx Context to be associated with the SID to be released
*/
void bgp_zebra_release_srv6_sid(const struct srv6_sid_ctx *ctx)
{
int ret;
if (!ctx)
return;
/*
* Send the Release SRv6 SID request to the SRv6 Manager and check the
* result
*/
ret = srv6_manager_release_sid(zclient, ctx);
if (ret < 0) {
zlog_warn("%s: error releasing SRv6 SID!", __func__);
return;
}
}
bgpd: add support for l3vpn per-nexthop label This commit introduces a new method to associate a label to prefixes to export to a VPNv4 backbone. All the methods to associate a label to a BGP update is documented in rfc4364, chapter 4.3.2. Initially, the "single label for an entire VRF" method was available. This commit adds "single label for each attachment circuit" method. The change impacts the control-plane, because each BGP update is checked to know if the nexthop has reachability in the VRF or not. If this is the case, then a unique label for a given destination IP in the VRF will be picked up. This label will be reused for an other BGP update that will have the same nexthop IP address. The change impacts the data-plane, because the MPLs pop mechanism applied to incoming labelled packets changes: the MPLS label is popped, and the packet is directly sent to the connected nexthop described in the previous outgoing BGP VPN update. By default per-vrf mode is done, but the user may choose the per-nexthop mode, by using the vty command from the previous commit. In the latter case, a per-vrf label will however be allocated to handle networks that are not directly connected. This is the case for local traffic for instance. The change also include the following: - ECMP case In case a route is learnt in a given VRF, and is resolved via an ECMP nexthop. This implies that when exporting the route as a BGP update, if label allocation per nexthop is used, then two possible MPLS values could be picked up, which is not possible with the current implementation. Actually, the NLRI for VPNv4 stores one prefix, and one single label value, not two. Today, RFC8277 with multiple label capability is not yet available. To avoid this corner case, when a route is resolved via more than one nexthop, the label allocation per nexthop will not apply, and the default per-vrf label will be chosen. Let us imagine BGP redistributes a static route using the `172.31.0.20` nexthop. The nexthop resolution will find two different nexthops fo a unique BGP update. > r1# show running-config > [..] > vrf vrf1 > ip route 172.31.0.30/32 172.31.0.20 > r1# show bgp vrf vrf1 nexthop > [..] > 172.31.0.20 valid [IGP metric 0], #paths 1 > gate 192.0.2.11 > gate 192.0.2.12 > Last update: Mon Jan 16 09:27:09 2023 > Paths: > 1/1 172.31.0.30/32 VRF vrf1 flags 0x20018 To avoid this situation, BGP updates that resolve over multiple nexthops are using the unique per-vrf label. - recursive route case Prefixes that need a recursive route to be resolved can also be eligible for mpls allocation per nexthop. In that case, the nexthop will be the recursive nexthop calculated. To achieve this, all nexthop types in bnc contexts are valid, except for the blackhole nexthops. - network declared prefixes Nexthop tracking is used to look for the reachability of the prefixes. When the the 'no bgp network import-check' command is used, network declared prefixes are maintained active, even if there is no active nexthop. Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-02-28 14:25:02 +01:00
void bgp_zebra_send_nexthop_label(int cmd, mpls_label_t label,
ifindex_t ifindex, vrf_id_t vrf_id,
enum lsp_types_t ltype, struct prefix *p,
uint8_t num_labels, mpls_label_t out_labels[])
bgpd: add support for l3vpn per-nexthop label This commit introduces a new method to associate a label to prefixes to export to a VPNv4 backbone. All the methods to associate a label to a BGP update is documented in rfc4364, chapter 4.3.2. Initially, the "single label for an entire VRF" method was available. This commit adds "single label for each attachment circuit" method. The change impacts the control-plane, because each BGP update is checked to know if the nexthop has reachability in the VRF or not. If this is the case, then a unique label for a given destination IP in the VRF will be picked up. This label will be reused for an other BGP update that will have the same nexthop IP address. The change impacts the data-plane, because the MPLs pop mechanism applied to incoming labelled packets changes: the MPLS label is popped, and the packet is directly sent to the connected nexthop described in the previous outgoing BGP VPN update. By default per-vrf mode is done, but the user may choose the per-nexthop mode, by using the vty command from the previous commit. In the latter case, a per-vrf label will however be allocated to handle networks that are not directly connected. This is the case for local traffic for instance. The change also include the following: - ECMP case In case a route is learnt in a given VRF, and is resolved via an ECMP nexthop. This implies that when exporting the route as a BGP update, if label allocation per nexthop is used, then two possible MPLS values could be picked up, which is not possible with the current implementation. Actually, the NLRI for VPNv4 stores one prefix, and one single label value, not two. Today, RFC8277 with multiple label capability is not yet available. To avoid this corner case, when a route is resolved via more than one nexthop, the label allocation per nexthop will not apply, and the default per-vrf label will be chosen. Let us imagine BGP redistributes a static route using the `172.31.0.20` nexthop. The nexthop resolution will find two different nexthops fo a unique BGP update. > r1# show running-config > [..] > vrf vrf1 > ip route 172.31.0.30/32 172.31.0.20 > r1# show bgp vrf vrf1 nexthop > [..] > 172.31.0.20 valid [IGP metric 0], #paths 1 > gate 192.0.2.11 > gate 192.0.2.12 > Last update: Mon Jan 16 09:27:09 2023 > Paths: > 1/1 172.31.0.30/32 VRF vrf1 flags 0x20018 To avoid this situation, BGP updates that resolve over multiple nexthops are using the unique per-vrf label. - recursive route case Prefixes that need a recursive route to be resolved can also be eligible for mpls allocation per nexthop. In that case, the nexthop will be the recursive nexthop calculated. To achieve this, all nexthop types in bnc contexts are valid, except for the blackhole nexthops. - network declared prefixes Nexthop tracking is used to look for the reachability of the prefixes. When the the 'no bgp network import-check' command is used, network declared prefixes are maintained active, even if there is no active nexthop. Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-02-28 14:25:02 +01:00
{
struct zapi_labels zl = {};
struct zapi_nexthop *znh;
int i = 0;
bgpd: add support for l3vpn per-nexthop label This commit introduces a new method to associate a label to prefixes to export to a VPNv4 backbone. All the methods to associate a label to a BGP update is documented in rfc4364, chapter 4.3.2. Initially, the "single label for an entire VRF" method was available. This commit adds "single label for each attachment circuit" method. The change impacts the control-plane, because each BGP update is checked to know if the nexthop has reachability in the VRF or not. If this is the case, then a unique label for a given destination IP in the VRF will be picked up. This label will be reused for an other BGP update that will have the same nexthop IP address. The change impacts the data-plane, because the MPLs pop mechanism applied to incoming labelled packets changes: the MPLS label is popped, and the packet is directly sent to the connected nexthop described in the previous outgoing BGP VPN update. By default per-vrf mode is done, but the user may choose the per-nexthop mode, by using the vty command from the previous commit. In the latter case, a per-vrf label will however be allocated to handle networks that are not directly connected. This is the case for local traffic for instance. The change also include the following: - ECMP case In case a route is learnt in a given VRF, and is resolved via an ECMP nexthop. This implies that when exporting the route as a BGP update, if label allocation per nexthop is used, then two possible MPLS values could be picked up, which is not possible with the current implementation. Actually, the NLRI for VPNv4 stores one prefix, and one single label value, not two. Today, RFC8277 with multiple label capability is not yet available. To avoid this corner case, when a route is resolved via more than one nexthop, the label allocation per nexthop will not apply, and the default per-vrf label will be chosen. Let us imagine BGP redistributes a static route using the `172.31.0.20` nexthop. The nexthop resolution will find two different nexthops fo a unique BGP update. > r1# show running-config > [..] > vrf vrf1 > ip route 172.31.0.30/32 172.31.0.20 > r1# show bgp vrf vrf1 nexthop > [..] > 172.31.0.20 valid [IGP metric 0], #paths 1 > gate 192.0.2.11 > gate 192.0.2.12 > Last update: Mon Jan 16 09:27:09 2023 > Paths: > 1/1 172.31.0.30/32 VRF vrf1 flags 0x20018 To avoid this situation, BGP updates that resolve over multiple nexthops are using the unique per-vrf label. - recursive route case Prefixes that need a recursive route to be resolved can also be eligible for mpls allocation per nexthop. In that case, the nexthop will be the recursive nexthop calculated. To achieve this, all nexthop types in bnc contexts are valid, except for the blackhole nexthops. - network declared prefixes Nexthop tracking is used to look for the reachability of the prefixes. When the the 'no bgp network import-check' command is used, network declared prefixes are maintained active, even if there is no active nexthop. Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-02-28 14:25:02 +01:00
zl.type = ltype;
zl.local_label = label;
zl.nexthop_num = 1;
znh = &zl.nexthops[0];
if (p->family == AF_INET)
IPV4_ADDR_COPY(&znh->gate.ipv4, &p->u.prefix4);
else
IPV6_ADDR_COPY(&znh->gate.ipv6, &p->u.prefix6);
if (ifindex == IFINDEX_INTERNAL)
znh->type = (p->family == AF_INET) ? NEXTHOP_TYPE_IPV4
: NEXTHOP_TYPE_IPV6;
else
znh->type = (p->family == AF_INET) ? NEXTHOP_TYPE_IPV4_IFINDEX
: NEXTHOP_TYPE_IPV6_IFINDEX;
znh->ifindex = ifindex;
znh->vrf_id = vrf_id;
if (num_labels == 0)
znh->label_num = 0;
else {
if (num_labels > MPLS_MAX_LABELS)
znh->label_num = MPLS_MAX_LABELS;
else
znh->label_num = num_labels;
for (i = 0; i < znh->label_num; i++)
znh->labels[i] = out_labels[i];
}
bgpd: add support for l3vpn per-nexthop label This commit introduces a new method to associate a label to prefixes to export to a VPNv4 backbone. All the methods to associate a label to a BGP update is documented in rfc4364, chapter 4.3.2. Initially, the "single label for an entire VRF" method was available. This commit adds "single label for each attachment circuit" method. The change impacts the control-plane, because each BGP update is checked to know if the nexthop has reachability in the VRF or not. If this is the case, then a unique label for a given destination IP in the VRF will be picked up. This label will be reused for an other BGP update that will have the same nexthop IP address. The change impacts the data-plane, because the MPLs pop mechanism applied to incoming labelled packets changes: the MPLS label is popped, and the packet is directly sent to the connected nexthop described in the previous outgoing BGP VPN update. By default per-vrf mode is done, but the user may choose the per-nexthop mode, by using the vty command from the previous commit. In the latter case, a per-vrf label will however be allocated to handle networks that are not directly connected. This is the case for local traffic for instance. The change also include the following: - ECMP case In case a route is learnt in a given VRF, and is resolved via an ECMP nexthop. This implies that when exporting the route as a BGP update, if label allocation per nexthop is used, then two possible MPLS values could be picked up, which is not possible with the current implementation. Actually, the NLRI for VPNv4 stores one prefix, and one single label value, not two. Today, RFC8277 with multiple label capability is not yet available. To avoid this corner case, when a route is resolved via more than one nexthop, the label allocation per nexthop will not apply, and the default per-vrf label will be chosen. Let us imagine BGP redistributes a static route using the `172.31.0.20` nexthop. The nexthop resolution will find two different nexthops fo a unique BGP update. > r1# show running-config > [..] > vrf vrf1 > ip route 172.31.0.30/32 172.31.0.20 > r1# show bgp vrf vrf1 nexthop > [..] > 172.31.0.20 valid [IGP metric 0], #paths 1 > gate 192.0.2.11 > gate 192.0.2.12 > Last update: Mon Jan 16 09:27:09 2023 > Paths: > 1/1 172.31.0.30/32 VRF vrf1 flags 0x20018 To avoid this situation, BGP updates that resolve over multiple nexthops are using the unique per-vrf label. - recursive route case Prefixes that need a recursive route to be resolved can also be eligible for mpls allocation per nexthop. In that case, the nexthop will be the recursive nexthop calculated. To achieve this, all nexthop types in bnc contexts are valid, except for the blackhole nexthops. - network declared prefixes Nexthop tracking is used to look for the reachability of the prefixes. When the the 'no bgp network import-check' command is used, network declared prefixes are maintained active, even if there is no active nexthop. Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-02-28 14:25:02 +01:00
/* vrf_id is DEFAULT_VRF */
zebra_send_mpls_labels(zclient, cmd, &zl);
}
bool bgp_zebra_request_label_range(uint32_t base, uint32_t chunk_size,
bool label_auto)
{
int ret;
uint32_t start, end;
if (!zclient_sync || !bgp_zebra_label_manager_ready())
return false;
ret = lm_get_label_chunk(zclient_sync, 0, base, chunk_size, &start,
&end);
if (ret < 0) {
zlog_warn("%s: error getting label range!", __func__);
return false;
}
if (start > end || start < MPLS_LABEL_UNRESERVED_MIN ||
end > MPLS_LABEL_UNRESERVED_MAX) {
flog_err(EC_BGP_LM_ERROR, "%s: Invalid Label chunk: %u - %u",
__func__, start, end);
return false;
}
if (label_auto)
/* label automatic is serviced by the bgp label pool
* manager, which allocates label chunks in
* pre-pools, and which needs to be notified about
* new chunks availability
*/
bgp_lp_event_chunk(start, end);
return true;
}
void bgp_zebra_release_label_range(uint32_t start, uint32_t end)
{
int ret;
if (!zclient_sync || !bgp_zebra_label_manager_ready())
return;
ret = lm_release_label_chunk(zclient_sync, start, end);
if (ret < 0)
zlog_warn("%s: error releasing label range!", __func__);
}