frr/bgpd/bgp_nexthop.h

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

160 lines
5.4 KiB
C
Raw Permalink Normal View History

// SPDX-License-Identifier: GPL-2.0-or-later
2002-12-13 21:15:29 +01:00
/* BGP nexthop scan
* Copyright (C) 2000 Kunihiro Ishiguro
*/
2002-12-13 21:15:29 +01:00
#ifndef _QUAGGA_BGP_NEXTHOP_H
#define _QUAGGA_BGP_NEXTHOP_H
#include "if.h"
#include "queue.h"
#include "prefix.h"
#include "bgp_table.h"
#define NEXTHOP_FAMILY(nexthop_len) \
(((nexthop_len) == 4 || (nexthop_len) == 12 \
? AF_INET \
: ((nexthop_len) == 16 || (nexthop_len) == 24 \
|| (nexthop_len) == 32 \
|| (nexthop_len) == 48 \
? AF_INET6 \
: AF_UNSPEC)))
bgpd: add L3/L2VPN Virtual Network Control feature This feature adds an L3 & L2 VPN application that makes use of the VPN and Encap SAFIs. This code is currently used to support IETF NVO3 style operation. In NVO3 terminology it provides the Network Virtualization Authority (NVA) and the ability to import/export IP prefixes and MAC addresses from Network Virtualization Edges (NVEs). The code supports per-NVE tables. The NVE-NVA protocol used to communicate routing and Ethernet / Layer 2 (L2) forwarding information between NVAs and NVEs is referred to as the Remote Forwarder Protocol (RFP). OpenFlow is an example RFP. For general background on NVO3 and RFP concepts see [1]. For information on Openflow see [2]. RFPs are integrated with BGP via the RF API contained in the new "rfapi" BGP sub-directory. Currently, only a simple example RFP is included in Quagga. Developers may use this example as a starting point to integrate Quagga with an RFP of their choosing, e.g., OpenFlow. The RFAPI code also supports the ability import/export of routing information between VNC and customer edge routers (CEs) operating within a virtual network. Import/export may take place between BGP views or to the default zebera VRF. BGP, with IP VPNs and Tunnel Encapsulation, is used to distribute VPN information between NVAs. BGP based IP VPN support is defined in RFC4364, BGP/MPLS IP Virtual Private Networks (VPNs), and RFC4659, BGP-MPLS IP Virtual Private Network (VPN) Extension for IPv6 VPN . Use of both the Encapsulation Subsequent Address Family Identifier (SAFI) and the Tunnel Encapsulation Attribute, RFC5512, The BGP Encapsulation Subsequent Address Family Identifier (SAFI) and the BGP Tunnel Encapsulation Attribute, are supported. MAC address distribution does not follow any standard BGB encoding, although it was inspired by the early IETF EVPN concepts. The feature is conditionally compiled and disabled by default. Use the --enable-bgp-vnc configure option to enable. The majority of this code was authored by G. Paul Ziemba <paulz@labn.net>. [1] http://tools.ietf.org/html/draft-ietf-nvo3-nve-nva-cp-req [2] https://www.opennetworking.org/sdn-resources/technical-library Now includes changes needed to merge with cmaster-next.
2016-05-07 20:18:56 +02:00
#define BGP_MP_NEXTHOP_FAMILY NEXTHOP_FAMILY
PREDECL_RBTREE_UNIQ(bgp_nexthop_cache);
2002-12-13 21:15:29 +01:00
/* BGP nexthop cache value structure. */
struct bgp_nexthop_cache {
afi_t afi;
/* The ifindex of the outgoing interface *if* it's a v6 LL */
ifindex_t ifindex_ipv6_ll;
/* RB-tree entry. */
struct bgp_nexthop_cache_item entry;
2002-12-13 21:15:29 +01:00
/* IGP route's metric. */
uint32_t metric;
2002-12-13 21:15:29 +01:00
/* Nexthop number and nexthop linked list.*/
uint16_t nexthop_num;
/* This flag is set to TRUE for a bnc that is gateway IP overlay index
* nexthop.
*/
bool is_evpn_gwip_nexthop;
uint8_t change_flags;
#define BGP_NEXTHOP_CHANGED (1 << 0)
#define BGP_NEXTHOP_METRIC_CHANGED (1 << 1)
#define BGP_NEXTHOP_MACIP_CHANGED (1 << 2)
2002-12-13 21:15:29 +01:00
struct nexthop *nexthop;
time_t last_update;
uint16_t flags;
bgpd: EVPN route type-5 to type-2 recursive resolution using gateway IP When EVPN prefix route with a gateway IP overlay index is imported into the IP vrf at the ingress PE, BGP nexthop of this route is set to the gateway IP. For this vrf route to be valid, following conditions must be met. - Gateway IP nexthop of this route should be L3 reachable, i.e., this route should be resolved in RIB. - A remote MAC/IP route should be present for the gateway IP address in the EVI(L2VPN table). To check for the first condition, gateway IP is registered with nht (nexthop tracking) to receive the reachability notifications for this IP from zebra RIB. If the gateway IP is reachable, zebra sends the reachability information (i.e., nexthop interface) for the gateway IP. This nexthop interface should be the SVI interface. Now, to find out type-2 route corresponding to the gateway IP, we need to fetch the VNI for the above SVI. To do this VNI lookup effitiently, define a hashtable of struct bgpevpn with svi_ifindex as key. struct hash *vni_svi_hash; An EVI instance is added to vni_svi_hash if its svi_ifindex is nonzero. Using this hash, we obtain struct bgpevpn corresponding to the gateway IP. For gateway IP overlay index recursive lookup, once we find the correct EVI, we have to lookup its route table for a MAC/IP prefix. As we have to iterate the entire route table for every lookup, this lookup is expensive. We can optimize this lookup by adding all the remote IP addresses in a hash table. Following hash table is defined for this purpose in struct bgpevpn Struct hash *remote_ip_hash; When a MAC/IP route is installed in the EVI table, it is also added to remote_ip_hash. It is possible to have multiple MAC/IP routes with the same IP address because of host move scenarios. Thus, for every address addr in remote_ip_hash, we maintain list of all the MAC/IP routes having addr as their IP address. Following structure defines an address in remote_ip_hash. struct evpn_remote_ip { struct ipaddr addr; struct list *macip_path_list; }; A Boolean field is added to struct bgp_nexthop_cache to indicate that the nexthop is EVPN gateway IP overlay index. bool is_evpn_gwip_nexthop; A flag BGP_NEXTHOP_EVPN_INCOMPLETE is added to struct bgp_nexthop_cache. This flag is set when the gateway IP is L3 reachable but not yet resolved by a MAC/IP route. Following table explains the combination of L3 and L2 reachability w.r.t. BGP_NEXTHOP_VALID and BGP_NEXTHOP_EVPN_INCOMPLETE flags * | MACIP resolved | MACIP unresolved *----------------|----------------|------------------ * L3 reachable | VALID = 1 | VALID = 0 * | INCOMPLETE = 0 | INCOMPLETE = 1 * ---------------|----------------|-------------------- * L3 unreachable | VALID = 0 | VALID = 0 * | INCOMPLETE = 0 | INCOMPLETE = 0 Procedure that we use to check if the gateway IP is resolvable by a MAC/IP route: - Find the EVI/L2VRF that belongs to the nexthop SVI using vni_svi_hash. - Check if the gateway IP is present in remote_ip_hash in this EVI. When the gateway IP is L3 reachable and it is also resolved by a MAC/IP route, unset BGP_NEXTHOP_EVPN_INCOMPLETE flag and set BGP_NEXTHOP_VALID flag. Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
2021-01-11 12:51:56 +01:00
/*
* If the nexthop is EVPN gateway IP NH, VALID flag is set only if the nexthop
* is RIB reachable as well as MAC/IP is present
*/
#define BGP_NEXTHOP_VALID (1 << 0)
#define BGP_NEXTHOP_REGISTERED (1 << 1)
#define BGP_NEXTHOP_CONNECTED (1 << 2)
#define BGP_NEXTHOP_PEER_NOTIFIED (1 << 3)
#define BGP_STATIC_ROUTE (1 << 4)
#define BGP_STATIC_ROUTE_EXACT_MATCH (1 << 5)
#define BGP_NEXTHOP_LABELED_VALID (1 << 6)
bgpd: Validate imported routes next-hop that is in a default VRF Without this patch: ``` r1# sh ip bgp vrf CUSTOMER-A BGP table version is 1, local router ID is 20.20.20.0, vrf id 4 Default local pref 100, local AS 65000 Status codes: s suppressed, d damped, h history, u unsorted, * valid, > best, = multipath, i internal, r RIB-failure, S Stale, R Removed Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self Origin codes: i - IGP, e - EGP, ? - incomplete RPKI validation codes: V valid, I invalid, N Not found Network Next Hop Metric LocPrf Weight Path 192.168.2.0/24 192.168.179.5@0< 0 0 479 ? Displayed 1 routes and 1 total paths r1# ``` This is because the route is imported, next-hop is in a default VRF, and we should evaluate an ultimate path info, not the current path info. After: ``` r1# sh ip bgp vrf CUSTOMER-A BGP table version is 1, local router ID is 20.20.20.0, vrf id 4 Default local pref 100, local AS 65000 Status codes: s suppressed, d damped, h history, u unsorted, * valid, > best, = multipath, i internal, r RIB-failure, S Stale, R Removed Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self Origin codes: i - IGP, e - EGP, ? - incomplete RPKI validation codes: V valid, I invalid, N Not found Network Next Hop Metric LocPrf Weight Path *> 192.168.2.0/24 192.168.179.5@0< 0 0 479 ? Displayed 1 routes and 1 total paths r1# ``` In both cases next-hop in cache table is valid: ``` r1# sh ip bgp nexthop Current BGP nexthop cache: 192.168.179.5 valid [IGP metric 0], #paths 2, peer 192.168.179.5 Resolved prefix 192.168.179.0/24 if r1-eth0 Last update: Thu Sep 5 11:24:37 2024 r1# ``` Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2024-09-05 10:24:53 +02:00
#define BGP_NEXTHOP_ULTIMATE (1 << 7)
bgpd: EVPN route type-5 to type-2 recursive resolution using gateway IP When EVPN prefix route with a gateway IP overlay index is imported into the IP vrf at the ingress PE, BGP nexthop of this route is set to the gateway IP. For this vrf route to be valid, following conditions must be met. - Gateway IP nexthop of this route should be L3 reachable, i.e., this route should be resolved in RIB. - A remote MAC/IP route should be present for the gateway IP address in the EVI(L2VPN table). To check for the first condition, gateway IP is registered with nht (nexthop tracking) to receive the reachability notifications for this IP from zebra RIB. If the gateway IP is reachable, zebra sends the reachability information (i.e., nexthop interface) for the gateway IP. This nexthop interface should be the SVI interface. Now, to find out type-2 route corresponding to the gateway IP, we need to fetch the VNI for the above SVI. To do this VNI lookup effitiently, define a hashtable of struct bgpevpn with svi_ifindex as key. struct hash *vni_svi_hash; An EVI instance is added to vni_svi_hash if its svi_ifindex is nonzero. Using this hash, we obtain struct bgpevpn corresponding to the gateway IP. For gateway IP overlay index recursive lookup, once we find the correct EVI, we have to lookup its route table for a MAC/IP prefix. As we have to iterate the entire route table for every lookup, this lookup is expensive. We can optimize this lookup by adding all the remote IP addresses in a hash table. Following hash table is defined for this purpose in struct bgpevpn Struct hash *remote_ip_hash; When a MAC/IP route is installed in the EVI table, it is also added to remote_ip_hash. It is possible to have multiple MAC/IP routes with the same IP address because of host move scenarios. Thus, for every address addr in remote_ip_hash, we maintain list of all the MAC/IP routes having addr as their IP address. Following structure defines an address in remote_ip_hash. struct evpn_remote_ip { struct ipaddr addr; struct list *macip_path_list; }; A Boolean field is added to struct bgp_nexthop_cache to indicate that the nexthop is EVPN gateway IP overlay index. bool is_evpn_gwip_nexthop; A flag BGP_NEXTHOP_EVPN_INCOMPLETE is added to struct bgp_nexthop_cache. This flag is set when the gateway IP is L3 reachable but not yet resolved by a MAC/IP route. Following table explains the combination of L3 and L2 reachability w.r.t. BGP_NEXTHOP_VALID and BGP_NEXTHOP_EVPN_INCOMPLETE flags * | MACIP resolved | MACIP unresolved *----------------|----------------|------------------ * L3 reachable | VALID = 1 | VALID = 0 * | INCOMPLETE = 0 | INCOMPLETE = 1 * ---------------|----------------|-------------------- * L3 unreachable | VALID = 0 | VALID = 0 * | INCOMPLETE = 0 | INCOMPLETE = 0 Procedure that we use to check if the gateway IP is resolvable by a MAC/IP route: - Find the EVI/L2VRF that belongs to the nexthop SVI using vni_svi_hash. - Check if the gateway IP is present in remote_ip_hash in this EVI. When the gateway IP is L3 reachable and it is also resolved by a MAC/IP route, unset BGP_NEXTHOP_EVPN_INCOMPLETE flag and set BGP_NEXTHOP_VALID flag. Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
2021-01-11 12:51:56 +01:00
/*
* This flag is added for EVPN gateway IP nexthops.
* If the nexthop is RIB reachable, but a MAC/IP is not yet
* resolved, this flag is set.
* Following table explains the combination of L3 and L2 reachability w.r.t.
* VALID and INCOMPLETE flags
*
* | MACIP resolved | MACIP unresolved
*----------------|----------------|------------------
* L3 reachable | VALID = 1 | VALID = 0
* | INCOMPLETE = 0 | INCOMPLETE = 1
* ---------------|----------------|--------------------
* L3 unreachable | VALID = 0 | VALID = 0
* | INCOMPLETE = 0 | INCOMPLETE = 0
*/
#define BGP_NEXTHOP_EVPN_INCOMPLETE (1 << 8)
bgpd: EVPN route type-5 to type-2 recursive resolution using gateway IP When EVPN prefix route with a gateway IP overlay index is imported into the IP vrf at the ingress PE, BGP nexthop of this route is set to the gateway IP. For this vrf route to be valid, following conditions must be met. - Gateway IP nexthop of this route should be L3 reachable, i.e., this route should be resolved in RIB. - A remote MAC/IP route should be present for the gateway IP address in the EVI(L2VPN table). To check for the first condition, gateway IP is registered with nht (nexthop tracking) to receive the reachability notifications for this IP from zebra RIB. If the gateway IP is reachable, zebra sends the reachability information (i.e., nexthop interface) for the gateway IP. This nexthop interface should be the SVI interface. Now, to find out type-2 route corresponding to the gateway IP, we need to fetch the VNI for the above SVI. To do this VNI lookup effitiently, define a hashtable of struct bgpevpn with svi_ifindex as key. struct hash *vni_svi_hash; An EVI instance is added to vni_svi_hash if its svi_ifindex is nonzero. Using this hash, we obtain struct bgpevpn corresponding to the gateway IP. For gateway IP overlay index recursive lookup, once we find the correct EVI, we have to lookup its route table for a MAC/IP prefix. As we have to iterate the entire route table for every lookup, this lookup is expensive. We can optimize this lookup by adding all the remote IP addresses in a hash table. Following hash table is defined for this purpose in struct bgpevpn Struct hash *remote_ip_hash; When a MAC/IP route is installed in the EVI table, it is also added to remote_ip_hash. It is possible to have multiple MAC/IP routes with the same IP address because of host move scenarios. Thus, for every address addr in remote_ip_hash, we maintain list of all the MAC/IP routes having addr as their IP address. Following structure defines an address in remote_ip_hash. struct evpn_remote_ip { struct ipaddr addr; struct list *macip_path_list; }; A Boolean field is added to struct bgp_nexthop_cache to indicate that the nexthop is EVPN gateway IP overlay index. bool is_evpn_gwip_nexthop; A flag BGP_NEXTHOP_EVPN_INCOMPLETE is added to struct bgp_nexthop_cache. This flag is set when the gateway IP is L3 reachable but not yet resolved by a MAC/IP route. Following table explains the combination of L3 and L2 reachability w.r.t. BGP_NEXTHOP_VALID and BGP_NEXTHOP_EVPN_INCOMPLETE flags * | MACIP resolved | MACIP unresolved *----------------|----------------|------------------ * L3 reachable | VALID = 1 | VALID = 0 * | INCOMPLETE = 0 | INCOMPLETE = 1 * ---------------|----------------|-------------------- * L3 unreachable | VALID = 0 | VALID = 0 * | INCOMPLETE = 0 | INCOMPLETE = 0 Procedure that we use to check if the gateway IP is resolvable by a MAC/IP route: - Find the EVI/L2VRF that belongs to the nexthop SVI using vni_svi_hash. - Check if the gateway IP is present in remote_ip_hash in this EVI. When the gateway IP is L3 reachable and it is also resolved by a MAC/IP route, unset BGP_NEXTHOP_EVPN_INCOMPLETE flag and set BGP_NEXTHOP_VALID flag. Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
2021-01-11 12:51:56 +01:00
uint32_t srte_color;
/* Back pointer to the cache tree this entry belongs to. */
struct bgp_nexthop_cache_head *tree;
struct prefix prefix;
struct prefix resolved_prefix;
void *nht_info; /* In BGP, peer session */
LIST_HEAD(path_list, bgp_path_info) paths;
unsigned int path_count;
struct bgp *bgp;
2002-12-13 21:15:29 +01:00
};
extern int bgp_nexthop_cache_compare(const struct bgp_nexthop_cache *a,
const struct bgp_nexthop_cache *b);
DECLARE_RBTREE_UNIQ(bgp_nexthop_cache, struct bgp_nexthop_cache, entry,
bgp_nexthop_cache_compare);
bgpd: Ignore EVPN routes from CLAG peer when VNI comes up There are two parts to this commit: 1. create a database of self tunnel-ip for used in martian nexthop check In a CLAG setup, the tunnel-ip (VNI UP) notification comes before the clag-anycast-ip comes up in the system. This was causing our self next hop check to fail and we were instaling routes with martian nexthop in zebra. We need to keep this info in a seperate database for all local tunnel-ip. This database will be used in parallel with the self next hop database to martian nexthop checks. 2. When a local VNI comes up, update the tunnel-ip database and filter routes in the RD table if necessary In case of EVPN we might receive routes from clag peer before the clag-anycast ip and VNI is up on the system. We will store the routes in the RD table for later processing. When VNI comes UP, we loop thorugh all the routes and install them in zebra if required. However, we were missing the martian nexthop check in this code path. From now onwards, when a VNI comes UP, we will first update the tunnel-ip database We then loop through all the routes in RD table and apply martian next hop filter if required. Things not covered in this commit but are required: This processing is needed in general when an address becomes a connected address. We need to loop through all the routes in BGP and apply martian nexthop filter if necessary. This will be taken care in a seperate bug Ticket:CM-17271/CM-16911 Reviewed By: ccr-6542 Testing Done: Manual Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-08-17 08:19:58 +02:00
/* Own tunnel-ip address structure */
struct tip_addr {
struct in_addr addr;
int refcnt;
};
/* Forward declaration(s). */
struct peer;
struct update_subgroup;
struct bgp_dest;
struct attr;
#define BNC_FLAG_DUMP_SIZE 180
extern char *bgp_nexthop_dump_bnc_flags(struct bgp_nexthop_cache *bnc,
char *buf, size_t len);
extern char *bgp_nexthop_dump_bnc_change_flags(struct bgp_nexthop_cache *bnc,
char *buf, size_t len);
extern void bgp_connected_add(struct bgp *bgp, struct connected *c);
extern void bgp_connected_delete(struct bgp *bgp, struct connected *c);
extern bool bgp_subgrp_multiaccess_check_v4(struct in_addr nexthop,
struct update_subgroup *subgrp,
struct peer *exclude);
extern bool bgp_subgrp_multiaccess_check_v6(struct in6_addr nexthop,
struct update_subgroup *subgrp,
struct peer *exclude);
extern bool bgp_multiaccess_check_v4(struct in_addr nexthop, struct peer *peer);
extern bool bgp_multiaccess_check_v6(struct in6_addr nexthop,
struct peer *peer);
extern int bgp_config_write_scan_time(struct vty *);
extern bool bgp_nexthop_self(struct bgp *bgp, afi_t afi, uint8_t type,
uint8_t sub_type, struct attr *attr,
struct bgp_dest *dest);
extern struct bgp_nexthop_cache *bnc_new(struct bgp_nexthop_cache_head *tree,
struct prefix *prefix,
uint32_t srte_color,
ifindex_t ifindex);
extern bool bnc_existing_for_prefix(struct bgp_nexthop_cache *bnc);
extern void bnc_free(struct bgp_nexthop_cache *bnc);
extern struct bgp_nexthop_cache *bnc_find(struct bgp_nexthop_cache_head *tree,
struct prefix *prefix,
uint32_t srte_color,
ifindex_t ifindex);
extern void bnc_nexthop_free(struct bgp_nexthop_cache *bnc);
extern void bgp_scan_init(struct bgp *bgp);
extern void bgp_scan_finish(struct bgp *bgp);
extern void bgp_scan_vty_init(void);
extern void bgp_address_init(struct bgp *bgp);
extern void bgp_address_destroy(struct bgp *bgp);
extern bool bgp_tip_add(struct bgp *bgp, struct in_addr *tip);
bgpd: Ignore EVPN routes from CLAG peer when VNI comes up There are two parts to this commit: 1. create a database of self tunnel-ip for used in martian nexthop check In a CLAG setup, the tunnel-ip (VNI UP) notification comes before the clag-anycast-ip comes up in the system. This was causing our self next hop check to fail and we were instaling routes with martian nexthop in zebra. We need to keep this info in a seperate database for all local tunnel-ip. This database will be used in parallel with the self next hop database to martian nexthop checks. 2. When a local VNI comes up, update the tunnel-ip database and filter routes in the RD table if necessary In case of EVPN we might receive routes from clag peer before the clag-anycast ip and VNI is up on the system. We will store the routes in the RD table for later processing. When VNI comes UP, we loop thorugh all the routes and install them in zebra if required. However, we were missing the martian nexthop check in this code path. From now onwards, when a VNI comes UP, we will first update the tunnel-ip database We then loop through all the routes in RD table and apply martian next hop filter if required. Things not covered in this commit but are required: This processing is needed in general when an address becomes a connected address. We need to loop through all the routes in BGP and apply martian nexthop filter if necessary. This will be taken care in a seperate bug Ticket:CM-17271/CM-16911 Reviewed By: ccr-6542 Testing Done: Manual Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-08-17 08:19:58 +02:00
extern void bgp_tip_del(struct bgp *bgp, struct in_addr *tip);
extern void bgp_tip_hash_init(struct bgp *bgp);
extern void bgp_tip_hash_destroy(struct bgp *bgp);
extern void bgp_nexthop_show_address_hash(struct vty *vty, struct bgp *bgp);
#endif /* _QUAGGA_BGP_NEXTHOP_H */