The bgp labelpool code is grabbing the vpn policy data structure.
This vpn_policy has a pointer to the bgp data structure. If
a item placed on the bgp label pool workqueue happens to sit
there for the microsecond or so and the operator issues a
`no router bgp...` command that corresponds to the vpn_policy
bgp pointer, when the workqueue is run it will crash because
the bgp pointer is now freed and something else owns it.
Modify the labelpool code to store the vrf id associated
with the request on the workqueue. When you wake up
if the vrf id still has a bgp pointer allow the request
to continue, else drop it.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
For JSON output we don't need newline to be printed.
Before:
```
"lastUpdate":{"epoch":1734490463,"string":"Wed Dec 18 04:54:23 2024\n"
```
After:
```
"lastUpdate":{"epoch":1734678560,"string":"Fri Dec 20 09:09:20 2024"
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
There are several problems with the bgp_sync_label_manager
function:
a) It is possible that a request in the lp->requests
fifo will be unable to be filled at this point in time
and the lf will be leaked and not ever fullfilled.
b) The bgp_sync_label_manager runs one time a second
irrelevant if there is work to do or not.
To fix (a) just add the request back to the requests
fifo and set the timer to pop in the future.
To fix (b) just every time something is put into
the request pool start a timer to run in 1 second
and do not restart it if all the work is done.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
If BGP starts with a l3vpn configuration, the 'pending' value
of the 'show bgp labelpool summary' command is set to 128,
whereas the 'pending' value is 0 if the l3vpn configuration is
applied after.
with no config at startup:
> show bgp labelpool summary
> Labelpool Summary
> -----------------
> Ledger: 1
> InUse: 1
> Requests: 0
> LabelChunks: 1
> Pending: 0
> Reconnects: 1
with config at startup:
> show bgp labelpool summary
> Labelpool Summary
> -----------------
> Ledger: 1
> InUse: 1
> Requests: 0
> LabelChunks: 1
> Pending: 128
> Reconnects: 1
When BGP configuration is applied at startup, the label request fails,
because the zapi connection with zebra is not yet up. At zebra
up event, the label request is done again, succeeds, decrements the
'pending_count' value in 'bgp_lp_event_chunk() function, then sets
the 'pending_count' value to the 'labels_needed' value.
This method was correct when label requests were asyncronous: the
'pending_count' value was first set, then decremented. In syncronous
label requests, the operations are swapped.
Fix this by incrementing the expected 'labels_needed' value instead.
Fixes: 0043ebab99 ("bgpd: Use synchronous way to get labels from Zebra")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
A label chunk is used by BGP for L3VPN or LU purposes,
by picking up labels from that chunk; but when those
labels are release, the label chunks are never released.
The below configuration sequence shows that the label
chunks are not released.
> router bgp 65500
> bgp router-id 1.1.1.1
> !
> address-family ipv4 unicast
> label vpn export auto
> rd vpn export 55:1
> rt vpn both 55:1
> export vpn
> import vpn
> [..]
> no label vpn export auto
> [..]
> # show bgp labelpool summary
> [..]
> LabelChunks: 1
> Pending: 128
> [..]
The '128' value stands for the default label chunk size,
which is not released after unconfiguration.
Fix this by checking after each label release, that
the label chunk is still used. If not, release it.
Reset the 'next_chunksize' value to the default value.
Fixes: 955bfd984f ("bgpd: dynamic mpls label pool")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
BGP always asks zebra for a chunk of MPLS label even if it doesn't need it.
Fix this by correcting the rounding up "labels_needed" formula.
Fixes: 80853c2ec7 ("bgpd: improve labelpool performance at scale")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Today, when configuring BGP L3VPN mpls, the operator may
use that command to hardset a label value:
> router bgp 65500 vrf vrf1
> address-family ipv4 unicast
> label vpn export <hardset_label_value>
Today, BGP uses this value without checks, leading to potential
conflicts with other control planes like LDP. For instance, if
LDP initiates with a label chunk of [16;72] and BGP also uses the
50 label value, a conflict arises.
The 'label manager' service in zebra oversees label allocations.
While all the control plane daemons use it, BGP doesn't when a
hardset label is in place.
This update fixes this problem. Now, when a hardset label is set for
l3vpn export, a request is made to the label manager for approval,
ensuring no conflicts with other daemons. But, this means some existing
BGP configurations might become non-operational if they conflict with
labels already allocated to another daemon but not used.
note: Labels below 16 are reserved and won't be checked for consistency
by the label manager.
Fixes: ddb5b4880b ("bgpd: vpn-vrf route leaking")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Both the label manager and table manager zapi code send data requests via zapi
to zebra and then immediately listen for a response from zebra. The problem here
is of course that the listen part is throwing away any zapi command that is not
the one it is looking for.
ISIS/OSPF and PIM all have synchronous abilities via zapi, which they all
do through a special zapi connection to zebra. BGP needs to follow this model
as well. Additionally the new zclient_sync connection that should be created,
a once a second timer should wake up and read any data on the socket to
prevent problems too much data accumulating in the socket.
```
r3# sh bgp labelpool summary
Labelpool Summary
-----------------
Ledger: 3
InUse: 3
Requests: 0
LabelChunks: 1
Pending: 128
Reconnects: 1
r3# sh bgp labelpool inuse
Prefix Label
---------------------------
10.0.0.1/32 16
192.168.31.0/24 17
192.168.32.0/24 18
r3#
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
When advertising an mpls vpn entry with a new label,
the return traffic is redirected to the local machine,
but the MPLS traffic is dropped.
Add an MPLS entry to handle MPLS packets which have
the new label value. Traffic is swapped to the original
label value from the mpls vpn next-hop entry; then it is
sent to the resolved next-hop of the original next-hop
from the mpls vpn next-hop entry.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The label per nexthop attributes take 24 bytes per bgp path
entry on AMD64 platform, and are only used for unicast paths.
The current patch-set introduces a similar attributes, but that
will be used only for l3vpn paths. To gain some memory on the
bgp_path_info structure in the next commit, do some changes.
Create an 'mplsvpn' union structure that will either include the
label per nexthop structs for ipv4 paths, or the l3vpn paths
structures. The 'label_nexthop_cache' and the 'label_nh_thread'
attributes of the 'bgp_path_info' structure are moved into an
union under a new structure called 'bgp_mplsvpn_label_nh_blnc'.
The flags attribute of 'bgp_path_info' is increased from 16 bits
to 32 bits, and the BGP_PATH_MPLSVPN_LABEL_NH flag is added to
know the 'mplsvpn' usage.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Redistributing mpls vpn prefixes with a new label
requires picking up unused MPLS local labels.
Today, there is an MPLS label pool which is owned
by the zebra daemon. BGP gets a chunk of labels
that are shared with the multiple usage of the BGP
daemon. A label type attribute is used whenever
BGP needs a new label value.
The 'LP_TYPE_BGP_L3VPN_BIND' label type will be used
to allocate the MPLS labels that will be bound to
the original next-hop, and label value of an L3VPN
update.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The following command is made available to list the labels
allocated per-nexthop, along with the paths registered to it.
> # show bgp vrf vrf1 label-nexthop
> Current BGP label nexthop cache for IP, VRF vrf1
> 192.0.2.11, label 20 #paths 3
> if r1-eth1
> Last update: Mon Jan 16 18:52:11 2023
> 192.0.2.12, label 17 #paths 2
> if r1-eth1
> Last update: Mon Jan 16 18:52:08 2023
> 192.0.2.14, label 18 #paths 1
> if r1-eth1
> Last update: Mon Jan 16 18:52:07 2023
> 192.168.255.13, label 19 #paths 1
> if r1-eth2
> Last update: Mon Jan 16 18:52:10 2023
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
BGP MPLSVPN next hop label allocation was using only the next-hop
IP address. As MPLSVPN contexts rely on bnc contexts, the real
nexthop interface is known, and the LSP entry to enter can apply
to the specific interface. To illustrate, the BGP service is able
to handle the following two iproute2 commands:
> ip -f mpls route add 105 via inet 192.0.2.45 dev r1-eth1
> ip -f mpls route add 105 via inet 192.0.2.46 dev r1-eth2
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This commit introduces the necessary structs and apis to
create the cache entries that store the label information
associated to a given nexthop.
A hash table is created in each BGP instance for all the
AFIs: IPv4 and IPv6. That hash table is initialised.
An API to look and/or create an entry based on a given
nexthop.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
A new label type is introduced: LP_TYPE_NEXTHOP. This new
label type will be used in next commits to allocate labels
for a specific nexthop IP address.
The commit changes add vty and json outputs to display
the new label type and the label values associated.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Effectively a massive search and replace of
`struct thread` to `struct event`. Using the
term `thread` gives people the thought that
this event system is a pthread when it is not
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The following command is made available to list the labels
allocated per-nexthop, along with the paths registered to it.
> # show bgp vrf vrf1 label-nexthop
> Current BGP label nexthop cache for IP, VRF vrf1
> 192.0.2.11, label 20 #paths 3
> if r1-eth1
> Last update: Mon Jan 16 18:52:11 2023
> 192.0.2.12, label 17 #paths 2
> if r1-eth1
> Last update: Mon Jan 16 18:52:08 2023
> 192.0.2.14, label 18 #paths 1
> if r1-eth1
> Last update: Mon Jan 16 18:52:07 2023
> 192.168.255.13, label 19 #paths 1
> if r1-eth2
> Last update: Mon Jan 16 18:52:10 2023
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
BGP MPLSVPN next hop label allocation was using only the next-hop
IP address. As MPLSVPN contexts rely on bnc contexts, the real
nexthop interface is known, and the LSP entry to enter can apply
to the specific interface. To illustrate, the BGP service is able
to handle the following two iproute2 commands:
> ip -f mpls route add 105 via inet 192.0.2.45 dev r1-eth1
> ip -f mpls route add 105 via inet 192.0.2.46 dev r1-eth2
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This commit introduces the necessary structs and apis to
create the cache entries that store the label information
associated to a given nexthop.
A hash table is created in each BGP instance for all the
AFIs: IPv4 and IPv6. That hash table is initialised.
An API to look and/or create an entry based on a given
nexthop.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
A new label type is introduced: LP_TYPE_NEXTHOP. This new
label type will be used in next commits to allocate labels
for a specific nexthop IP address.
The commit changes add vty and json outputs to display
the new label type and the label values associated.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Rather than running selected source files through the preprocessor and a
bunch of perl regex'ing to get the list of all DEFUNs, use the data
collected in frr.xref.
This not only eliminates issues we've been having with preprocessor
failures due to nonexistent header files, but is also much faster.
Where extract.pl would take 5s, this now finishes in 0.2s. And since
this is a non-parallelizable build step towards the end of the build
(dependent on a lot of other things being done already), the speedup is
actually noticeable.
Also files containing CLI no longer need to be listed in `vtysh_scan`
since the .xref data covers everything. `#ifndef VTYSH_EXTRACT_PL`
checks are equally obsolete.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
- double the size of each new chunk request from zebra
- use bitfields to track label allocations in a chunk
- When allocating:
- skip chunks with no free labels
- search biggest chunks first
- start search in chunk where last search ended
- Improve API documentation in comments (bgp_lp_get() and callback)
- Tweak formatting of "show bgp labelpool chunks"
- Add test features (compiled conditionally on BGP_LABELPOOL_ENABLE_TESTS)
Signed-off-by: G. Paul Ziemba <paulz@labn.net>
Back when I put this together in 2015, ISO C11 was still reasonably new
and we couldn't require it just yet. Without ISO C11, there is no
"good" way (only bad hacks) to require a semicolon after a macro that
ends with a function definition. And if you added one anyway, you'd get
"spurious semicolon" warnings on some compilers...
With C11, `_Static_assert()` at the end of a macro will make it so that
the semicolon is properly required, consumed, and not warned about.
Consistently requiring semicolons after "file-level" macros matches
Linux kernel coding style and helps some editors against mis-syntax'ing
these macros.
Signed-off-by: David Lamparter <equinox@diac24.net>
The check for the return code for zclient_send_get_label_chunk is
reversed and therefore the pending count does not get incremented
for each successful label chunk request.
This has the effect of requesting a 50 label chunk per label request
from BGP i.e we request 50 times the labels we require.
Signed-off-by: Pat Ruddy <pat@voltanet.io>
To prepare for fixing an issue where labels do not get released back
to the labelpool when the route is deleted some refactoring is
necessary. There are 2 parts to this.
1. restructure the code to remove the circular nature of label
allocations via the labelpool and decouple the label type decision
from the notification fo the FEC.
The code to notify the FEC association to zebra has been split out
into a separate function so that it can be called from the synchronous
path (for registration of index-based labels and de-registration of all
labels), and from the asynchronous path where we need to wait for a
callback from the labelpool code with a label allocation.
The decision about whether we are using an index-based label or an
allocated label is reflected in the state of the BGP_NODE_LABEL_REQUESTED
flag so the checks on the path_info in the labelpool callback code are
no longer required.
2. change the owned of a labelpool allocated label from the path info
structure to the bgp_dest structure. This allows labels to be released
(in a subsequent commit) when the owner (bgp_dest) goes away.
Signed-off-by: Pat Ruddy <pat@voltanet.io>
when the path info information is queued on the work queue it
is protected by a lock to avoid the rug being pulled whilst it
resides on the queue add an unlock in the error case where we do
no queue the reference to the workqueue.
Signed-off-by: Pat Ruddy <pat@voltanet.io>
The `enum zclient_send_status` enum needs to be extended
throughout the code base to use the new states and
to fix up places where we tested against the return
value being non zero.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
For SRGB, we need to support chunk requests starting at a
specific point in the label space, rather than just asking
for any sufficiently large chunk. To this purpose, we extend
the label manager api to request a chunk with a base value;
if the base is set to 0, the label manager will behave as it
currently does, i.e. fetching the first free chunk big enough
to satisfy the request.
update all the existing calls to get chunks from the label
manager so that they use MPLS_LABEL_BASE_ANY as the base
for the requested chunk
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
under some conditions, the callback to get a label for
a LU bgp path could be called after the path had already
been freed. In this case we would be reading garbage
and potentially crash. Lock the path info before
queueing the callback, and unlock as the first step
of the callback, exiting gracefully if the path info
is now NULL.
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
Again, the FIFO_* stuff in lib/fifo.h is no different from a simple
unsorted list. Just use DECLARE_LIST here so we can get rid of FIFO_*.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>