Commit graph

37096 commits

Author SHA1 Message Date
Jafar Al-Gharaibeh e3b6a8c459 FRR Release 10.2.2
Changelog:

bgpd
    Allow bfd to work if peer known but interface address not yet
    Apply route-map for aggregate before attribute comparison
    Do not ignore auto generated vrf instances when deleting
    Do not start bgp session if bgp identifier is not set
    Do not try to uninstall bfd session if the peer is not established
    Don't reuse nexthop variable in loop/switch
    Fix a bug in peer_allowas_in_set()
    Fix add label support to evpn ad routes
    Fix bfd with update-source in peer-group
    Fix bgp label evpn cid 1636504
    Fix bgp orf prefix-list json prefix
    Fix bgp peer solo option
    Fix bgp vrf instance creation from implicit
    Fix crash in bgp_labelpool
    Fix crash in displaying json orf prefix-list
    Fix deadlock in bgp_keepalive and master pthreads
    Fix duplicate bgp instance created with unified config
    Fix for local interface mac cache issue in 'bgp mac hash' table
    Fix import vrf creates multiple bgp instances
    Fix incorrect json in bgp_show_table_rd
    Fix memory leak in bgp_aggregate_install()
    Fix route-distinguisher in vrf leak json cmd
    Fix static analyzer issues around bgp pointer
    Fix table-map option
    Fix vty output of evpn route-target as4
    Fix wrong pthread event cancelling
    Remove dmed check not required in bestpath selection
    Request srv6 locator after zebra connection
    Reset bgp session only if it was a real bfd down event
    Respect allowas-in value from the source vrf's peer
    Simplify bgp_evpn_process_rt1 with label
    Update source address for bfd session
    Use igpmetric in bgp_aigp_metric_total()
    When bgp notices a change to shared_network inform bfd of it
    When removing the prefix list drop the pointer
    With suppress-fib-pending ensure withdrawal is sent
    Revert: Handle addpath capability using dynamic capabilities"
    Revert: Reinstall aggregated routes if using route-maps and it was changed"

isisd
    Add helper function to request srv6 locator information
    Allow full `no` form for `domain-password` and `area-password`
    Correct edge insertion into ted
    Request srv6 locator after zebra connection
    Show correct level information for `show isis interface detail json`

lib
    Clean up nexthop hashing mess
    Crash handlers must be allowed on threads
    Fix false context information for srv6 route
    Guard against padding garbage in zapi read
    Nb: call child destroy cbs when yang container is deleted

mgmtd
    Prevent use after free

nhrpd
    Fix dont consider incomplete l2 entry

ospf6d
    Fix use after free of router in ospfv3 abr route calculation.

pbrd
    Initialize structs used in hash_lookup

pimd
    Always write cand-rp group config even when rp is inactive
    Close autorp socket when not needed
    During prefix-list update, behave as pim_upstream_notjoined state (conformance issue)
    Explicitly ensure the rp src is bsr
    Fix autorp group joins
    Fix bsr rps timing out
    Fix dr election race on startup
    Fix for data packet loss when fhr is lhr and rp
    Fix for fhr mroute taking longer to age out
    Fix memory leak and assign allocation type
    Fix pim vrf support (send register/register stop in vrf)
    Fix pim6 mld vrf support (use recvmsg() pktinfo)
    Fix vrf binding of autorp and mroute socket

tests
    Add a test that shows the v6 recursive nexthop problem
    Bgp_srv6_sid_reachability should give more time
    Bgp_srv6l3vpn_to_bgp_vrf3 needs more time
    Check if allow as-in works when importing between local vrfs

tools
    Add missing formats keyword to segment-routing in frr-reload
    Add missing rpki keyword to vrf in frr-reload
    Fix frr-reload for ebgp-multihop ttl reconfiguration.

zebra
    Ensure dplane does not send work back to master at wrong time
    Evpn svd hash avoid double free
    Fix leaked nhe
    Fix resetting valid flags for nhg dependents
    Guard against junk in nexthop->rmap_src
    Include resolving nexthops in nhg hash

Signed-off-by: Jafar Al-Gharaibeh <jafar@atcorp.com>
2025-03-09 23:58:21 -05:00
Donald Sharp aa86d77e45
Merge pull request #18333 from FRRouting/mergify/bp/stable/10.2/pr-18315
pimd: Fix PIM6 MLD VRF support (use recvmsg() pktinfo) (backport #18315)
2025-03-06 20:32:39 -05:00
Martin Buck 3802a0b709 pimd: Fix PIM6 MLD VRF support (use recvmsg() pktinfo)
When receiving MLD messages, prefer pktinfo over msghdr.msg_name for
determining the source interface. The latter is just the VRF master
interface in case of VRF and we need the true interface the packet was
received on instead.

Signed-off-by: Martin Buck <mb-tmp-tvguho.pbz@gromit.dyndns.org>
(cherry picked from commit 374c8dc4db)
2025-03-06 18:43:38 +00:00
Donald Sharp 9ffd44779b
Merge pull request #18299 from FRRouting/mergify/bp/stable/10.2/pr-18294
isisd: Correct edge insertion into TED (backport #18294)
2025-03-03 10:35:07 -05:00
Olivier Dugeon 4e50885f61 isisd: Correct edge insertion into TED
Edges are not correctly linked to Vertices during LSP processing. In function
lsp_to_edge_cb(), once edge created or updated from the LSP TLVs, the code try
to link the edge to destination vertices. In case the revert edge is not found,
the code try to found a destination vertex to link to. But, the sys_id used
for this operation corresponds to the source vertex. As a result, the edge is
attached as source and destination of the vertex. When Traffic Engineering is
stopped, TED is deleted which result into a double free of the edge attributes.
This cause a crash when attempt to free extended admin groupi the second time.

This patch removed wrong code which link twice the edge to the source vertex.

Signed-off-by: Olivier Dugeon <olivier.dugeon@orange.com>
(cherry picked from commit 605fc1dd64)
2025-03-03 12:40:32 +00:00
Jafar Al-Gharaibeh 53c938b11c
Merge pull request #18280 from FRRouting/mergify/bp/stable/10.2/pr-18264
mgmtd: Prevent use after free (backport #18264)
2025-03-01 15:00:40 -06:00
Donald Sharp b4f3760f86 mgmtd: Prevent use after free
ci is picking up this use after free on occasion:

    ERROR: AddressSanitizer: attempting to call malloc_usable_size() for pointer which is not owned: 0x6030001d94a0
        0 0x7fab994b7f04 in __interceptor_malloc_usable_size ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:119
        1 0x7fab994264f6 in __sanitizer::BufferedStackTrace::Unwind(unsigned long, unsigned long, void*, bool, unsigned int) ../../../../src/libsanitizer/sanitizer_common/sanitizer_stacktrace.h:131
        2 0x7fab994264f6 in __asan::asan_malloc_usable_size(void const*, unsigned long, unsigned long) ../../../../src/libsanitizer/asan/asan_allocator.cpp:1058
        3 0x7fab99039bcf in mt_count_free lib/memory.c:78
        4 0x7fab99039bcf in qfree lib/memory.c:130
        5 0x7fab98ff971a in hash_clean lib/hash.c:290
        6 0x56110cdb0e7f in mgmt_txn_hash_destroy mgmtd/mgmt_txn.c:1881
        7 0x56110cdb0e7f in mgmt_txn_destroy mgmtd/mgmt_txn.c:2013
        8 0x56110cd8e5de in mgmt_terminate mgmtd/mgmt.c:91
        9 0x56110cd8e003 in sigint mgmtd/mgmt_main.c:90
        10 0x7fab990bf4b0 in frr_sigevent_process lib/sigevent.c:117
        11 0x7fab990ea7a1 in event_fetch lib/event.c:1740
        12 0x7fab9901a24e in frr_run lib/libfrr.c:1245
        13 0x56110cd8e21f in main mgmtd/mgmt_main.c:290
        14 0x7fab98af9249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
        15 0x7fab98af9304 in __libc_start_main_impl ../csu/libc-start.c:360
        16 0x56110cd8dd30 in _start (/usr/lib/frr/mgmtd+0x3ad30)

    0x6030001d94a0 is located 0 bytes inside of 24-byte region [0x6030001d94a0,0x6030001d94b8)
    freed by thread T0 here:
        0 0x7fab994b76a8 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:52
        1 0x7fab99039bf0 in qfree lib/memory.c:131
        2 0x7fab98ff93e1 in hash_release lib/hash.c:227
        3 0x56110cdaabdc in mgmt_txn_unlock mgmtd/mgmt_txn.c:1931
        4 0x56110cdab049 in mgmt_txn_delete mgmtd/mgmt_txn.c:1841
        5 0x56110cdab0ce in mgmt_txn_hash_free mgmtd/mgmt_txn.c:1864
        6 0x7fab98ff970b in hash_clean lib/hash.c:288
        7 0x56110cdb0e7f in mgmt_txn_hash_destroy mgmtd/mgmt_txn.c:1881
        8 0x56110cdb0e7f in mgmt_txn_destroy mgmtd/mgmt_txn.c:2013
        9 0x56110cd8e5de in mgmt_terminate mgmtd/mgmt.c:91
        10 0x56110cd8e003 in sigint mgmtd/mgmt_main.c:90
        11 0x7fab990bf4b0 in frr_sigevent_process lib/sigevent.c:117
        12 0x7fab990ea7a1 in event_fetch lib/event.c:1740
        13 0x7fab9901a24e in frr_run lib/libfrr.c:1245
        14 0x56110cd8e21f in main mgmtd/mgmt_main.c:290
        15 0x7fab98af9249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

    previously allocated by thread T0 here:
        0 0x7fab994b83b7 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:77
        1 0x7fab990392fd in qcalloc lib/memory.c:106
        2 0x7fab98ff8b4f in hash_get lib/hash.c:156
        3 0x56110cdb13ae in mgmt_txn_create_new mgmtd/mgmt_txn.c:1825
        4 0x56110cdb3b4d in mgmt_txn_notify_be_adapter_conn mgmtd/mgmt_txn.c:2212
        5 0x56110cd91178 in mgmt_be_adapter_conn_init mgmtd/mgmt_be_adapter.c:842
        6 0x7fab990ec6de in event_call lib/event.c:2019
        7 0x7fab9901a243 in frr_run lib/libfrr.c:1246
        8 0x56110cd8e21f in main mgmtd/mgmt_main.c:290
        9 0x7fab98af9249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

The only time that mgmt_txn_hash_free is called is in hash_clean.
There are other places that mgmt_txn_unlock/delete are called and
hash_release should be called.  Let's just notice when mgmtd is
being called from the hash_clean and not call hash_release (since
we know it is being released already)

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 62f35c7bdb)
2025-02-27 20:40:35 +00:00
Jafar Al-Gharaibeh 8f85bc932a
Merge pull request #18266 from FRRouting/mergify/bp/stable/10.2/pr-18254
ospf6d: Fix use after free of router in OSPFv3 ABR route calculation. (backport #18254)
2025-02-27 10:12:52 -06:00
Acee Lindem 0d7236c2d2 ospf6d: Fix use after free of router in OSPFv3 ABR route calculation.
This PR fixes FRR issue https://github.com/FRRouting/frr/issues/18040. The
OSPFv3 route is locked during the ABR calculation since there are
scenarios under which it is freed. The OSPFv3 ABR computation is
sub-optimal and this PR doesn't attempt to rework it.

Signed-off-by: Acee Lindem <acee@lindem.com>
(cherry picked from commit 06af50eace)
2025-02-26 17:48:49 +00:00
Donald Sharp b26f7fba92
Merge pull request #18251 from nabahr/pr-18225-10.2-backport-fixed
pim: Fix autorp group joins (backport #18225)
2025-02-25 10:21:16 -05:00
Donald Sharp 483b89751f
Merge pull request #18252 from nabahr/pr-18226-10.2-backport-fixed
pim: Fix vrf binding of autorp and mroute socket (backport #18226)
2025-02-25 10:20:46 -05:00
Jafar Al-Gharaibeh bedc596a5e
Merge pull request #18249 from FRRouting/mergify/bp/stable/10.2/pr-18216
pimd: Fix PIM VRF support (send register/register stop in VRF) (backport #18216)
2025-02-24 15:33:54 -06:00
Nathan Bahr 06956e881c pim: Fix vrf binding of autorp and mroute socket
Bind the autorp socket to the vrf device.
Also fixed mroute socket to use vrf_bind instead of directly
setting the socket option.

Signed-off-by: Nathan Bahr <nbahr@atcorp.com>
(cherry picked from commit 7e181a771c)

Fixed merge conflicts
2025-02-24 20:23:52 +00:00
Nathan Bahr bc29012863 pim: Fix autorp group joins
Group joining got broken when moving the autorp socket to open/close
as needed. This fixes it so autorp group joining is properly handled
as part of opening the socket.

Signed-off-by: Nathan Bahr <nbahr@atcorp.com>
(cherry picked from commit d840560b74)

Fixed merge conflicts for backport
2025-02-24 20:02:54 +00:00
Martin Buck e1c0246dee pimd: Fix PIM VRF support (send register/register stop in VRF)
In 9461953914 and
8ebcc02328, transmission of PIM register and
register stop messages was changed to use a separate socket. However, that
socket is not bound to a possible VRF, so the messages were sent in the
default VRF instead. Call vrf_bind() once after socket creation and when the
VRF is ready to ensure transmission in the correct VRF. vrf_bind() handles
the non-VRF case (i.e. VRF_DEFAULT) automatically, so it may be called
unconditionally.

Signed-off-by: Martin Buck <mb-tmp-tvguho.pbz@gromit.dyndns.org>
(cherry picked from commit 5a01011e0d)
2025-02-24 18:57:32 +00:00
Jafar Al-Gharaibeh 2539e67884
Merge pull request #18228 from FRRouting/mergify/bp/stable/10.2/pr-18210
bgpd: remove dmed check not required in bestpath selection (backport #18210)
2025-02-22 14:13:08 -06:00
Donald Sharp 41c3eb672e bgpd: remove dmed check not required in bestpath selection
As part of the upstream master commit (f3575f61c7 bgpd: Sort the
bgp_path_inf) the snippet of the code for dmed check condition
left out, which leads to an issue of selecting incorrect bestpath.

As an example:

During the bestpath selection local route looses to another path due
to dmed condition being hit.

The snippet of the logs:

2025/02/20 03:06:20.131441 BGP: [JW7VP-K1YVV]
[2]:[0]:[48]:[00:92:00:00:00:10](VRF default): Comparing path
27.0.0.7 flags Valid  with path Static announcement flags Selected Valid Attr Changed Unsorted
2025/02/20 03:06:20.131445 BGP: [SYTDR-QV6X9] [2]:[0]:[48]:[00:92:00:00:00:10]: path 27.0.0.7 loses to path Static announcement as ES 03:44:38:39:ff:ff:02:00:00:01 is same and local
2025/02/20 03:06:20.131452 BGP: [JW7VP-K1YVV] [2]:[0]:[48]:[00:92:00:00:00:10](VRF default): Comparing path 27.0.0.8 flags Valid  with path Static announcement flags Selected Valid Attr Changed Unsorted
2025/02/20 03:06:20.131456 BGP: [SYTDR-QV6X9] [2]:[0]:[48]:[00:92:00:00:00:10]: path 27.0.0.8 loses to path Static announcement as ES 03:44:38:39:ff:ff:02:00:00:01 is same and local
2025/02/20 03:06:20.131458 BGP: [WEWEC-8SE72] [2]:[0]:[48]:[00:92:00:00:00:10](VRF default): path Static announcement is the bestpath from AS 0   <<<< static is best
2025/02/20 03:06:20.131463 BGP: [Z3A78-GM3G5] bgp_best_selection: [2]:[0]:[48]:[00:92:00:00:00:10](VRF default) pi 27.0.0.7 dmed
2025/02/20 03:06:20.131467 BGP: [Z3A78-GM3G5] bgp_best_selection: [2]:[0]:[48]:[00:92:00:00:00:10](VRF default) pi 27.0.0.8 dmed
2025/02/20 03:06:20.131471 BGP: [N6CTF-2RSKS] [2]:[0]:[48]:[00:92:00:00:00:10](VRF default): After path selection, newbest is path 27.0.0.7 oldbest was Static announce

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 83ad94694b)
2025-02-21 23:12:23 +00:00
Donald Sharp f526b3d8f0
Merge pull request #18208 from FRRouting/mergify/bp/stable/10.2/pr-17666
pimd: During prefix-list update, behave as PIM_UPSTREAM_NOTJOINED sta… (backport #17666)
2025-02-20 16:19:23 -05:00
Donald Sharp b7f756c3d0
Merge pull request #18204 from FRRouting/mergify/bp/stable/10.2/pr-14227
pimd: Fix for data packet loss when FHR is LHR and RP (backport #14227)
2025-02-20 14:20:10 -05:00
Rajesh Varatharaj b06af550c5 pimd: During prefix-list update, behave as PIM_UPSTREAM_NOTJOINED state (conformance issue)
Issue:
If there are any changes to the prefix list, we perform a re-lookup to map the correct RP for the group.
Even if the S,G entry is PIM_UPSTREAM_NOTJOINED and in FHR, In the case of IGMPv3, an S,G entry can be
created with no joins. this is not necessary.
 https://www.rfc-editor.org/rfc/rfc4601#section-4.5.7 says no op in case of NOTJOINED

Solution:
To solve this issue, Stop RP mapping when the state is NOTJOINED

Ticket: #3496931

Signed-off-by: Rajesh Varatharaj <rvaratharaj@nvidia.com>
(cherry picked from commit 51f26d17da)
2025-02-20 18:42:03 +00:00
Rajesh Varatharaj a2c37893d4 pimd: Fix for data packet loss when FHR is LHR and RP
Topology:
A single router is acting as the First Hop Router (FHR), Last Hop Router (LHR), and RP.

RC and Issue:
When an upstream S,G is in join state, it sends a register message to the RP.
If the RP has the receiver, it sends a register stop message and switches to the shortest path.
When the register stop message is processed, it removes pimreg, moves to prune,
and starts the reg stop timer.

When the reg stop timer expires, PIM changes S,G state to Join Pending and sends out a NULL
register message to RP. RP receives it and fails to send Reg stop because SPT is not set at that point.

The problem is when the register stop timer pops and state is in Join Pending.
According to https://www.rfc-editor.org/rfc/rfc4601#section-4.4.1,
we need to put back the pimreg reg tunnel into the S,G mroute.
This causes data to be sent to the control plane and subsequently interrupts the line rate.

Fix:
If the router is FHR and RP to the group,
ignore SPT status and send out a register stop message back to the DR (in this context, the same router).

Ticket: #3506780

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Rajesh Varatharaj <rvaratharaj@nvidia.com>
(cherry picked from commit 8280257cc9)
2025-02-20 16:57:15 +00:00
Jafar Al-Gharaibeh 15d390c06c
Merge pull request #18200 from FRRouting/revert-18155-mergify/bp/stable/10.2/pr-18121
Revert "bgpd: release manual vpn label on instance deletion (backport #18121)"
2025-02-19 13:00:45 -06:00
Donald Sharp 95cd00448d
Revert "bgpd: release manual vpn label on instance deletion (backport #18121)" 2025-02-19 11:22:03 -05:00
Russ White 2dce35a397
Merge pull request #18155 from FRRouting/mergify/bp/stable/10.2/pr-18121
bgpd: release manual vpn label on instance deletion (backport #18121)
2025-02-18 10:28:10 -05:00
Jafar Al-Gharaibeh cdb2cc2f89
Merge pull request #18192 from FRRouting/mergify/bp/stable/10.2/pr-18082
lib: nb: call child destroy CBs when YANG container is deleted (backport #18082)
2025-02-17 23:14:27 -06:00
Christian Hopps 9bd9d4a6f7 lib: nb: call child destroy CBs when YANG container is deleted
Previously the code was only calling the child destroy callbacks if the target
deleted node was a non-presence container. We now add a flag to the callback
structure to instruct northbound to perform the rescursive delete for code that
wishes for this to happen.

- Fix wrong relative path lookup in keychain destroy callback

Signed-off-by: Christian Hopps <chopps@labn.net>
(cherry picked from commit d03ecf4562)
2025-02-18 02:37:59 +00:00
Donatas Abraitis 2f10852e16
Merge pull request #18180 from FRRouting/mergify/bp/stable/10.2/pr-18178
isisd: Request SRv6 locator after zebra connection (backport #18178)
2025-02-16 18:21:59 +02:00
Donald Sharp 9d53f35143
Merge pull request #18184 from FRRouting/mergify/bp/stable/10.2/pr-18109
bgpd: fix vty output of evpn route-target AS4 (backport #18109)
2025-02-16 08:09:57 -05:00
Mark Stapp 857c987fa7 bgpd: fix vty output of evpn route-target AS4
evpn route-targets are decoded in  ... multiple places; at least
two have a bug where the AS4 form doesn't have its AS decoded.

Signed-off-by: Mark Stapp <mjs@cisco.com>
(cherry picked from commit 9943a08720)
2025-02-15 20:11:59 +00:00
Carmine Scarpitta cac6b14879 isisd: Request SRv6 locator after zebra connection
When SRv6 is enabled and an SRv6 locator is specified in the IS-IS
configuration, IS-IS may attempt to request SRv6 locator information from
zebra before the connection is fully established. If this occurs, the
request fails with the following error:

```
2025/02/14 21:41:20 ISIS: [HR66R-TWQYD][EC 100663302] srv6_manager_get_locator: invalid zclient socket
````

As a result, IS-IS is unable to obtain the locator information,
preventing SRv6 from working.

This commit fixes the issue by ensuring IS-IS requests SRv6 locator
information once the connection with zebra is successfully established.

Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
(cherry picked from commit f02dba19d2)
2025-02-15 14:57:42 +00:00
Carmine Scarpitta 9c5d87ef84 isisd: Add helper function to request SRv6 locator information
This commit adds a function that iterates over all IS-IS areas and asks
the SRv6 Manager for information about the configured locators.

Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
(cherry picked from commit 0b76fb3c13)
2025-02-15 14:57:41 +00:00
Donald Sharp f3249ab686
Merge pull request #18167 from FRRouting/mergify/bp/stable/10.2/pr-18160
bgpd: When removing the prefix list drop the pointer (backport #18160)
2025-02-15 09:14:18 -05:00
Donald Sharp 7a85a239a3 bgpd: When removing the prefix list drop the pointer
We are very very rarely seeing this crash:

    0 0x7f36ba48e389 in prefix_list_apply_ext lib/plist.c:789
    1 0x55eff3fa4126 in subgroup_announce_check bgpd/bgp_route.c:2334
    2 0x55eff3fa858e in subgroup_process_announce_selected bgpd/bgp_route.c:3440
    3 0x55eff4016488 in subgroup_announce_table bgpd/bgp_updgrp_adv.c:808
    4 0x55eff401664e in subgroup_announce_route bgpd/bgp_updgrp_adv.c:861
    5 0x55eff40111df in peer_af_announce_route bgpd/bgp_updgrp.c:2223
    6 0x55eff3f884cb in bgp_announce_route_timer_expired bgpd/bgp_route.c:5892
    7 0x7f36ba4ec239 in event_call lib/event.c:2019
    8 0x7f36ba41a22a in frr_run lib/libfrr.c:1295
    9 0x55eff3e668b7 in main bgpd/bgp_main.c:557
    10 0x7f36b9e2d249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    11 0x7f36b9e2d304 in __libc_start_main_impl ../csu/libc-start.c:360
    12 0x55eff3e64a30 in _start (/home/ci/cibuild.1407/frr-source/bgpd/.libs/bgpd+0x2fda30)
0x608000037038 is located 24 bytes inside of 88-byte region [0x608000037020,0x608000037078)
freed by thread T0 here:
    0 0x7f36ba8b76a8 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:52
    1 0x7f36ba439bd7 in qfree lib/memory.c:131
    2 0x7f36ba48d3a3 in prefix_list_free lib/plist.c:156
    3 0x7f36ba48d3a3 in prefix_list_delete lib/plist.c:247
    4 0x7f36ba48fbef in prefix_bgp_orf_remove_all lib/plist.c:1516
    5 0x55eff3f679c4 in bgp_route_refresh_receive bgpd/bgp_packet.c:2841
    6 0x55eff3f70bab in bgp_process_packet bgpd/bgp_packet.c:4069
    7 0x7f36ba4ec239 in event_call lib/event.c:2019
    8 0x7f36ba41a22a in frr_run lib/libfrr.c:1295
    9 0x55eff3e668b7 in main bgpd/bgp_main.c:557
    10 0x7f36b9e2d249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
previously allocated by thread T0 here:
    0 0x7f36ba8b83b7 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:77
    1 0x7f36ba4392e4 in qcalloc lib/memory.c:106
    2 0x7f36ba48d0de in prefix_list_new lib/plist.c:150
    3 0x7f36ba48d0de in prefix_list_insert lib/plist.c:186
    4 0x7f36ba48d0de in prefix_list_get lib/plist.c:204
    5 0x7f36ba48f9df in prefix_bgp_orf_set lib/plist.c:1479
    6 0x55eff3f67ba6 in bgp_route_refresh_receive bgpd/bgp_packet.c:2920
    7 0x55eff3f70bab in bgp_process_packet bgpd/bgp_packet.c:4069
    8 0x7f36ba4ec239 in event_call lib/event.c:2019
    9 0x7f36ba41a22a in frr_run lib/libfrr.c:1295
    10 0x55eff3e668b7 in main bgpd/bgp_main.c:557
    11 0x7f36b9e2d249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

Let's just stop trying to save the pointer around in the peer->orf_plist
data structure.  There are other design problems but at least lets
stop the crash from possibly happening.

Fixes: #18138
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 3d43d7b789)
2025-02-14 21:29:13 +00:00
Jafar Al-Gharaibeh 87b0521b24
Merge pull request #18147 from FRRouting/mergify/bp/stable/10.2/pr-18023
lib: fix false context information for SRv6 route (backport #18023)
2025-02-13 19:04:59 -06:00
Jafar Al-Gharaibeh 2c4aff11b1
Merge pull request #18144 from FRRouting/mergify/bp/stable/10.2/pr-18079
bgpd: Fix crash in bgp_labelpool (backport #18079)
2025-02-13 19:03:33 -06:00
Louis Scalbert dfadee9d02 bgpd: release manual vpn label on instance deletion
When a BGP instance with a manually assigned VPN label is deleted, the
label is not released from the Zebra label registry. As a result,
reapplying a configuration with the same manual label leads to VPN
prefix export failures.

For example, with the following configuration:

> router bgp 65000 vrf BLUE
>  address-family ipv4 unicast
>   label vpn export <int>

Release zebra label registry on unconfiguration.

Fixes: d162d5f6f5 ("bgpd: fix hardset l3vpn label available in mpls pool")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
(cherry picked from commit d6363625c3)

# Conflicts:
#	bgpd/bgpd.c
2025-02-13 19:09:37 +00:00
Philippe Guibert 12f7120c07 lib: fix false context information for SRv6 route
The seg6local route dumped by 'show ipv6 route' makes think that the USP
flavor is supported, whereas it is not the case. This information is a
context information, and for End, the context information should be
empty.

> # show ipv6 route
> [..]
> I>* fc00:0:4::/128 [115/0] is directly connected, sr0, seg6local End USP, weight 1, 00:49:01

Fix this by suppressing the USP information from the output.

Fixes: e496b42030 ("bgpd: prefix-sid srv6 l3vpn service tlv")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
(cherry picked from commit 658bf0281d)
2025-02-13 17:58:35 +00:00
Donald Sharp 00e7b27af0 bgpd: Fix crash in bgp_labelpool
The bgp labelpool code is grabbing the vpn policy data structure.
This vpn_policy has a pointer to the bgp data structure.  If
a item placed on the bgp label pool workqueue happens to sit
there for the microsecond or so and the operator issues a
`no router bgp...` command that corresponds to the vpn_policy
bgp pointer, when the workqueue is run it will crash because
the bgp pointer is now freed and something else owns it.

Modify the labelpool code to store the vrf id associated
with the request on the workqueue.  When you wake up
if the vrf id still has a bgp pointer allow the request
to continue, else drop it.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 14eac319e8)
2025-02-13 17:45:18 +00:00
Jafar Al-Gharaibeh 3b97f50e31
Merge pull request #18134 from FRRouting/mergify/bp/stable/10.2/pr-18120
bgpd: fix incorrect JSON in bgp_show_table_rd (backport #18120)
2025-02-13 10:04:06 -06:00
Jafar Al-Gharaibeh d85c28e8f1
Merge pull request #18057 from FRRouting/mergify/bp/stable/10.2/pr-18048
pimd: fix DR election race on startup (backport #18048)
2025-02-12 22:28:24 -06:00
Louis Scalbert fb006377bd bgpd: fix incorrect json in bgp_show_table_rd
In bgp_show_table_rd(), the is_last argument is determined using the
expression "next == NULL" to check if the RD table is the last one. This
helps ensure proper JSON formatting.

However, if next is not NULL but is no longer associated with a BGP
table, the JSON output becomes malformed.

Updates the condition to also verify the existence of the next bgp_dest
table.

Fixes: 1ae44dfcba ("bgpd: unify 'show bgp' with RD with normal unicast bgp show")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
(cherry picked from commit cf0269649c)
2025-02-12 20:10:58 +00:00
Donald Sharp 30c21862fd
Merge pull request #18076 from opensourcerouting/fix/bfd_backports_10.2
bgp/bfd backports for stable/10.2
2025-02-12 12:58:15 -05:00
Donald Sharp 219794cf22
Merge pull request #18124 from FRRouting/mergify/bp/stable/10.2/pr-18062
Cid 1636504 (backport #18062)
2025-02-12 12:28:34 -05:00
Philippe Guibert 8a64ccbf68 bgpd: fix bgp label evpn CID 1636504
The following static analysis can be seen :

> *** CID 1636504:    (ARRAY_VS_SINGLETON)
> /bgpd/bgp_evpn_mh.c: 1241 in bgp_evpn_type1_route_process()
> 1235            build_evpn_type1_prefix(&p, eth_tag, &esi, vtep_ip);
> 1236            /* Process the route. */
> 1237            if (attr) {
> 1238                    bgp_update(peer, (struct prefix *)&p, addpath_id, attr, afi, safi, ZEBRA_ROUTE_BGP,
> 1239                               BGP_ROUTE_NORMAL, &prd, &label, num_labels, 0, NULL);
> 1240            } else {
> >>>     CID 1636504:    (ARRAY_VS_SINGLETON)
> >>>     Passing "&label" to function "bgp_withdraw" which uses it as an array. This might corrupt or misinterpret adjacent memory locations.
> 1241                    bgp_withdraw(peer, (struct prefix *)&p, addpath_id, afi, safi, ZEBRA_ROUTE_BGP,
> 1242                                 BGP_ROUTE_NORMAL, &prd, &label, num_labels);
> 1243            }
> 1244            return 0;
> 1245     }
> 1246
> /bgpd/bgp_evpn_mh.c: 1238 in bgp_evpn_type1_route_process()
> 1232             * table
> 1233             */
> 1234            vtep_ip.s_addr = INADDR_ANY;
> 1235            build_evpn_type1_prefix(&p, eth_tag, &esi, vtep_ip);
> 1236            /* Process the route. */
> 1237            if (attr) {
> >>>     CID 1636504:    (ARRAY_VS_SINGLETON)
> >>>     Passing "&label" to function "bgp_update" which uses it as an array. This might corrupt or misinterpret adjacent memory locations.
> 1238                    bgp_update(peer, (struct prefix *)&p, addpath_id, attr, afi, safi, ZEBRA_ROUTE_BGP,
> 1239                               BGP_ROUTE_NORMAL, &prd, &label, num_labels, 0, NULL);
> 1240            } else {
> 1241                    bgp_withdraw(peer, (struct prefix *)&p, addpath_id, afi, safi, ZEBRA_ROUTE_BGP,
> 1242                                 BGP_ROUTE_NORMAL, &prd, &label, num_labels);
> 1243            }

Fix this by declaring a label array instead of a single array.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
(cherry picked from commit ba462af2e3)
2025-02-12 14:37:58 +00:00
Philippe Guibert 7a9abe777b bgpd: simplify bgp_evpn_process_rt1 with label
Remove the num_labels variable, the received bgp_update() and
bgp_withdraw() function will read the message as including one
label or vni value.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
(cherry picked from commit 82d28f137a)
2025-02-12 14:37:58 +00:00
Donald Sharp e9e72912a4
Merge pull request #18113 from FRRouting/mergify/bp/stable/10.2/pr-18078
nhrpd: fix dont consider incomplete L2 entry (backport #18078)
2025-02-12 08:16:56 -05:00
Donald Sharp b58aa2189c
Merge pull request #18116 from FRRouting/mergify/bp/stable/10.2/pr-18069
bgpd: Request SRv6 locator after zebra connection (backport #18069)
2025-02-12 08:14:35 -05:00
Carmine Scarpitta ea595f9a14 bgpd: Request SRv6 locator after zebra connection
When SRv6 is enabled and an SRv6 locator is specified in the BGP
configuration, BGP may attempt to request SRv6 locator information from
zebra before the connection is fully established. If this occurs, the
request fails with the following error:

```
2025/02/06 16:37:32 BGP: [HR66R-TWQYD][EC 100663302] srv6_manager_get_locator: invalid zclient socket
````

As a result, BGP is unable to obtain the locator information,
preventing SRv6 VPN from working.

This commit fixes the issue by ensuring BGP requests SRv6 locator
information once the connection with zebra is successfully established.

Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
(cherry picked from commit 16640b615d)
2025-02-12 03:00:28 +00:00
Philippe Guibert a7d0e7b91c nhrpd: fix dont consider incomplete L2 entry
Sometimes, NHRP receives L2 information on a cache entry with the
0.0.0.0 IP address. NHRP considers it as valid and updates the binding
with the new IP address.

> Feb 09 20:09:54 aws-sin-vpn01 nhrpd[2695]: [QQ0NK-1H449] Netlink: new-neigh 10.2.114.238 dev dmvpn1 lladdr 162.251.180.10 nud 0x2 cache used 0 type 4
> Feb 09 20:10:35 aws-sin-vpn01 nhrpd[2695]: [QQ0NK-1H449] Netlink: new-neigh 10.2.114.238 dev dmvpn1 lladdr 162.251.180.10 nud 0x4 cache used 1 type 4
> Feb 09 20:10:48 aws-sin-vpn01 nhrpd[2695]: [QQ0NK-1H449] Netlink: del-neigh 10.2.114.238 dev dmvpn1 lladdr 162.251.180.10 nud 0x4 cache used 1 type 4
> Feb 09 20:10:49 aws-sin-vpn01 nhrpd[2695]: [QQ0NK-1H449] Netlink: who-has 10.2.114.238 dev dmvpn1 lladdr (unspec) nud 0x1 cache used 1 type 4
> Feb 09 20:10:49 aws-sin-vpn01 nhrpd[2695]: [QVXNM-NVHEQ] Netlink: update binding for 10.2.114.238 dev dmvpn1 from c 162.251.180.10 peer.vc.nbma 162.251.180.10 to lladdr (unspec)
> Feb 09 20:10:49 aws-sin-vpn01 nhrpd[2695]: [QQ0NK-1H449] Netlink: new-neigh 10.2.114.238 dev dmvpn1 lladdr 0.0.0.0 nud 0x2 cache used 1 type 4
> Feb 09 20:11:30 aws-sin-vpn01 nhrpd[2695]: [QQ0NK-1H449] Netlink: new-neigh 10.2.114.238 dev dmvpn1 lladdr 0.0.0.0 nud 0x4 cache used 1 type 4

Actually, the 0.0.0.0 IP addressed mentiones in the 'who-has' message is
wrong because the nud state value means that value is incomplete and
should not be handled as a valid entry. Instead of considering it, fix
this by by invalidating the current binding. This step is necessary in
order to permit NHRP to trigger resolution requests again.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
(cherry picked from commit 3202323052)
2025-02-12 02:58:19 +00:00
Jafar Al-Gharaibeh 5a18943588
Merge pull request #18102 from FRRouting/mergify/bp/stable/10.2/pr-18060
lib: crash handlers must be allowed on threads (backport #18060)
2025-02-11 20:52:43 -06:00