Commit graph

12 commits

Author SHA1 Message Date
Rajasekar Raja aa4786642c zebra: vlan to dplane Offload from main
Trigger: Zebra core seen when we convert l2vni to l3vni and back

BackTrace:
/usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(_zlog_assert_failed+0xe9) [0x7f4af96989d9]
/usr/lib/frr/zebra(zebra_vxlan_if_vni_up+0x250) [0x5561022ae030]
/usr/lib/frr/zebra(netlink_vlan_change+0x2f4) [0x5561021fd354]
/usr/lib/frr/zebra(netlink_parse_info+0xff) [0x55610220d37f]
/usr/lib/frr/zebra(+0xc264a) [0x55610220d64a]
/usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(thread_call+0x7d) [0x7f4af967e96d]
/usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(frr_run+0xe8) [0x7f4af9637588]
/usr/lib/frr/zebra(main+0x402) [0x5561021f4d32]
/lib/x86_64-linux-gnu/libc.so.6(+0x2724a) [0x7f4af932624a]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x85) [0x7f4af9326305]
/usr/lib/frr/zebra(_start+0x21) [0x5561021f72f1]

Root Cause:
In working case,
 - We get a RTM_NEWLINK whose ctx is enqueued by zebra dplane and
   dequeued by zebra main and processed i.e.
   (102000 is deleted from vxlan99) before we handle RTM_NEWVLAN.
 - So in handling of NEWVLAN (vxlan99) we bail out since find with
   vlan id 703 does not exist.

root@leaf2:mgmt:/var/log/frr# cat ~/raja_logs/working/nocras.log  | grep "RTM_NEWLINK\|QUEUED\|vxlan99\|in thread"
2024/07/18 23:09:33.741105 ZEBRA: [KMXEB-K771Y] netlink_parse_info: netlink-dp-in (NS 0) type RTM_NEWLINK(16), len=616, seq=0, pid=0
2024/07/18 23:09:33.744061 ZEBRA: [K8FXY-V65ZJ] Intf dplane ctx 0x7f2244000cf0, op INTF_INSTALL, ifindex (65), result QUEUED
2024/07/18 23:09:33.767240 ZEBRA: [KMXEB-K771Y] netlink_parse_info: netlink-dp-in (NS 0) type RTM_NEWLINK(16), len=508, seq=0, pid=0
2024/07/18 23:09:33.767380 ZEBRA: [K8FXY-V65ZJ] Intf dplane ctx 0x7f2244000cf0, op INTF_INSTALL, ifindex (73), result QUEUED
2024/07/18 23:09:33.767389 ZEBRA: [NVFT0-HS1EX] INTF_INSTALL for vxlan99(73)
2024/07/18 23:09:33.767404 ZEBRA: [TQR2A-H2RFY] Vlan-Vni(1186:1186-6000002:6000002) update for VxLAN IF vxlan99(73)
2024/07/18 23:09:33.767422 ZEBRA: [TP4VP-XZ627] Del L2-VNI 102000 intf vxlan99(73)
2024/07/18 23:09:33.767858 ZEBRA: [QYXB9-6RNNK] RTM_NEWVLAN bridge IF vxlan99 NS 0
2024/07/18 23:09:33.767866 ZEBRA: [KKZGZ-8PCDW] Cannot find VNI for VID (703) IF vxlan99 for vlan state update >>>>BAIL OUT

In failure case,
 - The NEWVLAN is received first even before processing RTM_NEWLINK.
 - Since the vxlan id 102000 is not removed from the vxlan99,
   the find with vlan id 703 returns the 102000 one and we invoke
   zebra_vxlan_if_vni_up where the interfaces don't match and assert.

root@leaf2:mgmt:/var/log/frr# cat ~/raja_logs/noworking/crash.log | grep "RTM_NEWLINK\|QUEUED\|vxlan99\|in thread"
2024/07/18 22:26:43.829370 ZEBRA: [KMXEB-K771Y] netlink_parse_info: netlink-dp-in (NS 0) type RTM_NEWLINK(16), len=616, seq=0, pid=0
2024/07/18 22:26:43.829646 ZEBRA: [K8FXY-V65ZJ] Intf dplane ctx 0x7fe07c026d30, op INTF_INSTALL, ifindex (65), result QUEUED
2024/07/18 22:26:43.853930 ZEBRA: [QYXB9-6RNNK] RTM_NEWVLAN bridge IF vxlan99 NS 0
2024/07/18 22:26:43.853949 ZEBRA: [K61WJ-XQQ3X] Intf vxlan99(73) L2-VNI 102000 is UP >>> VLAN PROCESSED BEFORE INTF EVENT
2024/07/18 22:26:43.853951 ZEBRA: [SPV50-BX2RP] RAJA zevpn_vxlanif vxlan48 and ifp vxlan99
2024/07/18 22:26:43.854005 ZEBRA: [KMXEB-K771Y] netlink_parse_info: netlink-dp-in (NS 0) type RTM_NEWLINK(16), len=508, seq=0, pid=0
2024/07/18 22:26:43.854241 ZEBRA: [KMXEB-K771Y] netlink_parse_info: netlink-dp-in (NS 0) type RTM_NEWLINK(16), len=516, seq=0, pid=0
2024/07/18 22:26:43.854251 ZEBRA: [KMXEB-K771Y] netlink_parse_info: netlink-dp-in (NS 0) type RTM_NEWLINK(16), len=544, seq=0, pid=0
ZEBRA: in thread kernel_read scheduled from zebra/kernel_netlink.c:505 kernel_read()

Fix:
Similar to #13396, where link change
handling was offloaded to dplane, same is being done for vlan events.

Note: Prior to this change, zebra main thread was interested in the
RTNLGRP_BRVLAN. So all the kernel events pertaining to vlan was
handled by zebra main.

With this change change as well the handling of vlan events is still
with Zebra main. However we offload it via Dplane thread.

Ticket :#3878175

Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
2024-09-26 20:17:35 -07:00
Carmine Scarpitta 8d0b4745a1 zebra: Add code to set SRv6 encap source addr in dplane
Add a bunch of set functions and associated data structure in
zebra_dplane to allow the configuration of the source address for SRv6
encap in the data plane.

Signed-off-by: Carmine Scarpitta <carmine.scarpitta@uniroma2.it>
2023-12-14 14:56:44 +01:00
Donald Sharp a22ab3f98f zebra: Fix missing break in switch
Recent Changes added the -Wimplicit-fallthrough flag
to FRR's compilation.  Implementor does not build with
lua support and as such this one was missed in the compilation

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-11-03 17:54:55 +00:00
Donald Sharp a014450441 zebra: Add code to get/set interface to pass up from dplane
1) Add a bunch of get/set functions and associated data
structure in zebra_dplane to allow the setting and retrieval
of interface netlink data up into the master pthread.

2) Add a bit of code to breakup startup into stages.  This is
because FRR currently has a mix of dplane and non dplane interactions
and the code needs to be paused before continuing on.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-07-05 13:03:14 -04:00
David Lamparter acddc0ed3c *: auto-convert to SPDX License IDs
Done with a combination of regex'ing and banging my head against a wall.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2023-02-09 14:09:11 +01:00
Donatas Abraitis 2272627571 zebra: Replace TC definitions for dplane
They were replaced, but forgot for `--enable-scripting`.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-11-23 17:35:33 +02:00
Stephen Worley 17e6ba2828 zebra: add TC handlers in script code
Add TC handlers in script code and move non-handled
code together.

Signed-off-by: Stephen Worley <sworley@nvidia.com>
2022-08-16 15:56:35 -04:00
David Lamparter 5b4f4e626f build: first header *must* be zebra.h or config.h
This has already been a requirement for Solaris, it is still a
requirement for some of the autoconf feature checks to work correctly,
and it will be a requirement for `-fms-extensions`.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2022-04-04 18:33:10 +02:00
Donald Sharp 105271792d zebra: Fixup lua with new dplane ops
Commit: 5d41413833 added 3 new dplane ops:

DPLANE_OP_INTF_INSTALL
DPLANE_OP_INTF_UPDATE
DPLANE_OP_INTF_DELETE

The build system does not build lua so zebra_script.c
was not updated.  Update of course!

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-03-16 15:10:54 -04:00
Mark Stapp 728f2017ae zebra: add dplane type for NETCONF data
Add a new dplane op for interface NETCONF data; add the new
enum value to several switch statements.

Signed-off-by: Mark Stapp <mstapp@nvidia.com>
2022-02-25 09:53:02 -05:00
Donald Sharp cbefb650bc zebra: Recent Merge broke --enable-werror
Recent code broke upon compiling with --enable-dev-build
and --enable-werror.  Fix.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2021-10-27 08:53:43 -04:00
Donald Lee 4f7e32bafe zebra: Add encoders/decoders for zebra
Signed-off-by: Donald Lee <dlqs@gmx.com>
2021-10-20 00:56:00 +08:00