The rib_sweep_route function when not doing graceful
restart does not attempt to save the event on the
t_rib_sweep pointer for shutdown. Prevent any
weird shenanigans by allowing shutdown to clean
up the rib_sweep_route event.
Signed-off-by: Donald Sharp <donaldsharp72@gmail.com>
Current -n option is only for zebra and mgmtd. All other daemons receive
the VRF backend configuration from zebra upon connection to it. This
leads to a potential race condition - daemons need to know the backend
before they start reading their config, but they can be not connected to
zebra yet at this point. As the VRF backend cannot change during runtime,
let's introduce a new global -w option for setting netns backend, to
make sure that all daemons know their VRF backend immediately after
start.
The reason for introducing a new option instead of making -n global is
that ospfd already uses -n for another purposes.
Signed-off-by: Igor Ryzhov <idryzhov@gmail.com>
The backend type cannot be unknown. It is configured to VRF_LITE by
default in zebra anyway, so just init to VRF_LITE in the lib and remove
the UNKNOWN type.
Signed-off-by: Igor Ryzhov <idryzhov@gmail.com>
Separate zebra's ZAPI server socket handling into two phases:
an early phase that opens the socket, and a later phase that
starts listening for client connections.
Signed-off-by: Mark Stapp <mjs@cisco.com>
Currently zebra starts the graceful restart timer as well as
allows connections from clients before all data is read in
from the kernel as well as the possiblity of allowing client
connections before this happens as well.
Let's move the graceful restart timer start till after this is
done as well as not allowing client connections till then as well.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The following ASAN issue has been observed:
> ERROR: AddressSanitizer: heap-use-after-free on address 0x6160000acba4 at pc 0x55910c5694d0 bp 0x7ffe3a8ac850 sp 0x7ffe3a8ac840
> READ of size 4 at 0x6160000acba4 thread T0
> #0 0x55910c5694cf in ctx_info_from_zns zebra/zebra_dplane.c:3315
> #1 0x55910c569696 in dplane_ctx_ns_init zebra/zebra_dplane.c:3331
> #2 0x55910c56bf61 in dplane_ctx_nexthop_init zebra/zebra_dplane.c:3680
> #3 0x55910c5711ca in dplane_nexthop_update_internal zebra/zebra_dplane.c:4490
> #4 0x55910c571c5c in dplane_nexthop_delete zebra/zebra_dplane.c:4717
> #5 0x55910c61e90e in zebra_nhg_uninstall_kernel zebra/zebra_nhg.c:3413
> #6 0x55910c615d8a in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1919
> #7 0x55910c6404db in route_entry_update_nhe zebra/zebra_rib.c:454
> #8 0x55910c64c904 in rib_re_nhg_free zebra/zebra_rib.c:2822
> #9 0x55910c655be2 in rib_unlink zebra/zebra_rib.c:4212
> #10 0x55910c6430f9 in zebra_rtable_node_cleanup zebra/zebra_rib.c:968
> #11 0x7f26f275b8a9 in route_node_free lib/table.c:75
> #12 0x7f26f275bae4 in route_table_free lib/table.c:111
> #13 0x7f26f275b749 in route_table_finish lib/table.c:46
> #14 0x55910c65db17 in zebra_router_free_table zebra/zebra_router.c:191
> #15 0x55910c65dfb5 in zebra_router_terminate zebra/zebra_router.c:244
> #16 0x55910c4f40db in zebra_finalize zebra/main.c:249
> #17 0x7f26f2777108 in event_call lib/event.c:2011
> #18 0x7f26f264180e in frr_run lib/libfrr.c:1212
> #19 0x55910c4f49cb in main zebra/main.c:531
> #20 0x7f26f2029d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
> #21 0x7f26f2029e3f in __libc_start_main_impl ../csu/libc-start.c:392
> #22 0x55910c4b0114 in _start (/usr/lib/frr/zebra+0x1ae114)
It happens with FRR using the kernel. During shutdown, the
namespace identifier is attempted to be obtained by zebra, in an
attempt to prepare zebra dataplane nexthop messages.
Fix this by accessing the ns structure.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Add a new start option "-K" to libfrr to denote a graceful start,
and use it in zebra and bgpd.
zebra will use this option to denote a planned FRR graceful restart
(supporting only bgpd currently) to wait for a route sync completion
from bgpd before cleaning up old stale routes from the FIB. An optional
timer provides an upper-bounds for this cleanup.
bgpd will use this option to denote either a planned FRR graceful
restart or a bgpd-only graceful restart, and this will drive the BGP
GR restarting router procedures.
Signed-off-by: Vivek Venkatraman <vivek@nvidia.com>
If you had a situation where an operator turned on
ospfd with snmp but not ospf6d and agentx was configured
then you get into a situation where ospf6d would complain
that the config for agentx did not exist. Let's modify
the code to allow this situation to happen.
Fixes: #15896
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Split zebra's vrf_terminate() into disable() and delete() stages.
The former enqueues all events for the dplane thread.
Memory freeing is performed in the second stage.
Signed-off-by: Alexander Skorichenko <askorichenko@netgate.com>
clang-format doesn't understand FRR_DAEMON_INFO is a long macro where
laying out items semantically makes sense.
(Also use only one `FRR_DAEMON_INFO(` in isisd so editors don't get
confused with the mismatching `( ( )`.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
The adata variable was being leaked on shutdown since
it was calloc'ed. There is no need to make this dynamic
memory. Just choose a size and use that. Add a bit
of code to ensure that if it's not large enough,
it will just stop and the developer will fix it.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
the zebra pseudo wire code was registering a callback
per vrf. These callbacks are not per vrf based. They
are vrf agnostic so this was a mistake. Modify the code
to on startup register once and on shutdown unregister once.
Finally rename the zebra_pw_init and zebra_pw_exit functions
to more properly reflect when they are called.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
a) Rename rib_init to zebra_rib_init() to better follow how
things are named
b) on shutdown cycle through the rib_dplane_q and free
up any contexts sitting in it.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The headers associated with netlink code
really only belong in those that need it.
Move these headers out of lib/zebra.h and
into more appropriate places. bgp's usage
of the RT_TABLE_XXX defines are probably not
appropriate and will be cleaned up in future
commits.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When shutting down the main pthread was first closing
the sockets associated with the dplane pthread and
then telling it to shutdown the pthread at a later point
in time. This caused the dplane to crash because the nl
data has been freed already. Change the shutdown order
to stop the dplane pthread *and* then close the sockets.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
two things:
On shutdown cleanup any events associated with the update walker.
Also do not allow new events to be created.
Fixes this mem-leak:
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790:Direct leak of 8 byte(s) in 1 object(s) allocated from:
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #0 0x7f0dd0b08037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #1 0x7f0dd06c19f9 in qcalloc lib/memory.c:105
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #2 0x55b42fb605bc in rib_update_ctx_init zebra/zebra_rib.c:4383
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #3 0x55b42fb6088f in rib_update zebra/zebra_rib.c:4421
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #4 0x55b42fa00344 in netlink_link_change zebra/if_netlink.c:2221
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #5 0x55b42fa24622 in netlink_information_fetch zebra/kernel_netlink.c:399
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #6 0x55b42fa28c02 in netlink_parse_info zebra/kernel_netlink.c:1183
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #7 0x55b42fa24951 in kernel_read zebra/kernel_netlink.c:493
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #8 0x7f0dd0797f0c in event_call lib/event.c:1995
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #9 0x7f0dd0684fd9 in frr_run lib/libfrr.c:1185
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #10 0x55b42fa30caa in main zebra/main.c:465
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #11 0x7f0dd01b5d09 in __libc_start_main ../csu/libc-start.c:308
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790-
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790-SUMMARY: AddressSanitizer: 8 byte(s) leaked in 1 allocation(s).
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Effectively a massive search and replace of
`struct thread` to `struct event`. Using the
term `thread` gives people the thought that
this event system is a pthread when it is not
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This is a first in a series of commits, whose goal is to rename
the thread system in FRR to an event system. There is a continual
problem where people are confusing `struct thread` with a true
pthread. In reality, our entire thread.c is an event system.
In this commit rename the thread.[ch] files to event.[ch].
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Add the affinity-map global command to zebra. The syntax is:
> affinity-map NAME bit-position (0-1023)
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Zebra has a shutdown setup where it asks the dplane to shutdown but can
still be processing data. This is especially true if something the dplane
is listening on receives data that will be processed by the main dplane thread
from netlink. When zebra_finalize is called it is possible that a bit
of data comes in before the zebra_dplane_shutdown() function is called
and the memory freed in ns_walk_func() causes the main dplane event
to crash when it cannot find the ns data anymore.
Reverse the order, stop the zebra dplane pthread and then free the
memory associated with the namespaces.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Instead of having global allow_delete move it to
where it belongs in the zrouter data structure.
Additionally show this data in `show zebra`
Signed-off-by: Donald Sharp <sharpd@nvidia.com>