Embed nexthop-group, which is just a pointer, in the zebra
nexthop-hash-entry object, rather than mallocing one.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
When we are receiving a kernel route, with an admin distance
of 255 we are not marking it as installed. This route
should be marked as installed.
New behavior:
K>* 4.5.7.0/24 [255/8192] via 192.168.209.1, enp0s8, 00:10:14
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
We should be NULL checking the entire re->nhe struct, not
the group inside of it. When we get routes from the kernel
using a nexthop group (and future protocols) they will only
pass us an ID to use. Hence, this struct can (and will be)
NULL on first attach when only passed an ID.
There shouldn't be a situation where we have an re->nhe
and don't have an re->nhe->nhg anyway.
Before this patch you can easily make zebra crash by creating a
route in the kernel using a nexthop group and starting zebra.
`ip next add dev lo id 111`
`ip route add 1.1.1.1/32 nhid 111`
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Handle the special case where a route update contains
no installed nexthops - that means the route is not
installed.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
The processing of dataplane route notifications was a little
off-target after the nexthop-group re-work. This should allow
notifications to work better.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Replace the existing list of nexthops (via a nexthop_group
struct) in the route_entry with a direct pointer to zebra's
new shared group (from zebra_nhg.h). This allows more
direct access to that shared group and the info it carries.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Clean up the relationships between zebra's rib and nexthop-group
headers as prep for adding a nexthop-group pointer to the
route_entry.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
We cannot clear the NEXTHOP_FLAG_FIB nexthop flag
when sending routes to the dataplane anymore since
nexthops are now shared.
We were seeing a situation where if we delete a route
using a nexthop group that is still active with another
route, the fib flag was being unset by this code
path despite them still being valid fib nexthops with the
other route.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
We were creating `other` tables in rib_del(), vty commands, and
dataplane return callback via the zebra_vrf_table_with_table_id()
API.
Seperate the API into only a lookup, never create
and added another with `get` in the name (following the standard
we use in other table APIs).
Then changed the rib_del(), rib_find_rn_from_ctx(), and show route
summary vty command to use the lookup API instead.
This was found via a crash where two different vrfs though they owned
the table. On delete, one free'd all the nodes, and then the other tried
to use them. It required specific timing of a VRF existing, going away,
and coming back again to cause the crash.
=23464== Invalid read of size 8
==23464== at 0x179EA4: rib_dest_from_rnode (rib.h:433)
==23464== by 0x17ACB1: zebra_vrf_delete (zebra_vrf.c:253)
==23464== by 0x48F3D45: vrf_delete (vrf.c:243)
==23464== by 0x48F4468: vrf_terminate (vrf.c:532)
==23464== by 0x13D8C5: sigint (main.c:172)
==23464== by 0x48DD25C: quagga_sigevent_process (sigevent.c:105)
==23464== by 0x48F0502: thread_fetch (thread.c:1417)
==23464== by 0x48AC82B: frr_run (libfrr.c:1023)
==23464== by 0x13DD02: main (main.c:483)
==23464== Address 0x5152788 is 104 bytes inside a block of size 112 free'd
==23464== at 0x48369AB: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==23464== by 0x48B25B8: qfree (memory.c:129)
==23464== by 0x48EA335: route_node_destroy (table.c:500)
==23464== by 0x48E967F: route_node_free (table.c:90)
==23464== by 0x48E9742: route_table_free (table.c:124)
==23464== by 0x48E9599: route_table_finish (table.c:60)
==23464== by 0x170CEA: zebra_router_free_table (zebra_router.c:165)
==23464== by 0x170DB4: zebra_router_release_table (zebra_router.c:188)
==23464== by 0x17AAD2: zebra_vrf_disable (zebra_vrf.c:222)
==23464== by 0x48F3F0C: vrf_disable (vrf.c:313)
==23464== by 0x48F3CCF: vrf_delete (vrf.c:223)
==23464== by 0x48F4468: vrf_terminate (vrf.c:532)
==23464== Block was alloc'd at
==23464== at 0x4837B65: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==23464== by 0x48B24A2: qcalloc (memory.c:110)
==23464== by 0x48EA2FE: route_node_create (table.c:488)
==23464== by 0x48E95C7: route_node_new (table.c:66)
==23464== by 0x48E95E5: route_node_set (table.c:75)
==23464== by 0x48E9EA9: route_node_get (table.c:326)
==23464== by 0x48E1EDB: srcdest_rnode_get (srcdest_table.c:244)
==23464== by 0x16EA4B: rib_add_multipath (zebra_rib.c:2730)
==23464== by 0x1A5310: zread_route_add (zapi_msg.c:1592)
==23464== by 0x1A7B8E: zserv_handle_commands (zapi_msg.c:2579)
==23464== by 0x19D689: zserv_process_messages (zserv.c:523)
==23464== by 0x48F09F8: thread_call (thread.c:1599)
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Reduce the api for deleting nexthops and the containing
group to just one call rather than having a special case
and handling it separately.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Optimize the fib and notified nexthop group comparison algorithm
to assume ordering. There were some pretty serious performance hits with
this on high ecmp routes.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
We were waiting until install time to mark nexthops as duplicate.
Since they are immutable now and re-used, move this marking into
when they are actually created to save a bunch of cycles.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
When we receive a route delete from the kernel and it
contains a nexthop object id, use that to match against
route gateways with instead of explicit nexthops.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
On restart, if we failed to remove any nexthop objects due
to a kill -9 or such event, sweep them if we aren't using them.
Add a proto field to handle this and remove the is_kernel bool.
Add a dupicate flag that indicates this nexthop group is only
present in our ID hashtable. It is a dupicate nexthop we received
from the kernel, therefore we cannot hash on it.
Make the idcounter globally accessible so that kernel updates
increment it as soon as we receive them, not when we handle them.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
We need to handle refcnt differently if we ever start making
upper level protocols aware of nhg_hash_entry IDs.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Move the resolving and installing of a single nhg_hash_entry
into the install function itself, rather than letting zebra_rib
handle it.
Further, ensure depends are installed/queued before installing
a group. The ordering should be find here since only one thread
will call this API.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Move the installation of an nhe out of nexthop_active_update()
and into the rib install path. So, only install the nhe when
a route using it is being installed.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
When hashing/creating the NHE, use the nexthops vrf as its
source of data. This is gotten directly from an interface
and should not come from a route.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Only remove a route if the nexthop it is using is still installed.
If a nexthop object is removed from the kernel, all routes referencing
it will be removed from the kernel.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add the ability to recursively resolve nexthop group hash entries
and resolve them when sending to the kernel.
When copying over nexthops into an NHE, copy resolved info as well.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
We will use a nhe context for dataplane interaction with
nextho group hash entries.
New nhe's from the kernel will be put into a group array
if they are a group and queued on the rib metaq to be processed
later.
New nhe's sent to the kernel will be set on the dataplane context
with approprate ID's in the group array if needed.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Check to make sure the route entry has a nexthop
group before we try to free after a table lookup
failure.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
A nexthop group should not have a VRF ID. Only individual
nexthops need to be using a VRF. Fixed this both kernel and
proto side.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Re-organize and expose the nhg_connected functions so that
it can be used outside zebra_nhg.c. And then abstract those
into zebra_nhg_depends_* and zebra_nhg_depenents_* functons.
Switch the ifp struct to use an RB tree for its dependents,
making use of the nhg_connected functions.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Create a nhg_depenents tree that will function as a way
to get back pointers for NHE's depending on it.
Abstract the RB nodes into nhg_connected for both depends and
dependents. This same struct is used for both.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Default the afi of the nexthop to the route entry using it.
If it turns out to be a group, update the afi to AFI_UNSPEC.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Update rib_add_multipath to use the reference count
increment function for nexthop group hash entries.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add a helper function to allow us to check if two
nhg_hash_entry's dependency lists are equal.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Changed our alloc function to just copy the nhg and
nhg_depends. This makes the zebra_nhg_find code a
little bit cleaner, hopefully preventing bugs.
The only issue with this is that it makes us have to loop
over the nexthops in a group an extra time for the copies.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Simplify the code for nexthop hash entry creation. I made nexthop
hash entry creation expect the nexthop group and depends to always
be allocated before lookup. Before, it was only allocated if it had
dependencies. I think it makes the code a bit more readable to go
ahead an allocate even for single nexthops as well.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add functionality to read in a group from the kernel,
create a hash entry for it, and add its nexthops to
its dependency list.
Further, we create its nhg struct separtely from this,
copying over any nexthops it should reference directly
into it.
Thus, we have two types for representation of the nexthop group:
nhe->nhg_depends->[nhe, nhe, nhe]
nhe->nhg->nexthop->nexthop->nexthop
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add an interface pointer for an nexthop group hash entry
when we are getting a rib_add for a new route.
Also, add the interface index to the `show nexthop-group` command.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Make our route entry struct's re->ng nexthop group pointer
just point to the nhe->nhg nexthop hash entry nexthop group.
This will allow updates to the nexthop itself to propogate
to our routes immediately.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add a parameter to the rib_add function so that it takes
a nexthop ID from the kernel if one is passed along
with the route.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>