Commit graph

576 commits

Author SHA1 Message Date
Stephen Worley ace3bbba4b zebra: Don't clear nexthop fib flag on rib_install
We cannot clear the NEXTHOP_FLAG_FIB nexthop flag
when sending routes to the dataplane anymore since
nexthops are now shared.

We were seeing a situation where if we delete a route
using a nexthop group that is still active with another
route, the fib flag was being unset by this code
path despite them still being valid fib nexthops with the
other route.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-11-12 01:24:39 -05:00
Stephen Worley c7c0b007a4 zebra: separate zebra_vrf_lookup_table_with_id()
We were creating `other` tables in rib_del(), vty commands, and
dataplane return callback via the zebra_vrf_table_with_table_id()
API.

Seperate the API into only a lookup, never create
and added another with `get` in the name (following the standard
we use in other table APIs).

Then changed the rib_del(), rib_find_rn_from_ctx(), and show route
summary vty command to use the lookup API instead.

This was found via a crash where two different vrfs though they owned
the table. On delete, one free'd all the nodes, and then the other tried
to use them. It required specific timing of a VRF existing, going away,
and coming back again to cause the crash.

=23464== Invalid read of size 8
==23464==    at 0x179EA4: rib_dest_from_rnode (rib.h:433)
==23464==    by 0x17ACB1: zebra_vrf_delete (zebra_vrf.c:253)
==23464==    by 0x48F3D45: vrf_delete (vrf.c:243)
==23464==    by 0x48F4468: vrf_terminate (vrf.c:532)
==23464==    by 0x13D8C5: sigint (main.c:172)
==23464==    by 0x48DD25C: quagga_sigevent_process (sigevent.c:105)
==23464==    by 0x48F0502: thread_fetch (thread.c:1417)
==23464==    by 0x48AC82B: frr_run (libfrr.c:1023)
==23464==    by 0x13DD02: main (main.c:483)
==23464==  Address 0x5152788 is 104 bytes inside a block of size 112 free'd
==23464==    at 0x48369AB: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==23464==    by 0x48B25B8: qfree (memory.c:129)
==23464==    by 0x48EA335: route_node_destroy (table.c:500)
==23464==    by 0x48E967F: route_node_free (table.c:90)
==23464==    by 0x48E9742: route_table_free (table.c:124)
==23464==    by 0x48E9599: route_table_finish (table.c:60)
==23464==    by 0x170CEA: zebra_router_free_table (zebra_router.c:165)
==23464==    by 0x170DB4: zebra_router_release_table (zebra_router.c:188)
==23464==    by 0x17AAD2: zebra_vrf_disable (zebra_vrf.c:222)
==23464==    by 0x48F3F0C: vrf_disable (vrf.c:313)
==23464==    by 0x48F3CCF: vrf_delete (vrf.c:223)
==23464==    by 0x48F4468: vrf_terminate (vrf.c:532)
==23464==  Block was alloc'd at
==23464==    at 0x4837B65: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==23464==    by 0x48B24A2: qcalloc (memory.c:110)
==23464==    by 0x48EA2FE: route_node_create (table.c:488)
==23464==    by 0x48E95C7: route_node_new (table.c:66)
==23464==    by 0x48E95E5: route_node_set (table.c:75)
==23464==    by 0x48E9EA9: route_node_get (table.c:326)
==23464==    by 0x48E1EDB: srcdest_rnode_get (srcdest_table.c:244)
==23464==    by 0x16EA4B: rib_add_multipath (zebra_rib.c:2730)
==23464==    by 0x1A5310: zread_route_add (zapi_msg.c:1592)
==23464==    by 0x1A7B8E: zserv_handle_commands (zapi_msg.c:2579)
==23464==    by 0x19D689: zserv_process_messages (zserv.c:523)
==23464==    by 0x48F09F8: thread_call (thread.c:1599)

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-11-01 16:06:19 -04:00
Stephen Worley d3a3513811 lib,pbrd,zebra: Use one api to delete nexthops/group
Reduce the api for deleting nexthops and the containing
group to just one call rather than having a special case
and handling it separately.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:43 -04:00
Stephen Worley fec211ad95 zebra: Zebra nexthop group re-work checkpatch fixes
Checkpatch fixes for the zebra nexthop group re-work.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:43 -04:00
Stephen Worley 986a6617cc zebra: Optimize the fib/notified nexthop matching
Optimize the fib and notified nexthop group comparison algorithm
to assume ordering. There were some pretty serious performance hits with
this on high ecmp routes.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:42 -04:00
Stephen Worley 9ef49038d5 lib,zebra: Move nexthop dup marking into creation
We were waiting until install time to mark nexthops as duplicate.
Since they are immutable now and re-used, move this marking into
when they are actually created to save a bunch of cycles.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:42 -04:00
Stephen Worley bc541126e4 zebra: Use nexthop object id on route delete
When we receive a route delete from the kernel and it
contains a nexthop object id, use that to match against
route gateways with instead of explicit nexthops.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:42 -04:00
Stephen Worley 38e40db1c9 zebra: Sweep our nexthop objects out on restart
On restart, if we failed to remove any nexthop objects due
to a kill -9 or such event, sweep them if we aren't using them.
Add a proto field to handle this and remove the is_kernel bool.

Add a dupicate flag that indicates this nexthop group is only
present in our ID hashtable. It is a dupicate nexthop we received
from the kernel, therefore we cannot hash on it.

Make the idcounter globally accessible so that kernel updates
increment it as soon as we receive them, not when we handle them.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:41 -04:00
Stephen Worley 4b87c90d58 zebra: TODO for handling upper level nhe_id passing
We need to handle refcnt differently if we ever start making
upper level protocols aware of nhg_hash_entry IDs.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:41 -04:00
Stephen Worley f429bd1b24 zebra: Move resolve/add depend install into api
Move the resolving and installing of a single nhg_hash_entry
into the install function itself, rather than letting zebra_rib
handle it.

Further, ensure depends are installed/queued before installing
a group. The ordering should be find here since only one thread
will call this API.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:41 -04:00
Stephen Worley 8dfbc65724 zebra: Install the nhe along with the route
Move the installation of an nhe out of nexthop_active_update()
and into the rib install path. So, only install the nhe when
a route using it is being installed.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:41 -04:00
Stephen Worley 7f99772169 zebra: Use nexthop/interface vrf, not the routes
When hashing/creating the NHE, use the nexthops vrf as its
source of data. This is gotten directly from an interface
and should not come from a route.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:40 -04:00
Stephen Worley 6df591527f zebra: Remove route only if NHE is installed check
Only remove a route if the nexthop it is using is still installed.
If a nexthop object is removed from the kernel, all routes referencing
it will be removed from the kernel.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:40 -04:00
Stephen Worley 144a1b34df zebra: Put NHE ref updating into a function
When the referenced NHE changes for a route_entry, use this function
to handle it.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:40 -04:00
Stephen Worley 98cda54a95 zebra: Add recursive functionality to NHE's
Add the ability to recursively resolve nexthop group hash entries
and resolve them when sending to the kernel.

When copying over nexthops into an NHE, copy resolved info as well.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:40 -04:00
Stephen Worley e22e80010e zebra: Use a nhe context dataplane and rib metaq
We will use a nhe context for dataplane interaction with
nextho group hash entries.

New nhe's from the kernel will be put into a group array
if they are a group and queued on the rib metaq to be processed
later.

New nhe's sent to the kernel will be set on the dataplane context
with approprate ID's in the group array if needed.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:40 -04:00
Stephen Worley 13dfea50e4 zebra: Check for nexthop group before free on fail
Check to make sure the route entry has a nexthop
group before we try to free after a table lookup
failure.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:39 -04:00
Stephen Worley bbb3940ed1 zebra: VRF ID should be null if a nexthop group
A nexthop group should not have a VRF ID. Only individual
nexthops need to be using a VRF. Fixed this both kernel and
proto side.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:39 -04:00
Stephen Worley fe593b781d zebra: Re-organize/expose nhg_connected
Re-organize and expose the nhg_connected functions so that
it can be used outside zebra_nhg.c. And then abstract those
into zebra_nhg_depends_* and zebra_nhg_depenents_* functons.

Switch the ifp struct to use an RB tree for its dependents,
making use of the nhg_connected functions.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:39 -04:00
Stephen Worley a15d4c0089 zebra: Abstract the RB nodes/add dependents tree
Create a nhg_depenents tree that will function as a way
to get back pointers for NHE's depending on it.

Abstract the RB nodes into nhg_connected for both depends and
dependents. This same struct is used for both.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:39 -04:00
Stephen Worley 0c8215cbab zebra,lib: Refactor depends to RB tree
Refactor the depends to use an RB tree instead of a list.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:39 -04:00
Stephen Worley ab942eb285 zebra: Protocol side nhg_hash_entry afi fix
Default the afi of the nexthop to the route entry using it.
If it turns out to be a group, update the afi to AFI_UNSPEC.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:39 -04:00
Stephen Worley 86946a2207 zebra: Update rib_add_multipath with increment nhe
Update rib_add_multipath to use the reference count
increment function for nexthop group hash entries.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:39 -04:00
Stephen Worley 20822f9d2e zebra: Add equivalence function for nhg_depends
Add a helper function to allow us to check if two
nhg_hash_entry's dependency lists are equal.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:39 -04:00
Stephen Worley 2d6cd1f007 zebra: Always copy nhg and depends on nhe alloc
Changed our alloc function to just copy the nhg and
nhg_depends. This makes the zebra_nhg_find code a
little bit cleaner, hopefully preventing bugs.

The only issue with this is that it makes us have to loop
over the nexthops in a group an extra time for the copies.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:38 -04:00
Stephen Worley 8e401b251f zebra: Refactor nexthop group creation code to use allocated memory
Simplify the code for nexthop hash entry creation. I made nexthop
hash entry creation expect the nexthop group and depends to always
be allocated before lookup. Before, it was only allocated if it had
dependencies. I think it makes the code a bit more readable to go
ahead an allocate even for single nexthops as well.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:38 -04:00
Stephen Worley 85f5e76175 zebra: Read in nexthop dependencies from the kernel
Add functionality to read in a group from the kernel,
create a hash entry for it, and add its nexthops to
its dependency list.

Further, we create its nhg struct separtely from this,
copying over any nexthops it should reference directly
into it.

Thus, we have two types for representation of the nexthop group:
	nhe->nhg_depends->[nhe, nhe, nhe]

	nhe->nhg->nexthop->nexthop->nexthop

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:38 -04:00
Stephen Worley 77b76fc900 Revert "zebra: Remove afi field in nexthop hash entry"
This reverts commit be73fe9393aac58c7f4bdb5c8a98c24c6cda6d5d.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:38 -04:00
Stephen Worley 2614bf8764 zebra: Add ifp to zebra-side rib_add
Add an interface pointer for an nexthop group hash entry
when we are getting a rib_add for a new route.

Also, add the interface index to the `show nexthop-group` command.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:37 -04:00
Stephen Worley bbb322f292 zebra: Make route entry nexthop groups point to our hash entry
Make our route entry struct's re->ng nexthop group pointer
just point to the nhe->nhg nexthop hash entry nexthop group.
This will allow updates to the nexthop itself to propogate
to our routes immediately.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:37 -04:00
Stephen Worley 8032b71737 zebra: Update rib_add to take a nexthop ID
Add a parameter to the rib_add function so that it takes
a nexthop ID from the kernel if one is passed along
with the route.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:37 -04:00
Stephen Worley 9865e1c393 zebra: Add calls to the nexthop context process result function
Added in case statements to handle finished dataplane contexts
and then handle them with the nexthop process result function.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:37 -04:00
Stephen Worley 8e76637969 zebra: Route entries use nexthop entry ID's instead of pointers
Switched the route entries to use ID's instead of pointers.
Perform lookups with the ID and then check if its null.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:37 -04:00
Stephen Worley d9f5b2f50f zebra: Add functionality to parse RTM_NEWNEXTHOP and RTM_DELNEXTHOP messages
Add the functionality to parse new nexthop group messages
from the kernel and insert them into the appropriate hash
tables. Parsing is done at startup between interface and
interface address lookup. Add functionality to parse
changes to nexthops we already have. Add functionality
to parse delete nexthop messages from the kernel and
remove them from our table.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:36 -04:00
Stephen Worley 8b5bdc8bdf zebra: Remove afi field in nexthop hash entry
I do not believe we should be hashing based on AFI
in for our upper level nexthop group entries. These
should be ambiguous with regards to  address families since
an ipv4 or ipv6 address can have the same interface
nexthop. This can be seen in NEXTHOP_TYPE_IFINDEX.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:36 -04:00
Donald Sharp 9a0d4dd39b zebra: Remove nexthop_active_num from route entry
The nexthop_active_num data structure is a property of the
nexthop group.  Move the keeping of this data to that.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-10-25 11:13:36 -04:00
Donald Sharp eecacedc3b zebra: Remove re->nexthop_num from re
The nexthop_num is not a function of the re.  It is owned
by the nexthop group.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-10-25 11:13:36 -04:00
Donald Sharp 6b46851168 zebra: Replace nexthop_group with pointer in route entry
In the route_entry we are keeping a non pointer based
nexthop group, switch the code to use a pointer for all
operations here and ensure we create and delete the memory.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-10-25 11:13:36 -04:00
Donald Sharp 22bcedb231 zebra: Add code to create/remove nexthop groups
Add some code to create/remove nexthop groups.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:35 -04:00
Stephen Worley 2a99ab95e6 zebra: Handle rib updates as a thread event
If we need to batch process the rib (all tables or specific
vrf), do so as a scheduled thread event rather than immediately
handling it. Further, add context to the events so that you
narrow down to certain route types you want to reprocess.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-18 16:59:25 -04:00
Donald Sharp a8100f512a
Merge pull request #5079 from mjstapp/fix_dplane_drop_at_shut
zebra: during shutdown processing, drop dplane results
2019-10-03 11:49:07 -04:00
Mark Stapp 2fc69f03d2 zebra: during shutdown processing, drop dplane results
Don't process dataplane results in zebra during shutdown (after
sigint has been seen). The dplane continues to run in order to
clean up, but zebra main just drops results.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-09-27 12:15:34 -04:00
dturlupov 873fde8cd9 zebra: if_is_loopback_or_vrf crash if if_lookup_by_index return NULL
Function if_lookup_by_index() can return NULL, but in if_is_loopback_or_vrf() we don't chech NULL and get next:

Sep 2 07:44:34 XXX zebra[4616]: /usr/lib64/libfrr.so.0(zlog_backtrace_sigsafe+0x48) [0x7fb5f704cf18]
Sep 2 07:44:34 XXX zebra[4616]: /usr/lib64/libfrr.so.0(zlog_signal+0x378) [0x7fb5f704d728]
Sep 2 07:44:34 XXX zebra[4616]: /usr/lib64/libfrr.so.0(+0x6b495) [0x7fb5f706b495]
Sep 2 07:44:34 XXX zebra[4616]: /lib64/libpthread.so.0(+0x123b0) [0x7fb5f6d573b0]
Sep 2 07:44:34 XXX zebra[4616]: /usr/lib64/libfrr.so.0(if_is_loopback+0) [0x7fb5f7045160]
Sep 2 07:44:34 XXX zebra[4616]: /usr/lib64/libfrr.so.0(if_is_loopback_or_vrf+0x11) [0x7fb5f7045191]
Sep 2 07:44:34 XXX zebra[4616]: /usr/sbin/zebra() [0x43b26d]
Sep 2 07:44:34 XXX zebra[4616]: /usr/sbin/zebra() [0x43db6f]
Sep 2 07:44:34 XXX zebra[4616]: /usr/lib64/libfrr.so.0(work_queue_run+0xc8) [0x7fb5f7080de8]
Sep 2 07:44:34 XXX zebra[4616]: /usr/lib64/libfrr.so.0(thread_call+0x47) [0x7fb5f7077d27]
Sep 2 07:44:34 XXX zebra[4616]: /usr/lib64/libfrr.so.0(frr_run+0xd8) [0x7fb5f704b448]

Signed-off-by: Dmitrii Turlupov dturlupov@factor-ts.ru
2019-09-26 16:11:50 +03:00
Donald Sharp 17a8a5beec
Merge pull request #4972 from mjstapp/fix_notif_installed
zebra: route updates from dataplane need to check all nexthops
2019-09-24 14:31:58 -04:00
Donald Sharp 16296beaa5
Merge pull request #4731 from mjstapp/fix_redist_update
zebra: redistribute deletes when updating selected route
2019-09-18 19:43:43 -04:00
Mark Stapp 11260e7011 zebra: check all dplane nexthops when processing
When processing route updates from the dataplane, we were
terminating the checking of nexthops prematurely, and we could
miss meaningful changes.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-09-17 10:50:36 -04:00
Mark Stapp cbb9ddad16 zebra: remove empty, unused internal api
Remove a leftover, empty nht api call from zebra.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-09-16 12:00:14 -04:00
Mark Stapp 40f321c000 zebra: revise redistribution delete to improve update case
When selecting a new best route, zebra sends a redist update
when the route is installed. There are cases where redist
clients may not see that redist add - clients who are not
subscribed to the new route type, e.g. In that case, attempt
to send a redist delete for the old/previous route type.

Revised the redist delete api to accomodate both cases;
also tightened up the const-ness of a few internal redist apis.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-09-12 08:51:05 -04:00
Mark Stapp 0bbd4ff442 zebra: move EVPN VTEP programming to dataplane
Move VTEP install/uninstall to the zebra dataplane. Remove
synch kernel-facing apis and helper functions.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-09-04 10:30:17 -04:00
Donald Sharp c9042b2890
Merge pull request #4877 from mjstapp/dplane_neighs
zebra: move evpn neighbors to dataplane
2019-09-04 10:23:31 -04:00