Commit graph

830 commits

Author SHA1 Message Date
Stephen Worley 50db3f2f1d zebra: handle zapi routes with NHG ID set
Add code to properly handle routes sent with NHG ID rather
than a nexthop_group.

For now, we separate this from backup nexthop handling since that
should probably be added to the nhg_proto_add calls.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2020-09-28 12:40:59 -04:00
Donald Sharp 9271987f1e zebra: When we get a rib deletion event be smarter
When we get a rib deletion event and we already have
that particular route node in the queue to be reprocessed,
just note that someone from kernel land has done us dirty
and allow it to be cleaned up by normal processing

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-08-28 14:45:59 -04:00
Donald Sharp 0aaa722883 zebra: When shutting down an interface immediately notify about rnh
Imagine a situation where a interface is bouncing up/down.
The interface comes up and daemons like pbr will get a nht
tracking callback for a connected interface up and will install
the routes down to zebra.  At this same time the interface can
go down.  But since zebra is busy handling route changes ( from pbr )
it has not read the netlink message and can get into a situation
where the route resolves properly and then we attempt to install
it into the kernel( which is rejected ).  If the interface
bounces back up fast at this point, the down then up netlink
message will be read and create two route entries off the connected
route node.  Zebra will then enqueue both route entries for future processing.

After this processing happens the down/up is collapsed into an up
and nexthop tracking sees no changes and does not inform any upper
level protocol( in this case pbr ) that nexthop tracking has changed.
So pbr still believes the nexthops are good but the routes are not
installed since pbr has taken no action.

Fix this by immediately running rnh when we signal a connected
route entry is scheduled for removal.  This should cause
upper level protocols to get a rnh notification for the small
amount of time that the connected route was bouncing around like
a madman.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-08-28 14:45:59 -04:00
Donald Sharp b96f64f76f zebra: When we fail, actually note the failure
During testing it was noticed that routes were considered
installed by zebra, but the kernel did not have the route.
Upon close debugging of the rib it was noticed that FRR
was turning a dplane_ctx_route_init into a success and
FRR was now in a bad state.

2020/08/26 17:55:53.897436 PBR: route_notify_owner: [0.0.0.0/0] Route Removed succeeded for table: 10012
2020/08/26 17:55:53.897572 ZEBRA: 0.0.0.0/0: uptime == 432033, type == 24, instance == 0, table == 10012
2020/08/26 17:55:53.897622 ZEBRA: rib_meta_queue_add: (0:10012):0.0.0.0/0: queued rn 0x5566b0ea7680 into sub-queue 5
2020/08/26 17:55:53.907637 ZEBRA: default(0:10012):0.0.0.0/0: Processing rn 0x5566b0ea7680
2020/08/26 17:55:53.907665 ZEBRA: default(0:10012):0.0.0.0/0: Examine re 0x5566b0d01200 (pbr) status 2 flags 1 dist 200 metric 0
2020/08/26 17:55:53.907702 ZEBRA: default(0:10012):0.0.0.0/0: After processing: old_selected 0x0 new_selected 0x5566b0d01200 old_fib 0x0 new_fib 0x5566b0d01200
2020/08/26 17:55:53.907713 ZEBRA: default(0:10012):0.0.0.0/0: Adding route rn 0x5566b0ea7680, re 0x5566b0d01200 (pbr)
2020/08/26 17:55:53.907879 ZEBRA: default(0:10012):0.0.0.0/0: rn 0x5566b0ea7680 dequeued from sub-queue 5
2020/08/26 17:55:53.907943 ZEBRA: netlink_route_multipath: RTM_NEWROUTE 0.0.0.0/0 vrf 0(10012)
2020/08/26 17:55:53.910756 ZEBRA: default(0:10012):0.0.0.0/0 Processing dplane result ctx 0x5566b0ea82f0, op ROUTE_INSTALL result SUCCESS
2020/08/26 17:55:53.910769 ZEBRA: update_from_ctx: default(0:10012):0.0.0.0/0: SELECTED, re 0x5566b0d01200
2020/08/26 17:55:53.910785 ZEBRA: default(0:10012):0.0.0.0/0 update_from_ctx(): no fib nhg
2020/08/26 17:55:53.910793 ZEBRA: default(0:10012):0.0.0.0/0 update_from_ctx(): rib nhg matched, changed 'true'
2020/08/26 17:55:53.910802 ZEBRA: (0:10012):0.0.0.0/0: Redist update re 0x5566b0d01200 (pbr), old 0x0 (None)
2020/08/26 17:55:53.910812 ZEBRA: Notifying Owner: 24 about prefix 0.0.0.0/0(10012) 2 vrf: 0
2020/08/26 17:55:53.910912 PBR: route_notify_owner: [0.0.0.0/0] Route installed succeeded for table: 10012
2020/08/26 17:55:55.400516 ZEBRA: RTM_DELROUTE 0.0.0.0/0 vrf default(0) table_id: 10012 metric: 20 Admin Distance: 0
2020/08/26 17:55:55.400527 ZEBRA: rib_delete: (0:10012):0.0.0.0/0: rn 0x5566b0ea7680, re 0x5566b0d01200 (pbr) was deleted from kernel, adding

We were receiving a notification from the kernel that the route was deleted and deciding
that we needed to reinstall it.  At that point in time when it got into the dplane
handlers to convert it to the dplane pthread, the dplane decided to drop the request
convert it too a success and not do anything.

This code change removes the conversion from this failure to success and
notifies the upper level about it.  After this change the default route
to table 10012 is now properly marked as rejected:

root@mlx-2700-07:mgmt:/var/log/frr# vtysh -c "show ip route table 10012"
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route, q - queued route, r - rejected route

VRF default table 10012:
F>r 0.0.0.0/0 [200/0] via 172.168.1.164, isp2-uplink (vrf PUBLIC), weight 1, 00:24:48

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-08-26 21:51:54 -04:00
Mark Stapp f515871207 zebra: fix SA warning in rib_process()
Fix an SA warning about a possible NULL pointer deref in
rib_process().

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-08-21 09:39:02 -04:00
Donald Sharp c2c02b76bc zebra: Add table id to debug output
There are a bunch of places where the table id is not being outputed
in debug messages for routing changes.  Add in the table id we
are operating on.  This is especially useful for the case where
pbr is working.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-08-19 13:59:29 -04:00
Jakub Urbańczyk d68e74b41c lib, zebra: add support for sending ARP requests
We can make the Linux kernel send an ARP/NDP request by adding
a neighbour with the 'NUD_INCOMPLETE' state and the 'NTF_USE' flag.

This commit adds new dataplane operation as well as new zapi message
to allow other daemons send ARP/NDP requests.

Signed-off-by: Jakub Urbańczyk <xthaid@gmail.com>
2020-08-12 23:19:58 +02:00
Jakub Urbańczyk 86d5622362 zebra: remove "PENDING" dplane request state
This request state is redundant with new message batching interface.

Signed-off-by: Jakub Urbańczyk <xthaid@gmail.com>
2020-08-10 21:33:00 +02:00
Sebastien Merle 31f937fb43 lib, zebra: Add SR-TE policy infrastructure to zebra
For the sake of Segment Routing (SR) and Traffic Engineering (TE)
Policies there's a need for additional infrastructure within zebra.
The infrastructure in this PR is supposed to manage such policies
in terms of installing binding SIDs and LSPs. Also it is capable of
managing MPLS labels using the label manager, keeping track of
nexthops (for resolving labels) and notifying interested parties about
changes of a policy/LSP state. Further it enables a route map mechanism
for BGP and SR-TE colors such that learned BGP routes can be mapped
onto SR-TE Policies.

This PR does not introduce any usable features by now, it is just
infrastructure for other upcoming PRs which will introduce 'pathd',
a new SR-TE daemon.

Co-authored-by: Renato Westphal <renato@opensourcerouting.org>
Co-authored-by: GalaxyGorilla <sascha@netdef.org>
Signed-off-by: Sebastien Merle <sebastien@netdef.org>
2020-08-07 11:08:49 +02:00
Renato Westphal 790953a387
Merge pull request #6765 from mjstapp/backup_nhg_netlink
lib,zebra: support multiple backup nexthops
2020-07-27 12:49:36 -03:00
Stephen Worley 55528234ea
Merge pull request #6753 from mjstapp/fix_zebra_backup_sa
zebra: fix SA warnings in backup nexthop code
2020-07-17 17:29:49 -04:00
Mark Stapp 7483dcbe29 zebra: add a route_entry flag for FIB-specific nexthops
Add a route_entry flag to indicate the presence of a fib
(installed) list of nexthops - more explicit and clearer.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-07-17 13:12:33 -04:00
Mark Stapp 474aebd939 lib,sharpd,zebra: initial support for multiple backup nexthops
Initial changes to support a nexthop with multiple backups. Lib
changes to hold a small array in each primary, zapi message
changes to support sending multiple backups, and daemon
changes to show commands to support multiple backups. The config
input for multiple backup indices is not present here.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-07-17 13:12:33 -04:00
Mark Stapp 43a9f66cd1 zebra: fix SA warnings in backup nexthop code
Fix a couple of recent SA warnings that came from backup
nexthop/nhlfe changes.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-07-16 11:00:17 -04:00
David Lamparter 3efd0893d0 *: un-split strings across lines
Remove mid-string line breaks, cf. workflow doc:

  .. [#tool_style_conflicts] For example, lines over 80 characters are allowed
     for text strings to make it possible to search the code for them: please
     see `Linux kernel style (breaking long lines and strings)
     <https://www.kernel.org/doc/html/v4.10/process/coding-style.html#breaking-long-lines-and-strings>`_
     and `Issue #1794 <https://github.com/FRRouting/frr/issues/1794>`_.

Scripted commit, idempotent to running:
```
python3 tools/stringmangle.py --unwrap `git ls-files | egrep '\.[ch]$'`
```

Signed-off-by: David Lamparter <equinox@diac24.net>
2020-07-14 10:37:25 +02:00
Mark Stapp 9959f1daba zebra: improve logic handling backup nexthop installation
When handling a fib notification event that involves a route
with backup nexthops, be clearer about representing the
installed state of the backups: any installed backup will be
on a dedicated route_entry list.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-07-07 13:14:01 -04:00
Mark Stapp 4db01e7914 zebra: add fib nhg for backups, revise api
Add an nhg for the fib-installed backup nexthops; rename an
api to access the fib-installed nexthop nhg.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-07-07 13:14:01 -04:00
Jakub Urbańczyk 2f74a82a11 zebra: prepare data plane for batching
* Add new zebra_dplane_result to allow kernel updates not to return
   a result immediately.

Signed-off-by: Jakub Urbańczyk <xthaid@gmail.com>
2020-06-26 22:03:44 +02:00
Mark Stapp 9db35a5e6f zebra: improve route_entry comparison logic
Improve and centralize some logic used to a) compare two
route_entries, and b) to locate a route_entry that matches
a dplane context object that contains the results of a
fib update. We were not rigorous enough in checking routes'
properties, especially when examining connected routes where
we allow multiple route_entries.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-06-25 08:21:27 -04:00
Jakub Urbańczyk f62e5480ec zebra: convert ip rule installation to use dplane thread
* Implement new dataplane operations
 * Convert existing code to use dataplane context object
 * Modify function preparing netlink message to use dataplane
   context object

Signed-off-by: Jakub Urbańczyk <xthaid@gmail.com>
2020-06-10 16:18:45 +02:00
Renato Westphal 30712725a0
Merge pull request #6480 from volta-networks/feat_pwstatus
ldpd: Relay data plane pseudowire status in LDP notification
2020-06-01 21:00:51 -03:00
Mark Stapp daaeaa2150 zebra: init dest's list of routes
Use the dlist init api on the zebra dest object's list
of routes.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-06-01 14:46:32 -04:00
Karen Schoener fd563cc7f3 ldpd: Relay data plane pseudowire status in LDP notification
Provide a way for the data plane to indicate pseudowire
status (such as: not forwarding, AC failure).

On a data plane pseudowire install failure, data plane
sets the pseudowire status.
Zebra relays the pseudowire status to LDP.
LDP includes the pseudowire status in the LDP notification
to the LDP peer.

Signed-off-by: Karen Schoener <karen@voltanet.io>
2020-06-01 13:21:37 -04:00
Emanuele Di Pascale cd7108ba92 zebra: avoid using c++ keywords in headers
to make sure that c++ code can include them, avoid using reserved
keywords like 'delete' or 'new'.

Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
2020-05-14 16:42:47 +02:00
Donald Sharp 91e6f25bc0 zebra: remove typedef rib_update_event_t from system
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-05-08 08:10:49 -04:00
Donald Sharp 630d596249 zebra: Remove typedef rib_table_info_t from system
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-05-08 08:10:49 -04:00
Stephen Worley 090152ec9c
Merge pull request #5786 from mjstapp/fix_notif_empty_nhg
zebra: fix handling of failed route install via notification
2020-04-29 12:28:56 -04:00
Mark Stapp a126f12003 zebra: fix handling of failed route install via notification
An async route notification can indicate that installation
has failed, but the handling code wasn't dealing with that
possibility correctly.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-04-27 10:24:55 -04:00
Quentin Young 772270f3b6 *: sprintf -> snprintf
Replace sprintf with snprintf where straightforward to do so.

- sprintf's into local scope buffers of known size are replaced with the
  equivalent snprintf call
- snprintf's into local scope buffers of known size that use the buffer
  size expression now use sizeof(buffer)
- sprintf(buf + strlen(buf), ...) replaced with snprintf() into temp
  buffer followed by strlcat

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2020-04-20 19:14:33 -04:00
Jakub Urbańczyk bd47f3a3b4 zebra: Add vrf name and id to debugs
In some places we log the interface but not the vfr the
interface is in. In others we only output the vrf id, which
can be difficult for human to read. This commit makes zebra
debugs more vrf aware.

Signed-off-by: Jakub Urbańczyk <xthaid@gmail.com>
2020-04-12 21:03:29 +02:00
Donatas Abraitis c4efd0f423 *: Do not cast to the same type
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2020-04-08 17:15:06 +03:00
Stephen Worley ff82bbbb91
Merge pull request #5901 from mjstapp/backup_nh_prep
zebra, lib: Backup nexthop (path) prep work
2020-03-30 10:26:17 -04:00
David Lamparter 07ef3e34ae lib: prepare for plugin-based frr_format check
Signed-off-by: David Lamparter <equinox@diac24.net>
2020-03-29 10:45:46 +02:00
Mark Stapp 377e29f7e7 zebra: handle backup nexthops in nhe/nhgs
Include backup nexthops in nhe processing; connect incoming
zapi route data with updated rib/nhg apis; add more debugs in
nhg processing.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-03-27 11:50:03 -04:00
Mark Stapp 6d81b590a9 zebra: improve route debugging and add support for backups
Refactor the detailed route debugging so that the dump of nexthops
can be used for both normal/active nexthops and backups (if they
are present).

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-03-27 11:50:03 -04:00
Mark Stapp 1d48702ede zebra: add per-nexthop backup index
Use a backup index in a nexthop directly (if it has a backup
nexthop); revise the zebra nhe/nhg code; revise zapi route
decoding to match; revise the dataplane route datastructs.

Refactor some of the rib_add_multipath code to be prepared to
be called with an nhe, carrying nexthop and (possibly) backup
info together.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-03-27 11:50:03 -04:00
Donatas Abraitis 15569c58f8 *: Replace __PRETTY_FUNCTION__/__FUNCTION__ to __func__
Just keep the code cool.

Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2020-03-05 20:23:23 +02:00
Ruben Kerkhof 05267678eb zebra: fix typo in debug log message
Signed-off-by: Ruben Kerkhof <ruben@rubenkerkhof.com>
2020-03-04 16:08:18 +01:00
Mark Stapp c415d89528 zebra: Embed lib nexthop-group in zebra hash entry
Embed nexthop-group, which is just a pointer, in the zebra
nexthop-hash-entry object, rather than mallocing one.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-02-27 15:49:31 -05:00
Donald Sharp c479e75665 zebra: Add vrf name to debug output
The vrf id is insufficient of a discriminator in people's head
Give them what they need.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-02-14 08:41:42 -05:00
Quentin Young efa618369a
Merge pull request #5794 from mjstapp/remove_nexthop_matched_flag
lib,zebra: remove unused MATCHED nexthop flag
2020-02-12 11:29:22 -05:00
Mark Stapp 0641a955d7 lib,zebra: remove unused MATCHED nexthop flag
Remove an unused flag value from the nexthop struct.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-02-11 15:56:35 -05:00
Russ White 5bf7fe566d
Merge pull request #5722 from donaldsharp/kernel_routes
Kernel routes
2020-02-06 08:04:42 -05:00
Quentin Young b3ba5dc7fe *: don't null after XFREE; XFREE does this itself
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2020-02-03 11:22:13 -05:00
Donald Sharp 3332f4f0fb zebra: Kernel routes w/ AD were not being marked as installed
When we are receiving a kernel route, with an admin distance
of 255 we are not marking it as installed.  This route
should be marked as installed.

New behavior:
K>* 4.5.7.0/24 [255/8192] via 192.168.209.1, enp0s8, 00:10:14

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-01-23 17:17:01 -05:00
Mark Stapp 9287b4c50f zebra: route changes via notify path trigger nht and mpls
Changes to a route via the dataplane notify path should
trigger nht and mpls lsp processing.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-01-06 10:09:47 -05:00
Stephen Worley 84a89a8d2e zebra: null check re->nhe not re->nhe->nhg on attach
We should be NULL checking the entire re->nhe struct, not
the group inside of it. When we get routes from the kernel
using a nexthop group (and future protocols) they will only
pass us an ID to use. Hence, this struct can (and will be)
NULL on first attach when only passed an ID.

There shouldn't be a situation where we have an re->nhe
and don't have an re->nhe->nhg anyway.

Before this patch you can easily make zebra crash by creating a
route in the kernel using a nexthop group and starting zebra.

`ip next add dev lo id 111`
`ip route add 1.1.1.1/32 nhid 111`

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-12-16 16:37:14 -05:00
Mark Stapp 1f6a5aca26 zebra: handle route notification with no nexthops
Handle the special case where a route update contains
no installed nexthops - that means the route is not
installed.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-12-12 12:55:51 -05:00
Mark Stapp 4c0b5436f9 zebra: accept async notification for un-install
Handle an async notification when a route-update operation
uninstalls one route in favor of another.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-12-11 11:22:53 -05:00
Mark Stapp 1c30d64bb6 zebra: align dplane notify processing with nhg work
The processing of dataplane route notifications was a little
off-target after the nexthop-group re-work. This should allow
notifications to work better.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-12-09 16:19:14 -05:00
Donald Sharp e302caaa81
Merge pull request #5416 from mjstapp/re_nhe_pointer
lib,zebra: use shared nexthop-group in route_entry
2019-12-04 14:11:04 -05:00
Mark Stapp 0eb97b860d lib,zebra: use nhg_hash_entry pointer in route_entry
Replace the existing list of nexthops (via a nexthop_group
struct) in the route_entry with a direct pointer to zebra's
new shared group (from zebra_nhg.h). This allows more
direct access to that shared group and the info it carries.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-12-04 08:13:52 -05:00
David Lamparter 2b64873d24 *: generously apply const
const const const your boat, merrily down the stream...

Signed-off-by: David Lamparter <equinox@diac24.net>
2019-12-02 15:01:29 +01:00
Mark Stapp 5463ce26c3 zebra: clean up rib and nhg headers
Clean up the relationships between zebra's rib and nexthop-group
headers as prep for adding a nexthop-group pointer to the
route_entry.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-11-21 15:05:52 -05:00
Stephen Worley ace3bbba4b zebra: Don't clear nexthop fib flag on rib_install
We cannot clear the NEXTHOP_FLAG_FIB nexthop flag
when sending routes to the dataplane anymore since
nexthops are now shared.

We were seeing a situation where if we delete a route
using a nexthop group that is still active with another
route, the fib flag was being unset by this code
path despite them still being valid fib nexthops with the
other route.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-11-12 01:24:39 -05:00
Stephen Worley c7c0b007a4 zebra: separate zebra_vrf_lookup_table_with_id()
We were creating `other` tables in rib_del(), vty commands, and
dataplane return callback via the zebra_vrf_table_with_table_id()
API.

Seperate the API into only a lookup, never create
and added another with `get` in the name (following the standard
we use in other table APIs).

Then changed the rib_del(), rib_find_rn_from_ctx(), and show route
summary vty command to use the lookup API instead.

This was found via a crash where two different vrfs though they owned
the table. On delete, one free'd all the nodes, and then the other tried
to use them. It required specific timing of a VRF existing, going away,
and coming back again to cause the crash.

=23464== Invalid read of size 8
==23464==    at 0x179EA4: rib_dest_from_rnode (rib.h:433)
==23464==    by 0x17ACB1: zebra_vrf_delete (zebra_vrf.c:253)
==23464==    by 0x48F3D45: vrf_delete (vrf.c:243)
==23464==    by 0x48F4468: vrf_terminate (vrf.c:532)
==23464==    by 0x13D8C5: sigint (main.c:172)
==23464==    by 0x48DD25C: quagga_sigevent_process (sigevent.c:105)
==23464==    by 0x48F0502: thread_fetch (thread.c:1417)
==23464==    by 0x48AC82B: frr_run (libfrr.c:1023)
==23464==    by 0x13DD02: main (main.c:483)
==23464==  Address 0x5152788 is 104 bytes inside a block of size 112 free'd
==23464==    at 0x48369AB: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==23464==    by 0x48B25B8: qfree (memory.c:129)
==23464==    by 0x48EA335: route_node_destroy (table.c:500)
==23464==    by 0x48E967F: route_node_free (table.c:90)
==23464==    by 0x48E9742: route_table_free (table.c:124)
==23464==    by 0x48E9599: route_table_finish (table.c:60)
==23464==    by 0x170CEA: zebra_router_free_table (zebra_router.c:165)
==23464==    by 0x170DB4: zebra_router_release_table (zebra_router.c:188)
==23464==    by 0x17AAD2: zebra_vrf_disable (zebra_vrf.c:222)
==23464==    by 0x48F3F0C: vrf_disable (vrf.c:313)
==23464==    by 0x48F3CCF: vrf_delete (vrf.c:223)
==23464==    by 0x48F4468: vrf_terminate (vrf.c:532)
==23464==  Block was alloc'd at
==23464==    at 0x4837B65: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==23464==    by 0x48B24A2: qcalloc (memory.c:110)
==23464==    by 0x48EA2FE: route_node_create (table.c:488)
==23464==    by 0x48E95C7: route_node_new (table.c:66)
==23464==    by 0x48E95E5: route_node_set (table.c:75)
==23464==    by 0x48E9EA9: route_node_get (table.c:326)
==23464==    by 0x48E1EDB: srcdest_rnode_get (srcdest_table.c:244)
==23464==    by 0x16EA4B: rib_add_multipath (zebra_rib.c:2730)
==23464==    by 0x1A5310: zread_route_add (zapi_msg.c:1592)
==23464==    by 0x1A7B8E: zserv_handle_commands (zapi_msg.c:2579)
==23464==    by 0x19D689: zserv_process_messages (zserv.c:523)
==23464==    by 0x48F09F8: thread_call (thread.c:1599)

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-11-01 16:06:19 -04:00
Stephen Worley d3a3513811 lib,pbrd,zebra: Use one api to delete nexthops/group
Reduce the api for deleting nexthops and the containing
group to just one call rather than having a special case
and handling it separately.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:43 -04:00
Stephen Worley fec211ad95 zebra: Zebra nexthop group re-work checkpatch fixes
Checkpatch fixes for the zebra nexthop group re-work.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:43 -04:00
Stephen Worley 986a6617cc zebra: Optimize the fib/notified nexthop matching
Optimize the fib and notified nexthop group comparison algorithm
to assume ordering. There were some pretty serious performance hits with
this on high ecmp routes.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:42 -04:00
Stephen Worley 9ef49038d5 lib,zebra: Move nexthop dup marking into creation
We were waiting until install time to mark nexthops as duplicate.
Since they are immutable now and re-used, move this marking into
when they are actually created to save a bunch of cycles.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:42 -04:00
Stephen Worley bc541126e4 zebra: Use nexthop object id on route delete
When we receive a route delete from the kernel and it
contains a nexthop object id, use that to match against
route gateways with instead of explicit nexthops.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:42 -04:00
Stephen Worley 38e40db1c9 zebra: Sweep our nexthop objects out on restart
On restart, if we failed to remove any nexthop objects due
to a kill -9 or such event, sweep them if we aren't using them.
Add a proto field to handle this and remove the is_kernel bool.

Add a dupicate flag that indicates this nexthop group is only
present in our ID hashtable. It is a dupicate nexthop we received
from the kernel, therefore we cannot hash on it.

Make the idcounter globally accessible so that kernel updates
increment it as soon as we receive them, not when we handle them.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:41 -04:00
Stephen Worley 4b87c90d58 zebra: TODO for handling upper level nhe_id passing
We need to handle refcnt differently if we ever start making
upper level protocols aware of nhg_hash_entry IDs.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:41 -04:00
Stephen Worley f429bd1b24 zebra: Move resolve/add depend install into api
Move the resolving and installing of a single nhg_hash_entry
into the install function itself, rather than letting zebra_rib
handle it.

Further, ensure depends are installed/queued before installing
a group. The ordering should be find here since only one thread
will call this API.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:41 -04:00
Stephen Worley 8dfbc65724 zebra: Install the nhe along with the route
Move the installation of an nhe out of nexthop_active_update()
and into the rib install path. So, only install the nhe when
a route using it is being installed.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:41 -04:00
Stephen Worley 7f99772169 zebra: Use nexthop/interface vrf, not the routes
When hashing/creating the NHE, use the nexthops vrf as its
source of data. This is gotten directly from an interface
and should not come from a route.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:40 -04:00
Stephen Worley 6df591527f zebra: Remove route only if NHE is installed check
Only remove a route if the nexthop it is using is still installed.
If a nexthop object is removed from the kernel, all routes referencing
it will be removed from the kernel.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:40 -04:00
Stephen Worley 144a1b34df zebra: Put NHE ref updating into a function
When the referenced NHE changes for a route_entry, use this function
to handle it.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:40 -04:00
Stephen Worley 98cda54a95 zebra: Add recursive functionality to NHE's
Add the ability to recursively resolve nexthop group hash entries
and resolve them when sending to the kernel.

When copying over nexthops into an NHE, copy resolved info as well.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:40 -04:00
Stephen Worley e22e80010e zebra: Use a nhe context dataplane and rib metaq
We will use a nhe context for dataplane interaction with
nextho group hash entries.

New nhe's from the kernel will be put into a group array
if they are a group and queued on the rib metaq to be processed
later.

New nhe's sent to the kernel will be set on the dataplane context
with approprate ID's in the group array if needed.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:40 -04:00
Stephen Worley 13dfea50e4 zebra: Check for nexthop group before free on fail
Check to make sure the route entry has a nexthop
group before we try to free after a table lookup
failure.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:39 -04:00
Stephen Worley bbb3940ed1 zebra: VRF ID should be null if a nexthop group
A nexthop group should not have a VRF ID. Only individual
nexthops need to be using a VRF. Fixed this both kernel and
proto side.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:39 -04:00
Stephen Worley fe593b781d zebra: Re-organize/expose nhg_connected
Re-organize and expose the nhg_connected functions so that
it can be used outside zebra_nhg.c. And then abstract those
into zebra_nhg_depends_* and zebra_nhg_depenents_* functons.

Switch the ifp struct to use an RB tree for its dependents,
making use of the nhg_connected functions.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:39 -04:00
Stephen Worley a15d4c0089 zebra: Abstract the RB nodes/add dependents tree
Create a nhg_depenents tree that will function as a way
to get back pointers for NHE's depending on it.

Abstract the RB nodes into nhg_connected for both depends and
dependents. This same struct is used for both.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:39 -04:00
Stephen Worley 0c8215cbab zebra,lib: Refactor depends to RB tree
Refactor the depends to use an RB tree instead of a list.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:39 -04:00
Stephen Worley ab942eb285 zebra: Protocol side nhg_hash_entry afi fix
Default the afi of the nexthop to the route entry using it.
If it turns out to be a group, update the afi to AFI_UNSPEC.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:39 -04:00
Stephen Worley 86946a2207 zebra: Update rib_add_multipath with increment nhe
Update rib_add_multipath to use the reference count
increment function for nexthop group hash entries.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:39 -04:00
Stephen Worley 20822f9d2e zebra: Add equivalence function for nhg_depends
Add a helper function to allow us to check if two
nhg_hash_entry's dependency lists are equal.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:39 -04:00
Stephen Worley 2d6cd1f007 zebra: Always copy nhg and depends on nhe alloc
Changed our alloc function to just copy the nhg and
nhg_depends. This makes the zebra_nhg_find code a
little bit cleaner, hopefully preventing bugs.

The only issue with this is that it makes us have to loop
over the nexthops in a group an extra time for the copies.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:38 -04:00
Stephen Worley 8e401b251f zebra: Refactor nexthop group creation code to use allocated memory
Simplify the code for nexthop hash entry creation. I made nexthop
hash entry creation expect the nexthop group and depends to always
be allocated before lookup. Before, it was only allocated if it had
dependencies. I think it makes the code a bit more readable to go
ahead an allocate even for single nexthops as well.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:38 -04:00
Stephen Worley 85f5e76175 zebra: Read in nexthop dependencies from the kernel
Add functionality to read in a group from the kernel,
create a hash entry for it, and add its nexthops to
its dependency list.

Further, we create its nhg struct separtely from this,
copying over any nexthops it should reference directly
into it.

Thus, we have two types for representation of the nexthop group:
	nhe->nhg_depends->[nhe, nhe, nhe]

	nhe->nhg->nexthop->nexthop->nexthop

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:38 -04:00
Stephen Worley 77b76fc900 Revert "zebra: Remove afi field in nexthop hash entry"
This reverts commit be73fe9393aac58c7f4bdb5c8a98c24c6cda6d5d.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:38 -04:00
Stephen Worley 2614bf8764 zebra: Add ifp to zebra-side rib_add
Add an interface pointer for an nexthop group hash entry
when we are getting a rib_add for a new route.

Also, add the interface index to the `show nexthop-group` command.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:37 -04:00
Stephen Worley bbb322f292 zebra: Make route entry nexthop groups point to our hash entry
Make our route entry struct's re->ng nexthop group pointer
just point to the nhe->nhg nexthop hash entry nexthop group.
This will allow updates to the nexthop itself to propogate
to our routes immediately.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:37 -04:00
Stephen Worley 8032b71737 zebra: Update rib_add to take a nexthop ID
Add a parameter to the rib_add function so that it takes
a nexthop ID from the kernel if one is passed along
with the route.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:37 -04:00
Stephen Worley 9865e1c393 zebra: Add calls to the nexthop context process result function
Added in case statements to handle finished dataplane contexts
and then handle them with the nexthop process result function.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:37 -04:00
Stephen Worley 8e76637969 zebra: Route entries use nexthop entry ID's instead of pointers
Switched the route entries to use ID's instead of pointers.
Perform lookups with the ID and then check if its null.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:37 -04:00
Stephen Worley d9f5b2f50f zebra: Add functionality to parse RTM_NEWNEXTHOP and RTM_DELNEXTHOP messages
Add the functionality to parse new nexthop group messages
from the kernel and insert them into the appropriate hash
tables. Parsing is done at startup between interface and
interface address lookup. Add functionality to parse
changes to nexthops we already have. Add functionality
to parse delete nexthop messages from the kernel and
remove them from our table.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:36 -04:00
Stephen Worley 8b5bdc8bdf zebra: Remove afi field in nexthop hash entry
I do not believe we should be hashing based on AFI
in for our upper level nexthop group entries. These
should be ambiguous with regards to  address families since
an ipv4 or ipv6 address can have the same interface
nexthop. This can be seen in NEXTHOP_TYPE_IFINDEX.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:36 -04:00
Donald Sharp 9a0d4dd39b zebra: Remove nexthop_active_num from route entry
The nexthop_active_num data structure is a property of the
nexthop group.  Move the keeping of this data to that.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-10-25 11:13:36 -04:00
Donald Sharp eecacedc3b zebra: Remove re->nexthop_num from re
The nexthop_num is not a function of the re.  It is owned
by the nexthop group.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-10-25 11:13:36 -04:00
Donald Sharp 6b46851168 zebra: Replace nexthop_group with pointer in route entry
In the route_entry we are keeping a non pointer based
nexthop group, switch the code to use a pointer for all
operations here and ensure we create and delete the memory.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-10-25 11:13:36 -04:00
Donald Sharp 22bcedb231 zebra: Add code to create/remove nexthop groups
Add some code to create/remove nexthop groups.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:35 -04:00
Stephen Worley 2a99ab95e6 zebra: Handle rib updates as a thread event
If we need to batch process the rib (all tables or specific
vrf), do so as a scheduled thread event rather than immediately
handling it. Further, add context to the events so that you
narrow down to certain route types you want to reprocess.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-18 16:59:25 -04:00
Donald Sharp a8100f512a
Merge pull request #5079 from mjstapp/fix_dplane_drop_at_shut
zebra: during shutdown processing, drop dplane results
2019-10-03 11:49:07 -04:00
Mark Stapp 2fc69f03d2 zebra: during shutdown processing, drop dplane results
Don't process dataplane results in zebra during shutdown (after
sigint has been seen). The dplane continues to run in order to
clean up, but zebra main just drops results.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-09-27 12:15:34 -04:00
dturlupov 873fde8cd9 zebra: if_is_loopback_or_vrf crash if if_lookup_by_index return NULL
Function if_lookup_by_index() can return NULL, but in if_is_loopback_or_vrf() we don't chech NULL and get next:

Sep 2 07:44:34 XXX zebra[4616]: /usr/lib64/libfrr.so.0(zlog_backtrace_sigsafe+0x48) [0x7fb5f704cf18]
Sep 2 07:44:34 XXX zebra[4616]: /usr/lib64/libfrr.so.0(zlog_signal+0x378) [0x7fb5f704d728]
Sep 2 07:44:34 XXX zebra[4616]: /usr/lib64/libfrr.so.0(+0x6b495) [0x7fb5f706b495]
Sep 2 07:44:34 XXX zebra[4616]: /lib64/libpthread.so.0(+0x123b0) [0x7fb5f6d573b0]
Sep 2 07:44:34 XXX zebra[4616]: /usr/lib64/libfrr.so.0(if_is_loopback+0) [0x7fb5f7045160]
Sep 2 07:44:34 XXX zebra[4616]: /usr/lib64/libfrr.so.0(if_is_loopback_or_vrf+0x11) [0x7fb5f7045191]
Sep 2 07:44:34 XXX zebra[4616]: /usr/sbin/zebra() [0x43b26d]
Sep 2 07:44:34 XXX zebra[4616]: /usr/sbin/zebra() [0x43db6f]
Sep 2 07:44:34 XXX zebra[4616]: /usr/lib64/libfrr.so.0(work_queue_run+0xc8) [0x7fb5f7080de8]
Sep 2 07:44:34 XXX zebra[4616]: /usr/lib64/libfrr.so.0(thread_call+0x47) [0x7fb5f7077d27]
Sep 2 07:44:34 XXX zebra[4616]: /usr/lib64/libfrr.so.0(frr_run+0xd8) [0x7fb5f704b448]

Signed-off-by: Dmitrii Turlupov dturlupov@factor-ts.ru
2019-09-26 16:11:50 +03:00
Donald Sharp 17a8a5beec
Merge pull request #4972 from mjstapp/fix_notif_installed
zebra: route updates from dataplane need to check all nexthops
2019-09-24 14:31:58 -04:00
Donald Sharp 16296beaa5
Merge pull request #4731 from mjstapp/fix_redist_update
zebra: redistribute deletes when updating selected route
2019-09-18 19:43:43 -04:00
Mark Stapp 11260e7011 zebra: check all dplane nexthops when processing
When processing route updates from the dataplane, we were
terminating the checking of nexthops prematurely, and we could
miss meaningful changes.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-09-17 10:50:36 -04:00
Mark Stapp cbb9ddad16 zebra: remove empty, unused internal api
Remove a leftover, empty nht api call from zebra.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-09-16 12:00:14 -04:00
Mark Stapp 40f321c000 zebra: revise redistribution delete to improve update case
When selecting a new best route, zebra sends a redist update
when the route is installed. There are cases where redist
clients may not see that redist add - clients who are not
subscribed to the new route type, e.g. In that case, attempt
to send a redist delete for the old/previous route type.

Revised the redist delete api to accomodate both cases;
also tightened up the const-ness of a few internal redist apis.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-09-12 08:51:05 -04:00
Mark Stapp 0bbd4ff442 zebra: move EVPN VTEP programming to dataplane
Move VTEP install/uninstall to the zebra dataplane. Remove
synch kernel-facing apis and helper functions.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-09-04 10:30:17 -04:00
Donald Sharp c9042b2890
Merge pull request #4877 from mjstapp/dplane_neighs
zebra: move evpn neighbors to dataplane
2019-09-04 10:23:31 -04:00
David Lamparter 00dffa8cde lib: add frr_with_mutex() block-wrapper
frr_with_mutex(...) { ... } locks and automatically unlocks the listed
mutex(es) when the block is exited.  This adds a bit of safety against
forgetting the unlock in error paths & co. and makes the code a slight
bit more readable.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2019-09-03 17:15:17 +02:00
Mark Stapp 931fa60c09 zebra: Use dataplane for evpn neighbor changes
Move neighbor programming to the dataplane; remove
old apis; remove some ifdef'd use of direct netlink
code points, using neutral values outside of the netlink-
specific files.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-08-23 14:10:41 -04:00
Donald Sharp ad4c7e8d4e
Merge pull request #4778 from mjstapp/dplane_macs
zebra: use dataplane for evpn macs
2019-08-21 20:26:29 -04:00
Mark Stapp 272e89030e zebra: clear route QUEUED flag in async notification handler
Ensure that the route-entry QUEUED flag is cleared in the async
notification path, as it is in the normal results processing
code path.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-08-05 10:52:31 -04:00
Mark Stapp 036d93c047 zebra: use dataplane for vxlan remote mac programming
Move vxlan remote MAC install and uninstall to the async
dataplane.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-08-02 14:54:16 -04:00
Donald Sharp 228a811a2e zebra: Redistribution should be told about the old route
When we are sending a redistribute_update, pass the old_re in
so that if we still have it around we can update the calling protocol.

Test:

router ospf
  redistribute sharp
!

sharp install route 4.5.6.7 nexthop 192.168.201.1 1

Now add a `ip route 4.5.6.7/32 192.168.201.1`.
This causes zebra to replace the sharp route with the static route.
No update is sent to ospf and debug:
2019/08/01 19:02:38.271998 ZEBRA: 0:4.5.6.7/32: Redist update re 0x12fdbda0 (static), old 0x0 (None)

With fix:

2019/08/01 19:15:09.644499 ZEBRA: 0:4.5.6.7/32: Redist update re 0x1ba5bce0 (static), old 0x1beea4e0 (sharp)
2019/08/01 19:15:09.645462 OSPF: ospf_zebra_read_route: from client sharp: vrf_id 0, p 4.5.6.7/32

Ticket: CM-25847
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-08-01 19:09:59 -04:00
Russ White 3d07ec896e
Merge pull request #4746 from donaldsharp/zebra_rib_improvements
Zebra rib improvements
2019-07-30 11:11:41 -04:00
Donald Sharp 42fc558ee3 zebra, tests: Remove ROUTE_ENTRY_NEXTHOPS_CHANGED
The flag ROUTE_ENTRY_NEXTHOPS_CHANGED is only ever set or unset.
Since this flag is not used for anything useful, remove from system.

By changing this flag we have re-ordered `internalStatus' of json
output of zebra rib routes.  Go through and fix up tetsts to
use the new values.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-07-29 14:53:58 -04:00
Donald Sharp b5046a3c50 zebra: Remove repeated enqueueing of system routes for rethinking
The code as written before this code change point would enqueue
every system route type to be refigured when we have an
interface event.  I believe this was to originally handle bugs
in the way nexthop tracking was handled, mainly that if you keep
asking the question you'll eventually get the right answer.

Modify the code to not do this, we have fixed nexthop tracking
to not be so brain dead and to know when it needs to refigure
a route that it is tracking.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-07-29 11:39:06 -04:00
Don Slice 6d0ee6a0d4 zebra: skip queued entries when resolving nexthop
Problem reported where certain routes were not being passed on to
clients if they were operated on while still queued for kernel
installation.   Changed it to defer working on entries that were
queued to dplane so we could operate on them after getting an
answer back from kernel installatino.

Ticket: CM-25480
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
2019-07-26 17:26:10 +00:00
Mark Stapp 7c7ef4a8c8 zebra: consolidate dataplane interface name and ifindex attrs
Move interface name and index to shared data struct, and remove
operation-specific values.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-07-25 14:01:22 -04:00
Donald Sharp cf05bc9424 zebra: Modify zebra to order nexthops received
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-07-09 10:44:41 -04:00
David Lamparter a5ddc34dc7
lib: Add a couple functions to nexthop_group.c (#4323)
lib: Add a couple functions to nexthop_group.c

Co-authored-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-07-03 14:42:35 +02:00
Sri Mohana Singamsetty 1c05eb4419
Merge pull request #4622 from donaldsharp/import_table_fix
Import table fix
2019-07-02 11:20:02 -07:00
Quentin Young 2951a7a4c2 *: s/TRUE/true/, s/FALSE/false/
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2019-07-01 17:26:05 +00:00
Stephen Worley 50d8965075 lib: Private api for nexthop_group manipulation
Add a file that exposes functions which modify nexthop groups.
Nexthop groups are techincally immutable but there are a
few special cases where we need direct access to add/remove
nexthops after the group has been made. This file provides a
way to expose those functions in a way that makes it clear
this is a private/hidden api.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-06-25 22:58:48 -04:00
Donald Sharp fe257ae733 zebra: Push VRF_DEFAULT outside of import table code
The import table code assumes that they will only work
in the default vrf.  This is ok, but we should push the
vrf_id and zvrf to be passed in instead of just using
VRF_DEFAULT.

This will allow us to fix a couple of things:

1) A bug in import where we are not creating the
route entry with the appropriate table so the imported
entry is showing up in the wrong spot.

2) In the future allow `ip import-table X` to become
vrf aware very easily.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-06-25 17:47:41 -04:00
Donald Sharp 82d7d0e28a zebra: Improve debugging when we can't find a route to delete
Improve debugging when we cannot find a route to delete
that we have been told to delete.

New output:

2019/06/25 17:43:49 ZEBRA: default[0]:4.5.6.7/32 doesn't exist in rib
2019/06/25 17:43:49 ZEBRA: default[0]:4.5.6.8/32 doesn't exist in rib

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-06-25 17:47:41 -04:00
Donald Sharp 53c16fbec0 zebra: Put route in debug dump of rib data
When dumping rib data about a route for `debug rib detail`
modify the dump command to display the prefix as part
of every line so that we can use a grep on the log
file.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-06-20 04:55:47 -04:00
Don Slice 739c9c90e7 zebra: resolve issue with rnh not evaluating nexhops correctly
Problem discovered in testing that occasionally when an interface
address was flushed, the corresponding route would be removed from
the kernel and zebra but remain in the bgp table and be advertised
to peers.  Discovered that when zebra_rib_evaluate_nexthops spun
thru the tree list of rns, if the timing and circumstances were
right, it would move elements and miss evaluating some.  Changed
from frr_each to frr_each_safe and the problem is now gone.

Ticket: CM-25301
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
2019-06-19 07:06:32 -07:00
Donald Sharp 0a7be32866 zebra: Display a bit better debugging for rnh tracking
Add a expected count for the route node we will be processing
as part of nexthop resolution and modify the type to display
a useful string of what the type is instead of a number.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-06-18 15:47:10 -04:00
David Lamparter 731bea2844
Merge pull request #4417 from sworleys/Move-Multicast-Mode
zebra: Move multicast mode to being a property of the router
2019-06-03 15:59:48 +02:00
Sri Mohana Singamsetty 979dd989c4
Merge pull request #4413 from donaldsharp/bgp_distance_comes_closer
Bgp distance comes closer
2019-05-30 09:45:43 -07:00
Donald Sharp 526052fb6d zebra: Move multicast mode to being a property of the router
The multicast mode enum was a global static in zebra_rib.c
it does not belong there, it belongs in zebra_router, moving.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-29 15:25:33 -04:00
Quentin Young eab4a5c2d0 zebra: strcat -> strlcat
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2019-05-29 18:03:26 +00:00
Mark Stapp 827debeac2
Merge pull request #4326 from sworleys/Move-NH-Active-Functions
zebra: Move nexthop_active_XXX functions to zebra_nhg.c
2019-05-29 11:35:27 -04:00
Donald Sharp 94c08afe02
Merge pull request #4228 from mjstapp/dplane_notif
zebra: async notifications from the dataplane
2019-05-29 10:10:05 -04:00
Donald Sharp eea2ef0754 zebra: BGP always sends distance no need to double check
BGP always sends down the correct distance to use.  We do
not need rib_add_multipath to double check the code.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-29 08:57:11 -04:00
Stephen Worley ad28e79ac9 zebra: Move nexthop_active_XXX functions to zebra_nhg.c
Since these functions are not really rib processing problems
let's move them to zebra_nhg.c which is meant for processing of
nexthop groups.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-28 17:41:38 -04:00
Mark Stapp 188a00e014 zebra: generate updates from notifications
If an async notification changes a route that's current,
generate an update to keep the kernel in sync.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-05-28 13:41:37 -04:00
Renato Westphal f6fd430e44
Merge pull request #4322 from sworleys/Nexthop-Cmp
lib: Add nexthop_cmp
2019-05-28 11:32:44 -03:00
Mark Stapp 104e3ad95e zebra: mpls lsp async notifications
Add LSP notification event type; add a handler for LSP notifs;
dispatch to that handler.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-05-28 08:35:01 -04:00
Mark Stapp efe6c026a9 zebra: support route changes via dplane notifications
Allow route notifications to trigger route state changes,
such as installed -> not installed.

Clean up the fib-specific nexthop-group in a couple of
un-install paths.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-05-28 08:34:31 -04:00
Mark Stapp 941e261c97 zebra: share rib processing of updates and notifications
Use some common handling for both route update results
processing and dataplane notification processing. Use the
fib-specific nexthop-group if the update to a route results
in different nexthop status than the default rib-provided
nexthop-group.

Use the fib-specific nexthop-group, if present, to provide
the output of 'show ip fib'.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-05-28 08:34:21 -04:00
Mark Stapp ee5e8a4820 zebra: add a fib-specific nexthop-group
Add a fib-specific nhg, distinct from the nhg developed from
the route-owner / RIB information.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-05-28 08:27:42 -04:00
Mark Stapp 54818e3b01 zebra: begin dataplane notifications
Add dataplane route notification type; add handler for zebra
routes.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-05-28 08:22:27 -04:00
Mark Stapp 5695d9ac5d zebra: set nexthop install state more accurately
When setting route nexthops' installation state based on a
dataplane context struct, unset the installed state if a
nexthop was not present in the dataplane context.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-05-28 08:22:21 -04:00
Mark Stapp fad4d69cd4 zebra: add api to locate route-node from dplane ctx
Create a helper api that locates a zebra route-node from info
in a dplane context struct. Moved code from the results handler
to make a more-general api that could be used in other paths.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-05-28 08:21:20 -04:00
Mark Stapp 78bf56b0b6 zebra: add api to update route from dplane ctx
Add an api to update the status of a route based on info
from a dplane context object. Use the api when processing
route update results from the dataplane.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-05-28 08:21:09 -04:00
Donald Sharp d4644d4196 zebra: Add kernel level graceful restart
<Initial Code from Praveen Chaudhary>

Add the a `--graceful_restart X` flag to zebra start that
now creates a timer that pops in X seconds and will go
through and remove all routes that are older than startup.

If graceful_restart is not specified then we will just pop
a timer that cleans everything up immediately.

Signed-off-by: Praveen Chaudhary <pchaudhary@linkedin.com>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-23 19:35:42 -04:00
Stephen Worley a5a2d802d7 lib,zebra,bgpd,pbrd: Compare nexthops without labels
Allow label ignoring when comparing nexthops. Specifically,
add another functon nexthop_same_no_labels() that shares
a path with nexthop_same() but doesn't check labels.

rib_delete() needs to ignore labels in this case.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-05-23 12:21:15 -04:00
Stephen Worley 78fba41bd8 lib,zebra,bgpd: Remove nexthop_same_no_recurse()
The functions nexthop_same() does not check the resolved
nexthops so I don't think this function is even needed
anymore.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-05-23 12:21:15 -04:00
Renato Westphal 81fddbe7ae *: rename new ForEach macros from the typesafe API
This is necessary to avoid a name collision with std::for_each
from C++.

Fixes the compilation of the gRPC northbound module.

Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
2019-05-21 15:59:08 -03:00
Quentin Young 303b93cdee zebra: update zebra_rib for vrrp
VRRP doesn't install any routes, but should still have an array entry.
Also add a help string for VRRP to route_types.txt

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2019-05-17 00:27:08 +00:00
Russ White 1b072ce466
Merge pull request #4269 from donaldsharp/other_tables
zebra Other tables
2019-05-16 10:11:56 -04:00
Russ White cc25952f2a
Merge pull request #4327 from sworleys/Move-Multipath-Num
zebra: Move multipath_num into zrouter
2019-05-16 10:04:14 -04:00
Donald Sharp b3f2b59020 zebra: Move multipath_num into zrouter
The multipath_num variable is a property of zebra_router,
so move it there.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-14 14:15:18 -07:00
Mark Stapp 98124e2d6a
Merge pull request #4321 from sworleys/Ribsystem-Ribkernel
zebra: Make RIB_SYSTEM|KERNEL_ROUTE a property of rib.h
2019-05-14 09:29:08 -04:00
Donald Sharp 84340a15b4 zebra: Make RIB_SYSTEM|KERNEL_ROUTE a property of rib.h
These defines should be available from rib.h

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-13 13:11:49 -04:00
Donald Sharp 98572489ea zebra: Switch to using monotime(NULL) for re->uptime
The re->uptime usage of time(NULL) leaves it open to
timing changes from outside influence.  Switching
to monotime allows us to ensure that we have a timestamp
that is always increasing.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-11 01:44:42 -04:00
Renato Westphal 773fc72b1f
Merge pull request #4242 from donaldsharp/zebra_diet
Zebra diet
2019-05-10 08:29:59 -03:00
Donald Sharp d8612e6545 zebra: Track tables allocated by vrf and cleanup
For each table created by a vrf, keep track of it and
allow for proper cleanup on shutdown of that particular
table.  Cleanup client shutdown to only cleanup data
that the particular vrf owns.  Before we were cleaning
the same table 2 times.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-09 07:11:22 -04:00
Renato Westphal 95f092540e
Merge pull request #4256 from donaldsharp/zebra_table
doc, zebra: Remove "table X" command
2019-05-06 19:08:17 -03:00
Donald Sharp c447ad08b2 doc, zebra: Remove "table X" command
This command is broken and has been broken since the introduction
of vrf's.  Since no-one has complained it is safe to assume that
there is no call for this specialized linux command.  Remove
from the system with extreme prejudice.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-06 13:42:23 -04:00
Donald Sharp 8dc7a75918 zebra: Add some extra safety for route_info
The route_info[X].meta_q_map *must* be less than MQ_SIZE
or we will do some strange stuff, so assert on it at startup.

The distance in route_info is a uint8_t so let's keep the data
structure the same.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-03 05:05:19 -04:00
Donald Sharp 4bb55bbecc zebra: ifp must be a real pointer sometimes
The ifp pointer must be pointing at a real location
in memory since right above us in this loop we
return if it is.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-03 05:05:19 -04:00
Donald Sharp 045207e27c zebra: Use built in list handler for route entries now
The route entry code was using a custom linked list to handle
route entries.  Remove and replace with the new lib link list
code.  This reduces the size of the route entry by a further
8 bytes.

Observant people will notice that the current linked list
implementation is singly linked, while the Route Entry
is doubly linked.  I am not terribly concerned about this
change as that 1) we do not see a large number of route
entries per prefix( say 2 maybe 3 items ) and route entries
do not come and go that often.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-02 17:41:35 -04:00
Donald Sharp aa57abfbb5 zebra: Remove linked list and replace with new LIST
The `struct rib_dest_t` was being used to store the linked
list of rnh's associated with the node.  This was taking up
a bunch of memory.  Replace with new data structure supplied
by David and see the memory reductions associated with 1 million
routes in the zebra rib:

Old:
Memory statistics for zebra:
System allocator statistics:
  Total heap allocated:  675 MiB
  Holding block headers: 0 bytes
  Used small blocks:     0 bytes
  Used ordinary blocks:  567 MiB
  Free small blocks:     39 MiB
  Free ordinary blocks:  69 MiB
  Ordinary blocks:       0
  Small blocks:          0
  Holding blocks:        0

New:
Memory statistics for zebra:
System allocator statistics:
  Total heap allocated:  574 MiB
  Holding block headers: 0 bytes
  Used small blocks:     0 bytes
  Used ordinary blocks:  536 MiB
  Free small blocks:     33 MiB
  Free ordinary blocks:  4600 KiB
  Ordinary blocks:       0
  Small blocks:          0
  Holding blocks:        0

`struct rnh` was moved to rib.h because of the tangled web
of structure dependancies.  This data structure is used
in numerous places so it should be ok for the moment.
Future work might be needed to do a better job of splitting
up data structures and function definitions.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-02 16:21:38 -04:00
Lou Berger e8b9ad5cdd
Revert "Zebra diet" 2019-05-02 06:54:59 -04:00
Donald Sharp 0a45d97472 zebra: Remove linked list and replace with new LIST
The `struct rib_dest_t` was being used to store the linked
list of rnh's associated with the node.  This was taking up
a bunch of memory.  Replace with new data structure supplied
by David and see the memory reductions associated with 1 million
routes in the zebra rib:

Old:
Memory statistics for zebra:
System allocator statistics:
  Total heap allocated:  675 MiB
  Holding block headers: 0 bytes
  Used small blocks:     0 bytes
  Used ordinary blocks:  567 MiB
  Free small blocks:     39 MiB
  Free ordinary blocks:  69 MiB
  Ordinary blocks:       0
  Small blocks:          0
  Holding blocks:        0

New:
Memory statistics for zebra:
System allocator statistics:
  Total heap allocated:  574 MiB
  Holding block headers: 0 bytes
  Used small blocks:     0 bytes
  Used ordinary blocks:  536 MiB
  Free small blocks:     33 MiB
  Free ordinary blocks:  4600 KiB
  Ordinary blocks:       0
  Small blocks:          0
  Holding blocks:        0

`struct rnh` was moved to rib.h because of the tangled web
of structure dependancies.  This data structure is used
in numerous places so it should be ok for the moment.
Future work might be needed to do a better job of splitting
up data structures and function definitions.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-01 20:28:57 -04:00
Stephen Worley eaa2716dfb zebra: Check on startup route_info has all types
Add a function to check if the route_info array
has all types specified with data in it. Specifically,
test the 'key' attribute for non-zero data. Ignore
ZEBRA_ROUTE_SYSTEM as it should be zero key anyway.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-05-01 15:32:18 -04:00
Mark Stapp 88351c8f6d
Merge pull request #4226 from sworleys/PBR-BFD-OF-route_info
zebra: Add PBR, BFD, OpenFabric to route_info
2019-05-01 11:22:54 -04:00
Lou Berger 31e944a8a7
Merge pull request #3045 from opensourcerouting/atoms
READY: lists/skiplists/rb-trees new API & sequence lock & atomic lists
2019-04-30 10:26:35 -04:00
Stephen Worley d6abd8b070 zebra: Comment to ensure types added to route_info
Add a comment to indicate that route types added to
Zebra, should also be present in the route_info array.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-04-30 10:07:45 -04:00
Stephen Worley eab7b6e371 zebra: Add OpenFabric to route_info array
Add OpenFabric to the route_info array for handling processing
of the OpenFabric route type.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-04-29 19:28:15 -04:00
Stephen Worley 42d96b73cb zebra: Add BFD to route_info array
Add BFD to the route_info array for handling processing
of the BFD route type.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-04-29 19:26:11 -04:00
Stephen Worley 9815665214 zebra: Add PBR to route_info array
Add PBR to the route_info array for handling processing
of the PBR route type.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-04-29 19:24:26 -04:00
Don Slice ade4a8868e zebra: resolve issue with protocol route-map not applied properly
Problem reported that route-maps applied to "ip protocol table bgp"
would not be invoked if the ip protocol table command was issued
after the bgp prefixes were installed.  Found that a recent change
improving how often nexthop_active_update runs missed causing this
filtering to be applied. This fix resolves that issue as well as
a couple of other places that were problematic with the recent
change.

Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
2019-04-26 17:15:44 +00:00
Donald Sharp df38b099ee zebra: Update flag output for route entry dump
Update the nexthop flag output for the route entry dump to
include all possible flag states be output.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-04-18 14:57:54 -04:00
Donald Sharp 6883bf8d35 zebra: Run nexthop_active_check once
We currently run nexthop_active_check multiple times.  Make the
code run once and figure out state from that.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-04-18 14:57:54 -04:00
Donald Sharp 80ad04184f zebra: Double check is not necessary in nexthop_active_update
The nexthop_active_update command looks at each individual
nexthop and decides if it has changed.  If any nexthop
has changed we will set the re->status to ROUTE_ENTRY_CHANGED
and ROUTE_ENTRY_NEXTHOPS_CHANGED.

Additionally the test for old_nh_num != curr_active
makes no sense because suppose we have several events
we are processing at the same time and a total ecmp
of 16 but 14 are active at the start and 14 are active
at the end but different interfaces are up or down.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-04-18 14:57:54 -04:00
Donald Sharp dd50eeb115 lib, zebra: Remove unused flag
The NEXTHOP_FLAG_FILTERED went away when we started treating
static routes like every other route in the system.  This was
a special case for handling static route code that just didn't
get finished cleaning up.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-04-18 14:57:54 -04:00
Donald Sharp 99eabcec1a zebra: nexthop_active_update does not need set
We are effectively calling nexthop_active_update() on every
route entry being processed for installation at least 2 times.
This is a bit ridiculous.  We need to resolve the nexthops
when we know a route has changed in some manner, so do so.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-04-18 14:57:54 -04:00
Renato Westphal e412d3b8d9 lib: move zlog() prototype back to the public logging API
zlog() should be part of the public logging API as it's useful in
the cases where the logging priority isn't known at compile time
(i.e. it depends on a variable).

Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
2019-04-18 13:15:13 -03:00
David Lamparter 7e3a1ec742 lib: ZEBRA_NUM_OF -> array_size
The latter is widely used, e.g. in the Linux kernel.

Signed-off-by: David Lamparter <equinox@diac24.net>
2019-04-18 12:44:29 +02:00
Mark Stapp cf363e1bd8 zebra: dataplane notifications for system route changes
Add notifications from zebra to the dataplane subsystem when
kernel or connected routes change.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-04-10 16:07:01 -04:00
Mark Stapp f4c6e2a815 zebra: remove unused VRF_RIB_SCHEDULED flag
We don't use th vrf-level VRF_RIB_SCHEDULED flag any longer;
remove it and collapse the zebra_vrf flags' values.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-04-05 08:46:28 -04:00
Donald Sharp a1494c250c zebra: Modify lsp processing to be invoked as needed
LSP processing was a zvrf flag based upon a connected route
coming or going.  But this did not allow us to know
that we should do lsp processing other than after the meta-queue
processing was finished.

Eventually we moved meta-queue processing of do_nht_processing
to after the dataplane sent the main pthread some results.
This of course left us with a timing hole where if a connected
route came in and we received a data plane response *before*
the meta queue was processed we would not do the work as necessary.

Move the lsp processing to a flag off of the rib_dest_t. If it
is marked then we need to process lsps.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-03-27 16:22:22 -04:00
Donald Sharp 50872b0804 zebra: Add detailed debugging command for NHT tracking
Add a detailed debugging command for NHT tracking and add
the detailed output to the log about why we make some decisions
that we are.  I tried to model this like the rib processing
detailed debugs that we added a few months back.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-03-27 16:22:22 -04:00
Donald Sharp 699dae230d zebra: Modify NHT to occur when needed.
Currently nexthop tracking is performed for all nexthops that
are being tracked after a group of contexts are passed back
from the data plane for post install processing.

This is inefficient and leaves us sending nexthop tracking
changes at an accelerated pace, when we think we've changed
a route.  Additionally every route change will cause us
to relook at all nexthops we are tracking irrelevant if
they are possibly related to the route change or not.

Let's modify the code base to track the rnh's off of the rib
table's rn, `rib_dest_t`.  So after we process a node, install
it into the data plane, in rib_process_result we can
look at the `rib_dest_t` associated with the rn and see that
a nexthop depended on this route node.  If so, refigure it.

Additionally we will store rnh's that are not resolved on the
0.0.0.0/0 nexthop tracking list.  As such when a route node
changes we can quickly walk up the rib tree and notice that
it needs to be reprocessed as well.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-03-27 16:22:22 -04:00
Donald Sharp c86ba6c283 zebra: Add a base node for the zebra vrf tables
Add a default route_node for our routing tables.  This will allow us
to know that we can hang data off the default route for processing.

We will be hanging the nexthop tracking data structures off the rib_dest_t
so that we can know which nexthops we need to handle.  Effectively
nexthops that we are tracking that are unresolved will be stored on the
default route.  When something changes in the rib tree we can
work up the rn->parent pointer checking for nexthops we need to re-evaluate.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-03-27 16:19:28 -04:00
Donald Sharp 434434f704 zebra: Abstract the rib_dest_t creation
Abstract the creation of the rib_dest_t so that we can call it
from multiple places.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-03-27 16:19:28 -04:00
Donald Sharp 3cdba47a82 zebra: Modify code so that dplane is responsible for indicating success/fail of install
We have several route types KERNEL and CONNECT that are handled via special
case in the code.  This was causing a lot of work keeping the two different
classes of route types as special(SYSTEM OR NOT).  Put the dplane
in charge of the code that sets the bits for signalling route install/failure.

This greatly simplifies the code calling path and makes all route types
be handled exactly the same.  Additionaly code that we want to run
post data plane install can just work as per normal then, instead
of having to know we need to run it when we have a special type
of route.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com.
2019-03-27 16:19:28 -04:00
Donald Sharp 7a230a9d0c zebra: On route install/update failure correctly indicate in rib
When we get a route install failure from the kernel, actually
indicate in the rib the status of the routes.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-03-27 16:19:28 -04:00
Donald Sharp 9ef0c6ba87 zebra: Unset old_re as queued.
When switching routes from one route type to another actually
unset the old route as enqueued.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-03-27 16:19:28 -04:00
Quentin Young 73fb891892 Revert "Merge pull request #3982 from pacovn/Coverity_1479148_copy_paste"
This reverts commit 3a3704fe36, reversing
changes made to 5a3c6e736d.
2019-03-20 21:25:04 +00:00
F. Aragon 23fbacb455
zebra: copy-paste error (Coverity 1479148)
Signed-off-by: F. Aragon <paco@voltanet.io>
2019-03-20 16:45:32 +01:00
Donald Sharp b900245adc zebra: System routes sometimes can not be properly selected
System Routes if received over the netlink bus in a
specific pattern that causes an update operation for that
route in zebra can leave the dest->selected_fib pointer NULL,
while having the ZEBRA_FLAG_SELECTED flag set. Specifically
one way to achieve this is to do this:

`ip addr del 4.5.6.7/32 dev swp1 ; ip addr add 4.5.6.7/32 dev swp1 metric 9`

Why is this a big deal?
Because nexthop tracking is looking at ZEBRA_FLAG_SELECTED to
know if we can use a route, while nexthop active checking uses
dest->selected_fib.

So imagine we have bgp registering a nexthop. nexthop tracking in
the above case will be able to choose the 4.5.6.7/32 route
if that is what the nexthop is, due to the ZEBRA_FLAG_SELECTED being
properly set. BGP then allows the peers connection to come up and we
install routes with a 4.5.6.7 nexthop. The rib processing for route
installation will then look at the 4.5.6.7 route see no
dest->selected_fib and then start walking up the tree to resolve
the route. In our case we could easily hit the default route and be
unable to resolve the route. Which then becomes inactive in the
rib so we never attempt to install it.

This commit fixes this problem because when the rib_process decides
that we need to update the fib( ie replace old w/ new ), the
replacement with new was not setting the `dest->selected_fib` pointer
to the new route_entry, when the route was a system route.

Ticket: CM-24203
Signed-off-by: Donald Sharp <sharpd@cumulusnetworkscom>
2019-03-15 10:02:11 -04:00
David Lamparter d3b05897ed
Merge pull request #3869 from qlyoung/cocci-fixes
Assorted Coccinelle fixes
2019-03-06 15:54:44 +01:00
vivek 2b83602b24 *: Explicitly mark nexthop of EVPN-sourced routes as onlink
In the case of EVPN symmetric routing, the tenant VRF is associated with
a VNI that is used for routing and commonly referred to as the L3 VNI or
VRF VNI. Corresponding to this VNI is a VLAN and its associated L3 (IP)
interface (SVI). Overlay next hops (i.e., next hops for routes in the
tenant VRF) are reachable over this interface. Howver, in the model that
is supported in the implementation and commonly deployed, there is no
explicit Overlay IP address associated with the next hop in the tenant
VRF; the underlay IP is used if (since) the forwarding plane requires
a next hop IP. Therefore, the next hop has to be explicit flagged as
onlink to cause any next hop reachability checks in the forwarding plane
to be skipped.

https://tools.ietf.org/html/draft-ietf-bess-evpn-prefix-advertisement
section 4.4 provides additional description of the above constructs.

Use existing mechanism to specify the nexthops as onlink when installing
these routes from bgpd to zebra and get rid of a special flag that was
introduced for EVPN-sourced routes. Also, use the onlink flag during next
hop validation in zebra and eliminate other special checks.

Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by:   Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Reviewed-by:   Donald Sharp <sharpd@cumulusnetworks.com>
2019-02-27 12:54:24 +00:00
Quentin Young 9f2d035447 *: remove useless return variables
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2019-02-25 23:00:16 +00:00
Donald Sharp 5f27bcba2a zebra: Fix use after free in rib_process_result
Running zebra after commit 888756b208
in valgrind produces this item:

==17102== Invalid read of size 8
==17102==    at 0x44D84C: rib_dest_from_rnode (rib.h:375)
==17102==    by 0x4546ED: rib_process_result (zebra_rib.c:1904)
==17102==    by 0x45436D: rib_process_dplane_results (zebra_rib.c:3295)
==17102==    by 0x4D0902B: thread_call (thread.c:1607)
==17102==    by 0x4CC3983: frr_run (libfrr.c:1011)
==17102==    by 0x4266F6: main (main.c:473)
==17102==  Address 0x83bd468 is 88 bytes inside a block of size 96 free'd
==17102==    at 0x4A35F54: free (vg_replace_malloc.c:530)
==17102==    by 0x4CCAC00: qfree (memory.c:129)
==17102==    by 0x4D03DC6: route_node_destroy (table.c:501)
==17102==    by 0x4D039EE: route_node_free (table.c:90)
==17102==    by 0x4D03971: route_node_delete (table.c:382)
==17102==    by 0x44D82A: route_unlock_node (table.h:256)
==17102==    by 0x454617: rib_process_result (zebra_rib.c:1882)
==17102==    by 0x45436D: rib_process_dplane_results (zebra_rib.c:3295)
==17102==    by 0x4D0902B: thread_call (thread.c:1607)
==17102==    by 0x4CC3983: frr_run (libfrr.c:1011)
==17102==    by 0x4266F6: main (main.c:473)
==17102==  Block was alloc'd at
==17102==    at 0x4A36FF6: calloc (vg_replace_malloc.c:752)
==17102==    by 0x4CCAA2D: qcalloc (memory.c:110)
==17102==    by 0x4D03D88: route_node_create (table.c:489)
==17102==    by 0x4D0360F: route_node_new (table.c:65)
==17102==    by 0x4D034F8: route_node_set (table.c:74)
==17102==    by 0x4D03486: route_node_get (table.c:327)
==17102==    by 0x4CFB700: srcdest_rnode_get (srcdest_table.c:243)
==17102==    by 0x4545C1: rib_process_result (zebra_rib.c:1872)
==17102==    by 0x45436D: rib_process_dplane_results (zebra_rib.c:3295)
==17102==    by 0x4D0902B: thread_call (thread.c:1607)
==17102==    by 0x4CC3983: frr_run (libfrr.c:1011)
==17102==    by 0x4266F6: main (main.c:473)
==17102==

This is happening because of this order of events:

1) Route is deleted in the main thread and scheduled for rib processing.
2) Rib garbage collection is run and we remove the route node since it
is no longer needed.
3) Data plane returns from the deletion in the kernel and we call
the srcdest_rnode_get function to get the prefix that was deleted.
This recreates a new route node.  This creates a route_node with
a lock count of 1, which we freed via the route_unlock_node call.
Then we continued to use the rn pointer.  Which leaves us with use
after frees.

The solution is, of course, to just move the unlock the node at the
end of the function if we have a route_node.

Fixes: #3854
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-02-23 20:03:48 -05:00
Mark Stapp 5c111895d6 zebra: unlock route-node in dplane results handler
Unlock the route-node struct we look up while processing
async dataplane results.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-02-21 16:15:14 -05:00
Mark Stapp 8263d1d0d9 zebra: use update semantics for routes consistently
Use 'update' semantics for route updates, to ensure that
netlink replace behavior works correctly.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-02-11 16:11:02 -05:00
David Lamparter b7777b57c4
Merge pull request #3722 from donaldsharp/static_recursive
Zebra fixes
2019-02-07 19:22:29 +01:00
Donald Sharp 4634d02cfd
Merge pull request #3684 from mjstapp/dplane_pw
zebra: async dataplane for pseudowires
2019-02-05 18:41:12 -05:00
Donald Sharp 6c47d39902 zebra: Fix multiple levels of static recursion
Allow the nexthop-check code to figure out recursive static routes
in a logical manner.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-02-05 15:21:26 -05:00
Donald Sharp 46a4e3455b zebra: NHT was being run at least 2 times and missreporting data
With the data plane changes that were made, we are now running
nexthop tracking 2 times.  Once at the end of meta-queue insertion
and once at the end of receiving a bunch of data from the dataplane.

The Addition of the data plane code caused flags to not be set
fully for the resolved routes( since we do not know the answer yet ),
This in turn caused the nexthop tracking run after the meta-queue
to think that the route was not `good`.  This would cause it to
tell all interested parties that there was no nexthop.

After the dataplane insertion we are also no running nht code.
This was re-figuring out the nexthop correctly and also
correctly reporting to interested parties that there was a path again.

Example:
donna.cumulusnetworks.com(config)# do show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route, q - queued route, f - failed route

K>* 0.0.0.0/0 [0/103] via 10.50.11.1, enp0s3, 00:06:47
S>* 4.5.6.7/32 [1/0] via 192.168.209.1, enp0s8, 00:04:47
C>* 10.50.11.0/24 is directly connected, enp0s3, 00:06:47
C>* 192.168.209.0/24 is directly connected, enp0s8, 00:06:47
C>* 192.168.210.0/24 is directly connected, enp0s9, 00:06:47
donna.cumulusnetworks.com(config)# ip route 4.5.6.7/32 192.168.210.1
donna.cumulusnetworks.com(config)# do show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route, q - queued route, f - failed route

K>* 0.0.0.0/0 [0/103] via 10.50.11.1, enp0s3, 00:07:06
S>* 4.5.6.7/32 [1/0] via 192.168.209.1, enp0s8, 00:00:04
  *                  via 192.168.210.1, enp0s9, 00:00:04
C>* 10.50.11.0/24 is directly connected, enp0s3, 00:07:06
C>* 192.168.209.0/24 is directly connected, enp0s8, 00:07:06
C>* 192.168.210.0/24 is directly connected, enp0s9, 00:07:06
donna.cumulusnetworks.com(config)#

Log files for sharp, which is watching 4.5.6.7:
2019/02/04 15:20:54.844288 SHARP: Received update for 4.5.6.7/32
2019/02/04 15:20:54.844820 SHARP: Received update for 4.5.6.7/32
2019/02/04 15:20:54.844836 SHARP: 	Nexthop 192.168.209.1, type: 2, ifindex: 3, vrf: 0, label_num: 0
2019/02/04 15:20:54.844853 SHARP: 	Nexthop 192.168.210.1, type: 2, ifindex: 4, vrf: 0, label_num: 0

As you can see we have received an update with no nexthops( invalid route )
and a second update immediately after it with 2 nexthops.

What's the big deal you say?  Well we have code in other daemons that reacts
to not having a path for a nexthop.  In BGP this will cause us to tear
down the peer.  In staticd we'll remove the recursively resolved route.
In pim we'll remove all paths to the mroute.  This is not desirable.

The fix is to remove the meta-queue run of nexthop tracking.

While running after data plane notice of routes to handle is not ideal
we will be fixing this in the future with the nexthop group code, which
should know what nexthops are affected by a nexthop group change.

Fixed code debug code:
donna.cumulusnetworks.com(config)# do show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route, q - queued route, f - failed route

K>* 0.0.0.0/0 [0/103] via 10.50.11.1, enp0s3, 00:00:46
S>* 4.5.6.7/32 [1/0] via 192.168.209.1, enp0s8, 00:00:02
C>* 10.50.11.0/24 is directly connected, enp0s3, 00:00:46
C>* 192.168.209.0/24 is directly connected, enp0s8, 00:00:46
C>* 192.168.210.0/24 is directly connected, enp0s9, 00:00:46
donna.cumulusnetworks.com(config)# ip route 4.5.6.7/32 192.168.210.1
donna.cumulusnetworks.com(config)# do show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route, q - queued route, f - failed route

K>* 0.0.0.0/0 [0/103] via 10.50.11.1, enp0s3, 00:00:59
S>* 4.5.6.7/32 [1/0] via 192.168.209.1, enp0s8, 00:00:02
  *                  via 192.168.210.1, enp0s9, 00:00:02
C>* 10.50.11.0/24 is directly connected, enp0s3, 00:00:59
C>* 192.168.209.0/24 is directly connected, enp0s8, 00:00:59
C>* 192.168.210.0/24 is directly connected, enp0s9, 00:00:59

2019/02/04 15:26:20.656395 SHARP: Received update for 4.5.6.7/32
2019/02/04 15:26:20.656440 SHARP: 	Nexthop 192.168.209.1, type: 2, ifindex: 3, vrf: 0, label_num: 0
2019/02/04 15:26:33.688251 SHARP: Received update for 4.5.6.7/32
2019/02/04 15:26:33.688322 SHARP: 	Nexthop 192.168.209.1, type: 2, ifindex: 3, vrf: 0, label_num: 0
2019/02/04 15:26:33.688329 SHARP: 	Nexthop 192.168.210.1, type: 2, ifindex: 4, vrf: 0, label_num: 0

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-02-05 09:17:02 -05:00
Donald Sharp 2561d12e5d zebra: Remove struct zebra_t
This structure is unused anymore and does not belong in zserv.h

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-31 09:20:46 -05:00
Donald Sharp ea45a4e7db zebra: Move the mq data structure to zrouter
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-31 09:20:46 -05:00
Donald Sharp 489a961429 zebra: Move ribq from zebrad to zrouter
The zrouter should own this data structure and it should not
be defined in zserv.h

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-31 09:20:46 -05:00
Donald Sharp b3d43ff471 zebra: Move rtm_table_default to zrouter
The zrouter should own this particular piece of data.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-31 09:20:46 -05:00
Donald Sharp 3801e7646c zebra: Move the master thread handler to the zrouter structure
The master thread handler is really part of the zrouter structure.
So let's move it over to that.  Eventually zserv.h will only be
used for zapi messages.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-31 09:20:46 -05:00
David Lamparter 19b336d343
Merge pull request #3699 from donaldsharp/zebra_rib_debugs
Zebra Respect my authority
2019-01-31 01:51:52 +01:00
Donald Sharp b9f0e5ee24 zebra: On route update context is sometimes indeterminate in post-processing
When we get into rib_process_result and the operation we are handling
is DPLANE_OP_ROUTE_UPDATE *and* the route entry being looked at
is a route replace, we currently have no way to decode to the old_re
and the re due to how we have stored context.  As such they are the
same pointer.

As such the route replace for the same route type is causing the re
to set the installed flag and then immediately unset the installed
flag, leaving us in a state where the kernel has the route but
the rib thinks we are not installed.

Since the true old_re( the one being replaced by the update operation )
is going away( as that it zebra deletes the old one for us already )
this fix is not optimal but will get us moving forward.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-30 09:52:13 -05:00
Donald Sharp 058c16b7e2 zebra: Trust kernel and System routes
If we receive a valid message from the kernel that
is either a kernel or system route, we should trust
that the route is legit and just use it.

Old behavior:

K * 172.22.0.0/15 [0/0] via 172.22.2.254, eva_dummy1 inactive, 00:00:16

New Behavior:

K>* 172.22.0.0/15 [0/0] via 172.22.2.254, eva_dummy1, 00:02:35

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-29 21:45:02 -05:00
Donald Sharp 2da33d6b3a zebra: Convert route entry id number to string in debugs
The route entry being displayed in debugs was displaying
the originating route type as a number.  While numbers
are cool, I for one am not terribly interested in
memorizing them.  Modify the (type %d) to a (%s) to
just list the string type of the route.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-29 21:35:07 -05:00
Russ White 2538f1dad7
Merge pull request #3681 from donaldsharp/onlink
*: The onlink attribute should be owned by the nexthop not the route.
2019-01-29 10:09:44 -05:00
Donald Sharp fe85601c96 *: The onlink attribute should be owned by the nexthop not the route.
The onlink attribute was being passed from upper level protocols
as an attribute of the route *not* the individual nexthop.  When
we pass this data to the kernel, we treat the onlink as a attribute
of the nexthop.  This commit modifies the code base to allow
us to pass the ONLINK attribute as an attribute of the nexthop.

This commit also fixes static routes that have multiple nexthops
some onlink and some not.

ip route 4.5.6.7/32 192.168.41.1 eveth1 onlink
ip route 4.5.6.7/32 192.168.42.2

S>* 4.5.6.7/32 [1/0] via 192.168.41.1, eveth1 onlink, 00:03:04
  *                  via 192.168.42.2, eveth2, 00:03:04

sharpd@robot ~/frr2> sudo ip netns exec EVA ip route show
4.5.6.7 proto 196 metric 20
	nexthop via 192.168.41.1 dev eveth1 weight 1 onlink
	nexthop via 192.168.42.2 dev eveth2 weight 1

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-26 21:02:26 -05:00
Donald Sharp 60f98b236e zebra: Keep track of when routes are queued/dequeued from the dataplane
When we process the dataplane data, keep track of whether or not a route
is in transit or not.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-25 20:16:15 -05:00
Donald Sharp 677c1dd5cb zebra: Use ROUTE_ENTRY_INSTALLED as decision for route is installed
zebra is using NEXTHOP_FLAG_FIB as the basis of whether or not
a route_entry is installed.  This is problematic in that we plan
to separate out nexthop handling from route installation.  So modify
the code to keep track of whether or not a route_entry is installed/failed.

This basically means that every place we set/unset NEXTHOP_FLAG_FIB, we
actually also set/unset ROUTE_ENTRY_INSTALLED on the route_entry.
Additionally where we check for route installed via NEXTHOP_FLAG_FIB
switch over to checking if the route think's it is installed.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-25 20:16:15 -05:00
Mark Stapp 9bd9717bb2 zebra: add handler for pw install errors
Add handler for async error results from the dataplane for
pseudowire installation attempts.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-01-25 10:45:57 -05:00
Donald Sharp 313731ac92
Merge pull request #3630 from opensourcerouting/fix-show-import-check
zebra: fix the "show ip import-check" command
2019-01-22 20:10:56 -05:00
Mark Stapp d37f4d6c61 zebra: move LSP updates into dataplane subsystem
Start performing LSP updates through the async dataplane
subsystem. This is plumbed through for linux/netlink.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-01-22 13:56:48 -05:00
Renato Westphal 73bf60a06b zebra: consolidate how we indentify address-families in the NHT code
Favor usage of the afi_t enumeration to identify address-families
over using the classic AF_INET[6] constants for that. The choice to
use either of the two seems to be mostly arbitrary throughout our
code base, which leads to confusion and bugs like the one fixed by
commit 6f95d11a1. To address this problem, favor usage of the afi_t
enumeration whenever possible, since 1) it's an enumeration (helps
the compilers to catch some bugs), 2) has a safi_t sibling and 3)
can be used to index static arrays. AF_INET[6] should then be used
only when interfacing with the kernel or external libraries like
libc. The family2afi() and afi2family() functions can be used to
convert between the two different representations back and forth.

Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
2019-01-21 13:26:36 -02:00
Donald Sharp 12e7fe3aa0 zebra: Add a switch statement for rib_process_after
Future commits are going to introduce more rigor in
state setting in the case of received results from
the data plane.  So let us move the DPLANE_OP_ROUTE_DELETE
state check to the same spot as the rest of the code that
is handling a particular operation.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-11 11:48:14 -05:00
Donald Sharp f52ed67796 zebra: Limit meta_queue insertion to one time.
Modify the meta_queue insertion such that we only enqueue
the route_node into one meta_queue instead of several.

Suppose we have multiple route_entries associated with
a particular node from rip, bgp, staticd.  If we receive a
route update from rip, we would enqueue the route_node into
the 1, 2, 3 meta-nodes.  Which means that we would run
the entire process of figuring out a route 3 times, while
nothing would change the second two times.

Modify the code to choose the lowest meta-queue and
install it into that one for processing.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-11 11:48:14 -05:00
Donald Sharp 9d5a82a5c2
Merge pull request #3526 from mjstapp/dplane_lists
zebra: pass lists of results from dataplane to zebra
2019-01-10 19:20:35 -05:00
Mark Stapp 4c206c8f74 zebra: pass lists of results from dataplane to zebra
Pass lists of results back to zebra from the dataplane subsystem
(and pthread). This helps reduce the lock/unlock cycles when
zebra is busy. Also remove a couple of typedefs that made their
way into the dataplane header file - those violate the FRR style
guidelines.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-01-10 13:24:13 -05:00
Donald Sharp 73547a754e zebra: Consolidate meta_queue_map into route_info
The route_info data structure already had a mapping of route type
to admin distance.  Consolidate the meta_queue_map information
into this route_info data structure.  This is to reduce the number
of places we need to remember to touch when adding a new routing
protocol.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-02 09:15:30 -05:00
Donald Sharp 85c3d6005e
Merge pull request #3464 from mjstapp/wq_event
libs,zebra: support timeout for workqueue retries, use for rib
2018-12-14 10:00:49 -05:00
Donald Sharp dba52387b7 zebra: On route removal failure return proper message
When a route removal failure happens return to the installing
protocol that the route deletion failed.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2018-12-13 20:00:33 -05:00
Mark Stapp 6dd7b84894 zebra: use a small retry timeout for the rib workqueue
In the zebra rib processing workqueue, set a small timeout
so that we will wait a short time if the queue into the
async dataplane is full. This helps avoid a situation where
the zebra main pthread constantly retries rib work without
giving the dataplane pthread a chance to make progress.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2018-12-13 14:15:27 -05:00
Russ White f4aaa03907
Merge pull request #3477 from donaldsharp/multipath_respect
zebra: Allow zebra to only mark up to multipath_num nexthops as ACTIVE
2018-12-13 10:41:26 -05:00
Russ White eefe8ab766
Merge pull request #3467 from donaldsharp/kernel_socket_cleanup
Kernel socket cleanup
2018-12-13 10:32:09 -05:00
Donald Sharp 220f0f4245 zebra: Allow zebra to only mark up to multipath_num nexthops as ACTIVE
NEXTHOP_FLAG_ACTIVE currently means that the nexthop is considered
good enough to be installed. With current ecmp restrictions this
translation from multipath_num is enforced in the data plane.
The problem with this is of course that every data plane now
becomes concerned about the multipath num and must enforce it
independently.  Currently *bsd does not honor multipath_num at
all and linux marks all nexthops as being installed even when
it honors a multipath_num that is less than the total.

This code change moves the multipath_num enforcement from a dataplane
decision to a zebra nexthop decision.  Thus dataplanes now can
just install those nexthops marked as NEXTHOP_FLAG_ACTIVE
without having to worry about multipath_num.

*BSD will now respect multipath_num and Linux now properly notes
which routes are actually installed or not:

sharpd@donna ~/f/t/topotests> ps -ef | grep frr
frr       6261  1556  0 09:12 ?        00:00:00 /usr/lib/frr/zebra -e 2 --daemon -A 127.0.0.1
frr       6279  1556  0 09:12 ?        00:00:00 /usr/lib/frr/staticd --daemon -A 127.0.0.1

donna.cumulusnetworks.com(config)# do show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route

K>* 0.0.0.0/0 [0/106] via 10.0.2.2, enp0s3, 00:00:45
S>* 4.4.4.4/32 [1/0] via 10.0.2.1, enp0s3, 00:00:02
  *                  via 192.168.209.1, enp0s8, 00:00:02
                     via 192.168.210.1, enp0s9 inactive, 00:00:02
C>* 10.0.2.0/24 is directly connected, enp0s3, 00:00:45
C>* 192.168.209.0/24 is directly connected, enp0s8, 00:00:45
C>* 192.168.210.0/24 is directly connected, enp0s9, 00:00:45
donna.cumulusnetworks.com(config)#

sharpd@donna ~/f/t/topotests> ip route show
default via 10.0.2.2 dev enp0s3 proto dhcp metric 106
4.4.4.4 proto 196 metric 20
	nexthop via 10.0.2.1 dev enp0s3 weight 1
	nexthop via 192.168.209.1 dev enp0s8 weight 1
10.0.2.0/24 dev enp0s3 proto kernel scope link src 10.0.2.15 metric 106
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
192.168.122.0/24 dev virbr0 proto kernel scope link src 192.168.122.1 linkdown
192.168.209.0/24 dev enp0s8 proto kernel scope link src 192.168.209.2 metric 105
192.168.210.0/24 dev enp0s9 proto kernel scope link src 192.168.210.2 metric 103
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2018-12-13 09:21:26 -05:00
Donald Sharp c626d369fd zebra: Remove rib_lookup_ipv4_route
The rib_lookup_ipv4_route function is only used in a debug path.
Is only used for v4 and only checks to make sure that the rib
and fib are in sync( which is not needed/used/supported on other
platforms ).  So let's just remove it.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2018-12-12 11:54:12 -05:00
Donald Sharp 0dddbf72ec zebra: Convert nexthop_active functions to use bool
The set value was only being used as a bool, formalize this
in the call chain.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2018-12-10 10:21:41 -05:00
Mark Stapp 68b375e059 zebra: revise dplane dequeue api
Change the dataplane context dequeue api used by zebra to make the
purpose a bit clearer.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2018-11-21 10:38:08 -05:00
Mark Stapp c831033fff zebra: dataplane provider enhancements
Limit the number of updates processed from the incoming queue;
add more stats. Fill out apis for dataplane providers; convert
route update processing to provider model; move dataplane
status enum

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2018-11-21 10:37:54 -05:00
Donald Sharp effcfaeb3d zebra: Carry onlink if set from resolving nexthop
When resolving a nexthop, carry the onlink flag if it
is set to the new nexthop.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2018-11-07 11:28:01 -05:00
Mark Stapp 258c07e4e2 zebra: only uninstall once, when closing rib table
When the rib code is informed that a table is closing/
going away, only try once to uninstall associated routes from
the fib/dataplane. The close path can be called multiple times
in some cases - zebra shutdown, e.g.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2018-10-30 09:41:55 -04:00
Donald Sharp 69c19e1def
Merge pull request #2946 from mjstapp/dplane_2
Zebra: async dataplane, phase 1
2018-10-28 16:10:45 -04:00
Mark Stapp 8b962e7759 zebra: rebase dataplane, align with master
Rebase and pick up dataplane changes on master, including
renamed structs and enums.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2018-10-25 08:57:04 -04:00
Mark Stapp 91f1681258 zebra: limit queued route updates
Impose a configurable limit on the number of route updates
that can be queued towards the dataplane subsystem.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2018-10-25 08:57:04 -04:00
Mark Stapp 2577906481 zebra: revise struct names to resolve review comments
Use standard type naming and remove use of typedef to resolve
some review comments.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2018-10-25 08:57:04 -04:00
Mark Stapp 14c8b173d2 zebra: remove old apis after new dplane work
Replaced or out-grew a few zebra internal apis during async
dataplane work; removing them.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2018-10-25 08:57:04 -04:00
Mark Stapp f183e380fa zebra: add handy res2str utility
Add a 2str utility for dplane result codes; use it in
a debug or two.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2018-10-25 08:57:04 -04:00
Mark Stapp 1bcea841b1 zebra: netlink fuzzing path correction
Correct use of netlink_parse_info() in the netlink fuzzing path.
Also clarify a couple of comments about pthreads.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2018-10-25 08:34:30 -04:00
Mark Stapp fe2c53d4ea zebra: Fix style issues
Clean up a couple of checkstyle reports in the dataplane
commit.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2018-10-25 08:34:30 -04:00
Mark Stapp 5af4b34689 zebra: ensure redist of system routes
We need a bit of special handling for system routes, which need
to be offered for redistribution even though they won't be
passing through the dplane system.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2018-10-25 08:34:30 -04:00
Mark Stapp 5709131cec zebra: resolve style issues in dplane commit
Resolve (most) style issues in the initial zebra dataplane
commit branch.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2018-10-25 08:34:30 -04:00
Mark Stapp 8cb41cd624 zebra: set SELECTED flag in rib_process
Set SELECTED re immediately in rib_process, without expecting
that fib install has completed. Remove premature redistribute
call also.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2018-10-25 08:34:30 -04:00
Mark Stapp 97f5b44182 zebra: use async dplane route updates
Enqueue updates to the dplane system; add a couple of stats.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2018-10-25 08:34:30 -04:00
Mark Stapp e5ac2adf17 zebra: wip: early version of dplane result handler
Early try at a result handler for async dplane route updates

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2018-10-25 08:34:30 -04:00
Mark Stapp 7cdb1a8445 zebra: start dataplane layer work
Reduce or eliminate use of global zebra_ns structs in
a couple of netlink/kernel code paths, so that those paths
can potentially be made asynch eventually.

Slide netlink_talk_info into place to remove dependency on core
zebra structs; add accessors for dplane context block

Start init of route context from zebra core re and rn structs;
start queueing and event handling for incoming route updates.

Expose netlink apis that don't rely on zebra core structs;
add parallel route-update code path using the dplane ctx;
simplest possible event loop to process queued route'
updates.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2018-10-25 08:34:30 -04:00