OSPF intra/inter area routes were previously marked to assure they
are re-installed after a fast link flap in the commit:
commit effee18744
Author: Donald Sharp <sharpd@nvidia.com>
Date: Mon May 24 13:45:29 2021 -0400
ospfd: Fix quick interface down up event handling in ospf
This commit extends this fix to OSPF AS External routes as well.
Signed-off-by: Acee <aceelindem@gmail.com>
In practical terms, unplanned GR refers to the act of recovering
from a software crash without affecting the forwarding plane.
Unplanned GR and Planned GR work virtually the same, except for the
following difference: on planned GR, the router sends the Grace-LSAs
*before* restarting, whereas in unplanned GR the router sends the
Grace-LSAs immediately *after* restarting.
For unplanned GR to work, ospf6d was modified to send a
ZEBRA_CLIENT_GR_CAPABILITIES message to zebra as soon as GR is
enabled. This causes zebra to freeze the OSPF routes in the RIB as
soon as the ospfd daemon dies, for as long as the configured grace
period (the defaults is 120 seconds). Similarly, ospfd now stores in
non-volatile memory that GR is enabled as soon as GR is configured.
Those two things are no longer done during the GR preparation phase,
which only happens for planned GRs.
Unplanned GR will only take effect when the daemon is killed
abruptly (e.g. SIGSEGV, SIGKILL), otherwise all OSPF routes will
be uninstalled while ospfd is exiting. Once ospfd starts, it will
check whether GR is enabled and enter in the GR mode if necessary,
sending Grace-LSAs out all operational interfaces.
One disadvantage of unplanned GR is that the neighboring routers
might time out their corresponding adjacencies if ospfd takes too
long to come back up. This is especially the case when short dead
intervals are used (or BFD). For this and other reasons, planned
GR should be preferred whenever possible.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
When using route maps with external routes in OSPF as follows:
```
set metric +10
```
The current behavior is to use the default ospf metric as the base and then add
to 10 to it. The behavior isn't useful as-is. A value 30 (20 dfeault + 10) can
be set directly instead. the behavior is also not consistent with bgp. bgp does
use the rib metric in this case as the base. The current behavior also doesn't
allow the metric to accumulate when crossing different routing domains such as
vrfs causing the metric to reset every time the route enters a new vrf with a new
ospf network.
This PR changes the behavior such that the rib metric is used as a base for
ospf exteral routes when used with `set metric -/+`
Signed-off-by: Jafar Al-Gharaibeh <jafar@atcorp.com>
Effectively a massive search and replace of
`struct thread` to `struct event`. Using the
term `thread` gives people the thought that
this event system is a pthread when it is not
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This is a first in a series of commits, whose goal is to rename
the thread system in FRR to an event system. There is a continual
problem where people are confusing `struct thread` with a true
pthread. In reality, our entire thread.c is an event system.
In this commit rename the thread.[ch] files to event.[ch].
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Description:
When changing the area from normal to NSSA, previous area's
ASBR router's type-5 also calculated and added to routing table along
with Type-7 lsas.
Made a change in route calculation such that it will not consider Type-5
lsas in calculation if it is originated from NSSA ASBR router.
These lsas will be age out at MAX age.
log:
frr(config-router)# do show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR,
f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
K>* 0.0.0.0/0 [0/0] via 10.112.157.253, ens160, 00:32:47
C>* 10.112.156.0/23 is directly connected, ens160, 00:32:47
S>* 22.22.22.2/32 [1/0] is directly connected, ens192, weight 1, 00:20:03
O>* 33.33.33.0/24 [110/20] via 100.1.1.220, ens192, weight 1, 00:08:55
via 100.1.1.220, ens192, weight 1, 00:08:55
O 100.1.1.0/24 [110/10] is directly connected, ens192, weight 1, 00:21:32
C>* 100.1.1.0/24 is directly connected, ens192, 00:23:11
frr(config-router)# do show ip ospf route
============ OSPF network routing table ============
N 100.1.1.0/24 [10] area: 0.0.0.1
directly attached to ens192
============ OSPF router routing table =============
R 2.2.2.2 [10] area: 0.0.0.1, ASBR
via 100.1.1.220, ens192
============ OSPF external routing table ===========
N E2 33.33.33.0/24 [10/20] tag: 0
via 100.1.1.220, ens192
via 100.1.1.220, ens192
Signed-off-by: Rajesh Girada <rgirada@vmware.com>
RFC 3623 specifies the Graceful Restart enhancement to the OSPF
routing protocol. This PR implements support for the restarting mode,
whereas the helper mode was implemented by #6811.
This work is based on #6782, which implemented the pre-restart part
and settled the foundations for the post-restart part (behavioral
changes, GR exit conditions, and on-exit actions).
Here's a quick summary of how the GR restarting mode works:
* GR can be enabled on a per-instance basis using the `graceful-restart
[grace-period (1-1800)]` command;
* To perform a graceful shutdown, the `graceful-restart prepare ospf`
EXEC-level command needs to be issued before restarting the ospfd
daemon (there's no specific requirement on how the daemon should
be restarted);
* `graceful-restart prepare ospf` will initiate the graceful restart
for all GR-enabled instances by taking the following actions:
o Flooding Grace-LSAs over all interfaces
o Freezing the OSPF routes in the RIB
o Saving the end of the grace period in non-volatile memory (a JSON
file stored in `$frr_statedir`)
* Once ospfd is started again, it will follow the procedures
described in RFC 3623 until it detects it's time to exit the graceful
restart (either successfully or unsuccessfully).
Testing done:
* New topotest featuring a multi-area OSPF topology (including stub
and NSSA areas);
* Successful interop tests against IOS-XR routers acting as helpers.
Co-authored-by: GalaxyGorilla <sascha@netdef.org>
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
The #if 0 code in ospfd, has not been compiled since at least
2012. If we are at least 9 years old at this point with no effort
to use or save, we should just get rid of it.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Topology diagram:
-------------------------
+---+ A0 +---+
+R1 +------------+R2 |
+-+-+- +--++
| -- -- |
| -- A0 -- |
A0| ---- |
| ---- | A0
| -- -- |
| -- -- |
+-+-+- +-+-+
+R0 +-------------+R3 |
+---+ A0 +---+
Steps to reproduce:
--------------------------
1. Bring up the base config as per the topology
2. Configure OSPF on all the routers of the topology.
3. Configure 5 static routes from the same network on R0 , 5 static routes from different networks and redistribute in R0
4. Configure External Route summary in R0 to summarise 5 routes to one route.
5. Delete the configured summary
6. configure the summary again and delete static routes .
7. Add back static routes.
8. Configure new static route which is matching the configured summary.
9. Delete one of the static route.
10. Configure redistribute connected and configure ospf external summary address to summarise the connected routes.
11. Clear ospf process and check for any errors.
[New LWP 2346]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/lib/frr/ospfd'.
Program terminated with signal SIGABRT, Aborted.
54 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
0 0x00007f296f278428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
1 0x00007f296f27a02a in __GI_abort () at abort.c:89
2 0x00007f296fca4110 in core_handler (signo=11, siginfo=0x7ffcd52044f0, context=<optimized out>) at lib/sigevent.c:254
3 <signal handler called>
4 0x000055949b9dfdff in ospf_lsdb_lookup (lsdb=lsdb@entry=0x55949bfd3688, lsa=lsa@entry=0x55949bfe1290) at ospfd/ospf_lsdb.c:179
5 0x000055949ba28fbe in ospf_ls_retransmit_lookup (lsa=0x55949bfe1290, nbr=0x55949bfd3610) at ospfd/ospf_flood.c:918
6 ospf_ls_retransmit_delete_nbr_if (oi=oi@entry=0x55949bfd2590, lsa=lsa@entry=0x55949bfe1290) at ospfd/ospf_flood.c:932
7 0x000055949ba2916b in ospf_ls_retransmit_delete_nbr_if (lsa=0x55949bfe1290, oi=0x55949bfd2590) at ospfd/ospf_flood.c:928
8 ospf_ls_retransmit_delete_nbr_as (ospf=ospf@entry=0x55949bfbdb30, lsa=lsa@entry=0x55949bfe1290) at ospfd/ospf_flood.c:959
9 0x000055949b9dcd7e in ospf_discard_from_db (ospf=ospf@entry=0x55949bfbdb30, lsdb=<optimized out>, lsa=lsa@entry=0x55949bfe1630) at ospfd/ospf_lsa.c:2552
10 0x000055949b9df1b3 in ospf_maxage_lsa_remover (thread=0x55949bfde930) at ospfd/ospf_lsa.c:2848
11 0x00007f296fcb1770 in thread_call (thread=0x55949bfde930) at lib/thread.c:1557
12 0x00007f296fcb19d6 in funcname_thread_execute (m=0x55949be0a630, func=func@entry=0x55949b9df0a0 <ospf_maxage_lsa_remover>, arg=arg@entry=0x55949bfbdb30, val=val@entry=0,
funcname=funcname@entry=0x55949ba31b41 "ospf_maxage_lsa_remover", schedfrom=schedfrom@entry=0x55949ba31917 "ospfd/ospf_lsa.c", fromln=3364) at lib/thread.c:1628
13 0x000055949b9de90b in ospf_flush_self_originated_lsas_now (ospf=ospf@entry=0x55949bfbdb30) at ospfd/ospf_lsa.c:3364
14 0x000055949ba19a55 in ospf_process_refresh_data (ospf=0x55949bfbdb30, reset=reset@entry=true) at ospfd/ospfd.c:138
15 0x000055949ba1aeef in ospf_process_reset (ospf=<optimized out>) at ospfd/ospfd.c:206
16 0x000055949ba0729b in clear_ip_ospf_process_magic (self=<optimized out>, vty=<optimized out>, argc=<optimized out>, argv=<optimized out>, instance_str=<optimized out>,
instance=<optimized out>) at ospfd/ospf_vty.c:11930
17 clear_ip_ospf_process (self=<optimized out>, vty=0x55949bfe2600, argc=<optimized out>, argv=<optimized out>) at ./ospfd/ospf_vty_clippy.c:306
18 0x00007f296fc66523 in cmd_execute_command_real (vline=vline@entry=0x55949bfd2fb0, vty=vty@entry=0x55949bfe2600, cmd=cmd@entry=0x0, filter=FILTER_RELAXED) at lib/command.c:1060
19 0x00007f296fc6869a in cmd_execute_command (vline=vline@entry=0x55949bfd2fb0, vty=vty@entry=0x55949bfe2600, cmd=0x0, vtysh=vtysh@entry=0) at lib/command.c:1119
20 0x00007f296fc68817 in cmd_execute (vty=vty@entry=0x55949bfe2600, cmd=cmd@entry=0x55949bfe7f80 "clear ip ospf pro", matched=matched@entry=0x0, vtysh=vtysh@entry=0)
at lib/command.c:1275
21 0x00007f296fcb6c13 in vty_command (vty=vty@entry=0x55949bfe2600, buf=0x55949bfe7f80 "clear ip ospf pro") at lib/vty.c:514
22 0x00007f296fcb6ea6 in vty_execute (vty=vty@entry=0x55949bfe2600) at lib/vty.c:1281
23 0x00007f296fcb97f4 in vtysh_read (thread=<optimized out>) at lib/vty.c:2123
24 0x00007f296fcb1770 in thread_call (thread=thread@entry=0x7ffcd5209590) at lib/thread.c:1557
25 0x00007f296fc82e78 in frr_run (master=0x55949be0a630) at lib/libfrr.c:1026
26 0x000055949b9d0f07 in main (argc=1, argv=0x7ffcd52098b8) at ospfd/ospf_main.c:230
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
Implement the below 2 CLIs to clear the current data in the process
and neighbor data structure.
1. clear ip ospf process
2. clear ip ospf neighbor
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
Remove mid-string line breaks, cf. workflow doc:
.. [#tool_style_conflicts] For example, lines over 80 characters are allowed
for text strings to make it possible to search the code for them: please
see `Linux kernel style (breaking long lines and strings)
<https://www.kernel.org/doc/html/v4.10/process/coding-style.html#breaking-long-lines-and-strings>`_
and `Issue #1794 <https://github.com/FRRouting/frr/issues/1794>`_.
Scripted commit, idempotent to running:
```
python3 tools/stringmangle.py --unwrap `git ls-files | egrep '\.[ch]$'`
```
Signed-off-by: David Lamparter <equinox@diac24.net>
Replace sprintf with snprintf where straightforward to do so.
- sprintf's into local scope buffers of known size are replaced with the
equivalent snprintf call
- snprintf's into local scope buffers of known size that use the buffer
size expression now use sizeof(buffer)
- sprintf(buf + strlen(buf), ...) replaced with snprintf() into temp
buffer followed by strlcat
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Line break at the end of the message is implicit for zlog_* and flog_*,
don't put it in the string. Mid-message line breaks are currently
unsupported. (LF is "end of message" in syslog.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
The problem is seen where speed mismatch caused ECMP route
not being reflected with correct number paths (NHs).
During cold boot, some interface speed updated by zebra as
part of one shot timer and triggers interface add to clients.
In this case, ospf already have created interface (bond interface),
but speed was not updated, trigger to do interface speed change
as part of interface add, which will trigger all Router LSA to
use updated speed into cost calculation.
Ticket:CM-22170
Testing Done:
Bring up CLOS config with Spine and leafs. Leaf have CLAG pair,
with same VRR ip address.
At spine one of the bond connecting to leaf node was having
higher speed than the paired device, With this fix, at spine (DUT)
bond interface speed is equal from all peer nodes.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
The following types are nonstandard:
- u_char
- u_short
- u_int
- u_long
- u_int8_t
- u_int16_t
- u_int32_t
Replace them with the C99 standard types:
- uint8_t
- unsigned short
- unsigned int
- unsigned long
- uint8_t
- uint16_t
- uint32_t
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Inform the .clang-format file about LSDB_LOOP and
put the proper indentation for this loop into the
code.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
list_free is occassionally being used to delete the
list and accidently not deleting all the nodes.
We keep running across this usage pattern. Let's
remove the temptation and only allow list_delete
to handle list deletion.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Convert the list_delete(struct list *) function to use
struct list **. This is to allow the list pointer to be nulled.
I keep running into uses of this list_delete function where we
forget to set the returned pointer to NULL and attempt to use
it and then experience a crash, usually after the developer
has long since left the building.
Let's make the api explicit in it setting the list pointer
to null.
Cynical Prediction: This code will expose a attempt
to use the NULL'ed list pointer in some obscure bit
of code.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The FSF's address changed, and we had a mixture of comment styles for
the GPL file header. (The style with * at the beginning won out with
580 to 141 in existing files.)
Note: I've intentionally left intact other "variations" of the copyright
header, e.g. whether it says "Zebra", "Quagga", "FRR", or nothing.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
The way thread.c is written, a caller who wishes to be able to cancel a
thread or avoid scheduling it twice must keep a reference to the thread.
Typically this is done with a long lived pointer whose value is checked
for null in order to know if the thread is currently scheduled. The
check-and-schedule idiom is so common that several wrapper macros in
thread.h existed solely to provide it.
This patch removes those macros and adds a new parameter to all
thread_add_* functions which is a pointer to the struct thread * to
store the result of a scheduling call. If the value passed is non-null,
the thread will only be scheduled if the value is null. This helps with
consistency.
A Coccinelle spatch has been used to transform code of the form:
if (t == NULL)
t = thread_add_* (...)
to the form
thread_add_* (..., &t)
The THREAD_ON macros have also been transformed to the underlying
thread.c calls.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Since we can't assume time_t to be long, int, or even long long, this
consistently uses %lld/long long (or %llu/unsigned long long in a few
cases) to print time_t/susecond_t values. This should fix a bunch of
warnings, on NetBSD in particular.
(Unfortunately, there seems to be no "PRId64" style printing macro for
time_t...)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
(cherry picked from commit ef008d2f8dc8f7160d8a3d24a15f2fad79ef3242)
Compute and display SPF execution statistics
Detailed SPF statistics, all around time spent executing various pieces of SPF
such as the SPF algorithm itself, installing routes, pruning unreachable networks
etc.
Reason codes for firing up SPF are:
R - Router LSA, N - Network LSA, S - Summary LSA, ABR - ABR status change,
ASBR - ASBR Status Change, AS - ASBR Summary, M - MaxAge
Signed-off-by: Dinesh G Dutt <ddutt@cumulusnetworks.com>
Reviewed-by: JR Rivers <jrrivers@cumulusnetworks.com>
Reviewed-by: Scott Feldman <sfeldma@cumulusnetworks.com>
Reviewed-by: Ayan Banerjee <ayan@cumulusnetworks.com>
* ospf_ase.c: (ospf_ase_calculate_route) Fix compiler warning about eval
needing brackets.
(various) add defensive asserts.
* ospf_lsdb.c: (ospf_lsdb_add) add missing node unlock if same lsa already
was indexed.
(ospf_lsdb_delete) check it's actually the same as specified lsa before
deleting
(ospf_lsdb_lookup_by_id_next) fix another corner case - no result =>
don't go on.
* global: In struct ospf_path, change struct ospf_interface *oi to int
ifindex. It is unsafe to reference *oi as an ospf interface can be
deleted under your feet. Use a weak reference instead.