When "ip ospf network point-to-multipoint delayed-reflooding" is configured,
LSAs received on an OSPF P2MP network are not reflooded. Since LSA reflooding
would normally serve as an implied LSA acknowledgment, an explicit OSPF ack
should be sent to avoid retransmission by the neighbor which orginally flooded
the LSA on the P2MP network.
Signed-off-by: Acee Lindem <acee@lindem.com>
1. On P2MP interfaces, direct ack would include the same LSA multiple times
multiple packets are processed before the OSPF interfae direct LSA
acknowledgment event is processed. Now duplicates LSA in the same event
are suppressed.
2. On non-broadcast interfaces, direct acks for multiple neighbors would be
unicast to the same neighbor due to the multiple OSPF LS Update packets
being process prior to the OSPF interface direct ack event. Now, separate
direct acks are unicast to the neighbors requiring them.
3. The interface delayed acknowledgment timer runs would run continously
(every second as long as the interace is up). Now, the timer is set
when delayed acknowledgments are queued and all queued delayed
acknowledges are sent when it fires.
4. For non-broadcast interface delayed acknowledgments, the logic to send
to multiple neighbors wasn't working because the list was emptied while
building the packet for the first neighbor.
Signed-off-by: Acee Lindem <acee@lindem.com>
The current OSPF neighbor retransmission operates on a single per-neighbor
periodic timer that sends all LSAs on the list when it expires.
Additionally, since it skips the first retransmission of received LSAs so
that at least the retransmission interval (resulting in a delay of between
the retransmission interval and twice the interval. In environments where
the links are lossy on P2MP networks with "delay-reflood" configured (which
relies on neighbor retransmission in partial meshs), the implementation
is sub-optimal (to say the least).
This commit reimplements OSPF neighbor retransmission as follows:
1. A new data structure making use the application managed
typesafe.h doubly linked list implements an OSPF LSA
list where each node includes a timestamp.
2. The existing neighbor LS retransmission LSDB data structure
is augmented with a pointer to the list node on the LSA
list to faciliate O(1) removal when the LSA is acknowledged.
3. The neighbor LS retransmission timer is set to the expiration
timer of the LSA at the top of the list.
4. When the timer expires, LSAs are retransmitted that within
the window of the current time and a small delta (50 milli-secs
default). The LSAs that are retransmited are given an updated
retransmission time and moved to the end of the LSA list.
5. Configuration is added to set the "retransmission-window" to a
value other than 50 milliseconds.
6. Neighbor and interface LSA retransmission counters are added
to provide insight into the lossiness of the links. However,
these will increment quickly on non-fully meshed P2MP networks
with "delay-reflood" configured.
7. Added a topotest to exercise the implementation on a non-fully
meshed P2MP network with "delay-reflood" configured. The
alternative was to use existing mechanisms to instroduce loss
but these seem less determistic in a topotest.
Signed-off-by: Acee Lindem <acee@lindem.com>
This extends non-broadcast support to point-to-multipoint networks.
Neighbors will be explicitly configured and polled in lieu of multicast
dicovery. Toptotests and documentation updates are included.
Additionally, the ospf neighbor commands have been greatly simplified taking
advantage of DEFPY() capabilities.
The AllOSPFRouters (224.0.0.5) is still joined for non-broadcast networks
since it is joined for NBMA networks. It seems this could be removed but
it should done be in a separate commit.
Signed-off-by: Acee Lindem <acee@lindem.com>
Also:
- replace all /* fallthrough */ comments with portable fallthrough;
pseudo keyword to accomodate both gcc and clang
- add missing break; statements as required by older versions of gcc
- cleanup some code to remove unnecessary fallthrough
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Add support for "[no] ip ospf capbility opaque" at the interface
level with the default being capability opaque enabled. The command
"no ip ospf capability opaque" will disable opaque LSA database
exchange and flooding on the interface. A change in configuration
will result in the interface being flapped to update our options
for neighbors but no attempt will be made to purge existing LSAs
as in dense topologies, these may received by neighbors through
different interfaces.
Topotests are added to test both the configuration and the LSA
opaque flooding suppression.
Signed-off-by: Acee <aceelindem@gmail.com>
When running all daemons with config for most of them, FRR has
sharpd@janelle:~/frr$ vtysh -c "show debug hashtable" | grep "VRF BIT HASH" | wc -l
3570
3570 hashes for bitmaps associated with the vrf. This is a very
large number of hashes. Let's do two things:
a) Reduce the created size of the actually created hashes to 2
instead of 32.
b) Delay generation of the hash *until* a set operation happens.
As that no hash directly implies a unset value if/when checked.
This reduces the number of hashes to 61 in my setup for normal
operation.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Currently, delayed reflooding on P2MP interfaces for LSAs received
from neighbors on the interface is unconditionally (see commit
c706f0e32b). In some cases, this
change wasn't desirable and this feature makes delayed reflooding
configurable for P2MP interfaces via the CLI command:
"ip ospf network point-to-multipoint delay-reflood" in interface
submode.
Signed-off-by: Acee <aceelindem@gmail.com>
This is a first in a series of commits, whose goal is to rename
the thread system in FRR to an event system. There is a continual
problem where people are confusing `struct thread` with a true
pthread. In reality, our entire thread.c is an event system.
In this commit rename the thread.[ch] files to event.[ch].
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Description:
Code changes involves.
1. Count the no.of router LSAs received with DC options bit set,
supporting do not age(DNA).
2. If no of router LSAs received with DC bit set is equal to total
no of LSAs in the router lsdb, then all the routers in the
area support do not age processing.
3. Flood the self originated LSAs with DNA flag if all routers in the area
supports the feature.
4. Stop aging of the LSAs recived with DO_NOT_AGE bit set from
other routers.
5. Self originated DO_NOT_AGE lsas will still be aging in their own
database.
Signed-off-by: Manoj Naragund <mnaragund@vmware.com>
FRR implements a non-standard, but compatible approach for
sending update LSAs (it always send to 224.0.0.5) on P2MP
interfaces. This change makes it so acks are also sent to
224.0.0.5.
Since the acks are multicast, this allows an optimization
where we don't send back out the incoming P2MP interface
immediately allow time to rx multicast ack from neighbors
on the same net that rx'ed the original (multicast) update.
Signed-off-by: Lou Berger <lberger@labn.net>
Fix CI Failure test_ospf_type5_summary_tc45_p0
Problem Statement:
==================
Summarised LSA is not flushed in OSPFv2 in below scenario:
1. Configure summary-address in ospfv2
2. redistribute static and connected.
3. Check the LSAs are received on neighbor.
4. Now remove all OSPFv2 configs, so neighbor will still have the summarised LSA.
5. Configure router ospf with redistribute static and connected.
6. Check the DB, summarised LSA is present although the configuration is not present.
7. Now configure the summary-address and remove the configuration after sometime.
8. The summarised LSA will be still present.
RCA:
==================
When self originated LSA is received from the neighbor and that
LSA is summarised one, the LSA is refreshed but a flag is not set
due to which it was not able to remove it later.
Fix:
==================
Set the originated flag when refreshing summarised LSA.
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
Move `is_default_prefix` variations to `lib/prefix.h` and make the code
use the library version instead of implementing it again.
NOTE
----
The function was split into per family versions to cover all types.
Using `union prefixconstptr` is not possible due to static analyzer
warnings which cause CI to fail.
The specific cases that would cause this failure were:
- Caller used `struct prefix_ipv4` and called the generic function.
- `is_default_prefix` with signature using `const struct prefix *` or
`union prefixconstptr`.
The compiler would complain about reading bytes outside of the memory
bounds even though it did not take into account the `prefix->family`
part.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
RFC 3623 specifies the Graceful Restart enhancement to the OSPF
routing protocol. This PR implements support for the restarting mode,
whereas the helper mode was implemented by #6811.
This work is based on #6782, which implemented the pre-restart part
and settled the foundations for the post-restart part (behavioral
changes, GR exit conditions, and on-exit actions).
Here's a quick summary of how the GR restarting mode works:
* GR can be enabled on a per-instance basis using the `graceful-restart
[grace-period (1-1800)]` command;
* To perform a graceful shutdown, the `graceful-restart prepare ospf`
EXEC-level command needs to be issued before restarting the ospfd
daemon (there's no specific requirement on how the daemon should
be restarted);
* `graceful-restart prepare ospf` will initiate the graceful restart
for all GR-enabled instances by taking the following actions:
o Flooding Grace-LSAs over all interfaces
o Freezing the OSPF routes in the RIB
o Saving the end of the grace period in non-volatile memory (a JSON
file stored in `$frr_statedir`)
* Once ospfd is started again, it will follow the procedures
described in RFC 3623 until it detects it's time to exit the graceful
restart (either successfully or unsuccessfully).
Testing done:
* New topotest featuring a multi-area OSPF topology (including stub
and NSSA areas);
* Successful interop tests against IOS-XR routers acting as helpers.
Co-authored-by: GalaxyGorilla <sascha@netdef.org>
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Remove previous log config
debug ospf graceful-restart helper
and just use
debug ospf graceful-restart
for everything related to OSPF GR.
Signed-off-by: GalaxyGorilla <sascha@netdef.org>
Log the LSA advertising router in addition to the LSA type and
ID in the places where that information is necessary to uniquely
identify the LSA in the LSDB.
This is useful, for example, to know exactly which LSA has changed
when the router is exiting from the GR helper mode when a topology
change was detected.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Replaces some hard-coded function names with __func__,
adds some additional references to neighbor/interface,
and cleans up some debug strings to be more readable.
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
Description:
Apis for creating/deleting aggregate routes.
Origination of summary route on behalf of matched external routes.
Signed-off-by: Rajesh Girada <rgirada@vmware.com>
Remove mid-string line breaks, cf. workflow doc:
.. [#tool_style_conflicts] For example, lines over 80 characters are allowed
for text strings to make it possible to search the code for them: please
see `Linux kernel style (breaking long lines and strings)
<https://www.kernel.org/doc/html/v4.10/process/coding-style.html#breaking-long-lines-and-strings>`_
and `Issue #1794 <https://github.com/FRRouting/frr/issues/1794>`_.
Scripted commit, idempotent to running:
```
python3 tools/stringmangle.py --unwrap `git ls-files | egrep '\.[ch]$'`
```
Signed-off-by: David Lamparter <equinox@diac24.net>
We test nbr->oi in a couple of places for null, but
in the majority of places of the nbr->oi data is being
used we just access it. Touch up code to trust this
assertion and make the code more consistent in others.
Found in Coverity.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This commit has:
The received packet path in ospf, had absolutely no debugs associated with
it. This makes it extremely hard to know when we receive packets for
consumption. Add some breadcrumbs to this end.
Large chunks of commands have no ability to debug what is happening
in what vrf. With ip overlap X vrf this becomes a bit of a problem
Add some breadcrumbs here.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
We have a bunch of places that look for ORIGINAL_CODING. There is
nothing in our configure system to define this value and a quick
git blame shows this code as being original to the import a very
very long time ago. This is dead code, removing.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
ospf->external[DEFAULT_ROUTE] and zclient->default_information don't
line up with each other; the former is only used for "originate always".
Fixes: #4237
Signed-off-by: David Lamparter <equinox@diac24.net>
This reverts commit a6b4e1fded.
This fix is wrong too. The zclient->redist & ->mi_redist arrays are
accessed past their size for any external route that is not 0.0.0.0/0.
Also, it is incorrect to check default_information for DEFAULT_ROUTE
since that's "originate always".
Signed-off-by: David Lamparter <equinox@diac24.net>
Some daemons like ospfd and isisd have the ability to advertise a
default route to their peers only if one exists in the RIB. This
is what the "default-information originate" commands do when used
without the "always" parameter.
For that to work, these daemons use the ZEBRA_REDISTRIBUTE_DEFAULT_ADD
message to request default route information to zebra. The problem
is that this message didn't have an AFI parameter, so a default route
from any address-family would satisfy the requests from both daemons
(e.g. ::/0 would trigger ospfd to advertise a default route to its
peers, and 0.0.0.0/0 would trigger isisd to advertise a default route
to its IPv6 peers).
Fix this by adding an AFI parameter to the
ZEBRA_REDISTRIBUTE_DEFAULT_{ADD,DELETE} messages and making the
corresponding code changes.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Default route type is not considered while processing lsa
refresh timer expiry which intern makes it flushed from lsdb.
Signed-off-by: rgirada <rgirada@vmware.com>
In all but one instance we were following this pattern
with ospf_lsa_new:
ospf_lsa_new()
ospf_lsa_data_new()
so let's create a ospf_lsa_new_and_data to abstract
this bit of fun and cleanup all the places where
it assumes these function calls can fail.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>