When an ospf interface is not in the backbone area, but it receives a
packet from the backbone, no logs are generated for this mismatch.
However, the opposite scenario does generate logs.
Add a log for this case.
Signed-off-by: Loïc Sang <loic.sang@6wind.com>
1. On P2MP interfaces, direct ack would include the same LSA multiple times
multiple packets are processed before the OSPF interfae direct LSA
acknowledgment event is processed. Now duplicates LSA in the same event
are suppressed.
2. On non-broadcast interfaces, direct acks for multiple neighbors would be
unicast to the same neighbor due to the multiple OSPF LS Update packets
being process prior to the OSPF interface direct ack event. Now, separate
direct acks are unicast to the neighbors requiring them.
3. The interface delayed acknowledgment timer runs would run continously
(every second as long as the interace is up). Now, the timer is set
when delayed acknowledgments are queued and all queued delayed
acknowledges are sent when it fires.
4. For non-broadcast interface delayed acknowledgments, the logic to send
to multiple neighbors wasn't working because the list was emptied while
building the packet for the first neighbor.
Signed-off-by: Acee Lindem <acee@lindem.com>
The current OSPF neighbor retransmission operates on a single per-neighbor
periodic timer that sends all LSAs on the list when it expires.
Additionally, since it skips the first retransmission of received LSAs so
that at least the retransmission interval (resulting in a delay of between
the retransmission interval and twice the interval. In environments where
the links are lossy on P2MP networks with "delay-reflood" configured (which
relies on neighbor retransmission in partial meshs), the implementation
is sub-optimal (to say the least).
This commit reimplements OSPF neighbor retransmission as follows:
1. A new data structure making use the application managed
typesafe.h doubly linked list implements an OSPF LSA
list where each node includes a timestamp.
2. The existing neighbor LS retransmission LSDB data structure
is augmented with a pointer to the list node on the LSA
list to faciliate O(1) removal when the LSA is acknowledged.
3. The neighbor LS retransmission timer is set to the expiration
timer of the LSA at the top of the list.
4. When the timer expires, LSAs are retransmitted that within
the window of the current time and a small delta (50 milli-secs
default). The LSAs that are retransmited are given an updated
retransmission time and moved to the end of the LSA list.
5. Configuration is added to set the "retransmission-window" to a
value other than 50 milliseconds.
6. Neighbor and interface LSA retransmission counters are added
to provide insight into the lossiness of the links. However,
these will increment quickly on non-fully meshed P2MP networks
with "delay-reflood" configured.
7. Added a topotest to exercise the implementation on a non-fully
meshed P2MP network with "delay-reflood" configured. The
alternative was to use existing mechanisms to instroduce loss
but these seem less determistic in a topotest.
Signed-off-by: Acee Lindem <acee@lindem.com>
This commit adds the capabiity to filter OSPF neighbors using a
prefix-list with rules matching the neighbor's IP source address.
Configuration, filtering, immediate neighbor pruning, topo-tests,
and documentation are included. The command is:
ip ospf neighbor-filter <prefix-list> [A.B.C.D]
Signed-off-by: Acee Lindem <acee@lindem.com>
This extends non-broadcast support to point-to-multipoint networks.
Neighbors will be explicitly configured and polled in lieu of multicast
dicovery. Toptotests and documentation updates are included.
Additionally, the ospf neighbor commands have been greatly simplified taking
advantage of DEFPY() capabilities.
The AllOSPFRouters (224.0.0.5) is still joined for non-broadcast networks
since it is joined for NBMA networks. It seems this could be removed but
it should done be in a separate commit.
Signed-off-by: Acee Lindem <acee@lindem.com>
With this fix, OSPF LS Updates sent in response to OSPF LS Requests during the DB Exchange process will be sent as unicasts. Unless the timing of multiple database exchanges coincides, there is little chance that the LSAs in the LS Update are required by OSPF routers other than the one which elicited the LS Update.
This is somewhat ambigous in RFC 2328 and two errata have been filed for clarification:
https://www.rfc-editor.org/errata/eid7850https://www.rfc-editor.org/errata/eid7851
FRR OSPFv3 (ospf6d) already does it correctly - see ospf6_lsupdate_send_neighbor(struct event *thread). Also, if there is any doubt, one can refer to the C++ code at ospf.org (John Moy's seminal OSPF reference implementation).
Signed-off-by: Acee Lindem <acee@lindem.com>
Also:
- replace all /* fallthrough */ comments with portable fallthrough;
pseudo keyword to accomodate both gcc and clang
- add missing break; statements as required by older versions of gcc
- cleanup some code to remove unnecessary fallthrough
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
This patch includes:
* Implementation of RFC 5709 support in OSPF. Using
openssl library and FRR key-chain,
one can use SHA1, SHA256, SHA384, SHA512 and
keyed-MD5( backward compatibility with RFC 2328) HMAC algs.
* Updating documentation of OSPF
* add topotests for new HMAC algorithms
Signed-off-by: Mahdi Varasteh <varasteh@amnesh.ir>
Add support for "[no] ip ospf capbility opaque" at the interface
level with the default being capability opaque enabled. The command
"no ip ospf capability opaque" will disable opaque LSA database
exchange and flooding on the interface. A change in configuration
will result in the interface being flapped to update our options
for neighbors but no attempt will be made to purge existing LSAs
as in dense topologies, these may received by neighbors through
different interfaces.
Topotests are added to test both the configuration and the LSA
opaque flooding suppression.
Signed-off-by: Acee <aceelindem@gmail.com>
1. Fix OSPF opaque LSA processing to preserve the stale opaque
LSAs in the Link State Database for 60 seconds consistent with
what is done for other LSA types.
2. Add a topotest that tests for cases where ospfd is restarted
and a stale OSPF opaque LSA exists in the OSPF routing domain
both when the LSA is purged and when the LSA is reoriginagted
with a more recent instance.
Signed-off-by: Acee <aceelindem@gmail.com>
In practical terms, unplanned GR refers to the act of recovering
from a software crash without affecting the forwarding plane.
Unplanned GR and Planned GR work virtually the same, except for the
following difference: on planned GR, the router sends the Grace-LSAs
*before* restarting, whereas in unplanned GR the router sends the
Grace-LSAs immediately *after* restarting.
For unplanned GR to work, ospf6d was modified to send a
ZEBRA_CLIENT_GR_CAPABILITIES message to zebra as soon as GR is
enabled. This causes zebra to freeze the OSPF routes in the RIB as
soon as the ospfd daemon dies, for as long as the configured grace
period (the defaults is 120 seconds). Similarly, ospfd now stores in
non-volatile memory that GR is enabled as soon as GR is configured.
Those two things are no longer done during the GR preparation phase,
which only happens for planned GRs.
Unplanned GR will only take effect when the daemon is killed
abruptly (e.g. SIGSEGV, SIGKILL), otherwise all OSPF routes will
be uninstalled while ospfd is exiting. Once ospfd starts, it will
check whether GR is enabled and enter in the GR mode if necessary,
sending Grace-LSAs out all operational interfaces.
One disadvantage of unplanned GR is that the neighboring routers
might time out their corresponding adjacencies if ospfd takes too
long to come back up. This is especially the case when short dead
intervals are used (or BFD). For this and other reasons, planned
GR should be preferred whenever possible.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Add support for a write socket per interface, enabled by
default at the ospf instance level. An ospf instance-level
config allows this to be disabled, reverting to the older
behavior where a single per-instance socket is used for
sending and receiving packets.
Signed-off-by: Mark Stapp <mjs@labn.net>
Effectively a massive search and replace of
`struct thread` to `struct event`. Using the
term `thread` gives people the thought that
this event system is a pthread when it is not
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This is a first in a series of commits, whose goal is to rename
the thread system in FRR to an event system. There is a continual
problem where people are confusing `struct thread` with a true
pthread. In reality, our entire thread.c is an event system.
In this commit rename the thread.[ch] files to event.[ch].
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The GR code should check for topology changes only upon the receipt
of Router-LSAs and Network-LSAs. Other LSAs types don't affect the
topology as far as a restarting router is concerned.
This optimization reduces unnecessary computations when the
restarting router receives thousands of inter-area LSAs or external
LSAs while coming back up.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Description:
The changes involve setting DC bit on ospf hellos and
addition of new DO_NOT_AGE flag.
Signed-off-by: Manoj Naragund <mnaragund@vmware.com>
FRR implements a non-standard, but compatible approach for
sending update LSAs (it always send to 224.0.0.5) on P2MP
interfaces. This change makes it so acks are also sent to
224.0.0.5.
Since the acks are multicast, this allows an optimization
where we don't send back out the incoming P2MP interface
immediately allow time to rx multicast ack from neighbors
on the same net that rx'ed the original (multicast) update.
Signed-off-by: Lou Berger <lberger@labn.net>
ospf_write pulls an interface off the ospf->oi_write_q
then writes one packet and places it back on the queue,
keeping track of the first one sent. Then it will
stop sending packets even if we get back to the first
interface written too but before we have sent the full
pkt_count. I do not believe this makes a whole bunch
of sense and is very restrictive of how much data can
be sent especially if you have a limited number of peers
but large amounts of data. Why be so restrictive?
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Let's just use THREAD_OFF consistently in the code base
instead of each daemon having a special macro that needs to
be looked at and remembered what it does.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Since f60a1188 we store a pointer to the VRF in the interface structure.
There's no need anymore to store a separate vrf_id field.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Commit 3551ee9e90304 introduced a regression that causes GR to fail
under certain circumstances. In short, while ISM events should
be ignored while acting as a helper for a restarting router, the
DR/BDR fields of the neighbor structure should still be updated
while processing a Hello packet. If that isn't done, it can cause
the helper to elect the wrong DR while exiting from the helper mode,
leading to a situation where there are two DRs for the same network
segment (and a failed GR by consequence). Fix this.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
RFC 3623 says:
"If the restarting router determines that it was the Designated
Router on a given segment prior to the restart, it elects
itself as the Designated Router again. The restarting router
knows that it was the Designated Router if, while the
associated interface is in Waiting state, a Hello packet is
received from a neighbor listing the router as the Designated
Router".
Implement that logic when processing Hello messages to ensure DR
interfaces will preserve their DR status across a graceful restart.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Issue:
===================
OSPF neighbors are not going down even after 10 mins when
having a mismatch in hello and dead interval.
First neighbors are formed and then a mismatch in the interval
is created, it is observed that the neighbor is not going down.
Root Cause Analysis:
====================
The event HelloReceived defined in RFC 2328 was named as PacketReceived
and this event was scheduled whenever LS Update, LS Ack, LS Request,
DD description packet or Hello packet is received.
Although there is a mismatch in the Hello packet contents, the
event PacketReceived gets triggered due to LS Update received and the
dead timer gets reset and hence the neighbor was never going Down and
remains FULL.
Fix:
==================
As per RFC 2328, the HelloReceived needs to be triggered only when
valid OSPF Hello packet is received and not when other OSPF packets
are received. Modified the function name as well.
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>