frr/doc/developer/ospf-ls-retrans.rst
Acee Lindem c494702929 ospfd: Improve OSPF neighbor retransmission list granularity and precision
The current OSPF neighbor retransmission operates on a single per-neighbor
periodic timer that sends all LSAs on the list when it expires.
Additionally, since it skips the first retransmission of received LSAs so
that at least the retransmission interval (resulting in a delay of between
the retransmission interval and twice the interval. In environments where
the links are lossy on P2MP networks with "delay-reflood" configured (which
relies on neighbor retransmission in partial meshs), the implementation
is sub-optimal (to say the least).

This commit reimplements OSPF neighbor retransmission as follows:

   1. A new data structure making use the application managed
      typesafe.h doubly linked list implements an OSPF LSA
      list where each node includes a timestamp.
   2. The existing neighbor LS retransmission LSDB data structure
      is augmented with a pointer to the list node on the LSA
      list to faciliate O(1) removal when the LSA is acknowledged.
   3. The neighbor LS retransmission timer is set to the expiration
      timer of the LSA at the top of the list.
   4. When the timer expires, LSAs are retransmitted that within
      the window of the current time and a small delta (50 milli-secs
      default). The LSAs that are retransmited are given an updated
      retransmission time and moved to the end of the LSA list.
   5. Configuration is added to set the "retransmission-window" to a
      value other than 50 milliseconds.
   6. Neighbor and interface LSA retransmission counters are added
      to provide insight into the lossiness of the links. However,
      these will increment quickly on non-fully meshed P2MP networks
      with "delay-reflood" configured.
   7. Added a topotest to exercise the implementation on a non-fully
      meshed P2MP network with "delay-reflood" configured. The
      alternative was to use existing mechanisms to instroduce loss
      but these seem less determistic in a topotest.

Signed-off-by: Acee Lindem <acee@lindem.com>
2024-06-20 15:31:07 +00:00

70 lines
2.8 KiB
ReStructuredText

OSPF Neighor Retransmission List
================================
Overview
--------
OSPF neighbor link-state retransmission lists are implemented using
both a sparse Link State Database (LSDB) and a doubly-linked list.
Rather than previous per-neighbor periodic timer, a per-neighbor
timer is set to the expiration time of the next scheduled LSA
retransmission.
Sparse Link State Database (LSDB)
---------------------------------
When an explicit or implied acknowledgment is recieved from a
neighbor in 2-way state or higher, the acknowledge LSA must be
removed from the neighbor's link state retransmission list. In order
to do this efficiently, a sparse LSDB is utilized. LSDB entries also
include a pointer to the corresponding list entry so that it may be
efficiently removed from the doubly-linked list.
The sparse LSDB is implemented using the OSPF functions is
ospf_lsdb.[c,h]. OSPF LSDBs are implemented as an array of route
tables (lib/table.[c,h]). What is unique of the LS Retransmission
list LSDB is that each entry also has a pointer into the doubly-linked
list to facilitate fast deletions.
Doubly-Linked List
------------------
In addition to the sparse LSDB, LSAs on a neighbor LS retransmission
list are also maintained in a linked-list order chronologically
with the LSA scheduled for the next retransmission at the head of
the list.
The doubly-link list is implemented using the dlist macros in
lib/typesafe.h.
LSA LS Retransmission List Addition
------------------------------------
When an LSA is added to a neighbor retransmission list, it is
added to both the sparse LSDB and the doubly-linked list with a pointer
in the LSDB route-table node to the list entry. The LSA is added to
the tail of the list with the expiration time set to the current time
with the retransmission interval added. If the neighbor retransmission
timer is not set, it is set to expire at the time of the newly added
LSA.
LSA LS Retransmission List Deletion
-----------------------------------
When an LSA is deleted from a neighbor retransmission list, it is
deleted from eboth the sparse LSDB and the doubly-linked list with the
pointer the LSDB route-table node used to efficiently delete the entry
from the list. If the LSA at the head of the list was removed, then
the neighbor retransmission timer is reset to the expiration of the
LSA at the head of the list or canceled if the list is empty.
Neighbor LS Retransmission List Expiration
------------------------------------------
When the neighbor retransmission timer expires, the LSA at the top of
list and any in a configured window (e.g., 50 milliseconds) are
retransmitted. The LSAs that have been retransmitted are removed from
the list and readded to the tail of the list with a new expiration time
which is retransmit-interval seconds in the future.