1. On P2MP interfaces, direct ack would include the same LSA multiple times
multiple packets are processed before the OSPF interfae direct LSA
acknowledgment event is processed. Now duplicates LSA in the same event
are suppressed.
2. On non-broadcast interfaces, direct acks for multiple neighbors would be
unicast to the same neighbor due to the multiple OSPF LS Update packets
being process prior to the OSPF interface direct ack event. Now, separate
direct acks are unicast to the neighbors requiring them.
3. The interface delayed acknowledgment timer runs would run continously
(every second as long as the interace is up). Now, the timer is set
when delayed acknowledgments are queued and all queued delayed
acknowledges are sent when it fires.
4. For non-broadcast interface delayed acknowledgments, the logic to send
to multiple neighbors wasn't working because the list was emptied while
building the packet for the first neighbor.
Signed-off-by: Acee Lindem <acee@lindem.com>
The current OSPF neighbor retransmission operates on a single per-neighbor
periodic timer that sends all LSAs on the list when it expires.
Additionally, since it skips the first retransmission of received LSAs so
that at least the retransmission interval (resulting in a delay of between
the retransmission interval and twice the interval. In environments where
the links are lossy on P2MP networks with "delay-reflood" configured (which
relies on neighbor retransmission in partial meshs), the implementation
is sub-optimal (to say the least).
This commit reimplements OSPF neighbor retransmission as follows:
1. A new data structure making use the application managed
typesafe.h doubly linked list implements an OSPF LSA
list where each node includes a timestamp.
2. The existing neighbor LS retransmission LSDB data structure
is augmented with a pointer to the list node on the LSA
list to faciliate O(1) removal when the LSA is acknowledged.
3. The neighbor LS retransmission timer is set to the expiration
timer of the LSA at the top of the list.
4. When the timer expires, LSAs are retransmitted that within
the window of the current time and a small delta (50 milli-secs
default). The LSAs that are retransmited are given an updated
retransmission time and moved to the end of the LSA list.
5. Configuration is added to set the "retransmission-window" to a
value other than 50 milliseconds.
6. Neighbor and interface LSA retransmission counters are added
to provide insight into the lossiness of the links. However,
these will increment quickly on non-fully meshed P2MP networks
with "delay-reflood" configured.
7. Added a topotest to exercise the implementation on a non-fully
meshed P2MP network with "delay-reflood" configured. The
alternative was to use existing mechanisms to instroduce loss
but these seem less determistic in a topotest.
Signed-off-by: Acee Lindem <acee@lindem.com>
ospfv3 shows this unconditionally, and ospfv2 does not show `ip ospf network ...` if the type of the interface matches the specified network.
Fixes: https://github.com/FRRouting/frr/issues/15817
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
This commit adds the capabiity to filter OSPF neighbors using a
prefix-list with rules matching the neighbor's IP source address.
Configuration, filtering, immediate neighbor pruning, topo-tests,
and documentation are included. The command is:
ip ospf neighbor-filter <prefix-list> [A.B.C.D]
Signed-off-by: Acee Lindem <acee@lindem.com>
This extends non-broadcast support to point-to-multipoint networks.
Neighbors will be explicitly configured and polled in lieu of multicast
dicovery. Toptotests and documentation updates are included.
Additionally, the ospf neighbor commands have been greatly simplified taking
advantage of DEFPY() capabilities.
The AllOSPFRouters (224.0.0.5) is still joined for non-broadcast networks
since it is joined for NBMA networks. It seems this could be removed but
it should done be in a separate commit.
Signed-off-by: Acee Lindem <acee@lindem.com>
OSPF intra/inter area routes were previously marked to assure they
are re-installed after a fast link flap in the commit:
commit effee18744
Author: Donald Sharp <sharpd@nvidia.com>
Date: Mon May 24 13:45:29 2021 -0400
ospfd: Fix quick interface down up event handling in ospf
This commit extends this fix to OSPF AS External routes as well.
Signed-off-by: Acee <aceelindem@gmail.com>
1. When an OSPF interface is deleted, remove the references in link-local
LSA. Delete the LSA from the LSDB so that the callback has accessibily
to the interface prior to deletion.
2. Fix a double free for the opaque function table data structure.
3. Assure that the opaque per-type information and opaque function table
structures are removed at the same time since they have back pointers
to one another.
4. Add a topotest variation for the link-local opaque LSA crash.
Signed-off-by: Acee <aceelindem@gmail.com>
Replace `struct list *` with `DLIST(if_connected, ...)`.
NB: while converting this, I found multiple places using connected
prefixes assuming they were IPv4 without checking:
- vrrpd/vrrp.c: vrrp_socket()
- zebra/irdp_interface.c: irdp_get_prefix(), irdp_if_start(),
irdp_advert_off()
(these fixes are really hard to split off into separate commits as that
would require going back and reapplying the change but with the old list
handling)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
INTERFACE_NAMSIZ is just a redefine of IFNAMSIZ and IFNAMSIZ
is the standard for interface name length on all platforms
that FRR currently compiles on.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
...so that multiple functions can be subscribed.
The create/destroy hooks are renamed to real/unreal because that's what
they *actually* signal.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Also:
- replace all /* fallthrough */ comments with portable fallthrough;
pseudo keyword to accomodate both gcc and clang
- add missing break; statements as required by older versions of gcc
- cleanup some code to remove unnecessary fallthrough
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
This patch includes:
* Implementation of RFC 5709 support in OSPF. Using
openssl library and FRR key-chain,
one can use SHA1, SHA256, SHA384, SHA512 and
keyed-MD5( backward compatibility with RFC 2328) HMAC algs.
* Updating documentation of OSPF
* add topotests for new HMAC algorithms
Signed-off-by: Mahdi Varasteh <varasteh@amnesh.ir>
Some fixes for the per-interface write sockets: better align
opening and closing them with ospf config actions; set
read buffer to zero since these sockets are used only for
writing packets.
Signed-off-by: Mark Stapp <mjs@labn.net>
Add support for "[no] ip ospf capbility opaque" at the interface
level with the default being capability opaque enabled. The command
"no ip ospf capability opaque" will disable opaque LSA database
exchange and flooding on the interface. A change in configuration
will result in the interface being flapped to update our options
for neighbors but no attempt will be made to purge existing LSAs
as in dense topologies, these may received by neighbors through
different interfaces.
Topotests are added to test both the configuration and the LSA
opaque flooding suppression.
Signed-off-by: Acee <aceelindem@gmail.com>
interface link update event needs
to be handle properly in ospf interface
cache.
Example:
When vrf (interface) is created its default type
would be set to BROADCAST because ifp->status
is not set to VRF.
Subsequent link event sets ifp->status to vrf,
ospf interface update need to compare current type
to new default type which would be VRF (OSPF_IFTYPE_LOOPBACK).
Since ospf type param was created in first add event,
ifp vrf link event didn't update ospf type param which
leads to treat vrf as non loopback interface.
Ticket:#3459451
Testing Done:
Running config suppose to bypass rendering default
network broadcast for loopback/vrf types.
Before fix:
vrf vrf1
vni 4001
exit-vrf
!
interface vrf1
ip ospf network broadcast
exit
After fix: (interface vrf1 is not displayed).
vrf vrf1
vni 4001
exit-vrf
Signed-off-by: Chirag Shah <chirag@nvidia.com>
Currently, delayed reflooding on P2MP interfaces for LSAs received
from neighbors on the interface is unconditionally (see commit
c706f0e32b). In some cases, this
change wasn't desirable and this feature makes delayed reflooding
configurable for P2MP interfaces via the CLI command:
"ip ospf network point-to-multipoint delay-reflood" in interface
submode.
Signed-off-by: Acee <aceelindem@gmail.com>
When setting an loopback's cost, set the value to 0, unless the operator
has assigned a value for the loopback's cost.
RFC states:
If the state of the interface is Loopback, add a Type 3
link (stub network) as long as this is not an interface
to an unnumbered point-to-point network. The Link ID
should be set to the IP interface address, the Link Data
set to the mask 0xffffffff (indicating a host route),
and the cost set to 0.
FRR is going to allow this to be overridden if the operator specifically
sets a value too.
Fixes: #13472
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This command makes unplanned GR more reliable by manipulating the
sending of Grace-LSAs and Hello packets for a certain amount of time,
increasing the chance that the neighboring routers are aware of
the ongoing graceful restart before resuming normal OSPF operation.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Add support for a write socket per interface, enabled by
default at the ospf instance level. An ospf instance-level
config allows this to be disabled, reverting to the older
behavior where a single per-instance socket is used for
sending and receiving packets.
Signed-off-by: Mark Stapp <mjs@labn.net>
Whenever OSPF virtual-link is created, a virtual interface is
associated with it. Name of the virtual interface is derived by
combining "VLINK" string with the value of vlink_count, which is a global
variable.
Problem:
Consider a scenario where 2 virtual links A and B are created in OSPF with
virtual interfaces VLINK0 and VLINK1 respectively. When virtual-link A is unconfigured
and reconfigured, new interface name derived for it will be VLINK1, which is already
associated with virtual-link B. Due to this, both virtual-links A and B will
point to the same interface, VLINK1.
During FRR restart when signal handler is called, OSPF goes through all the virtual
links and deletes the interface(oi) associated with it. During the deletion of interface
for virtual-link B,it accesses the interface which was deleted already(which was deleted
during deletion of virual-link A) and whose fields were set to NULL. This
leads to OSPF crash.
Fixed it by not decrementing vlink_count during unconfig/deletion for virtual-link.
Signed-off-by: Pooja Jagadeesh Doijode <pdoijode@nvidia.com>
This is a first in a series of commits, whose goal is to rename
the thread system in FRR to an event system. There is a continual
problem where people are confusing `struct thread` with a true
pthread. In reality, our entire thread.c is an event system.
In this commit rename the thread.[ch] files to event.[ch].
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Don't directly use `time()` for generating sequence numbers for two
reasons:
1. `time()` can go backwards (due to NTP or time adjustments)
2. Coverity Scan warns every time we truncate a `time_t` variable for
good reason (verify that we are Y2K38 ready).
Found by Coverity Scan (CID 1519812, 1519786, 1519783 and 1519772)
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
When forming a neighbor relationship on an interface, ospf is
currently evaluating unnumbered as highest priority, without
any consideration for if you have /32's and non /32's on the
interface. Effectively if I have something like this:
int foo0
ip address 192.168.119.1/24
!
router ospf
network 0.0.0.0/0 area 0
!
ospf will form a neighbor on foo0 if it exists. Now
suppose someone does this:
int foo0
ip address 192.168.120.1/32
This will create the unnumbered interface on foo0 and
the peering will come down immediately.
The problem here is that the original designers of the unnumbered
code for ospf didn't envision end operators mixing and matching
addresses on an interface like this ( for perfectly legitimate
reasons I might add ).
So if ospf has both numbered and unnumbered let's match against
the numbered first and then unnumbered. This solves the problem
Fixes: #6823
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Let's just use THREAD_OFF consistently in the code base
instead of each daemon having a special macro that needs to
be looked at and remembered what it does.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Skip marking routes as changed in ospf_if_down if there's now
new_table present, which might be the case when the instance is
being finished
The backtrace for the core was:
raise (sig=sig@entry=11) at ../sysdeps/unix/sysv/linux/raise.c:50
core_handler (signo=11, siginfo=0x7fffffffe170, context=<optimized out>) at lib/sigevent.c:262
<signal handler called>
route_top (table=0x0) at lib/table.c:401
ospf_if_down (oi=oi@entry=0x555555999090) at ospfd/ospf_interface.c:849
ospf_if_free (oi=0x555555999090) at ospfd/ospf_interface.c:339
ospf_finish_final (ospf=0x55555599c830) at ospfd/ospfd.c:749
ospf_deferred_shutdown_finish (ospf=0x55555599c830) at ospfd/ospfd.c:578
ospf_deferred_shutdown_check (ospf=<optimized out>) at ospfd/ospfd.c:627
ospf_finish (ospf=<optimized out>) at ospfd/ospfd.c:683
ospf_terminate () at ospfd/ospfd.c:653
sigint () at ospfd/ospf_main.c:109
quagga_sigevent_process () at lib/sigevent.c:130
thread_fetch (m=m@entry=0x5555556e45e0, fetch=fetch@entry=0x7fffffffe9b0) at lib/thread.c:1709
frr_run (master=0x5555556e45e0) at lib/libfrr.c:1174
main (argc=9, argv=0x7fffffffecb8) at ospfd/ospf_main.c:254
Signed-off-by: Tomi Salminen <tsalminen@forcepoint.com>
Since f60a1188 we store a pointer to the VRF in the interface structure.
There's no need anymore to store a separate vrf_id field.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
We should always treat the VRF interface as a loopback. Currently, this
is not the case, because in some old pre-VRF code we use if_is_loopback
instead of if_is_loopback_or_vrf. To avoid any future problems, the
proposal is to rename if_is_loopback_or_vrf to if_is_loopback and use it
everywhere. if_is_loopback is renamed to if_is_loopback_exact in case
it's ever needed, but currently it's not used anywhere.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Running ospf_topo_vrf1 leads us to this valgrind issue:
==2386518== Invalid read of size 8
==2386518== at 0x4971520: route_top (table.c:401)
==2386518== by 0x181F08: ospf_interface_bfd_apply (ospf_bfd.c:126)
==2386518== by 0x182069: ospf_interface_disable_bfd (ospf_bfd.c:158)
==2386518== by 0x18BF51: ospf_del_if_params (ospf_interface.c:557)
==2386518== by 0x18C584: ospf_if_delete_hook (ospf_interface.c:712)
==2386518== by 0x490CA0B: hook_call_if_del (if.c:61)
==2386518== by 0x490D1F3: if_delete_retain (if.c:286)
==2386518== by 0x490D337: if_delete (if.c:309)
==2386518== by 0x490CDED: if_destroy_via_zapi (if.c:200)
==2386518== by 0x49940A9: zclient_interface_delete (zclient.c:2237)
==2386518== by 0x4998062: zclient_read (zclient.c:3969)
==2386518== by 0x4979529: thread_call (thread.c:1908)
==2386518== by 0x4919918: frr_run (libfrr.c:1164)
==2386518== by 0x181AC7: main (ospf_main.c:235)
==2386518== Address 0x5df39a0 is 0 bytes inside a block of size 56 free'd
==2386518== at 0x48399AB: free (vg_replace_malloc.c:538)
==2386518== by 0x492A03E: qfree (memory.c:141)
==2386518== by 0x4970C6F: route_table_free (table.c:141)
==2386518== by 0x4970A36: route_table_finish (table.c:61)
==2386518== by 0x18C543: ospf_if_delete_hook (ospf_interface.c:708)
==2386518== by 0x490CA0B: hook_call_if_del (if.c:61)
==2386518== by 0x490D1F3: if_delete_retain (if.c:286)
==2386518== by 0x490D337: if_delete (if.c:309)
==2386518== by 0x490CDED: if_destroy_via_zapi (if.c:200)
==2386518== by 0x49940A9: zclient_interface_delete (zclient.c:2237)
==2386518== by 0x4998062: zclient_read (zclient.c:3969)
==2386518== by 0x4979529: thread_call (thread.c:1908)
==2386518== by 0x4919918: frr_run (libfrr.c:1164)
==2386518== by 0x181AC7: main (ospf_main.c:235)
==2386518== Block was alloc'd at
==2386518== at 0x483AB65: calloc (vg_replace_malloc.c:760)
==2386518== by 0x4929EFC: qcalloc (memory.c:116)
==2386518== by 0x49709F8: route_table_init_with_delegate (table.c:53)
==2386518== by 0x49717F4: route_table_init (table.c:528)
==2386518== by 0x18C328: ospf_if_new_hook (ospf_interface.c:659)
==2386518== by 0x490C97D: hook_call_if_add (if.c:60)
==2386518== by 0x490CE85: if_create_name (if.c:223)
==2386518== by 0x490DF32: if_get_by_name (if.c:622)
==2386518== by 0x4993F73: zclient_interface_add (zclient.c:2186)
==2386518== by 0x4998062: zclient_read (zclient.c:3969)
==2386518== by 0x4979529: thread_call (thread.c:1908)
==2386518== by 0x4919918: frr_run (libfrr.c:1164)
==2386518== by 0x181AC7: main (ospf_main.c:235)
==2386518==
Fix the ordering to do the individual node tree cleanup after we delete
the data we care about.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Currently, we automatically delete an inactive VRF when its last
interface is deleted. This code introduces a couple of crashes because
of the following problems:
- vrf_delete is called before calling if_del hook, so daemons may try to
dereference an ifp->vrf pointer which is freed
- in if_terminate, we continue to use the VRF in the loop condition
after the last interface is deleted
This check is needed only when the interface is deleted by the user,
because if the interface is deleted by the system, VRF must still exist
in the system. Move the check to appropriate places to fix crashes.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
It allows FRR to read the interface config even when the necessary VRFs
are not yet created and interfaces are in "wrong" VRFs. Currently, such
config is rejected.
For VRF-lite backend, we don't care at all about the VRF of the inactive
interface. When the interface is created in the OS and becomes active,
we always use its actual VRF instead of the configured one. So there's
no need to reject the config.
For netns backend, we may have multiple interfaces with the same name in
different VRFs. So we care about the VRF of inactive interfaces. And we
must allow to preconfigure the interface in a VRF even before it is
moved to the corresponding netns. From now on, we allow to create
multiple configs for the same interface name in different VRFs and
the necessary config is applied once the OS interface is moved to the
corresponding netns.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
There's a helper function to check whether the interface is loopback or
VRF - if_is_loopback_or_vrf. Let's use it whenever we need to check that.
There's no functional change in this commit.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Adding defensive code to the interface_link_params zebra callback
to check if the link params changed before taking action.
Signed-off-by: Karen Schoener <karen@voltanet.io>
When we get this sequence of events:
- zebra receives interface up, sends to ospf
- ospf receives intf up, processes( including neighbor formation and spf )
and sends route to zebra for installation.
- zebra receives route for processing, schedules it too happen in the future
- zebra receives interface down event, sends to ospf
- zebra processes route X and marks it inactive because nexthop
interface is down
- zebra receives interface up event, sends to ospf
- ospf receives both events and processes the change and decides
that nothing has changed so it does not send any route change for X to zebra.
At this point zebra has a route from ospf that is marked as inactive, while
ospf believes that the route should be installed properly.
Modify the code such that on an interface down event, ospf marks the routes
as changed if the ifindex is being used for a nexthop, so that when ospf
is deciding if routes have changed post spf that it can just automatically
send that route down again if it still exists.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>