zebra: Allow nhg's to be reused when multiple interfaces are going amuck

Currently if there are multiple interfaces going down it is possible
to have a ships in the night situation with trying to reuse a nhg.

Imagine that you have a route w/ 4 way ecmp
  nexthop A interface a
  nexthop B interface b
  nexthop C interface c
  nexthop D interface d

Suppose interface a goes down, zebra receives this data
marks singleton nexthop A down and then recurses up the
tree to the 4 way ecmp and sets nexthop A as inactive.
Zebra then will notify the upper level protocol.

The upper level protocol will refigure the route and send
it down with 3 way ecmp.  At the same time if interface
b goes down and zebra handles that interface down event
before the new route installation, then when
zebra_nhg_rib_compare_old_nhe is called it will not
match the old and new ones up as that the old will be:

  nexthop A  <inactive>
  nexthop B  <inactive>
  nexthop C
  nexthop D

New will be:

   nexthop B  <inactive>
   nexthop C
   nexthop D

Currently zebra_nhg_nexthop_compare on the old skips all
the inactive but it never skips the nexthops at are inactive
on the new.

Modify the code to allow the new nhop to be skipped if it
is the same nexthop being looked at as the old and it
is not active as well.  This allows zebra to choose
the same nhg in the above case to continue working.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This commit is contained in:
Donald Sharp 2025-04-24 14:45:14 -04:00
parent dd42058c16
commit 53dfeab287

View file

@ -3049,6 +3049,14 @@ static bool zebra_nhg_nexthop_compare(const struct nexthop *nhop,
if (IS_ZEBRA_DEBUG_NHG_DETAIL)
zlog_debug("%s: %pRN Old is not active going to the next one",
__func__, rn);
if (!CHECK_FLAG(nhop->flags, NEXTHOP_FLAG_ACTIVE) &&
nexthop_same(nhop, old_nhop)) {
if (IS_ZEBRA_DEBUG_NHG_DETAIL)
zlog_debug("%s: %pRN new is not active going to the next one",
__func__, rn);
nhop = nhop->next;
}
old_nhop = old_nhop->next;
continue;
}