Track whether or not we have received an answer from
our query to do nexthop tracking. This allows us to
go straight to doing a synchronous query for our
RPF.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Start the separation of tracking a Destination from the act
of looking it up. The cojoining of these two concepts led
to a bunch of code that had to think about both problems leading
to weird situations and code paths. Simplify the code by making
pim_ecmp_nexthop_search a static function and we only ever
call pim_ecmp_nexthop_lookup when we need to do a RPF().
pim_ecmp_nexthop_lookup will now attempt to find a stored pnc
and if it finds one it will report on the answer from it.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The pim_resolve_upstream_nh function call is no longer being used
let's remove it from the code base.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When RP gets deleted, find all the (*, G) upstream whose group belongs to
the deleted RP, release the upstream from pnc->upstream_hash in the function
pim_delete_tracked_nexthop().
Signed-off-by: Sarita Patra <saritap@vmware.com>
When route to RP gets modified, FRR receives a notification from
zebra, and call the function pim_resolve_upstream_nh() to compute the
nexthop and update upstream->rpf structure.
Issue: In case when RP becomes not reachable, FRR only uninstall
the mroute from the kernal, but not update the upstream->rpf structure.
Fix: When FRR receives a notification from zebra saying RP becomes
not reachable, then update the following fields.
1. update channel_oil incoming interface as MAXVIFS
2. Un-install the mroute from the kernel.
3. Switch upstream state from JOINED to NOTJOINED.
4. Clear the nexthop information of the upstream.
Signed-off-by: Sarita Patra <saritap@vmware.com>
When route to RP gets modified, FRR receives a notification from
zebra, and call the function pim_update_rp_nh() to compute the
new nexthop and will update the source_nexthop information of
rp_info. This is not working for the case when RP becomes not
reachable.
Fix: When FRR receives a notification from zebra saying RP becomes
not reachable, then delete the source_nexthop informatio of rp_info.
Signed-off-by: Sarita Patra <saritap@vmware.com>
When FRR receives IGMP/PIM (*, G) join and RP is not configured or not
reachable, then we are creating a dummy upstream with incoming interface
as NULL and upstream address as INADDR_ANY.
Added upstream address and incoming interface validation where it is necessary,
before doing any operation on the upstream.
Signed-off-by: Sarita Patra <saritap@vmware.com>
pimd/pim_nht.c: In function ‘pim_ecmp_nexthop_search’:
pimd/pim_nht.c:523:17: error: ‘nbr’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
nexthop->nbr = nbr;
(on gcc 5.4.0; this is the only warning with that version.)
Signed-off-by: David Lamparter <equinox@diac24.net>
Abstract the RPF change for upstream handling code so
that we do not have two copies of the code.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
After we have decided what has changed as part of a update
we need to send the j/p messages to our peers.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Both pim_ecmp_nexthop_lookup and pim_ecmp_fib_lookup_if_vif_index
pass the address in 2 times. Make function calls consistent
and just pass in the src once.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The pim_ecmp_fib_looikup_if_vif_index does practically
the same work as pim_ecmp_nexthop_lookup, refactor to
use that function so that we do not have more code
that must parse the results from zclient_lookup_nexthop.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When we are looking up a RPF with a ecmp path, there
are situations where we are failing to find a path change
because we were not considering the actual number of neighbors
we have available to us at the start of the loop.
Example:
Suppose 2 way ecmp with a neighbor on each path. We have
multiple upstreams that are strewn across both paths.
If we loose a pim neighbor on one of the paths we would
initiate a rescan of the upstreams. If the neighbor
we lost happened to be the last ecmp path we rescanned
we would not successfully find a new path and leave
the upstream stranded.
This code change looks at the number of available neighbors
that we have -vs- the number of paths we have and chooses
the smaller of the two for figuring out what to do.
There probably exist other failure scenarios as well that
I am missing here and quite frankly the current code muddies
the water between a RPF lookup failure -vs- a RPF lookup succeeded
and there are no paths. Further work is needed here imo.
Additionally this idea of a pim_ecmp_nexthop_lookup and
pim_ecmp_nexthop_search is bogus. They are the same function and
should be merged at some point in time.
Ticket: CM-21599
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Fix a couple of problems in my 1st fix for PIM nexthops reachable via a
connected route:
Use NEXTHOP_TYPE_IPV4_IFINDEX instead of NEXTHOP_TYPE_IPV4 since we add an
IPv4 address to an already known ifindex.
Assign nexthop_tab[num_ifindex].protocol_distance and .route_metric before
incrementing num_ifindex.
Revert the default: to individual switch case statement conversion in
zclient_read_nexthop() as requested by donaldsharp in #2347
Signed-off-by: Martin Buck <mb-tmp-tvguho.pbz@gromit.dyndns.org>
These commands were being accepted in all vrf's and
affecting all vrf's behavior globally, since they were
global variables.
Modify the code to make these two commands work
on a per-vrf basis.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When sending a PIM join upwards on the RP-based tree, it may get dropped on
the last hop before the RP if the RP is reachable via a connected route
(i.e. there's no associated nexthop). pimd needs to put the nexthop IP
address into the PIM join payload and fails to do that if that route has a
nexthop of 0.0.0.0. So whenever we look up a route to determine the nexthop
or we receive a nexthop tracking update from Zebra, use the destination
address as the nexthop address for connected routes.
Fixes#2326.
Signed-off-by: Martin Buck <mb-tmp-tvguho.pbz@gromit.dyndns.org>
Create a zapi_nexthop_update_decode function that both
pim and bgp use to decode the message from zebra.
There probably could be further optimizations but I opted
to keep the code as similiar as is possible between the
originals because they both make some assumptions about
code flow that I do not fully understand yet.
The real goal here is that I want to create a new
user of the nexthop tracking code from a higher level
daemon and I see no need to re-implement this damn
code again for a 3rd time.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Abstract the code that sends the zapi message into zebra
for the turn on/off of nexthop tracking for a prefix.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
A recent commit has shown that we were not consistent with
handling of the vrf lookup. Adjust pim to do the right
thing with vrf lookup to be consistent and to make SA
happier.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This improves code readability and also future-proofs our codebase
against new changes in the data structure used to store interfaces.
The FOR_ALL_INTERFACES_ADDRESSES macro was also moved to lib/ but
for now only babeld is using it.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
This is an important optimization for users running FRR on systems with
a large number of interfaces (e.g. thousands of tunnels). Red-black
trees scale much better than sorted linked-lists and also store the
elements in an ordered way (contrary to hash tables).
This is a big patch but the interesting bits are all in lib/if.[ch].
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Convert the list_delete(struct list *) function to use
struct list **. This is to allow the list pointer to be nulled.
I keep running into uses of this list_delete function where we
forget to set the returned pointer to NULL and attempt to use
it and then experience a crash, usually after the developer
has long since left the building.
Let's make the api explicit in it setting the list pointer
to null.
Cynical Prediction: This code will expose a attempt
to use the NULL'ed list pointer in some obscure bit
of code.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Current cleanup is for unset values or variables that are not used anymore.
Regarding ospfd/ospf_vty.c: argv_find()
we'll never get it NULL, so get coststr = argv[idx]->arg;
This does three things:
1) When we get a RPF_FAILURE, remove the mroute associated
with it.
-> This way when the RPF comes back we can just add the
mroute in as part of the normal scanning process.
2) When we do a ecmp_nexthop_search return 1 when we found
something we can use.
3) Ignore output from pim_update_rp_nh
-> When we do a ecmp_nexthop_search ignore the return
code and do not attempt to gather it up to return
to the calling function. It is just ignored
and we were not taking into account the what of
multiple RP's we were looking at.
Ticket: CM-17218
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The NHT upstream list at scale is horribly inefficient due to keeping
a sorted list of upstream entries. The attempting to find
the upstream and the insertion of it into the upstream_list
was consuming a large amount of cpu cycles.
Convert to a hash, allow add/deletions to effectively become
O(1) events.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When we receive a new ecmp path and the old nexthop is still
valid. There existed some cases where we would continue looking
for a nexthop( and thus loose the fact that we had found it )
after found.
Ticket: CM-16983
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Don Slice <dslice@cumulusnetworks.com>
Ensure that displayed (S,G) output in logs is
consistent for all debugs. This will make it
easier to grep for interesting data.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Be aware that we may not have pim configured on all interfaces when
we have a failure situation.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Move the upstream_list, hash and wheel into 'struct pim_instance'
Remove all pimg to pim in pim_upstream
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>