Sw3# sh ip pim rp-info
RP address group/prefix-list OIF I am RP Source
20.0.0.2 225.1.1.1/32 ens192 no BSR
9.9.9.9 226.1.1.1/32 (Unknown) no BSR
30.0.0.100 229.1.1.5/32 ens192 no Static
Signed-off-by: Saravanan K <saravanank@vmware.com>
Start the separation of tracking a Destination from the act
of looking it up. The cojoining of these two concepts led
to a bunch of code that had to think about both problems leading
to weird situations and code paths. Simplify the code by making
pim_ecmp_nexthop_search a static function and we only ever
call pim_ecmp_nexthop_lookup when we need to do a RPF().
pim_ecmp_nexthop_lookup will now attempt to find a stored pnc
and if it finds one it will report on the answer from it.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When creating new RP information from a `ip pim rp A.B.C.D/M A.B.C.D`
we should determine if we are the RP even if we can or cannot
determine if we have a path to the RP via RPF.
This is because we should determine if we are the RP based upon a
connected ip address match not whether or not we have a path to
the RPF. We would normally think this is not important but
RPF is inherently asynchronous and we can have a state where
we have registered for nht but have not received the actual
path back yet from zebra.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When a new pim neighbor comes up, we need to go through and
mark nexthops that we have received from zebra for reachability
tracking so we can refigure stuff. If we pass in the new neighbor
we can limit the search to those nexthops out the interface that
the new neighbor has come up.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The pim_rp_check_is_my_ip_address function was checking to see
if we were the actual RP as well as the pim_register code
was doing the same thing. Remove the reduncancy a bit and
just make this function check for that we are the actual receiver
of this data.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The interface column in pim was limited to 8 or 9 columns
all over the place in pim, fix the code up to allow interface
length to be up to 16 columns.
Ticket: CM-23083
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When a RP gets deleted, find all the (*, G) upstream whose
group belongs to the deleted RP.
case 1: if the group belongs to any other rp, then call
pim_upstream_update() to update the upstream addr and rpf
information.
case 2: If no RP found for the group, then clear the pim
upstream address and rpf information.
Signed-off-by: Sarita Patra <saritap@vmware.com>
When a new RP is configured, find all the (*, G) upstream whose
group belongs to the new RP and then update the upstream structure
with the below fields.
1. De-register for the old RP.
2. Set the upstream address as new RP
3. Register for the new RP.
4. Update the upstream rpf information and kernel multicast forwarding
cache(MFC), if the new RP is reachable.
Signed-off-by: Sarita Patra <saritap@vmware.com>
In this commit, we are creating a dummy upstream & dummy channel_oil
for (*, G) when RP is not configured or not reachable.
Dummy upstream: <upstream_addr = INADDR_ANY, rpf = Unknown>
Dummy channel oil: <iif = MAXVIFS>
Signed-off-by: Sarita Patra <saritap@vmware.com>
Issue: Configure "ip pim rp x.x.x.x 225.0.0.0/4".
Show running config shows "ip pim rp x.x.x.x 224.0.0.0/4"
This is mis-leading.
Root-cause: Internally 225.0.0.0/4 is getting converted to
224.0.0.0/4 group mask, since the prefix length is 4.
Fix: Restrict the user to configure inconsistent group address
mask by throughing a cli error "Inconsistent address and mask".
Signed-off-by: Sarita Patra <saritap@vmware.com>
While terminating pim instance, the memory allocated for pim nexthop
should be released before deallocating the memory of pim nexthop cache(pnc).
This resolves the memory leak detected in pnc->nexthop creation.
Signed-off-by: Sarita Patra <saritap@vmware.com>
There is no need to check for failure of a ALLOC call
as that any failure to do so will result in a assert
happening. So we can safely remove all of this code.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
In places where we do a pim_ecmp_nexthop_search, also
use pim_ecmp_nexthop_lookup instead of the single path
case of pim_nexthop_lookup.
This is in preparation of more serious surgery to fix
the weird api of pim_find_or_track_nexthop.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The following types are nonstandard:
- u_char
- u_short
- u_int
- u_long
- u_int8_t
- u_int16_t
- u_int32_t
Replace them with the C99 standard types:
- uint8_t
- unsigned short
- unsigned int
- unsigned long
- uint8_t
- uint16_t
- uint32_t
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
This improves code readability and also future-proofs our codebase
against new changes in the data structure used to store interfaces.
The FOR_ALL_INTERFACES_ADDRESSES macro was also moved to lib/ but
for now only babeld is using it.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
This is an important optimization for users running FRR on systems with
a large number of interfaces (e.g. thousands of tunnels). Red-black
trees scale much better than sorted linked-lists and also store the
elements in an ordered way (contrary to hash tables).
This is a big patch but the interesting bits are all in lib/if.[ch].
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Convert the list_delete(struct list *) function to use
struct list **. This is to allow the list pointer to be nulled.
I keep running into uses of this list_delete function where we
forget to set the returned pointer to NULL and attempt to use
it and then experience a crash, usually after the developer
has long since left the building.
Let's make the api explicit in it setting the list pointer
to null.
Cynical Prediction: This code will expose a attempt
to use the NULL'ed list pointer in some obscure bit
of code.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
All the rp debugs were a mish-mash of TRACE or ZEBRA,
but the reality they were all focused on handling NHT
issues associated with the RP's. So let's create
a new debug 'debug pim nht rp' if you are having
issues with RP's.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This feature does this:
Add the ability to store the non-prefix static RP
entries into a table. Then to lookup the G to
find the RP in that table, finding the longest
prefix match across both prefix-lists and
static RP's.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When we are searching for a RP to use, amongst
many RP's and separate prefix-lists, Match on
the longest prefix specified to choose the
correct RP.
Example:
ip pim rp 4.3.2.1 prefix-list A
ip pim rp 4.3.2.2 prefix-list B
ip pim rp 4.3.2.3 prefix-list C
ip prefix-list A seq 5 permit 225.0.0.0/8
ip prefix-list B seq 5 permit 225.1.0.0/16
ip prefix-list C seq 5 permit 225.1.1.0/24
Old behavior: Group 225.1.1.14 comes in and
we need to find the RP to use, we would match
on the first prefix-list A( since we are searching
based on a sorted link list of RP address ) and
select 4.3.2.1 as our RP
New behavior: Group 225.1.1.14 comes in and
we need to find theRP to use, we now will
match on C( longest prefix match ) and select
4.3.2.3 as our RP.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
1) Error check return from setsockopt and sockets
2) Check return codes for str2prefix
3) Clean up some potential NULL References
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The NHT upstream list at scale is horribly inefficient due to keeping
a sorted list of upstream entries. The attempting to find
the upstream and the insertion of it into the upstream_list
was consuming a large amount of cpu cycles.
Convert to a hash, allow add/deletions to effectively become
O(1) events.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
A bunch of functions had return values that were never
checked for ( and not needed ) and opposite return values
for proper calling function boolean logic.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
1) Create pim_instance.[ch] to allow us to handle the instance information there
2) Refactor some pim_rpf_ and some pim_rp commands into appropriate files and
appropriate includes.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This reverts commit c14777c6bf.
clang 5 is not widely available enough for people to indent with. This
is particularly problematic when rebasing/adjusting branches.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
The secondary address comparison done to determine if we are
an RP for a specified address was comparing A.B.C.D/32 to A.B.C.D/0
because when we created the rp_info we were not setting the prefixlen
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The FSF's address changed, and we had a mixture of comment styles for
the GPL file header. (The style with * at the beginning won out with
580 to 141 in existing files.)
Note: I've intentionally left intact other "variations" of the copyright
header, e.g. whether it says "Zebra", "Quagga", "FRR", or nothing.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
During processing of Join/Prune,
for a S,G entry, current state is SGRpt, when only *,G is
received, need to clear SGRpt and add/inherit the *,G OIF to S,G so
it can forward traffic to downstream where *,G is received.
Upon receiving SGRpt prune remove the inherited *,G OIF.
From, downstream router received *,G Prune along with SGRpt
prune. Avoid sending *,G and SGRpt Prune together.
Reset upstream_del reset ifchannel to NULL.
Testing Done:
Run failed smoke test of sending data packets, trigger SPT switchover,
*,G path received SGRpt later data traffic stopped S,G ages out from LHR, sends only
*,G join to upstream, verified S,G entry inherit the OIF.
Upon receiving SGRpt deletes inherited oif and retains in SGRpt state.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
During PIM Neighbor change/UP event, pim_scan_oil api
scans all channel oil to see any rpf impacted. Instead of
passing current upstream's RPF it passes current RPF as 0 and
does query to rib for nexhtop (without ECMP/Rebalance). This creates
inconsist RPF between Upstream and Channel oil.
In Channel Oil keep backward pointer to upstream DB and fetch up's
RPF and passed to channel_oil scan.
Decrement channel_oil ref_count in upstream_del when decrementing
up ref_count and it is not the last.
Created ECMP based FIB lookup API.
Testing Done:
Performed following testing on tester setup:
5 x LHR, 4 x MSDP Spines, 6 Sources each sending to 1023 groups from one of the spines.
Total send rate 8Mpps.
Test that caused problems was to reboot every device at the same time.
After fix performed 5 iterations of reboot devices and show no sign of the problem.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
During neighbor down event, all upstream entries rpf lookup may result
into nhop address with 0.0.0.0 and rpf interface info being NULL.
Put preventin check where rpf interface info is accessed.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
In this patch, PIM nexthop tracking uses locally populated nexthop cached list
to determine ECMP based nexthop (w/ ECMP knob enabled), otherwise picks
the first nexthop as RPF.
Introduced '[no] ip pim ecmp' command to enable/disable PIM ECMP knob.
By default, PIM ECMP is disabled.
Intorudced '[no] ip pim ecmp rebalance' command to provide existing mcache
entry to switch new path based on hash chosen path.
Introduced, show command to display pim registered addresses and respective nexthops.
Introuduce, show command to find nexthop and out interface for (S,G) or (RP,G).
Re-Register an address with nexthop when Interface UP event received,
to ensure the PIM nexthop cache is updated (being PIM enabled).
During PIM neighbor UP, traverse all RPs and Upstreams nexthop and determine, if
any of nexthop's IPv4 address changes/resolves due to neigbor UP event.
Testing Done: Run various LHR, RP and FHR related cases to resolve RPF using
nexthop cache with ECMP knob disabled, performed interface/PIM neighbor flap events.
Executed pim-smoke with knob disabled.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
(cherry picked from commit cba4448178)
In this patch, PIM nexthop tracking uses locally populated nexthop cached list
to determine ECMP based nexthop (w/ ECMP knob enabled), otherwise picks
the first nexthop as RPF.
Introduced '[no] ip pim ecmp' command to enable/disable PIM ECMP knob.
By default, PIM ECMP is disabled.
Intorudced '[no] ip pim ecmp rebalance' command to provide existing mcache
entry to switch new path based on hash chosen path.
Introduced, show command to display pim registered addresses and respective nexthops.
Introuduce, show command to find nexthop and out interface for (S,G) or (RP,G).
Re-Register an address with nexthop when Interface UP event received,
to ensure the PIM nexthop cache is updated (being PIM enabled).
During PIM neighbor UP, traverse all RPs and Upstreams nexthop and determine, if
any of nexthop's IPv4 address changes/resolves due to neigbor UP event.
Testing Done: Run various LHR, RP and FHR related cases to resolve RPF using
nexthop cache with ECMP knob disabled, performed interface/PIM neighbor flap events.
Executed pim-smoke with knob disabled.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
It is impossible for the list->cmp function to
ever be handed NULL values as the arguments.
Clean up this in the code.
Additionally consolidate the exact same two functions
into 1 function.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When we are checking RP addresses and looking at the secondary
address. With the addition of the ability to handle v6 addresses
in the secondary list. Assuming that the secondary address
is a v4 address is a no go.
Convert to prefix_same.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add pim Nexthop tracking feature 1st part where, specific RP or Source address (unicast address)
register with Zebra. Once nexthop update received from Zebra for a given address, scan RP or upstream
entries impacted by the change in nexthop.
Reviewed By: CCR-5761, Donald Sharp <sharpd@cumulusnetworks.com>
Testing Done: Tested with multiple RPs and multiple *,G entries at LHR.
Add new Nexthop or remove one of the link towards RP and verify RP and upstream nexthop update.
similar test done at RP with multiple S,G entries to reach source.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
When a new rp is entered, pim is looking at all rp's and failing the check if
any of the RP's have no path to the RP, instead of the one that was
just entered being wrong.
Ticket: CM-12623
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Don Slice <dslice@cumulusnetworks.com>
When we modify the 224.0.0.0/4 rp address with a:
'ip pim rp A.B.C.D'
We need to let msdp know that this command was entered.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Anycast requires that the lo interface be associated with multiple
addresses. One is the anycast IP address (which is the same on all RPs
participating in RP redundancy) and the second is the unique IP address
that will be used as the router id by routing protocols.
To accomodate that we maintain a list of secondary addresses per-pim iface
and allow any of them to be the RP address. This lets the I_am_RP macro
succeed on anycast RPs.
Note that the support is limited i.e. we don't actually advertise a
secondary list to the neighbors. This is assuming the anycast IP will never
be used as a router id i.e. will never be an RPF neighbor.
Sample output:
==============
dell-s6000-04# sh ip pim interface lo
Interface : lo
State : up
Address : 100.1.1.1 (primary)
100.1.1.2
100.1.1.3
100.1.2.1
>>>>>>> SNIP >>>>>>>>>>>>>>>
dell-s6000-04# sh ip pim interface lo json
{
"lo":{
"name":"lo",
"state":"up",
"address":"100.1.1.1",
"index":1,
"lanDelayEnabled":true,
"secondaryAddressList":[
"100.1.1.2",
"100.1.1.3",
"100.1.2.1"
],
>>>>>>> SNIP >>>>>>>>>>>>>>>
dell-s6000-04#sh ip pim rp-info
RP address group/prefix-list OIF I am RP
100.1.2.1 224.0.0.0/4 lo yes
dell-s6000-04#
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Acked-by: Donald Sharp <sharpd@cumulusnetworks.com>
If you specified the 'ip pim rp ...' after the
system has been configured it was not accepting
the new rp. This fixes that issue.
Ticket:CM-12623
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When a 'show ip pim rp-info' is issued shortly
after a restart/start, pim will crash because
nexthop information has not been fully resolved
and the outgoing interface is NULL.
Ticket: CM-13567
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
1. Added a new MSDP source reference flag for creating (S,G) entries
based on the SA-cache. The RFC recommends treating as SA like rxing
a (S, G) join (which is a bit different then treating like a traffic
stream).
2. SA-SPT is only setup if we are RP for the group and a corresponding
(*,G) exists with a non-empty OIL.
3. When an SA is moved we need to let the SPT live if it is active (this
change will come in a subsequent CL).
Testing done:
1. SA first; SPT setup whenever (*, G) comes around.
2. (*, G) first. As soon as SA is added SPT is setup.
3. (*, G) del with valid SA entries around.
Ticket: CM-13306
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Acked-by: Donald Sharp <sharpd@cumulusnetworks.com>
Pim sometimes needs the upstream rpf lookup to
only take into account if we have a nbr out
the selected interface or not. Move
the code for this to a better spot so
we can make a more intelligent decision
here.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The recent conversion of in_addr_t to struct prefix
caused a display issue in the 'show run' of the rp.
Ticket: CM-12752
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When I converted over to using 'struct prefix'
I broke the initialization of the rp.
In addition, I used the wrong AFI type
to switch on in pim_rpf.c
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
mrib_nexthop_addr and rpf_addr should be 'struct prefix'
so that we can safely handle unnumbered data from a nexthop
lookup in zebra
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
List the RP information we have configured.
Fix bug where we were not properly initializing some code
Ticket: CM-12617
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
I tried to be smart and skirt around rpf lookup if I knew
the incoming interface. This turns out to be not necessarily
a good thing because we can easily have asymetrical routing.
This fix removes the attempt to cache the ifp we received
the incoming packet on and just lets the lookup work like
it should.
Additionally it removes the weird hardcoding of the rpf
interface from the register stuff.
Ticket: CM-12530
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
reb
Allow the user to specify multiple rp commands.
'ip pim rp A.B.C.D' -> translates to 'ip pim rp A.G.C.D 224.0.0.0/24'
ip pim rp A.B.C.D A.B.C.D/M
First is the rp, second is the group with mask.
Groups and masks cannot be over each other except 224.0.0.0/24 which
is the fallback if used.
Ticket: CM-7860
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Refactor the qpim_rp into pim_rp.c so that the global data
is protected. This will allow us to easily add the group
ranges.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When a register is received, forward the packet as appropriate.
This is the infrastructure to make this happen.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Allow pim to separate out the pim vif index from the ifindex.
This change will allow pim to work with up to 255(MAXVIFS)
interfaces, while also allowing the interface ifindex to
be whatever number it needs to be.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When the kernel sends a NOCACHE message to
pim we were looking up the interface to
use for the incoming multicast packet
based upon the source. No need to do
that trust that the kernel has properly
identified it and use that.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When we get a NOCACHE call from the kernel, there is
no point generating a mcast route for the packet if
there is no RP configured, or if PIM-SSM is configured
on the interface.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add the ability for the node to determine if it is the RP or not.
Currently this only allows static RP's.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Receive a (*,G) route and send it upstream to the RP.
The RP at this time does not properly handle the route.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>