IGMP packets received from a source that does not match the subnet
of any configured addresses on the receive interface should be
ignored.
Also, find and use the correct IGMP socket object for the received
IGMP packet.
Signed-off-by: Nathan Bahr <nbahr@atcorp.com>
Signed-off-by: Jafar Al-Gharaibeh <jafar@atcorp.com>
Issue: when (*,G) has some receiver and directly connected source sends traffic,
new (S,G) entry created is not inheriting the oil from (*,g)
RCA: pim_mroute_msg_nocache haven't assume FHR have (*,g) member ports
Fix : Added inherit oil from parent from (*,g) receivers to get added
Signed-off-by: Saravanan K <saravanank@vmware.com>
Issue: When any interface is getting added/deleted in the outgoing
interface list, it calls pim_mroute_add() which is updating the
mroute_creation time without checking if the mroute is already
installed in the kernel.
Fix: Check if mroute is already installed, then dont refresh the
mroute_creation timer.
Signed-off-by: Sarita Patra <saritap@vmware.com>
There exists the possibility that a RP exists as a anycast
pair for a lan segment. As such one side may receive
the register and properly handle the registration mechanics.
The one that does not receive the register packets will still
get S,G state and WRVIFWHOLE upcalls across the lan. In
this case notice that we have not received the Registration
packets and prevent nexthop lookups.
Ticket: CM-27466
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When the WRVIFWHOLE callback is made with a iifp of the pimreg
device we *know* that the packet is a PIM Register packet
( see net/ipv4/ipmr.c for kernel behavior ). As such
we know that we will shortly read the pim register packet
and handle it through those mechanics. There is nothing
to do here so we can move along.
Ticket: CM-27729
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
A local membership is created on the vxlan termination device ipmr-lo. This
is done to -
1. Pull multicast vxlan tunnel traffic to the VTEP for termination by
triggering JoinDesired on the BUM multicast group.
2. Include the OIF in the mroute to signal to the dataplane component
that flow needs to be vxlan terminated.
Earlier we were overloading the PIM_UPSTREAM_FLAG_MASK_SRC_IGMP for
this local membership creation but that is creating confusion both in
the state machine and in the show outputs. To avoid that we use the
more apparent PIM_UPSTREAM_FLAG_MASK_SRC_VXLAN_TERM. With this change -
1. We get LHR functionality for VXLAN_TERM mroutes
2. OIF is populated with PIM_OIF_FLAG_PROTO_PIM only
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Kernel might not hand us a bad packet, but better safe than sorry here.
Validate the IP header length field. Also adds an additional check that
the packet length is sufficient for an IGMP packet, and a check that we
actually have enough for an ip header at all.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
SPT switchover handling is done by adding pimreg in the OIL of the (*, G)
entry on the LHR. This causes multicast data with that destination to be
sent to pimd as IGMPMSG_WHOLEPKT. These packets trigger creation of (S,G)
and also register forwarding. However register forwarding must only be done
if the router is also a FHR. That FHR check was missing causing strange
source registrations from multicast routers that were not directly
connected to the source.
Relevant logs from LHR -
PIM: pim_mroute_msg: pim kernel upcall WHOLEPKT type=3 ip_p=0 from fd=9 for (S,G)=(6.0.0.30,239.1.1.111) on pimreg vifi=0 size=98
PIM: Sending (6.0.0.30,239.1.1.111) Register Packet to 81.0.0.5
PIM: pim_register_send: Sending (6.0.0.30,239.1.1.111) Register Packet to 81.0.0.5 on swp2
And 6.0.0.30 is clearly not directly connected on that router -
root@tor-11:~# ip route |grep 6.0.0.30 -A2
6.0.0.30 proto ospf metric 20
nexthop via 6.0.0.22 dev swp1 weight 1 onlink
nexthop via 6.0.0.23 dev swp2 weight 1 onlink
root@tor-11:~#
Ticket: CM-24549
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
This is needed for two reasons -
1. The inherited OIL needs to be setup independent of the RPF interface
to allow correct computation of the JoinDesired macro.
2. The RPF interface is computed at the time of MFC programming so
it is not possible to permanently evict the OIF at that time oif_add
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
This commit includes the following changes -
1. kat needs to be included when evaluting join desired on a (S,G)
entry.
2. there were cases where we were adding OIF based on joindesired
being true for unrelated reasons (on other OIFs). cleaned up those
cases.
3. make all calls to pim_upstream_switch conditional on the JoinDesired
macro.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
If a source is being forwarded along the RPT it uses the parent (*,G)'s
IIF. When the parent's IIF changes all the children need to be updated
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
mfcc_parent for an (S, G) entry was being updated on any upstream RPF
change. With the change to use RPT for (S,G) in some cases we can no
longer do that. Instead the upstream entry's RPF neigbor is managed
separately form the channel_oil's mfcc_parent i.e. via NHT. And the
mfcc_parent is evaluated at the time of mroute programming.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
An (S,G) mroute can be created as a result of rpt prune. However that
entry needs to stay on the parent (*,G)'s tree (IIF) till a decision is
made to switch the source to the SPT.
The decision to stay on the RPT is made based on the SPTbit setting
according to - RFC7761, Section 4.2 “Data Packet Forwarding Rules”
However those rules are hard to achieve when hw acceleration i.e.
control and data planes are separate. So instead of relying on data
we make the decision of using SPT if we have decided to join the SPT -
Use_RPT(S,G) {
if (Joined(S,G) == TRUE // we have decided to join the SPT
OR Directly_Connected(S) == TRUE // source is directly connected
OR I_am_RP(G) == TRUE) // RP
//use_spt
return FALSE;
//use_rpt
return TRUE;
}
To make that change some re-org was needed -
1. pim static mroutes and dynamic (upstream mroutes) top level APIs
have been separated. This is to limit the state machine to dynamic
mroutes.
2. c_oil->oil.mfcc_parent is re-evaluated based on if we decided
to use the SPT or stay on the RPT.
3. upstream mroute re-eval is done when any of the criteria involved
in Use_RPT changes.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
1. This avoids the needs to re-run "muting" decisions.
2. Avoids the need to restore's pim OIL after fixup and send to kernel
(this is getting harder to manage).
In the future we need to also move the PIM maintained channel OIL from
an array of MAXVIFs to a simple DLL. This will be a significant
optimization in memory usage and preformance (OIL reads, copies etc).
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
When you turn on `debug igmp trace` we are seeing a bunch
of debugs associated with pim processing. This is because we were
using PIM_DEBUG_TRACE which is both `debug igmp trace` and `debug pim trace`
when tracing igmp code it would be nice to only see igmp work.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
There are several places in the pim where we are mixing up
zlog_warn w/ zlog_debug and vice versa. If we are protecting
a zlog_warn w/ a debug is it really a warn? If we have an actual
error situation we should also warn about it.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Modify the code to create an upstream reference whenever the code
creates an channel_oil via the pim_mroute.c code. This code also
starts a keep alive timer to clean up the reference if we do
nothing with it after the normal time.
I've left alone the source->channel_oil creation because these
are kept and tracked independently already.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Modify code base so that pim_upstream *always* creates a channel_oil
and as such we do not need to create it later or play other games.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add a function that allows you to modify the channel oil's incoming
interface and to appropriately install/remove it from the kernel.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
If we create a channel_oil ensure that all paths that
we can go down will create one. Future commits
can remove the (up->channel_oil) tests.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
There is no need to check for ALLOC function failures
in the code base. If we cannot get more memory we
assert.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
While this is impossible, make the compilers a bit happier
for those of us having to use something old.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com
1. special handling of term device in orig mroutes -
The multicast-vxlan termination device ipmr-lo is added to the (*, G)
mroute -
(0.0.0.0, 239.1.1.100) Iif: uplink-1 Oifs: uplink-1 ipmr-lo
This means that it will be inherited into all the SG entries including the
origination mroute. However we cannot terminate the traffic we originate
so some special handling is needed to exclude the termination device
in the origination entries -
27.0.0.7, 239.1.1.100) Iif: lo Oifs: uplink-1
2. special handling of term device on the MLAG pair -
Both MLAG switches pull down BUM-MDT traffic but only one (the DF) can
terminate the traffic. The non-DF must not exclude the termination device
from the MFC to prevent dups to the overlay.
DF -
root@TORC11:~# ip mr |grep "(0.0.0.0, 239.1.1.100)"
(0.0.0.0, 239.1.1.100) Iif: uplink-1 Oifs: uplink-1 ipmr-lo State: resolved
root@TORC11:~#
non-DF -
root@TORC12:~# ip mr |grep "(0.0.0.0, 239.1.1.100)"
(0.0.0.0, 239.1.1.100) Iif: uplink-1 Oifs: uplink-1 State: resolved
root@TORC12:~#
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
In a MLAG setup both of the VTEPs can rx and reg-encapsulate BUM traffic
toward the RP. To prevent these duplicates we need a mechanism to disable
register encaps on specific mroutes.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
The current reverse logic led to this code construct
if (!pim_nexthop_lookup(...)) {
//Do something successfull
}
This is backwards and will cause logic errors when people
use this code. Fix to use true/false for success/failure.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Issue: (*,G) mroute uptime is not updated in mroute table,
after deleting and adding the RP
Root cause: When RP gets deleted or becomes not reachable, then
we un-install the entry from the kernel, but still maintains the
entry in the stack. When RP gets re-configured or becomes reachable,
"show ip mroute" shows the uptime, when the channel_oil gets created.
Fix: Introduce a new time mroute_creation in the channel_oil, gets
updated when mroute gets installed in the kernel.
Signed-off-by: Sarita Patra <saritap@vmware.com>
When we are displaying S,G string data we already auto
display the string as (S,G) no need to have ((S,G)).
Cleanup some that were found during log look through.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When FRR receives IGMP/PIM (*, G) join and RP is not configured or not
reachable, then we are creating a dummy upstream with incoming interface
as NULL and upstream address as INADDR_ANY.
Added upstream address and incoming interface validation where it is necessary,
before doing any operation on the upstream.
Signed-off-by: Sarita Patra <saritap@vmware.com>
Create a `struct pim_router` and move the thread master into it.
Future commits will further move global varaibles into the pim_router
structure.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When sending a sockoption for MRT_INIT, *bsd requires that
the data passed in must be 1. While linux does not, the
code was sending in a positive value that was causing issues
on *bsd of protocol not supported.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When trying to run PIM on *bsd, the kernel expects to only
allow the pim kernel socket to work if we elevate priviledges.
So do so.
This commit gets us further in the startup of PIM on *bsd
but is not sufficient to get it fully started yet.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Suppose we have a bridge with a host and two routers attached
to it.
r1 r2
| |
--------
|
host
host is sending traffic.
r1 and r2 are pim neighbors and r2 is the DR.
Both r1 and r2 will receive data from the stream up the pim
kernel socket. r1 will notice that it is not the DR and
stop processing in pim. This code adds a bit more code to blackhole
the route when r1 detects it is not the DR in this scenario.
This is being done because the kernel is both keeping state and
sending data to the pim process to continue processing this.
Additionally if we happen to be running this on a asic, then
blackholing the route in the asic can save a significant amount
of cpu time handling this situation.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The changes introduced in PR #1044 caused pim to notice
when a setsockopt call failed. The kicker here is that
this used to just work because we ignored the issue
pre. So VRF's need to BINDTODEVICE to get igmp callbacks
but the default vrf does not need to do so.
With the fix we now see IGMP group join:
root@dell-s6000-02 ~/frr# vtysh -c "show ip igmp group"
Interface Address Group Mode Timer Srcs V Uptime
br1 20.0.11.1 232.2.3.4 EXCL 00:04:14 1 3 00:00:05
Fixes: #1121
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
1) Error check return from setsockopt and sockets
2) Check return codes for str2prefix
3) Clean up some potential NULL References
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When we receive a SGRPT Prune we were switching the upstream
to JOINED and immediately sending a join. This was not
the right thing to do.
This was happening because we were making decisions about the
new ifchannel before it was fully formed.
Rework ifchannel startup to provide enough information to
the pim upstream data structure to make the right decisions
Ticket: CM-16425
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Bind the pim kernel fd to the appropriate vrf, modify
the callback up into pim with the IGMP report to
retrieve the incoming interface and use that to
lookup the correct interface to use.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Move the upstream_list, hash and wheel into 'struct pim_instance'
Remove all pimg to pim in pim_upstream
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
As a transitory mechanism, if c_oil->pim is set, use that particular
pim instance, else use the default pimg.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When we are handling the thread read/writes for
a pim mroute socket, make it so that it can
be appropriately handled by the 'struct pim_instance'
instead of defaulting to the default VRF's
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This change allows other non-linux platforms to be a bit
more forgiving if we ask for a very very large size.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Convert the socket fd to be owned by the pimg pointer as
well as the counters associated with the fd. This will
allow us to future proof our code.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When we add a thread pointer to thread_add_XXX functions
when the specified function is called, thread.c is setting
the thread pointer to NULL. This was causing pim to
liberally pull it's zassert grenade pin's.
Additionally clean up code to not set the NULL pointer.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The FSF's address changed, and we had a mixture of comment styles for
the GPL file header. (The style with * at the beginning won out with
580 to 141 in existing files.)
Note: I've intentionally left intact other "variations" of the copyright
header, e.g. whether it says "Zebra", "Quagga", "FRR", or nothing.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
The way thread.c is written, a caller who wishes to be able to cancel a
thread or avoid scheduling it twice must keep a reference to the thread.
Typically this is done with a long lived pointer whose value is checked
for null in order to know if the thread is currently scheduled. The
check-and-schedule idiom is so common that several wrapper macros in
thread.h existed solely to provide it.
This patch removes those macros and adds a new parameter to all
thread_add_* functions which is a pointer to the struct thread * to
store the result of a scheduling call. If the value passed is non-null,
the thread will only be scheduled if the value is null. This helps with
consistency.
A Coccinelle spatch has been used to transform code of the form:
if (t == NULL)
t = thread_add_* (...)
to the form
thread_add_* (..., &t)
The THREAD_ON macros have also been transformed to the underlying
thread.c calls.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
In this patch, PIM nexthop tracking uses locally populated nexthop cached list
to determine ECMP based nexthop (w/ ECMP knob enabled), otherwise picks
the first nexthop as RPF.
Introduced '[no] ip pim ecmp' command to enable/disable PIM ECMP knob.
By default, PIM ECMP is disabled.
Intorudced '[no] ip pim ecmp rebalance' command to provide existing mcache
entry to switch new path based on hash chosen path.
Introduced, show command to display pim registered addresses and respective nexthops.
Introuduce, show command to find nexthop and out interface for (S,G) or (RP,G).
Re-Register an address with nexthop when Interface UP event received,
to ensure the PIM nexthop cache is updated (being PIM enabled).
During PIM neighbor UP, traverse all RPs and Upstreams nexthop and determine, if
any of nexthop's IPv4 address changes/resolves due to neigbor UP event.
Testing Done: Run various LHR, RP and FHR related cases to resolve RPF using
nexthop cache with ECMP knob disabled, performed interface/PIM neighbor flap events.
Executed pim-smoke with knob disabled.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
When we have a *,G mroute that starts receiving any particular
S,G, we will get wholepkt callbacks due to the pimreg in the
OIL for the *,G.
So we need to do SPT switchover, but this can fail if we
do not have a path to the S( but we do to the RP!).
In this case fail gracefully.
Ticket: CM-15621
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This allows SPT switchover for S,G upon receipt of packets
on the LHR.
1) When we create a *,G from a IGMP Group Report, install
the *,G route with the pimreg device on the OIL.
2) When a packet hits the LHR that matches the *,G, we will
get a WHOLEPKT callback from the kernel and if we cannot
find the S,G, that means we have matched it on the LHR via
the *,G mroute. Create the S,G start the KAT and run
inherited_olist.
3) When the S,G times out, safely remove the S,G via
the KAT expiry
4) When the *,G is removed, remove any S,G associated
with it via the LHR flag.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
There exists a common pattern in pim where we were setting
a variable to a value in the error case when we would no
longer need it.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Interface type has been replaced with the SSM range config. And SSM
groups can now co-exists with ASM groups. I have left the pim ssm
per-interface cli control hidden. It now enables pim-sm with a warning.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ticket: CM-15344
Testing Done: pim-smoke
SSM groups (232/8 or user configured SSM range) can exist in the same
multicast network as ASM groups. For such groups all RPT related state
machine operations have to be skipped as defined by section 4.8 of
RFC4601 -
1. Source registration is skipped for SSM groups. For SSM groups mroute
is setup on the FHR when a new multicast flow is rxed; however source
registration (i.e. pimreg join) is skipped. This will let the ASIC black
hole the traffic till a valid OIL is added to the mroute.
2. (*,G) IGMP registrations are ignored for SSM groups.
Sample output:
=============
fhr# sh ip pim group-type
SSM group range : 232.0.0.0/8
fhr# sh ip pim group-type 232.1.1.1
Group type: SSM
fhr# sh ip pim group-type 239.1.1.1
Group type: ASM
fhr#
Sample config:
=============
fhr(config)# ip pim ssm prefix-list ssm-ranges
fhr(config)#
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ticket: CM-15344
Testing Done:
1. SSM/ASM source-registration/igmp-joins.
2. On the fly multicast group type changes.
3. pim-smoke.
When an interface bounces and we receive a packet before
pim has a chance to fully bring the 'struct pim_usptream'
back up correctly, first check to see if we already have
an associated data structure before creating it again.
This removes a case where both the c_oil and up ref counts
were being incremented and never removed properly.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add pim Nexthop tracking feature 1st part where, specific RP or Source address (unicast address)
register with Zebra. Once nexthop update received from Zebra for a given address, scan RP or upstream
entries impacted by the change in nexthop.
Reviewed By: CCR-5761, Donald Sharp <sharpd@cumulusnetworks.com>
Testing Done: Tested with multiple RPs and multiple *,G entries at LHR.
Add new Nexthop or remove one of the link towards RP and verify RP and upstream nexthop update.
similar test done at RP with multiple S,G entries to reach source.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
With the separation of register-state and upstream-join-state we no
longer need an enumeration that covers both states. This commit includes
the following -
1. Defined new enumeration for reg state (this 1:1 with RFC4601).
2. Dropped JOIN_PENDING enum value from upstream join state. RFC4601
only define two values NOT_JOINED and JOINED for this state.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ticket: CM-14700
Testing Done: Verified register setup manually and ran pim-smoke