Add the support to store lookup modes as a sorted list.
List is non-unique and sorts mode with both lists < modes with one list < global mode (no lists).
This way, when finding the right mode, we will match a lookup using a prefix list before the global mode.
Add passing group address into all lookups (using nht cache and/or synchronous lookup).
Many areas don't have a group address, use PIMADDR_ANY if no valid group is needed.
Signed-off-by: Nathan Bahr <nbahr@atcorp.com>
Refactor the next hop tracking in PIM to fully support the configured RPF lookup mode.
Moved many NHT related functions to pim_nht.h/c
NHT now tracks both MRIB and URIB tables and makes nexthop decisions based on the configured lookup mode.
Signed-off-by: Nathan Bahr <nbahr@atcorp.com>
Add rpf-lookup-mode MODE vty command under router pim block.
Including NB piping and config write. Using the mode still pending.
Signed-off-by: Nathan Bahr <nbahr@atcorp.com>
Signalize termination to functions so they can avoid accessing pointers
that might be no longer available.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Implement embedded RP support and configuration commands.
Embedded RP is disabled by default and can be globally enabled with the
command `embedded-rp` in the PIMv6 configuration node.
It supports the following options:
- Embedded RP maximum limit
- Embedded RP group filtering
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Perform AutoRP discovery and candidate RP announcements using the
AutoRP protocol.
Mapping agent is not yet implemented, but this feature is not
necessary for FRR to support AutoRP as we only need one AutoRP
mapping agent in the network.
Signed-off-by: Nathan Bahr <nbahr@atcorp.com>
INTERFACE_NAMSIZ is just a redefine of IFNAMSIZ and IFNAMSIZ
is the standard for interface name length on all platforms
that FRR currently compiles on.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Effectively a massive search and replace of
`struct thread` to `struct event`. Using the
term `thread` gives people the thought that
this event system is a pthread when it is not
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Modifying igmp_group_count of struct pim_instance
to gm_group_count which is to be used for both IGMP and MLD.
Signed-off-by: Sai Gomathi N <nsaigomathi@vmware.com>
The problem here is when the same node is FHR as well as RP,
then the node keeps on sending the register packet.
Register-stop is not sent as well.
This problem has occurred because the RP is the same node
and there is no socket created on loopback interface, so the
packet is never send out and never received back on the same
node, so register recv could not be processed on the node and
hence no register-stop is sent.
Since register packets are unicast packets, its better to handle
the send of register packet via a separate register socket.
This fixes the problem mentioned above as well.
Fixes: #11331
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
Modifying igmp_watermark_limit of struct pim_instance
to gm_watermark_limit which is to be used for both IGMP and MLD.
Signed-off-by: Sai Gomathi N <nsaigomathi@vmware.com>
Zebra can be setup to use a value that is less than MULTIPATH_NUM.
When pimd connects to zebra, zebra will inform pim about the MULTIPATH_NUM
used. Let's use that value for figuring out our multipath value.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
pimd's include files are very interdependent. Let's chop that down a
bit to gain some flexibility.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
VRF creation can happen from either cli or from
knowledged about the vrf learned from zebra.
In the case where we learn about the vrf from
the cli, the vrf id is UNKNOWN. Upon actual
creation of the vrf, lib/vrf.c touches up the vrf_id
and calls pim_vrf_enable to turn it on properly.
At this point in time we have a pim->vrf_id of
UNKNOWN and the vrf->vrf_id of the right value.
There is no point in duplicating this data. So just
remove all pim->vrf_id and use the vrf->vrf_id instead
since we keep a copy of the pim->vrf pointer.
This will remove some crashes where we expect the
pim->vrf_id to be usable and it's not.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This CLI will allow user to configure a igmp group limit which will generate
a watermark warning when reached.
Though watermark may not make sense without setting a limit, this
implementation shall serve as a base to implementing limit in future and helps
tracking a particular scale currently.
Testing:
=======
ip igmp watermark-warn <10-60000>
on reaching the configured number of group, pim will issue warning
2019/09/18 18:30:55 PIM: SCALE ALERT: igmp group count reached watermak limit: 210(vrf: default)
Also added group count and watermark limit configured on cli - show ip igmp groups [json]
<snip>
Sw3# sh ip igmp groups json
{
"Total Groups":221, <=====
"Watermark limit":210, <=========
"ens224":{
"name":"ens224",
"state":"up",
"address":"40.0.0.1",
"index":6,
"flagMulticast":true,
"flagBroadcast":true,
"lanDelayEnabled":true,
"groups":[
{
"source":"40.0.0.1",
"group":"225.1.1.122",
"timer":"00:03:56",
"sourcesCount":1,
"version":2,
"uptime":"00:00:24"
<\snip>
<snip>
Sw3(config)# do sh ip igmp group
Total IGMP groups: 221
Watermark warn limit(Set) : 210
Interface Address Group Mode Timer Srcs V Uptime
ens224 40.0.0.1 225.1.1.122 ---- 00:04:06 1 2 00:13:22
ens224 40.0.0.1 225.1.1.144 ---- 00:04:02 1 2 00:13:22
ens224 40.0.0.1 225.1.1.57 ---- 00:04:01 1 2 00:13:22
ens224 40.0.0.1 225.1.1.210 ---- 00:04:06 1 2 00:13:22
<\snip>
Signed-off-by: Saravanan K <saravanank@vmware.com>
RCA: When configured more than 32(MAXVIS), the inerfaces that are configured
after 32nd interfaces have the value of MAXVIF.
This is used as index to access the free vif tracker of array size 32(MAXVIFS).
So the channel oil list pointer which is present as the next field in pim structure get corrupt, when updating free vif.
This gets accessed during rpf update resulting in crash.
Fix: Refrain from allocating mcast interface structure and throw config error when more than MAXVIFS are attempted to configure.
Max vif checks are exempted for vrf device and pimreg as vrf device will be the first interface and not expected to fail and pimreg has reserved vif.
vxlan tunnel termination device creation has this check and throw warning on max vif.
All other creation are through CLI.
Signed-off-by: Saravanan K <saravanank@vmware.com>
When pim receives a register packet, we will apply the
received source to the prefix list. If accepted normal
processing continues. If denied we will send a register
stop message to the source.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
1. Upstream entries associated with tunnel termination mroutes are
synced to the MLAG peer via the local MLAG daemon.
2. These entries are installed in the peer switch (via an upstream
ref flag).
3. DF (Designated Forwarder) election is run per-upstream entry by both
the MLAG switches -
a. The switch with the lowest RPF cost is the DF winner
b. If both switches have the same RPF cost the MLAG role is
used as a tie breaker with the MLAG primary becoming the DF
winner.
4. The DF winner terminates the multicast traffic by adding the tunnel
termination device to the OIL. The non-DF suppresses the termination
device from the OIL.
Note: Before the PIM-MLAG interface was available hidden config was
used to test the EVPN-PIM functionality with MLAG. I have removed the
code to persist that config to avoid confusion. The hidden commands are
still available.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Convert the upstream_list and hash to a rb tree, Significant
time was being spent in the listnode_add_sort. This reduces
this time greatly.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The channel_oil_list and hash are taking significant
cpu at scale when adding to the sorted list. Replace
with a RB_TREE.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
when ever a FRR Client wants to send any data to another node
using MLAG Channel, uses below mechanisam.
1. sends a MLAG Registration to zebra with interested messages that
it is intended to receive from peer.
2. In response to this request, Zebra opens communication channel with
MLAG. and also in Rx. diretion zebra forwards only those messages which
client shown interest during registration
3. when client is no-longer interested in communicating with MLAG, client
posts De-register to Zebra
4. if this is the last client which is interested for MLAG Communication,
zebra closes the channel.
why PIM Needs MLAG Communication
================================
1. In general on LAN Networks elecetd DR will send the Join towards
Multicast RP in case of a LHR and Register in case of FHR.
2. But in case DR Goes down, traffic will be re-converged only after
the New DR is elected, but this can take time based on Hold Timer to
detect the DR down.
3. this can be optimised by using MLAG Mecganisam.
4. and also Traffic can be forwarded more efficiently by knowing the cost
towards RP using MLAG
Signed-off-by: Satheesh Kumar K <sathk@cumulusnetworks.com>
when ever a FRR Client wants to send any data to another node
using MLAG Channel, uses below mechanisam.
1. sends a MLAG Registration to zebra with interested messages that
it is intended to receive from peer.
2. In response to this request, Zebra opens communication channel with
MLAG. and also in Rx. diretion zebra forwards only those messages which
client shown interest during registration
3. when client is no-longer interested in communicating with MLAG, client
posts De-register to Zebra
4. if this is the last client which is interested for MLAG Communication,
zebra closes the channel.
why PIM Needs MLAG Communication
================================
1. In general on LAN Networks elecetd DR will send the Join towards
Multicast RP in case of a LHR and Register in case of FHR.
2. But in case DR Goes down, traffic will be re-converged only after
the New DR is elected, but this can take time based on Hold Timer to
detect the DR down.
3. this can be optimised by using MLAG Mecganisam.
4. and also Traffic can be forwarded more efficiently by knowing the cost
towards RP using MLAG
Signed-off-by: Satheesh Kumar K <sathk@cumulusnetworks.com>
Apart from datastructure, bsm scope initialization and deinitialiation
routines called during pim instance init and deinit. Also makefile changes.
Signed-off-by: Saravanan K <saravanank@vmware.com>
These entries will be used over the subsequent commits for
1. vxlan-tunnel-termination handling - setup MDT to rx VxLAN encapsulated
BUM traffic.
2. vxlan-tunnel-origination handling - register local-vtep-ip as a
multicast source.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
The pimg data structure is only used in one spot to send the default
vrf id to zebra upon startup. Add the default vrf id to the struct pim_router
data structure and remove the pimg pointer.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Create a `struct pim_router` and move the thread master into it.
Future commits will further move global varaibles into the pim_router
structure.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
These commands were being accepted in all vrf's and
affecting all vrf's behavior globally, since they were
global variables.
Modify the code to make these two commands work
on a per-vrf basis.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
We know the vrf that we are in when we need to initiate a
rescan of the rpf cache. So pass it in and use that information.
This should help the rescan at scale with several vrf's cutting
out a lot of unnecessary work.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This feature does this:
Add the ability to store the non-prefix static RP
entries into a table. Then to lookup the G to
find the RP in that table, finding the longest
prefix match across both prefix-lists and
static RP's.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>