The nl batch buffer was destroyed too early when a netns was
terminated. Now freeing the buffer later in router_terminate allows
netlink messages to be still processed.
Signed-off-by: Christopher Dziomba <christopher.dziomba@telekom.de>
If you have undefined behavior compilation checking gcc
starts to complain about a bunch of places that do not
have return's. When most of them actually do and we
have the assert's to prove it. I'm just doing this
to make the compiler happy for me, so I can continue
to do work.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Today zebra tracks EVPN vtep_ips in the L3-VNI of the route_entry.
Leaking routes in bgpd and using different RMACs on the remote side
for different L3-VNIs for a single VTEP leads to updates in the leak
destination VRF which is not desired. Zebra gets the source VRF in
vrf_id of the nexthops. This nexthop vrf_id is now used for adding/
updating/deleting the l3vni nexthop.
Signed-off-by: Christopher Dziomba <christopher.dziomba@telekom.de>
Problem:
Zebra crashed while going down. This happened because zebra was
trying to process a new client accept request after closing ZAPI's
listener socket zsock and setting it to -1.
Fix:
Skip rescheduling zserv_accept() and accepting a new client if global
ZAPI listener socket FD, zsock is already closed and set to -1.
Also use global ZAPI listener socket FD zsock in zserv_accept() instead
of using a copy of zsock.
Signed-off-by: Pooja Jagadeesh Doijode <pdoijode@nvidia.com>
Introduce the End.B6.Encaps in the rt_netlink code.
Signed-off-by: Dmytro Shytyi <dmytro.shytyi@6wind.com>
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
When displaying `show nexthop-group rib` the afi
displayed for Nexthop Groups ( not singletons ) is
`bad-value` which while true to the specific of
the AFI it's not necessarily what we want to display
to the end operator. Add a wrapper function for
nhg's to do the right thing.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
zebra rmac has a nh_list which tracks the assigned VTEP IPs to RMACs.
It can also receive IPv6 encoded IPv4 addresses as VTEPs. Changing/
Installing the RMAC into the Kernel is only important when the IPv4
address changes. However because nh_list is a nodup list used to
track usage or RMACs by VTEP IPs, both IP addresses (IPv4 and IPv6
encoded IPv4) should be written into it, as both could be removed
in l3vni_rmac_nh_list_nh_delete independently.
Signed-off-by: Christopher Dziomba <christopher.dziomba@telekom.de>
The rib_sweep_route function when not doing graceful
restart does not attempt to save the event on the
t_rib_sweep pointer for shutdown. Prevent any
weird shenanigans by allowing shutdown to clean
up the rib_sweep_route event.
Signed-off-by: Donald Sharp <donaldsharp72@gmail.com>
Make code changes in fpm_read to create a list of ctx and send it to
zebra for processing rather than sending individual ctx
Signed-off-by: Krishnasamy <krishnasamyr@nvidia.com>
This tells hosts on the subnet if (and which) NAT64 prefix is in use.
Useful for things like xlat464, or local dns64.
Updated from the previous -03 draft implementation. (PLC field did not
exist before.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
The fpm_listener currently has no ability to store the list
of prefixes that it has received. Modify the code to store
the prefixes in a typesafe RB Tree. Additionally modify
the code such that when a SIGUSR1 is received to dump
the routes out. If the operator specifies a -z <filename>
then write the routes to that file. It will overwrite
the last version of the file written.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
With bgpd no longer sending withdraws for EVPN prefix routes
zebra needs to track the path/ref count of prefixes for an EVPN
nexthop. When a route is updated the count is first increased
with the new paths and then decreased with the paths of the
existing/"same" route entry.
However the same evpn route methods are used for EVPN MH as well,
where bgpd already tracks the references. It is expected that an
ADD operation for the respective A-D routes is handled as an
upsert, a DEL operation should really remove the respective A-D
reference on a next-hop. For this the old behaviour (no path/ref
counting in zebra) is preserved.
Signed-off-by: Christopher Dziomba <christopher.dziomba@telekom.de>
Currently Zebra is just reading packets off the zapi
wire and stacking them up for processing in zebra
in the future. When there is significant churn
in the network the size of zebra can grow without
bounds due to the MetaQ sizing constraints. This
ends up showing by the number of nexthops in the
system. Reducing the number of packets serviced
to limit the metaQ size to the packets to process
allieviates this problem.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This commit undo 8c9b007a0c
stream lib has been modified to expand the stream if needed
Now for zapi route encode, we use expandable stream
Signed-off-by: Soumya Roy <souroy@nvidia.com>
Issue:
If static route is created with a BGP route as nexthop, which
recursively resolves over 512 ECMP v6 nexthops, zapi nexthop encode
fails, as there is not enough memory allocated for stream. This causes
assert/core dump in zebra. Right now we allocate fixed memory
of ZEBRA_MAX_PACKET_SIZ size.
Fix:
1)Dynamically calculate required memory size for the stream
2)try to optimize memory usage
Testing:
No crash happens anymore with the fix
zebra: zebra crash for zapi stream
Issue:
If static route is created with a BGP route as nexthop, which
recursively resolves over 512 ECMP v6 nexthops, zapi nexthop encode
fails, as there is not enough memory allocated for stream. This causes
assert/core dump in zebra. Right now we allocate fixed memory
of ZEBRA_MAX_PACKET_SIZ size.
Fix:
1)Dynamically calculate required memory size for the stream
2)try to optimize memory usage
Testing:
No crash happens anymore with the fix
r1#
r1# sharp install routes 2100:cafe:: nexthop 2001:db8::1 1000
r1#
r2# conf
r2(config)# ipv6 route 2503:feca::100/128 2100:cafe::1
r2(config)# exit
r2#
Signed-off-by: Soumya Roy <souroy@nvidia.com>
Now usage of `-r -f` with fpm_listener now causes all
routes to be rejected.
r1# sharp install routes 10.0.0.0 nexthop 192.168.44.5 5
r1# show ip route
Codes: K - kernel route, C - connected, L - local, S - static,
R - RIP, O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric, t - Table-Direct,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
IPv4 unicast VRF default:
D>o 10.0.0.0/32 [150/0] via 192.168.44.5, r1-eth0, weight 1, 00:00:02
D>o 10.0.0.1/32 [150/0] via 192.168.44.5, r1-eth0, weight 1, 00:00:02
D>o 10.0.0.2/32 [150/0] via 192.168.44.5, r1-eth0, weight 1, 00:00:02
D>o 10.0.0.3/32 [150/0] via 192.168.44.5, r1-eth0, weight 1, 00:00:02
D>o 10.0.0.4/32 [150/0] via 192.168.44.5, r1-eth0, weight 1, 00:00:02
C>* 192.168.44.0/24 is directly connected, r1-eth0, weight 1, 00:00:37
L>* 192.168.44.1/32 is directly connected, r1-eth0, weight 1, 00:00:37
r1#
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Trigger:
Imagine a route utilizing an NHG with six nexthops (Intf swp1-swp6).
If interfaces swp1-swp4 flaps, the NHG remains the same but now only
references two nexthops (swp5-6) instead of all six. This behavior
occurs due to how NHGs with recursive nexthops are managed within Zebra.
In the scenario below, NHG 370 has all six nexthops installed in the
kernel. However, Zebra maintains a list of recursive NHGs that NHG 370
references i.e., Depends: (371), (372), (373) which are not directly
installed in the kernel.
- When an interface comes up, its nexthop and corresponding dependents
are installed.
- These dependents (counterparts to 371-373) are non-recursive and
are installed as well.
- However, when attempting to install the recursive ones in
zebra_nhg_install_kernel(), they resolve to the already installed
counterparts, resulting in a NO-OP.
Fixing this by iterating all dependents of the recursively resolved
NHGs and reinstalling them.
Trigger: Flap swp1 to swp4
Before Fix:
root@leaf-11:mgmt:/var/home/cumulus# ip route show | grep 6.0.0.5
6.0.0.5 nhid 370 proto bgp metric 20
ip -d next show
id 337 via 2000:1:0:1:0:f:0:9 dev swp6 scope link proto zebra
id 339 via 2000:1:0:1:0:e:0:9 dev swp5 scope link proto zebra
id 341 via 2000:1:0:1:0:8:0:8 dev swp4 scope link proto zebra
id 343 via 2000:1:0:1:0:7:0:8 dev swp3 scope link proto zebra
id 346 via 2000:1:0:1:0:1:0:7 dev swp2 scope link proto zebra
id 348 via 2000:1:0:1::7 dev swp1 scope link proto zebra
id 370 group 346/348/341/343/337/339 scope global proto zebra
After Trigger:
root@leaf-11:mgmt:/var/home/cumulus# ip route show | grep 6.0.0.5
6.0.0.5 nhid 370 proto bgp metric 20
root@leaf-11:mgmt:/var/home/cumulus# ip -d next show
id 337 via 2000:1:0:1:0:f:0:9 dev swp6 scope link proto zebra
id 339 via 2000:1:0:1:0:e:0:9 dev swp5 scope link proto zebra
id 370 group 337/339 scope global proto zebra
After Fix:
root@leaf-11:mgmt:/var/home/cumulus# ip route show | grep 6.0.0.5
6.0.0.5 nhid 432 proto bgp metric 20
ip -d next show
id 432 group 395/397/400/402/405/407 scope global proto zebra
After Trigger
root@leaf-11:mgmt:/var/home/cumulus# ip route show | grep 6.0.0.5
6.0.0.5 nhid 432 proto bgp metric 20
root@leaf-11:mgmt:/var/home/cumulus# ip -d next show
id 432 group 395/397/400/402/405/407 scope global proto zebra
Ticket :#
Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
Signed-off-by: Donald Sharp <sharpd@nvidia.com>