Before it was setting SDIR, which is /usr/lib/frr, but the vtysh binary is put
under bindir (which is /usr/local by default). And running `/usr/lib/frr/frr reload`
failed.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Features added in this commit:
1. Bringup/shutdown new management daemon 'mgmtd' along with FRR.
2. Support for Startup, Candidate and Running DBs.
3. Lock/Unlock DS feature using pthread lock.
4. Load config from a JSON file onto candidate DS.
5. Save config to a JSON file from running/candidate DS.
6. Dump candidate or running DS contents on the terminal or a file in
JSON/XML format.
7. Maintaining commit history (Full rollback support to be added in
future commits).
8. Addition of debug commands.
Co-authored-by: Yash Ranjan <ranjany@vmware.com>
Co-authored-by: Abhinay Ramesh <rabhinay@vmware.com>
Co-authored-by: Ujwal P <ujwalp@vmware.com>
Signed-off-by: Pushpasis Sarkar <pushpasis@gmail.com>
Check for value present in list before removing
as in certain python3 ValueError traceback is observed.
Traceback (most recent call last):
File "/usr/lib/frr/frr-reload.py",
line 2278, in <module>
(lines_to_add, lines_to_del, restart_frr)
= compare_context_objects(newconf, running)
File "/usr/lib/frr/frr-reload.py",
line 1933, in compare_context_objects
lines_to_add, lines_to_del
File "/usr/lib/frr/frr-reload.py",
line 1549, in ignore_delete_re_add_lines
lines_to_del.remove((ctx_keys, line))
ValueError: list.remove(x): x not in list
Ticket:#3389979
Issue:3389979
Testing Done:
With fix perform frr-relaod on frr.conf config where earlier
traceback was seen.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Chirag Shah <chirag@nvidia.com>
Using // style comments for the SPDX license identifier was kind of an
intentional choice to make it stand out as "directive-like" comment (and
also to constrain it to the one line.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
The files converted in this commit either had some random misspelling or
formatting weirdness that made them escape automated replacement, or
have a particularly "weird" licensing setup (e.g. dual-licensed.)
This also marks a bunch of "public domain" files as SPDX License "NONE".
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Got `ERROR: Daemon babeld is not a valid option for 'show running-config'` when using `frr-reload.py --reload --daemon babeld`.
Adds `babeld` and `nhrpd` as valid daemons.
Signed-off-by: Yuxiang Zhu <vfreex@gmail.com>
agentx can't be disabled once enabled, so we should ignore it for frr-reload.py.
```
$ /usr/lib/frr/frr-reload.py --reload /etc/frr/bgpd.conf --bindir /usr/local/bin
"no agentx" we failed to remove this command
SNMP AgentX support cannot be disabled once enabled
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
This reverts commit 2000ac4075.
There were concerns that ensuring zebra stopped last led to
problems with zebra's "-r" flag, so we'll revert that for the
time being and reconsider this area.
Signed-off-by: Mark Stapp <mjs@labn.net>
There might be use cases when this would make sense, for example
running FRR in a container as a designated user.
Signed-off-by: Michal Ruprich <mruprich@redhat.com>
The backslash in `grep -q '^declare \-a'` is not needed and
causes `grep: warning: stray \ before -` warning in grep-3.8.
Signed-off-by: Marius Tomaschewski <mt@suse.com>
PIMv6 Support Bundle commands are added in support_bundle_commands.conf file.
This will help in debugging PIMv6 test Failures.
Signed-off-by: Sai Gomathi <nsaigomathi@vmware.com>
It will be easier to maintain a single file instead of two separate.
Also, fixes the issue when the file (/var/log/frr/frr.log) is not created
after logrotate.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
touch + chown can have a gap between the commands (or the second failed).
This could lead to unexpected permissions (root, instead of frr) for some
.conf files or directories.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
If we add/modify community/large/ext lists without sequence numbers, and
doing frr-reload.py, then rules with sequence numbers (show running-config
always adds sequence numbers) will be deleted and new ones will be re-added.
This could lead to blackholing for some time.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
It already "looks" like a bitmask, but we currently can't flag a command
both YANG and HIDDEN at the same time. It really should be a bitmask.
Also clarify DEPRECATED behaviour (or the absence thereof.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
The current service file configures restarts on-abnormal, which translates to "unclean signal", "timeout", or "watchdog". This patch updates it to always restart, as there's never really a time watchfrr should exit by itself at all.
Signed-off-by: Brian Rak <brak@vultr.com>
check_function_arguments_recurse() has received a new function argument
in GCC 12. Fill it in and add a compatibility wrapper.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
The problem is that when we run watchfrr.sh/frrinit.sh, we get something like:
```
cat: '"/var/run/frr/staticd.pid"': No such file or directory
cat: '"/var/run/frr/babeld.pid"': No such file or directory
cat: '"/var/run/frr/zebra.pid"': No such file or directory
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2469a37f reversed the logic of the existence check for
/etc/frr/frr.conf breaking boot config loading, fix it.
Signed-off-by: Christian Hopps <chopps@labn.net>
When starting a daemon, print the full command run by the init script to
start it. This gives more information and is especially helpful when
debugging wrap commands.
Also add some more logs to vtysh_b to print the command used there,
log when we exit early because frr.conf doesn't exist, and simplify the
code path for creating the command to use.
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
The explanation block for watchfrr_options was split into two blocks,
one explaining the --netns option and one making a vague statement that
the init script provides the list of daemons to start. The former can be
merged with the latter and the latter is more useful when stated as a
caveat for what you should actually use watchfrr_options for.
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
We support 'wrap' variables in /etc/frr/daemons, but the explanation
given there doesn't touch on some of the subtleties of using these
variables.
The variables were designed for use with Valgrind, which has special
behavior when run with programs that daemonize; Valgrind will intercept
the fork()'d child process and run itself instead of the child. This
behavior allows it to follow the same forking semantics as the target
program.
For virtually every other wrapper, the wrap variables do not work as
demonstrated because the wrapper programs do not daemonize. If the
wrappers do not daemonize, they will block the init script. The examples
given with "perf" for example simply do not work, because perf remains
in the foreground even as it tracks forked children.
This patch adds an explanation of the behavior expected by the init
script and offers a solution for getting that behavior.
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
Fix daemon shutdown broken by f0cccaa6bf. Now
we still don't complain about missing PID files but actually stop the
running daemons.
The previous fix was broken because it passed a new "--all" option to
daemon_stop which wasn't handled properly (it assumed $1 contains the
daemon name when at that time it acutally contained the "--all" option).
Plus, "--all" wasn't actually necessary, because we already passed
"--reallyall" from all_stop to daemon_stop after the daemon name.
So remove "--all" again and simply check for "--reallyall" in $2. This
should *really* fix#11317.
Signed-off-by: Martin Buck <mb-tmp-tvguho.pbz@gromit.dyndns.org>
Since 1686b1d486, we try to stop all daemons,
even those which are not (no longer) enabled in /etc/frr/daemons. But we
shouldn't complain about missing PID files for daemons which have never been
started and just silently ignore those.
Signed-off-by: Martin Buck <mb-tmp-tvguho.pbz@gromit.dyndns.org>
When a bgp neighbor removed from associated to peer-group,
the neighbor is fully deleted, subsequent deletion of any
configuration related to the neighbor leads to failure
in frr-reload.
Fix: In frr-reload lines to delete check if any neighbor with
peer-group removal line is present, if so then remove any
further config deletion associated the neighbor needs to removed
from the lines to delete.
Ticket:#3032234
Reviewed By:
Testing Done:
BEFORE FIX:
-----------
2022-04-08 20:03:32,734 INFO: Executed "router bgp 4200000005 no neighbor swp5 interface peer-group UNDERLAY"
2022-04-08 20:03:32,892 INFO: Failed to execute router bgp 4200000005 no neighbor swp5 password SSSS
2022-04-08 20:03:33,050 INFO: Failed to execute router bgp 4200000005 no neighbor swp5 password
2022-04-08 20:03:33,218 INFO: Failed to execute router bgp 4200000005 no neighbor swp5
2022-04-08 20:03:33,354 INFO: Failed to execute router bgp 4200000005 no neighbor
2022-04-08 20:03:33,520 INFO: Failed to execute router bgp 4200000005 no
2022-04-08 20:03:33,521 ERROR: "router bgp 4200000005 -- no" we failed to remove this command
2022-04-08 20:03:33,521 ERROR: % Specify remote-as or peer-group commands first
2022-04-08 20:03:33,691 INFO: Failed to execute router bgp 4200000005 no neighbor swp5 advertisement-interval 0
2022-04-08 20:03:33,853 INFO: Failed to execute router bgp 4200000005 no neighbor swp5 advertisement-interval
2022-04-08 20:03:34,015 INFO: Failed to execute router bgp 4200000005 no neighbor swp5
2022-04-08 20:03:34,145 INFO: Failed to execute router bgp 4200000005 no neighbor
2022-04-08 20:03:34,326 INFO: Failed to execute router bgp 4200000005 no
2022-04-08 20:03:34,327 ERROR: "router bgp 4200000005 -- no" we failed to remove this command
2022-04-08 20:03:34,327 ERROR: % Specify remote-as or peer-group commands first
AFTER FIX:
----------
delete of numbered neighbor:
2022-04-08 19:52:17,204 INFO: Executed "router bgp 4200000005 no
neighbor 1.2.3.4 peer-group UNDERLAY"
2022-04-08 19:52:17,205 INFO: /var/run/frr/reload-GRFX1M.txt content
delete of unnumbered neighbor:
2022-04-08 20:00:02,952 INFO: Executed "router bgp 4200000005 no
neighbor swp5 interface peer-group UNDERLAY"
2022-04-08 20:00:02,953 INFO: /var/run/frr/reload-722C3P.txt content
Signed-off-by: Chirag Shah <chirag@nvidia.com>
BGPd does not allow default instance deletion
in presence of bgp vrf instance;
frr-reload script fails if delete list contains
default instance followed by vrf instance.
Fix:
frr-reload scans lines_to_delete to look for
'router bgp' and 'router bgp vrf ...' line.
If both are present switch the order to delete
bgp vrf instance(s) than default instance at the end.
Testing Done:
Before:
INFO: Loading Config object from file /etc/frr/frr.conf
INFO: Loading Config object from vtysh show running
INFO: Failed to execute no router bgp 40201 <-- Failed to delete
INFO: Failed to execute no router bgp
INFO: Failed to execute no router
ERROR: "no router" we failed to remove this command
ERROR: % Cannot delete default BGP instance. Dependent VRF instances exist
INFO: Executed "no router bgp 40201 vrf bgp-test" <-- vrf instance deleted
INFO: Loading Config object from vtysh show running
After:
order of deletion switched
INFO: Loading Config object from file /etc/frr/frr.conf
INFO: Loading Config object from vtysh show running
INFO: Executed "no router bgp 40201 vrf bgp-test"
INFO: Executed "no router bgp 40201"
INFO: Loading Config object from vtysh show running
Signed-off-by: Chirag Shah <chirag@nvidia.com>
watchfrr and staticd do not require <1024 ports to be running, thus they can
start, but others fail.
We should allow only users with EUID=0 (sudo or root) running frrinit.sh.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
At the moment we set /var/log/frr permissions to 0750 (frr:frr), but the log
file is 0640 (root:adm) (unless logrotated) and that doesn't allow adm group
to even open the directory.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
If a user enters:
/usr/lib/frr/frrinit.sh start
<modifies daemons file to remove a daemon from being used>
then calls:
/usr/lib/frr/frrinit.sh stop
The daemon(s) that are removed from the file are not stopped.
Apparently in shell scripting naming the variables the same
as the callee does not work as one would think. Just renaming
the variable to something else solved the problem.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
dc3bae68a2 added strings command, which is wrong.
It requires additional package to be installed on the system (binutils).
Let's just get use `tr`.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
There are singline-line commands inside `router bgp` that start with
`vnc ` or `bmp `. Those commands are currently treated as node-entering
commands. We need to specify such commands more precisely.
Fixes#10548.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
I think we don't care about this in release notes.
bgpd,pimd,isisd,nhrpd: Convert to vty_json() (origin/fix/vty_json)
ospf6d: Fix memory leak for `show ipv6 ospf6 zebra json` (origin/fix/zebra_ospf6d_json_leak)
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Problem:
During restarting frr.service, it throws annoying warnings:
Cannot stop bgpd(and others): pid file not found.
Root Cause:
During restarting process, systemd uses "stop", and watchfrr
uses "restart".
Yes, watchfrr using "restart" is to avoid systemd failing to stop.
But it should be quiet.
Fix:
During restarting service, suppress these warnings from watchfrr.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
checkpatch.pl has a hardcoded list of printf extensions supported... by
the Linux kernel. This happens to have covered the ones we have in FRR
so far, but `%pPA` isn't on the list and others may not be either.
Since we have the frr-format GCC plugin (and CI runs that on Debian 11)
we don't really need these checks in checkpatch.pl.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
On lower CPU with lots of static routes, it will cost more than 2
minutes.
2 minutes is the default timeout value, we can adjust it by configure:
./configure --with-service-timeout=<digit>
Signed-off-by: anlan_cs <anlan_cs@tom.com>
subprocess.check_call needs to be called with shell=True, else it will
interpret the string as a single path to execute instead of a command
with arguments:
Fixes the following error:
$ /usr/lib/frr/generate_support_bundle.py
Making backup of /var/log/frr/bgp_support_bundle.log
Traceback (most recent call last):
File "/usr/lib/frr/generate_support_bundle.py", line 89, in <module>
main()
File "/usr/lib/frr/generate_support_bundle.py", line 80, in main
stdout=open_with_backup(ofn),
File "/usr/lib/frr/generate_support_bundle.py", line 32, in open_with_backup
subprocess.check_call("mv {0} {0}.prev".format(path))
File "/usr/lib/python3.8/subprocess.py", line 359, in check_call
retcode = call(*popenargs, **kwargs)
File "/usr/lib/python3.8/subprocess.py", line 340, in call
with Popen(*popenargs, **kwargs) as p:
File "/usr/lib/python3.8/subprocess.py", line 858, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File "/usr/lib/python3.8/subprocess.py", line 1704, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'mv /var/log/frr/bgp_support_bundle.log /var/log/frr/bgp_support_bundle.log.prev'
Fixes: 5417cc2de ("tests: collect support bundle data in parallel, fix bugs")
Signed-off-by: Jonas Gorski <jonas.gorski@bisdn.de>
frr-reload triggers restart of service in case
it fails to parse new config file and conjunction with
running config contains 'router bgp' (default bgp instnace).
When frr-reload fails to parse new config file, it fails
to build newconfig context (empty object).
Instead of bailing out it compares against the running config
context. If the running config contains default bgp instance
it thinks new config is removing default bgp instance so it
triggers frr restart.
Fix is to to bail out reload script when it fails to parse
config file.
Ticket:#2861989
Reviewed By: MR-83
Testing Done:
router bgp 102 vrf RED
bgp router-id 2.2.2.2
neighbor underlay peer-group
neighbor underlay remote-as <---- Partial config
Before fix:
2021-12-02 02:43:16,987 ERROR: vtysh failed to process new
configuration: vtysh (mark file) exited with status 4:
b'line 79: % Command incomplete: neighbor underlay remote-as\n\n'
2021-12-02 02:43:17,145 INFO: Loading Config object from vtysh show
running
2021-12-02 02:43:17,362 INFO: "frr version 7.5+cl5.0.0u0" cannot be
removed
2021-12-02 02:43:17,362 INFO: "frr defaults datacenter" cannot be
removed
2021-12-02 02:43:17,362 INFO: "service integrated-vtysh-config" cannot
be removed
2021-12-02 02:43:17,363 INFO: "line vty" cannot be removed
2021-12-02 02:43:17,522 INFO: EVPN is enabled and default instance del
needed
2021-12-02 02:43:17,522 INFO: Restarting FRR <---- Restart frr
After fix:
Just throw Error and abort the script.
root@TORS1:mgmt:/home/cumulus# /usr/lib/frr/frr-reload.py --debug
--reload --stdout /etc/frr/frr.conf
2021-12-02 04:00:56,519 INFO: Called via "Namespace(bindir='/usr/bin',
confdir='/etc/frr', daemon='', debug=True, filename='/etc/frr/$
rr.conf', input=None, overwrite=False, pathspace=None, reload=True,
rundir='/var/run/frr', stdout=True, test=False, vty_socket=None)"
2021-12-02 04:00:56,520 INFO: Loading Config object from file
/etc/frr/frr.conf
2021-12-02 04:00:56,679 ERROR: vtysh failed to process new
configuration: vtysh (mark file) exited with status 4:
b'line 79: % Command incomplete: neighbor underlay remote-as\n\n'
root@TORS1:mgmt:/home/cumulus#
Signed-off-by: Chirag Shah <chirag@nvidia.com>
This utility script helps in generated formatted and consistent
change log including:
1- group logs per daemon
2- standarize daemon names (lowercase, end with d)
3- capitalize all log lines
4- no merge commits
caveat: comments are assumed to be in the form
daemon-name : message
Sample Output:
```
sharpd
Follow the practice on cli design for json output
Install route supports nexthop-seg6 (step3)
Install_routes_helper support zapi_route flags (step1)
snapcraft
Add missing dependency
Add pathd to frr snap daemons
Change base to ubuntu 18.04 and libyang 2.0.7
staticd
Convert typedef to enum
Fix distance processing
Fix late initialization of blackhole type
Output config using nb callbacks instead of operational data
```
Signed-off-by: Jafar Al-Gharaibeh <jafar@atcorp.com>
Remove neighbor <> remote-as <> config line,
if the neighbor is part of the peer-group and
peer-group contains remote-as config.
Neighbors which are part of the peer-group
cannot override remote-as.
Fix:
Frr-reload needs to remote 'neighbor <> remote-as <>'
from lines_to_add if its already part of peer-group
and peer-group has remote-as config.
Testing Done:
Before:
Config snippet:
neighbor PEERS peer-group
neighbor PEERS remote-as external
neighbor PEERS timers 3 9
neighbor 10.2.1.1 remote-as external
neighbor 10.2.1.1 peer-group PEERS
neighbor 10.2.1.1 timers 3 9
neighbor 10.2.1.2 remote-as external
neighbor 10.2.1.2 peer-group PEERS
Frr-reload failure:
line 179: Failure to communicate[13] to bgpd, line: neighbor 10.2.1.1
remote-as external
% Peer-group member cannot override remote-as of peer-group
line 179: Failure to communicate[13] to bgpd, line: neighbor 10.2.1.2
remote-as external
% Peer-group member cannot override remote-as of peer-group
After:
frr-reload apply the config successfully.
Signed-off-by: Chirag Shah <chirag@nvidia.com>
These aren't appropriate for use in FRR. Among other things, this
enables running checkpatch by calling it in a git working tree with
`tools/checkpatch.pl -g origin/master..`
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
For the purpose of allowing the space in `frr_each (`, copy the list of
iterators from .clang-format and wire it up appropriately.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Calling get_contexts() can't display as expected, it wrongly displays:
<__main__.Context object at 0x7fdee1d5ad50>
So make it display correct data by add __str__ in Context class.
Signed-off-by: anlan_cs <anlan_cs@tom.com>
This allows defining a CLI command like this:
`[no] some setting ![VALUE]`
with VALUE being optional for the "no" form, but required for the
positive form. It's just a `[...]` where the empty branch can only be
taken for commands starting with `no`.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Currently, in frr-reload we:
- store a list of single-line context keywords which needs to be
frequently updated,
- have a separate "if" clause for every node and subnode we have in FRR.
Instead, we can store the tree of all known FRR nodes. This tree needs
to be updated whenever we add a new node, which is not frequent. And,
most importantly, it allows us to write node-agnostic code and save more
than 250 LOC.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
As discussed in the weekly meeting today, this is what we're trying to
work with for the time being.
(Date calculator included as a bonus goodie ;)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
The pathspace folder in /var/run needs the x permission for the group too
Otherwise vtysh fails when running it with groups frrvty and frr:
$ vtysh -N gateway
% Can't open configuration file /etc/frr/gateway/vtysh.conf due to 'Permission denied'.
vtysh_connect(/var/run/frr/gateway/zebra.vty): stat = Permission denied
Signed-off-by: Steffen Neubauer <s.neubauer@syseleven.de>
- Remove incorrect requirement for `service integrated-vtysh-config`
when producing a delta.
- Add `--test-reset` option which suppresses non-parseable lines from the
produced delta
- Use new features in common_config.py
Signed-off-by: Christian Hopps <chopps@labn.net>
Speedup (large topo): OLD: ~6 minutes NEW: ~1 second.
(when paired with common_config.py changes)
- Collect each "proc" support in parallel
- For each "proc" only call vtysh once with all commands
Bug fixes:
- output was broken, a dump of python "repr" format of str.
Signed-off-by: Christian Hopps <chopps@labn.net>
When using frr-reload.py to modify a bgp neighbors route-map
the code was doing this:
a) deleting the previous route-map: `no neighbor XX route-map YY (in|out)`
b) Adding the new route-map back in `neighbor XX route-may ZZ (in|out)`
Now imagine that we have an outgoing route-map that we are changing
and the reload is large because of a large number of lines in frr.conf
Item (a) will happen. BGP will immediately start sending all local
routes. At some point in time in the future (b) will be applied.
This of course causes a withdraw but for a short amount of time we
are leaking unintended routes. This is bad for several reasons
not 1) route churn upstream, 2) we might influence traffic to go the
wrong way. 3) if upstream has a maximum-prefix command the routes
being sent might trip its circuitry and shutdown the peer entirely
not even allowing you to get to (b).
Ticket: #2589685
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Problem reported that frr-reload.py didn't handle the mac access-list
command correctly, causing reloads to fail. This fix adds the
support for the command as a single line context.
Signed-off-by: Don Slice <dslice@nvidia.com>