watchfrr is currently being started with $frr_global_options
This is problematic as that it has a entirely different cli
than the rest of the daemons and we have no plans to make
this equivalent.
Fixes: #18107
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Use consistent `e_somepath` names for expanded versions of `somepath`.
Also remove all paths from `config.h` and put them into
`lib/config_paths.h` - this is to make more obvious when someone is
doing something probably not quite properly structured.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
When calling daemon_stop() with --quiet and e.g. the pidfile is empty,
it won't return early since while "$fail" is set, "$2" is "--quiet", so
the if condition isn't met and it will continue executing, resulting
in error messages in the log:
> Sep 14 14:48:33 localhost watchfrr[2085]: [YFT0P-5Q5YX] Forked background command [pid 2086]: /usr/lib/frr/watchfrr.sh restart all
> Sep 14 14:48:33 localhost frrinit.sh[2075]: /usr/lib/frr/frrcommon.sh: line 216: kill: `': not a pid or valid job spec
> Sep 14 14:48:33 localhost frrinit.sh[2075]: /usr/lib/frr/frrcommon.sh: line 216: kill: `': not a pid or valid job spec
> Sep 14 14:48:33 localhost frrinit.sh[2075]: /usr/lib/frr/frrcommon.sh: line 216: kill: `': not a pid or valid job spec
Fix this by moving the --quiet check into the block to log_failure_msg(),
and also add the check to all other invocations of log_*_msg() to make
--quiet properly suppress output.
Fixes: 19a99d89f0 ("tools: suppress unuseful warnings during restarting frr")
Signed-off-by: Jonas Gorski <jonas.gorski@bisdn.de>
Before it was setting SDIR, which is /usr/lib/frr, but the vtysh binary is put
under bindir (which is /usr/local by default). And running `/usr/lib/frr/frr reload`
failed.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Features added in this commit:
1. Bringup/shutdown new management daemon 'mgmtd' along with FRR.
2. Support for Startup, Candidate and Running DBs.
3. Lock/Unlock DS feature using pthread lock.
4. Load config from a JSON file onto candidate DS.
5. Save config to a JSON file from running/candidate DS.
6. Dump candidate or running DS contents on the terminal or a file in
JSON/XML format.
7. Maintaining commit history (Full rollback support to be added in
future commits).
8. Addition of debug commands.
Co-authored-by: Yash Ranjan <ranjany@vmware.com>
Co-authored-by: Abhinay Ramesh <rabhinay@vmware.com>
Co-authored-by: Ujwal P <ujwalp@vmware.com>
Signed-off-by: Pushpasis Sarkar <pushpasis@gmail.com>
This reverts commit 2000ac4075.
There were concerns that ensuring zebra stopped last led to
problems with zebra's "-r" flag, so we'll revert that for the
time being and reconsider this area.
Signed-off-by: Mark Stapp <mjs@labn.net>
There might be use cases when this would make sense, for example
running FRR in a container as a designated user.
Signed-off-by: Michal Ruprich <mruprich@redhat.com>
The backslash in `grep -q '^declare \-a'` is not needed and
causes `grep: warning: stray \ before -` warning in grep-3.8.
Signed-off-by: Marius Tomaschewski <mt@suse.com>
touch + chown can have a gap between the commands (or the second failed).
This could lead to unexpected permissions (root, instead of frr) for some
.conf files or directories.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
The problem is that when we run watchfrr.sh/frrinit.sh, we get something like:
```
cat: '"/var/run/frr/staticd.pid"': No such file or directory
cat: '"/var/run/frr/babeld.pid"': No such file or directory
cat: '"/var/run/frr/zebra.pid"': No such file or directory
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2469a37f reversed the logic of the existence check for
/etc/frr/frr.conf breaking boot config loading, fix it.
Signed-off-by: Christian Hopps <chopps@labn.net>
When starting a daemon, print the full command run by the init script to
start it. This gives more information and is especially helpful when
debugging wrap commands.
Also add some more logs to vtysh_b to print the command used there,
log when we exit early because frr.conf doesn't exist, and simplify the
code path for creating the command to use.
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
Fix daemon shutdown broken by f0cccaa6bf. Now
we still don't complain about missing PID files but actually stop the
running daemons.
The previous fix was broken because it passed a new "--all" option to
daemon_stop which wasn't handled properly (it assumed $1 contains the
daemon name when at that time it acutally contained the "--all" option).
Plus, "--all" wasn't actually necessary, because we already passed
"--reallyall" from all_stop to daemon_stop after the daemon name.
So remove "--all" again and simply check for "--reallyall" in $2. This
should *really* fix#11317.
Signed-off-by: Martin Buck <mb-tmp-tvguho.pbz@gromit.dyndns.org>
Since 1686b1d486, we try to stop all daemons,
even those which are not (no longer) enabled in /etc/frr/daemons. But we
shouldn't complain about missing PID files for daemons which have never been
started and just silently ignore those.
Signed-off-by: Martin Buck <mb-tmp-tvguho.pbz@gromit.dyndns.org>
watchfrr and staticd do not require <1024 ports to be running, thus they can
start, but others fail.
We should allow only users with EUID=0 (sudo or root) running frrinit.sh.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
If a user enters:
/usr/lib/frr/frrinit.sh start
<modifies daemons file to remove a daemon from being used>
then calls:
/usr/lib/frr/frrinit.sh stop
The daemon(s) that are removed from the file are not stopped.
Apparently in shell scripting naming the variables the same
as the callee does not work as one would think. Just renaming
the variable to something else solved the problem.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Problem:
During restarting frr.service, it throws annoying warnings:
Cannot stop bgpd(and others): pid file not found.
Root Cause:
During restarting process, systemd uses "stop", and watchfrr
uses "restart".
Yes, watchfrr using "restart" is to avoid systemd failing to stop.
But it should be quiet.
Fix:
During restarting service, suppress these warnings from watchfrr.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
The pathspace folder in /var/run needs the x permission for the group too
Otherwise vtysh fails when running it with groups frrvty and frr:
$ vtysh -N gateway
% Can't open configuration file /etc/frr/gateway/vtysh.conf due to 'Permission denied'.
vtysh_connect(/var/run/frr/gateway/zebra.vty): stat = Permission denied
Signed-off-by: Steffen Neubauer <s.neubauer@syseleven.de>
Currently pathd is missing from the deamon list in frrcommon
with this missing frrinit can't start pathd if it is added to
the deamon file. This commit adds it to the frrcommon deamon list
and updates the example deamon file.
Signed-off-by: Erik Kooistra <me@erikkooistra.nl>
As noted by Donald:
When FRR is starting all daemons (or restarting them all) FRR is reading
in the configuration 1 time for each daemon specified to run. This is
not a big deal if you have a very small configuration. But with large
configurations FRR is taking long enough that watchfrr is not
establishing connection to all the daemons and starting some over.
Modify the code so that vtysh is only read in at the end of a all
sequence. If we are restarting an individual daemon allow the read in of
the whole config.
Reported-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: David Lamparter <equinox@diac24.net>
In the case of some linux distros the /var/run dir is mounted
with tmpfs so in every reboot it's removed.
Then the frrcommon.sh will recreate it without 'x' perm
So no pid file cannot be created in /var/run/frr
Signed-off-by: Javier Garcia <rampxxxx@gmail.com>
Drop the `-n` (`--noerror`) flag from the `vtysh -b` invocation called by the
init script responsible for starting FRR. This ensures that errors in the
configuration file is propagated to the administrator, and prevents a node from
entering a production network while running an essentially undefined
configuration (a behaviour that I can personally attest to has the potential to
cause disastrous network outages - documented in more detail in Cumulus
Networks CS#12791).
Silently ignoring errors also leads to the rather odd behaviour that starting
FRR will ostensibly succeed, while reloading it immediately after - without
changing the configuration - will fail. This is due to the fact that the `-n`
flag is not used while reloading.
The use of the `-n` flag appears to have been introduced without any
explanation in commit 858aa29c68 by @donaldsharp.
Looking at the commit message, I suspect that it was not an intentional change.
It seems more likely to me that it was just meant to be used during testing and
development, but ended up being committed to master by accident.
Ticket:CM-28003
Signed-off-by: Tore Anderson <tore@fud.no>
This adds -N and --netns options to watchfrr, allowing it to start
daemons with -N and switching network namespaces respectively.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Original start/stop of FRR prior to David's rewrite in
PR 3507, when configuring multi-instance would
only start multi-instance (-1 -2 -3 -4...) or
just the daemon, not both. If you happened
to start a ospfd instance of 1 then both
the default and instance 1 would react to cli.
Do not allow this, put it back to original behavior
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This only applies for split-config; the init script would create an
empty config file with default permissions.
Reported-by: Robert Scheck <robert@fedoraproject.org>
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Hopefully at some point we can get rid of the --enable-datacenter switch
and just have the init script do magic. Should already work for Cumulus
as it is.
NB: the profile name can't be baked into the package. The whole point
is to make the package profile-agnostic; in theory at some point the
exact same package files should work on both, say, a Cumulus switch and
a Linux software BGP DFZ router.
Signed-off-by: David Lamparter <equinox@diac24.net>
The "declare -p watchfrr_options" call is just to support backwards
compatibility. If it fails, silently ignore that.
Signed-off-by: David Lamparter <equinox@diac24.net>
TBH when I looked at watchfrr I didn't see any MI support and hence
assumed this just didn't work to begin with. However, it actually does
(transparently to watchfrr, by just using "ospfd-1" as daemon name.)
So, fix this up and make it work again.
(Also remove 2 extraneous \n in messages.)
Signed-off-by: David Lamparter <equinox@diac24.net>
There's no good reason to not have these options default to the
installation path of tools/watchfrr.sh. Doing so allows us to ditch
watchfrr_options from daemons/daemons.conf completely.
Fixes: #3652
Signed-off-by: David Lamparter <equinox@diac24.net>
If we try to monitor a nonexisting daemon in watchfrr, it will
(currently) forever wait at startup since the vty connection will never
come up. Just drop the daemon from the daemon list in such a case.
Signed-off-by: David Lamparter <equinox@diac24.net>
This separates the init script used for the system (and called in the
systemd unit file) from the script that watchfrr uses to control
daemons. Mixing these two caused the entire thing to become a rather
huge spaghetti mess.
Note that there is a behaviour change in that the new script always
starts zebra regardless of zebra_enable.
Side changes:
- Ubuntu 12.04 removed from backports since it doesn't work anyway
- zebra is always started regardless of zebra_enable. To disable FRR,
the entire init script should be disabled through policy.
- no-watchfrr operation is no longer supported by the scripts in the
Debian packages. (This is intentional.)
Signed-off-by: David Lamparter <equinox@diac24.net>