Having just completed a code audit during RCA, the fact that we have 2
different argument orders for the related datastore copying functions
was unnecessary and super confusing.
Fix this code-maintenance/comprehension mistake and move the newer mgmtd
copy routines to use the same arg order as the pre-existing underlying
northbound copy functions (i.e., use `copy(dst, src)`)
Signed-off-by: Christian Hopps <chopps@labn.net>
ci is picking up this use after free on occasion:
ERROR: AddressSanitizer: attempting to call malloc_usable_size() for pointer which is not owned: 0x6030001d94a0
0 0x7fab994b7f04 in __interceptor_malloc_usable_size ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:119
1 0x7fab994264f6 in __sanitizer::BufferedStackTrace::Unwind(unsigned long, unsigned long, void*, bool, unsigned int) ../../../../src/libsanitizer/sanitizer_common/sanitizer_stacktrace.h:131
2 0x7fab994264f6 in __asan::asan_malloc_usable_size(void const*, unsigned long, unsigned long) ../../../../src/libsanitizer/asan/asan_allocator.cpp:1058
3 0x7fab99039bcf in mt_count_free lib/memory.c:78
4 0x7fab99039bcf in qfree lib/memory.c:130
5 0x7fab98ff971a in hash_clean lib/hash.c:290
6 0x56110cdb0e7f in mgmt_txn_hash_destroy mgmtd/mgmt_txn.c:1881
7 0x56110cdb0e7f in mgmt_txn_destroy mgmtd/mgmt_txn.c:2013
8 0x56110cd8e5de in mgmt_terminate mgmtd/mgmt.c:91
9 0x56110cd8e003 in sigint mgmtd/mgmt_main.c:90
10 0x7fab990bf4b0 in frr_sigevent_process lib/sigevent.c:117
11 0x7fab990ea7a1 in event_fetch lib/event.c:1740
12 0x7fab9901a24e in frr_run lib/libfrr.c:1245
13 0x56110cd8e21f in main mgmtd/mgmt_main.c:290
14 0x7fab98af9249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
15 0x7fab98af9304 in __libc_start_main_impl ../csu/libc-start.c:360
16 0x56110cd8dd30 in _start (/usr/lib/frr/mgmtd+0x3ad30)
0x6030001d94a0 is located 0 bytes inside of 24-byte region [0x6030001d94a0,0x6030001d94b8)
freed by thread T0 here:
0 0x7fab994b76a8 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:52
1 0x7fab99039bf0 in qfree lib/memory.c:131
2 0x7fab98ff93e1 in hash_release lib/hash.c:227
3 0x56110cdaabdc in mgmt_txn_unlock mgmtd/mgmt_txn.c:1931
4 0x56110cdab049 in mgmt_txn_delete mgmtd/mgmt_txn.c:1841
5 0x56110cdab0ce in mgmt_txn_hash_free mgmtd/mgmt_txn.c:1864
6 0x7fab98ff970b in hash_clean lib/hash.c:288
7 0x56110cdb0e7f in mgmt_txn_hash_destroy mgmtd/mgmt_txn.c:1881
8 0x56110cdb0e7f in mgmt_txn_destroy mgmtd/mgmt_txn.c:2013
9 0x56110cd8e5de in mgmt_terminate mgmtd/mgmt.c:91
10 0x56110cd8e003 in sigint mgmtd/mgmt_main.c:90
11 0x7fab990bf4b0 in frr_sigevent_process lib/sigevent.c:117
12 0x7fab990ea7a1 in event_fetch lib/event.c:1740
13 0x7fab9901a24e in frr_run lib/libfrr.c:1245
14 0x56110cd8e21f in main mgmtd/mgmt_main.c:290
15 0x7fab98af9249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
previously allocated by thread T0 here:
0 0x7fab994b83b7 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:77
1 0x7fab990392fd in qcalloc lib/memory.c:106
2 0x7fab98ff8b4f in hash_get lib/hash.c:156
3 0x56110cdb13ae in mgmt_txn_create_new mgmtd/mgmt_txn.c:1825
4 0x56110cdb3b4d in mgmt_txn_notify_be_adapter_conn mgmtd/mgmt_txn.c:2212
5 0x56110cd91178 in mgmt_be_adapter_conn_init mgmtd/mgmt_be_adapter.c:842
6 0x7fab990ec6de in event_call lib/event.c:2019
7 0x7fab9901a243 in frr_run lib/libfrr.c:1246
8 0x56110cd8e21f in main mgmtd/mgmt_main.c:290
9 0x7fab98af9249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
The only time that mgmt_txn_hash_free is called is in hash_clean.
There are other places that mgmt_txn_unlock/delete are called and
hash_release should be called. Let's just notice when mgmtd is
being called from the hash_clean and not call hash_release (since
we know it is being released already)
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
- Fix memleak on multiple errstr returns (multiple clients) Allow the
- multiple clients to all return results and merge them (as with other
operations like get tree).
Signed-off-by: Christian Hopps <chopps@labn.net>
This operation basically implements support for RESTCONF operations. It
receives an xpath and a data tree in JSON/XML format, instead of a list
of (xpath, value) tuples as required by the current protobuf interface.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
When FRR starts, after mgmtd is initialized, backend clients connect to
it and request their config. To supply the config, mgmtd creates a
configuration transaction. At the same time, `vtysh -b` tries to read
the startup config and configure mgmtd, which also creates a
configuration transaction. If these two actions happen at the exact same
time, there's a conflict between them, because only a single
configuration translaction is allowed. Because of that, vtysh fails and
the config is completely ignored.
When starting the config reading, vtysh locks candidate and running
datastores in mgmtd. This commit adds locking of running datastore when
initializing the backend client. It allows to retry locking on the vtysh
side and read the config only when the lock is aquired instead of
failing.
This change also prevents running datastore from being changed during
initialization of backend clients. This could lead to a desynchronized
state between mgmtd and backends.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
When no data requests were sent to the backends, return immediately,
instead of waiting for a timeout. This can happen if backends providing
the requested data are not connected to mgmtd.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
There are places, where we can receive an existing commit transaction.
If we don't check that the request already exists, it gets overwritten
and we start having problems with transaction refcounters. Forbid having
multiple configuration sessions simultaneously.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
If the transaction is not cleaned up immediately, it can be still
referenced by some threds. If it's a commit thread and it's executed
before the actual cleanup, mgmtd crashes because of the missing
commit_cfg_req.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
The value should not be cleared after sending it to the first client,
otherwise other clients receive an empty string.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Like in RESTCONF GET request and NETCONF get-data request, make it
possible to request state-only, config-only, or all data.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
next_phase is always curr_phase + 1. It's not necessary to maintain it
separately. Also rename curr_phase to phase.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
When determining the interested backend clients for a configuration
change, don't consider disconnected clients. This fixes a crash in
`mgmt_txn_send_be_txn_create` when trying to send data to a non-existing
adapter.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Replace operation removes the current data node configuration and sets
the provided value. As current northbound code works only with one
xpath at a time, the operation only makes sense to clear the config of
a container without deleting it itself. However, the next step is to
allow passing JSON-encoded complex values to northbound operations which
will make replace operation much more useful.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Currently, there's a single operation type which doesn't return error
if the object doesn't exists. To be compatible with NETCONF/RESTCONF,
we should support differentiate between DELETE (fails when object
doesn't exist) and REMOVE (doesn't fail if the object doesn't exist).
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Currently, there's no difference between CREATE and MODIFY operations.
To be compatible with NETCONF/RESTCONF, add new CREATE_EXCL operation
that throws an error if the configuration data already exists.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Currently, nb_operation enum means two different things - edit operation
type (frontend part), and callback type (backend part). These types
overlap, but they are not identical. We need to add more operation
types to support NETCONF/RESTCONF integration, so it's better to have
separate enums to identify different entities.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Add configure.ac tests for libyang functions, if not present supply the
functionality ourselves in yang.[ch]
Signed-off-by: Christian Hopps <chopps@labn.net>
Allow user to specify full YANG compatible XPath 1.0 predicates. This
allows for trimming results of generic queries using functions and other
non-key predicates from XPath 1.0
Signed-off-by: Christian Hopps <chopps@labn.net>
Adds the guts of the get-tree functionality that is called by or calls
the FE and BE code for get-tree processing.
Signed-off-by: Christian Hopps <chopps@labn.net>
The candidate yang tree should be validated before `nb_config_diff` is
called. `nb_config_diff` ignores all prohibited operations and can
provide an empty change list because of this. For example, if a user
deletes a mandatory node from the candidate datastore and tries to make
a commit, they'll receive the "No changes found to be committed!" error,
because such a change is ignored by `nb_config_diff`. Instead, mgmtd
should tell the user that their candidate datastore is not valid and
can't be commited.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
At this stage, we should send a SET_CFG reply, not COMMIT_CFG reply.
Commit request is not yet initialized.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>