matthieu/frr - Forgejo: Beyond coding. We Forge.

Author	SHA1	Message	Date
Donald Sharp	aa736aa0aa	Revert "bgpd: Make keepalive pthread be connection based." This reverts commit `23bdaba147`.	2025-03-06 20:50:41 -05:00
Donald Sharp	23bdaba147	bgpd: Make keepalive pthread be connection based. Again instead of making the keepalives be peer based use the connection to make it happen. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2025-02-28 10:28:50 -05:00
Donald Sharp	2cd1d00dde	bgpd: Convert bgp_keepalive_send to use a connection The peer is going to eventually have a incoming and outgoing connection. Let's send the data based upon the connection not the peer. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2025-02-28 10:28:50 -05:00
Donald Sharp	d1e7215da0	bgpd: make bgp_keepalives_on\|off connection oriented The bgp_keepalives_on\|off functions should use a peer_connection as a basis for it's operation. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-10 08:31:25 -04:00
Donald Sharp	e16d030c65	*: Convert THREAD_XXX macros to EVENT_XXX macros Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-03-24 08:32:17 -04:00
Donald Sharp	de2754be3a	*: Convert thread_fetch and thread_call to event_fetch and event_call Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-03-24 08:32:17 -04:00
Donald Sharp	d8bc11a592	*: Add a hash_clean_and_free() function Add a hash_clean_and_free() function as well as convert the code to use it. This function also takes a double pointer to the hash to set it NULL. Also it cleanly does nothing if the pointer is NULL( as a bunch of code tested for ). Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-03-21 08:54:21 -04:00
David Lamparter	acddc0ed3c	*: auto-convert to SPDX License IDs Done with a combination of regex'ing and banging my head against a wall. Signed-off-by: David Lamparter <equinox@opensourcerouting.org>	2023-02-09 14:09:11 +01:00
Russ White	6664d74505	Merge pull request #12641 from samanvithab/bgpd_crash bgpd: Fix crash during shutdown due to race condition	2023-01-17 09:40:05 -05:00
Samanvitha B Bhargav	8c9d306c8d	bgpd: Fix crash during shutdown due to race condition [New LWP 2524] [New LWP 2539] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Core was generated by `/opt/avi/bin/bgpd -f /run/frr/avi_ns3_bgpd.config -i /opt/avi/etc/avi_ns3_bgpd.'. Program terminated with signal SIGABRT, Aborted. [Current thread is 1 (Thread 0x7f92ac8f1740 (LWP 2524))] 0 0x00007f92acb3800b in raise () from /lib/x86_64-linux-gnu/libc.so.6 [Current thread is 1 (Thread 0x7f92ac8f1740 (LWP 2524))] 0 0x00007f92acb3800b in raise () from /lib/x86_64-linux-gnu/libc.so.6 1 0x00007f92acb17859 in abort () from /lib/x86_64-linux-gnu/libc.so.6 2 0x00007f92acb17729 in ?? () from /lib/x86_64-linux-gnu/libc.so.6 3 0x00007f92acb28fd6 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6 4 0x00007f92accf2164 in pthread_mutex_lock () from /lib/x86_64-linux-gnu/libpthread.so.0 5 0x000055b46be1ef63 in bgp_keepalives_wake () at bgpd/bgp_keepalives.c:311 6 0x000055b46be1f111 in bgp_keepalives_stop (fpt=0x55b46cfacf20, result=<optimized out>) at bgpd/bgp_keepalives.c:323 7 0x00007f92acea9521 in frr_pthread_stop (fpt=0x55b46cfacf20, result=result@entry=0x0) at lib/frr_pthread.c:176 8 0x00007f92acea9586 in frr_pthread_stop_all () at lib/frr_pthread.c:188 9 0x000055b46bdde54a in bgp_pthreads_finish () at bgpd/bgpd.c:8150 10 0x000055b46bd696ca in bgp_exit (status=0) at bgpd/bgp_main.c:210 11 sigint () at bgpd/bgp_main.c:154 12 0x00007f92acecc1e9 in quagga_sigevent_process () at lib/sigevent.c:105 13 0x00007f92aced689a in thread_fetch (m=m@entry=0x55b46cf23540, fetch=fetch@entry=0x7fff95379238) at lib/thread.c:1487 14 0x00007f92aceb2681 in frr_run (master=0x55b46cf23540) at lib/libfrr.c:1010 15 0x000055b46bd676f4 in main (argc=11, argv=0x7fff953795a8) at bgpd/bgp_main.c:482 Root cause: This is due to race condition between main thread & keepalive thread during clean-up. This happens when the keepalive thread is processing a wake signal owning the mutex, when meanwhile the main thread tries to stop the keepalives thread. In main thread, the keepalive thread’s running bit (fpt->running) is set to false, without taking the mutex & then it blocks on mutex. Meanwhile, keepalive thread which owns the mutex sees that the running bit is false & executes bgp_keepalives_finish() which also frees up mutex. Main thread that is waiting on mutex with pthread_mutex_lock() will cause core while trying to access mutex. Fix: Take the lock in main thread while setting the fpt->running to false. Signed-off-by: Samanvitha B Bhargav <bsamanvitha@vmware.com>	2023-01-16 04:22:11 -08:00
Donald Sharp	19a713be1d	bgpd: Make bgp_keepalives.c not use MTYPE_TMP Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2022-12-05 12:17:00 -05:00
vivek	3ffec403e8	bgpd: Modify keepalive debug category Log keepalive timer expiry against 'debug bgp keepalive' instead of 'debug bgp neighbor-events'. Signed-off-by: Vivek Venkatraman <vivek@nvidia.com>	2022-11-20 22:42:47 -05:00
Mark Stapp	85ba04f389	bgpd: release rcu lock in bgp keepalive pthread Don't hold the rcu lock in the bgp keepalive pthread: it blocks the rcu pthread and prevents log-file deletion. Signed-off-by: Mark Stapp <mstapp@nvidia.com>	2022-09-06 09:07:07 -04:00
Donald Sharp	cb1991af8c	*: frr_with_mutex change to follow our standard convert: frr_with_mutex(..) to: frr_with_mutex (..) To make all our code agree with what clang-format is going to produce Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2022-07-20 15:50:32 -04:00
Donatas Abraitis	6006b807b1	*: Properly use memset() when zeroing Wrong: memset(&a, 0, sizeof(struct ...)); Good: memset(&a, 0, sizeof(a)); Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2022-05-11 14:08:47 +03:00
anlan_cs	8e3aae66ce	: remove the checking returned value for hash_get() Firstly, keep no change* for `hash_get()` with NULL `alloc_func`. Only focus on cases with non-NULL `alloc_func` of `hash_get()`. Since `hash_get()` with non-NULL `alloc_func` parameter shall not fail, just ignore the returned value of it. The returned value must not be NULL. So in this case, remove the unnecessary checking NULL or not for the returned value and add `void` in front of it. Importantly, also keep no change for the two cases with non-NULL `alloc_func` - 1) Use `assert(<returned_data> == <searching_data>)` to ensure it is a created node, not a found node. Refer to `isis_vertex_queue_insert()` of isisd, there are many examples of this case in isid. 2) Use `<returned_data> != <searching_data>` to judge it is a found node, then free <searching_data>. Refer to `aspath_intern()` of bgpd, there are many examples of this case in bgpd. Here, <returned_data> is the returned value from `hash_get()`, and <searching_data> is the data, which is to be put into hash table. Signed-off-by: anlan_cs <vic.lan@pica8.com>	2022-05-03 00:41:48 +08:00
David Lamparter	2b64873d24	*: generously apply const const const const your boat, merrily down the stream... Signed-off-by: David Lamparter <equinox@diac24.net>	2019-12-02 15:01:29 +01:00
Quentin Young	bfc18a0205	bgpd: do not send keepalives when KA timer is 0 RFC4271 specifies behavior when the hold timer is sent to zero - we should not send keepalives or run a hold timer. But FRR, and other vendors, allow the keepalive timer to be set to zero with a nonzero hold timer. In this case we were sending keepalives constantly and maxing out a pthread to do so. Instead behave similarly to other vendors and do not send keepalives. Unsure what the utility of this is, but blasting keepalives is definitely the wrong thing to do. Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>	2019-09-16 16:15:07 +00:00
David Lamparter	00dffa8cde	lib: add frr_with_mutex() block-wrapper frr_with_mutex(...) { ... } locks and automatically unlocks the listed mutex(es) when the block is exited. This adds a bit of safety against forgetting the unlock in error paths & co. and makes the code a slight bit more readable. Signed-off-by: David Lamparter <equinox@opensourcerouting.org>	2019-09-03 17:15:17 +02:00
Quentin Young	d8b87afe7c	lib: hashing functions should take const arguments It doesn't make much sense for a hash function to modify its argument, so const the hash input. BGP does it in a couple places, those cast away the const. Not great but not any worse than it was. Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>	2019-05-14 21:23:08 +00:00
Tim Bray	e3b78da875	*: Rename backet to bucket Presume typo from original author Signed-off-by: Tim Bray <tim@kooky.org>	2019-02-25 16:22:36 +00:00
Ruben Kerkhof	4d762f2607	Treewide: use ANSI function definitions Signed-off-by: Ruben Kerkhof <ruben@rubenkerkhof.com>	2019-01-24 11:21:59 +01:00
Donald Sharp	c80bedb83b	lib, bgpd: Convert frr_pthread_set_name to only cause it to set os name of the thread The current invocation of frr_pthread_set_name was causing it reset the os_name. There is no need for this, we now always create the pthread appropriately to have both name and os_name. So convert this function to a simple call through of the pthread call now. Before(any of these changes): sharpd@robot ~/frr1> ps -L -p 16895 PID LWP TTY TIME CMD 16895 16895 ? 00:01:39 bgpd 16895 16896 ? 00:00:54 16895 16897 ? 00:00:07 bgpd_ka After: sharpd@donna ~/frr1> ps -L -p 1752 PID LWP TTY TIME CMD 1752 1752 ? 00:00:00 bgpd 1752 1753 ? 00:00:00 bgpd_io 1752 1754 ? 00:00:00 bgpd_ka Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>	2019-01-09 14:59:22 -05:00
Donald Sharp	74df8d6d9d	*: Replace hash_cmp function return value to a bool The ->hash_cmp and linked list ->cmp functions were sometimes being used interchangeably and this really is not a good thing. So let's modify the hash_cmp function pointer to return a boolean and convert everything to use the new syntax. Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>	2018-10-19 13:14:45 -04:00
David Lamparter	1ac267a2d9	lib: remove frr_pthread->id All I can see is an unneccessary complication. If there's some purpose here it needs to be documented... Signed-off-by: David Lamparter <equinox@diac24.net>	2018-09-19 22:01:46 +02:00
Chirag Shah	57019528a0	*: pthread set name abstraction Testing Done: TOR#cat /proc/2670/task/2672/comm bgpd_ka TOR# ps H -C bgpd -o 'pid tid cmd comm' PID TID CMD COMMAND 2670 2670 /usr/lib/frr/bgpd -M snmp - bgpd 2670 2671 /usr/lib/frr/bgpd -M snmp - bgpd 2670 2672 /usr/lib/frr/bgpd -M snmp - bgpd_ka Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>	2018-08-29 15:41:54 -07:00
Donald Sharp	68ede9c401	bgpd: zlog_warn to assert for code that must be executed first In bgp_keepalives.c, it was noticed that we were ensuring that we called an intialization function first, but this is a development escape in that once this was fixed we never see it. So if a developer moves this assumption around, let's crash the program and lead them to this spot instead of silently ignoring the problem. Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>	2018-08-16 08:24:22 -04:00
Chirag Shah	a9198bc1e2	bgpd: add keepalive thread name Testing Done: Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>	2018-07-27 21:04:47 -07:00
Quentin Young	2ccf91b108	bgpd: fix incorrect keepalive timer evaluation Incorrect check for sentinel value effectively caused peers to sometimes use the keepalive timer value of other peers, which sometimes led to hold timer expiry. Signed-off-by: Quentin Young <qlyoung@qlyoung.net>	2018-02-21 12:15:17 -05:00
Quentin Young	096476ddb0	bgpd: check flags before attempting keepalive ops If a peer already has keepalives turned on when asking to turn them on, return immediately. Same thing for turning them off. Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>	2018-01-24 17:47:17 -05:00
Quentin Young	a715eab3ce	bgpd: update pthreads to use lib changes Use the new threading facilities provided in lib/ to streamline the threads used in bgpd. In particular, all of the lifecycle code has been removed from the I/O thread and replaced with the default loop. Did not do the same to the keepalives thread as it is much smaller (doesn't need the event system). Also cleaned up some comments to match the style guide. Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>	2018-01-24 15:30:55 -05:00
Quentin Young	934af4587f	bgpd: turn off keepalives when sending NOTIFY This is necessary because otherwise between the time we wipe the output buffer and the time we push the NOTIFY onto it, the KA generation thread could have pushed a KEEPALIVE in the middle. Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>	2017-11-30 16:18:07 -05:00
Quentin Young	152456fe23	bgpd: rebase onto master Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>	2017-11-30 16:18:04 -05:00
Quentin Young	6ee8ea1cf9	bgpd: fix includes for bgp_keeaplives.c Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>	2017-11-30 16:18:02 -05:00
Quentin Young	5c0c651c0a	bgpd: restyle bgp_keepalives.[ch] And update copyright header. Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>	2017-11-30 16:18:02 -05:00
Quentin Young	b72b6f4fc9	bgpd: rename peer_keepalives* --> bgp_keepalives* Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>	2017-11-30 16:17:59 -05:00
Quentin Young	424ab01d0f	bgpd: implement buffered reads * Move and modify all network input related code to bgp_io.c * Add a real input buffer to `struct peer` * Move connection initialization to its own thread.c task instead of piggybacking off of bgp_read() * Tons of little fixups Primary changes are in bgp_packet.[ch], bgp_io.[ch], bgp_fsm.[ch]. Changes made elsewhere are almost exclusively refactoring peer->ibuf to peer->curr since peer->ibuf is now the true FIFO packet input buffer while peer->curr represents the packet currently being processed by the main pthread. Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>	2017-11-30 16:17:59 -05:00
Quentin Young	0ca8b79f38	bgpd: use new threading infra Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>	2017-11-30 16:17:59 -05:00
Quentin Young	bd8b71e405	bgpd: use hash table for bgp_keepalives.c Large numbers of peers makes insertion and removal time for a linked list non-negligible. Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>	2017-11-30 16:17:59 -05:00
Quentin Young	419dfe6a70	bgpd: dynamically allocate synchronization primitives Changes all synchronization primitives to be dynamically allocated. This should help catch any subtle errors in pthread lifecycles. This change also pre-initializes synchronization primitives before threads begin to run, eliminating a potential race condition that probably would have caused a segfault on startup on a very fast box. Also changes mutex and condition variable allocations to use MTYPE_PTHREAD and updates tests to do the proper initializations. Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>	2017-11-30 16:17:58 -05:00
Quentin Young	49507a6f6a	bgpd: remove unused `struct thread` from peer * Remove t_write * Remove t_keepalive These have been replaced by pthreads and are no longer needed. Since some code looks at these values to determine if the threads are scheduled, also add a new bitfield to store the same information. Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>	2017-11-30 16:17:58 -05:00
Quentin Young	03014d48f4	bgpd: put BGP keepalives in a pthread This patch, in tandem with moving packet writes into a dedicated kernel thread, fixes session flaps caused by long-running internal operations starving the (old) userspace write thread. BGP keepalives are now produced by a kernel thread and placed onto the peer's output queue. These are then consumed by the write thread. Both of these tasks are concurrent with the rest of bgpd, obviating the session flaps described above. Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>	2017-11-30 16:17:57 -05:00

42 commits