openwrt/target/linux/ipq806x/patches-6.1/700-03-net-stmmac-improve-TX-timer-arm-logic.patch
John Audia 1e6c6a36f5 kernel: bump 6.1 to 6.1.68
Changelog: https://cdn.kernel.org/pub/linux/kernel/v6.x/ChangeLog-6.1.68

Removed upstreamed:
	generic/backport-6.1/795-v6.6-12-r8152-Rename-RTL8152_UNPLUG-to-RTL8152_INACCESSIBLE.patch[1]

Manually rebased:
	mediatek/patches-6.1/100-dts-update-mt7622-rfb1.patch

All other patches automatically rebased.

1. https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v6.1.68&id=3759e735562a31e44fee825498f05c06e64b25a8

Build system: x86/64
Build-tested: x86/64/AMD Cezanne
Run-tested: x86/64/AMD Cezanne

Signed-off-by: John Audia <therealgraysky@proton.me>
2023-12-19 14:12:25 +01:00

77 lines
2.9 KiB
Diff

From cd40cd8b1ca4a6f531c6c3fd78b306e5014f9c04 Mon Sep 17 00:00:00 2001
From: Christian Marangi <ansuelsmth@gmail.com>
Date: Mon, 18 Sep 2023 14:39:01 +0200
Subject: [PATCH 3/4] net: stmmac: improve TX timer arm logic
There is currently a problem with the TX timer getting armed multiple
unnecessary times causing big performance regression on some device that
suffer from heavy handling of hrtimer rearm.
The use of the TX timer is an old implementation that predates the napi
implementation and the interrupt enable/disable handling.
Due to stmmac being a very old code, the TX timer was never evaluated
again with this new implementation and was kept there causing
performance regression. The performance regression started to appear
with kernel version 4.19 with 8fce33317023 ("net: stmmac: Rework coalesce
timer and fix multi-queue races") where the timer was reduced to 1ms
causing it to be armed 40 times more than before.
Decreasing the timer made the problem more present and caused the
regression in the other of 600-700mbps on some device (regression where
this was notice is ipq806x).
The problem is in the fact that handling the hrtimer on some target is
expensive and recent kernel made the timer armed much more times.
A solution that was proposed was reverting the hrtimer change and use
mod_timer but such solution would still hide the real problem in the
current implementation.
To fix the regression, apply some additional logic and skip arming the
timer when not needed.
Arm the timer ONLY if a napi is not already scheduled. Running the timer
is redundant since the same function (stmmac_tx_clean) will run in the
napi TX poll. Also try to cancel any timer if a napi is scheduled to
prevent redundant run of TX call.
With the following new logic the original performance are restored while
keeping using the hrtimer.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
---
.../net/ethernet/stmicro/stmmac/stmmac_main.c | 18 +++++++++++++++---
1 file changed, 15 insertions(+), 3 deletions(-)
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -2976,13 +2976,25 @@ static void stmmac_tx_timer_arm(struct s
{
struct stmmac_tx_queue *tx_q = &priv->dma_conf.tx_queue[queue];
u32 tx_coal_timer = priv->tx_coal_timer[queue];
+ struct stmmac_channel *ch;
+ struct napi_struct *napi;
if (!tx_coal_timer)
return;
- hrtimer_start(&tx_q->txtimer,
- STMMAC_COAL_TIMER(tx_coal_timer),
- HRTIMER_MODE_REL);
+ ch = &priv->channel[tx_q->queue_index];
+ napi = tx_q->xsk_pool ? &ch->rxtx_napi : &ch->tx_napi;
+
+ /* Arm timer only if napi is not already scheduled.
+ * Try to cancel any timer if napi is scheduled, timer will be armed
+ * again in the next scheduled napi.
+ */
+ if (unlikely(!napi_is_scheduled(napi)))
+ hrtimer_start(&tx_q->txtimer,
+ STMMAC_COAL_TIMER(tx_coal_timer),
+ HRTIMER_MODE_REL);
+ else
+ hrtimer_try_to_cancel(&tx_q->txtimer);
}
/**