After commit e0d2c59ee995 ("genirq: Always limit the affinity to online
CPUs", 5.10) on Linux, the cpumask passed to irq_set_affinity of irqchip
driver is limited to online CPUs. When irq_do_set_affinity called from
otto timer driver with only one secondary CPU, that CPU is not marked as
online yet, filtered out by cpu_online_mask and fall to error path.
Then, fail to set affinity for that CPU and it leads to instability of
timer on secondary CPU(s).
At least, RTL839x system will be affected.
log:
[ 37.560020] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[ 37.638025] rcu: 1-...!: (0 ticks this GP) idle=6ac/0/0x0 softirq=0/0 fqs=1 (false positive?)
[ 37.752683] (detected by 0, t=6002 jiffies, g=-1179, q=26293)
[ 37.829510] Sending NMI from CPU 0 to CPUs 1:
[ 37.886857] NMI backtrace for cpu 1 skipped: idling at r4k_wait_irqoff+0x1c/0x24
[ 37.984801] rcu: rcu_sched kthread timer wakeup didn't happen for 5999 jiffies! g-1179 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
[ 38.132743] rcu: Possible timer handling issue on cpu=1 timer-softirq=0
[ 38.221033] rcu: rcu_sched kthread starved for 6000 jiffies! g-1179 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=1
[ 38.356336] rcu: Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
[ 38.474440] rcu: RCU grace-period kthread stack dump:
...
Replace to irq_force_affinity from irq_set_affinity and ignore
cpu_online_mask to fix the issue.
Signed-off-by: INAGAKI Hiroshi <musashino.open@gmail.com>
Tested-by: Olliver Schinagl <oliver@schinagl.nl>
After replacing the R4K event timer and clock source with the new
Realtek Otto timer, performance for RTL839x devices was severely
impacted, as reported by Hiroshi.
Research by Markus showed that after commit 4657a5301e ("realtek:
avoid busy waiting for RTL839x PHY read/write"), the ethernet driver
could only update a phy once per timer interval, which also heavily
impacted boot time. On e.g. a Zyxel GS1900-48, this added around a
minute to the time to fully initialise the switch.
By marking the otto clocksource as continuous, the kernel enables it to
be used for high resolution timers. This allows readx_poll_timeout() to
sleep for less than one system timer interval, reducing system dead
time.
Link: https://github.com/openwrt/openwrt/issues/11117
Reported-by: INAGAKI Hiroshi <musashino.open@gmail.com>
Cc: Markus Stockhausen <markus.stockhausen@gmx.de>
Signed-off-by: Sander Vanheule <sander@svanheule.net>
Tested-by: INAGAKI Hiroshi <musashino.open@gmail.com> # Panasonic Switch-M48eG PN28480K
Tested-by: Jan Hoffmann <jan@3e8.eu> # HPE 1920-8G, HPE 1920-48G
Now that we provide a clock driver for the Reltek SOCs the CPU frequency might
change on demand. This has direct visible effects during operation
- the CEVT 4K timer is no longer a stable clocksource
- after CPU frequencies changes time calculation works wrong
- sched_clock falls back to kernel default interval (100 Hz)
- timestamps in dmesg have only 2 digits left
[ 0.000000] sched_clock: 32 bits at 100 Hz, resolution 10000000ns, wraps ...
[ 0.060000] pid_max: default: 32768 minimum: 301
[ 0.070000] Mount-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
[ 0.070000] Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
[ 0.080000] dyndbg: Ignore empty _ddebug table in a CONFIG_DYNAMIC_DEBUG_CORE build
[ 0.090000] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, ...
Looking around where we can start the CEVT timer for RTL930X is a good basis.
Initially it was developed as a clocksource driver for the broken timer in that
specific SOC series. Afterwards it was shifted around to the CEVT location,
got SMP enablement and lost its clocksource feature. So we at least have
something to copy from. As the timers on these devices are well understood
the implementation follows this way:
- leave the RTL930X implementation as is
- provide a new driver for RTL83XX devices only
- swap RTL930X driver at a later time
Like the clock driver this patch contains a self contained module that is SOC
independet and already provides full support for the RTL838X, RTL839X and
RTL930X devices. Some of the new (or reestablished) features are:
- simplified initialization routines
- SMP setup with CPU hotplug framework
- derived from LXB clock speed
- supplied clocksource
- dedicated register functions for better readability
- documentation about some caveats
Signed-off-by: Markus Stockhausen <markus.stockhausen@gmx.de>
[remove unused header includes, remove old CONFIG_MIPS dependency, add
REALTEK_ prefix to driver symbol]
Signed-off-by: Sander Vanheule <sander@svanheule.net>
The RTL9300 has a broken R4K MIPS timer interrupt, however, the
R4K clocksource works. We replace the RTL9300 timer with a
Clock Event Timer (CEVT), which is VSMP aware and can be instantiated
as part of brining a VSMTP cpu up instead of the R4K CEVT source.
For this we place the RTL9300 CEVT timer in arch/mips/kernel
together with other MIPS CEVT timers, initialize the SoC IRQs
from a modified smp-mt.c and instantiate each timer as part
of the MIPS time setup in arch/mips/include/asm/time.h instead
of the R4K CEVT, similarly as is done by other MIPS CEVT timers.
Signed-off-by: Birger Koblitz <git@birger-koblitz.de>
this patch copies the following files from 5.4 to 5.10:
- config-5.4 -> config-5.10
- files-5.4/ -> files-5.10/
- patches-5.4/ -> patches-5.10/
Signed-off-by: INAGAKI Hiroshi <musashino.open@gmail.com>
[rebase on change in files-5.4]
Signed-off-by: Adrian Schmutzler <freifunk@adrianschmutzler.de>