After commit e0d2c59ee995 ("genirq: Always limit the affinity to online
CPUs", 5.10) on Linux, the cpumask passed to irq_set_affinity of irqchip
driver is limited to online CPUs. When irq_do_set_affinity called from
otto timer driver with only one secondary CPU, that CPU is not marked as
online yet, filtered out by cpu_online_mask and fall to error path.
Then, fail to set affinity for that CPU and it leads to instability of
timer on secondary CPU(s).
At least, RTL839x system will be affected.
log:
[ 37.560020] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[ 37.638025] rcu: 1-...!: (0 ticks this GP) idle=6ac/0/0x0 softirq=0/0 fqs=1 (false positive?)
[ 37.752683] (detected by 0, t=6002 jiffies, g=-1179, q=26293)
[ 37.829510] Sending NMI from CPU 0 to CPUs 1:
[ 37.886857] NMI backtrace for cpu 1 skipped: idling at r4k_wait_irqoff+0x1c/0x24
[ 37.984801] rcu: rcu_sched kthread timer wakeup didn't happen for 5999 jiffies! g-1179 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
[ 38.132743] rcu: Possible timer handling issue on cpu=1 timer-softirq=0
[ 38.221033] rcu: rcu_sched kthread starved for 6000 jiffies! g-1179 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=1
[ 38.356336] rcu: Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
[ 38.474440] rcu: RCU grace-period kthread stack dump:
...
Replace to irq_force_affinity from irq_set_affinity and ignore
cpu_online_mask to fix the issue.
Signed-off-by: INAGAKI Hiroshi <musashino.open@gmail.com>
Tested-by: Olliver Schinagl <oliver@schinagl.nl>
Upstream generic MIPS uses 0x80100000 and 0x80100400 for the LOADADDR
and ENTRY addresses. As we do not want to diverge from upstream and
patch upstream when not needed, adjust our addresses as well to be
future proof.
Signed-off-by: Olliver Schinagl <oliver@schinagl.nl>
Tested-by: Jan Hoffmann <jan@3e8.eu> # HPE 1920-8G, HPE 1920-48G
The reset register on RTL93xx not merely have bits to execute
a reset of a hardware component, but also configuration bits for
reset procedures. Keep them during executing a reset.
Signed-off-by: Birger Koblitz <git@birger-koblitz.de>
Signed-off-by: Olliver Schinagl <oliver@schinagl.nl>
[backport to 5.10 kernel]
Signed-off-by: Sander Vanheule <sander@svanheule.net>
A full loop accessing all FDB entries can take several milliseconds
(on RTL839x about 20 ms), so give other kernel tasks a chance to run.
This is especially important for rtl83xx_port_fdb_dump which is itself
called in a loop for all ports by the kernel.
Signed-off-by: Jan Hoffmann <jan@3e8.eu>
These two functions are identical apart from writing different values to
the read/write bit. Create a new function rtl_table_exec to reduce code
duplication.
Also replace the unbounded busy-waiting loop. The new implementation may
sleep, but as the hardware typically responds before the first poll, any
callers doing many table accesses still need to make sure not to block
other kernel tasks themselves.
So far, polling timeout errors are only handled by logging an error, but
a return value is added to allow proper handling in the future.
Signed-off-by: Jan Hoffmann <jan@3e8.eu>
This function currently prints three messages for every switch port at
KERN_INFO level. This takes a considerable amount of time during bootup
and can even trigger an external watchdog.
Replace these log messages by a single one at KERN_DEBUG level.
Signed-off-by: Jan Hoffmann <jan@3e8.eu>
As learning for the CPU port is now disabled globally, the bit in the
TX header doesn't have any effect anymore. Remove it to make the header
consistent with the global configuration.
Originally, this change was intended to be applied before commit
eb456aedfe ("realtek: use assisted learning on CPU port"), which is
why the commit message incorrectly mentions that the TX header already
disables learning.
The reason for disabling learning on the CPU port in the first place is
that it doesn't work correctly when packets are trapped to the CPU and
then forwarded by the CPU to other ports. In that case, the switch would
incorrectly learn the CPU port as source. An example that triggered this
issue are Multicast Listener Reports and IGMP membership reports.
Signed-off-by: Jan Hoffmann <jan@3e8.eu>
should be add/delete or abbreviated add/del
Signed-off-by: Jan-Niklas Burfeind <git@aiyionpri.me>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The code in dsa.c:rtl83xx_port_enable() was trying to set
vlan_port_tag_sts_ctrl while dealing with differences between SoCs.
However, not only that register has a different address, the register
structure and even the 2-bit value semantic changes for each SoC.
The vlan_port_tag_sts_ctrl field was dropped and converted into a
vlan_port_keep_incoming_tag_set() function that abstracts the different
between SoCs. The macro referencing that register migrated to the SoC
specific c file as it will be privately used by each file.
All magic numbers were converted into macros using BITMASK and
FIELD_PREP.
The vlan_port_tag_sts_ctrl debugfs was dropped for now as it is already
broken for rtl93xx. The best place for SoC specific code might be in each
respective c file and not in if/else clauses.
The final result is:
rtl838x: set ITAG_STS=TAGGED, same as before
rtl839x: set ITAG_STS=TAGGED instead of IGR_P_ITAG_KEEP=0x1, fixing
forwarding of tagged packets
rtl930x: set EGR_ITAG_STS=TAGGED instead of IGR_P_ITAG=0x1, possibly
fixing forwarding of tagged packets
rtl931x: set EGR_ITAG_STS=TAGGED instead of OTPID_KEEP=0x1, possibly
fixing forwarding of tagged packets
Without (EGR_)ITAG_STS=TAGGED, at least for rtl839x, forwarded packets
will drop the vlan tag while packets from the CPU will still have the
correct tag.
Signed-off-by: Luiz Angelo Daros de Luca <luizluca@gmail.com>
The rtl930x speed status registers require 4 bits to indicate the speed
status. As such, we want to divide by 8. To make things consistent with
the rest of this code, use a bitshift however.
This bug probably won't affect many users yet, as there aren't many
rtl930x switches in the wild yet with more then 10 ports, and thus a
low-impact bugfix.
Signed-off-by: Olliver Schinagl <oliver@schinagl.nl>
[also fix port field extraction]
Signed-off-by: Sander Vanheule <sander@svanheule.net>
After replacing the R4K event timer and clock source with the new
Realtek Otto timer, performance for RTL839x devices was severely
impacted, as reported by Hiroshi.
Research by Markus showed that after commit 4657a5301e ("realtek:
avoid busy waiting for RTL839x PHY read/write"), the ethernet driver
could only update a phy once per timer interval, which also heavily
impacted boot time. On e.g. a Zyxel GS1900-48, this added around a
minute to the time to fully initialise the switch.
By marking the otto clocksource as continuous, the kernel enables it to
be used for high resolution timers. This allows readx_poll_timeout() to
sleep for less than one system timer interval, reducing system dead
time.
Link: https://github.com/openwrt/openwrt/issues/11117
Reported-by: INAGAKI Hiroshi <musashino.open@gmail.com>
Cc: Markus Stockhausen <markus.stockhausen@gmx.de>
Signed-off-by: Sander Vanheule <sander@svanheule.net>
Tested-by: INAGAKI Hiroshi <musashino.open@gmail.com> # Panasonic Switch-M48eG PN28480K
Tested-by: Jan Hoffmann <jan@3e8.eu> # HPE 1920-8G, HPE 1920-48G
In rtl83xx_set_features we set bit 3 to enable, and bit 4 to disable
checksuming. Looking at rtl93xx_set_features we however see that for
both enable and disable the same bit is used (bit 4). This can't be
right, especially as bit 4 for rtl83xx seems to be Collision threshold
occupying 2 bits. Change this to make this more logical.
Fixes: 9e8d62e421 ("realtek: enable CRC offloading")
Signed-off-by: Olliver Schinagl <oliver@schinagl.nl>
L2 learning on the CPU port is currently not consistently configured and
relies on the default configuration of the device. On RTL83xx, it is
disabled for packets transmitted with a TX header, as hardware learning
corrupts the forwarding table otherwise. As a result, unneeded flooding
of traffic for the CPU port can already happen on some devices now. It
is also likely that similar issues exist on RTL93xx, which doesn't have
a field to disable learning in the TX header.
To address this, disable hardware learning for the CPU port globally on
all devices. Instead, enable assisted learning to let DSA write FDB
entries to the switch.
For now, this does not sync local/bridge entries to the switch. However,
support for that was added in Linux 5.14, so the next switch to a newer
kernel version is going to fix this.
Signed-off-by: Jan Hoffmann <jan@3e8.eu>
Initialize the data structure using memset to avoid the possibility of
writing garbage values to the hardware.
Always set a valid entry type, which should fix writing unicast entries
on RTL930x.
For unicast entries, set the is_static flag to prevent the switch from
aging them out.
Also set the rvid field for unicast entries. This is not strictly
necessary, as the switch fills it in automatically from a non-zero vid.
However, this makes the code consistent with multicast entry setup.
While at it, reorder the statements and fix some style issues (double
space, comma instead of semicolon at end of statement). Also remove the
unneeded priv parameter and debug print for the multicast entry setup
function.
Fixes: cde31976e3 ("realtek: Add support for Layer 2 Multicast")
Signed-off-by: Jan Hoffmann <jan@3e8.eu>
The switches support different actions for incoming ethernet multicast
frames with Reserved Multicast Addresses (01-80-C2-00-00-{01-2F}). The
current code will set the 2-bit action field to FLOOD (0x3) for most
classes, but the highest bit is always unset for the relevant control
registers. This means the DROP (0x1) action being used for these
classes; whatever class the MSB happens to be in.
For RTL838x, this results in {20,23-2F} frames being dropped, instead of
flooding all ports. On other switch generations, {0F,1F,2F} frames are
dropped. This is inconsistent, and appears to be a mistake. Remove this
inconsistency by flooding all multicast frames with RMA addresses.
Signed-off-by: Sander Vanheule <sander@svanheule.net>
The multicast setup function rtl838x_eth_set_multicast_list() checks if
the current SoC is a RTL839x family device. However, the function is
only included in the RTL838x ops table, so this path should never be
taken, making this dead code. rtl839x_eth_set_multicast_list() is
already present in the RTL839x ops table, so it should be safe to remove
this branch.
While touching the code, also re-sort the functions to match sorting
elsewhere, with rtl838x coming before rtl839x.
Signed-off-by: Sander Vanheule <sander@svanheule.net>
Currently several messages at KERN_INFO level are printed for every FDB
del/dump operation. This can cause a significant slowdown for example
while using "bridge fdb", and may even trigger a watchdog.
Remove most of these log messages, as the new L2 table debugfs node
should be a good replacement. Change the remaining messages to
KERN_DEBUG level.
Signed-off-by: Jan Hoffmann <jan@3e8.eu>
Switch to a polling implementation similar to the one for RTL838x, to
allow other kernel tasks to run while waiting.
Signed-off-by: Jan Hoffmann <jan@3e8.eu>
Provide some helpful information about the devicetree configuration of
our new driver
Signed-off-by: Markus Stockhausen <markus.stockhausen@gmx.de>
[correct compatible order in examples]
Signed-off-by: Sander Vanheule <sander@svanheule.net>
Now that we provide a clock driver for the Reltek SOCs the CPU frequency might
change on demand. This has direct visible effects during operation
- the CEVT 4K timer is no longer a stable clocksource
- after CPU frequencies changes time calculation works wrong
- sched_clock falls back to kernel default interval (100 Hz)
- timestamps in dmesg have only 2 digits left
[ 0.000000] sched_clock: 32 bits at 100 Hz, resolution 10000000ns, wraps ...
[ 0.060000] pid_max: default: 32768 minimum: 301
[ 0.070000] Mount-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
[ 0.070000] Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
[ 0.080000] dyndbg: Ignore empty _ddebug table in a CONFIG_DYNAMIC_DEBUG_CORE build
[ 0.090000] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, ...
Looking around where we can start the CEVT timer for RTL930X is a good basis.
Initially it was developed as a clocksource driver for the broken timer in that
specific SOC series. Afterwards it was shifted around to the CEVT location,
got SMP enablement and lost its clocksource feature. So we at least have
something to copy from. As the timers on these devices are well understood
the implementation follows this way:
- leave the RTL930X implementation as is
- provide a new driver for RTL83XX devices only
- swap RTL930X driver at a later time
Like the clock driver this patch contains a self contained module that is SOC
independet and already provides full support for the RTL838X, RTL839X and
RTL930X devices. Some of the new (or reestablished) features are:
- simplified initialization routines
- SMP setup with CPU hotplug framework
- derived from LXB clock speed
- supplied clocksource
- dedicated register functions for better readability
- documentation about some caveats
Signed-off-by: Markus Stockhausen <markus.stockhausen@gmx.de>
[remove unused header includes, remove old CONFIG_MIPS dependency, add
REALTEK_ prefix to driver symbol]
Signed-off-by: Sander Vanheule <sander@svanheule.net>
In *_enable_learning() only address learning should be configured, so
remove enabling forwarding. Forwarding is configured by the respective
*_enable_flood() functions.
Clean up both functions for RTL838x and RTL839x, and fix the comment on
the number of entries.
Signed-off-by: INAGAKI Hiroshi <musashino.open@gmail.com>
[squash RTL838x, RTL839x changes]
Signed-off-by: Sander Vanheule <sander@svanheule.net>
Those messages should be printed when entry was found (idx >= 0). Move
them to the right place to not print invalid entry indices.
Signed-off-by: INAGAKI Hiroshi <musashino.open@gmail.com>
[amden commit message]
Signed-off-by: Sander Vanheule <sander@svanheule.net>
The initial state of sds_mode in rtl9300_force_sds_mode() is null and it
will be configured in switch-case. So print message after it.
Signed-off-by: INAGAKI Hiroshi <musashino.open@gmail.com>
[amend commit message]
Signed-off-by: Sander Vanheule <sander@svanheule.net>
Use the generic function of MIPS in Linux Kernel instead of open coding
our own initialisation.
Signed-off-by: INAGAKI Hiroshi <musashino.open@gmail.com>
[amend commit message]
Signed-off-by: Sander Vanheule <sander@svanheule.net>
The availabibity of probing CPC depends on CONFIG_MIPS_CPC symbol and it
will be checked in arch/mips/include/asm/mips-cpc.h. RTL9310 selects
this symbol, so the family check is redudant.
Furthermore, mips_cm_probe() is already called from setup_arch() in
mips/kernel/setup.c before prom_init(), and as such is not required.
Also move mips_cpc_probe() to run just before registering SMP ops.
Signed-off-by: INAGAKI Hiroshi <musashino.open@gmail.com>
[squash SMP change commits, reword commit message]
Signed-off-by: Sander Vanheule <sander@svanheule.net>
---
This patch only really has an impact on the rtl931x subtarget, which has
no devices. Noboby is currently set up to test these patches either, but
the end result is closer to MIPS_GENERIC, so I do not expect it to cause
issues.
RTL8231 and ethernet phys are not on the same bus, so separate the lock
to each own to cut off the unnecessary dependency.
Signed-off-by: INAGAKI Hiroshi <musashino.open@gmail.com>
The workaround for an already-enabled R4K timer used a non-existent
macro CAUSE_DC. Fix compiling by using the actual macro CAUSEF_DC.
Fixes: b7aab19585 ("realtek: SMP handling of R4K timer interrupts")
Signed-off-by: Sander Vanheule <sander@svanheule.net>
Until now there has been no good explanation why we mess with the R4K
timer on SMP. After extensive testing and looking at the SDK code it
becomes clear what it is all about.
When we disable the CEVT_R4K module (we will do with the new timer
driver) the R4K timer hardware still fires interrupts on the secondary
CPU. To get around this we have two options:
- Disable IRQ 7
- Stop the counter completely
This patch selects option two because this is the root of evil.. To be
on the safe side we will do it only in case the CEVT_R4K module is
disabled.
Signed-off-by: Markus Stockhausen <markus.stockhausen@gmx.de>
The scope of the SMP startup structure is wrong. It is created on the
stack and not as a global variable. This can lead to startup failures.
Fixes: 3f41360eb7 ("realtek: use upstream recommendation for CPU start")
Signed-off-by: Markus Stockhausen <markus.stockhausen@gmx.de
Don't overwrite AS_DPM and L2LEARNING flags when dest_port is >= 32.
Fixes: 1773264a0c ("realtek: correct egress frame port verification")
Signed-off-by: Jan Hoffmann <jan@3e8.eu>
Currently we fix interrupts/timers for the secondary CPU by patching
vsmp_init_secondary(). Get a little bit more generic and use the
upstream recommended way instead. Additionally avoid a check around
register_cps_smp_ops() because it does that itself.
See https://lkml.org/lkml/2022/9/12/522
Signed-off-by: Markus Stockhausen <markus.stockhausen@gmx.de>
Due to an oversight we accidentally inverted the timeout check. This
patch corrects this.
Fixes: 9cec4a0ea4 ("realtek: Use built-in functionality for timeout loop")
Signed-off-by: Olliver Schinagl <oliver@schinagl.nl>
[ wrap poll_timeout line to 80 char ]
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
In commit 81e3017609 ("realtek: clean up rtl838x MDIO busy wait loop")
a hand-crafted loop was created, that nearly exactly replicate the
iopoll's `read_poll_timeout` functionality.
Use that instead.
Signed-off-by: Olliver Schinagl <oliver@schinagl.nl>
The previous fixup was incomplete, and the offsets for the
queue and crc_error cpu_tag bitfields were still wrong on
RTL839x.
Fixes: 545c6113c9 ("realtek: fix RTL838x receive tag decoding")
Suggested-by: Jan Hoffmann <jan@3e8.eu>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Commit dc9cc0d3e2 ("realtek: add QoS and rate control") replaced a
16 bit reserved field in the RTL83xx packet header with the initial
cpu_tag word, shifting the real cpu_tag fields by one. Adjusting for
this new shift was partially forgotten in the new RX tag decoders.
This caused the switch to block IGMP, effectively blocking IPv4
multicast.
The bug was partially fixed by commit 9d847244d9 ("realtek: fix
RTL839X receive tag decoding")
Fix on RTL838x too, including correct NIC_RX_REASON_SPECIAL_TRAP value.
Suggested-by: Jan Hoffmann <jan@3e8.eu>
Fixes: dc9cc0d3e2 ("realtek: add QoS and rate control")
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Some devices have wrong/empty values in the PLL registers. Work
around that by reporting the default values.
Signed-off-by: Markus Stockhausen <markus.stockhausen@gmx.de>
When marking a switch port as disabled in the device tree, by using
'status = "disabled";', the switch driver fails on boot, causing a
restart:
CPU 0 Unable to handle kernel paging request at virtual address
00000000, epc == 802c3064, ra == 8022b4b4
[ ... ]
Call Trace:
[<802c3064>] strlen+0x0/0x2c
[<8022b4b4>] start_creating.part.0+0x78/0x194
[<8022bd3c>] debugfs_create_dir+0x44/0x1c0
[<80396dfc>] rtl838x_dbgfs_port_init+0x54/0x258
[<80397508>] rtl838x_dbgfs_init+0xe0/0x56c
This is caused by the DSA subsystem (mostly) ignoring the port, while
rtl83xx_mdio_probe() still extracts some details on this disabled port
from the device tree, resulting in the usage of a NULL pointer where a
port name is expected.
By not probing ignoring disabled ports, no attempt is made to create a
debugfs directory later. The device then boots as expected without the
disabled port.
Signed-off-by: Sander Vanheule <sander@svanheule.net>
Add a new self-contained combined clock & platform driver that allows to
access the PLL hardware clocks of RTL83XX devices. Currently it provides
info about CPU, MEM and LXB clocks on RTL838X and RTL839X devices and
additionally allows to change the CPU clocks. Changing the clocks
multiple times on a DGS-1210-20 and a DGS-1210-52 already works well and
is multithreading safe on the RTL839X. Even a cpufreq initiated change
of the CPU clock works fine. Loading the driver will add some meaningful
logging.
[0.000000] rtl83xx-clk: initialized, CPU 500 MHz, MEM 300 MHz (8 Bit DDR3), LXB 200 MHz
[0.279456] rtl83xx-clk soc:clock-controller: rate setting enabled, CPU 325-600 MHz,
MEM 300-300 MHz, LXB 200-200 MHz, OVERCLOCK AT OWN RISK
Signed-off-by: Markus Stockhausen <markus.stockhausen@gmx.de>
[remove trailing whitespaces, C-style SPDX comments for ASM and headers]
Signed-off-by: Sander Vanheule <sander@svanheule.net>
Platform startup still "guesses" the CPU clock speed by DT fixed values.
If possible take clock rates from a to be developed driver and align to
MIPS generic platfom initialization code. Pack old behaviour into a
fallback function. We might get rid of that some day.
Signed-off-by: Markus Stockhausen <markus.stockhausen@gmx.de>
Don't use udelay to allow other kernel tasks to execute if the kernel
has been built without preemption. Also determine the timeout based on
jiffies instead of loop iterations.
This is especially important on devices containing a watchdog with a
short timeout. Without this change, the watchdog is not serviced during
PHY patching which can take multiple seconds.
Tested-by: Birger Koblitz <mail@birger-koblitz.de>
Signed-off-by: Jan Hoffmann <jan@3e8.eu>
Probe the SFP module during PHY initialization and implement
insertion/removal handlers to automatically configure the media type
of the respective port.
Suggested-by: Birger Koblitz <git@birger-koblitz.de>
Tested-by: Birger Koblitz <mail@birger-koblitz.de>
Signed-off-by: Jan Hoffmann <jan@3e8.eu>
Move RTL8214FC power configuration to newly created suspend and resume
methods. A media change now only results in power configuration if the
PHY is not suspended, to avoid powering up a port when the interface is
currently not up.
While at it, remove the rtl8380 prefix from function names, as this is
actually not SoC-specific.
Tested-by: Birger Koblitz <mail@birger-koblitz.de>
Signed-off-by: Jan Hoffmann <jan@3e8.eu>
Toggle power on the individual PHY instead of the package. Otherwise
a media change always toggles power on the first port, and not the one
that is being configured.
Signed-off-by: Jan Hoffmann <jan@3e8.eu>
Destination switch ports for outgoing frame can range from 0 to
CPU_PORT-1.
Refactor the code to only generate egress frame CPU headers when a valid
destination port number is available, and make the code a bit more
consistent between different switch generations. Change the dest_port
argument's type to 'unsigned int', since only positive values are valid.
This fixes the issue where egress frames on switch port 0 did not
receive a VLAN tag, because they are sent out without a CPU header.
Also fixes a potential issue with invalid (negative) egress port numbers
on RTL93xx switches.
Reported-by: Arınç ÜNAL <arinc.unal@xeront.com>
Suggested-by: Birger Koblitz <mail@birger-koblitz.de>
Tested-by: Luiz Angelo Daros de Luca <luizluca@gmail.com>
Signed-off-by: Sander Vanheule <sander@svanheule.net>