generic: 6.1: add patch fixing bugs with LED netdev trigger

Backport one patch merged upstream that prevent a deadlock for LED
netdev trigger and add a pending patch that fix kernel panic on
interface rename trigger notification with invalid dev.

Fixes: #14477
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
This commit is contained in:
Christian Marangi 2024-02-04 00:58:15 +01:00
parent 79c79d9809
commit 508f2dbfb3
No known key found for this signature in database
GPG Key ID: AC001D09ADBFEAD7
2 changed files with 222 additions and 0 deletions

View File

@ -0,0 +1,170 @@
From fe2b1226656afae56702d1d84c6900f6b67df297 Mon Sep 17 00:00:00 2001
From: Heiner Kallweit <hkallweit1@gmail.com>
Date: Fri, 1 Dec 2023 11:23:22 +0100
Subject: [PATCH] leds: trigger: netdev: fix RTNL handling to prevent potential
deadlock
When working on LED support for r8169 I got the following lockdep
warning. Easiest way to prevent this scenario seems to be to take
the RTNL lock before the trigger_data lock in set_device_name().
======================================================
WARNING: possible circular locking dependency detected
6.7.0-rc2-next-20231124+ #2 Not tainted
------------------------------------------------------
bash/383 is trying to acquire lock:
ffff888103aa1c68 (&trigger_data->lock){+.+.}-{3:3}, at: netdev_trig_notify+0xec/0x190 [ledtrig_netdev]
but task is already holding lock:
ffffffff8cddf808 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock+0x12/0x20
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (rtnl_mutex){+.+.}-{3:3}:
__mutex_lock+0x9b/0xb50
mutex_lock_nested+0x16/0x20
rtnl_lock+0x12/0x20
set_device_name+0xa9/0x120 [ledtrig_netdev]
netdev_trig_activate+0x1a1/0x230 [ledtrig_netdev]
led_trigger_set+0x172/0x2c0
led_trigger_write+0xf1/0x140
sysfs_kf_bin_write+0x5d/0x80
kernfs_fop_write_iter+0x15d/0x210
vfs_write+0x1f0/0x510
ksys_write+0x6c/0xf0
__x64_sys_write+0x14/0x20
do_syscall_64+0x3f/0xf0
entry_SYSCALL_64_after_hwframe+0x6c/0x74
-> #0 (&trigger_data->lock){+.+.}-{3:3}:
__lock_acquire+0x1459/0x25a0
lock_acquire+0xc8/0x2d0
__mutex_lock+0x9b/0xb50
mutex_lock_nested+0x16/0x20
netdev_trig_notify+0xec/0x190 [ledtrig_netdev]
call_netdevice_register_net_notifiers+0x5a/0x100
register_netdevice_notifier+0x85/0x120
netdev_trig_activate+0x1d4/0x230 [ledtrig_netdev]
led_trigger_set+0x172/0x2c0
led_trigger_write+0xf1/0x140
sysfs_kf_bin_write+0x5d/0x80
kernfs_fop_write_iter+0x15d/0x210
vfs_write+0x1f0/0x510
ksys_write+0x6c/0xf0
__x64_sys_write+0x14/0x20
do_syscall_64+0x3f/0xf0
entry_SYSCALL_64_after_hwframe+0x6c/0x74
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(rtnl_mutex);
lock(&trigger_data->lock);
lock(rtnl_mutex);
lock(&trigger_data->lock);
*** DEADLOCK ***
8 locks held by bash/383:
#0: ffff888103ff33f0 (sb_writers#3){.+.+}-{0:0}, at: ksys_write+0x6c/0xf0
#1: ffff888103aa1e88 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x114/0x210
#2: ffff8881036f1890 (kn->active#82){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x11d/0x210
#3: ffff888108e2c358 (&led_cdev->led_access){+.+.}-{3:3}, at: led_trigger_write+0x30/0x140
#4: ffffffff8cdd9e10 (triggers_list_lock){++++}-{3:3}, at: led_trigger_write+0x75/0x140
#5: ffff888108e2c270 (&led_cdev->trigger_lock){++++}-{3:3}, at: led_trigger_write+0xe3/0x140
#6: ffffffff8cdde3d0 (pernet_ops_rwsem){++++}-{3:3}, at: register_netdevice_notifier+0x1c/0x120
#7: ffffffff8cddf808 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock+0x12/0x20
stack backtrace:
CPU: 0 PID: 383 Comm: bash Not tainted 6.7.0-rc2-next-20231124+ #2
Hardware name: Default string Default string/Default string, BIOS ADLN.M6.SODIMM.ZB.CY.015 08/08/2023
Call Trace:
<TASK>
dump_stack_lvl+0x5c/0xd0
dump_stack+0x10/0x20
print_circular_bug+0x2dd/0x410
check_noncircular+0x131/0x150
__lock_acquire+0x1459/0x25a0
lock_acquire+0xc8/0x2d0
? netdev_trig_notify+0xec/0x190 [ledtrig_netdev]
__mutex_lock+0x9b/0xb50
? netdev_trig_notify+0xec/0x190 [ledtrig_netdev]
? __this_cpu_preempt_check+0x13/0x20
? netdev_trig_notify+0xec/0x190 [ledtrig_netdev]
? __cancel_work_timer+0x11c/0x1b0
? __mutex_lock+0x123/0xb50
mutex_lock_nested+0x16/0x20
? mutex_lock_nested+0x16/0x20
netdev_trig_notify+0xec/0x190 [ledtrig_netdev]
call_netdevice_register_net_notifiers+0x5a/0x100
register_netdevice_notifier+0x85/0x120
netdev_trig_activate+0x1d4/0x230 [ledtrig_netdev]
led_trigger_set+0x172/0x2c0
? preempt_count_add+0x49/0xc0
led_trigger_write+0xf1/0x140
sysfs_kf_bin_write+0x5d/0x80
kernfs_fop_write_iter+0x15d/0x210
vfs_write+0x1f0/0x510
ksys_write+0x6c/0xf0
__x64_sys_write+0x14/0x20
do_syscall_64+0x3f/0xf0
entry_SYSCALL_64_after_hwframe+0x6c/0x74
RIP: 0033:0x7f269055d034
Code: c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 80 3d 35 c3 0d 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 48 83 ec 28 48 89 54 24 18 48
RSP: 002b:00007ffddb7ef748 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000007 RCX: 00007f269055d034
RDX: 0000000000000007 RSI: 000055bf5f4af3c0 RDI: 0000000000000001
RBP: 000055bf5f4af3c0 R08: 0000000000000073 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000007
R13: 00007f26906325c0 R14: 00007f269062ff20 R15: 0000000000000000
</TASK>
Fixes: d5e01266e7f5 ("leds: trigger: netdev: add additional specific link speed mode")
Cc: stable@vger.kernel.org
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Lee Jones <lee@kernel.org>
Link: https://lore.kernel.org/r/fb5c8294-2a10-4bf5-8f10-3d2b77d2757e@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
drivers/leds/trigger/ledtrig-netdev.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
--- a/drivers/leds/trigger/ledtrig-netdev.c
+++ b/drivers/leds/trigger/ledtrig-netdev.c
@@ -235,6 +235,11 @@ static int set_device_name(struct led_ne
{
cancel_delayed_work_sync(&trigger_data->work);
+ /*
+ * Take RTNL lock before trigger_data lock to prevent potential
+ * deadlock with netdev notifier registration.
+ */
+ rtnl_lock();
mutex_lock(&trigger_data->lock);
if (trigger_data->net_dev) {
@@ -254,16 +259,14 @@ static int set_device_name(struct led_ne
trigger_data->carrier_link_up = false;
trigger_data->link_speed = SPEED_UNKNOWN;
trigger_data->duplex = DUPLEX_UNKNOWN;
- if (trigger_data->net_dev != NULL) {
- rtnl_lock();
+ if (trigger_data->net_dev)
get_device_state(trigger_data);
- rtnl_unlock();
- }
trigger_data->last_activity = 0;
set_baseline_state(trigger_data);
mutex_unlock(&trigger_data->lock);
+ rtnl_unlock();
return 0;
}

View File

@ -0,0 +1,52 @@
From 9e2a0a8bd7a392b5af6bf2cfa0b07d96d1fb6719 Mon Sep 17 00:00:00 2001
From: Christian Marangi <ansuelsmth@gmail.com>
Date: Sun, 4 Feb 2024 00:32:04 +0100
Subject: [PATCH] leds: trigger: netdev: Fix kernel panic on interface rename
trig notify
Commit d5e01266e7f5 ("leds: trigger: netdev: add additional specific link
speed mode") in the various changes, reworked the way to set the LINKUP
mode in commit cee4bd16c319 ("leds: trigger: netdev: Recheck
NETDEV_LED_MODE_LINKUP on dev rename") and moved it to a generic function.
This changed the logic where, in the previous implementation the dev
from the trigger event was used to check if the carrier was ok, but in
the new implementation with the generic function, the dev in
trigger_data is used instead.
This is problematic and cause a possible kernel panic due to the fact
that the dev in the trigger_data still reference the old one as the
new one (passed from the trigger event) still has to be hold and saved
in the trigger_data struct (done in the NETDEV_REGISTER case).
On calling of get_device_state(), an invalid net_dev is used and this
cause a kernel panic.
To handle this correctly, move the call to get_device_state() after the
new net_dev is correctly set in trigger_data (in the NETDEV_REGISTER
case) and correctly parse the new dev.
Fixes: d5e01266e7f5 ("leds: trigger: netdev: add additional specific link speed mode")
Cc: stable@vger.kernel.org
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
---
drivers/leds/trigger/ledtrig-netdev.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/leds/trigger/ledtrig-netdev.c
+++ b/drivers/leds/trigger/ledtrig-netdev.c
@@ -489,12 +489,12 @@ static int netdev_trig_notify(struct not
trigger_data->duplex = DUPLEX_UNKNOWN;
switch (evt) {
case NETDEV_CHANGENAME:
- get_device_state(trigger_data);
- fallthrough;
case NETDEV_REGISTER:
dev_put(trigger_data->net_dev);
dev_hold(dev);
trigger_data->net_dev = dev;
+ if (evt == NETDEV_CHANGENAME)
+ get_device_state(trigger_data);
break;
case NETDEV_UNREGISTER:
dev_put(trigger_data->net_dev);