openwrt/target/linux
Michał Kępień 626c84340d kernel: mtk_bmt: refactor to avoid deep recursion
A Linksys E8450 (mt7622) device running current master has recently
started crashing:

    [    0.562900] mtk-ecc 1100e000.ecc: probed
    [    0.570254] spi-nand spi2.0: Fidelix SPI NAND was found.
    [    0.575576] spi-nand spi2.0: 128 MiB, block size: 128 KiB, page size: 2048, OOB size: 64
    [    0.583780] mtk-snand 1100d000.spi: ECC strength: 4 bits per 512 bytes
    [    0.682930] Insufficient stack space to handle exception!
    [    0.682939] ESR: 0x0000000096000047 -- DABT (current EL)
    [    0.682946] FAR: 0xffffffc008c47fe0
    [    0.682948] Task stack:     [0xffffffc008c48000..0xffffffc008c4c000]
    [    0.682951] IRQ stack:      [0xffffffc008008000..0xffffffc00800c000]
    [    0.682954] Overflow stack: [0xffffff801feb00a0..0xffffff801feb10a0]
    [    0.682959] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G S                5.15.107 #0
    [    0.682966] Hardware name: Linksys E8450 (DT)
    [    0.682969] pstate: 800000c5 (Nzcv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
    [    0.682975] pc : dequeue_entity+0x0/0x250
    [    0.682988] lr : dequeue_task_fair+0x98/0x290
    [    0.682992] sp : ffffffc008c48030
    [    0.682994] x29: ffffffc008c48030 x28: 0000000000000001 x27: ffffff801feb6380
    [    0.683004] x26: 0000000000000001 x25: ffffff801feb6300 x24: ffffff8000068000
    [    0.683011] x23: 0000000000000001 x22: 0000000000000009 x21: 0000000000000000
    [    0.683017] x20: ffffff801feb6380 x19: ffffff8000068080 x18: 0000000017a740a6
    [    0.683024] x17: ffffffc008bae748 x16: ffffffc008bae6d8 x15: ffffffffffffffff
    [    0.683031] x14: ffffffffffffffff x13: 0000000000000000 x12: 0000000f00000101
    [    0.683038] x11: 0000000000000449 x10: 0000000000000127 x9 : 0000000000000000
    [    0.683044] x8 : 0000000000000125 x7 : 0000000000116da1 x6 : 0000000000116da1
    [    0.683051] x5 : 00000000001165a1 x4 : ffffff801feb6e00 x3 : 0000000000000000
    [    0.683058] x2 : 0000000000000009 x1 : ffffff8000068080 x0 : ffffff801feb6380
    [    0.683066] Kernel panic - not syncing: kernel stack overflow
    [    0.683069] SMP: stopping secondary CPUs
    [    1.648361] SMP: failed to stop secondary CPUs 0-1
    [    1.648366] Kernel Offset: disabled
    [    1.648368] CPU features: 0x00003000,00000802
    [    1.648372] Memory Limit: none

Several factors contributed to this issue:

 1. The mtk_bmt driver recursively calls its scan_bmt() helper function
    during device initialization, while looking for a valid block
    mapping table (BMT).

 2. Commit fa4dc86e98 ("kernel: backport MEMREAD ioctl"):

      - increased the size of some stack-allocated structures (like
	struct mtd_oob_ops, used in bbt_nand_read(), which is indirectly
	called from scan_bmt()),

      - increased the stack size for some functions (for example,
	spinand_mtd_read(), which is indirectly called from scan_bmt(),
	now uses an extra stack-allocated struct mtd_ecc_stats).

 3. OpenWrt currently compiles the kernel with the
    -fno-optimize-sibling-calls flag, which prevents tail-call
    optimization.

Collectively, all of these factors caused stack usage in the mtk_bmt
driver to grow excessively large, triggering stack overflows.

Recursion is not really necessary in scan_bmt() as it simply iterates
over flash memory blocks in reverse order, looking for a valid BMT.
Refactor the logic contained in the scan_bmt() and read_bmt() functions
in target/linux/generic/files/drivers/mtd/nand/mtk_bmt_v2.c so that deep
recursion is prevented (and therefore also any potential stack overflows
it may cause).

Link: https://lists.openwrt.org/pipermail/openwrt-devel/2023-April/040872.html
Signed-off-by: Michał Kępień <openwrt@kempniu.pl>
2023-04-29 12:58:48 +02:00
..
airoha kernel: bump 5.15 to 5.15.100 2023-03-18 12:52:17 +01:00
apm821xx kernel: bump 5.10 to 5.10.178 2023-04-22 01:15:03 +02:00
archs38 treewide: replace wpad-basic-wolfssl default 2023-02-04 02:35:03 +01:00
armvirt kernel: disable CONFIG_CPU_LITTLE_ENDIAN in generic config 2022-10-21 13:47:01 +02:00
at91 kernel: bump 5.10 to 5.10.178 2023-04-22 01:15:03 +02:00
ath25 treewide: replace wpad-basic-wolfssl default 2023-02-04 02:35:03 +01:00
ath79 ath79: create APBoot-compatible image for Aruba AP-175 2023-04-24 10:44:49 +02:00
bcm27xx kernel: bump 5.15 to 5.15.108 2023-04-22 01:10:24 +02:00
bcm47xx kernel: add bcma/ssb fallback SPROM support 2023-04-23 12:18:35 +02:00
bcm53xx kernel: add bcma/ssb fallback SPROM support 2023-04-23 12:18:35 +02:00
bcm63xx bcm63xx: kernel: power cycle the bcm6358 USB PLL 2023-03-04 20:09:49 +01:00
bcm4908 bcm4908: switch to Kernel 5.15 by default 2023-04-10 21:21:03 +02:00
bmips bmips: fix external interrupt controller 2023-04-27 15:39:09 +02:00
gemini gemini: add generic subtarget 2022-12-23 19:44:20 +01:00
generic kernel: mtk_bmt: refactor to avoid deep recursion 2023-04-29 12:58:48 +02:00
imx cypress-nvram: consolidate NVRAM packages 2022-11-16 20:14:13 +01:00
ipq40xx ipq40xx: convert GL-AP1300 to DSA 2023-04-24 18:32:26 +02:00
ipq806x kernel: bump 5.10 to 5.10.178 2023-04-22 01:15:03 +02:00
ipq807x kernel: backport NVMEM patch for U-Boot env data "ethaddr" cell 2023-04-06 12:21:29 +02:00
kirkwood kirkwood: fix Linksys upgrade, restore config step 2023-04-11 12:21:15 +02:00
lantiq lantiq: fix lzma-loader for Netgear DGN 3500(B) 2023-04-02 22:33:55 +02:00
layerscape kernel: bump 5.15 to 5.15.108 2023-04-22 01:10:24 +02:00
malta treewide: replace wpad-basic-wolfssl default 2023-02-04 02:35:03 +01:00
mediatek mediatek: remove mt753x driver 2023-04-29 10:25:43 +02:00
mpc85xx mpc85xx: refresh patches 2023-04-08 14:41:01 +02:00
mvebu kernel: bump 5.15 to 5.15.108 2023-04-22 01:10:24 +02:00
mxs mxs: switch default kernel to 5.15 2023-01-30 11:13:14 +01:00
octeon octeon: switch to Kernel 5.15 by default 2023-04-08 00:47:12 +02:00
octeontx kernel: bump 5.15 to 5.15.100 2023-03-18 12:52:17 +01:00
omap treewide: replace wpad-basic-wolfssl default 2023-02-04 02:35:03 +01:00
oxnas treewide: replace wpad-basic-wolfssl default 2023-02-04 02:35:03 +01:00
pistachio kernel: bump 5.10 to 5.10.173 2023-03-20 22:44:28 +01:00
qoriq treewide: remove label = "cpu" from DSA dt-binding 2023-02-26 22:22:48 +01:00
ramips ramips: reduce Archer AX23 / MR70X SPI-frequency 2023-04-27 22:26:58 +02:00
realtek kernel: bump 5.10 to 5.10.178 2023-04-22 01:15:03 +02:00
rockchip treewide: update NVMEM symbols 2023-01-07 01:30:31 +01:00
sunxi sunxi: enable CONFIG_NVMEM_SYSFS 2023-02-26 22:22:48 +01:00
tegra tegra: switch to Kernel 5.15 by default 2023-04-08 00:30:22 +02:00
uml treewide: replace wpad-basic-wolfssl default 2023-02-04 02:35:03 +01:00
x86 x86: fix deprecated CONFIG_MICROCODE_OLD_INTERACE 2023-03-20 22:44:20 +01:00
zynq zynq: remove kconfig for 5.10 2023-01-30 18:01:14 +08:00
Makefile build: fix issues with targets installed via feeds 2022-09-27 13:41:12 +02:00