mirror of
https://github.com/openwrt/openwrt.git
synced 2025-01-23 04:48:22 +00:00
98d325aaf8
Apparently, a few ipq40xx devices have sporadic problems when reading the flash over SPI. When that happens, the result of the faulty SPI read is cached and it isn't re-attempted. Depending on when it happens, the router either panics and reboots or is left in a partially broken state (an application wont start). The data on the flash is alright. This wasn't the case with Openwrt with Linux < 5.x but I wasn't able to work out which software change was responsible. Github user karlpip created a patch for testing that disabled the cache entirely and added logs. Typically, only one or two SPI operations fail at a time: [689200.631152] spi-nor spi0.0: SPI transfer failed: -110 [689200.631280] spi_master spi0: failed to transfer one message from queue [689200.635369] jffs2: Write of 68 bytes at 0x00ffccf4 failed. returned -110, retlen 0 [689200.642014] jffs2: Not marking the space at 0x00ffccf4 as dirty because the flash driver returned retlen zero Because reads aren't re-attempted, squashfs can't recover: [3171844.279235] SQUASHFS error: Failed to read block 0x2bb912: -5 [3171844.279284] SQUASHFS error: Unable to read fragment cache entry [2bb912] [3171844.283980] SQUASHFS error: Unable to read page, block 2bb912, size 14e6c [3171844.291650] SQUASHFS error: Unable to read fragment cache entry [2bb912] [3171844.297831] SQUASHFS error: Unable to read page, block 2bb912, size 14e6c I assume there to be some kind of underlying electrical problem because, in my experience, this happens a lot more when PoE is used. NoTengoBattery has made an in-depth investigation: https://forum.openwrt.org/t/patch-squashfs-data-probably-corrupt/70480 .. and created a patch that evicts the page cache and retries reading: https://github.com/NoTengoBattery/openwrt/blob/linksys-ea6350v3-mastertrack/target/linux/ipq40xx/patches-5.4/9996-fs_squashfs_improve_squashfs_error_resistance.patch The patch also works well with the WPJ428 but NoTengoBattery didn't try to upstream it ("This is not the solution that should be used"). In 2020, I tried and failed to create a working patch that prevents faulty pages to be cached in the first place. Because I needed a solution, I backported "squashfs: add option to panic on errors " (10dde05b89980ef) which has since become available in Openwrt. The 'error=panic' option has been tested on a fleet of multiple hundred WPJ428s over multiple years. Without this patch, devices regularly went into 'limbo' on reboot or update and required a manual reboot. Devices with this patch don't. I was initially concerned that the kernel panic would leave devices with a real corrupted data but I haven't seen a case of actual corruption since (outside of people turning off the power during upgrades). The WPJ428 is the only device I tested this patch on - others might also benefit. Reviewed-by: Robert Marko <robimarko@gmail.com> Signed-off-by: Leon M. Busch-George <leon@georgemail.eu> |
||
---|---|---|
.. | ||
qcom-ipq40x9-dr40x9.dts | ||
qcom-ipq4018-a42.dts | ||
qcom-ipq4018-ap120c-ac.dts | ||
qcom-ipq4018-cap-ac.dts | ||
qcom-ipq4018-cs-w3-wd1200g-eup.dts | ||
qcom-ipq4018-dap-2610.dts | ||
qcom-ipq4018-ea6350v3.dts | ||
qcom-ipq4018-eap1300.dts | ||
qcom-ipq4018-ecw5211.dts | ||
qcom-ipq4018-emd1.dts | ||
qcom-ipq4018-emr3500.dts | ||
qcom-ipq4018-ens620ext.dts | ||
qcom-ipq4018-ex61x0v2.dtsi | ||
qcom-ipq4018-ex6100v2.dts | ||
qcom-ipq4018-ex6150v2.dts | ||
qcom-ipq4018-fritzbox-4040.dts | ||
qcom-ipq4018-gl-a1300.dts | ||
qcom-ipq4018-gl-ap1300.dts | ||
qcom-ipq4018-hap-ac2.dts | ||
qcom-ipq4018-jalapeno.dts | ||
qcom-ipq4018-jalapeno.dtsi | ||
qcom-ipq4018-magic-2-wifi-next.dts | ||
qcom-ipq4018-meshpoint-one.dts | ||
qcom-ipq4018-mf287_common.dtsi | ||
qcom-ipq4018-mf287.dts | ||
qcom-ipq4018-mf287plus.dts | ||
qcom-ipq4018-mf287pro.dts | ||
qcom-ipq4018-nbg6617.dts | ||
qcom-ipq4018-pa1200.dts | ||
qcom-ipq4018-rt-ac58u.dts | ||
qcom-ipq4018-rutx10.dts | ||
qcom-ipq4018-rutx50.dts | ||
qcom-ipq4018-rutx.dtsi | ||
qcom-ipq4018-sxtsq-5-ac.dts | ||
qcom-ipq4018-wac510.dts | ||
qcom-ipq4018-wap-ac-lte.dts | ||
qcom-ipq4018-wap-ac.dts | ||
qcom-ipq4018-wap-ac.dtsi | ||
qcom-ipq4018-wap-r-ac.dts | ||
qcom-ipq4018-whw01.dts | ||
qcom-ipq4018-wr-1.dts | ||
qcom-ipq4018-wre6606.dts | ||
qcom-ipq4018-wrtq-329acn.dts | ||
qcom-ipq4019-a62.dts | ||
qcom-ipq4019-cm520-79f.dts | ||
qcom-ipq4019-e2600ac-c1.dts | ||
qcom-ipq4019-e2600ac-c2.dts | ||
qcom-ipq4019-e2600ac.dtsi | ||
qcom-ipq4019-ea8300.dts | ||
qcom-ipq4019-eap2200.dts | ||
qcom-ipq4019-fritzbox-7530.dts | ||
qcom-ipq4019-fritzrepeater-1200.dts | ||
qcom-ipq4019-fritzrepeater-3000.dts | ||
qcom-ipq4019-gl-b2200.dts | ||
qcom-ipq4019-habanero-dvk.dts | ||
qcom-ipq4019-hap-ac3-lte6-kit.dts | ||
qcom-ipq4019-hap-ac3.dts | ||
qcom-ipq4019-le1.dts | ||
qcom-ipq4019-lhgg-60ad.dts | ||
qcom-ipq4019-map-ac2200.dts | ||
qcom-ipq4019-mf18a.dts | ||
qcom-ipq4019-mf282plus.dts | ||
qcom-ipq4019-mf286d.dts | ||
qcom-ipq4019-mf289f.dts | ||
qcom-ipq4019-mr8300.dts | ||
qcom-ipq4019-ncp-hg100-cellular.dts | ||
qcom-ipq4019-oap100.dts | ||
qcom-ipq4019-orbi.dtsi | ||
qcom-ipq4019-pa2200.dts | ||
qcom-ipq4019-r619ac-64m.dts | ||
qcom-ipq4019-r619ac-128m.dts | ||
qcom-ipq4019-r619ac.dtsi | ||
qcom-ipq4019-rbr40.dts | ||
qcom-ipq4019-rbr50.dts | ||
qcom-ipq4019-rbs40.dts | ||
qcom-ipq4019-rbs50.dts | ||
qcom-ipq4019-rt-ac42u.dts | ||
qcom-ipq4019-rtl30vw.dts | ||
qcom-ipq4019-srr60.dts | ||
qcom-ipq4019-srs60.dts | ||
qcom-ipq4019-u4019-32m.dts | ||
qcom-ipq4019-u4019.dtsi | ||
qcom-ipq4019-whw03v2.dts | ||
qcom-ipq4019-wifi.dts | ||
qcom-ipq4019-wpj419.dts | ||
qcom-ipq4019-wtr-m2133hp.dts | ||
qcom-ipq4019-x1pro.dts | ||
qcom-ipq4019-x1pro.dtsi | ||
qcom-ipq4019-xx8300.dtsi | ||
qcom-ipq4028-wpj428.dts | ||
qcom-ipq4029-ap-303.dts | ||
qcom-ipq4029-ap-303h.dts | ||
qcom-ipq4029-ap-365.dts | ||
qcom-ipq4029-aruba-glenmorangie.dtsi | ||
qcom-ipq4029-gl-b1300.dts | ||
qcom-ipq4029-gl-s1300.dts | ||
qcom-ipq4029-insect-common.dtsi | ||
qcom-ipq4029-mr33.dts | ||
qcom-ipq4029-mr74.dts | ||
qcom-ipq4029-ws-ap3915i.dts |