summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-03-30docs: sphinx/requirements: Limit jinja2<3.1Akira Yokosawa
jinja2 release 3.1.0 (March 24, 2022) broke Sphinx<4.0. This looks like the result of deprecating Python 3.6. It has been tested against Sphinx 4.3.0 and later. Setting an upper limit of <3.1 to junja2 can unbreak Sphinx<4.0 including Sphinx 2.4.4. Signed-off-by: Akira Yokosawa <akiyks@gmail.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: stable@vger.kernel.org # v5.15+ Link: https://lore.kernel.org/r/7dbff8a0-f4ff-34a0-71c7-1987baf471f9@gmail.com Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2022-03-30sfc: Avoid NULL pointer dereference on systems without numa awarenessMartin Habets
On such systems cpumask_of_node() returns NULL, which bitmap operations are not happy with. Fixes: c265b569a45f ("sfc: default config to 1 channel/core in local NUMA node only") Fixes: 09a99ab16c60 ("sfc: set affinity hints in local NUMA node only") Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com> Reviewed-by: Íñigo Huguet <ihuguet@redhat.com> Link: https://lore.kernel.org/r/164857006953.8140.3265568858101821256.stgit@palantir17.mph.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-30ptp: ocp: handle error from nvmem_device_findJonathan Lemon
nvmem_device_find returns a valid pointer or IS_ERR(). Handle this properly. Fixes: 0cfcdd1ebcfe ("ptp: ocp: add nvmem interface for accessing eeprom") Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Link: https://lore.kernel.org/r/20220329160354.4035-1-jonathan.lemon@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-30net: dsa: felix: fix possible NULL pointer dereferenceZheng Yongjun
As the possible failure of the allocation, kzalloc() may return NULL pointer. Therefore, it should be better to check the 'sgi' in order to prevent the dereference of NULL pointer. Fixes: 23ae3a7877718 ("net: dsa: felix: add stream gate settings for psfp"). Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20220329090800.130106-1-zhengyongjun3@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-30drbd: fix potential silent data corruptionLars Ellenberg
Scenario: --------- bio chain generated by blk_queue_split(). Some split bio fails and propagates its error status to the "parent" bio. But then the (last part of the) parent bio itself completes without error. We would clobber the already recorded error status with BLK_STS_OK, causing silent data corruption. Reproducer: ----------- How to trigger this in the real world within seconds: DRBD on top of degraded parity raid, small stripe_cache_size, large read_ahead setting. Drop page cache (sysctl vm.drop_caches=1, fadvise "DONTNEED", umount and mount again, "reboot"). Cause significant read ahead. Large read ahead request is split by blk_queue_split(). Parts of the read ahead that are already in the stripe cache, or find an available stripe cache to use, can be serviced. Parts of the read ahead that would need "too much work", would need to wait for a "stripe_head" to become available, are rejected immediately. For larger read ahead requests that are split in many pieces, it is very likely that some "splits" will be serviced, but then the stripe cache is exhausted/busy, and the remaining ones will be rejected. Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> Cc: <stable@vger.kernel.org> # 4.13.x Link: https://lore.kernel.org/r/20220330185551.3553196-1-christoph.boehmwalder@linbit.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-30MIPS: rb532: move GPIOD definition into C-filesJackie Liu
My kernel robot reports build error from drivers/iio/adc/da9150-gpadc.c, drivers/iio/adc/da9150-gpadc.c:254:13: error: ‘DA9150_GPADC_CHAN_0x08’ undeclared here (not in a function); did you mean ‘DA9150_GPADC_CHAN_TBAT’? 254 | .channel = DA9150_GPADC_CHAN_##_id, We define GPIOD in rb.h, in fact it should only be used in gpio.c, but it affects the driver da9150-gpadc.c which goes against the original intention of the design, just move it to its scope. Fixes: 1b432840d0a4 ("MIPS: RB532: GPIO register offsets are relative to GPIOBASE") Suggested-by: Jonathan Cameron <jic23@kernel.org> Suggested-by: Andy Shevchenko <andy.shevchenko@gmail.com> Reported-by: k2ci <kernel-bot@kylinos.cn> Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-03-30MIPS: lantiq: check the return value of kzalloc()Xiaoke Wang
kzalloc() is a memory allocation function which can return NULL when some internal memory errors happen. So it is better to check the return value of it to prevent potential wrong memory access or memory leak. Signed-off-by: Xiaoke Wang <xkernel.wang@foxmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-03-30mips: sgi-ip22: add a check for the return of kzalloc()Xiaoke Wang
kzalloc() is a memory allocation function which can return NULL when some internal memory errors happen. So it is better to check it to prevent potential wrong memory access. Signed-off-by: Xiaoke Wang <xkernel.wang@foxmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-03-30Merge tag 'pwm/for-5.18-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm Pull pwm updates from Thierry Reding: "This contains conversions of some more drivers to the atomic API as well as the addition of new chip support for some existing drivers. There are also various minor fixes and cleanups across the board, from drivers to device tree bindings" * tag 'pwm/for-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm: (45 commits) pwm: rcar: Simplify multiplication/shift logic dt-bindings: pwm: renesas,tpu: Do not require pwm-cells twice dt-bindings: pwm: tiehrpwm: Do not require pwm-cells twice dt-bindings: pwm: tiecap: Do not require pwm-cells twice dt-bindings: pwm: samsung: Do not require pwm-cells twice dt-bindings: pwm: intel,keembay: Do not require pwm-cells twice dt-bindings: pwm: brcm,bcm7038: Do not require pwm-cells twice dt-bindings: pwm: toshiba,visconti: Include generic PWM schema dt-bindings: pwm: renesas,pwm: Include generic PWM schema dt-bindings: pwm: sifive: Include generic PWM schema dt-bindings: pwm: rockchip: Include generic PWM schema dt-bindings: pwm: mxs: Include generic PWM schema dt-bindings: pwm: iqs620a: Include generic PWM schema dt-bindings: pwm: intel,lgm: Include generic PWM schema dt-bindings: pwm: imx: Include generic PWM schema dt-bindings: pwm: allwinner,sun4i-a10: Include generic PWM schema pwm: pwm-mediatek: Beautify error messages text pwm: pwm-mediatek: Allocate clk_pwms with devm_kmalloc_array pwm: pwm-mediatek: Simplify error handling with dev_err_probe() pwm: brcmstb: Remove useless locking ...
2022-03-30Merge tag 'regulator-fix-v5.18' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator fixes from Mark Brown: "A couple of fixes for the rt4831 driver which fix features that didn't work due to incomplete description of the register configuration" * tag 'regulator-fix-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: rt4831: Add active_discharge_on to fix discharge API regulator: rt4831: Add bypass mask to fix set_bypass API work
2022-03-30Merge tag 'dmaengine-5.18-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine Pull dmaengine updates from Vinod Koul: "This time we have bunch of driver updates and some new device support. New support: - Document RZ/V2L and RZ/G2UL dma binding - TI AM62x k3-udma and k3-psil support Updates: - Yaml conversion for Mediatek uart apdma schema - Removal of DMA-32 fallback configuration for various drivers - imx-sdma updates for channel restart" * tag 'dmaengine-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (23 commits) dmaengine: hisi_dma: fix MSI allocate fail when reload hisi_dma dmaengine: dw-axi-dmac: cleanup comments dmaengine: fsl-dpaa2-qdma: Drop comma after SoC match table sentinel dt-bindings: dma: Convert mtk-uart-apdma to DT schema dmaengine: ppc4xx: Make use of the helper macro LIST_HEAD() dmaengine: idxd: Remove useless DMA-32 fallback configuration dmaengine: qcom_hidma: Remove useless DMA-32 fallback configuration dmaengine: sh: Kconfig: Add ARCH_R9A07G054 dependency for RZ_DMAC config option dmaengine: ti: k3-psil: Add AM62x PSIL and PDMA data dmaengine: ti: k3-udma: Add AM62x DMSS support dmaengine: ti: cleanup comments dmaengine: imx-sdma: clean up some inconsistent indenting dmaengine: Revert "dmaengine: shdma: Fix runtime PM imbalance on error" dmaengine: idxd: restore traffic class defaults after wq reset dmaengine: altera-msgdma: Remove useless DMA-32 fallback configuration dmaengine: stm32-dma: set dma_device max_sg_burst dmaengine: imx-sdma: fix cyclic buffer race condition dmaengine: imx-sdma: restart cyclic channel if needed dmaengine: iot: Remove useless DMA-32 fallback configuration dmaengine: ptdma: handle the cases based on DMA is complete ...
2022-03-30Merge tag 'rproc-v5.18' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux Pull remoteproc updates from Bjorn Andersson: "In the remoteproc core, it's now possible to mark the sysfs attributes read only on a per-instance basis, which is then used by the TI wkup M3 driver. Also, the rproc_shutdown() interface propagates errors to the caller and an array underflow is fixed in the debugfs interface. The rproc_da_to_va() API is moved to the public API to allow e.g. child rpmsg devices to acquire pointers to memory shared with the remote processor. The TI K3 R5F and DSP drivers gains support for attaching to instances already started by the bootloader, aka IPC-only mode. The Mediatek remoteproc driver gains support for the MT8186 SCP. The driver's probe function is reordered and moved to use the devres version of rproc_alloc() to save a few gotos. The driver's probe function is also transitioned to use dev_err_probe() to provide better debug support. Support for the Qualcomm SC7280 Wireless Subsystem (WPSS) is introduced. The Hexagon based remoteproc drivers gains support for voting for interconnect bandwidth during launch of the remote processor. The modem subsystem (MSS) driver gains support for probing the BAM-DMUX driver, which provides the network interface towards the modem on a set of older Qualcomm platforms. In addition a number a bug fixes are introduces in the Qualcomm drivers. Lastly Qualcomm ADSP DeviceTree binding is converted to YAML format, to allow validation of DeviceTree source files" * tag 'rproc-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux: (22 commits) remoteproc: qcom_q6v5_mss: Create platform device for BAM-DMUX remoteproc: qcom: q6v5_wpss: Add support for sc7280 WPSS dt-bindings: remoteproc: qcom: Add SC7280 WPSS support dt-bindings: remoteproc: qcom: adsp: Convert binding to YAML remoteproc: k3-dsp: Add support for IPC-only mode for all K3 DSPs remoteproc: k3-dsp: Refactor mbox request code in start remoteproc: k3-r5: Add support for IPC-only mode for all R5Fs remoteproc: k3-r5: Refactor mbox request code in start remoteproc: Change rproc_shutdown() to return a status remoteproc: qcom: q6v5: Add interconnect path proxy vote remoteproc: mediatek: Support mt8186 scp dt-bindings: remoteproc: mediatek: Add binding for mt8186 scp remoteproc: qcom_q6v5_mss: Fix some leaks in q6v5_alloc_memory_region remoteproc: qcom_wcnss: Add missing of_node_put() in wcnss_alloc_memory_region remoteproc: qcom: Fix missing of_node_put in adsp_alloc_memory_region remoteproc: move rproc_da_to_va declaration to remoteproc.h remoteproc: wkup_m3: Set sysfs_read_only flag remoteproc: Introduce sysfs_read_only flag remoteproc: Fix count check in rproc_coredump_write() remoteproc: mtk_scp: Use dev_err_probe() where possible ...
2022-03-30Merge tag 'hwlock-v5.18' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux Pull hwspinlock updates from Bjorn Andersson: "This updates sprd and srm32 drivers to use struct_size() instead of their open-coded equivalents. It also cleans up the omap dt-bindings example" * tag 'hwlock-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux: hwspinlock: sprd: Use struct_size() helper in devm_kzalloc() hwspinlock: stm32: Use struct_size() helper in devm_kzalloc() dt-bindings: hwlock: omap: Remove redundant binding example
2022-03-30Merge tag 'rpmsg-v5.18' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux Pull rpmsg updates from Bjorn Andersson: "The major part of the rpmsg changes for v5.18 relates to improvements in the rpmsg char driver, which now allow automatically attaching to rpmsg channels as well as initiating new communication channels from the Linux side. The SMD driver is moved to arch_initcall with the purpose of registering root clocks earlier during boot. Also in the SMD driver, a workaround for the resource power management (RPM) channel is introduced to resolve an issue where both the RPM and Linux side waits for the other to close the communication established by the bootloader - this unblocks support for clocks and regulators on some older Qualcomm platforms" * tag 'rpmsg-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux: rpmsg: ctrl: Introduce new RPMSG_CREATE/RELEASE_DEV_IOCTL controls rpmsg: char: Introduce the "rpmsg-raw" channel rpmsg: char: Add possibility to use default endpoint of the rpmsg device rpmsg: char: Refactor rpmsg_chrdev_eptdev_create function rpmsg: Update rpmsg_chrdev_register_device function rpmsg: Move the rpmsg control device from rpmsg_char to rpmsg_ctrl rpmsg: Create the rpmsg class in core instead of in rpmsg char rpmsg: char: Export eptdev create and destroy functions rpmsg: char: treat rpmsg_trysend() ENOMEM as EAGAIN rpmsg: qcom_smd: Fix redundant channel->registered assignment rpmsg: use struct_size over open coded arithmetic rpmsg: smd: allow opening rpm_requests even if already opened rpmsg: qcom_smd: Promote to arch_initcall
2022-03-30Merge tag 'i3c/for-5.18' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux Pull i3c updates from Alexandre Belloni: - support dynamic addition of i2c devices * tag 'i3c/for-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux: i3c: fix uninitialized variable use in i2c setup i3c: support dynamically added i2c devices i3c: remove i2c board info from i2c_dev_desc
2022-03-30Merge tag 'clk-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk updates from Stephen Boyd: "There's one large change in the core clk framework here. We change how clk_set_rate_range() works so that the frequency is re-evaulated each time the rate is changed. Previously we wouldn't let clk providers see a rate that was different if it was still within the range, which could be bad for power if the clk could run slower when a range expands. Now the clk provider can decide to do something differently when the constraints change. This broke Nvidia's clk driver so we had to wait for the fix for that to bake a little more in -next. The rate range patch series also introduced a kunit suite for the clk framework that we're going to extend in the next release. It already made it easy to find corner cases in the rate range patches so I'm excited to see it cover more clk code and increase our confidence in core framework patches in the future. I also added a kunit test for the basic clk gate code and that work will continue to cover more basic clk types: muxes, dividers, etc. Beyond the core code we have the usual set of clk driver updates and additions. Qualcomm again dominates the diffstat here with lots more SoCs being supported and i.MX follows afer that with a similar number of SoCs gaining clk drivers. Beyond those large additions there's drivers being modernized to use clk_parent_data so we can move away from global string names for all the clks in an SoC. Finally there's lots of little fixes all over the clk drivers for typos, warnings, and missing clks that aren't critical and get batched up waiting for the next merge window to open. Nothing super big stands out in the driver pile. Full details are below. Core: - Make clk_set_rate_range() re-evaluate the limits each time - Introduce various clk_set_rate_range() tests - Add clk_drop_range() to drop a previously set range New Drivers: - i.MXRT1050 clock driver and bindings - i.MX8DXL clock driver and bindings - i.MX93 clock driver and bindings - NCO blocks on Apple SoCs - Audio clks on StarFive JH7100 RISC-V SoC - Add support for the new Renesas RZ/V2L SoC - Qualcomm SDX65 A7 PLL - Qualcomm SM6350 GPU clks - Qualcomm SM6125, SM6350, QCS2290 display clks - Qualcomm MSM8226 multimedia clks Updates: - Kunit tests for clk-gate implementation - Terminate arrays with sentinels and make that clearer - Cleanup SPDX tags - Fix typos in comments - Mark mux table as const in clk-mux - Make the all_lists array const - Convert Cirrus Logic CS2000P driver to regmap, yamlify DT binding and add support for dynamic mode - Clock configuration on Microchip PolarFire SoCs - Free allocations on probe error in Mediatek clk driver - Modernize Mediatek clk driver by consolidating code - Add watchdog (WDT), I2C, and pin function controller (PFC) clocks on Renesas R-Car S4-8 - Improve the clocks for the Rockchip rk3568 display outputs (parenting, pll-rates) - Use of_device_get_match_data() instead of open-coding on Rockchip rk3568 - Reintroduce the expected fractional-divider behaviour that disappeared with the addition of CLK_FRAC_DIVIDER_POWER_OF_TWO_PS - Remove SYS PLL 1/2 clock gates for i.MX8M* - Remove AUDIO MCLK ROOT from i.MX7D - Add fracn gppll clock type used by i.MX93 - Add new composite clock for i.MX93 - Add missing media mipi phy ref clock for i.MX8MP - Fix off by one in imx_lpcg_parse_clks_from_dt() - Rework for the imx pll14xx - sama7g5: One low priority fix for GCLK of PDMC - Add DMA engine (SYS-DMAC) clocks on Renesas R-Car S4-8 - Add MOST (MediaLB I/F) clocks on Renesas R-Car E3 and D3 - Add CAN-FD clocks on Renesas R-Car V3U - Qualcomm SC8280XP RPMCC - Add some missing clks on Qualcomm MSM8992/MSM8994/MSM8998 SoCs - Rework Qualcomm GCC bindings and convert SDM845 camera bindig to YAML - Convert various Qualcomm drivers to use clk_parent_data - Remove test clocks from various Qualcomm drivers - Crypto engine clks on Qualcomm IPQ806x + more freqs for SDCC/NSS - Qualcomm SM8150 EMAC, PCIe, UFS GDSCs - Better pixel clk frequency support on Qualcomm RCG2 clks" * tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (227 commits) clk: zynq: Update the parameters to zynq_clk_register_periph_clk clk: zynq: trivial warning fix clk: Drop the rate range on clk_put() clk: test: Test clk_set_rate_range on orphan mux clk: Initialize orphan req_rate dt-bindings: clock: drop useless consumer example dt-bindings: clock: renesas: Make example 'clocks' parsable clk: qcom: gcc-msm8994: Fix gpll4 width dt-bindings: clock: fix dt_binding_check error for qcom,gcc-other.yaml clk: rs9: Add Renesas 9-series PCIe clock generator driver clk: fixed-factor: Introduce devm_clk_hw_register_fixed_factor_index() clk: visconti: prevent array overflow in visconti_clk_register_gates() dt-bindings: clk: rs9: Add Renesas 9-series I2C PCIe clock generator clk: sifive: Move all stuff into SoCs header files from C files clk: sifive: Add SoCs prefix in each SoCs-dependent data riscv: dts: Change the macro name of prci in each device node dt-bindings: change the macro name of prci in header files and example clk: sifive: duplicate the macro definitions for the time being clk: qcom: sm6125-gcc: fix typos in comments clk: ti: clkctrl: fix typos in comments ...
2022-03-30Merge tag 'libnvdimm-for-5.18' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm updates from Dan Williams: "The update for this cycle includes the deprecation of block-aperture mode and a new perf events interface for the papr_scm nvdimm driver. The perf events approach was acked by PeterZ. - Add perf support for nvdimm events, initially only for 'papr_scm' devices. - Deprecate the 'block aperture' support in libnvdimm, it only ever existed in the specification, not in shipping product" * tag 'libnvdimm-for-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: nvdimm/blk: Fix title level MAINTAINERS: remove section LIBNVDIMM BLK: MMIO-APERTURE DRIVER powerpc/papr_scm: Fix build failure when drivers/nvdimm: Fix build failure when CONFIG_PERF_EVENTS is not set nvdimm/region: Delete nd_blk_region infrastructure ACPI: NFIT: Remove block aperture support nvdimm/namespace: Delete nd_namespace_blk nvdimm/namespace: Delete blk namespace consideration in shared paths nvdimm/blk: Delete the block-aperture window driver nvdimm/region: Fix default alignment for small regions docs: ABI: sysfs-bus-nvdimm: Document sysfs event format entries for nvdimm pmu powerpc/papr_scm: Add perf interface support drivers/nvdimm: Add perf interface to expose nvdimm performance stats drivers/nvdimm: Add nvdimm pmu structure
2022-03-30fs: fix an infinite loop in iomap_fiemapGuo Xuenan
when get fiemap starting from MAX_LFS_FILESIZE, (maxbytes - *len) < start will always true , then *len set zero. because of start offset is beyond file size, for erofs filesystem it will always return iomap.length with zero,iomap iterate will enter infinite loop. it is necessary cover this corner case to avoid this situation. ------------[ cut here ]------------ WARNING: CPU: 7 PID: 905 at fs/iomap/iter.c:35 iomap_iter+0x97f/0xc70 Modules linked in: xfs erofs CPU: 7 PID: 905 Comm: iomap Tainted: G W 5.17.0-rc8 #27 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 RIP: 0010:iomap_iter+0x97f/0xc70 Code: 85 a1 fc ff ff e8 71 be 9c ff 0f 1f 44 00 00 e9 92 fc ff ff e8 62 be 9c ff 0f 0b b8 fb ff ff ff e9 fc f8 ff ff e8 51 be 9c ff <0f> 0b e9 2b fc ff ff e8 45 be 9c ff 0f 0b e9 e1 fb ff ff e8 39 be RSP: 0018:ffff888060a37ab0 EFLAGS: 00010293 RAX: 0000000000000000 RBX: ffff888060a37bb0 RCX: 0000000000000000 RDX: ffff88807e19a900 RSI: ffffffff81a7da7f RDI: ffff888060a37be0 RBP: 7fffffffffffffff R08: 0000000000000000 R09: ffff888060a37c20 R10: ffff888060a37c67 R11: ffffed100c146f8c R12: 7fffffffffffffff R13: 0000000000000000 R14: ffff888060a37bd8 R15: ffff888060a37c20 FS: 00007fd3cca01540(0000) GS:ffff888108780000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020010820 CR3: 0000000054b92000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> iomap_fiemap+0x1c9/0x2f0 erofs_fiemap+0x64/0x90 [erofs] do_vfs_ioctl+0x40d/0x12e0 __x64_sys_ioctl+0xaa/0x1c0 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae </TASK> ---[ end trace 0000000000000000 ]--- watchdog: BUG: soft lockup - CPU#7 stuck for 26s! [iomap:905] Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Guo Xuenan <guoxuenan@huawei.com> Reviewed-by: Christoph Hellwig <hch@lst.de> [djwong: fix some typos] Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2022-03-30loop: fix ioctl calls using compat_loop_infoCarlos Llamas
Support for cryptoloop was deleted in commit 47e9624616c8 ("block: remove support for cryptoloop and the xor transfer"), making the usage of loop_info->lo_encrypt_type obsolete. However, this member was also removed from the compat_loop_info definition and this breaks userspace ioctl calls for 32-bit binaries and CONFIG_COMPAT=y. This patch restores the compat_loop_info->lo_encrypt_type member and marks it obsolete as well as in the uapi header definitions. Fixes: 47e9624616c8 ("block: remove support for cryptoloop and the xor transfer") Signed-off-by: Carlos Llamas <cmllamas@google.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220329201815.1347500-1-cmllamas@google.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-30PCI/doc: cleanup references to the legacy PCI DMA APIChristoph Hellwig
Mention the regular DMA API calls instead of the now removed PCI DMA API. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2022-03-30ksmbd: replace usage of found with dedicated list iterator variableJakob Koschel
To move the list iterator variable into the list_for_each_entry_*() macro in the future it should be avoided to use the list iterator variable after the loop body. To *never* use the list iterator variable after the loop it was concluded to use a separate iterator variable instead of a found boolean [1]. This removes the need to use a found variable and simply checking if the variable was set, can determine if the break/goto was hit. Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=EHreAsk5SqXPwr9Y7k9sA6cWXJ6w@mail.gmail.com/ Signed-off-by: Jakob Koschel <jakobkoschel@gmail.com> Reviewed-by: Hyunchul Lee <hyc.lee@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-03-30ksmbd: Remove a redundant zeroing of memoryChristophe JAILLET
fill_transform_hdr() has only one caller that already clears tr_buf (it is kzalloc'ed). So there is no need to clear it another time here. Remove the superfluous memset() and add a comment to remind that the caller must clear the buffer. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Hyunchul Lee <hyc.lee@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-03-30MAINTAINERS: ksmbd: switch Sergey to reviewerNamjae Jeon
Sergey don't have the time to work ksmbd. He will continue to review ksmbd works at free time. This patch switches him from maintainer to reviewer. Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-03-30ksmbd: shorten experimental warning on loading the moduleSteve French
ksmbd is continuing to improve. Shorten the warning message logged the first time it is loaded to: "The ksmbd server is experimental" Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-03-30ALSA: pcm: Fix potential AB/BA lock with buffer_mutex and mmap_lockTakashi Iwai
syzbot caught a potential deadlock between the PCM runtime->buffer_mutex and the mm->mmap_lock. It was brought by the recent fix to cover the racy read/write and other ioctls, and in that commit, I overlooked a (hopefully only) corner case that may take the revert lock, namely, the OSS mmap. The OSS mmap operation exceptionally allows to re-configure the parameters inside the OSS mmap syscall, where mm->mmap_mutex is already held. Meanwhile, the copy_from/to_user calls at read/write operations also take the mm->mmap_lock internally, hence it may lead to a AB/BA deadlock. A similar problem was already seen in the past and we fixed it with a refcount (in commit b248371628aa). The former fix covered only the call paths with OSS read/write and OSS ioctls, while we need to cover the concurrent access via both ALSA and OSS APIs now. This patch addresses the problem above by replacing the buffer_mutex lock in the read/write operations with a refcount similar as we've used for OSS. The new field, runtime->buffer_accessing, keeps the number of concurrent read/write operations. Unlike the former buffer_mutex protection, this protects only around the copy_from/to_user() calls; the other codes are basically protected by the PCM stream lock. The refcount can be a negative, meaning blocked by the ioctls. If a negative value is seen, the read/write aborts with -EBUSY. In the ioctl side, OTOH, they check this refcount, too, and set to a negative value for blocking unless it's already being accessed. Reported-by: syzbot+6e5c88838328e99c7e1c@syzkaller.appspotmail.com Fixes: dca947d4d26d ("ALSA: pcm: Fix races among concurrent read/write and buffer changes") Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/000000000000381a0d05db622a81@google.com Link: https://lore.kernel.org/r/20220330120903.4738-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-03-30Merge tag 'asoc-fix-v5.18' of ↵Takashi Iwai
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus ASoC: Fixes for v5.18 A few fixes that came in during the merge window, all fairly routine.
2022-03-30x86/fpu/xstate: Consolidate size calculationsThomas Gleixner
Use the offset calculation to do the size calculation which avoids yet another series of CPUID instructions for each invocation. [ Fix the FP/SSE only case which missed to take the xstate header into account, as Reported-by: kernel test robot <oliver.sang@intel.com> ] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/87o81pgbp2.ffs@tglx
2022-03-30x86/fpu/xstate: Handle supervisor states in XSTATE permissionsThomas Gleixner
The size calculation in __xstate_request_perm() fails to take supervisor states into account because the permission bitmap is only relevant for user states. Up to 5.17 this does not matter because there are no supervisor states supported, but the (re-)enabling of ENQCMD makes them available. Fixes: 7c1ef59145f1 ("x86/cpufeatures: Re-enable ENQCMD") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20220324134623.681768598@linutronix.de
2022-03-30x86/fpu/xsave: Handle compacted offsets correctly with supervisor statesThomas Gleixner
So far the cached fixed compacted offsets worked, but with (re-)enabling of ENQCMD this does no longer work with KVM fpstate. KVM does not have supervisor features enabled for the guest FPU, which means that KVM has then a different XSAVE area layout than the host FPU state. This in turn breaks the copy from/to UABI functions when invoked for a guest state. Remove the pre-calculated compacted offsets and calculate the offset of each component at runtime based on the XCOMP_BV field in the XSAVE header. The runtime overhead is not interesting because these copy from/to UABI functions are not used in critical fast paths. KVM uses them to save and restore FPU state during migration. The host uses them for ptrace and for the slow path of 32bit signal handling. Fixes: 7c1ef59145f1 ("x86/cpufeatures: Re-enable ENQCMD") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20220324134623.627636809@linutronix.de
2022-03-30x86/fpu: Cache xfeature flags from CPUIDThomas Gleixner
In preparation for runtime calculation of XSAVE offsets cache the feature flags for each XSTATE component during feature enumeration via CPUID(0xD). EDX has two relevant bits: 0 Supervisor component 1 Feature storage must be 64 byte aligned These bits are currently only evaluated during init, but the alignment bit must be cached to make runtime calculation of XSAVE offsets efficient. Cache the full EDX content and use it for the existing alignment and supervisor checks. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20220324134623.573656209@linutronix.de
2022-03-30x86/fpu/xsave: Initialize offset/size cache earlyThomas Gleixner
Reading XSTATE feature information from CPUID over and over does not make sense. The information has to be cached anyway, so it can be done early. Prepare for runtime calculation of XSTATE offsets and allow consolidation of the size calculation functions in a later step. Rename the function while at it as it does not setup any features. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20220324134623.519411939@linutronix.de
2022-03-30x86/fpu: Remove unused supervisor only offsetsThomas Gleixner
No users. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20220324134623.465066249@linutronix.de
2022-03-30ALSA: hda: Avoid unsol event during RPM suspendingMohan Kumar
There is a corner case with unsol event handling during codec runtime suspending state. When the codec runtime suspend call initiated, the codec->in_pm atomic variable would be 0, currently the codec runtime suspend function calls snd_hdac_enter_pm() which will just increments the codec->in_pm atomic variable. Consider unsol event happened just after this step and before snd_hdac_leave_pm() in the codec runtime suspend function. The snd_hdac_power_up_pm() in the unsol event flow in hdmi_present_sense_via_verbs() function would just increment the codec->in_pm atomic variable without calling pm_runtime_get_sync function. As codec runtime suspend flow is already in progress and in parallel unsol event is also accessing the codec verbs, as soon as codec suspend flow completes and clocks are switched off before completing the unsol event handling as both functions doesn't wait for each other. This will result in below errors [ 589.428020] tegra-hda 3510000.hda: azx_get_response timeout, switching to polling mode: last cmd=0x505f2f57 [ 589.428344] tegra-hda 3510000.hda: spurious response 0x80000074:0x5, last cmd=0x505f2f57 [ 589.428547] tegra-hda 3510000.hda: spurious response 0x80000065:0x5, last cmd=0x505f2f57 To avoid this, the unsol event flow should not perform any codec verb related operations during RPM_SUSPENDING state. Signed-off-by: Mohan Kumar <mkumard@nvidia.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20220329155940.26331-1-mkumard@nvidia.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-03-30ALSA: hda/realtek: Fix audio regression on Mi Notebook Pro 2020Kai-Heng Feng
Commit 5aec98913095 ("ALSA: hda/realtek - ALC236 headset MIC recording issue") is to solve recording issue met on AL236, by matching codec variant ALC269_TYPE_ALC257 and ALC269_TYPE_ALC256. This match can be too broad and Mi Notebook Pro 2020 is broken by the patch. Instead, use codec ID to be narrow down the scope, in order to make ALC256 unaffected. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=215484 Fixes: 5aec98913095 ("ALSA: hda/realtek - ALC236 headset MIC recording issue") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Link: https://lore.kernel.org/r/20220330061335.1015533-1-kai.heng.feng@canonical.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-03-29fs: fix fd table size alignment properlyLinus Torvalds
Jason Donenfeld reports that my commit 1c24a186398f ("fs: fd tables have to be multiples of BITS_PER_LONG") doesn't work, and the reason is an embarrassing brown-paper-bag bug. Yes, we want to align the number of fds to BITS_PER_LONG, and yes, the reason they might not be aligned is because the incoming 'max_fd' argument might not be aligned. But aligining the argument - while simple - will cause a "infinitely big" maxfd (eg NR_OPEN_MAX) to just overflow to zero. Which most definitely isn't what we want either. The obvious fix was always just to do the alignment last, but I had moved it earlier just to make the patch smaller and the code look simpler. Duh. It certainly made _me_ look simple. Fixes: 1c24a186398f ("fs: fd tables have to be multiples of BITS_PER_LONG") Reported-and-tested-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Fedor Pchelkin <aissur0002@gmail.com> Cc: Alexey Khoroshilov <khoroshilov@ispras.ru> Cc: Christian Brauner <brauner@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-30PCI: Remove the deprecated "pci-dma-compat.h" APIChristophe JAILLET
Now that all usages of the functions defined in "pci-dma-compat.h" have been removed, it is time to remove this file as well. In order not to break builds, move the "#include <linux/dma-mapping.h>" that was in "pci-dma-compat.h" into "include/linux/pci.h" Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-03-30crypto: x86/sm3 - Fixup SLSPeter Zijlstra
This missed the big asm update due to being merged through the crypto tree. Fixes: f94909ceb1ed ("x86: Prepare asm files for straight-line-speculation") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-03-29Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfJakub Kicinski
Alexei Starovoitov says: ==================== pull-request: bpf 2022-03-29 We've added 16 non-merge commits during the last 1 day(s) which contain a total of 24 files changed, 354 insertions(+), 187 deletions(-). The main changes are: 1) x86 specific bits of fprobe/rethook, from Masami and Peter. 2) ice/xsk fixes, from Maciej and Magnus. 3) Various small fixes, from Andrii, Yonghong, Geliang and others. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: selftests/bpf: Fix clang compilation errors ice: xsk: Fix indexing in ice_tx_xsk_pool() ice: xsk: Stop Rx processing when ntc catches ntu ice: xsk: Eliminate unnecessary loop iteration xsk: Do not write NULL in SW ring at allocation failure x86,kprobes: Fix optprobe trampoline to generate complete pt_regs x86,rethook: Fix arch_rethook_trampoline() to generate a complete pt_regs x86,rethook,kprobes: Replace kretprobe with rethook on x86 kprobes: Use rethook for kretprobe if possible bpftool: Fix generated code in codegen_asserts selftests/bpf: fix selftest after random: Urandom_read tracepoint removal bpf: Fix maximum permitted number of arguments check bpf: Sync comments for bpf_get_stack fprobe: Fix sparse warning for acccessing __rcu ftrace_hash fprobe: Fix smatch type mismatch warning bpf/bpftool: Add unprivileged_bpf_disabled check against value of 2 ==================== Link: https://lore.kernel.org/r/20220329234924.39053-1-alexei.starovoitov@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-29Merge tag 'nfs-for-5.18-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client updates from Trond Myklebust: "Highlights include: Features: - Switch NFS to use readahead instead of the obsolete readpages. - Readdir fixes to improve cacheability of large directories when there are multiple readers and writers. - Readdir performance improvements when doing a seekdir() immediately after opening the directory (common when re-exporting NFS). - NFS swap improvements from Neil Brown. - Loosen up memory allocation to permit direct reclaim and write back in cases where there is no danger of deadlocking the writeback code or NFS swap. - Avoid sillyrename when the NFSv4 server claims to support the necessary features to recover the unlinked but open file after reboot. Bugfixes: - Patch from Olga to add a mount option to control NFSv4.1 session trunking discovery, and default it to being off. - Fix a lockup in nfs_do_recoalesce(). - Two fixes for list iterator variables being used when pointing to the list head. - Fix a kernel memory scribble when reading from a non-socket transport in /sys/kernel/sunrpc. - Fix a race where reconnecting to a server could leave the TCP socket stuck forever in the connecting state. - Patch from Neil to fix a shutdown race which can leave the SUNRPC transport timer primed after we free the struct xprt itself. - Patch from Xin Xiong to fix reference count leaks in the NFSv4.2 copy offload. - Sunrpc patch from Olga to avoid resending a task on an offlined transport. Cleanups: - Patches from Dave Wysochanski to clean up the fscache code" * tag 'nfs-for-5.18-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (91 commits) NFSv4/pNFS: Fix another issue with a list iterator pointing to the head NFS: Don't loop forever in nfs_do_recoalesce() SUNRPC: Don't return error values in sysfs read of closed files SUNRPC: Do not dereference non-socket transports in sysfs NFSv4.1: don't retry BIND_CONN_TO_SESSION on session error SUNRPC don't resend a task on an offlined transport NFS: replace usage of found with dedicated list iterator variable SUNRPC: avoid race between mod_timer() and del_timer_sync() pNFS/files: Ensure pNFS allocation modes are consistent with nfsiod pNFS/flexfiles: Ensure pNFS allocation modes are consistent with nfsiod NFSv4/pnfs: Ensure pNFS allocation modes are consistent with nfsiod NFS: Avoid writeback threads getting stuck in mempool_alloc() NFS: nfsiod should not block forever in mempool_alloc() SUNRPC: Make the rpciod and xprtiod slab allocation modes consistent SUNRPC: Fix unx_lookup_cred() allocation NFS: Fix memory allocation in rpc_alloc_task() NFS: Fix memory allocation in rpc_malloc() SUNRPC: Improve accuracy of socket ENOBUFS determination SUNRPC: Replace internal use of SOCKWQ_ASYNC_NOSPACE SUNRPC: Fix socket waits for write buffer space ...
2022-03-29xfs: drop async cache flushes from CIL commits.Dave Chinner
Jan Kara reported a performance regression in dbench that he bisected down to commit bad77c375e8d ("xfs: CIL checkpoint flushes caches unconditionally"). Whilst developing the journal flush/fua optimisations this cache was part of, it appeared to made a significant difference to performance. However, now that this patchset has settled and all the correctness issues fixed, there does not appear to be any significant performance benefit to asynchronous cache flushes. In fact, the opposite is true on some storage types and workloads, where additional cache flushes that can occur from fsync heavy workloads have measurable and significant impact on overall throughput. Local dbench testing shows little difference on dbench runs with sync vs async cache flushes on either fast or slow SSD storage, and no difference in streaming concurrent async transaction workloads like fs-mark. Fast NVME storage. From `dbench -t 30`, CIL scale: clients async sync BW Latency BW Latency 1 935.18 0.855 915.64 0.903 8 2404.51 6.873 2341.77 6.511 16 3003.42 6.460 2931.57 6.529 32 3697.23 7.939 3596.28 7.894 128 7237.43 15.495 7217.74 11.588 512 5079.24 90.587 5167.08 95.822 fsmark, 32 threads, create w/ 64 byte xattr w/32k logbsize create chown unlink async 1m41s 1m16s 2m03s sync 1m40s 1m19s 1m54s Slower SATA SSD storage: From `dbench -t 30`, CIL scale: clients async sync BW Latency BW Latency 1 78.59 15.792 83.78 10.729 8 367.88 92.067 404.63 59.943 16 564.51 72.524 602.71 76.089 32 831.66 105.984 870.26 110.482 128 1659.76 102.969 1624.73 91.356 512 2135.91 223.054 2603.07 161.160 fsmark, 16 threads, create w/32k logbsize create unlink async 5m06s 4m15s sync 5m00s 4m22s And on Jan's test machine: 5.18-rc8-vanilla 5.18-rc8-patched Amean 1 71.22 ( 0.00%) 64.94 * 8.81%* Amean 2 93.03 ( 0.00%) 84.80 * 8.85%* Amean 4 150.54 ( 0.00%) 137.51 * 8.66%* Amean 8 252.53 ( 0.00%) 242.24 * 4.08%* Amean 16 454.13 ( 0.00%) 439.08 * 3.31%* Amean 32 835.24 ( 0.00%) 829.74 * 0.66%* Amean 64 1740.59 ( 0.00%) 1686.73 * 3.09%* Performance and cache flush behaviour is restored to pre-regression levels. As such, we can now consider the async cache flush mechanism an unnecessary exercise in premature optimisation and hence we can now remove it and the infrastructure it requires completely. Fixes: bad77c375e8d ("xfs: CIL checkpoint flushes caches unconditionally") Reported-and-tested-by: Jan Kara <jack@suse.cz> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2022-03-29xfs: shutdown during log recovery needs to mark the log shutdownDave Chinner
When a checkpoint writeback is run by log recovery, corruption propagated from the log can result in writeback verifiers failing and calling xfs_force_shutdown() from xfs_buf_delwri_submit_buffers(). This results in the mount being marked as shutdown, but the log does not get marked as shut down because: /* * If this happens during log recovery then we aren't using the runtime * log mechanisms yet so there's nothing to shut down. */ if (!log || xlog_in_recovery(log)) return false; If there are other buffers that then fail (say due to detecting the mount shutdown), they will now hang in xfs_do_force_shutdown() waiting for the log to shut down like this: __schedule+0x30d/0x9e0 schedule+0x55/0xd0 xfs_do_force_shutdown+0x1cd/0x200 ? init_wait_var_entry+0x50/0x50 xfs_buf_ioend+0x47e/0x530 __xfs_buf_submit+0xb0/0x240 xfs_buf_delwri_submit_buffers+0xfe/0x270 xfs_buf_delwri_submit+0x3a/0xc0 xlog_do_recovery_pass+0x474/0x7b0 ? do_raw_spin_unlock+0x30/0xb0 xlog_do_log_recovery+0x91/0x140 xlog_do_recover+0x38/0x1e0 xlog_recover+0xdd/0x170 xfs_log_mount+0x17e/0x2e0 xfs_mountfs+0x457/0x930 xfs_fs_fill_super+0x476/0x830 xlog_force_shutdown() always needs to mark the log as shut down, regardless of whether recovery is in progress or not, so that multiple calls to xfs_force_shutdown() during recovery don't end up waiting for the log to be shut down like this. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2022-03-29xfs: xfs_trans_commit() path must check for log shutdownDave Chinner
If a shut races with xfs_trans_commit() and we have shut down the filesystem but not the log, we will still cancel the transaction. This can result in aborting dirty log items instead of committing and pinning them whilst the log is still running. Hence we can end up with dirty, unlogged metadata that isn't in the AIL in memory that can be flushed to disk via writeback clustering. This was discovered from a g/388 trace where an inode log item was having IO completed on it and it wasn't in the AIL, hence tripping asserts xfs_ail_check(). Inode cluster writeback started long after the filesystem shutdown started, and long after the transaction containing the dirty inode was aborted and the log item marked XFS_LI_ABORTED. The inode was seen as dirty and unpinned, so it was flushed. IO completion tried to remove the inode from the AIL, at which point stuff went bad: XFS (pmem1): Log I/O Error (0x6) detected at xfs_fs_goingdown+0xa3/0xf0 (fs/xfs/xfs_fsops.c:500). Shutting down filesystem. XFS: Assertion failed: in_ail, file: fs/xfs/xfs_trans_ail.c, line: 67 XFS (pmem1): Please unmount the filesystem and rectify the problem(s) Workqueue: xfs-buf/pmem1 xfs_buf_ioend_work RIP: 0010:assfail+0x27/0x2d Call Trace: <TASK> xfs_ail_check+0xa8/0x180 xfs_ail_delete_one+0x3b/0xf0 xfs_buf_inode_iodone+0x329/0x3f0 xfs_buf_ioend+0x1f8/0x530 xfs_buf_ioend_work+0x15/0x20 process_one_work+0x1ac/0x390 worker_thread+0x56/0x3c0 kthread+0xf6/0x120 ret_from_fork+0x1f/0x30 </TASK> xfs_trans_commit() needs to check log state for shutdown, not mount state. It cannot abort dirty log items while the log is still running as dirty items must remained pinned in memory until they are either committed to the journal or the log has shut down and they can be safely tossed away. Hence if the log has not shut down, the xfs_trans_commit() path must allow completed transactions to commit to the CIL and pin the dirty items even if a mount shutdown has started. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2022-03-29xfs: xfs_do_force_shutdown needs to block racing shutdownsDave Chinner
When we call xfs_forced_shutdown(), the caller often expects the filesystem to be completely shut down when it returns. However, if we have racing xfs_forced_shutdown() calls, the first caller sets the mount shutdown flag then goes to shutdown the log. The second caller sees the mount shutdown flag and returns immediately - it does not wait for the log to be shut down. Unfortunately, xfs_forced_shutdown() is used in some places that expect it to completely shut down the filesystem before it returns (e.g. xfs_trans_log_inode()). As such, returning before the log has been shut down leaves us in a place where the transaction failed to complete correctly but we still call xfs_trans_commit(). This situation arises because xfs_trans_log_inode() does not return an error and instead calls xfs_force_shutdown() to ensure that the transaction being committed is aborted. Unfortunately, we have a race condition where xfs_trans_commit() needs to check xlog_is_shutdown() because it can't abort log items before the log is shut down, but it needs to use xfs_is_shutdown() because xfs_forced_shutdown() does not block waiting for the log to shut down. To fix this conundrum, first we make all calls to xfs_forced_shutdown() block until the log is also shut down. This means we can then safely use xfs_forced_shutdown() as a mechanism that ensures the currently running transaction will be aborted by xfs_trans_commit() regardless of the shutdown check it uses. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2022-03-29xfs: log shutdown triggers should only shut down the logDave Chinner
We've got a mess on our hands. 1. xfs_trans_commit() cannot cancel transactions because the mount is shut down - that causes dirty, aborted, unlogged log items to sit unpinned in memory and potentially get written to disk before the log is shut down. Hence xfs_trans_commit() can only abort transactions when xlog_is_shutdown() is true. 2. xfs_force_shutdown() is used in places to cause the current modification to be aborted via xfs_trans_commit() because it may be impractical or impossible to cancel the transaction directly, and hence xfs_trans_commit() must cancel transactions when xfs_is_shutdown() is true in this situation. But we can't do that because of #1. 3. Log IO errors cause log shutdowns by calling xfs_force_shutdown() to shut down the mount and then the log from log IO completion. 4. xfs_force_shutdown() can result in a log force being issued, which has to wait for log IO completion before it will mark the log as shut down. If #3 races with some other shutdown trigger that runs a log force, we rely on xfs_force_shutdown() silently ignoring #3 and avoiding shutting down the log until the failed log force completes. 5. To ensure #2 always works, we have to ensure that xfs_force_shutdown() does not return until the the log is shut down. But in the case of #4, this will result in a deadlock because the log Io completion will block waiting for a log force to complete which is blocked waiting for log IO to complete.... So the very first thing we have to do here to untangle this mess is dissociate log shutdown triggers from mount shutdowns. We already have xlog_forced_shutdown, which will atomically transistion to the log a shutdown state. Due to internal asserts it cannot be called multiple times, but was done simply because the only place that could call it was xfs_do_force_shutdown() (i.e. the mount shutdown!) and that could only call it once and once only. So the first thing we do is remove the asserts. We then convert all the internal log shutdown triggers to call xlog_force_shutdown() directly instead of xfs_force_shutdown(). This allows the log shutdown triggers to shut down the log without needing to care about mount based shutdown constraints. This means we shut down the log independently of the mount and the mount may not notice this until it's next attempt to read or modify metadata. At that point (e.g. xfs_trans_commit()) it will see that the log is shutdown, error out and shutdown the mount. To ensure that all the unmount behaviours and asserts track correctly as a result of a log shutdown, propagate the shutdown up to the mount if it is not already set. This keeps the mount and log state in sync, and saves a huge amount of hassle where code fails because of a log shutdown but only checks for mount shutdowns and hence ends up doing the wrong thing. Cleaning up that mess is an exercise for another day. This enables us to address the other problems noted above in followup patches. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2022-03-29xfs: run callbacks before waking waiters in xlog_state_shutdown_callbacksDave Chinner
Brian reported a null pointer dereference failure during unmount in xfs/006. He tracked the problem down to the AIL being torn down before a log shutdown had completed and removed all the items from the AIL. The failure occurred in this path while unmount was proceeding in another task: xfs_trans_ail_delete+0x102/0x130 [xfs] xfs_buf_item_done+0x22/0x30 [xfs] xfs_buf_ioend+0x73/0x4d0 [xfs] xfs_trans_committed_bulk+0x17e/0x2f0 [xfs] xlog_cil_committed+0x2a9/0x300 [xfs] xlog_cil_process_committed+0x69/0x80 [xfs] xlog_state_shutdown_callbacks+0xce/0xf0 [xfs] xlog_force_shutdown+0xdf/0x150 [xfs] xfs_do_force_shutdown+0x5f/0x150 [xfs] xlog_ioend_work+0x71/0x80 [xfs] process_one_work+0x1c5/0x390 worker_thread+0x30/0x350 kthread+0xd7/0x100 ret_from_fork+0x1f/0x30 This is processing an EIO error to a log write, and it's triggering a force shutdown. This causes the log to be shut down, and then it is running attached iclog callbacks from the shutdown context. That means the fs and log has already been marked as xfs_is_shutdown/xlog_is_shutdown and so high level code will abort (e.g. xfs_trans_commit(), xfs_log_force(), etc) with an error because of shutdown. The umount would have been blocked waiting for a log force completion inside xfs_log_cover() -> xfs_sync_sb(). The first thing for this situation to occur is for xfs_sync_sb() to exit without waiting for the iclog buffer to be comitted to disk. The above trace is the completion routine for the iclog buffer, and it is shutting down the filesystem. xlog_state_shutdown_callbacks() does this: { struct xlog_in_core *iclog; LIST_HEAD(cb_list); spin_lock(&log->l_icloglock); iclog = log->l_iclog; do { if (atomic_read(&iclog->ic_refcnt)) { /* Reference holder will re-run iclog callbacks. */ continue; } list_splice_init(&iclog->ic_callbacks, &cb_list); >>>>>> wake_up_all(&iclog->ic_write_wait); >>>>>> wake_up_all(&iclog->ic_force_wait); } while ((iclog = iclog->ic_next) != log->l_iclog); wake_up_all(&log->l_flush_wait); spin_unlock(&log->l_icloglock); >>>>>> xlog_cil_process_committed(&cb_list); } This wakes any thread waiting on IO completion of the iclog (in this case the umount log force) before shutdown processes all the pending callbacks. That means the xfs_sync_sb() waiting on a sync transaction in xfs_log_force() on iclog->ic_force_wait will get woken before the callbacks attached to that iclog are run. This results in xfs_sync_sb() returning an error, and so unmount unblocks and continues to run whilst the log shutdown is still in progress. Normally this is just fine because the force waiter has nothing to do with AIL operations. But in the case of this unmount path, the log force waiter goes on to tear down the AIL because the log is now shut down and so nothing ever blocks it again from the wait point in xfs_log_cover(). Hence it's a race to see who gets to the AIL first - the unmount code or xlog_cil_process_committed() killing the superblock buffer. To fix this, we just have to change the order of processing in xlog_state_shutdown_callbacks() to run the callbacks before it wakes any task waiting on completion of the iclog. Reported-by: Brian Foster <bfoster@redhat.com> Fixes: aad7272a9208 ("xfs: separate out log shutdown callback processing") Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2022-03-29xfs: shutdown in intent recovery has non-intent items in the AILDave Chinner
generic/388 triggered a failure in RUI recovery due to a corrupted btree record and the system then locked up hard due to a subsequent assert failure while holding a spinlock cancelling intents: XFS (pmem1): Corruption of in-memory data (0x8) detected at xfs_do_force_shutdown+0x1a/0x20 (fs/xfs/xfs_trans.c:964). Shutting down filesystem. XFS (pmem1): Please unmount the filesystem and rectify the problem(s) XFS: Assertion failed: !xlog_item_is_intent(lip), file: fs/xfs/xfs_log_recover.c, line: 2632 Call Trace: <TASK> xlog_recover_cancel_intents.isra.0+0xd1/0x120 xlog_recover_finish+0xb9/0x110 xfs_log_mount_finish+0x15a/0x1e0 xfs_mountfs+0x540/0x910 xfs_fs_fill_super+0x476/0x830 get_tree_bdev+0x171/0x270 ? xfs_init_fs_context+0x1e0/0x1e0 xfs_fs_get_tree+0x15/0x20 vfs_get_tree+0x24/0xc0 path_mount+0x304/0xba0 ? putname+0x55/0x60 __x64_sys_mount+0x108/0x140 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae Essentially, there's dirty metadata in the AIL from intent recovery transactions, so when we go to cancel the remaining intents we assume that all objects after the first non-intent log item in the AIL are not intents. This is not true. Intent recovery can log new intents to continue the operations the original intent could not complete in a single transaction. The new intents are committed before they are deferred, which means if the CIL commits in the background they will get inserted into the AIL at the head. Hence if we shut down the filesystem while processing intent recovery, the AIL may have new intents active at the current head. Hence this check: /* * We're done when we see something other than an intent. * There should be no intents left in the AIL now. */ if (!xlog_item_is_intent(lip)) { #ifdef DEBUG for (; lip; lip = xfs_trans_ail_cursor_next(ailp, &cur)) ASSERT(!xlog_item_is_intent(lip)); #endif break; } in both xlog_recover_process_intents() and log_recover_cancel_intents() is simply not valid. It was valid back when we only had EFI/EFD intents and didn't chain intents, but it hasn't been valid ever since intent recovery could create and commit new intents. Given that crashing the mount task like this pretty much prevents diagnosing what went wrong that lead to the initial failure that triggered intent cancellation, just remove the checks altogether. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2022-03-29xfs: aborting inodes on shutdown may need buffer lockDave Chinner
Most buffer io list operations are run with the bp->b_lock held, but xfs_iflush_abort() can be called without the buffer lock being held resulting in inodes being removed from the buffer list while other list operations are occurring. This causes problems with corrupted bp->b_io_list inode lists during filesystem shutdown, leading to traversals that never end, double removals from the AIL, etc. Fix this by passing the buffer to xfs_iflush_abort() if we have it locked. If the inode is attached to the buffer, we're going to have to remove it from the buffer list and we'd have to get the buffer off the inode log item to do that anyway. If we don't have a buffer passed in (e.g. from xfs_reclaim_inode()) then we can determine if the inode has a log item and if it is attached to a buffer before we do anything else. If it does have an attached buffer, we can lock it safely (because the inode has a reference to it) and then perform the inode abort. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2022-03-29Merge tag 'jfs-5.18' of https://github.com/kleikamp/linux-shaggyLinus Torvalds
Pull jfs updates from Dave Kleikamp: "A couple bug fixes" * tag 'jfs-5.18' of https://github.com/kleikamp/linux-shaggy: jfs: prevent NULL deref in diFree jfs: fix divide error in dbNextAG
2022-03-29dt-bindings: net: qcom,ethqos: Document SM8150 SoC compatibleVinod Koul
SM8150 has an ethernet controller and it needs a different configuration, so add a new compatible for this. Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Vinod Koul <vkoul@kernel.org> [bhsharma: Massage the commit log] Signed-off-by: Bhupesh Sharma <bhupesh.sharma@linaro.org> Link: https://lore.kernel.org/r/20220325200731.1585554-1-bhupesh.sharma@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-29lib/test: use after free in register_test_dev_kmod()Dan Carpenter
The "test_dev" pointer is freed but then returned to the caller. Fixes: d9c6a72d6fa2 ("kmod: add test driver to stress test the module loader") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>