linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2022-05-29	block: make bioset_exit() fully resilient against being called twice	Jens Axboe
	Most of bioset_exit() is fine being called twice, as it clears the various allocations etc when they are freed. The exception is bio_alloc_cache_destroy(), which does not clear ->cache when it has freed it. This isn't necessarily a bug, but can be if buggy users does call the exit path more then once, or with just a memset() bioset which has never been initialized. dm appears to be one such user. Fixes: be4d234d7aeb ("bio: add allocation cache abstraction") Link: https://lore.kernel.org/linux-block/YpK7m+14A+pZKs5k@casper.infradead.org/ Reported-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-29	Merge branch 'sfc-fixes'	David S. Miller
	Íñigo Huguet says: ==================== sfc: fix some efx_separate_tx_channels errors Trying to load sfc driver with modparam efx_separate_tx_channels=1 resulted in errors during initialization and not being able to use the NIC. This patches fix a few bugs and make it work again. v2: * added Martin's patch instead of a previous mine. Mine one solved some of the initialization errors, but Martin's solves them also in all possible cases. * removed whitespaces cleanup, as requested by Jakub ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-29	sfc: fix wrong tx channel offset with efx_separate_tx_channels	Íñigo Huguet
	tx_channel_offset is calculated in efx_allocate_msix_channels, but it is also calculated again in efx_set_channels because it was originally done there, and when efx_allocate_msix_channels was introduced it was forgotten to be removed from efx_set_channels. Moreover, the old calculation is wrong when using efx_separate_tx_channels because now we can have XDP channels after the TX channels, so n_channels - n_tx_channels doesn't point to the first TX channel. Remove the old calculation from efx_set_channels, and add the initialization of this variable if MSI or legacy interrupts are used, next to the initialization of the rest of the related variables, where it was missing. Fixes: 3990a8fffbda ("sfc: allocate channels for XDP tx queues") Reported-by: Tianhao Zhao <tizhao@redhat.com> Signed-off-by: Íñigo Huguet <ihuguet@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-29	sfc: fix considering that all channels have TX queues	Martin Habets
	Normally, all channels have RX and TX queues, but this is not true if modparam efx_separate_tx_channels=1 is used. In that cases, some channels only have RX queues and others only TX queues (or more preciselly, they have them allocated, but not initialized). Fix efx_channel_has_tx_queues to return the correct value for this case too. Messages shown at probe time before the fix: sfc 0000:03:00.0 ens6f0np0: MC command 0x82 inlen 544 failed rc=-22 (raw=0) arg=0 ------------[ cut here ]------------ netdevice: ens6f0np0: failed to initialise TXQ -1 WARNING: CPU: 1 PID: 626 at drivers/net/ethernet/sfc/ef10.c:2393 efx_ef10_tx_init+0x201/0x300 [sfc] [...] stripped RIP: 0010:efx_ef10_tx_init+0x201/0x300 [sfc] [...] stripped Call Trace: efx_init_tx_queue+0xaa/0xf0 [sfc] efx_start_channels+0x49/0x120 [sfc] efx_start_all+0x1f8/0x430 [sfc] efx_net_open+0x5a/0xe0 [sfc] __dev_open+0xd0/0x190 __dev_change_flags+0x1b3/0x220 dev_change_flags+0x21/0x60 [...] stripped Messages shown at remove time before the fix: sfc 0000:03:00.0 ens6f0np0: failed to flush 10 queues sfc 0000:03:00.0 ens6f0np0: failed to flush queues Fixes: 8700aff08984 ("sfc: fix channel allocation with brute force") Reported-by: Tianhao Zhao <tizhao@redhat.com> Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com> Tested-by: Íñigo Huguet <ihuguet@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-29	parisc: remove arch/parisc/nm	Masahiro Yamada
	Parisc overrides 'nm' with a shell script. I was hit by a false-positive error of $(NM) because this script returns the exit status of grep instead of ${CROSS_COMPILE}nm. (grep returns 1 if no lines were selected) I tried to fix it, but in the code review, Helge suggested to remove it entirely. [1] This script was added in 2003. [2] Presumably, it was a workaround for old toolchains (but even the parisc maintainer does not know the detail any more). Hopefully, recent tools should work fine. [1]: https://lore.kernel.org/all/1c12cd26-d8aa-4498-f4c0-29478b9578fe@gmx.de/ [2]: https://git.kernel.org/pub/scm/linux/kernel/git/history/history.git/commit/?id=36eaa6e4c0e0b6950136b956b72fd08155b92ca3 Suggested-by: Helge Deller <deller@gmx.de> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Helge Deller <deller@gmx.de>
2022-05-29	kbuild: do not create *.prelink.o for Clang LTO or IBT	Masahiro Yamada
	When CONFIG_LTO_CLANG=y, additional intermediate .prelink.o is created for each module. Also, objtool is postponed until LLVM IR is converted to ELF. CONFIG_X86_KERNEL_IBT works in a similar way to postpone objtool until objects are merged together. This commit stops generating .prelink.o, so the build flow will look similar with/without LTO. The following figures show how the LTO build currently works, and how this commit is changing it. Current build flow ================== [1] single-object module $(LD) $(CC) +objtool $(LD) foo.c --------------------> foo.o -----> foo.prelink.o -----> foo.ko (LLVM IR) (ELF) \| (ELF) \| foo.mod.o --/ (LLVM IR) [2] multi-object module $(LD) $(CC) $(AR) +objtool $(LD) foo1.c -----> foo1.o -----> foo.o -----> foo.prelink.o -----> foo.ko \| (archive) (ELF) \| (ELF) foo2.c -----> foo2.o --/ \| (LLVM IR) foo.mod.o --/ (LLVM IR) One confusion is that foo.o in multi-object module is an archive despite of its suffix. New build flow ============== [1] single-object module Since there is only one object, there is no need to keep the LLVM IR. Use $(CC)+$(LD) to generate an ELF object in one build rule. When LTO is disabled, $(LD) is unneeded because $(CC) produces an ELF object. $(CC)+$(LD)+objtool $(LD) foo.c ----------------------------> foo.o ---------> foo.ko (ELF) \| (ELF) \| foo.mod.o --/ (LLVM IR) [2] multi-object module Previously, $(AR) was used to combine LLVM IR files into an archive, but there was no technical reason to do so. Use $(LD) to merge them into a single ELF object. $(LD) $(CC) +objtool $(LD) foo1.c ---------> foo1.o ---------> foo.o ---------> foo.ko \| (ELF) \| (ELF) foo2.c ---------> foo2.o ----/ \| (LLVM IR) foo.mod.o --/ (LLVM IR) Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nicolas Schier <nicolas@fjasle.eu> Tested-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM-14 (x86-64) Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
2022-05-29	kbuild: replace $(linked-object) with CONFIG options	Masahiro Yamada
	*.prelink.o is created when CONFIG_LTO_CLANG or CONFIG_X86_KERNEL_IBT is enabled. Replace $(linked-object) with $(CONFIG_LTO_CLANG)$(CONFIG_X86_KERNEL_IBT) so you will get a quick idea of when the --link option is passed. No functional change is intended. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Josh Poimboeuf <jpoimboe@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM-14
2022-05-29	kbuild: do not try to parse *.cmd files for objects provided by compiler	Masahiro Yamada
	Guenter Roeck reported the build breakage for parisc and csky. I confirmed nios2 and openrisc are broken as well. The reason is that they borrow libgcc.a from the toolchains. For example, see this line in arch/parisc/Makefile: LIBGCC := $(shell $(CC) -print-libgcc-file-name) Some objects in libgcc.a are linked to vmlinux.o, but they do not have ..cmd files. Obviously, there is no EXPORT_SYMBOL in external objects. Ignore them. (Most of the architectures import library code into the kernel tree. Perhaps those 4 architectures can do similar, but I do not know how challenging it is.) Fixes: f292d875d0dc ("modpost: extract symbol versions from .cmd files") Link: https://lore.kernel.org/linux-kbuild/20220528224745.GA2501857@roeck-us.net/T/#mac65c20c71c3e272db0350ecfba53fcd8905b0a0 Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Tested-by: Guenter Roeck <linux@roeck-us.net>
2022-05-29	video: fbdev: omap: Add prototype for hwa742_update_window_async()	Helge Deller
	The symbol hwa742_update_window_async() is exported, but there is no prototype defined for it. That's why gcc complains: drivers-video-fbdev-omap-hwa742.c:warning:no-previous-prototype-for-hwa742_update_window_async Add the prototype, but I wonder if we couldn't drop exporting the symbol instead. Since omapfb_update_window_async() is exported the same way, are there any users outside of the tree? Signed-off-by: Helge Deller <deller@gmx.de>
2022-05-29	erofs: update documentation	Gao Xiang
	- refine the filesystem overview for better description of recent new features like FSDAX and Fscache; - add the new `fsid' mount option; - fix some typos. Link: https://lore.kernel.org/r/20220527070133.77962-1-hsiangkao@linux.alibaba.com Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2022-05-29	erofs: fix crash when enable tracepoint cachefiles_prep_read	Xin Yin
	RIP: 0010:trace_event_raw_event_cachefiles_prep_read+0x88/0xe0 [cachefiles] Call Trace: <TASK> cachefiles_prepare_read+0x1d7/0x3a0 [cachefiles] erofs_fscache_read_folios+0x188/0x220 [erofs] erofs_fscache_meta_readpage+0x106/0x160 [erofs] do_read_cache_folio+0x42a/0x590 ? bdi_register_va.part.14+0x1a7/0x210 ? super_setup_bdi_name+0x76/0xe0 erofs_bread+0x5b/0x170 [erofs] erofs_fc_fill_super+0x12b/0xc50 [erofs] This tracepoint uses rreq->inode, should set it when allocating. Fixes: d435d53228dd ("erofs: change to use asynchronous io for fscache readpage/readahead") Signed-off-by: Xin Yin <yinxin.x@bytedance.com> Reviewed-by: Jeffle Xu <jefflexu@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20220527101800.22360-1-yinxin.x@bytedance.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2022-05-29	erofs: leave compressed inodes unsupported in fscache mode for now	Jeffle Xu
	erofs over fscache doesn't support the compressed layout yet. It will cause NULL crash if there are compressed inodes contained when working in fscache mode. So far in the erofs based container image distribution scenarios (RAFS v6), the compressed RAFS v6 images are downloaded and then decompressed on demand as an uncompressed erofs image. Then the erofs image is mounted in fscache mode for containers to use. IOWs, currently compressed data is decompressed on the userspace side instead and uncompressed erofs images will be finally cached. The fscache support for the compressed layout is still under development and it will be used for runtime decompression feature. Anyway, to avoid the potential crash, let's leave the compressed inodes unsupported in fscache mode until we support it later. Fixes: 1442b02b66ad ("erofs: implement fscache-based data read for non-inline layout") Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20220526010344.118493-1-jefflexu@linux.alibaba.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2022-05-28	Merge tag 'input-for-v5.19-rc0' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input updates from Dmitry Torokhov: - a new driver for the Azoteq IQS7222A/B/C capacitive touch controller - a new driver for Raspberry Pi Sense HAT joystick - sun4i-lradc-keys gained support of R329 and D1 variants, plus it can be now used as a wakeup source - pm8941-pwrkey can now properly handle PON GEN3 variants; the driver also implements software debouncing and has a workaround for missing key press events - assorted driver fixes and cleanups * tag 'input-for-v5.19-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (29 commits) Input: stmfts - do not leave device disabled in stmfts_input_open Input: gpio-keys - cancel delayed work only in case of GPIO Input: cypress_ps2 - fix typo in comment Input: vmmouse - disable vmmouse before entering suspend mode dt-bindings: google,cros-ec-keyb: Fixup bad compatible match Input: cros-ec-keyb - allow skipping keyboard registration dt-bindings: google,cros-ec-keyb: Introduce switches only compatible Input: psmouse-smbus - avoid flush_scheduled_work() usage Input: bcm-keypad - remove unneeded NULL check before clk_disable_unprepare Input: sparcspkr - fix refcount leak in bbc_beep_probe Input: sun4i-lradc-keys - add support for R329 and D1 Input: sun4i-lradc-keys - add optional clock/reset support dt-bindings: input: sun4i-lradc-keys: Add R329 and D1 compatibles Input: sun4i-lradc-keys - add wakeup support Input: pm8941-pwrkey - simulate missed key press events Input: pm8941-pwrkey - add software key press debouncing support Input: pm8941-pwrkey - add support for PON GEN3 base addresses Input: pm8941-pwrkey - fix error message Input: synaptics-rmi4 - remove unnecessary flush_workqueue() Input: ep93xx_keypad - use devm_platform_ioremap_resource() helper ...
2022-05-28	drm: fix EDID struct for old ARM OABI format	Linus Torvalds
	When building the kernel for arm with the "-mabi=apcs-gnu" option, gcc will force alignment of all structures and unions to a word boundary (see also STRUCTURE_SIZE_BOUNDARY and the "-mstructure-size-boundary=XX" option if you're a gcc person), even when the members of said structures do not want or need said alignment. This completely messes up the structure alignment of 'struct edid' on those targets, because even though all the embedded structures are marked with "__attribute__((packed))", the unions that contain them are not. This was exposed by commit f1e4c916f97f ("drm/edid: add EDID block count and size helpers"), but the bug is pre-existing. That commit just made the structure layout problem cause a build failure due to the addition of the BUILD_BUG_ON(sizeof(*edid) != EDID_LENGTH); sanity check in drivers/gpu/drm/drm_edid.c:edid_block_data(). This legacy union alignment should probably not be used in the first place, but we can fix the layout by adding the packed attribute to the union entries even when each member is already packed and it shouldn't matter in a sane build environment. You can see this issue with a trivial test program: union { struct { char c[5]; }; struct { char d; unsigned e; } __attribute__((packed)); } a = { "1234" }; where building this with a normal "gcc -S" will result in the expected 5-byte size of said union: .type a, @object .size a, 5 but with an ARM compiler and the old ABI: arm-linux-gnu-gcc -mabi=apcs-gnu -mfloat-abi=soft -S t.c you get .type a, %object .size a, 8 instead, because even though each member of the union is packed, the union itself still gets aligned. This was reported by Sudip for the spear3xx_defconfig target. Link: https://lore.kernel.org/lkml/YpCUzStDnSgQLNFN@debian/ Reported-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-05-28	net: enetc: Use pci_release_region() to release some resources	Christophe JAILLET
	Some resources are allocated using pci_request_region(). It is more straightforward to release them with pci_release_region(). Fixes: 231ece36f50d ("enetc: Add mdio bus driver for the PCIe MDIO endpoint") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-28	Merge tag 'hyperv-next-signed-20220528' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv updates from Wei Liu: - Harden hv_sock driver (Andrea Parri) - Harden Hyper-V PCI driver (Andrea Parri) - Fix multi-MSI for Hyper-V PCI driver (Jeffrey Hugo) - Fix Hyper-V PCI to reduce boot time (Dexuan Cui) - Remove code for long EOL'ed Hyper-V versions (Michael Kelley, Saurabh Sengar) - Fix balloon driver error handling (Shradha Gupta) - Fix a typo in vmbus driver (Julia Lawall) - Ignore vmbus IMC device (Michael Kelley) - Add a new error message to Hyper-V DRM driver (Saurabh Sengar) * tag 'hyperv-next-signed-20220528' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: (28 commits) hv_balloon: Fix balloon_probe() and balloon_remove() error handling scsi: storvsc: Removing Pre Win8 related logic Drivers: hv: vmbus: fix typo in comment PCI: hv: Fix synchronization between channel callback and hv_pci_bus_exit() PCI: hv: Add validation for untrusted Hyper-V values PCI: hv: Fix interrupt mapping for multi-MSI PCI: hv: Reuse existing IRTE allocation in compose_msi_msg() drm/hyperv: Remove support for Hyper-V 2008 and 2008R2/Win7 video: hyperv_fb: Remove support for Hyper-V 2008 and 2008R2/Win7 scsi: storvsc: Remove support for Hyper-V 2008 and 2008R2/Win7 Drivers: hv: vmbus: Remove support for Hyper-V 2008 and Hyper-V 2008R2/Win7 x86/hyperv: Disable hardlockup detector by default in Hyper-V guests drm/hyperv: Add error message for fb size greater than allocated PCI: hv: Do not set PCI_COMMAND_MEMORY to reduce VM boot time PCI: hv: Fix hv_arch_irq_unmask() for multi-MSI Drivers: hv: vmbus: Refactor the ring-buffer iterator functions Drivers: hv: vmbus: Accept hv_sock offers in isolated guests hv_sock: Add validation for untrusted Hyper-V values hv_sock: Copy packets sent by Hyper-V out of the ring buffer hv_sock: Check hv_pkt_iter_first_raw()'s return value ...
2022-05-28	Merge tag 'powerpc-5.19-1' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - Convert to the generic mmap support (ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT) - Add support for outline-only KASAN with 64-bit Radix MMU (P9 or later) - Increase SIGSTKSZ and MINSIGSTKSZ and add support for AT_MINSIGSTKSZ - Enable the DAWR (Data Address Watchpoint) on POWER9 DD2.3 or later - Drop support for system call instruction emulation - Many other small features and fixes Thanks to Alexey Kardashevskiy, Alistair Popple, Andy Shevchenko, Bagas Sanjaya, Bjorn Helgaas, Bo Liu, Chen Huang, Christophe Leroy, Colin Ian King, Daniel Axtens, Dwaipayan Ray, Fabiano Rosas, Finn Thain, Frank Rowand, Fuqian Huang, Guilherme G. Piccoli, Hangyu Hua, Haowen Bai, Haren Myneni, Hari Bathini, He Ying, Jason Wang, Jiapeng Chong, Jing Yangyang, Joel Stanley, Julia Lawall, Kajol Jain, Kevin Hao, Krzysztof Kozlowski, Laurent Dufour, Lv Ruyi, Madhavan Srinivasan, Magali Lemes, Miaoqian Lin, Minghao Chi, Nathan Chancellor, Naveen N. Rao, Nicholas Piggin, Oliver O'Halloran, Oscar Salvador, Pali Rohár, Paul Mackerras, Peng Wu, Qing Wang, Randy Dunlap, Reza Arbab, Russell Currey, Sohaib Mohamed, Vaibhav Jain, Vasant Hegde, Wang Qing, Wang Wensheng, Xiang wangx, Xiaomeng Tong, Xu Wang, Yang Guang, Yang Li, Ye Bin, YueHaibing, Yu Kuai, Zheng Bin, Zou Wei, and Zucheng Zheng. * tag 'powerpc-5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (200 commits) powerpc/64: Include cache.h directly in paca.h powerpc/64s: Only set HAVE_ARCH_UNMAPPED_AREA when CONFIG_PPC_64S_HASH_MMU is set powerpc/xics: Include missing header powerpc/powernv/pci: Drop VF MPS fixup powerpc/fsl_book3e: Don't set rodata RO too early powerpc/microwatt: Add mmu bits to device tree powerpc/powernv/flash: Check OPAL flash calls exist before using powerpc/powermac: constify device_node in of_irq_parse_oldworld() powerpc/powermac: add missing g5_phy_disable_cpu1() declaration selftests/powerpc/pmu: fix spelling mistake "mis-match" -> "mismatch" powerpc: Enable the DAWR on POWER9 DD2.3 and above powerpc/64s: Add CPU_FTRS_POWER10 to ALWAYS mask powerpc/64s: Add CPU_FTRS_POWER9_DD2_2 to CPU_FTRS_ALWAYS mask powerpc: Fix all occurences of "the the" selftests/powerpc/pmu/ebb: remove fixed_instruction.S powerpc/platforms/83xx: Use of_device_get_match_data() powerpc/eeh: Drop redundant spinlock initialization powerpc/iommu: Add missing of_node_put in iommu_init_early_dart powerpc/pseries/vas: Call misc_deregister if sysfs init fails powerpc/papr_scm: Fix leaking nvdimm_events_map elements ...
2022-05-28	Merge tag 'pinctrl-v5.19-1' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control updates from Linus Walleij: "Pretty big this time. Mostly due to (nice) Renesas refactorings. Core changes: - New helpers from Andy such as for_each_gpiochip_node() affecting both GPIO and pin control, improving a bunch of drivers in the process. - Pulled in Marc Zyngiers work to make IRQ chips immutable, and started to apply fixups on top. New drivers: - New driver for Marvell MVEBU 98DX2530. - New driver for Mediatek MT8195. - Support Qualcomm PMX65 and PM6125. - New driver for Qualcomm SC7280 LPASS pin control. - New driver for Rockchip RK3588. - New driver for NXP Freescale i.MXRT1170. - New driver for Mediatek MT6795 Helio X10. Improvements: - Several Aspeed G6 cleanups and non-critical fixes. - Thorought refactoring of some of the ever improving Renesas drivers. - Clean up Mediatek MT8192 bindings a bit. - PWM output and clock monitoring in the Ocelot LAN966x driver. - Thorough refactoring and cleanup of the Ralink drivers such as RT2880, RT3883, RT305X, MT7620, MT7621, MT7628 splitting these into proper sub-drivers" * tag 'pinctrl-v5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (161 commits) pinctrl: apple: Use a raw spinlock for the regmap pinctrl: berlin: bg4ct: Use devm_platform_*ioremap_resource() APIs pinctrl: intel: Fix kernel doc format, i.e. add return sections dt-bindings: pinctrl: qcom: Drop 'maxItems' on 'wakeup-parent' pinctrl: starfive: Make the irqchip immutable pinctrl: mediatek: Add pinctrl driver for MT6795 Helio X10 dt-bindings: pinctrl: Add MediaTek MT6795 pinctrl bindings pinctrl: freescale: Add i.MXRT1170 pinctrl driver support dt-bindings: pinctrl: add i.MXRT1170 pinctrl Documentation dt-bindings: pinctrl: rockchip: increase max amount of device functions dt-bindings: pinctrl: qcom,pmic-gpio: add 'gpio-reserved-ranges' dt-bindings: pinctrl: qcom,pmic-gpio: add 'input-disable' dt-bindings: pinctrl: qcom,pmic-gpio: describe gpio-line-names dt-bindings: pinctrl: qcom,pmic-gpio: fix matching pin config dt-bindings: pinctrl: qcom,pmic-gpio: document PM8150L and PMM8155AU pinctrl: qcom: spmi-gpio: Add pm6125 compatible dt-bindings: pinctrl: qcom-pmic-gpio: Add pm6125 compatible pinctrl: intel: Drop unused irqchip member in struct intel_pinctrl pinctrl: intel: make irq_chip immutable pinctrl: cherryview: Use GPIO chip pointer in chv_gpio_irq_mask_unmask() ...
2022-05-28	video: fbdev: vesafb: Fix a use-after-free due early fb_info cleanup	Javier Martinez Canillas
	Commit b3c9a924aab6 ("fbdev: vesafb: Cleanup fb_info in .fb_destroy rather than .remove") fixed a use-after-free error due the vesafb driver freeing the fb_info in the .remove handler instead of doing it in .fb_destroy. This can happen if the .fb_destroy callback is executed after the .remove callback, since the former tries to access a pointer freed by the latter. But that change didn't take into account that another possible scenario is that .fb_destroy is called before the .remove callback. For example, if no process has the fbdev chardev opened by the time the driver is removed. If that's the case, fb_info will be freed when unregister_framebuffer() is called, making the fb_info pointer accessed in vesafb_remove() after that to no longer be valid. To prevent that, move the expression containing the info->par to happen before the unregister_framebuffer() function call. Fixes: b3c9a924aab6 ("fbdev: vesafb: Cleanup fb_info in .fb_destroy rather than .remove") Reported-by: Pascal Ernster <dri-devel@hardfalcon.net> Signed-off-by: Javier Martinez Canillas <javierm@redhat.com> Tested-by: Pascal Ernster <dri-devel@hardfalcon.net> Signed-off-by: Helge Deller <deller@gmx.de>
2022-05-28	Revert "crypto: poly1305 - cleanup stray CRYPTO_LIB_POLY1305_RSIZE"	Jason A. Donenfeld
	This reverts commit 8bdc2a190105e862dfe7a4033f2fd385b7e58ae8. It got merged a bit prematurely and shortly after the kernel test robot and Sudip pointed out build failures: arm: imx_v6_v7_defconfig and multi_v7_defconfig mips: decstation_64_defconfig, decstation_defconfig, decstation_r4k_defconfig In file included from crypto/chacha20poly1305.c:13: include/crypto/poly1305.h:56:46: error: 'CONFIG_CRYPTO_LIB_POLY1305_RSIZE' undeclared here (not in a function); did you mean 'CONFIG_CRYPTO_POLY1305_MODULE'? 56 \| struct poly1305_key opaque_r[CONFIG_CRYPTO_LIB_POLY1305_RSIZE]; \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ We could attempt to fix this by listing the dependencies piecemeal, but it's not as obvious as it looks: drivers like caam use this macro in headers even if there's no .o compiled in that makes use of it. So actually fixing this might require a bit more of a comprehensive approach, rather than whack-a-mole with hunting down which drivers use which headers which use this macro. Therefore, this commit just reverts the change, and maybe the problem can be visited on the next rainy day. Reported-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Reported-by: kernel test robot <lkp@intel.com> Fixes: 8bdc2a190105 ("crypto: poly1305 - cleanup stray CRYPTO_LIB_POLY1305_RSIZE") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-05-28	bonding: NS target should accept link local address	Hangbin Liu
	When setting bond NS target, we use bond_is_ip6_target_ok() to check if the address valid. The link local address was wrongly rejected in bond_changelink(), as most time the user just set the ARP/NS target to gateway, while the IPv6 gateway is always a link local address when user set up interface via SLAAC. So remove the link local addr check when setting bond NS target. Fixes: 129e3c1bab24 ("bonding: add new option ns_ip6_target") Reported-by: Li Liang <liali@redhat.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Jonathan Toppins <jtoppins@redhat.com> Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-28	net: nfc: Directly use ida_alloc()/free()	keliu
	Use ida_alloc()/ida_free() instead of deprecated ida_simple_get()/ida_simple_remove() . Signed-off-by: keliu <liuke94@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-28	ftrace: Add FTRACE_MCOUNT_MAX_OFFSET to avoid adding weak function	Steven Rostedt (Google)
	If an unused weak function was traced, it's call to fentry will still exist, which gets added into the __mcount_loc table. Ftrace will use kallsyms to retrieve the name for each location in __mcount_loc to display it in the available_filter_functions and used to enable functions via the name matching in set_ftrace_filter/notrace. Enabling these functions do nothing but enable an unused call to ftrace_caller. If a traced weak function is overridden, the symbol of the function would be used for it, which will either created duplicate names, or if the previous function was not traced, it would be incorrectly be listed in available_filter_functions as a function that can be traced. This became an issue with BPF[1] as there are tooling that enables the direct callers via ftrace but then checks to see if the functions were actually enabled. The case of one function that was marked notrace, but was followed by an unused weak function that was traced. The unused function's call to fentry was added to the __mcount_loc section, and kallsyms retrieved the untraced function's symbol as the weak function was overridden. Since the untraced function would not get traced, the BPF check would detect this and fail. The real fix would be to fix kallsyms to not show addresses of weak functions as the function before it. But that would require adding code in the build to add function size to kallsyms so that it can know when the function ends instead of just using the start of the next known symbol. In the mean time, this is a work around. Add a FTRACE_MCOUNT_MAX_OFFSET macro that if defined, ftrace will ignore any function that has its call to fentry/mcount that has an offset from the symbol that is greater than FTRACE_MCOUNT_MAX_OFFSET. If CONFIG_HAVE_FENTRY is defined for x86, define FTRACE_MCOUNT_MAX_OFFSET to zero (unless IBT is enabled), which will have ftrace ignore all locations that are not at the start of the function (or one after the ENDBR instruction). A worker thread is added at boot up to scan all the ftrace record entries, and will mark any that fail the FTRACE_MCOUNT_MAX_OFFSET test as disabled. They will still appear in the available_filter_functions file as: __ftrace_invalid_address___<invalid-offset> (showing the offset that caused it to be invalid). This is required for tools that use libtracefs (like trace-cmd does) that scan the available_filter_functions and enable set_ftrace_filter and set_ftrace_notrace using indexes of the function listed in the file (this is a speedup, as enabling thousands of files via names is an O(n^2) operation and can take minutes to complete, where the indexing takes less than a second). The invalid functions cannot be removed from available_filter_functions as the names there correspond to the ftrace records in the array that manages them (and the indexing depends on this). [1] https://lore.kernel.org/all/20220412094923.0abe90955e5db486b7bca279@kernel.org/ Link: https://lkml.kernel.org/r/20220526141912.794c2786@gandalf.local.home Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-05-28	bcache: avoid unnecessary soft lockup in kworker update_writeback_rate()	Coly Li
	The kworker routine update_writeback_rate() is schedued to update the writeback rate in every 5 seconds by default. Before calling __update_writeback_rate() to do real job, semaphore dc->writeback_lock should be held by the kworker routine. At the same time, bcache writeback thread routine bch_writeback_thread() also needs to hold dc->writeback_lock before flushing dirty data back into the backing device. If the dirty data set is large, it might be very long time for bch_writeback_thread() to scan all dirty buckets and releases dc->writeback_lock. In such case update_writeback_rate() can be starved for long enough time so that kernel reports a soft lockup warn- ing started like: watchdog: BUG: soft lockup - CPU#246 stuck for 23s! [kworker/246:31:179713] Such soft lockup condition is unnecessary, because after the writeback thread finishes its job and releases dc->writeback_lock, the kworker update_writeback_rate() may continue to work and everything is fine indeed. This patch avoids the unnecessary soft lockup by the following method, - Add new member to struct cached_dev - dc->rate_update_retry (0 by default) - In update_writeback_rate() call down_read_trylock(&dc->writeback_lock) firstly, if it fails then lock contention happens. - If dc->rate_update_retry <= BCH_WBRATE_UPDATE_MAX_SKIPS (15), doesn't acquire the lock and reschedules the kworker for next try. - If dc->rate_update_retry > BCH_WBRATE_UPDATE_MAX_SKIPS, no retry anymore and call down_read(&dc->writeback_lock) to wait for the lock. By the above method, at worst case update_writeback_rate() may retry for 1+ minutes before blocking on dc->writeback_lock by calling down_read(). For a 4TB cache device with 1TB dirty data, 90%+ of the unnecessary soft lockup warning message can be avoided. When retrying to acquire dc->writeback_lock in update_writeback_rate(), of course the writeback rate cannot be updated. It is fair, because when the kworker is blocked on the lock contention of dc->writeback_lock, the writeback rate cannot be updated neither. This change follows Jens Axboe's suggestion to a more clear and simple version. Signed-off-by: Coly Li <colyli@suse.de> Link: https://lore.kernel.org/r/20220528124550.32834-2-colyli@suse.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-28	blk-mq: remove the done argument to blk_execute_rq_nowait	Christoph Hellwig
	Let the caller set it together with the end_io_data instead of passing a pointless argument. Note the the target code did in fact already set it and then just overrode it again by calling blk_execute_rq_nowait. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220524121530.943123-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-28	blk-mq: avoid a mess of casts for blk_end_sync_rq	Christoph Hellwig
	Instead of trying to cast a __bitwise 32-bit integer to a larger integer and then a pointer, just allow a struct with the blk_status_t and the completion on stack and set the end_io_data to that. Use the opportunity to move the code to where it belongs and drop rather confusing comments. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220524121530.943123-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-28	blk-mq: remove __blk_execute_rq_nowait	Christoph Hellwig
	We don't want to plug for synchronous execution that where we immediately wait for the request. Once that is done not a whole lot of code is shared, so just remove __blk_execute_rq_nowait. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220524121530.943123-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-28	block: use bio_queue_enter instead of blk_queue_enter in bio_poll	Christoph Hellwig
	We want to have a valid live gendisk to call ->poll and not just a request_queue, so call the right helper. Fixes: 3e08773c3841 ("block: switch polling to be bio based") Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220523124302.526186-1-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-28	nfp: only report pause frame configuration for physical device	Yu Xiao
	Only report pause frame configuration for physical device. Logical port of both PCI PF and PCI VF do not support it. Fixes: 9fdc5d85a8fe ("nfp: update ethtool reporting of pauseframe control") Signed-off-by: Yu Xiao <yu.xiao@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-28	net: dpaa: Convert to SPDX identifiers	Sean Anderson
	This converts these files to use SPDX idenfifiers instead of license text. Signed-off-by: Sean Anderson <sean.anderson@seco.com> Reviewed-by: Madalin Bucur <madalin.bucur@oss.nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-28	tcp: fix tcp_mtup_probe_success vs wrong snd_cwnd	Eric Dumazet
	syzbot got a new report [1] finally pointing to a very old bug, added in initial support for MTU probing. tcp_mtu_probe() has checks about starting an MTU probe if tcp_snd_cwnd(tp) >= 11. But nothing prevents tcp_snd_cwnd(tp) to be reduced later and before the MTU probe succeeds. This bug would lead to potential zero-divides. Debugging added in commit 40570375356c ("tcp: add accessors to read/set tp->snd_cwnd") has paid off :) While we are at it, address potential overflows in this code. [1] WARNING: CPU: 1 PID: 14132 at include/net/tcp.h:1219 tcp_mtup_probe_success+0x366/0x570 net/ipv4/tcp_input.c:2712 Modules linked in: CPU: 1 PID: 14132 Comm: syz-executor.2 Not tainted 5.18.0-syzkaller-07857-gbabf0bb978e3 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:tcp_snd_cwnd_set include/net/tcp.h:1219 [inline] RIP: 0010:tcp_mtup_probe_success+0x366/0x570 net/ipv4/tcp_input.c:2712 Code: 74 08 48 89 ef e8 da 80 17 f9 48 8b 45 00 65 48 ff 80 80 03 00 00 48 83 c4 30 5b 41 5c 41 5d 41 5e 41 5f 5d c3 e8 aa b0 c5 f8 <0f> 0b e9 16 fe ff ff 48 8b 4c 24 08 80 e1 07 38 c1 0f 8c c7 fc ff RSP: 0018:ffffc900079e70f8 EFLAGS: 00010287 RAX: ffffffff88c0f7f6 RBX: ffff8880756e7a80 RCX: 0000000000040000 RDX: ffffc9000c6c4000 RSI: 0000000000031f9e RDI: 0000000000031f9f RBP: 0000000000000000 R08: ffffffff88c0f606 R09: ffffc900079e7520 R10: ffffed101011226d R11: 1ffff1101011226c R12: 1ffff1100eadcf50 R13: ffff8880756e72c0 R14: 1ffff1100eadcf89 R15: dffffc0000000000 FS: 00007f643236e700(0000) GS:ffff8880b9b00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1ab3f1e2a0 CR3: 0000000064fe7000 CR4: 00000000003506e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> tcp_clean_rtx_queue+0x223a/0x2da0 net/ipv4/tcp_input.c:3356 tcp_ack+0x1962/0x3c90 net/ipv4/tcp_input.c:3861 tcp_rcv_established+0x7c8/0x1ac0 net/ipv4/tcp_input.c:5973 tcp_v6_do_rcv+0x57b/0x1210 net/ipv6/tcp_ipv6.c:1476 sk_backlog_rcv include/net/sock.h:1061 [inline] __release_sock+0x1d8/0x4c0 net/core/sock.c:2849 release_sock+0x5d/0x1c0 net/core/sock.c:3404 sk_stream_wait_memory+0x700/0xdc0 net/core/stream.c:145 tcp_sendmsg_locked+0x111d/0x3fc0 net/ipv4/tcp.c:1410 tcp_sendmsg+0x2c/0x40 net/ipv4/tcp.c:1448 sock_sendmsg_nosec net/socket.c:714 [inline] sock_sendmsg net/socket.c:734 [inline] __sys_sendto+0x439/0x5c0 net/socket.c:2119 __do_sys_sendto net/socket.c:2131 [inline] __se_sys_sendto net/socket.c:2127 [inline] __x64_sys_sendto+0xda/0xf0 net/socket.c:2127 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 RIP: 0033:0x7f6431289109 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f643236e168 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 00007f643139c100 RCX: 00007f6431289109 RDX: 00000000d0d0c2ac RSI: 0000000020000080 RDI: 000000000000000a RBP: 00007f64312e308d R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000 R13: 00007fff372533af R14: 00007f643236e300 R15: 0000000000022000 Fixes: 5d424d5a674f ("[TCP]: MTU probing") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-28	net: phy: Directly use ida_alloc()/free()	Ke Liu
	Use ida_alloc()/ida_free() instead of deprecated ida_simple_get()/ida_simple_remove(). Signed-off-by: Ke Liu <liuke94@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-28	net/smc: fixes for converting from "struct smc_cdc_tx_pend **" to "struct ↵	Guangguan Wang
	smc_wr_tx_pend_priv " "struct smc_cdc_tx_pend " can not directly convert to "struct smc_wr_tx_pend_priv ". Fixes: 2bced6aefa3d ("net/smc: put slot when connection is killed") Signed-off-by: Guangguan Wang <guangguan.wang@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-28	riscv: read-only pages should not be writable	Heinrich Schuchardt
	If EFI pages are marked as read-only, we should remove the _PAGE_WRITE flag. The current code overwrites an unused value. Fixes: b91540d52a08b ("RISC-V: Add EFI runtime services") Signed-off-by: Heinrich Schuchardt <heinrich.schuchardt@canonical.com> Link: https://lore.kernel.org/r/20220528014132.91052-1-heinrich.schuchardt@canonical.com Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-05-28	pcmcia: Use platform_get_irq() to get the interrupt	Minghao Chi
	It is not recommened to use platform_get_resource(pdev, IORESOURCE_IRQ) for requesting IRQ's resources any more, as they can be not ready yet in case of DT-booting. platform_get_irq() instead is a recommended way for getting IRQ even if it was not retrieved earlier. It also makes code simpler because we're getting "int" value right away and no conversion from resource to int is required. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2022-05-28	MAINTAINERS: Update Xen maintainership	Boris Ostrovsky
	Due to time constraints I am stepping down as maintainter. I will stay as reviewer for x86 code (for which create a separate category). Stefano is now maintainer for Xen hypervisor interface and Oleksandr has graciously agreed to become a reviewer. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Acked-by: Stefano Stabellini <sstabellini@kernel.org> Acked-by: Juergen Gross <jgross@suse.com> Acked-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> Link: https://lore.kernel.org/r/1653674225-10447-1-git-send-email-boris.ostrovsky@oracle.com Signed-off-by: Juergen Gross <jgross@suse.com>
2022-05-27	Merge tag 'cxl-for-5.19' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull cxl updates from Dan Williams: "Compute Express Link (CXL) updates for this cycle. The highlight is new driver-core infrastructure and CXL subsystem changes for allowing lockdep to validate device_lock() usage. Thanks to PeterZ for setting me straight on the current capabilities of the lockdep API, and Greg acked it as well. On the CXL ACPI side this update adds support for CXL _OSC so that platform firmware knows that it is safe to still grant Linux native control of PCIe hotplug and error handling in the presence of CXL devices. A circular dependency problem was discovered between suspend and CXL memory for cases where the suspend image might be stored in CXL memory where that image also contains the PCI register state to restore to re-enable the device. Disable suspend for now until an architecture is defined to clarify that conflict. Lastly a collection of reworks, fixes, and cleanups to the CXL subsystem where support for snooping mailbox commands and properly handling the "mem_enable" flow are the highlights. Summary: - Add driver-core infrastructure for lockdep validation of device_lock(), and fixup a deadlock report that was previously hidden behind the 'lockdep no validate' policy. - Add CXL _OSC support for claiming native control of CXL hotplug and error handling. - Disable suspend in the presence of CXL memory unless and until a protocol is identified for restoring PCI device context from memory hosted on CXL PCI devices. - Add support for snooping CXL mailbox commands to protect against inopportune changes, like set-partition with the 'immediate' flag set. - Rework how the driver detects legacy CXL 1.1 configurations (CXL DVSEC / 'mem_enable') before enabling new CXL 2.0 decode configurations (CXL HDM Capability). - Miscellaneous cleanups and fixes from -next exposure" * tag 'cxl-for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (47 commits) cxl/port: Enable HDM Capability after validating DVSEC Ranges cxl/port: Reuse 'struct cxl_hdm' context for hdm init cxl/port: Move endpoint HDM Decoder Capability init to port driver cxl/pci: Drop @info argument to cxl_hdm_decode_init() cxl/mem: Merge cxl_dvsec_ranges() and cxl_hdm_decode_init() cxl/mem: Skip range enumeration if mem_enable clear cxl/mem: Consolidate CXL DVSEC Range enumeration in the core cxl/pci: Move cxl_await_media_ready() to the core cxl/mem: Validate port connectivity before dvsec ranges cxl/mem: Fix cxl_mem_probe() error exit cxl/pci: Drop wait_for_valid() from cxl_await_media_ready() cxl/pci: Consolidate wait_for_media() and wait_for_media_ready() cxl/mem: Drop mem_enabled check from wait_for_media() nvdimm: Fix firmware activation deadlock scenarios device-core: Kill the lockdep_mutex nvdimm: Drop nd_device_lock() ACPI: NFIT: Drop nfit_device_lock() nvdimm: Replace lockdep_mutex with local lock classes cxl: Drop cxl_device_lock() cxl/acpi: Add root device lockdep validation ...
2022-05-27	nbd: use pr_err to output error message	Yu Kuai
	Instead of using the long printk(KERN_ERR "nbd: ...") to output error message, defining pr_fmt and using the short pr_err("") to do that. The replacemen is done by using the following command: sed -i 's/printk(KERN_ERR "nbd: /pr_err("/g' \ drivers/block/nbd.c This patch also rewrap to 80 columns where possible. Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Link: https://lore.kernel.org/r/20220521073749.3146892-7-yukuai3@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-27	nbd: fix possible overflow on 'first_minor' in nbd_dev_add()	Zhang Wensheng
	When 'index' is a big numbers, it may become negative which forced to 'int'. then 'index << part_shift' might overflow to a positive value that is not greater than '0xfffff', then sysfs might complains about duplicate creation. Because of this, move the 'index' judgment to the front will fix it and be better. Fixes: b0d9111a2d53 ("nbd: use an idr to keep track of nbd devices") Fixes: 940c264984fd ("nbd: fix possible overflow for 'first_minor' in nbd_dev_add()") Signed-off-by: Zhang Wensheng <zhangwensheng5@huawei.com> Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Link: https://lore.kernel.org/r/20220521073749.3146892-6-yukuai3@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-27	nbd: fix io hung while disconnecting device	Yu Kuai
	In our tests, "qemu-nbd" triggers a io hung: INFO: task qemu-nbd:11445 blocked for more than 368 seconds. Not tainted 5.18.0-rc3-next-20220422-00003-g2176915513ca #884 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:qemu-nbd state:D stack: 0 pid:11445 ppid: 1 flags:0x00000000 Call Trace: <TASK> __schedule+0x480/0x1050 ? _raw_spin_lock_irqsave+0x3e/0xb0 schedule+0x9c/0x1b0 blk_mq_freeze_queue_wait+0x9d/0xf0 ? ipi_rseq+0x70/0x70 blk_mq_freeze_queue+0x2b/0x40 nbd_add_socket+0x6b/0x270 [nbd] nbd_ioctl+0x383/0x510 [nbd] blkdev_ioctl+0x18e/0x3e0 __x64_sys_ioctl+0xac/0x120 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7fd8ff706577 RSP: 002b:00007fd8fcdfebf8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 0000000040000000 RCX: 00007fd8ff706577 RDX: 000000000000000d RSI: 000000000000ab00 RDI: 000000000000000f RBP: 000000000000000f R08: 000000000000fbe8 R09: 000055fe497c62b0 R10: 00000002aff20000 R11: 0000000000000246 R12: 000000000000006d R13: 0000000000000000 R14: 00007ffe82dc5e70 R15: 00007fd8fcdff9c0 "qemu-ndb -d" will call ioctl 'NBD_DISCONNECT' first, however, following message was found: block nbd0: Send disconnect failed -32 Which indicate that something is wrong with the server. Then, "qemu-nbd -d" will call ioctl 'NBD_CLEAR_SOCK', however ioctl can't clear requests after commit 2516ab1543fd("nbd: only clear the queue on device teardown"). And in the meantime, request can't complete through timeout because nbd_xmit_timeout() will always return 'BLK_EH_RESET_TIMER', which means such request will never be completed in this situation. Now that the flag 'NBD_CMD_INFLIGHT' can make sure requests won't complete multiple times, switch back to call nbd_clear_sock() in nbd_clear_sock_ioctl(), so that inflight requests can be cleared. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Link: https://lore.kernel.org/r/20220521073749.3146892-5-yukuai3@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-27	nbd: don't clear 'NBD_CMD_INFLIGHT' flag if request is not completed	Yu Kuai
	Otherwise io will hung because request will only be completed if the cmd has the flag 'NBD_CMD_INFLIGHT'. Fixes: 07175cb1baf4 ("nbd: make sure request completion won't concurrent") Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20220521073749.3146892-4-yukuai3@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-27	nbd: fix race between nbd_alloc_config() and module removal	Yu Kuai
	When nbd module is being removing, nbd_alloc_config() may be called concurrently by nbd_genl_connect(), although try_module_get() will return false, but nbd_alloc_config() doesn't handle it. The race may lead to the leak of nbd_config and its related resources (e.g, recv_workq) and oops in nbd_read_stat() due to the unload of nbd module as shown below: BUG: kernel NULL pointer dereference, address: 0000000000000040 Oops: 0000 [#1] SMP PTI CPU: 5 PID: 13840 Comm: kworker/u17:33 Not tainted 5.14.0+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) Workqueue: knbd16-recv recv_work [nbd] RIP: 0010:nbd_read_stat.cold+0x130/0x1a4 [nbd] Call Trace: recv_work+0x3b/0xb0 [nbd] process_one_work+0x1ed/0x390 worker_thread+0x4a/0x3d0 kthread+0x12a/0x150 ret_from_fork+0x22/0x30 Fixing it by checking the return value of try_module_get() in nbd_alloc_config(). As nbd_alloc_config() may return ERR_PTR(-ENODEV), assign nbd->config only when nbd_alloc_config() succeeds to ensure the value of nbd->config is binary (valid or NULL). Also adding a debug message to check the reference counter of nbd_config during module removal. Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Link: https://lore.kernel.org/r/20220521073749.3146892-3-yukuai3@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-27	nbd: call genl_unregister_family() first in nbd_cleanup()	Yu Kuai
	Otherwise there may be race between module removal and the handling of netlink command, which can lead to the oops as shown below: BUG: kernel NULL pointer dereference, address: 0000000000000098 Oops: 0002 [#1] SMP PTI CPU: 1 PID: 31299 Comm: nbd-client Tainted: G E 5.14.0-rc4 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) RIP: 0010:down_write+0x1a/0x50 Call Trace: start_creating+0x89/0x130 debugfs_create_dir+0x1b/0x130 nbd_start_device+0x13d/0x390 [nbd] nbd_genl_connect+0x42f/0x748 [nbd] genl_family_rcv_msg_doit.isra.0+0xec/0x150 genl_rcv_msg+0xe5/0x1e0 netlink_rcv_skb+0x55/0x100 genl_rcv+0x29/0x40 netlink_unicast+0x1a8/0x250 netlink_sendmsg+0x21b/0x430 ____sys_sendmsg+0x2a4/0x2d0 ___sys_sendmsg+0x81/0xc0 __sys_sendmsg+0x62/0xb0 __x64_sys_sendmsg+0x1f/0x30 do_syscall_64+0x3b/0xc0 entry_SYSCALL_64_after_hwframe+0x44/0xae Modules linked in: nbd(E-) Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Link: https://lore.kernel.org/r/20220521073749.3146892-2-yukuai3@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-27	block: document BLK_STS_AGAIN usage	Hannes Reinecke
	BLK_STS_AGAIN should only be used if RQF_NOWAIT is set and the bio would block. So we'd better document that to avoid accidental misuse. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220524055631.85480-2-hare@suse.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-27	block: take destination bvec offsets into account in bio_copy_data_iter	Christoph Hellwig
	Appartly bcache can copy into bios that do not just contain fresh pages but can have offsets into the bio_vecs. Restore support for tht in bio_copy_data_iter. Fixes: f8b679a070c5 ("block: rewrite bio_copy_data_iter to use bvec_kmap_local and memcpy_to_bvec") Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220524143919.1155501-1-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-27	ksmbd: smbd: relax the count of sges required	Hyunchul Lee
	Remove the condition that the count of sges must be greater than or equal to SMB_DIRECT_MAX_SEND_SGES(8). Because ksmbd needs sges only for SMB direct header, SMB2 transform header, SMB2 response, and optional payload. Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Reviewed-by: Tom Talpey <tom@talpey.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-05-27	Merge branch 'net-ipa-fix-page-free-in-two-spots'	Jakub Kicinski
	Alex Elder says: ==================== net: ipa: fix page free in two spots When a receive buffer is not wrapped in an SKB and passed to the network stack, the (compound) page gets freed within the IPA driver. This is currently quite rare. The pages are freed using __free_pages(), but they should instead be freed using page_put(). This series fixes this, in two spots. These patches work for the current linus/master branch, but won't apply cleanly to earlier stable branches. (Nevertheless, the fix is a trivial substitution everwhere __free_pages() is called.) ==================== Link: https://lore.kernel.org/r/20220526152314.1405629-1-elder@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-27	net: ipa: fix page free in ipa_endpoint_replenish_one()	Alex Elder
	Currently the (possibly compound) pages used for receive buffers are freed using __free_pages(). But according to this comment above the definition of that function, that's wrong: If you want to use the page's reference count to decide when to free the allocation, you should allocate a compound page, and use put_page() instead of __free_pages(). Convert the call to __free_pages() in ipa_endpoint_replenish_one() to use put_page() instead. Fixes: 6a606b90153b8 ("net: ipa: allocate transaction in replenish loop") Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-27	net: ipa: fix page free in ipa_endpoint_trans_release()	Alex Elder
	Currently the (possibly compound) page used for receive buffers are freed using __free_pages(). But according to this comment above the definition of that function, that's wrong: If you want to use the page's reference count to decide when to free the allocation, you should allocate a compound page, and use put_page() instead of __free_pages(). Convert the call to __free_pages() in ipa_endpoint_trans_release() to use put_page() instead. Fixes: ed23f02680caa ("net: ipa: define per-endpoint receive buffer size") Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-27	dt-bindings: net: Update ADIN PHY maintainers	Alexandru Tachici
	Update the dt-bindings maintainers section. Signed-off-by: Alexandru Tachici <alexandru.tachici@analog.com> Link: https://lore.kernel.org/r/20220526141318.77146-1-alexandru.tachici@analog.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>