summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-04-28Merge tag 'platform-drivers-x86-v5.18-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fixes from Hans de Goede: "Highlights: - asus-wmi bug-fixes - intel-sdsu bug-fixes - build (warning) fixes - couple of hw-id additions" * tag 'platform-drivers-x86-v5.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86/intel: pmc/core: change pmc_lpm_modes to static platform/x86/intel/sdsi: Fix bug in multi packet reads platform/x86/intel/sdsi: Poll on ready bit for writes platform/x86/intel/sdsi: Handle leaky bucket platform/x86: intel-uncore-freq: Prevent driver loading in guests platform/x86: gigabyte-wmi: added support for B660 GAMING X DDR4 motherboard platform/x86: dell-laptop: Add quirk entry for Latitude 7520 platform/x86: asus-wmi: Fix driver not binding when fan curve control probe fails platform/x86: asus-wmi: Potential buffer overflow in asus_wmi_evaluate_method_buf() tools/power/x86/intel-speed-select: fix build failure when using -Wl,--as-needed
2022-04-28Merge tag 'regulator-fix-v5.18-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator fix from Mark Brown: "A minor fix for the DT binding documentation of the rt5190a driver" * tag 'regulator-fix-v5.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: dt-bindings: Revise the rt5190a buck/ldo description
2022-04-28tcp: fix F-RTO may not work correctly when receiving DSACKPengcheng Yang
Currently DSACK is regarded as a dupack, which may cause F-RTO to incorrectly enter "loss was real" when receiving DSACK. Packetdrill to demonstrate: // Enable F-RTO and TLP 0 `sysctl -q net.ipv4.tcp_frto=2` 0 `sysctl -q net.ipv4.tcp_early_retrans=3` 0 `sysctl -q net.ipv4.tcp_congestion_control=cubic` // Establish a connection +0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +0 bind(3, ..., ...) = 0 +0 listen(3, 1) = 0 // RTT 10ms, RTO 210ms +.1 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7> +0 > S. 0:0(0) ack 1 <...> +.01 < . 1:1(0) ack 1 win 257 +0 accept(3, ..., ...) = 4 // Send 2 data segments +0 write(4, ..., 2000) = 2000 +0 > P. 1:2001(2000) ack 1 // TLP +.022 > P. 1001:2001(1000) ack 1 // Continue to send 8 data segments +0 write(4, ..., 10000) = 10000 +0 > P. 2001:10001(8000) ack 1 // RTO +.188 > . 1:1001(1000) ack 1 // The original data is acked and new data is sent(F-RTO step 2.b) +0 < . 1:1(0) ack 2001 win 257 +0 > P. 10001:12001(2000) ack 1 // D-SACK caused by TLP is regarded as a dupack, this results in // the incorrect judgment of "loss was real"(F-RTO step 3.a) +.022 < . 1:1(0) ack 2001 win 257 <sack 1001:2001,nop,nop> // Never-retransmitted data(3001:4001) are acked and // expect to switch to open state(F-RTO step 3.b) +0 < . 1:1(0) ack 4001 win 257 +0 %{ assert tcpi_ca_state == 0, tcpi_ca_state }% Fixes: e33099f96d99 ("tcp: implement RFC5682 F-RTO") Signed-off-by: Pengcheng Yang <yangpc@wangsu.com> Acked-by: Neal Cardwell <ncardwell@google.com> Tested-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/1650967419-2150-1-git-send-email-yangpc@wangsu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nfJakub Kicinski
Pablo Neira Ayuso says: ==================== Netfilter fixes for net 1) Fix incorrect TCP connection tracking window reset for non-syn packets, from Florian Westphal. 2) Incorrect dependency on CONFIG_NFT_FLOW_OFFLOAD, from Volodymyr Mytnyk. 3) Fix nft_socket from the output path, from Florian Westphal. * git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nft_socket: only do sk lookups when indev is available netfilter: conntrack: fix udp offload timeout sysctl netfilter: nf_conntrack_tcp: re-init for syn packets only ==================== Link: https://lore.kernel.org/r/20220428142109.38726-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-28Merge tag 'gfs2-v5.18-rc4-fix2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 fix from Andreas Gruenbacher: - No short reads or writes upon glock contention * tag 'gfs2-v5.18-rc4-fix2' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: gfs2: No short reads or writes upon glock contention
2022-04-28Revert "ibmvnic: Add ethtool private flag for driver-defined queue limits"Dany Madden
This reverts commit 723ad916134784b317b72f3f6cf0f7ba774e5dae When client requests channel or ring size larger than what the server can support the server will cap the request to the supported max. So, the client would not be able to successfully request resources that exceed the server limit. Fixes: 723ad9161347 ("ibmvnic: Add ethtool private flag for driver-defined queue limits") Signed-off-by: Dany Madden <drt@linux.ibm.com> Link: https://lore.kernel.org/r/20220427235146.23189-1-drt@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-28net: enetc: allow tc-etf offload even with NETIF_F_CSUM_MASKVladimir Oltean
The Time-Specified Departure feature is indeed mutually exclusive with TX IP checksumming in ENETC, but TX checksumming in itself is broken and was removed from this driver in commit 82728b91f124 ("enetc: Remove Tx checksumming offload code"). The blamed commit declared NETIF_F_HW_CSUM in dev->features to comply with software TSO's expectations, and still did the checksumming in software by calling skb_checksum_help(). So there isn't any restriction for the Time-Specified Departure feature. However, enetc_setup_tc_txtime() doesn't understand that, and blindly looks for NETIF_F_CSUM_MASK. Instead of checking for things which can literally never happen in the current code base, just remove the check and let the driver offload tc-etf qdiscs. Fixes: acede3c5dad5 ("net: enetc: declare NETIF_F_HW_CSUM and do it in software") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20220427203017.1291634-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-28ixgbe: ensure IPsec VF<->PF compatibilityLeon Romanovsky
The VF driver can forward any IPsec flags and such makes the function is not extendable and prone to backward/forward incompatibility. If new software runs on VF, it won't know that PF configured something completely different as it "knows" only XFRM_OFFLOAD_INBOUND flag. Fixes: eda0333ac293 ("ixgbe: add VF IPsec management") Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Shannon Nelson <snelson@pensando.io> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20220427173152.443102-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-28Merge tag 'xfs-5.18-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds
Pull xfs fixes from Dave Chinner: - define buffer bit flags as unsigned to fix gcc-5 + c11 warnings - remove redundant XFS fields from MAINTAINERS - fix inode buffer locking order regression * tag 'xfs-5.18-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: reorder iunlink remove operation in xfs_ifree MAINTAINERS: update IOMAP FILESYSTEM LIBRARY and XFS FILESYSTEM xfs: convert buffer flags to unsigned.
2022-04-28MAINTAINERS: Update BNXT entry with firmware filesFlorian Fainelli
There appears to be a maintainer gap for BNXT TEE firmware files which causes some patches to be missed. Update the entry for the BNXT Ethernet controller with its companion firmware files. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20220427163606.126154-1-f.fainelli@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-28Merge branch 'thermal-int340x'Rafael J. Wysocki
Merge a fix for the attr.show callback prototype in the int340x thermal driver (Kees Cook). * thermal-int340x: thermal: int340x: Fix attr.show callback prototype
2022-04-28Merge tag 'iio-fixes-for-5.18a' of ↵Greg Kroah-Hartman
https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio Pull set of IIO fixes for 5.18 from Jonathan Cameron: "1st set of IIO fixes for the 5.18 cycle ad3552r: - Fix a bug with error codes being stored in unsigned local variable. - Fix IS_ERR when value is either NULL or not rather than ERR_PTR ad5446 - Fix shifting of read_raw value. ad5592r - Fix missing return value being set for a fwnode property read. ad7280a - Wrong variable being used to set thresholds. admv8818 - Kconfig dependency fix. ak8975 - Missing regulator disable in error path. bmi160 - Disable regulators in an error path. dac5571 - Fix chip id detection for devices with OF bindings. inv_icm42600 - Handle a case of a missing I2C NACK during initially configuration. ltc2688 - Fix voltage scaling where integer part was written twice and decimal part not at all. scd4x - Handle error before using value. sx9310 - Device property parsing against indio_dev->dev.of_node which hasn't been set yet. sx9324 - Fix hardware gain related maths. - Wrong defaults for precharge internal resistance register." * tag 'iio-fixes-for-5.18a' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio: iio: imu: inv_icm42600: Fix I2C init possible nack iio: dac: ltc2688: fix voltage scale read iio:dac:ad3552r: Fix an IS_ERR() vs NULL check iio: sx9324: Fix default precharge internal resistance register iio: dac: ad5446: Fix read_raw not returning set value iio: magnetometer: ak8975: Fix the error handling in ak8975_power_on() iio:proximity:sx9324: Fix hardware gain read/write iio:proximity:sx_common: Fix device property parsing on DT systems iio: adc: ad7280a: Fix wrong variable used when setting thresholds. iio:filter:admv8818: select REGMAP_SPI for ADMV8818 iio: dac: ad5592r: Fix the missing return value. iio: dac: dac5571: Fix chip id detection for OF devices iio:imu:bmi160: disable regulator in error path iio: scd4x: check return of scd4x_write_and_fetch iio: dac: ad3552r: fix signedness bug in ad3552r_reset()
2022-04-28netfilter: nft_socket: only do sk lookups when indev is availableFlorian Westphal
Check if the incoming interface is available and NFT_BREAK in case neither skb->sk nor input device are set. Because nf_sk_lookup_slow*() assume packet headers are in the 'in' direction, use in postrouting is not going to yield a meaningful result. Same is true for the forward chain, so restrict the use to prerouting, input and output. Use in output work if a socket is already attached to the skb. Fixes: 554ced0a6e29 ("netfilter: nf_tables: add support for native socket matching") Reported-and-tested-by: Topi Miettinen <toiwoton@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-04-28Merge branch 'pm-cpuidle'Rafael J. Wysocki
Merge cpuidle fixes for 5.18-rc5: - Make intel_idle enable C1E promotion on all CPUs when C1E is preferred to C1 (Artem Bityutskiy). - Make C6 optimization on Sapphire Rapids added recently work as expected if both C1E and C1 are "preferred" (Artem Bityutskiy). * pm-cpuidle: intel_idle: Fix SPR C6 optimization intel_idle: Fix the 'preferred_cstates' module parameter
2022-04-28perf symbol: Remove arch__symbols__fixup_end()Namhyung Kim
Now the generic code can handle kallsyms fixup properly so no need to keep the arch-functions anymore. Fixes: 3cf6a32f3f2a4594 ("perf symbols: Fix symbol size calculation condition") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Will Deacon <will@kernel.org> Cc: linux-s390@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20220416004048.1514900-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-28perf symbol: Update symbols__fixup_end()Namhyung Kim
Now arch-specific functions all do the same thing. When it fixes the symbol address it needs to check the boundary between the kernel image and modules. For the last symbol in the previous region, it cannot know the exact size as it's discarded already. Thus it just uses a small page size (4096) and rounds it up like the last symbol. Fixes: 3cf6a32f3f2a4594 ("perf symbols: Fix symbol size calculation condition") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Will Deacon <will@kernel.org> Cc: linux-s390@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20220416004048.1514900-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-28perf symbol: Pass is_kallsyms to symbols__fixup_end()Namhyung Kim
The symbol fixup is necessary for symbols in kallsyms since they don't have size info. So we use the next symbol's address to calculate the size. Now it's also used for user binaries because sometimes they miss size for hand-written asm functions. There's a arch-specific function to handle kallsyms differently but currently it cannot distinguish kallsyms from others. Pass this information explicitly to handle it properly. Note that those arch functions will be moved to the generic function so I didn't added it to the arch-functions. Fixes: 3cf6a32f3f2a4594 ("perf symbols: Fix symbol size calculation condition") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Will Deacon <will@kernel.org> Cc: linux-s390@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20220416004048.1514900-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-28perf test: Add perf_event_attr test for Arm SPETimothy Hayes
Adds a perf_event_attr test for Arm SPE in which the presence of physical addresses are checked when SPE unit is run with pa_enable=1. Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Timothy Hayes <timothy.hayes@arm.com> Tested-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: John Garry <john.garry@huawei.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Cc: Will Deacon <will@kernel.org> Cc: Yonghong Song <yhs@fb.com> Cc: bpf@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: netdev@vger.kernel.org Link: https://lore.kernel.org/r/20220421165205.117662-4-timothy.hayes@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-28perf arm-spe: Fix SPE events with phys addressesTimothy Hayes
This patch corrects a bug whereby SPE collection is invoked with pa_enable=1 but synthesized events fail to show physical addresses. Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Timothy Hayes <timothy.hayes@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: John Garry <john.garry@huawei.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Cc: Will Deacon <will@kernel.org> Cc: Yonghong Song <yhs@fb.com> Cc: bpf@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: netdev@vger.kernel.org Link: https://lore.kernel.org/r/20220421165205.117662-3-timothy.hayes@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-28perf arm-spe: Fix addresses of synthesized SPE eventsTimothy Hayes
This patch corrects a bug whereby synthesized events from SPE samples are missing virtual addresses. Fixes: 54f7815efef7fad9 ("perf arm-spe: Fill address info for samples") Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Timothy Hayes <timothy.hayes@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: bpf@vger.kernel.org Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: John Garry <john.garry@huawei.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: linux-arm-kernel@lists.infradead.org Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: netdev@vger.kernel.org Cc: Song Liu <songliubraving@fb.com> Cc: Will Deacon <will@kernel.org> Cc: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220421165205.117662-2-timothy.hayes@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-28perf intel-pt: Fix timeless decoding with perf.data directoryAdrian Hunter
Intel PT does not capture data in separate directories, so do not use separate directory processing because it doesn't work for timeless decoding. It also looks like it doesn't support one_mmap handling. Example: Before: # perf record --kcore -a -e intel_pt/tsc=0/k sleep 0.1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.799 MB perf.data ] # perf script --itrace=bep | head # After: # perf script --itrace=bep | head perf 21073 [000] psb: psb offs: 0 ffffffffaa68faf4 native_write_msr+0x4 ([kernel.kallsyms]) perf 21073 [000] cbr: cbr: 45 freq: 4505 MHz (161%) ffffffffaa68faf4 native_write_msr+0x4 ([kernel.kallsyms]) perf 21073 [000] 1 branches:k: 0 [unknown] ([unknown]) => ffffffffaa68faf6 native_write_msr+0x6 ([kernel.kallsyms]) perf 21073 [000] 1 branches:k: ffffffffaa68faf8 native_write_msr+0x8 ([kernel.kallsyms]) => ffffffffaa61aab0 pt_config_start+0x60 ([kernel.kallsyms]) perf 21073 [000] 1 branches:k: ffffffffaa61aabd pt_config_start+0x6d ([kernel.kallsyms]) => ffffffffaa61b8ad pt_event_start+0x27d ([kernel.kallsyms]) perf 21073 [000] 1 branches:k: ffffffffaa61b8bb pt_event_start+0x28b ([kernel.kallsyms]) => ffffffffaa61ba60 pt_event_add+0x40 ([kernel.kallsyms]) perf 21073 [000] 1 branches:k: ffffffffaa61ba76 pt_event_add+0x56 ([kernel.kallsyms]) => ffffffffaa880e86 event_sched_in+0xc6 ([kernel.kallsyms]) perf 21073 [000] 1 branches:k: ffffffffaa880e9b event_sched_in+0xdb ([kernel.kallsyms]) => ffffffffaa880ea5 event_sched_in+0xe5 ([kernel.kallsyms]) perf 21073 [000] 1 branches:k: ffffffffaa880eba event_sched_in+0xfa ([kernel.kallsyms]) => ffffffffaa880f96 event_sched_in+0x1d6 ([kernel.kallsyms]) perf 21073 [000] 1 branches:k: ffffffffaa880fc8 event_sched_in+0x208 ([kernel.kallsyms]) => ffffffffaa880ec0 event_sched_in+0x100 ([kernel.kallsyms]) Fixes: bb6be405c4a2a5 ("perf session: Load data directory files for analysis") Cc: stable@vger.kernel.org Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20220428093109.274641-1-adrian.hunter@intel.com Cc: Ian Rogers <irogers@google.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: linux-kernel@vger.kernel.org
2022-04-28gfs2: No short reads or writes upon glock contentionAndreas Gruenbacher
Commit 00bfe02f4796 ("gfs2: Fix mmap + page fault deadlocks for buffered I/O") changed gfs2_file_read_iter() and gfs2_file_buffered_write() to allow dropping the inode glock while faulting in user buffers. When the lock was dropped, a short result was returned to indicate that the operation was interrupted. As pointed out by Linus (see the link below), this behavior is broken and the operations should always re-acquire the inode glock and resume the operation instead. Link: https://lore.kernel.org/lkml/CAHk-=whaz-g_nOOoo8RRiWNjnv2R+h6_xk2F1J4TuSRxk1MtLw@mail.gmail.com/ Fixes: 00bfe02f4796 ("gfs2: Fix mmap + page fault deadlocks for buffered I/O") Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2022-04-28elf: Fix the arm64 MTE ELF segment name and valueCatalin Marinas
Unfortunately, the name/value choice for the MTE ELF segment type (PT_ARM_MEMTAG_MTE) was pretty poor: LOPROC+1 is already in use by PT_AARCH64_UNWIND, as defined in the AArch64 ELF ABI (https://github.com/ARM-software/abi-aa/blob/main/aaelf64/aaelf64.rst). Update the ELF segment type value to LOPROC+2 and also change the define to PT_AARCH64_MEMTAG_MTE to match the AArch64 ELF ABI namespace. The AArch64 ELF ABI document is updating accordingly (segment type not previously mentioned in the document). Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Fixes: 761b9b366cec ("elf: Introduce the ARM MTE ELF segment type") Cc: Will Deacon <will@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Kees Cook <keescook@chromium.org> Cc: Luis Machado <luis.machado@arm.com> Cc: Richard Earnshaw <Richard.Earnshaw@arm.com> Link: https://lore.kernel.org/r/20220425151833.2603830-1-catalin.marinas@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2022-04-28Merge tag 'for-net-2022-04-27' of ↵Paolo Abeni
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Luiz Augusto von Dentz says: ==================== bluetooth pull request for net: - Fix regression causing some HCI events to be discarded when they shouldn't. * tag 'for-net-2022-04-27' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth: Bluetooth: hci_sync: Cleanup hci_conn if it cannot be aborted Bluetooth: hci_event: Fix creating hci_conn object on error status Bluetooth: hci_event: Fix checking for invalid handle on error status ==================== Link: https://lore.kernel.org/r/20220427234031.1257281-1-luiz.dentz@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-27scsi: target: pscsi: Set SCF_TREAT_READ_AS_NORMAL flag only if there is ↵David Jeffery
valid data With tape devices, the SCF_TREAT_READ_AS_NORMAL flag is used by the target subsystem to mark commands which have both data to return as well as sense data. But with pscsi, SCF_TREAT_READ_AS_NORMAL can be set even if there is no data to return. The SCF_TREAT_READ_AS_NORMAL flag causes the target core to call iscsit data-in callbacks even if there is no data, which iscsit does not support. This results in iscsit going into an error state requiring recovery and being unable to complete the command to the initiator. This issue can be resolved by fixing pscsi to only set SCF_TREAT_READ_AS_NORMAL if there is valid data to return alongside the sense data. Link: https://lore.kernel.org/r/20220427183250.291881-1-djeffery@redhat.com Fixes: bd81372065fa ("scsi: target: transport should handle st FM/EOM/ILI reads") Reported-by: Scott Hamilton <scott.hamilton@atos.net> Tested-by: Laurence Oberman <loberman@redhat.com> Reviewed-by: Laurence Oberman <loberman@redhat.com> Signed-off-by: David Jeffery <djeffery@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-04-27net: fec: add missing of_node_put() in fec_enet_init_stop_mode()Yang Yingliang
Put device node in error path in fec_enet_init_stop_mode(). Fixes: 8a448bf832af ("net: ethernet: fec: move GPR register offset and bit into DT") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20220426125231.375688-1-yangyingliang@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-27bnx2x: fix napi API usage sequenceManish Chopra
While handling PCI errors (AER flow) driver tries to disable NAPI [napi_disable()] after NAPI is deleted [__netif_napi_del()] which causes unexpected system hang/crash. System message log shows the following: ======================================= [ 3222.537510] EEH: Detected PCI bus error on PHB#384-PE#800000 [ 3222.537511] EEH: This PCI device has failed 2 times in the last hour and will be permanently disabled after 5 failures. [ 3222.537512] EEH: Notify device drivers to shutdown [ 3222.537513] EEH: Beginning: 'error_detected(IO frozen)' [ 3222.537514] EEH: PE#800000 (PCI 0384:80:00.0): Invoking bnx2x->error_detected(IO frozen) [ 3222.537516] bnx2x: [bnx2x_io_error_detected:14236(eth14)]IO error detected [ 3222.537650] EEH: PE#800000 (PCI 0384:80:00.0): bnx2x driver reports: 'need reset' [ 3222.537651] EEH: PE#800000 (PCI 0384:80:00.1): Invoking bnx2x->error_detected(IO frozen) [ 3222.537651] bnx2x: [bnx2x_io_error_detected:14236(eth13)]IO error detected [ 3222.537729] EEH: PE#800000 (PCI 0384:80:00.1): bnx2x driver reports: 'need reset' [ 3222.537729] EEH: Finished:'error_detected(IO frozen)' with aggregate recovery state:'need reset' [ 3222.537890] EEH: Collect temporary log [ 3222.583481] EEH: of node=0384:80:00.0 [ 3222.583519] EEH: PCI device/vendor: 168e14e4 [ 3222.583557] EEH: PCI cmd/status register: 00100140 [ 3222.583557] EEH: PCI-E capabilities and status follow: [ 3222.583744] EEH: PCI-E 00: 00020010 012c8da2 00095d5e 00455c82 [ 3222.583892] EEH: PCI-E 10: 10820000 00000000 00000000 00000000 [ 3222.583893] EEH: PCI-E 20: 00000000 [ 3222.583893] EEH: PCI-E AER capability register set follows: [ 3222.584079] EEH: PCI-E AER 00: 13c10001 00000000 00000000 00062030 [ 3222.584230] EEH: PCI-E AER 10: 00002000 000031c0 000001e0 00000000 [ 3222.584378] EEH: PCI-E AER 20: 00000000 00000000 00000000 00000000 [ 3222.584416] EEH: PCI-E AER 30: 00000000 00000000 [ 3222.584416] EEH: of node=0384:80:00.1 [ 3222.584454] EEH: PCI device/vendor: 168e14e4 [ 3222.584491] EEH: PCI cmd/status register: 00100140 [ 3222.584492] EEH: PCI-E capabilities and status follow: [ 3222.584677] EEH: PCI-E 00: 00020010 012c8da2 00095d5e 00455c82 [ 3222.584825] EEH: PCI-E 10: 10820000 00000000 00000000 00000000 [ 3222.584826] EEH: PCI-E 20: 00000000 [ 3222.584826] EEH: PCI-E AER capability register set follows: [ 3222.585011] EEH: PCI-E AER 00: 13c10001 00000000 00000000 00062030 [ 3222.585160] EEH: PCI-E AER 10: 00002000 000031c0 000001e0 00000000 [ 3222.585309] EEH: PCI-E AER 20: 00000000 00000000 00000000 00000000 [ 3222.585347] EEH: PCI-E AER 30: 00000000 00000000 [ 3222.586872] RTAS: event: 5, Type: Platform Error (224), Severity: 2 [ 3222.586873] EEH: Reset without hotplug activity [ 3224.762767] EEH: Beginning: 'slot_reset' [ 3224.762770] EEH: PE#800000 (PCI 0384:80:00.0): Invoking bnx2x->slot_reset() [ 3224.762771] bnx2x: [bnx2x_io_slot_reset:14271(eth14)]IO slot reset initializing... [ 3224.762887] bnx2x 0384:80:00.0: enabling device (0140 -> 0142) [ 3224.768157] bnx2x: [bnx2x_io_slot_reset:14287(eth14)]IO slot reset --> driver unload Uninterruptible tasks ===================== crash> ps | grep UN 213 2 11 c000000004c89e00 UN 0.0 0 0 [eehd] 215 2 0 c000000004c80000 UN 0.0 0 0 [kworker/0:2] 2196 1 28 c000000004504f00 UN 0.1 15936 11136 wickedd 4287 1 9 c00000020d076800 UN 0.0 4032 3008 agetty 4289 1 20 c00000020d056680 UN 0.0 7232 3840 agetty 32423 2 26 c00000020038c580 UN 0.0 0 0 [kworker/26:3] 32871 4241 27 c0000002609ddd00 UN 0.1 18624 11648 sshd 32920 10130 16 c00000027284a100 UN 0.1 48512 12608 sendmail 33092 32987 0 c000000205218b00 UN 0.1 48512 12608 sendmail 33154 4567 16 c000000260e51780 UN 0.1 48832 12864 pickup 33209 4241 36 c000000270cb6500 UN 0.1 18624 11712 sshd 33473 33283 0 c000000205211480 UN 0.1 48512 12672 sendmail 33531 4241 37 c00000023c902780 UN 0.1 18624 11648 sshd EEH handler hung while bnx2x sleeping and holding RTNL lock =========================================================== crash> bt 213 PID: 213 TASK: c000000004c89e00 CPU: 11 COMMAND: "eehd" #0 [c000000004d477e0] __schedule at c000000000c70808 #1 [c000000004d478b0] schedule at c000000000c70ee0 #2 [c000000004d478e0] schedule_timeout at c000000000c76dec #3 [c000000004d479c0] msleep at c0000000002120cc #4 [c000000004d479f0] napi_disable at c000000000a06448 ^^^^^^^^^^^^^^^^ #5 [c000000004d47a30] bnx2x_netif_stop at c0080000018dba94 [bnx2x] #6 [c000000004d47a60] bnx2x_io_slot_reset at c0080000018a551c [bnx2x] #7 [c000000004d47b20] eeh_report_reset at c00000000004c9bc #8 [c000000004d47b90] eeh_pe_report at c00000000004d1a8 #9 [c000000004d47c40] eeh_handle_normal_event at c00000000004da64 And the sleeping source code ============================ crash> dis -ls c000000000a06448 FILE: ../net/core/dev.c LINE: 6702 6697 { 6698 might_sleep(); 6699 set_bit(NAPI_STATE_DISABLE, &n->state); 6700 6701 while (test_and_set_bit(NAPI_STATE_SCHED, &n->state)) * 6702 msleep(1); 6703 while (test_and_set_bit(NAPI_STATE_NPSVC, &n->state)) 6704 msleep(1); 6705 6706 hrtimer_cancel(&n->timer); 6707 6708 clear_bit(NAPI_STATE_DISABLE, &n->state); 6709 } EEH calls into bnx2x twice based on the system log above, first through bnx2x_io_error_detected() and then bnx2x_io_slot_reset(), and executes the following call chains: bnx2x_io_error_detected() +-> bnx2x_eeh_nic_unload() +-> bnx2x_del_all_napi() +-> __netif_napi_del() bnx2x_io_slot_reset() +-> bnx2x_netif_stop() +-> bnx2x_napi_disable() +->napi_disable() Fix this by correcting the sequence of NAPI APIs usage, that is delete the NAPI after disabling it. Fixes: 7fa6f34081f1 ("bnx2x: AER revised") Reported-by: David Christensen <drc@linux.vnet.ibm.com> Tested-by: David Christensen <drc@linux.vnet.ibm.com> Signed-off-by: Manish Chopra <manishc@marvell.com> Signed-off-by: Ariel Elior <aelior@marvell.com> Link: https://lore.kernel.org/r/20220426153913.6966-1-manishc@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-27tls: Skip tls_append_frag on zero copy sizeMaxim Mikityanskiy
Calling tls_append_frag when max_open_record_len == record->len might add an empty fragment to the TLS record if the call happens to be on the page boundary. Normally tls_append_frag coalesces the zero-sized fragment to the previous one, but not if it's on page boundary. If a resync happens then, the mlx5 driver posts dump WQEs in tx_post_resync_dump, and the empty fragment may become a data segment with byte_count == 0, which will confuse the NIC and lead to a CQE error. This commit fixes the described issue by skipping tls_append_frag on zero size to avoid adding empty fragments. The fix is not in the driver, because an empty fragment is hardly the desired behavior. Fixes: e8f69799810c ("net/tls: Add generic NIC offload infrastructure") Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/20220426154949.159055-1-maximmi@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-27Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfJakub Kicinski
Daniel Borkmann says: ==================== pull-request: bpf 2022-04-27 We've added 5 non-merge commits during the last 20 day(s) which contain a total of 6 files changed, 34 insertions(+), 12 deletions(-). The main changes are: 1) Fix xsk sockets when rx and tx are separately bound to the same umem, also fix xsk copy mode combined with busy poll, from Maciej Fijalkowski. 2) Fix BPF tunnel/collect_md helpers with bpf_xmit lwt hook usage which triggered a crash due to invalid metadata_dst access, from Eyal Birger. 3) Fix release of page pool in XDP live packet mode, from Toke Høiland-Jørgensen. 4) Fix potential NULL pointer dereference in kretprobes, from Adam Zabrocki. (Masami & Steven preferred this small fix to be routed via bpf tree given it's follow-up fix to Masami's rethook work that went via bpf earlier, too.) * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: xsk: Fix possible crash when multiple sockets are created kprobes: Fix KRETPROBES when CONFIG_KRETPROBE_ON_RETHOOK is set bpf, lwt: Fix crash when using bpf_skb_set_tunnel_key() from bpf_xmit lwt hook bpf: Fix release of page_pool in BPF_PROG_RUN in test runner xsk: Fix l2fwd for copy mode + busy poll combo ==================== Link: https://lore.kernel.org/r/20220427212748.9576-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-27KVM: arm64: Inject exception on out-of-IPA-range translation faultMarc Zyngier
When taking a translation fault for an IPA that is outside of the range defined by the hypervisor (between the HW PARange and the IPA range), we stupidly treat it as an IO and forward the access to userspace. Of course, userspace can't do much with it, and things end badly. Arguably, the guest is braindead, but we should at least catch the case and inject an exception. Check the faulting IPA against: - the sanitised PARange: inject an address size fault - the IPA size: inject an abort Reported-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-04-27drm/amdgpu: keep mmhub clock gating being enabled during s2idle suspendPrike Liang
Without MMHUB clock gating being enabled then MMHUB will not disconnect from DF and will result in DF C-state entry can't be accessed during S2idle suspend, and eventually s0ix entry will be blocked. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-04-27drm/amd/pm: fix the deadlock issue observed on SIEvan Quan
The adev->pm.mutx is already held at the beginning of amdgpu_dpm_compute_clocks/amdgpu_dpm_enable_uvd/amdgpu_dpm_enable_vce. But on their calling path, amdgpu_display_bandwidth_update will be called and thus its sub functions amdgpu_dpm_get_sclk/mclk. They will then try to acquire the same adev->pm.mutex and deadlock will occur. By placing amdgpu_display_bandwidth_update outside of adev->pm.mutex protection(considering logically they do not need such protection) and restructuring the call flow accordingly, we can eliminate the deadlock issue. This comes with no real logics change. Fixes: 3712e7a49459 ("drm/amd/pm: unified lock protections in amdgpu_dpm.c") Reported-by: Paul Menzel <pmenzel@molgen.mpg.de> Reported-by: Arthur Marsh <arthur.marsh@internode.on.net> Link: https://lore.kernel.org/all/9e689fea-6c69-f4b0-8dee-32c4cf7d8f9c@molgen.mpg.de/ BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1957 Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-04-27drm/amd/display: Fix memory leak in dcn21_clock_source_createMiaoqian Lin
When dcn20_clk_src_construct() fails, we need to release clk_src. Fixes: 6f4e6361c3ff ("drm/amd/display: Add Renoir resource (v2)") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-04-27drm/amdgpu: don't runtime suspend if there are displays attached (v3)Alex Deucher
We normally runtime suspend when there are displays attached if they are in the DPMS off state, however, if something wakes the GPU we send a hotplug event on resume (in case any displays were connected while the GPU was in suspend) which can cause userspace to light up the displays again soon after they were turned off. Prior to commit 087451f372bf76 ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's."), the driver took a runtime pm reference when the fbdev emulation was enabled because we didn't implement proper shadowing support for vram access when the device was off so the device never runtime suspended when there was a console bound. Once that commit landed, we now utilize the core fb helper implementation which properly handles the emulation, so runtime pm now suspends in cases where it did not before. Ultimately, we need to sort out why runtime suspend in not working in this case for some users, but this should restore similar behavior to before. v2: move check into runtime_suspend v3: wake ups -> wakeups in comment, retain pm_runtime behavior in runtime_idle callback Fixes: 087451f372bf76 ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's.") Link: https://lore.kernel.org/r/20220403132322.51c90903@darkstar.example.org/ Tested-by: Michele Ballabio <ballabio.m@gmail.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2022-04-27drm/amdkfd: CRIU add support for GWS queuesDavid Yat Sin
Add support to checkpoint/restore GWS (Global Wave Sync) queues. Signed-off-by: David Yat Sin <david.yatsin@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-04-27drm/amdkfd: Fix GWS queue countDavid Yat Sin
dqm->gws_queue_count and pdd->qpd.mapped_gws_queue need to be updated each time the queue gets evicted. Fixes: b8020b0304c8 ("drm/amdkfd: Enable over-subscription with >1 GWS queue") Signed-off-by: David Yat Sin <david.yatsin@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-04-27KVM/arm64: Don't emulate a PMU for 32-bit guests if feature not setAlexandru Elisei
kvm->arch.arm_pmu is set when userspace attempts to set the first PMU attribute. As certain attributes are mandatory, arm_pmu ends up always being set to a valid arm_pmu, otherwise KVM will refuse to run the VCPU. However, this only happens if the VCPU has the PMU feature. If the VCPU doesn't have the feature bit set, kvm->arch.arm_pmu will be left uninitialized and equal to NULL. KVM doesn't do ID register emulation for 32-bit guests and accesses to the PMU registers aren't gated by the pmu_visibility() function. This is done to prevent injecting unexpected undefined exceptions in guests which have detected the presence of a hardware PMU. But even though the VCPU feature is missing, KVM still attempts to emulate certain aspects of the PMU when PMU registers are accessed. This leads to a NULL pointer dereference like this one, which happens on an odroid-c4 board when running the kvm-unit-tests pmu-cycle-counter test with kvmtool and without the PMU feature being set: [ 454.402699] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000150 [ 454.405865] Mem abort info: [ 454.408596] ESR = 0x96000004 [ 454.411638] EC = 0x25: DABT (current EL), IL = 32 bits [ 454.416901] SET = 0, FnV = 0 [ 454.419909] EA = 0, S1PTW = 0 [ 454.423010] FSC = 0x04: level 0 translation fault [ 454.427841] Data abort info: [ 454.430687] ISV = 0, ISS = 0x00000004 [ 454.434484] CM = 0, WnR = 0 [ 454.437404] user pgtable: 4k pages, 48-bit VAs, pgdp=000000000c924000 [ 454.443800] [0000000000000150] pgd=0000000000000000, p4d=0000000000000000 [ 454.450528] Internal error: Oops: 96000004 [#1] PREEMPT SMP [ 454.456036] Modules linked in: [ 454.459053] CPU: 1 PID: 267 Comm: kvm-vcpu-0 Not tainted 5.18.0-rc4 #113 [ 454.465697] Hardware name: Hardkernel ODROID-C4 (DT) [ 454.470612] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 454.477512] pc : kvm_pmu_event_mask.isra.0+0x14/0x74 [ 454.482427] lr : kvm_pmu_set_counter_event_type+0x2c/0x80 [ 454.487775] sp : ffff80000a9839c0 [ 454.491050] x29: ffff80000a9839c0 x28: ffff000000a83a00 x27: 0000000000000000 [ 454.498127] x26: 0000000000000000 x25: 0000000000000000 x24: ffff00000a510000 [ 454.505198] x23: ffff000000a83a00 x22: ffff000003b01000 x21: 0000000000000000 [ 454.512271] x20: 000000000000001f x19: 00000000000003ff x18: 0000000000000000 [ 454.519343] x17: 000000008003fe98 x16: 0000000000000000 x15: 0000000000000000 [ 454.526416] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 [ 454.533489] x11: 000000008003fdbc x10: 0000000000009d20 x9 : 000000000000001b [ 454.540561] x8 : 0000000000000000 x7 : 0000000000000d00 x6 : 0000000000009d00 [ 454.547633] x5 : 0000000000000037 x4 : 0000000000009d00 x3 : 0d09000000000000 [ 454.554705] x2 : 000000000000001f x1 : 0000000000000000 x0 : 0000000000000000 [ 454.561779] Call trace: [ 454.564191] kvm_pmu_event_mask.isra.0+0x14/0x74 [ 454.568764] kvm_pmu_set_counter_event_type+0x2c/0x80 [ 454.573766] access_pmu_evtyper+0x128/0x170 [ 454.577905] perform_access+0x34/0x80 [ 454.581527] kvm_handle_cp_32+0x13c/0x160 [ 454.585495] kvm_handle_cp15_32+0x1c/0x30 [ 454.589462] handle_exit+0x70/0x180 [ 454.592912] kvm_arch_vcpu_ioctl_run+0x1c4/0x5e0 [ 454.597485] kvm_vcpu_ioctl+0x23c/0x940 [ 454.601280] __arm64_sys_ioctl+0xa8/0xf0 [ 454.605160] invoke_syscall+0x48/0x114 [ 454.608869] el0_svc_common.constprop.0+0xd4/0xfc [ 454.613527] do_el0_svc+0x28/0x90 [ 454.616803] el0_svc+0x34/0xb0 [ 454.619822] el0t_64_sync_handler+0xa4/0x130 [ 454.624049] el0t_64_sync+0x18c/0x190 [ 454.627675] Code: a9be7bfd 910003fd f9000bf3 52807ff3 (b9415001) [ 454.633714] ---[ end trace 0000000000000000 ]--- In this particular case, Linux hasn't detected the presence of a hardware PMU because the PMU node is missing from the DTB, so userspace would have been unable to set the VCPU PMU feature even if it attempted it. What happens is that the 32-bit guest reads ID_DFR0, which advertises the presence of the PMU, and when it tries to program a counter, it triggers the NULL pointer dereference because kvm->arch.arm_pmu is NULL. kvm-arch.arm_pmu was introduced by commit 46b187821472 ("KVM: arm64: Keep a per-VM pointer to the default PMU"). Until that commit, this error would be triggered instead: [ 73.388140] ------------[ cut here ]------------ [ 73.388189] Unknown PMU version 0 [ 73.390420] WARNING: CPU: 1 PID: 264 at arch/arm64/kvm/pmu-emul.c:36 kvm_pmu_event_mask.isra.0+0x6c/0x74 [ 73.399821] Modules linked in: [ 73.402835] CPU: 1 PID: 264 Comm: kvm-vcpu-0 Not tainted 5.17.0 #114 [ 73.409132] Hardware name: Hardkernel ODROID-C4 (DT) [ 73.414048] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 73.420948] pc : kvm_pmu_event_mask.isra.0+0x6c/0x74 [ 73.425863] lr : kvm_pmu_event_mask.isra.0+0x6c/0x74 [ 73.430779] sp : ffff80000a8db9b0 [ 73.434055] x29: ffff80000a8db9b0 x28: ffff000000dbaac0 x27: 0000000000000000 [ 73.441131] x26: ffff000000dbaac0 x25: 00000000c600000d x24: 0000000000180720 [ 73.448203] x23: ffff800009ffbe10 x22: ffff00000b612000 x21: 0000000000000000 [ 73.455276] x20: 000000000000001f x19: 0000000000000000 x18: ffffffffffffffff [ 73.462348] x17: 000000008003fe98 x16: 0000000000000000 x15: 0720072007200720 [ 73.469420] x14: 0720072007200720 x13: ffff800009d32488 x12: 00000000000004e6 [ 73.476493] x11: 00000000000001a2 x10: ffff800009d32488 x9 : ffff800009d32488 [ 73.483565] x8 : 00000000ffffefff x7 : ffff800009d8a488 x6 : ffff800009d8a488 [ 73.490638] x5 : ffff0000f461a9d8 x4 : 0000000000000000 x3 : 0000000000000001 [ 73.497710] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000000dbaac0 [ 73.504784] Call trace: [ 73.507195] kvm_pmu_event_mask.isra.0+0x6c/0x74 [ 73.511768] kvm_pmu_set_counter_event_type+0x2c/0x80 [ 73.516770] access_pmu_evtyper+0x128/0x16c [ 73.520910] perform_access+0x34/0x80 [ 73.524532] kvm_handle_cp_32+0x13c/0x160 [ 73.528500] kvm_handle_cp15_32+0x1c/0x30 [ 73.532467] handle_exit+0x70/0x180 [ 73.535917] kvm_arch_vcpu_ioctl_run+0x20c/0x6e0 [ 73.540489] kvm_vcpu_ioctl+0x2b8/0x9e0 [ 73.544283] __arm64_sys_ioctl+0xa8/0xf0 [ 73.548165] invoke_syscall+0x48/0x114 [ 73.551874] el0_svc_common.constprop.0+0xd4/0xfc [ 73.556531] do_el0_svc+0x28/0x90 [ 73.559808] el0_svc+0x28/0x80 [ 73.562826] el0t_64_sync_handler+0xa4/0x130 [ 73.567054] el0t_64_sync+0x1a0/0x1a4 [ 73.570676] ---[ end trace 0000000000000000 ]--- [ 73.575382] kvm: pmu event creation failed -2 The root cause remains the same: kvm->arch.pmuver was never set to something sensible because the VCPU feature itself was never set. The odroid-c4 is somewhat of a special case, because Linux doesn't probe the PMU. But the above errors can easily be reproduced on any hardware, with or without a PMU driver, as long as userspace doesn't set the PMU feature. Work around the fact that KVM advertises a PMU even when the VCPU feature is not set by gating all PMU emulation on the feature. The guest can still access the registers without KVM injecting an undefined exception. Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220425145530.723858-1-alexandru.elisei@arm.com
2022-04-27KVM: arm64: Handle host stage-2 faults from 32-bit EL0Will Deacon
When pKVM is enabled, host memory accesses are translated by an identity mapping at stage-2, which is populated lazily in response to synchronous exceptions from 64-bit EL1 and EL0. Extend this handling to cover exceptions originating from 32-bit EL0 as well. Although these are very unlikely to occur in practice, as the kernel typically ensures that user pages are initialised before mapping them in, drivers could still map previously untouched device pages into userspace and expect things to work rather than panic the system. Cc: Quentin Perret <qperret@google.com> Cc: Marc Zyngier <maz@kernel.org> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220427171332.13635-1-will@kernel.org
2022-04-27Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge fixes from Andrew Morton: "Two patches. Subsystems affected by this patch series: mm/kasan and mm/debug" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: docs: vm/page_owner: use literal blocks for param description kasan: prevent cpu_quarantine corruption when CPU offline and cache shrink occur at same time
2022-04-27docs: vm/page_owner: use literal blocks for param descriptionAkira Yokosawa
Sphinx generates hard-to-read lists of parameters at the bottom of the page. Fix them by putting literal-block markers of "::" in front of them. Link: https://lkml.kernel.org/r/cfd3bcc0-b51d-0c68-c065-ca1c4c202447@gmail.com Signed-off-by: Akira Yokosawa <akiyks@gmail.com> Fixes: 57f2b54a9379 ("Documentation/vm/page_owner.rst: update the documentation") Cc: Shenghong Han <hanshenghong2019@email.szu.edu.cn> Cc: Haowen Bai <baihaowen@meizu.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Alex Shi <seakeel@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-27kasan: prevent cpu_quarantine corruption when CPU offline and cache shrink ↵Zqiang
occur at same time kasan_quarantine_remove_cache() is called in kmem_cache_shrink()/ destroy(). The kasan_quarantine_remove_cache() call is protected by cpuslock in kmem_cache_destroy() to ensure serialization with kasan_cpu_offline(). However the kasan_quarantine_remove_cache() call is not protected by cpuslock in kmem_cache_shrink(). When a CPU is going offline and cache shrink occurs at same time, the cpu_quarantine may be corrupted by interrupt (per_cpu_remove_cache operation). So add a cpu_quarantine offline flags check in per_cpu_remove_cache(). [akpm@linux-foundation.org: add comment, per Zqiang] Link: https://lkml.kernel.org/r/20220414025925.2423818-1-qiang1.zhang@intel.com Signed-off-by: Zqiang <qiang1.zhang@intel.com> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-27Revert "block: inherit request start time from bio for BLK_CGROUP"Tejun Heo
This reverts commit 0006707723233cb2a9a23ca19fc3d0864835704c. It has a couple problems: * bio_issue_time() is stored in bio->bi_issue truncated to 51 bits. This overflows in slightly over 26 days. Setting rq->io_start_time_ns with it means that io duration calculation would yield >26days after 26 days of uptime. This, for example, confuses kyber making it cause high IO latencies. * rq->io_start_time_ns should record the time that the IO is issued to the device so that on-device latency can be measured. However, bio_issue_time() is set before the bio goes through the rq-qos controllers (wbt, iolatency, iocost), so when the bio gets throttled in any of the mechanisms, the measured latencies make no sense - on-device latencies end up higher than request-alloc-to-completion latencies. We'll need a smarter way to avoid calling ktime_get_ns() repeatedly back-to-back. For now, let's revert the commit. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: stable@vger.kernel.org # v5.16+ Link: https://lore.kernel.org/r/YmmeOLfo5lzc+8yI@slm.duckdns.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-04-27intel_idle: Fix SPR C6 optimizationArtem Bityutskiy
The Sapphire Rapids (SPR) C6 optimization was added to the end of the 'spr_idle_state_table_update()' function. However, the function has a 'return' which may happen before the optimization has a chance to run. And this may prevent the optimization from happening. This is an unlikely scenario, but possible if user boots with, say, the 'intel_idle.preferred_cstates=6' kernel boot option. This patch fixes the issue by eliminating the problematic 'return' statement. Fixes: 3a9cf77b60dc ("intel_idle: add core C6 optimization for SPR") Suggested-by: Jan Beulich <jbeulich@suse.com> Reported-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> [ rjw: Minor changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-04-27intel_idle: Fix the 'preferred_cstates' module parameterArtem Bityutskiy
Problem description. When user boots kernel up with the 'intel_idle.preferred_cstates=4' option, we enable C1E and disable C1 states on Sapphire Rapids Xeon (SPR). In order for C1E to work on SPR, we have to enable the C1E promotion bit on all CPUs. However, we enable it only on one CPU. Fix description. The 'intel_idle' driver already has the infrastructure for disabling C1E promotion on every CPU. This patch uses the same infrastructure for enabling C1E promotion on every CPU. It changes the boolean 'disable_promotion_to_c1e' variable to a tri-state 'c1e_promotion' variable. Tested on a 2-socket SPR system. I verified the following combinations: * C1E promotion enabled and disabled in BIOS. * Booted with and without the 'intel_idle.preferred_cstates=4' kernel argument. In all 4 cases C1E promotion was correctly set on all CPUs. Also tested on an old Broadwell system, just to make sure it does not cause a regression. C1E promotion was correctly disabled on that system, both C1 and C1E were exposed (as expected). Fixes: da0e58c038e6 ("intel_idle: add 'preferred_cstates' module argument") Reported-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> [ rjw: Minor changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-04-27Merge tag 'cpufreq-arm-fixes-5.18-rc5' of ↵Rafael J. Wysocki
git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm Pull ARM cpufreq fixes for 5.18-rc5 from Viresh Kumar: "- Fix issues with the Qualcomm's cpufreq driver (Dmitry Baryshkov and Vladimir Zapolskiy). - Fix memory leak with the Sun501 driver (Xiaobing Luo)." * tag 'cpufreq-arm-fixes-5.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm: cpufreq: qcom-cpufreq-hw: Clear dcvs interrupts cpufreq: fix memory leak in sun50i_cpufreq_nvmem_probe cpufreq: qcom-cpufreq-hw: Fix throttle frequency value on EPSS platforms cpufreq: qcom-hw: provide online/offline operations cpufreq: qcom-hw: fix the opp entries refcounting cpufreq: qcom-hw: fix the race between LMH worker and cpuhp cpufreq: qcom-hw: drop affinity hint before freeing the IRQ
2022-04-27hex2bin: fix access beyond string endMikulas Patocka
If we pass too short string to "hex2bin" (and the string size without the terminating NUL character is even), "hex2bin" reads one byte after the terminating NUL character. This patch fixes it. Note that hex_to_bin returns -1 on error and hex2bin return -EINVAL on error - so we can't just return the variable "hi" or "lo" on error. This inconsistency may be fixed in the next merge window, but for the purpose of fixing this bug, we just preserve the existing behavior and return -1 and -EINVAL. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Fixes: b78049831ffe ("lib: add error checking to hex2bin") Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-27hex2bin: make the function hex_to_bin constant-timeMikulas Patocka
The function hex2bin is used to load cryptographic keys into device mapper targets dm-crypt and dm-integrity. It should take constant time independent on the processed data, so that concurrently running unprivileged code can't infer any information about the keys via microarchitectural convert channels. This patch changes the function hex_to_bin so that it contains no branches and no memory accesses. Note that this shouldn't cause performance degradation because the size of the new function is the same as the size of the old function (on x86-64) - and the new function causes no branch misprediction penalties. I compile-tested this function with gcc on aarch64 alpha arm hppa hppa64 i386 ia64 m68k mips32 mips64 powerpc powerpc64 riscv sh4 s390x sparc32 sparc64 x86_64 and with clang on aarch64 arm hexagon i386 mips32 mips64 powerpc powerpc64 s390x sparc32 sparc64 x86_64 to verify that there are no branches in the generated code. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-27bus: fsl-mc-msi: Fix MSI descriptor mutex lock for msi_first_desc()Shin'ichiro Kawasaki
Commit e8604b1447b4 introduced a call to the helper function msi_first_desc(), which needs MSI descriptor mutex lock before call. However, the required mutex lock was not added. This results in lockdep assertion: WARNING: CPU: 4 PID: 119 at kernel/irq/msi.c:274 msi_first_desc+0xd0/0x10c msi_first_desc+0xd0/0x10c fsl_mc_msi_domain_alloc_irqs+0x7c/0xc0 fsl_mc_populate_irq_pool+0x80/0x3cc Fix this by adding the mutex lock and unlock around the function call. Fixes: e8604b1447b4 ("bus: fsl-mc-msi: Simplify MSI descriptor handling") Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220412075636.755454-1-shinichiro.kawasaki@wdc.com
2022-04-27kernfs: fix NULL dereferencing in kernfs_removeMinchan Kim
kernfs_remove supported NULL kernfs_node param to bail out but revent per-fs lock change introduced regression that dereferencing the param without NULL check so kernel goes crash. This patch checks the NULL kernfs_node in kernfs_remove and if so, just return. Quote from bug report by Jirka ``` The bug is triggered by running NAS Parallel benchmark suite on SuperMicro servers with 2x Xeon(R) Gold 6126 CPU. Here is the error log: [ 247.035564] BUG: kernel NULL pointer dereference, address: 0000000000000008 [ 247.036009] #PF: supervisor read access in kernel mode [ 247.036009] #PF: error_code(0x0000) - not-present page [ 247.036009] PGD 0 P4D 0 [ 247.036009] Oops: 0000 [#1] PREEMPT SMP PTI [ 247.058060] CPU: 1 PID: 6546 Comm: umount Not tainted 5.16.0393c3714081a53795bbff0e985d24146def6f57f+ #16 [ 247.058060] Hardware name: Supermicro Super Server/X11DDW-L, BIOS 2.0b 03/07/2018 [ 247.058060] RIP: 0010:kernfs_remove+0x8/0x50 [ 247.058060] Code: 4c 89 e0 5b 5d 41 5c 41 5d 41 5e c3 49 c7 c4 f4 ff ff ff eb b2 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 0f 1f 44 00 00 41 54 55 <48> 8b 47 08 48 89 fd 48 85 c0 48 0f 44 c7 4c 8b 60 50 49 83 c4 60 [ 247.058060] RSP: 0018:ffffbbfa48a27e48 EFLAGS: 00010246 [ 247.058060] RAX: 0000000000000001 RBX: ffffffff89e31f98 RCX: 0000000080200018 [ 247.058060] RDX: 0000000080200019 RSI: fffff6760786c900 RDI: 0000000000000000 [ 247.058060] RBP: ffffffff89e31f98 R08: ffff926b61b24d00 R09: 0000000080200018 [ 247.122048] R10: ffff926b61b24d00 R11: ffff926a8040c000 R12: ffff927bd09a2000 [ 247.122048] R13: ffffffff89e31fa0 R14: dead000000000122 R15: dead000000000100 [ 247.122048] FS: 00007f01be0a8c40(0000) GS:ffff926fa8e40000(0000) knlGS:0000000000000000 [ 247.122048] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 247.122048] CR2: 0000000000000008 CR3: 00000001145c6003 CR4: 00000000007706e0 [ 247.122048] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 247.122048] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 247.122048] PKRU: 55555554 [ 247.122048] Call Trace: [ 247.122048] <TASK> [ 247.122048] rdt_kill_sb+0x29d/0x350 [ 247.122048] deactivate_locked_super+0x36/0xa0 [ 247.122048] cleanup_mnt+0x131/0x190 [ 247.122048] task_work_run+0x5c/0x90 [ 247.122048] exit_to_user_mode_prepare+0x229/0x230 [ 247.122048] syscall_exit_to_user_mode+0x18/0x40 [ 247.122048] do_syscall_64+0x48/0x90 [ 247.122048] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 247.122048] RIP: 0033:0x7f01be2d735b ``` Link: https://bugzilla.kernel.org/show_bug.cgi?id=215696 Link: https://lore.kernel.org/lkml/CAE4VaGDZr_4wzRn2___eDYRtmdPaGGJdzu_LCSkJYuY9BEO3cw@mail.gmail.com/ Fixes: 393c3714081a (kernfs: switch global kernfs_rwsem lock to per-fs lock) Cc: stable@vger.kernel.org Reported-by: Jirka Hladky <jhladky@redhat.com> Tested-by: Jirka Hladky <jhladky@redhat.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Minchan Kim <minchan@kernel.org> Link: https://lore.kernel.org/r/20220427172152.3505364-1-minchan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-27Merge tag 'zonefs-5.18-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs Pull zonefs fixes from Damien Le Moal: "Two fixes for rc5: - Fix inode initialization to make sure that the inode flags are all cleared. - Use zone reset operation instead of close to make sure that the zone of an empty sequential file in never in an active state after closing the file" * tag 'zonefs-5.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs: zonefs: Fix management of open zones zonefs: Clear inode information flags on inode creation