Age | Commit message (Collapse) | Author |
|
Add new -z argument to specify max IOV size. By default, use
single large IOV.
Signed-off-by: Stanislav Fomichev <stfomichev@gmail.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
sendmsg() with a single iov becomes ITER_UBUF, sendmsg() with multiple
iovs becomes ITER_IOVEC. iter_iov_len does not return correct
value for UBUF, so teach to treat UBUF differently.
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Pavel Begunkov <asml.silence@gmail.com>
Cc: Mina Almasry <almasrymina@google.com>
Fixes: bd61848900bf ("net: devmem: Implement TX path")
Signed-off-by: Stanislav Fomichev <stfomichev@gmail.com>
Acked-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signal vt subsystem to redraw console when switching to dummycon
with deferred takeover enabled. Makes the console switch to fbcon
and displays the available output.
With deferred takeover enabled, dummycon acts as the placeholder
until the first output to the console happens. At that point, fbcon
takes over. If the output happens while dummycon is not active, it
cannot inform fbcon. This is the case if the vt subsystem runs in
graphics mode.
A typical graphical boot starts plymouth, a display manager and a
compositor; all while leaving out dummycon. Switching to a text-mode
console leaves the console with dummycon even if a getty terminal
has been started.
Returning true from dummycon's con_switch helper signals the vt
subsystem to redraw the screen. If there's output available dummycon's
con_putc{s} helpers trigger deferred takeover of fbcon, which sets a
display mode and displays the output. If no output is available,
dummycon remains active.
v2:
- make the comment slightly more verbose (Javier)
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reported-by: Andrei Borzenkov <arvidjaar@gmail.com>
Closes: https://bugzilla.suse.com/show_bug.cgi?id=1242191
Tested-by: Andrei Borzenkov <arvidjaar@gmail.com>
Acked-by: Javier Martinez Canillas <javierm@redhat.com>
Fixes: 83d83bebf401 ("console/fbcon: Add support for deferred console takeover")
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: linux-fbdev@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v4.19+
Link: https://lore.kernel.org/r/20250520071418.8462-1-tzimmermann@suse.de
|
|
ksmbd accesses crc32 using normal function calls (as opposed to e.g.
the generic crypto infrastructure's name-based algorithm resolution), so
there is no need to declare a module softdep.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
ksmbd_gen_sd_hash() does not support any other algorithm, so the
crypto_shash abstraction provides no value. Just use the SHA-256
library API instead, which is much simpler and easier to use.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
irq_fpu_usable() incorrectly returned true before the FPU is
initialized. The x86 CPU onlining code can call sha256() to checksum
AMD microcode images, before the FPU is initialized. Since sha256()
recently gained a kernel-mode FPU optimized code path, a crash occurred
in kernel_fpu_begin_mask() during hotplug CPU onlining.
(The crash did not occur during boot-time CPU onlining, since the
optimized sha256() code is not enabled until subsys_initcalls run.)
Fix this by making irq_fpu_usable() return false before fpu__init_cpu()
has run. To do this without adding any additional overhead to
irq_fpu_usable(), replace the existing per-CPU bool in_kernel_fpu with
kernel_fpu_allowed which tracks both initialization and usage rather
than just usage. The initial state is false; FPU initialization sets it
to true; kernel-mode FPU sections toggle it to false and then back to
true; and CPU offlining restores it to the initial state of false.
Fixes: 11d7956d526f ("crypto: x86/sha256 - implement library instead of shash")
Reported-by: Ayush Jain <Ayush.Jain3@amd.com>
Closes: https://lore.kernel.org/r/20250516112217.GBaCcf6Yoc6LkIIryP@fat_crate.local
Signed-off-by: Eric Biggers <ebiggers@google.com>
Tested-by: Ayush Jain <Ayush.Jain3@amd.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
This is kind of last-minute, but Al Viro reported that the new
FOP_DONTCACHE flag causes memory corruption due to use-after-free
issues.
This was triggered by commit 974c5e6139db ("xfs: flag as supporting
FOP_DONTCACHE"), but that is not the underlying bug - it is just the
first user of the flag.
Vlastimil Babka suspects the underlying problem stems from the
folio_end_writeback() logic introduced in commit fb7d3bc414939
("mm/filemap: drop streaming/uncached pages when writeback completes").
The most straightforward fix would be to just revert the commit that
exposed this, but Matthew Wilcox points out that other filesystems are
also starting to enable the FOP_DONTCACHE logic, so this instead
disables that bit globally for now.
The fix will hopefully end up being trivial and we can just re-enable
this logic after more testing, but until such a time we'll have to
disable the new FOP_DONTCACHE flag.
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Link: https://lore.kernel.org/all/20250525083209.GS2023217@ZenIV/
Triggered-by: 974c5e6139db ("xfs: flag as supporting FOP_DONTCACHE")
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Darrick J. Wong <djwong@kernel.org>
Cc: Christian Brauner <brauner@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The driver currently uses dev_err for messages that have a very low probability
of being read by the user as the error will probably never happen and the
systems with the RTC probably don't have any user able to read the message.
Moreover, the only user action after getting this message is the restart the
action so drop the level to dev_dbg.
Link: https://lore.kernel.org/r/20250525222153.1472917-1-alexandre.belloni@bootlin.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
|
|
The ocillator on the m41t62 (and other chips of this type) needs
a kickstart upon a failure; the RTC read routine will notice the
oscillator failure and fail reads. This is added in the RTC write
routine; this allows the system to know that the time in the RTC
is accurate. This is following the procedure described in section
3.11 of "https://www.st.com/resource/en/datasheet/m41t62.pdf"
Signed-off-by: A. Niyas Ahamed Mydeen <nmydeen@mvista.com>
Reviewed-by: Corey Minyard <cminyard@mvista.com>
Link: https://lore.kernel.org/r/20250402120546.336657-2-nmydeen@mvista.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
|
|
Add a RTC driver for NXP S32G2/S32G3 SoCs.
RTC tracks clock time during system suspend. It can be a wakeup source
for the S32G2/S32G3 SoC based boards.
The RTC module from S32G2/S32G3 is not battery-powered and it is not kept
alive during system reset.
Co-developed-by: Bogdan Hamciuc <bogdan.hamciuc@nxp.com>
Signed-off-by: Bogdan Hamciuc <bogdan.hamciuc@nxp.com>
Co-developed-by: Ghennadi Procopciuc <Ghennadi.Procopciuc@nxp.com>
Signed-off-by: Ghennadi Procopciuc <Ghennadi.Procopciuc@nxp.com>
Signed-off-by: Ciprian Marian Costea <ciprianmarian.costea@oss.nxp.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Tested-by: Enric Balletbo i Serra <eballetbo@kernel.org>
Link: https://lore.kernel.org/r/20250403103346.3064895-3-ciprianmarian.costea@oss.nxp.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
|
|
RTC tracks clock time during system suspend and it is used as a wakeup
source on S32G2/S32G3 architecture.
RTC from S32G2/S32G3 is not battery-powered and it is not kept alive
during system reset.
Co-developed-by: Bogdan-Gabriel Roman <bogdan-gabriel.roman@nxp.com>
Signed-off-by: Bogdan-Gabriel Roman <bogdan-gabriel.roman@nxp.com>
Co-developed-by: Ghennadi Procopciuc <ghennadi.procopciuc@nxp.com>
Signed-off-by: Ghennadi Procopciuc <ghennadi.procopciuc@nxp.com>
Signed-off-by: Ciprian Marian Costea <ciprianmarian.costea@oss.nxp.com>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://lore.kernel.org/r/20250403103346.3064895-2-ciprianmarian.costea@oss.nxp.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
|
|
Rust 1.87 (released on 2025-05-15) compiles core library with edition
2024 instead of 2021 [1]. Ensure that the edition matches libcore's
expectation to avoid potential breakage.
[ J3m3 reported in Zulip [2] that the `rust-analyzer` target was
broken after this patch -- indeed, we need to avoid `core-cfgs`
since those are passed to the `rust-analyzer` target.
So, instead, I tweaked the patch to create a new `core-edition`
variable and explicitly mention the `--edition` flag instead of
reusing `core-cfg`s.
In addition, pass a new argument using this new variable to
`generate_rust_analyzer.py` so that we set the right edition there.
By the way, for future reference: the `filter-out` change is needed
for Rust < 1.87, since otherwise we would skip the `--edition=2021`
we just added, ending up with no edition flag, and thus the compiler
would default to the 2015 one.
[2] https://rust-for-linux.zulipchat.com/#narrow/channel/291565/topic/x/near/520206547
- Miguel ]
Cc: stable@vger.kernel.org # Needed in 6.12.y and later (Rust is pinned in older LTSs).
Link: https://github.com/rust-lang/rust/pull/138162 [1]
Reported-by: est31 <est31@protonmail.com>
Closes: https://github.com/Rust-for-Linux/linux/issues/1163
Signed-off-by: Gary Guo <gary@garyguo.net>
Link: https://lore.kernel.org/r/20250517085600.2857460-1-gary@garyguo.net
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
|
|
Add missing Markdown code span.
This was found using the Clippy `doc_markdown` lint, which we may want
to enable.
Fixes: ad2907b4e308 ("rust: add dma coherent allocator abstraction")
Reviewed-by: Benno Lossin <benno.lossin@proton.me>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://lore.kernel.org/r/20250324210359.1199574-6-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
|
|
Add missing Markdown code spans and also convert them into intra-doc
links.
This was found using the Clippy `doc_markdown` lint, which we may want
to enable.
Fixes: e0020ba6cbcb ("rust: add PidNamespace")
Reviewed-by: Benno Lossin <benno.lossin@proton.me>
Link: https://lore.kernel.org/r/20250324210359.1199574-10-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
|
|
In particular:
- Add missing Markdown code spans.
- Improve title for `DeviceId`, adding a link to the struct in the
C side, rather than referring to `bindings::`.
- Convert `TODO` from documentation to a normal comment, and put code
in block.
This was found using the Clippy `doc_markdown` lint, which we may want
to enable.
Fixes: 1bd8b6b2c5d3 ("rust: pci: add basic PCI device / driver abstractions")
Reviewed-by: Benno Lossin <benno.lossin@proton.me>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://lore.kernel.org/r/20250324210359.1199574-8-ojeda@kernel.org
[ Prefixed link text with `struct`. - Miguel ]
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
|
|
Add missing Markdown code span.
This was found using the Clippy `doc_markdown` lint, which we may want
to enable.
Fixes: dd09538fb409 ("rust: alloc: implement `Cmalloc` in module allocator_test")
Reviewed-by: Benno Lossin <benno.lossin@proton.me>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://lore.kernel.org/r/20250324210359.1199574-5-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
|
|
Add missing Markdown code spans.
This was found using the Clippy `doc_markdown` lint, which we may want
to enable.
Fixes: b6a006e21b82 ("rust: alloc: introduce allocation flags")
Reviewed-by: Benno Lossin <benno.lossin@proton.me>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://lore.kernel.org/r/20250324210359.1199574-4-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
|
|
Convert `TODO` from documentation to a normal comment, and put code in
block.
This was found using the Clippy `doc_markdown` lint, which we may want
to enable.
Fixes: 683a63befc73 ("rust: platform: add basic platform device / driver abstractions")
Reviewed-by: Benno Lossin <benno.lossin@proton.me>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20250324210359.1199574-9-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
|
|
When CONFIG_HWMON is built as a loadable module, the HSMP drivers
cannot be built-in:
ERROR: modpost: "hsmp_create_sensor" [drivers/platform/x86/amd/hsmp/amd_hsmp.ko] undefined!
ERROR: modpost: "hsmp_create_sensor" [drivers/platform/x86/amd/hsmp/hsmp_acpi.ko] undefined!
Enforce that through the usual Kconfig dependnecy trick.
Fixes: 92c025db52bb ("platform/x86/amd/hsmp: Report power via hwmon sensors")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20250522144422.2824083-1-arnd@kernel.org
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
The patch "Refactor Ally suspend/resume" introduced an
acpi_s2idle_dev_ops for use with ROG Ally which caused a build error
if CONFIG_SUSPEND was not defined.
Signed-off-by: Luke Jones <luke@ljones.dev>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202505090418.DaeaXe4i-lkp@intel.com/
Fixes: feea7bd6b02d ("platform/x86: asus-wmi: Refactor Ally suspend/resume")
Link: https://lore.kernel.org/r/20250524101343.57883-2-luke@ljones.dev
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull hotfixes from Andrew Morton:
"22 hotfixes.
13 are cc:stable and the remainder address post-6.14 issues or aren't
considered necessary for -stable kernels. 19 are for MM"
* tag 'mm-hotfixes-stable-2025-05-25-00-58' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (22 commits)
mailmap: add Jarkko's employer email address
mm: fix copy_vma() error handling for hugetlb mappings
memcg: always call cond_resched() after fn()
mm/hugetlb: fix kernel NULL pointer dereference when replacing free hugetlb folios
mm: vmalloc: only zero-init on vrealloc shrink
mm: vmalloc: actually use the in-place vrealloc region
alloc_tag: allocate percpu counters for module tags dynamically
module: release codetag section when module load fails
mm/cma: make detection of highmem_start more robust
MAINTAINERS: add mm memory policy section
MAINTAINERS: add mm ksm section
kasan: avoid sleepable page allocation from atomic context
highmem: add folio_test_partial_kmap()
MAINTAINERS: add hung-task detector section
taskstats: fix struct taskstats breaks backward compatibility since version 15
mm/truncate: fix out-of-bounds when doing a right-aligned split
MAINTAINERS: add mm reclaim section
MAINTAINERS: update page allocator section
mm: fix VM_UFFD_MINOR == VM_SHADOW_STACK on USERFAULTFD=y && ARM64_GCS=y
mm: mmap: map MAP_STACK to VM_NOHUGEPAGE only if THP is enabled
...
|
|
Correct spelling of platforms, various, and initial.
As flagged by codespell.
Signed-off-by: Simon Horman <horms@kernel.org>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
As it's name suggests, parse_eeprom() parses EEPROM data.
This is done by reading data, 16 bits at a time as follows:
for (i = 0; i < 128; i++)
((__le16 *) sromdata)[i] = cpu_to_le16(read_eeprom(np, i));
sromdata is at the same memory location as psrom.
And the type of psrom is a pointer to struct t_SROM.
As can be seen in the loop above, data is stored in sromdata, and thus
psrom, as 16-bit little-endian values. However, the integer fields of
t_SROM are host byte order.
In the case of the led_mode field this results in a but which has been
addressed by commit e7e5ae71831c ("net: dlink: Correct endianness
handling of led_mode").
In the case of the remaining fields, which are updated by this patch,
I do not believe this does not result in any bugs. But it does seem
best to correctly annotate the endianness of integers.
Flagged by Sparse as:
.../dl2k.c:344:35: warning: restricted __le32 degrades to integer
Compile tested only.
No run-time change intended.
Signed-off-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The AF driver assigns reserved MCAM entries (for unicast, broadcast,
etc.) based on the NIXLF number. When a NIXLF is detached, these entries
are disabled.
For example,
PF NIXLF
--------------------
PF0 0
SDP-VF0 1
If the user unbinds both PF0 and SDP-VF0 interfaces and then binds them in
reverse order
PF NIXLF
---------------------
SDP-VF0 0
PF0 1
In this scenario, the PF0 unicast entry is getting corrupted because
the MCAM entry contains stale data (SDP-VF0 ucast data)
This patch resolves the issue by clearing the unicast MCAM entry during
NIXLF detach
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In the calibration delay process, some resources are shared, so it's
better to move it after the parallel execution part. Thanks to the
patch optimizing CPU delay calibration, this change has no impact on
the boot time improvements gained from CPU parallel boot.
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
|
|
config ECONET selects SERIAL_OF_PLATFORM and that depends on SERIAL_8250
so we need to select SERIAL_8250 directly.
Also do not enable DEBUG_ZBOOT unless DEBUG_KERNEL is set.
Signed-off-by: Caleb James DeLisle <cjd@cjdns.fr>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202505211654.CBdIsoTq-lkp@intel.com/
Closes: https://lore.kernel.org/oe-kbuild-all/202505211451.WRjyf3a9-lkp@intel.com/
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
|
|
The driver currently checks if the user is querying VF RoCE statistics.
It will not send the query_roce_stats_ext HWRM command if it is for a
VF. But Thor2 VF can support extended statistics.
Allow query of extended stats for Thor2 VFs.
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Signed-off-by: Shravya KN <shravya.k-n@broadcom.com>
Link: https://patch.msgid.link/20250523075952.1267827-1-kalesh-anakkur.purayil@broadcom.com
Reviewed-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
Avoid using __le32 directly in trace events to fix sparse complains:
drivers/infiniband/hw/hns/./hns_roce_trace.h:48:1: sparse: sparse: cast to restricted __le32
drivers/infiniband/hw/hns/./hns_roce_trace.h:48:1: sparse: sparse: restricted __le32 degrades to integer
drivers/infiniband/hw/hns/./hns_roce_trace.h:48:1: sparse: sparse: restricted __le32 degrades to integer
drivers/infiniband/hw/hns/./hns_roce_trace.h:90:1: sparse: sparse: cast to restricted __le32
drivers/infiniband/hw/hns/./hns_roce_trace.h:90:1: sparse: sparse: restricted __le32 degrades to integer
drivers/infiniband/hw/hns/./hns_roce_trace.h:90:1: sparse: sparse: restricted __le32 degrades to integer
drivers/infiniband/hw/hns/./hns_roce_trace.h:173:1: sparse: sparse: cast to restricted __le32
drivers/infiniband/hw/hns/./hns_roce_trace.h:173:1: sparse: sparse: restricted __le32 degrades to integer
drivers/infiniband/hw/hns/./hns_roce_trace.h:173:1: sparse: sparse: restricted __le32 degrades to integer
Fixes: 6c98c8670806 ("RDMA/hns: Add trace for WQE dumping")
Fixes: 1e63e2f96613 ("RDMA/hns: Add trace for AEQE dumping")
Fixes: 6bd18dabf1c9 ("RDMA/hns: Add trace for CMDQ dumping")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202505170327.TNOpreil-lkp@intel.com/
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Link: https://patch.msgid.link/20250523023433.2171003-1-huangjunxian6@hisilicon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
The following warning is reported by sparse tool:
drivers/infiniband/hw/mlx5/fs.c:1664:26: warning: array of flexible
structures
Avoid it by simply splitting array into two separate structs.
Link: https://patch.msgid.link/7b891b96a9fc053d01284c184d25ae98d35db2d4.1747827041.git.leon@kernel.org
Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
|
Drop ib_send_cm_mra parameters which are always constant. Remove branch
which is never taken. Adjust name to ib_prepare_cm_mra, which better
reflects its functionality - no MRA is actually sent. Adjust name of
related tracepoints. Push setting of the constant service timeout to
cm.c and drop IB_CM_MRA_FLAG_DELAY.
Signed-off-by: Vlad Dumitrescu <vdumitrescu@nvidia.com>
Reviewed-by: Sean Hefty <shefty@nvidia.com>
Link: https://patch.msgid.link/cdd2a237acf2b495c19ce02e4b1c42c41c6751c2.1747827207.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
Drivers such as rxe, which use virtual DMA, must not call into the DMA
mapping core since they lack physical DMA capabilities. Otherwise, a NULL
pointer dereference is observed as shown below. This patch ensures the RDMA
core handles virtual and physical DMA paths appropriately.
This fixes the following kernel oops:
BUG: kernel NULL pointer dereference, address: 00000000000002fc
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 1028eb067 P4D 1028eb067 PUD 105da0067 PMD 0
Oops: Oops: 0000 [#1] SMP NOPTI
CPU: 3 UID: 1000 PID: 1854 Comm: python3 Tainted: G W 6.15.0-rc1+ #11 PREEMPT(voluntary)
Tainted: [W]=WARN
Hardware name: Trigkey Key N/Key N, BIOS KEYN101 09/02/2024
RIP: 0010:hmm_dma_map_alloc+0x25/0x100
Code: 90 90 90 90 90 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 49 89 d6 49 c1 e6 0c 41 55 41 54 53 49 39 ce 0f 82 c6 00 00 00 49 89 fc <f6> 87 fc 02 00 00 20 0f 84 af 00 00 00 49 89 f5 48 89 d3 49 89 cf
RSP: 0018:ffffd3d3420eb830 EFLAGS: 00010246
RAX: 0000000000001000 RBX: ffff8b727c7f7400 RCX: 0000000000001000
RDX: 0000000000000001 RSI: ffff8b727c7f74b0 RDI: 0000000000000000
RBP: ffffd3d3420eb858 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 00007262a622a000 R14: 0000000000001000 R15: ffff8b727c7f74b0
FS: 00007262a62a1080(0000) GS:ffff8b762ac3e000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000002fc CR3: 000000010a1f0004 CR4: 0000000000f72ef0
PKRU: 55555554
Call Trace:
<TASK>
ib_init_umem_odp+0xb6/0x110 [ib_uverbs]
ib_umem_odp_get+0xf0/0x150 [ib_uverbs]
rxe_odp_mr_init_user+0x71/0x170 [rdma_rxe]
rxe_reg_user_mr+0x217/0x2e0 [rdma_rxe]
ib_uverbs_reg_mr+0x19e/0x2e0 [ib_uverbs]
ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0xd9/0x150 [ib_uverbs]
ib_uverbs_cmd_verbs+0xd19/0xee0 [ib_uverbs]
? mmap_region+0x63/0xd0
? __pfx_ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x10/0x10 [ib_uverbs]
ib_uverbs_ioctl+0xba/0x130 [ib_uverbs]
__x64_sys_ioctl+0xa4/0xe0
x64_sys_call+0x1178/0x2660
do_syscall_64+0x7e/0x170
? syscall_exit_to_user_mode+0x4e/0x250
? do_syscall_64+0x8a/0x170
? do_syscall_64+0x8a/0x170
? syscall_exit_to_user_mode+0x4e/0x250
? do_syscall_64+0x8a/0x170
? syscall_exit_to_user_mode+0x4e/0x250
? do_syscall_64+0x8a/0x170
? do_user_addr_fault+0x1d2/0x8d0
? irqentry_exit_to_user_mode+0x43/0x250
? irqentry_exit+0x43/0x50
? exc_page_fault+0x93/0x1d0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7262a6124ded
Code: 04 25 28 00 00 00 48 89 45 c8 31 c0 48 8d 45 10 c7 45 b0 10 00 00 00 48 89 45 b8 48 8d 45 d0 48 89 45 c0 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1a 48 8b 45 c8 64 48 2b 04 25 28 00 00 00
RSP: 002b:00007fffd08c3960 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007fffd08c39f0 RCX: 00007262a6124ded
RDX: 00007fffd08c3a10 RSI: 00000000c0181b01 RDI: 0000000000000007
RBP: 00007fffd08c39b0 R08: 0000000014107820 R09: 00007fffd08c3b44
R10: 000000000000000c R11: 0000000000000246 R12: 00007fffd08c3b44
R13: 000000000000000c R14: 00007fffd08c3b58 R15: 0000000014107960
</TASK>
Fixes: 1efe8c0670d6 ("RDMA/core: Convert UMEM ODP DMA mapping to caching IOVA and page linkage")
Closes: https://lore.kernel.org/all/3e8f343f-7d66-4f7a-9f08-3910623e322f@gmail.com/
Signed-off-by: Daisuke Matsuda <dskmtsd@gmail.com>
Link: https://patch.msgid.link/20250524144328.4361-1-dskmtsd@gmail.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
Commit e77aff5528a18 ("binderfs: fix use-after-free in binder_devices")
addressed a use-after-free where devices could be released without first
being removed from the binder_devices list. However, there is a similar
path in binder_free_proc() that was missed:
==================================================================
BUG: KASAN: slab-use-after-free in binder_remove_device+0xd4/0x100
Write of size 8 at addr ffff0000c773b900 by task umount/467
CPU: 12 UID: 0 PID: 467 Comm: umount Not tainted 6.15.0-rc7-00138-g57483a362741 #9 PREEMPT
Hardware name: linux,dummy-virt (DT)
Call trace:
binder_remove_device+0xd4/0x100
binderfs_evict_inode+0x230/0x2f0
evict+0x25c/0x5dc
iput+0x304/0x480
dentry_unlink_inode+0x208/0x46c
__dentry_kill+0x154/0x530
[...]
Allocated by task 463:
__kmalloc_cache_noprof+0x13c/0x324
binderfs_binder_device_create.isra.0+0x138/0xa60
binder_ctl_ioctl+0x1ac/0x230
[...]
Freed by task 215:
kfree+0x184/0x31c
binder_proc_dec_tmpref+0x33c/0x4ac
binder_deferred_func+0xc10/0x1108
process_one_work+0x520/0xba4
[...]
==================================================================
Call binder_remove_device() within binder_free_proc() to ensure the
device is removed from the binder_devices list before being kfreed.
Cc: stable@vger.kernel.org
Fixes: 12d909cac1e1 ("binderfs: add new binder devices to binder_devices")
Reported-by: syzbot+4af454407ec393de51d6@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=4af454407ec393de51d6
Tested-by: syzbot+4af454407ec393de51d6@syzkaller.appspotmail.com
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20250524220758.915028-1-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The manually updated table of contents and section numbering are hard
to maintain.
Make changes similar to the following commits:
5e8f0ba38a4d ("docs/kbuild/makefiles: throw out the local table of contents")
1a4c1c9df72e ("docs/kbuild/makefiles: drop section numbering, use references")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Explicitly adding MODULE_IMPORT_NS("module:...") is not allowed.
Currently, this is only checked at run time. That is, when such a
module is loaded, an error message like the following is shown:
foo: module tries to import module namespace: module:bar
Obviously, checking this at compile time improves usability.
In such a case, modpost will report the following error at compile time:
ERROR: modpost: foo: explicitly importing namespace "module:bar" is not allowed.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
scripts/Makefile.lib is included by the following Makefiles:
scripts/Makefile.build
scripts/Makefile.modfinal
scripts/Makefile.package
scripts/Makefile.vmlinux
scripts/Makefile.vmlinux_o
However, the last four do not need to process Kbuild syntax such as
obj-*, lib-*, subdir-*, etc.
Move the relevant code to scripts/Makefile.build.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <nicolas.schier@linux.dev>
|
|
archscripts has nothing to do with headers_install.
Signed-off-by: Henrik Lindström <henrik@lxm.se>
Reviewed-by: Nicolas Schier <n.schier@avm.de>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Document the "byte_size" and "type_string" kABI stability rules.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Change the gendwarfksyms documentation to use proper chapter,
section, and subsection adornments instead of fragile section
numbers.
Suggested-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
In rare situations where distributions must make significant
changes to otherwise opaque data structures that have
inadvertently been included in the published ABI, keeping
symbol versions stable using the existing kABI macros can
become tedious.
For example, Android decided to switch to a newer io_uring
implementation in the 5.10 GKI kernel "to resolve a huge number
of potential, and known, problems with the codebase," requiring
"horrible hacks" with genksyms:
"A number of the io_uring structures get used in other core
kernel structures, only as "opaque" pointers, so there is
not any real ABI breakage. But, due to the visibility of
the structures going away, the CRC values of many scheduler
variables and functions were changed."
-- https://r.android.com/2425293
While these specific changes probably could have been hidden
from gendwarfksyms using the existing kABI macros, this may not
always be the case.
Add a last resort kABI rule that allows distribution
maintainers to fully override a type string for a symbol or a
type. Also add a more informative error message in case we find
a non-existent type references when calculating versions.
Suggested-by: Giuliano Procida <gprocida@google.com>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
A data structure can be partially opaque to modules if its
allocation is handled by the core kernel, and modules only need
to access some of its members. In this situation, it's possible
to append new members to the structure without breaking the ABI,
as long as the layout for the original members remains unchanged.
For example, consider the following struct:
struct s {
unsigned long a;
void *p;
};
gendwarfksyms --stable --dump-dies produces the following type
expansion:
variable structure_type s {
member base_type long unsigned int byte_size(8) encoding(7) a
data_member_location(0) ,
member pointer_type {
base_type void
} byte_size(8) p data_member_location(8)
} byte_size(16)
To append new members, we can use the KABI_IGNORE() macro to
hide them from gendwarfksyms --stable:
struct s {
/* old members with unchanged layout */
unsigned long a;
void *p;
/* new members not accessed by modules */
KABI_IGNORE(0, unsigned long n);
};
However, we can't hide the fact that adding new members changes
the struct size, as seen in the updated type string:
variable structure_type s {
member base_type long unsigned int byte_size(8) encoding(7) a
data_member_location(0) ,
member pointer_type {
base_type void
} byte_size(8) p data_member_location(8)
} byte_size(24)
In order to support this use case, add a kABI rule that makes it
possible to override the byte_size attribute for types:
/*
* struct s allocation is handled by the kernel, so
* appending new members without changing the original
* layout won't break the ABI.
*/
KABI_BYTE_SIZE(s, 16);
This results in a type string that's unchanged from the original
and therefore, won't change versions for symbols that reference
the changed structure.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Reduce code duplication by moving kABI rule look-ups to separate
functions.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Helper macro to more easily limit the export of a symbol to a given
list of modules.
Eg:
EXPORT_SYMBOL_GPL_FOR_MODULES(preempt_notifier_inc, "kvm");
will limit the use of said function to kvm.ko, any other module trying
to use this symbol will refure to load (and get modpost build
failures).
Requested-by: Masahiro Yamada <masahiroy@kernel.org>
Requested-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Sean noted that scripts/Makefile.lib:name-fix-token rule will mangle
the module name with s/-/_/g.
Since this happens late in the build, only the kernel needs to bother
with this, the modpost tool still sees the original name.
Reported-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Tested-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Instead of only accepting "module:${name}", extend it with a comma
separated list of module names and add tail glob support.
That is, something like: "module:foo-*,bar" is now possible.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Designate the "module:${modname}" symbol namespace to mean: 'only
export to the named module'.
Notably, explicit imports of anything in the "module:" space is
forbidden.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Slight cleanup by using a for() loop instead of while(). This makes it
clearer what is the iteration and what is the actual work done.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Add the current employer email address to mailmap.
Link: https://lkml.kernel.org/r/20250523121105.15850-1-jarkko@kernel.org
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Cc: Alexander Sverdlin <alexander.sverdlin@gmail.com>
Cc: Antonio Quartulli <antonio@openvpn.net>
Cc: Carlos Bilbao <carlos.bilbao@kernel.org>
Cc: Kees Cook <kees@kernel.org>
Cc: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
If, during a mremap() operation for a hugetlb-backed memory mapping,
copy_vma() fails after the source vma has been duplicated and opened (ie.
vma_link() fails), the error is handled by closing the new vma. This
updates the hugetlbfs reservation counter of the reservation map which at
this point is referenced by both the source vma and the new copy. As a
result, once the new vma has been freed and copy_vma() returns, the
reservation counter for the source vma will be incorrect.
This patch addresses this corner case by clearing the hugetlb private page
reservation reference for the new vma and decrementing the reference
before closing the vma, so that vma_close() won't update the reservation
counter. This is also what copy_vma_and_data() does with the source vma
if copy_vma() succeeds, so a helper function has been added to do the
fixup in both functions.
The issue was reported by a private syzbot instance and can be reproduced
using the C reproducer in [1]. It's also a possible duplicate of public
syzbot report [2]. The WARNING report is:
============================================================
page_counter underflow: -1024 nr_pages=1024
WARNING: CPU: 0 PID: 3287 at mm/page_counter.c:61 page_counter_cancel+0xf6/0x120
Modules linked in:
CPU: 0 UID: 0 PID: 3287 Comm: repro__WARNING_ Not tainted 6.15.0-rc7+ #54 NONE
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-2-gc13ff2cd-prebuilt.qemu.org 04/01/2014
RIP: 0010:page_counter_cancel+0xf6/0x120
Code: ff 5b 41 5e 41 5f 5d c3 cc cc cc cc e8 f3 4f 8f ff c6 05 64 01 27 06 01 48 c7 c7 60 15 f8 85 48 89 de 4c 89 fa e8 2a a7 51 ff <0f> 0b e9 66 ff ff ff 44 89 f9 80 e1 07 38 c1 7c 9d 4c 81
RSP: 0018:ffffc900025df6a0 EFLAGS: 00010246
RAX: 2edfc409ebb44e00 RBX: fffffffffffffc00 RCX: ffff8880155f0000
RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000
RBP: dffffc0000000000 R08: ffffffff81c4a23c R09: 1ffff1100330482a
R10: dffffc0000000000 R11: ffffed100330482b R12: 0000000000000000
R13: ffff888058a882c0 R14: ffff888058a882c0 R15: 0000000000000400
FS: 0000000000000000(0000) GS:ffff88808fc53000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000004b33e0 CR3: 00000000076d6000 CR4: 00000000000006f0
Call Trace:
<TASK>
page_counter_uncharge+0x33/0x80
hugetlb_cgroup_uncharge_counter+0xcb/0x120
hugetlb_vm_op_close+0x579/0x960
? __pfx_hugetlb_vm_op_close+0x10/0x10
remove_vma+0x88/0x130
exit_mmap+0x71e/0xe00
? __pfx_exit_mmap+0x10/0x10
? __mutex_unlock_slowpath+0x22e/0x7f0
? __pfx_exit_aio+0x10/0x10
? __up_read+0x256/0x690
? uprobe_clear_state+0x274/0x290
? mm_update_next_owner+0xa9/0x810
__mmput+0xc9/0x370
exit_mm+0x203/0x2f0
? __pfx_exit_mm+0x10/0x10
? taskstats_exit+0x32b/0xa60
do_exit+0x921/0x2740
? do_raw_spin_lock+0x155/0x3b0
? __pfx_do_exit+0x10/0x10
? __pfx_do_raw_spin_lock+0x10/0x10
? _raw_spin_lock_irq+0xc5/0x100
do_group_exit+0x20c/0x2c0
get_signal+0x168c/0x1720
? __pfx_get_signal+0x10/0x10
? schedule+0x165/0x360
arch_do_signal_or_restart+0x8e/0x7d0
? __pfx_arch_do_signal_or_restart+0x10/0x10
? __pfx___se_sys_futex+0x10/0x10
syscall_exit_to_user_mode+0xb8/0x2c0
do_syscall_64+0x75/0x120
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x422dcd
Code: Unable to access opcode bytes at 0x422da3.
RSP: 002b:00007ff266cdb208 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: 0000000000000001 RBX: 00007ff266cdbcdc RCX: 0000000000422dcd
RDX: 00000000000f4240 RSI: 0000000000000081 RDI: 00000000004c7bec
RBP: 00007ff266cdb220 R08: 203a6362696c6720 R09: 203a6362696c6720
R10: 0000200000c00000 R11: 0000000000000246 R12: ffffffffffffffd0
R13: 0000000000000002 R14: 00007ffe1cb5f520 R15: 00007ff266cbb000
</TASK>
============================================================
Link: https://lkml.kernel.org/r/20250523-warning_in_page_counter_cancel-v2-1-b6df1a8cfefd@igalia.com
Link: https://people.igalia.com/rcn/kernel_logs/20250422__WARNING_in_page_counter_cancel__repro.c [1]
Link: https://lore.kernel.org/all/67000a50.050a0220.49194.048d.GAE@google.com/ [2]
Signed-off-by: Ricardo Cañuelo Navarro <rcn@igalia.com>
Suggested-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Florent Revest <revest@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|