Age | Commit message (Collapse) | Author |
|
It appears that stmmac is not the only hardware which requires a
software-driven verification state machine for the MAC Merge layer.
While on the one hand it's good to encourage hardware implementations,
on the other hand it's quite difficult to tolerate multiple drivers
implementing independently fairly non-trivial logic.
Extract the hardware-independent logic from stmmac into library code and
put it in ethtool. Name the state structure "mmsv" for MAC Merge
Software Verification. Let this expose an operations structure for
executing the hardware stuff: sync hardware with the tx_active boolean
(result of verification process), enable/disable the pMAC, send mPackets,
notify library of external events (reception of mPackets), as well as
link state changes.
Note that it is assumed that the external events are received in hardirq
context. If they are not, it is probably a good idea to disable hardirqs
when calling ethtool_mmsv_event_handle(), because the library does not
do so.
Also, the MM software verification process has no business with the
tx_min_frag_size, that is all the driver's to handle.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Co-developed-by: Choong Yong Liang <yong.liang.choong@linux.intel.com>
Signed-off-by: Choong Yong Liang <yong.liang.choong@linux.intel.com>
Tested-by: Choong Yong Liang <yong.liang.choong@linux.intel.com>
Tested-by: Furong Xu <0x1207@gmail.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Faizal Rahim <faizal.abdul.rahim@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
It looks like GPUs are used after shutdown is invoked.
Thus, breaking virtio gpu in the shutdown callback is not a good idea -
guest hangs attempting to finish console drawing, with these warnings:
[ 20.504464] WARNING: CPU: 0 PID: 568 at drivers/gpu/drm/virtio/virtgpu_vq.c:358 virtio_gpu_queue_ctrl_sgs+0x236/0x290 [virtio_gpu]
[ 20.505685] Modules linked in: nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill ip_set nf_tables nfnetlink vfat fat intel_rapl_msr intel_rapl_common intel_uncore_frequency_common nfit libnvdimm kvm_intel kvm rapl iTCO_wdt iTCO_vendor_support virtio_gpu virtio_dma_buf pcspkr drm_shmem_helper i2c_i801 drm_kms_helper lpc_ich i2c_smbus virtio_balloon joydev drm fuse xfs libcrc32c ahci libahci crct10dif_pclmul crc32_pclmul crc32c_intel libata virtio_net ghash_clmulni_intel net_failover virtio_blk failover serio_raw dm_mirror dm_region_hash dm_log dm_mod
[ 20.511847] CPU: 0 PID: 568 Comm: kworker/0:3 Kdump: loaded Tainted: G W ------- --- 5.14.0-578.6675_1757216455.el9.x86_64 #1
[ 20.513157] Hardware name: Red Hat KVM/RHEL, BIOS edk2-20241117-3.el9 11/17/2024
[ 20.513918] Workqueue: events drm_fb_helper_damage_work [drm_kms_helper]
[ 20.514626] RIP: 0010:virtio_gpu_queue_ctrl_sgs+0x236/0x290 [virtio_gpu]
[ 20.515332] Code: 00 00 48 85 c0 74 0c 48 8b 78 08 48 89 ee e8 51 50 00 00 65 ff 0d 42 e3 74 3f 0f 85 69 ff ff ff 0f 1f 44 00 00 e9 5f ff ff ff <0f> 0b e9 3f ff ff ff 48 83 3c 24 00 74 0e 49 8b 7f 40 48 85 ff 74
[ 20.517272] RSP: 0018:ff34f0a8c0787ad8 EFLAGS: 00010282
[ 20.517820] RAX: 00000000fffffffb RBX: 0000000000000000 RCX: 0000000000000820
[ 20.518565] RDX: 0000000000000000 RSI: ff34f0a8c0787be0 RDI: ff218bef03a26300
[ 20.519308] RBP: ff218bef03a26300 R08: 0000000000000001 R09: ff218bef07224360
[ 20.520059] R10: 0000000000008dc0 R11: 0000000000000002 R12: ff218bef02630028
[ 20.520806] R13: ff218bef0263fb48 R14: ff218bef00cb8000 R15: ff218bef07224360
[ 20.521555] FS: 0000000000000000(0000) GS:ff218bef7ba00000(0000) knlGS:0000000000000000
[ 20.522397] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 20.522996] CR2: 000055ac4f7871c0 CR3: 000000010b9f2002 CR4: 0000000000771ef0
[ 20.523740] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 20.524477] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
[ 20.525223] PKRU: 55555554
[ 20.525515] Call Trace:
[ 20.525777] <TASK>
[ 20.526003] ? show_trace_log_lvl+0x1c4/0x2df
[ 20.526464] ? show_trace_log_lvl+0x1c4/0x2df
[ 20.526925] ? virtio_gpu_queue_fenced_ctrl_buffer+0x82/0x2c0 [virtio_gpu]
[ 20.527643] ? virtio_gpu_queue_ctrl_sgs+0x236/0x290 [virtio_gpu]
[ 20.528282] ? __warn+0x7e/0xd0
[ 20.528621] ? virtio_gpu_queue_ctrl_sgs+0x236/0x290 [virtio_gpu]
[ 20.529256] ? report_bug+0x100/0x140
[ 20.529643] ? handle_bug+0x3c/0x70
[ 20.530010] ? exc_invalid_op+0x14/0x70
[ 20.530421] ? asm_exc_invalid_op+0x16/0x20
[ 20.530862] ? virtio_gpu_queue_ctrl_sgs+0x236/0x290 [virtio_gpu]
[ 20.531506] ? virtio_gpu_queue_ctrl_sgs+0x174/0x290 [virtio_gpu]
[ 20.532148] virtio_gpu_queue_fenced_ctrl_buffer+0x82/0x2c0 [virtio_gpu]
[ 20.532843] virtio_gpu_primary_plane_update+0x3e2/0x460 [virtio_gpu]
[ 20.533520] drm_atomic_helper_commit_planes+0x108/0x320 [drm_kms_helper]
[ 20.534233] drm_atomic_helper_commit_tail+0x45/0x80 [drm_kms_helper]
[ 20.534914] commit_tail+0xd2/0x130 [drm_kms_helper]
[ 20.535446] drm_atomic_helper_commit+0x11b/0x140 [drm_kms_helper]
[ 20.536097] drm_atomic_commit+0xa4/0xe0 [drm]
[ 20.536588] ? __pfx___drm_printfn_info+0x10/0x10 [drm]
[ 20.537162] drm_atomic_helper_dirtyfb+0x192/0x270 [drm_kms_helper]
[ 20.537823] drm_fbdev_shmem_helper_fb_dirty+0x43/0xa0 [drm_shmem_helper]
[ 20.538536] drm_fb_helper_damage_work+0x87/0x160 [drm_kms_helper]
[ 20.539188] process_one_work+0x194/0x380
[ 20.539612] worker_thread+0x2fe/0x410
[ 20.540007] ? __pfx_worker_thread+0x10/0x10
[ 20.540456] kthread+0xdd/0x100
[ 20.540791] ? __pfx_kthread+0x10/0x10
[ 20.541190] ret_from_fork+0x29/0x50
[ 20.541566] </TASK>
[ 20.541802] ---[ end trace 0000000000000000 ]---
It looks like the shutdown is called in the middle of console drawing, so
we should either wait for it to finish, or let drm handle the shutdown.
This patch implements this second option:
Add an option for drivers to bypass the common break+reset handling.
As DRM is careful to flush/synchronize outstanding buffers, it looks like
GPU can just have a NOP there.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Fixes: 8bd2fa086a04 ("virtio: break and reset virtio devices on device_shutdown()")
Cc: Eric Auger <eauger@redhat.com>
Cc: Jocelyn Falempe <jfalempe@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <8490dbeb6f79ed039e6c11d121002618972538a3.1744293540.git.mst@redhat.com>
|
|
Merge series from Andy Shevchenko <andriy.shevchenko@linux.intel.com>:
Recently in the discussion with David the idea of having
a common helper popped up. The helper converts the given
bits per word to bytes. The result will always be power-of-two
(e.g. for 37 bits it returns 8 bytes) or 0 for 0 input.
More details are in the respective code comment.
This mini-series introduces it and replaces current users
under drivers/spi and we expect more (and possibly some
lurking in other subsystems).
|
|
Vlastimil points out that commit a211c6550efc ("mm: page_alloc:
defrag_mode kswapd/kcompactd watermarks") switched kswapd from
zone_watermark_ok_safe() to the standard, percpu-cached version of reading
free pages, thus dropping the watermark safety precautions for systems
with high CPU counts (e.g. >212 cpus on 64G). Restore them.
Since zone_watermark_ok_safe() is no longer the right interface, and this
was the last caller of the function anyway, open-code the
zone_page_state_snapshot() conditional and delete the function.
Link: https://lkml.kernel.org/r/20250416135142.778933-2-hannes@cmpxchg.org
Fixes: a211c6550efc ("mm: page_alloc: defrag_mode kswapd/kcompactd watermarks")
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Brendan Jackman <jackmanb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
inode_to_wb() is used also for filesystems that don't support cgroup
writeback. For these filesystems inode->i_wb is stable during the
lifetime of the inode (it points to bdi->wb) and there's no need to hold
locks protecting the inode->i_wb dereference. Improve the warning in
inode_to_wb() to not trigger for these filesystems.
Link: https://lkml.kernel.org/r/20250412163914.3773459-3-agruenba@redhat.com
Fixes: aaa2cacf8184 ("writeback: add lockdep annotation to inode_to_wb()")
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Alison reports an issue with fsdax when large extends end up using large
ZONE_DEVICE folios:
[ 417.796271] BUG: kernel NULL pointer dereference, address: 0000000000000b00
[ 417.796982] #PF: supervisor read access in kernel mode
[ 417.797540] #PF: error_code(0x0000) - not-present page
[ 417.798123] PGD 2a5c5067 P4D 2a5c5067 PUD 2a5c6067 PMD 0
[ 417.798690] Oops: Oops: 0000 [#1] SMP NOPTI
[ 417.799178] CPU: 5 UID: 0 PID: 1515 Comm: mmap Tainted: ...
[ 417.800150] Tainted: [O]=OOT_MODULE
[ 417.800583] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
[ 417.801358] RIP: 0010:__lruvec_stat_mod_folio+0x7e/0x250
[ 417.801948] Code: ...
[ 417.803662] RSP: 0000:ffffc90002be3a08 EFLAGS: 00010206
[ 417.804234] RAX: 0000000000000000 RBX: 0000000000000200 RCX: 0000000000000002
[ 417.804984] RDX: ffffffff815652d7 RSI: 0000000000000000 RDI: ffffffff82a2beae
[ 417.805689] RBP: ffffc90002be3a28 R08: 0000000000000000 R09: 0000000000000000
[ 417.806384] R10: ffffea0007000040 R11: ffff888376ffe000 R12: 0000000000000001
[ 417.807099] R13: 0000000000000012 R14: ffff88807fe4ab40 R15: ffff888029210580
[ 417.807801] FS: 00007f339fa7a740(0000) GS:ffff8881fa9b9000(0000) knlGS:0000000000000000
[ 417.808570] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 417.809193] CR2: 0000000000000b00 CR3: 000000002a4f0004 CR4: 0000000000370ef0
[ 417.809925] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 417.810622] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 417.811353] Call Trace:
[ 417.811709] <TASK>
[ 417.812038] folio_add_file_rmap_ptes+0x143/0x230
[ 417.812566] insert_page_into_pte_locked+0x1ee/0x3c0
[ 417.813132] insert_page+0x78/0xf0
[ 417.813558] vmf_insert_page_mkwrite+0x55/0xa0
[ 417.814088] dax_fault_iter+0x484/0x7b0
[ 417.814542] dax_iomap_pte_fault+0x1ca/0x620
[ 417.815055] dax_iomap_fault+0x39/0x40
[ 417.815499] __xfs_write_fault+0x139/0x380
[ 417.815995] ? __handle_mm_fault+0x5e5/0x1a60
[ 417.816483] xfs_write_fault+0x41/0x50
[ 417.816966] xfs_filemap_fault+0x3b/0xe0
[ 417.817424] __do_fault+0x31/0x180
[ 417.817859] __handle_mm_fault+0xee1/0x1a60
[ 417.818325] ? debug_smp_processor_id+0x17/0x20
[ 417.818844] handle_mm_fault+0xe1/0x2b0
[...]
The issue is that when we split a large ZONE_DEVICE folio to order-0 ones,
we don't reset the order/_nr_pages. As folio->_nr_pages overlays
page[1]->memcg_data, once page[1] is a folio, it suddenly looks like it
has folio->memcg_data set. And we never manually initialize
folio->memcg_data in fsdax code, because we never expect it to be set at
all.
When __lruvec_stat_mod_folio() then stumbles over such a folio, it tries
to use folio->memcg_data (because it's non-NULL) but it does not actually
point at a memcg, resulting in the problem.
Alison also observed that these folios sometimes have "locked" set, which
is rather concerning (folios locked from the beginning ...). The reason
is that the order for large folios is stored in page[1]->flags, which
become the folio->flags of a new small folio.
Let's fix it by adding a folio helper to clear order/_nr_pages for
splitting purposes.
Maybe we should reinitialize other large folio flags / folio members as
well when splitting, because they might similarly cause harm once page[1]
becomes a folio? At least other flags in PAGE_FLAGS_SECOND should not be
set for fsdax, so at least page[1]->flags might be as expected with this
fix.
From a quick glimpse, initializing ->mapping, ->pgmap and ->share should
re-initialize most things from a previous page[1] used by large folios
that fsdax cares about. For example folio->private might not get
reinitialized, but maybe that's not relevant -- no traces of it's use in
fsdax code. Needs a closer look.
Another thing that should be considered in the future is performing
similar checks as we perform in free_tail_page_prepare()
-- checking pincount etc.
-- when freeing a large fsdax folio.
Link: https://lkml.kernel.org/r/20250410091020.119116-1-david@redhat.com
Fixes: 4996fc547f5b ("mm: let _folio_nr_pages overlay memcg_data in first tail page")
Fixes: 38607c62b34b ("fs/dax: properly refcount fs dax pages")
Signed-off-by: David Hildenbrand <david@redhat.com>
Reported-by: Alison Schofield <alison.schofield@intel.com>
Closes: https://lkml.kernel.org/r/Z_W9Oeg-D9FhImf3@aschofie-mobl2.lan
Tested-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Tested-by: "Darrick J. Wong" <djwong@kernel.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
When the last page in the zone is accepted, __accept_page() calls
static_branch_dec(). This function takes cpu_hotplug_lock, which can lead
to a deadlock if the allocation occurs during CPU bringup path as
_cpu_up() also takes the lock.
To prevent this deadlock, defer static_branch_dec() to a workqueue.
Call static_branch_dec() only when the workqueue is not yet initialized.
Workqueues are initialized before CPU bring up, so this will not conflict
with the first scenario.
Link: https://lkml.kernel.org/r/20250329171030.3942298-1-kirill.shutemov@linux.intel.com
Fixes: 55ad43e8ba0f ("mm: add a helper to accept page")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Srikanth Aithal <sraithal@amd.com>
Tested-by: Srikanth Aithal <sraithal@amd.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Ashish Kalra <ashish.kalra@amd.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: "Mike Rapoport (IBM)" <rppt@kernel.org>
Cc: Thomas Lendacky <thomas.lendacky@amd.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Kuniyuki reports that the assert for netdev lock fires when
there are netdev event listeners (otherwise we skip the netlink
event generation).
Correct the locking when coming from the notifier.
The NETDEV_XDP_FEAT_CHANGE notifier is already fully locked,
it's the documentation that's incorrect.
Fixes: 99e44f39a8f7 ("netdev: depend on netdev->lock for xdp features")
Reported-by: syzkaller <syzkaller@googlegroups.com>
Reported-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/20250410171019.62128-1-kuniyu@amazon.com
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250416030447.1077551-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Many drivers populate the stats buffer using C-String based APIs (e.g.
ethtool_sprintf() and ethtool_puts()), usually when building up the
list of stats individually (i.e. with a for() loop). This, however,
requires that the source strings be populated in such a way as to have
a terminating NUL byte in the source.
Other drivers populate the stats buffer directly using one big memcpy()
of an entire array of strings. No NUL termination is needed here, as the
bytes are being directly passed through. Yet others will build up the
stats buffer individually, but also use memcpy(). This, too, does not
need NUL termination of the source strings.
However, there are cases where the strings that populate the
source stats strings are exactly ETH_GSTRING_LEN long, and GCC
15's -Wunterminated-string-initialization option complains that the
trailing NUL byte has been truncated. This situation is fine only if the
driver is using the memcpy() approach. If the C-String APIs are used,
the destination string name will have its final byte truncated by the
required trailing NUL byte applied by the C-string API.
For drivers that are already using memcpy() but have initializers that
truncate the NUL terminator, mark their source strings as __nonstring to
silence the GCC warnings.
For drivers that have initializers that truncate the NUL terminator and
are using the C-String APIs, switch to memcpy() to avoid destination
string truncation and mark their source strings as __nonstring to silence
the GCC warnings. (Also introduce ethtool_cpy() as a helper to make this
an easy replacement).
Specifically the following warnings were investigated and addressed:
../drivers/net/ethernet/chelsio/cxgb/cxgb2.c:364:9: warning: initializer-string for array of 'char' truncates NUL terminator but destination lacks 'nonstring' attribute (33 chars into 32 available) [-Wunterminated-string-initialization]
364 | "TxFramesAbortedDueToXSCollisions",
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../drivers/net/ethernet/freescale/enetc/enetc_ethtool.c:165:33: warning: initializer-string for array of 'char' truncates NUL terminator but destination lacks 'nonstring' attribute (33 chars into 32 available) [-Wunterminated-string-initialization]
165 | { ENETC_PM_R1523X(0), "MAC rx 1523 to max-octet packets" },
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../drivers/net/ethernet/freescale/enetc/enetc_ethtool.c:190:33: warning: initializer-string for array of 'char' truncates NUL terminator but destination lacks 'nonstring' attribute (33 chars into 32 available) [-Wunterminated-string-initialization]
190 | { ENETC_PM_T1523X(0), "MAC tx 1523 to max-octet packets" },
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../drivers/net/ethernet/google/gve/gve_ethtool.c:76:9: warning: initializer-string for array of 'char' truncates NUL terminator but destination lacks 'nonstring' attribute (33 chars into 32 available) [-Wunterminated-string-initialization]
76 | "adminq_dcfg_device_resources_cnt", "adminq_set_driver_parameter_cnt",
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c:117:53: warning: initializer-string for array of 'char' truncates NUL terminator but destination lacks 'nonstring' attribute (33 chars into 32 available) [-Wunterminated-string-initialization]
117 | STMMAC_STAT(ptp_rx_msg_type_pdelay_follow_up),
| ^
../drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c:46:12: note: in definition of macro 'STMMAC_STAT'
46 | { #m, sizeof_field(struct stmmac_extra_stats, m), \
| ^
../drivers/net/ethernet/mellanox/mlxsw/spectrum_ethtool.c:328:24: warning: initializer-string for array of 'char' truncates NUL terminator but destination lacks 'nonstring' attribute (33 chars into 32 available) [-Wunterminated-string-initialization]
328 | .str = "a_mac_control_frames_transmitted",
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../drivers/net/ethernet/mellanox/mlxsw/spectrum_ethtool.c:340:24: warning: initializer-string for array of 'char' truncates NUL terminator but destination lacks 'nonstring' attribute (33 chars into 32 available) [-Wunterminated-string-initialization]
340 | .str = "a_pause_mac_ctrl_frames_received",
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Kees Cook <kees@kernel.org>
Reviewed-by: Petr Machata <petrm@nvidia.com> # for mlxsw
Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com>
Link: https://patch.msgid.link/20250416010210.work.904-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Cross-merge networking fixes after downstream PR (net-6.15-rc3).
No conflicts. Adjacent changes:
tools/net/ynl/pyynl/ynl_gen_c.py
4d07bbf2d456 ("tools: ynl-gen: don't declare loop iterator in place")
7e8ba0c7de2b ("tools: ynl: don't use genlmsghdr in classic netlink")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Rename get_ctx_arg_idx to bpf_ctx_arg_idx, and allow others to call it.
No functional change.
Signed-off-by: Amery Hung <ameryhung@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://patch.msgid.link/20250409214606.2000194-2-ameryhung@gmail.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A collection of small fixes. All are device-specific like quirks, new
IDs, and other safe (or rather boring) changes"
* tag 'sound-6.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
firmware: cs_dsp: test_bin_error: Fix uninitialized data used as fw version
ASoC: codecs: Add of_match_table for aw888081 driver
ASoC: fsl: fsl_qmc_audio: Reset audio data pointers on TRIGGER_START event
mailmap: Add entry for Srinivas Kandagatla
MAINTAINERS: use kernel.org alias
ASoC: cs42l43: Reset clamp override on jack removal
ALSA: hda/realtek - Fixed ASUS platform headset Mic issue
ALSA: hda/cirrus_scodec_test: Don't select dependencies
ALSA: azt2320: Replace deprecated strcpy() with strscpy()
ASoC: hdmi-codec: use RTD ID instead of DAI ID for ELD entry
ASoC: Intel: avs: Constrain path based on BE capabilities
ALSA: hda/tas2781: Remove unnecessary NULL check before release_firmware()
ASoC: Intel: avs: Fix null-ptr-deref in avs_component_probe()
ASoC: fsl_asrc_dma: get codec or cpu dai from backend
ASoC: qcom: Fix sc7280 lpass potential buffer overflow
ASoC: dwc: always enable/disable i2s irqs
ASoC: Intel: sof_sdw: Add quirk for Asus Zenbook S16
ASoC: codecs:lpass-wsa-macro: Fix logic of enabling vi channels
ASoC: codecs:lpass-wsa-macro: Fix vi feedback rate
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform drivers fixes from Ilpo Järvinen:
"Fixes:
- amd/pmf: Fix STT limits
- asus-laptop: Fix an uninitialized variable
- intel_pmc_ipc: Allow building without ACPI
- mlxbf-bootctl: Use sysfs_emit_at() in secure_boot_fuse_state_show()
- msi-wmi-platform: Add locking to workaround ACPI firmware bug
New HW support:
- alienware-wmi-wmax:
- Extended thermal control support to:
- Alienware Area-51m R2
- Alienware m16 R1
- Alienware m16 R2
- Dell G16 7630
- Dell G5 5505 SE
- G-Mode support to Alienware m16 R1
- x86-android-tablets: Add Vexia Edu Atla 10 tablet 5V data"
* tag 'platform-drivers-x86-v6.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
platform/x86: msi-wmi-platform: Workaround a ACPI firmware bug
platform/x86: msi-wmi-platform: Rename "data" variable
platform/x86: alienware-wmi-wmax: Extend support to more laptops
platform/x86: alienware-wmi-wmax: Add G-Mode support to Alienware m16 R1
platform/x86: amd: pmf: Fix STT limits
mlxbf-bootctl: use sysfs_emit_at() in secure_boot_fuse_state_show()
platform/x86: x86-android-tablets: Add Vexia Edu Atla 10 tablet 5V data
platform/x86: x86-android-tablets: Add "9v" to Vexia EDU ATLA 10 tablet symbols
asus-laptop: Fix an uninitialized variable
platform/x86: intel_pmc_ipc: add option to build without ACPI
|
|
This helper converts the given bits per word to bytes. The result
will always be power-of-two, e.g.,
=============== =================
Input (in bits) Output (in bytes)
=============== =================
5 1
9 2
21 4
37 8
=============== =================
It will return 0 for the 0 input.
There are a couple of cases in SPI that are using the same approach
and at least one more (in IIO) would benefit of it. Add a helper
for everyone.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: David Lechner <dlechner@baylibre.com>
Link: https://patch.msgid.link/20250417152529.490582-2-andriy.shevchenko@linux.intel.com
Acked-by: Mukesh Kumar Savaliya <quic_msavaliy@quicinc.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
In the review for iommu_copy_struct_to_user() helper, Matt pointed out that
a NULL pointer should be rejected prior to dereferencing it:
https://lore.kernel.org/all/86881827-8E2D-461C-BDA3-FA8FD14C343C@nvidia.com
And Alok pointed out a typo at the same time:
https://lore.kernel.org/all/480536af-6830-43ce-a327-adbd13dc3f1d@oracle.com
Since both issues were copied from iommu_copy_struct_from_user(), fix them
first in the current header.
Fixes: e9d36c07bb78 ("iommu: Add iommu_copy_struct_from_user helper")
Cc: stable@vger.kernel.org
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Matthew R. Ochs <mochs@nvidia.com>
Link: https://lore.kernel.org/r/20250414191635.450472-1-nicolinc@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Generally drivers have a specific idea what their HW structure size should
be. In a lot of cases this is related to PAGE_SIZE, but not always. ARM64,
for example, allows a 4K IO page table size on a 64K CPU page table
system.
Currently we don't have any good support for sub page allocations, but
make the API accommodate this by accepting a sub page size from the caller
and rounding up internally.
This is done by moving away from order as the size input and using size:
size == 1 << (order + PAGE_SHIFT)
Following patches convert drivers away from using order and try to specify
allocation sizes independent of PAGE_SIZE.
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/15-v4-c8663abbb606+3f7-iommu_pages_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
This converts the remaining places using list of pages to the new API.
The Intel free path was shared with its gather path, so it is converted at
the same time.
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/11-v4-c8663abbb606+3f7-iommu_pages_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
We want to get rid of struct page references outside the internal
allocator implementation. The free list has the driver open code something
like:
list_add_tail(&virt_to_page(ptr)->lru, freelist);
Move the above into a small inline and make the freelist into a wrapper
type 'struct iommu_pages_list' so that the compiler can help check all the
conversion.
This struct has also proven helpful in some future ideas to convert to a
singly linked list to get an extra pointer in the struct page, and to
signal that the pages should be freed with RCU.
Use a temporary _Generic so we don't need to rename the free function as
the patches progress.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/8-v4-c8663abbb606+3f7-iommu_pages_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
When sending an skb over a socket using skb_send_sock_locked(),
it is currently not possible to specify any flag to be set in
msghdr->msg_flags.
However, we may want to pass flags the user may have specified,
like MSG_NOSIGNAL.
Extend __skb_send_sock() with a new argument 'flags' and add a
new interface named skb_send_sock_locked_with_flags().
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Simon Horman <horms@kernel.org>
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
Link: https://patch.msgid.link/20250415-b4-ovpn-v26-12-577f6097b964@openvpn.net
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
{bdev,disk}_zone_capacity() takes block_device or gendisk and sector position
and returns the zone capacity of the corresponding zone.
With that, move disk_nr_zones() and blk_zone_plug_bio() to consolidate them in
the same #ifdef block.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Currently bdex_statx is only called from the very high-level
vfs_statx_path function, and thus bypassing it for in-kernel calls
to vfs_getattr or vfs_getattr_nosec.
This breaks querying the block ѕize of the underlying device in the
loop driver and also is a pitfall for any other new kernel caller.
Move the call into the lowest level helper to ensure all callers get
the right results.
Fixes: 2d985f8c6b91 ("vfs: support STATX_DIOALIGN on block devices")
Fixes: f4774e92aab8 ("loop: take the file system minimum dio alignment into account")
Reported-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/20250417064042.712140-1-hch@lst.de
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
When CONFIG_NEED_DMA_MAP_STATE is not defined, dma-mapping clients might
report unused data compilation warnings for dma_unmap_*() calls
arguments. Redefine macros for those calls to let compiler to notice that
it is okay when the provided arguments are not used.
Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20250415075659.428549-1-m.szyprowski@samsung.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc hotfixes from Andrew Morton:
"31 hotfixes.
9 are cc:stable and the remainder address post-6.15 issues or aren't
considered necessary for -stable kernels.
22 patches are for MM, 9 are otherwise"
* tag 'mm-hotfixes-stable-2025-04-16-19-59' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (31 commits)
MAINTAINERS: update HUGETLB reviewers
mm: fix apply_to_existing_page_range()
selftests/mm: fix compiler -Wmaybe-uninitialized warning
alloc_tag: handle incomplete bulk allocations in vm_module_tags_populate
mailmap: add entry for Jean-Michel Hautbois
mm: (un)track_pfn_copy() fix + doc improvements
mm: fix filemap_get_folios_contig returning batches of identical folios
mm/hugetlb: add a line break at the end of the format string
selftests: mincore: fix tmpfs mincore test failure
mm/hugetlb: fix set_max_huge_pages() when there are surplus pages
mm/cma: report base address of single range correctly
mm: page_alloc: speed up fallbacks in rmqueue_bulk()
kunit: slub: add module description
mm/kasan: add module decription
ucs2_string: add module description
zlib: add module description
fpga: tests: add module descriptions
samples/livepatch: add module descriptions
ASN.1: add module description
mm/vma: add give_up_on_oom option on modify/merge, use in uffd release
...
|
|
A subset of AMD Processors supporting Preferred Core Rankings also
feature the ability to dynamically switch these rankings at runtime to
bias load balancing towards or away from the LLC domain with larger
cache.
To support dynamically updating "sg->asym_prefer_cpu" without needing to
rebuild the sched domain, introduce sched_update_asym_prefer_cpu() which
recomutes the "asym_prefer_cpu" when the core-ranking of a CPU changes.
sched_update_asym_prefer_cpu() swaps the "sg->asym_prefer_cpu" with the
CPU whose ranking has changed if the new ranking is greater than that of
the "asym_prefer_cpu". If CPU whose ranking has changed is the current
"asym_prefer_cpu", it scans the CPUs of the sched groups to find the new
"asym_prefer_cpu" and sets it accordingly.
get_group() for non-overlapping sched domains returns the sched group
for the first CPU in the sched_group_span() which ensures all CPUs in
the group see the updated value of "asym_prefer_cpu".
Overlapping groups are allocated differently and will require moving the
"asym_prefer_cpu" to "sg->sgc" but since the current implementations do
not set "SD_ASYM_PACKING" at NUMA domains, skip additional
indirection and place a SCHED_WARN_ON() to alert any future users.
Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250409053446.23367-3-kprateek.nayak@amd.com
|
|
Two members of the same or quite similar semantics is quite confusing
to begin with. Moreover, fwnode covers all possible firmware descriptions
that Linux kernel supports. Deprecate of_node in struct i2c_board_info,
so users will be warned and in the future there is a plan to convert
the users and remove it completely.
Tested-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
|
|
lookup_or_create_module_kobject() is marked as static and __init,
to make it global drop static keyword.
Since this function can be called from non-init code, use __modinit
instead of __init, __modinit marker will make it __init if
CONFIG_MODULES is not defined.
Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Shyam Saini <shyamsaini@linux.microsoft.com>
Link: https://lore.kernel.org/r/20250227184930.34163-4-shyamsaini@linux.microsoft.com
Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
|
|
Previously, device driver IPSec offload implementations would fall into
two categories:
1. Those that used xso.dev to determine the offload device.
2. Those that used xso.real_dev to determine the offload device.
The first category didn't work with bonding while the second did.
In a non-bonding setup the two pointers are the same.
This commit adds explicit pointers for the offload netdevice to
.xdo_dev_state_add() / .xdo_dev_state_delete() / .xdo_dev_state_free()
which eliminates the confusion and allows drivers from the first
category to work with bonding.
xso.real_dev now becomes a private pointer managed by the bonding
driver.
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
If the bit CRYPTO_ALG_DUP_FIRST is set, an algorithm will be
duplicated by kmemdup before registration. This is inteded for
hardware-based algorithms that may be unplugged at will.
Do not use this if the algorithm data structure is embedded in a
bigger data structure. Perform the duplication in the driver
instead.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Add a reqsize field to crypto_alg with the intention of replacing
the type-specific reqsize field currently used by ahash and acomp.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
These functions have been obsoleted by the type-specific init/exit
functions.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Add helpers so that the ON_STACK request flag management is not
duplicated all over the place.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Request chaining requires the user to do too much book keeping.
Remove it from ahash.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
The PTP_PEROUT_REQUEST2 ioctl has gained support for flags specifying
specific output behavior including PTP_PEROUT_ONE_SHOT,
PTP_PEROUT_DUTY_CYCLE, PTP_PEROUT_PHASE.
Driver authors are notorious for not checking the flags of the request.
This results in misinterpreting the request, generating an output signal
that does not match the requested value. It is anticipated that even more
flags will be added in the future, resulting in even more broken requests.
Expecting these issues to be caught during review or playing whack-a-mole
after the fact is not a great solution.
Instead, introduce the supported_perout_flags field in the ptp_clock_info
structure. Update the core character device logic to explicitly reject any
request which has a flag not on this list.
This ensures that drivers must 'opt in' to the flags they support. Drivers
which don't set the .supported_perout_flags field will not need to check
that unsupported flags aren't passed, as the core takes care of this.
Update the drivers which do support flags to set this new field.
Note the following driver files set n_per_out to a non-zero value but did
not check the flags at all:
• drivers/ptp/ptp_clockmatrix.c
• drivers/ptp/ptp_idt82p33.c
• drivers/ptp/ptp_fc3.c
• drivers/net/ethernet/ti/am65-cpts.c
• drivers/net/ethernet/aquantia/atlantic/aq_ptp.c
• drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c
• drivers/net/dsa/sja1105/sja1105_ptp.c
• drivers/net/ethernet/freescale/dpaa2/dpaa2-ptp.c
• drivers/net/ethernet/mscc/ocelot_vsc7514.c
• drivers/net/ethernet/intel/i40e/i40e_ptp.c
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Kory Maincent <kory.maincent@bootlin.com>
Link: https://patch.msgid.link/20250414-jk-supported-perout-flags-v2-2-f6b17d15475c@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The PTP_EXTTS_REQUEST(2) ioctl has a flags field which specifies how the
external timestamp request should behave. This includes which edge of the
signal to timestamp, as well as a specialized "offset" mode. It is expected
that more flags will be added in the future.
Driver authors routinely do not check the flags, often accepting requests
with flags which they do not support. Even drivers which do check flags may
not be future-proofed to reject flags not yet defined. Thus, any future
flag additions often require manually updating drivers to reject these
flags.
This approach of hoping we catch flag checks during review, or playing
whack-a-mole after the fact is the wrong approach.
Introduce the "supported_extts_flags" field to the ptp_clock_info
structure. This field defines the set of flags the device actually
supports.
Update the core character device logic to check this field and reject
unsupported requests. Getting this right is somewhat tricky. First, to
avoid unnecessary repetition and make basic functionality work when
.supported_extts_flags is 0, the core always accepts the PTP_ENABLE_FEATURE
flag. This flag is used to set the 'on' parameter to the .enable function
and is thus always 'supported' by all drivers.
For backwards compatibility, the PTP_RISING_EDGE and PTP_FALLING_EDGE flags
are merely "hints" when using the old PTP_EXTTS_REQUEST ioctl, and are not
expected to be enforced. If the user issues PTP_EXTTS_REQUEST2, the
PTP_STRICT_FLAGS flag is added which is supposed to inform the driver to
strictly validate the flags and reject unsupported requests. To handle
this, first check if the driver reports PTP_STRICT_FLAGS support. If it
does not, then always allow the PTP_RISING_EDGE and PTP_FALLING_EDGE flags.
This keeps backwards compatibility with the original PTP_EXTTS_REQUEST
ioctl where these flags are not guaranteed to be honored.
This way, drivers which do not set the supported_extts_flags will continue
to accept requests for the original PTP_EXTTS_REQUEST ioctl. The core will
automatically reject requests with new flags, and correctly reject requests
with PTP_STRICT_FLAGS, where the driver is supposed to strictly validate
the flags.
Update the various drivers, refactoring their validation logic into the
.supported_extts_flags field. For consistency and readability,
PTP_ENABLE_FEATURE is not set in the supported flags list, and
PTP_EXTTS_EDGES is expanded to PTP_RISING_EDGE | PTP_FALLING_EDGE in all
cases.
Note the following driver files set n_ext_ts to a non-zero value but did
not check flags at all:
• drivers/net/ethernet/freescale/dpaa2/dpaa2-ptp.c
• drivers/net/ethernet/freescale/enetc/enetc_ptp.c
• drivers/net/ethernet/intel/i40e/i40e_ptp.c
• drivers/net/ethernet/marvell/octeontx2/nic/otx2_ptp.c
• drivers/net/ethernet/renesas/ravb_ptp.c
• drivers/net/ethernet/renesas/rtsn.c
• drivers/net/ethernet/renesas/rtsn.h
• drivers/net/ethernet/ti/am65-cpts.c
• drivers/net/ethernet/ti/cpts.h
• drivers/net/ethernet/ti/icssg/icss_iep.c
• drivers/net/ethernet/xscale/ptp_ixp46x.c
• drivers/net/phy/bcm-phy-ptp.c
• drivers/ptp/ptp_ocp.c
• drivers/ptp/ptp_pch.c
• drivers/ptp/ptp_qoriq.c
These drivers behavior does change slightly: they will now reject the
PTP_EXTTS_REQUEST2 ioctl, because they do not strictly validate their
flags. This also makes them no longer incorrectly accept PTP_EXT_OFFSET.
Also note that the renesas ravb driver does not support PTP_STRICT_FLAGS.
We could leave the .supported_extts_flags as 0, but I added the
PTP_RISING_EDGE | PTP_FALLING_EDGE since the driver previously manually
validated these flags. This is equivalent to 0 because the core will allow
these flags regardless unless PTP_STRICT_FLAGS is also set.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Kory Maincent <kory.maincent@bootlin.com>
Link: https://patch.msgid.link/20250414-jk-supported-perout-flags-v2-1-f6b17d15475c@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
All users of this field have been migrated to bin_attrs_new.
It can now be constified.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/r/20250313-sysfs-const-bin_attr-final-v2-2-96284e1e88ce@weissschuh.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
All callback implementers have been moved to the const variant of the
callbacks. The signature of the original callbacks can now be changed.
Also remove the now unnecessary transition machinery inside __BIN_ATTR().
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/r/20250313-sysfs-const-bin_attr-final-v2-1-96284e1e88ce@weissschuh.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Add a note to the fwnode.h that the header should not be used
directly in the leaf drivers, they all should use the higher
level APIs and the respective headers.
The purpose of this note is to give guidance to driver writers
to avoid repeating a common mistake.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Zijun Hu <quic_zijuhu@quicinc.com>
Link: https://lore.kernel.org/r/20250408095229.1298005-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
AFAICS this function has never had a user.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/ab7b8094-2eea-4e82-a047-fd60117f220b@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Rename field 'counter' in 'mlxreg_core_hotplug_platform_data' to count.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Link: https://lore.kernel.org/r/20250412091843.33943-2-vadimp@nvidia.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
Add helper functions to create a device on the auxiliary bus.
This is meant for fairly simple usage of the auxiliary bus, to avoid having
the same code repeated in the different drivers.
Suggested-by: Stephen Boyd <sboyd@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20250218-aux-device-create-helper-v4-1-c3d7dfdea2e6@baylibre.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Backmerging to get fixes from v6.15-rc2 into drm-misc-next.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
|
|
The audit code relies on the fact that kern_path_locked() returned a
path even for a negative dentry. If it doesn't find a valid dentry it
immediately calls:
audit_find_parent(d_backing_inode(parent_path.dentry));
which assumes that parent_path.dentry is still valid. But it isn't since
kern_path_locked() has been changed to path_put() also for a negative
dentry.
Fix this by adding a helper that implements the required audit semantics
and allows us to fix the immediate bleeding. We can find a unified
solution for this afterwards.
Link: https://lore.kernel.org/20250414-rennt-wimmeln-f186c3a780f1@brauner
Fixes: 1c3cb50b58c3 ("VFS: change kern_path_locked() and user_path_locked_at() to never return negative dentry")
Reported-and-tested-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Commit 7d5ec3d36123 ("PCI/MSI: Mask all unused MSI-X entries") introduced a
readl() from ENTRY_VECTOR_CTRL before the writel() to ENTRY_DATA.
This is correct, however some hardware, like the Sun Neptune chips, the NIU
module, will cause an error and/or fatal trap if any MSIX table entry is
read before the corresponding ENTRY_DATA field is written to.
Add an optional early writel() in msix_prepare_msi_desc().
Fixes: 7d5ec3d36123 ("PCI/MSI: Mask all unused MSI-X entries")
Signed-off-by: Jonathan Currier <dullfire@yahoo.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20241117234843.19236-2-dullfire@yahoo.com
|
|
plat_dat->eee_users_rate is now unused, so remove this member.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1u3Vuv-000E7y-9k@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit 0e9c127729be ("ethtool: add interface to read Tx hardware
timestamping statistics") added documentation for timestamping
statistics, but added the detailed explanation for this method to
the get_ts_info() rather than get_ts_stats(). Move it to the correct
entry.
Cc: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1u3MTz-000Crx-IW@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When enabling DMA mapping in page_pool, pages are kept DMA mapped until
they are released from the pool, to avoid the overhead of re-mapping the
pages every time they are used. This causes resource leaks and/or
crashes when there are pages still outstanding while the device is torn
down, because page_pool will attempt an unmap through a non-existent DMA
device on the subsequent page return.
To fix this, implement a simple tracking of outstanding DMA-mapped pages
in page pool using an xarray. This was first suggested by Mina[0], and
turns out to be fairly straight forward: We simply store pointers to
pages directly in the xarray with xa_alloc() when they are first DMA
mapped, and remove them from the array on unmap. Then, when a page pool
is torn down, it can simply walk the xarray and unmap all pages still
present there before returning, which also allows us to get rid of the
get/put_device() calls in page_pool. Using xa_cmpxchg(), no additional
synchronisation is needed, as a page will only ever be unmapped once.
To avoid having to walk the entire xarray on unmap to find the page
reference, we stash the ID assigned by xa_alloc() into the page
structure itself, using the upper bits of the pp_magic field. This
requires a couple of defines to avoid conflicting with the
POINTER_POISON_DELTA define, but this is all evaluated at compile-time,
so does not affect run-time performance. The bitmap calculations in this
patch gives the following number of bits for different architectures:
- 23 bits on 32-bit architectures
- 21 bits on PPC64 (because of the definition of ILLEGAL_POINTER_VALUE)
- 32 bits on other 64-bit architectures
Stashing a value into the unused bits of pp_magic does have the effect
that it can make the value stored there lie outside the unmappable
range (as governed by the mmap_min_addr sysctl), for architectures that
don't define ILLEGAL_POINTER_VALUE. This means that if one of the
pointers that is aliased to the pp_magic field (such as page->lru.next)
is dereferenced while the page is owned by page_pool, that could lead to
a dereference into userspace, which is a security concern. The risk of
this is mitigated by the fact that (a) we always clear pp_magic before
releasing a page from page_pool, and (b) this would need a
use-after-free bug for struct page, which can have many other risks
since page->lru.next is used as a generic list pointer in multiple
places in the kernel. As such, with this patch we take the position that
this risk is negligible in practice. For more discussion, see[1].
Since all the tracking added in this patch is performed on DMA
map/unmap, no additional code is needed in the fast path, meaning the
performance overhead of this tracking is negligible there. A
micro-benchmark shows that the total overhead of the tracking itself is
about 400 ns (39 cycles(tsc) 395.218 ns; sum for both map and unmap[2]).
Since this cost is only paid on DMA map and unmap, it seems like an
acceptable cost to fix the late unmap issue. Further optimisation can
narrow the cases where this cost is paid (for instance by eliding the
tracking when DMA map/unmap is a no-op).
The extra memory needed to track the pages is neatly encapsulated inside
xarray, which uses the 'struct xa_node' structure to track items. This
structure is 576 bytes long, with slots for 64 items, meaning that a
full node occurs only 9 bytes of overhead per slot it tracks (in
practice, it probably won't be this efficient, but in any case it should
be an acceptable overhead).
[0] https://lore.kernel.org/all/CAHS8izPg7B5DwKfSuzz-iOop_YRbk3Sd6Y4rX7KBG9DcVJcyWg@mail.gmail.com/
[1] https://lore.kernel.org/r/20250320023202.GA25514@openwall.com
[2] https://lore.kernel.org/r/ae07144c-9295-4c9d-a400-153bb689fe9e@huawei.com
Reported-by: Yonglong Liu <liuyonglong@huawei.com>
Closes: https://lore.kernel.org/r/8743264a-9700-4227-a556-5f931c720211@huawei.com
Fixes: ff7d6b27f894 ("page_pool: refurbish version of page_pool code")
Suggested-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Jesper Dangaard Brouer <hawk@kernel.org>
Tested-by: Jesper Dangaard Brouer <hawk@kernel.org>
Tested-by: Qiuling Ren <qren@redhat.com>
Tested-by: Yuying Ma <yuma@redhat.com>
Tested-by: Yonglong Liu <liuyonglong@huawei.com>
Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://patch.msgid.link/20250409-page-pool-track-dma-v9-2-6a9ef2e0cba8@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Since we are about to stash some more information into the pp_magic
field, let's move the magic signature checks into a pair of helper
functions so it can be changed in one place.
Reviewed-by: Mina Almasry <almasrymina@google.com>
Tested-by: Yonglong Liu <liuyonglong@huawei.com>
Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://patch.msgid.link/20250409-page-pool-track-dma-v9-1-6a9ef2e0cba8@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
netdevice reg_state was split into two 16 bit enums back in 2010
in commit a2835763e130 ("rtnetlink: handle rtnl_link netlink
notifications manually"). Since the split the fields have been
moved apart, and last year we converted reg_state to a normal
u8 in commit 4d42b37def70 ("net: convert dev->reg_state to u8").
rtnl_link_state being a 16 bitfield makes no sense. Convert it
to a single bool, it seems very unlikely after 15 years that
we'll need more values in it.
We could drop dev->rtnl_link_ops from the conditions but feels
like having it there more clearly points at the reason for this
hack.
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250410014246.780885-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|