summaryrefslogtreecommitdiff
path: root/io_uring
AgeCommit message (Collapse)Author
2025-03-10io_uring/kbuf: enable bundles for incrementally consumed buffersJens Axboe
The original support for incrementally consumed buffers didn't allow it to be used with bundles, with the assumption being that incremental buffers are generally larger, and hence there's less of a nedd to support it. But that assumption may not be correct - it's perfectly viable to use smaller buffers with incremental consumption, and there may be valid reasons for an application or framework to do so. As there's really no need to explicitly disable bundles with incrementally consumed buffers, allow it. This actually makes the peek side cheaper and simpler, with the completion side basically the same, just needing to iterate for the consumed length. Reported-by: Norman Maurer <norman_maurer@apple.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-10Revert "io_uring/rsrc: simplify the bvec iter count calculation"Keith Busch
This reverts commit 2a51c327d4a4a2eb62d67f4ea13a17efd0f25c5c. The kernel registered bvecs do use the iov_iter_advance() API, so we can't rely on this simplification anymore. Fixes: 27cb27b6d5ea40 ("io_uring: add support for kernel registered bvecs") Reported-by: Caleb Sander Mateos <csander@purestorage.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Caleb Sander Mateos <csander@purestorage.com> Link: https://lore.kernel.org/r/20250310184825.569371-1-kbusch@meta.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-10io_uring: rely on io_prep_reg_vec for iovec placementPavel Begunkov
All vectored reg buffer users should use io_import_reg_vec() for iovec imports, since iovec placement is the function's responsibility and callers shouldn't know much about it, drop the offset parameter from io_prep_reg_vec() and calculate it inside. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/08ed87ca4bbc06724373b6ce06f36b703fe60c4e.1741457480.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-10io_uring: introduce io_prep_reg_iovec()Pavel Begunkov
iovecs that are turned into registered buffers are imported in a special way with an offset, so that later we can do an in place translation. Add a helper function taking care of it. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/7de2ecb9ed5efc3c5cf320232236966da5ad4ccc.1741457480.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-10io_uring: unify STOP_MULTISHOT with IOU_OKPavel Begunkov
IOU_OK means that the request ownership is now handed back to core io_uring and it has to complete it using the result provided in req->cqe. Same is true for multishot and IOU_STOP_MULTISHOT. Rename it into IOU_COMPLETE to avoid confusion and use for both modes. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/e6a5b2edb0eb9558acb1c8f1db38ac45fee95491.1741453534.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-10io_uring: return -EAGAIN to continue multishotPavel Begunkov
Multishot errors can be mapped 1:1 to normal errors, but there are not identical. It leads to a peculiar situation where all multishot requests has to check in what context they're run and return different codes. Unify them starting with EAGAIN / IOU_ISSUE_SKIP_COMPLETE(EIOCBQUEUED) pair, which mean that core io_uring still owns the request and it should be retried. In case of multishot it's naturally just continues to poll, otherwise it might poll, use iowq or do any other kind of allowed blocking. Introduce IOU_RETRY aliased to -EAGAIN for that. Apart from obvious upsides, multishot can now also check for misuse of IOU_ISSUE_SKIP_COMPLETE. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/da117b79ce72ecc3ab488c744e29fae9ba54e23b.1741453534.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-07Merge tag 'io_uring-6.14-20250306' of git://git.kernel.dk/linuxLinus Torvalds
Pull io_uring fix from Jens Axboe: "A single fix for a regression introduced in the 6.14 merge window, causing stalls/hangs with IOPOLL reads or writes" * tag 'io_uring-6.14-20250306' of git://git.kernel.dk/linux: io_uring/rw: ensure reissue path is correctly handled for IOPOLL
2025-03-07io_uring: Remove unused declaration io_alloc_async_data()Yue Haibing
Commit ef623a647f42 ("io_uring: Move old async data allocation helper to header") leave behind this unused declaration. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Link: https://lore.kernel.org/r/20250305013454.3635021-1-yuehaibing@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-07io_uring: cap cached iovec/bvec sizePavel Begunkov
Bvecs can be large, put an arbitrary limit on the max vector size it can cache. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/823055fa6628daa24bbc9cd77c2da87e9a1e1e32.1741362889.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-07io_uring/net: implement vectored reg bufs for zctxPavel Begunkov
Add support for vectored registered buffers for send zc. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/e484052875f862d2dca99f0f8c04407c1d51a1c1.1741362889.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-07io_uring/net: convert to struct iou_vecPavel Begunkov
Convert net.c to use struct iou_vec. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/6437b57dabed44eca708c02e390529c7ed211c78.1741362889.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-07io_uring/net: pull vec alloc out of msghdr importPavel Begunkov
I'll need more control over iovec management, move io_net_import_vec() out of io_msg_copy_hdr(). Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/9600ea6300f620e65d39da481c22605ddc898850.1741362889.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-07io_uring/net: combine msghdr copyPavel Begunkov
Call the compat version from inside of io_msg_copy_hdr() and don't duplicate it in callers. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/25795660f7b31f9273911c99f495d9c2b169ecda.1741362889.git.asml.silence@gmail.com [axboe: fixup msg pointer vs variable braino in io_msg_copy_hdr()] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-07io_uring/rw: defer reg buf vec importPavel Begunkov
Import registered buffers for vectored reads and writes later at issue time as we now do for other fixed ops. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/e8491c976e4ab83a4e3dc428e9fe7555e59583b8.1741362889.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-07io_uring/rw: implement vectored registered rwPavel Begunkov
Implement registered buffer vectored reads with new opcodes IORING_OP_WRITEV_FIXED and IORING_OP_READV_FIXED. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/d7c89eb481e870f598edc91cc66ff4d1e4ae3788.1741362889.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-07io_uring: add infra for importing vectored reg buffersPavel Begunkov
Add io_import_reg_vec(), which will be responsible for importing vectored registered buffers. The function might reallocate the vector, but it'd try to do the conversion in place first, which is why it's required of the user to pad the iovec to the right border of the cache. Overlapping also depends on struct iovec being larger than bvec, which is not the case on e.g. 32 bit architectures. Don't try to complicate this case and make sure vectors never overlap, it'll be improved later. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/60bd246b1249476a6996407c1dbc38ef6febad14.1741362889.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-07io_uring: introduce struct iou_vecPavel Begunkov
I need a convenient way to pass around and work with iovec+size pair, put them into a structure and makes use of it in rw.c Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/d39fadafc9e9047b0a292e5be6db3cf2f48bb1f7.1741362889.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-07Merge branch 'for-6.15/io_uring-epoll-wait' into for-6.15/io_uring-reg-vecJens Axboe
* for-6.15/io_uring-epoll-wait: io_uring/epoll: add support for IORING_OP_EPOLL_WAIT io_uring/epoll: remove CONFIG_EPOLL guards eventpoll: add epoll_sendevents() helper eventpoll: abstract out ep_try_send_events() helper eventpoll: abstract out parameter sanity checking
2025-03-07Merge branch 'for-6.15/io_uring-rx-zc' into for-6.15/io_uring-reg-vecJens Axboe
* for-6.15/io_uring-rx-zc: (80 commits) io_uring/zcrx: add selftest case for recvzc with read limit io_uring/zcrx: add a read limit to recvzc requests io_uring: add missing IORING_MAP_OFF_ZCRX_REGION in io_uring_mmap io_uring: Rename KConfig to Kconfig io_uring/zcrx: fix leaks on failed registration io_uring/zcrx: recheck ifq on shutdown io_uring/zcrx: add selftest net: add documentation for io_uring zcrx io_uring/zcrx: add copy fallback io_uring/zcrx: throttle receive requests io_uring/zcrx: set pp memory provider for an rx queue io_uring/zcrx: add io_recvzc request io_uring/zcrx: dma-map area for the device io_uring/zcrx: implement zerocopy receive pp memory provider io_uring/zcrx: grab a net device io_uring/zcrx: add io_zcrx_area io_uring/zcrx: add interface queue and refill queue net: add helpers for setting a memory provider on an rx queue net: page_pool: add memory provider helpers net: prepare for non devmem TCP memory providers ...
2025-03-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.14-rc6). Conflicts: net/ethtool/cabletest.c 2bcf4772e45a ("net: ethtool: try to protect all callback with netdev instance lock") 637399bf7e77 ("net: ethtool: netlink: Allow NULL nlattrs when getting a phy_device") No Adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-05io_uring/rw: ensure reissue path is correctly handled for IOPOLLJens Axboe
The IOPOLL path posts CQEs when the io_kiocb is marked as completed, so it cannot rely on the usual retry that non-IOPOLL requests do for read/write requests. If -EAGAIN is received and the request should be retried, go through the normal completion path and let the normal flush logic catch it and reissue it, like what is done for !IOPOLL reads or writes. Fixes: d803d123948f ("io_uring/rw: handle -EAGAIN retry at IO completion time") Reported-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/io-uring/2b43ccfa-644d-4a09-8f8f-39ad71810f41@oracle.com/ Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-05io_uring: introduce io_cache_free() helperCaleb Sander Mateos
Add a helper function io_cache_free() that returns an allocation to a io_alloc_cache, falling back on kfree() if the io_alloc_cache is full. This is the inverse of io_cache_alloc(), which takes an allocation from an io_alloc_cache and falls back on kmalloc() if the cache is empty. Convert 4 callers to use the helper. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Suggested-by: Li Zetao <lizetao1@huawei.com> Link: https://lore.kernel.org/r/20250304194814.2346705-1-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-04io_uring/rsrc: skip NULL file/buffer checks in io_free_rsrc_node()Caleb Sander Mateos
io_rsrc_node's of type IORING_RSRC_FILE always have a file attached immediately after they are allocated. IORING_RSRC_BUFFER nodes won't be returned from io_sqe_buffer_register()/io_buffer_register_bvec() until they have a io_mapped_ubuf attached. So remove the checks for a NULL file/buffer in io_free_rsrc_node(). Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Link: https://lore.kernel.org/r/20250228235916.670437-5-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-04io_uring/rsrc: avoid NULL node check on io_sqe_buffer_register() failureCaleb Sander Mateos
The done: label is only reachable if node is non-NULL. So don't bother checking, just call io_free_node(). Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Link: https://lore.kernel.org/r/20250228235916.670437-4-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-04io_uring/rsrc: call io_free_node() on io_sqe_buffer_register() failureCaleb Sander Mateos
io_sqe_buffer_register() currently calls io_put_rsrc_node() if it fails to fully set up the io_rsrc_node. io_put_rsrc_node() is more involved than necessary, since we already know the reference count will reach 0 and no io_mapped_ubuf has been attached to the node yet. So just call io_free_node() to release the node's memory. This also avoids the need to temporarily set the node's buf pointer to NULL. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Link: https://lore.kernel.org/r/20250228235916.670437-3-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-04io_uring/rsrc: free io_rsrc_node using kfree()Caleb Sander Mateos
io_rsrc_node_alloc() calls io_cache_alloc(), which uses kmalloc() to allocate the node. So it can be freed with kfree() instead of kvfree(). Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Link: https://lore.kernel.org/r/20250228235916.670437-2-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-04io_uring/rsrc: split out io_free_node() helperCaleb Sander Mateos
Split the freeing of the io_rsrc_node from io_free_rsrc_node(), for use with nodes that haven't been fully initialized. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Link: https://lore.kernel.org/r/20250228235916.670437-1-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-04io_uring/rsrc: include io_uring_types.h in rsrc.hCaleb Sander Mateos
io_uring/rsrc.h uses several types from include/linux/io_uring_types.h. Include io_uring_types.h explicitly in rsrc.h to avoid depending on users of rsrc.h including io_uring_types.h first. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Li Zetao <lizetao1@huawei.com> Link: https://lore.kernel.org/r/20250301183612.937529-1-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-28io_uring/nop: use io_find_buf_node()Caleb Sander Mateos
Call io_find_buf_node() to avoid duplicating it in io_nop(). Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Link: https://lore.kernel.org/r/20250301001610.678223-2-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-28io_uring/rsrc: declare io_find_buf_node() in header fileCaleb Sander Mateos
Declare io_find_buf_node() in io_uring/rsrc.h so it can be called from other files. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Link: https://lore.kernel.org/r/20250301001610.678223-1-csander@purestorage.com [axboe: keep the inline for local hot path usage] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-28io_uring/ublk: report error when unregister operation failsCaleb Sander Mateos
Indicate to userspace applications if a UBLK_IO_UNREGISTER_IO_BUF command specifies an invalid buffer index by returning an error code. Return -EINVAL if no buffer is registered with the given index, and -EBUSY if the registered buffer is not a kernel bvec. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Link: https://lore.kernel.org/r/20250228231432.642417-1-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-28io_uring/uring_cmd: specify io_uring_cmd_import_fixed() pointer typeCaleb Sander Mateos
io_uring_cmd_import_fixed() takes a struct io_uring_cmd *, but the type of the ioucmd parameter is void *. Make the pointer type explicit so the compiler can type check it. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Link: https://lore.kernel.org/r/20250228221514.604350-1-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-28io_uring/rsrc: use rq_data_dir() to compute bvec dirCaleb Sander Mateos
The macro rq_data_dir() already computes a request's data direction. Use it in place of the if-else to set imu->dir. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Link: https://lore.kernel.org/r/20250228223057.615284-1-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-28Merge tag 'io_uring-6.14-20250228' of git://git.kernel.dk/linuxLinus Torvalds
Pull io_uring fix from Jens Axboe: "Just a single fix headed for stable, ensuring that msg_control is properly saved in compat mode as well" * tag 'io_uring-6.14-20250228' of git://git.kernel.dk/linux: io_uring/net: save msg_control for compat
2025-02-28io_uring: cache nodes and mapped buffersKeith Busch
Frequent alloc/free cycles on these is pretty costly. Use an io cache to more efficiently reuse these buffers. Signed-off-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20250227223916.143006-7-kbusch@meta.com [axboe: fix imu leak] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-28io_uring: add support for kernel registered bvecsKeith Busch
Provide an interface for the kernel to leverage the existing pre-registered buffers that io_uring provides. User space can reference these later to achieve zero-copy IO. User space must register an empty fixed buffer table with io_uring in order for the kernel to make use of it. Signed-off-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20250227223916.143006-5-kbusch@meta.com Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-28io_uring/rw: move fixed buffer import to issue pathKeith Busch
Registered buffers may depend on a linked command, which makes the prep path too early to import. Move to the issue path when the node is actually needed like all the other users of fixed buffers. Signed-off-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20250227223916.143006-3-kbusch@meta.com Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-28io_uring/rw: move buffer_select outside generic prepKeith Busch
Cleans up the generic rw prep to not require the do_import flag. Use a different prep function for callers that might need buffer select. Based-on-a-patch-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20250227223916.143006-2-kbusch@meta.com Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.14-rc5). Conflicts: drivers/net/ethernet/cadence/macb_main.c fa52f15c745c ("net: cadence: macb: Synchronize stats calculations") 75696dd0fd72 ("net: cadence: macb: Convert to get_stats64") https://lore.kernel.org/20250224125848.68ee63e5@canb.auug.org.au Adjacent changes: drivers/net/ethernet/intel/ice/ice_sriov.c 79990cf5e7ad ("ice: Fix deinitializing VF in error path") a203163274a4 ("ice: simplify VF MSI-X managing") net/ipv4/tcp.c 18912c520674 ("tcp: devmem: don't write truncated dmabuf CMSGs to userspace") 297d389e9e5b ("net: prefix devmem specific helpers") net/mptcp/subflow.c 8668860b0ad3 ("mptcp: reset when MPTCP opts are dropped after join") c3349a22c200 ("mptcp: consolidate subflow cleanup") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-27io_uring/net: fix build warning for !CONFIG_COMPATArnd Bergmann
A code rework resulted in an uninitialized return code when COMPAT mode is disabled: io_uring/net.c:722:6: error: variable 'ret' is used uninitialized whenever 'if' condition is true [-Werror,-Wsometimes-uninitialized] 722 | if (io_is_compat(req->ctx)) { | ^~~~~~~~~~~~~~~~~~~~~~ io_uring/net.c:736:15: note: uninitialized use occurs here 736 | if (unlikely(ret)) | ^~~ Since io_is_compat() turns into a compile-time 'false', the #ifdef here is completely unnecessary, and removing it avoids the warning. Fixes: 51e158d40589 ("io_uring/net: unify *mshot_prep calls with compat") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20250227132018.1111094-1-arnd@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-27io_uring: rearrange opdef flags by use patternPavel Begunkov
Keep all flags that we use in the generic req init path close together. That saves a load for x86 because apparently some compilers prefer reading single bytes. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/ef03b6ce4a0c2a5234cd4037fa07e9e4902dcc9e.1740602793.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-27io_uring/net: extract iovec import into a helperPavel Begunkov
Deduplicate iovec imports between compat and !compat by introducing a helper function. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/6a5f8c526f6732c4249a7fa0213b49e1a3ecccf0.1740569495.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-27io_uring/net: unify *mshot_prep calls with compatPavel Begunkov
Instead of duplicating a io_recvmsg_mshot_prep() call in the compat path, let the common code handle it. For that, copy necessary compat fields into struct user_msghdr. Note, it zeroes user_msghdr to be on the safe side as compat is not that interesting and overhead shouldn't be high. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/94e62386dec570f83b4a4270a46ac60bc415fb71.1740569495.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-27io_uring/net: derive iovec storage laterPavel Begunkov
Don't read free_iov until right before we need it to import the iovec. The only place that uses it before that is provided buffer selection, but it only serves as temporary storage and iovec content is not reused afterwards, so use a local variable for that. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/8bfa7d74c33e37860a724f4e0e96660c25cd4c02.1740569495.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-27io_uring/net: verify msghdr before copying iovecPavel Begunkov
Normally, net/ would verify msghdr before importing iovec, for example see copy_msghdr_from_user(), which further assumed by __copy_msghdr() validating msg->msg_iovlen. io_uring does it in reverse order, which is fine, but it'll be more convenient for flip it so that the iovec business is done at the end and eventually can be nicely pulled out of msghdr parsing section and thought as a sepaarate step. That also makes structure accesses more localised, which should be better for caches. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/cd35dc1b48d4e6e31f59ae7304c037fbe8a3fd3d.1740569495.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-27io_uring/net: isolate msghdr copying codePavel Begunkov
The user access section in io_msg_copy_hdr() is overextended by covering selected buffers. It's hard to work with and prone to errors. Limit the section to msghdr import only, selected buffers will do a separate copy_from_user() call, and then move it into its own function. This should be fine, selected buffer single shots are not important, for multishots the overhead should be non-existent, and it's not that expensive overall. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/d3eb1f81c8cfbea9f1aa57dab90c472d2aa6e371.1740569495.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-27io_uring/net: simplify compat selbuf iov parsingPavel Begunkov
Use copy_from_user() instead of open coded access_ok() + get_user(), that's simpler and we don't care about compat that much. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/e51f9c323a3cd4ad7c8da656559bdf6237f052fb.1740569495.git.asml.silence@gmail.com [axboe: fold in bogus < 0 check for tmp_iov.iov_len] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-27io_uring/net: remove unnecessary REQ_F_NEED_CLEANUPPavel Begunkov
REQ_F_NEED_CLEANUP in io_recvmsg_prep_setup() and in io_sendmsg_setup() are relics of the past and don't do anything useful, the flag should be and are set earlier on iovec and async_data allocation. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/6aedc3141c1fc027128a4503656cfd686a6980ef.1740569495.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-27Merge branch 'io_uring-6.14' into for-6.15/io_uringJens Axboe
Merge mainline fixes into 6.15 branch, as upcoming patches depend on fixes that went into the 6.14 mainline branch. * io_uring-6.14: io_uring/net: save msg_control for compat io_uring/rw: clean up mshot forced sync mode io_uring/rw: move ki_complete init into prep io_uring/rw: don't directly use ki_complete io_uring/rw: forbid multishot async reads io_uring/rsrc: remove unused constants io_uring: fix spelling error in uapi io_uring.h io_uring: prevent opcode speculation io-wq: backoff when retrying worker creation
2025-02-27io_uring: combine buffer lookup and importPavel Begunkov
Registered buffer are currently imported in two steps, first we lookup a rsrc node and then use it to set up the iterator. The first part is usually done at the prep stage, and import happens whenever it's needed. As we want to defer binding to a node so that it works with linked requests, combine both steps into a single helper. Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250224213116.3509093-6-kbusch@meta.com Signed-off-by: Jens Axboe <axboe@kernel.dk>