linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2025-12-24	Merge tag 'nfsd-6.19-1' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: "A set of NFSD fixes that arrived just a bit late for the 6.19 merge window. Regression fixes: - Mark variable __maybe_unused to avoid W=1 build break Stable fixes: - NFSv4 file creation neglects setting ACL - Clear TIME_DELEG in the suppattr_exclcreat bitmap - Clear SECLABEL in the suppattr_exclcreat bitmap - Fix memory leak in nfsd_create_serv error paths - Bound check rq_pages index in inline path - Return 0 on success from svc_rdma_copy_inline_range - Use rc_pageoff for memcpy byte offset - Avoid NULL deref on zero length gss_token in gss_read_proxy_verf" * tag 'nfsd-6.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: NFSD: NFSv4 file creation neglects setting ACL NFSD: Clear TIME_DELEG in the suppattr_exclcreat bitmap NFSD: Clear SECLABEL in the suppattr_exclcreat bitmap nfsd: fix memory leak in nfsd_create_serv error paths nfsd: Mark variable __maybe_unused to avoid W=1 build break svcrdma: bound check rq_pages index in inline path svcrdma: return 0 on success from svc_rdma_copy_inline_range svcrdma: use rc_pageoff for memcpy byte offset SUNRPC: svcauth_gss: avoid NULL deref on zero length gss_token in gss_read_proxy_verf
2025-12-12	Merge tag 'nfs-for-6.19-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs	Linus Torvalds
	Pull NFS client updates from Trond Myklebust: "Bugfixes: - Fix 'nlink' attribute update races when unlinking a file - Add missing initialisers for the directory verifier in various places - Don't regress the NFSv4 open state due to misordered racing replies - Ensure the NFSv4.x callback server uses the correct transport connection - Fix potential use-after-free races when shutting down the NFSv4.x callback server - Fix a pNFS layout commit crash - Assorted fixes to ensure correct propagation of mount options when the client crosses a filesystem boundary and triggers the VFS automount code - More localio fixes Features and cleanups: - Add initial support for basic directory delegations - SunRPC back channel code cleanups" * tag 'nfs-for-6.19-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (24 commits) NFSv4: Handle NFS4ERR_NOTSUPP errors for directory delegations nfs/localio: remove 61 byte hole from needless ____cacheline_aligned nfs/localio: remove alignment size checking in nfs_is_local_dio_possible NFS: Fix up the automount fs_context to use the correct cred NFS: Fix inheritance of the block sizes when automounting NFS: Automounted filesystems should inherit ro,noexec,nodev,sync flags Revert "nfs: ignore SB_RDONLY when mounting nfs" Revert "nfs: clear SB_RDONLY before getting superblock" Revert "nfs: ignore SB_RDONLY when remounting nfs" NFS: Add a module option to disable directory delegations NFS: Shortcut lookup revalidations if we have a directory delegation NFS: Request a directory delegation during RENAME NFS: Request a directory delegation on ACCESS, CREATE, and UNLINK NFS: Add support for sending GDD_GETATTR NFSv4/pNFS: Clear NFS_INO_LAYOUTCOMMIT in pnfs_mark_layout_stateid_invalid NFSv4.1: protect destroying and nullifying bc_serv structure SUNRPC: new helper function for stopping backchannel server SUNRPC: cleanup common code in backchannel request NFSv4.1: pass transport for callback shutdown NFSv4: ensure the open stateid seqid doesn't go backwards ...
2025-12-08	svcrdma: bound check rq_pages index in inline path	Joshua Rogers
	svc_rdma_copy_inline_range indexed rqstp->rq_pages[rc_curpage] without verifying rc_curpage stays within the allocated page array. Add guards before the first use and after advancing to a new page. Fixes: d7cc73972661 ("svcrdma: support multiple Read chunks per RPC") Cc: stable@vger.kernel.org Signed-off-by: Joshua Rogers <linux@joshua.hu> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-12-08	svcrdma: return 0 on success from svc_rdma_copy_inline_range	Joshua Rogers
	The function comment specifies 0 on success and -EINVAL on invalid parameters. Make the tail return 0 after a successful copy loop. Fixes: d7cc73972661 ("svcrdma: support multiple Read chunks per RPC") Cc: stable@vger.kernel.org Signed-off-by: Joshua Rogers <linux@joshua.hu> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-12-08	svcrdma: use rc_pageoff for memcpy byte offset	Joshua Rogers
	svc_rdma_copy_inline_range added rc_curpage (page index) to the page base instead of the byte offset rc_pageoff. Use rc_pageoff so copies land within the current page. Found by ZeroPath (https://zeropath.com) Fixes: 8e122582680c ("svcrdma: Move svc_rdma_read_info::ri_pageno to struct svc_rdma_recv_ctxt") Cc: stable@vger.kernel.org Signed-off-by: Joshua Rogers <linux@joshua.hu> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-12-08	SUNRPC: svcauth_gss: avoid NULL deref on zero length gss_token in ↵	Joshua Rogers
	gss_read_proxy_verf A zero length gss_token results in pages == 0 and in_token->pages[0] is NULL. The code unconditionally evaluates page_address(in_token->pages[0]) for the initial memcpy, which can dereference NULL even when the copy length is 0. Guard the first memcpy so it only runs when length > 0. Fixes: 5866efa8cbfb ("SUNRPC: Fix svcauth_gss_proxy_init()") Cc: stable@vger.kernel.org Signed-off-by: Joshua Rogers <linux@joshua.hu> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-12-06	Merge tag 'nfsd-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux	Linus Torvalds
	Pull nfsd updates from Chuck Lever: - Mike Snitzer's mechanism for disabling I/O caching introduced in v6.18 is extended to include using direct I/O. The goal is to further reduce the memory footprint consumed by NFS clients accessing large data sets via NFSD. - The NFSD community adopted a maintainer entry profile during this cycle. See Documentation/filesystems/nfs/nfsd-maintainer-entry-profile.rst - Work continues on hardening NFSD's implementation of the pNFS block layout type. This type enables pNFS clients to directly access the underlying block devices that contain an exported file system, reducing server overhead and increasing data throughput. - The remaining patches are clean-ups and minor optimizations. Many thanks to the contributors, reviewers, testers, and bug reporters who participated during the v6.19 NFSD development cycle. * tag 'nfsd-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (38 commits) NFSD: nfsd-io-modes: Separate lists NFSD: nfsd-io-modes: Wrap shell snippets in literal code blocks NFSD: Add toctree entry for NFSD IO modes docs NFSD: add Documentation/filesystems/nfs/nfsd-io-modes.rst NFSD: Implement NFSD_IO_DIRECT for NFS WRITE NFSD: Make FILE_SYNC WRITEs comply with spec NFSD: Add trace point for SCSI fencing operation. NFSD: use correct reservation type in nfsd4_scsi_fence_client xdrgen: Don't generate unnecessary semicolon xdrgen: Fix union declarations NFSD: don't start nfsd if sv_permsocks is empty xdrgen: handle _XdrString in union encoder/decoder xdrgen: Fix the variable-length opaque field decoder template xdrgen: Make the xdrgen script location-independent xdrgen: Generalize/harden pathname construction lockd: don't allow locking on reexported NFSv2/3 MAINTAINERS: add a nfsd blocklayout reviewer nfsd: Use MD5 library instead of crypto_shash nfsd: stop pretending that we cache the SEQUENCE reply. NFS: nfsd-maintainer-entry-profile: Inline function name prefixes ...
2025-12-05	Merge tag 'pull-persistency' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull persistent dentry infrastructure and conversion from Al Viro: "Some filesystems use a kinda-sorta controlled dentry refcount leak to pin dentries of created objects in dcache (and undo it when removing those). A reference is grabbed and not released, but it's not actually _stored_ anywhere. That works, but it's hard to follow and verify; among other things, we have no way to tell _which_ of the increments is intended to be an unpaired one. Worse, on removal we need to decide whether the reference had already been dropped, which can be non-trivial if that removal is on umount and we need to figure out if this dentry is pinned due to e.g. unlink() not done. Usually that is handled by using kill_litter_super() as ->kill_sb(), but there are open-coded special cases of the same (consider e.g. /proc/self). Things get simpler if we introduce a new dentry flag (DCACHE_PERSISTENT) marking those "leaked" dentries. Having it set claims responsibility for +1 in refcount. The end result this series is aiming for: - get these unbalanced dget() and dput() replaced with new primitives that would, in addition to adjusting refcount, set and clear persistency flag. - instead of having kill_litter_super() mess with removing the remaining "leaked" references (e.g. for all tmpfs files that hadn't been removed prior to umount), have the regular shrink_dcache_for_umount() strip DCACHE_PERSISTENT of all dentries, dropping the corresponding reference if it had been set. After that kill_litter_super() becomes an equivalent of kill_anon_super(). Doing that in a single step is not feasible - it would affect too many places in too many filesystems. It has to be split into a series. This work has really started early in 2024; quite a few preliminary pieces have already gone into mainline. This chunk is finally getting to the meat of that stuff - infrastructure and most of the conversions to it. Some pieces are still sitting in the local branches, but the bulk of that stuff is here" * tag 'pull-persistency' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (54 commits) d_make_discardable(): warn if given a non-persistent dentry kill securityfs_recursive_remove() convert securityfs get rid of kill_litter_super() convert rust_binderfs convert nfsctl convert rpc_pipefs convert hypfs hypfs: swich hypfs_create_u64() to returning int hypfs: switch hypfs_create_str() to returning int hypfs: don't pin dentries twice convert gadgetfs gadgetfs: switch to simple_remove_by_name() convert functionfs functionfs: switch to simple_remove_by_name() functionfs: fix the open/removal races functionfs: need to cancel ->reset_work in ->kill_sb() functionfs: don't bother with ffs->ref in ffs_data_{opened,closed}() functionfs: don't abuse ffs_data_closed() on fs shutdown convert selinuxfs ...
2025-11-23	SUNRPC: new helper function for stopping backchannel server	Olga Kornievskaia
	Create a new backchannel function to stop the backchannel server and clear the bc_serv in transport protected under the bc_pa_lock. Signed-off-by: Olga Kornievskaia <okorniev@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-11-23	SUNRPC: cleanup common code in backchannel request	Olga Kornievskaia
	Create a helper function for common code between rdma and tcp backchannel handling of the backchannel request. Make sure that access is protected by the bc_pa_lock lock. Signed-off-by: Olga Kornievskaia <okorniev@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-11-17	convert rpc_pipefs	Al Viro
	Just use d_make_persistent() + dput() (and fold the latter into simple_finish_creating()) and that's it... NOTE: pipe->dentry is a borrowed reference - it does not contribute to dentry refcount. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-11-16	sunrpc: allocate a separate bvec array for socket sends	Jeff Layton
	svc_tcp_sendmsg() calls xdr_buf_to_bvec() with the second slot of rq_bvec as the start, but doesn't reduce the array length by one, which could lead to an array overrun. Also, rq_bvec is always rq_maxpages in length, which can be too short in some cases, since the TCP record marker consumes a slot. Fix both problems by adding a separate bvec array to the svc_sock that is specifically for sending. For TCP, make this array one slot longer than rq_maxpages, to account for the record marker. For UDP, only allocate as large an array as we need since it's limited to 64k of payload. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: NeilBrown <neil@brown.name> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-11-16	SUNRPC: Improve "fragment too large" warning	Chuck Lever
	Including the client IP address that generated the overrun traffic seems like it would be helpful. The message now reads: kernel: svc: nfsd oversized RPC fragment (1064958 octets) from 100.64.0.11:45866 Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-11-16	svcrdma: Release transport resources synchronously	Chuck Lever
	NFSD has always supported added network listeners. The new netlink protocol now enables the removal of listeners. Olga noticed that if an RDMA listener is removed and immediately re-added, the deferred __svc_rdma_free() function might not have run yet, so some or all of the old listener's RDMA resources linger, which prevents a new listener on the same address from being created. Also, svc_xprt_free() does a module_put() just after calling ->xpo_free(). That means if there is deferred work going on, the module could be unloaded before that work is even started, resulting in a UAF. Neil asks: > What particular part of __svc_rdma_free() needs to run in order for a > subsequent registration to succeed? > Can that bit be run directory from svc_rdma_free() rather than be > delayed? > (I know almost nothing about rdma so forgive me if the answers to these > questions seems obvious) The reasons I can recall are: - Some of the transport tear-down work can sleep - Releasing a cm_id is tricky and can deadlock We might be able to mitigate the second issue with judicious application of transport reference counting. Reported-by: Olga Kornievskaia <okorniev@redhat.com> Closes: https://lore.kernel.org/linux-nfs/20250821204328.89218-1-okorniev@redhat.com/ Suggested-by: NeilBrown <neil@brown.name> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-11-13	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	Cross-merge networking fixes after downstream PR (net-6.18-rc6). No conflicts, adjacent changes in: drivers/net/phy/micrel.c 96a9178a29a6 ("net: phy: micrel: lan8814 fix reset of the QSGMII interface") 61b7ade9ba8c ("net: phy: micrel: Add support for non PTP SKUs for lan8814") and a trivial one in tools/testing/selftests/drivers/net/Makefile. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-12	Merge tag 'nfsd-6.18-3' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: "Address recently reported issues or issues found at the recent NFS bake-a-thon held in Raleigh, NC. Issues reported with v6.18-rc: - Address a kernel build issue - Reorder SEQUENCE processing to avoid spurious NFS4ERR_SEQ_MISORDERED Issues that need expedient stable backports: - Close a refcount leak exposure - Report support for NFSv4.2 CLONE correctly - Fix oops during COPY_NOTIFY processing - Prevent rare crash after XDR encoding failure - Prevent crash due to confused or malicious NFSv4.1 client" * tag 'nfsd-6.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: Revert "SUNRPC: Make RPCSEC_GSS_KRB5 select CRYPTO instead of depending on it" nfsd: ensure SEQUENCE replay sends a valid reply. NFSD: Never cache a COMPOUND when the SEQUENCE operation fails NFSD: Skip close replay processing if XDR encoding fails NFSD: free copynotify stateid in nfs4_free_ol_stateid() nfsd: add missing FATTR4_WORD2_CLONE_BLKSIZE from supported attributes nfsd: fix refcount leak in nfsd_set_fh_dentry()
2025-11-10	Revert "SUNRPC: Make RPCSEC_GSS_KRB5 select CRYPTO instead of depending on it"	Chuck Lever
	Geert reports: > This is now commit d8e97cc476e33037 ("SUNRPC: Make RPCSEC_GSS_KRB5 > select CRYPTO instead of depending on it") in v6.18-rc1. > As RPCSEC_GSS_KRB5 defaults to "y", CRYPTO is now auto-enabled in > defconfigs that didn't enable it before. Revert while we work out a proper solution and then test it. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Closes: https://lore.kernel.org/linux-nfs/b97cea29-4ab7-4fb6-85ba-83f9830e524f@kernel.org/T/#t Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-11-04	net: Convert proto_ops connect() callbacks to use sockaddr_unsized	Kees Cook
	Update all struct proto_ops connect() callback function prototypes from "struct sockaddr " to "struct sockaddr_unsized " to avoid lying to the compiler about object sizes. Calls into struct proto handlers gain casts that will be removed in the struct proto conversion patch. No binary changes expected. Signed-off-by: Kees Cook <kees@kernel.org> Link: https://patch.msgid.link/20251104002617.2752303-3-kees@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04	net: Convert proto_ops bind() callbacks to use sockaddr_unsized	Kees Cook
	Update all struct proto_ops bind() callback function prototypes from "struct sockaddr " to "struct sockaddr_unsized " to avoid lying to the compiler about object sizes. Calls into struct proto handlers gain casts that will be removed in the struct proto conversion patch. No binary changes expected. Signed-off-by: Kees Cook <kees@kernel.org> Link: https://patch.msgid.link/20251104002617.2752303-2-kees@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-06	Merge tag 'nfsd-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux	Linus Torvalds
	Pull nfsd updates from Chuck Lever: "Mike Snitzer has prototyped a mechanism for disabling I/O caching in NFSD. This is introduced in v6.18 as an experimental feature. This enables scaling NFSD in /both/ directions: - NFS service can be supported on systems with small memory footprints, such as low-cost cloud instances - Large NFS workloads will be less likely to force the eviction of server-local activity, helping it avoid thrashing Jeff Layton contributed a number of fixes to the new attribute delegation implementation (based on a pending Internet RFC) that we hope will make attribute delegation reliable enough to enable by default, as it is on the Linux NFS client. The remaining patches in this pull request are clean-ups and minor optimizations. Many thanks to the contributors, reviewers, testers, and bug reporters who participated during the v6.18 NFSD development cycle" * tag 'nfsd-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (42 commits) nfsd: discard nfserr_dropit SUNRPC: Make RPCSEC_GSS_KRB5 select CRYPTO instead of depending on it NFSD: Add io_cache_{read,write} controls to debugfs NFSD: Do the grace period check in ->proc_layoutget nfsd: delete unnecessary NULL check in __fh_verify() NFSD: Allow layoutcommit during grace period NFSD: Disallow layoutget during grace period sunrpc: fix "occurence"->"occurrence" nfsd: Don't force CRYPTO_LIB_SHA256 to be built-in nfsd: nfserr_jukebox in nlm_fopen should lead to a retry NFSD: Reduce DRC bucket size NFSD: Delay adding new entries to LRU SUNRPC: Move the svc_rpcb_cleanup() call sites NFS: Remove rpcbind cleanup for NFSv4.0 callback nfsd: unregister with rpcbind when deleting a transport NFSD: Drop redundant conversion to bool sunrpc: eliminate return pointer in svc_tcp_sendmsg() sunrpc: fix pr_notice in svc_tcp_sendto() to show correct length nfsd: decouple the xprtsec policy check from check_nfsd_access() NFSD: Fix destination buffer size in nfsd4_ssc_setup_dul() ...
2025-10-01	SUNRPC: Make RPCSEC_GSS_KRB5 select CRYPTO instead of depending on it	Eric Biggers
	Make RPCSEC_GSS_KRB5 select CRYPTO instead of depending on it. This unblocks the eventual removal of the selection of CRYPTO from NFSD_V4, which will no longer be needed by nfsd itself due to switching to the crypto library functions. But NFSD_V4 selects RPCSEC_GSS_KRB5, which still needs CRYPTO. It makes more sense for RPCSEC_GSS_KRB5 to select CRYPTO itself, like most other kconfig options that need CRYPTO do. Signed-off-by: Eric Biggers <ebiggers@kernel.org> Acked-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-09-30	sunrpc: unexport rpc_malloc() and rpc_free()	Jeff Layton
	These are not used outside of sunrpc code. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-09-23	SUNRPC: Update gssx_accept_sec_context() to use xdr_set_scratch_folio()	Anna Schumaker
	This was the last caller of xdr_set_scratch_page(), so I remove this function while I'm at it. Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-09-23	SUNRPC: Update svcxdr_init_decode() to call xdr_set_scratch_folio()	Anna Schumaker
	The only snag here is that __folio_alloc_node() doesn't handle NUMA_NO_NODE, so I also need to update svc_pool_map_get_node() to return numa_mem_id() instead. I arrived at this approach by looking at what other users of __folio_alloc_node() do for this case. Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-09-23	SUNRPC: Remove redundant __GFP_NOWARN	Qianfeng Rong
	GFP_NOWAIT already includes __GFP_NOWARN, so let's remove the redundant __GFP_NOWARN. Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com> Acked-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-09-23	SUNRPC: Move the svc_rpcb_cleanup() call sites	Chuck Lever
	Clean up: because svc_rpcb_cleanup() and svc_xprt_destroy_all() are always invoked in pairs, we can deduplicate code by moving the svc_rpcb_cleanup() call sites into svc_xprt_destroy_all(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Olga Kornievskaia <okorniev@redhat.com> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-09-23	sunrpc: add a Kconfig option to redirect dfprintk() output to trace buffer	Jeff Layton
	We have a lot of old dprintk() call sites that aren't going anywhere anytime soon. At the same time, turning them up is a serious burden on the host due to the console locking overhead. Add a new Kconfig option that redirects dfprintk() output to the trace buffer. This is more efficient than logging to the console and allows for proper interleaving of dprintk and static tracepoint events. Since using trace_printk() causes scary warnings to pop at boot time, this new option defaults to "n". Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-09-21	sunrpc: fix "occurence"->"occurrence"	Xichao Zhao
	Trivial fix to spelling mistake in comment text. Signed-off-by: Xichao Zhao <zhao.xichao@vivo.com> Reviewed-by: Joe Damato <joe@dama.to> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-09-21	SUNRPC: Move the svc_rpcb_cleanup() call sites	Chuck Lever
	Clean up: because svc_rpcb_cleanup() and svc_xprt_destroy_all() are always invoked in pairs, we can deduplicate code by moving the svc_rpcb_cleanup() call sites into svc_xprt_destroy_all(). Tested-by: Olga Kornievskaia <okorniev@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-09-21	nfsd: unregister with rpcbind when deleting a transport	Olga Kornievskaia
	When a listener is added, a part of creation of transport also registers program/port with rpcbind. However, when the listener is removed, while transport goes away, rpcbind still has the entry for that port/type. When deleting the transport, unregister with rpcbind when appropriate. ---v2 created a new xpt_flag XPT_RPCB_UNREG to mark TCP and UDP transport and at xprt destroy send rpcbind unregister if flag set. Suggested-by: Chuck Lever <chuck.lever@oracle.com> Fixes: d093c9089260 ("nfsd: fix management of listener transports") Cc: stable@vger.kernel.org Signed-off-by: Olga Kornievskaia <okorniev@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-09-21	sunrpc: eliminate return pointer in svc_tcp_sendmsg()	Jeff Layton
	Return a positive value if something was sent, or a negative error code. Eliminate the "err" variable in the only caller as well. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-09-21	sunrpc: fix pr_notice in svc_tcp_sendto() to show correct length	Jeff Layton
	This pr_notice() is confusing since it only prints xdr->len, which doesn't include the 4-byte record marker. That can make it sometimes look like the socket sent more than was requested if it's short by just a few bytes. Add sizeof(marker) to the size and fix the format accordingly. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-09-21	sunrpc: delay pc_release callback until after the reply is sent	Jeff Layton
	The server-side sunrpc code currently calls pc_release before sending the reply. Change svc_process and svc_process_bc to call pc_release after sending the reply instead. Reviewed-by: NeilBrown <neil@brown.name> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-09-21	sunrpc: fix null pointer dereference on zero-length checksum	Lei Lu
	In xdr_stream_decode_opaque_auth(), zero-length checksum.len causes checksum.data to be set to NULL. This triggers a NPD when accessing checksum.data in gss_krb5_verify_mic_v2(). This patch ensures that the value of checksum.len is not less than XDR_UNIT. Fixes: 0653028e8f1c ("SUNRPC: Convert gss_verify_header() to use xdr_stream") Cc: stable@kernel.org Signed-off-by: Lei Lu <llfamsec@gmail.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-09-06	SUNRPC: call xs_sock_process_cmsg for all cmsg	Justin Worrell
	xs_sock_recv_cmsg was failing to call xs_sock_process_cmsg for any cmsg type other than TLS_RECORD_TYPE_ALERT (TLS_RECORD_TYPE_DATA, and other values not handled.) Based on my reading of the previous commit (cc5d5908: sunrpc: fix client side handling of tls alerts), it looks like only iov_iter_revert should be conditional on TLS_RECORD_TYPE_ALERT (but that other cmsg types should still call xs_sock_process_cmsg). On my machine, I was unable to connect (over mtls) to an NFS share hosted on FreeBSD. With this patch applied, I am able to mount the share again. Fixes: cc5d59081fa2 ("sunrpc: fix client side handling of tls alerts") Signed-off-by: Justin Worrell <jworrell@gmail.com> Reviewed-and-tested-by: Scott Mayhew <smayhew@redhat.com> Link: https://lore.kernel.org/r/20250904211038.12874-3-jworrell@gmail.com Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-09-06	Revert "SUNRPC: Don't allow waiting for exiting tasks"	Trond Myklebust
	This reverts commit 14e41b16e8cb677bb440dca2edba8b041646c742. This patch breaks the LTP acct02 test, so let's revert and look for a better solution. Reported-by: Mark Brown <broonie@kernel.org> Reported-by: Harshvardhan Jha <harshvardhan.j.jha@oracle.com> Link: https://lore.kernel.org/linux-nfs/7d4d57b0-39a3-49f1-8ada-60364743e3b4@sirena.org.uk/ Cc: stable@vger.kernel.org # 6.15.x Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-08-11	Merge tag 'nfsd-6.17-1' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: - A correctness fix for delegated timestamps - Address an NFSD shutdown hang when LOCALIO is in use - Prevent a remotely exploitable crasher when TLS is in use * tag 'nfsd-6.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: sunrpc: fix handling of server side tls alerts nfsd: avoid ref leak in nfsd_open_local_fh() nfsd: don't set the ctime on delegated atime updates
2025-08-09	Merge tag 'nfs-for-6.17-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs	Linus Torvalds
	Pull NFS client updates from Trond Myklebust: "Highlights include: Stable fixes: - don't inherit NFS filesystem capabilities when crossing from one filesystem to another Bugfixes: - NFS wakeup of __nfs_lookup_revalidate() needs memory barriers - NFS improve bounds checking in nfs_fh_to_dentry() - NFS Fix allocation errors when writing to a NFS file backed loopback device - NFSv4: More listxattr fixes - SUNRPC: fix client handling of TLS alerts - pNFS block/scsi layout fix for an uninitialised pointer dereference - pNFS block/scsi layout fixes for the extent encoding, stripe mapping, and disk offset overflows - pNFS layoutcommit work around for RPC size limitations - pNFS/flexfiles avoid looping when handling fatal errors after layoutget - localio: fix various race conditions Features and cleanups: - Add NFSv4 support for retrieving the btime - NFS: Allow folio migration for the case of mode == MIGRATE_SYNC - NFS: Support using a kernel keyring to store TLS certificates - NFSv4: Speed up delegation lookup using a hash table - Assorted cleanups to remove unused variables and struct fields - Assorted new tracepoints to improve debugging" * tag 'nfs-for-6.17-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (44 commits) NFS/localio: nfs_uuid_put() fix the wake up after unlinking the file NFS/localio: nfs_uuid_put() fix races with nfs_open/close_local_fh() NFS/localio: nfs_close_local_fh() fix check for file closed NFSv4: Remove duplicate lookups, capability probes and fsinfo calls NFS: Fix the setting of capabilities when automounting a new filesystem sunrpc: fix client side handling of tls alerts nfs/localio: use read_seqbegin() rather than read_seqbegin_or_lock() NFS: Fixup allocation flags for nfsiod's __GFP_NORETRY NFSv4.2: another fix for listxattr NFS: Fix filehandle bounds checking in nfs_fh_to_dentry() SUNRPC: Silence warnings about parameters not being described NFS: Clean up pnfs_put_layout_hdr()/pnfs_destroy_layout_final() NFS: Fix wakeup of __nfs_lookup_revalidate() in unblock_revalidate() NFS: use a hash table for delegation lookup NFS: track active delegations per-server NFS: move the delegation_watermark module parameter NFS: cleanup nfs_inode_reclaim_delegation NFS: cleanup error handling in nfs4_server_common_setup pNFS/flexfiles: don't attempt pnfs on fatal DS errors NFS: drop __exit from nfs_exit_keyring ...
2025-08-06	sunrpc: fix handling of server side tls alerts	Olga Kornievskaia
	Scott Mayhew discovered a security exploit in NFS over TLS in tls_alert_recv() due to its assumption it can read data from the msg iterator's kvec.. kTLS implementation splits TLS non-data record payload between the control message buffer (which includes the type such as TLS aler or TLS cipher change) and the rest of the payload (say TLS alert's level/description) which goes into the msg payload buffer. This patch proposes to rework how control messages are setup and used by sock_recvmsg(). If no control message structure is setup, kTLS layer will read and process TLS data record types. As soon as it encounters a TLS control message, it would return an error. At that point, NFS can setup a kvec backed msg buffer and read in the control message such as a TLS alert. Msg iterator can advance the kvec pointer as a part of the copy process thus we need to revert the iterator before calling into the tls_alert_recv. Reported-by: Scott Mayhew <smayhew@redhat.com> Fixes: 5e052dda121e ("SUNRPC: Recognize control messages in server-side TCP socket code") Suggested-by: Trond Myklebust <trondmy@hammerspace.com> Cc: stable@vger.kernel.org Signed-off-by: Olga Kornievskaia <okorniev@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-08-03	sunrpc: fix client side handling of tls alerts	Olga Kornievskaia
	A security exploit was discovered in NFS over TLS in tls_alert_recv due to its assumption that there is valid data in the msghdr's iterator's kvec. Instead, this patch proposes the rework how control messages are setup and used by sock_recvmsg(). If no control message structure is setup, kTLS layer will read and process TLS data record types. As soon as it encounters a TLS control message, it would return an error. At that point, NFS can setup a kvec backed control buffer and read in the control message such as a TLS alert. Scott found that a msg iterator can advance the kvec pointer as a part of the copy process thus we need to revert the iterator before calling into the tls_alert_recv. Fixes: dea034b963c8 ("SUNRPC: Capture CMSG metadata on client-side receive") Suggested-by: Trond Myklebust <trondmy@hammerspace.com> Suggested-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: Olga Kornievskaia <okorniev@redhat.com> Link: https://lore.kernel.org/r/20250731180058.4669-3-okorniev@redhat.com Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-28	Merge tag 'pull-rpc_pipefs' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull rpc_pipefs updates from Al Viro: "Massage rpc_pipefs to use saner primitives and clean up the APIs provided to the rest of the kernel" * tag 'pull-rpc_pipefs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: rpc_create_client_dir(): return 0 or -E... rpc_create_client_dir(): don't bother with rpc_populate() rpc_new_dir(): the last argument is always NULL rpc_pipe: expand the calls of rpc_mkdir_populate() rpc_gssd_dummy_populate(): don't bother with rpc_populate() rpc_mkpipe_dentry(): switch to simple_start_creating() rpc_pipe: saner primitive for creating regular files rpc_pipe: saner primitive for creating subdirectories rpc_pipe: don't overdo directory locking rpc_mkpipe_dentry(): saner calling conventions rpc_unlink(): saner calling conventions rpc_populate(): lift cleanup into callers rpc_unlink(): use simple_recursive_removal() rpc_{rmdir_,}depopulate(): use simple_recursive_removal() instead rpc_pipe: clean failure exits in fill_super new helper: simple_start_creating()
2025-07-28	Merge tag 'pull-dcache' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull dentry d_flags updates from Al Viro: "The current exclusion rules for dentry->d_flags stores are rather unpleasant. The basic rules are simple: - stores to dentry->d_flags are OK under dentry->d_lock - stores to dentry->d_flags are OK in the dentry constructor, before becomes potentially visible to other threads Unfortunately, there's a couple of exceptions to that, and that's where the headache comes from. The main PITA comes from d_set_d_op(); that primitive sets ->d_op of dentry and adjusts the flags that correspond to presence of individual methods. It's very easy to misuse; existing uses _are_ safe, but proof of correctness is brittle. Use in __d_alloc() is safe (we are within a constructor), but we might as well precalculate the initial value of 'd_flags' when we set the default ->d_op for given superblock and set 'd_flags' directly instead of messing with that helper. The reasons why other uses are safe are bloody convoluted; I'm not going to reproduce it here. See [1] for gory details, if you care. The critical part is using d_set_d_op() only just prior to d_splice_alias(), which makes a combination of d_splice_alias() with setting ->d_op, etc a natural replacement primitive. Better yet, if we go that way, it's easy to take setting ->d_op and modifying 'd_flags' under ->d_lock, which eliminates the headache as far as 'd_flags' exclusion rules are concerned. Other exceptions are minor and easy to deal with. What this series does: - d_set_d_op() is no longer available; instead a new primitive (d_splice_alias_ops()) is provided, equivalent to combination of d_set_d_op() and d_splice_alias(). - new field of struct super_block - 's_d_flags'. This sets the default value of 'd_flags' to be used when allocating dentries on this filesystem. - new primitive for setting 's_d_op': set_default_d_op(). This replaces stores to 's_d_op' at mount time. All in-tree filesystems converted; out-of-tree ones will get caught by the compiler ('s_d_op' is renamed, so stores to it will be caught). 's_d_flags' is set by the same primitive to match the 's_d_op'. - a lot of filesystems had sb->s_d_op->d_delete equal to always_delete_dentry; that is equivalent to setting DCACHE_DONTCACHE in 'd_flags', so such filesystems can bloody well set that bit in 's_d_flags' and drop 'd_delete()' from dentry_operations. In quite a few cases that results in empty dentry_operations, which means that we can get rid of those. - kill simple_dentry_operations - not needed anymore - massage d_alloc_parallel() to get rid of the other exception wrt 'd_flags' stores - we can set DCACHE_PAR_LOOKUP as soon as we allocate the new dentry; no need to delay that until we commit to using the sucker. As the result, 'd_flags' stores are all either under ->d_lock or done before the dentry becomes visible in any shared data structures" Link: https://lore.kernel.org/all/20250224010624.GT1977892@ZenIV/ [1] * tag 'pull-dcache' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (21 commits) configfs: use DCACHE_DONTCACHE debugfs: use DCACHE_DONTCACHE efivarfs: use DCACHE_DONTCACHE instead of always_delete_dentry() 9p: don't bother with always_delete_dentry ramfs, hugetlbfs, mqueue: set DCACHE_DONTCACHE kill simple_dentry_operations devpts, sunrpc, hostfs: don't bother with ->d_op shmem: no dentry retention past the refcount reaching zero d_alloc_parallel(): set DCACHE_PAR_LOOKUP earlier make d_set_d_op() static simple_lookup(): just set DCACHE_DONTCACHE tracefs: Add d_delete to remove negative dentries set_default_d_op(): calculate the matching value for ->d_flags correct the set of flags forbidden at d_set_d_op() time split d_flags calculation out of d_set_d_op() new helper: set_default_d_op() fuse: no need for special dentry_operations for root dentry switch procfs from d_set_d_op() to d_splice_alias_ops() new helper: d_splice_alias_ops() procfs: kill ->proc_dops ...
2025-07-22	SUNRPC: Silence warnings about parameters not being described	Trond Myklebust
	Warning: net/sunrpc/auth_gss/gss_krb5_crypto.c:902 function parameter 'len' not described in 'krb5_etm_decrypt' Warning: net/sunrpc/auth_gss/gss_krb5_crypto.c:902 function parameter 'buf' not described in 'krb5_etm_decrypt' Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-14	SUNRPC: Remove unused xdr functions	Dr. David Alan Gilbert
	Remove a bunch of unused xdr_decode functions: The last use of xdr_decode_netobj() was removed in 2021 by: commit 7cf96b6d0104 ("lockd: Update the NLMv4 SHARE arguments decoder to use struct xdr_stream") The last use of xdr_decode_string_inplace() was removed in 2021 by: commit 3049e974a7c7 ("lockd: Update the NLMv4 FREE_ALL arguments decoder to use struct xdr_stream") The last use of xdr_stream_decode_opaque() was removed in 2024 by: commit fed8a17c61ff ("xdrgen: typedefs should use the built-in string and opaque functions") The functions xdr_stream_decode_string() and xdr_stream_decode_opaque_dup() were both added in 2018 by the commit 0e779aa70308 ("SUNRPC: Add helpers for decoding opaque and string types") but never used. Remove them. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Link: https://lore.kernel.org/r/20250712233006.403226-1-linux@treblig.org Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-14	sunrpc: make svc_tcp_sendmsg() take a signed sentp pointer	Jeff Layton
	The return value of sock_sendmsg() is signed, and svc_tcp_sendto() wants a signed value to return. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-07-14	sunrpc: return better error in svcauth_gss_accept() on alloc failure	Jeff Layton
	This ends up returning AUTH_BADCRED when memory allocation fails today. Fix it to return AUTH_FAILED, which better indicates a failure on the server. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-07-14	sunrpc: reset rq_accept_statp when starting a new RPC	Jeff Layton
	rq_accept_statp should point to the location of the accept_status in the reply. This field is not reset between RPCs so if svc_authenticate or pg_authenticate return SVC_DENIED without setting the pointer, it could result in the status being written to the wrong place. This pointer starts its lifetime as NULL. Reset it on every iteration so we get consistent behavior if this happens. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-07-14	sunrpc: remove SVC_SYSERR	Jeff Layton
	Nothing returns this error code. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-07-14	sunrpc: fix handling of unknown auth status codes	Jeff Layton
	In the case of an unknown error code from svc_authenticate or pg_authenticate, return AUTH_ERROR with a status of AUTH_FAILED. Also add the other auth_stat value from RFC 5531, and document all the status codes. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-07-14	sunrpc: new tracepoints around svc thread wakeups	Jeff Layton
	Convert the svc_wake_up tracepoint into svc_pool_thread_event class. Have it also record the pool id, and add new tracepoints for when the thread is already running and for when there are no idle threads. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>