summaryrefslogtreecommitdiff
path: root/Documentation/filesystems
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2025-01-28 14:23:46 -0800
committerLinus Torvalds <torvalds@linux-foundation.org>2025-01-28 14:23:46 -0800
commitb88fe2b5dd018c2b856fd6c32b82f25033e908d4 (patch)
tree54ccf64a07b4a3f06bf3d919bb9bf0a5e4a48dfe /Documentation/filesystems
parent3673f5be0ec4798089c2c014505e54fc361d3616 (diff)
parent6f56971841a178e99c502f4150fa28b9d699ed31 (diff)
Merge tag 'nfs-for-6.14-1' of git://git.linux-nfs.org/projects/anna/linux-nfs
Pull NFS client updates from Anna Schumaker: "New Features: - Enable using direct IO with localio - Added localio related tracepoints Bugfixes: - Sunrpc fixes for working with a very large cl_tasks list - Fix a possible buffer overflow in nfs_sysfs_link_rpc_client() - Fixes for handling reconnections with localio - Fix how the NFS_FSCACHE kconfig option interacts with NETFS_SUPPORT - Fix COPY_NOTIFY xdr_buf size calculations - pNFS/Flexfiles fix for retrying requesting a layout segment for reads - Sunrpc fix for retrying on EKEYEXPIRED error when the TGT is expired Cleanups: - Various other nfs & nfsd localio cleanups - Prepratory patches for async copy improvements that are under development - Make OFFLOAD_CANCEL, LAYOUTSTATS, and LAYOUTERR moveable to other xprts - Add netns inum and srcaddr to debugfs rpc_xprt info" * tag 'nfs-for-6.14-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (28 commits) SUNRPC: do not retry on EKEYEXPIRED when user TGT ticket expired sunrpc: add netns inum and srcaddr to debugfs rpc_xprt info pnfs/flexfiles: retry getting layout segment for reads NFSv4.2: make LAYOUTSTATS and LAYOUTERROR MOVEABLE NFSv4.2: mark OFFLOAD_CANCEL MOVEABLE NFSv4.2: fix COPY_NOTIFY xdr buf size calculation NFS: Rename struct nfs4_offloadcancel_data NFS: Fix typo in OFFLOAD_CANCEL comment NFS: CB_OFFLOAD can return NFS4ERR_DELAY nfs: Make NFS_FSCACHE select NETFS_SUPPORT instead of depending on it nfs: fix incorrect error handling in LOCALIO nfs: probe for LOCALIO when v3 client reconnects to server nfs: probe for LOCALIO when v4 client reconnects to server nfs/localio: remove redundant code and simplify LOCALIO enablement nfs_common: add nfs_localio trace events nfs_common: track all open nfsd_files per LOCALIO nfs_client nfs_common: rename nfslocalio nfs_uuid_lock to nfs_uuids_lock nfsd: nfsd_file_acquire_local no longer returns GC'd nfsd_file nfsd: rename nfsd_serv_ prefixed methods and variables with nfsd_net_ nfsd: update percpu_ref to manage references on nfsd_net ...
Diffstat (limited to 'Documentation/filesystems')
-rw-r--r--Documentation/filesystems/nfs/localio.rst104
1 files changed, 52 insertions, 52 deletions
diff --git a/Documentation/filesystems/nfs/localio.rst b/Documentation/filesystems/nfs/localio.rst
index bd1967e2eab3..79808b37d745 100644
--- a/Documentation/filesystems/nfs/localio.rst
+++ b/Documentation/filesystems/nfs/localio.rst
@@ -218,64 +218,30 @@ NFS Client and Server Interlock
===============================
LOCALIO provides the nfs_uuid_t object and associated interfaces to
-allow proper network namespace (net-ns) and NFSD object refcounting:
-
- We don't want to keep a long-term counted reference on each NFSD's
- net-ns in the client because that prevents a server container from
- completely shutting down.
-
- So we avoid taking a reference at all and rely on the per-cpu
- reference to the server (detailed below) being sufficient to keep
- the net-ns active. This involves allowing the NFSD's net-ns exit
- code to iterate all active clients and clear their ->net pointers
- (which are needed to find the per-cpu-refcount for the nfsd_serv).
-
- Details:
-
- - Embed nfs_uuid_t in nfs_client. nfs_uuid_t provides a list_head
- that can be used to find the client. It does add the 16-byte
- uuid_t to nfs_client so it is bigger than needed (given that
- uuid_t is only used during the initial NFS client and server
- LOCALIO handshake to determine if they are local to each other).
- If that is really a problem we can find a fix.
-
- - When the nfs server confirms that the uuid_t is local, it moves
- the nfs_uuid_t onto a per-net-ns list in NFSD's nfsd_net.
-
- - When each server's net-ns is shutting down - in a "pre_exit"
- handler, all these nfs_uuid_t have their ->net cleared. There is
- an rcu_synchronize() call between pre_exit() handlers and exit()
- handlers so any caller that sees nfs_uuid_t ->net as not NULL can
- safely manage the per-cpu-refcount for nfsd_serv.
-
- - The client's nfs_uuid_t is passed to nfsd_open_local_fh() so it
- can safely dereference ->net in a private rcu_read_lock() section
- to allow safe access to the associated nfsd_net and nfsd_serv.
-
-So LOCALIO required the introduction and use of NFSD's percpu_ref to
-interlock nfsd_destroy_serv() and nfsd_open_local_fh(), to ensure each
-nn->nfsd_serv is not destroyed while in use by nfsd_open_local_fh(), and
+allow proper network namespace (net-ns) and NFSD object refcounting.
+
+LOCALIO required the introduction and use of NFSD's percpu nfsd_net_ref
+to interlock nfsd_shutdown_net() and nfsd_open_local_fh(), to ensure
+each net-ns is not destroyed while in use by nfsd_open_local_fh(), and
warrants a more detailed explanation:
- nfsd_open_local_fh() uses nfsd_serv_try_get() before opening its
+ nfsd_open_local_fh() uses nfsd_net_try_get() before opening its
nfsd_file handle and then the caller (NFS client) must drop the
- reference for the nfsd_file and associated nn->nfsd_serv using
- nfs_file_put_local() once it has completed its IO.
+ reference for the nfsd_file and associated net-ns using
+ nfsd_file_put_local() once it has completed its IO.
This interlock working relies heavily on nfsd_open_local_fh() being
afforded the ability to safely deal with the possibility that the
NFSD's net-ns (and nfsd_net by association) may have been destroyed
- by nfsd_destroy_serv() via nfsd_shutdown_net() -- which is only
- possible given the nfs_uuid_t ->net pointer managemenet detailed
- above.
-
-All told, this elaborate interlock of the NFS client and server has been
-verified to fix an easy to hit crash that would occur if an NFSD
-instance running in a container, with a LOCALIO client mounted, is
-shutdown. Upon restart of the container and associated NFSD the client
-would go on to crash due to NULL pointer dereference that occurred due
-to the LOCALIO client's attempting to nfsd_open_local_fh(), using
-nn->nfsd_serv, without having a proper reference on nn->nfsd_serv.
+ by nfsd_destroy_serv() via nfsd_shutdown_net().
+
+This interlock of the NFS client and server has been verified to fix an
+easy to hit crash that would occur if an NFSD instance running in a
+container, with a LOCALIO client mounted, is shutdown. Upon restart of
+the container and associated NFSD, the client would go on to crash due
+to NULL pointer dereference that occurred due to the LOCALIO client's
+attempting to nfsd_open_local_fh() without having a proper reference on
+NFSD's net-ns.
NFS Client issues IO instead of Server
======================================
@@ -306,10 +272,26 @@ is issuing IO to the underlying local filesystem that it is sharing with
the NFS server. See: fs/nfs/localio.c:nfs_local_doio() and
fs/nfs/localio.c:nfs_local_commit().
+With normal NFS that makes use of RPC to issue IO to the server, if an
+application uses O_DIRECT the NFS client will bypass the pagecache but
+the NFS server will not. The NFS server's use of buffered IO affords
+applications to be less precise with their alignment when issuing IO to
+the NFS client. But if all applications properly align their IO, LOCALIO
+can be configured to use end-to-end O_DIRECT semantics from the NFS
+client to the underlying local filesystem, that it is sharing with
+the NFS server, by setting the 'localio_O_DIRECT_semantics' nfs module
+parameter to Y, e.g.:
+
+ echo Y > /sys/module/nfs/parameters/localio_O_DIRECT_semantics
+
+Once enabled, it will cause LOCALIO to use end-to-end O_DIRECT semantics
+(but again, this may cause IO to fail if applications do not properly
+align their IO).
+
Security
========
-Localio is only supported when UNIX-style authentication (AUTH_UNIX, aka
+LOCALIO is only supported when UNIX-style authentication (AUTH_UNIX, aka
AUTH_SYS) is used.
Care is taken to ensure the same NFS security mechanisms are used
@@ -324,6 +306,24 @@ client is afforded this same level of access (albeit in terms of the NFS
protocol via SUNRPC). No other namespaces (user, mount, etc) have been
altered or purposely extended from the server to the client.
+Module Parameters
+=================
+
+/sys/module/nfs/parameters/localio_enabled (bool)
+controls if LOCALIO is enabled, defaults to Y. If client and server are
+local but 'localio_enabled' is set to N then LOCALIO will not be used.
+
+/sys/module/nfs/parameters/localio_O_DIRECT_semantics (bool)
+controls if O_DIRECT extends down to the underlying filesystem, defaults
+to N. Application IO must be logical blocksize aligned, otherwise
+O_DIRECT will fail.
+
+/sys/module/nfsv3/parameters/nfs3_localio_probe_throttle (uint)
+controls if NFSv3 read and write IOs will trigger (re)enabling of
+LOCALIO every N (nfs3_localio_probe_throttle) IOs, defaults to 0
+(disabled). Must be power-of-2, admin keeps all the pieces if they
+misconfigure (too low a value or non-power-of-2).
+
Testing
=======