Age | Commit message (Collapse) | Author |
|
Instead of calling xchg() and unrcu_pointer() before
nfsd_file_put_local(), we now pass pointer to the __rcu pointer and call
xchg() and unrcu_pointer() inside that function.
Where unrcu_pointer() is currently called the internals of "struct
nfsd_file" are not known and that causes older compilers such as gcc-8
to complain.
In some cases we have a __kernel (aka normal) pointer not an __rcu
pointer so we need to cast it to __rcu first. This is strictly a
weakening so no information is lost. Somewhat surprisingly, this cast
is accepted by gcc-8.
This has the pleasing result that the cmpxchg() which sets ro_file and
rw_file, and also the xchg() which clears them, are both now in the nfsd
code.
Reported-by: Pali Rohár <pali@kernel.org>
Reported-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Fixes: 86e00412254a ("nfs: cache all open LOCALIO nfsd_file(s) in client")
Signed-off-by: NeilBrown <neil@brown.name>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
|
|
nfs_uuid_put() and nfs_close_local_fh() can race if a "struct
nfs_file_localio" is released at the same time that nfsd calls
nfs_localio_invalidate_clients().
It is important that neither of these functions completes after the
other has started looking at a given nfs_file_localio and before it
finishes.
If nfs_uuid_put() exits while nfs_close_local_fh() is closing ro_file
and rw_file it could return to __nfd_file_cache_purge() while some files
are still referenced so the purge may not succeed.
If nfs_close_local_fh() exits while nfsd_uuid_put() is still closing the
files then the "struct nfs_file_localio" could be freed while
nfsd_uuid_put() is still looking at it. This side is currently handled
by copying the pointers out of ro_file and rw_file before deleting from
the list in nfsd_uuid. We need to preserve this while ensuring that
nfsd_uuid_put() does wait for nfs_close_local_fh().
This patch use nfl->uuid and nfl->list to provide the required
interlock.
nfs_uuid_put() removes the nfs_file_localio from the list, then drops
locks and puts the two files, then reclaims the spinlock and sets
->nfs_uuid to NULL.
nfs_close_local_fh() operates in the reverse order, setting ->nfs_uuid
to NULL, then closing the files, then unlinking from the list.
If nfs_uuid_put() finds that ->nfs_uuid is already NULL, it waits for
the nfs_file_localio to be removed from the list. If
nfs_close_local_fh() find that it has already been unlinked it waits for
->nfs_uuid to become NULL. This ensure that one of the two tries to
close the files, but they each waits for the other.
As nfs_uuid_put() is making the list empty, change from a
list_for_each_safe loop to a while that always takes the first entry.
This makes the intent more clear.
Also don't move the list to a temporary local list as this would defeat
the guarantees required for the interlock.
Fixes: 86e00412254a ("nfs: cache all open LOCALIO nfsd_file(s) in client")
Signed-off-by: NeilBrown <neil@brown.name>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
|
|
nfs_close_local_fh() is called from two different places for quite
different use case.
It is called from nfs_uuid_put() when the nfs_uuid is being detached -
possibly because the nfs server is not longer serving that filesystem.
In this case there will always be an nfs_uuid and so rcu_read_lock() is
not needed.
It is also called when the nfs_file_localio is no longer needed. In
this case there may not be an active nfs_uuid.
These two can race, and handling the race properly while avoiding
excessive locking will require different handling on each side.
This patch prepares the way by opencoding nfs_close_local_fh() into
nfs_uuid_put(), then simplifying the code there as befits the context.
Fixes: 86e00412254a ("nfs: cache all open LOCALIO nfsd_file(s) in client")
Signed-off-by: NeilBrown <neil@brown.name>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
|
|
The nfsd_localio_operations structure contains nfsd_file_get() to get a
reference to an nfsd_file. This is only used in one place, where
nfsd_open_local_fh() is also used.
This patch combines the two, calling nfsd_open_local_fh() passing a
pointer to where the nfsd_file pointer might be stored. If there is a
pointer there an nfsd_file_get() can get a reference, that reference is
returned. If not a new nfsd_file is acquired, stored at the pointer,
and returned. When we store a reference we also increase the refcount
on the net, as that refcount is decrements when we clear the stored
pointer.
We now get an extra reference *before* storing the new nfsd_file at the
given location. This avoids possible races with the nfsd_file being
freed before the final reference can be taken.
This patch moves the rcu_dereference() needed after fetching from
ro_file or rw_file into the nfsd code where the 'struct nfs_file' is
fully defined. This avoids an error reported by older versions of gcc
such as gcc-8 which complain about rcu_dereference() use in contexts
where the structure (which will supposedly be accessed) is not fully
defined.
Reported-by: Pali Rohár <pali@kernel.org>
Reported-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Fixes: 86e00412254a ("nfs: cache all open LOCALIO nfsd_file(s) in client")
Signed-off-by: NeilBrown <neil@brown.name>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
|
|
Having separate nfsd_file_put and nfsd_file_put_local in struct
nfsd_localio_operations doesn't make much sense. The difference is that
nfsd_file_put doesn't drop a reference to the nfs_net which is what
keeps nfsd from shutting down.
Currently, if nfsd tries to shutdown it will invalidate the files stored
in the list from the nfs_uuid and this will drop all references to the
nfsd net that the client holds. But the client could still hold some
references to nfsd_files for active IO. So nfsd might think is has
completely shut down local IO, but hasn't and has no way to wait for
those active IO requests to complete.
So this patch changes nfsd_file_get to nfsd_file_get_local and has it
increase the ref count on the nfsd net and it replaces all calls to
->nfsd_put_file to ->nfsd_put_file_local.
It also changes ->nfsd_open_local_fh to return with the refcount on the
net elevated precisely when a valid nfsd_file is returned.
This means that whenever the client holds a valid nfsd_file, there will
be an associated count on the nfsd net, and so the count can only reach
zero when all nfsd_files have been returned.
nfs_local_file_put() is changed to call nfs_to_nfsd_file_put_local()
instead of replacing calls to one with calls to the other because this
will help a later patch which changes nfs_to_nfsd_file_put_local() to
take an __rcu pointer while nfs_local_file_put() doesn't.
Fixes: 86e00412254a ("nfs: cache all open LOCALIO nfsd_file(s) in client")
Signed-off-by: NeilBrown <neil@brown.name>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
|
|
Rather than using nfs_uuid.lock to protect installing
a new ro_file or rw_file, change to use cmpxchg().
Removing the file already uses xchg() so this improves symmetry
and also makes the code a little simpler.
Also remove the optimisation of not taking the lock, and not removing
the nfs_file_localio from the linked list, when both ->ro_file and
->rw_file are already NULL. Given that ->nfs_uuid was not NULL, it is
extremely unlikely that neither ->ro_file or ->rw_file is NULL so
this optimisation can be of little value and it complicates
understanding of the code - why can the list_del_init() be skipped?
Finally, move the assignment of NULL to ->nfs_uuid until after
the last action on the nfs_file_localio (the list_del_init). As soon as
this is NULL a racing nfs_close_local_fh() can bypass all the locking
and go on to free the nfs_file_localio, so we must be certain to be
finished with it first.
Fixes: 86e00412254a ("nfs: cache all open LOCALIO nfsd_file(s) in client")
Signed-off-by: NeilBrown <neil@brown.name>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
|
|
-Wflex-array-member-not-at-end was introduced in GCC-14, and we are
getting ready to enable it, globally.
So, in order to avoid ending up with a flexible-array member in the
middle of other structs, we use the `struct_group_tagged()` helper
to create a new tagged `struct posix_acl_hdr`. This structure
groups together all the members of the flexible `struct posix_acl`
except the flexible array.
As a result, the array is effectively separated from the rest of the
members without modifying the memory layout of the flexible structure.
We then change the type of the middle struct member currently causing
trouble from `struct posix_acl` to `struct posix_acl_hdr`.
We also want to ensure that when new members need to be added to the
flexible structure, they are always included within the newly created
tagged struct. For this, we use `static_assert()`. This ensures that the
memory layout for both the flexible structure and the new tagged struct
is the same after any changes.
This approach avoids having to implement `struct posix_acl_hdr` as a
completely separate structure, thus preventing having to maintain two
independent but basically identical structures, closing the door to
potential bugs in the future.
We also use `container_of()` whenever we need to retrieve a pointer to
the flexible structure, through which we can access the flexible-array
member, if necessary.
So, with these changes, fix the following warning:
fs/nfs_common/nfsacl.c:45:26: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Acked-by: Anna Schumaker <anna.schumaker@oracle.com>
Acked-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
nfs4_stat_to_errno() expects a NFSv4 error code as an argument and
returns a POSIX errno.
The problem is LOCALIO is passing nfs4_stat_to_errno() the POSIX errno
return values from filp->f_op->read_iter(), filp->f_op->write_iter()
and vfs_fsync_range().
So the POSIX errno that nfs_local_pgio_done() and
nfs_local_commit_done() are passing to nfs4_stat_to_errno() are
failing to match any NFSv4 error code, which results in
nfs4_stat_to_errno() defaulting to returning -EREMOTEIO. This causes
assertions in upper layers due to -EREMOTEIO not being a valid NFSv4
error code.
Fix this by updating nfs_local_pgio_done() and nfs_local_commit_done()
to use the new nfs_localio_errno_to_nfs4_stat() to map a POSIX errno
to an NFSv4 error code.
Care was taken to factor out nfs4_errtbl_common[] to avoid duplicating
the same NFS error to errno table. nfs4_errtbl_common[] is checked
first by both nfs4_stat_to_errno and nfs_localio_errno_to_nfs4_stat
before they check their own more specialized tables (nfs4_errtbl[] and
nfs4_errtbl_localio[] respectively).
While auditing the associated error mapping tables, the (ab)use of -1
for the last table entry was removed in favor of using ARRAY_SIZE to
iterate the nfs_errtbl[] and nfs4_errtbl[]. And 'errno_NFSERR_IO' was
removed because it caused needless obfuscation.
Fixes: 70ba381e1a431 ("nfs: add LOCALIO support")
Reported-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
|
|
Re-enabling NFSv3 LOCALIO is made more complex (than NFSv4) because v3
is stateless. As such, the hueristic used to identify a LOCALIO probe
point is more adhoc by nature: if/when NFSv3 client IO begins to
complete again in terms of normal RPC-based NFSv3 server IO, attempt
nfs_local_probe_async().
Care is taken to throttle the frequency of nfs_local_probe_async(),
otherwise there could be a flood of repeat calls to
nfs_local_probe_async().
The throttle is admin controlled using a new module parameter for
nfsv3, e.g.:
echo 512 > /sys/module/nfsv3/parameters/nfs3_localio_probe_throttle
Probe for NFSv3 LOCALIO every N IO requests (512 in this case). Must
be power-of-2, defaults to 0 (probing disabled).
On systems that expect to use LOCALIO with NFSv3 the admin should
configure the 'nfs3_localio_probe_throttle' module parameter.
This commit backfills module parameter documentation in localio.rst
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
|
|
Remove nfs_local_enable and nfs_local_disable, instead use
nfs_localio_enable_client and nfs_localio_disable_client.
Discontinue use of the NFS_CS_LOCAL_IO bit in the nfs_client struct's
cl_flags to reflect that LOCALIO is enabled; instead just test if the
net member of the nfs_uuid_t struct is set.
Also remove NFS_CS_LOCAL_IO.
Lastly, remove trace_nfs_local_enable and trace_nfs_local_disable
because comparable traces are available from nfs_localio.ko.
Suggested-by: NeilBrown <neilb@suse.de>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
|
|
The nfs_localio.ko now exposes /sys/kernel/tracing/events/nfs_localio
with nfs_localio_enable_client and nfs_localio_disable_client events.
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
|
|
This tracking enables __nfsd_file_cache_purge() to call
nfs_localio_invalidate_clients(), upon shutdown or export change, to
nfs_close_local_fh() all open nfsd_files that are still cached by the
LOCALIO nfs clients associated with nfsd_net that is being shutdown.
Now that the client must track all open nfsd_files there was more work
than necessary being done with the global nfs_uuids_lock contended.
This manifested in various RCU issues, e.g.:
hrtimer: interrupt took 47969440 ns
rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
Use nfs_uuid->lock to protect all nfs_uuid_t members, instead of
nfs_uuids_lock, once nfs_uuid_is_local() adds the client to
nn->local_clients.
Also add 'local_clients_lock' to 'struct nfsd_net' to protect
nn->local_clients. And store a pointer to spinlock in the 'list_lock'
member of nfs_uuid_t so nfs_localio_disable_client() can use it to
avoid taking the global nfs_uuids_lock.
In combination, these split out locks eliminate the use of the single
nfslocalio.c global nfs_uuids_lock in the IO paths (open and close).
Also refactored associated fs/nfs_common/nfslocalio.c methods' locking
to reduce work performed with spinlocks held in general.
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
|
|
This global spinlock protects all nfs_uuid_t relative to the global
nfs_uuids list. A later commit will split this global spinlock so
prepare by renaming this lock to reflect its intended narrow scope.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
|
|
Also update Documentation/filesystems/nfs/localio.rst accordingly
and reduce the technical documentation debt that was previously
captured in that document.
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Acked-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
|
|
This commit switches from leaning heavily on NFSD's filecache (in
terms of GC'd nfsd_files) back to caching nfsd_files in the
client. A later commit will add the callback mechanism needed to
allow NFSD to force the NFS client to cleanup all cached nfsd_files.
Add nfs_fh_localio_init() and 'struct nfs_fh_localio' to cache opened
nfsd_file(s) (both a RO and RW nfsd_file is able to be opened and
cached for a given nfs_fh).
Update nfs_local_open_fh() to cache the nfsd_file once it is opened
using __nfs_local_open_fh().
Introduce nfs_close_local_fh() to clear the cached open nfsd_files and
call nfs_to_nfsd_file_put_local().
Refcounting is such that:
- nfs_local_open_fh() is paired with nfs_close_local_fh().
- __nfs_local_open_fh() is paired with nfs_to_nfsd_file_put_local().
- nfs_local_file_get() is paired with nfs_local_file_put().
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
|
|
Remove cl_localio_lock from 'struct nfs_client' in favor of adding a
lock to the nfs_uuid_t struct (which is embedded in each nfs_client).
Push nfs_local_{enable,disable} implementation down to nfs_common.
Those methods now call nfs_localio_{enable,disable}_client.
This allows implementing nfs_localio_invalidate_clients in terms of
nfs_localio_disable_client.
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
|
|
Rename nfs_uuid_invalidate_one_client to nfs_localio_disable_client.
Rename nfs_uuid_invalidate_clients to nfs_localio_invalidate_clients.
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: NeilBrown <neilb@suse.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
|
|
Move holding the RCU from nfs_to_nfsd_file_put_local to
nfs_to_nfsd_net_put. It is the call to nfs_to->nfsd_serv_put that
requires the RCU anyway (the puts for nfsd_file and netns were
combined to avoid an extra indirect reference but that
micro-optimization isn't possible now).
This fixes xfstests generic/013 and it triggering:
"Voluntary context switch within RCU read-side critical section!"
[ 143.545738] Call Trace:
[ 143.546206] <TASK>
[ 143.546625] ? show_regs+0x6d/0x80
[ 143.547267] ? __warn+0x91/0x140
[ 143.547951] ? rcu_note_context_switch+0x496/0x5d0
[ 143.548856] ? report_bug+0x193/0x1a0
[ 143.549557] ? handle_bug+0x63/0xa0
[ 143.550214] ? exc_invalid_op+0x1d/0x80
[ 143.550938] ? asm_exc_invalid_op+0x1f/0x30
[ 143.551736] ? rcu_note_context_switch+0x496/0x5d0
[ 143.552634] ? wakeup_preempt+0x62/0x70
[ 143.553358] __schedule+0xaa/0x1380
[ 143.554025] ? _raw_spin_unlock_irqrestore+0x12/0x40
[ 143.554958] ? try_to_wake_up+0x1fe/0x6b0
[ 143.555715] ? wake_up_process+0x19/0x20
[ 143.556452] schedule+0x2e/0x120
[ 143.557066] schedule_preempt_disabled+0x19/0x30
[ 143.557933] rwsem_down_read_slowpath+0x24d/0x4a0
[ 143.558818] ? xfs_efi_item_format+0x50/0xc0 [xfs]
[ 143.559894] down_read+0x4e/0xb0
[ 143.560519] xlog_cil_commit+0x1b2/0xbc0 [xfs]
[ 143.561460] ? _raw_spin_unlock+0x12/0x30
[ 143.562212] ? xfs_inode_item_precommit+0xc7/0x220 [xfs]
[ 143.563309] ? xfs_trans_run_precommits+0x69/0xd0 [xfs]
[ 143.564394] __xfs_trans_commit+0xb5/0x330 [xfs]
[ 143.565367] xfs_trans_roll+0x48/0xc0 [xfs]
[ 143.566262] xfs_defer_trans_roll+0x57/0x100 [xfs]
[ 143.567278] xfs_defer_finish_noroll+0x27a/0x490 [xfs]
[ 143.568342] xfs_defer_finish+0x1a/0x80 [xfs]
[ 143.569267] xfs_bunmapi_range+0x4d/0xb0 [xfs]
[ 143.570208] xfs_itruncate_extents_flags+0x13d/0x230 [xfs]
[ 143.571353] xfs_free_eofblocks+0x12e/0x190 [xfs]
[ 143.572359] xfs_file_release+0x12d/0x140 [xfs]
[ 143.573324] __fput+0xe8/0x2d0
[ 143.573922] __fput_sync+0x1d/0x30
[ 143.574574] nfsd_filp_close+0x33/0x60 [nfsd]
[ 143.575430] nfsd_file_free+0x96/0x150 [nfsd]
[ 143.576274] nfsd_file_put+0xf7/0x1a0 [nfsd]
[ 143.577104] nfsd_file_put_local+0x18/0x30 [nfsd]
[ 143.578070] nfs_close_local_fh+0x101/0x110 [nfs_localio]
[ 143.579079] __put_nfs_open_context+0xc9/0x180 [nfs]
[ 143.580031] nfs_file_clear_open_context+0x4a/0x60 [nfs]
[ 143.581038] nfs_file_release+0x3e/0x60 [nfs]
[ 143.581879] __fput+0xe8/0x2d0
[ 143.582464] __fput_sync+0x1d/0x30
[ 143.583108] __x64_sys_close+0x41/0x80
[ 143.583823] x64_sys_call+0x189a/0x20d0
[ 143.584552] do_syscall_64+0x64/0x170
[ 143.585240] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 143.586185] RIP: 0033:0x7f3c5153efd7
Fixes: 65f2a5c36635 ("nfs_common: fix race in NFS calls to nfsd_file_put_local() and nfsd_serv_put()")
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: NeilBrown <neilb@suse.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Fix the possibility of racing nfs_local_probe() resulting in:
list_add double add: new=ffff8b99707f9f58, prev=ffff8b99707f9f58, next=ffffffffc0f30000.
------------[ cut here ]------------
kernel BUG at lib/list_debug.c:35!
Add nfs_uuid_init() to properly initialize all nfs_uuid_t members
(particularly its list_head).
Switch to returning bool from nfs_uuid_begin(), returns false if
nfs_uuid_t is already in-use (its list_head is on a list). Update
nfs_local_probe() to return early if the nfs_client's cl_uuid
(nfs_uuid_t) is in-use.
Also, switch nfs_uuid_begin() from using list_add_tail_rcu() to
list_add_tail() -- rculist was used in an earlier version of the
localio code that had a lockless nfs_uuid_lookup interface.
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
|
|
Add nfs_to_nfsd_file_put_local() interface to fix race with nfsd
module unload. Similarly, use RCU around nfs_open_local_fh()'s error
path call to nfs_to->nfsd_serv_put(). Holding RCU ensures that NFS
will safely _call and return_ from its nfs_to calls into the NFSD
functions nfsd_file_put_local() and nfsd_serv_put().
Otherwise, if RCU isn't used then there is a narrow window when NFS's
reference for the nfsd_file and nfsd_serv are dropped and the NFSD
module could be unloaded, which could result in a crash from the
return instruction for either nfs_to->nfsd_file_put_local() or
nfs_to->nfsd_serv_put().
Reported-by: NeilBrown <neilb@suse.de>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
|
|
Add client support for bypassing NFS for localhost reads, writes, and
commits. This is only useful when the client and the server are
running on the same host.
nfs_local_probe() is stubbed out, later commits will enable client and
server handshake via a Linux-only LOCALIO auxiliary RPC protocol.
This has dynamic binding with the nfsd module (via nfs_localio module
which is part of nfs_common). LOCALIO will only work if nfsd is
already loaded.
The "localio_enabled" nfs kernel module parameter can be used to
disable and enable the ability to use LOCALIO support.
CONFIG_NFS_LOCALIO enables NFS client support for LOCALIO.
Lastly, LOCALIO uses an nfsd_file to initiate all IO. To make proper
use of nfsd_file (and nfsd's filecache) its lifetime (duration before
nfsd_file_put is called) must extend until after commit, read and
write operations. So rather than immediately drop the nfsd_file
reference in nfs_local_open_fh(), that doesn't happen until
nfs_local_pgio_release() for read/write and not until
nfs_local_release_commit_data() for commit. The same applies to the
reference held on nfsd's nn->nfsd_serv. Both objects' lifetimes and
associated references are managed through calls to
nfs_to->nfsd_file_put_local().
Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Co-developed-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: NeilBrown <neilb@suse.de> # nfs_open_local_fh
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
|
|
The next commit will introduce nfsd_open_local_fh() which returns an
nfsd_file structure. This commit exposes LOCALIO's required NFSD
symbols to the NFS client:
- Make nfsd_open_local_fh() symbol and other required NFSD symbols
available to NFS in a global 'nfs_to' nfsd_localio_operations
struct (global access suggested by Trond, nfsd_localio_operations
suggested by NeilBrown). The next commit will also introduce
nfsd_localio_ops_init() that init_nfsd() will call to initialize
'nfs_to'.
- Introduce nfsd_file_file() that provides access to nfsd_file's
backing file. Keeps nfsd_file structure opaque to NFS client (as
suggested by Jeff Layton).
- Introduce nfsd_file_put_local() that will put the reference to the
nfsd_file's associated nn->nfsd_serv and then put the reference to
the nfsd_file (as suggested by NeilBrown).
Suggested-by: Trond Myklebust <trond.myklebust@hammerspace.com> # nfs_to
Suggested-by: NeilBrown <neilb@suse.de> # nfsd_localio_operations
Suggested-by: Jeff Layton <jlayton@kernel.org> # nfsd_file_file
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: NeilBrown <neilb@suse.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
|
|
fs/nfs_common/nfslocalio.c provides interfaces that enable an NFS
client to generate a nonce (single-use UUID) and associated nfs_uuid_t
struct, register it with nfs_common for subsequent lookup and
verification by the NFS server and if matched the NFS server populates
members in the nfs_uuid_t struct.
nfs_common's nfs_uuids list is the basis for localio enablement, as
such it has members that point to nfsd memory for direct use by the
client (e.g. 'net' is the server's network namespace, through it the
client can access nn->nfsd_serv).
This commit also provides the base nfs_uuid_t interfaces to allow
proper net namespace refcounting for the LOCALIO use case.
CONFIG_NFS_LOCALIO controls the nfs_common, NFS server and NFS client
enablement for LOCALIO. If both NFS_FS=m and NFSD=m then
NFS_COMMON_LOCALIO_SUPPORT=m and nfs_localio.ko is built (and provides
nfs_common's LOCALIO support).
# lsmod | grep nfs_localio
nfs_localio 12288 2 nfsd,nfs
sunrpc 745472 35 nfs_localio,nfsd,auth_rpcgss,lockd,nfsv3,nfs
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Co-developed-by: NeilBrown <neilb@suse.de>
Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
|
|
Common nfs4_stat_to_errno() is used by fs/nfs/nfs4xdr.c and will be
used by fs/nfs/localio.c
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: NeilBrown <neilb@suse.de>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
|
|
Common nfs_stat_to_errno() is used by both fs/nfs/nfs2xdr.c and
fs/nfs/nfs3xdr.c
Will also be used by fs/nfsd/localio.c
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: NeilBrown <neilb@suse.de>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
|
|
Fix the 'make W=1' warnings:
WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nfs_common/nfs_acl.o
WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nfs_common/grace.o
WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nfs/nfs.o
WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nfs/nfsv2.o
WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nfs/nfsv3.o
WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nfs/nfsv4.o
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
Since commit 8b41fc4454e ("kbuild: create modules.builtin without
Makefile.modbuiltin or tristate.conf"), MODULE_LICENSE declarations
are used to identify modules. As a consequence, uses of the macro
in non-modules will cause modprobe to misidentify their containing
object file as a module when it is not (false positives), and modprobe
might succeed rather than failing with a suitable error message.
So remove it in the files in this commit, none of which can be built as
modules.
Signed-off-by: Nick Alcock <nick.alcock@oracle.com>
Suggested-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: linux-modules@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Hitomi Hasegawa <hasegawa-hitomi@fujitsu.com>
Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: Anna Schumaker <anna@kernel.org>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: linux-nfs@vger.kernel.org
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
|
The file locking definitions have lived in fs.h since the dawn of time,
but they are only used by a small subset of the source files that
include it.
Move the file locking definitions to a new header file, and add the
appropriate #include directives to the source files that need them. By
doing this we trim down fs.h a bit and limit the amount of rebuilding
that has to be done when we make changes to the file locking APIs.
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Howells <dhowells@redhat.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Acked-by: Chuck Lever <chuck.lever@oracle.com>
Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Acked-by: Steve French <stfrench@microsoft.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
|
|
Fix a kerneldoc warning in nfs due to documentation for a parameter that
isn't present.
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
cc: Trond Myklebust <trond.myklebust@hammerspace.com>
cc: Anna Schumaker <anna.schumaker@netapp.com>
cc: Mauro Carvalho Chehab <mchehab@kernel.org>
cc: linux-nfs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-doc@vger.kernel.org
Link: https://lore.kernel.org/r/163214005516.2945267.7000234432243167892.stgit@warthog.procyon.org.uk/ # rfc v1
Link: https://lore.kernel.org/r/163281899704.2790286.9177774252843775348.stgit@warthog.procyon.org.uk/ # rfc v2
|
|
Fix gcc W=1 warning:
fs/nfs_common/grace.c:91: warning: Function parameter or member 'net' not described in 'locks_in_grace'
Signed-off-by: ChenXiaoSong <chenxiaosong2@huawei.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Currently NFSv4_2 SSC helper, nfs_ssc, incorrectly uses GRACE_PERIOD
as its config. Fix by adding new config NFS_V4_2_SSC_HELPER which
depends on NFS_V4_2 and is automatically selected when NFSD_V4 is
enabled. Also removed the file name from a comment in nfs_ssc.c.
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
If the elem is deleted during be iterated on it, the iteration
process will fall into an endless loop.
kernel: NMI watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [nfsd:17137]
PID: 17137 TASK: ffff8818d93c0000 CPU: 4 COMMAND: "nfsd"
[exception RIP: __state_in_grace+76]
RIP: ffffffffc00e817c RSP: ffff8818d3aefc98 RFLAGS: 00000246
RAX: ffff881dc0c38298 RBX: ffffffff81b03580 RCX: ffff881dc02c9f50
RDX: ffff881e3fce8500 RSI: 0000000000000001 RDI: ffffffff81b03580
RBP: ffff8818d3aefca0 R8: 0000000000000020 R9: ffff8818d3aefd40
R10: ffff88017fc03800 R11: ffff8818e83933c0 R12: ffff8818d3aefd40
R13: 0000000000000000 R14: ffff8818e8391068 R15: ffff8818fa6e4000
CS: 0010 SS: 0018
#0 [ffff8818d3aefc98] opens_in_grace at ffffffffc00e81e3 [grace]
#1 [ffff8818d3aefca8] nfs4_preprocess_stateid_op at ffffffffc02a3e6c [nfsd]
#2 [ffff8818d3aefd18] nfsd4_write at ffffffffc028ed5b [nfsd]
#3 [ffff8818d3aefd80] nfsd4_proc_compound at ffffffffc0290a0d [nfsd]
#4 [ffff8818d3aefdd0] nfsd_dispatch at ffffffffc027b800 [nfsd]
#5 [ffff8818d3aefe08] svc_process_common at ffffffffc02017f3 [sunrpc]
#6 [ffff8818d3aefe70] svc_process at ffffffffc0201ce3 [sunrpc]
#7 [ffff8818d3aefe98] nfsd at ffffffffc027b117 [nfsd]
#8 [ffff8818d3aefec8] kthread at ffffffff810b88c1
#9 [ffff8818d3aeff50] ret_from_fork at ffffffff816d1607
The troublemake elem:
crash> lock_manager ffff881dc0c38298
struct lock_manager {
list = {
next = 0xffff881dc0c38298,
prev = 0xffff881dc0c38298
},
block_opens = false
}
Fixes: c87fb4a378f9 ("lockd: NLM grace period shouldn't block NFSv4 opens")
Signed-off-by: Cheng Lin <cheng.lin130@zte.com.cn>
Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
NFS_FS=y as dependency of CONFIG_NFSD_V4_2_INTER_SSC still have
build errors and some configs with NFSD=m to get NFS4ERR_STALE
error when doing inter server copy.
Added ops table in nfs_common for knfsd to access NFS client modules.
Fixes: 3ac3711adb88 ("NFSD: Fix NFS server build errors")
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.
[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
|
|
Add SPDX license identifiers to all Make/Kconfig files which:
- Have no license information of any form
These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:
GPL-2.0-only
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Add SPDX license identifiers to all files which:
- Have no license information of any form
- Have MODULE_LICENCE("GPL*") inside which was used in the initial
scan/conversion to ignore the file
These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:
GPL-2.0-only
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
restart_grace() uses hardcoded init_net.
It can cause to "list_add double add" in following scenario:
1) nfsd and lockd was started in several net namespaces
2) nfsd in init_net was stopped (lockd was not stopped because
it have users from another net namespaces)
3) lockd got signal, called restart_grace() -> set_grace_period()
and enabled lock_manager in hardcoded init_net.
4) nfsd in init_net is started again,
its lockd_up() calls set_grace_period() and tries to add
lock_manager into init_net 2nd time.
Jeff Layton suggest:
"Make it safe to call locks_start_grace multiple times on the same
lock_manager. If it's already on the global grace_list, then don't try
to add it again. (But we don't intentionally add twice, so for now we
WARN about that case.)
With this change, we also need to ensure that the nfsd4 lock manager
initializes the list before we call locks_start_grace. While we're at
it, move the rest of the nfsd_net initialization into
nfs4_state_create_net. I see no reason to have it spread over two
functions like it is today."
Suggested patch was updated to generate warning in described situation.
Suggested-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Since __state_in_grace return only true/false, make it return bool
instead of int.
Same change for the two user of it, locks_in_grace/opens_in_grace
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Commit c87fb4a378f9 ("lockd: NLM grace period shouldn't block NFSv4 opens")
made the locks_in_grace() comment be in the wrong place.
This patch move this comment just at the right place.
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
This fix the following warning
fs/nfs_common/grace.c:66:1: warning: no previous prototype for function '__state_in_grace' [-Wmissing-prototypes]
by adding the missing static.
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Make struct pernet_operations::id unsigned.
There are 2 reasons to do so:
1)
This field is really an index into an zero based array and
thus is unsigned entity. Using negative value is out-of-bound
access by definition.
2)
On x86_64 unsigned 32-bit data which are mixed with pointers
via array indexing or offsets added or subtracted to pointers
are preffered to signed 32-bit data.
"int" being used as an array index needs to be sign-extended
to 64-bit before being used.
void f(long *p, int i)
{
g(p[i]);
}
roughly translates to
movsx rsi, esi
mov rdi, [rsi+...]
call g
MOVSX is 3 byte instruction which isn't necessary if the variable is
unsigned because x86_64 is zero extending by default.
Now, there is net_generic() function which, you guessed it right, uses
"int" as an array index:
static inline void *net_generic(const struct net *net, int id)
{
...
ptr = ng->ptr[id - 1];
...
}
And this function is used a lot, so those sign extensions add up.
Patch snipes ~1730 bytes on allyesconfig kernel (without all junk
messing with code generation):
add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)
Unfortunately some functions actually grow bigger.
This is a semmingly random artefact of code generation with register
allocator being used differently. gcc decides that some variable
needs to live in new r8+ registers and every access now requires REX
prefix. Or it is shifted into r12, so [r12+0] addressing mode has to be
used which is longer than [r8]
However, overall balance is in negative direction:
add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)
function old new delta
nfsd4_lock 3886 3959 +73
tipc_link_build_proto_msg 1096 1140 +44
mac80211_hwsim_new_radio 2776 2808 +32
tipc_mon_rcv 1032 1058 +26
svcauth_gss_legacy_init 1413 1429 +16
tipc_bcbase_select_primary 379 392 +13
nfsd4_exchange_id 1247 1260 +13
nfsd4_setclientid_confirm 782 793 +11
...
put_client_renew_locked 494 480 -14
ip_set_sockfn_get 730 716 -14
geneve_sock_add 829 813 -16
nfsd4_sequence_done 721 703 -18
nlmclnt_lookup_host 708 686 -22
nfsd4_lockt 1085 1063 -22
nfs_get_client 1077 1050 -27
tcf_bpf_init 1106 1076 -30
nfsd4_encode_fattr 5997 5930 -67
Total: Before=154856051, After=154854321, chg -0.00%
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
NLM locks don't conflict with NFSv4 share reservations, so we're not
going to learn anything new by watiting for them.
They do conflict with NFSv4 locks and with delegations.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Currently, all of the grace period handling is part of lockd. Eventually
though we'd like to be able to build v4-only servers, at which point
we'll need to put all of this elsewhere.
Move the code itself into fs/nfs_common and have it build a grace.ko
module. Then, rejigger the Kconfig options so that both nfsd and lockd
enable it automatically.
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
|
|
Fix checkpatch warnings:
"WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable"
Cc: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
|
- Use kuid_t and kgit in struct nfsacl_encode_desc.
- Convert from kuids and kgids when generating on the wire values.
- Convert on the wire values to kuids and kgids when read.
- Modify cmp_acl_entry to be type safe comparison on posix acls.
Only acls with type ACL_USER and ACL_GROUP can appear more
than once and as such need to compare more than their tag.
- The e_id field is being removed from posix acls so don't initialize it.
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
Fixes generated by 'codespell' and manually reviewed.
Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
|
|
Remove pointless memset() in nfsacl_encode().
Thanks to Trond Myklebust <Trond.Myklebust@netapp.com> for pointing out
that it is not needed since posix_acl_init() will set everything
regardless..
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|