summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-09-05gfs2: Introduce new quota=quiet mount optionBob Peterson
This patch adds a new mount option quota=quiet which is the same as quota=on but it suppresses gfs2 quota error messages. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05gfs2: Add device name to gfs2_logd and gfs2_quotadAndreas Gruenbacher
Add the device name to the names of the gfs2_logd and gfs2_quotad kernel threads to allow for easier identification. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05gfs2: Rename "freeze_workqueue" to "gfs2_freeze"Andreas Gruenbacher
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05gfs2: Rename "gfs_recovery" workqueue to "gfs2_recovery"Andreas Gruenbacher
Rename the "gfs_recovery" workqueue to "gfs2_recovery", and gfs_recovery_wq to gfs2_recovery_wq. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05gfs2: Fix withdraw raceAndreas Gruenbacher
Function gfs2_withdraw() tries to synchronize concurrent callers by atomically setting the SDF_WITHDRAWN flag in the first caller, setting the SDF_WITHDRAW_IN_PROG flag to indicate that a withdraw is in progress, performing the actual withdraw, and clearing the SDF_WITHDRAW_IN_PROG flag when done. All other callers wait for the SDF_WITHDRAW_IN_PROG flag to be cleared before returning. This leaves a small window in which callers can find the SDF_WITHDRAWN flag set before the SDF_WITHDRAW_IN_PROG flag has been set, causing them to return prematurely, before the withdraw has been completed. Fix that by setting the SDF_WITHDRAWN and SDF_WITHDRAW_IN_PROG flags atomically. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05gfs2: Sanitize kthread stoppingAndreas Gruenbacher
Immediately stop the logd and quotad kernel threads when a filesystem withdraw is detected: those threads aren't doing anything useful after a withdraw. (Depends on the extra logd and quotad task struct references held since commit 7a109f383fa3 ("gfs2: Fix asynchronous thread destruction").) In addition, check for kthread_should_stop() in the wait condition in gfs2_quotad() to stop immediately when kthread_stop() is called. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05gfs2: Switch to wait_event in gfs2_quotadAndreas Gruenbacher
In gfs2_quotad(), switch from an open-coded wait loop to wait_event_interruptible_timeout(). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05gfs2: Fix asynchronous thread destructionAndreas Gruenbacher
The kernel threads are currently stopped and destroyed synchronously by gfs2_make_fs_ro() and gfs2_put_super(), and asynchronously by signal_our_withdraw(), with no synchronization, so the synchronous and asynchronous contexts can race with each other. First, when creating the kernel threads, take an extra task struct reference so that the task struct won't go away immediately when they terminate. This allows those kthreads to terminate immediately when they're done rather than hanging around as zombies until they are reaped by kthread_stop(). When kthread_stop() is called on a terminated kthread, it will return immediately. Second, in signal_our_withdraw(), once the SDF_JOURNAL_LIVE flag has been cleared, wake up the logd and quotad wait queues instead of stopping the logd and quotad kthreads. The kthreads are then expected to terminate automatically within short time, but if they cannot, they will not block the withdraw. For example, if a user process and one of the kthread decide to withdraw at the same time, only one of them will perform the actual withdraw and the other will wait for it to be done. If the kthread ends up being the one to wait, the withdrawing user process won't be able to stop it. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05gfs2: Stop using gfs2_make_fs_ro for withdrawAndreas Gruenbacher
[ 81.372851][ T5532] CPU: 1 PID: 5532 Comm: syz-executor.0 Not tainted 6.2.0-rc1-syzkaller-dirty #0 [ 81.382080][ T5532] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/12/2023 [ 81.392343][ T5532] Call Trace: [ 81.395654][ T5532] <TASK> [ 81.398603][ T5532] dump_stack_lvl+0x1b1/0x290 [ 81.418421][ T5532] gfs2_assert_warn_i+0x19a/0x2e0 [ 81.423480][ T5532] gfs2_quota_cleanup+0x4c6/0x6b0 [ 81.428611][ T5532] gfs2_make_fs_ro+0x517/0x610 [ 81.457802][ T5532] gfs2_withdraw+0x609/0x1540 [ 81.481452][ T5532] gfs2_inode_refresh+0xb2d/0xf60 [ 81.506658][ T5532] gfs2_instantiate+0x15e/0x220 [ 81.511504][ T5532] gfs2_glock_wait+0x1d9/0x2a0 [ 81.516352][ T5532] do_sync+0x485/0xc80 [ 81.554943][ T5532] gfs2_quota_sync+0x3da/0x8b0 [ 81.559738][ T5532] gfs2_sync_fs+0x49/0xb0 [ 81.564063][ T5532] sync_filesystem+0xe8/0x220 [ 81.568740][ T5532] generic_shutdown_super+0x6b/0x310 [ 81.574112][ T5532] kill_block_super+0x79/0xd0 [ 81.578779][ T5532] deactivate_locked_super+0xa7/0xf0 [ 81.584064][ T5532] cleanup_mnt+0x494/0x520 [ 81.593753][ T5532] task_work_run+0x243/0x300 [ 81.608837][ T5532] exit_to_user_mode_loop+0x124/0x150 [ 81.614232][ T5532] exit_to_user_mode_prepare+0xb2/0x140 [ 81.619820][ T5532] syscall_exit_to_user_mode+0x26/0x60 [ 81.625287][ T5532] do_syscall_64+0x49/0xb0 [ 81.629710][ T5532] entry_SYSCALL_64_after_hwframe+0x63/0xcd In this backtrace, gfs2_quota_sync() takes quota data references and then calls do_sync(). Function do_sync() encounters filesystem corruption and withdraws the filesystem, which (among other things) calls gfs2_quota_cleanup(). Function gfs2_quota_cleanup() wrongly assumes that nobody is holding any quota data references anymore, and destroys all quota data objects. When gfs2_quota_sync() then resumes and dereferences the quota data objects it is holding, those objects are no longer there. Function gfs2_quota_cleanup() deals with resource deallocation and can easily be delayed until gfs2_put_super() in the case of a filesystem withdraw. In fact, most of the other work gfs2_make_fs_ro() does is unnecessary during a withdraw as well, so change signal_our_withdraw() to skip gfs2_make_fs_ro() and perform the necessary steps directly instead. Thanks to Edward Adam Davis <eadavis@sina.com> for the initial patches. Link: https://lore.kernel.org/all/0000000000002b5e2405f14e860f@google.com Reported-by: syzbot+3f6a670108ce43356017@syzkaller.appspotmail.com Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05gfs2: Free quota data objects synchronouslyAndreas Gruenbacher
In gfs2_quota_cleanup(), wait for the quota data objects to be freed before returning. Otherwise, there is no guarantee that the quota data objects will be gone when their kmem cache is destroyed. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05gfs2: Fix initial quota data refcountAndreas Gruenbacher
Fix the refcount of quota data objects created directly by gfs2_quota_init(): those are placed into the in-memory quota "database" for eventual syncing to the main quota file, but they are not actively held and should thus have an initial refcount of 0. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05gfs2: No more quota complaints after withdrawAndreas Gruenbacher
Once a filesystem is withdrawn, don't complain about quota changes that can't be synced to the main quota file anymore. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05gfs2: Factor out duplicate quota data disposal codeAndreas Gruenbacher
Rename gfs2_qd_dispose() to gfs2_qd_dispose_list(). Move some code duplicated in gfs2_qd_dispose_list() and gfs2_quota_cleanup() into a new gfs2_qd_dispose() function. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05gfs2: Use gfs2_qd_dispose in gfs2_quota_cleanupAndreas Gruenbacher
Change gfs2_quota_cleanup() to move the quota data objects to dispose of on a dispose list and call gfs2_qd_dispose() on that list, like gfs2_qd_shrink_scan() does, instead of disposing of the quota data objects directly. This may look a bit pointless by itself, but it will make more sense in combination with a fix that follows. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05gfs2: Fix wrong quota shrinker return valueAndreas Gruenbacher
Function gfs2_qd_isolate must only return LRU_REMOVED when removing the item from the lru list; otherwise, the number of items on the list will go wrong. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05gfs2: Rename SDF_DEACTIVATING to SDF_KILLAndreas Gruenbacher
Rename the SDF_DEACTIVATING flag to SDF_KILL to make it more obvious that this relates to the kill_sb filesystem operation. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05gfs2: Rename sd_{ glock => kill }_waitAndreas Gruenbacher
Rename sd_glock_wait to sd_kill_wait: we'll use it for other things related to "killing" a filesystem on unmount soon (kill_sb). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05gfs2: Use qd_sbd more consequentlyBob Peterson
Before this patch many of the functions in quota.c got their superblock pointer, sdp, from the quota_data's glock pointer. That's silly because the qd already has its own pointer to the superblock (qd_sbd). This patch changes references to use that instead, eliminating a level of indirection. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05gfs2: journal flush threshold fixes and cleanupAndreas Gruenbacher
Commit f07b35202148 ("GFS2: Made logd daemon take into account log demand") changed gfs2_ail_flush_reqd() and gfs2_jrnl_flush_reqd() to take sd_log_blks_needed into account, but the checks in gfs2_log_commit() were not updated correspondingly. Once that is fixed, gfs2_jrnl_flush_reqd() and gfs2_ail_flush_reqd() can be used in gfs2_log_commit(). Make those two helpers available to gfs2_log_commit() by defining them above gfs2_log_commit(). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05gfs2: Fix logd wakeup on I/O errorAndreas Gruenbacher
When quotad detects an I/O error, it sets sd_log_error and then it wakes up logd to withdraw the filesystem. However, logd doesn't wake up when sd_log_error is set. Fix that. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05gfs2: low-memory forced flush fixesAndreas Gruenbacher
First, function gfs2_ail_flush_reqd checks the SDF_FORCE_AIL_FLUSH flag to determine if an AIL flush should be forced in low-memory situations. However, it also immediately clears the flag, and when called repeatedly as in function gfs2_logd, the flag will be lost. Fix that by pulling the SDF_FORCE_AIL_FLUSH flag check out of gfs2_ail_flush_reqd. Second, function gfs2_writepages sets the SDF_FORCE_AIL_FLUSH flag whether or not enough pages were written. If enough pages could be written, flushing the AIL is unnecessary, though. Third, gfs2_writepages doesn't wake up logd after setting the SDF_FORCE_AIL_FLUSH flag, so it can take a long time for logd to react. It would be preferable to wake up logd, but that hurts the performance of some workloads and we don't quite understand why so far, so don't wake up logd so far. Fixes: b066a4eebd4f ("gfs2: forcibly flush ail to relieve memory pressure") Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05gfs2: Switch to wait_event in gfs2_logdAndreas Gruenbacher
In gfs2_logd(), switch from an open-coded wait loop to wait_event_interruptible_timeout(). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05gfs2: conversion deadlock do_promote bypassBob Peterson
Consider the following case: 1. A glock is held in shared mode. 2. A process requests the glock in exclusive mode (rename). 3. Before the lock is granted, more processes (read / ls) request the glock in shared mode again. 4. gfs2 sends a request to dlm for the lock in exclusive mode because that holder is at the head of the queue. 5. Somehow the dlm request gets canceled, so dlm sends us back a response with state == LM_ST_SHARED and LM_OUT_CANCELED. So at that point, the glock is still held in shared mode. 6. finish_xmote gets called to process the response from dlm. It detects that the glock is not in the requested mode and no demote is in progress, so it moves the canceled holder to the tail of the queue and finds the new holder at the head of the queue. That holder is requesting the glock in shared mode. 7. finish_xmote calls do_xmote to transition the glock into shared mode, but the glock is already in shared mode and so do_xmote complains about that with: GLOCK_BUG_ON(gl, gl->gl_state == gl->gl_target); Instead, in finish_xmote, after moving the canceled holder to the tail of the queue, check if any new holders can be granted. Only call do_xmote to repeat the dlm request if the holder at the head of the queue is requesting the glock in a mode that is incompatible with the mode the glock is currently held in. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05gfs2: Remove LM_FLAG_PRIORITY flagAndreas Gruenbacher
The last user of this flag was removed in commit b77b4a4815a9 ("gfs2: Rework freeze / thaw logic"). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05gfs2: do_promote cleanupAndreas Gruenbacher
Change function do_promote to return true on success, and false otherwise. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05gfs: Don't use GFP_NOFS in gfs2_unstuff_dinodeAndreas Gruenbacher
Revert the rest of commit 220cca2a4f58 ("GFS2: Change truncate page allocation to be GFP_NOFS"): In gfs2_unstuff_dinode(), there is no need to carry out the page cache allocation under GFP_NOFS because inodes on the "regular" filesystem are never un-inlined under memory pressure, so switch back from find_or_create_page() to grab_cache_page() here as well. Inodes on the "metadata" filesystem can theoretically be un-inlined under memory pressure, but any page cache allocations in that context would happen in GFP_NOFS context because those inodes have inode->i_mapping->gfp_mask set to GFP_NOFS (see the previous patch). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05gfs2: Use mapping->gfp_mask for metadata inodesAndreas Gruenbacher
Set mapping->gfp mask to GFP_NOFS for all metadata inodes so that allocating pages in the address space of those inodes won't call back into the filesystem. This allows to switch back from find_or_create_page() to grab_cache_page() in two places. Partially reverts commit 220cca2a4f58 ("GFS2: Change truncate page allocation to be GFP_NOFS"). Thanks to Dan Carpenter <dan.carpenter@linaro.org> for pointing out a Smatch static checker warning. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-05gfs2: increase usage of folio_next_index() helperMinjie Du
Simplify code pattern of 'folio->index + folio_nr_pages(folio)' by using the existing helper folio_next_index(). Signed-off-by: Minjie Du <duminjie@vivo.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-08-08Merge tag 'gfs2-v6.4-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 fixes from Andreas Gruenbacher: - Fix a freeze consistency check in gfs2_trans_add_meta() - Don't use filemap_splice_read as it can cause deadlocks on gfs2 * tag 'gfs2-v6.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: gfs2: Don't use filemap_splice_read gfs2: Fix freeze consistency check in gfs2_trans_add_meta
2023-08-07Merge tag 'xsa432-6.5-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen netback buffer overflow fix from Juergen Gross: "The fix for XSA-423 added logic to Linux'es netback driver to deal with a frontend splitting a packet in a way such that not all of the headers would come in one piece. Unfortunately the logic introduced there didn't account for the extreme case of the entire packet being split into as many pieces as permitted by the protocol, yet still being smaller than the area that's specially dealt with to keep all (possible) headers together. Such an unusual packet would therefore trigger a buffer overrun in the driver" * tag 'xsa432-6.5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/netback: Fix buffer overrun triggered by unusual packet
2023-08-07Merge tag 'gds-for-linus-2023-08-01' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/gds fixes from Dave Hansen: "Mitigate Gather Data Sampling issue: - Add Base GDS mitigation - Support GDS_NO under KVM - Fix a documentation typo" * tag 'gds-for-linus-2023-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Documentation/x86: Fix backwards on/off logic about YMM support KVM: Add GDS_NO support to KVM x86/speculation: Add Kconfig option for GDS x86/speculation: Add force option to GDS mitigation x86/speculation: Add Gather Data Sampling mitigation
2023-08-07Merge tag 'x86_bugs_srso' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/srso fixes from Borislav Petkov: "Add a mitigation for the speculative RAS (Return Address Stack) overflow vulnerability on AMD processors. In short, this is yet another issue where userspace poisons a microarchitectural structure which can then be used to leak privileged information through a side channel" * tag 'x86_bugs_srso' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/srso: Tie SBPB bit setting to microcode patch detection x86/srso: Add a forgotten NOENDBR annotation x86/srso: Fix return thunks in generated code x86/srso: Add IBPB on VMEXIT x86/srso: Add IBPB x86/srso: Add SRSO_NO support x86/srso: Add IBPB_BRTYPE support x86/srso: Add a Speculative RAS Overflow mitigation x86/bugs: Increase the x86 bugs vector size to two u32s
2023-08-07Merge tag 'wq-for-6.5-rc5-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq Pull workqueue fixes from Tejun Heo: - The recently added cpu_intensive auto detection and warning mechanism was spuriously triggered on slow CPUs. While not causing serious issues, it's still a nuisance and can cause unintended concurrency management behaviors. Relax the threshold on machines with lower BogoMIPS. While BogoMIPS is not an accurate measure of performance by most measures, we don't have to be accurate and it has rough but strong enough correlation. - A correction in Kconfig help text * tag 'wq-for-6.5-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: Scale up wq_cpu_intensive_thresh_us if BogoMIPS is below 4000 workqueue: Fix cpu_intensive_thresh_us name in help text
2023-08-07Merge tag 'tpmdd-v6.5-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd Pull tpm fixes from Jarkko Sakkinen: "A few more bug fixes" * tag 'tpmdd-v6.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd: tpm/tpm_tis: Disable interrupts for Lenovo P620 devices tpm: Disable RNG for all AMD fTPMs sysctl: set variable key_sysctls storage-class-specifier to static tpm/tpm_tis: Disable interrupts for TUXEDO InfinityBook S 15/17 Gen7
2023-08-07tpm/tpm_tis: Disable interrupts for Lenovo P620 devicesJonathan McDowell
The Lenovo ThinkStation P620 suffers from an irq storm issue like various other Lenovo machines, so add an entry for it to tpm_tis_dmi_table and force polling. It is worth noting that 481c2d14627d (tpm,tpm_tis: Disable interrupts after 1000 unhandled IRQs) does not seem to fix the problem on this machine, but setting 'tpm_tis.interrupts=0' on the kernel command line does. [jarkko@kernel.org: truncated the commit ID in the description to 12 characters] Cc: stable@vger.kernel.org # v6.4+ Fixes: e644b2f498d2 ("tpm, tpm_tis: Enable interrupt test") Signed-off-by: Jonathan McDowell <noodles@meta.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2023-08-07tpm: Disable RNG for all AMD fTPMsMario Limonciello
The TPM RNG functionality is not necessary for entropy when the CPU already supports the RDRAND instruction. The TPM RNG functionality was previously disabled on a subset of AMD fTPM series, but reports continue to show problems on some systems causing stutter root caused to TPM RNG functionality. Expand disabling TPM RNG use for all AMD fTPMs whether they have versions that claim to have fixed or not. To accomplish this, move the detection into part of the TPM CRB registration and add a flag indicating that the TPM should opt-out of registration to hwrng. Cc: stable@vger.kernel.org # 6.1.y+ Fixes: b006c439d58d ("hwrng: core - start hwrng kthread also for untrusted sources") Fixes: f1324bbc4011 ("tpm: disable hwrng for fTPM on some AMD designs") Reported-by: daniil.stas@posteo.net Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217719 Reported-by: bitlord0xff@gmail.com Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217212 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2023-08-07sysctl: set variable key_sysctls storage-class-specifier to staticTom Rix
smatch reports security/keys/sysctl.c:12:18: warning: symbol 'key_sysctls' was not declared. Should it be static? This variable is only used in its defining file, so it should be static. Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2023-08-07tpm/tpm_tis: Disable interrupts for TUXEDO InfinityBook S 15/17 Gen7Takashi Iwai
TUXEDO InfinityBook S 15/17 Gen7 suffers from an IRQ problem on tpm_tis like a few other laptops. Add an entry for the workaround. Cc: stable@vger.kernel.org Fixes: e644b2f498d2 ("tpm, tpm_tis: Enable interrupt test") Link: https://bugzilla.suse.com/show_bug.cgi?id=1213645 Signed-off-by: Takashi Iwai <tiwai@suse.de> Acked-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2023-08-07Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fixes from Paolo Bonzini: "x86: - Fix SEV race condition ARM: - Fixes for the configuration of SVE/SME traps when hVHE mode is in use - Allow use of pKVM on systems with FF-A implementations that are v1.0 compatible - Request/release percpu IRQs (arch timer, vGIC maintenance) correctly when pKVM is in use - Fix function prototype after __kvm_host_psci_cpu_entry() rename - Skip to the next instruction when emulating writes to TCR_EL1 on AmpereOne systems Selftests: - Fix missing include" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: selftests/rseq: Fix build with undefined __weak KVM: SEV: remove ghcb variable declarations KVM: SEV: only access GHCB fields once KVM: SEV: snapshot the GHCB before accessing it KVM: arm64: Skip instruction after emulating write to TCR_EL1 KVM: arm64: fix __kvm_host_psci_cpu_entry() prototype KVM: arm64: Fix resetting SME trap values on reset for (h)VHE KVM: arm64: Fix resetting SVE trap values on reset for hVHE KVM: arm64: Use the appropriate feature trap register when activating traps KVM: arm64: Helper to write to appropriate feature trap register based on mode KVM: arm64: Disable SME traps for (h)VHE at setup KVM: arm64: Use the appropriate feature trap register for SVE at EL2 setup KVM: arm64: Factor out code for checking (h)VHE mode into a macro KVM: arm64: Rephrase percpu enable/disable tracking in terms of hyp KVM: arm64: Fix hardware enable/disable flows for pKVM KVM: arm64: Allow pKVM on v1.0 compatible FF-A implementations
2023-08-07Merge tag 'mmc-v6.5-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC fixes from Ulf Hansson: - moxart: Fix big-endian conversion for SCR structure - sdhci-f-sdh30: Replace with sdhci_pltfm to fix PM support * tag 'mmc-v6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: sdhci-f-sdh30: Replace with sdhci_pltfm mmc: moxart: read scr register without changing byte order
2023-08-07gfs2: Don't use filemap_splice_readBob Peterson
Starting with patch 2cb1e08985, gfs2 started using the new function filemap_splice_read rather than the old (and subsequently deleted) function generic_file_splice_read. filemap_splice_read works by taking references to a number of folios in the page cache and splicing those folios into a pipe. The folios are then read from the pipe and the folio references are dropped. This can take an arbitrary amount of time. We cannot allow that in gfs2 because those folio references will pin the inode glock to the node and prevent it from being demoted, which can lead to cluster-wide deadlocks. Instead, use copy_splice_read. (In addition, the old generic_file_splice_read called into ->read_iter, which called gfs2_file_read_iter, which took the inode glock during the operation. The new filemap_splice_read interface does not take the inode glock anymore. This is fixable, but it still wouldn't prevent cluster-wide deadlocks.) Fixes: 2cb1e08985e3 ("splice: Use filemap_splice_read() instead of generic_file_splice_read()") Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-08-07gfs2: Fix freeze consistency check in gfs2_trans_add_metaAndreas Gruenbacher
Function gfs2_trans_add_meta() checks for the SDF_FROZEN flag to make sure that no buffers are added to a transaction while the filesystem is frozen. With the recent freeze/thaw rework, the SDF_FROZEN flag is cleared after thaw_super() is called, which is sufficient for serializing freeze/thaw. However, other filesystem operations started after thaw_super() may now be calling gfs2_trans_add_meta() before the SDF_FROZEN flag is cleared, which will trigger the SDF_FROZEN check in gfs2_trans_add_meta(). Fix that by checking the s_writers.frozen state instead. In addition, make sure not to call gfs2_assert_withdraw() with the sd_log_lock spin lock held. Check for a withdrawn filesystem before checking for a frozen filesystem, and don't pin/add buffers to the current transaction in case of a failure in either case. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2023-08-07x86/srso: Tie SBPB bit setting to microcode patch detectionBorislav Petkov (AMD)
The SBPB bit in MSR_IA32_PRED_CMD is supported only after a microcode patch has been applied so set X86_FEATURE_SBPB only then. Otherwise, guests would attempt to set that bit and #GP on the MSR write. While at it, make SMT detection more robust as some guests - depending on how and what CPUID leafs their report - lead to cpu_smt_control getting set to CPU_SMT_NOT_SUPPORTED but SRSO_NO should be set for any guest incarnation where one simply cannot do SMT, for whatever reason. Fixes: fb3bd914b3ec ("x86/srso: Add a Speculative RAS Overflow mitigation") Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reported-by: Salvatore Bonaccorso <carnil@debian.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
2023-08-06Linux 6.5-rc5Linus Torvalds
2023-08-06Merge tag 'v6.5-rc5.vfs.fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - Fix a wrong check for O_TMPFILE during RESOLVE_CACHED lookup - Clean up directory iterators and clarify file_needs_f_pos_lock() * tag 'v6.5-rc5.vfs.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: fs: rely on ->iterate_shared to determine f_pos locking vfs: get rid of old '->iterate' directory operation proc: fix missing conversion to 'iterate_shared' open: make RESOLVE_CACHED correctly test for O_TMPFILE
2023-08-06fs: rely on ->iterate_shared to determine f_pos lockingChristian Brauner
Now that we removed ->iterate we don't need to check for either ->iterate or ->iterate_shared in file_needs_f_pos_lock(). Simply check for ->iterate_shared instead. This will tell us whether we need to unconditionally take the lock. Not just does it allow us to avoid checking f_inode's mode it also actually clearly shows that we're locking because of readdir. Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-06vfs: get rid of old '->iterate' directory operationLinus Torvalds
All users now just use '->iterate_shared()', which only takes the directory inode lock for reading. Filesystems that never got convered to shared mode now instead use a wrapper that drops the lock, re-takes it in write mode, calls the old function, and then downgrades the lock back to read mode. This way the VFS layer and other callers no longer need to care about filesystems that never got converted to the modern era. The filesystems that use the new wrapper are ceph, coda, exfat, jfs, ntfs, ocfs2, overlayfs, and vboxsf. Honestly, several of them look like they really could just iterate their directories in shared mode and skip the wrapper entirely, but the point of this change is to not change semantics or fix filesystems that haven't been fixed in the last 7+ years, but to finally get rid of the dual iterators. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-06proc: fix missing conversion to 'iterate_shared'Linus Torvalds
I'm looking at the directory handling due to the discussion about f_pos locking (see commit 797964253d35: "file: reinstate f_pos locking optimization for regular files"), and wanting to clean that up. And one source of ugliness is how we were supposed to move filesystems over to the '->iterate_shared()' function that only takes the inode lock for reading many many years ago, but several filesystems still use the bad old '->iterate()' that takes the inode lock for exclusive access. See commit 6192269444eb ("introduce a parallel variant of ->iterate()") that also added some documentation stating Old method is only used if the new one is absent; eventually it will be removed. Switch while you still can; the old one won't stay. and that was back in April 2016. Here we are, many years later, and the old version is still clearly sadly alive and well. Now, some of those old style iterators are probably just because the filesystem may end up having per-inode mutable data that it uses for iterating a directory, but at least one case is just a mistake. Al switched over most filesystems to use '->iterate_shared()' back when it was introduced. In particular, the /proc filesystem was converted as one of the first ones in commit f50752eaa0b0 ("switch all procfs directories ->iterate_shared()"). But then later one new user of '->iterate()' was then re-introduced by commit 6d9c939dbe4d ("procfs: add smack subdir to attrs"). And that's clearly not what we wanted, since that new case just uses the same 'proc_pident_readdir()' and 'proc_pident_lookup()' helper functions that other /proc pident directories use, and they are most definitely safe to use with the inode lock held shared. So just fix it. This still leaves a fair number of oddball filesystems using the old-style directory iterator (ceph, coda, exfat, jfs, ntfs, ocfs2, overlayfs, and vboxsf), but at least we don't have any remaining in the core filesystems. I'm going to add a wrapper function that just drops the read-lock and takes it as a write lock, so that we can clean up the core vfs layer and make all the ugly 'this filesystem needs exclusive inode locking' be just filesystem-internal warts. I just didn't want to make that conversion when we still had a core user left. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-06open: make RESOLVE_CACHED correctly test for O_TMPFILEAleksa Sarai
O_TMPFILE is actually __O_TMPFILE|O_DIRECTORY. This means that the old fast-path check for RESOLVE_CACHED would reject all users passing O_DIRECTORY with -EAGAIN, when in fact the intended test was to check for __O_TMPFILE. Cc: stable@vger.kernel.org # v5.12+ Fixes: 99668f618062 ("fs: expose LOOKUP_CACHED through openat2() RESOLVE_CACHED") Signed-off-by: Aleksa Sarai <cyphar@cyphar.com> Message-Id: <20230806-resolve_cached-o_tmpfile-v1-1-7ba16308465e@cyphar.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-05Merge tag 'rust-fixes-6.5-rc5' of https://github.com/Rust-for-Linux/linuxLinus Torvalds
Pull rust fixes from Miguel Ojeda: - Allocator: prevent mis-aligned allocation - Types: delete 'ForeignOwnable::borrow_mut'. A sound replacement is planned for the merge window - Build: fix bindgen error with UBSAN_BOUNDS_STRICT * tag 'rust-fixes-6.5-rc5' of https://github.com/Rust-for-Linux/linux: rust: fix bindgen build error with UBSAN_BOUNDS_STRICT rust: delete `ForeignOwnable::borrow_mut` rust: allocator: Prevent mis-aligned allocation