summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-06-21x86/platform/UV: Add adjustable set memory block size functionmike.travis@hpe.com
Add a new function to "adjust" the current fixed UV memory block size of 2GB so it can be changed to a different physical boundary. This is out of necessity so arch dependent code can accommodate specific BIOS requirements which can align these new PMEM modules at less than the default boundaries. A "set order" type of function was used to insure that the memory block size will be a power of two value without requiring a validity check. 64GB was chosen as the upper limit for memory block size values to accommodate upcoming 4PB systems which have 6 more bits of physical address space (46 becoming 52). Signed-off-by: Mike Travis <mike.travis@hpe.com> Reviewed-by: Andrew Banman <andrew.banman@hpe.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dimitri Sivanich <dimitri.sivanich@hpe.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russ Anderson <russ.anderson@hpe.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dan.j.williams@intel.com Cc: jgross@suse.com Cc: kirill.shutemov@linux.intel.com Cc: mhocko@suse.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/lkml/20180524201711.609546602@stormcage.americas.sgi.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-06-21x86/spectre_v1: Disable compiler optimizations over array_index_mask_nospec()Dan Williams
Mark Rutland noticed that GCC optimization passes have the potential to elide necessary invocations of the array_index_mask_nospec() instruction sequence, so mark the asm() volatile. Mark explains: "The volatile will inhibit *some* cases where the compiler could lift the array_index_nospec() call out of a branch, e.g. where there are multiple invocations of array_index_nospec() with the same arguments: if (idx < foo) { idx1 = array_idx_nospec(idx, foo) do_something(idx1); } < some other code > if (idx < foo) { idx2 = array_idx_nospec(idx, foo); do_something_else(idx2); } ... since the compiler can determine that the two invocations yield the same result, and reuse the first result (likely the same register as idx was in originally) for the second branch, effectively re-writing the above as: if (idx < foo) { idx = array_idx_nospec(idx, foo); do_something(idx); } < some other code > if (idx < foo) { do_something_else(idx); } ... if we don't take the first branch, then speculatively take the second, we lose the nospec protection. There's more info on volatile asm in the GCC docs: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Volatile " Reported-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: <stable@vger.kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: babdde2698d4 ("x86: Implement array_index_mask_nospec") Link: https://lkml.kernel.org/lkml/152838798950.14521.4893346294059739135.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-06-21Merge branches 'acpi-soc' and 'acpi-processor'Rafael J. Wysocki
These are a stable-candidate suspend/resume fix of the ACPI driver for Intel SoCs (LPSS) and an inline stub fix for the ACPI processor driver. * acpi-soc: ACPI / LPSS: Avoid PM quirks on suspend and resume from S3 * acpi-processor: ACPI / processor: Finish making acpi_processor_ppc_has_changed() void
2018-06-21Merge branch 'pm-tools'Rafael J. Wysocki
These are turbostat utility updates for 4.18-rc2 including two fixes for recent regressions and some minor extensions. * pm-tools: tools/power turbostat: version 18.06.20 tools/power turbostat: add the missing command line switches tools/power turbostat: add single character tokens to help tools/power turbostat: alphabetize the help output tools/power turbostat: fix segfault on 'no node' machines tools/power turbostat: add optional APIC X2APIC columns tools/power turbostat: decode cpuid.1.HT tools/power turbostat: fix show/hide issues resulting from mis-merge
2018-06-21x86/pti: Don't report XenPV as vulnerableJiri Kosina
Xen PV domain kernel is not by design affected by meltdown as it's enforcing split CR3 itself. Let's not report such systems as "Vulnerable" in sysfs (we're also already forcing PTI to off in X86_HYPER_XEN_PV cases); the security of the system ultimately depends on presence of mitigation in the Hypervisor, which can't be easily detected from DomU; let's report that. Reported-and-tested-by: Mike Latimer <mlatimer@suse.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Juergen Gross <jgross@suse.com> Cc: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/nycvar.YFH.7.76.1806180959080.6203@cbobk.fhfr.pm [ Merge the user-visible string into a single line. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-06-21Merge branches 'pm-core' and 'pm-opp'Rafael J. Wysocki
These are a PM core fix and an OPP framework fix for 4.18-rc2, both "stable" material. * pm-core: PM / core: Fix supplier device runtime PM usage counter imbalance * pm-opp: PM / OPP: Update voltage in case freq == old_freq
2018-06-21x86/build: Remove unnecessary preparation for purgatoryMasahiro Yamada
kexec-purgatory.c is properly generated when Kbuild descend into the arch/x86/purgatory/. Thus the 'archprepare' target is redundant. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michal Marek <michal.lkml@markovi.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/lkml/1529401422-28838-3-git-send-email-yamada.masahiro@socionext.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-06-21Revert "kexec/purgatory: Add clean-up for purgatory directory"Masahiro Yamada
Reverts the following commit: b0108f9e93d0 ("kexec: purgatory: add clean-up for purgatory directory") ... which incorrectly stated that the kexec-purgatory.c and purgatory.ro files were not removed after 'make mrproper'. In fact, they are. You can confirm it after reverting it. $ make mrproper $ touch arch/x86/purgatory/kexec-purgatory.c $ touch arch/x86/purgatory/purgatory.ro $ make mrproper CLEAN arch/x86/purgatory $ ls arch/x86/purgatory/ entry64.S Makefile purgatory.c setup-x86_64.S stack.S string.c This is obvious from the build system point of view. arch/x86/Makefile adds 'arch/x86' to core-y. Hence 'make clean' descends like this: arch/x86/Kbuild -> arch/x86/purgatory/Makefile Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michal Marek <michal.lkml@markovi.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/lkml/1529401422-28838-2-git-send-email-yamada.masahiro@socionext.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-06-21KVM: arm/arm64: add WARN_ON if size is not PAGE_SIZE aligned in ↵Jia He
unmap_stage2_range There is a panic in armv8a server(QDF2400) under memory pressure tests (start 20 guests and run memhog in the host). ---------------------------------begin-------------------------------- [35380.800950] BUG: Bad page state in process qemu-kvm pfn:dd0b6 [35380.805825] page:ffff7fe003742d80 count:-4871 mapcount:-2126053375 mapping: (null) index:0x0 [35380.815024] flags: 0x1fffc00000000000() [35380.818845] raw: 1fffc00000000000 0000000000000000 0000000000000000 ffffecf981470000 [35380.826569] raw: dead000000000100 dead000000000200 ffff8017c001c000 0000000000000000 [35380.805825] page:ffff7fe003742d80 count:-4871 mapcount:-2126053375 mapping: (null) index:0x0 [35380.815024] flags: 0x1fffc00000000000() [35380.818845] raw: 1fffc00000000000 0000000000000000 0000000000000000 ffffecf981470000 [35380.826569] raw: dead000000000100 dead000000000200 ffff8017c001c000 0000000000000000 [35380.834294] page dumped because: nonzero _refcount [...] --------------------------------end-------------------------------------- The root cause might be what was fixed at [1]. But from the KVM points of view, it would be better if the issue was caught earlier. If the size is not PAGE_SIZE aligned, unmap_stage2_range might unmap the wrong(more or less) page range. Hence it caused the "BUG: Bad page state" Let's WARN in that case, so that the issue is obvious. [1] https://lkml.org/lkml/2018/5/3/1042 Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: jia.he@hxt-semitech.com [maz: tidied up commit message] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-06-21rseq/cleanup: Do not abort rseq c.s. in child on fork()Mathieu Desnoyers
Considering that we explicitly forbid system calls in rseq critical sections, it is not valid to issue a fork or clone system call within a rseq critical section, so rseq_fork() is not required to restart an active rseq c.s. in the child process. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Ben Maurer <bmaurer@fb.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Lameter <cl@linux.com> Cc: Dave Watson <davejwatson@fb.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Paul E . McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Shuah Khan <shuahkh@osg.samsung.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-api@vger.kernel.org Cc: linux-kselftest@vger.kernel.org Link: https://lore.kernel.org/lkml/20180619133230.4087-4-mathieu.desnoyers@efficios.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-06-21rseq/selftests/arm: Align 'struct rseq_cs' on 32 bytesMathieu Desnoyers
uapi/linux/rseq.h aligns 'struct rseq_cs' on 32 bytes. Satisfy this alignment requirement in its definition within the rseq-arm.h inline assembly as well. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Hunter <ahh@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Ben Maurer <bmaurer@fb.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Lameter <cl@linux.com> Cc: Dave Watson <davejwatson@fb.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Paul E . McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Shuah Khan <shuahkh@osg.samsung.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-api@vger.kernel.org Cc: linux-kselftest@vger.kernel.org Link: https://lore.kernel.org/lkml/20180619133230.4087-3-mathieu.desnoyers@efficios.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-06-21rseq/selftests: Make run_param_test.sh executableMathieu Desnoyers
The executable bit of the run_param_test.sh script got lost in the merge. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Hunter <ahh@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Ben Maurer <bmaurer@fb.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Lameter <cl@linux.com> Cc: Dave Watson <davejwatson@fb.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Paul E . McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Shuah Khan <shuahkh@osg.samsung.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-api@vger.kernel.org Cc: linux-kselftest@vger.kernel.org Link: https://lore.kernel.org/lkml/20180619133230.4087-2-mathieu.desnoyers@efficios.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-06-21x86/xen: Add call of speculative_store_bypass_ht_init() to PV pathsJuergen Gross
Commit: 1f50ddb4f418 ("x86/speculation: Handle HT correctly on AMD") ... added speculative_store_bypass_ht_init() to the per-CPU initialization sequence. speculative_store_bypass_ht_init() needs to be called on each CPU for PV guests, too. Reported-by: Brian Woods <brian.woods@amd.com> Tested-by: Brian Woods <brian.woods@amd.com> Signed-off-by: Juergen Gross <jgross@suse.com> Cc: <stable@vger.kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: boris.ostrovsky@oracle.com Cc: xen-devel@lists.xenproject.org Fixes: 1f50ddb4f4189243c05926b842dc1a0332195f31 ("x86/speculation: Handle HT correctly on AMD") Link: https://lore.kernel.org/lkml/20180621084331.21228-1-jgross@suse.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-06-21drm/bridge/sii8620: fix display of packed pixel modes in MHL2Maciej Purski
Currently packed pixel modes in MHL2 can't be displayed. The device automatically recognizes output format, so setting format other than RGB causes failure. Fix it by writing proper values to registers. Tested on MHL1 and MHL2 using various vendors' dongles both in DVI and HDMI mode. Signed-off-by: Maciej Purski <m.purski@samsung.com> Signed-off-by: Andrzej Hajda <a.hajda@samsung.com> Link: https://patchwork.freedesktop.org/patch/msgid/1516706239-9104-1-git-send-email-m.purski@samsung.com
2018-06-21KVM: arm64: Avoid mistaken attempts to save SVE state for vcpusDave Martin
Commit e6b673b ("KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing") uses fpsimd_save() to save the FPSIMD state for a vcpu when scheduling the vcpu out. However, currently current's value of TIF_SVE is restored before calling fpsimd_save() which means that fpsimd_save() may erroneously attempt to save SVE state from the vcpu. This enables current's vector state to be polluted with guest data. current->thread.sve_state may be unallocated or not large enough, so this can also trigger a NULL dereference or buffer overrun. Instead of this, TIF_SVE should be configured properly for the guest when calling fpsimd_save() with the vcpu context loaded. This patch ensures this by delaying restoration of current's TIF_SVE until after the call to fpsimd_save(). Fixes: e6b673b741ea ("KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing") Signed-off-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-06-21KVM: arm64/sve: Fix SVE trap restoration for non-current tasksDave Martin
Commit e6b673b ("KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing") attempts to restore the configuration of userspace SVE trapping via a call to fpsimd_bind_task_to_cpu(), but the logic for determining when to do this is not correct. The patch makes the errnoenous assumption that the only task that may try to enter userspace with the currently loaded FPSIMD/SVE register content is current. This may not be the case however: if some other user task T is scheduled on the CPU during the execution of the KVM run loop, and the vcpu does not try to use the registers in the meantime, then T's state may be left there intact. If T happens to be the next task to enter userspace on this CPU then the hooks for reloading the register state and configuring traps will be skipped. (Also, current never has SVE state at this point anyway and should always have the trap enabled, as a side-effect of the ioctl() syscall needed to reach the KVM run loop in the first place.) This patch instead restores the state of the EL0 trap from the state observed at the most recent vcpu_load(), ensuring that the trap is set correctly for the loaded context (if any). Fixes: e6b673b741ea ("KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing") Signed-off-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-06-21KVM: arm64: Don't mask softirq with IRQs disabled in vcpu_put()Dave Martin
Commit e6b673b ("KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing") introduces a specific helper kvm_arch_vcpu_put_fp() for saving the vcpu FPSIMD state during vcpu_put(). This function uses local_bh_disable()/_enable() to protect the FPSIMD context manipulation from interruption by softirqs. This approach is not correct, because vcpu_put() can be invoked either from the KVM host vcpu thread (when exiting the vcpu run loop), or via a preempt notifier. In the former case, only preemption is disabled. In the latter case, the function is called from inside __schedule(), which means that IRQs are disabled. Use of local_bh_disable()/_enable() with IRQs disabled is considerd an error, resulting in lockdep splats while running VMs if lockdep is enabled. This patch disables IRQs instead of attempting to disable softirqs, avoiding the problem of calling local_bh_enable() with IRQs disabled in the __schedule() path. This creates an additional interrupt blackout during vcpu run loop exit, but this is the rare case and the blackout latency is still less than that of __schedule(). Fixes: e6b673b741ea ("KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing") Reported-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-06-21arm64: Introduce sysreg_clear_set()Mark Rutland
Currently we have a couple of helpers to manipulate bits in particular sysregs: * config_sctlr_el1(u32 clear, u32 set) * change_cpacr(u64 val, u64 mask) The parameters of these differ in naming convention, order, and size, which is unfortunate. They also differ slightly in behaviour, as change_cpacr() skips the sysreg write if the bits are unchanged, which is a useful optimization when sysreg writes are expensive. Before we gain yet another sysreg manipulation function, let's unify these with a common helper, providing a consistent order for clear/set operands, and the write skipping behaviour from change_cpacr(). Code will be migrated to the new helper in subsequent patches. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Dave Martin <dave.martin@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-06-21KVM: arm/arm64: Drop resource size check for GICV windowArd Biesheuvel
When booting a 64 KB pages kernel on a ACPI GICv3 system that implements support for v2 emulation, the following warning is produced GICV size 0x2000 not a multiple of page size 0x10000 and support for v2 emulation is disabled, preventing GICv2 VMs from being able to run on such hosts. The reason is that vgic_v3_probe() performs a sanity check on the size of the window (it should be a multiple of the page size), while the ACPI MADT parsing code hardcodes the size of the window to 8 KB. This makes sense, considering that ACPI does not bother to describe the size in the first place, under the assumption that platforms implementing ACPI will follow the architecture and not put anything else in the same 64 KB window. So let's just drop the sanity check altogether, and assume that the window is at least 64 KB in size. Fixes: 909777324588 ("KVM: arm/arm64: vgic-new: vgic_init: implement kvm_vgic_hyp_init") Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-06-21nvme-fc: release io queues to allow fast failJames Smart
Rather than leaving io queues quiesced after tearing down an association, restart them. This allows ios to be replayed, with fastfail ios terminating and non-fastfail getting into loops of retry. This follows rdma's lead. Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Sagi Grimberg <sagi@grimber.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-06-20nbd: Add the nbd NBD_DISCONNECT_ON_CLOSE config flag.Doron Roberts-Kedes
If NBD_DISCONNECT_ON_CLOSE is set on a device, then the driver will issue a disconnect from nbd_release if the device has no remaining bdev->bd_openers. Fix ret val so reconfigure with only setting the flag succeeds. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Doron Roberts-Kedes <doronrk@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-21Merge branch 'drm-fixes-4.18' of git://people.freedesktop.org/~agd5f/linux ↵Dave Airlie
into drm-fixes Bunch of amdgpu fixes mostly all going to stable. Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180620190021.2775-1-alexander.deucher@amd.com
2018-06-21Merge branch 'turbostat' of ↵Rafael J. Wysocki
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux Pull turbostat utility changes for 4.18-rc2 from Len Brown. "This includes two regression fixes, plus a couple more random, but worthy, patches." * 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: tools/power turbostat: version 18.06.20 tools/power turbostat: add the missing command line switches tools/power turbostat: add single character tokens to help tools/power turbostat: alphabetize the help output tools/power turbostat: fix segfault on 'no node' machines tools/power turbostat: add optional APIC X2APIC columns tools/power turbostat: decode cpuid.1.HT tools/power turbostat: fix show/hide issues resulting from mis-merge
2018-06-21Documentation: intel_pstate: Fix typoRafael J. Wysocki
Fix a typo in the intel_pstate admin-guide documentation. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-06-21Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma fixes from Jason Gunthorpe: "Here are eight fairly small fixes collected over the last two weeks. Regression and crashing bug fixes: - mlx4/5: Fixes for issues found from various checkers - A resource tracking and uverbs regression in the core code - qedr: NULL pointer regression found during testing - rxe: Various small bugs" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: IB/rxe: Fix missing completion for mem_reg work requests RDMA/core: Save kernel caller name when creating CQ using ib_create_cq() IB/uverbs: Fix ordering of ucontext check in ib_uverbs_write IB/mlx4: Fix an error handling path in 'mlx4_ib_rereg_user_mr()' RDMA/qedr: Fix NULL pointer dereference when running over iWARP without RDMA-CM IB/mlx5: Fix return value check in flow_counters_set_data() IB/mlx5: Fix memory leak in mlx5_ib_create_flow IB/rxe: avoid double kfree skb
2018-06-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Fix crash on bpf_prog_load() errors, from Daniel Borkmann. 2) Fix ATM VCC memory accounting, from David Woodhouse. 3) fib6_info objects need RCU freeing, from Eric Dumazet. 4) Fix SO_BINDTODEVICE handling for TCP sockets, from David Ahern. 5) Fix clobbered error code in enic_open() failure path, from Govindarajulu Varadarajan. 6) Propagate dev_get_valid_name() error returns properly, from Li RongQing. 7) Fix suspend/resume in davinci_emac driver, from Bartosz Golaszewski. 8) Various act_ife fixes (recursive locking, IDR leaks, etc.) from Davide Caratti. 9) Fix buggy checksum handling in sungem driver, from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (40 commits) ip: limit use of gso_size to udp stmmac: fix DMA channel hang in half-duplex mode net: stmmac: socfpga: add additional ocp reset line for Stratix10 net: sungem: fix rx checksum support bpfilter: ignore binary files bpfilter: fix build error net/usb/drivers: Remove useless hrtimer_active check net/sched: act_ife: preserve the action control in case of error net/sched: act_ife: fix recursive lock and idr leak net: ethernet: fix suspend/resume in davinci_emac net: propagate dev_get_valid_name return code enic: do not overwrite error code net/tcp: Fix socket lookups with SO_BINDTODEVICE ptp: replace getnstimeofday64() with ktime_get_real_ts64() net/ipv6: respect rcu grace period before freeing fib6_info net: net_failover: fix typo in net_failover_slave_register() ipvlan: use ETH_MAX_MTU as max mtu net: hamradio: use eth_broadcast_addr enic: initialize enic->rfs_h.lock in enic_probe MAINTAINERS: Add Sam as the maintainer for NCSI ...
2018-06-20block: sed-opal: Fix a couple off by one bugsDan Carpenter
resp->num is the number of tokens in resp->tok[]. It gets set in response_parse(). So if n == resp->num then we're reading beyond the end of the data. Fixes: 455a7b238cd6 ("block: Add Sed-opal library") Reviewed-by: Scott Bauer <scott.bauer@intel.com> Tested-by: Scott Bauer <scott.bauer@intel.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-20tools/power turbostat: version 18.06.20Len Brown
Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-20tools/power turbostat: add the missing command line switchesNathan Ciobanu
Document the missing command line tokens in the help() function. Signed-off-by: Nathan Ciobanu <nathan.d.ciobanu@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-20tools/power turbostat: add single character tokens to helpNathan Ciobanu
Improve the help() output by adding the single character tokens (e.g -a). Signed-off-by: Nathan Ciobanu <nathan.d.ciobanu@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-20tools/power turbostat: alphabetize the help outputNathan Ciobanu
Sort the command line arguments output of help() in alphabetical order in line with other linux tools. Signed-off-by: Nathan Ciobanu <nathan.d.ciobanu@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-20tools/power turbostat: fix segfault on 'no node' machinesNathan Ciobanu
Running turbostat on machines that don't expose nodes in sysfs (no /sys/bus/node) causes a segfault or a -nan value diesplayed in the log. This is caused by physical_node_id being reported as -1 and logical_node_id being calculated as a negative number resulting in the new GET_THREAD/GET_CORE returning an incorrect address. Signed-off-by: Nathan Ciobanu <nathan.d.ciobanu@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-20tools/power turbostat: add optional APIC X2APIC columnsLen Brown
Add APIC and X2APIC columns to the topology section. They are disabled-by-default -- enable like so: --debug or --enable APIC,X2APIC Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-20tools/power turbostat: decode cpuid.1.HTLen Brown
eg. the "HT" here: CPUID(1): SSE3 MONITOR - EIST TM2 TSC MSR ACPI-TM HT TM Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-20tools/power turbostat: fix show/hide issues resulting from mis-mergeLen Brown
The --show and --hide options failed on "Node", which was listed as "Node%". The --show and --hide options were generally fouled-up do due to come content merges that scrambled the list of column name indexes. Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-20blk-mq-debugfs: Off by one in blk_mq_rq_state_name()Dan Carpenter
If rq_state == ARRAY_SIZE() then we read one element beyond the end of the blk_mq_rq_state_name_array[] array. Fixes: ec6dcf63c55c ("blk-mq-debugfs: Show more request state information") Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-20nvmet: reset keep alive timer in controller enableMax Gurtuvoy
Controllers that are not yet enabled should not really enforce keep alive timeouts, but we still want to track a timeout and cleanup in case a host died before it enabled the controller. Hence, simply reset the keep alive timer when the controller is enabled. Suggested-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-06-20nvme-rdma: don't override opts->queue_sizeSagi Grimberg
That is user argument, and theoretically controller limits can change over time (over reconnects/resets). Instead, use the sqsize controller attribute to check queue depth boundaries and use it to the tagset allocation. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-06-20nvme-rdma: Fix command completion race at error recoveryIsrael Rukshin
The race is between completing the request at error recovery work and rdma completions. If we cancel the request before getting the good rdma completion we get a NULL deref of the request MR at nvme_rdma_process_nvme_rsp(). When Canceling the request we return its mr to the mr pool (set mr to NULL) and also unmap its data. Canceling the requests while the rdma queues are active is not safe. Because rdma queues are active and we get good rdma completions that can use the mr pointer which may be NULL. Completing the request too soon may lead also to performing DMA to/from user buffers which might have been already unmapped. The commit fixes the race by draining the QP before starting the abort commands mechanism. Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-06-20nvme-rdma: fix possible free of a non-allocated async event bufferSagi Grimberg
If nvme_rdma_configure_admin_queue fails before we allocated the async event buffer, we will falsly free it because nvme_rdma_free_queue is freeing it. Fix it by allocating the buffer right after nvme_rdma_alloc_queue and free it right before nvme_rdma_queue_free to maintain orderly reverse cleanup sequence. Reported-by: Israel Rukshin <israelr@mellanox.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-06-20nvme-rdma: fix possible double free condition when failing to create a ↵Sagi Grimberg
controller Failures after nvme_init_ctrl will defer resource cleanups to .free_ctrl when the reference is released, hence we should not free the controller queues for these failures. Fix that by moving controller queues allocation before controller initialization and correctly freeing them for failures before initialization and skip them for failures after initialization. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-06-20x86: Call fixup_exception() before notify_die() in math_error()Siarhei Liakh
fpu__drop() has an explicit fwait which under some conditions can trigger a fixable FPU exception while in kernel. Thus, we should attempt to fixup the exception first, and only call notify_die() if the fixup failed just like in do_general_protection(). The original call sequence incorrectly triggers KDB entry on debug kernels under particular FPU-intensive workloads. Andy noted, that this makes the whole conditional irq enable thing even more inconsistent, but fixing that it outside the scope of this. Signed-off-by: Siarhei Liakh <siarhei.liakh@concurrent-rt.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Andy Lutomirski <luto@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Borislav Petkov" <bpetkov@suse.de> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/DM5PR11MB201156F1CAB2592B07C79A03B17D0@DM5PR11MB2011.namprd11.prod.outlook.com
2018-06-20locking/rwsem: Fix up_read_non_owner() warning with DEBUG_RWSEMSWaiman Long
It was found that the use of up_read_non_owner() in NFS was causing the following warning when DEBUG_RWSEMS was configured. DEBUG_LOCKS_WARN_ON(sem->owner != ((struct task_struct *)(1UL << 0))) Looking into the rwsem.c file, it was discovered that the corresponding down_read_non_owner() function was not setting the owner field properly. This is fixed now, and the warning should be gone. Fixes: 5149cbac4235 ("locking/rwsem: Add DEBUG_RWSEMS to look for lock/unlock mismatches") Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Gavin Schenk <g.schenk@eckelmann.de> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: linux-nfs@vger.kernel.org Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/1527168398-4291-1-git-send-email-longman@redhat.com
2018-06-20udf: Drop unused arguments of udf_delete_aext()Jan Kara
udf_delete_aext() uses its last two arguments only as local variables. Drop them. Signed-off-by: Jan Kara <jack@suse.cz>
2018-06-20udf: Provide function for calculating dir entry lengthJan Kara
Provide function for calculating directory entry length and use to reduce code duplication. Signed-off-by: Jan Kara <jack@suse.cz>
2018-06-20udf: Detect incorrect directory sizeJan Kara
Detect when a directory entry is (possibly partially) beyond directory size and return EIO in that case since it means the filesystem is corrupted. Otherwise directory operations can further corrupt the directory and possibly also oops the kernel. CC: Anatoly Trosinenko <anatoly.trosinenko@gmail.com> CC: stable@vger.kernel.org Reported-and-tested-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2018-06-20ext2: add warning when specifying nocheck optionChengguang Xu
The option nocheck(nocheck/check=none) is useless but considering backwards compatibility it's better to print warning for a while before completely remove from the code. This patch add proper warning message for option 'nocheck' and remove unnecessary comment/function declaration which is used for removed option 'check'. Signed-off-by: Chengguang Xu <cgxu519@gmx.com> Signed-off-by: Jan Kara <jack@suse.cz>
2018-06-20quota: Cleanup list iteration in dqcache_shrink_scan()Jan Kara
Use list_first_entry() and list_empty() instead of opencoded variants. Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: Jan Kara <jack@suse.cz>
2018-06-20quota: reclaim least recently used dquotsGreg Thelen
The dquots in the free_dquots list are not reclaimed in LRU way. put_dquot_last() puts entries to the tail and dqcache_shrink_scan() frees from the tail. Free unreferenced dquots in LRU order because it seems more reasonable than freeing most recently used. Signed-off-by: Greg Thelen <gthelen@google.com> Signed-off-by: Shakeel Butt <shakeelb@google.com> Signed-off-by: Jan Kara <jack@suse.cz>
2018-06-20ACPI / processor: Finish making acpi_processor_ppc_has_changed() voidBrian Norris
Commit bca5f557dcea "ACPI / processor: Make acpi_processor_ppc_has_changed() void" changed one of the declarations of acpi_processor_ppc_has_changed() to return void, but the !CPU_FREQ version still returns int. Let's return void to be consistent. Fixes: bca5f557dcea "ACPI / processor: Make acpi_processor_ppc_has_changed() void" Signed-off-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>