Age | Commit message (Collapse) | Author |
|
Fix trivial ICC_SRE_EL2 register spelling typo in booting.rst.
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Will Deacon <will@kernel.org>
CC: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20250610120935.852034-1-lpieralisi@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Palmer Dabbelt:
- Support for the FWFT SBI extension, which is part of SBI 3.0 and a
dependency for many new SBI and ISA extensions
- Support for getrandom() in the VDSO
- Support for mseal
- Optimized routines for raid6 syndrome and recovery calculations
- kexec_file() supports loading Image-formatted kernel binaries
- Improvements to the instruction patching framework to allow for
atomic instruction patching, along with rules as to how systems need
to behave in order to function correctly
- Support for a handful of new ISA extensions: Svinval, Zicbop, Zabha,
some SiFive vendor extensions
- Various fixes and cleanups, including: misaligned access handling,
perf symbol mangling, module loading, PUD THPs, and improved uaccess
routines
* tag 'riscv-for-linus-6.16-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (69 commits)
riscv: uaccess: Only restore the CSR_STATUS SUM bit
RISC-V: vDSO: Wire up getrandom() vDSO implementation
riscv: enable mseal sysmap for RV64
raid6: Add RISC-V SIMD syndrome and recovery calculations
riscv: mm: Add support for Svinval extension
RISC-V: Documentation: Add enough title underlines to CMODX
riscv: Improve Kconfig help for RISCV_ISA_V_PREEMPTIVE
MAINTAINERS: Update Atish's email address
riscv: uaccess: do not do misaligned accesses in get/put_user()
riscv: process: use unsigned int instead of unsigned long for put_user()
riscv: make unsafe user copy routines use existing assembly routines
riscv: hwprobe: export Zabha extension
riscv: Make regs_irqs_disabled() more clear
perf symbols: Ignore mapping symbols on riscv
RISC-V: Kconfig: Fix help text of CMDLINE_EXTEND
riscv: module: Optimize PLT/GOT entry counting
riscv: Add support for PUD THP
riscv: xchg: Prefetch the destination word for sc.w
riscv: Add ARCH_HAS_PREFETCH[W] support with Zicbop
riscv: Add support for Zicbop
...
|
|
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/alexghiti/linux into for-next
riscv patches for 6.16-rc1, part 2
* Performance improvements
- Add support for vdso getrandom
- Implement raid6 calculations using vectors
- Introduce svinval tlb invalidation
* Cleanup
- A bunch of deduplication of the macros we use for manipulating instructions
* Misc
- Introduce a kunit test for kprobes
- Add support for mseal as riscv fits the requirements (thanks to Lorenzo for making sure of that :))
[Palmer: There was a rebase between part 1 and part 2, so I've had to do
some more git surgery here... at least two rounds of surgery...]
* alex-pr-2: (866 commits)
RISC-V: vDSO: Wire up getrandom() vDSO implementation
riscv: enable mseal sysmap for RV64
raid6: Add RISC-V SIMD syndrome and recovery calculations
riscv: mm: Add support for Svinval extension
riscv: Add kprobes KUnit test
riscv: kprobes: Remove duplication of RV_EXTRACT_ITYPE_IMM
riscv: kprobes: Remove duplication of RV_EXTRACT_UTYPE_IMM
riscv: kprobes: Remove duplication of RV_EXTRACT_RD_REG
riscv: kprobes: Remove duplication of RVC_EXTRACT_BTYPE_IMM
riscv: kprobes: Remove duplication of RVC_EXTRACT_C2_RS1_REG
riscv: kproves: Remove duplication of RVC_EXTRACT_JTYPE_IMM
riscv: kprobes: Remove duplication of RV_EXTRACT_BTYPE_IMM
riscv: kprobes: Remove duplication of RV_EXTRACT_RS1_REG
riscv: kprobes: Remove duplication of RV_EXTRACT_JTYPE_IMM
riscv: kprobes: Move branch_funct3 to insn.h
riscv: kprobes: Move branch_rs2_idx to insn.h
Linux 6.15-rc6
Input: xpad - fix xpad_device sorting
Input: xpad - add support for several more controllers
Input: xpad - fix Share button on Xbox One controllers
...
|
|
This reports as a warning in linux-next along the lines of
Documentation/arch/riscv/cmodx.rst:14: WARNING: Title underline too short.
CMODX in the Kernel Space
--------------------- [docutils]
Documentation/arch/riscv/cmodx.rst:43: WARNING: Title underline too short.
CMODX in the User Space
--------------------- [docutils]
Documentation/arch/riscv/cmodx.rst:43: WARNING: Title underline too short.
CMODX in the User Space
--------------------- [docutils]
Link: https://lore.kernel.org/all/20250603154544.1602a8b5@canb.auug.org.au/
Fixes: 0e07200b2af6 ("riscv: Documentation: add a description about dynamic ftrace")
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20250603172856.49925-1-palmer@dabbelt.com
Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
|
|
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/alexghiti/linux into for-next
riscv patches for 6.16-rc1
* Implement atomic patching support for ftrace which finally allows to
get rid of stop_machine().
* Support for kexec_file_load() syscall
* Improve module loading time by changing the algorithm that counts the
number of plt/got entries in a module.
* Zicbop is now used in the kernel to prefetch instructions
[Palmer: There's been two rounds of surgery on this one, so as a result
it's a bit different than the PR.]
* alex-pr: (734 commits)
riscv: Improve Kconfig help for RISCV_ISA_V_PREEMPTIVE
MAINTAINERS: Update Atish's email address
riscv: hwprobe: export Zabha extension
riscv: Make regs_irqs_disabled() more clear
perf symbols: Ignore mapping symbols on riscv
RISC-V: Kconfig: Fix help text of CMDLINE_EXTEND
riscv: module: Optimize PLT/GOT entry counting
riscv: Add support for PUD THP
riscv: xchg: Prefetch the destination word for sc.w
riscv: Add ARCH_HAS_PREFETCH[W] support with Zicbop
riscv: Add support for Zicbop
riscv: Introduce Zicbop instructions
riscv/kexec_file: Fix comment in purgatory relocator
riscv: kexec_file: Support loading Image binary file
riscv: kexec_file: Split the loading of kernel and others
riscv: Documentation: add a description about dynamic ftrace
riscv: ftrace: support direct call using call_ops
riscv: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS
riscv: ftrace: support PREEMPT
riscv: add a data fence for CMODX in the kernel mode
...
Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
|
|
Export Zabha through the hwprobe syscall.
Reviewed-by: Clément Léger <cleger@rivosinc.com>
Link: https://lore.kernel.org/r/20250421141413.394444-1-alexghiti@rivosinc.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
|
|
Add a section in cmodx to describe how dynamic ftrace works on riscv,
limitations, and assumptions.
Signed-off-by: Andy Chiu <andybnac@gmail.com>
Link: https://lore.kernel.org/r/20250407180838.42877-12-andybnac@gmail.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
|
|
Pull OpenRISC updates from Stafford Horne:
"Just a few documentation updates from the community:
- Device tree documentation conversion from txt to yaml
- Documentation addition to help users getting started with initramfs
on OpenRISC
* tag 'for-linus' of https://github.com/openrisc/linux:
dt-bindings: interrupt-controller: Convert openrisc,ompic to DT schema
dt-bindings: interrupt-controller: Convert opencores,or1k-pic to DT schema
Documentation:openrisc: Add build instructions with initramfs
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform drivers updates from Ilpo Järvinen:
"The changes are mostly business as usual. Besides pdx86 changes, there
are a few power supply changes needed for related pdx86 features, move
of oxpec driver from hwmon (oxp-sensors) to pdx86, and one FW version
warning to hid-asus.
Highlights:
- alienware-wmi-wmax:
- Add HWMON support
- Add ABI and admin-guide documentation
- Expose GPIO debug methods through debug FS
- Support manual fan control and "custom" thermal profile
- amd/hsmp:
- Add sysfs files to show HSMP telemetry
- Report power readings and limits via hwmon
- amd/isp4: Add AMD ISP platform config for OV05C10
- asus-wmi:
- Refactor Ally suspend/resume to work better with older FW
- hid-asus: check ROG Ally MCU version and warn about old FW versions
- dasharo-acpi:
- Add driver for Dasharo devices supporting fans and temperatures
monitoring
- dell-ddv:
- Expose the battery health and manufacture date to userspace
using power supply extensions
- Implement the battery matching algorithm
- dell-pc:
- Improve error propagation
- Use faux device
- int3472:
- Add delays to avoid GPIO regulator spikes
- Add handshake pin support
- Make regulator supply name configurable and allow registering
more than 1 GPIO regulator
- Map mt9m114 powerdown pin to powerenable
- intel/pmc: Add separate SSRAM Telemetry driver
- intel-uncore-freq: Add attributes to show agent types and die ID
- ISST:
- Support SST-TF revision 2 (allows more cores per bucket)
- Support SST-PP revision 2 (fabric 1 frequencies)
- Remove unnecessary SST MSRs restore (the package retains MSRs
despite CPU offlining)
- mellanox: Add support for SN2201, SN4280, SN5610, and SN5640
- mellanox: mlxbf-pmc: Support additional PMC blocks
- oxpec:
- Add OneXFly variants
- Add support for charge limit, charge thresholds, and turbo LED
- Distinguish current X1 variants to avoid unwanted matching to
new variants
- Follow hwmon conventions
- Move from hwmon/oxp-sensors to platform/x86 to match the
enlarged scope
- power supply:
- Add inhibit-charge-awake (needed by oxpec)
- Add additional battery health status values ("blown fuse" and
"cell imbalance") (needed by dell-ddv)
- powerwell-ec: Add driver for Portwell EC supporting GPIO and watchdog
- thinkpad-acpi: Support camera shutter switch hotkey
- tuxedo: Add virtual LampArray for TUXEDO NB04 devices
- tools/power/x86/intel-speed-select:
- Support displaying SST-PP revision 2 fields
- Skip uncore frequency update on newer generations of CPUs
- Miscellaneous cleanups / refactoring / improvements"
* tag 'platform-drivers-x86-v6.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (112 commits)
thermal/drivers/acerhdf: Constify struct thermal_zone_device_ops
platform/x86/amd/hsmp: fix building with CONFIG_HWMON=m
platform/x86: asus-wmi: fix build without CONFIG_SUSPEND
docs: ABI: Fix "aassociated" to "associated"
platform/x86: Add AMD ISP platform config for OV05C10
Documentation: admin-guide: pm: Add documentation for die_id
platform/x86/intel-uncore-freq: Add attributes to show die_id
platform/x86/intel: power-domains: Add interface to get Linux die ID
Documentation: admin-guide: pm: Add documentation for agent_types
platform/x86/intel-uncore-freq: Add attributes to show agent types
platform/x86/tuxedo: Prevent invalid Kconfig state
platform/x86: dell-ddv: Expose the battery health to userspace
platform/x86: dell-ddv: Expose the battery manufacture date to userspace
platform/x86: dell-ddv: Implement the battery matching algorithm
power: supply: core: Add additional health status values
platform/x86/amd/hsmp: acpi: Add sysfs files to display HSMP telemetry
platform/x86/amd/hsmp: Report power via hwmon sensors
platform/x86/amd/hsmp: Use a single DRIVER_VERSION for all hsmp modules
platform/mellanox: mlxreg-dpu: Fix smatch warnings
platform: mellanox: nvsw-sn2200: Fix .items in nvsw_sn2201_busbar_hotplug
...
|
|
Pull kvm updates from Paolo Bonzini:
"As far as x86 goes this pull request "only" includes TDX host support.
Quotes are appropriate because (at 6k lines and 100+ commits) it is
much bigger than the rest, which will come later this week and
consists mostly of bugfixes and selftests. s390 changes will also come
in the second batch.
ARM:
- Add large stage-2 mapping (THP) support for non-protected guests
when pKVM is enabled, clawing back some performance.
- Enable nested virtualisation support on systems that support it,
though it is disabled by default.
- Add UBSAN support to the standalone EL2 object used in nVHE/hVHE
and protected modes.
- Large rework of the way KVM tracks architecture features and links
them with the effects of control bits. While this has no functional
impact, it ensures correctness of emulation (the data is
automatically extracted from the published JSON files), and helps
dealing with the evolution of the architecture.
- Significant changes to the way pKVM tracks ownership of pages,
avoiding page table walks by storing the state in the hypervisor's
vmemmap. This in turn enables the THP support described above.
- New selftest checking the pKVM ownership transition rules
- Fixes for FEAT_MTE_ASYNC being accidentally advertised to guests
even if the host didn't have it.
- Fixes for the address translation emulation, which happened to be
rather buggy in some specific contexts.
- Fixes for the PMU emulation in NV contexts, decoupling PMCR_EL0.N
from the number of counters exposed to a guest and addressing a
number of issues in the process.
- Add a new selftest for the SVE host state being corrupted by a
guest.
- Keep HCR_EL2.xMO set at all times for systems running with the
kernel at EL2, ensuring that the window for interrupts is slightly
bigger, and avoiding a pretty bad erratum on the AmpereOne HW.
- Add workaround for AmpereOne's erratum AC04_CPU_23, which suffers
from a pretty bad case of TLB corruption unless accesses to HCR_EL2
are heavily synchronised.
- Add a per-VM, per-ITS debugfs entry to dump the state of the ITS
tables in a human-friendly fashion.
- and the usual random cleanups.
LoongArch:
- Don't flush tlb if the host supports hardware page table walks.
- Add KVM selftests support.
RISC-V:
- Add vector registers to get-reg-list selftest
- VCPU reset related improvements
- Remove scounteren initialization from VCPU reset
- Support VCPU reset from userspace using set_mpstate() ioctl
x86:
- Initial support for TDX in KVM.
This finally makes it possible to use the TDX module to run
confidential guests on Intel processors. This is quite a large
series, including support for private page tables (managed by the
TDX module and mirrored in KVM for efficiency), forwarding some
TDVMCALLs to userspace, and handling several special VM exits from
the TDX module.
This has been in the works for literally years and it's not really
possible to describe everything here, so I'll defer to the various
merge commits up to and including commit 7bcf7246c42a ('Merge
branch 'kvm-tdx-finish-initial' into HEAD')"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (248 commits)
x86/tdx: mark tdh_vp_enter() as __flatten
Documentation: virt/kvm: remove unreferenced footnote
RISC-V: KVM: lock the correct mp_state during reset
KVM: arm64: Fix documentation for vgic_its_iter_next()
KVM: arm64: np-guest CMOs with PMD_SIZE fixmap
KVM: arm64: Stage-2 huge mappings for np-guests
KVM: arm64: Add a range to pkvm_mappings
KVM: arm64: Convert pkvm_mappings to interval tree
KVM: arm64: Add a range to __pkvm_host_test_clear_young_guest()
KVM: arm64: Add a range to __pkvm_host_wrprotect_guest()
KVM: arm64: Add a range to __pkvm_host_unshare_guest()
KVM: arm64: Add a range to __pkvm_host_share_guest()
KVM: arm64: Introduce for_each_hyp_page
KVM: arm64: Handle huge mappings for np-guest CMOs
KVM: arm64: nv: Release faulted-in VNCR page from mmu_lock critical section
KVM: arm64: nv: Handle TLBI S1E2 for VNCR invalidation with mmu_lock held
KVM: arm64: nv: Hold mmu_lock when invalidating VNCR SW-TLB before translating
RISC-V: KVM: add KVM_CAP_RISCV_MP_STATE_RESET
RISC-V: KVM: Remove scounteren initialization
KVM: RISC-V: remove unnecessary SBI reset state
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Will Deacon:
"The headline feature is the re-enablement of support for Arm's
Scalable Matrix Extension (SME) thanks to a bumper crop of fixes
from Mark Rutland.
If matrices aren't your thing, then Ryan's page-table optimisation
work is much more interesting.
Summary:
ACPI, EFI and PSCI:
- Decouple Arm's "Software Delegated Exception Interface" (SDEI)
support from the ACPI GHES code so that it can be used by platforms
booted with device-tree
- Remove unnecessary per-CPU tracking of the FPSIMD state across EFI
runtime calls
- Fix a node refcount imbalance in the PSCI device-tree code
CPU Features:
- Ensure register sanitisation is applied to fields in ID_AA64MMFR4
- Expose AIDR_EL1 to userspace via sysfs, primarily so that KVM
guests can reliably query the underlying CPU types from the VMM
- Re-enabling of SME support (CONFIG_ARM64_SME) as a result of fixes
to our context-switching, signal handling and ptrace code
Entry code:
- Hook up TIF_NEED_RESCHED_LAZY so that CONFIG_PREEMPT_LAZY can be
selected
Memory management:
- Prevent BSS exports from being used by the early PI code
- Propagate level and stride information to the low-level TLB
invalidation routines when operating on hugetlb entries
- Use the page-table contiguous hint for vmap() mappings with
VM_ALLOW_HUGE_VMAP where possible
- Optimise vmalloc()/vmap() page-table updates to use "lazy MMU mode"
and hook this up on arm64 so that the trailing DSB (used to publish
the updates to the hardware walker) can be deferred until the end
of the mapping operation
- Extend mmap() randomisation for 52-bit virtual addresses (on par
with 48-bit addressing) and remove limited support for
randomisation of the linear map
Perf and PMUs:
- Add support for probing the CMN-S3 driver using ACPI
- Minor driver fixes to the CMN, Arm-NI and amlogic PMU drivers
Selftests:
- Fix FPSIMD and SME tests to align with the freshly re-enabled SME
support
- Fix default setting of the OUTPUT variable so that tests are
installed in the right location
vDSO:
- Replace raw counter access from inline assembly code with a call to
the the __arch_counter_get_cntvct() helper function
Miscellaneous:
- Add some missing header inclusions to the CCA headers
- Rework rendering of /proc/cpuinfo to follow the x86-approach and
avoid repeated buffer expansion (the user-visible format remains
identical)
- Remove redundant selection of CONFIG_CRC32
- Extend early error message when failing to map the device-tree
blob"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (83 commits)
arm64: cputype: Add cputype definition for HIP12
arm64: el2_setup.h: Make __init_el2_fgt labels consistent, again
perf/arm-cmn: Add CMN S3 ACPI binding
arm64/boot: Disallow BSS exports to startup code
arm64/boot: Move global CPU override variables out of BSS
arm64/boot: Move init_pgdir[] and init_idmap_pgdir[] into __pi_ namespace
perf/arm-cmn: Initialise cmn->cpu earlier
kselftest/arm64: Set default OUTPUT path when undefined
arm64: Update comment regarding values in __boot_cpu_mode
arm64: mm: Drop redundant check in pmd_trans_huge()
arm64/mm: Re-organise setting up FEAT_S1PIE registers PIRE0_EL1 and PIR_EL1
arm64/mm: Permit lazy_mmu_mode to be nested
arm64/mm: Disable barrier batching in interrupt contexts
arm64/cpuinfo: only show one cpu's info in c_show()
arm64/mm: Batch barriers when updating kernel mappings
mm/vmalloc: Enter lazy mmu mode while manipulating vmalloc ptes
arm64/mm: Support huge pte-mapped pages in vmap
mm/vmalloc: Gracefully unmap huge ptes
mm/vmalloc: Warn on improper use of vunmap_range()
arm64/mm: Hoist barriers out of set_ptes_anysz() loop
...
|
|
Pull documentation updates from Jonathan Corbet:
"A moderately busy cycle for documentation this time around:
- The most significant change is the replacement of the old
kernel-doc script (a monstrous collection of Perl regexes that
predates the Git era) with a Python reimplementation. That, too, is
a horrifying collection of regexes, but in a much cleaner and more
maintainable structure that integrates far better with the Sphinx
build system.
This change has been in linux-next for the full 6.15 cycle; the
small number of problems that turned up have been addressed,
seemingly to everybody's satisfaction. The Perl kernel-doc script
remains in tree (as scripts/kernel-doc.pl) and can be used with a
command-line option if need be. Unless some reason to keep it
around materializes, it will probably go away in 6.17.
Credit goes to Mauro Carvalho Chehab for doing all this work.
- Some RTLA documentation updates
- A handful of Chinese translations
- The usual collection of typo fixes, general updates, etc"
* tag 'docs-6.16' of git://git.lwn.net/linux: (85 commits)
Docs: doc-guide: update sphinx.rst Sphinx version number
docs: doc-guide: clarify latest theme usage
Documentation/scheduler: Fix typo in sched-stats domain field description
scripts: kernel-doc: prevent a KeyError when checking output
docs: kerneldoc.py: simplify exception handling logic
MAINTAINERS: update linux-doc entry to cover new Python scripts
docs: align with scripts/syscall.tbl migration
Documentation: NTB: Fix typo
Documentation: ioctl-number: Update table intro
docs: conf.py: drop backward support for old Sphinx versions
Docs: driver-api/basics: add kobject_event interfaces
Docs: relay: editing cleanups
docs: fix "incase" typo in coresight/panic.rst
Fix spelling error for 'parallel'
docs: admin-guide: fix typos in reporting-issues.rst
docs: dmaengine: add explanation for DMA_ASYNC_TX capability
Documentation: leds: improve readibility of multicolor doc
docs: fix typo in firmware-related section
docs: Makefile: Inherit PYTHONPYCACHEPREFIX setting as env variable
Documentation: ioctl-number: Update outdated submission info
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 resource control updates from Borislav Petkov:
"Carve out the resctrl filesystem-related code into fs/resctrl/ so that
multiple architectures can share the fs API for manipulating their
respective hw resource control implementation.
This is the second step in the work towards sharing the resctrl
filesystem interface, the next one being plugging ARM's MPAM into the
aforementioned fs API"
* tag 'x86_cache_for_v6.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits)
MAINTAINERS: Add reviewers for fs/resctrl
x86,fs/resctrl: Move the resctrl filesystem code to live in /fs/resctrl
x86/resctrl: Always initialise rid field in rdt_resources_all[]
x86/resctrl: Relax some asm #includes
x86/resctrl: Prefer alloc(sizeof(*foo)) idiom in rdt_init_fs_context()
x86/resctrl: Squelch whitespace anomalies in resctrl core code
x86/resctrl: Move pseudo lock prototypes to include/linux/resctrl.h
x86/resctrl: Fix types in resctrl_arch_mon_ctx_{alloc,free}() stubs
x86/resctrl: Move enum resctrl_event_id to resctrl.h
x86/resctrl: Move the filesystem bits to headers visible to fs/resctrl
fs/resctrl: Add boiler plate for external resctrl code
x86/resctrl: Add 'resctrl' to the title of the resctrl documentation
x86/resctrl: Split trace.h
x86/resctrl: Expand the width of domid by replacing mon_data_bits
x86/resctrl: Add end-marker to the resctrl_event_id enum
x86/resctrl: Move is_mba_sc() out of core.c
x86/resctrl: Drop __init/__exit on assorted symbols
x86/resctrl: Resctrl_exit() teardown resctrl but leave the mount point
x86/resctrl: Check all domains are offline in resctrl_exit()
x86/resctrl: Rename resctrl_sched_in() to begin with "resctrl_arch_"
...
|
|
* for-next/sme-fixes: (35 commits)
arm64/fpsimd: Allow CONFIG_ARM64_SME to be selected
arm64/fpsimd: ptrace: Gracefully handle errors
arm64/fpsimd: ptrace: Mandate SVE payload for streaming-mode state
arm64/fpsimd: ptrace: Do not present register data for inactive mode
arm64/fpsimd: ptrace: Save task state before generating SVE header
arm64/fpsimd: ptrace/prctl: Ensure VL changes leave task in a valid state
arm64/fpsimd: ptrace/prctl: Ensure VL changes do not resurrect stale data
arm64/fpsimd: Make clone() compatible with ZA lazy saving
arm64/fpsimd: Clear PSTATE.SM during clone()
arm64/fpsimd: Consistently preserve FPSIMD state during clone()
arm64/fpsimd: Remove redundant task->mm check
arm64/fpsimd: signal: Use SMSTOP behaviour in setup_return()
arm64/fpsimd: Add task_smstop_sm()
arm64/fpsimd: Factor out {sve,sme}_state_size() helpers
arm64/fpsimd: Clarify sve_sync_*() functions
arm64/fpsimd: ptrace: Consistently handle partial writes to NT_ARM_(S)SVE
arm64/fpsimd: signal: Consistently read FPSIMD context
arm64/fpsimd: signal: Mandate SVE payload for streaming-mode state
arm64/fpsimd: signal: Clear PSTATE.SM when restoring FPSIMD frame only
arm64/fpsimd: Do not discard modified SVE state
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core x86 updates from Ingo Molnar:
"Boot code changes:
- A large series of changes to reorganize the x86 boot code into a
better isolated and easier to maintain base of PIC early startup
code in arch/x86/boot/startup/, by Ard Biesheuvel.
Motivation & background:
| Since commit
|
| c88d71508e36 ("x86/boot/64: Rewrite startup_64() in C")
|
| dated Jun 6 2017, we have been using C code on the boot path in a way
| that is not supported by the toolchain, i.e., to execute non-PIC C
| code from a mapping of memory that is different from the one provided
| to the linker. It should have been obvious at the time that this was a
| bad idea, given the need to sprinkle fixup_pointer() calls left and
| right to manipulate global variables (including non-pointer variables)
| without crashing.
|
| This C startup code has been expanding, and in particular, the SEV-SNP
| startup code has been expanding over the past couple of years, and
| grown many of these warts, where the C code needs to use special
| annotations or helpers to access global objects.
This tree includes the first phase of this work-in-progress x86
boot code reorganization.
Scalability enhancements and micro-optimizations:
- Improve code-patching scalability (Eric Dumazet)
- Remove MFENCEs for X86_BUG_CLFLUSH_MONITOR (Andrew Cooper)
CPU features enumeration updates:
- Thorough reorganization and cleanup of CPUID parsing APIs (Ahmed S.
Darwish)
- Fix, refactor and clean up the cacheinfo code (Ahmed S. Darwish,
Thomas Gleixner)
- Update CPUID bitfields to x86-cpuid-db v2.3 (Ahmed S. Darwish)
Memory management changes:
- Allow temporary MMs when IRQs are on (Andy Lutomirski)
- Opt-in to IRQs-off activate_mm() (Andy Lutomirski)
- Simplify choose_new_asid() and generate better code (Borislav
Petkov)
- Simplify 32-bit PAE page table handling (Dave Hansen)
- Always use dynamic memory layout (Kirill A. Shutemov)
- Make SPARSEMEM_VMEMMAP the only memory model (Kirill A. Shutemov)
- Make 5-level paging support unconditional (Kirill A. Shutemov)
- Stop prefetching current->mm->mmap_lock on page faults (Mateusz
Guzik)
- Predict valid_user_address() returning true (Mateusz Guzik)
- Consolidate initmem_init() (Mike Rapoport)
FPU support and vector computing:
- Enable Intel APX support (Chang S. Bae)
- Reorgnize and clean up the xstate code (Chang S. Bae)
- Make task_struct::thread constant size (Ingo Molnar)
- Restore fpu_thread_struct_whitelist() to fix
CONFIG_HARDENED_USERCOPY=y (Kees Cook)
- Simplify the switch_fpu_prepare() + switch_fpu_finish() logic (Oleg
Nesterov)
- Always preserve non-user xfeatures/flags in __state_perm (Sean
Christopherson)
Microcode loader changes:
- Help users notice when running old Intel microcode (Dave Hansen)
- AMD: Do not return error when microcode update is not necessary
(Annie Li)
- AMD: Clean the cache if update did not load microcode (Boris
Ostrovsky)
Code patching (alternatives) changes:
- Simplify, reorganize and clean up the x86 text-patching code (Ingo
Molnar)
- Make smp_text_poke_batch_process() subsume
smp_text_poke_batch_finish() (Nikolay Borisov)
- Refactor the {,un}use_temporary_mm() code (Peter Zijlstra)
Debugging support:
- Add early IDT and GDT loading to debug relocate_kernel() bugs
(David Woodhouse)
- Print the reason for the last reset on modern AMD CPUs (Yazen
Ghannam)
- Add AMD Zen debugging document (Mario Limonciello)
- Fix opcode map (!REX2) superscript tags (Masami Hiramatsu)
- Stop decoding i64 instructions in x86-64 mode at opcode (Masami
Hiramatsu)
CPU bugs and bug mitigations:
- Remove X86_BUG_MMIO_UNKNOWN (Borislav Petkov)
- Fix SRSO reporting on Zen1/2 with SMT disabled (Borislav Petkov)
- Restructure and harmonize the various CPU bug mitigation methods
(David Kaplan)
- Fix spectre_v2 mitigation default on Intel (Pawan Gupta)
MSR API:
- Large MSR code and API cleanup (Xin Li)
- In-kernel MSR API type cleanups and renames (Ingo Molnar)
PKEYS:
- Simplify PKRU update in signal frame (Chang S. Bae)
NMI handling code:
- Clean up, refactor and simplify the NMI handling code (Sohil Mehta)
- Improve NMI duration console printouts (Sohil Mehta)
Paravirt guests interface:
- Restrict PARAVIRT_XXL to 64-bit only (Kirill A. Shutemov)
SEV support:
- Share the sev_secrets_pa value again (Tom Lendacky)
x86 platform changes:
- Introduce the <asm/amd/> header namespace (Ingo Molnar)
- i2c: piix4, x86/platform: Move the SB800 PIIX4 FCH definitions to
<asm/amd/fch.h> (Mario Limonciello)
Fixes and cleanups:
- x86 assembly code cleanups and fixes (Uros Bizjak)
- Misc fixes and cleanups (Andi Kleen, Andy Lutomirski, Andy
Shevchenko, Ard Biesheuvel, Bagas Sanjaya, Baoquan He, Borislav
Petkov, Chang S. Bae, Chao Gao, Dan Williams, Dave Hansen, David
Kaplan, David Woodhouse, Eric Biggers, Ingo Molnar, Josh Poimboeuf,
Juergen Gross, Malaya Kumar Rout, Mario Limonciello, Nathan
Chancellor, Oleg Nesterov, Pawan Gupta, Peter Zijlstra, Shivank
Garg, Sohil Mehta, Thomas Gleixner, Uros Bizjak, Xin Li)"
* tag 'x86-core-2025-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (331 commits)
x86/bugs: Fix spectre_v2 mitigation default on Intel
x86/bugs: Restructure ITS mitigation
x86/xen/msr: Fix uninitialized variable 'err'
x86/msr: Remove a superfluous inclusion of <asm/asm.h>
x86/paravirt: Restrict PARAVIRT_XXL to 64-bit only
x86/mm/64: Make 5-level paging support unconditional
x86/mm/64: Make SPARSEMEM_VMEMMAP the only memory model
x86/mm/64: Always use dynamic memory layout
x86/bugs: Fix indentation due to ITS merge
x86/cpuid: Rename hypervisor_cpuid_base()/for_each_possible_hypervisor_cpuid_base() to cpuid_base_hypervisor()/for_each_possible_cpuid_base_hypervisor()
x86/cpu/intel: Rename CPUID(0x2) descriptors iterator parameter
x86/cacheinfo: Rename CPUID(0x2) descriptors iterator parameter
x86/cpuid: Rename cpuid_get_leaf_0x2_regs() to cpuid_leaf_0x2()
x86/cpuid: Rename have_cpuid_p() to cpuid_feature()
x86/cpuid: Set <asm/cpuid/api.h> as the main CPUID header
x86/cpuid: Move CPUID(0x2) APIs into <cpuid/api.h>
x86/msr: Add rdmsrl_on_cpu() compatibility wrapper
x86/mm: Fix kernel-doc descriptions of various pgtable methods
x86/asm-offsets: Export certain 'struct cpuinfo_x86' fields for 64-bit asm use too
x86/boot: Defer initialization of VM space related global variables
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Madhavan Srinivasan:
- Support for dynamic preemption
- Migrate powerpc boards GPIO driver to new setter API
- Added new PMU for KVM host-wide measurement
- Enhancement to htmdump driver to support more functions
- Added character device for couple RTAS supported APIs
- Minor fixes and cleanup
Thanks to Amit Machhiwal, Athira Rajeev, Bagas Sanjaya, Bartosz
Golaszewski, Christophe Leroy, Eddie James, Gaurav Batra, Gautam
Menghani, Geert Uytterhoeven, Haren Myneni, Hari Bathini, Jiri Slaby
(SUSE), Linus Walleij, Michal Suchanek, Naveen N Rao (AMD), Nilay
Shroff, Ricardo B. Marlière, Ritesh Harjani (IBM), Sathvika Vasireddy,
Shrikanth Hegde, Stephen Rothwell, Sourabh Jain, Thorsten Blum, Vaibhav
Jain, Venkat Rao Bagalkote, and Viktor Malik.
* tag 'powerpc-6.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (52 commits)
MAINTAINERS: powerpc: Remove myself as a reviewer
powerpc/iommu: Use str_disabled_enabled() helper
powerpc/powermac: Use str_enabled_disabled() and str_on_off() helpers
powerpc/mm/fault: Use str_write_read() helper function
powerpc: Replace strcpy() with strscpy() in proc_ppc64_init()
powerpc/pseries/iommu: Fix kmemleak in TCE table userspace view
powerpc/kernel: Fix ppc_save_regs inclusion in build
powerpc: Transliterate author name and remove FIXME
powerpc/pseries/htmdump: Include header file to get is_kvm_guest() definition
KVM: PPC: Book3S HV: Fix IRQ map warnings with XICS on pSeries KVM Guest
powerpc/8xx: Reduce alignment constraint for kernel memory
powerpc/boot: Fix build with gcc 15
powerpc/pseries/htmdump: Add documentation for H_HTM debugfs interface
powerpc/pseries/htmdump: Add htm capabilities support to htmdump module
powerpc/pseries/htmdump: Add htm flags support to htmdump module
powerpc/pseries/htmdump: Add htm setup support to htmdump module
powerpc/pseries/htmdump: Add htm info support to htmdump module
powerpc/pseries/htmdump: Add htm status support to htmdump module
powerpc/pseries/htmdump: Add htm start support to htmdump module
powerpc/pseries/htmdump: Add htm configure support to htmdump module
...
|
|
* kvm-arm64/misc-6.16:
: .
: Misc changes and improvements for 6.16:
:
: - Add a new selftest for the SVE host state being corrupted by a guest
:
: - Keep HCR_EL2.xMO set at all times for systems running with the kernel at EL2,
: ensuring that the window for interrupts is slightly bigger, and avoiding
: a pretty bad erratum on the AmpereOne HW
:
: - Replace a couple of open-coded on/off strings with str_on_off()
:
: - Get rid of the pKVM memblock sorting, which now appears to be superflous
:
: - Drop superflous clearing of ICH_LR_EOI in the LR when nesting
:
: - Add workaround for AmpereOne's erratum AC04_CPU_23, which suffers from
: a pretty bad case of TLB corruption unless accesses to HCR_EL2 are
: heavily synchronised
:
: - Add a per-VM, per-ITS debugfs entry to dump the state of the ITS tables
: in a human-friendly fashion
: .
KVM: arm64: Fix documentation for vgic_its_iter_next()
KVM: arm64: vgic-its: Add debugfs interface to expose ITS tables
arm64: errata: Work around AmpereOne's erratum AC04_CPU_23
KVM: arm64: nv: Remove clearing of ICH_LR<n>.EOI if ICH_LR<n>.HW == 1
KVM: arm64: Drop sort_memblock_regions()
KVM: arm64: selftests: Add test for SVE host corruption
KVM: arm64: Force HCR_EL2.xMO to 1 at all times in VHE mode
KVM: arm64: Replace ternary flags with str_on_off() helper
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
On AmpereOne AC04, updates to HCR_EL2 can rarely corrupt simultaneous
translations for data addresses initiated by load/store instructions.
Only instruction initiated translations are vulnerable, not translations
from prefetches for example. A DSB before the store to HCR_EL2 is
sufficient to prevent older instructions from hitting the window for
corruption, and an ISB after is sufficient to prevent younger
instructions from hitting the window for corruption.
Signed-off-by: D Scott Phillips <scott@os.amperecomputing.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20250513184514.2678288-1-scott@os.amperecomputing.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Both Intel and AMD CPUs support 5-level paging, which is expected to
become more widely adopted in the future. All major x86 Linux
distributions have the feature enabled.
Remove CONFIG_X86_5LEVEL and related #ifdeffery for it to make it more readable.
Suggested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250516123306.3812286-4-kirill.shutemov@linux.intel.com
|
|
Resctrl is a filesystem interface to hardware that provides cache
allocation policy and bandwidth control for groups of tasks or CPUs.
To support more than one architecture, resctrl needs to live in /fs/.
Move the code that is concerned with the filesystem interface to
/fs/resctrl.
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Fenghua Yu <fenghuay@nvidia.com>
Tested-by: Fenghua Yu <fenghuay@nvidia.com>
Tested-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/20250515165855.31452-25-james.morse@arm.com
|
|
The resctrl documentation is titled "User Interface for Resource Control
feature".
Once the documentation follows the code in a move to the filesystem, this
appears in the list of filesystems, but doesn't contain the name of the
filesystem, making it hard to find.
Add 'resctrl' to the title.
Suggested-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Fenghua Yu <fenghuay@nvidia.com>
Tested-by: Fenghua Yu <fenghuay@nvidia.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Tested-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/20250515165855.31452-15-james.morse@arm.com
|
|
Prepare to resolve conflicts with an upstream series of fixes that conflict
with pending x86 changes:
6f5bf947bab0 Merge tag 'its-for-linus-20250509' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Make frequently fetched telemetry available via sysfs. These parameters
do not fit in hwmon sensor model, hence make them available via sysfs.
Create following sysfs files per acpi device node.
* c0_residency_input
* prochot_status
* smu_fw_version
* protocol_version
* ddr_max_bw(GB/s)
* ddr_utilised_bw_input(GB/s)
* ddr_utilised_bw_perc_input(%)
* mclk_input(MHz)
* fclk_input(MHz)
* clk_fmax(MHz)
* clk_fmin(MHz)
* cclk_freq_limit_input(MHz)
* pwr_current_active_freq_limit(MHz)
* pwr_current_active_freq_limit_source
Signed-off-by: Suma Hegde <suma.hegde@amd.com>
Reviewed-by: Naveen Krishna Chatradhi <naveenkrishna.chatradhi@amd.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20250506101542.200811-3-suma.hegde@amd.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
Expose power reading and power limits via hwmon power sensors.
Signed-off-by: Suma Hegde <suma.hegde@amd.com>
Reviewed-by: Naveen Krishna Chatradhi <naveenkrishna.chatradhi@amd.com>
Link: https://lore.kernel.org/r/20250506101542.200811-2-suma.hegde@amd.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
Document the support for matrix multiply accumulate instruction
from SiFive using RISCV_HWPROBE_VENDOR_EXT_XSFVFWMACCQQQ.
Signed-off-by: Cyan Yang <cyan.yang@sifive.com>
Link: https://lore.kernel.org/r/20250418053239.4351-12-cyan.yang@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Document the support for SiFive vendor extensions for
FP32-to-int8 Ranged Clip Instructions using
RISCV_HWPROBE_VENDOR_EXT_XSFVFNRCLIPXFQF.
Signed-off-by: Cyan Yang <cyan.yang@sifive.com>
Link: https://lore.kernel.org/r/20250418053239.4351-8-cyan.yang@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Document the support for sifive vendor extensions using the key
RISCV_HWPROBE_KEY_VENDOR_EXT_SIFIVE_0 and two vendor extensions for SiFive
Int8 Matrix Multiplication Instructions using
RISCV_HWPROBE_VENDOR_EXT_XSFVQMACCDOD and
RISCV_HWPROBE_VENDOR_EXT_XSFVQMACCQOQ.
Signed-off-by: Cyan Yang <cyan.yang@sifive.com>
Link: https://lore.kernel.org/r/20250418053239.4351-4-cyan.yang@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Currently, vec_set_vector_length() can manipulate a task into an invalid
state as a result of a prctl/ptrace syscall which changes the SVE/SME
vector length, resulting in several problems:
(1) When changing the SVE vector length, if the task initially has
PSTATE.ZA==1, and sve_alloc() fails to allocate memory, the task
will be left with PSTATE.ZA==1 and sve_state==NULL. This is not a
legitimate state, and could result in a subsequent null pointer
dereference.
(2) When changing the SVE vector length, if the task initially has
PSTATE.SM==1, the task will be left with PSTATE.SM==1 and
fp_type==FP_STATE_FPSIMD. Streaming mode state always needs to be
saved in SVE format, so this is not a legitimate state.
Attempting to restore this state may cause a task to erroneously
inherit stale streaming mode predicate registers and FFR contents,
behaving non-deterministically and potentially leaving information
from another task.
While in this state, reads of the NT_ARM_SSVE regset will indicate
that the registers are not stored in SVE format. For the NT_ARM_SSVE
regset specifically, debuggers interpret this as meaning that
PSTATE.SM==0.
(3) When changing the SME vector length, if the task initially has
PSTATE.SM==1, the lower 128 bits of task's streaming mode vector
state will be migrated to non-streaming mode, rather than these bits
being zeroed as is usually the case for changes to PSTATE.SM.
To fix the first issue, we can eagerly allocate the new sve_state and
sme_state before modifying the task. This makes it possible to handle
memory allocation failure without modifying the task state at all, and
removes the need to clear TIF_SVE and TIF_SME.
To fix the second issue, we either need to clear PSTATE.SM or not change
the saved fp_type. Given we're going to eagerly allocate sve_state and
sme_state, the simplest option is to preserve PSTATE.SM and the saves
fp_type, and consistently truncate the SVE state. This ensures that the
task always stays in a valid state, and by virtue of not exiting
streaming mode, this also sidesteps the third issue.
I believe these changes should not be problematic for realistic usage:
* When the SVE/SME vector length is changed via prctl(), syscall entry
will have cleared PSTATE.SM. Unless the task's state has been
manipulated via ptrace after entry, the task will have PSTATE.SM==0.
* When the SVE/SME vector length is changed via a write to the
NT_ARM_SVE or NT_ARM_SSVE regsets, PSTATE.SM will be forced
immediately after the length change, and new vector state will be
copied from userspace.
* When the SME vector length is changed via a write to the NT_ARM_ZA
regset, the (S)SVE state is clobbered today, so anyone who cares about
the specific state would need to install this after writing to the
NT_ARM_ZA regset.
As we need to free the old SVE state while TIF_SVE may still be set, we
cannot use sve_free(), and using kfree() directly makes it clear that
the free pairs with the subsequent assignment. As this leaves sve_free()
unused, I've removed the existing sve_free() and renamed __sve_free() to
mirror sme_free().
Fixes: 8bd7f91c03d8 ("arm64/sme: Implement traps and syscall handling for SME")
Fixes: baa8515281b3 ("arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: David Spickett <david.spickett@arm.com>
Cc: Luis Machado <luis.machado@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250508132644.1395904-16-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Linux is intended to be compatible with userspace written to Arm's
AAPCS64 procedure call standard [1,2]. For the Scalable Matrix Extension
(SME), AAPCS64 was extended with a "ZA lazy saving scheme", where SME's
ZA tile is lazily callee-saved and caller-restored. In this scheme,
TPIDR2_EL0 indicates whether the ZA tile is live or has been saved by
pointing to a "TPIDR2 block" in memory, which has a "za_save_buffer"
pointer. This scheme has been implemented in GCC and LLVM, with
necessary runtime support implemented in glibc and bionic.
AAPCS64 does not specify how the ZA lazy saving scheme is expected to
interact with thread creation mechanisms such as fork() and
pthread_create(), which would be implemented in terms of the Linux clone
syscall. The behaviour implemented by Linux and glibc/bionic doesn't
always compose safely, as explained below.
Currently the clone syscall is implemented such that PSTATE.ZA and the
ZA tile are always inherited by the new task, and TPIDR2_EL0 is
inherited unless the 'flags' argument includes CLONE_SETTLS,
in which case TPIDR2_EL0 is set to 0/NULL. This doesn't make much sense:
(a) TPIDR2_EL0 is part of the calling convention, and changes as control
is passed between functions. It is *NOT* used for thread local
storage, despite superficial similarity to TPIDR_EL0, which is is
used as the TLS register.
(b) TPIDR2_EL0 and PSTATE.ZA are tightly coupled in the procedure call
standard, and some combinations of states are illegal. In general,
manipulating the two independently is not guaranteed to be safe.
In practice, code which is compliant with the procedure call standard
may issue a clone syscall while in the "ZA dormant" state, where
PSTATE.ZA==1 and TPIDR2_EL0 is non-null and indicates that ZA needs to
be saved. This can cause a variety of problems, including:
* If the implementation of pthread_create() passes CLONE_SETTLS, the
new thread will start with PSTATE.ZA==1 and TPIDR2==NULL. Per the
procedure call standard this is not a legitimate state for most
functions. This can cause data corruption (e.g. as code may rely on
PSTATE.ZA being 0 to guarantee that an SMSTART ZA instruction will
zero the ZA tile contents), and may result in other undefined
behaviour.
* If the implementation of pthread_create() does not pass CLONE_SETTLS, the
new thread will start with PSTATE.ZA==1 and TPIDR2 pointing to a
TPIDR2 block on the parent thread's stack. This can result in a
variety of problems, e.g.
- The child may write back to the parent's za_save_buffer, corrupting
its contents.
- The child may read from the TPIDR2 block after the parent has reused
this memory for something else, and consequently the child may abort
or clobber arbitrary memory.
Ideally we'd require that userspace ensures that a task is in the "ZA
off" state (with PSTATE.ZA==0 and TPIDR2_EL0==NULL) prior to issuing a
clone syscall, and have the kernel force this state for new threads.
Unfortunately, contemporary C libraries do not do this, and simply
forcing this state within the implementation of clone would break
fork().
Instead, we can bodge around this by considering the CLONE_VM flag, and
manipulate PSTATE.ZA and TPIDR2_EL0 as a pair. CLONE_VM indicates that
the new task will run in the same address space as its parent, and in
that case it doesn't make sense to inherit a stale pointer to the
parent's TPIDR2 block:
* For fork(), CLONE_VM will not be set, and it is safe to inherit both
PSTATE.ZA and TPIDR2_EL0 as the new task will have its own copy of the
address space, and cannot clobber its parent's stack.
* For pthread_create() and vfork(), CLONE_VM will be set, and discarding
PSTATE.ZA and TPIDR2_EL0 for the new task doesn't break any existing
assumptions in userspace.
Implement this behaviour for clone(). We currently inherit PSTATE.ZA in
arch_dup_task_struct(), but this does not have access to the clone
flags, so move this logic under copy_thread(). Documentation is updated
to describe the new behaviour.
[1] https://github.com/ARM-software/abi-aa/releases/download/2025Q1/aapcs64.pdf
[2] https://github.com/ARM-software/abi-aa/blob/c51addc3dc03e73a016a1e4edf25440bcac76431/aapcs64/aapcs64.rst
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Daniel Kiss <daniel.kiss@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Richard Sandiford <richard.sandiford@arm.com>
Cc: Sander De Smalen <sander.desmalen@arm.com>
Cc: Tamas Petz <tamas.petz@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yury Khrustalev <yury.khrustalev@arm.com>
Acked-by: Yury Khrustalev <yury.khrustalev@arm.com>
Link: https://lore.kernel.org/r/20250508132644.1395904-14-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Mention how to include initramfs when building the kernel and
direct the reader to ramfs-rootfs-initramfs.rst documentation for more
details
Signed-off-by: Ann Yun <by.ann.yun@gmail.com>
Signed-off-by: Stafford Horne <shorne@gmail.com>
|
|
The following register contains bits that indicate the cause for the
previous reset.
PMx000000C0 (FCH::PM::S5_RESET_STATUS)
This is useful for debug. The reasons for reset are broken into 6 high level
categories. Decode it by category and print during boot.
Specifics within a category are split off into debugging documentation.
The register is accessed indirectly through a "PM" port in the FCH. Use
MMIO access in order to avoid restrictions with legacy port access.
Use a late_initcall() to ensure that MMIO has been set up before trying to
access the register.
This register was introduced with AMD Family 17h, so avoid access on older
families. There is no CPUID feature bit for this register.
[ bp: Simplify the reason dumping loop.
- merge a fix to not access an array element after the last one:
https://lore.kernel.org/r/20250505133609.83933-1-superm1@kernel.org
Reported-by: James Dutton <james.dutton@gmail.com>
]
[ mingo:
- Use consistent .rst formatting
- Fix 'Sleep' class field to 'ACPI-State'
- Standardize pin messages around the 'tripped' verbiage
- Remove reference to ring-buffer printing & simplify the wording
- Use curly braces for multi-line conditional statements ]
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Co-developed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250422234830.2840784-6-superm1@kernel.org
|
|
Documentation for HTM (Hardware Trace Macro) debugfs interface
and how it can be used to configure/control the HTM operations.
Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250420180844.53128-10-atrajeev@linux.ibm.com
|
|
Debugging issues on AMD hardware can be challenging for users without
proper documentation and tools.
Introduce a document that includes techniques for debugging s2idle
issues. It will be expanded for debugging other issues later.
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/20250422234830.2840784-2-superm1@kernel.org
|
|
The KVM PV ABI recently added a feature that allows the VM to discover
the set of physical CPU implementations, identified by a tuple of
{MIDR_EL1, REVIDR_EL1, AIDR_EL1}. Unlike other KVM PV features, the
expectation is that the VMM implements the hypercall instead of KVM as
it has the authoritative view of where the VM gets scheduled.
To do this the VMM needs to know the values of these registers on any
CPU in the system. While MIDR_EL1 and REVIDR_EL1 are already exposed,
AIDR_EL1 is not. Provide it in sysfs along with the other identification
registers.
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20250403231626.3181116-1-oliver.upton@linux.dev
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Fix a spelling typo in fsgs.rst.
Signed-off-by: Adrian Bütler <buetlera123@gmail.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20250426122303.15905-1-buetlera123@gmail.com>
|
|
Linux is intended to be compatible with userspace written to Arm's
AAPCS64 procedure call standard [1,2]. For the Scalable Matrix Extension
(SME), AAPCS64 was extended with a "ZA lazy saving scheme", where SME's
ZA tile is lazily callee-saved and caller-restored. In this scheme,
TPIDR2_EL0 indicates whether the ZA tile is live or has been saved by
pointing to a "TPIDR2 block" in memory, which has a "za_save_buffer"
pointer. This scheme has been implemented in GCC and LLVM, with
necessary runtime support implemented in glibc.
AAPCS64 does not specify how the ZA lazy saving scheme is expected to
interact with signal handling, and the behaviour that AAPCS64 currently
recommends for (sig)setjmp() and (sig)longjmp() does not always compose
safely with signal handling, as explained below.
When Linux delivers a signal, it creates signal frames which contain the
original values of PSTATE.ZA, the ZA tile, and TPIDR_EL2. Between saving
the original state and entering the signal handler, Linux clears
PSTATE.ZA, but leaves TPIDR2_EL0 unchanged. Consequently a signal
handler can be entered with PSTATE.ZA=0 (meaning accesses to ZA will
trap), while TPIDR_EL0 is non-null (which may indicate that ZA needs to
be lazily saved, depending on the contents of the TPIDR2 block). While
in this state, libc and/or compiler runtime code, such as longjmp(), may
attempt to save ZA. As PSTATE.ZA=0, these accesses will trap, causing
the kernel to inject a SIGILL. Note that by virtue of lazy saving
occurring in libc and/or C runtime code, this can be triggered by
application/library code which is unaware of SME.
To avoid the problem above, the kernel must ensure that signal handlers
are entered with PSTATE.ZA and TPIDR2_EL0 configured in a manner which
complies with the ZA lazy saving scheme. Practically speaking, the only
choice is to enter signal handlers with PSTATE.ZA=0 and TPIDR2_EL0=NULL.
This change should not impact SME code which does not follow the ZA lazy
saving scheme (and hence does not use TPIDR2_EL0).
An alternative approach that was considered is to have the signal
handler inherit the original values of both PSTATE.ZA and TPIDR2_EL0,
relying on lazy save/restore sequences being idempotent and capable of
racing safely. This is not safe as signal handlers must be assumed to
have a "private ZA" interface, and therefore cannot be entered with
PSTATE.ZA=1 and TPIDR2_EL0=NULL, but it is legitimate for signals to be
taken from this state.
With the kernel fixed to clear TPIDR2_EL0, there are a couple of
remaining issues (largely masked by the first issue) that must be fixed
in userspace:
(1) When a (sig)setjmp() + (sig)longjmp() pair cross a signal boundary,
ZA state may be discarded when it needs to be preserved.
Currently, the ZA lazy saving scheme recommends that setjmp() does
not save ZA, and recommends that longjmp() is responsible for saving
ZA. A call to longjmp() in a signal handler will not have visibility
of ZA state that existed prior to entry to the signal, and when a
longjmp() is used to bypass a usual signal return, unsaved ZA state
will be discarded erroneously.
To fix this, it is necessary for setjmp() to eagerly save ZA state,
and for longjmp() to configure PSTATE.ZA=0 and TPIDR2_EL0=NULL. This
works regardless of whether a signal boundary is crossed.
(2) When a C++ exception is thrown and crosses a signal boundary before
it is caught, ZA state may be discarded when it needs to be
preserved.
AAPCS64 requires that exception handlers are entered with
PSTATE.{SM,ZA}={0,0} and TPIDR2_EL0=NULL, with exception unwind code
expected to perform any necessary save of ZA state.
Where it is necessary to perform an exception unwind across an
exception boundary, the unwind code must recover any necessary ZA
state (along with TPIDR2) from signal frames.
Fix the kernel as described above, with setup_return() clearing
TPIDR2_EL0 when delivering a signal. Folk on CC are working on fixes for
the remaining userspace issues, including updates/fixes to the AAPCS64
specification and glibc.
[1] https://github.com/ARM-software/abi-aa/releases/download/2025Q1/aapcs64.pdf
[2] https://github.com/ARM-software/abi-aa/blob/c51addc3dc03e73a016a1e4edf25440bcac76431/aapcs64/aapcs64.rst
Fixes: 39782210eb7e ("arm64/sme: Implement ZA signal handling")
Fixes: 39e54499280f ("arm64/signal: Include TPIDR2 in the signal context")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Daniel Kiss <daniel.kiss@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Richard Sandiford <richard.sandiford@arm.com>
Cc: Sander De Smalen <sander.desmalen@arm.com>
Cc: Tamas Petz <tamas.petz@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yury Khrustalev <yury.khrustalev@arm.com>
Link: https://lore.kernel.org/r/20250417190113.3778111-1-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Pull OpenRISC updates from Stafford Horne:
- Support for cacheinfo API to expose OpenRISC cache info via sysfs,
this also translated to some cleanups to OpenRISC cache flush and
invalidate API's
- Documentation updates for new mailing list and toolchain binaries
* tag 'for-linus' of https://github.com/openrisc/linux:
Documentation: openrisc: Update toolchain binaries URL
Documentation: openrisc: Update mailing list
openrisc: Add cacheinfo support
openrisc: Introduce new utility functions to flush and invalidate caches
openrisc: Refactor struct cpuinfo_or1k to reduce duplication
|
|
The old development toolchain binaries were hosted in the or1k-gcc
development github repo release page. However, now that we have all
code upstream I cut releases from stable upstream tarballs. It does not
make sense to tag the or1k-gcc github repo releases for these stable
releases.
Update the toolchain binaries URL to point to where they are now hosted
on the or1k-toolchain-build github release page.
Signed-off-by: Stafford Horne <shorne@gmail.com>
|
|
The librecores.org mailing list was replaced with vger.kernel.org last
year after the old mail server went offline. Update the docs to reflect
the new list.
Signed-off-by: Stafford Horne <shorne@gmail.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
- A fix for an issue where C instructions ended up in non-C builds, due
to some broken inline assembly in the KGDB breakpoint insertion code
- A fix to avoid spurious printk messages about misaligned access
performance probing
- A fix for a handful of issues with /proc/iomem's reserved region
handling
- A pair of fixes for module relocation processing
- A few build-time fixes
* tag 'riscv-for-linus-6.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: KGDB: Remove ".option norvc/.option rvc" for kgdb_compiled_break
riscv: KGDB: Do not inline arch_kgdb_breakpoint()
riscv: Avoid fortify warning in syscall_get_arguments()
riscv: Provide all alternative macros all the time
riscv: module: Allocate PLT entries for R_RISCV_PLT32
riscv: module: Fix out-of-bounds relocation access
riscv: Properly export reserved regions in /proc/iomem
riscv: Fix unaligned access info messages
riscv: Avoid fortify warning in syscall_get_arguments()
Documentation: riscv: Fix typo MIMPLID -> MIMPID
riscv: Use kvmalloc_array on relocation_hashtable
|
|
Update kvm-nested APIv2 documentation to include five new
Guest-State-Elements to fetch the hostwide counters. These counters are
per L1-Lpar and indicate the amount of Heap/Page-table memory allocated,
available and Page-table memory reclaimed for all L2-Guests active
instances
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
Tested-by: Gautam Menghani <gautam@linux.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250416162740.93143-2-vaibhav@linux.ibm.com
|
|
The subsections already have numbering - no need for the letters too.
Zap the latter.
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250409111435.GEZ_ZWmz3_lkP8S9Lb@fat_crate.local
|
|
Commit:
78ce84b9e0a5 ("x86/cpufeatures: Flip the /proc/cpuinfo appearance logic")
changed how CPU feature names should be specified. Update document to
reflect the same.
Signed-off-by: Naveen N Rao (AMD) <naveen@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250409111341.GDZ_ZWZS4LckBcirLE@fat_crate.local
|
|
The macro that is really defined is RISCV_HWPROBE_KEY_MIMPID, not
RISCV_HWPROBE_KEY_MIMPLID (difference is the 'L').
Also, the riscv privileged specification names the register "mimpid", not
"mimplid".
Correct these typos.
Signed-off-by: Nam Cao <namcao@linutronix.de>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Evan Green <evan@rivosinc.com>
Link: https://lore.kernel.org/r/20240925142532.31808-1-namcao@linutronix.de
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Palmer Dabbelt:
- The sub-architecture selection Kconfig system has been cleaned up,
the documentation has been improved, and various detections have been
fixed
- The vector-related extensions dependencies are now validated when
parsing from device tree and in the DT bindings
- Misaligned access probing can be overridden via a kernel command-line
parameter, along with various fixes to misalign access handling
- Support for relocatable !MMU kernels builds
- Support for hpge pfnmaps, which should improve TLB utilization
- Support for runtime constants, which improves the d_hash()
performance
- Support for bfloat16, Zicbom, Zaamo, Zalrsc, Zicntr, Zihpm
- Various fixes, including:
- We were missing a secondary mmu notifier call when flushing the
tlb which is required for IOMMU
- Fix ftrace panics by saving the registers as expected by ftrace
- Fix a couple of stimecmp usage related to cpu hotplug
- purgatory_start is now aligned as per the STVEC requirements
- A fix for hugetlb when calculating the size of non-present PTEs
* tag 'riscv-for-linus-6.15-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (65 commits)
riscv: Add norvc after .option arch in runtime const
riscv: Make sure toolchain supports zba before using zba instructions
riscv/purgatory: 4B align purgatory_start
riscv/kexec_file: Handle R_RISCV_64 in purgatory relocator
selftests: riscv: fix v_exec_initval_nolibc.c
riscv: Fix hugetlb retrieval of number of ptes in case of !present pte
riscv: print hartid on bringup
riscv: Add norvc after .option arch in runtime const
riscv: Remove CONFIG_PAGE_OFFSET
riscv: Support CONFIG_RELOCATABLE on riscv32
asm-generic: Always define Elf_Rel and Elf_Rela
riscv: Support CONFIG_RELOCATABLE on NOMMU
riscv: Allow NOMMU kernels to access all of RAM
riscv: Remove duplicate CONFIG_PAGE_OFFSET definition
RISC-V: errata: Use medany for relocatable builds
dt-bindings: riscv: document vector crypto requirements
dt-bindings: riscv: add vector sub-extension dependencies
dt-bindings: riscv: d requires f
RISC-V: add f & d extension validation checks
RISC-V: add vector crypto extension validation checks
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull more powerpc updates from Michael Ellerman:
- Remove the IBM CAPI (cxl) driver
Thanks to Andrew Donnellan.
* tag 'powerpc-6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
docs: Fix references to IBM CAPI (cxl) removal version
cxl: Remove driver
|
|
This merges in the removal of the IBM CAPI "cxl" driver.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- The series "Enable strict percpu address space checks" from Uros
Bizjak uses x86 named address space qualifiers to provide
compile-time checking of percpu area accesses.
This has caused a small amount of fallout - two or three issues were
reported. In all cases the calling code was found to be incorrect.
- The series "Some cleanup for memcg" from Chen Ridong implements some
relatively monir cleanups for the memcontrol code.
- The series "mm: fixes for device-exclusive entries (hmm)" from David
Hildenbrand fixes a boatload of issues which David found then using
device-exclusive PTE entries when THP is enabled. More work is
needed, but this makes thins better - our own HMM selftests now
succeed.
- The series "mm: zswap: remove z3fold and zbud" from Yosry Ahmed
remove the z3fold and zbud implementations. They have been deprecated
for half a year and nobody has complained.
- The series "mm: further simplify VMA merge operation" from Lorenzo
Stoakes implements numerous simplifications in this area. No runtime
effects are anticipated.
- The series "mm/madvise: remove redundant mmap_lock operations from
process_madvise()" from SeongJae Park rationalizes the locking in the
madvise() implementation. Performance gains of 20-25% were observed
in one MADV_DONTNEED microbenchmark.
- The series "Tiny cleanup and improvements about SWAP code" from
Baoquan He contains a number of touchups to issues which Baoquan
noticed when working on the swap code.
- The series "mm: kmemleak: Usability improvements" from Catalin
Marinas implements a couple of improvements to the kmemleak
user-visible output.
- The series "mm/damon/paddr: fix large folios access and schemes
handling" from Usama Arif provides a couple of fixes for DAMON's
handling of large folios.
- The series "mm/damon/core: fix wrong and/or useless damos_walk()
behaviors" from SeongJae Park fixes a few issues with the accuracy of
kdamond's walking of DAMON regions.
- The series "expose mapping wrprotect, fix fb_defio use" from Lorenzo
Stoakes changes the interaction between framebuffer deferred-io and
core MM. No functional changes are anticipated - this is preparatory
work for the future removal of page structure fields.
- The series "mm/damon: add support for hugepage_size DAMOS filter"
from Usama Arif adds a DAMOS filter which permits the filtering by
huge page sizes.
- The series "mm: permit guard regions for file-backed/shmem mappings"
from Lorenzo Stoakes extends the guard region feature from its
present "anon mappings only" state. The feature now covers shmem and
file-backed mappings.
- The series "mm: batched unmap lazyfree large folios during
reclamation" from Barry Song cleans up and speeds up the unmapping
for pte-mapped large folios.
- The series "reimplement per-vma lock as a refcount" from Suren
Baghdasaryan puts the vm_lock back into the vma. Our reasons for
pulling it out were largely bogus and that change made the code more
messy. This patchset provides small (0-10%) improvements on one
microbenchmark.
- The series "Docs/mm/damon: misc DAMOS filters documentation fixes and
improves" from SeongJae Park does some maintenance work on the DAMON
docs.
- The series "hugetlb/CMA improvements for large systems" from Frank
van der Linden addresses a pile of issues which have been observed
when using CMA on large machines.
- The series "mm/damon: introduce DAMOS filter type for unmapped pages"
from SeongJae Park enables users of DMAON/DAMOS to filter my the
page's mapped/unmapped status.
- The series "zsmalloc/zram: there be preemption" from Sergey
Senozhatsky teaches zram to run its compression and decompression
operations preemptibly.
- The series "selftests/mm: Some cleanups from trying to run them" from
Brendan Jackman fixes a pile of unrelated issues which Brendan
encountered while runnimg our selftests.
- The series "fs/proc/task_mmu: add guard region bit to pagemap" from
Lorenzo Stoakes permits userspace to use /proc/pid/pagemap to
determine whether a particular page is a guard page.
- The series "mm, swap: remove swap slot cache" from Kairui Song
removes the swap slot cache from the allocation path - it simply
wasn't being effective.
- The series "mm: cleanups for device-exclusive entries (hmm)" from
David Hildenbrand implements a number of unrelated cleanups in this
code.
- The series "mm: Rework generic PTDUMP configs" from Anshuman Khandual
implements a number of preparatoty cleanups to the GENERIC_PTDUMP
Kconfig logic.
- The series "mm/damon: auto-tune aggregation interval" from SeongJae
Park implements a feedback-driven automatic tuning feature for
DAMON's aggregation interval tuning.
- The series "Fix lazy mmu mode" from Ryan Roberts fixes some issues in
powerpc, sparc and x86 lazy MMU implementations. Ryan did this in
preparation for implementing lazy mmu mode for arm64 to optimize
vmalloc.
- The series "mm/page_alloc: Some clarifications for migratetype
fallback" from Brendan Jackman reworks some commentary to make the
code easier to follow.
- The series "page_counter cleanup and size reduction" from Shakeel
Butt cleans up the page_counter code and fixes a size increase which
we accidentally added late last year.
- The series "Add a command line option that enables control of how
many threads should be used to allocate huge pages" from Thomas
Prescher does that. It allows the careful operator to significantly
reduce boot time by tuning the parallalization of huge page
initialization.
- The series "Fix calculations in trace_balance_dirty_pages() for cgwb"
from Tang Yizhou fixes the tracing output from the dirty page
balancing code.
- The series "mm/damon: make allow filters after reject filters useful
and intuitive" from SeongJae Park improves the handling of allow and
reject filters. Behaviour is made more consistent and the documention
is updated accordingly.
- The series "Switch zswap to object read/write APIs" from Yosry Ahmed
updates zswap to the new object read/write APIs and thus permits the
removal of some legacy code from zpool and zsmalloc.
- The series "Some trivial cleanups for shmem" from Baolin Wang does as
it claims.
- The series "fs/dax: Fix ZONE_DEVICE page reference counts" from
Alistair Popple regularizes the weird ZONE_DEVICE page refcount
handling in DAX, permittig the removal of a number of special-case
checks.
- The series "refactor mremap and fix bug" from Lorenzo Stoakes is a
preparatoty refactoring and cleanup of the mremap() code.
- The series "mm: MM owner tracking for large folios (!hugetlb) +
CONFIG_NO_PAGE_MAPCOUNT" from David Hildenbrand reworks the manner in
which we determine whether a large folio is known to be mapped
exclusively into a single MM.
- The series "mm/damon: add sysfs dirs for managing DAMOS filters based
on handling layers" from SeongJae Park adds a couple of new sysfs
directories to ease the management of DAMON/DAMOS filters.
- The series "arch, mm: reduce code duplication in mem_init()" from
Mike Rapoport consolidates many per-arch implementations of
mem_init() into code generic code, where that is practical.
- The series "mm/damon/sysfs: commit parameters online via
damon_call()" from SeongJae Park continues the cleaning up of sysfs
access to DAMON internal data.
- The series "mm: page_ext: Introduce new iteration API" from Luiz
Capitulino reworks the page_ext initialization to fix a boot-time
crash which was observed with an unusual combination of compile and
cmdline options.
- The series "Buddy allocator like (or non-uniform) folio split" from
Zi Yan reworks the code to split a folio into smaller folios. The
main benefit is lessened memory consumption: fewer post-split folios
are generated.
- The series "Minimize xa_node allocation during xarry split" from Zi
Yan reduces the number of xarray xa_nodes which are generated during
an xarray split.
- The series "drivers/base/memory: Two cleanups" from Gavin Shan
performs some maintenance work on the drivers/base/memory code.
- The series "Add tracepoints for lowmem reserves, watermarks and
totalreserve_pages" from Martin Liu adds some more tracepoints to the
page allocator code.
- The series "mm/madvise: cleanup requests validations and
classifications" from SeongJae Park cleans up some warts which
SeongJae observed during his earlier madvise work.
- The series "mm/hwpoison: Fix regressions in memory failure handling"
from Shuai Xue addresses two quite serious regressions which Shuai
has observed in the memory-failure implementation.
- The series "mm: reliable huge page allocator" from Johannes Weiner
makes huge page allocations cheaper and more reliable by reducing
fragmentation.
- The series "Minor memcg cleanups & prep for memdescs" from Matthew
Wilcox is preparatory work for the future implementation of memdescs.
- The series "track memory used by balloon drivers" from Nico Pache
introduces a way to track memory used by our various balloon drivers.
- The series "mm/damon: introduce DAMOS filter type for active pages"
from Nhat Pham permits users to filter for active/inactive pages,
separately for file and anon pages.
- The series "Adding Proactive Memory Reclaim Statistics" from Hao Jia
separates the proactive reclaim statistics from the direct reclaim
statistics.
- The series "mm/vmscan: don't try to reclaim hwpoison folio" from
Jinjiang Tu fixes our handling of hwpoisoned pages within the reclaim
code.
* tag 'mm-stable-2025-03-30-16-52' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (431 commits)
mm/page_alloc: remove unnecessary __maybe_unused in order_to_pindex()
x86/mm: restore early initialization of high_memory for 32-bits
mm/vmscan: don't try to reclaim hwpoison folio
mm/hwpoison: introduce folio_contain_hwpoisoned_page() helper
cgroup: docs: add pswpin and pswpout items in cgroup v2 doc
mm: vmscan: split proactive reclaim statistics from direct reclaim statistics
selftests/mm: speed up split_huge_page_test
selftests/mm: uffd-unit-tests support for hugepages > 2M
docs/mm/damon/design: document active DAMOS filter type
mm/damon: implement a new DAMOS filter type for active pages
fs/dax: don't disassociate zero page entries
MM documentation: add "Unaccepted" meminfo entry
selftests/mm: add commentary about 9pfs bugs
fork: use __vmalloc_node() for stack allocation
docs/mm: Physical Memory: Populate the "Zones" section
xen: balloon: update the NR_BALLOON_PAGES state
hv_balloon: update the NR_BALLOON_PAGES state
balloon_compaction: update the NR_BALLOON_PAGES state
meminfo: add a per node counter for balloon drivers
mm: remove references to folio in __memcg_kmem_uncharge_page()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Madhavan Srinivasan:
- Remove support for IBM Cell Blades
- SMP support for microwatt platform
- Support for inline static calls on PPC32
- Enable pmu selftests for power11 platform
- Enable hardware trace macro (HTM) hcall support
- Support for limited address mode capability
- Changes to RMA size from 512 MB to 768 MB to handle fadump
- Misc fixes and cleanups
Thanks to Abhishek Dubey, Amit Machhiwal, Andreas Schwab, Arnd Bergmann,
Athira Rajeev, Avnish Chouhan, Christophe Leroy, Disha Goel, Donet Tom,
Gaurav Batra, Gautam Menghani, Hari Bathini, Kajol Jain, Kees Cook,
Mahesh Salgaonkar, Michael Ellerman, Paul Mackerras, Ritesh Harjani
(IBM), Sathvika Vasireddy, Segher Boessenkool, Sourabh Jain, Vaibhav
Jain, and Venkat Rao Bagalkote.
* tag 'powerpc-6.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (61 commits)
powerpc/kexec: fix physical address calculation in clear_utlb_entry()
crypto: powerpc: Mark ghashp8-ppc.o as an OBJECT_FILES_NON_STANDARD
powerpc: Fix 'intra_function_call not a direct call' warning
powerpc/perf: Fix ref-counting on the PMU 'vpa_pmu'
KVM: PPC: Enable CAP_SPAPR_TCE_VFIO on pSeries KVM guests
powerpc/prom_init: Fixup missing #size-cells on PowerBook6,7
powerpc/microwatt: Add SMP support
powerpc: Define config option for processors with broadcast TLBIE
powerpc/microwatt: Define an idle power-save function
powerpc/microwatt: Device-tree updates
powerpc/microwatt: Select COMMON_CLK in order to get the clock framework
net: toshiba: Remove reference to PPC_IBM_CELL_BLADE
net: spider_net: Remove powerpc Cell driver
cpufreq: ppc_cbe: Remove powerpc Cell driver
genirq: Remove IRQ_EDGE_EOI_HANDLER
docs: Remove reference to removed CBE_CPUFREQ_SPU_GOVERNOR
powerpc: Remove UDBG_RTAS_CONSOLE
powerpc/io: Use standard barrier macros in io.c
powerpc/io: Rename _insw_ns() etc.
powerpc/io: Use generic raw accessors
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core & protocols:
- Continue Netlink conversions to per-namespace RTNL lock
(IPv4 routing, routing rules, routing next hops, ARP ioctls)
- Continue extending the use of netdev instance locks. As a driver
opt-in protect queue operations and (in due course) ethtool
operations with the instance lock and not RTNL lock.
- Support collecting TCP timestamps (data submitted, sent, acked) in
BPF, allowing for transparent (to the application) and lower
overhead tracking of TCP RPC performance.
- Tweak existing networking Rx zero-copy infra to support zero-copy
Rx via io_uring.
- Optimize MPTCP performance in single subflow mode by 29%.
- Enable GRO on packets which went thru XDP CPU redirect (were queued
for processing on a different CPU). Improving TCP stream
performance up to 2x.
- Improve performance of contended connect() by 200% by searching for
an available 4-tuple under RCU rather than a spin lock. Bring an
additional 229% improvement by tweaking hash distribution.
- Avoid unconditionally touching sk_tsflags on RX, improving
performance under UDP flood by as much as 10%.
- Avoid skb_clone() dance in ping_rcv() to improve performance under
ping flood.
- Avoid FIB lookup in netfilter if socket is available, 20% perf win.
- Rework network device creation (in-kernel) API to more clearly
identify network namespaces and their roles. There are up to 4
namespace roles but we used to have just 2 netns pointer arguments,
interpreted differently based on context.
- Use sysfs_break_active_protection() instead of trylock to avoid
deadlocks between unregistering objects and sysfs access.
- Add a new sysctl and sockopt for capping max retransmit timeout in
TCP.
- Support masking port and DSCP in routing rule matches.
- Support dumping IPv4 multicast addresses with RTM_GETMULTICAST.
- Support specifying at what time packet should be sent on AF_XDP
sockets.
- Expose TCP ULP diagnostic info (for TLS and MPTCP) to non-admin
users.
- Add Netlink YAML spec for WiFi (nl80211) and conntrack.
- Introduce EXPORT_IPV6_MOD() and EXPORT_IPV6_MOD_GPL() for symbols
which only need to be exported when IPv6 support is built as a
module.
- Age FDB entries based on Rx not Tx traffic in VxLAN, similar to
normal bridging.
- Allow users to specify source port range for GENEVE tunnels.
- netconsole: allow attaching kernel release, CPU ID and task name to
messages as metadata
Driver API:
- Continue rework / fixing of Energy Efficient Ethernet (EEE) across
the SW layers. Delegate the responsibilities to phylink where
possible. Improve its handling in phylib.
- Support symmetric OR-XOR RSS hashing algorithm.
- Support tracking and preserving IRQ affinity by NAPI itself.
- Support loopback mode speed selection for interface selftests.
Device drivers:
- Remove the IBM LCS driver for s390
- Remove the sb1000 cable modem driver
- Add support for SFP module access over SMBus
- Add MCTP transport driver for MCTP-over-USB
- Enable XDP metadata support in multiple drivers
- Ethernet high-speed NICs:
- Broadcom (bnxt):
- add PCIe TLP Processing Hints (TPH) support for new AMD
platforms
- support dumping RoCE queue state for debug
- opt into instance locking
- Intel (100G, ice, idpf):
- ice: rework MSI-X IRQ management and distribution
- ice: support for E830 devices
- iavf: add support for Rx timestamping
- iavf: opt into instance locking
- nVidia/Mellanox:
- mlx4: use page pool memory allocator for Rx
- mlx5: support for one PTP device per hardware clock
- mlx5: support for 200Gbps per-lane link modes
- mlx5: move IPSec policy check after decryption
- AMD/Solarflare:
- support FW flashing via devlink
- Cisco (enic):
- use page pool memory allocator for Rx
- enable 32, 64 byte CQEs
- get max rx/tx ring size from the device
- Meta (fbnic):
- support flow steering and RSS configuration
- report queue stats
- support TCP segmentation
- support IRQ coalescing
- support ring size configuration
- Marvell/Cavium:
- support AF_XDP
- Wangxun:
- support for PTP clock and timestamping
- Huawei (hibmcge):
- checksum offload
- add more statistics
- Ethernet virtual:
- VirtIO net:
- aggressively suppress Tx completions, improve perf by 96%
with 1 CPU and 55% with 2 CPUs
- expose NAPI to IRQ mapping and persist NAPI settings
- Google (gve):
- support XDP in DQO RDA Queue Format
- opt into instance locking
- Microsoft vNIC:
- support BIG TCP
- Ethernet NICs consumer, and embedded:
- Synopsys (stmmac):
- cleanup Tx and Tx clock setting and other link-focused
cleanups
- enable SGMII and 2500BASEX mode switching for Intel platforms
- support Sophgo SG2044
- Broadcom switches (b53):
- support for BCM53101
- TI:
- iep: add perout configuration support
- icssg: support XDP
- Cadence (macb):
- implement BQL
- Xilinx (axinet):
- support dynamic IRQ moderation and changing coalescing at
runtime
- implement BQL
- report standard stats
- MediaTek:
- support phylink managed EEE
- Intel:
- igc: don't restart the interface on every XDP program change
- RealTek (r8169):
- support reading registers of internal PHYs directly
- increase max jumbo packet size on RTL8125/RTL8126
- Airoha:
- support for RISC-V NPU packet processing unit
- enable scatter-gather and support MTU up to 9kB
- Tehuti (tn40xx):
- support cards with TN4010 MAC and an Aquantia AQR105 PHY
- Ethernet PHYs:
- support for TJA1102S, TJA1121
- dp83tg720: add randomized polling intervals for link detection
- dp83822: support changing the transmit amplitude voltage
- support for LEDs on 88q2xxx
- CAN:
- canxl: support Remote Request Substitution bit access
- flexcan: add S32G2/S32G3 SoC
- WiFi:
- remove cooked monitor support
- strict mode for better AP testing
- basic EPCS support
- OMI RX bandwidth reduction support
- batman-adv: add support for jumbo frames
- WiFi drivers:
- RealTek (rtw88):
- support RTL8814AE and RTL8814AU
- RealTek (rtw89):
- switch using wiphy_lock and wiphy_work
- add BB context to manipulate two PHY as preparation of MLO
- improve BT-coexistence mechanism to play A2DP smoothly
- Intel (iwlwifi):
- add new iwlmld sub-driver for latest HW/FW combinations
- MediaTek (mt76):
- preparation for mt7996 Multi-Link Operation (MLO) support
- Qualcomm/Atheros (ath12k):
- continued work on MLO
- Silabs (wfx):
- Wake-on-WLAN support
- Bluetooth:
- add support for skb TX SND/COMPLETION timestamping
- hci_core: enable buffer flow control for SCO/eSCO
- coredump: log devcd dumps into the monitor
- Bluetooth drivers:
- intel: add support to configure TX power
- nxp: handle bootloader error during cmd5 and cmd7"
* tag 'net-next-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1681 commits)
unix: fix up for "apparmor: add fine grained af_unix mediation"
mctp: Fix incorrect tx flow invalidation condition in mctp-i2c
net: usb: asix: ax88772: Increase phy_name size
net: phy: Introduce PHY_ID_SIZE — minimum size for PHY ID string
net: libwx: fix Tx L4 checksum
net: libwx: fix Tx descriptor content for some tunnel packets
atm: Fix NULL pointer dereference
net: tn40xx: add pci-id of the aqr105-based Tehuti TN4010 cards
net: tn40xx: prepare tn40xx driver to find phy of the TN9510 card
net: tn40xx: create swnode for mdio and aqr105 phy and add to mdiobus
net: phy: aquantia: add essential functions to aqr105 driver
net: phy: aquantia: search for firmware-name in fwnode
net: phy: aquantia: add probe function to aqr105 for firmware loading
net: phy: Add swnode support to mdiobus_scan
gve: add XDP DROP and PASS support for DQ
gve: update XDP allocation path support RX buffer posting
gve: merge packet buffer size fields
gve: update GQ RX to use buf_size
gve: introduce config-based allocation for XDP
gve: remove xdp_xsk_done and xdp_xsk_wakeup statistics
...
|