Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- Yu Zhao's Multi-Gen LRU patches are here. They've been under test in
linux-next for a couple of months without, to my knowledge, any
negative reports (or any positive ones, come to that).
- Also the Maple Tree from Liam Howlett. An overlapping range-based
tree for vmas. It it apparently slightly more efficient in its own
right, but is mainly targeted at enabling work to reduce mmap_lock
contention.
Liam has identified a number of other tree users in the kernel which
could be beneficially onverted to mapletrees.
Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat
at [1]. This has yet to be addressed due to Liam's unfortunately
timed vacation. He is now back and we'll get this fixed up.
- Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer. It uses
clang-generated instrumentation to detect used-unintialized bugs down
to the single bit level.
KMSAN keeps finding bugs. New ones, as well as the legacy ones.
- Yang Shi adds a userspace mechanism (madvise) to induce a collapse of
memory into THPs.
- Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to
support file/shmem-backed pages.
- userfaultfd updates from Axel Rasmussen
- zsmalloc cleanups from Alexey Romanov
- cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and
memory-failure
- Huang Ying adds enhancements to NUMA balancing memory tiering mode's
page promotion, with a new way of detecting hot pages.
- memcg updates from Shakeel Butt: charging optimizations and reduced
memory consumption.
- memcg cleanups from Kairui Song.
- memcg fixes and cleanups from Johannes Weiner.
- Vishal Moola provides more folio conversions
- Zhang Yi removed ll_rw_block() :(
- migration enhancements from Peter Xu
- migration error-path bugfixes from Huang Ying
- Aneesh Kumar added ability for a device driver to alter the memory
tiering promotion paths. For optimizations by PMEM drivers, DRM
drivers, etc.
- vma merging improvements from Jakub Matěn.
- NUMA hinting cleanups from David Hildenbrand.
- xu xin added aditional userspace visibility into KSM merging
activity.
- THP & KSM code consolidation from Qi Zheng.
- more folio work from Matthew Wilcox.
- KASAN updates from Andrey Konovalov.
- DAMON cleanups from Kaixu Xia.
- DAMON work from SeongJae Park: fixes, cleanups.
- hugetlb sysfs cleanups from Muchun Song.
- Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core.
Link: https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com [1]
* tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (555 commits)
hugetlb: allocate vma lock for all sharable vmas
hugetlb: take hugetlb vma_lock when clearing vma_lock->vma pointer
hugetlb: fix vma lock handling during split vma and range unmapping
mglru: mm/vmscan.c: fix imprecise comments
mm/mglru: don't sync disk for each aging cycle
mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbol
mm: memcontrol: use do_memsw_account() in a few more places
mm: memcontrol: deprecate swapaccounting=0 mode
mm: memcontrol: don't allocate cgroup swap arrays when memcg is disabled
mm/secretmem: remove reduntant return value
mm/hugetlb: add available_huge_pages() func
mm: remove unused inline functions from include/linux/mm_inline.h
selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory
selftests/vm: add file/shmem MADV_COLLAPSE selftest for cleared pmd
selftests/vm: add thp collapse shmem testing
selftests/vm: add thp collapse file and tmpfs testing
selftests/vm: modularize thp collapse memory operations
selftests/vm: dedup THP helpers
mm/khugepaged: add tracepoint to hpage_collapse_scan_file()
mm/madvise: add file and shmem support to MADV_COLLAPSE
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more power management updates from Rafael Wysocki:
"These update the turbostat utility, extend the macros used for
defining device power management callbacks and add a diagnostic
message to the generic power domains code.
Specifics:
- Add an error message to be printed when a power domain marked as
"always on" is not actually on during initialization (Johan
Hovold).
- Extend macros used for defining power management callbacks to allow
conditional exporting of noirq and late/early suspend/resume PM
callbacks (Paul Cercueil).
- Update the turbostat utility:
- Add support for two new platforms (Zhang Rui).
- Adjust energy unit for Sapphire Rapids (Zhang Rui).
- Do not dump TRL if turbo is not supported (Artem Bityutskiy)"
* tag 'pm-6.1-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
tools/power turbostat: version 2022.10.04
tools/power turbostat: Use standard Energy Unit for SPR Dram RAPL domain
tools/power turbostat: Do not dump TRL if turbo is not supported
tools/power turbostat: Add support for MeteorLake platforms
tools/power turbostat: Add support for RPL-S
PM: Improve EXPORT_*_DEV_PM_OPS macros
PM: domains: log failures to register always-on domains
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd
Pull tpm updates from Jarkko Sakkinen:
"Just a few bug fixes this time"
* tag 'tpmdd-next-v6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
selftest: tpm2: Add Client.__del__() to close /dev/tpm* handle
security/keys: Remove inconsistent __user annotation
char: move from strlcpy with unused retval to strscpy
|
|
Pull bitmap updates from Yury Norov:
- Fix unsigned comparison to -1 in CPUMAP_FILE_MAX_BYTES (Phil Auld)
- cleanup nr_cpu_ids vs nr_cpumask_bits mess (me)
This series cleans that mess and adds new config FORCE_NR_CPUS that
allows to optimize cpumask subsystem if the number of CPUs is known
at compile-time.
- optimize find_bit() functions (me)
Reworks find_bit() functions based on new FIND_{FIRST,NEXT}_BIT()
macros.
- add find_nth_bit() (me)
Adds find_nth_bit(), which is ~70 times faster than bitcounting with
for_each() loop:
for_each_set_bit(bit, mask, size)
if (n-- == 0)
return bit;
Also adds bitmap_weight_and() to let people replace this pattern:
tmp = bitmap_alloc(nbits);
bitmap_and(tmp, map1, map2, nbits);
weight = bitmap_weight(tmp, nbits);
bitmap_free(tmp);
with a single bitmap_weight_and() call.
- repair cpumask_check() (me)
After switching cpumask to use nr_cpu_ids, cpumask_check() started
generating many false-positive warnings. This series fixes it.
- Add for_each_cpu_andnot() and for_each_cpu_andnot() (Valentin
Schneider)
Extends the API with one more function and applies it in sched/core.
* tag 'bitmap-6.1-rc1' of https://github.com/norov/linux: (28 commits)
sched/core: Merge cpumask_andnot()+for_each_cpu() into for_each_cpu_andnot()
lib/test_cpumask: Add for_each_cpu_and(not) tests
cpumask: Introduce for_each_cpu_andnot()
lib/find_bit: Introduce find_next_andnot_bit()
cpumask: fix checking valid cpu range
lib/bitmap: add tests for for_each() loops
lib/find: optimize for_each() macros
lib/bitmap: introduce for_each_set_bit_wrap() macro
lib/find_bit: add find_next{,_and}_bit_wrap
cpumask: switch for_each_cpu{,_not} to use for_each_bit()
net: fix cpu_max_bits_warn() usage in netif_attrmask_next{,_and}
cpumask: add cpumask_nth_{,and,andnot}
lib/bitmap: remove bitmap_ord_to_pos
lib/bitmap: add tests for find_nth_bit()
lib: add find_nth{,_and,_andnot}_bit()
lib/bitmap: add bitmap_weight_and()
lib/bitmap: don't call __bitmap_weight() in kernel code
tools: sync find_bit() implementation
lib/find_bit: optimize find_next_bit() functions
lib/find_bit: create find_first_zero_bit_le()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing updates from Steven Rostedt:
"Major changes:
- Changed location of tracing repo from personal git repo to:
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace.git
- Added Masami Hiramatsu as co-maintainer
- Updated MAINTAINERS file to separate out FTRACE as it is more than
just TRACING.
Minor changes:
- Added Mark Rutland as FTRACE reviewer
- Updated user_events to make it on its way to remove the BROKEN tag.
The changes should now be acceptable but will run it through a
cycle and hopefully we can remove the BROKEN tag next release.
- Added filtering to eprobes
- Added a delta time to the benchmark trace event
- Have the histogram and filter callbacks called via a switch
statement instead of indirect functions. This speeds it up to avoid
retpolines.
- Add a way to wake up ring buffer waiters waiting for the ring
buffer to fill up to its watermark.
- New ioctl() on the trace_pipe_raw file to wake up ring buffer
waiters.
- Wake up waiters when the ring buffer is disabled. A reader may
block when the ring buffer is disabled, but if it was blocked when
the ring buffer is disabled it should then wake up.
Fixes:
- Allow splice to read partially read ring buffer pages. This fixes
splice never moving forward.
- Fix inverted compare that made the "shortest" ring buffer wait
queue actually the longest.
- Fix a race in the ring buffer between resetting a page when a
writer goes to another page, and the reader.
- Fix ftrace accounting bug when function hooks are added at boot up
before the weak functions are set to "disabled".
- Fix bug that freed a user allocated snapshot buffer when enabling a
tracer.
- Fix possible recursive locks in osnoise tracer
- Fix recursive locking direct functions
- Other minor clean ups and fixes"
* tag 'trace-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (44 commits)
ftrace: Create separate entry in MAINTAINERS for function hooks
tracing: Update MAINTAINERS to reflect new tracing git repo
tracing: Do not free snapshot if tracer is on cmdline
ftrace: Still disable enabled records marked as disabled
tracing/user_events: Move pages/locks into groups to prepare for namespaces
tracing: Add Masami Hiramatsu as co-maintainer
tracing: Remove unused variable 'dups'
MAINTAINERS: add myself as a tracing reviewer
ring-buffer: Fix race between reset page and reading page
tracing/user_events: Update ABI documentation to align to bits vs bytes
tracing/user_events: Use bits vs bytes for enabled status page data
tracing/user_events: Use refcount instead of atomic for ref tracking
tracing/user_events: Ensure user provided strings are safely formatted
tracing/user_events: Use WRITE instead of READ for io vector import
tracing/user_events: Use NULL for strstr checks
tracing: Fix spelling mistake "preapre" -> "prepare"
tracing: Wake up waiters when tracing is disabled
tracing: Add ioctl() to force ring buffer waiters to wake up
tracing: Wake up ring buffer waiters on closing of the file
ring-buffer: Add ring_buffer_wake_waiters()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching
Pull livepatching updates from Petr Mladek:
- Fix race between fork and livepatch transition revert
- Add sysfs entry that shows "patched" state for each object (module)
that can be livepatched by the given livepatch
- Some clean up
* tag 'livepatching-for-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching:
selftests/livepatch: add sysfs test
livepatch: add sysfs entry "patched" for each klp_object
selftests/livepatch: normalize sysctl error message
livepatch: Add a missing newline character in klp_module_coming()
livepatch: fix race between fork and KLP transition
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo:
- cpuset now support isolated cpus.partition type, which will enable
dynamic CPU isolation
- pids.peak added to remember the max number of pids used
- holes in cgroup namespace plugged
- internal cleanups
* tag 'cgroup-for-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (25 commits)
cgroup: use strscpy() is more robust and safer
iocost_monitor: reorder BlkgIterator
cgroup: simplify code in cgroup_apply_control
cgroup: Make cgroup_get_from_id() prettier
cgroup/cpuset: remove unreachable code
cgroup: Remove CFTYPE_PRESSURE
cgroup: Improve cftype add/rm error handling
kselftest/cgroup: Add cpuset v2 partition root state test
cgroup/cpuset: Update description of cpuset.cpus.partition in cgroup-v2.rst
cgroup/cpuset: Make partition invalid if cpumask change violates exclusivity rule
cgroup/cpuset: Relocate a code block in validate_change()
cgroup/cpuset: Show invalid partition reason string
cgroup/cpuset: Add a new isolated cpus.partition type
cgroup/cpuset: Relax constraints to partition & cpus changes
cgroup/cpuset: Allow no-task partition to have empty cpuset.cpus.effective
cgroup/cpuset: Miscellaneous cleanups & add helper functions
cgroup/cpuset: Enable update_tasks_cpumask() on top_cpuset
cgroup: add pids.peak interface for pids controller
cgroup: Remove data-race around cgrp_dfl_visible
cgroup: Fix build failure when CONFIG_SHRINKER_DEBUG
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull objtool updates from Ingo Molnar:
- Remove the "ANNOTATE_NOENDBR on ENDBR" warning: it's not really
useful and only found a non-bug false positive so far.
- Properly decode LOOP/LOOPE/LOOPNE, which were missing from the x86
decoder. Because these instructions are rather ineffective, they
never showed up in compiler output, but they are simple enough to
support, so add them for completeness.
- A bit more cross-arch preparatory work.
* tag 'objtool-core-2022-10-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool,x86: Teach decode about LOOP* instructions
objtool: Remove "ANNOTATE_NOENDBR on ENDBR" warning
objtool: Use arch_jump_destination() in read_intra_function_calls()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
- Disable preemption in rwsem_write_trylock()'s attempt to take the
rwsem, to avoid RT tasks hogging the CPU, which managed to preempt
this function after the owner has been cleared but before a new owner
is set. Also add debug checks to enforce this.
- Add __lockfunc to more slow path functions and add __sched to
semaphore functions.
- Mark spinlock APIs noinline when the respective CONFIG_INLINE_SPIN_*
toggles are disabled, to reduce LTO text size.
- Print more debug information when lockdep gets confused in
look_up_lock_class().
- Improve header file abuse checks.
- Misc cleanups
* tag 'locking-core-2022-10-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/lockdep: Print more debug information - report name and key when look_up_lock_class() got confused
locking: Add __sched to semaphore functions
locking/rwsem: Disable preemption while trying for rwsem lock
locking: Detect includes rwlock.h outside of spinlock.h
locking: Add __lockfunc to slow path functions
locking/spinlocks: Mark spinlocks noinline when inline spinlocks are disabled
selftests: futex: Fix 'the the' typo in comment
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
- Remove our now never-true definitions for pgd_huge() and p4d_leaf().
- Add pte_needs_flush() and huge_pmd_needs_flush() for 64-bit.
- Add support for syscall wrappers.
- Add support for KFENCE on 64-bit.
- Update 64-bit HV KVM to use the new guest state entry/exit accounting
API.
- Support execute-only memory when using the Radix MMU (P9 or later).
- Implement CONFIG_PARAVIRT_TIME_ACCOUNTING for pseries guests.
- Updates to our linker script to move more data into read-only
sections.
- Allow the VDSO to be randomised on 32-bit.
- Many other small features and fixes.
Thanks to Andrew Donnellan, Aneesh Kumar K.V, Arnd Bergmann, Athira
Rajeev, Christophe Leroy, David Hildenbrand, Disha Goel, Fabiano Rosas,
Gaosheng Cui, Gustavo A. R. Silva, Haren Myneni, Hari Bathini, Jilin
Yuan, Joel Stanley, Kajol Jain, Kees Cook, Krzysztof Kozlowski, Laurent
Dufour, Liang He, Li Huafei, Lukas Bulwahn, Madhavan Srinivasan, Nathan
Chancellor, Nathan Lynch, Nicholas Miehlbradt, Nicholas Piggin, Pali
Rohár, Rohan McLure, Russell Currey, Sachin Sant, Segher Boessenkool,
Shrikanth Hegde, Tyrel Datwyler, Wolfram Sang, ye xingchen, and Zheng
Yongjun.
* tag 'powerpc-6.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (214 commits)
KVM: PPC: Book3S HV: Fix stack frame regs marker
powerpc: Don't add __powerpc_ prefix to syscall entry points
powerpc/64s/interrupt: Fix stack frame regs marker
powerpc/64: Fix msr_check_and_set/clear MSR[EE] race
powerpc/64s/interrupt: Change must-hard-mask interrupt check from BUG to WARN
powerpc/pseries: Add firmware details to the hardware description
powerpc/powernv: Add opal details to the hardware description
powerpc: Add device-tree model to the hardware description
powerpc/64: Add logical PVR to the hardware description
powerpc: Add PVR & CPU name to hardware description
powerpc: Add hardware description string
powerpc/configs: Enable PPC_UV in powernv_defconfig
powerpc/configs: Update config files for removed/renamed symbols
powerpc/mm: Fix UBSAN warning reported on hugetlb
powerpc/mm: Always update max/min_low_pfn in mem_topology_setup()
powerpc/mm/book3s/hash: Rename flush_tlb_pmd_range
powerpc: Drops STABS_DEBUG from linker scripts
powerpc/64s: Remove lost/old comment
powerpc/64s: Remove old STAB comment
powerpc: remove orphan systbl_chk.sh
...
|
|
Pull kvm updates from Paolo Bonzini:
"The first batch of KVM patches, mostly covering x86.
ARM:
- Account stage2 page table allocations in memory stats
x86:
- Account EPT/NPT arm64 page table allocations in memory stats
- Tracepoint cleanups/fixes for nested VM-Enter and emulated MSR
accesses
- Drop eVMCS controls filtering for KVM on Hyper-V, all known
versions of Hyper-V now support eVMCS fields associated with
features that are enumerated to the guest
- Use KVM's sanitized VMCS config as the basis for the values of
nested VMX capabilities MSRs
- A myriad event/exception fixes and cleanups. Most notably, pending
exceptions morph into VM-Exits earlier, as soon as the exception is
queued, instead of waiting until the next vmentry. This fixed a
longstanding issue where the exceptions would incorrecly become
double-faults instead of triggering a vmexit; the common case of
page-fault vmexits had a special workaround, but now it's fixed for
good
- A handful of fixes for memory leaks in error paths
- Cleanups for VMREAD trampoline and VMX's VM-Exit assembly flow
- Never write to memory from non-sleepable kvm_vcpu_check_block()
- Selftests refinements and cleanups
- Misc typo cleanups
Generic:
- remove KVM_REQ_UNHALT"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (94 commits)
KVM: remove KVM_REQ_UNHALT
KVM: mips, x86: do not rely on KVM_REQ_UNHALT
KVM: x86: never write to memory from kvm_vcpu_check_block()
KVM: x86: Don't snapshot pending INIT/SIPI prior to checking nested events
KVM: nVMX: Make event request on VMXOFF iff INIT/SIPI is pending
KVM: nVMX: Make an event request if INIT or SIPI is pending on VM-Enter
KVM: SVM: Make an event request if INIT or SIPI is pending when GIF is set
KVM: x86: lapic does not have to process INIT if it is blocked
KVM: x86: Rename kvm_apic_has_events() to make it INIT/SIPI specific
KVM: x86: Rename and expose helper to detect if INIT/SIPI are allowed
KVM: nVMX: Make an event request when pending an MTF nested VM-Exit
KVM: x86: make vendor code check for all nested events
mailmap: Update Oliver's email address
KVM: x86: Allow force_emulation_prefix to be written without a reload
KVM: selftests: Add an x86-only test to verify nested exception queueing
KVM: selftests: Use uapi header to get VMX and SVM exit reasons/codes
KVM: x86: Rename inject_pending_events() to kvm_check_and_inject_events()
KVM: VMX: Update MTF and ICEBP comments to document KVM's subtle behavior
KVM: x86: Treat pending TRIPLE_FAULT requests as pending exceptions
KVM: x86: Morph pending exceptions to pending VM-Exits at queue time
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc and other driver updates from Greg KH:
"Here is the large set of char/misc and other small driver subsystem
changes for 6.1-rc1. Loads of different things in here:
- IIO driver updates, additions, and changes. Probably the largest
part of the diffstat
- habanalabs driver update with support for new hardware and
features, the second largest part of the diff.
- fpga subsystem driver updates and additions
- mhi subsystem updates
- Coresight driver updates
- gnss subsystem updates
- extcon driver updates
- icc subsystem updates
- fsi subsystem updates
- nvmem subsystem and driver updates
- misc driver updates
- speakup driver additions for new features
- lots of tiny driver updates and cleanups
All of these have been in the linux-next tree for a while with no
reported issues"
* tag 'char-misc-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (411 commits)
w1: Split memcpy() of struct cn_msg flexible array
spmi: pmic-arb: increase SPMI transaction timeout delay
spmi: pmic-arb: block access for invalid PMIC arbiter v5 SPMI writes
spmi: pmic-arb: correct duplicate APID to PPID mapping logic
spmi: pmic-arb: add support to dispatch interrupt based on IRQ status
spmi: pmic-arb: check apid against limits before calling irq handler
spmi: pmic-arb: do not ack and clear peripheral interrupts in cleanup_irq
spmi: pmic-arb: handle spurious interrupt
spmi: pmic-arb: add a print in cleanup_irq
drivers: spmi: Directly use ida_alloc()/free()
MAINTAINERS: add TI ECAP driver info
counter: ti-ecap-capture: capture driver support for ECAP
Documentation: ABI: sysfs-bus-counter: add frequency & num_overflows items
dt-bindings: counter: add ti,am62-ecap-capture.yaml
counter: Introduce the COUNTER_COMP_ARRAY component type
counter: Consolidate Counter extension sysfs attribute creation
counter: Introduce the Count capture component
counter: 104-quad-8: Add Signal polarity component
counter: Introduce the Signal polarity component
counter: interrupt-cnt: Implement watch_validate callback
...
|
|
Pull io_uring updates from Jens Axboe:
- Add supported for more directly managed task_work running.
This is beneficial for real world applications that end up issuing
lots of system calls as part of handling work. Normal task_work will
always execute as we transition in and out of the kernel, even for
"unrelated" system calls. It's more efficient to defer the handling
of io_uring's deferred work until the application wants it to be run,
generally in batches.
As part of ongoing work to write an io_uring network backend for
Thrift, this has been shown to greatly improve performance. (Dylan)
- Add IOPOLL support for passthrough (Kanchan)
- Improvements and fixes to the send zero-copy support (Pavel)
- Partial IO handling fixes (Pavel)
- CQE ordering fixes around CQ ring overflow (Pavel)
- Support sendto() for non-zc as well (Pavel)
- Support sendmsg for zerocopy (Pavel)
- Networking iov_iter fix (Stefan)
- Misc fixes and cleanups (Pavel, me)
* tag 'for-6.1/io_uring-2022-10-03' of git://git.kernel.dk/linux: (56 commits)
io_uring/net: fix notif cqe reordering
io_uring/net: don't update msg_name if not provided
io_uring: don't gate task_work run on TIF_NOTIFY_SIGNAL
io_uring/rw: defer fsnotify calls to task context
io_uring/net: fix fast_iov assignment in io_setup_async_msg()
io_uring/net: fix non-zc send with address
io_uring/net: don't skip notifs for failed requests
io_uring/rw: don't lose short results on io_setup_async_rw()
io_uring/rw: fix unexpected link breakage
io_uring/net: fix cleanup double free free_iov init
io_uring: fix CQE reordering
io_uring/net: fix UAF in io_sendrecv_fail()
selftest/net: adjust io_uring sendzc notif handling
io_uring: ensure local task_work marks task as running
io_uring/net: zerocopy sendmsg
io_uring/net: combine fail handlers
io_uring/net: rename io_sendzc()
io_uring/net: support non-zerocopy sendto
io_uring/net: refactor io_setup_async_addr
io_uring/net: don't lose partial send_zc on fail
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull KUnit updates from Shuah Khan:
"Several documentation fixes, UML related cleanups, and a feature to
enable/disable KUnit tests
This includes the change to rename all_test_uml.config, and use it for
'--alltests'. Note: if anyone was using all_tests_uml.config, this
change breaks them.
This change simplifies the usage and eliminates the need to type:
--kunitconfig=tools/testing/kunit/configs/all_tests_uml.config
A simple workaround to create a symlink to the new name can solve the
problem for anyone using all_tests_uml.config.
all_tests_uml.config should work across ~all architectures"
* tag 'linux-kselftest-kunit-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
Documentation: Kunit: Use full path to .kunitconfig
kunit: tool: rename all_test_uml.config, use it for --alltests
kunit: tool: remove UML specific options from all_tests_uml.config
lib: stackinit: update reference to kunit-tool
lib: overflow: update reference to kunit-tool
Documentation: KUnit: update links in the index page
Documentation: KUnit: add intro to the getting-started page
Documentation: KUnit: Reword start guide for selecting tests
Documentation: KUnit: add note about mrproper in start.rst
Documentation: KUnit: avoid repeating "kunit.py run" in start.rst
Documentation: KUnit: remove duplicated docs for kunit_tool
Documentation: Kunit: Add ref for other kinds of tests
Documentation: KUnit: Fix non-uml anchor
Documentation: Kunit: Fix inconsistent titles
Documentation: kunit: fix trivial typo
kunit: no longer call module_info(test, "Y") for kunit modules
kunit: add kunit.enable to enable/disable KUnit test
kunit: tool: make --raw_output=kunit (aka --raw_output) preserve leading spaces
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull Kselftest updates from Shuah Khan:
"Fixes and new tests:
- Add an amd-pstate-ut test module, used by kselftest to unit test
amd-pstate functionality
- Fixes and cleanups to to cpu-hotplug to delete the fault injection
test code
- Improvements to vm test to use top_srcdir for builds"
* tag 'linux-kselftest-next-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
docs:kselftest: fix kselftest_module.h path of example module
cpufreq: amd-pstate: Add explanation for X86_AMD_PSTATE_UT
selftests/cpu-hotplug: Add log info when test success
selftests/cpu-hotplug: Reserve one cpu online at least
selftests/cpu-hotplug: Delete fault injection related code
selftests/cpu-hotplug: Use return instead of exit
selftests/cpu-hotplug: Correct log info
cpufreq: amd-pstate: modify type in argument 2 for filp_open
Documentation: amd-pstate: Add unit test introduction
selftests: amd-pstate: Add test trigger for amd-pstate driver
cpufreq: amd-pstate: Add test module for amd-pstate driver
cpufreq: amd-pstate: Expose struct amd_cpudata
selftests/vm: use top_srcdir instead of recomputing relative paths
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
- arm64 perf: DDR PMU driver for Alibaba's T-Head Yitian 710 SoC, SVE
vector granule register added to the user regs together with SVE perf
extensions documentation.
- SVE updates: add HWCAP for SVE EBF16, update the SVE ABI
documentation to match the actual kernel behaviour (zeroing the
registers on syscall rather than "zeroed or preserved" previously).
- More conversions to automatic system registers generation.
- vDSO: use self-synchronising virtual counter access in gettimeofday()
if the architecture supports it.
- arm64 stacktrace cleanups and improvements.
- arm64 atomics improvements: always inline assembly, remove LL/SC
trampolines.
- Improve the reporting of EL1 exceptions: rework BTI and FPAC
exception handling, better EL1 undefs reporting.
- Cortex-A510 erratum 2658417: remove BF16 support due to incorrect
result.
- arm64 defconfig updates: build CoreSight as a module, enable options
necessary for docker, memory hotplug/hotremove, enable all PMUs
provided by Arm.
- arm64 ptrace() support for TPIDR2_EL0 (register provided with the SME
extensions).
- arm64 ftraces updates/fixes: fix module PLTs with mcount, remove
unused function.
- kselftest updates for arm64: simple HWCAP validation, FP stress test
improvements, validation of ZA regs in signal handlers, include
larger SVE and SME vector lengths in signal tests, various cleanups.
- arm64 alternatives (code patching) improvements to robustness and
consistency: replace cpucap static branches with equivalent
alternatives, associate callback alternatives with a cpucap.
- Miscellaneous updates: optimise kprobe performance of patching
single-step slots, simplify uaccess_mask_ptr(), move MTE registers
initialisation to C, support huge vmalloc() mappings, run softirqs on
the per-CPU IRQ stack, compat (arm32) misalignment fixups for
multiword accesses.
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (126 commits)
arm64: alternatives: Use vdso/bits.h instead of linux/bits.h
arm64/kprobe: Optimize the performance of patching single-step slot
arm64: defconfig: Add Coresight as module
kselftest/arm64: Handle EINTR while reading data from children
kselftest/arm64: Flag fp-stress as exiting when we begin finishing up
kselftest/arm64: Don't repeat termination handler for fp-stress
ARM64: reloc_test: add __init/__exit annotations to module init/exit funcs
arm64/mm: fold check for KFENCE into can_set_direct_map()
arm64: ftrace: fix module PLTs with mcount
arm64: module: Remove unused plt_entry_is_initialized()
arm64: module: Make plt_equals_entry() static
arm64: fix the build with binutils 2.27
kselftest/arm64: Don't enable v8.5 for MTE selftest builds
arm64: uaccess: simplify uaccess_mask_ptr()
arm64: asm/perf_regs.h: Avoid C++-style comment in UAPI header
kselftest/arm64: Fix typo in hwcap check
arm64: mte: move register initialization to C
arm64: mm: handle ARM64_KERNEL_USES_PMD_MAPS in vmemmap_populate()
arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()
arm64/sve: Add Perf extensions documentation
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver updates from Hans de Goede:
- AMD Platform Management Framework (PMF) driver with AMT and QnQF
support
- AMD PMC: Improved logging for debugging s2idle issues
- Big refactor of the ACPI/x86 backlight handling, ensuring that we
only register 1 /sys/class/backlight device per LCD panel
- Microsoft Surface:
- Surface Laptop Go 2 support
- Surface Pro 8 HID sensor support
- Asus WMI:
- Lots of cleanups
- Support for TUF RGB keyboard backlight control
- Add support for ROG X13 tablet mode
- Siemens Simatic: IPC227G and IPC427G support
- Toshiba ACPI laptop driver: Fan hwmon and battery ECO mode support
- tools/power/x86/intel-speed-select: Various improvements
- Various cleanups
- Various small bugfixes
* tag 'platform-drivers-x86-v6.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (153 commits)
platform/x86: use PLATFORM_DEVID_NONE instead of -1
platform/x86/amd: pmc: Dump idle mask during "check" stage instead
platform/x86/intel/wmi: thunderbolt: Use dev_groups callback
platform/x86/amd: pmc: remove CONFIG_DEBUG_FS checks
platform/surface: Split memcpy() of struct ssam_event flexible array
platform/x86: compal-laptop: Get rid of a few forward declarations
platform/x86: intel-uncore-freq: Use sysfs_emit() to instead of scnprintf()
platform/x86: dell-smbios-base: Use sysfs_emit()
platform/x86/amd/pmf: Remove unused power_delta instances
platform/x86/amd/pmf: install notify handler after acpi init
Documentation/ABI/testing/sysfs-amd-pmf: Add ABI doc for AMD PMF
platform/x86/amd/pmf: Add sysfs to toggle CnQF
platform/x86/amd/pmf: Add support for CnQF
platform/x86/amd: pmc: Fix build without debugfs
platform/x86: hp-wmi: Support touchpad on/off
platform/x86: int3472/discrete: Drop a forward declaration
platform/x86: toshiba_acpi: change turn_on_panel_on_resume to static
platform/x86: wmi: Drop forward declaration of static functions
platform/x86: toshiba_acpi: Remove duplicate include
platform/x86: msi-laptop: Change DMI match / alias strings to fix module autoloading
...
|
|
This kernel module is used for testing. It's safe to say M here.
It can also be built-in without X86_AMD_PSTATE enabled.
Currently, only tests for amd-pstate are supported. If X86_AMD_PSTATE
is set disabled, it can tell the users test can only run on amd-pstate
driver, please set X86_AMD_PSTATE enabled.
In the future, comparison tests will be added. It can set amd-pstate
disabled and set acpi-cpufreq enabled to run test cases, then compare
the test results.
Suggested-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Meng Li <li.meng@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
Add log information when run full test successfully.
Signed-off-by: Zhao Gongyi <zhaogongyi@huawei.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
Considering that we can not offline all cpus in any cases,
we need to reserve one cpu online when the test offline all
hotpluggable online cpus, otherwise the test will fail forever.
Fixes: d89dffa976bc ("fault-injection: add selftests for cpu and memory hotplug")
Signed-off-by: Zhao Gongyi <zhaogongyi@huawei.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
Delete fault injection related code since the module has been deleted.
Signed-off-by: Zhao Gongyi <zhaogongyi@huawei.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
Some cpus will be left in offline state when online
function exits in some error conditions. Use return
instead of exit to fix it.
Signed-off-by: Zhao Gongyi <zhaogongyi@huawei.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
Correct the log info to match the test.
Signed-off-by: Zhao Gongyi <zhaogongyi@huawei.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
Add amd-pstate test trigger in kselftest, it will load/unload
amd-pstate-ut module to test some cases etc.
Signed-off-by: Meng Li <li.meng@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
In various places both in t/t/s/v/Makefile as well as some of the test
sources, we were referring to headers or directories using some fairly
long relative paths.
Since we have a working top_srcdir variable though, which refers to the
root of the kernel tree, we can clean up all of these "up and over"
relative paths, just relying on the single variable instead.
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux
Pull turbostat changes for 6.1-rc1 from Len Brown:
"Add support for two new platforms, and two bug fixes on existing
platforms."
* 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
tools/power turbostat: version 2022.10.04
tools/power turbostat: Use standard Energy Unit for SPR Dram RAPL domain
tools/power turbostat: Do not dump TRL if turbo is not supported
tools/power turbostat: Add support for MeteorLake platforms
tools/power turbostat: Add support for RPL-S
|
|
|
|
The following output can bee seen when the test is executed:
test_flush_context (tpm2_tests.SpaceTest) ... \
/usr/lib64/python3.6/unittest/case.py:605: ResourceWarning: \
unclosed file <_io.FileIO name='/dev/tpmrm0' mode='rb+' closefd=True>
An instance of Client does not implicitly close /dev/tpm* handle, once it
gets destroyed. Close the file handle in the class destructor
Client.__del__().
Fixes: 6ea3dfe1e0732 ("selftests: add TPM 2.0 tests")
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
Intel Xeon servers used to use a fixed energy resolution (15.3uj) for
Dram RAPL domain. But on SPR, Dram RAPL domain follows the standard
energy resolution as described in MSR_RAPL_POWER_UNIT.
Remove the SPR rapl_dram_energy_units quirk.
Fixes: e7af1ed3fa47 ("tools/power turbostat: Support additional CPU model numbers")
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Tested-by: Wang Wendy <wendy.wang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
Do not dump turbo ratio limits if platform does not support turbo, because it
is confusing and the TRL MSRs may even include misleading information. And they
are not supposed to be relied on if turbo is not supported.
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
Add turbostat support for MeteorLake platforms, which behave the same
as RaptorLake platforms.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
Add turbostat support for RAPTORLAKE_S platform, which behaves the same
as RAPTORLAKE and RAPTORLAKE_P platforms.
RPL-S 601/801 have different CPU ID than the Hybrid ADL-S platforms.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core:
- Introduce and use a single page frag cache for allocating small skb
heads, clawing back the 10-20% performance regression in UDP flood
test from previous fixes.
- Run packets which already went thru HW coalescing thru SW GRO. This
significantly improves TCP segment coalescing and simplifies
deployments as different workloads benefit from HW or SW GRO.
- Shrink the size of the base zero-copy send structure.
- Move TCP init under a new slow / sleepable version of DO_ONCE().
BPF:
- Add BPF-specific, any-context-safe memory allocator.
- Add helpers/kfuncs for PKCS#7 signature verification from BPF
programs.
- Define a new map type and related helpers for user space -> kernel
communication over a ring buffer (BPF_MAP_TYPE_USER_RINGBUF).
- Allow targeting BPF iterators to loop through resources of one
task/thread.
- Add ability to call selected destructive functions. Expose
crash_kexec() to allow BPF to trigger a kernel dump. Use
CAP_SYS_BOOT check on the loading process to judge permissions.
- Enable BPF to collect custom hierarchical cgroup stats efficiently
by integrating with the rstat framework.
- Support struct arguments for trampoline based programs. Only
structs with size <= 16B and x86 are supported.
- Invoke cgroup/connect{4,6} programs for unprivileged ICMP ping
sockets (instead of just TCP and UDP sockets).
- Add a helper for accessing CLOCK_TAI for time sensitive network
related programs.
- Support accessing network tunnel metadata's flags.
- Make TCP SYN ACK RTO tunable by BPF programs with TCP Fast Open.
- Add support for writing to Netfilter's nf_conn:mark.
Protocols:
- WiFi: more Extremely High Throughput (EHT) and Multi-Link Operation
(MLO) work (802.11be, WiFi 7).
- vsock: improve support for SO_RCVLOWAT.
- SMC: support SO_REUSEPORT.
- Netlink: define and document how to use netlink in a "modern" way.
Support reporting missing attributes via extended ACK.
- IPSec: support collect metadata mode for xfrm interfaces.
- TCPv6: send consistent autoflowlabel in SYN_RECV state and RST
packets.
- TCP: introduce optional per-netns connection hash table to allow
better isolation between namespaces (opt-in, at the cost of memory
and cache pressure).
- MPTCP: support TCP_FASTOPEN_CONNECT.
- Add NEXT-C-SID support in Segment Routing (SRv6) End behavior.
- Adjust IP_UNICAST_IF sockopt behavior for connected UDP sockets.
- Open vSwitch:
- Allow specifying ifindex of new interfaces.
- Allow conntrack and metering in non-initial user namespace.
- TLS: support the Korean ARIA-GCM crypto algorithm.
- Remove DECnet support.
Driver API:
- Allow selecting the conduit interface used by each port in DSA
switches, at runtime.
- Ethernet Power Sourcing Equipment and Power Device support.
- Add tc-taprio support for queueMaxSDU parameter, i.e. setting per
traffic class max frame size for time-based packet schedules.
- Support PHY rate matching - adapting between differing host-side
and link-side speeds.
- Introduce QUSGMII PHY mode and 1000BASE-KX interface mode.
- Validate OF (device tree) nodes for DSA shared ports; make
phylink-related properties mandatory on DSA and CPU ports.
Enforcing more uniformity should allow transitioning to phylink.
- Require that flash component name used during update matches one of
the components for which version is reported by info_get().
- Remove "weight" argument from driver-facing NAPI API as much as
possible. It's one of those magic knobs which seemed like a good
idea at the time but is too indirect to use in practice.
- Support offload of TLS connections with 256 bit keys.
New hardware / drivers:
- Ethernet:
- Microchip KSZ9896 6-port Gigabit Ethernet Switch
- Renesas Ethernet AVB (EtherAVB-IF) Gen4 SoCs
- Analog Devices ADIN1110 and ADIN2111 industrial single pair
Ethernet (10BASE-T1L) MAC+PHY.
- Rockchip RV1126 Gigabit Ethernet (a version of stmmac IP).
- Ethernet SFPs / modules:
- RollBall / Hilink / Turris 10G copper SFPs
- HALNy GPON module
- WiFi:
- CYW43439 SDIO chipset (brcmfmac)
- CYW89459 PCIe chipset (brcmfmac)
- BCM4378 on Apple platforms (brcmfmac)
Drivers:
- CAN:
- gs_usb: HW timestamp support
- Ethernet PHYs:
- lan8814: cable diagnostics
- Ethernet NICs:
- Intel (100G):
- implement control of FCS/CRC stripping
- port splitting via devlink
- L2TPv3 filtering offload
- nVidia/Mellanox:
- tunnel offload for sub-functions
- MACSec offload, w/ Extended packet number and replay window
offload
- significantly restructure, and optimize the AF_XDP support,
align the behavior with other vendors
- Huawei:
- configuring DSCP map for traffic class selection
- querying standard FEC statistics
- querying SerDes lane number via ethtool
- Marvell/Cavium:
- egress priority flow control
- MACSec offload
- AMD/SolarFlare:
- PTP over IPv6 and raw Ethernet
- small / embedded:
- ax88772: convert to phylink (to support SFP cages)
- altera: tse: convert to phylink
- ftgmac100: support fixed link
- enetc: standard Ethtool counters
- macb: ZynqMP SGMII dynamic configuration support
- tsnep: support multi-queue and use page pool
- lan743x: Rx IP & TCP checksum offload
- igc: add xdp frags support to ndo_xdp_xmit
- Ethernet high-speed switches:
- Marvell (prestera):
- support SPAN port features (traffic mirroring)
- nexthop object offloading
- Microchip (sparx5):
- multicast forwarding offload
- QoS queuing offload (tc-mqprio, tc-tbf, tc-ets)
- Ethernet embedded switches:
- Marvell (mv88e6xxx):
- support RGMII cmode
- NXP (felix):
- standardized ethtool counters
- Microchip (lan966x):
- QoS queuing offload (tc-mqprio, tc-tbf, tc-cbs, tc-ets)
- traffic policing and mirroring
- link aggregation / bonding offload
- QUSGMII PHY mode support
- Qualcomm 802.11ax WiFi (ath11k):
- cold boot calibration support on WCN6750
- support to connect to a non-transmit MBSSID AP profile
- enable remain-on-channel support on WCN6750
- Wake-on-WLAN support for WCN6750
- support to provide transmit power from firmware via nl80211
- support to get power save duration for each client
- spectral scan support for 160 MHz
- MediaTek WiFi (mt76):
- WiFi-to-Ethernet bridging offload for MT7986 chips
- RealTek WiFi (rtw89):
- P2P support"
* tag 'net-next-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1864 commits)
eth: pse: add missing static inlines
once: rename _SLOW to _SLEEPABLE
net: pse-pd: add regulator based PSE driver
dt-bindings: net: pse-dt: add bindings for regulator based PoDL PSE controller
ethtool: add interface to interact with Ethernet Power Equipment
net: mdiobus: search for PSE nodes by parsing PHY nodes.
net: mdiobus: fwnode_mdiobus_register_phy() rework error handling
net: add framework to support Ethernet PSE and PDs devices
dt-bindings: net: phy: add PoDL PSE property
net: marvell: prestera: Propagate nh state from hw to kernel
net: marvell: prestera: Add neighbour cache accounting
net: marvell: prestera: add stub handler neighbour events
net: marvell: prestera: Add heplers to interact with fib_notifier_info
net: marvell: prestera: Add length macros for prestera_ip_addr
net: marvell: prestera: add delayed wq and flush wq on deinit
net: marvell: prestera: Add strict cleanup of fib arbiter
net: marvell: prestera: Add cleanup of allocated fib_nodes
net: marvell: prestera: Add router nexthops ABI
eth: octeon: fix build after netif_napi_add() changes
net/mlx5: E-Switch, Return EBUSY if can't get mode lock
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cpu updates from Borislav Petkov:
- Print the CPU number at segfault time.
The number printed is not always accurate (preemption is enabled at
that time) but the print string contains "likely" and after a lot of
back'n'forth on this, this was the consensus that was reached. See
thread at [1].
- After a *lot* of testing and polishing, finally the clear_user()
improvements to inline REP; STOSB by default
Link: https://lore.kernel.org/r/5d62c1d0-7425-d5bb-ecb5-1dc3b4d7d245@intel.com [1]
* tag 'x86_cpu_for_v6.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Print likely CPU at segfault time
x86/clear_user: Make it faster
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm
Pull LSM updates from Paul Moore:
"Seven patches for the LSM layer and we've got a mix of trivial and
significant patches. Highlights below, starting with the smaller bits
first so they don't get lost in the discussion of the larger items:
- Remove some redundant NULL pointer checks in the common LSM audit
code.
- Ratelimit the lockdown LSM's access denial messages.
With this change there is a chance that the last visible lockdown
message on the console is outdated/old, but it does help preserve
the initial series of lockdown denials that started the denial
message flood and my gut feeling is that these might be the more
valuable messages.
- Open userfaultfds as readonly instead of read/write.
While this code obviously lives outside the LSM, it does have a
noticeable impact on the LSMs with Ondrej explaining the situation
in the commit description. It is worth noting that this patch
languished on the VFS list for over a year without any comments
(objections or otherwise) so I took the liberty of pulling it into
the LSM tree after giving fair notice. It has been in linux-next
since the end of August without any noticeable problems.
- Add a LSM hook for user namespace creation, with implementations
for both the BPF LSM and SELinux.
Even though the changes are fairly small, this is the bulk of the
diffstat as we are also including BPF LSM selftests for the new
hook.
It's also the most contentious of the changes in this pull request
with Eric Biederman NACK'ing the LSM hook multiple times during its
development and discussion upstream. While I've never taken NACK's
lightly, I'm sending these patches to you because it is my belief
that they are of good quality, satisfy a long-standing need of
users and distros, and are in keeping with the existing nature of
the LSM layer and the Linux Kernel as a whole.
The patches in implement a LSM hook for user namespace creation
that allows for a granular approach, configurable at runtime, which
enables both monitoring and control of user namespaces. The general
consensus has been that this is far preferable to the other
solutions that have been adopted downstream including outright
removal from the kernel, disabling via system wide sysctls, or
various other out-of-tree mechanisms that users have been forced to
adopt since we haven't been able to provide them an upstream
solution for their requests. Eric has been steadfast in his
objections to this LSM hook, explaining that any restrictions on
the user namespace could have significant impact on userspace.
While there is the possibility of impacting userspace, it is
important to note that this solution only impacts userspace when it
is requested based on the runtime configuration supplied by the
distro/admin/user. Frederick (the pathset author), the LSM/security
community, and myself have tried to work with Eric during
development of this patchset to find a mutually acceptable
solution, but Eric's approach and unwillingness to engage in a
meaningful way have made this impossible. I have CC'd Eric directly
on this pull request so he has a chance to provide his side of the
story; there have been no objections outside of Eric's"
* tag 'lsm-pr-20221003' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm:
lockdown: ratelimit denial messages
userfaultfd: open userfaultfds with O_RDONLY
selinux: Implement userns_create hook
selftests/bpf: Add tests verifying bpf lsm userns_create hook
bpf-lsm: Make bpf_lsm_userns_create() sleepable
security, lsm: Introduce security_create_user_ns()
lsm: clean up redundant NULL pointer check
|
|
Merge in the left-over fixes before the net-next pull-request.
Conflicts:
drivers/net/ethernet/mediatek/mtk_ppe.c
ae3ed15da588 ("net: ethernet: mtk_eth_soc: fix state in __mtk_foe_entry_clear")
9d8cb4c096ab ("net: ethernet: mtk_eth_soc: add foe_entry_size to mtk_eth_soc")
https://lore.kernel.org/all/6cb6893b-4921-a068-4c30-1109795110bb@tessares.net/
kernel/bpf/helpers.c
8addbfc7b308 ("bpf: Gate dynptr API behind CAP_BPF")
5679ff2f138f ("bpf: Move bpf_loop and bpf_for_each_map_elem under CAP_BPF")
8a67f2de9b1d ("bpf: expose bpf_strtol and bpf_strtoul to all program types")
https://lore.kernel.org/all/20221003201957.13149-1-daniel@iogearbox.net/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull kernel hardening updates from Kees Cook:
"Most of the collected changes here are fixes across the tree for
various hardening features (details noted below).
The most notable new feature here is the addition of the memcpy()
overflow warning (under CONFIG_FORTIFY_SOURCE), which is the next step
on the path to killing the common class of "trivially detectable"
buffer overflow conditions (i.e. on arrays with sizes known at compile
time) that have resulted in many exploitable vulnerabilities over the
years (e.g. BleedingTooth).
This feature is expected to still have some undiscovered false
positives. It's been in -next for a full development cycle and all the
reported false positives have been fixed in their respective trees.
All the known-bad code patterns we could find with Coccinelle are also
either fixed in their respective trees or in flight.
The commit message in commit 54d9469bc515 ("fortify: Add run-time WARN
for cross-field memcpy()") for the feature has extensive details, but
I'll repeat here that this is a warning _only_, and is not intended to
actually block overflows (yet). The many patches fixing array sizes
and struct members have been landing for several years now, and we're
finally able to turn this on to find any remaining stragglers.
Summary:
Various fixes across several hardening areas:
- loadpin: Fix verity target enforcement (Matthias Kaehlcke).
- zero-call-used-regs: Add missing clobbers in paravirt (Bill
Wendling).
- CFI: clean up sparc function pointer type mismatches (Bart Van
Assche).
- Clang: Adjust compiler flag detection for various Clang changes
(Sami Tolvanen, Kees Cook).
- fortify: Fix warnings in arch-specific code in sh, ARM, and xen.
Improvements to existing features:
- testing: improve overflow KUnit test, introduce fortify KUnit test,
add more coverage to LKDTM tests (Bart Van Assche, Kees Cook).
- overflow: Relax overflow type checking for wider utility.
New features:
- string: Introduce strtomem() and strtomem_pad() to fill a gap in
strncpy() replacement needs.
- um: Enable FORTIFY_SOURCE support.
- fortify: Enable run-time struct member memcpy() overflow warning"
* tag 'hardening-v6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (27 commits)
Makefile.extrawarn: Move -Wcast-function-type-strict to W=1
hardening: Remove Clang's enable flag for -ftrivial-auto-var-init=zero
sparc: Unbreak the build
x86/paravirt: add extra clobbers with ZERO_CALL_USED_REGS enabled
x86/paravirt: clean up typos and grammaros
fortify: Convert to struct vs member helpers
fortify: Explicitly check bounds are compile-time constants
x86/entry: Work around Clang __bdos() bug
ARM: decompressor: Include .data.rel.ro.local
fortify: Adjust KUnit test for modular build
sh: machvec: Use char[] for section boundaries
kunit/memcpy: Avoid pathological compile-time string size
lib: Improve the is_signed_type() kunit test
LoadPin: Require file with verity root digests to have a header
dm: verity-loadpin: Only trust verity targets with enforcement
LoadPin: Fix Kconfig doc about format of file with verity digests
um: Enable FORTIFY_SOURCE
lkdtm: Update tests for memcpy() run-time warnings
fortify: Add run-time WARN for cross-field memcpy()
fortify: Use SIZE_MAX instead of (size_t)-1
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull kcfi updates from Kees Cook:
"This replaces the prior support for Clang's standard Control Flow
Integrity (CFI) instrumentation, which has required a lot of special
conditions (e.g. LTO) and work-arounds.
The new implementation ("Kernel CFI") is specific to C, directly
designed for the Linux kernel, and takes advantage of architectural
features like x86's IBT. This series retains arm64 support and adds
x86 support.
GCC support is expected in the future[1], and additional "generic"
architectural support is expected soon[2].
Summary:
- treewide: Remove old CFI support details
- arm64: Replace Clang CFI support with Clang KCFI support
- x86: Introduce Clang KCFI support"
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107048 [1]
Link: https://github.com/samitolvanen/llvm-project/commits/kcfi_generic [2]
* tag 'kcfi-v6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (22 commits)
x86: Add support for CONFIG_CFI_CLANG
x86/purgatory: Disable CFI
x86: Add types to indirectly called assembly functions
x86/tools/relocs: Ignore __kcfi_typeid_ relocations
kallsyms: Drop CONFIG_CFI_CLANG workarounds
objtool: Disable CFI warnings
objtool: Preserve special st_shndx indexes in elf_update_symbol
treewide: Drop __cficanonical
treewide: Drop WARN_ON_FUNCTION_MISMATCH
treewide: Drop function_nocfi
init: Drop __nocfi from __init
arm64: Drop unneeded __nocfi attributes
arm64: Add CFI error handling
arm64: Add types to indirect called assembly functions
psci: Fix the function type for psci_initcall_t
lkdtm: Emit an indirect call for CFI tests
cfi: Add type helper macros
cfi: Switch to -fsanitize=kcfi
cfi: Drop __CFI_ADDRESSABLE
cfi: Remove CONFIG_CFI_CLANG_SHADOW
...
|
|
Pull Rust introductory support from Kees Cook:
"The tree has a recent base, but has fundamentally been in linux-next
for a year and a half[1]. It's been updated based on feedback from the
Kernel Maintainer's Summit, and to gain recent Reviewed-by: tags.
Miguel is the primary maintainer, with me helping where needed/wanted.
Our plan is for the tree to switch to the standard non-rebasing
practice once this initial infrastructure series lands.
The contents are the absolute minimum to get Rust code building in the
kernel, with many more interfaces[2] (and drivers - NVMe[3], 9p[4], M1
GPU[5]) on the way.
The initial support of Rust-for-Linux comes in roughly 4 areas:
- Kernel internals (kallsyms expansion for Rust symbols, %pA format)
- Kbuild infrastructure (Rust build rules and support scripts)
- Rust crates and bindings for initial minimum viable build
- Rust kernel documentation and samples
Rust support has been in linux-next for a year and a half now, and the
short log doesn't do justice to the number of people who have
contributed both to the Linux kernel side but also to the upstream
Rust side to support the kernel's needs. Thanks to these 173 people,
and many more, who have been involved in all kinds of ways:
Miguel Ojeda, Wedson Almeida Filho, Alex Gaynor, Boqun Feng, Gary Guo,
Björn Roy Baron, Andreas Hindborg, Adam Bratschi-Kaye, Benno Lossin,
Maciej Falkowski, Finn Behrens, Sven Van Asbroeck, Asahi Lina, FUJITA
Tomonori, John Baublitz, Wei Liu, Geoffrey Thomas, Philip Herron,
Arthur Cohen, David Faust, Antoni Boucher, Philip Li, Yujie Liu,
Jonathan Corbet, Greg Kroah-Hartman, Paul E. McKenney, Josh Triplett,
Kent Overstreet, David Gow, Alice Ryhl, Robin Randhawa, Kees Cook,
Nick Desaulniers, Matthew Wilcox, Linus Walleij, Joe Perches, Michael
Ellerman, Petr Mladek, Masahiro Yamada, Arnaldo Carvalho de Melo,
Andrii Nakryiko, Konstantin Shelekhin, Rasmus Villemoes, Konstantin
Ryabitsev, Stephen Rothwell, Andy Shevchenko, Sergey Senozhatsky, John
Paul Adrian Glaubitz, David Laight, Nathan Chancellor, Jonathan
Cameron, Daniel Latypov, Shuah Khan, Brendan Higgins, Julia Lawall,
Laurent Pinchart, Geert Uytterhoeven, Akira Yokosawa, Pavel Machek,
David S. Miller, John Hawley, James Bottomley, Arnd Bergmann,
Christian Brauner, Dan Robertson, Nicholas Piggin, Zhouyi Zhou, Elena
Zannoni, Jose E. Marchesi, Leon Romanovsky, Will Deacon, Richard
Weinberger, Randy Dunlap, Paolo Bonzini, Roland Dreier, Mark Brown,
Sasha Levin, Ted Ts'o, Steven Rostedt, Jarkko Sakkinen, Michal
Kubecek, Marco Elver, Al Viro, Keith Busch, Johannes Berg, Jan Kara,
David Sterba, Connor Kuehl, Andy Lutomirski, Andrew Lunn, Alexandre
Belloni, Peter Zijlstra, Russell King, Eric W. Biederman, Willy
Tarreau, Christoph Hellwig, Emilio Cobos Álvarez, Christian Poveda,
Mark Rousskov, John Ericson, TennyZhuang, Xuanwo, Daniel Paoliello,
Manish Goregaokar, comex, Josh Stone, Stephan Sokolow, Philipp Krones,
Guillaume Gomez, Joshua Nelson, Mats Larsen, Marc Poulhiès, Samantha
Miller, Esteban Blanc, Martin Schmidt, Martin Rodriguez Reboredo,
Daniel Xu, Viresh Kumar, Bartosz Golaszewski, Vegard Nossum, Milan
Landaverde, Dariusz Sosnowski, Yuki Okushi, Matthew Bakhtiari, Wu
XiangCheng, Tiago Lam, Boris-Chengbiao Zhou, Sumera Priyadarsini,
Viktor Garske, Niklas Mohrin, Nándor István Krácser, Morgan Bartlett,
Miguel Cano, Léo Lanteri Thauvin, Julian Merkle, Andreas Reindl,
Jiapeng Chong, Fox Chen, Douglas Su, Antonio Terceiro, SeongJae Park,
Sergio González Collado, Ngo Iok Ui (Wu Yu Wei), Joshua Abraham,
Milan, Daniel Kolsoi, ahomescu, Manas, Luis Gerhorst, Li Hongyu,
Philipp Gesang, Russell Currey, Jalil David Salamé Messina, Jon Olson,
Raghvender, Angelos, Kaviraj Kanagaraj, Paul Römer, Sladyn Nunes,
Mauro Baladés, Hsiang-Cheng Yang, Abhik Jain, Hongyu Li, Sean Nash,
Yuheng Su, Peng Hao, Anhad Singh, Roel Kluin, Sara Saa, Geert
Stappers, Garrett LeSage, IFo Hancroft, and Linus Torvalds"
Link: https://lwn.net/Articles/849849/ [1]
Link: https://github.com/Rust-for-Linux/linux/commits/rust [2]
Link: https://github.com/metaspace/rust-linux/commit/d88c3744d6cbdf11767e08bad56cbfb67c4c96d0 [3]
Link: https://github.com/wedsonaf/linux/commit/9367032607f7670de0ba1537cf09ab0f4365a338 [4]
Link: https://github.com/AsahiLinux/linux/commits/gpu/rust-wip [5]
* tag 'rust-v6.1-rc1' of https://github.com/Rust-for-Linux/linux: (27 commits)
MAINTAINERS: Rust
samples: add first Rust examples
x86: enable initial Rust support
docs: add Rust documentation
Kbuild: add Rust support
rust: add `.rustfmt.toml`
scripts: add `is_rust_module.sh`
scripts: add `rust_is_available.sh`
scripts: add `generate_rust_target.rs`
scripts: add `generate_rust_analyzer.py`
scripts: decode_stacktrace: demangle Rust symbols
scripts: checkpatch: enable language-independent checks for Rust
scripts: checkpatch: diagnose uses of `%pA` in the C side as errors
vsprintf: add new `%pA` format specifier
rust: export generated symbols
rust: add `kernel` crate
rust: add `bindings` crate
rust: add `macros` crate
rust: add `compiler_builtins` crate
rust: adapt `alloc` crate to the kernel
...
|
|
Daniel Borkmann says:
====================
pull-request: bpf 2022-10-03
We've added 10 non-merge commits during the last 23 day(s) which contain
a total of 14 files changed, 130 insertions(+), 69 deletions(-).
The main changes are:
1) Fix dynptr helper API to gate behind CAP_BPF given it was not intended
for unprivileged BPF programs, from Kumar Kartikeya Dwivedi.
2) Fix need_wakeup flag inheritance from umem buffer pool for shared xsk
sockets, from Jalal Mostafa.
3) Fix truncated last_member_type_id in btf_struct_resolve() which had a
wrong storage type, from Lorenz Bauer.
4) Fix xsk back-pressure mechanism on tx when amount of produced
descriptors to CQ is lower than what was grabbed from xsk tx ring,
from Maciej Fijalkowski.
5) Fix wrong cgroup attach flags being displayed to effective progs,
from Pu Lehui.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
xsk: Inherit need_wakeup flag for shared sockets
bpf: Gate dynptr API behind CAP_BPF
selftests/bpf: Adapt cgroup effective query uapi change
bpftool: Fix wrong cgroup attach flags being assigned to effective progs
bpf, cgroup: Reject prog_attach_flags array when effective query
bpf: Ensure correct locking around vulnerable function find_vpid()
bpf: btf: fix truncated last_member_type_id in btf_struct_resolve
selftests/xsk: Add missing close() on netns fd
xsk: Fix backpressure mechanism on Tx
MAINTAINERS: Add include/linux/tnum.h to BPF CORE
====================
Link: https://lore.kernel.org/r/20221003201957.13149-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Since 2d1c498072de ("mm: memcontrol: make swap tracking an integral part
of memory control"), CONFIG_MEMCG_SWAP hasn't been a user-visible config
option anymore, it just means CONFIG_MEMCG && CONFIG_SWAP.
Update the sites accordingly and drop the symbol.
[ While touching the docs, remove two references to CONFIG_MEMCG_KMEM,
which hasn't been a user-visible symbol for over half a decade. ]
Link: https://lkml.kernel.org/r/20220926135704.400818-5-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Shakeel Butt <shakeelb@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Add :collapse mod to userfaultfd selftest. Currently this mod is only
valid for "shmem" test type, but could be used for other test types.
When provided, memory allocated by ->allocate_area() will be
hugepage-aligned enforced to be hugepage-sized. userfaultf_minor_test,
after the UFFD-registered mapping has been populated by UUFD minor fault
handler, attempt to MADV_COLLAPSE the UFFD-registered mapping to collapse
the memory into a pmd-mapped THP.
This test is meant to be a functional test of what occurs during
UFFD-driven live migration of VMs backed by huge tmpfs where, after a
hugepage-sized region has been successfully migrated (in native page-sized
chunks, to avoid latency of fetched a hugepage over the network), we want
to reclaim previous VM performance by remapping it at the PMD level.
Link: https://lkml.kernel.org/r/20220907144521.3115321-11-zokeefe@google.com
Link: https://lkml.kernel.org/r/20220922224046.1143204-11-zokeefe@google.com
Signed-off-by: Zach O'Keefe <zokeefe@google.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Chris Kennelly <ckennelly@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: James Houghton <jthoughton@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rongwei Wang <rongwei.wang@linux.alibaba.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
This test tests that MADV_COLLAPSE acting on file/shmem memory for which
(1) the file extent mapping by the memory is already a huge page in the
page cache, and (2) the pmd mapping this memory in the target process is
none.
In practice, (1)+(2) is the state left over after khugepaged has
successfully collapsed file/shmem memory for a target VMA, but the memory
has not yet been refaulted. So, this test in-effect tests MADV_COLLAPSE
racing with khugepaged to collapse the memory first.
Link: https://lkml.kernel.org/r/20220907144521.3115321-10-zokeefe@google.com
Link: https://lkml.kernel.org/r/20220922224046.1143204-10-zokeefe@google.com
Signed-off-by: Zach O'Keefe <zokeefe@google.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Chris Kennelly <ckennelly@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: James Houghton <jthoughton@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rongwei Wang <rongwei.wang@linux.alibaba.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Add memory operations for shmem (memfd) memory, and reuse existing tests
with the new memory operations.
Shmem tests can be called with "shmem" mem_type, and shmem tests are ran
with "all" mem_type as well.
Link: https://lkml.kernel.org/r/20220907144521.3115321-9-zokeefe@google.com
Link: https://lkml.kernel.org/r/20220922224046.1143204-9-zokeefe@google.com
Signed-off-by: Zach O'Keefe <zokeefe@google.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Chris Kennelly <ckennelly@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: James Houghton <jthoughton@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rongwei Wang <rongwei.wang@linux.alibaba.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Add memory operations for file-backed and tmpfs memory. Call existing
tests with these new memory operations to test collapse functionality of
khugepaged and MADV_COLLAPSE on file-backed and tmpfs memory. Not all
tests are reusable; for example, collapse_swapin_single_pte() which checks
swap usage.
Refactor test arguments. Usage is now:
Usage: ./khugepaged <test type> [dir]
<test type> : <context>:<mem_type>
<context> : [all|khugepaged|madvise]
<mem_type> : [all|anon|file]
"file,all" mem_type requires [dir] argument
"file,all" mem_type requires kernel built with
CONFIG_READ_ONLY_THP_FOR_FS=y
if [dir] is a (sub)directory of a tmpfs mount, tmpfs must be
mounted with huge=madvise option for khugepaged tests to work
Refactor calling tests to make it clear what collapse context / memory
operations they support, but only invoke tests requested by user. Also
log what test is being ran, and with what context / memory, to make test
logs more human readable.
A new test file is created and deleted for every test to ensure no pages
remain in the page cache between tests (tests also may attempt to collapse
different amount of memory).
For file-backed memory where the file is stored on a block device, disable
/sys/block/<device>/queue/read_ahead_kb so that pages don't find their way
into the page cache without the tests faulting them in.
Add file and shmem wrappers to vm_utils check for file and shmem hugepages
in smaps.
[zokeefe@google.com: fix "add thp collapse file and tmpfs testing" for
tmpfs]
Link: https://lkml.kernel.org/r/20220913212517.3163701-1-zokeefe@google.com
Link: https://lkml.kernel.org/r/20220907144521.3115321-8-zokeefe@google.com
Link: https://lkml.kernel.org/r/20220922224046.1143204-8-zokeefe@google.com
Signed-off-by: Zach O'Keefe <zokeefe@google.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Chris Kennelly <ckennelly@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: James Houghton <jthoughton@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rongwei Wang <rongwei.wang@linux.alibaba.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Modularize operations to setup, cleanup, fault, and check for huge pages,
for a given memory type. This allows reusing existing tests with
additional memory types by defining new memory operations. Following
patches will add file and shmem memory types.
Link: https://lkml.kernel.org/r/20220907144521.3115321-7-zokeefe@google.com
Link: https://lkml.kernel.org/r/20220922224046.1143204-7-zokeefe@google.com
Signed-off-by: Zach O'Keefe <zokeefe@google.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Chris Kennelly <ckennelly@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: James Houghton <jthoughton@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rongwei Wang <rongwei.wang@linux.alibaba.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
These files:
tools/testing/selftests/vm/vm_util.c
tools/testing/selftests/vm/khugepaged.c
Both contain logic to:
1) Determine hugepage size on current system
2) Read /proc/self/smaps to determine number of THPs at an address
Refactor selftests/vm/khugepaged.c to use the vm_util common helpers and
add it as a build dependency.
Since selftests/vm/khugepaged.c is the largest user of check_huge(),
change the signature of check_huge() to match selftests/vm/khugepaged.c's
useage: take an expected number of hugepages, and return a bool indicating
if the correct number of hugepages were found. Add a wrapper,
check_huge_anon(), in anticipation of checking smaps for file and shmem
hugepages.
Update existing callsites to use the new pattern / function.
Likewise, check_for_pattern() was duplicated, and it's a general enough
helper to include in vm_util helpers as well.
Link: https://lkml.kernel.org/r/20220907144521.3115321-6-zokeefe@google.com
Link: https://lkml.kernel.org/r/20220922224046.1143204-6-zokeefe@google.com
Signed-off-by: Zach O'Keefe <zokeefe@google.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Chris Kennelly <ckennelly@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: James Houghton <jthoughton@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rongwei Wang <rongwei.wang@linux.alibaba.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
MADV_COLLAPSE is a best-effort request that will set errno to an
actionable value if the request cannot be performed.
For example, if pages are not found on the LRU, or if they are currently
locked by something else, MADV_COLLAPSE will fail and set errno to EAGAIN
to inform callers that they may try again.
Since the khugepaged selftest is the first public use of MADV_COLLAPSE,
set a best practice of checking errno and retrying on EAGAIN.
Link: https://lkml.kernel.org/r/20220922184651.1016461-2-zokeefe@google.com
Fixes: 9330694de59f ("selftests/vm: add MADV_COLLAPSE collapse context to selftests")
Signed-off-by: Zach O'Keefe <zokeefe@google.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Chris Kennelly <ckennelly@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: James Houghton <jthoughton@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rongwei Wang <rongwei.wang@linux.alibaba.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
KMSAN inserts API function calls in a lot of places (function entries and
exits, local variables, memory accesses), so they may get called from the
uaccess regions as well.
KMSAN API functions are used to update the metadata (shadow/origin pages)
for kernel memory accesses. The metadata pages for kernel pointers are
also located in the kernel memory, so touching them is not a problem. For
userspace pointers, no metadata is allocated.
If an API function is supposed to read or modify the metadata, it does so
for kernel pointers and ignores userspace pointers. If an API function is
supposed to return a pair of metadata pointers for the instrumentation to
use (like all __msan_metadata_ptr_for_TYPE_SIZE() functions do), it
returns the allocated metadata for kernel pointers and special dummy
buffers residing in the kernel memory for userspace pointers.
As a result, none of KMSAN API functions perform userspace accesses, but
since they might be called from UACCESS regions they use
user_access_save/restore().
Link: https://lkml.kernel.org/r/20220915150417.722975-32-glider@google.com
Signed-off-by: Alexander Potapenko <glider@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Eric Biggers <ebiggers@google.com>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Ilya Leoshkevich <iii@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Marco Elver <elver@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|