Age | Commit message (Collapse) | Author |
|
The zbud compressed pages allocator is rarely used, most users use
zsmalloc. zbud consumes much more memory (only stores 1 or 2 compressed
pages per physical page). The only advantage of zbud is a marginal
performance improvement that by no means justify the memory overhead.
Historically, zsmalloc had significantly worse latency than zbud and
z3fold but offered better memory savings. This is no longer the case as
shown by a simple recent analysis [1]. In a kernel build test on tmpfs in
a limited cgroup, zbud 2-3% less time than zsmalloc, but at the cost of
using ~32% more memory (1.5G vs 1.13G). The tradeoff does not make sense
for zbud in any practical scenario.
The only alleged advantage of zbud is not having the dependency on
CONFIG_MMU, but CONFIG_SWAP already depends on CONFIG_MMU anyway, and zbud
is only used by zswap.
Remove zbud after z3fold's removal, leaving zsmalloc as the one and only
zpool allocator. Leave the removal of the zpool API (and its associated
config options) to a followup cleanup after no more allocators show up.
Deprecating zbud for a few cycles before removing it was initially
proposed [2], like z3fold was marked as deprecated for 2 cycles [3].
However, Johannes rightfully pointed out that the 2 cycles is too short
for most downstream consumers, and z3fold was deprecated first only as a
courtesy anyway.
[1]https://lore.kernel.org/lkml/CAJD7tkbRF6od-2x_L8-A1QL3=2Ww13sCj4S3i4bNndqF+3+_Vg@mail.gmail.com/
[2]https://lore.kernel.org/lkml/Z5gdnSX5Lv-nfjQL@google.com/
[3]https://lore.kernel.org/lkml/20240904233343.933462-1-yosryahmed@google.com/
Link: https://lkml.kernel.org/r/20250129180633.3501650-3-yosry.ahmed@linux.dev
Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Nhat Pham <nphamcs@gmail.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vitaly Wool <vitaly.wool@konsulko.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: WANG Xuerui <kernel@xen0n.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
We now print error messages in ext4_end_bio() when page writeback
encounters an error. If data_err=abort is set, the journal will also
be aborted in a kworker. This means that we now check all Buffer I/O
in all modes and decide whether to abort the journal based on the
data_err option. Therefore, we remove the ordered mode restriction
in the descriptions of data_err=abort and data_err=ignore.
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20250122110533.4116662-8-libaokun@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
Ammend missing "install" operation keyword after "apt-get", and fix
"build-essentials" to "build-essential".
Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20250306030708.8133-1-richard120310@gmail.com
|
|
pnp_remove_card() is currently unused, it has been since it was
added in 2003's BKrev: 3e6d3f19XSmESWEZnNEReEJOJW5SOw
pnp_unregister_protocol() is currently unused, it has been since
it was added in 2002's BKrev: 3df0cf6d4FVUKndhbfxjL7pksw5PGA
Remove them, and pnp_remove_card_device() and __pnp_remove_device()
which are now no longer used.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Link: https://patch.msgid.link/20250307214936.74504-1-linux@treblig.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
As explained in the commit 76f969e8948d8 ("cgroup: cgroup v2 freezer"),
the original freezer is imperfect, some users may unwittingly rely on it
when there exists the alternative of v2. Print a message when it happens
and explain that in the docs.
Signed-off-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
The concept of per-memcg swappiness has never landed well in memcg for
cgroup v2. Add a message to users who use it on v1 hierarchy.
Decreased swappiness transforms to memory.swap.max=0 whereas
increased swappiness transforms into active memory.reclaim operation.
Link: https://lore.kernel.org/r/1577252208-32419-1-git-send-email-teawater@gmail.com/
Signed-off-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
- Use the specified $(LD) when building userprogs with Clang
- Pass the correct target triple when compile-testing UAPI headers
with Clang
- Fix pacman-pkg build error with KBUILD_OUTPUT
* tag 'kbuild-fixes-v6.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kbuild: install-extmod-build: Fix build when specifying KBUILD_OUTPUT
docs: Kconfig: fix defconfig description
kbuild: hdrcheck: fix cross build with clang
kbuild: userprogs: use correct lld when linking through clang
|
|
Requesting a fwctl scope of access that includes mutating device debug
data will cause the kernel to be tainted. Changing the device operation
through things in the debug scope may cause the device to malfunction in
undefined ways. This should be reflected in the TAINT flags to help any
debuggers understand that something has been done.
Link: https://patch.msgid.link/r/4-v5-642aa0c94070+4447f-fwctl_jgg@nvidia.com
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Tested-by: Dave Jiang <dave.jiang@intel.com>
Tested-by: Shannon Nelson <shannon.nelson@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
controllers
The cgroup v2 cpu controller has a limitation that if
CONFIG_RT_GROUP_SCHED is enabled, the cpu controller can be enabled only
if all the realtime processes are in the root cgroup. The other
controllers have no such restriction. They can be used for the resource
control of realtime processes irrespective of whether
CONFIG_RT_GROUP_SCHED is enabled or not.
Signed-off-by: Shashank Balaji <shashank.mahadasyam@sony.com>
Acked-by: Waiman Long <longman@redhat.com>
Acked-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Commit 2a86f6612164 ("kbuild: use KBUILD_DEFCONFIG as the fallback
for DEFCONFIG_LIST") removed arch/$ARCH/defconfig; however,
the document has not been updated to reflect this change yet.
Signed-off-by: Satoru Takeuchi <satoru.takeuchi@gmail.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Change the default value of spectre v2 in user mode to respect the
CONFIG_MITIGATION_SPECTRE_V2 config option.
Currently, user mode spectre v2 is set to auto
(SPECTRE_V2_USER_CMD_AUTO) by default, even if
CONFIG_MITIGATION_SPECTRE_V2 is disabled.
Set the spectre_v2 value to auto (SPECTRE_V2_USER_CMD_AUTO) if the
Spectre v2 config (CONFIG_MITIGATION_SPECTRE_V2) is enabled, otherwise
set the value to none (SPECTRE_V2_USER_CMD_NONE).
Important to say the command line argument "spectre_v2_user" overwrites
the default value in both cases.
When CONFIG_MITIGATION_SPECTRE_V2 is not set, users have the flexibility
to opt-in for specific mitigations independently. In this scenario,
setting spectre_v2= will not enable spectre_v2_user=, and command line
options spectre_v2_user and spectre_v2 are independent when
CONFIG_MITIGATION_SPECTRE_V2=n.
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Kaplan <David.Kaplan@amd.com>
Link: https://lore.kernel.org/r/20241031-x86_bugs_last_v2-v2-2-b7ff1dab840e@debian.org
|
|
HARDENED_USERCOPY defaults to on if enabled at compile time. Allow
hardened_usercopy= default to be set at compile time similar to
init_on_alloc= and init_on_free=. The intent is that hardening
options that can be disabled at runtime can set their default at
build time.
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Link: https://lore.kernel.org/r/20250123221115.19722-3-mgorman@techsingularity.net
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
With the maximum amount of RAM now 4GB, there is very little point
to still have PTE pages in highmem. Drop this for simplification.
The only other architecture supporting HIGHPTE is 32-bit arm, and
once that feature is removed as well, the highpte logic can be
dropped from common code as well.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250226213714.4040853-8-arnd@kernel.org
|
|
HIGHMEM64G support was added in linux-2.3.25 to support (then)
high-end Pentium Pro and Pentium III Xeon servers with more than 4GB of
addressing, NUMA and PCI-X slots started appearing.
I have found no evidence of this ever being used in regular dual-socket
servers or consumer devices, all the users seem obsolete these days,
even by i386 standards:
- Support for NUMA servers (NUMA-Q, IBM x440, unisys) was already
removed ten years ago.
- 4+ socket non-NUMA servers based on Intel 450GX/450NX, HP F8 and
ServerWorks ServerSet/GrandChampion could theoretically still work
with 8GB, but these were exceptionally rare even 20 years ago and
would have usually been equipped with than the maximum amount of
RAM.
- Some SKUs of the Celeron D from 2004 had 64-bit mode fused off but
could still work in a Socket 775 mainboard designed for the later
Core 2 Duo and 8GB. Apparently most BIOSes at the time only allowed
64-bit CPUs.
- The rare Xeon LV "Sossaman" came on a few motherboards with
registered DDR2 memory support up to 16GB.
- In the early days of x86-64 hardware, there was sometimes the need
to run a 32-bit kernel to work around bugs in the hardware drivers,
or in the syscall emulation for 32-bit userspace. This likely still
works but there should never be a need for this any more.
PAE mode is still required to get access to the 'NX' bit on Atom
'Pentium M' and 'Core Duo' CPUs.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250226213714.4040853-6-arnd@kernel.org
|
|
The x86-32 kernel used to support multiple platforms with more than eight
logical CPUs, from the 1999-2003 timeframe: Sequent NUMA-Q, IBM Summit,
Unisys ES7000 and HP F8. Support for all except the latter was dropped
back in 2014, leaving only the F8 based DL740 and DL760 G2 machines in
this catery, with up to eight single-core Socket-603 Xeon-MP processors
with hyperthreading.
Like the already removed machines, the HP F8 servers at the time cost
upwards of $100k in typical configurations, but were quickly obsoleted
by their 64-bit Socket-604 cousins and the AMD Opteron.
Earlier servers with up to 8 Pentium Pro or Xeon processors remain
fully supported as they had no hyperthreading. Similarly, the more
common 4-socket Xeon-MP machines with hyperthreading using Intel
or ServerWorks chipsets continue to work without this, and all the
multi-core Xeon processors also run 64-bit kernels.
While the "bigsmp" support can also be used to run on later 64-bit
machines (including VM guests), it seems best to discourage that
and get any remaining users to update their kernels to 64-bit builds
on these. As a side-effect of this, there is also no more need to
support NUMA configurations on 32-bit x86, as all true 32-bit
NUMA platforms are already gone.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250226213714.4040853-3-arnd@kernel.org
|
|
Remove a reference to CBE_CPUFREQ_SPU_GOVERNOR which has been removed.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20241218105523.416573-20-mpe@ellerman.id.au
|
|
Add support for
CPUID Fn8000_0021_EAX[31] (SRSO_MSR_FIX). If this bit is 1, it
indicates that software may use MSR BP_CFG[BpSpecReduce] to mitigate
SRSO.
Enable BpSpecReduce to mitigate SRSO across guest/host boundaries.
Switch back to enabling the bit when virtualization is enabled and to
clear the bit when virtualization is disabled because using a MSR slot
would clear the bit when the guest is exited and any training the guest
has done, would potentially influence the host kernel when execution
enters the kernel and hasn't VMRUN the guest yet.
More detail on the public thread in Link below.
Co-developed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20241202120416.6054-1-bp@kernel.org
|
|
Sometimes tracing is used to debug issues during the boot process. Since
the trace buffer has a limited amount of storage, it may be prudent to
disable tracing after the boot is finished, otherwise the critical
information may be overwritten. With this option, the main tracing buffer
will be turned off at the end of the boot process.
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Borislav Petkov <bp@alien8.de>
Link: https://lore.kernel.org/20250208103017.48a7ec83@batman.local.home
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
This work had been triggered by a report that commit 0611a640e60a
("eventpoll: prefer kfree_rcu() in __ep_remove()") had caused the
critical-jOPS metric of the SPECjbb 2015 benchmark [1] to drop by around
50% even though it generally reduced kernel overhead. Indeed, it was
found during further investigation that the total interrupt rate while
running the SPECjbb workload had fallen as a result of that commit by
55% and the local timer interrupt rate had fallen by almost 80%.
That turned out to cause the menu cpuidle governor to select the deepest
idle state supplied by the cpuidle driver (intel_idle) much more often
which added significantly more idle state latency to the workload and
that led to the decrease of the critical-jOPS score.
Interestingly enough, this problem was not visible when the teo cpuidle
governor was used instead of menu, so it appeared to be specific to the
latter. CPU wakeup event statistics collected while running the
workload indicated that the menu governor was effectively ignoring non-
timer wakeup information and all of its idle state selection decisions
appeared to be based on timer wakeups only. Thus, it appeared that the
reduction of the local timer interrupt rate caused the governor to
predict a idle duration much more often while running the workload and
the deepest idle state was selected significantly more often as a result
of that.
A subsequent inspection of the get_typical_interval() function in the
menu governor indicated that it might return UINT_MAX too often which
then caused the governor's decisions to be based entirely on information
related to timers.
Generally speaking, UINT_MAX is returned by get_typical_interval() if it
cannot make a prediction based on the most recent idle intervals data
with sufficiently high confidence, but at least in some cases this means
that useful information is not taken into account at all which may lead
to significant idle state selection mistakes. Moreover, this is not
really unlikely to happen.
One issue with get_typical_interval() is that, when it eliminates
outliers from the sample set in an attempt to reduce the standard
deviation (and so improve the prediction confidence), it does that by
dropping high-end samples only, while samples at the low end of the set
are retained. However, the samples at the low end very well may be the
outliers and they should be eliminated from the sample set instead of
the high-end samples. Accordingly, the likelihood of making a
meaningful idle duration prediction can be improved by making it also
eliminate low-end samples if they are farther from the average than
high-end samples.
Another issue is that get_typical_interval() gives up after eliminating
1/4 of the samples if the standard deviation is still not as low as
desired (within 1/6 of the average or within 20 us if the average is
close to 0), but the remaining samples in the set still represent useful
information at that point and discarding them altogether may lead to
suboptimal idle state selection.
For instance, the largest idle duration value in the get_typical_interval()
data set is the maximum idle duration observed recently and it is likely
that the upcoming idle duration will not exceed it. Therefore, in the
absence of a better choice, this value can be used as an upper bound on
the target residency of the idle state to select.
* cpuidle-menu:
cpuidle: menu: Update documentation after get_typical_interval() changes
cpuidle: menu: Avoid discarding useful information
cpuidle: menu: Eliminate outliers on both ends of the sample set
cpuidle: menu: Tweak threshold use in get_typical_interval()
cpuidle: menu: Use one loop for average and variance computations
cpuidle: menu: Drop a redundant local variable
|
|
Since commit 18734958e9bf ("intel_idle: Use ACPI _CST for processor models
without C-state tables") the intel_idle driver has had the ability to use
the ACPI _CST to populate C-states when the processor model is not
recognized.
However, even when the processor model is recognized (native mode) there
are cases where it is useful to make the driver ignore the per-CPU idle
states in lieu of ACPI C-states (such as specific application performance).
Add a new 'no_native' module parameter to provide this functionality.
Signed-off-by: David Arcari <darcari@redhat.com>
Link: https://patch.msgid.link/20250220151120.1131122-1-darcari@redhat.com
Reviewed-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
[ rjw: Spell CPU in capitals ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The documentation of the menu cpuidle governor needs to be updated
to match the code behavior after some changes made recently.
No functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Christian Loehle <christian.loehle@arm.com>
Link: https://patch.msgid.link/4998484.31r3eYUQgx@rjwysocki.net
[ rjw: More specific subject, two typos fixed in the changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The sorting of VMAs by size in commit 7d442a33bfe8 ("binfmt_elf: Dump
smaller VMAs first in ELF cores") breaks elfutils[1]. Instead, sort
based on the setting of the new sysctl, core_sort_vma, which defaults
to 0, no sorting.
Reported-by: Michael Stapelberg <michael@stapelberg.ch>
Closes: https://lore.kernel.org/all/20250218085407.61126-1-michael@stapelberg.de/ [1]
Fixes: 7d442a33bfe8 ("binfmt_elf: Dump smaller VMAs first in ELF cores")
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
Fixing typos.
Signed-off-by: Maksimilijan Marosevic <maksimilijan.marosevic@proton.me>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20250218193822.1031-1-maksimilijan.marosevic@proton.me
|
|
Document the (new) mgb4 GMSL modules variants.
Signed-off-by: Martin Tůma <martin.tuma@digiteqautomotive.com>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
|
|
Correct the erroneous spelling of 'responsible' in
Documentation/admin-guide/media/cec.rst. This fixes all errors
pointed out by codespell in the file.
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
|
|
Now that Documentation/ABI is processed by automarkup, let it
generate cross-references for the corresponding ABI file.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/a655e770e1446f91088f579b79ae890a19771119.1739254867.git.mchehab+huawei@kernel.org
|
|
The introduction discussed stat file formats for very old kernel versions,
which obscured key information that readers may find useful. Additionally,
the example file contents and the reference to "15 fields" did not account
for the flush fields added in b6866318657 ("block: add iostat counters for
flush requests") [1].
Rewrite the introduction to focus only on the current kernel's disk I/O stat
file formats. Also, clean up wording for conciseness.
Link: https://lore.kernel.org/lkml/157433282607.7928.5202409984272248322.stgit@buzz/T/ [1]
Signed-off-by: David Reaver <me@davidreaver.com>
Link: https://lore.kernel.org/r/20250215180114.157948-1-me@davidreaver.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
|
|
The format description of reserve_mem uses [KNG] as units, rather than
[KMG].
Fix it.
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20250218070845.3769520-1-rppt@kernel.org
|
|
Capacity-aware scheduling (CAS) is enabled by default by intel_pstate on
hybrid systems without SMT, but in some usage scenarios it may be more
attractive to place tasks for maximum CPU performance regardless of the
extra cost in terms of energy, which is the case on such systems when
CAS is not enabled, so introduce a command line option to forbid
intel_pstate to enable CAS.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by:Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/2781262.mvXUDI8C0e@rjwysocki.net
|
|
Currently the CPUFreq core exposes two sysfs attributes that can be used
to query current frequency of a given CPU(s): namely cpuinfo_cur_freq
and scaling_cur_freq. Both provide slightly different view on the
subject and they do come with their own drawbacks.
cpuinfo_cur_freq provides higher precision though at a cost of being
rather expensive. Moreover, the information retrieved via this attribute
is somewhat short lived as frequency can change at any point of time
making it difficult to reason from.
scaling_cur_freq, on the other hand, tends to be less accurate but then
the actual level of precision (and source of information) varies between
architectures making it a bit ambiguous.
The new attribute, cpuinfo_avg_freq, is intended to provide more stable,
distinct interface, exposing an average frequency of a given CPU(s), as
reported by the hardware, over a time frame spanning no more than a few
milliseconds. As it requires appropriate hardware support, this
interface is optional.
Note that under the hood, the new attribute relies on the information
provided by arch_freq_get_on_cpu, which, up to this point, has been
feeding data for scaling_cur_freq attribute, being the source of
ambiguity when it comes to interpretation. This has been amended by
restoring the intended behavior for scaling_cur_freq, with a new
dedicated config option to maintain status quo for those, who may need
it.
CC: Jonathan Corbet <corbet@lwn.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: Dave Hansen <dave.hansen@linux.intel.com>
CC: H. Peter Anvin <hpa@zytor.com>
CC: Phil Auld <pauld@redhat.com>
CC: x86@kernel.org
CC: linux-doc@vger.kernel.org
Signed-off-by: Beata Michalska <beata.michalska@arm.com>
Reviewed-by: Prasanna Kumar T S M <ptsm@linux.microsoft.com>
Reviewed-by: Sumit Gupta <sumitg@nvidia.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20250131162439.3843071-3-beata.michalska@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
The ABI documentation looks a little bit better if it starts
with the contents of the README is placed at the beginning.
Move it to the beginning of the ABI chapter. While here, improve
the README text and change the title that will be shown at the
html/pdf output to be coherent with both ABI file contents and
with the generated documentation output.
Suggested-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Link: https://lore.kernel.org/r/20250211055809.1898623-1-mchehab+huawei@kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
|
|
Now that get_abi has gained support for filtering its output,
split ABI symbols from files at the html output.
That makes pages smaller and easier to navigate.
As an additional bonus, as it will paralelize files handling,
it gives an additional performance improvement.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/30e3cf2a8aeef23ca889de60a90f7de141e0dc0e.1739182025.git.mchehab+huawei@kernel.org
|
|
Right now, the logic parses ABI files on 4 steps, one for each
directory. While this is fine in principle, by doing that, not
all symbol cross-references will be created.
Change the logic to do the parsing only once in order to get
a global dictionary to be used when creating ABI cross-references.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/5205c53838b6ea25f4cdd4cc1e3d17c0141e75a6.1739182025.git.mchehab+huawei@kernel.org
|
|
Use the new script instead of the old one when generating ABI docs.
For now, execute it via exec. Future changes will better integrate it
by using the class defined there directly.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/e7fcb121c0612c94f6f54f0d742cd3a26a46cd7d.1739182025.git.mchehab+huawei@kernel.org
|
|
Such files are missing SPDX tags containing the licensing information.
Add them.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/8c7bfe676e7349ea2d1930bf918d54e27d15ae9e.1739182025.git.mchehab+huawei@kernel.org
|
|
Add a new driver for Samsung Galaxy Book series notebook devices with the
following features:
- Keyboard backlight control
- Battery hook for installing power supply extension to add charge
control end threshold
- Controller for Samsung's performance modes using the platform profile
interface
- Adds firmware-attributes to control various system features
- Handles various hotkeys and notifications
Signed-off-by: Joshua Grisham <josh@joshuagrisham.com>
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: Armin Wolf <W_Armin@gmx.de>
Link: https://lore.kernel.org/r/20250201105450.193450-1-josh@joshuagrisham.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
This commit adds a test_boost_holdoff module parameter that tells the RCU
priority-boosting tests to wait for the specified number of seconds past
the start of the rcutorture test. This can be useful when rcutorture
is built into the kernel (as opposed to being modprobed), especially on
large systems where early start of RCU priority boosting can delay the
boot sequence, which adds a full CPU's worth of load onto the system.
This can in turn result in pointless stall warnings.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
|
|
Upstream calls it "GTK" now, see [1]
[1] https://mail.gnome.org/archives/gtk-devel-list/2019-February/msg00000.html
Signed-off-by: Diego Viola <diego.viola@gmail.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20250204040444.6299-1-diego.viola@gmail.com
|
|
This patch adds documentation for new option introduced in commit
4441686b24a1 ("dm-crypt: Allow to specify the integrity key size as option").
Signed-off-by: Milan Broz <gmazyland@gmail.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
|
|
This patch adds documentation for new 'I' mode for dm-integrity
introduced in commit
fb0987682c62 ("dm-integrity: introduce the Inline mode").
Signed-off-by: Milan Broz <gmazyland@gmail.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
|
|
This patch adds documentation for options introduced in commit
f811b83879fb ("dm-verity: introduce the options restart_on_error and panic_on_error").
Signed-off-by: Milan Broz <gmazyland@gmail.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
|
|
Commit ae1f3db006b7 ("ata: ahci: do not enable LPM on external ports")
changed so that LPM is not enabled on external ports (hotplug-capable or
eSATA ports).
This is because hotplug and LPM are mutually exclusive, see 7.3.1 Hot Plug
Removal Detection and Power Management Interaction in AHCI 1.3.1.
This does require that firmware has set the appropate bits (HPCP or ESP)
in PxCMD (which is a per port register in the AHCI controller).
If the firmware has failed to mark a port as hotplug-capable or eSATA in
PxCMD, then there is currently not much a user can do.
If LPM is enabled on the port, hotplug insertions and removals will not be
detected on that port.
In order to allow a user to fix up broken firmware, add 'external' to the
libata.force kernel parameter.
libata.force can be specified either on the kernel command line, or as a
kernel module parameter.
For more information, see Documentation/admin-guide/kernel-parameters.txt.
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20250130133544.219297-4-cassel@kernel.org
Signed-off-by: Niklas Cassel <cassel@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull KVM/arm64 updates from Will Deacon:
"New features:
- Support for non-protected guest in protected mode, achieving near
feature parity with the non-protected mode
- Support for the EL2 timers as part of the ongoing NV support
- Allow control of hardware tracing for nVHE/hVHE
Improvements, fixes and cleanups:
- Massive cleanup of the debug infrastructure, making it a bit less
awkward and definitely easier to maintain. This should pave the way
for further optimisations
- Complete rewrite of pKVM's fixed-feature infrastructure, aligning
it with the rest of KVM and making the code easier to follow
- Large simplification of pKVM's memory protection infrastructure
- Better handling of RES0/RES1 fields for memory-backed system
registers
- Add a workaround for Qualcomm's Snapdragon X CPUs, which suffer
from a pretty nasty timer bug
- Small collection of cleanups and low-impact fixes"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (87 commits)
arm64/sysreg: Get rid of TRFCR_ELx SysregFields
KVM: arm64: nv: Fix doc header layout for timers
KVM: arm64: nv: Apply RESx settings to sysreg reset values
KVM: arm64: nv: Always evaluate HCR_EL2 using sanitising accessors
KVM: arm64: Fix selftests after sysreg field name update
coresight: Pass guest TRFCR value to KVM
KVM: arm64: Support trace filtering for guests
KVM: arm64: coresight: Give TRBE enabled state to KVM
coresight: trbe: Remove redundant disable call
arm64/sysreg/tools: Move TRFCR definitions to sysreg
tools: arm64: Update sysreg.h header files
KVM: arm64: Drop pkvm_mem_transition for host/hyp donations
KVM: arm64: Drop pkvm_mem_transition for host/hyp sharing
KVM: arm64: Drop pkvm_mem_transition for FF-A
KVM: arm64: Explicitly handle BRBE traps as UNDEFINED
KVM: arm64: vgic: Use str_enabled_disabled() in vgic_v3_probe()
arm64: kvm: Introduce nvhe stack size constants
KVM: arm64: Fix nVHE stacktrace VA bits mask
KVM: arm64: Fix FEAT_MTE in pKVM
Documentation: Update the behaviour of "kvm-arm.mode"
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
"The various patchsets are summarized below. Plus of course many
indivudual patches which are described in their changelogs.
- "Allocate and free frozen pages" from Matthew Wilcox reorganizes
the page allocator so we end up with the ability to allocate and
free zero-refcount pages. So that callers (ie, slab) can avoid a
refcount inc & dec
- "Support large folios for tmpfs" from Baolin Wang teaches tmpfs to
use large folios other than PMD-sized ones
- "Fix mm/rodata_test" from Petr Tesarik performs some maintenance
and fixes for this small built-in kernel selftest
- "mas_anode_descend() related cleanup" from Wei Yang tidies up part
of the mapletree code
- "mm: fix format issues and param types" from Keren Sun implements a
few minor code cleanups
- "simplify split calculation" from Wei Yang provides a few fixes and
a test for the mapletree code
- "mm/vma: make more mmap logic userland testable" from Lorenzo
Stoakes continues the work of moving vma-related code into the
(relatively) new mm/vma.c
- "mm/page_alloc: gfp flags cleanups for alloc_contig_*()" from David
Hildenbrand cleans up and rationalizes handling of gfp flags in the
page allocator
- "readahead: Reintroduce fix for improper RA window sizing" from Jan
Kara is a second attempt at fixing a readahead window sizing issue.
It should reduce the amount of unnecessary reading
- "synchronously scan and reclaim empty user PTE pages" from Qi Zheng
addresses an issue where "huge" amounts of pte pagetables are
accumulated:
https://lore.kernel.org/lkml/cover.1718267194.git.zhengqi.arch@bytedance.com/
Qi's series addresses this windup by synchronously freeing PTE
memory within the context of madvise(MADV_DONTNEED)
- "selftest/mm: Remove warnings found by adding compiler flags" from
Muhammad Usama Anjum fixes some build warnings in the selftests
code when optional compiler warnings are enabled
- "mm: don't use __GFP_HARDWALL when migrating remote pages" from
David Hildenbrand tightens the allocator's observance of
__GFP_HARDWALL
- "pkeys kselftests improvements" from Kevin Brodsky implements
various fixes and cleanups in the MM selftests code, mainly
pertaining to the pkeys tests
- "mm/damon: add sample modules" from SeongJae Park enhances DAMON to
estimate application working set size
- "memcg/hugetlb: Rework memcg hugetlb charging" from Joshua Hahn
provides some cleanups to memcg's hugetlb charging logic
- "mm/swap_cgroup: remove global swap cgroup lock" from Kairui Song
removes the global swap cgroup lock. A speedup of 10% for a
tmpfs-based kernel build was demonstrated
- "zram: split page type read/write handling" from Sergey Senozhatsky
has several fixes and cleaups for zram in the area of
zram_write_page(). A watchdog softlockup warning was eliminated
- "move pagetable_*_dtor() to __tlb_remove_table()" from Kevin
Brodsky cleans up the pagetable destructor implementations. A rare
use-after-free race is fixed
- "mm/debug: introduce and use VM_WARN_ON_VMG()" from Lorenzo Stoakes
simplifies and cleans up the debugging code in the VMA merging
logic
- "Account page tables at all levels" from Kevin Brodsky cleans up
and regularizes the pagetable ctor/dtor handling. This results in
improvements in accounting accuracy
- "mm/damon: replace most damon_callback usages in sysfs with new
core functions" from SeongJae Park cleans up and generalizes
DAMON's sysfs file interface logic
- "mm/damon: enable page level properties based monitoring" from
SeongJae Park increases the amount of information which is
presented in response to DAMOS actions
- "mm/damon: remove DAMON debugfs interface" from SeongJae Park
removes DAMON's long-deprecated debugfs interfaces. Thus the
migration to sysfs is completed
- "mm/hugetlb: Refactor hugetlb allocation resv accounting" from
Peter Xu cleans up and generalizes the hugetlb reservation
accounting
- "mm: alloc_pages_bulk: small API refactor" from Luiz Capitulino
removes a never-used feature of the alloc_pages_bulk() interface
- "mm/damon: extend DAMOS filters for inclusion" from SeongJae Park
extends DAMOS filters to support not only exclusion (rejecting),
but also inclusion (allowing) behavior
- "Add zpdesc memory descriptor for zswap.zpool" from Alex Shi
introduces a new memory descriptor for zswap.zpool that currently
overlaps with struct page for now. This is part of the effort to
reduce the size of struct page and to enable dynamic allocation of
memory descriptors
- "mm, swap: rework of swap allocator locks" from Kairui Song redoes
and simplifies the swap allocator locking. A speedup of 400% was
demonstrated for one workload. As was a 35% reduction for kernel
build time with swap-on-zram
- "mm: update mips to use do_mmap(), make mmap_region() internal"
from Lorenzo Stoakes reworks MIPS's use of mmap_region() so that
mmap_region() can be made MM-internal
- "mm/mglru: performance optimizations" from Yu Zhao fixes a few
MGLRU regressions and otherwise improves MGLRU performance
- "Docs/mm/damon: add tuning guide and misc updates" from SeongJae
Park updates DAMON documentation
- "Cleanup for memfd_create()" from Isaac Manjarres does that thing
- "mm: hugetlb+THP folio and migration cleanups" from David
Hildenbrand provides various cleanups in the areas of hugetlb
folios, THP folios and migration
- "Uncached buffered IO" from Jens Axboe implements the new
RWF_DONTCACHE flag which provides synchronous dropbehind for
pagecache reading and writing. To permite userspace to address
issues with massive buildup of useless pagecache when
reading/writing fast devices
- "selftests/mm: virtual_address_range: Reduce memory" from Thomas
Weißschuh fixes and optimizes some of the MM selftests"
* tag 'mm-stable-2025-01-26-14-59' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (321 commits)
mm/compaction: fix UBSAN shift-out-of-bounds warning
s390/mm: add missing ctor/dtor on page table upgrade
kasan: sw_tags: use str_on_off() helper in kasan_init_sw_tags()
tools: add VM_WARN_ON_VMG definition
mm/damon/core: use str_high_low() helper in damos_wmark_wait_us()
seqlock: add missing parameter documentation for raw_seqcount_try_begin()
mm/page-writeback: consolidate wb_thresh bumping logic into __wb_calc_thresh
mm/page_alloc: remove the incorrect and misleading comment
zram: remove zcomp_stream_put() from write_incompressible_page()
mm: separate move/undo parts from migrate_pages_batch()
mm/kfence: use str_write_read() helper in get_access_type()
selftests/mm/mkdirty: fix memory leak in test_uffdio_copy()
kasan: hw_tags: Use str_on_off() helper in kasan_init_hw_tags()
selftests/mm: virtual_address_range: avoid reading from VM_IO mappings
selftests/mm: vm_util: split up /proc/self/smaps parsing
selftests/mm: virtual_address_range: unmap chunks after validation
selftests/mm: virtual_address_range: mmap() without PROT_WRITE
selftests/memfd/memfd_test: fix possible NULL pointer dereference
mm: add FGP_DONTCACHE folio creation flag
mm: call filemap_fdatawrite_range_kick() after IOCB_DONTCACHE issue
...
|
|
Two of DAMON user-space tool (damo) commands that are used for examples on
DAMON getting started document, namely 'damo show' and 'damo report heats'
are deprecated[1,2], and replaced by new commands that provides same
functions with unified and simplified user interfaces. Also the example
output of 'damo show' is outdated. 'damo schemes' command is not
deprecated, but users are recommended to use 'damo start' or 'damo tune'
instead.
Update the examples to use the replacements, recommendations, and
up-to-date output formats.
[1] https://git.kernel.org/sj/damo/c/3272e0ac94ecc5e1
[2] https://git.kernel.org/sj/damo/c/da3ec66bbdd9e87d
Link: https://lkml.kernel.org/r/20250110185232.54907-5-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Honggyu Kim <honggyu.kim@sk.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Yunjeong Mun <yunjeong.mun@sk.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
files on files hierarchy
DAMOS filter directory part of DAMON sysfs files hierarchy on the usage
document is wrong. 'memcg_path' file under the directory is wrongly
written as 'memcg_id'. Also the directory has 'addr_start', 'addr_end',
and 'target_idx' files, but the list is missing those. Fix the wrong name
and add missing files.
Link: https://lkml.kernel.org/r/20250110185232.54907-4-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Honggyu Kim <honggyu.kim@sk.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Yunjeong Mun <yunjeong.mun@sk.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Update DAMON usage document for the newly added 'allow' sysfs file for
DAMOS filters.
Link: https://lkml.kernel.org/r/20250109175126.57878-11-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
design doc
DAMON usage document is describing some details about DAMOS filters, which
are also documented on the design doc. Deduplicate the details in favor
of the design doc.
Link: https://lkml.kernel.org/r/20250109175126.57878-10-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
It's time to remove DAMON debugfs interface, which has deprecated long
before in February 2023. Read the cover letter of this patch series for
more details.
Remove DAMON debugfs interface usage documentation, to avoid confusing
users with documents for an already removed thing.
Link: https://lkml.kernel.org/r/20250106191941.107070-3-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Alex Shi <alexs@kernel.org>
Cc: Brendan Higgins <brendan.higgins@linux.dev>
Cc: David Gow <davidgow@google.com>
Cc: Hu Haowen <2023002089@link.tyut.edu.cn>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Rae Moar <rmoar@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Yanteng Si <si.yanteng@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
region directories
Document the newly added DAMON sysfs interface file for per-scheme-tried
region's bytes that passed the operations set handling DAMOS filters.
Link: https://lkml.kernel.org/r/20250106193401.109161-16-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|