summaryrefslogtreecommitdiff
path: root/Documentation/admin-guide
AgeCommit message (Collapse)Author
2024-07-09gpio: virtuser: new virtual testing driver for the GPIO APIBartosz Golaszewski
The GPIO subsystem used to have a serious problem with undefined behavior and use-after-free bugs on hot-unplug of GPIO chips. This can be considered a corner-case by some as most GPIO controllers are enabled early in the boot process and live until the system goes down but most GPIO drivers do allow unbind over sysfs, many are loadable modules that can be (force) unloaded and there are also GPIO devices that can be dynamically detached, for instance CP2112 which is a USB GPIO expender. Bugs can be triggered both from user-space as well as by in-kernel users. We have the means of testing it from user-space via the character device but the issues manifest themselves differently in the kernel. This is a proposition of adding a new virtual driver - a configurable GPIO consumer that can be configured over configfs (similarly to gpio-sim) or described on the device-tree. This driver is aimed as a helper in spotting any regressions in hot-unplug handling in GPIOLIB. Link: https://lore.kernel.org/r/20240708142912.120570-1-brgl@bgdev.pl Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-07-04docs: mm: add enable_soft_offline sysctlJiaqi Yan
Add the documentation for soft offline behaviors / costs, and what the new enable_soft_offline sysctl is for. [jiaqiyan@google.com: fix kerneldoc warnings] Link: https://lkml.kernel.org/r/CACw3F52=GxTCDw-PqFh3-GDM-fo3GbhGdu0hedxYXOTT4TQSTg@mail.gmail.com [jiaqiyan@google.com: there are more blank lines needed] Link: https://lkml.kernel.org/r/CACw3F52_obAB742XeDRNun4BHBYtrxtbvp5NkUincXdaob0j1g@mail.gmail.com Link: https://lkml.kernel.org/r/20240626050818.2277273-5-jiaqiyan@google.com Signed-off-by: Jiaqi Yan <jiaqiyan@google.com> Acked-by: Oscar Salvador <osalvador@suse.de> Acked-by: Miaohe Lin <linmiaohe@huawei.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Frank van der Linden <fvdl@google.com> Cc: Jane Chu <jane.chu@oracle.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Lance Yang <ioworker0@gmail.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Naoya Horiguchi <nao.horiguchi@gmail.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-04mm: add swappiness= arg to memory.reclaimDan Schatzberg
Allow proactive reclaimers to submit an additional swappiness=<val> argument to memory.reclaim. This overrides the global or per-memcg swappiness setting for that reclaim attempt. For example: echo "2M swappiness=0" > /sys/fs/cgroup/memory.reclaim will perform reclaim on the rootcg with a swappiness setting of 0 (no swap) regardless of the vm.swappiness sysctl setting. Userspace proactive reclaimers use the memory.reclaim interface to trigger reclaim. The memory.reclaim interface does not allow for any way to effect the balance of file vs anon during proactive reclaim. The only approach is to adjust the vm.swappiness setting. However, there are a few reasons we look to control the balance of file vs anon during proactive reclaim, separately from reactive reclaim: * Swapout should be limited to manage SSD write endurance. In near-OOM situations we are fine with lots of swap-out to avoid OOMs. As these are typically rare events, they have relatively little impact on write endurance. However, proactive reclaim runs continuously and so its impact on SSD write endurance is more significant. Therefore it is desireable to control swap-out for proactive reclaim separately from reactive reclaim * Some userspace OOM killers like systemd-oomd[1] support OOM killing on swap exhaustion. This makes sense if the swap exhaustion is triggered due to reactive reclaim but less so if it is triggered due to proactive reclaim (e.g. one could see OOMs when free memory is ample but anon is just particularly cold). Therefore, it's desireable to have proactive reclaim reduce or stop swap-out before the threshold at which OOM killing occurs. In the case of Meta's Senpai proactive reclaimer, we adjust vm.swappiness before writes to memory.reclaim[2]. This has been in production for nearly two years and has addressed our needs to control proactive vs reactive reclaim behavior but is still not ideal for a number of reasons: * vm.swappiness is a global setting, adjusting it can race/interfere with other system administration that wishes to control vm.swappiness. In our case, we need to disable Senpai before adjusting vm.swappiness. * vm.swappiness is stateful - so a crash or restart of Senpai can leave a misconfigured setting. This requires some additional management to record the "desired" setting and ensure Senpai always adjusts to it. With this patch, we avoid these downsides of adjusting vm.swappiness globally. [1]https://www.freedesktop.org/software/systemd/man/latest/systemd-oomd.service.html [2]https://github.com/facebookincubator/oomd/blob/main/src/oomd/plugins/Senpai.cpp#L585-L598 Link: https://lkml.kernel.org/r/20240103164841.2800183-3-schatzberg.dan@gmail.com Signed-off-by: Dan Schatzberg <schatzberg.dan@gmail.com> Suggested-by: Yosry Ahmed <yosryahmed@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: David Rientjes <rientjes@google.com> Acked-by: Chris Li <chrisl@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Shakeel Butt <shakeelb@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Yue Zhao <findns94@gmail.com> Cc: Zefan Li <lizefan.x@bytedance.com> Cc: Nhat Pham <nphamcs@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-04rcu: Add rcutree.nohz_full_patience_delay to reduce nohz_full OS jitterPaul E. McKenney
If a CPU is running either a userspace application or a guest OS in nohz_full mode, it is possible for a system call to occur just as an RCU grace period is starting. If that CPU also has the scheduling-clock tick enabled for any reason (such as a second runnable task), and if the system was booted with rcutree.use_softirq=0, then RCU can add insult to injury by awakening that CPU's rcuc kthread, resulting in yet another task and yet more OS jitter due to switching to that task, running it, and switching back. In addition, in the common case where that system call is not of excessively long duration, awakening the rcuc task is pointless. This pointlessness is due to the fact that the CPU will enter an extended quiescent state upon returning to the userspace application or guest OS. In this case, the rcuc kthread cannot do anything that the main RCU grace-period kthread cannot do on its behalf, at least if it is given a few additional milliseconds (for example, given the time duration specified by rcutree.jiffies_till_first_fqs, give or take scheduling delays). This commit therefore adds a rcutree.nohz_full_patience_delay kernel boot parameter that specifies the grace period age (in milliseconds, rounded to jiffies) before which RCU will refrain from awakening the rcuc kthread. Preliminary experimentation suggests a value of 1000, that is, one second. Increasing rcutree.nohz_full_patience_delay will increase grace-period latency and in turn increase memory footprint, so systems with constrained memory might choose a smaller value. Systems with less-aggressive OS-jitter requirements might choose the default value of zero, which keeps the traditional immediate-wakeup behavior, thus avoiding increases in grace-period latency. [ paulmck: Apply Leonardo Bras feedback. ] Link: https://lore.kernel.org/all/20240328171949.743211-1-leobras@redhat.com/ Reported-by: Leonardo Bras <leobras@redhat.com> Suggested-by: Leonardo Bras <leobras@redhat.com> Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Leonardo Bras <leobras@redhat.com>
2024-07-04ACPI: Add acpi=nospcr to disable ACPI SPCR as default console on ARM64Liu Wei
For varying privacy and security reasons, sometimes we would like to completely silence the _serial_ console, and only enable it when needed. But there are many existing systems that depend on this _serial_ console, so add acpi=nospcr to disable console in ACPI SPCR table as default _serial_ console. Signed-off-by: Liu Wei <liuwei09@cestc.cn> Suggested-by: Prarit Bhargava <prarit@redhat.com> Suggested-by: Will Deacon <will@kernel.org> Suggested-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Hanjun Guo <guohanjun@huawei.com> Reviewed-by: Prarit Bhargava <prarit@redhat.com> Link: https://lore.kernel.org/r/20240625030504.58025-1-liuwei09@cestc.cn Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-07-04Documentation: kernel-parameters: Add DEVNAME:0.0 format for serial portsTony Lindgren
Document the console option for DEVNAME:0.0 style addressing for serial ports. Suggested-by: Sebastian Reichel <sre@kernel.org> Signed-off-by: Tony Lindgren <tony@atomide.com> Reviewed-by: Dhruva Gole <d-gole@ti.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240703100615.118762-4-tony.lindgren@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-07-03vmalloc: modify the alloc_vmap_area() error message for better diagnosticsShubhang Kaushik OS
'vmap allocation for size %lu failed: use vmalloc=<size> to increase size' The above warning is seen in the kernel functionality for allocation of the restricted virtual memory range till exhaustion. This message is misleading because 'vmalloc=' is supported on arm32, x86 platforms and is not a valid kernel parameter on a number of other platforms (in particular its not supported on arm64, alpha, loongarch, arc, csky, hexagon, microblaze, mips, nios2, openrisc, parisc, m64k, powerpc, riscv, sh, um, xtensa, s390, sparc). With the update, the output gets modified to include the function parameters along with the start and end of the virtual memory range allowed. The warning message after fix on kernel version 6.10.0-rc1+: vmalloc_node_range for size 33619968 failed: Address range restricted between 0xffff800082640000 - 0xffff800084650000 Backtrace with the misleading error message: vmap allocation for size 33619968 failed: use vmalloc=<size> to increase size insmod: vmalloc error: size 33554432, vm_struct allocation failed, mode:0xcc0(GFP_KERNEL), nodemask=(null),cpuset=/,mems_allowed=0 CPU: 46 PID: 1977 Comm: insmod Tainted: G E 6.10.0-rc1+ #79 Hardware name: INGRASYS Yushan Server iSystem TEMP-S000141176+10/Yushan Motherboard, BIOS 2.10.20230517 (SCP: xxx) yyyy/mm/dd Call trace: dump_backtrace+0xa0/0x128 show_stack+0x20/0x38 dump_stack_lvl+0x78/0x90 dump_stack+0x18/0x28 warn_alloc+0x12c/0x1b8 __vmalloc_node_range_noprof+0x28c/0x7e0 custom_init+0xb4/0xfff8 [test_driver] do_one_initcall+0x60/0x290 do_init_module+0x68/0x250 load_module+0x236c/0x2428 init_module_from_file+0x8c/0xd8 __arm64_sys_finit_module+0x1b4/0x388 invoke_syscall+0x78/0x108 el0_svc_common.constprop.0+0x48/0xf0 do_el0_svc+0x24/0x38 el0_svc+0x3c/0x130 el0t_64_sync_handler+0x100/0x130 el0t_64_sync+0x190/0x198 [Shubhang@os.amperecomputing.com: v5] Link: https://lkml.kernel.org/r/CH2PR01MB5894B0182EA0B28DF2EFB916F5C72@CH2PR01MB5894.prod.exchangelabs.com Link: https://lkml.kernel.org/r/MN2PR01MB59025CC02D1D29516527A693F5C62@MN2PR01MB5902.prod.exchangelabs.com Signed-off-by: Shubhang Kaushik <shubhang@os.amperecomputing.com> Reviewed-by: Christoph Lameter (Ampere) <cl@linux.com> Cc: Christoph Lameter <cl@linux.com> Cc: Guo Ren <guoren@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Uladzislau Rezki (Sony) <urezki@gmail.com> Cc: Xiongwei Song <xiongwei.song@windriver.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-03Docs/damon: document damos_migrate_{hot,cold}Honggyu Kim
This patch adds damon description for "migrate_hot" and "migrate_cold" actions for both usage and design documents as long as a new "target_nid" knob to set the migration target node. [sj@kernel.org: trivial fixups for DAMOS_MIGRATE_{HOT,COLD} documentation] Link: https://lkml.kernel.org/r/20240618213630.84846-2-sj@kernel.org Link: https://lkml.kernel.org/r/20240614030010.751-8-honggyu.kim@sk.com Signed-off-by: Honggyu Kim <honggyu.kim@sk.com> Signed-off-by: SeongJae Park <sj@kernel.org>Reviewed-by: SeongJae Park <sj@kernel.org> Cc: Gregory Price <gregory.price@memverge.com> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: Hyeongtak Ji <hyeongtak.ji@sk.com> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Rakie Kim <rakie.kim@sk.com> Cc: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-03Documentation/admin-guide/mm/pagemap.rst: drop "Using pagemap to do ↵David Hildenbrand
something useful" That example was added in 2008. In 2015, we restricted access to the PFNs in the pagemap to CAP_SYS_ADMIN, making that approach quite less usable. It's 2024 now, and using that racy and low-lewel mechanism to calculate the USS should not be considered a good example anymore. /proc/$pid/smaps and /proc/$pid/smaps_rollup can do a much better job without any of that low-level handling. Let's just drop that example. Link: https://lkml.kernel.org/r/20240607122357.115423-7-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Lance Yang <ioworker0@gmail.com> Cc: Oscar Salvador <osalvador@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-03mm: shmem: add mTHP counters for anonymous shmemBaolin Wang
Add mTHP counters for anonymous shmem. [baolin.wang@linux.alibaba.com: update Documentation/admin-guide/mm/transhuge.rst] Link: https://lkml.kernel.org/r/d86e2e7f-4141-432b-b2ba-c6691f36ef0b@linux.alibaba.com Link: https://lkml.kernel.org/r/4fd9e467d49ae4a747e428bcd821c7d13125ae67.1718090413.git.baolin.wang@linux.alibaba.com Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: Lance Yang <ioworker0@gmail.com> Cc: Barry Song <v-songbaohua@oppo.com> Cc: Daniel Gomez <da.gomez@samsung.com> Cc: David Hildenbrand <david@redhat.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Pankaj Raghav <p.raghav@samsung.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-03mm: shmem: add multi-size THP sysfs interface for anonymous shmemBaolin Wang
To support the use of mTHP with anonymous shmem, add a new sysfs interface 'shmem_enabled' in the '/sys/kernel/mm/transparent_hugepage/hugepages-kB/' directory for each mTHP to control whether shmem is enabled for that mTHP, with a value similar to the top level 'shmem_enabled', which can be set to: "always", "inherit (to inherit the top level setting)", "within_size", "advise", "never". An 'inherit' option is added to ensure compatibility with these global settings, and the options 'force' and 'deny' are dropped, which are rather testing artifacts from the old ages. By default, PMD-sized hugepages have enabled="inherit" and all other hugepage sizes have enabled="never" for '/sys/kernel/mm/transparent_hugepage/hugepages-xxkB/shmem_enabled'. In addition, if top level value is 'force', then only PMD-sized hugepages have enabled="inherit", otherwise configuration will be failed and vice versa. That means now we will avoid using non-PMD sized THP to override the global huge allocation. [baolin.wang@linux.alibaba.com: fix transhuge.rst indentation] Link: https://lkml.kernel.org/r/b189d815-998b-4dfd-ba89-218ff51313f8@linux.alibaba.com [akpm@linux-foundation.org: reflow transhuge.rst addition to 80 cols] [baolin.wang@linux.alibaba.com: move huge_shmem_orders_lock under CONFIG_SYSFS] Link: https://lkml.kernel.org/r/eb34da66-7f12-44f3-a39e-2bcc90c33354@linux.alibaba.com [akpm@linux-foundation.org: huge_memory.c needs mm_types.h] Link: https://lkml.kernel.org/r/ffddfa8b3cb4266ff963099ab78cfd7184c57ac7.1718090413.git.baolin.wang@linux.alibaba.com Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Barry Song <v-songbaohua@oppo.com> Cc: Daniel Gomez <da.gomez@samsung.com> Cc: David Hildenbrand <david@redhat.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Lance Yang <ioworker0@gmail.com> Cc: Pankaj Raghav <p.raghav@samsung.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-03docs/admin-guide/mm: correct typo 'quired' to 'queried'Daniel Watson
Convert the word "quired" to the word "queried" which makes more sense in this context. Signed-off-by: Daniel Watson <ozzloy@each.do> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/878qymrjrg.fsf@trent-reznor
2024-07-03cgroup/misc: Introduce misc.peakXiu Jianfeng
Introduce misc.peak to record the historical maximum usage of the resource, as in some scenarios the value of misc.max could be adjusted based on the peak usage of the resource. Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2024-07-02cpufreq: docs: Add missing scaling_available_frequencies descriptionRaphael Gallais-Pou
Add a description of the scaling_available_frequencies attribute in sysfs to the documentation. Signed-off-by: Raphael Gallais-Pou <rgallaispou@gmail.com> Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Link: https://patch.msgid.link/20240701171040.369030-1-rgallaispou@gmail.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-07-02x86/efi: Drop support for fake EFI memory mapsArd Biesheuvel
Between kexec and confidential VM support, handling the EFI memory maps correctly on x86 is already proving to be rather difficult (as opposed to other EFI architectures which manage to never modify the EFI memory map to begin with) EFI fake memory map support is essentially a development hack (for testing new support for the 'special purpose' and 'more reliable' EFI memory attributes) that leaked into production code. The regions marked in this manner are not actually recognized as such by the firmware itself or the EFI stub (and never have), and marking memory as 'more reliable' seems rather futile if the underlying memory is just ordinary RAM. Marking memory as 'special purpose' in this way is also dubious, but may be in use in production code nonetheless. However, the same should be achievable by using the memmap= command line option with the ! operator. EFI fake memmap support is not enabled by any of the major distros (Debian, Fedora, SUSE, Ubuntu) and does not exist on other architectures, so let's drop support for it. Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2024-07-01Merge 6.10-rc6 into usb-nextGreg Kroah-Hartman
We need the USB fixes in here as well for some follow-on patches. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-30Merge tag 'tty-6.10-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty / serial / console fixes from Greg KH: "Here are a bunch of fixes/reverts for 6.10-rc6. Include in here are: - revert the bunch of tty/serial/console changes that landed in -rc1 that didn't quite work properly yet. Everyone agreed to just revert them for now and will work on making them better for a future release instead of trying to quick fix the existing changes this late in the release cycle - 8250 driver port count bugfix - Other tiny serial port bugfixes for reported issues All of these have been in linux-next this week with no reported issues" * tag 'tty-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: Revert "printk: Save console options for add_preferred_console_match()" Revert "printk: Don't try to parse DEVNAME:0.0 console options" Revert "printk: Flag register_console() if console is set on command line" Revert "serial: core: Add support for DEVNAME:0.0 style naming for kernel console" Revert "serial: core: Handle serial console options" Revert "serial: 8250: Add preferred console in serial8250_isa_init_ports()" Revert "Documentation: kernel-parameters: Add DEVNAME:0.0 format for serial ports" Revert "serial: 8250: Fix add preferred console for serial8250_isa_init_ports()" Revert "serial: core: Fix ifdef for serial base console functions" serial: bcm63xx-uart: fix tx after conversion to uart_port_tx_limited() serial: core: introduce uart_port_tx_limited_flags() Revert "serial: core: only stop transmit when HW fifo is empty" serial: imx: set receiver level before starting uart tty: mcf: MCF54418 has 10 UARTS serial: 8250_omap: Implementation of Errata i2310 tty: serial: 8250: Fix port count mismatch with the device
2024-06-29media: em28xx: Add support for MyGica UTV3Nils Rothaug
The MyGica UTV3 Analog USB2.0 TV Box is a USB video capture card that has analog TV, composite video, and FM radio inputs, an IR remote, and provides audio only as Line Out, but not over USB. Mine is prepared for an FM tuner, but not equipped with one. Support for FM radio is therefore missing. The device contains: - Empia EM2860 USB bridge - Philips SAA7113 video decoder - NXP TDA9801T demodulator - Tena TNF931D-DFDR1 tuner - ST HCF4052 demux, switches input audio to Line Out Signed-off-by: Nils Rothaug <nils.rothaug@gmx.de> Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
2024-06-29media: tuner-simple: Add support for Tena TNF931D-DFDR1Nils Rothaug
Tuner ranges were determined by USB capturing the vendor driver of a MyGica UTV3 video capture card. Signed-off-by: Nils Rothaug <nils.rothaug@gmx.de> Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
2024-06-28x86/bugs: Add 'spectre_bhi=vmexit' cmdline optionJosh Poimboeuf
In cloud environments it can be useful to *only* enable the vmexit mitigation and leave syscalls vulnerable. Add that as an option. This is similar to the old spectre_bhi=auto option which was removed with the following commit: 36d4fe147c87 ("x86/bugs: Remove CONFIG_BHI_MITIGATION_AUTO and spectre_bhi=auto") with the main difference being that this has a more descriptive name and is disabled by default. Mitigation switch requested by Maksim Davydov <davydov-max@yandex-team.ru>. [ bp: Massage. ] Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Reviewed-by: Nikolay Borisov <nik.borisov@suse.com> Link: https://lore.kernel.org/r/2cbad706a6d5e1da2829e5e123d8d5c80330148c.1719381528.git.jpoimboe@kernel.org
2024-06-28x86/bugs: Remove duplicate Spectre cmdline option descriptionsJosh Poimboeuf
Duplicating the documentation of all the Spectre kernel cmdline options in two separate files is unwieldy and error-prone. Instead just add a reference to kernel-parameters.txt from spectre.rst. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Link: https://lore.kernel.org/r/450b5f4ffe891a8cc9736ec52b0c6f225bab3f4b.1719381528.git.jpoimboe@kernel.org
2024-06-28documentation: media: vivid: Update documentation on vivid loopback supportDorcas Anono Litunya
Modify section "Video and Sliced VBI Looping" in Documentation to explain the vivid loopback support for video across multiple vivid instances. Previous documentation is out-of-date as it was explaining looping in a single vivid instance only. Also, in "Some Future Improvements" the item "Add support to loop from a specific output to a specific input across vivid instances" can be dropped since that's now implemented. Signed-off-by: Dorcas Anono Litunya <anonolitunya@gmail.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
2024-06-28media: vivid: loopback based on 'Connected To' controlsHans Verkuil
Instead of using hardwired video loopback limited to a single vivid instance, use the new 'Connected To' controls to only loopback if an HDMI or S-Video input is connected to another output, which can be in another vivid instance. Effectively this emulates connecting and disconnecting an HDMI/S-Video cable. The Loop Video control is dropped since it has now been replaced by the new 'Connected To' controls. The Display Present has also been dropped since it no longer fits. Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Co-developed-by: Dorcas Anono Litunya <anonolitunya@gmail.com> Signed-off-by: Dorcas Anono Litunya <anonolitunya@gmail.com>
2024-06-28media: Documentation: vivid.rst: Remove documentation for Capture OverlayDorcas Anono Litunya
Modifying documentation to remove 'Capture Overlay section' as destructive capture overlay support was removed. See commit ccaa9d50ca73 ("media: vivid: drop overlay support") Signed-off-by: Dorcas Anono Litunya <anonolitunya@gmail.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
2024-06-28media: Documentation: vivid.rst: add supports_requestsHans Verkuil
The module option supports_requests was not documented, add it. Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
2024-06-28media: Documentation: vivid.rst: drop "Video, VBI and RDS Looping"Hans Verkuil
Drop the "Video, VBI and RDS Looping" section, instead moving the Video/VBI info to section "Video and Sliced VBI looping" and the RDS info to section "Radio & RDS Looping". Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
2024-06-28media: Documentation: vivid.rst: fix confusing section refsHans Verkuil
The documentation contained several instances of "section X" references, which no longer map to whatever X was. Replace these by the section titles. Also fix a single confusing typo in the "Radio & RDS Looping" section: "are regular frequency intervals" -> "at regular frequency intervals" Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
2024-06-27Merge tag 'amd-pstate-v6.11-2024-06-26' of ↵Rafael J. Wysocki
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux Merge more amd-pstate driver updates for 6.11 from Mario Limonciello: "Add support for amd-pstate core performance boost support which allows controlling which CPU cores can operate above nominal frequencies for short periods of time." * tag 'amd-pstate-v6.11-2024-06-26' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux: Documentation: cpufreq: amd-pstate: update doc for Per CPU boost control method cpufreq: amd-pstate: Cap the CPPC.max_perf to nominal_perf if CPB is off cpufreq: amd-pstate: initialize core precision boost state cpufreq: acpi: move MSR_K7_HWCR_CPB_DIS_BIT into msr-index.h
2024-06-27Merge back cpufreq material for v6.11.Rafael J. Wysocki
2024-06-27Documentation: Remove IA-64 from kernel-parametersThomas Huth
IA-64 has been removed from the tree, so we should also remove the corresponding kernel-parameters documentation now. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/20240627162458.387700-1-thuth@redhat.com
2024-06-27media: admin-guide: Document the Raspberry Pi PiSP BEJacopo Mondi
Add documentation for the PiSP Back End memory-to-memory ISP. Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
2024-06-26docs: verify/bisect: Fix rendered version URLDiederik de Haas
When rendering the documentation, the 'html' file extension replaces the 'rst' file extension, not add it. So remove the 'rst' part of the URL. Signed-off-by: Diederik de Haas <didi.debian@cknow.org> Reviewed-by: Thorsten Leemhuis <linux@leemhuis.info> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/20240620081355.11549-1-didi.debian@cknow.org
2024-06-26Documentation: cpufreq: amd-pstate: update doc for Per CPU boost control methodPerry Yuan
Updates the documentation in `amd-pstate.rst` to include information about the per CPU boost control feature. Users can now enable or disable the Core Performance Boost (CPB) feature on individual CPUs using the `boost` sysfs attribute. Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Signed-off-by: Perry Yuan <perry.yuan@amd.com> Co-developed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20240626042733.3747-5-mario.limonciello@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-06-25Revert "Documentation: kernel-parameters: Add DEVNAME:0.0 format for serial ↵Greg Kroah-Hartman
ports" This reverts commit 5c3a766e9f057ee7a54b5d7addff7fab02676fea. Let's roll back all of the serial core and printk console changes that went into 6.10-rc1 as there still are problems with them that need to be sorted out. Link: https://lore.kernel.org/r/ZnpRozsdw6zbjqze@tlindgre-MOBL1 Reported-by: Petr Mladek <pmladek@suse.com> Reported-by: Tony Lindgren <tony@atomide.com> Cc: Jiri Slaby <jirislaby@kernel.org> Cc: John Ogness <john.ogness@linutronix.de> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-24media: Documentation: ipu6: Fix examples in ipu6-isys admin-guideSamuel Wein
Fix flags in X1 Yoga example. MEDIA_LNK_FL_DYNAMIC (0x4 in the link flag) was removed in V4 Intel IPU6 and IPU6 input system drivers. Added -V flag to media-ctl commands for X1 Yoga, lower-case v only makes it verbose upper-case V sets the format. Signed-off-by: Samuel Wein <sam@samwein.com> [Sakari Ailus: Align subject line, rewrap commit message.] Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
2024-06-20Documentation: PM: amd-pstate: add guided mode to the Operation modePerry Yuan
the guided mode is also supported, so the operation mode should include that mode as well. Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Perry Yuan <perry.yuan@amd.com> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Link: https://lore.kernel.org/r/a61d825ef71f6aacc8f1624fe9fb982b8446b5a7.1718811234.git.perry.yuan@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-06-19cgroup/cpuset: Make cpuset.cpus.exclusive independent of cpuset.cpusWaiman Long
The "cpuset.cpus.exclusive.effective" value is currently limited to a subset of its "cpuset.cpus". This makes the exclusive CPUs distribution hierarchy subsumed within the larger "cpuset.cpus" hierarchy. We have to decide on what CPUs are used locally and what CPUs can be passed down as exclusive CPUs down the hierarchy and combine them into "cpuset.cpus". The advantage of the current scheme is to have only one hierarchy to worry about. However, it make it harder to use as all the "cpuset.cpus" values have to be properly set along the way down to the designated remote partition root. It also makes it more cumbersome to find out what CPUs can be used locally. Make creation of remote partition simpler by breaking the dependency of "cpuset.cpus.exclusive" on "cpuset.cpus" and make them independent entities. Now we have two separate hierarchies - one for setting "cpuset.cpus.effective" and the other one for setting "cpuset.cpus.exclusive.effective". We may not need to set "cpuset.cpus" when we activate a partition root anymore. Also update Documentation/admin-guide/cgroup-v2.rst and cpuset.c comment to document this change. Suggested-by: Petr Malat <oss@malat.biz> Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2024-06-19cgroup/cpuset: Delay setting of CS_CPU_EXCLUSIVE until valid partitionWaiman Long
The CS_CPU_EXCLUSIVE flag is currently set whenever cpuset.cpus.exclusive is set to make sure that the exclusivity test will be run to ensure its exclusiveness. At the same time, this flag can be changed whenever the partition root state is changed. For example, the CS_CPU_EXCLUSIVE flag will be reset whenever a partition root becomes invalid. This makes using CS_CPU_EXCLUSIVE to ensure exclusiveness a bit fragile. The current scheme also makes setting up a cpuset.cpus.exclusive hierarchy to enable remote partition harder as cpuset.cpus.exclusive cannot overlap with any cpuset.cpus of sibling cpusets if their cpuset.cpus.exclusive aren't set. Solve these issues by deferring the setting of CS_CPU_EXCLUSIVE flag until the cpuset become a valid partition root while adding new checks in validate_change() to ensure that cpuset.cpus.exclusive of sibling cpusets cannot overlap. An additional check is also added to validate_change() to make sure that cpuset.cpus of one cpuset cannot be a subset of cpuset.cpus.exclusive of a sibling cpuset to avoid the problem that none of those CPUs will be available when these exclusive CPUs are extracted out to a newly enabled partition root. The Documentation/admin-guide/cgroup-v2.rst file is updated to document the new constraints. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2024-06-19pstore/ramoops: Add ramoops.mem_name= command line optionSteven Rostedt (Google)
Add a method to find a region specified by reserve_mem=nn:align:name for ramoops. Adding a kernel command line parameter: reserve_mem=12M:4096:oops ramoops.mem_name=oops Will use the size and location defined by the memmap parameter where it finds the memory and labels it "oops". The "oops" in the ramoops option is used to search for it. This allows for arbitrary RAM to be used for ramoops if it is known that the memory is not cleared on kernel crashes or soft reboots. Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Acked-by: Kees Cook <kees@kernel.org> Link: https://lore.kernel.org/r/20240613155527.591647061@goodmis.org Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
2024-06-19mm/memblock: Add "reserve_mem" to reserved named memory at boot upSteven Rostedt (Google)
In order to allow for requesting a memory region that can be used for things like pstore on multiple machines where the memory layout is not the same, add a new option to the kernel command line called "reserve_mem". The format is: reserve_mem=nn:align:name Where it will find nn amount of memory at the given alignment of align. The name field is to allow another subsystem to retrieve where the memory was found. For example: reserve_mem=12M:4096:oops ramoops.mem_name=oops Where ramoops.mem_name will tell ramoops that memory was reserved for it via the reserve_mem option and it can find it by calling: if (reserve_mem_find_by_name("oops", &start, &size)) { // start holds the start address and size holds the size given This is typically used for systems that do not wipe the RAM, and this command line will try to reserve the same physical memory on soft reboots. Note, it is not guaranteed to be the same location. For example, if KASLR places the kernel at the location of where the RAM reservation was from a previous boot, the new reservation will be at a different location. Any subsystem using this feature must add a way to verify that the contents of the physical memory is from a previous boot, as there may be cases where the memory will not be located at the same location. Not all systems may work either. There could be bit flips if the reboot goes through the BIOS. Using kexec to reboot the machine is likely to have better results in such cases. Link: https://lore.kernel.org/all/ZjJVnZUX3NZiGW6q@kernel.org/ Suggested-by: Mike Rapoport <rppt@kernel.org> Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Link: https://lore.kernel.org/r/20240613155527.437020271@goodmis.org Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
2024-06-18Merge tag 'v6.10-rc4' into usb-nextGreg Kroah-Hartman
We need the USB / Thunderbolt fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-17Documentation: Remove the unused "tp720" from kernel-parameters.txtThomas Huth
The "tp720" switch once belonged to the ps2esdi driver, but this driver has been removed a long time ago in 2008 in the commit 2af3e6017e53 ("The ps2esdi driver was marked as BROKEN more than two years ago due to being no longer working for some time.") already, so let's remove it from the documentation now, too. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/20240617073322.40679-1-thuth@redhat.com
2024-06-17Documentation: Remove the unused "topology_updates" from kernel-parameters.txtThomas Huth
The "topology_updates" switch has been removed four years ago in commit c30f931e891e ("powerpc/numa: remove ability to enable topology updates"), so let's remove this from the documentation, too. Signed-off-by: Thomas Huth <thuth@redhat.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/20240617060848.38937-1-thuth@redhat.com
2024-06-17Documentation: Remove unused "nps_mtm_hs_ctr" from kernel-parameters.txtThomas Huth
The "nps_mtm_hs_ctr" parameter has been removed in commit dd7c7ab01a04 ("ARC: [plat-eznps]: Drop support for EZChip NPS platform"). Remove it from the documentation now, too. Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/20240614190804.602970-1-thuth@redhat.com
2024-06-17Documentation: Remove unused "spia_*" kernel parametersThomas Huth
The kernel module parameters "spia_io_base", "spia_fio_base", "spia_pedr" and "spia_peddr" have been removed via commit e377ca1e32f6 ("ARM: clps711x: p720t: Special driver for handling NAND memory is removed"). Time to remove them from the documentation now, too. Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/20240614184041.601056-1-thuth@redhat.com
2024-06-17Documentation: Remove unused "mtdset=" from kernel-parameters.txtThomas Huth
The kernel parameter "mtdset" has been removed two years ago in commit 61b7f8920b17 ("ARM: s3c: remove all s3c24xx support") and thus should be removed from the documentation now, too. Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/20240614182508.600113-1-thuth@redhat.com
2024-06-17Documentation: Remove the "rhash_entries=" from kernel-parameters.txtThomas Huth
"rhash_entries" belonged to the routing cache that has been removed in commit 89aef8921bfb ("ipv4: Delete routing cache."). Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/20240614092134.563082-1-thuth@redhat.com
2024-06-17Documentation: Remove "ltpc=" from the kernel-parameters.txtThomas Huth
The string "ltpc" cannot be found in the source code anymore. This kernel parameter likely belonged to the LocalTalk PC card module which has been removed in commit 03dcb90dbf62 ("net: appletalk: remove Apple/Farallon LocalTalk PC support"), so we should remove it from kernel-parameters.txt now, too. Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/20240614084633.560069-1-thuth@redhat.com
2024-06-17Documentation: Add "S390" to the swiotlb kernel parameterThomas Huth
The "swiotlb" kernel parameter is used on s390 for protected virt since commit 64e1f0c531d1 ("s390/mm: force swiotlb for protected virtualization") and thus should be marked in kernel-parameters.txt accordingly. Signed-off-by: Thomas Huth <thuth@redhat.com> Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com> Acked-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/20240614081438.553160-1-thuth@redhat.com
2024-06-15Revert "mm: init_mlocked_on_free_v3"David Hildenbrand
There was insufficient review and no agreement that this is the right approach. There are serious flaws with the implementation that make processes using mlock() not even work with simple fork() [1] and we get reliable crashes when rebooting. Further, simply because we might be unmapping a single PTE of a large mlocked folio, we shouldn't zero out the whole folio. ... especially because the code can also *corrupt* urelated memory because kernel_init_pages(page, folio_nr_pages(folio)); Could end up writing outside of the actual folio if we work with a tail page. Let's revert it. Once there is agreement that this is the right approach, the issues were fixed and there was reasonable review and proper testing, we can consider it again. [1] https://lkml.kernel.org/r/4da9da2f-73e4-45fd-b62f-a8a513314057@redhat.com Link: https://lkml.kernel.org/r/20240605091710.38961-1-david@redhat.com Fixes: ba42b524a040 ("mm: init_mlocked_on_free_v3") Signed-off-by: David Hildenbrand <david@redhat.com> Reported-by: David Wang <00107082@163.com> Closes: https://lore.kernel.org/lkml/20240528151340.4282-1-00107082@163.com/ Reported-by: Lance Yang <ioworker0@gmail.com> Closes: https://lkml.kernel.org/r/20240601140917.43562-1-ioworker0@gmail.com Acked-by: Lance Yang <ioworker0@gmail.com> Cc: York Jasper Niebuhr <yjnworkstation@gmail.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>