summaryrefslogtreecommitdiff
path: root/Documentation
AgeCommit message (Collapse)Author
2025-05-13cxl: docs/linux/dax-driver documentationGregory Price
Add documentation on how the CXL driver interacts with the DAX driver. Signed-off-by: Gregory Price <gourry@gourry.net> Link: https://patch.msgid.link/20250512162134.3596150-12-gourry@gourry.net Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-05-13cxl: docs/linux/cxl-driver - add example configurationsGregory Price
Add 4 example configurations: - single device - cross-host-bridge interleave - intra-host-bridge-interleave - multi-level interleave Signed-off-by: Gregory Price <gourry@gourry.net> Link: https://patch.msgid.link/20250512162134.3596150-11-gourry@gourry.net Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-05-13cxl: docs/linux - add cxl-driver theory of operationGregory Price
Add docs for the CXL driver that explains the base devices, decoder types, region types, mailbox interfaces, and decoder programming. Signed-off-by: Gregory Price <gourry@gourry.net> Link: https://patch.msgid.link/20250512162134.3596150-10-gourry@gourry.net Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-05-13cxl: docs/linux - early boot configurationGregory Price
Document __init time configurations that affect CXL driver probe process and memory region configuration. Signed-off-by: Gregory Price <gourry@gourry.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://patch.msgid.link/20250512162134.3596150-9-gourry@gourry.net Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-05-13cxl: docs/linux - overviewGregory Price
Add type-3 device configuration overview that explains the probe process for a type-3 device from early-boot through memory-hotplug. Signed-off-by: Gregory Price <gourry@gourry.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://patch.msgid.link/20250512162134.3596150-8-gourry@gourry.net Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-05-13cxl: docs/platform/example-configs documentationGregory Price
Add example ACPI Table configurations for different sample platforms. Signed-off-by: Gregory Price <gourry@gourry.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://patch.msgid.link/20250512162134.3596150-7-gourry@gourry.net Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-05-13cxl: docs/platform/acpi reference documentationGregory Price
Add basic ACPI table information needed to understand the CXL driver probe process. Signed-off-by: Gregory Price <gourry@gourry.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://patch.msgid.link/20250512162134.3596150-6-gourry@gourry.net Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-05-13cxl: docs/platform/bios-and-efi documentationGregory Price
Add some docs on CXL configurations done in bios/efi that affect linux configuration - information vendors may care to consider. Signed-off-by: Gregory Price <gourry@gourry.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://patch.msgid.link/20250512162134.3596150-5-gourry@gourry.net Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-05-13cxl: docs/devices - add cxl device and protocol referenceGregory Price
Add a simple device primer sufficient to understand the theory of operation documentation. Signed-off-by: Gregory Price <gourry@gourry.net> Link: https://patch.msgid.link/20250512162134.3596150-4-gourry@gourry.net Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-05-13cxl: docs - access-coordinates doc fixupsGregory Price
Place the hierarchy diagram in access-coordinates.rst in a code block. Fix a few grammar issues. Suggested-by: Randy Dunlap <rdunlap@infradead.org> Suggested-by: Bagas Sanjaya <bagasdotme@gmail.com> Signed-off-by: Gregory Price <gourry@gourry.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://patch.msgid.link/20250512162134.3596150-3-gourry@gourry.net Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-05-13cxl: update documentation structure in prep for new docsGregory Price
Restructure the cxl folder to make adding docs per-page cleaner. Signed-off-by: Gregory Price <gourry@gourry.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://patch.msgid.link/20250512162134.3596150-2-gourry@gourry.net Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-05-13Documentation: scheduler: Changed lowercase acronyms to uppercaseJake Rice
Everywhere else in this doc, the dispatch queue acronym (DSQ) is uppercase. There were a couple places where the acronym was written in lowercase. I changed them to uppercase to make it homogeneous. Signed-off-by: Jake Rice <jake@jakerice.dev> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-05-13dt-bindings: display: msm: correct example in SM8350 MDSS schemaDmitry Baryshkov
Fix the interconnects in the example to follow the schema changes. Fixes: 60b8d3a2365a ("dt-bindings: display: msm: sm8350-mdss: Describe the CPU-CFG icc path") Reported-by: Rob Herring <robh@kernel.org> Closes: http://lore.kernel.org/r/CAL_JsqKr8Xd8uxFzE0YJTyD+V6N++VV8SX-GB5Xt0_BKkeoGUQ@mail.gmail.com Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Patchwork: https://patchwork.freedesktop.org/patch/651775/ Link: https://lore.kernel.org/r/20250505-sm8350-fix-example-v1-1-36d5d9ccba66@oss.qualcomm.com
2025-05-13docs: bpf: Fix bullet point formatting warningKhaled Elnaggar
Fix indentation for a bullet list item in bpf_iterators.rst. According to reStructuredText rules, bullet list item bodies must be consistently indented relative to the bullet. The indentation of the first line after the bullet determines the alignment for the rest of the item body. Reported by smatch: /linux/Documentation/bpf/bpf_iterators.rst:55: WARNING: Bullet list ends without a blank line; unexpected unindent. [docutils] Fixes: 7220eabff8cb ("bpf, docs: document open-coded BPF iterators") Signed-off-by: Khaled Elnaggar <khaledelnaggarlinux@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250513015901.475207-1-khaledelnaggarlinux@gmail.com
2025-05-13f2fs: fix 32-bits hexademical number in fault injection docChao Yu
FAULT_KMALLOC 0x000000001 There is one redundant '0' in 32-bits hexademical number of fault type, remove it. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-05-13dt-bindings: remoteproc: qcom,sm8150-pas: Add missing SC8180X compatibleKrzysztof Kozlowski
Commit 4b4ab93ddc5f ("dt-bindings: remoteproc: Consolidate SC8180X and SM8150 PAS files") moved SC8180X bindings from separate file into this one, but it forgot to add actual compatibles in top-level properties section making the entire binding un-selectable (no-op) for SC8180X PAS. Fixes: 4b4ab93ddc5f ("dt-bindings: remoteproc: Consolidate SC8180X and SM8150 PAS files") Cc: stable@vger.kernel.org Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Acked-by: Rob Herring (Arm) <robh@kernel.org> Link: https://lore.kernel.org/r/20250428075243.44256-2-krzysztof.kozlowski@linaro.org Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2025-05-13dt-bindings: remoteproc: qcom,sm8350-pas: Add SC8280XPKonrad Dybcio
From the software POV, it matches the SM8350's implementation. Describe it as such, with a fallback. Signed-off-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Acked-by: Rob Herring (Arm) <robh@kernel.org> Tested-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> # Lenovo X13s Link: https://lore.kernel.org/r/20250503-topic-8280_slpi-v1-1-9400a35574f7@oss.qualcomm.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2025-05-13dt-bindings: cache: Convert marvell,tauros2-cache to DT schemaRob Herring (Arm)
Convert the Marvell Tauros2 Cache binding to DT schema. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2025-05-13dt-bindings: cache: Convert marvell,{feroceon,kirkwood}-cache to DT schemaRob Herring (Arm)
Convert the Marvell Feroceon/Kirkwood Cache binding to DT schema format. Use "marvell,kirkwood-cache" for the filename instead as that's only compatible used in a .dts upstream. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2025-05-13Documentation/gpu: Disambiguate SPI termBagas Sanjaya
Documentation/userspace-api/media/glossary.rst:170: WARNING: duplicate term description of SPI, other instance in gpu/amdgpu/amdgpu-glossary That's because SPI of amdgpu (Shader Processor Input) shares the same global glossary term as SPI of media subsystem (which is Serial Peripheral Interface Bus). Disambiguate the former from the latter to fix the warning. Note that adding context qualifiers in the term is strictly necessary in order to make Sphinx happy. Fixes: dd3d035a7838 ("Documentation/gpu: Add new entries to amdgpu glossary") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Closes: https://lore.kernel.org/linux-next/20250509185845.60bf5e7b@canb.auug.org.au/ Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-13Merge tag 'ib-mfd-gpio-nvmem-v6.16' of ↵Bartosz Golaszewski
git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd into gpio/for-next Immutable branch between MFD, GPIO and NVMEM due for the v6.16 merge window
2025-05-13gpu: nova-core: define registers layout using helper macroAlexandre Courbot
Add the register!() macro, which defines a given register's layout and provide bit-field accessors with a way to convert them to a given type. This macro will allow us to make clear definitions of the registers and manipulate their fields safely. The long-term goal is to eventually move it to the kernel crate so it can be used by other drivers as well, but it was agreed to first land it into nova-core and make it mature there. To illustrate its usage, use it to define the layout for the Boot0 (renamed to NV_PMC_BOOT_0 to match OpenRM's naming scheme) and take advantage of its accessors. Suggested-by: Danilo Krummrich <dakr@kernel.org> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Link: https://lore.kernel.org/r/20250507-nova-frts-v3-5-fcb02749754d@nvidia.com [ Fix typo in commit message. - Danilo ] Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2025-05-13dt-bindings: pinctrl: qcom: correct gpio-ranges in examples for qcs8300Lijuan Gao
Correct the gpio-ranges in the QCS8300 TLMM pin controller example to include the UFS_RESET pin, which is expected to be wired to the reset pin of the primary UFS memory. This allows the UFS driver to toggle it. Fixes: 5778535972e2 ("dt-bindings: pinctrl: describe qcs8300-tlmm") Acked-by: Rob Herring (Arm) <robh@kernel.org> Signed-off-by: Lijuan Gao <quic_lijuang@quicinc.com> Link: https://lore.kernel.org/20250506-correct_gpio_ranges-v3-2-49a7d292befa@quicinc.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2025-05-13dt-bindings: pinctrl: qcom: correct gpio-ranges in examples for qcs615Lijuan Gao
Correct the gpio-ranges in the QCS615 TLMM pin controller example to include the UFS_RESET pin, which is expected to be wired to the reset pin of the primary UFS memory. This allows the UFS driver to toggle it. Fixes: 55c487ea6084 ("dt-bindings: pinctrl: document the QCS615 Top Level Mode Multiplexer") Acked-by: Rob Herring (Arm) <robh@kernel.org> Signed-off-by: Lijuan Gao <quic_lijuang@quicinc.com> Link: https://lore.kernel.org/20250506-correct_gpio_ranges-v3-1-49a7d292befa@quicinc.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2025-05-13cpufreq: intel_pstate: Document hybrid processor supportRafael J. Wysocki
Describe the support for hybrid processors in intel_pstate, including the CAS and EAS support, in the admin-guide documentation. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/1935040.CQOukoFCf9@rjwysocki.net
2025-05-13PM: EM: Documentation: Fix typos in example driver codeAtul Kumar Pant
Fix the API name to free the allocated table in the example driver code that modifies the EM. Also fix the passing of correct table when updating the cost. Signed-off-by: Atul Kumar Pant <atulpant.linux@gmail.com> Link: https://patch.msgid.link/20250511071141.13237-1-atulpant.linux@gmail.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-05-13dt-bindings: mfd: stm32-lptimer: Add support for stm32mp25Fabrice Gasnier
Add a new stm32mp25 compatible to stm32-lptimer dt-bindings, to support STM32MP25 SoC. Some features has been updated or added to the low-power timer: - new capture compare channels - up to two PWM channels - PWM input capture - peripheral interconnect in stm32mp25 has been updated (new triggers). - registers/bits has been added or revisited (IER access). So introduce a new compatible to handle this diversity. Signed-off-by: Fabrice Gasnier <fabrice.gasnier@foss.st.com> Reviewed-by: "Rob Herring (Arm)" <robh@kernel.org> Link: https://lore.kernel.org/r/20250429125133.1574167-2-fabrice.gasnier@foss.st.com Signed-off-by: Lee Jones <lee@kernel.org>
2025-05-13dt-bindings: arm: sunxi: Add Liontron H-A133L board nameAndre Przywara
The Liontron H-A133L is an industrial development board using the Allwinner A133 SoC. Add its compatible name to the list of valid board names. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Acked-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20250505164729.18175-3-andre.przywara@arm.com Signed-off-by: Chen-Yu Tsai <wens@csie.org>
2025-05-13dt-bindings: vendor-prefixes: Add Liontron nameAndre Przywara
Liontron is a company based in Shenzen, China, making industrial development boards and embedded computers, mostly using Rockchip and Allwinner SoCs. Add their name to the list of vendors. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Acked-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20250505164729.18175-2-andre.przywara@arm.com Signed-off-by: Chen-Yu Tsai <wens@csie.org>
2025-05-13net: enable driver support for netmem TXMina Almasry
Drivers need to make sure not to pass netmem dma-addrs to the dma-mapping API in order to support netmem TX. Add helpers and netmem_dma_*() helpers that enables special handling of netmem dma-addrs that drivers can use. Document in netmem.rst what drivers need to do to support netmem TX. Signed-off-by: Mina Almasry <almasrymina@google.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250508004830.4100853-7-almasrymina@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-13net: add devmem TCP TX documentationMina Almasry
Add documentation outlining the usage and details of the devmem TCP TX API. Signed-off-by: Mina Almasry <almasrymina@google.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250508004830.4100853-6-almasrymina@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-13net: devmem: TCP tx netlink apiStanislav Fomichev
Add bind-tx netlink call to attach dmabuf for TX; queue is not required, only ifindex and dmabuf fd for attachment. Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Signed-off-by: Mina Almasry <almasrymina@google.com> Link: https://patch.msgid.link/20250508004830.4100853-4-almasrymina@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-13Merge commit 'its-for-linus-20250509-merge' into x86/core, to resolve conflictsIngo Molnar
Conflicts: Documentation/admin-guide/hw-vuln/index.rst arch/x86/include/asm/cpufeatures.h arch/x86/kernel/alternative.c arch/x86/kernel/cpu/bugs.c arch/x86/kernel/cpu/common.c drivers/base/cpu.c include/linux/cpu.h Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-05-13Merge branch 'x86/platform' into x86/core, to merge dependent commitsIngo Molnar
Prepare to resolve conflicts with an upstream series of fixes that conflict with pending x86 changes: 6f5bf947bab0 Merge tag 'its-for-linus-20250509' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-05-13Merge branch 'x86/microcode' into x86/core, to merge dependent commitsIngo Molnar
Prepare to resolve conflicts with an upstream series of fixes that conflict with pending x86 changes: 6f5bf947bab0 Merge tag 'its-for-linus-20250509' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-05-13dt-bindings: soc: samsung: exynos-pmu: gs101: add google,pmu-intr-gen phandlePeter Griffin
gs101 requires access to the pmu interrupt generation register region which is exposed as a syscon. Update the exynos-pmu bindings documentation to reflect this. Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Peter Griffin <peter.griffin@linaro.org> Link: https://lore.kernel.org/r/20250506-contrib-pg-cpu-hotplug-suspend2ram-fixes-v1-v4-2-9f64a2657316@linaro.org Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
2025-05-13dt-bindings: soc: google: Add gs101-pmu-intr-gen binding documentationPeter Griffin
Add bindings documentation for the Power Management Unit (PMU) interrupt generator. Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Peter Griffin <peter.griffin@linaro.org> Link: https://lore.kernel.org/r/20250506-contrib-pg-cpu-hotplug-suspend2ram-fixes-v1-v4-1-9f64a2657316@linaro.org Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
2025-05-13Documentation: fix typo in root= kernel parameter descriptionPetr Vaněk
Fixes a typo in the root= parameter description, changing "this a a" to "this is a". Fixes: c0c1a7dcb6f5 ("init: move the nfs/cifs/ram special cases out of name_to_dev_t") Signed-off-by: Petr Vaněk <arkamar@atlas.cz> Link: https://lore.kernel.org/20250512110827.32530-1-arkamar@atlas.cz Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-05-12docs/mm/damon/design: fix spelling mistakeThushara.M.S
The word accuracy was misspelled as "accruracy". Signed-off-by: Thushara.M.S <thusharms@gmail.com> Reviewed-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-12Documentation: KHO: add memblock bindingsMike Rapoport (Microsoft)
We introduced KHO into Linux: A framework that allows Linux to pass metadata and memory across kexec from Linux to Linux. KHO reuses fdt as file format and shares a lot of the same properties of firmware-to- Linux boot formats: It needs a stable, documented ABI that allows for forward and backward compatibility as well as versioning. As first user of KHO, we introduced memblock which can now preserve memory ranges reserved with reserve_mem command line options contents across kexec, so you can use the post-kexec kernel to read traces from the pre-kexec kernel. This patch adds memblock schemas similar to "device" device tree ones to a new kho bindings directory. This allows us to force contributors to document the data that moves across KHO kexecs and catch breaking change during review. Link: https://lkml.kernel.org/r/20250509074635.3187114-18-changyuanl@google.com Co-developed-by: Alexander Graf <graf@amazon.com> Signed-off-by: Alexander Graf <graf@amazon.com> Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Signed-off-by: Changyuan Lyu <changyuanl@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Anthony Yznaga <anthony.yznaga@oracle.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Ashish Kalra <ashish.kalra@amd.com> Cc: Ben Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Betkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Eric Biederman <ebiederm@xmission.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Gowans <jgowans@amazon.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Krzysztof Kozlowski <krzk@kernel.org> Cc: Marc Rutland <mark.rutland@arm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Pratyush Yadav <ptyadav@amazon.de> Cc: Rob Herring <robh@kernel.org> Cc: Saravana Kannan <saravanak@google.com> Cc: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleinxer <tglx@linutronix.de> Cc: Thomas Lendacky <thomas.lendacky@amd.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-12Documentation: add documentation for KHOAlexander Graf
With KHO in place, let's add documentation that describes what it is and how to use it. Link: https://lkml.kernel.org/r/20250509074635.3187114-17-changyuanl@google.com Signed-off-by: Alexander Graf <graf@amazon.com> Co-developed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Co-developed-by: Changyuan Lyu <changyuanl@google.com> Signed-off-by: Changyuan Lyu <changyuanl@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Anthony Yznaga <anthony.yznaga@oracle.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Ashish Kalra <ashish.kalra@amd.com> Cc: Ben Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Betkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Eric Biederman <ebiederm@xmission.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Gowans <jgowans@amazon.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Krzysztof Kozlowski <krzk@kernel.org> Cc: Marc Rutland <mark.rutland@arm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Pratyush Yadav <ptyadav@amazon.de> Cc: Rob Herring <robh@kernel.org> Cc: Saravana Kannan <saravanak@google.com> Cc: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleinxer <tglx@linutronix.de> Cc: Thomas Lendacky <thomas.lendacky@amd.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-12mm: add max swappiness arg to lru_gen for anonymous memory onlyZhongkun He
The MGLRU already supports reclaiming only from anonymous memory via the /sys/kernel/debug/lru_gen interface. Now, memory.reclaim also supports the swappiness=max parameter to enable reclaiming solely from anonymous memory. To unify the semantics of proactive reclaiming from anonymous folios, the max parameter is introduced. [hezhongkun.hzk@bytedance.com: use strcmp instead of strncmp, if swappiness is not set, use the default value] Link: https://lkml.kernel.org/r/20250507071057.3184240-1-hezhongkun.hzk@bytedance.com [akpm@linux-foundation.org: tweak coding style] Link: https://lkml.kernel.org/r/65181f7745d657d664d833c26d8a94cae40538b9.1745225696.git.hezhongkun.hzk@bytedance.com Signed-off-by: Zhongkun He <hezhongkun.hzk@bytedance.com> Acked-by: Muchun Song <muchun.song@linux.dev> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Yosry Ahmed <yosry.ahmed@linux.dev> Cc: Yu Zhao <yuzhao@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-12mm: add swappiness=max arg to memory.reclaim for only anon reclaimZhongkun He
Patch series "add max arg to swappiness in memory.reclaim and lru_gen", v4. This patchset adds max arg to swappiness in memory.reclaim and lru_gen for anon only proactive memory reclaim. With commit <68cd9050d871> ("mm: add swappiness= arg to memory.reclaim") we can submit an additional swappiness=<val> argument to memory.reclaim. It is very useful because we can dynamically adjust the reclamation ratio based on the anonymous folios and file folios of each cgroup. For example,when swappiness is set to 0, we only reclaim from file folios. But we can not relciam memory just from anon folios. This patchset introduces a new macro, SWAPPINESS_ANON_ONLY, defined as MAX_SWAPPINESS + 1, represent the max arg semantics. It specifically indicates that reclamation should occur only from anonymous pages. Patch 1 adds swappiness=max arg to memory.reclaim suggested-by: Yosry Ahmed Patch 2 add more comments for cache_trim_mode from Johannes Weiner in [1]. Patch 3 add max arg to lru_gen for proactive memory reclaim in MGLRU. The MGLRU already supports reclaiming exclusively from anonymous pages. This patch formalizes that behavior by introducing a max parameter to represent the corresponding semantics. Patch 4 using SWAPPINESS_ANON_ONLY in MGLRU Using SWAPPINESS_ANON_ONLY instead of MAX_SWAPPINESS + 1 to indicate reclaiming only from anonymous pages makes the code more readable and explicit Here is the previous discussion: https://lore.kernel.org/all/20250314033350.1156370-1-hezhongkun.hzk@bytedance.com/ https://lore.kernel.org/all/20250312094337.2296278-1-hezhongkun.hzk@bytedance.com/ https://lore.kernel.org/all/20250318135330.3358345-1-hezhongkun.hzk@bytedance.com/ This patch (of 4): With commit <68cd9050d871> ("mm: add swappiness= arg to memory.reclaim") we can submit an additional swappiness=<val> argument to memory.reclaim. It is very useful because we can dynamically adjust the reclamation ratio based on the anonymous folios and file folios of each cgroup. For example,when swappiness is set to 0, we only reclaim from file folios. However,we have also encountered a new issue: when swappiness is set to the MAX_SWAPPINESS, it may still only reclaim file folios. So, we hope to add a new arg 'swappiness=max' in memory.reclaim where proactive memory reclaim only reclaims from anonymous folios when swappiness is set to max. The swappiness semantics from a user perspective remain unchanged. For example, something like this: echo "2M swappiness=max" > /sys/fs/cgroup/memory.reclaim will perform reclaim on the rootcg with a swappiness setting of 'max' (a new mode) regardless of the file folios. Users have a more comprehensive view of the application's memory distribution because there are many metrics available. For example, if we find that a certain cgroup has a large number of inactive anon folios, we can reclaim only those and skip file folios, because with the zram/zswap, the IO tradeoff that cache_trim_mode or other file first logic is making doesn't hold - file refaults will cause IO, whereas anon decompression will not. With this patch, the swappiness argument of memory.reclaim has a new mode 'max', means reclaiming just from anonymous folios both in traditional LRU and MGLRU. Link: https://lkml.kernel.org/r/cover.1745225696.git.hezhongkun.hzk@bytedance.com Link: https://lore.kernel.org/all/20250314141833.GA1316033@cmpxchg.org/ [1] Link: https://lkml.kernel.org/r/519e12b9b1f8c31a01e228c8b4b91a2419684f77.1745225696.git.hezhongkun.hzk@bytedance.com Signed-off-by: Zhongkun He <hezhongkun.hzk@bytedance.com> Suggested-by: Yosry Ahmed <yosry.ahmed@linux.dev> Acked-by: Muchun Song <muchun.song@linux.dev> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Yu Zhao <yuzhao@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-12memcg-introduce-non-blocking-limit-setting-option-v3Shakeel Butt
add more explanation in doc and commit message on O_NONBLOCK side-effects (Johannes) Link: https://lkml.kernel.org/r/20250506232833.3109790-1-shakeel.butt@linux.dev Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev> Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Greg Thelen <gthelen@google.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Michal Koutný <mkoutny@suse.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Tejun Heo <tj@kernel.org> Cc: Yosry Ahmed <yosry.ahmed@linux.dev> Cc: Christian Brauner <brauner@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-12memcg: introduce non-blocking limit setting optionShakeel Butt
Setting the max and high limits can trigger synchronous reclaim and/or oom-kill if the usage is higher than the given limit. This behavior is fine for newly created cgroups but it can cause issues for the node controller while setting limits for existing cgroups. In our production multi-tenant and overcommitted environment, we are seeing priority inversion when the node controller dynamically adjusts the limits of running jobs of different priorities. Based on the system situation, the node controller may reduce the limits of lower priority jobs and increase the limits of higher priority jobs. However we are seeing node controller getting stuck for long period of time while reclaiming from lower priority jobs while setting their limits and also spends a lot of its own CPU. One of the workaround we are trying is to fork a new process which sets the limit of the lower priority job along with setting an alarm to get itself killed if it get stuck in the reclaim for lower priority job. However we are finding it very unreliable and costly. Either we need a good enough time buffer for the alarm to be delivered after setting limit and potentialy spend a lot of CPU in the reclaim or be unreliable in setting the limit for much shorter but cheaper (less reclaim) alarms. Let's introduce new limit setting option which does not trigger reclaim and/or oom-kill and let the processes in the target cgroup to trigger reclaim and/or throttling and/or oom-kill in their next charge request. This will make the node controller on multi-tenant overcommitted environment much more reliable. Explanation from Johannes on side-effects of O_NONBLOCK limit change: It's usually the allocating tasks inside the group bearing the cost of limit enforcement and reclaim. This allows a (privileged) updater from outside the group to keep that cost in there - instead of having to help, from a context that doesn't necessarily make sense. I suppose the tradeoff with that - and the reason why this was doing sync reclaim in the first place - is that, if the group is idle and not trying to allocate more, it can take indefinitely for the new limit to actually be met. It should be okay in most scenarios in practice. As the capacity is reallocated from group A to B, B will exert pressure on A once it tries to claim it and thereby shrink it down. If A is idle, that shouldn't be hard. If A is running, it's likely to fault/allocate soon-ish and then join the effort. It does leave a (malicious) corner case where A is just busy-hitting its memory to interfere with the clawback. This is comparable to reclaiming memory.low overage from the outside, though, which is an acceptable risk. Users of O_NONBLOCK just need to be aware. Link: https://lkml.kernel.org/r/20250419183545.1982187-1-shakeel.butt@linux.dev Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev> Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Greg Thelen <gthelen@google.com> Cc: Michal Koutný <mkoutny@suse.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Tejun Heo <tj@kernel.org> Cc: Christian Brauner <brauner@kernel.org> Cc: Yosry Ahmed <yosry.ahmed@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-12vmscan,cgroup: apply mems_effective to reclaimGregory Price
It is possible for a reclaimer to cause demotions of an lruvec belonging to a cgroup with cpuset.mems set to exclude some nodes. Attempt to apply this limitation based on the lruvec's memcg and prevent demotion. Notably, this may still allow demotion of shared libraries or any memory first instantiated in another cgroup. This means cpusets still cannot cannot guarantee complete isolation when demotion is enabled, and the docs have been updated to reflect this. This is useful for isolating workloads on a multi-tenant system from certain classes of memory more consistently - with the noted exceptions. Note on locking: The cgroup_get_e_css reference protects the css->effective_mems, and calls of this interface would be subject to the same race conditions associated with a non-atomic access to cs->effective_mems. So while this interface cannot make strong guarantees of correctness, it can therefore avoid taking a global or rcu_read_lock for performance. Link: https://lkml.kernel.org/r/20250424202806.52632-3-gourry@gourry.net Signed-off-by: Gregory Price <gourry@gourry.net> Suggested-by: Shakeel Butt <shakeel.butt@linux.dev> Suggested-by: Waiman Long <longman@redhat.com> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev> Reviewed-by: Waiman Long <longman@redhat.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Michal Koutný <mkoutny@suse.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Roman Gushchin <roman.gushchin@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-12Update Christoph's Email address and make it consistentChristoph Lameter (Ampere)
Use cl@gentwo.org throughout and remove the old email addresses. Link: https://lkml.kernel.org/r/8b962f57-4d98-cbb0-cd82-b6ba456733e8@gentwo.org Signed-off-by: Christoph Lameter <cl@gentwo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-12Docs/ABI/damon: document nid fileSeongJae Park
Add a description of 'nid' file, which is optionally used for specific DAMOS quota goal metrics such as node_mem_{used,free}_bp on the DAMON sysfs ABI document. Link: https://lkml.kernel.org/r/20250420194030.75838-7-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Yunjeong Mun <yunjeong.mun@sk.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-12Docs/admin-guide/mm/damon/usage: document 'nid' fileSeongJae Park
Add description of 'nid' file, which is optionally used for specific DAMOS quota goal metrics such as node_mem_{used,free}_bp on DAMON usage document. Link: https://lkml.kernel.org/r/20250420194030.75838-6-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Yunjeong Mun <yunjeong.mun@sk.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-12Docs/mm/damon/design: document node_mem_{used,free}_bpSeongJae Park
Add description of DAMOS quota goal metrics for NUMA node utilization on the DAMON deesign document. Link: https://lkml.kernel.org/r/20250420194030.75838-5-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Yunjeong Mun <yunjeong.mun@sk.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>