summaryrefslogtreecommitdiff
path: root/Documentation/admin-guide
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/admin-guide')
-rw-r--r--Documentation/admin-guide/LSM/Smack.rst16
-rw-r--r--Documentation/admin-guide/LSM/ipe.rst17
-rw-r--r--Documentation/admin-guide/RAS/main.rst142
-rw-r--r--Documentation/admin-guide/bcache.rst13
-rw-r--r--Documentation/admin-guide/blockdev/zoned_loop.rst61
-rw-r--r--Documentation/admin-guide/cgroup-v2.rst31
-rw-r--r--Documentation/admin-guide/efi-stub.rst3
-rw-r--r--Documentation/admin-guide/hw-vuln/l1d_flush.rst2
-rw-r--r--Documentation/admin-guide/hw-vuln/spectre.rst2
-rw-r--r--Documentation/admin-guide/kernel-parameters.rst97
-rw-r--r--Documentation/admin-guide/kernel-parameters.txt118
-rw-r--r--Documentation/admin-guide/md.rst10
-rw-r--r--Documentation/admin-guide/media/mali-c55-graph.dot19
-rw-r--r--Documentation/admin-guide/media/mali-c55.rst413
-rw-r--r--Documentation/admin-guide/media/platform-cardlist.rst2
-rw-r--r--Documentation/admin-guide/media/rkcif-rk3568-vicap.dot8
-rw-r--r--Documentation/admin-guide/media/rkcif.rst79
-rw-r--r--Documentation/admin-guide/media/v4l-drivers.rst2
-rw-r--r--Documentation/admin-guide/pm/cpuidle.rst9
-rw-r--r--Documentation/admin-guide/pm/intel_pstate.rst133
-rw-r--r--Documentation/admin-guide/sysctl/net.rst29
-rw-r--r--Documentation/admin-guide/tainted-kernels.rst2
-rw-r--r--Documentation/admin-guide/thermal/index.rst1
-rw-r--r--Documentation/admin-guide/thermal/intel_thermal_throttle.rst91
-rw-r--r--Documentation/admin-guide/workload-tracing.rst10
25 files changed, 954 insertions, 356 deletions
diff --git a/Documentation/admin-guide/LSM/Smack.rst b/Documentation/admin-guide/LSM/Smack.rst
index 6d44f4fdbf59..c5ed775f2d10 100644
--- a/Documentation/admin-guide/LSM/Smack.rst
+++ b/Documentation/admin-guide/LSM/Smack.rst
@@ -601,10 +601,15 @@ specification.
Task Attribute
~~~~~~~~~~~~~~
-The Smack label of a process can be read from /proc/<pid>/attr/current. A
-process can read its own Smack label from /proc/self/attr/current. A
+The Smack label of a process can be read from ``/proc/<pid>/attr/current``. A
+process can read its own Smack label from ``/proc/self/attr/current``. A
privileged process can change its own Smack label by writing to
-/proc/self/attr/current but not the label of another process.
+``/proc/self/attr/current`` but not the label of another process.
+
+Format of writing is : only the label or the label followed by one of the
+3 trailers: ``\n`` (by common agreement for ``/proc/...`` interfaces),
+``\0`` (because some applications incorrectly include it),
+``\n\0`` (because we think some applications may incorrectly include it).
File Attribute
~~~~~~~~~~~~~~
@@ -696,6 +701,11 @@ sockets.
A privileged program may set this to match the label of another
task with which it hopes to communicate.
+UNIX domain socket (UDS) with a BSD address functions both as a file in a
+filesystem and as a socket. As a file, it carries the SMACK64 attribute. This
+attribute is not involved in Smack security enforcement and is immutably
+assigned the label "*".
+
Smack Netlabel Exceptions
~~~~~~~~~~~~~~~~~~~~~~~~~
diff --git a/Documentation/admin-guide/LSM/ipe.rst b/Documentation/admin-guide/LSM/ipe.rst
index dc7088451f9d..a756d8158531 100644
--- a/Documentation/admin-guide/LSM/ipe.rst
+++ b/Documentation/admin-guide/LSM/ipe.rst
@@ -95,7 +95,20 @@ languages when these scripts are invoked by passing these program files
to the interpreter. This is because the way interpreters execute these
files; the scripts themselves are not evaluated as executable code
through one of IPE's hooks, but they are merely text files that are read
-(as opposed to compiled executables) [#interpreters]_.
+(as opposed to compiled executables). However, with the introduction of the
+``AT_EXECVE_CHECK`` flag (:doc:`AT_EXECVE_CHECK </userspace-api/check_exec>`),
+interpreters can use it to signal the kernel that a script file will be executed,
+and request the kernel to perform LSM security checks on it.
+
+IPE's EXECUTE operation enforcement differs between compiled executables and
+interpreted scripts: For compiled executables, enforcement is triggered
+automatically by the kernel during ``execve()``, ``execveat()``, ``mmap()``
+and ``mprotect()`` syscalls when loading executable content. For interpreted
+scripts, enforcement requires explicit interpreter integration using
+``execveat()`` with ``AT_EXECVE_CHECK`` flag. Unlike exec syscalls that IPE
+intercepts during the execution process, this mechanism needs the interpreter
+to take the initiative, and existing interpreters won't be automatically
+supported unless the signal call is added.
Threat Model
------------
@@ -806,8 +819,6 @@ A:
.. [#digest_cache_lsm] https://lore.kernel.org/lkml/20240415142436.2545003-1-roberto.sassu@huaweicloud.com/
-.. [#interpreters] There is `some interest in solving this issue <https://lore.kernel.org/lkml/20220321161557.495388-1-mic@digikod.net/>`_.
-
.. [#devdoc] Please see :doc:`the design docs </security/ipe>` for more on
this topic.
diff --git a/Documentation/admin-guide/RAS/main.rst b/Documentation/admin-guide/RAS/main.rst
index 447bfde509fb..5a45db32c49b 100644
--- a/Documentation/admin-guide/RAS/main.rst
+++ b/Documentation/admin-guide/RAS/main.rst
@@ -406,24 +406,8 @@ index of the MC::
|->mc2
....
-Under each ``mcX`` directory each ``csrowX`` is again represented by a
-``csrowX``, where ``X`` is the csrow index::
-
- .../mc/mc0/
- |
- |->csrow0
- |->csrow2
- |->csrow3
- ....
-
-Notice that there is no csrow1, which indicates that csrow0 is composed
-of a single ranked DIMMs. This should also apply in both Channels, in
-order to have dual-channel mode be operational. Since both csrow2 and
-csrow3 are populated, this indicates a dual ranked set of DIMMs for
-channels 0 and 1.
-
-Within each of the ``mcX`` and ``csrowX`` directories are several EDAC
-control and attribute files.
+Within each of the ``mcX`` directory are several EDAC control and
+attribute files.
``mcX`` directories
-------------------
@@ -569,7 +553,7 @@ this ``X`` memory module:
- Unbuffered-DDR
.. [#f5] On some systems, the memory controller doesn't have any logic
- to identify the memory module. On such systems, the directory is called ``rankX`` and works on a similar way as the ``csrowX`` directories.
+ to identify the memory module. On such systems, the directory is called ``rankX``.
On modern Intel memory controllers, the memory controller identifies the
memory modules directly. On such systems, the directory is called ``dimmX``.
@@ -577,126 +561,6 @@ this ``X`` memory module:
symlinks inside the sysfs mapping that are automatically created by
the sysfs subsystem. Currently, they serve no purpose.
-``csrowX`` directories
-----------------------
-
-When CONFIG_EDAC_LEGACY_SYSFS is enabled, sysfs will contain the ``csrowX``
-directories. As this API doesn't work properly for Rambus, FB-DIMMs and
-modern Intel Memory Controllers, this is being deprecated in favor of
-``dimmX`` directories.
-
-In the ``csrowX`` directories are EDAC control and attribute files for
-this ``X`` instance of csrow:
-
-
-- ``ue_count`` - Total Uncorrectable Errors count attribute file
-
- This attribute file displays the total count of uncorrectable
- errors that have occurred on this csrow. If panic_on_ue is set
- this counter will not have a chance to increment, since EDAC
- will panic the system.
-
-
-- ``ce_count`` - Total Correctable Errors count attribute file
-
- This attribute file displays the total count of correctable
- errors that have occurred on this csrow. This count is very
- important to examine. CEs provide early indications that a
- DIMM is beginning to fail. This count field should be
- monitored for non-zero values and report such information
- to the system administrator.
-
-
-- ``size_mb`` - Total memory managed by this csrow attribute file
-
- This attribute file displays, in count of megabytes, the memory
- that this csrow contains.
-
-
-- ``mem_type`` - Memory Type attribute file
-
- This attribute file will display what type of memory is currently
- on this csrow. Normally, either buffered or unbuffered memory.
- Examples:
-
- - Registered-DDR
- - Unbuffered-DDR
-
-
-- ``edac_mode`` - EDAC Mode of operation attribute file
-
- This attribute file will display what type of Error detection
- and correction is being utilized.
-
-
-- ``dev_type`` - Device type attribute file
-
- This attribute file will display what type of DRAM device is
- being utilized on this DIMM.
- Examples:
-
- - x1
- - x2
- - x4
- - x8
-
-
-- ``ch0_ce_count`` - Channel 0 CE Count attribute file
-
- This attribute file will display the count of CEs on this
- DIMM located in channel 0.
-
-
-- ``ch0_ue_count`` - Channel 0 UE Count attribute file
-
- This attribute file will display the count of UEs on this
- DIMM located in channel 0.
-
-
-- ``ch0_dimm_label`` - Channel 0 DIMM Label control file
-
-
- This control file allows this DIMM to have a label assigned
- to it. With this label in the module, when errors occur
- the output can provide the DIMM label in the system log.
- This becomes vital for panic events to isolate the
- cause of the UE event.
-
- DIMM Labels must be assigned after booting, with information
- that correctly identifies the physical slot with its
- silk screen label. This information is currently very
- motherboard specific and determination of this information
- must occur in userland at this time.
-
-
-- ``ch1_ce_count`` - Channel 1 CE Count attribute file
-
-
- This attribute file will display the count of CEs on this
- DIMM located in channel 1.
-
-
-- ``ch1_ue_count`` - Channel 1 UE Count attribute file
-
-
- This attribute file will display the count of UEs on this
- DIMM located in channel 0.
-
-
-- ``ch1_dimm_label`` - Channel 1 DIMM Label control file
-
- This control file allows this DIMM to have a label assigned
- to it. With this label in the module, when errors occur
- the output can provide the DIMM label in the system log.
- This becomes vital for panic events to isolate the
- cause of the UE event.
-
- DIMM Labels must be assigned after booting, with information
- that correctly identifies the physical slot with its
- silk screen label. This information is currently very
- motherboard specific and determination of this information
- must occur in userland at this time.
-
System Logging
--------------
diff --git a/Documentation/admin-guide/bcache.rst b/Documentation/admin-guide/bcache.rst
index 6fdb495ac466..f71f349553e4 100644
--- a/Documentation/admin-guide/bcache.rst
+++ b/Documentation/admin-guide/bcache.rst
@@ -17,8 +17,7 @@ The latest bcache kernel code can be found from mainline Linux kernel:
It's designed around the performance characteristics of SSDs - it only allocates
in erase block sized buckets, and it uses a hybrid btree/log to track cached
extents (which can be anywhere from a single sector to the bucket size). It's
-designed to avoid random writes at all costs; it fills up an erase block
-sequentially, then issues a discard before reusing it.
+designed to avoid random writes at all costs.
Both writethrough and writeback caching are supported. Writeback defaults to
off, but can be switched on and off arbitrarily at runtime. Bcache goes to
@@ -618,19 +617,11 @@ bucket_size
cache_replacement_policy
One of either lru, fifo or random.
-discard
- Boolean; if on a discard/TRIM will be issued to each bucket before it is
- reused. Defaults to off, since SATA TRIM is an unqueued command (and thus
- slow).
-
freelist_percent
Size of the freelist as a percentage of nbuckets. Can be written to to
increase the number of buckets kept on the freelist, which lets you
artificially reduce the size of the cache at runtime. Mostly for testing
- purposes (i.e. testing how different size caches affect your hit rate), but
- since buckets are discarded when they move on to the freelist will also make
- the SSD's garbage collection easier by effectively giving it more reserved
- space.
+ purposes (i.e. testing how different size caches affect your hit rate).
io_errors
Number of errors that have occurred, decayed by io_error_halflife.
diff --git a/Documentation/admin-guide/blockdev/zoned_loop.rst b/Documentation/admin-guide/blockdev/zoned_loop.rst
index 64dcfde7450a..806adde664db 100644
--- a/Documentation/admin-guide/blockdev/zoned_loop.rst
+++ b/Documentation/admin-guide/blockdev/zoned_loop.rst
@@ -68,30 +68,43 @@ The options available for the add command can be listed by reading the
In more details, the options that can be used with the "add" command are as
follows.
-================ ===========================================================
-id Device number (the X in /dev/zloopX).
- Default: automatically assigned.
-capacity_mb Device total capacity in MiB. This is always rounded up to
- the nearest higher multiple of the zone size.
- Default: 16384 MiB (16 GiB).
-zone_size_mb Device zone size in MiB. Default: 256 MiB.
-zone_capacity_mb Device zone capacity (must always be equal to or lower than
- the zone size. Default: zone size.
-conv_zones Total number of conventioanl zones starting from sector 0.
- Default: 8.
-base_dir Path to the base directory where to create the directory
- containing the zone files of the device.
- Default=/var/local/zloop.
- The device directory containing the zone files is always
- named with the device ID. E.g. the default zone file
- directory for /dev/zloop0 is /var/local/zloop/0.
-nr_queues Number of I/O queues of the zoned block device. This value is
- always capped by the number of online CPUs
- Default: 1
-queue_depth Maximum I/O queue depth per I/O queue.
- Default: 64
-buffered_io Do buffered IOs instead of direct IOs (default: false)
-================ ===========================================================
+=================== =========================================================
+id Device number (the X in /dev/zloopX).
+ Default: automatically assigned.
+capacity_mb Device total capacity in MiB. This is always rounded up
+ to the nearest higher multiple of the zone size.
+ Default: 16384 MiB (16 GiB).
+zone_size_mb Device zone size in MiB. Default: 256 MiB.
+zone_capacity_mb Device zone capacity (must always be equal to or lower
+ than the zone size. Default: zone size.
+conv_zones Total number of conventioanl zones starting from
+ sector 0
+ Default: 8
+base_dir Path to the base directory where to create the directory
+ containing the zone files of the device.
+ Default=/var/local/zloop.
+ The device directory containing the zone files is always
+ named with the device ID. E.g. the default zone file
+ directory for /dev/zloop0 is /var/local/zloop/0.
+nr_queues Number of I/O queues of the zoned block device. This
+ value is always capped by the number of online CPUs
+ Default: 1
+queue_depth Maximum I/O queue depth per I/O queue.
+ Default: 64
+buffered_io Do buffered IOs instead of direct IOs (default: false)
+zone_append Enable or disable a zloop device native zone append
+ support.
+ Default: 1 (enabled).
+ If native zone append support is disabled, the block layer
+ will emulate this operation using regular write
+ operations.
+ordered_zone_append Enable zloop mitigation of zone append reordering.
+ Default: disabled.
+ This is useful for testing file systems file data mapping
+ (extents), as when enabled, this can significantly reduce
+ the number of data extents needed to for a file data
+ mapping.
+=================== =========================================================
3) Deleting a Zoned Device
--------------------------
diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 0e6c67ac585a..4c072e85acdf 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -53,7 +53,8 @@ v1 is available under :ref:`Documentation/admin-guide/cgroup-v1/index.rst <cgrou
5-2. Memory
5-2-1. Memory Interface Files
5-2-2. Usage Guidelines
- 5-2-3. Memory Ownership
+ 5-2-3. Reclaim Protection
+ 5-2-4. Memory Ownership
5-3. IO
5-3-1. IO Interface Files
5-3-2. Writeback
@@ -1317,7 +1318,7 @@ PAGE_SIZE multiple when read back.
smaller overages.
Effective min boundary is limited by memory.min values of
- all ancestor cgroups. If there is memory.min overcommitment
+ ancestor cgroups. If there is memory.min overcommitment
(child cgroup or cgroups are requiring more protected memory
than parent will allow), then each child cgroup will get
the part of parent's protection proportional to its
@@ -1326,9 +1327,6 @@ PAGE_SIZE multiple when read back.
Putting more memory than generally available under this
protection is discouraged and may lead to constant OOMs.
- If a memory cgroup is not populated with processes,
- its memory.min is ignored.
-
memory.low
A read-write single value file which exists on non-root
cgroups. The default is "0".
@@ -1343,7 +1341,7 @@ PAGE_SIZE multiple when read back.
smaller overages.
Effective low boundary is limited by memory.low values of
- all ancestor cgroups. If there is memory.low overcommitment
+ ancestor cgroups. If there is memory.low overcommitment
(child cgroup or cgroups are requiring more protected memory
than parent will allow), then each child cgroup will get
the part of parent's protection proportional to its
@@ -1934,6 +1932,27 @@ memory - is necessary to determine whether a workload needs more
memory; unfortunately, memory pressure monitoring mechanism isn't
implemented yet.
+Reclaim Protection
+~~~~~~~~~~~~~~~~~~
+
+The protection configured with "memory.low" or "memory.min" applies relatively
+to the target of the reclaim (i.e. any of memory cgroup limits, proactive
+memory.reclaim or global reclaim apparently located in the root cgroup).
+The protection value configured for B applies unchanged to the reclaim
+targeting A (i.e. caused by competition with the sibling E)::
+
+ root - ... - A - B - C
+ \ ` D
+ ` E
+
+When the reclaim targets ancestors of A, the effective protection of B is
+capped by the protection value configured for A (and any other intermediate
+ancestors between A and the target).
+
+To express indifference about relative sibling protection, it is suggested to
+use memory_recursiveprot. Configuring all descendants of a parent with finite
+protection to "max" works but it may unnecessarily skew memory.events:low
+field.
Memory Ownership
~~~~~~~~~~~~~~~~
diff --git a/Documentation/admin-guide/efi-stub.rst b/Documentation/admin-guide/efi-stub.rst
index 090f3a185e18..f8e7407698bd 100644
--- a/Documentation/admin-guide/efi-stub.rst
+++ b/Documentation/admin-guide/efi-stub.rst
@@ -79,6 +79,9 @@ because the image we're executing is interpreted by the EFI shell,
which understands relative paths, whereas the rest of the command line
is passed to bzImage.efi.
+.. hint::
+ It is also possible to provide an initrd using a Linux-specific UEFI
+ protocol at boot time. See :ref:`pe-coff-entry-point` for details.
The "dtb=" option
-----------------
diff --git a/Documentation/admin-guide/hw-vuln/l1d_flush.rst b/Documentation/admin-guide/hw-vuln/l1d_flush.rst
index 210020bc3f56..35dc25159b28 100644
--- a/Documentation/admin-guide/hw-vuln/l1d_flush.rst
+++ b/Documentation/admin-guide/hw-vuln/l1d_flush.rst
@@ -31,7 +31,7 @@ specifically opt into the feature to enable it.
Mitigation
----------
-When PR_SET_L1D_FLUSH is enabled for a task a flush of the L1D cache is
+When PR_SPEC_L1D_FLUSH is enabled for a task a flush of the L1D cache is
performed when the task is scheduled out and the incoming task belongs to a
different process and therefore to a different address space.
diff --git a/Documentation/admin-guide/hw-vuln/spectre.rst b/Documentation/admin-guide/hw-vuln/spectre.rst
index 991f12adef8d..4bb8549bee82 100644
--- a/Documentation/admin-guide/hw-vuln/spectre.rst
+++ b/Documentation/admin-guide/hw-vuln/spectre.rst
@@ -406,7 +406,7 @@ The possible values in this file are:
- Single threaded indirect branch prediction (STIBP) status for protection
between different hyper threads. This feature can be controlled through
- prctl per process, or through kernel command line options. This is x86
+ prctl per process, or through kernel command line options. This is an x86
only feature. For more details see below.
==================== ========================================================
diff --git a/Documentation/admin-guide/kernel-parameters.rst b/Documentation/admin-guide/kernel-parameters.rst
index 7bf8cc7df6b5..02a725536cc5 100644
--- a/Documentation/admin-guide/kernel-parameters.rst
+++ b/Documentation/admin-guide/kernel-parameters.rst
@@ -110,102 +110,7 @@ The parameters listed below are only valid if certain kernel build options
were enabled and if respective hardware is present. This list should be kept
in alphabetical order. The text in square brackets at the beginning
of each description states the restrictions within which a parameter
-is applicable::
-
- ACPI ACPI support is enabled.
- AGP AGP (Accelerated Graphics Port) is enabled.
- ALSA ALSA sound support is enabled.
- APIC APIC support is enabled.
- APM Advanced Power Management support is enabled.
- APPARMOR AppArmor support is enabled.
- ARM ARM architecture is enabled.
- ARM64 ARM64 architecture is enabled.
- AX25 Appropriate AX.25 support is enabled.
- CLK Common clock infrastructure is enabled.
- CMA Contiguous Memory Area support is enabled.
- DRM Direct Rendering Management support is enabled.
- DYNAMIC_DEBUG Build in debug messages and enable them at runtime
- EARLY Parameter processed too early to be embedded in initrd.
- EDD BIOS Enhanced Disk Drive Services (EDD) is enabled
- EFI EFI Partitioning (GPT) is enabled
- EVM Extended Verification Module
- FB The frame buffer device is enabled.
- FTRACE Function tracing enabled.
- GCOV GCOV profiling is enabled.
- HIBERNATION HIBERNATION is enabled.
- HW Appropriate hardware is enabled.
- HYPER_V HYPERV support is enabled.
- IMA Integrity measurement architecture is enabled.
- IP_PNP IP DHCP, BOOTP, or RARP is enabled.
- IPV6 IPv6 support is enabled.
- ISAPNP ISA PnP code is enabled.
- ISDN Appropriate ISDN support is enabled.
- ISOL CPU Isolation is enabled.
- JOY Appropriate joystick support is enabled.
- KGDB Kernel debugger support is enabled.
- KVM Kernel Virtual Machine support is enabled.
- LIBATA Libata driver is enabled
- LOONGARCH LoongArch architecture is enabled.
- LOOP Loopback device support is enabled.
- LP Printer support is enabled.
- M68k M68k architecture is enabled.
- These options have more detailed description inside of
- Documentation/arch/m68k/kernel-options.rst.
- MDA MDA console support is enabled.
- MIPS MIPS architecture is enabled.
- MOUSE Appropriate mouse support is enabled.
- MSI Message Signaled Interrupts (PCI).
- MTD MTD (Memory Technology Device) support is enabled.
- NET Appropriate network support is enabled.
- NFS Appropriate NFS support is enabled.
- NUMA NUMA support is enabled.
- OF Devicetree is enabled.
- PARISC The PA-RISC architecture is enabled.
- PCI PCI bus support is enabled.
- PCIE PCI Express support is enabled.
- PCMCIA The PCMCIA subsystem is enabled.
- PNP Plug & Play support is enabled.
- PPC PowerPC architecture is enabled.
- PPT Parallel port support is enabled.
- PS2 Appropriate PS/2 support is enabled.
- PV_OPS A paravirtualized kernel is enabled.
- RAM RAM disk support is enabled.
- RDT Intel Resource Director Technology.
- RISCV RISCV architecture is enabled.
- S390 S390 architecture is enabled.
- SCSI Appropriate SCSI support is enabled.
- A lot of drivers have their options described inside
- the Documentation/scsi/ sub-directory.
- SDW SoundWire support is enabled.
- SECURITY Different security models are enabled.
- SELINUX SELinux support is enabled.
- SERIAL Serial support is enabled.
- SH SuperH architecture is enabled.
- SMP The kernel is an SMP kernel.
- SPARC Sparc architecture is enabled.
- SUSPEND System suspend states are enabled.
- SWSUSP Software suspend (hibernation) is enabled.
- TPM TPM drivers are enabled.
- UMS USB Mass Storage support is enabled.
- USB USB support is enabled.
- USBHID USB Human Interface Device support is enabled.
- V4L Video For Linux support is enabled.
- VGA The VGA console has been enabled.
- VMMIO Driver for memory mapped virtio devices is enabled.
- VT Virtual terminal support is enabled.
- WDT Watchdog support is enabled.
- X86-32 X86-32, aka i386 architecture is enabled.
- X86-64 X86-64 architecture is enabled.
- X86 Either 32-bit or 64-bit x86 (same as X86-32+X86-64)
- X86_UV SGI UV support is enabled.
- XEN Xen support is enabled
- XTENSA xtensa architecture is enabled.
-
-In addition, the following text indicates that the option::
-
- BOOT Is a boot loader parameter.
- BUGS= Relates to possible processor bugs on the said processor.
- KNL Is a kernel start-up parameter.
+is applicable.
Parameters denoted with BOOT are actually interpreted by the boot
loader, and have no meaning to the kernel directly.
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 6c42061ca20e..1e89d122f084 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1,3 +1,101 @@
+ ACPI ACPI support is enabled.
+ AGP AGP (Accelerated Graphics Port) is enabled.
+ ALSA ALSA sound support is enabled.
+ APIC APIC support is enabled.
+ APM Advanced Power Management support is enabled.
+ APPARMOR AppArmor support is enabled.
+ ARM ARM architecture is enabled.
+ ARM64 ARM64 architecture is enabled.
+ AX25 Appropriate AX.25 support is enabled.
+ CLK Common clock infrastructure is enabled.
+ CMA Contiguous Memory Area support is enabled.
+ DRM Direct Rendering Management support is enabled.
+ DYNAMIC_DEBUG Build in debug messages and enable them at runtime
+ EARLY Parameter processed too early to be embedded in initrd.
+ EDD BIOS Enhanced Disk Drive Services (EDD) is enabled
+ EFI EFI Partitioning (GPT) is enabled
+ EVM Extended Verification Module
+ FB The frame buffer device is enabled.
+ FTRACE Function tracing enabled.
+ GCOV GCOV profiling is enabled.
+ HIBERNATION HIBERNATION is enabled.
+ HW Appropriate hardware is enabled.
+ HYPER_V HYPERV support is enabled.
+ IMA Integrity measurement architecture is enabled.
+ IP_PNP IP DHCP, BOOTP, or RARP is enabled.
+ IPV6 IPv6 support is enabled.
+ ISAPNP ISA PnP code is enabled.
+ ISDN Appropriate ISDN support is enabled.
+ ISOL CPU Isolation is enabled.
+ JOY Appropriate joystick support is enabled.
+ KGDB Kernel debugger support is enabled.
+ KVM Kernel Virtual Machine support is enabled.
+ LIBATA Libata driver is enabled
+ LOONGARCH LoongArch architecture is enabled.
+ LOOP Loopback device support is enabled.
+ LP Printer support is enabled.
+ M68k M68k architecture is enabled.
+ These options have more detailed description inside of
+ Documentation/arch/m68k/kernel-options.rst.
+ MDA MDA console support is enabled.
+ MIPS MIPS architecture is enabled.
+ MOUSE Appropriate mouse support is enabled.
+ MSI Message Signaled Interrupts (PCI).
+ MTD MTD (Memory Technology Device) support is enabled.
+ NET Appropriate network support is enabled.
+ NFS Appropriate NFS support is enabled.
+ NUMA NUMA support is enabled.
+ OF Devicetree is enabled.
+ PARISC The PA-RISC architecture is enabled.
+ PCI PCI bus support is enabled.
+ PCIE PCI Express support is enabled.
+ PCMCIA The PCMCIA subsystem is enabled.
+ PNP Plug & Play support is enabled.
+ PPC PowerPC architecture is enabled.
+ PPT Parallel port support is enabled.
+ PS2 Appropriate PS/2 support is enabled.
+ PV_OPS A paravirtualized kernel is enabled.
+ RAM RAM disk support is enabled.
+ RDT Intel Resource Director Technology.
+ RISCV RISCV architecture is enabled.
+ S390 S390 architecture is enabled.
+ SCSI Appropriate SCSI support is enabled.
+ A lot of drivers have their options described inside
+ the Documentation/scsi/ sub-directory.
+ SDW SoundWire support is enabled.
+ SECURITY Different security models are enabled.
+ SELINUX SELinux support is enabled.
+ SERIAL Serial support is enabled.
+ SH SuperH architecture is enabled.
+ SMP The kernel is an SMP kernel.
+ SPARC Sparc architecture is enabled.
+ SUSPEND System suspend states are enabled.
+ SWSUSP Software suspend (hibernation) is enabled.
+ TPM TPM drivers are enabled.
+ UMS USB Mass Storage support is enabled.
+ USB USB support is enabled.
+ USBHID USB Human Interface Device support is enabled.
+ V4L Video For Linux support is enabled.
+ VGA The VGA console has been enabled.
+ VMMIO Driver for memory mapped virtio devices is enabled.
+ VT Virtual terminal support is enabled.
+ WDT Watchdog support is enabled.
+ X86-32 X86-32, aka i386 architecture is enabled.
+ X86-64 X86-64 architecture is enabled.
+ X86 Either 32-bit or 64-bit x86 (same as X86-32+X86-64)
+ X86_UV SGI UV support is enabled.
+ XEN Xen support is enabled
+ XTENSA xtensa architecture is enabled.
+
+In addition, the following text indicates that the option
+
+ BOOT Is a boot loader parameter.
+ BUGS= Relates to possible processor bugs on the said processor.
+ KNL Is a kernel start-up parameter.
+
+
+Kernel parameters
+
accept_memory= [MM]
Format: { eager | lazy }
default: lazy
@@ -1907,6 +2005,16 @@
/sys/power/pm_test). Only available when CONFIG_PM_DEBUG
is set. Default value is 5.
+ hibernate_compression_threads=
+ [HIBERNATION]
+ Set the number of threads used for compressing or decompressing
+ hibernation images.
+
+ Format: <integer>
+ Default: 3
+ Minimum: 1
+ Example: hibernate_compression_threads=4
+
highmem=nn[KMG] [KNL,BOOT,EARLY] forces the highmem zone to have an exact
size of <nn>. This works even on boxes that have no
highmem otherwise. This also works to reduce highmem
@@ -6207,7 +6315,7 @@
rdt= [HW,X86,RDT]
Turn on/off individual RDT features. List is:
cmt, mbmtotal, mbmlocal, l3cat, l3cdp, l2cat, l2cdp,
- mba, smba, bmec, abmc.
+ mba, smba, bmec, abmc, sdciae.
E.g. to turn on cmt and turn off mba use:
rdt=cmt,!mba
@@ -6404,7 +6512,7 @@
that don't.
off - no mitigation
- auto - automatically select a migitation
+ auto - automatically select a mitigation
auto,nosmt - automatically select a mitigation,
disabling SMT if necessary for
the full mitigation (only on Zen1
@@ -6500,6 +6608,10 @@
Memory area to be used by remote processor image,
managed by CMA.
+ rseq_debug= [KNL] Enable or disable restartable sequence
+ debug mode. Defaults to CONFIG_RSEQ_DEBUG_DEFAULT_ENABLE.
+ Format: <bool>
+
rt_group_sched= [KNL] Enable or disable SCHED_RR/FIFO group scheduling
when CONFIG_RT_GROUP_SCHED=y. Defaults to
!CONFIG_RT_GROUP_SCHED_DEFAULT_DISABLED.
@@ -7150,7 +7262,7 @@
limit. Default value is 8191 pools.
stacktrace [FTRACE]
- Enabled the stack tracer on boot up.
+ Enable the stack tracer on boot up.
stacktrace_filter=[function-list]
[FTRACE] Limit the functions that the stack tracer
diff --git a/Documentation/admin-guide/md.rst b/Documentation/admin-guide/md.rst
index deed823eab01..dc7eab191caa 100644
--- a/Documentation/admin-guide/md.rst
+++ b/Documentation/admin-guide/md.rst
@@ -238,6 +238,16 @@ All md devices contain:
the number of devices in a raid4/5/6, or to support external
metadata formats which mandate such clipping.
+ logical_block_size
+ Configure the array's logical block size in bytes. This attribute
+ is only supported for 1.x meta. Write the value before starting
+ array. The final array LBS uses the maximum between this
+ configuration and LBS of all combined devices. Note that
+ LBS cannot exceed PAGE_SIZE before RAID supports folio.
+ WARNING: Arrays created on new kernel cannot be assembled at old
+ kernel due to padding check, Set module parameter 'check_new_feature'
+ to false to bypass, but data loss may occur.
+
reshape_position
This is either ``none`` or a sector number within the devices of
the array where ``reshape`` is up to. If this is set, the three
diff --git a/Documentation/admin-guide/media/mali-c55-graph.dot b/Documentation/admin-guide/media/mali-c55-graph.dot
new file mode 100644
index 000000000000..0775ba42bf4c
--- /dev/null
+++ b/Documentation/admin-guide/media/mali-c55-graph.dot
@@ -0,0 +1,19 @@
+digraph board {
+ rankdir=TB
+ n00000001 [label="{{} | mali-c55 tpg\n/dev/v4l-subdev0 | {<port0> 0}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000001:port0 -> n00000003:port0 [style=dashed]
+ n00000003 [label="{{<port0> 0} | mali-c55 isp\n/dev/v4l-subdev1 | {<port1> 1 | <port2> 2}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000003:port1 -> n00000007:port0 [style=bold]
+ n00000003:port2 -> n00000007:port2 [style=bold]
+ n00000003:port1 -> n0000000b:port0 [style=bold]
+ n00000007 [label="{{<port0> 0 | <port2> 2} | mali-c55 resizer fr\n/dev/v4l-subdev2 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000007:port1 -> n0000000e [style=bold]
+ n0000000b [label="{{<port0> 0} | mali-c55 resizer ds\n/dev/v4l-subdev3 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n0000000b:port1 -> n00000012 [style=bold]
+ n0000000e [label="mali-c55 fr\n/dev/video0", shape=box, style=filled, fillcolor=yellow]
+ n00000012 [label="mali-c55 ds\n/dev/video1", shape=box, style=filled, fillcolor=yellow]
+ n00000022 [label="{{<port0> 0} | csi2-rx\n/dev/v4l-subdev4 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000022:port1 -> n00000003:port0
+ n00000027 [label="{{} | imx415 1-001a\n/dev/v4l-subdev5 | {<port0> 0}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000027:port0 -> n00000022:port0 [style=bold]
+} \ No newline at end of file
diff --git a/Documentation/admin-guide/media/mali-c55.rst b/Documentation/admin-guide/media/mali-c55.rst
new file mode 100644
index 000000000000..315f982000c4
--- /dev/null
+++ b/Documentation/admin-guide/media/mali-c55.rst
@@ -0,0 +1,413 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==========================================
+ARM Mali-C55 Image Signal Processor driver
+==========================================
+
+Introduction
+============
+
+This file documents the driver for ARM's Mali-C55 Image Signal Processor. The
+driver is located under drivers/media/platform/arm/mali-c55.
+
+The Mali-C55 ISP receives data in either raw Bayer format or RGB/YUV format from
+sensors through either a parallel interface or a memory bus before processing it
+and outputting it through an internal DMA engine. Two output pipelines are
+possible (though one may not be fitted, depending on the implementation). These
+are referred to as "Full resolution" and "Downscale", but the naming is historic
+and both pipes are capable of cropping/scaling operations. The full resolution
+pipe is also capable of outputting RAW data, bypassing much of the ISP's
+processing. The downscale pipe cannot output RAW data. An integrated test
+pattern generator can be used to drive the ISP and produce image data in the
+absence of a connected camera sensor. The driver module is named mali_c55, and
+is enabled through the CONFIG_VIDEO_MALI_C55 config option.
+
+The driver implements V4L2, Media Controller and V4L2 Subdevice interfaces and
+expects camera sensors connected to the ISP to have V4L2 subdevice interfaces.
+
+Mali-C55 ISP hardware
+=====================
+
+A high level functional view of the Mali-C55 ISP is presented below. The ISP
+takes input from either a live source or through a DMA engine for memory input,
+depending on the SoC integration.::
+
+ +---------+ +----------+ +--------+
+ | Sensor |--->| CSI-2 Rx | "Full Resolution" | DMA |
+ +---------+ +----------+ |\ Output +--->| Writer |
+ | | \ | +--------+
+ | | \ +----------+ +------+---> Streaming I/O
+ +------------+ +------->| | | | |
+ | | | |-->| Mali-C55 |--+
+ | DMA Reader |--------------->| | | ISP | |
+ | | | / | | | +---> Streaming I/O
+ +------------+ | / +----------+ | |
+ |/ +------+
+ | +--------+
+ +--->| DMA |
+ "Downscaled" | Writer |
+ Output +--------+
+
+Media Controller Topology
+=========================
+
+An example of the ISP's topology (as implemented in a system with an IMX415
+camera sensor and generic CSI-2 receiver) is below:
+
+
+.. kernel-figure:: mali-c55-graph.dot
+ :alt: mali-c55-graph.dot
+ :align: center
+
+The driver has 4 V4L2 subdevices:
+
+- `mali_c55 isp`: Responsible for configuring input crop and color space
+ conversion
+- `mali_c55 tpg`: The test pattern generator, emulating a camera sensor.
+- `mali_c55 resizer fr`: The Full-Resolution pipe resizer
+- `mali_c55 resizer ds`: The Downscale pipe resizer
+
+The driver has 3 V4L2 video devices:
+
+- `mali-c55 fr`: The full-resolution pipe's capture device
+- `mali-c55 ds`: The downscale pipe's capture device
+- `mali-c55 3a stats`: The 3A statistics capture device
+
+Frame sequences are synchronised across to two capture devices, meaning if one
+pipe is started later than the other the sequence numbers returned in its
+buffers will match those of the other pipe rather than starting from zero.
+
+Idiosyncrasies
+--------------
+
+**mali-c55 isp**
+The `mali-c55 isp` subdevice has a single sink pad to which all sources of data
+should be connected. The active source is selected by enabling the appropriate
+media link and disabling all others. The ISP has two source pads, reflecting the
+different paths through which it can internally route data. Tap points within
+the ISP allow users to divert data to avoid processing by some or all of the
+hardware's processing steps. The diagram below is intended only to highlight how
+the bypassing works and is not a true reflection of those processing steps; for
+a high-level functional block diagram see ARM's developer page for the
+ISP [3]_::
+
+ +--------------------------------------------------------------+
+ | Possible Internal ISP Data Routes |
+ | +------------+ +----------+ +------------+ |
+ +---+ | | | | | Colour | +---+
+ | 0 |--+-->| Processing |->| Demosaic |->| Space |--->| 1 |
+ +---+ | | | | | | Conversion | +---+
+ | | +------------+ +----------+ +------------+ |
+ | | +---+
+ | +---------------------------------------------------| 2 |
+ | +---+
+ | |
+ +--------------------------------------------------------------+
+
+
+.. flat-table::
+ :header-rows: 1
+
+ * - Pad
+ - Direction
+ - Purpose
+
+ * - 0
+ - sink
+ - Data input, connected to the TPG and camera sensors
+
+ * - 1
+ - source
+ - RGB/YUV data, connected to the FR and DS V4L2 subdevices
+
+ * - 2
+ - source
+ - RAW bayer data, connected to the FR V4L2 subdevices
+
+The ISP is limited to both input and output resolutions between 640x480 and
+8192x8192, and this is reflected in the ISP and resizer subdevice's .set_fmt()
+operations.
+
+**mali-c55 resizer fr**
+The `mali-c55 resizer fr` subdevice has two _sink_ pads to reflect the different
+insertion points in the hardware (either RAW or demosaiced data):
+
+.. flat-table::
+ :header-rows: 1
+
+ * - Pad
+ - Direction
+ - Purpose
+
+ * - 0
+ - sink
+ - Data input connected to the ISP's demosaiced stream.
+
+ * - 1
+ - source
+ - Data output connected to the capture video device
+
+ * - 2
+ - sink
+ - Data input connected to the ISP's raw data stream
+
+The data source in use is selected through the routing API; two routes each of a
+single stream are available:
+
+.. flat-table::
+ :header-rows: 1
+
+ * - Sink Pad
+ - Source Pad
+ - Purpose
+
+ * - 0
+ - 1
+ - Demosaiced data route
+
+ * - 2
+ - 1
+ - Raw data route
+
+
+If the demosaiced route is active then the FR pipe is only capable of output
+in RGB/YUV formats. If the raw route is active then the output reflects the
+input (which may be either Bayer or RGB/YUV data).
+
+Using the driver to capture video
+=================================
+
+Using the media controller APIs we can configure the input source and ISP to
+capture images in a variety of formats. In the examples below, configuring the
+media graph is done with the v4l-utils [1]_ package's media-ctl utility.
+Capturing the images is done with yavta [2]_.
+
+Configuring the input source
+----------------------------
+
+The first step is to set the input source that we wish by enabling the correct
+media link. Using the example topology above, we can select the TPG as follows:
+
+.. code-block:: none
+
+ media-ctl -l "'lte-csi2-rx':1->'mali-c55 isp':0[0]"
+ media-ctl -l "'mali-c55 tpg':0->'mali-c55 isp':0[1]"
+
+Configuring which video devices will stream data
+------------------------------------------------
+
+The driver will wait for all video devices to have their VIDIOC_STREAMON ioctl
+called before it tells the sensor to start streaming. To facilitate this we need
+to enable links to the video devices that we want to use. In the example below
+we enable the links to both of the image capture video devices
+
+.. code-block:: none
+
+ media-ctl -l "'mali-c55 resizer fr':1->'mali-c55 fr':0[1]"
+ media-ctl -l "'mali-c55 resizer ds':1->'mali-c55 ds':0[1]"
+
+Capturing bayer data from the source and processing to RGB/YUV
+--------------------------------------------------------------
+
+To capture 1920x1080 bayer data from the source and push it through the ISP's
+full processing pipeline, we configure the data formats appropriately on the
+source, ISP and resizer subdevices and set the FR resizer's routing to select
+processed data. The media bus format on the resizer's source pad will be either
+RGB121212_1X36 or YUV10_1X30, depending on whether you want to capture RGB or
+YUV. The ISP's debayering block outputs RGB data natively, setting the source
+pad format to YUV10_1X30 enables the colour space conversion block.
+
+In this example we target RGB565 output, so select RGB121212_1X36 as the resizer
+source pad's format:
+
+.. code-block:: none
+
+ # Set formats on the TPG and ISP
+ media-ctl -V "'mali-c55 tpg':0[fmt:SRGGB20_1X20/1920x1080]"
+ media-ctl -V "'mali-c55 isp':0[fmt:SRGGB20_1X20/1920x1080]"
+ media-ctl -V "'mali-c55 isp':1[fmt:SRGGB20_1X20/1920x1080]"
+
+ # Set routing on the FR resizer
+ media-ctl -R "'mali-c55 resizer fr'[0/0->1/0[1],2/0->1/0[0]]"
+
+ # Set format on the resizer, must be done AFTER the routing.
+ media-ctl -V "'mali-c55 resizer fr':1[fmt:RGB121212_1X36/1920x1080]"
+
+The downscale output can also be used to stream data at the same time. In this
+case since only processed data can be captured through the downscale output no
+routing need be set:
+
+.. code-block:: none
+
+ # Set format on the resizer
+ media-ctl -V "'mali-c55 resizer ds':1[fmt:RGB121212_1X36/1920x1080]"
+
+Following which images can be captured from both the FR and DS output's video
+devices (simultaneously, if desired):
+
+.. code-block:: none
+
+ yavta -f RGB565 -s 1920x1080 -c10 /dev/video0
+ yavta -f RGB565 -s 1920x1080 -c10 /dev/video1
+
+Cropping the image
+~~~~~~~~~~~~~~~~~~
+
+Both the full resolution and downscale pipes can crop to a minimum resolution of
+640x480. To crop the image simply configure the resizer's sink pad's crop and
+compose rectangles and set the format on the video device:
+
+.. code-block:: none
+
+ media-ctl -V "'mali-c55 resizer fr':0[fmt:RGB121212_1X36/1920x1080 crop:(480,270)/640x480 compose:(0,0)/640x480]"
+ media-ctl -V "'mali-c55 resizer fr':1[fmt:RGB121212_1X36/640x480]"
+ yavta -f RGB565 -s 640x480 -c10 /dev/video0
+
+Downscaling the image
+~~~~~~~~~~~~~~~~~~~~~
+
+Both the full resolution and downscale pipes can downscale the image by up to 8x
+provided the minimum 640x480 output resolution is adhered to. For the best image
+result the scaling ratio for each direction should be the same. To configure
+scaling we use the compose rectangle on the resizer's sink pad:
+
+.. code-block:: none
+
+ media-ctl -V "'mali-c55 resizer fr':0[fmt:RGB121212_1X36/1920x1080 crop:(0,0)/1920x1080 compose:(0,0)/640x480]"
+ media-ctl -V "'mali-c55 resizer fr':1[fmt:RGB121212_1X36/640x480]"
+ yavta -f RGB565 -s 640x480 -c10 /dev/video0
+
+Capturing images in YUV formats
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+If we need to output YUV data rather than RGB the color space conversion block
+needs to be active, which is achieved by setting MEDIA_BUS_FMT_YUV10_1X30 on the
+resizer's source pad. We can then configure a capture format like NV12 (here in
+its multi-planar variant)
+
+.. code-block:: none
+
+ media-ctl -V "'mali-c55 resizer fr':1[fmt:YUV10_1X30/1920x1080]"
+ yavta -f NV12M -s 1920x1080 -c10 /dev/video0
+
+Capturing RGB data from the source and processing it with the resizers
+----------------------------------------------------------------------
+
+The Mali-C55 ISP can work with sensors capable of outputting RGB data. In this
+case although none of the image quality blocks would be used it can still
+crop/scale the data in the usual way. For this reason RGB data input to the ISP
+still goes through the ISP subdevice's pad 1 to the resizer.
+
+To achieve this, the ISP's sink pad's format is set to
+MEDIA_BUS_FMT_RGB202020_1X60 - this reflects the format that data must be in to
+work with the ISP. Converting the camera sensor's output to that format is the
+responsibility of external hardware.
+
+In this example we ask the test pattern generator to give us RGB data instead of
+bayer.
+
+.. code-block:: none
+
+ media-ctl -V "'mali-c55 tpg':0[fmt:RGB202020_1X60/1920x1080]"
+ media-ctl -V "'mali-c55 isp':0[fmt:RGB202020_1X60/1920x1080]"
+
+Cropping or scaling the data can be done in exactly the same way as outlined
+earlier.
+
+Capturing raw data from the source and outputting it unmodified
+-----------------------------------------------------------------
+
+The ISP can additionally capture raw data from the source and output it on the
+full resolution pipe only, completely unmodified. In this case the downscale
+pipe can still process the data normally and be used at the same time.
+
+To configure raw bypass the FR resizer's subdevice's routing table needs to be
+configured, followed by formats in the appropriate places:
+
+.. code-block:: none
+
+ media-ctl -R "'mali-c55 resizer fr'[0/0->1/0[0],2/0->1/0[1]]"
+ media-ctl -V "'mali-c55 isp':0[fmt:RGB202020_1X60/1920x1080]"
+ media-ctl -V "'mali-c55 resizer fr':2[fmt:RGB202020_1X60/1920x1080]"
+ media-ctl -V "'mali-c55 resizer fr':1[fmt:RGB202020_1X60/1920x1080]"
+
+ # Set format on the video device and stream
+ yavta -f RGB565 -s 1920x1080 -c10 /dev/video0
+
+.. _mali-c55-3a-stats:
+
+Capturing ISP Statistics
+========================
+
+The ISP is capable of producing statistics for consumption by image processing
+algorithms running in userspace. These statistics can be captured by queueing
+buffers to the `mali-c55 3a stats` V4L2 Device whilst the ISP is streaming. Only
+the :ref:`V4L2_META_FMT_MALI_C55_STATS <v4l2-meta-fmt-mali-c55-stats>`
+format is supported, so no format-setting need be done:
+
+.. code-block:: none
+
+ # We assume the media graph has been configured to support RGB565 capture
+ # from the mali-c55 fr V4L2 Device, which is at /dev/video0. The statistics
+ # V4L2 device is at /dev/video3
+
+ yavta -f RGB565 -s 1920x1080 -c32 /dev/video0 && \
+ yavta -c10 -F /dev/video3
+
+The layout of the buffer is described by :c:type:`mali_c55_stats_buffer`,
+but broadly statistics are generated to support three image processing
+algorithms; AEXP (Auto-Exposure), AWB (Auto-White Balance) and AF (Auto-Focus).
+These stats can be drawn from various places in the Mali C55 ISP pipeline, known
+as "tap points". This high-level block diagram is intended to explain where in
+the processing flow the statistics can be drawn from::
+
+ +--> AEXP-2 +----> AEXP-1 +--> AF-0
+ | +----> AF-1 |
+ | | |
+ +---------+ | +--------------+ | +--------------+ |
+ | Input +-+-->+ Digital Gain +---+-->+ Black Level +---+---+
+ +---------+ +--------------+ +--------------+ |
+ +-----------------------------------------------------------------+
+ |
+ | +--------------+ +---------+ +----------------+
+ +-->| Sinter Noise +-+ White +--+--->| Lens Shading +--+---------------+
+ | Reduction | | Balance | | | | | |
+ +--------------+ +---------+ | +----------------+ | |
+ +---> AEXP-0 (A) +--> AEXP-0 (B) |
+ +--------------------------------------------------------------------------+
+ |
+ | +----------------+ +--------------+ +----------------+
+ +-->| Tone mapping +-+--->| Demosaicing +->+ Purple Fringe +-+-----------+
+ | | | +--------------+ | Correction | | |
+ +----------------+ +-> AEXP-IRIDIX +----------------+ +---> AWB-0 |
+ +----------------------------------------------------------------------------+
+ | +-------------+ +-------------+
+ +------------------->| Colour +---+--->| Output |
+ | Correction | | | Pipelines |
+ +-------------+ | +-------------+
+ +--> AWB-1
+
+By default all statistics are drawn from the 0th tap point for each algorithm;
+I.E. AEXP statistics from AEXP-0 (A), AWB statistics from AWB-0 and AF
+statistics from AF-0. This is configurable for AEXP and AWB statsistics through
+programming the ISP's parameters.
+
+.. _mali-c55-3a-params:
+
+Programming ISP Parameters
+==========================
+
+The ISP can be programmed with various parameters from userspace to apply to the
+hardware before and during video stream. This allows userspace to dynamically
+change values such as black level, white balance and lens shading gains and so
+on.
+
+The buffer format and how to populate it are described by the
+:ref:`V4L2_META_FMT_MALI_C55_PARAMS <v4l2-meta-fmt-mali-c55-params>` format,
+which should be set as the data format for the `mali-c55 3a params` video node.
+
+References
+==========
+.. [1] https://git.linuxtv.org/v4l-utils.git/
+.. [2] https://git.ideasonboard.org/yavta.git
+.. [3] https://developer.arm.com/Processors/Mali-C55
diff --git a/Documentation/admin-guide/media/platform-cardlist.rst b/Documentation/admin-guide/media/platform-cardlist.rst
index 1230ae4037ad..63f4b19c3628 100644
--- a/Documentation/admin-guide/media/platform-cardlist.rst
+++ b/Documentation/admin-guide/media/platform-cardlist.rst
@@ -18,8 +18,6 @@ am437x-vpfe TI AM437x VPFE
aspeed-video Aspeed AST2400 and AST2500
atmel-isc ATMEL Image Sensor Controller (ISC)
atmel-isi ATMEL Image Sensor Interface (ISI)
-c8sectpfe SDR platform devices
-c8sectpfe SDR platform devices
cafe_ccic Marvell 88ALP01 (Cafe) CMOS Camera Controller
cdns-csi2rx Cadence MIPI-CSI2 RX Controller
cdns-csi2tx Cadence MIPI-CSI2 TX Controller
diff --git a/Documentation/admin-guide/media/rkcif-rk3568-vicap.dot b/Documentation/admin-guide/media/rkcif-rk3568-vicap.dot
new file mode 100644
index 000000000000..3fac59335459
--- /dev/null
+++ b/Documentation/admin-guide/media/rkcif-rk3568-vicap.dot
@@ -0,0 +1,8 @@
+digraph board {
+ rankdir=TB
+ n00000001 [label="{{<port0> 0} | rkcif-dvp0\n/dev/v4l-subdev0 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000001:port1 -> n00000004
+ n00000004 [label="rkcif-dvp0-id0\n/dev/video0", shape=box, style=filled, fillcolor=yellow]
+ n00000025 [label="{{} | it6801 2-0048\n/dev/v4l-subdev1 | {<port0> 0}}", shape=Mrecord, style=filled, fillcolor=green]
+ n00000025:port0 -> n00000001:port0
+}
diff --git a/Documentation/admin-guide/media/rkcif.rst b/Documentation/admin-guide/media/rkcif.rst
new file mode 100644
index 000000000000..2558c121abc4
--- /dev/null
+++ b/Documentation/admin-guide/media/rkcif.rst
@@ -0,0 +1,79 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=========================================
+Rockchip Camera Interface (CIF)
+=========================================
+
+Introduction
+============
+
+The Rockchip Camera Interface (CIF) is featured in many Rockchip SoCs in
+different variants.
+The different variants are combinations of common building blocks, such as
+
+* INTERFACE blocks of different types, namely
+
+ * the Digital Video Port (DVP, a parallel data interface)
+ * the interface block for the MIPI CSI-2 receiver
+
+* CROP units
+
+* MIPI CSI-2 receiver (not available on all variants): This unit is referred
+ to as MIPI CSI HOST in the Rockchip documentation.
+ Technically, it is a separate hardware block, but it is strongly coupled to
+ the CIF and therefore included here.
+
+* MUX units (not available on all variants) that pass the video data to an
+ image signal processor (ISP)
+
+* SCALE units (not available on all variants)
+
+* DMA engines that transfer video data into system memory using a
+ double-buffering mechanism called ping-pong mode
+
+* Support for four streams per INTERFACE block (not available on all
+ variants), e.g., for MIPI CSI-2 Virtual Channels (VCs)
+
+This document describes the different variants of the CIF, their hardware
+layout, as well as their representation in the media controller centric rkcif
+device driver, which is located under drivers/media/platform/rockchip/rkcif.
+
+Variants
+========
+
+Rockchip PX30 Video Input Processor (VIP)
+-----------------------------------------
+
+The PX30 Video Input Processor (VIP) features a digital video port that accepts
+parallel video data or BT.656.
+Since these protocols do not feature multiple streams, the VIP has one DMA
+engine that transfers the input video data into system memory.
+
+The rkcif driver represents this hardware variant by exposing one V4L2 subdevice
+(the DVP INTERFACE/CROP block) and one V4L2 device (the DVP DMA engine).
+
+Rockchip RK3568 Video Capture (VICAP)
+-------------------------------------
+
+The RK3568 Video Capture (VICAP) unit features a digital video port and a MIPI
+CSI-2 receiver that can receive video data independently.
+The DVP accepts parallel video data, BT.656 and BT.1120.
+Since the BT.1120 protocol may feature more than one stream, the RK3568 VICAP
+DVP features four DMA engines that can capture different streams.
+Similarly, the RK3568 VICAP MIPI CSI-2 receiver features four DMA engines to
+handle different Virtual Channels (VCs).
+
+The rkcif driver represents this hardware variant by exposing up the following
+V4L2 subdevices:
+
+* rkcif-dvp0: INTERFACE/CROP block for the DVP
+
+and the following video devices:
+
+* rkcif-dvp0-id0: The support for multiple streams on the DVP is not yet
+ implemented, as it is hard to find test hardware. Thus, this video device
+ represents the first DMA engine of the RK3568 DVP.
+
+.. kernel-figure:: rkcif-rk3568-vicap.dot
+ :alt: Topology of the RK3568 Video Capture (VICAP) unit
+ :align: center
diff --git a/Documentation/admin-guide/media/v4l-drivers.rst b/Documentation/admin-guide/media/v4l-drivers.rst
index 3bac5165b134..393f83e8dc4d 100644
--- a/Documentation/admin-guide/media/v4l-drivers.rst
+++ b/Documentation/admin-guide/media/v4l-drivers.rst
@@ -19,12 +19,14 @@ Video4Linux (V4L) driver-specific documentation
ipu3
ipu6-isys
ivtv
+ mali-c55
mgb4
omap3isp
philips
qcom_camss
raspberrypi-pisp-be
rcar-fdp1
+ rkcif
rkisp1
raspberrypi-rp1-cfe
saa7134
diff --git a/Documentation/admin-guide/pm/cpuidle.rst b/Documentation/admin-guide/pm/cpuidle.rst
index 0c090b076224..be4c1120e3f0 100644
--- a/Documentation/admin-guide/pm/cpuidle.rst
+++ b/Documentation/admin-guide/pm/cpuidle.rst
@@ -580,6 +580,15 @@ the given CPU as the upper limit for the exit latency of the idle states that
they are allowed to select for that CPU. They should never select any idle
states with exit latency beyond that limit.
+While the above CPU QoS constraints apply to CPU idle time management, user
+space may also request a CPU system wakeup latency QoS limit, via the
+`cpu_wakeup_latency` file. This QoS constraint is respected when selecting a
+suitable idle state for the CPUs, while entering the system-wide suspend-to-idle
+sleep state, but also to the regular CPU idle time management.
+
+Note that, the management of the `cpu_wakeup_latency` file works according to
+the 'cpu_dma_latency' file from user space point of view. Moreover, the unit
+is also microseconds.
Idle States Control Via Kernel Command Line
===========================================
diff --git a/Documentation/admin-guide/pm/intel_pstate.rst b/Documentation/admin-guide/pm/intel_pstate.rst
index 26e702c7016e..fde967b0c2e0 100644
--- a/Documentation/admin-guide/pm/intel_pstate.rst
+++ b/Documentation/admin-guide/pm/intel_pstate.rst
@@ -48,8 +48,9 @@ only way to pass early-configuration-time parameters to it is via the kernel
command line. However, its configuration can be adjusted via ``sysfs`` to a
great extent. In some configurations it even is possible to unregister it via
``sysfs`` which allows another ``CPUFreq`` scaling driver to be loaded and
-registered (see `below <status_attr_>`_).
+registered (see :ref:`below <status_attr>`).
+.. _operation_modes:
Operation Modes
===============
@@ -62,6 +63,8 @@ a certain performance scaling algorithm. Which of them will be in effect
depends on what kernel command line options are used and on the capabilities of
the processor.
+.. _active_mode:
+
Active Mode
-----------
@@ -94,6 +97,8 @@ Which of the P-state selection algorithms is used by default depends on the
Namely, if that option is set, the ``performance`` algorithm will be used by
default, and the other one will be used by default if it is not set.
+.. _active_mode_hwp:
+
Active Mode With HWP
~~~~~~~~~~~~~~~~~~~~
@@ -123,7 +128,7 @@ Energy-Performance Bias (EPB) knob (otherwise), which means that the processor's
internal P-state selection logic is expected to focus entirely on performance.
This will override the EPP/EPB setting coming from the ``sysfs`` interface
-(see `Energy vs Performance Hints`_ below). Moreover, any attempts to change
+(see :ref:`energy_performance_hints` below). Moreover, any attempts to change
the EPP/EPB to a value different from 0 ("performance") via ``sysfs`` in this
configuration will be rejected.
@@ -192,6 +197,8 @@ This is the default P-state selection algorithm if the
:c:macro:`CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE` kernel configuration option
is not set.
+.. _passive_mode:
+
Passive Mode
------------
@@ -289,12 +296,12 @@ Unlike ``_PSS`` objects in the ACPI tables, ``intel_pstate`` always exposes
the entire range of available P-states, including the whole turbo range, to the
``CPUFreq`` core and (in the passive mode) to generic scaling governors. This
generally causes turbo P-states to be set more often when ``intel_pstate`` is
-used relative to ACPI-based CPU performance scaling (see `below <acpi-cpufreq_>`_
-for more information).
+used relative to ACPI-based CPU performance scaling (see
+:ref:`below <acpi-cpufreq>` for more information).
Moreover, since ``intel_pstate`` always knows what the real turbo threshold is
(even if the Configurable TDP feature is enabled in the processor), its
-``no_turbo`` attribute in ``sysfs`` (described `below <no_turbo_attr_>`_) should
+``no_turbo`` attribute in ``sysfs`` (described :ref:`below <no_turbo_attr>`) should
work as expected in all cases (that is, if set to disable turbo P-states, it
always should prevent ``intel_pstate`` from using them).
@@ -307,12 +314,12 @@ pieces of information on it to be known, including:
* The minimum supported P-state.
- * The maximum supported `non-turbo P-state <turbo_>`_.
+ * The maximum supported :ref:`non-turbo P-state <turbo>`.
* Whether or not turbo P-states are supported at all.
- * The maximum supported `one-core turbo P-state <turbo_>`_ (if turbo P-states
- are supported).
+ * The maximum supported :ref:`one-core turbo P-state <turbo>` (if turbo
+ P-states are supported).
* The scaling formula to translate the driver's internal representation
of P-states into frequencies and the other way around.
@@ -400,10 +407,10 @@ Energy-Aware Scheduling Support
If ``CONFIG_ENERGY_MODEL`` has been set during kernel configuration and
``intel_pstate`` runs on a hybrid processor without SMT, in addition to enabling
-`CAS <CAS_>`_ it registers an Energy Model for the processor. This allows the
+:ref:`CAS` it registers an Energy Model for the processor. This allows the
Energy-Aware Scheduling (EAS) support to be enabled in the CPU scheduler if
``schedutil`` is used as the ``CPUFreq`` governor which requires ``intel_pstate``
-to operate in the `passive mode <Passive Mode_>`_.
+to operate in the :ref:`passive mode <passive_mode>`.
The Energy Model registered by ``intel_pstate`` is artificial (that is, it is
based on abstract cost values and it does not include any real power numbers)
@@ -432,6 +439,8 @@ the ``energy_model`` directory in ``debugfs`` (typlically mounted on
User Space Interface in ``sysfs``
=================================
+.. _global_attributes:
+
Global Attributes
-----------------
@@ -444,8 +453,8 @@ argument is passed to the kernel in the command line.
``max_perf_pct``
Maximum P-state the driver is allowed to set in percent of the
- maximum supported performance level (the highest supported `turbo
- P-state <turbo_>`_).
+ maximum supported performance level (the highest supported :ref:`turbo
+ P-state <turbo>`).
This attribute will not be exposed if the
``intel_pstate=per_cpu_perf_limits`` argument is present in the kernel
@@ -453,8 +462,8 @@ argument is passed to the kernel in the command line.
``min_perf_pct``
Minimum P-state the driver is allowed to set in percent of the
- maximum supported performance level (the highest supported `turbo
- P-state <turbo_>`_).
+ maximum supported performance level (the highest supported :ref:`turbo
+ P-state <turbo>`).
This attribute will not be exposed if the
``intel_pstate=per_cpu_perf_limits`` argument is present in the kernel
@@ -463,18 +472,18 @@ argument is passed to the kernel in the command line.
``num_pstates``
Number of P-states supported by the processor (between 0 and 255
inclusive) including both turbo and non-turbo P-states (see
- `Turbo P-states Support`_).
+ :ref:`turbo`).
This attribute is present only if the value exposed by it is the same
for all of the CPUs in the system.
The value of this attribute is not affected by the ``no_turbo``
- setting described `below <no_turbo_attr_>`_.
+ setting described :ref:`below <no_turbo_attr>`.
This attribute is read-only.
``turbo_pct``
- Ratio of the `turbo range <turbo_>`_ size to the size of the entire
+ Ratio of the :ref:`turbo range <turbo>` size to the size of the entire
range of supported P-states, in percent.
This attribute is present only if the value exposed by it is the same
@@ -486,7 +495,7 @@ argument is passed to the kernel in the command line.
``no_turbo``
If set (equal to 1), the driver is not allowed to set any turbo P-states
- (see `Turbo P-states Support`_). If unset (equal to 0, which is the
+ (see :ref:`turbo`). If unset (equal to 0, which is the
default), turbo P-states can be set by the driver.
[Note that ``intel_pstate`` does not support the general ``boost``
attribute (supported by some other scaling drivers) which is replaced
@@ -495,11 +504,11 @@ argument is passed to the kernel in the command line.
This attribute does not affect the maximum supported frequency value
supplied to the ``CPUFreq`` core and exposed via the policy interface,
but it affects the maximum possible value of per-policy P-state limits
- (see `Interpretation of Policy Attributes`_ below for details).
+ (see :ref:`policy_attributes_interpretation` below for details).
``hwp_dynamic_boost``
This attribute is only present if ``intel_pstate`` works in the
- `active mode with the HWP feature enabled <Active Mode With HWP_>`_ in
+ :ref:`active mode with the HWP feature enabled <active_mode_hwp>` in
the processor. If set (equal to 1), it causes the minimum P-state limit
to be increased dynamically for a short time whenever a task previously
waiting on I/O is selected to run on a given logical CPU (the purpose
@@ -514,12 +523,12 @@ argument is passed to the kernel in the command line.
Operation mode of the driver: "active", "passive" or "off".
"active"
- The driver is functional and in the `active mode
- <Active Mode_>`_.
+ The driver is functional and in the :ref:`active mode
+ <active_mode>`.
"passive"
- The driver is functional and in the `passive mode
- <Passive Mode_>`_.
+ The driver is functional and in the :ref:`passive mode
+ <passive_mode>`.
"off"
The driver is not functional (it is not registered as a scaling
@@ -547,13 +556,15 @@ argument is passed to the kernel in the command line.
attribute to "1" enables the energy-efficiency optimizations and setting
to "0" disables them.
+.. _policy_attributes_interpretation:
+
Interpretation of Policy Attributes
-----------------------------------
The interpretation of some ``CPUFreq`` policy attributes described in
Documentation/admin-guide/pm/cpufreq.rst is special with ``intel_pstate``
as the current scaling driver and it generally depends on the driver's
-`operation mode <Operation Modes_>`_.
+:ref:`operation mode <operation_modes>`.
First of all, the values of the ``cpuinfo_max_freq``, ``cpuinfo_min_freq`` and
``scaling_cur_freq`` attributes are produced by applying a processor-specific
@@ -562,9 +573,10 @@ Also, the values of the ``scaling_max_freq`` and ``scaling_min_freq``
attributes are capped by the frequency corresponding to the maximum P-state that
the driver is allowed to set.
-If the ``no_turbo`` `global attribute <no_turbo_attr_>`_ is set, the driver is
-not allowed to use turbo P-states, so the maximum value of ``scaling_max_freq``
-and ``scaling_min_freq`` is limited to the maximum non-turbo P-state frequency.
+If the ``no_turbo`` :ref:`global attribute <no_turbo_attr>` is set, the driver
+is not allowed to use turbo P-states, so the maximum value of
+``scaling_max_freq`` and ``scaling_min_freq`` is limited to the maximum
+non-turbo P-state frequency.
Accordingly, setting ``no_turbo`` causes ``scaling_max_freq`` and
``scaling_min_freq`` to go down to that value if they were above it before.
However, the old values of ``scaling_max_freq`` and ``scaling_min_freq`` will be
@@ -576,7 +588,7 @@ and ``scaling_min_freq`` corresponds to the maximum supported turbo P-state,
which also is the value of ``cpuinfo_max_freq`` in either case.
Next, the following policy attributes have special meaning if
-``intel_pstate`` works in the `active mode <Active Mode_>`_:
+``intel_pstate`` works in the :ref:`active mode <active_mode>`:
``scaling_available_governors``
List of P-state selection algorithms provided by ``intel_pstate``.
@@ -597,20 +609,22 @@ processor:
Shows the base frequency of the CPU. Any frequency above this will be
in the turbo frequency range.
-The meaning of these attributes in the `passive mode <Passive Mode_>`_ is the
+The meaning of these attributes in the :ref:`passive mode <passive_mode>` is the
same as for other scaling drivers.
Additionally, the value of the ``scaling_driver`` attribute for ``intel_pstate``
depends on the operation mode of the driver. Namely, it is either
-"intel_pstate" (in the `active mode <Active Mode_>`_) or "intel_cpufreq" (in the
-`passive mode <Passive Mode_>`_).
+"intel_pstate" (in the :ref:`active mode <active_mode>`) or "intel_cpufreq"
+(in the :ref:`passive mode <passive_mode>`).
+
+.. _pstate_limits_coordination:
Coordination of P-State Limits
------------------------------
``intel_pstate`` allows P-state limits to be set in two ways: with the help of
-the ``max_perf_pct`` and ``min_perf_pct`` `global attributes
-<Global Attributes_>`_ or via the ``scaling_max_freq`` and ``scaling_min_freq``
+the ``max_perf_pct`` and ``min_perf_pct`` :ref:`global attributes
+<global_attributes>` or via the ``scaling_max_freq`` and ``scaling_min_freq``
``CPUFreq`` policy attributes. The coordination between those limits is based
on the following rules, regardless of the current operation mode of the driver:
@@ -632,17 +646,18 @@ on the following rules, regardless of the current operation mode of the driver:
3. The global and per-policy limits can be set independently.
-In the `active mode with the HWP feature enabled <Active Mode With HWP_>`_, the
+In the :ref:`active mode with the HWP feature enabled <active_mode_hwp>`, the
resulting effective values are written into hardware registers whenever the
limits change in order to request its internal P-state selection logic to always
set P-states within these limits. Otherwise, the limits are taken into account
-by scaling governors (in the `passive mode <Passive Mode_>`_) and by the driver
-every time before setting a new P-state for a CPU.
+by scaling governors (in the :ref:`passive mode <passive_mode>`) and by the
+driver every time before setting a new P-state for a CPU.
Additionally, if the ``intel_pstate=per_cpu_perf_limits`` command line argument
is passed to the kernel, ``max_perf_pct`` and ``min_perf_pct`` are not exposed
at all and the only way to set the limits is by using the policy attributes.
+.. _energy_performance_hints:
Energy vs Performance Hints
---------------------------
@@ -702,9 +717,9 @@ output.
On those systems each ``_PSS`` object returns a list of P-states supported by
the corresponding CPU which basically is a subset of the P-states range that can
be used by ``intel_pstate`` on the same system, with one exception: the whole
-`turbo range <turbo_>`_ is represented by one item in it (the topmost one). By
-convention, the frequency returned by ``_PSS`` for that item is greater by 1 MHz
-than the frequency of the highest non-turbo P-state listed by it, but the
+:ref:`turbo range <turbo>` is represented by one item in it (the topmost one).
+By convention, the frequency returned by ``_PSS`` for that item is greater by
+1 MHz than the frequency of the highest non-turbo P-state listed by it, but the
corresponding P-state representation (following the hardware specification)
returned for it matches the maximum supported turbo P-state (or is the
special value 255 meaning essentially "go as high as you can get").
@@ -730,18 +745,18 @@ benefit from running at turbo frequencies will be given non-turbo P-states
instead.
One more issue related to that may appear on systems supporting the
-`Configurable TDP feature <turbo_>`_ allowing the platform firmware to set the
-turbo threshold. Namely, if that is not coordinated with the lists of P-states
-returned by ``_PSS`` properly, there may be more than one item corresponding to
-a turbo P-state in those lists and there may be a problem with avoiding the
-turbo range (if desirable or necessary). Usually, to avoid using turbo
-P-states overall, ``acpi-cpufreq`` simply avoids using the topmost state listed
-by ``_PSS``, but that is not sufficient when there are other turbo P-states in
-the list returned by it.
+:ref:`Configurable TDP feature <turbo>` allowing the platform firmware to set
+the turbo threshold. Namely, if that is not coordinated with the lists of
+P-states returned by ``_PSS`` properly, there may be more than one item
+corresponding to a turbo P-state in those lists and there may be a problem with
+avoiding the turbo range (if desirable or necessary). Usually, to avoid using
+turbo P-states overall, ``acpi-cpufreq`` simply avoids using the topmost state
+listed by ``_PSS``, but that is not sufficient when there are other turbo
+P-states in the list returned by it.
Apart from the above, ``acpi-cpufreq`` works like ``intel_pstate`` in the
-`passive mode <Passive Mode_>`_, except that the number of P-states it can set
-is limited to the ones listed by the ACPI ``_PSS`` objects.
+:ref:`passive mode <passive_mode>`, except that the number of P-states it can
+set is limited to the ones listed by the ACPI ``_PSS`` objects.
Kernel Command Line Options for ``intel_pstate``
@@ -756,11 +771,11 @@ of them have to be prepended with the ``intel_pstate=`` prefix.
processor is supported by it.
``active``
- Register ``intel_pstate`` in the `active mode <Active Mode_>`_ to start
- with.
+ Register ``intel_pstate`` in the :ref:`active mode <active_mode>` to
+ start with.
``passive``
- Register ``intel_pstate`` in the `passive mode <Passive Mode_>`_ to
+ Register ``intel_pstate`` in the :ref:`passive mode <passive_mode>` to
start with.
``force``
@@ -793,12 +808,12 @@ of them have to be prepended with the ``intel_pstate=`` prefix.
and this option has no effect.
``per_cpu_perf_limits``
- Use per-logical-CPU P-State limits (see `Coordination of P-state
- Limits`_ for details).
+ Use per-logical-CPU P-State limits (see
+ :ref:`pstate_limits_coordination` for details).
``no_cas``
- Do not enable `capacity-aware scheduling <CAS_>`_ which is enabled by
- default on hybrid systems without SMT.
+ Do not enable :ref:`capacity-aware scheduling <CAS>` which is enabled
+ by default on hybrid systems without SMT.
Diagnostics and Tuning
======================
@@ -810,7 +825,7 @@ There are two static trace events that can be used for ``intel_pstate``
diagnostics. One of them is the ``cpu_frequency`` trace event generally used
by ``CPUFreq``, and the other one is the ``pstate_sample`` trace event specific
to ``intel_pstate``. Both of them are triggered by ``intel_pstate`` only if
-it works in the `active mode <Active Mode_>`_.
+it works in the :ref:`active mode <active_mode>`.
The following sequence of shell commands can be used to enable them and see
their output (if the kernel is generally configured to support event tracing)::
@@ -822,7 +837,7 @@ their output (if the kernel is generally configured to support event tracing)::
gnome-terminal--4510 [001] ..s. 1177.680733: pstate_sample: core_busy=107 scaled=94 from=26 to=26 mperf=1143818 aperf=1230607 tsc=29838618 freq=2474476
cat-5235 [002] ..s. 1177.681723: cpu_frequency: state=2900000 cpu_id=2
-If ``intel_pstate`` works in the `passive mode <Passive Mode_>`_, the
+If ``intel_pstate`` works in the :ref:`passive mode <passive_mode>`, the
``cpu_frequency`` trace event will be triggered either by the ``schedutil``
scaling governor (for the policies it is attached to), or by the ``CPUFreq``
core (for the policies with other scaling governors).
diff --git a/Documentation/admin-guide/sysctl/net.rst b/Documentation/admin-guide/sysctl/net.rst
index 2ef50828aff1..369a738a6819 100644
--- a/Documentation/admin-guide/sysctl/net.rst
+++ b/Documentation/admin-guide/sysctl/net.rst
@@ -212,6 +212,14 @@ mem_pcpu_rsv
Per-cpu reserved forward alloc cache size in page units. Default 1MB per CPU.
+bypass_prot_mem
+---------------
+
+Skip charging socket buffers to the global per-protocol memory
+accounting controlled by net.ipv4.tcp_mem, net.ipv4.udp_mem, etc.
+
+Default: 0 (off)
+
rmem_default
------------
@@ -347,9 +355,9 @@ skb_defer_max
-------------
Max size (in skbs) of the per-cpu list of skbs being freed
-by the cpu which allocated them. Used by TCP stack so far.
+by the cpu which allocated them.
-Default: 64
+Default: 128
optmem_max
----------
@@ -406,6 +414,23 @@ to SOCK_TXREHASH_DEFAULT (i. e. not overridden by setsockopt).
If set to 1 (default), hash rethink is performed on listening socket.
If set to 0, hash rethink is not performed.
+txq_reselection_ms
+------------------
+
+Controls how often (in ms) a busy connected flow can select another tx queue.
+
+A resection is desirable when/if user thread has migrated and XPS
+would select a different queue. Same can occur without XPS
+if the flow hash has changed.
+
+But switching txq can introduce reorders, especially if the
+old queue is under high pressure. Modern TCP stacks deal
+well with reorders if they happen not too often.
+
+To disable this feature, set the value to 0.
+
+Default : 1000
+
gro_normal_batch
----------------
diff --git a/Documentation/admin-guide/tainted-kernels.rst b/Documentation/admin-guide/tainted-kernels.rst
index a0cc017e4424..ed1f8f1e86c5 100644
--- a/Documentation/admin-guide/tainted-kernels.rst
+++ b/Documentation/admin-guide/tainted-kernels.rst
@@ -186,6 +186,6 @@ More detailed explanation for tainting
18) ``N`` if an in-kernel test, such as a KUnit test, has been run.
- 19) ``J`` if userpace opened /dev/fwctl/* and performed a FWTCL_RPC_DEBUG_WRITE
+ 19) ``J`` if userspace opened /dev/fwctl/* and performed a FWTCL_RPC_DEBUG_WRITE
to use the devices debugging features. Device debugging features could
cause the device to malfunction in undefined ways.
diff --git a/Documentation/admin-guide/thermal/index.rst b/Documentation/admin-guide/thermal/index.rst
index 193b7b01a87d..e48bc0a1951b 100644
--- a/Documentation/admin-guide/thermal/index.rst
+++ b/Documentation/admin-guide/thermal/index.rst
@@ -6,3 +6,4 @@ Thermal Subsystem
:maxdepth: 1
intel_powerclamp
+ intel_thermal_throttle
diff --git a/Documentation/admin-guide/thermal/intel_thermal_throttle.rst b/Documentation/admin-guide/thermal/intel_thermal_throttle.rst
new file mode 100644
index 000000000000..f4fbf9d5a4ec
--- /dev/null
+++ b/Documentation/admin-guide/thermal/intel_thermal_throttle.rst
@@ -0,0 +1,91 @@
+.. SPDX-License-Identifier: GPL-2.0
+.. include:: <isonum.txt>
+
+=======================================
+Intel thermal throttle events reporting
+=======================================
+
+:Author: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
+
+Introduction
+------------
+
+Intel processors have built in automatic and adaptive thermal monitoring
+mechanisms that force the processor to reduce its power consumption in order
+to operate within predetermined temperature limits.
+
+Refer to section "THERMAL MONITORING AND PROTECTION" in the "Intel® 64 and
+IA-32 Architectures Software Developer’s Manual Volume 3 (3A, 3B, 3C, & 3D):
+System Programming Guide" for more details.
+
+In general, there are two mechanisms to control the core temperature of the
+processor. They are called "Thermal Monitor 1 (TM1) and Thermal Monitor 2 (TM2)".
+
+The status of the temperature sensor that triggers the thermal monitor (TM1/TM2)
+is indicated through the "thermal status flag" and "thermal status log flag" in
+MSR_IA32_THERM_STATUS for core level and MSR_IA32_PACKAGE_THERM_STATUS for
+package level.
+
+Thermal Status flag, bit 0 — When set, indicates that the processor core
+temperature is currently at the trip temperature of the thermal monitor and that
+the processor power consumption is being reduced via either TM1 or TM2, depending
+on which is enabled. When clear, the flag indicates that the core temperature is
+below the thermal monitor trip temperature. This flag is read only.
+
+Thermal Status Log flag, bit 1 — When set, indicates that the thermal sensor has
+tripped since the last power-up or reset or since the last time that software
+cleared this flag. This flag is a sticky bit; once set it remains set until
+cleared by software or until a power-up or reset of the processor. The default
+state is clear.
+
+It is possible that when user reads MSR_IA32_THERM_STATUS or
+MSR_IA32_PACKAGE_THERM_STATUS, TM1/TM2 is not active. In this case,
+"Thermal Status flag" will read "0" and the "Thermal Status Log flag" will be set
+to show any previous "TM1/TM2" activation. But since it needs to be cleared by
+the software, it can't show the number of occurrences of "TM1/TM2" activations.
+
+Hence, Linux provides counters of how many times the "Thermal Status flag" was
+set. Also presents how long the "Thermal Status flag" was active in milliseconds.
+Using these counters, users can check if the performance was limited because of
+thermal events. It is recommended to read from sysfs instead of directly reading
+MSRs as the "Thermal Status Log flag" is reset by the driver to implement rate
+control.
+
+Sysfs Interface
+---------------
+
+Thermal throttling events are presented for each CPU under
+"/sys/devices/system/cpu/cpuX/thermal_throttle/", where "X" is the CPU number.
+
+All these counters are read-only. They can't be reset to 0. So, they can potentially
+overflow after reaching the maximum 64 bit unsigned integer.
+
+``core_throttle_count``
+ Shows the number of times "Thermal Status flag" changed from 0 to 1 for this
+ CPU since OS boot and thermal vector is initialized. This is a 64 bit counter.
+
+``package_throttle_count``
+ Shows the number of times "Thermal Status flag" changed from 0 to 1 for the
+ package containing this CPU since OS boot and thermal vector is initialized.
+ Package status is broadcast to all CPUs; all CPUs in the package increment
+ this count. This is a 64-bit counter.
+
+``core_throttle_max_time_ms``
+ Shows the maximum amount of time for which "Thermal Status flag" has been
+ set to 1 for this CPU at the core level since OS boot and thermal vector
+ is initialized.
+
+``package_throttle_max_time_ms``
+ Shows the maximum amount of time for which "Thermal Status flag" has been
+ set to 1 for the package containing this CPU since OS boot and thermal
+ vector is initialized.
+
+``core_throttle_total_time_ms``
+ Shows the cumulative time for which "Thermal Status flag" has been
+ set to 1 for this CPU for core level since OS boot and thermal vector
+ is initialized.
+
+``package_throttle_total_time_ms``
+ Shows the cumulative time for which "Thermal Status flag" has been set
+ to 1 for the package containing this CPU since OS boot and thermal vector
+ is initialized.
diff --git a/Documentation/admin-guide/workload-tracing.rst b/Documentation/admin-guide/workload-tracing.rst
index d6313890ee41..35963491b9f1 100644
--- a/Documentation/admin-guide/workload-tracing.rst
+++ b/Documentation/admin-guide/workload-tracing.rst
@@ -196,11 +196,11 @@ Let’s checkout the latest Linux repository and build cscope database::
cscope -R -p10 # builds cscope.out database before starting browse session
cscope -d -p10 # starts browse session on cscope.out database
-Note: Run "cscope -R -p10" to build the database and c"scope -d -p10" to
-enter into the browsing session. cscope by default cscope.out database.
-To get out of this mode press ctrl+d. -p option is used to specify the
-number of file path components to display. -p10 is optimal for browsing
-kernel sources.
+Note: Run "cscope -R -p10" to build the database and "cscope -d -p10" to
+enter into the browsing session. cscope by default uses the cscope.out
+database. To get out of this mode press ctrl+d. -p option is used to
+specify the number of file path components to display. -p10 is optimal
+for browsing kernel sources.
What is perf and how do we use it?
==================================