summaryrefslogtreecommitdiff
path: root/drivers/cpufreq
AgeCommit message (Collapse)Author
2025-04-05treewide: Switch/rename to timer_delete[_sync]()Thomas Gleixner
timer_delete[_sync]() replaces del_timer[_sync](). Convert the whole tree over and remove the historical wrapper inlines. Conversion was done with coccinelle plus manual fixups where necessary. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-04-02Merge tag 'pm-6.15-rc1-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fix from Rafael Wysocki: "Prevent cpufreq_update_limits() from crashing the kernel due to a NULL pointer dereference when it is called before registering a cpufreq driver, for instance as a result of a notification triggered by the platform firmware (Rafael Wysocki)" * tag 'pm-6.15-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: Reference count policy in cpufreq_update_limits()
2025-04-01cpufreq: Reference count policy in cpufreq_update_limits()Rafael J. Wysocki
Since acpi_processor_notify() can be called before registering a cpufreq driver or even in cases when a cpufreq driver is not registered at all, cpufreq_update_limits() needs to check if a cpufreq driver is present and prevent it from being unregistered. For this purpose, make it call cpufreq_cpu_get() to obtain a cpufreq policy pointer for the given CPU and reference count the corresponding policy object, if present. Fixes: 5a25e3f7cc53 ("cpufreq: intel_pstate: Driver-specific handling of _PPC updates") Closes: https://lore.kernel.org/linux-acpi/Z-ShAR59cTow0KcR@mail-itl Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Link: https://patch.msgid.link/1928789.tdWV9SEqCh@rjwysocki.net
2025-03-27Merge tag 'powerpc-6.15-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Madhavan Srinivasan: - Remove support for IBM Cell Blades - SMP support for microwatt platform - Support for inline static calls on PPC32 - Enable pmu selftests for power11 platform - Enable hardware trace macro (HTM) hcall support - Support for limited address mode capability - Changes to RMA size from 512 MB to 768 MB to handle fadump - Misc fixes and cleanups Thanks to Abhishek Dubey, Amit Machhiwal, Andreas Schwab, Arnd Bergmann, Athira Rajeev, Avnish Chouhan, Christophe Leroy, Disha Goel, Donet Tom, Gaurav Batra, Gautam Menghani, Hari Bathini, Kajol Jain, Kees Cook, Mahesh Salgaonkar, Michael Ellerman, Paul Mackerras, Ritesh Harjani (IBM), Sathvika Vasireddy, Segher Boessenkool, Sourabh Jain, Vaibhav Jain, and Venkat Rao Bagalkote. * tag 'powerpc-6.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (61 commits) powerpc/kexec: fix physical address calculation in clear_utlb_entry() crypto: powerpc: Mark ghashp8-ppc.o as an OBJECT_FILES_NON_STANDARD powerpc: Fix 'intra_function_call not a direct call' warning powerpc/perf: Fix ref-counting on the PMU 'vpa_pmu' KVM: PPC: Enable CAP_SPAPR_TCE_VFIO on pSeries KVM guests powerpc/prom_init: Fixup missing #size-cells on PowerBook6,7 powerpc/microwatt: Add SMP support powerpc: Define config option for processors with broadcast TLBIE powerpc/microwatt: Define an idle power-save function powerpc/microwatt: Device-tree updates powerpc/microwatt: Select COMMON_CLK in order to get the clock framework net: toshiba: Remove reference to PPC_IBM_CELL_BLADE net: spider_net: Remove powerpc Cell driver cpufreq: ppc_cbe: Remove powerpc Cell driver genirq: Remove IRQ_EDGE_EOI_HANDLER docs: Remove reference to removed CBE_CPUFREQ_SPU_GOVERNOR powerpc: Remove UDBG_RTAS_CONSOLE powerpc/io: Use standard barrier macros in io.c powerpc/io: Rename _insw_ns() etc. powerpc/io: Use generic raw accessors ...
2025-03-25Merge tag 'pm-6.15-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "These are dominated by cpufreq updates which in turn are dominated by updates related to boost support in the core and drivers and amd-pstate driver optimizations. Apart from the above, there are some cpuidle updates including a rework of the most recent idle intervals handling in the venerable menu governor that leads to significant improvements in some performance benchmarks, as the governor is now more likely to predict a shorter idle duration in some cases, and there are updates of the core device power management code, mostly related to system suspend and resume, that should help to avoid potential issues arising when the drivers of devices depending on one another want to use different optimizations. There is also a usual collection of assorted fixes and cleanups, including removal of some unused code. Specifics: - Manage sysfs attributes and boost frequencies efficiently from cpufreq core to reduce boilerplate code in drivers (Viresh Kumar) - Minor cleanups to cpufreq drivers (Aaron Kling, Benjamin Schneider, Dhananjay Ugwekar, Imran Shaik, zuoqian) - Migrate some cpufreq drivers to using for_each_present_cpu() (Jacky Bai) - cpufreq-qcom-hw DT binding fixes (Krzysztof Kozlowski) - Use str_enable_disable() helper in cpufreq_online() (Lifeng Zheng) - Optimize the amd-pstate driver to avoid cases where call paths end up calling the same writes multiple times and needlessly caching variables through code reorganization, locking overhaul and tracing adjustments (Mario Limonciello, Dhananjay Ugwekar) - Make it possible to avoid enabling capacity-aware scheduling (CAS) in the intel_pstate driver and relocate a check for out-of-band (OOB) platform handling in it to make it detect OOB before checking HWP availability (Rafael Wysocki) - Fix dbs_update() to avoid inadvertent conversions of negative integer values to unsigned int which causes CPU frequency selection to be inaccurate in some cases when the "conservative" cpufreq governor is in use (Jie Zhan) - Update the handling of the most recent idle intervals in the menu cpuidle governor to prevent useful information from being discarded by it in some cases and improve the prediction accuracy (Rafael Wysocki) - Make it possible to tell the intel_idle driver to ignore its built-in table of idle states for the given processor, clean up the handling of auto-demotion disabling on Baytrail and Cherrytrail chips in it, and update its MAINTAINERS entry (David Arcari, Artem Bityutskiy, Rafael Wysocki) - Make some cpuidle drivers use for_each_present_cpu() instead of for_each_possible_cpu() during initialization to avoid issues occurring when nosmp or maxcpus=0 are used (Jacky Bai) - Clean up the Energy Model handling code somewhat (Rafael Wysocki) - Use kfree_rcu() to simplify the handling of runtime Energy Model updates (Li RongQing) - Add an entry for the Energy Model framework to MAINTAINERS as properly maintained (Lukasz Luba) - Address RCU-related sparse warnings in the Energy Model code (Rafael Wysocki) - Remove ENERGY_MODEL dependency on SMP and allow it to be selected when DEVFREQ is set without CPUFREQ so it can be used on a wider range of systems (Jeson Gao) - Unify error handling during runtime suspend and runtime resume in the core to help drivers to implement more consistent runtime PM error handling (Rafael Wysocki) - Drop a redundant check from pm_runtime_force_resume() and rearrange documentation related to __pm_runtime_disable() (Rafael Wysocki) - Rework the handling of the "smart suspend" driver flag in the PM core to avoid issues hat may occur when drivers using it depend on some other drivers and clean up the related PM core code (Rafael Wysocki, Colin Ian King) - Fix the handling of devices with the power.direct_complete flag set if device_suspend() returns an error for at least one device to avoid situations in which some of them may not be resumed (Rafael Wysocki) - Use mutex_trylock() in hibernate_compressor_param_set() to avoid a possible deadlock that may occur if the "compressor" hibernation module parameter is accessed during the registration of a new ieee80211 device (Lizhi Xu) - Suppress sleeping parent warning in device_pm_add() in the case when new children are added under a device with the power.direct_complete set after it has been processed by device_resume() (Xu Yang) - Remove needless return in three void functions related to system wakeup (Zijun Hu) - Replace deprecated kmap_atomic() with kmap_local_page() in the hibernation core code (David Reaver) - Remove unused helper functions related to system sleep (David Alan Gilbert) - Clean up s2idle_enter() so it does not lock and unlock CPU offline in vain and update comments in it (Ulf Hansson) - Clean up broken white space in dpm_wait_for_children() (Geert Uytterhoeven) - Update the cpupower utility to fix lib version-ing in it and memory leaks in error legs, remove hard-coded values, and implement CPU physical core querying (Thomas Renninger, John B. Wyatt IV, Shuah Khan, Yiwei Lin, Zhongqiu Han)" * tag 'pm-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (139 commits) PM: sleep: Fix bit masking operation dt-bindings: cpufreq: cpufreq-qcom-hw: Narrow properties on SDX75, SA8775p and SM8650 dt-bindings: cpufreq: cpufreq-qcom-hw: Drop redundant minItems:1 dt-bindings: cpufreq: cpufreq-qcom-hw: Add missing constraint for interrupt-names dt-bindings: cpufreq: cpufreq-qcom-hw: Add QCS8300 compatible cpufreq: Init cpufreq only for present CPUs PM: sleep: Fix handling devices with direct_complete set on errors cpuidle: Init cpuidle only for present CPUs PM: clk: Remove unused pm_clk_remove() PM: sleep: core: Fix indentation in dpm_wait_for_children() PM: s2idle: Extend comment in s2idle_enter() PM: s2idle: Drop redundant locks when entering s2idle PM: sleep: Remove unused pm_generic_ wrappers cpufreq: tegra186: Share policy per cluster cpupower: Make lib versioning scheme more obvious and fix version link PM: EM: Rework the depends on for CONFIG_ENERGY_MODEL PM: EM: Address RCU-related sparse warnings cpupower: Implement CPU physical core querying pm: cpupower: remove hard-coded topology depth values pm: cpupower: Fix cmd_monitor() error legs to free cpu_topology ...
2025-03-25Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: "Nothing major this time around. Apart from the usual perf/PMU updates, some page table cleanups, the notable features are average CPU frequency based on the AMUv1 counters, CONFIG_HOTPLUG_SMT and MOPS instructions (memcpy/memset) in the uaccess routines. Perf and PMUs: - Support for the 'Rainier' CPU PMU from Arm - Preparatory driver changes and cleanups that pave the way for BRBE support - Support for partial virtualisation of the Apple-M1 PMU - Support for the second event filter in Arm CSPMU designs - Minor fixes and cleanups (CMN and DWC PMUs) - Enable EL2 requirements for FEAT_PMUv3p9 Power, CPU topology: - Support for AMUv1-based average CPU frequency - Run-time SMT control wired up for arm64 (CONFIG_HOTPLUG_SMT). It adds a generic topology_is_primary_thread() function overridden by x86 and powerpc New(ish) features: - MOPS (memcpy/memset) support for the uaccess routines Security/confidential compute: - Fix the DMA address for devices used in Realms with Arm CCA. The CCA architecture uses the address bit to differentiate between shared and private addresses - Spectre-BHB: assume CPUs Linux doesn't know about vulnerable by default Memory management clean-ups: - Drop the P*D_TABLE_BIT definition in preparation for 128-bit PTEs - Some minor page table accessor clean-ups - PIE/POE (permission indirection/overlay) helpers clean-up Kselftests: - MTE: skip hugetlb tests if MTE is not supported on such mappings and user correct naming for sync/async tag checking modes Miscellaneous: - Add a PKEY_UNRESTRICTED definition as 0 to uapi (toolchain people request) - Sysreg updates for new register fields - CPU type info for some Qualcomm Kryo cores" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (72 commits) arm64: mm: Don't use %pK through printk perf/arm_cspmu: Fix missing io.h include arm64: errata: Add newer ARM cores to the spectre_bhb_loop_affected() lists arm64: cputype: Add MIDR_CORTEX_A76AE arm64: errata: Add KRYO 2XX/3XX/4XX silver cores to Spectre BHB safe list arm64: errata: Assume that unknown CPUs _are_ vulnerable to Spectre BHB arm64: errata: Add QCOM_KRYO_4XX_GOLD to the spectre_bhb_k24_list arm64/sysreg: Enforce whole word match for open/close tokens arm64/sysreg: Fix unbalanced closing block arm64: Kconfig: Enable HOTPLUG_SMT arm64: topology: Support SMT control on ACPI based system arch_topology: Support SMT control for OF based system cpu/SMT: Provide a default topology_is_primary_thread() arm64/mm: Define PTDESC_ORDER perf/arm_cspmu: Add PMEVFILT2R support perf/arm_cspmu: Generalise event filtering perf/arm_cspmu: Move register definitons to header arm64/kernel: Always use level 2 or higher for early mappings arm64/mm: Drop PXD_TABLE_BIT arm64/mm: Check pmd_table() in pmd_trans_huge() ...
2025-03-22Merge tag 'cpufreq-arm-updates-6.15' of ↵Rafael J. Wysocki
git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm Merge ARM cpufreq updates for 6.15 from Viresh Kumar: "- manage sysfs attributes and boost frequencies efficiently from cpufreq core to reduce boilerplate code from drivers (Viresh Kumar). - Minor cleanups to cpufreq drivers (Aaron Kling, Benjamin Schneider, Dhananjay Ugwekar, Imran Shaik, and zuoqian). - Migrate to using for_each_present_cpu (Jacky Bai). - cpufreq-qcom-hw DT binding fixes (Krzysztof Kozlowski). - Use str_enable_disable() helper (Lifeng Zheng)." * tag 'cpufreq-arm-updates-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm: (59 commits) dt-bindings: cpufreq: cpufreq-qcom-hw: Narrow properties on SDX75, SA8775p and SM8650 dt-bindings: cpufreq: cpufreq-qcom-hw: Drop redundant minItems:1 dt-bindings: cpufreq: cpufreq-qcom-hw: Add missing constraint for interrupt-names dt-bindings: cpufreq: cpufreq-qcom-hw: Add QCS8300 compatible cpufreq: Init cpufreq only for present CPUs cpufreq: tegra186: Share policy per cluster cpufreq: tegra194: Allow building for Tegra234 cpufreq: enable 1200Mhz clock speed for armada-37xx cpufreq: Remove cpufreq_enable_boost_support() cpufreq: staticize policy_has_boost_freq() cpufreq: qcom: Set .set_boost directly cpufreq: dt: Set .set_boost directly cpufreq: scmi: Set .set_boost directly cpufreq: powernv: Set .set_boost directly cpufreq: loongson: Set .set_boost directly cpufreq: apple: Set .set_boost directly cpufreq: Restrict enabling boost on policies with no boost frequencies cpufreq: cppc: Set policy->boost_supported cpufreq: amd: Set policy->boost_supported cpufreq: acpi: Set policy->boost_supported ...
2025-03-17cpufreq: Init cpufreq only for present CPUsJacky Bai
for_each_possible_cpu() is currently used to initialize cpufreq. However, in cpu_dev_register_generic(), for_each_present_cpu() is used to register CPU devices which means the CPU devices are only registered for present CPUs and not all possible CPUs. With nosmp or maxcpus=0, only the boot CPU is present, lead to the cpufreq probe failure or defer probe due to no cpu device available for not present CPUs. Change for_each_possible_cpu() to for_each_present_cpu() in the above cpufreq drivers to ensure it only registers cpufreq for CPUs that are actually present. Fixes: b0c69e1214bc ("drivers: base: Use present CPUs in GENERIC_CPU_DEVICES") Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Jacky Bai <ping.bai@nxp.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-03-10cpufreq: tegra186: Share policy per clusterAaron Kling
This functionally brings tegra186 in line with tegra210 and tegra194, sharing a cpufreq policy between all cores in a cluster. Reviewed-by: Sumit Gupta <sumitg@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Aaron Kling <webgeek1234@gmail.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-03-06Merge back earlier cpufreq material for 6.15Rafael J. Wysocki
2025-03-06cpufreq/amd-pstate: Drop actions in amd_pstate_epp_cpu_offline()Mario Limonciello
When the CPU goes offline there is no need to change the CPPC request because the CPU will go into the deepest C-state it supports already. Actually changing the CPPC request when it goes offline messes up the cached values and can lead to the wrong values being restored when it comes back. Instead drop the actions and if the CPU comes back online let amd_pstate_epp_set_policy() restore it to expected values. Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06cpufreq/amd-pstate: Stop caching EPPMario Limonciello
EPP values are cached in the cpudata structure per CPU. This is needless though because they are also cached in the CPPC request variable. Drop the separate cache for EPP values and always reference the CPPC request variable when needed. Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06cpufreq/amd-pstate: Rework CPPC enablingMario Limonciello
The CPPC enable register is configured as "write once". That is any future writes don't actually do anything. Because of this, all the cleanup paths that currently exist for CPPC disable are non-effective. Rework CPPC enable to only enable after all the CAP registers have been read to avoid enabling CPPC on CPUs with invalid _CPC or unpopulated MSRs. As the register is write once, remove all cleanup paths as well. Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06cpufreq/amd-pstate: Drop debug statements for policy settingMario Limonciello
There are trace events that exist now for all amd-pstate modes that will output information right before programming to the hardware. This makes the existing debug statements unnecessary remaining overhead. Drop them. Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06cpufreq/amd-pstate: Update cppc_req_cached for shared mem EPP writesMario Limonciello
On EPP only writes update the cached variable so that the min/max performance controls don't need to be updated again. Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06cpufreq/amd-pstate: Move all EPP tracing into *_update_perf and *_set_epp ↵Mario Limonciello
functions The EPP tracing is done by the caller today, but this precludes the information about whether the CPPC request has changed. Move it into the update_perf and set_epp functions and include information about whether the request has changed from the last one. amd_pstate_update_perf() and amd_pstate_set_epp() now require the policy as an argument instead of the cpudata. Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06cpufreq/amd-pstate: Cache CPPC request in shared mem case tooMario Limonciello
In order to prevent a potential write for shmem_update_perf() cache the request into the cppc_req_cached variable normally only used for the MSR case. This adds symmetry into the code and potentially avoids extra writes. Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06cpufreq/amd-pstate: Replace all AMD_CPPC_* macros with masksMario Limonciello
Bitfield masks are easier to follow and less error prone. Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06cpufreq/amd-pstate-ut: Adjust variable scopeMario Limonciello
In amd_pstate_ut_check_freq() and amd_pstate_ut_check_perf() the cpudata variable is only needed in the scope of the for loop. Move it there. Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06cpufreq/amd-pstate-ut: Run on all of the correct CPUsMario Limonciello
If a CPU is missing a policy or one has been offlined then the unit test is skipped for the rest of the CPUs on the system. Instead; iterate online CPUs and skip any missing policies to allow continuing to test the rest of them. Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06cpufreq/amd-pstate-ut: Drop SUCCESS and FAIL enumsMario Limonciello
Enums are effectively used as a boolean and don't show the return value of the failing call. Instead of using enums switch to returning the actual return code from the unit test. Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06cpufreq/amd-pstate-ut: Allow lowest nonlinear and lowest to be the sameMario Limonciello
Several Ryzen AI processors support the exact same value for lowest nonlinear perf and lowest perf. Loosen up the unit tests to allow this scenario. Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06cpufreq/amd-pstate-ut: Use _free macro to free put policyMario Limonciello
Using a scoped cleanup macro simplifies cleanup code. Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06cpufreq/amd-pstate: Drop `cppc_cap1_cached`Mario Limonciello
The `cppc_cap1_cached` variable isn't used at all, there is no need to read it at initialization for each CPU. Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06cpufreq/amd-pstate: Overhaul lockingMario Limonciello
amd_pstate_cpu_boost_update() and refresh_frequency_limits() both update the policy state and have nothing to do with the amd-pstate driver itself. A global "limits" lock doesn't make sense because each CPU can have policies changed independently. Each time a CPU changes values they will atomically be written to the per-CPU perf member. Drop per CPU locking cases. The remaining "global" driver lock is used to ensure that only one entity can change driver modes at a given time. Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06cpufreq/amd-pstate: Move perf values into a unionMario Limonciello
By storing perf values in a union all the writes and reads can be done atomically, removing the need for some concurrency protections. While making this change, also drop the cached frequency values, using inline helpers to calculate them on demand from perf value. Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06cpufreq/amd-pstate: Drop min and max cached frequenciesMario Limonciello
Use the perf_to_freq helpers to calculate this on the fly. As the members are no longer cached add an extra check into amd_pstate_epp_update_limit() to avoid unnecessary calls in amd_pstate_update_min_max_limit(). Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06cpufreq/amd-pstate: Show a warning when a CPU fails to setupMario Limonciello
I came across a system that MSR_AMD_CPPC_CAP1 for some CPUs isn't populated. This is an unexpected behavior that is most likely a BIOS bug. In the event it happens I'd like users to report bugs to properly root cause and get this fixed. Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06cpufreq/amd-pstate: Invalidate cppc_req_cached during suspendMario Limonciello
During resume it's possible the firmware didn't restore the CPPC request MSR but the kernel thinks the values line up. This leads to incorrect performance after resume from suspend. To fix the issue invalidate the cached value at suspend. During resume use the saved values programmed as cached limits. Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Reported-by: Miroslav Pavleski <miroslav@pavleski.net> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217931 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06cpufreq/amd-pstate: Fix the clamping of perf valuesDhananjay Ugwekar
The clamping in freq_to_perf() is broken right now, as we first typecast (read wraparound) the overflowing value into a u8 and then clamp it down. So, use a u32 to store the >255 value in certain edge cases and then clamp it down into a u8. Also, use a "explicit typecast + clamp" instead of just a "clamp_t" as the latter typecasts first and then clamps between the limits, which defeats our purpose. Fixes: 620136ced35a ("cpufreq/amd-pstate: Modularize perf<->freq conversion") Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20250222033221.554976-1-dhananjay.ugwekar@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-04cpufreq: tegra194: Allow building for Tegra234Aaron Kling
Support was added for Tegra234 in the referenced commit, but the Kconfig was not updated to allow building for the arch. Fixes: 273bc890a2a8 ("cpufreq: tegra194: Add support for Tegra234") Signed-off-by: Aaron Kling <webgeek1234@gmail.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-02-27cpufreq: intel_pstate: Avoid SMP calls to get cpu-typePawan Gupta
Intel pstate driver relies on SMP calls to get the cpu-type of a given CPU. Remove the SMP calls and instead use the cached value of cpu-type which is more efficient. [ mingo: Forward ported it. ] Suggested-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/r/20241211-add-cpu-type-v5-2-2ae010f50370@linux.intel.com
2025-02-26cpufreq: ppc_cbe: Remove powerpc Cell driverMichael Ellerman
This driver can no longer be built since support for IBM Cell Blades was removed, in particular CBE_RAS. Remove the driver. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20241218105523.416573-22-mpe@ellerman.id.au
2025-02-23cpufreq/amd-pstate: Remove the unncecessary driver_lock in ↵Dhananjay Ugwekar
amd_pstate_update_limits There is no need to take a driver wide lock while updating the highest_perf value in the percpu cpudata struct. Hence remove it. Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Link: https://lore.kernel.org/r/20250205112523.201101-13-dhananjay.ugwekar@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-02-23cpufreq/amd-pstate: Use scope based cleanup for cpufreq_policy refsDhananjay Ugwekar
There have been instances in past where refcount decrementing is missed while exiting a function. Use automatic scope based cleanup to avoid such errors. Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Link: https://lore.kernel.org/r/20250205112523.201101-12-dhananjay.ugwekar@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-02-23cpufreq/amd-pstate: Add missing NULL ptr check in amd_pstate_updateDhananjay Ugwekar
Check if policy is NULL before dereferencing it in amd_pstate_update. Fixes: e8f555daacd3 ("cpufreq/amd-pstate: fix setting policy current frequency value") Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Link: https://lore.kernel.org/r/20250205112523.201101-11-dhananjay.ugwekar@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-02-23cpufreq/amd-pstate: Remove the unnecessary cpufreq_update_policy callDhananjay Ugwekar
The update_limits callback is only called in two conditions. * When the preferred core rankings change. In which case, we just need to change the prefcore ranking in the cpudata struct. As there are no changes to any of the perf values, there is no need to call cpufreq_update_policy() * When the _PPC ACPI object changes, i.e. the highest allowed Pstate changes. The _PPC object is only used for a table based cpufreq driver like acpi-cpufreq, hence is irrelevant for CPPC based amd-pstate. Hence, the cpufreq_update_policy() call becomes unnecessary and can be removed. Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Link: https://lore.kernel.org/r/20250205112523.201101-9-dhananjay.ugwekar@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-02-23cpufreq/amd-pstate: Modularize perf<->freq conversionDhananjay Ugwekar
Delegate the perf<->frequency conversion to helper functions to reduce code duplication, and improve readability. Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Link: https://lore.kernel.org/r/20250205112523.201101-8-dhananjay.ugwekar@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-02-23cpufreq/amd-pstate: Convert all perf values to u8Dhananjay Ugwekar
All perf values are always within 0-255 range, hence convert their datatype to u8 everywhere. Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Link: https://lore.kernel.org/r/20250205112523.201101-7-dhananjay.ugwekar@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-02-23cpufreq/amd-pstate: Pass min/max_limit_perf as min/max_perf to amd_pstate_updateDhananjay Ugwekar
Currently, amd_pstate_update_freq passes the hardware perf limits as min/max_perf to amd_pstate_update, which eventually gets programmed into the min/max_perf fields of the CPPC_REQ register. Instead pass the effective perf limits i.e. min/max_limit_perf values to amd_pstate_update as min/max_perf. Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Link: https://lore.kernel.org/r/20250205112523.201101-6-dhananjay.ugwekar@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-02-23cpufreq/amd-pstate: Remove the redundant des_perf clamping in adjust_perfDhananjay Ugwekar
des_perf is later on clamped between min_perf and max_perf in amd_pstate_update. So, remove the redundant clamping from amd_pstate_adjust_perf. Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Link: https://lore.kernel.org/r/20250205112523.201101-5-dhananjay.ugwekar@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-02-23cpufreq/amd-pstate: Modify the min_perf calculation in adjust_perf callbackDhananjay Ugwekar
Instead of setting a fixed floor at lowest_nonlinear_perf, use the min_limit_perf value, so that it gives the user the freedom to lower the floor further. There are two minimum frequency/perf limits that we need to consider in the adjust_perf callback. One provided by schedutil i.e. the sg_cpu->bw_min value passed in _min_perf arg, another is the effective value of min_freq_qos request that is updated in cpudata->min_limit_perf. Modify the code to use the bigger of these two values. Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Link: https://lore.kernel.org/r/20250205112523.201101-4-dhananjay.ugwekar@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-02-21cpufreq: intel_pstate: Relocate platform preference checkRafael J. Wysocki
Move the invocation of intel_pstate_platform_pwr_mgmt_exists() before checking whether or not HWP is enabled because it does not depend on any code running before it except for the vendor check and if CPU performance scaling is going to be carried out by the platform, all of the code that runs before that function (again, except for the vendor check) is redundant. This is not expected to alter any functionality except for the ordering of messages printed by intel_pstate_init() when it is going to return an error before attempting to register the driver. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Link: https://patch.msgid.link/2776745.mvXUDI8C0e@rjwysocki.net
2025-02-20cpufreq: governor: Fix negative 'idle_time' handling in dbs_update()Jie Zhan
We observed an issue that the CPU frequency can't raise up with a 100% CPU load when NOHZ is off and the 'conservative' governor is selected. 'idle_time' can be negative if it's obtained from get_cpu_idle_time_jiffy() when NOHZ is off. This was found and explained in commit 9485e4ca0b48 ("cpufreq: governor: Fix handling of special cases in dbs_update()"). However, commit 7592019634f8 ("cpufreq: governors: Fix long idle detection logic in load calculation") introduced a comparison between 'idle_time' and 'samling_rate' to detect a long idle interval. While 'idle_time' is converted to int before comparison, it's actually promoted to unsigned again when compared with an unsigned 'sampling_rate'. Hence, this leads to wrong idle interval detection when it's in fact 100% busy and sets policy_dbs->idle_periods to a very large value. 'conservative' adjusts the frequency to minimum because of the large 'idle_periods', such that the frequency can't raise up. 'Ondemand' doesn't use policy_dbs->idle_periods so it fortunately avoids the issue. Correct negative 'idle_time' to 0 before any use of it in dbs_update(). Fixes: 7592019634f8 ("cpufreq: governors: Fix long idle detection logic in load calculation") Signed-off-by: Jie Zhan <zhanjie9@hisilicon.com> Reviewed-by: Chen Yu <yu.c.chen@intel.com> Link: https://patch.msgid.link/20250213035510.2402076-1-zhanjie9@hisilicon.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-02-19cpufreq: enable 1200Mhz clock speed for armada-37xxBenjamin Schneider
This frequency was disabled because of stability problems whose source could not be accurately identified[1]. After seven months of testing, the evidence points to an incorrectly configured bootloader as the source of the historical instability. Testing was performed on two A3720 devices with this frequency enabled and the ondemand policy in use. Marvell merged[2] changes to their bootloader source needed to address the stability issue. This driver should expose this frequency option to users. [1] https://github.com/torvalds/linux/commit/484f2b7c61b9ae58cc00c5127bcbcd9177af8dfe [2] https://github.com/MarvellEmbeddedProcessors/mv-ddr-marvell/pull/44 Signed-off-by: Benjamin Schneider <ben@bens.haus> Reviewed-by: Pali Rohár <pali@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Gregory CLEMENT <gregory.clement@bootlin.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-02-18cpufreq: intel_pstate: Make it possible to avoid enabling CASRafael J. Wysocki
Capacity-aware scheduling (CAS) is enabled by default by intel_pstate on hybrid systems without SMT, but in some usage scenarios it may be more attractive to place tasks for maximum CPU performance regardless of the extra cost in terms of energy, which is the case on such systems when CAS is not enabled, so introduce a command line option to forbid intel_pstate to enable CAS. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by:Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Link: https://patch.msgid.link/2781262.mvXUDI8C0e@rjwysocki.net
2025-02-17cpufreq: Introduce an optional cpuinfo_avg_freq sysfs entryBeata Michalska
Currently the CPUFreq core exposes two sysfs attributes that can be used to query current frequency of a given CPU(s): namely cpuinfo_cur_freq and scaling_cur_freq. Both provide slightly different view on the subject and they do come with their own drawbacks. cpuinfo_cur_freq provides higher precision though at a cost of being rather expensive. Moreover, the information retrieved via this attribute is somewhat short lived as frequency can change at any point of time making it difficult to reason from. scaling_cur_freq, on the other hand, tends to be less accurate but then the actual level of precision (and source of information) varies between architectures making it a bit ambiguous. The new attribute, cpuinfo_avg_freq, is intended to provide more stable, distinct interface, exposing an average frequency of a given CPU(s), as reported by the hardware, over a time frame spanning no more than a few milliseconds. As it requires appropriate hardware support, this interface is optional. Note that under the hood, the new attribute relies on the information provided by arch_freq_get_on_cpu, which, up to this point, has been feeding data for scaling_cur_freq attribute, being the source of ambiguity when it comes to interpretation. This has been amended by restoring the intended behavior for scaling_cur_freq, with a new dedicated config option to maintain status quo for those, who may need it. CC: Jonathan Corbet <corbet@lwn.net> CC: Thomas Gleixner <tglx@linutronix.de> CC: Ingo Molnar <mingo@redhat.com> CC: Borislav Petkov <bp@alien8.de> CC: Dave Hansen <dave.hansen@linux.intel.com> CC: H. Peter Anvin <hpa@zytor.com> CC: Phil Auld <pauld@redhat.com> CC: x86@kernel.org CC: linux-doc@vger.kernel.org Signed-off-by: Beata Michalska <beata.michalska@arm.com> Reviewed-by: Prasanna Kumar T S M <ptsm@linux.microsoft.com> Reviewed-by: Sumit Gupta <sumitg@nvidia.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Rafael J. Wysocki <rafael@kernel.org> Link: https://lore.kernel.org/r/20250131162439.3843071-3-beata.michalska@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-02-17cpufreq: Allow arch_freq_get_on_cpu to return an errorBeata Michalska
Allow arch_freq_get_on_cpu to return an error for cases when retrieving current CPU frequency is not possible, whether that being due to lack of required arch support or due to other circumstances when the current frequency cannot be determined at given point of time. Signed-off-by: Beata Michalska <beata.michalska@arm.com> Reviewed-by: Prasanna Kumar T S M <ptsm@linux.microsoft.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Rafael J. Wysocki <rafael@kernel.org> Link: https://lore.kernel.org/r/20250131162439.3843071-2-beata.michalska@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-02-07cpufreq: Remove cpufreq_enable_boost_support()Viresh Kumar
Remove the now unused helper, cpufreq_enable_boost_support(). Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-02-07cpufreq: staticize policy_has_boost_freq()Viresh Kumar
policy_has_boost_freq() isn't used outside of freq_table.c now, mark it static. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>