summaryrefslogtreecommitdiff
path: root/drivers/thermal
AgeCommit message (Collapse)Author
2024-03-11thermal: Drop spaces before TABsGeert Uytterhoeven
There is never a need to have a space before a TAB, but it hurts the eyes of vim users. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/480478a53fd42621e97b2db36e181903cc0f53e3.1708001426.git.geert+renesas@glider.be
2024-03-11thermal/drivers/mediatek: Fix control buffer enablement on MT7896Frank Wunderlich
Reading thermal sensor on mt7986 devices returns invalid temperature: bpi-r3 ~ # cat /sys/class/thermal/thermal_zone0/temp -274000 Fix this by adding missing members in mtk_thermal_data struct which were used in mtk_thermal_turn_on_buffer after commit 33140e668b10. Cc: stable@vger.kernel.org Fixes: 33140e668b10 ("thermal/drivers/mediatek: Control buffer enablement tweaks") Signed-off-by: Frank Wunderlich <frank-w@public-files.de> Reviewed-by: Markus Schneider-Pargmann <msp@baylibre.com> Reviewed-by: Daniel Golle <daniel@makrotopia.org> Tested-by: Daniel Golle <daniel@makrotopia.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20230907112018.52811-1-linux@fw-web.de
2024-03-11thermal/drivers/mediatek/lvts_thermal: Fix a memory leak in an error ↵Christophe JAILLET
handling path If devm_krealloc() fails, then 'efuse' is leaking. So free it to avoid a leak. Fixes: f5f633b18234 ("thermal/drivers/mediatek: Add the Low Voltage Thermal Sensor driver") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/481d345233862d58c3c305855a93d0dbc2bbae7e.1706431063.git.christophe.jaillet@wanadoo.fr
2024-03-11Merge branch 'pm-em'Rafael J. Wysocki
Merge Enery Model changes for 6.9-rc1: - Allow the Energy Model to be updated dynamically (Lukasz Luba). * pm-em: (24 commits) PM: EM: Fix nr_states warnings in static checks Documentation: EM: Update with runtime modification design PM: EM: Add em_dev_compute_costs() PM: EM: Remove old table PM: EM: Change debugfs configuration to use runtime EM table data drivers/thermal/devfreq_cooling: Use new Energy Model interface drivers/thermal/cpufreq_cooling: Use new Energy Model interface powercap/dtpm_devfreq: Use new Energy Model interface to get table powercap/dtpm_cpu: Use new Energy Model interface to get table PM: EM: Optimize em_cpu_energy() and remove division PM: EM: Support late CPUs booting and capacity adjustment PM: EM: Add performance field to struct em_perf_state and optimize PM: EM: Add em_perf_state_from_pd() to get performance states table PM: EM: Introduce em_dev_update_perf_domain() for EM updates PM: EM: Add functions for memory allocations for new EM tables PM: EM: Use runtime modified EM for CPUs energy estimation in EAS PM: EM: Introduce runtime modifiable table PM: EM: Split the allocation and initialization of the EM table PM: EM: Check if the get_cost() callback is present in em_compute_costs() PM: EM: Introduce em_compute_costs() ...
2024-03-07Merge branches 'thermal-core' and 'thermal-intel'Rafael J. Wysocki
Merge thermal core changes and Intel thermal drivers changes for 6.9-rc1: - Store zone trips table and zone operations directly in struct thermal_zone_device (Rafael Wysocki). - Rework writable trip points handling (Rafael Wysocki). - Thermal core code cleanups (Dan Carpenter, Flavio Suligoi). - Use thermal zone accessor functions in the int340x Intel thermal driver (Rafael Wysocki). - Add Lunar Lake-M PCI ID to the int340x Intel thermal driver (Srinivas Pandruvada). * thermal-core: thermal: core: remove unnecessary check in trip_point_hyst_store() thermal: core: Remove excess empty line from a comment thermal: core: Eliminate writable trip points masks thermal: of: Set THERMAL_TRIP_FLAG_RW_TEMP directly thermal: imx: Set THERMAL_TRIP_FLAG_RW_TEMP directly wifi: iwlwifi: mvm: Set THERMAL_TRIP_FLAG_RW_TEMP directly mlxsw: core_thermal: Set THERMAL_TRIP_FLAG_RW_TEMP directly thermal: intel: Set THERMAL_TRIP_FLAG_RW_TEMP directly thermal: core: Drop the .set_trip_hyst() thermal zone operation thermal: core: Add flags to struct thermal_trip thermal: core: Move initial num_trips assignment before memcpy() thermal: Get rid of CONFIG_THERMAL_WRITABLE_TRIPS thermal: intel: Adjust ops handling during thermal zone registration thermal: ACPI: Constify acpi_thermal_zone_ops thermal: core: Store zone ops in struct thermal_zone_device thermal: intel: Discard trip tables after zone registration thermal: ACPI: Discard trips table after zone registration thermal: core: Store zone trips table in struct thermal_zone_device * thermal-intel: thermal: intel: int340x_thermal: Use thermal zone accessor functions thermal: int340x: processor_thermal: Add Lunar Lake-M PCI ID
2024-03-06thermal: core: remove unnecessary check in trip_point_hyst_store()Dan Carpenter
This code was shuffled around a bit recently. We no longer need to check the value of "ret" because we know it's zero. Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-03-05thermal: intel: int340x_thermal: Use thermal zone accessor functionsRafael J. Wysocki
Make int340x_thermal use the dedicated accessor functions for the thermal zone device object address and the thermal zone type string. This is requisite for future thermal core improvements. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2024-03-05Merge thermal core changes for 6.9 to satisfy a dependency.Rafael J. Wysocki
2024-03-05thermal: core: Remove excess empty line from a commentFlavio Suligoi
The first and the third lines of the kerneldoc comment for: thermal_zone_device_set_polling() belong to the same sentences, so join them together. Signed-off-by: Flavio Suligoi <f.suligoi@asem.it> [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-02-28thermal: int340x: processor_thermal: Add Lunar Lake-M PCI IDSrinivas Pandruvada
Add Lunar Lake-M PCI ID for processor thermal device. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-02-27thermal: core: Eliminate writable trip points masksRafael J. Wysocki
All of the thermal_zone_device_register_with_trips() callers pass zero writable trip points masks to it, so drop the mask argument from that function and update all of its callers accordingly. This also removes the artificial trip points per zone limit of 32, related to using writable trip points masks. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-02-27thermal: of: Set THERMAL_TRIP_FLAG_RW_TEMP directlyRafael J. Wysocki
It is now possible to flag trip points with THERMAL_TRIP_FLAG_RW_TEMP to allow their temperature to be set from user space via sysfs instead of using a nonzero writable trips mask during thermal zone registration, so make the OF thermal code do that. No intentional functional impact. Note that this change is requisite for dropping the mask argument from thermal_zone_device_register_with_trips() going forward. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-02-27thermal: imx: Set THERMAL_TRIP_FLAG_RW_TEMP directlyRafael J. Wysocki
It is now possible to flag trip points with THERMAL_TRIP_FLAG_RW_TEMP to allow their temperature to be set from user space via sysfs instead of using a nonzero writable trips mask during thermal zone registration, so make the imx thermal code do that. No intentional functional impact. Note that this change is requisite for dropping the mask argument from thermal_zone_device_register_with_trips() going forward. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-02-27thermal: intel: Set THERMAL_TRIP_FLAG_RW_TEMP directlyRafael J. Wysocki
Some Intel thermal drivers need/want the temperature of their trip points to be set by user space via sysfs and so they pass nonzero writable trip masks during thermal zone registration for this purpose. It is now possible to achieve the same result by setting the THERMAL_TRIP_FLAG_RW_TEMP trip flag directly, so modify the drivers in question to do that instead of using a nonzero writable trips mask. No intentional functional impact. Note that this change is requisite for dropping the mask argument from thermal_zone_device_register_with_trips() going forward. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-02-27thermal: core: Drop the .set_trip_hyst() thermal zone operationRafael J. Wysocki
None of the users of the thermal core provides a .set_trip_hyst() thermal zone operation, so drop that callback from struct thermal_zone_device_ops and update trip_point_hyst_store() accordingly. No functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-02-27thermal: core: Add flags to struct thermal_tripRafael J. Wysocki
In order to allow thermal zone creators to specify the writability of trip point temperature and hysteresis on a per-trip basis, add a flags field to struct thermal_trip and define flags to represent the desired trip properties. Also make thermal_zone_device_register_with_trips() set the THERMAL_TRIP_FLAG_RW_TEMP flag for all trips covered by the writable trips mask passed to it and modify the thermal sysfs code to look at the trip flags instead of using the writable trips mask directly or checking the presence of the .set_trip_hyst() zone callback. Additionally, make trip_point_temp_store() and trip_point_hyst_store() fail with an error code if the trip passed to one of them has THERMAL_TRIP_FLAG_RW_TEMP or THERMAL_TRIP_FLAG_RW_HYST, respectively, clear in its flags. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-02-27thermal: core: Move initial num_trips assignment before memcpy()Nathan Chancellor
When booting a CONFIG_FORTIFY_SOURCE=y kernel compiled with a toolchain that supports __counted_by() (such as clang-18 and newer), there is a panic on boot: [ 2.913770] memcpy: detected buffer overflow: 72 byte write of buffer size 0 [ 2.920834] WARNING: CPU: 2 PID: 1 at lib/string_helpers.c:1027 __fortify_report+0x5c/0x74 ... [ 3.039208] Call trace: [ 3.041643] __fortify_report+0x5c/0x74 [ 3.045469] __fortify_panic+0x18/0x20 [ 3.049209] thermal_zone_device_register_with_trips+0x4c8/0x4f8 This panic occurs because trips is counted by num_trips but num_trips is assigned after the call to memcpy(), so the fortify checks think the buffer size is zero because tz was allocated with kzalloc(). Move the num_trips assignment before the memcpy() to resolve the panic and ensure that the fortify checks work properly. Fixes: 9b0a62758665 ("thermal: core: Store zone trips table in struct thermal_zone_device") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-02-23thermal: Get rid of CONFIG_THERMAL_WRITABLE_TRIPSRafael J. Wysocki
The only difference made by CONFIG_THERMAL_WRITABLE_TRIPS is whether or not the writable trips mask passed during thermal zone registration will take any effect, but whoever passes a non-zero writable trips mask to thermal_zone_device_register_with_trips() can be forgiven thinking that it will always work. Moreover, some thermal drivers expect user space to set trip temperature values, so they select CONFIG_THERMAL_WRITABLE_TRIPS, possibly overriding a manual choice to unset it and going against the design purportedly allowing system integrators to decide on the writability of trip points for the given kernel build. It is also set in one platform's defconfig. Forthermore, CONFIG_THERMAL_WRITABLE_TRIPS only affects trip temperature, because trip hysteresis is writable as long as the thermal zone provides a callback to update it, regardless of the CONFIG_THERMAL_WRITABLE_TRIPS value. The above means that the symbol in question is used inconsistently and its purpose is at least moot, so remove it and always take the writable trip mask passed to thermal_zone_device_register_with_trips() into account. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-02-23thermal: intel: Adjust ops handling during thermal zone registrationRafael J. Wysocki
Because thermal zone operations are now stored directly in struct thermal_zone_device, thermal zone creators can discard the operations structure after the zone registration is complete, or it can be made read-only. Accordingly, make int340x_thermal_zone_add() use a local variable to represent thermal zone operations, so it is freed automatically upon the function exit, and make the other Intel thermal drivers use const zone operations structures. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-02-23thermal: core: Store zone ops in struct thermal_zone_deviceRafael J. Wysocki
The current code requires thermal zone creators to pass pointers to writable ops structures to thermal_zone_device_register_with_trips() which needs to modify the target struct thermal_zone_device_ops object if the "critical" operation in it is NULL. Moreover, the callers of thermal_zone_device_register_with_trips() are required to hold on to the struct thermal_zone_device_ops object passed to it until the given thermal zone is unregistered. Both of these requirements are quite inconvenient, so modify struct thermal_zone_device to contain struct thermal_zone_device_ops as field and make thermal_zone_device_register_with_trips() copy the contents of the struct thermal_zone_device_ops passed to it via a pointer (which can be const now) to that field. Also adjust the code using thermal zone ops accordingly and modify thermal_of_zone_register() to use a local ops variable during thermal zone registration so ops do not need to be freed in thermal_of_zone_unregister() any more. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-02-23thermal: intel: Discard trip tables after zone registrationRafael J. Wysocki
Because the thermal core creates and uses its own copy of the trips table passed to thermal_zone_device_register_with_trips(), it is not necessary to hold on to a local copy of it any more after the given thermal zone has been registered. Accordingly, modify Intel thermal drivers to discard the trips tables passed to thermal_zone_device_register_with_trips() after thermal zone registration, for example by storing them in local variables which are automatically discarded when the zone registration is complete. Also make some additional code simplifications unlocked by the above changes. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-02-23thermal: core: Store zone trips table in struct thermal_zone_deviceRafael J. Wysocki
The current code expects thermal zone creators to pass a pointer to a writable trips table to thermal_zone_device_register_with_trips() and that trips table is then used by the thermal core going forward. Consequently, the callers of thermal_zone_device_register_with_trips() are required to hold on to the trips table passed to it until the given thermal zone is unregistered, at which point the trips table can be freed, but at the same time they are not expected to access that table directly. This is both error prone and confusing. To address it, turn the trips table pointer in struct thermal_zone_device into a flex array (counted by its num_trips field), allocate it during thermal zone device allocation and copy the contents of the trips table supplied by the zone creator (which can be const now) into it, which will allow the callers of thermal_zone_device_register_with_trips() to drop their trip tables right after the zone registration. This requires the imx thermal driver to be adjusted to store the new temperature in its internal trips table in imx_set_trip_temp(), because it will be separate from the core's trips table now and it has to be explicitly kept in sync with the latter. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-02-23Merge branch 'thermal-core'Rafael J. Wysocki
Merge thermal core changes for 6.9: - Minor fixes for thermal governors (Rafael J. Wysocki, Di Shen). - Trip point handling fixes for the iwlwifi wireless driver (Rafael J. Wysocki). - Code cleanups (Rafael J. Wysocki, AngeloGioacchino Del Regno). * thermal-tmp: thermal: gov_power_allocator: Avoid overwriting PID coefficients from setup time thermal: sysfs: Fix up white space in trip_point_temp_store() iwlwifi: mvm: Use for_each_thermal_trip() for walking trip points iwlwifi: mvm: Populate trip table before registering thermal zone iwlwifi: mvm: Drop unused fw_trips_index[] from iwl_mvm_thermal_device thermal: core: Change governor name to const char pointer thermal: gov_bang_bang: Fix possible cooling device state ping-pong thermal: gov_fair_share: Fix dependency on trip points ordering
2024-02-15x86/cpu/topology: Rename topology_max_die_per_package()Thomas Gleixner
The plural of die is dies. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Michael Kelley <mhklinux@outlook.com> Tested-by: Sohil Mehta <sohil.mehta@intel.com> Link: https://lore.kernel.org/r/20240213210253.065874205@linutronix.de
2024-02-13powercap: intel_rapl: Fix locking in TPMI RAPLZhang Rui
The RAPL framework uses CPU hotplug locking to protect the rapl_packages list and rp->lead_cpu to guarantee that 1. the RAPL package device is not unprobed and freed 2. the cached rp->lead_cpu is always valid for operations like powercap sysfs accesses. Current RAPL APIs assume being called from CPU hotplug callbacks which hold the CPU hotplug lock, but TPMI RAPL driver invokes the APIs in the driver's .probe() function without acquiring the CPU hotplug lock. Fix the problem by providing both locked and lockless versions of RAPL APIs. Fixes: 9eef7f9da928 ("powercap: intel_rapl: Introduce RAPL TPMI interface driver") Signed-off-by: Zhang Rui <rui.zhang@intel.com> Cc: 6.5+ <stable@vger.kernel.org> # 6.5+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-02-12thermal/intel: Fix intel_tcc_get_temp() to support negative CPU temperatureZhang Rui
CPU temperature can be negative in some cases. Thus the negative CPU temperature should not be considered as a failure. Fix intel_tcc_get_temp() and its users to support negative CPU temperature. Fixes: a3c1f066e1c5 ("thermal/intel: Introduce Intel TCC library") Signed-off-by: Zhang Rui <rui.zhang@intel.com> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Cc: 6.3+ <stable@vger.kernel.org> # 6.3+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-02-12thermal: gov_power_allocator: Avoid overwriting PID coefficients from setup timeDi Shen
When the PID coefficients k_* are set via sysfs before the IPA algorithm is triggered then the coefficients would be overwritten after IPA throttle() is called. The old configuration values might be different than the new values estimated by the IPA internal algorithm. There might be a time delay when this overwriting happens. It depends on the thermal zone temperature value. The temperature value needs to cross the first trip point value then IPA algorithms start operating. Although, the PID coefficients setup time should not be affected or linked to any later operating phase and values must not be overwritten. This patch initializes params->sustainable_power when the governor binds to thermal zone to avoid overwriting k_*. The basic function won't be affected, as the k_* still can be estimated if the sustainable_power is modified. Signed-off-by: Di Shen <di.shen@unisoc.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-02-12thermal: sysfs: Fix up white space in trip_point_temp_store()Rafael J. Wysocki
Remove an excess tab character from an otherwise empty code line. No functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
2024-02-08drivers/thermal/devfreq_cooling: Use new Energy Model interfaceLukasz Luba
Energy Model framework support modifications at runtime of the power values. Use the new EM table which is protected with RCU. Align the code so that this RCU read section is short. This change is not expected to alter the general functionality. Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-02-08drivers/thermal/cpufreq_cooling: Use new Energy Model interfaceLukasz Luba
Energy Model framework support modifications at runtime of the power values. Use the new EM table which is protected with RCU. Align the code so that this RCU read section is short. This change is not expected to alter the general functionality. Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-29thermal: gov_bang_bang: Fix possible cooling device state ping-pongRafael J. Wysocki
The current behavior of thermal_zone_trip_update() in the bang-bang thermal governor may be problematic for trip points with 0 hysteresis, because when the zone temperature reaches the trip temperature and stays there, it will then cause the cooling device go "on" and "off" alternately, which is not desirable. Address this by requiring the zone temperature to actually fall below trip->temperature - trip->hysteresis for the cooling device to go off. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-29thermal: gov_fair_share: Fix dependency on trip points orderingRafael J. Wysocki
The computation in the fair share governor's get_trip_level() function currently works under the assumption that the temperature ordering of trips[] in a thermal zone is ascending, which need not be the case. However, get_trip_level() can be made work regardless of whether or not the trips table is ordered by temperature in any way, so change it accordingly. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-22thermal: intel: powerclamp: Remove dead code for target mwait valueSrinivas Pandruvada
After conversion of this driver to use powercap idle_inject core, this driver doesn't use target_mwait value. So remove dead code. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-19thermal: loongson2: Replace of_device.h with explicit includesRob Herring
The DT of_device.h and of_platform.h date back to the separate of_platform_bus_type before it as merged into the regular platform bus. As part of that merge prepping Arm DT support 13 years ago, they "temporarily" include each other. They also include platform_device.h and of.h. of_device.h isn't needed, but mod_devicetable.h and property.h were implicitly included. Signed-off-by: Rob Herring <robh@kernel.org>
2024-01-17Merge tag 'thermal-6.8-rc1-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more thermal control updates from Rafael Wysocki: "These add support for debugfs-based diagnostics to the thermal core, simplify the thermal netlink API, fix system-wide PM support in the Intel HFI driver and clean up some code. Specifics: - Add debugfs-based diagnostics support to the thermal core (Daniel Lezcano, Dan Carpenter) - Fix a power allocator thermal governor issue preventing it from resetting cooling devices sometimes (Di Shen) - Simplify the thermal netlink API and clean up related code (Rafael J. Wysocki) - Make the Intel HFI driver support hibernation and deep suspend properly (Ricardo Neri)" * tag 'thermal-6.8-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: thermal/debugfs: Unlock on error path in thermal_debug_tz_trip_up() thermal: intel: hfi: Add syscore callbacks for system-wide PM thermal: gov_power_allocator: avoid inability to reset a cdev thermal: helpers: Rearrange thermal_cdev_set_cur_state() thermal: netlink: Rework notify API for cooling devices thermal: core: Use kstrdup_const() during cooling device registration thermal/debugfs: Add thermal debugfs information for mitigation episodes thermal/debugfs: Add thermal cooling device debugfs information thermal: netlink: Pass thermal zone pointer to notify routines thermal: netlink: Drop thermal_notify_tz_trip_add/delete() thermal: netlink: Pass pointers to thermal_notify_tz_trip_up/down() thermal: netlink: Pass pointers to thermal_notify_tz_trip_change()
2024-01-16Merge branches 'thermal-core' and 'thermal-intel'Rafael J. Wysocki
Merge additional updates for 6.8-rc1 in the thermal core and in the Intel HFI thermal driver: - Add debugfs-based diagnostics support to the thermal core (Daniel Lezcano, Dan Carpenter). - Fix a power allocator thermal governor issue preventing it from resetting cooling devices sometimes (Di Shen). - Simplify the thermal netlink API and clean up related code (Rafael J. Wysocki). - Make the Intel HFI driver support hibernation and deep suspend properly (Ricardo Neri). * thermal-core: thermal/debugfs: Unlock on error path in thermal_debug_tz_trip_up() thermal: gov_power_allocator: avoid inability to reset a cdev thermal: helpers: Rearrange thermal_cdev_set_cur_state() thermal: netlink: Rework notify API for cooling devices thermal: core: Use kstrdup_const() during cooling device registration thermal/debugfs: Add thermal debugfs information for mitigation episodes thermal/debugfs: Add thermal cooling device debugfs information thermal: netlink: Pass thermal zone pointer to notify routines thermal: netlink: Drop thermal_notify_tz_trip_add/delete() thermal: netlink: Pass pointers to thermal_notify_tz_trip_up/down() thermal: netlink: Pass pointers to thermal_notify_tz_trip_change() * thermal-intel: thermal: intel: hfi: Add syscore callbacks for system-wide PM
2024-01-12thermal/debugfs: Unlock on error path in thermal_debug_tz_trip_up()Dan Carpenter
Add a missing mutex_unlock(&thermal_dbg->lock) to this error path. Fixes: 7ef01f228c9f ("thermal/debugfs: Add thermal debugfs information for mitigation episodes") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-12thermal: intel: hfi: Add syscore callbacks for system-wide PMRicardo Neri
The kernel allocates a memory buffer and provides its location to the hardware, which uses it to update the HFI table. This allocation occurs during boot and remains constant throughout runtime. When resuming from hibernation, the restore kernel allocates a second memory buffer and reprograms the HFI hardware with the new location as part of a normal boot. The location of the second memory buffer may differ from the one allocated by the image kernel. When the restore kernel transfers control to the image kernel, its HFI buffer becomes invalid, potentially leading to memory corruption if the hardware writes to it (the hardware continues to use the buffer from the restore kernel). It is also possible that the hardware "forgets" the address of the memory buffer when resuming from "deep" suspend. Memory corruption may also occur in such a scenario. To prevent the described memory corruption, disable HFI when preparing to suspend or hibernate. Enable it when resuming. Add syscore callbacks to handle the package of the boot CPU (packages of non-boot CPUs are handled via CPU offline). Syscore ops always run on the boot CPU. Additionally, HFI only needs to be disabled during "deep" suspend and hibernation. Syscore ops only run in these cases. Cc: 6.1+ <stable@vger.kernel.org> # 6.1+ Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> [ rjw: Comment adjustment, subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-12thermal: gov_power_allocator: avoid inability to reset a cdevDi Shen
Commit 0952177f2a1f ("thermal/core/power_allocator: Update once cooling devices when temp is low") adds an update flag to avoid triggering a thermal event when there is no need, and the thermal cdev is updated once when the temperature is low. But when the trips are writable, and switch_on_temp is set to be a higher value, the cooling device state may not be reset to 0, because last_temperature is smaller than switch_on_temp. For example: First: switch_on_temp=70 control_temp=85; Then userspace change the trip_temp: switch_on_temp=45 control_temp=55 cur_temp=54 Then userspace reset the trip_temp: switch_on_temp=70 control_temp=85 cur_temp=57 last_temp=54 At this time, the cooling device state should be reset to 0. However, because cur_temp(57) < switch_on_temp(70) last_temp(54) < switch_on_temp(70) ----> update = false, update is false, the cooling device state can not be reset. Using the observation that tz->passive can also be regarded as the temperature status, set the update flag to the tz->passive value. When the temperature drops below switch_on for the first time, the states of cooling devices can be reset once, and tz->passive is updated to 0. In the next round, because tz->passive is 0, cdev->state will not be updated. By using the tz->passive value as the "update" flag, the issue above can be solved, and the cooling devices can be updated only once when the temperature is low. Fixes: 0952177f2a1f ("thermal/core/power_allocator: Update once cooling devices when temp is low") Cc: 5.13+ <stable@vger.kernel.org> # 5.13+ Suggested-by: Wei Wang <wvw@google.com> Signed-off-by: Di Shen <di.shen@unisoc.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-12thermal: helpers: Rearrange thermal_cdev_set_cur_state()Rafael J. Wysocki
Change the code layout in thermal_cdev_set_cur_state() so it returns early on errors which is more consistent with what happens elsewhere. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-01-12thermal: netlink: Rework notify API for cooling devicesRafael J. Wysocki
In analogy with some previous thermal netlink API changes, redefine thermal_notify_cdev_state_update(), thermal_notify_cdev_add() and thermal_notify_cdev_delete() to take a const cdev pointer as their first argument and let them extract the requisite information from there by themselves. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-01-12thermal: core: Use kstrdup_const() during cooling device registrationChristophe JAILLET
Some *thermal_cooling_device_register() calls pass a string literal as the 'type' parameter, so kstrdup_const() can be used instead of kstrdup() to avoid a memory allocation in such cases. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-12thermal/debugfs: Add thermal debugfs information for mitigation episodesDaniel Lezcano
The mitigation episodes are recorded. A mitigation episode happens when the first trip point is crossed the way up and then the way down. During this episode other trip points can be crossed also and are accounted for this mitigation episode. The interesting information is the average temperature at the trip point, the undershot and the overshot. The standard deviation of the mitigated temperature will be added later. The thermal debugfs directory structure tries to stay consistent with the sysfs one but in a very simplified way: thermal/ `-- thermal_zones |-- 0 | `-- mitigations `-- 1 `-- mitigations The content of the mitigations file has the following format: ,-Mitigation at 349988258us, duration=130136ms | trip | type | temp(°mC) | hyst(°mC) | duration | avg(°mC) | min(°mC) | max(°mC) | | 0 | passive | 65000 | 2000 | 130136 | 68227 | 62500 | 75625 | | 1 | passive | 75000 | 2000 | 104209 | 74857 | 71666 | 77500 | ,-Mitigation at 272451637us, duration=75000ms | trip | type | temp(°mC) | hyst(°mC) | duration | avg(°mC) | min(°mC) | max(°mC) | | 0 | passive | 65000 | 2000 | 75000 | 68561 | 62500 | 75000 | | 1 | passive | 75000 | 2000 | 60714 | 74820 | 70555 | 77500 | ,-Mitigation at 238184119us, duration=27316ms | trip | type | temp(°mC) | hyst(°mC) | duration | avg(°mC) | min(°mC) | max(°mC) | | 0 | passive | 65000 | 2000 | 27316 | 73377 | 62500 | 75000 | | 1 | passive | 75000 | 2000 | 19468 | 75284 | 69444 | 77500 | ,-Mitigation at 39863713us, duration=136196ms | trip | type | temp(°mC) | hyst(°mC) | duration | avg(°mC) | min(°mC) | max(°mC) | | 0 | passive | 65000 | 2000 | 136196 | 73922 | 62500 | 75000 | | 1 | passive | 75000 | 2000 | 91721 | 74386 | 69444 | 78125 | More information for a better understanding of the thermal behavior will be added after. The idea is to give detailed statistics information about the undershots and overshots, the temperature speed, etc... As all the information in a single file is too much, the idea would be to create a directory named with the mitigation timestamp where all data could be added. Please note this code is immune against trip ordering but not against a trip temperature change while a mitigation is happening. However, this situation should be extremely rare, perhaps not happening and we might question ourselves if something should be done in the core framework for other components first. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> [ rjw: White space fixups, rebase ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-12thermal/debugfs: Add thermal cooling device debugfs informationDaniel Lezcano
The thermal framework does not have any debug information except a sysfs stat which is a bit controversial. This one allocates big chunks of memory for every cooling devices with a high number of states and could represent on some systems in production several megabytes of memory for just a portion of it. As the sysfs is limited to a page size, the output is not exploitable with large data array and gets truncated. The patch provides the same information than sysfs except the transitions are dynamically allocated, thus they won't show more events than the ones which actually occurred. There is no longer a size limitation and it opens the field for more debugging information where the debugfs is designed for, not sysfs. The thermal debugfs directory structure tries to stay consistent with the sysfs one but in a very simplified way: thermal/ -- cooling_devices |-- 0 | |-- clear | |-- time_in_state_ms | |-- total_trans | `-- trans_table |-- 1 | |-- clear | |-- time_in_state_ms | |-- total_trans | `-- trans_table |-- 2 | |-- clear | |-- time_in_state_ms | |-- total_trans | `-- trans_table |-- 3 | |-- clear | |-- time_in_state_ms | |-- total_trans | `-- trans_table `-- 4 |-- clear |-- time_in_state_ms |-- total_trans `-- trans_table The content of the files in the cooling devices directory is the same as the sysfs one except for the trans_table which has the following format: Transition Hits 1->0 246 0->1 246 2->1 632 1->2 632 3->2 98 2->3 98 Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> [ rjw: White space fixups, rebase ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-09Merge tag 'thermal-6.8-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull thermal control updates from Rafael Wysocki: "These add support for the D1/T113s THS controller to the sun8i driver and a DT-based mechanism for platforms to indicate a preference to reboot (instead of shutting down) on crossing a critical trip point, fix issues, make other improvements (in the IPA governor, the Intel HFI driver, the exynos driver and the thermal netlink interface among other places) and clean up code. One long-standing issue addressed here is that trip point crossing notifications sent to user space might be unreliable due to the incorrect handling of trip point hysteresis in the thermal core: multiple notifications might be sent for the same event or there might be events without any notification at all. Specifics: - Add dynamic thresholds for trip point crossing detection to prevent trip point crossing notifications from being sent at incorrect times or not at all in some cases (Rafael J. Wysocki) - Fix synchronization issues related to the resume of thermal zones during a system-wide resume and allow thermal zones to be resumed concurrently (Rafael J. Wysocki) - Modify the thermal zone unregistration to wait for the given zone to go away completely before returning to the caller and rework the sysfs interface for trip points on top of that (Rafael J. Wysocki) - Fix a possible NULL pointer dereference in thermal zone registration error path (Rafael J. Wysocki) - Clean up the IPA thermal governor and modify it (with the help of a new governor callback) to avoid allocating and freeing memory every time its throttling callback is invoked (Lukasz Luba) - Make the IPA thermal governor handle thermal instance weight changes via sysfs correctly (Lukasz Luba) - Update the thermal netlink code to avoid sending messages if there are no recipients (Stanislaw Gruszka) - Convert Mediatek Thermal to the json-schema (Rafał Miłecki) - Fix thermal DT bindings issue on Loongson (Binbin Zhou) - Fix returning NULL instead of -ENODEV during thermal probe on Loogsoon (Binbin Zhou) - Add thermal DT binding for tsens on the SM8650 platform (Neil Armstrong) - Add reboot on the critical trip point crossing option feature (Fabio Estevam) - Use DEFINE_SIMPLE_DEV_PM_OPS do define PM functions for thermal suspend/resume on AmLogic (Uwe Kleine-König) - Add D1/T113s THS controller support to the Sun8i thermal control driver (Maxim Kiselev) - Fix example in the thermal DT binding for QCom SPMI (Johan Hovold) - Fix compilation warning in the tmon utility (Florian Eckert) - Add support for interrupt-based thermal configuration on Exynos along with a set of related cleanups (Mateusz Majewski) - Make the Intel HFI thermal driver enable an HFI instance (eg. processor package) from its first online CPU and disable it when the last CPU in it goes offline (Ricardo Neri) - Fix a kernel-doc warning and a spello in the cpuidle_cooling thermal driver (Randy Dunlap) - Move the .get_temp() thermal zone callback presence check to the thermal zone registration code (Daniel Lezcano) - Use the for_each_trip() macro for trip points table walks in a few places in the thermal core (Rafael J. Wysocki) - Make all trip point updates (via sysfs as well as from the platform firmware) trigger trip change notifications (Rafael J. Wysocki) - Drop redundant code from the thermal core and make one function in it take a const pointer argument (Rafael J. Wysocki)" * tag 'thermal-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (64 commits) thermal: trip: Constify thermal zone argument of thermal_zone_trip_id() thermal: intel: hfi: Disable an HFI instance when all its CPUs go offline thermal: intel: hfi: Enable an HFI instance from its first online CPU thermal: intel: hfi: Refactor enabling code into helper functions thermal/drivers/exynos: Use set_trips ops thermal/drivers/exynos: Use BIT wherever possible thermal/drivers/exynos: Split initialization of TMU and the thermal zone thermal/drivers/exynos: Stop using the threshold mechanism on Exynos 4210 thermal/drivers/exynos: Simplify regulator (de)initialization thermal/drivers/exynos: Handle devm_regulator_get_optional return value correctly thermal/drivers/exynos: Wwitch from workqueue-driven interrupt handling to threaded interrupts thermal/drivers/exynos: Drop id field thermal/drivers/exynos: Remove an unnecessary field description tools/thermal/tmon: Fix compilation warning for wrong format dt-bindings: thermal: qcom-spmi-adc-tm5/hc: Clean up examples dt-bindings: thermal: qcom-spmi-adc-tm5/hc: Fix example node names thermal/drivers/sun8i: Add D1/T113s THS controller support dt-bindings: thermal: sun8i: Add binding for D1/T113s THS controller thermal: amlogic: Use DEFINE_SIMPLE_DEV_PM_OPS for PM functions thermal: amlogic: Make amlogic_thermal_disable() return void ...
2024-01-09thermal: netlink: Pass thermal zone pointer to notify routinesRafael J. Wysocki
There are several rountines in the thermal netlink API that take a thermal zone ID or a thermal zone type as their arguments, but from their callers perspective it would be more convenient to pass a thermal zone pointer to them and let them extract the necessary data from the given thermal zone object by themselves. Modify the code accordingly. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-01-09thermal: netlink: Drop thermal_notify_tz_trip_add/delete()Rafael J. Wysocki
Because thermal_notify_tz_trip_add/delete() are never used, drop them entirely along with the related code. The addition or removal of trip points is not supported by the thermal core and is unlikely to be supported in the future, so it is also unlikely that these functions will ever be needed. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-01-09thermal: netlink: Pass pointers to thermal_notify_tz_trip_up/down()Rafael J. Wysocki
Instead of requiring the callers of thermal_notify_tz_trip_up/down() to provide specific values needed to populate struct param in them, make them extract those values from objects passed by the callers via const pointers. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-01-09thermal: netlink: Pass pointers to thermal_notify_tz_trip_change()Rafael J. Wysocki
Instead of requiring the caller of thermal_notify_tz_trip_change() to provide specific values needed to populate struct param in it, make it extract those values from objects passed to it by the caller via const pointers. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-01-05Merge branch 'thermal-intel'Rafael J. Wysocki
Merge changes in thermal control drivers for Intel platforms for 6.8-rc1: - Make the Intel HFI thermal driver enable an HFI instance (eg. processor package) from its first online CPU and disable it when the last CPU in it goes offline (Ricardo Neri). * thermal-intel: thermal: intel: hfi: Disable an HFI instance when all its CPUs go offline thermal: intel: hfi: Enable an HFI instance from its first online CPU thermal: intel: hfi: Refactor enabling code into helper functions