summaryrefslogtreecommitdiff
path: root/drivers/thermal
AgeCommit message (Collapse)Author
2024-04-23thermal/drivers/mediatek/lvts_thermal: Provision for gt variable locationNicolas Pitre
The golden temperature calibration value in nvram is not always the 3rd byte. A future commit will prove this assumption wrong. Signed-off-by: Nicolas Pitre <npitre@baylibre.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20240402032729.2736685-11-nico@fluxnic.net
2024-04-23thermal/drivers/mediatek/lvts_thermal: Add MT8186 supportNicolas Pitre
Various values extracted from the vendor's kernel driver. Signed-off-by: Nicolas Pitre <npitre@baylibre.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20240402032729.2736685-9-nico@fluxnic.net
2024-04-23thermal/drivers/mediatek/lvts_thermal: Guard against efuse data buffer overflowNicolas Pitre
We don't want to silently fetch garbage past the actual buffer. Signed-off-by: Nicolas Pitre <npitre@baylibre.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20240402032729.2736685-6-nico@fluxnic.net
2024-04-23thermal/drivers/mediatek/lvts_thermal: Use offsets for every calibration byteNicolas Pitre
Current code assumes calibration values are always stored contiguously in host endian order. A future patch will prove this wrong. Let's specify the offset for each calibration byte instead. Signed-off-by: Nicolas Pitre <npitre@baylibre.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20240402032729.2736685-5-nico@fluxnic.net
2024-04-23thermal/drivers/mediatek/lvts_thermal: Remove .hw_tshut_tempNicolas Pitre
All the .hw_tshut_temp instances are initialized with the same value. Let's remove those and use a common definition instead. If ever a different value must be used in the future then an override parameter could be added back. Signed-off-by: Nicolas Pitre <npitre@baylibre.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20240402032729.2736685-4-nico@fluxnic.net
2024-04-23thermal/drivers/mediatek/lvts_thermal: Move commentNicolas Pitre
Move efuse data interpretation inside lvts_golden_temp_init() alongside the actual code retrieving wanted value. Signed-off-by: Nicolas Pitre <npitre@baylibre.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20240402032729.2736685-3-nico@fluxnic.net
2024-04-23thermal/drivers/mediatek/lvts_thermal: Retrieve all calibration bytesNicolas Pitre
Calibration values are 24-bit wide. Those values so far appear to span only 16 bits but let's not push our luck. Found while looking at the original Mediatek driver code. Signed-off-by: Nicolas Pitre <npitre@baylibre.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20240402032729.2736685-2-nico@fluxnic.net
2024-04-23thermal/drivers/k3_bandgap: Remove some unused fields in struct k3_bandgapChristophe JAILLET
In "struct k3_bandgap", the 'conf' field is unused. Remove it. Found with cppcheck, unusedStructMember. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/206d39bed9ec6df9b4d80b1fc064e499389fc7fc.1712687420.git.christophe.jaillet@wanadoo.fr
2024-04-23thermal/drivers/qcom: Remove some unused fields in struct qpnp_tm_chipChristophe JAILLET
In "struct qpnp_tm_chip", the 'prev_stage' field is unused. Remove it. Found with cppcheck, unusedStructMember. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/d1c3a3c455f485dae46290e3488daf1dcc1d355a.1712687589.git.christophe.jaillet@wanadoo.fr
2024-04-23thermal/drivers/tsens: Fix null pointer dereferenceAleksandr Mishin
compute_intercept_slope() is called from calibrate_8960() (in tsens-8960.c) as compute_intercept_slope(priv, p1, NULL, ONE_PT_CALIB) which lead to null pointer dereference (if DEBUG or DYNAMIC_DEBUG set). Fix this bug by adding null pointer check. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: dfc1193d4dbd ("thermal/drivers/tsens: Replace custom 8960 apis with generic apis") Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru> Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20240411114021.12203-1-amishin@t-argos.ru
2024-04-23thermal/drivers/mediatek/lvts_thermal: Add coeff for mt8192Hsin-Te Yuan
In order for lvts_raw_to_temp to function properly on mt8192, temperature coefficients for mt8192 need to be added. Fixes: 288732242db4 ("thermal/drivers/mediatek/lvts_thermal: Add mt8192 support") Signed-off-by: Hsin-Te Yuan <yuanhsinte@chromium.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20240416-lvts_thermal-v2-1-f8a36882cc53@chromium.org
2024-04-23thermal/drivers/rcar_gen3: Update temperature approximation calculationNiklas Söderlund
The initial driver used a formula to approximate the temperature and register values reversed engineered from an out-of-tree BSP driver. This was needed as the datasheet at the time did not contain any information on how to do this. Later Gen3 (Rev 2.30) and Gen4 (all) now contains this information. Update the approximation formula to use the datasheet's information instead of the reversed-engineered one. On an idle M3-N without fused calibration values for PTAT and THCODE the old formula reports, zone0: 52000 zone1: 53000 zone2: 52500 While the new formula under the same circumstances reports, zone0: 52500 zone1: 54000 zone2: 54000 Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20240327133013.3982199-3-niklas.soderlund+renesas@ragnatech.se
2024-04-23thermal/drivers/rcar_gen3: Move Tj_T storage to shared private dataNiklas Söderlund
The calculated Tj_T constant is calculated from the PTAT data either read from the first TSC zone on the device if calibration data is fused, or from fallback values in the driver itself. The value calculated is shared among all TSC zones. Move the Tj_T constant to the shared private data structure instead of duplicating it in each TSC private data. This requires adding a pointer to the shared data to the TSC private data structure. This back pointer make it easier to further rework the temperature conversion logic. Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20240327133013.3982199-2-niklas.soderlund+renesas@ragnatech.se
2024-04-23thermal/drivers/amlogic: Support A1 SoC family Thermal Sensor controllerDmitry Rokosov
In comparison to other Amlogic chips, there is one key difference. The offset for the sec_ao base, also known as u_efuse_off, is special, while other aspects remain the same. Signed-off-by: Dmitry Rokosov <ddrokosov@salutedevices.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20240328191322.17551-3-ddrokosov@salutedevices.com
2024-04-23thermal/drivers/tsens: Add suspend to RAM support for tsensPriyansh Jain
As part of suspend to RAM, tsens hardware will be turned off. While resume callback, re-initialize tsens hardware. Signed-off-by: Priyansh Jain <quic_priyjain@quicinc.com> Acked-by: Amit Kucheria <amitk@kernel.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20240328050230.31770-1-quic_priyjain@quicinc.com
2024-04-23thermal/drivers/armada: Simplify name sanitizationRasmus Villemoes
Simplify the code by using the helper we have for doing exactly this. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20240320104940.65031-1-linux@rasmusvillemoes.dk
2024-04-23thermal/drivers/qcom/lmh: Check for SCM availability at probeKonrad Dybcio
Up until now, the necessary scm availability check has not been performed, leading to possible null pointer dereferences (which did happen for me on RB1). Fix that. Fixes: 53bca371cdf7 ("thermal/drivers/qcom: Add support for LMh driver") Cc: <stable@vger.kernel.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20240308-topic-rb1_lmh-v2-2-bac3914b0fe3@linaro.org
2024-04-19thermal: core: Introduce .trip_crossed() callback for thermal governorsRafael J. Wysocki
Introduce a new thermal governor callback called .trip_crossed() that will be invoked whenever a trip point is crossed by the zone temperature, either on the way up or on the way down. The trip crossing direction information will be passed to it and if multiple trips are crossed in the same direction during one thermal zone update, the new callback will be invoked for them in temperature order, either ascending or descending, depending on the trip crossing direction. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-19Merge back earlier thermal control material for v6.10.Rafael J. Wysocki
2024-04-19thermal/debugfs: Add missing count increment to thermal_debug_tz_trip_up()Rafael J. Wysocki
The count field in struct trip_stats, representing the number of times the zone temperature was above the trip point, needs to be incremented in thermal_debug_tz_trip_up(), for two reasons. First, if a trip point is crossed on the way up for the first time, thermal_debug_update_temp() called from update_temperature() does not see it because it has not been added to trips_crossed[] array in the thermal zone's struct tz_debugfs object yet. Therefore, when thermal_debug_tz_trip_up() is called after that, the trip point's count value is 0, and the attempt to divide by it during the average temperature computation leads to a divide error which causes the kernel to crash. Setting the count to 1 before the division by incrementing it fixes this problem. Second, if a trip point is crossed on the way up, but it has been crossed on the way up already before, its count value needs to be incremented to make a record of the fact that the zone temperature is above the trip now. Without doing that, if the mitigations applied after crossing the trip cause the zone temperature to drop below its threshold, the count will not be updated for this episode at all and the average temperature in the trip statistics record will be somewhat higher than it should be. Fixes: 7ef01f228c9f ("thermal/debugfs: Add thermal debugfs information for mitigation episodes") Cc :6.8+ <stable@vger.kernel.org> # 6.8+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-04-15Merge branch 'thermal-intel' into thermalRafael J. Wysocki
* thermal-intel: thermal: intel: int340x_thermal: replace deprecated strncpy() with strscpy() thermal: intel: hfi: Enable HFI only when required thermal: netlink: Rename thermal_gnl_family thermal: netlink: Add genetlink bind/unbind notifications
2024-04-11treewide: Use sysfs_bin_attr_simple_read() helperLukas Wunner
Deduplicate ->read() callbacks of bin_attributes which are backed by a simple buffer in memory: Use the newly introduced sysfs_bin_attr_simple_read() helper instead, either by referencing it directly or by declaring such bin_attributes with BIN_ATTR_SIMPLE_RO() or BIN_ATTR_SIMPLE_ADMIN_RO(). Aside from a reduction of LoC, this shaves off a few bytes from vmlinux (304 bytes on an x86_64 allyesconfig). No functional change intended. Signed-off-by: Lukas Wunner <lukas@wunner.de> Acked-by: Zhi Wang <zhiwang@kernel.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/92ee0a0e83a5a3f3474845db6c8575297698933a.1712410202.git.lukas@wunner.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-08ACPI: DPTF: Add Lunar Lake supportSumeet Pawnikar
Add Lunar Lake ACPI IDs for DPTF devices. Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-04-08thermal: core: Relocate the struct thermal_governor definitionRafael J. Wysocki
Notice that struct thermal_governor is only used by the thermal core and so move its definition to thermal_core.h. No functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-08thermal: core: Sort trip point crossing notifications by temperatureRafael J. Wysocki
If multiple trip points are crossed in one go and the trips table in the thermal zone device object is not sorted, the corresponding trip point crossing notifications sent to user space will not be ordered either. Moreover, if the trips table is sorted by trip temperature in ascending order, the trip crossing notifications on the way up will be sent in that order too, but the trip crossing notifications on the way down will be sent in the reverse order. This is generally confusing and it is better to make the kernel send the notifications in the order of growing (on the way up) or falling (on the way down) trip temperature. To achieve that, instead of sending a trip crossing notification and recording a trip crossing event in the statistics right away from handle_thermal_trip(), put the trip in question on a list that will be sorted by __thermal_zone_device_update() after processing all of the trips and before sending the notifications and recording trip crossing events. Link: https://lore.kernel.org/linux-pm/20240306085428.88011-1-daniel.lezcano@linaro.org/ Reported-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-08thermal: core: Send trip crossing notifications at init time if neededRafael J. Wysocki
If a trip point is already exceeded by the zone temperature at the initialization time, no trip crossing notification is send regarding this even though mitigation should be started then. Address this by rearranging the code in handle_thermal_trip() to send a trip crossing notification for trip points already exceeded by the zone temperature initially which also allows to reduce its size by using the observation that the initialization and regular trip crossing on the way up become the same case then. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-08thermal: core: Rewrite comments in handle_thermal_trip()Rafael J. Wysocki
Make the comments regarding trip crossing and threshold updates in handle_thermal_trip() slightly more clear. No functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-04-08thermal: core: Make struct thermal_zone_device definition internalRafael J. Wysocki
Move the definitions of struct thermal_trip_desc and struct thermal_zone_device to an internal header file in the thermal core, as they don't need to be accessible to any code other than the thermal core and so they don't need to be present in a global header. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-04-08thermal: core: Move threshold out of struct thermal_tripRafael J. Wysocki
The threshold field in struct thermal_trip is only used internally by the thermal core and it is better to prevent drivers from misusing it. It also takes some space unnecessarily in the trip tables passed by drivers to the core during thermal zone registration. For this reason, introduce struct thermal_trip_desc as a wrapper around struct thermal_trip, move the threshold field directly into it and make the thermal core store struct thermal_trip_desc objects in the internal thermal zone trip tables. Adjust all of the code using trip tables in the thermal core accordingly. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-08thermal: gov_step_wise: Simplify checks related to passive tripsRafael J. Wysocki
Make it more clear from the code flow that the passive polling status updates only take place for passive trip points. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-08thermal: gov_step_wise: Simplify get_target_state()Rafael J. Wysocki
The step-wise governor's get_target_state() function contains redundant braces, redundant parens and a redundant next_target local variable, so get rid of all that stuff. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-03thermal: gov_power_allocator: Allow binding without trip pointsNikita Travkin
IPA probe function was recently refactored to perform extra error checks and make sure the thermal zone has trip points necessary for the IPA operation. With this change, if a thermal zone is probed such that it has no trip points that IPA can use, IPA will fail and the TZ won't be created. This is the case if a platform defines a TZ without cooling devices and only with "hot"/"critical" trip points, often found on some Qualcomm devices [1]. Documentation across IPA code (notably get_governor_trips() kerneldoc) suggests that IPA is supposed to handle such TZ even if it won't actually do anything. This commit partially reverts the previous change to allow IPA to bind to such "empty" thermal zones. Fixes: e83747c2f8e3 ("thermal: gov_power_allocator: Set up trip points earlier") Link: arch/arm64/boot/dts/qcom/sc7180.dtsi#n4776 # [1] Signed-off-by: Nikita Travkin <nikita@trvn.ru> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-04-03thermal: gov_power_allocator: Allow binding without cooling devicesNikita Travkin
IPA was recently refactored to split out memory allocation into a separate funciton. That funciton was made to return -EINVAL if there is zero power_actors and thus no memory to allocate. This causes IPA to fail probing when the thermal zone has no attached cooling devices. Since cooling devices can attach after the thermal zone is created and the governer is attached to it, failing probe due to the lack of cooling devices is incorrect. Change the allocate_actors_buffer() to return success when there is no cooling devices present. Fixes: 912e97c67cc3 ("thermal: gov_power_allocator: Move memory allocation out of throttle()") Signed-off-by: Nikita Travkin <nikita@trvn.ru> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-03-27thermal: devfreq_cooling: Fix perf state when calculate dfc res_utilYe Zhang
The issue occurs when the devfreq cooling device uses the EM power model and the get_real_power() callback is provided by the driver. The EM power table is sorted ascending,can't index the table by cooling device state,so convert cooling state to performance state by dfc->max_state - dfc->capped_state. Fixes: 615510fe13bd ("thermal: devfreq_cooling: remove old power model and use EM") Cc: 5.11+ <stable@vger.kernel.org> # 5.11+ Signed-off-by: Ye Zhang <ye.zhang@rock-chips.com> Reviewed-by: Dhruva Gole <d-gole@ti.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-03-27thermal: intel: int340x_thermal: replace deprecated strncpy() with strscpy()Justin Stitt
strncpy() is deprecated for use on NUL-terminated destination strings [1] and as such we should prefer more robust and less ambiguous string interfaces. psvt->limit.string can only be 8 bytes so let's use the appropriate size macro ACPI_LIMIT_STR_MAX_LEN. Neither psvt->limit.string or psvt_user[i].limit.string requires the NUL-padding behavior that strncpy() provides as they have both been filled with NUL-bytes prior to the string operation. | memset(&psvt->limit, 0, sizeof(u64)); and | psvt_user = kzalloc(psvt_len, GFP_KERNEL); Let's use `strscpy` [2] due to the fact that it guarantees NUL-termination on the destination buffer without unnecessarily NUL-padding. Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings # [1] Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2] Link: https://github.com/KSPP/linux/issues/90 Signed-off-by: Justin Stitt <justinstitt@google.com> Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-03-27thermal: intel: hfi: Enable HFI only when requiredStanislaw Gruszka
Enable and disable hardware feedback interface (HFI) when user space handler is present. For example, enable HFI, when intel-speed-select or Intel Low Power daemon is running and subscribing to thermal netlink events. When user space handlers exit or remove subscription for thermal netlink events, disable HFI. Summary of changes: - Register a thermal genetlink notifier - In the notifier, process THERMAL_NOTIFY_BIND and THERMAL_NOTIFY_UNBIND reason codes to count number of thermal event group netlink multicast clients. If thermal netlink group has any listener enable HFI on all packages. If there are no listener disable HFI on all packages. - When CPU is online, instead of blindly enabling HFI, check if the thermal netlink group has any listener. This will make sure that HFI is not enabled by default during boot time. - Actual processing to enable/disable matches what is done in suspend/resume callbacks. Create two functions hfi_enable_instance() and hfi_disable_instance(), which can be called from the netlink notifier callback and suspend/resume callbacks. Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-03-27thermal: netlink: Rename thermal_gnl_familyStanislaw Gruszka
Almost all thermal netlink structures use thermal_genl prefix. Change thermal_gnl_family name accordingly for consistency. No functional impact. Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-03-27thermal: netlink: Add genetlink bind/unbind notificationsStanislaw Gruszka
Introduce a new feature to the thermal netlink framework, enabling the registration of sub drivers to receive events via a notifier mechanism. Specifically, implement genetlink family bind and unbind callbacks to send BIND and UNBIND events. The primary purpose of this enhancement is to facilitate the tracking of user-space consumers by the Intel HFI driver. By leveraging these notifications, the driver can determine when consumers are present or absent. Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-03-26Revert "thermal: core: Don't update trip points inside the hysteresis range"Daniel Lezcano
It has been reported the commit cf3986f8c01d3 introduced a regression when the temperature is wavering in the hysteresis region. The mitigation stops leading to an uncontrolled temperature increase until reaching the critical trip point. Here what happens: * 'throttle' is when the current temperature is greater than the trip point temperature * 'target' is the mitigation level * 'passive' is positive when there is a mitigation, zero otherwise * these values are computed in the step_wise governor Configuration: trip point 1: temp=95°C, hyst=5°C (passive) trip point 2: temp=115°C, hyst=0°C (critical) governor: step_wise 1. The temperature crosses the way up the trip point 1 at 95°C - trend=raising - throttle=1, target=1 - passive=1 - set_trips: low=90°C, high=115°C 2. The temperature decreases but stays in the hysteresis region at 93°C - trend=dropping - throttle=0, target=0 - passive=1 Before cf3986f8c01d3 - set_trips: low=90°C, high=95°C After cf3986f8c01d3 - set_trips: low=90°C, high=115°C 3. The temperature increases a bit but stays in the hysteresis region at 94°C (so below the trip point 1 temp 95°C) - trend=raising - throttle=0, target=0 - passive=1 Before cf3986f8c01d3 - set_trips: low=90°C, high=95°C After cf3986f8c01d3 - set_trips: low=90°C, high=115°C 4. The temperature decreases but stays in the hysteresis region at 93°C - trend=dropping - throttle=0, target=THERMAL_NO_TARGET - passive=0 Before cf3986f8c01d3 - set_trips: low=90°C, high=95°C After cf3986f8c01d3 - set_trips: low=90°C, high=115°C At this point, the 'passive' value is zero, there is no mitigation, the temperature is in the hysteresis region, the next trip point is 115°C. As 'passive' is zero, the timer to monitor the thermal zone is disabled. Consequently if the temperature continues to increase, no mitigation will happen and it will reach the 115°C trip point and reboot. Before the optimization, the high boundary would have been 95°C, thus triggering the mitigation again and rearming the polling timer. The optimization make sense but given the current implementation of the step_wise governor collaborating via this 'passive' flag with the core framework it can not work. From a higher perspective it seems like there is a problem between the governor which sets a variable to be used by the core framework. That sounds akward and it would make much more sense if the core framework controls the governor and not the opposite. But as the devil hides in the details, there are some subtilities to be addressed before. Elaborating those would be out of the scope this changelog. So let's stay simple and revert the change first to fixup all broken mobile platforms. This reverts commit cf3986f8c01d3 ("thermal: core: Don't update trip points inside the hysteresis range") and takes a conflict with commit 0c0c4740c9d26 ("0c0c4740c9d2 thermal: trip: Use for_each_trip() in __thermal_zone_set_trips()") in drivers/thermal/thermal_trip.c into account. Fixes: cf3986f8c01d3 ("thermal: core: Don't update trip points inside the hysteresis range") Reported-by: Manaf Meethalavalappu Pallikunhi <quic_manafm@quicinc.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> Cc: 6.7+ <stable@vger.kernel.org> # 6.7+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-03-13Merge tag 'thermal-v6.9-rc1' of ↵Rafael J. Wysocki
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux Merge additional thermal control changes for 6.9-rc1 from Daniel Lezcano: "- Fix memory leak in the error path at probe time in the Mediatek LVTS driver (Christophe Jaillet) - Fix control buffer enablement regression on Meditek MT7896 (Frank Wunderlich) - Drop spaces before TABs in different places: thermal-of, ST drivers and Makefile (Geert Uytterhoeven) - Adjust DT binding for NXP as fsl,tmu-range min/maxItems can vary among several SoC versions (Fabio Estevam) - Add support for H616 THS controller for the Sun8i platforms. Note that this change relies on another change in the SoC specific code which is included in this branch (Martin Botka) - Don't fail probe due to zone registration failure because there is no trip points defined in the DT (Mark Brown) - Support variable TMU array size for new platforms (Peng Fan) - Adjust the DT binding for thermal-of and make the polling time not required and assume it is zero when not found in the DT (Konrad Dybcio) - Add r8a779h0 support in both the DT and the driver (Geert Uytterhoeven)" * tag 'thermal-v6.9-rc1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux: thermal/drivers/rcar_gen3: Add support for R-Car V4M dt-bindings: thermal: rcar-gen3-thermal: Add r8a779h0 support thermal/of: Assume polling-delay(-passive) 0 when absent dt-bindings: thermal-zones: Don't require polling-delay(-passive) thermal/drivers/qoriq: Fix getting tmu range thermal/drivers/sun8i: Don't fail probe due to zone registration failure thermal/drivers/sun8i: Add support for H616 THS controller thermal/drivers/sun8i: Add SRAM register access code thermal/drivers/sun8i: Extend H6 calibration to support 4 sensors thermal/drivers/sun8i: Explain unknown H6 register value dt-bindings: thermal: sun8i: Add H616 THS controller soc: sunxi: sram: export register 0 for THS on H616 dt-bindings: thermal: qoriq-thermal: Adjust fsl,tmu-range min/maxItems thermal: Drop spaces before TABs thermal/drivers/mediatek: Fix control buffer enablement on MT7896 thermal/drivers/mediatek/lvts_thermal: Fix a memory leak in an error handling path
2024-03-13Merge tag 'thermal-6.9-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull thermal control updates from Rafael Wysocki: "These mostly change the thermal core in a few ways allowing thermal drivers to be simplified, in particular in their removal and failing probe handling parts that are notoriously prone to errors, and propagate the changes to several drivers. Apart from that, support for a new platform is added (Intel Lunar Lake-M), some bugs are fixed and some code is cleaned up, as usual. Specifics: - Store zone trips table and zone operations directly in struct thermal_zone_device (Rafael Wysocki) - Fix up flex array initialization during thermal zone device registration (Nathan Chancellor) - Rework writable trip points handling in the thermal core and several drivers (Rafael Wysocki) - Thermal core code cleanups (Dan Carpenter, Flavio Suligoi) - Use thermal zone accessor functions in the int340x Intel thermal driver (Rafael Wysocki) - Add Lunar Lake-M PCI ID to the int340x Intel thermal driver (Srinivas Pandruvada) - Minor fixes for thermal governors (Rafael Wysocki, Di Shen) - Trip point handling fixes for the iwlwifi wireless driver (Rafael Wysocki) - Code cleanups (Rafael J. Wysocki, AngeloGioacchino Del Regno)" * tag 'thermal-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (29 commits) thermal: core: remove unnecessary check in trip_point_hyst_store() thermal: intel: int340x_thermal: Use thermal zone accessor functions thermal: core: Remove excess empty line from a comment thermal: int340x: processor_thermal: Add Lunar Lake-M PCI ID thermal: core: Eliminate writable trip points masks thermal: of: Set THERMAL_TRIP_FLAG_RW_TEMP directly thermal: imx: Set THERMAL_TRIP_FLAG_RW_TEMP directly wifi: iwlwifi: mvm: Set THERMAL_TRIP_FLAG_RW_TEMP directly mlxsw: core_thermal: Set THERMAL_TRIP_FLAG_RW_TEMP directly thermal: intel: Set THERMAL_TRIP_FLAG_RW_TEMP directly thermal: core: Drop the .set_trip_hyst() thermal zone operation thermal: core: Add flags to struct thermal_trip thermal: core: Move initial num_trips assignment before memcpy() thermal: Get rid of CONFIG_THERMAL_WRITABLE_TRIPS thermal: intel: Adjust ops handling during thermal zone registration thermal: ACPI: Constify acpi_thermal_zone_ops thermal: core: Store zone ops in struct thermal_zone_device thermal: intel: Discard trip tables after zone registration thermal: ACPI: Discard trips table after zone registration thermal: core: Store zone trips table in struct thermal_zone_device ...
2024-03-13Merge tag 'pm-6.9-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "From the functional perspective, the most significant change here is the addition of support for Energy Models that can be updated dynamically at run time. There is also the addition of LZ4 compression support for hibernation, the new preferred core support in amd-pstate, new platforms support in the Intel RAPL driver, new model-specific EPP handling in intel_pstate and more. Apart from that, the cpufreq default transition delay is reduced from 10 ms to 2 ms (along with some related adjustments), the system suspend statistics code undergoes a significant rework and there is a usual bunch of fixes and code cleanups all over. Specifics: - Allow the Energy Model to be updated dynamically (Lukasz Luba) - Add support for LZ4 compression algorithm to the hibernation image creation and loading code (Nikhil V) - Fix and clean up system suspend statistics collection (Rafael Wysocki) - Simplify device suspend and resume handling in the power management core code (Rafael Wysocki) - Fix PCI hibernation support description (Yiwei Lin) - Make hibernation take set_memory_ro() return values into account as appropriate (Christophe Leroy) - Set mem_sleep_current during kernel command line setup to avoid an ordering issue with handling it (Maulik Shah) - Fix wake IRQs handling when pm_runtime_force_suspend() is used as a driver's system suspend callback (Qingliang Li) - Simplify pm_runtime_get_if_active() usage and add a replacement for pm_runtime_put_autosuspend() (Sakari Ailus) - Add a tracepoint for runtime_status changes tracking (Vilas Bhat) - Fix section title markdown in the runtime PM documentation (Yiwei Lin) - Enable preferred core support in the amd-pstate cpufreq driver (Meng Li) - Fix min_perf assignment in amd_pstate_adjust_perf() and make the min/max limit perf values in amd-pstate always stay within the (highest perf, lowest perf) range (Tor Vic, Meng Li) - Allow intel_pstate to assign model-specific values to strings used in the EPP sysfs interface and make it do so on Meteor Lake (Srinivas Pandruvada) - Drop long-unused cpudata::prev_cummulative_iowait from the intel_pstate cpufreq driver (Jiri Slaby) - Prevent scaling_cur_freq from exceeding scaling_max_freq when the latter is an inefficient frequency (Shivnandan Kumar) - Change default transition delay in cpufreq to 2ms (Qais Yousef) - Remove references to 10ms minimum sampling rate from comments in the cpufreq code (Pierre Gondois) - Honour transition_latency over transition_delay_us in cpufreq (Qais Yousef) - Stop unregistering cpufreq cooling on CPU hot-remove (Viresh Kumar) - General enhancements / cleanups to ARM cpufreq drivers (tianyu2, Nícolas F. R. A. Prado, Erick Archer, Arnd Bergmann, Anastasia Belova) - Update cpufreq-dt-platdev to block/approve devices (Richard Acayan) - Make the SCMI cpufreq driver get a transition delay value from firmware (Pierre Gondois) - Prevent the haltpoll cpuidle governor from shrinking guest poll_limit_ns below grow_start (Parshuram Sangle) - Avoid potential overflow in integer multiplication when computing cpuidle state parameters (C Cheng) - Adjust MWAIT hint target C-state computation in the ACPI cpuidle driver and in intel_idle to return a correct value for C0 (He Rongguang) - Address multiple issues in the TPMI RAPL driver and add support for new platforms (Lunar Lake-M, Arrow Lake) to Intel RAPL (Zhang Rui) - Fix freq_qos_add_request() return value check in dtpm_cpu (Daniel Lezcano) - Fix kernel-doc for dtpm_create_hierarchy() (Yang Li) - Fix file leak in get_pkg_num() in x86_energy_perf_policy (Samasth Norway Ananda) - Fix cpupower-frequency-info.1 man page typo (Jan Kratochvil) - Fix a couple of warnings in the OPP core code related to W=1 builds (Viresh Kumar) - Move dev_pm_opp_{init|free}_cpufreq_table() to pm_opp.h (Viresh Kumar) - Extend dev_pm_opp_data with turbo support (Sibi Sankar) - dt-bindings: drop maxItems from inner items (David Heidelberg)" * tag 'pm-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (95 commits) dt-bindings: opp: drop maxItems from inner items OPP: debugfs: Fix warning around icc_get_name() OPP: debugfs: Fix warning with W=1 builds cpufreq: Move dev_pm_opp_{init|free}_cpufreq_table() to pm_opp.h OPP: Extend dev_pm_opp_data with turbo support Fix cpupower-frequency-info.1 man page typo cpufreq: scmi: Set transition_delay_us firmware: arm_scmi: Populate fast channel rate_limit firmware: arm_scmi: Populate perf commands rate_limit cpuidle: ACPI/intel: fix MWAIT hint target C-state computation PM: sleep: wakeirq: fix wake irq warning in system suspend powercap: dtpm: Fix kernel-doc for dtpm_create_hierarchy() function cpufreq: Don't unregister cpufreq cooling on CPU hotplug PM: suspend: Set mem_sleep_current during kernel command line setup cpufreq: Honour transition_latency over transition_delay_us cpufreq: Limit resolving a frequency to policy min/max Documentation: PM: Fix runtime_pm.rst markdown syntax cpufreq: amd-pstate: adjust min/max limit perf cpufreq: Remove references to 10ms min sampling rate cpufreq: intel_pstate: Update default EPPs for Meteor Lake ...
2024-03-11thermal/drivers/rcar_gen3: Add support for R-Car V4MGeert Uytterhoeven
Add support for the Thermal Sensor/Chip Internal Voltage Monitor/Core Voltage Monitor (THS/CIVM/CVM) on the Renesas R-Car V4M (R8A779H0) SoC. The conversion formulas for R-Car V4M are the same as for other R-Car Gen4 SoCs. Based on a patch in the BSP by Duy Nguyen. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/bd5b002a802c1e058e0048592f17862db1d04263.1709722342.git.geert+renesas@glider.be
2024-03-11thermal/of: Assume polling-delay(-passive) 0 when absentKonrad Dybcio
Currently, thermal zones associated with providers that have interrupts for signaling hot/critical trips are required to set a polling-delay of 0 to indicate no polling. This feels a bit backwards. Change the code such that "no polling delay" also means "no polling". Suggested-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20240125-topic-thermal-v1-2-3c9d4dced138@linaro.org
2024-03-11thermal/drivers/qoriq: Fix getting tmu rangePeng Fan
TMU Version 1 has 4 TTRCRs, while TMU Version >=2 has 16 TTRCRs. So limit the len to 4 will report "invalid range data" for i.MX93. This patch drop the local array with allocated ttrcr array and able to support larger tmu ranges. Fixes: f12d60c81fce ("thermal/drivers/qoriq: Support version 2.1") Tested-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Peng Fan <peng.fan@nxp.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20240226003657.3012880-1-peng.fan@oss.nxp.com
2024-03-11thermal/drivers/sun8i: Don't fail probe due to zone registration failureMark Brown
Currently the sun8i thermal driver will fail to probe if any of the thermal zones it is registering fails to register with the thermal core. Since we currently do not define any trip points for the GPU thermal zones on at least A64 or H5 this means that we have no thermal support on these platforms: [ 1.698703] thermal_sys: Failed to find 'trips' node [ 1.698707] thermal_sys: Failed to find trip points for thermal-sensor id=1 even though the main CPU thermal zone on both SoCs is fully configured. This does not seem ideal, while we may not be able to use all the zones it seems better to have those zones which are usable be operational. Instead just carry on registering zones if we get any non-deferral error, allowing use of those zones which are usable. This means that we also need to update the interrupt handler to not attempt to notify the core for events on zones which we have not registered, I didn't see an ability to mask individual interrupts and I would expect that interrupts would still be indicated in the ISR even if they were masked. Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20240123-thermal-sun8i-registration-v3-1-3e5771b1bbdd@kernel.org
2024-03-11thermal/drivers/sun8i: Add support for H616 THS controllerMartin Botka
Add support for the thermal sensor found in H616 SoCs, is the same as the H6 thermal sensor controller, but with four sensors. Also the registers readings are wrong, unless a bit in the first SYS_CFG register cleared, so set exercise the SRAM regmap to take care of that. Signed-off-by: Martin Botka <martin.botka@somainline.org> Signed-off-by: Andre Przywara <andre.przywara@arm.com> Acked-by: Vasily Khoruzhick <anarsoul@gmail.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20240219153639.179814-7-andre.przywara@arm.com
2024-03-11thermal/drivers/sun8i: Add SRAM register access codeAndre Przywara
The Allwinner H616 SoC needs to clear a bit in one register in the SRAM controller, to report reasonable temperature values. On reset, bit 16 in register 0x3000000 is set, which leads to the driver reporting temperatures around 200C. Clearing this bit brings the values down to the expected range. The BSP code does a one-time write in U-Boot, with a comment just mentioning the effect on the THS, but offering no further explanation. To not rely on firmware to set things up for us, add code that queries the SRAM controller device via a DT phandle link, then clear just this single bit. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Acked-by: Vasily Khoruzhick <anarsoul@gmail.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20240219153639.179814-6-andre.przywara@arm.com
2024-03-11thermal/drivers/sun8i: Extend H6 calibration to support 4 sensorsMaksim Kiselev
The H616 SoC resembles the H6 thermal sensor controller, with a few changes like four sensors. Extend sun50i_h6_ths_calibrate() function to support calibration of these sensors. Co-developed-by: Martin Botka <martin.botka@somainline.org> Signed-off-by: Martin Botka <martin.botka@somainline.org> Signed-off-by: Maksim Kiselev <bigunclemax@gmail.com> Reviewed-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Jernej Skrabec <jernej.skrabec@gmail.com> Acked-by: Vasily Khoruzhick <anarsoul@gmail.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20240219153639.179814-5-andre.przywara@arm.com
2024-03-11thermal/drivers/sun8i: Explain unknown H6 register valueAndre Przywara
So far we were ORing in some "unknown" value into the THS control register on the Allwinner H6. This part of the register is not explained in the H6 manual, but the H616 manual details those bits, and on closer inspection the THS IP blocks in both SoCs seem very close: - The BSP code for both SoCs writes the same values into THS_CTRL. - The reset values of at least the first three registers are the same. Replace the "unknown" value with its proper meaning: "acquire time", most probably the sample part of the sample & hold circuit of the ADC, according to its explanation in the H616 manual. No functional change, just a macro rename and adjustment. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Jernej Skrabec <jernej.skrabec@gmail.com> Acked-by: Vasily Khoruzhick <anarsoul@gmail.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20240219153639.179814-4-andre.przywara@arm.com