summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-02-28drm/amd/display: enable DPG when disabling plane for phantom pipeSamson Tam
[Why] In disable_dangling_plane, for phantom pipes, we enable OTG so disable programming gets the double buffer update. But this causes an underflow to occur. [How] Enable DPG prior to enabling OTG. Reviewed-by: Alvin Lee <Alvin.Lee2@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Samson Tam <samson.tam@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-28drm/amd/display: remove empty dc_link.cWenjing Liu
[why] We kept an empty dc_link.c file due to external build dependency. Now the last build dependency has been removed. We can safely delete this file. Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Wenjing Liu <wenjing.liu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-28drm/amd/display: Correct DML calculation to align HW formulaPaul Hsieh
[Why] In 2560x1440@240p eDP panel, some use cases will enable MPC combine with RGB MPO then underflow happened. This case is not allowed from HW formula.  [How] Correct eDP, DP and DP2 output bpp calculation to align HW formula. Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Paul Hsieh <Paul.Hsieh@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-28drm/amdgpu/vcn: fix compilation issue with legacy gccbobzhou
This patch is used to fix following compilation issue with legacy gcc error: ‘for’ loop initial declarations are only allowed in C99 mode for (int i = 0; i < adev->vcn.num_vcn_inst; ++i) { Signed-off-by: bobzhou <bob.zhou@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-28drm/amdkfd: Implement DMA buf fd export from KFDFelix Kuehling
Exports a DMA buf fd of a given KFD buffer handle. This is intended for being able to import KFD BOs into GEM contexts to leverage the amdgpu_bo_va API for more flexible virtual address mappings. It will also be used for the new upstreamable RDMA solution coming to UCX and RCCL. The corresponding user mode change (Thunk API and kfdtest) is here: https://github.com/fxkamd/ROCT-Thunk-Interface/commits/fxkamd/dmabuf Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Xiaogang Chen <Xiaogang.Chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-28drm/amdgpu: Generalize KFD dmabuf importFelix Kuehling
Use proper amdgpu_gem_prime_import function to handle all kinds of imports. Remember the dmabuf reference to enable proper multi-GPU attachment to multiple VMs without erroneously re-exporting the underlying BO multiple times. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-28drm/amd/display: merge dc_link.h into dc.h and dc_types.hWenjing Liu
[why] Remove the need to include dc_link.h separately. dc.h should contain everything needed on DM side. [How] Merge dc_link.h into dc.h and dc_types.h so DM only needs to include dc.h to use all link public functions. Reviewed-by: Jun Lei <Jun.Lei@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Wenjing Liu <wenjing.liu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-28drm/amd/display: Update BW ALLOCATION Function declarationMustapha Ghaddar
[WHY & HOW] Update the declaration to give a better idea of what the function does. Reviewed-by: Wenjing Liu <Wenjing.Liu@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Mustapha Ghaddar <mghaddar@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-28Revert "drm/amd/display: Fix FreeSync active bit issue"Aric Cyr
This reverts commit 6cfb6df2d645c00513ecf17832928e08979fa953. [Why & How] Original change causes black screen. Revert until fix is available. Reviewed-by: Aric Cyr <aric.cyr@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Aric Cyr <aric.cyr@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-28drm/amd/display: DAL to program DISPCLK WDIVIDER if PMFW doesn'tAlvin Lee
[Why & How] - If for any reason PMFW fails to set the expected (or valid) DISPCLK WDIVIDER, then DAL will program DENTIST DISPCLK WDIVIDER to correct for this issue Reviewed-by: Samson Tam <Samson.Tam@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Alvin Lee <Alvin.Lee2@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-28drm/amd/display: Extend Freesync over PCon support for more devicesSung Joon Kim
[why] More branch devices are able to support Freesync over PCon so include them in the list of supporting devices. [how] Add more compatible PCon devices in the whitelist for Freesync over Pcon. Reviewed-by: Harry Wentland <Harry.Wentland@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Sung Joon Kim <sungjoon.kim@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-28drm/amd/display: update pixel format in DP hw sequenceYihan Zhu
[WHY] DP 420 formats do not light up because the pixel processing mode of the DP_FORMAT is misprogrammed [HOW] Added appropriate programming for DP pixel format Reviewed-by: Charlene Liu <Charlene.Liu@amd.com> Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Yihan Zhu <yihan.zhu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-28drm/amd/display: populate subvp cmd info only for the top pipeAyush Gupta
[Why] System restart observed while changing the display resolution to 8k with extended mode. Sytem restart was caused by a page fault. [How] When the driver populates subvp info it did it for both the pipes using vblank which caused an outof bounds array access causing the page fault. added checks to allow the top pipe only to fix this issue. Co-authored-by: Ayush Gupta <ayush.gupta@amd.com> Reviewed-by: Alvin Lee <Alvin.Lee2@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Ayush Gupta <ayush.gupta@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-28drm/amd/display: dcn32/321 dsc_pg_control not executed properlyHersen Wu
[why] during boot up or resume from s3, hw default value of domain_power_forceon is 1. when program domain_power_gate to 1 to power down hw block, hw will not change to power off due to domain_power_forceon = 1. [how] enable_power_gating_plane(true) should be executed to set domain_power_forceon to 0 before dsc_pg_control. dsc_pg_control is already called by dcn3x_init_hw--> init_pipes--> dsc_pg_control. no need be programmed with dcn3x_init_hw one more time. to trigger dchub, dsc block power state change, need program dc_ip_request_cntl to notify hw block. Reviewed-by: Nevenko Stupar <Nevenko.Stupar@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Hersen Wu <hersenxs.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-28drm/amd/display: Allocation at stream EnableMustapha Ghaddar
[WHY & HOW] After we allocate BW at plug, we will de-alloc and allocate only what stream needs at stream_enable() [HOW] Introduce bw allocation check at link_enable() for DPIA links Reviewed-by: Wenjing Liu <Wenjing.Liu@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Mustapha Ghaddar <mghaddar@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-28Revert "drm/amd/display: Do not set DRR on pipe commit"Aric Cyr
This reverts commit 4f1b5e739dfd1edde33329e3f376733a131fb1ff. [Why & How] Original change causes a regression. Revert until fix is available. Reviewed-by: Aric Cyr <aric.cyr@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Aric Cyr <aric.cyr@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-28drm/amd/display: Updating Video Format Fall Back Policy.Jasdeep Dhillon
[WHY] Adding 1920x1080 as fail safe mode for Video Format Fall Back Policy. Reviewed-by: Jerry Zuo <Jerry.Zuo@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Jasdeep Dhillon <jdhillon@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-28drm/amd/display: Reduce CPU busy-waiting for long delaysAric Cyr
[WHY] udelay should not be used for long waits since it keeps CPU active, wasting power. [HOW] Use fsleep where acceptable to allow CPU cores to be parked by the scheduler. Reviewed-by: Wenjing Liu <Wenjing.Liu@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Aric Cyr <aric.cyr@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-28drm/amd/display: fix shift-out-of-bounds in CalculateVMAndRowBytesAlex Hung
[WHY] When PTEBufferSizeInRequests is zero, UBSAN reports the following warning because dml_log2 returns an unexpected negative value: shift exponent 4294966273 is too large for 32-bit type 'int' [HOW] In the case PTEBufferSizeInRequests is zero, skip the dml_log2() and assign the result directly. Reviewed-by: Jun Lei <Jun.Lei@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-28drm/amd/display: Ext displays with dock can't recognized after resumeRyan Lin
[Why] Needs to set the default value of the LTTPR timeout after resume. [How] Set the default (3.2ms) timeout at resuming if the sink supports LTTPR Reviewed-by: Jerry Zuo <Jerry.Zuo@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Ryan Lin <tsung-hua.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-28drm/amdgpu: fix ttm_bo calltrace warning in psp_hw_finiHoratio Zhang
The call trace occurs when the amdgpu is removed after the mode1 reset. During mode1 reset, from suspend to resume, there is no need to reinitialize the ta firmware buffer which caused the bo pin_count increase redundantly. [ 489.885525] Call Trace: [ 489.885525] <TASK> [ 489.885526] amdttm_bo_put+0x34/0x50 [amdttm] [ 489.885529] amdgpu_bo_free_kernel+0xe8/0x130 [amdgpu] [ 489.885620] psp_free_shared_bufs+0xb7/0x150 [amdgpu] [ 489.885720] psp_hw_fini+0xce/0x170 [amdgpu] [ 489.885815] amdgpu_device_fini_hw+0x2ff/0x413 [amdgpu] [ 489.885960] ? blocking_notifier_chain_unregister+0x56/0xb0 [ 489.885962] amdgpu_driver_unload_kms+0x51/0x60 [amdgpu] [ 489.886049] amdgpu_pci_remove+0x5a/0x140 [amdgpu] [ 489.886132] ? __pm_runtime_resume+0x60/0x90 [ 489.886134] pci_device_remove+0x3e/0xb0 [ 489.886135] __device_release_driver+0x1ab/0x2a0 [ 489.886137] driver_detach+0xf3/0x140 [ 489.886138] bus_remove_driver+0x6c/0xf0 [ 489.886140] driver_unregister+0x31/0x60 [ 489.886141] pci_unregister_driver+0x40/0x90 [ 489.886142] amdgpu_exit+0x15/0x451 [amdgpu] Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com> Signed-off-by: longlyao <Longlong.Yao@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-28drm/amdgpu: remove unused variable ringTom Rix
building with gcc and W=1 reports drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c:81:29: error: variable ‘ring’ set but not used [-Werror=unused-but-set-variable] 81 | struct amdgpu_ring *ring; | ^~~~ ring is not used so remove it. Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-28drm/amd/display: fix dm irq error message in gpu recovertiancyin
[Why] Variable adev->crtc_irq.num_types was initialized as the value of adev->mode_info.num_crtc at early_init stage, later at hw_init stage, the num_crtc changed due to the display pipe harvest on some SKUs, but the num_types was not updated accordingly, that cause below error in gpu recover. *ERROR* amdgpu_dm_set_crtc_irq_state: crtc is NULL at id :3 *ERROR* amdgpu_dm_set_crtc_irq_state: crtc is NULL at id :3 *ERROR* amdgpu_dm_set_crtc_irq_state: crtc is NULL at id :3 *ERROR* amdgpu_dm_set_pflip_irq_state: crtc is NULL at id :3 *ERROR* amdgpu_dm_set_pflip_irq_state: crtc is NULL at id :3 *ERROR* amdgpu_dm_set_pflip_irq_state: crtc is NULL at id :3 *ERROR* amdgpu_dm_set_pflip_irq_state: crtc is NULL at id :3 *ERROR* amdgpu_dm_set_vupdate_irq_state: crtc is NULL at id :3 *ERROR* amdgpu_dm_set_vupdate_irq_state: crtc is NULL at id :3 *ERROR* amdgpu_dm_set_vupdate_irq_state: crtc is NULL at id :3 [How] Defer the initialization of num_types to eliminate the error logs. Signed-off-by: tiancyin <tianci.yin@amd.com> Reviewed-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-28drm/amd: Fix initialization for nbio 7.5.1Mario Limonciello
A mistake has been made in the BIOS for some ASICs with NBIO 7.5.1 where some NBIO registers aren't properly setup. Ensure that they're set during initialization. Tested-by: Richard Gong <richard.gong@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-28drm/amd/display: Format input and output CSC matrixHarry Wentland
Format the input and output CSC matrix so they look like 3x4 matrixes. This will make parsing them much easier and allows us to quickly spot potential mistakes. Signed-off-by: Harry Wentland <harry.wentland@amd.com> Cc: Pekka Paalanen <ppaalanen@gmail.com> Cc: Sebastian Wick <sebastian.wick@redhat.com> Cc: Vitaly.Prosyak@amd.com Cc: Joshua Ashton <joshua@froggi.es> Cc: dri-devel@lists.freedesktop.org Cc: amd-gfx@lists.freedesktop.org Reviewed-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-28drm/amd/display: Don't restrict bpc to 8 bpcHarry Wentland
This will let us pass the kms_hdr.bpc_switch IGT test. The reason the bpc restriction was required is historical. At one point in time we were not falling back to a lower bpc when we didn't have enough bandwidth for the maximum bpc reported by a display. This meant that we couldn't enable some high refresh modes unless we limitted the bpc. Starting with this patch the issue is fixed: commit cbd14ae7ea93 ("drm/amd/display: Fix incorrectly pruned modes with deep color") This patch implemented a fallback mechanism if mode validation failed at the max bpc. This means users now automatically get all modes that can be supported by at least 6 bpc. The driver will enable the mode with the highest possible bpc that is supported by the display. v2: - explain why this is no longer needed (Michel) - refer to commit that fixed bpc fallback (Michel) Signed-off-by: Harry Wentland <harry.wentland@amd.com> Cc: Pekka Paalanen <ppaalanen@gmail.com> Cc: Sebastian Wick <sebastian.wick@redhat.com> Cc: Vitaly.Prosyak@amd.com Cc: Joshua Ashton <joshua@froggi.es> Cc: dri-devel@lists.freedesktop.org Cc: amd-gfx@lists.freedesktop.org Cc: Michel Dänzer <michel.daenzer@mailbox.org> Reviewed-by: Joshua Ashton <joshua@froggi.es> Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-28drm/amdgpu: Make umc_v8_10_convert_error_address static and remove unused ↵Candice Li
variable Fixes following warnings: warning: no previous prototype for 'umc_v8_10_convert_error_address' warning: variable 'channel_index' set but not used Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23drm/amdgpu: added a sysfs interface for thermal throttlingKun Liu
implement apu_thermal_cap r/w callback for vangogh Signed-off-by: Kun Liu <Kun.Liu2@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23drm/amdgpu: added a sysfs interface for thermal throttlingKun Liu
added a sysfs interface for thermal throttling, then userspace can get/update thermal limit Signed-off-by: Kun Liu <Kun.Liu2@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23drm/amd/display: Remove unused local variables and functionArthur Grillo
Remove a couple of local variables that are only set but never used, also remove an static utility function that is never used in consequence of the variable removal. This decrease the number of -Wunused-but-set-variable warnings. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23drm/amd/display: Remove unused local variablesArthur Grillo
Remove local variables that were just set but were never used. This decrease the number of -Wunused-but-set-variable warnings. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23drm/radeon: Fix eDP for single-display iMac11,2Mark Hawrylak
Apple iMac11,2 (mid 2010) also with Radeon HD-4670 that has the same issue as iMac10,1 (late 2009) where the internal eDP panel stays dark on driver load. This patch treats iMac11,2 the same as iMac10,1, so the eDP panel stays active. Additional steps: Kernel boot parameter radeon.nomodeset=0 required to keep the eDP panel active. This patch is an extension of commit 564d8a2cf3ab ("drm/radeon: Fix eDP for single-display iMac10,1 (v2)") Link: https://lore.kernel.org/all/lsq.1507553064.833262317@decadent.org.uk/ Signed-off-by: Mark Hawrylak <mark.hawrylak@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23drm/amdkfd: Make kobj_type structures constantThomas Weißschuh
Since commit ee6d3dd4ed48 ("driver core: make kobj_type constant.") the driver core allows the usage of const struct kobj_type. Take advantage of this to constify the structure definitions to prevent modification at runtime. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23drm/amdgpu: make kobj_type structures constantThomas Weißschuh
Since commit ee6d3dd4ed48 ("driver core: make kobj_type constant.") the driver core allows the usage of const struct kobj_type. Take advantage of this to constify the structure definitions to prevent modification at runtime. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23drm/amd/display: Modify mismatched function nameJiapeng Chong
No functional modification involved. drivers/gpu/drm/amd/amdgpu/../display/dc/link/link_detection.c:1199: warning: expecting prototype for dc_link_detect_connection_type(). Prototype was for link_detect_connection_type() instead. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=4103 Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23drm/amd/display: Pass proper parent for DM backlight device registrationHans de Goede
The parent for the backlight device should be the drm-connector object, not the PCI device. Userspace relies on this to be able to detect which backlight class device to use on hybrid gfx devices where there may be multiple native (raw) backlight devices registered. Specifically gnome-settings-daemon expects the parent device to have an "enabled" sysfs attribute (as drm_connector devices do) and tests that this returns "enabled" when read. This aligns the parent of the backlight device with i915, nouveau, radeon. Note that drivers/gpu/drm/amd/amdgpu/atombios_encoders.c also already uses the drm_connector as parent, only amdgpu_dm.c used the PCI device as parent before this change. Note this is marked as a RFC because I don't have hw to test, so this has only been compile tested! If someone can test this on actual hw which hits the changed code path that would be great. Link: https://gitlab.gnome.org/GNOME/gnome-settings-daemon/-/issues/730 Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23drm/amd/pm: downgrade log level upon SMU IF version mismatchGuchun Chen
SMU IF version mismatch as a warning message exists widely after asic production, however, due to this log level setting, such mismatch warning will be caught by automation test like IGT and reported as a fake error after checking. As such mismatch does not break anything, to reduce confusion, downgrade it from dev_warn to dev_info. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23drm/amdgpu: Add ecc info query interface for umc v8_10Candice Li
Support ecc info query for umc v8_10. v2: Simplied by convert_error_address. v3: Remove unused variable and invalid checking. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23drm/amdgpu: Add convert_error_address function for umc v8_10Candice Li
Add convert_error_address for umc v8_10. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23drm/amdgpu: add bad_page_threshold check in ras_eeprom_check_errTao Zhou
bad_page_threshold controls page retirement behavior and it should be also checked. v2: simplify the condition of bad page handling path. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23drm/amdgpu: change default behavior of bad_page_threshold parameterTao Zhou
Ignore ras umc bad page threshold by default, GPU initialization won't be stopped in this mode. v2: refine the description of bad_page_threshold. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23drm/amdgpu: exclude duplicate pages from UMC RAS UE countTao Zhou
If a UMC bad page is reserved but not freed by an application, the application may trigger uncorrectable error repeatly by accessing the page. v2: add specific function to do the check. v3: remove duplicate pages, calculate new added bad page number. v4: reuse save_bad_pages to calculate new added bad page number. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23drm/amdgpu: add umc retire unit elementTao Zhou
It records how many bad pages are retired in one uncorrectable error. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23drm/amd/pm: no pptable resetup on runpm exitingEvan Quan
It is assumed the pptable used before runpm is same as the one used afterwards. Thus, we can reuse the stored copy and do not need to resetup the pptable again. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Feifei Xu <feifei.xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23drm/amd/pm: correct the baco state setting for ArmD3 scenarioEvan Quan
The check for baco support relies on the correct baco state. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Feifei Xu <feifei.xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23drm/amdgpu: add more fields into device info, caches sizes, etc.Marek Olšák
AMDGPU_IDS_FLAGS_CONFORMANT_TRUNC_COORD: important for conformance on gfx11 Other fields are exposed from IP discovery. enabled_rb_pipes_mask_hi is added for future chips, currently 0. Mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21403 Signed-off-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23drm/amdkfd: Fix an illegal memory accessQu Huang
In the kfd_wait_on_events() function, the kfd_event_waiter structure is allocated by alloc_event_waiters(), but the event field of the waiter structure is not initialized; When copy_from_user() fails in the kfd_wait_on_events() function, it will enter exception handling to release the previously allocated memory of the waiter structure; Due to the event field of the waiters structure being accessed in the free_waiters() function, this results in illegal memory access and system crash, here is the crash log: localhost kernel: RIP: 0010:native_queued_spin_lock_slowpath+0x185/0x1e0 localhost kernel: RSP: 0018:ffffaa53c362bd60 EFLAGS: 00010082 localhost kernel: RAX: ff3d3d6bff4007cb RBX: 0000000000000282 RCX: 00000000002c0000 localhost kernel: RDX: ffff9e855eeacb80 RSI: 000000000000279c RDI: ffffe7088f6a21d0 localhost kernel: RBP: ffffe7088f6a21d0 R08: 00000000002c0000 R09: ffffaa53c362be64 localhost kernel: R10: ffffaa53c362bbd8 R11: 0000000000000001 R12: 0000000000000002 localhost kernel: R13: ffff9e7ead15d600 R14: 0000000000000000 R15: ffff9e7ead15d698 localhost kernel: FS: 0000152a3d111700(0000) GS:ffff9e855ee80000(0000) knlGS:0000000000000000 localhost kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 localhost kernel: CR2: 0000152938000010 CR3: 000000044d7a4000 CR4: 00000000003506e0 localhost kernel: Call Trace: localhost kernel: _raw_spin_lock_irqsave+0x30/0x40 localhost kernel: remove_wait_queue+0x12/0x50 localhost kernel: kfd_wait_on_events+0x1b6/0x490 [hydcu] localhost kernel: ? ftrace_graph_caller+0xa0/0xa0 localhost kernel: kfd_ioctl+0x38c/0x4a0 [hydcu] localhost kernel: ? kfd_ioctl_set_trap_handler+0x70/0x70 [hydcu] localhost kernel: ? kfd_ioctl_create_queue+0x5a0/0x5a0 [hydcu] localhost kernel: ? ftrace_graph_caller+0xa0/0xa0 localhost kernel: __x64_sys_ioctl+0x8e/0xd0 localhost kernel: ? syscall_trace_enter.isra.18+0x143/0x1b0 localhost kernel: do_syscall_64+0x33/0x80 localhost kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9 localhost kernel: RIP: 0033:0x152a4dff68d7 Allocate the structure with kcalloc, and remove redundant 0-initialization and a redundant loop condition check. Signed-off-by: Qu Huang <qu.huang@linux.dev> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23drm/amdgpu/vcn: set and use harvest configJane Jian
in early init to set harvest config if the vcn0/1 is disabled rather than hard-code the ring attributes as before did Signed-off-by: Jane Jian <Jane.Jian@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23drm/amd: Don't allow s0ix on APUs older than RavenMario Limonciello
APUs before Raven didn't support s0ix. As we just relieved some of the safety checks for s0ix to improve power consumption on APUs that support it but that are missing BIOS support a new blind spot was introduced that a user could "try" to run s0ix. Plug this hole so that if users try to run s0ix on anything older than Raven it will just skip suspend of the GPU. Fixes: cf488dcd0ab7 ("drm/amd: Allow s0ix without BIOS support") Suggested-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23drm/amdgpu: fix incorrect active rb bitmap for gfx11Hawking Zhang
GFX v11 changes RB_BACKEND_DISABLE related registers from per SA to global ones. The approach to query active rb bitmap needs to be changed accordingly. Query per SE setting returns wrong active RB bitmap especially in the case when some of SA are disabled. With the new approach, driver will generate the active rb bitmap based on active SA bitmap and global active RB bitmap. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>