summaryrefslogtreecommitdiff
path: root/drivers/crypto/intel/qat
AgeCommit message (Collapse)Author
2024-02-24crypto: qat - uninitialized variable in adf_hb_error_inject_write()Dan Carpenter
There are a few issues in this code. If *ppos is non-zero then the first part of the buffer is not initialized. We never initialize the last character of the buffer. The return is not checked so it's possible that none of the buffer is initialized. This is debugfs code which is root only and the impact of these bugs is very small. However, it's still worth fixing. To fix this: 1) Check that *ppos is zero. 2) Use copy_from_user() instead of simple_write_to_buffer(). 3) Explicitly add a NUL terminator. Fixes: e2b67859ab6e ("crypto: qat - add heartbeat error simulator") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-02-17crypto: qat - resolve race condition during AER recoveryDamian Muszynski
During the PCI AER system's error recovery process, the kernel driver may encounter a race condition with freeing the reset_data structure's memory. If the device restart will take more than 10 seconds the function scheduling that restart will exit due to a timeout, and the reset_data structure will be freed. However, this data structure is used for completion notification after the restart is completed, which leads to a UAF bug. This results in a KFENCE bug notice. BUG: KFENCE: use-after-free read in adf_device_reset_worker+0x38/0xa0 [intel_qat] Use-after-free read at 0x00000000bc56fddf (in kfence-#142): adf_device_reset_worker+0x38/0xa0 [intel_qat] process_one_work+0x173/0x340 To resolve this race condition, the memory associated to the container of the work_struct is freed on the worker if the timeout expired, otherwise on the function that schedules the worker. The timeout detection can be done by checking if the caller is still waiting for completion or not by using completion_done() function. Fixes: d8cba25d2c68 ("crypto: qat - Intel(R) QAT driver framework") Cc: <stable@vger.kernel.org> Signed-off-by: Damian Muszynski <damian.muszynski@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-02-17crypto: qat - change SLAs cleanup flow at shutdownDamian Muszynski
The implementation of the Rate Limiting (RL) feature includes the cleanup of all SLAs during device shutdown. For each SLA, the firmware is notified of the removal through an admin message, the data structures that take into account the budgets are updated and the memory is freed. However, this explicit cleanup is not necessary as (1) the device is reset, and the firmware state is lost and (2) all RL data structures are freed anyway. In addition, if the device is unresponsive, for example after a PCI AER error is detected, the admin interface might not be available. This might slow down the shutdown sequence and cause a timeout in the recovery flows which in turn makes the driver believe that the device is not recoverable. Fix by replacing the explicit SLAs removal with just a free of the SLA data structures. Fixes: d9fb8408376e ("crypto: qat - add rate limiting feature to qat_4xxx") Cc: <stable@vger.kernel.org> Signed-off-by: Damian Muszynski <damian.muszynski@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-02-09crypto: qat - improve aer error reset handlingMun Chun Yep
Rework the AER reset and recovery flow to take into account root port integrated devices that gets reset between the error detected and the slot reset callbacks. In adf_error_detected() the devices is gracefully shut down. The worker threads are disabled, the error conditions are notified to listeners and through PFVF comms and finally the device is reset as part of adf_dev_down(). In adf_slot_reset(), the device is brought up again. If SRIOV VFs were enabled before reset, these are re-enabled and VFs are notified of restarting through PFVF comms. Signed-off-by: Mun Chun Yep <mun.chun.yep@intel.com> Reviewed-by: Ahsan Atta <ahsan.atta@intel.com> Reviewed-by: Markas Rapoportas <markas.rapoportas@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-02-09crypto: qat - limit heartbeat notificationsFurong Zhou
When the driver detects an heartbeat failure, it starts the recovery flow. Set a limit so that the number of events is limited in case the heartbeat status is read too frequently. Signed-off-by: Furong Zhou <furong.zhou@intel.com> Reviewed-by: Ahsan Atta <ahsan.atta@intel.com> Reviewed-by: Markas Rapoportas <markas.rapoportas@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Mun Chun Yep <mun.chun.yep@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-02-09crypto: qat - add auto reset on errorDamian Muszynski
Expose the `auto_reset` sysfs attribute to configure the driver to reset the device when a fatal error is detected. When auto reset is enabled, the driver resets the device when it detects either an heartbeat failure or a fatal error through an interrupt. This patch is based on earlier work done by Shashank Gupta. Signed-off-by: Damian Muszynski <damian.muszynski@intel.com> Reviewed-by: Ahsan Atta <ahsan.atta@intel.com> Reviewed-by: Markas Rapoportas <markas.rapoportas@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Mun Chun Yep <mun.chun.yep@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-02-09crypto: qat - add fatal error notificationMun Chun Yep
Notify a fatal error condition and optionally reset the device in the following cases: * if the device reports an uncorrectable fatal error through an interrupt * if the heartbeat feature detects that the device is not responding This patch is based on earlier work done by Shashank Gupta. Signed-off-by: Mun Chun Yep <mun.chun.yep@intel.com> Reviewed-by: Ahsan Atta <ahsan.atta@intel.com> Reviewed-by: Markas Rapoportas <markas.rapoportas@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-02-09crypto: qat - re-enable sriov after pf resetMun Chun Yep
When a Physical Function (PF) is reset, SR-IOV gets disabled, making the associated Virtual Functions (VFs) unavailable. Even after reset and using pci_restore_state, VFs remain uncreated because the numvfs still at 0. Therefore, it's necessary to reconfigure SR-IOV to re-enable VFs. This commit introduces the ADF_SRIOV_ENABLED configuration flag to cache the SR-IOV enablement state. SR-IOV is only re-enabled if it was previously configured. This commit also introduces a dedicated workqueue without `WQ_MEM_RECLAIM` flag for enabling SR-IOV during Heartbeat and CPM error resets, preventing workqueue flushing warning. This patch is based on earlier work done by Shashank Gupta. Signed-off-by: Mun Chun Yep <mun.chun.yep@intel.com> Reviewed-by: Ahsan Atta <ahsan.atta@intel.com> Reviewed-by: Markas Rapoportas <markas.rapoportas@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-02-09crypto: qat - update PFVF protocol for recoveryMun Chun Yep
Update the PFVF logic to handle restart and recovery. This adds the following functions: * adf_pf2vf_notify_fatal_error(): allows the PF to notify VFs that the device detected a fatal error and requires a reset. This sends to VF the event `ADF_PF2VF_MSGTYPE_FATAL_ERROR`. * adf_pf2vf_wait_for_restarting_complete(): allows the PF to wait for `ADF_VF2PF_MSGTYPE_RESTARTING_COMPLETE` events from active VFs before proceeding with a reset. * adf_pf2vf_notify_restarted(): enables the PF to notify VFs with an `ADF_PF2VF_MSGTYPE_RESTARTED` event after recovery, indicating that the device is back to normal. This prompts VF drivers switch back to use the accelerator for workload processing. These changes improve the communication and synchronization between PF and VF drivers during system restart and recovery processes. Signed-off-by: Mun Chun Yep <mun.chun.yep@intel.com> Reviewed-by: Ahsan Atta <ahsan.atta@intel.com> Reviewed-by: Markas Rapoportas <markas.rapoportas@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-02-09crypto: qat - disable arbitration before resetFurong Zhou
Disable arbitration to avoid new requests to be processed before resetting a device. This is needed so that new requests are not fetched when an error is detected. Signed-off-by: Furong Zhou <furong.zhou@intel.com> Reviewed-by: Ahsan Atta <ahsan.atta@intel.com> Reviewed-by: Markas Rapoportas <markas.rapoportas@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Mun Chun Yep <mun.chun.yep@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-02-09crypto: qat - add fatal error notify methodFurong Zhou
Add error notify method to report a fatal error event to all the subsystems registered. In addition expose an API, adf_notify_fatal_error(), that allows to trigger a fatal error notification asynchronously in the context of a workqueue. This will be invoked when a fatal error is detected by the ISR or through Heartbeat. Signed-off-by: Furong Zhou <furong.zhou@intel.com> Reviewed-by: Ahsan Atta <ahsan.atta@intel.com> Reviewed-by: Markas Rapoportas <markas.rapoportas@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Mun Chun Yep <mun.chun.yep@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-02-09crypto: qat - add heartbeat error simulatorDamian Muszynski
Add a mechanism that allows to inject a heartbeat error for testing purposes. A new attribute `inject_error` is added to debugfs for each QAT device. Upon a write on this attribute, the driver will inject an error on the device which can then be detected by the heartbeat feature. Errors are breaking the device functionality thus they require a device reset in order to be recovered. This functionality is not compiled by default, to enable it CRYPTO_DEV_QAT_ERROR_INJECTION must be set. Signed-off-by: Damian Muszynski <damian.muszynski@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Lucas Segarra Fernandez <lucas.segarra.fernandez@intel.com> Reviewed-by: Ahsan Atta <ahsan.atta@intel.com> Reviewed-by: Markas Rapoportas <markas.rapoportas@intel.com> Signed-off-by: Mun Chun Yep <mun.chun.yep@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-01-26crypto: qat - use kcalloc_node() instead of kzalloc_node()Erick Archer
As noted in the "Deprecated Interfaces, Language Features, Attributes, and Conventions" documentation [1], size calculations (especially multiplication) should not be performed in memory allocator (or similar) function arguments due to the risk of them overflowing. This could lead to values wrapping around and a smaller allocation being made than the caller was expecting. Using those allocations could lead to linear overflows of heap memory and other misbehaviors. So, use the purpose specific kcalloc_node() function instead of the argument count * size in the kzalloc_node() function. Link: https://www.kernel.org/doc/html/next/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments [1] Link: https://github.com/KSPP/linux/issues/162 Signed-off-by: Erick Archer <erick.archer@gmx.com> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Acked-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-01-26crypto: qat - avoid memcpy() overflow warningArnd Bergmann
The use of array_size() leads gcc to assume the memcpy() can have a larger limit than actually possible, which triggers a string fortification warning: In file included from include/linux/string.h:296, from include/linux/bitmap.h:12, from include/linux/cpumask.h:12, from include/linux/sched.h:16, from include/linux/delay.h:23, from include/linux/iopoll.h:12, from drivers/crypto/intel/qat/qat_common/adf_gen4_hw_data.c:3: In function 'fortify_memcpy_chk', inlined from 'adf_gen4_init_thd2arb_map' at drivers/crypto/intel/qat/qat_common/adf_gen4_hw_data.c:401:3: include/linux/fortify-string.h:579:4: error: call to '__write_overflow_field' declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror=attribute-warning] 579 | __write_overflow_field(p_size_field, size); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/fortify-string.h:588:4: error: call to '__read_overflow2_field' declared with attribute warning: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Werror=attribute-warning] 588 | __read_overflow2_field(q_size_field, size); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Add an explicit range check to avoid this. Fixes: 5da6a2d5353e ("crypto: qat - generate dynamically arbiter mappings") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-01-26crypto: qat - fix arbiter mapping generation algorithm for QAT 402xxDamian Muszynski
The commit "crypto: qat - generate dynamically arbiter mappings" introduced a regression on qat_402xx devices. This is reported when the driver probes the device, as indicated by the following error messages: 4xxx 0000:0b:00.0: enabling device (0140 -> 0142) 4xxx 0000:0b:00.0: Generate of the thread to arbiter map failed 4xxx 0000:0b:00.0: Direct firmware load for qat_402xx_mmp.bin failed with error -2 The root cause of this issue was the omission of a necessary function pointer required by the mapping algorithm during the implementation. Fix it by adding the missing function pointer. Fixes: 5da6a2d5353e ("crypto: qat - generate dynamically arbiter mappings") Signed-off-by: Damian Muszynski <damian.muszynski@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-12-29crypto: qat - generate dynamically arbiter mappingsDamian Muszynski
The thread-to-arbiter mapping describes which arbiter can assign jobs to an acceleration engine thread. The existing mappings are functionally correct, but hardcoded and not optimized. Replace the static mappings with an algorithm that generates optimal mappings, based on the loaded configuration. The logic has been made common so that it can be shared between all QAT GEN4 devices. Signed-off-by: Damian Muszynski <damian.muszynski@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-12-29crypto: qat - add support for ring pair level telemetryLucas Segarra Fernandez
Expose through debugfs ring pair telemetry data for QAT GEN4 devices. This allows to gather metrics about the PCIe channel and device TLB for a selected ring pair. It is possible to monitor maximum 4 ring pairs at the time per device. For details, refer to debugfs-driver-qat_telemetry in Documentation/ABI. This patch is based on earlier work done by Wojciech Ziemba. Signed-off-by: Lucas Segarra Fernandez <lucas.segarra.fernandez@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Damian Muszynski <damian.muszynski@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-12-29crypto: qat - add support for device telemetryLucas Segarra Fernandez
Expose through debugfs device telemetry data for QAT GEN4 devices. This allows to gather metrics about the performance and the utilization of a device. In particular, statistics on (1) the utilization of the PCIe channel, (2) address translation, when SVA is enabled and (3) the internal engines for crypto and data compression. If telemetry is supported by the firmware, the driver allocates a DMA region and a circular buffer. When telemetry is enabled, through the `control` attribute in debugfs, the driver sends to the firmware, via the admin interface, the `TL_START` command. This triggers the device to periodically gather telemetry data from hardware registers and write it into the DMA memory region. The device writes into the shared region every second. The driver, every 500ms, snapshots the DMA shared region into the circular buffer. This is then used to compute basic metric (min/max/average) on each counter, every time the `device_data` attribute is queried. Telemetry counters are exposed through debugfs in the folder /sys/kernel/debug/qat_<device>_<BDF>/telemetry. For details, refer to debugfs-driver-qat_telemetry in Documentation/ABI. This patch is based on earlier work done by Wojciech Ziemba. Signed-off-by: Lucas Segarra Fernandez <lucas.segarra.fernandez@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Damian Muszynski <damian.muszynski@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-12-29crypto: qat - add admin msgs for telemetryLucas Segarra Fernandez
Extend the admin interface with two new public APIs to enable and disable the telemetry feature: adf_send_admin_tl_start() and adf_send_admin_tl_stop(). The first, sends to the firmware, through the ICP_QAT_FW_TL_START message, the IO address where the firmware will write telemetry metrics and a list of ring pairs (maximum 4) to be monitored. It returns the number of accelerators of each type supported by this hardware. After this message is sent, the firmware starts periodically reporting telemetry data using by writing into the dma buffer specified as input. The second, sends the admin message ICP_QAT_FW_TL_STOP which stops the reporting of telemetry data. This patch is based on earlier work done by Wojciech Ziemba. Signed-off-by: Lucas Segarra Fernandez <lucas.segarra.fernandez@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Damian Muszynski <damian.muszynski@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-12-29crypto: qat - include pci.h for GET_DEV()Lucas Segarra Fernandez
GET_DEV() macro expansion relies on struct pci_dev being defined. Include <linux/pci.h> at adf_accel_devices.h. Signed-off-by: Lucas Segarra Fernandez <lucas.segarra.fernandez@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Damian Muszynski <damian.muszynski@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-12-22crypto: qat - add support for 420xx devicesJie Wang
Add support for 420xx devices by including a new device driver that supports such devices, updates to the firmware loader and capabilities. Compared to 4xxx devices, 420xx devices have more acceleration engines (16 service engines and 1 admin) and support the wireless cipher algorithms ZUC and Snow 3G. Signed-off-by: Jie Wang <jie.wang@intel.com> Co-developed-by: Dong Xie <dong.xie@intel.com> Signed-off-by: Dong Xie <dong.xie@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-12-22crypto: qat - move fw config related structuresJie Wang
Relocate the structures adf_fw_objs and adf_fw_config from the file adf_4xxx_hw_data.c to the newly created adf_fw_config.h. These structures will be used by new device drivers. This does not introduce any functional change. Signed-off-by: Jie Wang <jie.wang@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-12-22crypto: qat - relocate portions of qat_4xxx codeJie Wang
Move logic that is common between QAT GEN4 accelerators to the qat_common folder. This includes addresses of CSRs, setters and configuration logic. When moved, functions and defines have been renamed from 4XXX to GEN4. Code specific to the device is moved to the file adf_gen4_hw_data.c. Code related to configuration is moved to the newly created adf_gen4_config.c. This does not introduce any functional change. Signed-off-by: Jie Wang <jie.wang@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-12-22crypto: qat - change signature of uof_get_num_objs()Jie Wang
Add accel_dev as parameter of the function uof_get_num_objs(). This is in preparation for the introduction of the QAT 420xx driver as it will allow to reconfigure the ae_mask when a configuration that does not require all AEs is loaded on the device. This does not introduce any functional change. Signed-off-by: Jie Wang <jie.wang@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-12-22crypto: qat - relocate and rename get_service_enabled()Jie Wang
Move the function get_service_enabled() from adf_4xxx_hw_data.c to adf_cfg_services.c and rename it as adf_get_service_enabled(). This function is not specific to the 4xxx and will be used by other QAT drivers. This does not introduce any functional change. Signed-off-by: Jie Wang <jie.wang@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-12-08crypto: qat - add NULL pointer checkGiovanni Cabiddu
There is a possibility that the function adf_devmgr_pci_to_accel_dev() might return a NULL pointer. Add a NULL pointer check in the function rp2srv_show(). Fixes: dbc8876dd873 ("crypto: qat - add rp2svc sysfs attribute") Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Ahsan Atta <ahsan.atta@intel.com> Reviewed-by: David Guckian <david.guckian@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-12-08crypto: qat - fix mutex ordering in adf_rlDamian Muszynski
If the function validate_user_input() returns an error, the error path attempts to unlock an unacquired mutex. Acquire the mutex before calling validate_user_input(). This is not strictly necessary but simplifies the code. Fixes: d9fb8408376e ("crypto: qat - add rate limiting feature to qat_4xxx") Signed-off-by: Damian Muszynski <damian.muszynski@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-12-08crypto: qat - fix error path in add_update_sla()Damian Muszynski
The input argument `sla_in` is a pointer to a structure that contains the parameters of the SLA which is being added or updated. If this pointer is NULL, the function should return an error as the data required for the algorithm is not available. By mistake, the logic jumps to the error path which dereferences the pointer. This results in a warnings reported by the static analyzer Smatch when executed without a database: drivers/crypto/intel/qat/qat_common/adf_rl.c:871 add_update_sla() error: we previously assumed 'sla_in' could be null (see line 812) This issue was not found in internal testing as the pointer cannot be NULL. The function add_update_sla() is only called (indirectly) by the rate limiting sysfs interface implementation in adf_sysfs_rl.c which ensures that the data structure is allocated and valid. This is also proven by the fact that Smatch executed with a database does not report such error. Fix it by returning with error if the pointer `sla_in` is NULL. Fixes: d9fb8408376e ("crypto: qat - add rate limiting feature to qat_4xxx") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Damian Muszynski <damian.muszynski@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-12-01crypto: qat - add sysfs_added flag for rate limitingDamian Muszynski
The qat_rl sysfs attribute group is registered within the adf_dev_start() function, alongside other driver components. If any of the functions preceding the group registration fails, the adf_dev_start() function returns, and the caller, to undo the operation, invokes adf_dev_stop() followed by adf_dev_shutdown(). However, the current flow lacks information about whether the registration of the qat_rl attribute group was successful or not. In cases where this condition is encountered, an error similar to the following might be reported: 4xxx 0000:6b:00.0: Starting device qat_dev0 4xxx 0000:6b:00.0: qat_dev0 started 9 acceleration engines 4xxx 0000:6b:00.0: Failed to send init message 4xxx 0000:6b:00.0: Failed to start device qat_dev0 sysfs group 'qat_rl' not found for kobject '0000:6b:00.0' ... sysfs_remove_groups+0x2d/0x50 adf_sysfs_rl_rm+0x44/0x70 [intel_qat] adf_rl_stop+0x2d/0xb0 [intel_qat] adf_dev_stop+0x33/0x1d0 [intel_qat] adf_dev_down+0xf1/0x150 [intel_qat] ... 4xxx 0000:6b:00.0: qat_dev0 stopped 9 acceleration engines 4xxx 0000:6b:00.0: Resetting device qat_dev0 To prevent attempting to remove attributes from a group that has not been added yet, a flag named 'sysfs_added' is introduced. This flag is set to true upon the successful registration of the attribute group. Fixes: d9fb8408376e ("crypto: qat - add rate limiting feature to qat_4xxx") Signed-off-by: Damian Muszynski <damian.muszynski@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Ahsan Atta <ahsan.atta@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-12-01crypto: qat - add sysfs_added flag for rasDamian Muszynski
The qat_ras sysfs attribute group is registered within the adf_dev_start() function, alongside other driver components. If any of the functions preceding the group registration fails, the adf_dev_start() function returns, and the caller, to undo the operation, invokes adf_dev_stop() followed by adf_dev_shutdown(). However, the current flow lacks information about whether the registration of the qat_ras attribute group was successful or not. In cases where this condition is encountered, an error similar to the following might be reported: 4xxx 0000:6b:00.0: Starting device qat_dev0 4xxx 0000:6b:00.0: qat_dev0 started 9 acceleration engines 4xxx 0000:6b:00.0: Failed to send init message 4xxx 0000:6b:00.0: Failed to start device qat_dev0 sysfs group 'qat_ras' not found for kobject '0000:6b:00.0' ... sysfs_remove_groups+0x29/0x50 adf_sysfs_stop_ras+0x4b/0x80 [intel_qat] adf_dev_stop+0x43/0x1d0 [intel_qat] adf_dev_down+0x4b/0x150 [intel_qat] ... 4xxx 0000:6b:00.0: qat_dev0 stopped 9 acceleration engines 4xxx 0000:6b:00.0: Resetting device qat_dev0 To prevent attempting to remove attributes from a group that has not been added yet, a flag named 'sysfs_added' is introduced. This flag is set to true upon the successful registration of the attribute group. Fixes: 532d7f6bc458 ("crypto: qat - add error counters") Signed-off-by: Damian Muszynski <damian.muszynski@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Ahsan Atta <ahsan.atta@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-11-17crypto: qat - prevent underflow in rp2srv_store()Dan Carpenter
The "ring" variable has an upper bounds check but nothing checks for negatives. This code uses kstrtouint() already and it was obviously intended to be declared as unsigned int. Make it so. Fixes: dbc8876dd873 ("crypto: qat - add rp2svc sysfs attribute") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Acked-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-10-27crypto: qat - fix deadlock in backlog processingGiovanni Cabiddu
If a request has the flag CRYPTO_TFM_REQ_MAY_BACKLOG set, the function qat_alg_send_message_maybacklog(), enqueues it in a backlog list if either (1) there is already at least one request in the backlog list, or (2) the HW ring is nearly full or (3) the enqueue to the HW ring fails. If an interrupt occurs right before the lock in qat_alg_backlog_req() is taken and the backlog queue is being emptied, then there is no request in the HW queues that can trigger a subsequent interrupt that can clear the backlog queue. In addition subsequent requests are enqueued to the backlog list and not sent to the hardware. Fix it by holding the lock while taking the decision if the request needs to be included in the backlog queue or not. This synchronizes the flow with the interrupt handler that drains the backlog queue. For performance reasons, the logic has been changed to try to enqueue first without holding the lock. Fixes: 386823839732 ("crypto: qat - add backlog mechanism") Reported-by: Mikulas Patocka <mpatocka@redhat.com> Closes: https://lore.kernel.org/all/af9581e2-58f9-cc19-428f-6f18f1f83d54@redhat.com/T/ Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-10-27crypto: qat - move adf_cfg_servicesGiovanni Cabiddu
The file adf_cfg_services.h cannot be included in header files since it instantiates the structure adf_cfg_services. Move that structure to its own file and export the symbol. This does not introduce any functional change. Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Damian Muszynski <damian.muszynski@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-10-27crypto: qat - add num_rps sysfs attributeCiunas Bennett
Add the attribute `num_rps` to the `qat` attribute group. This returns the number of ring pairs that a single device has. This allows to know the maximum value that can be set to the attribute `rp2svc`. Signed-off-by: Ciunas Bennett <ciunas.bennett@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Damian Muszynski <damian.muszynski@intel.com> Reviewed-by: Tero Kristo <tero.kristo@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-10-27crypto: qat - add rp2svc sysfs attributeCiunas Bennett
Add the attribute `rp2svc` to the `qat` attribute group. This provides a way for a user to query a specific ring pair for the type of service that is currently configured for. When read, the service will be returned for the defined ring pair. When written to this value will be stored as the ring pair to return the service of. Signed-off-by: Ciunas Bennett <ciunas.bennett@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Damian Muszynski <damian.muszynski@intel.com> Reviewed-by: Tero Kristo <tero.kristo@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-10-27crypto: qat - add rate limiting sysfs interfaceCiunas Bennett
Add an interface for the rate limiting feature which allows to add, remove and modify a QAT SLA (Service Level Agreement). This adds a new sysfs attribute group, `qat_rl`, which can be accessed from /sys/bus/pci/devices/<BUS:DEV:FUNCTION> with the following hierarchy: |-+ qat_rl |---- id (RW) # SLA identifier |---- cir (RW) # Committed Information Rate |---- pir (RW) # Peak Information Rate |---- srv (RW) # Service to be rate limited |---- rp (RW) (HEX) # Ring pairs to be rate limited |---- cap_rem (RW) # Remaining capability for a service |---- sla_op (WO) # Allows to perform an operation on an SLA The API works by setting the appropriate RW attributes and then issuing a command through the `sla_op`. For example, to create an SLA, a user needs to input the necessary data into the attributes cir, pir, srv and rp and then write into `sla_op` the command `add` to execute the operation. The API also provides `cap_rem` attribute to get information about the remaining device capability within a certain service which is required when setting an SLA. Signed-off-by: Ciunas Bennett <ciunas.bennett@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Damian Muszynski <damian.muszynski@intel.com> Reviewed-by: Tero Kristo <tero.kristo@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-10-27crypto: qat - add rate limiting feature to qat_4xxxDamian Muszynski
The Rate Limiting (RL) feature allows to control the rate of requests that can be submitted on a ring pair (RP). This allows sharing a QAT device among multiple users while ensuring a guaranteed throughput. The driver provides a mechanism that allows users to set policies, that are programmed to the device. The device is then enforcing those policies. Configuration of RL is accomplished through entities called SLAs (Service Level Agreement). Each SLA object gets a unique identifier and defines the limitations for a single service across up to four ring pairs (RPs count allocated to a single VF). The rate is determined using two fields: * CIR (Committed Information Rate), i.e., the guaranteed rate. * PIR (Peak Information Rate), i.e., the maximum rate achievable when the device has available resources. The rate values are expressed in permille scale i.e. 0-1000. Ring pair selection is achieved by providing a 64-bit mask, where each bit corresponds to one of the ring pairs. This adds an interface and logic that allow to add, update, retrieve and remove an SLA. Signed-off-by: Damian Muszynski <damian.muszynski@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Tero Kristo <tero.kristo@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-10-27crypto: qat - add retrieval of fw capabilitiesDamian Muszynski
The QAT firmware provides a mechanism to retrieve its capabilities through the init admin interface. Add logic to retrieve the firmware capability mask from the firmware through the init/admin channel. This mask reports if the power management, telemetry and rate limiting features are supported. The fw capabilities are stored in the accel_dev structure and are used to detect if a certain feature is supported by the firmware loaded in the device. This is supported only by devices which have an admin AE. Signed-off-by: Damian Muszynski <damian.muszynski@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Tero Kristo <tero.kristo@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-10-27crypto: qat - add bits.h to icp_qat_hw.hDamian Muszynski
Some enums use the macro BIT. Include bits.h as it is missing. Signed-off-by: Damian Muszynski <damian.muszynski@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Tero Kristo <tero.kristo@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-10-27crypto: qat - move admin apiGiovanni Cabiddu
The admin API is growing and deserves its own include. Move it from adf_common_drv.h to adf_admin.h. Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Damian Muszynski <damian.muszynski@intel.com> Reviewed-by: Tero Kristo <tero.kristo@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-10-27crypto: qat - fix ring to service map for QAT GEN4Giovanni Cabiddu
The 4xxx drivers hardcode the ring to service mapping. However, when additional configurations where added to the driver, the mappings were not updated. This implies that an incorrect mapping might be reported through pfvf for certain configurations. Add an algorithm that computes the correct ring to service mapping based on the firmware loaded on the device. Fixes: 0cec19c761e5 ("crypto: qat - add support for compression for 4xxx") Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Damian Muszynski <damian.muszynski@intel.com> Reviewed-by: Tero Kristo <tero.kristo@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-10-27crypto: qat - use masks for AE groupsGiovanni Cabiddu
The adf_fw_config structures hardcode a bit mask that represents the acceleration engines (AEs) where a certain firmware image will have to be loaded to. Remove the hardcoded masks and replace them with defines. This does not introduce any functional change. Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Damian Muszynski <damian.muszynski@intel.com> Reviewed-by: Tero Kristo <tero.kristo@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-10-27crypto: qat - refactor fw config related functionsGiovanni Cabiddu
The logic that selects the correct adf_fw_config structure based on the configured service is replicated twice in the uof_get_name() and uof_get_ae_mask() functions. Refactor the code so that there is no replication. This does not introduce any functional change. Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Damian Muszynski <damian.muszynski@intel.com> Reviewed-by: Tero Kristo <tero.kristo@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-10-27crypto: qat - count QAT GEN4 errorsShashank Gupta
Add logic to count correctable, non fatal and fatal error for QAT GEN4 devices. These counters are reported through sysfs attributes in the group qat_ras. Signed-off-by: Shashank Gupta <shashank.gupta@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Tero Kristo <tero.kristo@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-10-27crypto: qat - add error countersShashank Gupta
Introduce ras counters interface for counting QAT specific device errors and expose them through the newly created qat_ras sysfs group attribute. This adds the following attributes: - errors_correctable: number of correctable errors - errors_nonfatal: number of uncorrectable non fatal errors - errors_fatal: number of uncorrectable fatal errors - reset_error_counters: resets all counters These counters are initialized during device bring up and cleared during device shutdown and are applicable only to QAT GEN4 devices. Signed-off-by: Shashank Gupta <shashank.gupta@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Tero Kristo <tero.kristo@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-10-27crypto: qat - add handling of errors from ERRSOU3 for QAT GEN4Shashank Gupta
Add logic to detect, report and handle uncorrectable errors reported through the ERRSOU3 register in QAT GEN4 devices. Signed-off-by: Shashank Gupta <shashank.gupta@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Tero Kristo <tero.kristo@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-10-27crypto: qat - add adf_get_aram_base() helper functionShashank Gupta
Add the function adf_get_aram_base() which allows to return the base address of the aram bar. Signed-off-by: Shashank Gupta <shashank.gupta@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Tero Kristo <tero.kristo@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-10-27crypto: qat - add handling of compression related errors for QAT GEN4Shashank Gupta
Add logic to detect, report and handle correctable and uncorrectable errors related to the compression hardware. These are detected through the EXPRPSSMXLT, EXPRPSSMCPR and EXPRPSSMDCPR registers. Signed-off-by: Shashank Gupta <shashank.gupta@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Tero Kristo <tero.kristo@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-10-27crypto: qat - add handling of errors from ERRSOU2 for QAT GEN4Shashank Gupta
Add logic to detect, report and handle uncorrectable errors reported through the ERRSOU2 register in QAT GEN4 devices. Signed-off-by: Shashank Gupta <shashank.gupta@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Tero Kristo <tero.kristo@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-10-27crypto: qat - add reporting of errors from ERRSOU1 for QAT GEN4Shashank Gupta
Add logic to detect and report uncorrectable errors reported through the ERRSOU1 register in QAT GEN4 devices. This also introduces the adf_dev_err_mask structure as part of adf_hw_device_data which will allow to provide different error masks per device generation. Signed-off-by: Shashank Gupta <shashank.gupta@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Tero Kristo <tero.kristo@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>