summaryrefslogtreecommitdiff
path: root/drivers/scsi/lpfc
AgeCommit message (Collapse)Author
2023-11-15scsi: lpfc: Refactor and clean up mailbox command memory freeJustin Tee
A lot of repeated clean up code exists when freeing mailbox commands in lpfc_mem_free_all(). Introduce a lpfc_mem_free_sli_mbox() helper routine to refactor the copy-paste code. Additionally, reinitialize the mailbox command structure context pointers to NULL in lpfc_sli4_mbox_cmd_free(). Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20231031191224.150862-7-justintee8345@gmail.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-15scsi: lpfc: Return early in lpfc_poll_eratt() when the driver is unloadingJustin Tee
Add a check in lpfc_poll_eratt() when the driver is unloading. There is no point to check for error attention events if the driver is rmmod'ed. If the driver is reloaded, as part of insmod initialization, then a fresh reset is always asserted to start clean and free of error attention events. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20231031191224.150862-6-justintee8345@gmail.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-15scsi: lpfc: Eliminate unnecessary relocking in lpfc_check_nlp_post_devloss()Justin Tee
In lpfc_check_nlp_post_devloss(), retaking of the ndlp lock in the if statement is useless because the very next line unlocks. Simply return to avoid relocking. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20231031191224.150862-5-justintee8345@gmail.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-15scsi: lpfc: Fix list_entry null check warning in lpfc_cmpl_els_plogi()Justin Tee
Smatch called out a warning for null checking a ptr that is assigned by list_entry(). list_entry() does not return null and, if the list is empty, can return an invalid ptr. Thus, the !psrp check does not execute properly. drivers/scsi/lpfc/lpfc_els.c:2133 lpfc_cmpl_els_plogi() warn: list_entry() does not return NULL 'prsp' Replace list_entry() with list_get_first(), which does a list_empty() check before returning the first entry. Fixes: a3c3c0a806f1 ("scsi: lpfc: Validate ELS LS_ACC completion payload") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/linux-scsi/01b7568f-4ab4-4d56-bfa6-9ecc5fc261fe@moroto.mountain/ Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20231031191224.150862-4-justintee8345@gmail.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-15scsi: lpfc: Fix possible file string name overflow when updating firmwareJustin Tee
Because file_name and phba->ModelName are both declared a size 80 bytes, the extra ".grp" file extension could cause an overflow into file_name. Define a ELX_FW_NAME_SIZE macro with value 84. 84 incorporates the 4 extra characters from ".grp". file_name is changed to be declared as a char and initialized to zeros i.e. null chars. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20231031191224.150862-3-justintee8345@gmail.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-15scsi: lpfc: Correct maximum PCI function value for RAS fw loggingJustin Tee
Currently, the ras_fwlog_func sysfs parameter allows users to input a value greater than three when selecting a PCI function to enable RAS fw logging feature. The user's input is sanity checked in lpfc_sli4_ras_init(), but allowing an input greater than three doesn't make sense because the max number of ports per HBA is four. Change the allowable range from [0, 7] to [0, 3]. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20231031191224.150862-2-justintee8345@gmail.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-02Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds
Pull SCSI updates from James Bottomley: "Updates to the usual drivers (ufs, megaraid_sas, lpfc, target, ibmvfc, scsi_debug) plus the usual assorted minor fixes and updates. The major change this time around is a prep patch for rethreading of the driver reset handler API not to take a scsi_cmd structure which starts to reduce various drivers' dependence on scsi_cmd in error handling" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (132 commits) scsi: ufs: core: Leave space for '\0' in utf8 desc string scsi: ufs: core: Conversion to bool not necessary scsi: ufs: core: Fix race between force complete and ISR scsi: megaraid: Fix up debug message in megaraid_abort_and_reset() scsi: aic79xx: Fix up NULL command in ahd_done() scsi: message: fusion: Initialize return value in mptfc_bus_reset() scsi: mpt3sas: Fix loop logic scsi: snic: Remove useless code in snic_dr_clean_pending_req() scsi: core: Add comment to target_destroy in scsi_host_template scsi: core: Clean up scsi_dev_queue_ready() scsi: pmcraid: Add missing scsi_device_put() in pmcraid_eh_target_reset_handler() scsi: target: core: Fix kernel-doc comment scsi: pmcraid: Fix kernel-doc comment scsi: core: Handle depopulation and restoration in progress scsi: ufs: core: Add support for parsing OPP scsi: ufs: core: Add OPP support for scaling clocks and regulators scsi: ufs: dt-bindings: common: Add OPP table scsi: scsi_debug: Add param to control sdev's allow_restart scsi: scsi_debug: Add debugfs interface to fail target reset scsi: scsi_debug: Add new error injection type: Reset LUN failed ...
2023-10-30Merge tag 'x86-core-2023-10-29-v2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 core updates from Thomas Gleixner: - Limit the hardcoded topology quirk for Hygon CPUs to those which have a model ID less than 4. The newer models have the topology CPUID leaf 0xB correctly implemented and are not affected. - Make SMT control more robust against enumeration failures SMT control was added to allow controlling SMT at boottime or runtime. The primary purpose was to provide a simple mechanism to disable SMT in the light of speculation attack vectors. It turned out that the code is sensible to enumeration failures and worked only by chance for XEN/PV. XEN/PV has no real APIC enumeration which means the primary thread mask is not set up correctly. By chance a XEN/PV boot ends up with smp_num_siblings == 2, which makes the hotplug control stay at its default value "enabled". So the mask is never evaluated. The ongoing rework of the topology evaluation caused XEN/PV to end up with smp_num_siblings == 1, which sets the SMT control to "not supported" and the empty primary thread mask causes the hotplug core to deny the bringup of the APS. Make the decision logic more robust and take 'not supported' and 'not implemented' into account for the decision whether a CPU should be booted or not. - Fake primary thread mask for XEN/PV Pretend that all XEN/PV vCPUs are primary threads, which makes the usage of the primary thread mask valid on XEN/PV. That is consistent with because all of the topology information on XEN/PV is fake or even non-existent. - Encapsulate topology information in cpuinfo_x86 Move the randomly scattered topology data into a separate data structure for readability and as a preparatory step for the topology evaluation overhaul. - Consolidate APIC ID data type to u32 It's fixed width hardware data and not randomly u16, int, unsigned long or whatever developers decided to use. - Cure the abuse of cpuinfo for persisting logical IDs. Per CPU cpuinfo is used to persist the logical package and die IDs. That's really not the right place simply because cpuinfo is subject to be reinitialized when a CPU goes through an offline/online cycle. Use separate per CPU data for the persisting to enable the further topology management rework. It will be removed once the new topology management is in place. - Provide a debug interface for inspecting topology information Useful in general and extremly helpful for validating the topology management rework in terms of correctness or "bug" compatibility. * tag 'x86-core-2023-10-29-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits) x86/apic, x86/hyperv: Use u32 in hv_snp_boot_ap() too x86/cpu: Provide debug interface x86/cpu/topology: Cure the abuse of cpuinfo for persisting logical ids x86/apic: Use u32 for wakeup_secondary_cpu[_64]() x86/apic: Use u32 for [gs]et_apic_id() x86/apic: Use u32 for phys_pkg_id() x86/apic: Use u32 for cpu_present_to_apicid() x86/apic: Use u32 for check_apicid_used() x86/apic: Use u32 for APIC IDs in global data x86/apic: Use BAD_APICID consistently x86/cpu: Move cpu_l[l2]c_id into topology info x86/cpu: Move logical package and die IDs into topology info x86/cpu: Remove pointless evaluation of x86_coreid_bits x86/cpu: Move cu_id into topology info x86/cpu: Move cpu_core_id into topology info hwmon: (fam15h_power) Use topology_core_id() scsi: lpfc: Use topology_core_id() x86/cpu: Move cpu_die_id into topology info x86/cpu: Move phys_proc_id into topology info x86/cpu: Encapsulate topology information in cpuinfo_x86 ...
2023-10-13Merge patch series "lpfc: Update lpfc to revision 14.2.0.15"Martin K. Petersen
Justin Tee <justintee8345@gmail.com> says: Update lpfc to revision 14.2.0.15 This patch set contains error handling fixes, ELS bug fixes, and logging improvements. The patches were cut against Martin's 6.7/scsi-queue tree. Link: https://lore.kernel.org/r/20231009161812.97232-1-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13scsi: lpfc: Update lpfc version to 14.2.0.15Justin Tee
Update lpfc version to 14.2.0.15. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20231009161812.97232-7-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13scsi: lpfc: Introduce LOG_NODE_VERBOSE messaging flagJustin Tee
The preexisting LOG_NODE message flag frequently spams a subset of the same log messages during normal FC driver operations. When analyzing driver logs, this sometimes leads to difficulty in troubleshooting. Because LOG_IP log message flag is unused, convert it to a new LOG_NODE_VERBOSE flag. The LOG_NODE_VERBOSE shall specifically be used for diagnosing issues that require precise ndlp tracking detail. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20231009161812.97232-6-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13scsi: lpfc: Validate ELS LS_ACC completion payloadJustin Tee
A WCQE success completion status does not guarantee valid LS_ACC receipt for ELS commands. So, introduce a small helper routine that validates ELS LS_ACC frames in ELS cmpl routines. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20231009161812.97232-5-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13scsi: lpfc: Reject received PRLIs with only initiator fcn role for NPIV portsJustin Tee
Currently, NPIV ports send PRLI_ACC to all received unsolicited PRLI requests. For an NPIV port, there is no point to PRLI_ACC if the received PRLI request has the initiator function bit set and the target function bit unset. Modify the lpfc_rcv_prli_support_check() routine to send a PRLI_RJT in such cases. NPIV ports are expected to send PRLI_ACC only if the Target function bit is set. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20231009161812.97232-4-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13scsi: lpfc: Treat IOERR_SLI_DOWN I/O completion status the same as pci offlineJustin Tee
During receipt of a hardware error attention ACQE, IOERR_SLI_DOWN status is set by the driver for all outstanding I/Os. In such hardware error attention cases, we can treat the situation exactly the same as pci_channel_offline. Thus, add IOERR_SLI_DOWN status to the same category as pci_channel_offline handling in lpfc_nvme_io_cmd_cmpl. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20231009161812.97232-3-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13scsi: lpfc: Remove unnecessary zero return code assignment in ↵Justin Tee
lpfc_sli4_hba_setup In order to enter the !rc if statement block in question, rc had to have been zero to begin with. Thus, the rc = 0 assignment is unnecessary and can be removed. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20231009161812.97232-2-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-10scsi: lpfc: Use topology_core_id()Thomas Gleixner
Use the provided topology helper. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Tested-by: Sohil Mehta <sohil.mehta@intel.com> Tested-by: Michael Kelley <mikelley@microsoft.com> Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Zhang Rui <rui.zhang@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20230814085112.446856860@linutronix.de
2023-10-10x86/cpu: Move phys_proc_id into topology infoThomas Gleixner
Rename it to pkg_id which is the terminology used in the kernel. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Tested-by: Sohil Mehta <sohil.mehta@intel.com> Tested-by: Michael Kelley <mikelley@microsoft.com> Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Zhang Rui <rui.zhang@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20230814085112.329006989@linutronix.de
2023-09-13scsi: lpfc: Prevent use-after-free during rmmod with mapped NVMe rportsJustin Tee
During rmmod, when dev_loss_tmo callback is called, an ndlp kref count is decremented twice. Once for SCSI transport registration and second to remove the initial node allocation kref. If there is also an NVMe transport registration, another reference count decrement is expected in lpfc_nvme_unregister_port(). Race conditions between the NVMe transport remoteport_delete and dev_loss_tmo callbacks sometimes results in premature ndlp object release resulting in use-after-free issues. Fix by not dropping the ndlp object in dev_loss_tmo callback with an outstanding NVMe transport registration. Inversely, mark the final NLP_DROPPED flag in lpfc_nvme_unregister_port when rmmod flag is set. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230908211923.37603-1-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-09-13scsi: lpfc: Early return after marking final NLP_DROPPED flag in dev_loss_tmoJustin Tee
When a dev_loss_tmo event occurs, an ndlp lock is taken before checking nlp_flag for NLP_DROPPED. There is an attempt to restore the ndlp lock when exiting the if statement, but the nlp_put kref could be the final decrement causing a use-after-free memory access on a released ndlp object. Instead of trying to reacquire the ndlp lock after checking nlp_flag, just return after calling nlp_put. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230908211852.37576-1-justintee8345@gmail.com Reviewed-by: "Ewan D. Milne" <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-09-13scsi: lpfc: Fix the NULL vs IS_ERR() bug for debugfs_create_file()Jinjie Ruan
Since debugfs_create_file() returns ERR_PTR and never NULL, use IS_ERR() to check the return value. Fixes: 2fcbc569b9f5 ("scsi: lpfc: Make debugfs ktime stats generic for NVME and SCSI") Fixes: 4c47efc140fa ("scsi: lpfc: Move SCSI and NVME Stats to hardware queue structures") Fixes: 6a828b0f6192 ("scsi: lpfc: Support non-uniform allocation of MSIX vectors to hardware queues") Fixes: 95bfc6d8ad86 ("scsi: lpfc: Make FW logging dynamically configurable") Fixes: 9f77870870d8 ("scsi: lpfc: Add debugfs support for cm framework buffers") Fixes: c490850a0947 ("scsi: lpfc: Adapt partitioned XRI lists to efficient sharing") Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Link: https://lore.kernel.org/r/20230906030809.2847970-1-ruanjinjie@huawei.com Reviewed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-09-02Merge branch 'fixes' into miscJames Bottomley
2023-08-21scsi: lpfc: Do not abuse UUID APIs and LPFC_COMPRESS_VMID_SIZEAndy Shevchenko
The lpfc_vmid_host_uuid is not defined as uuid_t and its usage is not the same as for uuid_t operations (like exporting or importing). Hence replace call to uuid_is_null() by respective memchr_inv() without abusing casting. With that, replace LPFC_COMPRESS_VMID_SIZE with plain number and respective sizeof() to make code robust to changes in the future, if any. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20230818155452.875781-1-andriy.shevchenko@linux.intel.com Reviewed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-08-07scsi: lpfc: Remove reftag check in DIF pathsJustin Tee
When preparing protection DIF I/O for DMA, the driver obtains reference tags from scsi_prot_ref_tag(). Previously, there was a wrong assumption that an all 0xffffffff value meant error and thus the driver failed the I/O. This patch removes the evaluation code and accepts whatever the upper layer returns. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230803211932.155745-1-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-08-07scsi: lpfc: Modify when a node should be put in device recovery mode during RSCNJustin Tee
Only nodes whose state is at least past a PLOGI issue and strictly less than a PRLI issue should be put into device recovery mode upon RSCN receipt. Previously, the allowance of LOGO and PRLI completion states did not make sense because those nodes should be allowed to flow through and marked as NPort dissappeared as is normally done. A follow up RSCN GID_FT would recover those nodes in such cases. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230804195546.157839-1-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-07-23scsi: lpfc: Copyright updates for 14.2.0.14 patchesJustin Tee
Update copyrights to 2023 for files modified in the 14.2.0.14 patch set. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230712180522.112722-13-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-07-23scsi: lpfc: Update lpfc version to 14.2.0.14Justin Tee
Update lpfc version to 14.2.0.14 Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230712180522.112722-12-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-07-23scsi: lpfc: Clean up SLI-4 sysfs resource reportingJustin Tee
Currently, we have dated logic to work around the differences between SLI-4 and SLI-3 resource reporting through sysfs. Leave the SLI-3 path untouched, but for SLI4 path, retrieve resource values from the phba->sli4_hba->max_cfg_param structure. Max values are populated during ACQE events right after READ_CONFIG mbox cmd is sent. Instead of the dated subtraction logic, used resource calculation is directly fed into sysfs for display. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230712180522.112722-11-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-07-23scsi: lpfc: Refactor cpu affinity assignment pathsJustin Tee
During initialization, a lot of the same logic is used on MSI-X vector CPU affinity assignment. Create a lpfc_next_present_cpu() helper routine, and apply its usage for refactoring purposes. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230712180522.112722-10-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-07-23scsi: lpfc: Abort outstanding ELS cmds when mailbox timeout error is detectedJustin Tee
A mailbox timeout error usually indicates something has gone wrong, and a follow up reset of the HBA is a typical recovery mechanism. Introduce a MBX_TMO_ERR flag to detect such cases and have lpfc_els_flush_cmd abort ELS commands if the MBX_TMO_ERR flag condition was set. This ensures all of the registered SGL resources meant for ELS traffic are not leaked after an HBA reset. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230712180522.112722-9-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-07-23scsi: lpfc: Make fabric zone discovery more robust when handling unsolicited ↵Justin Tee
LOGO This patch provides better target rport recovery when a target rport is running in initiator mode to discover the fabric. Such a target will issue a LOGO before switching back to strict target mode and changes are made to recover the login. Log messages are also updated accordingly. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230712180522.112722-8-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-07-23scsi: lpfc: Set Establish Image Pair service parameter only for Target FunctionsJustin Tee
Previously, Establish Image Pair was set in all PRLI_ACC responses regardless if the received PRLI was from an initiator or target function. Specific target vendors that can operate in both initiator and target mode, may view the PRLI_ACC with Establish Image Pair set as an invalid service parameter when operating in initiator only mode. This causes discovery issues later when the target switches on its target mode function. Revise logic that determines an rport's role as an initiator or target and set the Establish Image Pair service parameter bit only if the Target Function bit is set. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230712180522.112722-7-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-07-23scsi: lpfc: Revise ndlp kref handling for dev_loss_tmo_callbk and lpfc_drop_nodeJustin Tee
The ndlp kref count implementation in lpfc_dev_loss_tmo_callbk() removes the initial node reference when a vport is unloading. When lpfc_cleanup() sends a DEVICE_RM event and is in NPR state, the driver calls lpfc_drop_node(). Subsequently, lpfc_drop_node() also removes an ndlp kref thinking it is the initial reference. This unintentionally introduces an extra kref decrement on the ndlp object. Fix by using the NLP_DROPPED node flag in lpfc_dev_loss_tmo_callbk() and lpfc_drop_node() to coordinate the removal of the initial node reference. In lpfc_dev_loss_tmo_callbk(), remove the SCSI transport reference provided the node is registered in the dev_loss context because the driver cannot call the SCSI transport in dev_loss context or afterwards. And, have lpfc_drop_node() not remove a reference if another thread is acting or has already acted on it. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230712180522.112722-6-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-07-23scsi: lpfc: Qualify ndlp discovery state when processing RSCNJustin Tee
Conditionalize when to put an ndlp into recovery mode when processing RSCNs. As long as an ndlp state is beyond a PLOGI issue and has been mapped to a transport layer before, the ndlp qualifies to be put into recovery mode. Otherwise, treat the ndlp rport normally through the discovery engine. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230712180522.112722-5-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-07-23scsi: lpfc: Remove extra ndlp kref decrement in FLOGI cmpl for loop topologyJustin Tee
In lpfc_cmpl_els_flogi(), the return out: label decrements the ndlp kref signaling that FLOGI processing on the ndlp is complete. In loop topology path, there is an unnecessary ndlp put because it also branches to the out: label. This also signals ndlp usage completion too soon. As such, remove the extra lpfc_nlp_put() when in loop topology. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230712180522.112722-4-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-07-23scsi: lpfc: Simplify fcp_abort transport callback log messageJustin Tee
The driver is reaching into a nvme_fc_cmd_iu ptr that belongs to the transport during an abort. This could cause an unintentional ptr dereference into memory that the driver does not own. Since the nvme_fc_cmd_iu ptr was for logging purposes only, simplify the log message such that the nvme_fc_cmd_iu reference is no longer needed. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230712180522.112722-3-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-07-23scsi: lpfc: Pull out fw diagnostic dump log message from driver's trace bufferJustin Tee
The firmware diagnostic dump log message does not need to be a part of the driver's log trace buffer because it is an expected user triggered event. Change LOG_TRACE_EVENT verbose flag to LOG_SLI. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230712180522.112722-2-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-07-11Merge branch '6.5/scsi-staging' into 6.5/scsi-fixesMartin K. Petersen
Pull in the currently staged SCSI fixes for 6.5. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-07-08Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds
Pull more SCSI updates from James Bottomley: "A few late arriving patches that missed the initial pull request. It's mostly bug fixes (the dt-bindings is a fix for the initial pull)" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: ufs: core: Remove unused function declaration scsi: target: docs: Remove tcm_mod_builder.py scsi: target: iblock: Quiet bool conversion warning with pr_preempt use scsi: dt-bindings: ufs: qcom: Fix ICE phandle scsi: core: Simplify scsi_cdl_check_cmd() scsi: isci: Fix comment typo scsi: smartpqi: Replace one-element arrays with flexible-array members scsi: target: tcmu: Replace strlcpy() with strscpy() scsi: ncr53c8xx: Replace strlcpy() with strscpy() scsi: lpfc: Fix lpfc_name struct packing
2023-07-05scsi: lpfc: Fix a possible data race in lpfc_unregister_fcf_rescan()Tuo Li
The variable phba->fcf.fcf_flag is often protected by the lock phba->hbalock() when is accessed. Here is an example in lpfc_unregister_fcf_rescan(): spin_lock_irq(&phba->hbalock); phba->fcf.fcf_flag |= FCF_INIT_DISC; spin_unlock_irq(&phba->hbalock); However, in the same function, phba->fcf.fcf_flag is assigned with 0 without holding the lock, and thus can cause a data race: phba->fcf.fcf_flag = 0; To fix this possible data race, a lock and unlock pair is added when accessing the variable phba->fcf.fcf_flag. Reported-by: BassCheck <bass@buaa.edu.cn> Signed-off-by: Tuo Li <islituo@gmail.com> Link: https://lore.kernel.org/r/20230630024748.1035993-1-islituo@gmail.com Reviewed-by: Justin Tee <justin.tee@broadcom.com> Reviewed-by: Laurence Oberman <loberman@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-06-30Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds
Pull SCSI updates from James Bottomley: "Updates to the usual drivers (ufs, pm80xx, libata-scsi, smartpqi, lpfc, qla2xxx). We have a couple of major core changes impacting other systems: - Command Duration Limits, which spills into block and ATA - block level Persistent Reservation Operations, which touches block, nvme, target and dm Both of these are added with merge commits containing a cover letter explaining what's going on" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (187 commits) scsi: core: Improve warning message in scsi_device_block() scsi: core: Replace scsi_target_block() with scsi_block_targets() scsi: core: Don't wait for quiesce in scsi_device_block() scsi: core: Don't wait for quiesce in scsi_stop_queue() scsi: core: Merge scsi_internal_device_block() and device_block() scsi: sg: Increase number of devices scsi: bsg: Increase number of devices scsi: qla2xxx: Remove unused nvme_ls_waitq wait queue scsi: ufs: ufs-pci: Add support for Intel Arrow Lake scsi: sd: sd_zbc: Use PAGE_SECTORS_SHIFT scsi: ufs: wb: Add explicit flush_threshold sysfs attribute scsi: ufs: ufs-qcom: Switch to the new ICE API scsi: ufs: dt-bindings: qcom: Add ICE phandle scsi: ufs: ufs-mediatek: Set UFSHCD_QUIRK_MCQ_BROKEN_RTC quirk scsi: ufs: ufs-mediatek: Set UFSHCD_QUIRK_MCQ_BROKEN_INTR quirk scsi: ufs: core: Add host quirk UFSHCD_QUIRK_MCQ_BROKEN_RTC scsi: ufs: core: Add host quirk UFSHCD_QUIRK_MCQ_BROKEN_INTR scsi: ufs: core: Remove dedicated hwq for dev command scsi: ufs: core: mcq: Fix the incorrect OCS value for the device command scsi: ufs: dt-bindings: samsung,exynos: Drop unneeded quotes ...
2023-06-21scsi: lpfc: Fix lpfc_name struct packingArnd Bergmann
clang points out that the lpfc_name structure has an 8-byte alignment requirement on most architectures, but is embedded in a number of other structures that are forced to be only 1-byte aligned: drivers/scsi/lpfc/lpfc_hw.h:1516:30: error: field pe within 'struct lpfc_fdmi_reg_port_list' is less aligned than 'struct lpfc_fdmi_port_entry' and is usually due to 'struct lpfc_fdmi_reg_port_list' being packed, which can lead to unaligned accesses [-Werror,-Wunaligned-access] struct lpfc_fdmi_port_entry pe; drivers/scsi/lpfc/lpfc_hw.h:850:19: error: field portName within 'struct _ADISC' is less aligned than 'struct lpfc_name' and is usually due to 'struct _ADISC' being packed, which can lead to unaligned accesses [-Werror,-Wunaligned-access] drivers/scsi/lpfc/lpfc_hw.h:851:19: error: field nodeName within 'struct _ADISC' is less aligned than 'struct lpfc_name' and is usually due to 'struct _ADISC' being packed, which can lead to unaligned accesses [-Werror,-Wunaligned-access] drivers/scsi/lpfc/lpfc_hw.h:922:19: error: field portName within 'struct _RNID' is less aligned than 'struct lpfc_name' and is usually due to 'struct _RNID' being packed, which can lead to unaligned accesses [-Werror,-Wunaligned-access] drivers/scsi/lpfc/lpfc_hw.h:923:19: error: field nodeName within 'struct _RNID' is less aligned than 'struct lpfc_name' and is usually due to 'struct _RNID' being packed, which can lead to unaligned accesses [-Werror,-Wunaligned-access] From the git history, I can see that all the __packed annotations were done specifically to avoid introducing implicit padding around the lpfc_name instances, though this was probably the wrong approach. To improve this, only annotate the one uint64_t field inside of lpfc_name as packed, with an explicit 4-byte alignment, as is the default already on the 32-bit x86 ABI but not on most others. With this, the other __packed annotations can be removed again, as this avoids the incorrect padding. Two other structures change their layout as a result of this change: - struct _LOGO never gained a __packed annotation even though it has the same alignment problem as the others but is not used anywhere in the driver today. - struct serv_param similarly has this issue, and it is used, my guess is that this is only an internal structure rather than part of a binary interface, so the padding has no negative effect here. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20230616090705.2623408-1-arnd@kernel.org Reviewed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-06-14scsi: lpfc: Fix incorrect big endian type assignment in bsg loopback pathJustin Tee
The kernel test robot reported sparse warnings regarding incorrect type assignment for __be16 variables in bsg loopback path. Change the flagged lines to use the be16_to_cpu() and cpu_to_be16() macros appropriately. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230614175944.3577-1-justintee8345@gmail.com Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202306110819.sDIKiGgg-lkp@intel.com/ Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-06-07scsi: lpfc: Avoid -Wstringop-overflow warningGustavo A. R. Silva
Prevent any potential integer wrapping issue, and avoid a -Wstringop-overflow warning by using the check_mul_overflow() helper. drivers/scsi/lpfc/lpfc.h: 837:#define LPFC_RAS_MIN_BUFF_POST_SIZE (256 * 1024) drivers/scsi/lpfc/lpfc_debugfs.c: 2266 size = LPFC_RAS_MIN_BUFF_POST_SIZE * phba->cfg_ras_fwlog_buffsize; this can wrap to negative if cfg_ras_fwlog_buffsize is large enough. And even when in practice this is not possible (due to phba->cfg_ras_fwlog_buffsize never being larger than 4[1]), the compiler is legitimately warning us about potentially buggy code. Fix the following warning seen under GCC-13: In function ‘lpfc_debugfs_ras_log_data’, inlined from ‘lpfc_debugfs_ras_log_open’ at drivers/scsi/lpfc/lpfc_debugfs.c:2271:15: drivers/scsi/lpfc/lpfc_debugfs.c:2210:25: warning: ‘memcpy’ specified bound between 18446744071562067968 and 18446744073709551615 exceeds maximum object size 9223372036854775807 [-Wstringop-overflow=] 2210 | memcpy(buffer + copied, dmabuf->virt, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2211 | size - copied - 1); | ~~~~~~~~~~~~~~~~~~ Link: https://github.com/KSPP/linux/issues/305 Link: https://lore.kernel.org/linux-hardening/CABPRKS8zyzrbsWt4B5fp7kMowAZFiMLKg5kW26uELpg1cDKY3A@mail.gmail.com/ [1] Co-developed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://lore.kernel.org/r/ZHkseX6TiFahvxJA@work Reviewed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-06-07scsi: lpfc: Use struct_size() helperJustin Tee
Prefer struct_size() over open-coded versions of idiom: sizeof(struct-with-flex-array) + sizeof(typeof-flex-array-elements) * count where count is the max number of items the flexible array is supposed to contain. Link: https://github.com/KSPP/linux/issues/160 Co-developed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Co-developed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230531223319.24328-1-justintee8345@gmail.com Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-31scsi: lpfc: Fix incorrect big endian type assignments in FDMI and VMID pathsJustin Tee
The kernel test robot reported sparse warnings regarding the improper usage of beXX_to_cpu() macros. Change the flagged FDMI and VMID member variables to __beXX and redo the beXX_to_cpu() macros appropriately. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230530191405.21580-1-justintee8345@gmail.com Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202305261159.lTW5NYrv-lkp@intel.com/ Closes: https://lore.kernel.org/oe-kbuild-all/202305260751.NWFvhLY5-lkp@intel.com/ Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-31Merge patch series "lpfc: Update lpfc to revision 14.2.0.13"Martin K. Petersen
Justin Tee <justintee8345@gmail.com> says: Update lpfc to revision 14.2.0.13 This patch set contains discovery bug fixes, firmware logging improvements, clean up of CQ handling, and statistics collection enhancements. The patches were cut against Martin's 6.5/scsi-queue tree. Link: https://lore.kernel.org/r/20230523183206.7728-1-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-31scsi: lpfc: Copyright updates for 14.2.0.13 patchesJustin Tee
Update copyrights to 2023 for files modified in the 14.2.0.13 patch set. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230523183206.7728-10-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-31scsi: lpfc: Update lpfc version to 14.2.0.13Justin Tee
Update lpfc version to 14.2.0.13 Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230523183206.7728-9-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-31scsi: lpfc: Enhance congestion statistics collectionJustin Tee
Various improvements are made for collecting congestion statistics: - Pre-existing logic is replaced with use of an hrtimer for increased reporting accuracy. - Congestion timestamp information is reorganized into a single struct. - Common statistic collection logic is refactored into a helper routine. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230523183206.7728-8-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-31scsi: lpfc: Clean up SLI-4 CQE status handlingJustin Tee
There is mishandling of SLI-4 CQE status values larger than what is allowed by the LPFC_IOCB_STATUS_MASK of 4 bits. The LPFC_IOCB_STATUS_MASK is a leftover SLI-3 construct and serves no purpose in SLI-4 path. Remove the LPFC_IOCB_STATUS_MASK and clean up general CQE status handling in SLI-4 completion paths. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230523183206.7728-7-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>