Age | Commit message (Collapse) | Author |
|
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Link: https://lore.kernel.org/r/20240710171057.35066-12-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
On bsg timeout, hardware_lock is used as part of search for the srb.
Instead, qpair lock should be used to iterate through different qpair.
Cc: stable@vger.kernel.org
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Link: https://lore.kernel.org/r/20240710171057.35066-11-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
For fabric scan, current code uses switch scan opcode and flags as the
method to iterate through different commands to carry out the process.
This makes it hard to read. This patch convert those opcode and flags into
steps. In addition, this help reduce some duplicate code.
Consolidate routines that handle GPNFT & GNNFT.
Cc: stable@vger.kernel.org
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Link: https://lore.kernel.org/r/20240710171057.35066-10-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Bios version was popluated for FDMI response. Systems with EFI would show
optrom version as 0. EFI version is populated here and BIOS version is
already displayed under FDMI_HBA_BOOT_BIOS_NAME.
Cc: stable@vger.kernel.org
Signed-off-by: Shreyas Deodhar <sdeodhar@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Link: https://lore.kernel.org/r/20240710171057.35066-9-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
During vport delete, it is observed that during unload we hit a crash
because of stale entries in outstanding command array. For all these stale
I/O entries, eh_abort was issued and aborted (fast_fail_io = 2009h) but
I/Os could not complete while vport delete is in process of deleting.
BUG: kernel NULL pointer dereference, address: 000000000000001c
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP NOPTI
Workqueue: qla2xxx_wq qla_do_work [qla2xxx]
RIP: 0010:dma_direct_unmap_sg+0x51/0x1e0
RSP: 0018:ffffa1e1e150fc68 EFLAGS: 00010046
RAX: 0000000000000000 RBX: 0000000000000021 RCX: 0000000000000001
RDX: 0000000000000021 RSI: 0000000000000000 RDI: ffff8ce208a7a0d0
RBP: ffff8ce208a7a0d0 R08: 0000000000000000 R09: ffff8ce378aac9c8
R10: ffff8ce378aac8a0 R11: ffffa1e1e150f9d8 R12: 0000000000000000
R13: 0000000000000000 R14: ffff8ce378aac9c8 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff8d217f000000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000000001c CR3: 0000002089acc000 CR4: 0000000000350ee0
Call Trace:
<TASK>
qla2xxx_qpair_sp_free_dma+0x417/0x4e0
? qla2xxx_qpair_sp_compl+0x10d/0x1a0
? qla2x00_status_entry+0x768/0x2830
? newidle_balance+0x2f0/0x430
? dequeue_entity+0x100/0x3c0
? qla24xx_process_response_queue+0x6a1/0x19e0
? __schedule+0x2d5/0x1140
? qla_do_work+0x47/0x60
? process_one_work+0x267/0x440
? process_one_work+0x440/0x440
? worker_thread+0x2d/0x3d0
? process_one_work+0x440/0x440
? kthread+0x156/0x180
? set_kthread_struct+0x50/0x50
? ret_from_fork+0x22/0x30
</TASK>
Send out async logout explicitly for all the ports during vport delete.
Cc: stable@vger.kernel.org
Signed-off-by: Manish Rangankar <mrangankar@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Link: https://lore.kernel.org/r/20240710171057.35066-8-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
A crash was observed while performing NPIV and FW reset,
BUG: kernel NULL pointer dereference, address: 000000000000001c
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 1 PREEMPT_RT SMP NOPTI
RIP: 0010:dma_direct_unmap_sg+0x51/0x1e0
RSP: 0018:ffffc90026f47b88 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000021 RCX: 0000000000000002
RDX: 0000000000000021 RSI: 0000000000000000 RDI: ffff8881041130d0
RBP: ffff8881041130d0 R08: 0000000000000000 R09: 0000000000000034
R10: ffffc90026f47c48 R11: 0000000000000031 R12: 0000000000000000
R13: 0000000000000000 R14: ffff8881565e4a20 R15: 0000000000000000
FS: 00007f4c69ed3d00(0000) GS:ffff889faac80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000000001c CR3: 0000000288a50002 CR4: 00000000007706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
<TASK>
? __die_body+0x1a/0x60
? page_fault_oops+0x16f/0x4a0
? do_user_addr_fault+0x174/0x7f0
? exc_page_fault+0x69/0x1a0
? asm_exc_page_fault+0x22/0x30
? dma_direct_unmap_sg+0x51/0x1e0
? preempt_count_sub+0x96/0xe0
qla2xxx_qpair_sp_free_dma+0x29f/0x3b0 [qla2xxx]
qla2xxx_qpair_sp_compl+0x60/0x80 [qla2xxx]
__qla2x00_abort_all_cmds+0xa2/0x450 [qla2xxx]
The command completion was done early while aborting the commands in driver
unload path but outside lock to avoid the WARN_ON condition of performing
dma_free_attr within the lock. However this caused race condition while
command completion via multiple paths causing system crash.
Hence complete the command early in unload path but within the lock to
avoid race condition.
Fixes: 0367076b0817 ("scsi: qla2xxx: Perform lockless command completion in abort path")
Cc: stable@vger.kernel.org
Signed-off-by: Shreyas Deodhar <sdeodhar@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Link: https://lore.kernel.org/r/20240710171057.35066-7-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Link up failure is observed as a result of flash read failure. Current
code does not check flash read return code where it relies on FW checksum
to detect the problem.
Add check of flash read failure to detect the problem sooner.
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/all/202406210815.rPDRDMBi-lkp@intel.com/
Cc: stable@vger.kernel.org
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Link: https://lore.kernel.org/r/20240710171057.35066-6-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Firmware only supports single DSDs in ELS Pass-through IOCB (0x53h), sg cnt
is decided by the SCSI ML. User is not aware of the cause of an acutal
error.
Return the appropriate return code that will be decoded by API and
application and proper error message will be displayed to user.
Fixes: 6e98016ca077 ("[SCSI] qla2xxx: Re-organized BSG interface specific code.")
Cc: stable@vger.kernel.org
Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Link: https://lore.kernel.org/r/20240710171057.35066-5-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Init Control Block is dereferenced incorrectly. Correctly dereference ICB
Cc: stable@vger.kernel.org
Signed-off-by: Shreyas Deodhar <sdeodhar@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Link: https://lore.kernel.org/r/20240710171057.35066-4-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The driver load failed with error message,
qla2xxx [0000:04:00.0]-ffff:0: register_localport failed: ret=ffffffef
and with a kernel crash,
BUG: unable to handle kernel NULL pointer dereference at 0000000000000070
Workqueue: events_unbound qla_register_fcport_fn [qla2xxx]
RIP: 0010:nvme_fc_register_remoteport+0x16/0x430 [nvme_fc]
RSP: 0018:ffffaaa040eb3d98 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff9dfb46b78c00 RCX: 0000000000000000
RDX: ffff9dfb46b78da8 RSI: ffffaaa040eb3e08 RDI: 0000000000000000
RBP: ffff9dfb612a0a58 R08: ffffffffaf1d6270 R09: 3a34303a30303030
R10: 34303a303030305b R11: 2078787832616c71 R12: ffff9dfb46b78dd4
R13: ffff9dfb46b78c24 R14: ffff9dfb41525300 R15: ffff9dfb46b78da8
FS: 0000000000000000(0000) GS:ffff9dfc67c00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000070 CR3: 000000018da10004 CR4: 00000000000206f0
Call Trace:
qla_nvme_register_remote+0xeb/0x1f0 [qla2xxx]
? qla2x00_dfs_create_rport+0x231/0x270 [qla2xxx]
qla2x00_update_fcport+0x2a1/0x3c0 [qla2xxx]
qla_register_fcport_fn+0x54/0xc0 [qla2xxx]
Exit the qla_nvme_register_remote() function when qla_nvme_register_hba()
fails and correctly validate nvme_local_port.
Cc: stable@vger.kernel.org
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Link: https://lore.kernel.org/r/20240710171057.35066-3-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The device does not come online when the target port is online. There were
multiple RSCNs indicating multiple devices were affected. Driver is in the
process of finishing a fabric scan. A new RSCN (device up) arrived at the
tail end of the last fabric scan. Driver mistakenly thinks the new RSCN is
being taken care of by the previous fabric scan, where this notification is
cleared and not acted on. The laser needs to be blinked again to get the
device to show up.
To prevent driver from accidentally clearing the RSCN notification, each
RSCN is given a generation value. A fabric scan will scan for that
generation(s). Any new RSCN arrive after the scan start will have a new
generation value. This will trigger another scan to get latest data. The
RSCN notification flag will be cleared when the scan is associate to that
generation.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202406210538.w875N70K-lkp@intel.com/
Fixes: bb2ca6b3f09a ("scsi: qla2xxx: Relogin during fabric disturbance")
Cc: stable@vger.kernel.org
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Link: https://lore.kernel.org/r/20240710171057.35066-2-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Pull in my fixes branch to resolve an mpi3mr merge conflict reported
by sfr.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"One core change that moves a disk start message to a location where it
will only be printed once instead of twice plus a couple of error
handling race fixes in the ufs driver"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: sd: Do not repeat the starting disk message
scsi: ufs: core: Fix ufshcd_abort_one racing issue
scsi: ufs: core: Fix ufshcd_clear_cmd racing issue
|
|
for-6.11/block
Pull NVMe updates from Keith:
"nvme updates for Linux 6.11
- Device initialization memory leak fixes (Keith)
- More constants defined (Weiwen)
- Target debugfs support (Hannes)
- PCIe subsystem reset enhancements (Keith)
- Queue-depth multipath policy (Redhat and PureStorage)
- Implement get_unique_id (Christoph)
- Authentication error fixes (Gaosheng)"
* tag 'nvme-6.11-2024-07-08' of git://git.infradead.org/nvme: (21 commits)
nvmet-auth: fix nvmet_auth hash error handling
nvme: implement ->get_unique_id
nvme-multipath: implement "queue-depth" iopolicy
nvme-multipath: prepare for "queue-depth" iopolicy
nvme-pci: do not directly handle subsys reset fallout
lpfc_nvmet: implement 'host_traddr'
nvme-fcloop: implement 'host_traddr'
nvmet-fc: implement host_traddr()
nvmet-rdma: implement host_traddr()
nvmet-tcp: implement host_traddr()
nvmet: add 'host_traddr' callback for debugfs
nvmet: add debugfs support
mailmap: add entry for Weiwen Hu
nvme: rename CDR/MORE/DNR to NVME_STATUS_*
nvme: fix status magic numbers
nvme: rename nvme_sc_to_pr_err to nvme_status_to_pr_err
nvme: split device add from initialization
nvme: fc: split controller bringup handling
nvme: rdma: split controller bringup handling
nvme: tcp: split controller bringup handling
...
|
|
Now that device mapper can handle resetting all zones of a mapped zoned
device using REQ_OP_ZONE_RESET_ALL, all zoned block device drivers
support this operation. With this, the request queue feature
BLK_FEAT_ZONE_RESETALL is not necessary and the emulation code in
blk-zone.c can be removed.
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20240704052816.623865-5-dlemoal@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Sumit Saxena <sumit.saxena@broadcom.com> says:
This patch series contains the changes done in the driver to support
PCI error recovery. It is rework of older patch series from Ranjan
Kumar, see [1].
[1] https://lore.kernel.org/all/20231214205900.270488-1-ranjan.kumar@broadcom.com/
Link: https://lore.kernel.org/r/20240627101735.18286-1-sumit.saxena@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Link: https://lore.kernel.org/r/20240627101735.18286-4-sumit.saxena@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Prevent interaction with the hardware while the error recovery in progress.
Co-developed-by: Sathya Prakash <sathya.prakash@broadcom.com>
Signed-off-by: Sathya Prakash <sathya.prakash@broadcom.com>
Co-developed-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Link: https://lore.kernel.org/r/20240627101735.18286-3-sumit.saxena@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
PCI Error recovery support is required to recover the controller upon
detection of PCI errors. Add support for the PCI error recovery callback
handlers in mpi3mr driver.
Co-developed-by: Sathya Prakash <sathya.prakash@broadcom.com>
Signed-off-by: Sathya Prakash <sathya.prakash@broadcom.com>
Co-developed-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Link: https://lore.kernel.org/r/20240627101735.18286-2-sumit.saxena@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Justin Tee <justintee8345@gmail.com> says:
Update lpfc to revision 14.4.0.3
This patch set contains bug fixes related to discovery, submission of
mailbox commands, and proper endianness conversions.
The patches were cut against Martin's 6.11/scsi-queue tree.
Link: https://lore.kernel.org/r/20240628172011.25921-1-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Update lpfc version to 14.4.0.3.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240628172011.25921-9-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
On big endian architectures, it is possible to run into a memory out of
bounds pointer dereference when FCP targets are zoned.
In lpfc_prep_embed_io, the memcpy(ptr, fcp_cmnd, sgl->sge_len) is
referencing a little endian formatted sgl->sge_len value. So, the memcpy
can cause big endian systems to crash.
Redefine the *sgl ptr as a struct sli4_sge_le to make it clear that we are
referring to a little endian formatted data structure. And, update the
routine with proper le32_to_cpu macro usages.
Fixes: af20bb73ac25 ("scsi: lpfc: Add support for 32 byte CDBs")
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240628172011.25921-8-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
When setting trunk modes through sysfs, the SLI_CONFIG mailbox command's
command payload length is incorrectly hardcoded to 12 bytes. SLI_CONFIG's
payload length field should be specified large enough to encompass both the
submailbox command header and the submailbox request itself.
Thus, replace the hardcoded 12 bytes with a clearer calculation by way of
sizeof(struct lpfc_mbx_set_trunk_mode) - sizeof(struct lpfc_sli4_cfg_mhdr).
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240628172011.25921-7-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The MBX_TIMEOUT return code is not handled in lpfc_get_sfp_info and the
routine unconditionally frees submitted mailbox commands regardless of
return status. The issue is that for MBX_TIMEOUT cases, when firmware
returns SFP information at a later time, that same mailbox memory region
references previously freed memory in its cmpl routine.
Fix by adding checks for the MBX_TIMEOUT return code. During mailbox
resource cleanup, check the mbox flag to make sure that the wait did not
timeout. If the MBOX_WAKE flag is not set, then do not free the resources
because it will be freed when firmware completes the mailbox at a later
time in its cmpl routine.
Also, increase the timeout from 30 to 60 seconds to accommodate boot
scripts requiring longer timeouts.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240628172011.25921-6-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
In rare cases when a fabric node is recovered after a link bounce and
before dev_loss_tmo callbk is reached, the driver may leave the fabric node
in an inconsistent state with the NLP_IN_DEV_LOSS flag perpetually set.
In lpfc_dev_loss_tmo_callbk, a check is added for a recovered fabric node.
If the node is recovered, then don't queue the lpfc_dev_loss_tmo_handler
work. In lpfc_dev_loss_tmo_handler, the path taken for the recovered fabric
nodes is updated to clear the NLP_IN_DEV_LOSS flag.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240628172011.25921-5-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
If previously in REG_LOGIN_ISSUE state, then remove the requirement that
PLOGI must have been received from the remote port before issuing a PRLI.
After GID_FT completes, it does not matter whether the driver itself sent a
PLOGI or received one. The fact that we're in REG_LOGIN_ISSUE state simply
means that the next state should be issuing the PRLI to continue discovery
of the remote port.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240628172011.25921-4-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Certain vendor specific targets initially register with the fabric as an
initiator function first and then re-register as a target function
afterwards.
The timing of the target function re-registration can cause a race
condition such that the driver is stuck assuming the remote port as an
initiator function and never discovers the target's hosted LUNs.
Expand the nlp_state qualifier to also include NLP_STE_PRLI_ISSUE because
the state means that PRLI was issued but we have not quite reached
MAPPED_NODE state yet. If we received an RSCN in the PRLI_ISSUE state,
then we should restart discovery again by going into DEVICE_RECOVERY.
Fixes: dded1dc31aa4 ("scsi: lpfc: Modify when a node should be put in device recovery mode during RSCN")
Cc: <stable@vger.kernel.org> # v6.6+
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240628172011.25921-3-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
During SLI port errata events, there should be no expectation that
submitted outstanding WQEs will return back CQEs. In these situations, the
driver should not rely on receiving CQEs from the SLI port to signal WQE
resource clean up.
Put an sli_flag LPFC_SLI_ACTIVE check in lpfc_els_flush_cmd() when walking
the txcmplq. The sli_flag check helps determine whether to issue an abort
or driver based cancel on outstanding WQEs. If !LPFC_SLI_ACTIVE, then
there's no point to issue anything to the SLI port. Instead, let the
driver based cancel logic clean up the submitted WQE resources.
Also, enhance some abort log messages that help with future debugging.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240628172011.25921-2-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The SCSI disk message "Starting disk" to signal resuming of a suspended
disk is printed in both sd_resume() and sd_resume_common() which results
in this message being printed twice when resuming from e.g. autosuspend:
$ echo 5000 > /sys/block/sda/device/power/autosuspend_delay_ms
$ echo auto > /sys/block/sda/device/power/control
[ 4962.438293] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[ 4962.501121] sd 0:0:0:0: [sda] Stopping disk
$ echo on > /sys/block/sda/device/power/control
[ 4972.805851] sd 0:0:0:0: [sda] Starting disk
[ 4980.558806] sd 0:0:0:0: [sda] Starting disk
Fix this double print by removing the call to sd_printk() from sd_resume()
and moving the call to sd_printk() in sd_resume_common() earlier in the
function, before the check using sd_do_start_stop(). Doing so, the message
is printed once regardless if sd_resume_common() actually executes
sd_start_stop_device() (i.e. SCSI device case) or not (libsas and libata
managed ATA devices case).
Fixes: 0c76106cb975 ("scsi: sd: Fix TCG OPAL unlock on system resume")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20240701215326.128067-1-dlemoal@kernel.org
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Reading the main config table occurs as a part of initialization in
pm80xx_chip_init(). Because of this it makes more sense to have it be a
part of the INIT logging.
Signed-off-by: Terrence Adams <tadamsjr@google.com>
Link: https://lore.kernel.org/r/20240627155924.2361370-3-tadamsjr@google.com
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
pm8001_phy_control() populates the enable_completion pointer with a stack
address, sends a PHY_LINK_RESET / PHY_HARD_RESET, waits 300 ms, and
returns. The problem arises when a phy control response comes late. After
300 ms the pm8001_phy_control() function returns and the passed
enable_completion stack address is no longer valid. Late phy control
response invokes complete() on a dangling enable_completion pointer which
leads to a kernel crash.
Signed-off-by: Igor Pylypiv <ipylypiv@google.com>
Signed-off-by: Terrence Adams <tadamsjr@google.com>
Link: https://lore.kernel.org/r/20240627155924.2361370-2-tadamsjr@google.com
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The test for a possible shift overflow is not correct. Fix it by replacing
the '>' with a '>='.
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Link: https://lore.kernel.org/r/20240627074827.13672-1-thenzl@redhat.com
Suggested-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The ata_sas_port_alloc() wrapper mainly exists in order to export the
internal libata function which it wraps. The secondary reason is that
it initializes some ata_port struct members.
However, ata_sas_port_alloc() is only used in a single location,
sas_ata_init(), which already performs some ata_port struct member
initialization, so it does not make sense to spread this initialization
out over two separate locations.
Thus, remove the wrapper and instead export the libata function directly,
and move the libsas specific ata_port initialization to sas_ata_init(),
which already does some ata_port initialization.
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20240703184418.723066-19-cassel@kernel.org
Signed-off-by: Niklas Cassel <cassel@kernel.org>
|
|
The ata_sas_tport_add() and ata_sas_tport_delete() wrappers only exist in
order to export the internal libata functions which they wrap.
Remove the wrappers and instead export the libata functions directly.
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20240703184418.723066-12-cassel@kernel.org
Signed-off-by: Niklas Cassel <cassel@kernel.org>
|
|
Split struct bio_integrity_payload and the related prototypes out of
bio.h into a separate bio-integrity.h header so that it is only pulled
in by the few places that need it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Anuj Gupta <anuj20.g@samsung.com>
Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20240702151047.1746127-2-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Pull in v6.10-rc6 to resolve a conflict for the integrity cleanups.
* tag 'v6.10-rc6': (778 commits)
Linux 6.10-rc6
ata: ahci: Clean up sysfs file on error
ata: libata-core: Fix double free on error
ata,scsi: libata-core: Do not leak memory for ata_port struct members
ata: libata-core: Fix null pointer dereference on error
x86-32: fix cmpxchg8b_emu build error with clang
x86: stop playing stack games in profile_pc()
i2c: testunit: discard write requests while old command is running
i2c: testunit: don't erase registers after STOP
tty: mxser: Remove __counted_by from mxser_board.ports[]
randomize_kstack: Remove non-functional per-arch entropy filtering
string: kunit: add missing MODULE_DESCRIPTION() macros
ata: libata-core: Add ATA_HORKAGE_NOLPM for all Crucial BX SSD1 models
MAINTAINERS: Update IOMMU tree location
tools/power turbostat: Add local build_bug.h header for snapshot target
tools/power turbostat: Fix unc freq columns not showing with '-q' or '-l'
tools/power turbostat: option '-n' is ambiguous
drm/drm_file: Fix pid refcounting race
kallsyms: rework symbol lookup return codes
gpiolib: cdev: Ignore reconfiguration without direction
...
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
'devmodel' hasn't actually been used since:
'commit 3275158fa52a ("parport: remove use of devmodel")'
and everyone now has it set to true and has been fixed up; remove
the flag.
(There are still comments all over about it)
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Acked-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Link: https://lore.kernel.org/r/20240502154823.67235-4-linux@treblig.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
In the match() callback, the struct device_driver * should not be
changed, so change the function callback to be a const *. This is one
step of many towards making the driver core safe to have struct
device_driver in read-only memory.
Because the match() callback is in all busses, all busses are modified
to handle this properly. This does entail switching some container_of()
calls to container_of_const() to properly handle the constant *.
For some busses, like PCI and USB and HV, the const * is cast away in
the match callback as those busses do want to modify those structures at
this point in time (they have a local lock in the driver structure.)
That will have to be changed in the future if they wish to have their
struct device * in read-only-memory.
Cc: Rafael J. Wysocki <rafael@kernel.org>
Reviewed-by: Alex Elder <elder@kernel.org>
Acked-by: Sumit Garg <sumit.garg@linaro.org>
Link: https://lore.kernel.org/r/2024070136-wrongdoer-busily-01e8@gregkh
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"A couple of error leg problems, one affecting scsi_debug and the other
affecting pure SAS (i.e. not SATA) SCSI expanders"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: libsas: Fix exp-attached device scan after probe failure scanned in again after probe failed
scsi: scsi_debug: Fix create target debugfs failure
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux
Pull ata fixes from Niklas Cassel:
- Add NOLPM quirk for for all Crucial BX SSD1 models.
Considering that we now have had bug reports for 3 different BX SSD1
variants from Crucial with the same product name, make the quirk more
inclusive, to catch more device models from the same generation.
- Fix a trivial NULL pointer dereference in the error path for
ata_host_release().
- Create a ata_port_free(), so that we don't miss freeing ata_port
struct members when freeing a struct ata_port.
- Fix a trivial double free in the error path for ata_host_alloc().
- Ensure that we remove the libata "remapped NVMe device count" sysfs
entry on .probe() error.
* tag 'ata-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
ata: ahci: Clean up sysfs file on error
ata: libata-core: Fix double free on error
ata,scsi: libata-core: Do not leak memory for ata_port struct members
ata: libata-core: Fix null pointer dereference on error
ata: libata-core: Add ATA_HORKAGE_NOLPM for all Crucial BX SSD1 models
|
|
libsas is currently not freeing all the struct ata_port struct members,
e.g. ncq_sense_buf for a driver supporting Command Duration Limits (CDL).
Add a function, ata_port_free(), that is used to free a ata_port,
including its struct members. It makes sense to keep the code related to
freeing a ata_port in its own function, which will also free all the
struct members of struct ata_port.
Fixes: 18bd7718b5c4 ("scsi: ata: libata: Handle completion of CDL commands using policy 0xD")
Reviewed-by: John Garry <john.g.garry@oracle.com>
Link: https://lore.kernel.org/r/20240629124210.181537-8-cassel@kernel.org
Signed-off-by: Niklas Cassel <cassel@kernel.org>
|
|
There were several instances of the string "assocat" in the kernel, which
should have been spelled "associat", with the various endings of -ive,
-ed, -ion, and sometimes beginnging with dis-.
Add to the spelling dictionary the corrections so that future instances
will be caught by checkpatch, and fix the instances found.
Originally noticed by accident with a 'git grep socat'.
Link: https://lkml.kernel.org/r/20240612001247.356867-1-jesse.brandeburg@intel.com
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Setting QUEUE_FLAG_NOMERGES was added in commit d1b01d14b7ba ("scsi:
mpt3sas: Set NVMe device queue depth as 128") without any explanation.
Drivers should second guess the block layer merge decisions, so remove
the flag.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20240627124926.512662-4-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Setting QUEUE_FLAG_NOMERGES was added in commit 15dd03811d99dcf
("scsi: megaraid_sas: NVME Interface detection and prop settings")
without any explanation. Drivers should second guess the block
layer merge decisions, so remove the flag.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20240627124926.512662-3-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Ranjan Kumar <ranjan.kumar@broadcom.com> says:
The controllers managed by mpi3mr driver requires system memory to
save hardware and firmware diagnostic information, this patch set
enhances the drivers to provide host memory to the controller for
diagnostic information. This patch set also provides driver changes
to push kernel messages into the diagnostic buffers reserved for the
driver, so that the information will be available as part of debug
data fetched from the controller. In addition, support for
configuring automatic diagnostic information is added in the driver.
Link: https://lore.kernel.org/r/20240626102646.14298-1-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Update driver version to 8.9.1.0.50
Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Link: https://lore.kernel.org/r/20240626102646.14298-5-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Add interface for applications to manage the host diagnostic buffers and
update the automatic diag buffer capture triggers.
Co-developed-by: Sathya Prakash <sathya.prakash@broadcom.com>
Signed-off-by: Sathya Prakash <sathya.prakash@broadcom.com>
Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Link: https://lore.kernel.org/r/20240626102646.14298-4-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Add functions to process automatic diag triggers. If a condition defined in
the triggers is met, the driver will call appropriate controller functions
to save the diagnostic information.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202405151955.BiAWI1SY-lkp@intel.com/
Co-developed-by: Sathya Prakash <sathya.prakash@broadcom.com>
Signed-off-by: Sathya Prakash <sathya.prakash@broadcom.com>
Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Link: https://lore.kernel.org/r/20240626102646.14298-3-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
To be able to debug controller problems it is beneficial to allocate and
configure system/host memory buffers which can be used to capture hardware
and firmware diagnostic information.
Add functions required to allocate and post firmware and hardware
diagnostic buffers to the controller and to set up automatic diagnostic
capture triggers.
Captures will be triggered under the following circumstances:
1. Firmware is in FAULT state.
2. Admin commands time out.
3. Controller reset caused due to I/O timeout
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202405151758.7xrJz6rp-lkp@intel.com/
Co-developed-by: Sathya Prakash <sathya.prakash@broadcom.com>
Signed-off-by: Sathya Prakash <sathya.prakash@broadcom.com>
Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Link: https://lore.kernel.org/r/20240626102646.14298-2-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
In function lpfc_xcvr_data_show, the memory allocation with kmalloc might
fail, thereby making rdp_context a null pointer. In the following context
and functions that use this pointer, there are dereferencing operations,
leading to null pointer dereference.
To fix this issue, a null pointer check should be added. If it is null,
use scnprintf to notify the user and return len.
Fixes: 479b0917e447 ("scsi: lpfc: Create a sysfs entry called lpfc_xcvr_data for transceiver info")
Signed-off-by: Huai-Yuan Liu <qq810974084@gmail.com>
Link: https://lore.kernel.org/r/20240621082545.449170-1-qq810974084@gmail.com
Reviewed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|