summaryrefslogtreecommitdiff
path: root/drivers/scsi
AgeCommit message (Collapse)Author
2024-07-04scsi: sd: Do not repeat the starting disk messageDamien Le Moal
The SCSI disk message "Starting disk" to signal resuming of a suspended disk is printed in both sd_resume() and sd_resume_common() which results in this message being printed twice when resuming from e.g. autosuspend: $ echo 5000 > /sys/block/sda/device/power/autosuspend_delay_ms $ echo auto > /sys/block/sda/device/power/control [ 4962.438293] sd 0:0:0:0: [sda] Synchronizing SCSI cache [ 4962.501121] sd 0:0:0:0: [sda] Stopping disk $ echo on > /sys/block/sda/device/power/control [ 4972.805851] sd 0:0:0:0: [sda] Starting disk [ 4980.558806] sd 0:0:0:0: [sda] Starting disk Fix this double print by removing the call to sd_printk() from sd_resume() and moving the call to sd_printk() in sd_resume_common() earlier in the function, before the check using sd_do_start_stop(). Doing so, the message is printed once regardless if sd_resume_common() actually executes sd_start_stop_device() (i.e. SCSI device case) or not (libsas and libata managed ATA devices case). Fixes: 0c76106cb975 ("scsi: sd: Fix TCG OPAL unlock on system resume") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Link: https://lore.kernel.org/r/20240701215326.128067-1-dlemoal@kernel.org Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04scsi: pm8001: Update log level when reading config tableTerrence Adams
Reading the main config table occurs as a part of initialization in pm80xx_chip_init(). Because of this it makes more sense to have it be a part of the INIT logging. Signed-off-by: Terrence Adams <tadamsjr@google.com> Link: https://lore.kernel.org/r/20240627155924.2361370-3-tadamsjr@google.com Acked-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04scsi: pm80xx: Set phy->enable_completion only when we wait for itIgor Pylypiv
pm8001_phy_control() populates the enable_completion pointer with a stack address, sends a PHY_LINK_RESET / PHY_HARD_RESET, waits 300 ms, and returns. The problem arises when a phy control response comes late. After 300 ms the pm8001_phy_control() function returns and the passed enable_completion stack address is no longer valid. Late phy control response invokes complete() on a dangling enable_completion pointer which leads to a kernel crash. Signed-off-by: Igor Pylypiv <ipylypiv@google.com> Signed-off-by: Terrence Adams <tadamsjr@google.com> Link: https://lore.kernel.org/r/20240627155924.2361370-2-tadamsjr@google.com Acked-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04scsi: mpi3mr: Correct a test in mpi3mr_sas_port_add()Tomas Henzl
The test for a possible shift overflow is not correct. Fix it by replacing the '>' with a '>='. Signed-off-by: Tomas Henzl <thenzl@redhat.com> Link: https://lore.kernel.org/r/20240627074827.13672-1-thenzl@redhat.com Suggested-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04ata,scsi: Remove wrapper ata_sas_port_alloc()Niklas Cassel
The ata_sas_port_alloc() wrapper mainly exists in order to export the internal libata function which it wraps. The secondary reason is that it initializes some ata_port struct members. However, ata_sas_port_alloc() is only used in a single location, sas_ata_init(), which already performs some ata_port struct member initialization, so it does not make sense to spread this initialization out over two separate locations. Thus, remove the wrapper and instead export the libata function directly, and move the libsas specific ata_port initialization to sas_ata_init(), which already does some ata_port initialization. Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20240703184418.723066-19-cassel@kernel.org Signed-off-by: Niklas Cassel <cassel@kernel.org>
2024-07-04ata,scsi: Remove wrappers ata_sas_tport_{add,delete}()Niklas Cassel
The ata_sas_tport_add() and ata_sas_tport_delete() wrappers only exist in order to export the internal libata functions which they wrap. Remove the wrappers and instead export the libata functions directly. Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20240703184418.723066-12-cassel@kernel.org Signed-off-by: Niklas Cassel <cassel@kernel.org>
2024-07-03block: split integrity support out of bio.hChristoph Hellwig
Split struct bio_integrity_payload and the related prototypes out of bio.h into a separate bio-integrity.h header so that it is only pulled in by the few places that need it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Anuj Gupta <anuj20.g@samsung.com> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20240702151047.1746127-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-07-03Merge tag 'v6.10-rc6' into for-6.11/block-postJens Axboe
Pull in v6.10-rc6 to resolve a conflict for the integrity cleanups. * tag 'v6.10-rc6': (778 commits) Linux 6.10-rc6 ata: ahci: Clean up sysfs file on error ata: libata-core: Fix double free on error ata,scsi: libata-core: Do not leak memory for ata_port struct members ata: libata-core: Fix null pointer dereference on error x86-32: fix cmpxchg8b_emu build error with clang x86: stop playing stack games in profile_pc() i2c: testunit: discard write requests while old command is running i2c: testunit: don't erase registers after STOP tty: mxser: Remove __counted_by from mxser_board.ports[] randomize_kstack: Remove non-functional per-arch entropy filtering string: kunit: add missing MODULE_DESCRIPTION() macros ata: libata-core: Add ATA_HORKAGE_NOLPM for all Crucial BX SSD1 models MAINTAINERS: Update IOMMU tree location tools/power turbostat: Add local build_bug.h header for snapshot target tools/power turbostat: Fix unc freq columns not showing with '-q' or '-l' tools/power turbostat: option '-n' is ambiguous drm/drm_file: Fix pid refcounting race kallsyms: rework symbol lookup return codes gpiolib: cdev: Ignore reconfiguration without direction ... Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-07-03parport: Remove parport_driver.devmodelDr. David Alan Gilbert
'devmodel' hasn't actually been used since: 'commit 3275158fa52a ("parport: remove use of devmodel")' and everyone now has it set to true and has been fixed up; remove the flag. (There are still comments all over about it) Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Acked-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Link: https://lore.kernel.org/r/20240502154823.67235-4-linux@treblig.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-07-03driver core: have match() callback in struct bus_type take a const *Greg Kroah-Hartman
In the match() callback, the struct device_driver * should not be changed, so change the function callback to be a const *. This is one step of many towards making the driver core safe to have struct device_driver in read-only memory. Because the match() callback is in all busses, all busses are modified to handle this properly. This does entail switching some container_of() calls to container_of_const() to properly handle the constant *. For some busses, like PCI and USB and HV, the const * is cast away in the match callback as those busses do want to modify those structures at this point in time (they have a local lock in the driver structure.) That will have to be changed in the future if they wish to have their struct device * in read-only-memory. Cc: Rafael J. Wysocki <rafael@kernel.org> Reviewed-by: Alex Elder <elder@kernel.org> Acked-by: Sumit Garg <sumit.garg@linaro.org> Link: https://lore.kernel.org/r/2024070136-wrongdoer-busily-01e8@gregkh Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-07-01Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "A couple of error leg problems, one affecting scsi_debug and the other affecting pure SAS (i.e. not SATA) SCSI expanders" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: libsas: Fix exp-attached device scan after probe failure scanned in again after probe failed scsi: scsi_debug: Fix create target debugfs failure
2024-06-30Merge tag 'ata-6.10-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux Pull ata fixes from Niklas Cassel: - Add NOLPM quirk for for all Crucial BX SSD1 models. Considering that we now have had bug reports for 3 different BX SSD1 variants from Crucial with the same product name, make the quirk more inclusive, to catch more device models from the same generation. - Fix a trivial NULL pointer dereference in the error path for ata_host_release(). - Create a ata_port_free(), so that we don't miss freeing ata_port struct members when freeing a struct ata_port. - Fix a trivial double free in the error path for ata_host_alloc(). - Ensure that we remove the libata "remapped NVMe device count" sysfs entry on .probe() error. * tag 'ata-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux: ata: ahci: Clean up sysfs file on error ata: libata-core: Fix double free on error ata,scsi: libata-core: Do not leak memory for ata_port struct members ata: libata-core: Fix null pointer dereference on error ata: libata-core: Add ATA_HORKAGE_NOLPM for all Crucial BX SSD1 models
2024-06-30ata,scsi: libata-core: Do not leak memory for ata_port struct membersNiklas Cassel
libsas is currently not freeing all the struct ata_port struct members, e.g. ncq_sense_buf for a driver supporting Command Duration Limits (CDL). Add a function, ata_port_free(), that is used to free a ata_port, including its struct members. It makes sense to keep the code related to freeing a ata_port in its own function, which will also free all the struct members of struct ata_port. Fixes: 18bd7718b5c4 ("scsi: ata: libata: Handle completion of CDL commands using policy 0xD") Reviewed-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20240629124210.181537-8-cassel@kernel.org Signed-off-by: Niklas Cassel <cassel@kernel.org>
2024-06-28kernel-wide: fix spelling mistakes like "assocative" -> "associative"Jesse Brandeburg
There were several instances of the string "assocat" in the kernel, which should have been spelled "associat", with the various endings of -ive, -ed, -ion, and sometimes beginnging with dis-. Add to the spelling dictionary the corrections so that future instances will be caught by checkpatch, and fix the instances found. Originally noticed by accident with a 'git grep socat'. Link: https://lkml.kernel.org/r/20240612001247.356867-1-jesse.brandeburg@intel.com Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-28mpt3sas_scsih: don't set QUEUE_FLAG_NOMERGESChristoph Hellwig
Setting QUEUE_FLAG_NOMERGES was added in commit d1b01d14b7ba ("scsi: mpt3sas: Set NVMe device queue depth as 128") without any explanation. Drivers should second guess the block layer merge decisions, so remove the flag. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20240627124926.512662-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-06-28megaraid_sas: don't set QUEUE_FLAG_NOMERGESChristoph Hellwig
Setting QUEUE_FLAG_NOMERGES was added in commit 15dd03811d99dcf ("scsi: megaraid_sas: NVME Interface detection and prop settings") without any explanation. Drivers should second guess the block layer merge decisions, so remove the flag. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20240627124926.512662-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-06-26Merge patch series "mpi3mr: Host diag buffer support"Martin K. Petersen
Ranjan Kumar <ranjan.kumar@broadcom.com> says: The controllers managed by mpi3mr driver requires system memory to save hardware and firmware diagnostic information, this patch set enhances the drivers to provide host memory to the controller for diagnostic information. This patch set also provides driver changes to push kernel messages into the diagnostic buffers reserved for the driver, so that the information will be available as part of debug data fetched from the controller. In addition, support for configuring automatic diagnostic information is added in the driver. Link: https://lore.kernel.org/r/20240626102646.14298-1-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-06-26scsi: mpi3mr: Update driver version to 8.9.1.0.50Ranjan Kumar
Update driver version to 8.9.1.0.50 Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20240626102646.14298-5-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-06-26scsi: mpi3mr: Add ioctl support for HDBRanjan Kumar
Add interface for applications to manage the host diagnostic buffers and update the automatic diag buffer capture triggers. Co-developed-by: Sathya Prakash <sathya.prakash@broadcom.com> Signed-off-by: Sathya Prakash <sathya.prakash@broadcom.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20240626102646.14298-4-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-06-26scsi: mpi3mr: Trigger supportRanjan Kumar
Add functions to process automatic diag triggers. If a condition defined in the triggers is met, the driver will call appropriate controller functions to save the diagnostic information. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202405151955.BiAWI1SY-lkp@intel.com/ Co-developed-by: Sathya Prakash <sathya.prakash@broadcom.com> Signed-off-by: Sathya Prakash <sathya.prakash@broadcom.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20240626102646.14298-3-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-06-26scsi: mpi3mr: HDB allocation and posting for hardware and firmware buffersRanjan Kumar
To be able to debug controller problems it is beneficial to allocate and configure system/host memory buffers which can be used to capture hardware and firmware diagnostic information. Add functions required to allocate and post firmware and hardware diagnostic buffers to the controller and to set up automatic diagnostic capture triggers. Captures will be triggered under the following circumstances: 1. Firmware is in FAULT state. 2. Admin commands time out. 3. Controller reset caused due to I/O timeout Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202405151758.7xrJz6rp-lkp@intel.com/ Co-developed-by: Sathya Prakash <sathya.prakash@broadcom.com> Signed-off-by: Sathya Prakash <sathya.prakash@broadcom.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20240626102646.14298-2-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-06-26scsi: lpfc: Fix a possible null pointer dereferenceHuai-Yuan Liu
In function lpfc_xcvr_data_show, the memory allocation with kmalloc might fail, thereby making rdp_context a null pointer. In the following context and functions that use this pointer, there are dereferencing operations, leading to null pointer dereference. To fix this issue, a null pointer check should be added. If it is null, use scnprintf to notify the user and return len. Fixes: 479b0917e447 ("scsi: lpfc: Create a sysfs entry called lpfc_xcvr_data for transceiver info") Signed-off-by: Huai-Yuan Liu <qq810974084@gmail.com> Link: https://lore.kernel.org/r/20240621082545.449170-1-qq810974084@gmail.com Reviewed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-06-26scsi: libsas: Fix exp-attached device scan after probe failure scanned in ↵Xingui Yang
again after probe failed The expander phy will be treated as broadcast flutter in the next revalidation after the exp-attached end device probe failed, as follows: [78779.654026] sas: broadcast received: 0 [78779.654037] sas: REVALIDATING DOMAIN on port 0, pid:10 [78779.654680] sas: ex 500e004aaaaaaa1f phy05 change count has changed [78779.662977] sas: ex 500e004aaaaaaa1f phy05 originated BROADCAST(CHANGE) [78779.662986] sas: ex 500e004aaaaaaa1f phy05 new device attached [78779.663079] sas: ex 500e004aaaaaaa1f phy05:U:8 attached: 500e004aaaaaaa05 (stp) [78779.693542] hisi_sas_v3_hw 0000:b4:02.0: dev[16:5] found [78779.701155] sas: done REVALIDATING DOMAIN on port 0, pid:10, res 0x0 [78779.707864] sas: Enter sas_scsi_recover_host busy: 0 failed: 0 ... [78835.161307] sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1 [78835.171344] sas: sas_probe_sata: for exp-attached device 500e004aaaaaaa05 returned -19 [78835.180879] hisi_sas_v3_hw 0000:b4:02.0: dev[16:5] is gone [78835.187487] sas: broadcast received: 0 [78835.187504] sas: REVALIDATING DOMAIN on port 0, pid:10 [78835.188263] sas: ex 500e004aaaaaaa1f phy05 change count has changed [78835.195870] sas: ex 500e004aaaaaaa1f phy05 originated BROADCAST(CHANGE) [78835.195875] sas: ex 500e004aaaaaaa1f rediscovering phy05 [78835.196022] sas: ex 500e004aaaaaaa1f phy05:U:A attached: 500e004aaaaaaa05 (stp) [78835.196026] sas: ex 500e004aaaaaaa1f phy05 broadcast flutter [78835.197615] sas: done REVALIDATING DOMAIN on port 0, pid:10, res 0x0 The cause of the problem is that the related ex_phy's attached_sas_addr was not cleared after the end device probe failed, so reset it. Signed-off-by: Xingui Yang <yangxingui@huawei.com> Link: https://lore.kernel.org/r/20240619091742.25465-1-yangxingui@huawei.com Reviewed-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-06-26block: move dma_pad_mask into queue_limitsChristoph Hellwig
dma_pad_mask is a queue_limits by all ways of looking at it, so move it there and set it through the atomic queue limits APIs. Add a little helper that takes the alignment and pad into account to simplify the code that is touched a bit. Note that there never was any need for the > check in blk_queue_update_dma_pad, this probably was just copy and paste from dma_update_dma_alignment. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Link: https://lore.kernel.org/r/20240626142637.300624-9-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-06-25scsi: scsi_debug: Fix create target debugfs failureMing Lei
Target debugfs entry is removed via async_schedule() which isn't drained when adding same name target, so failure of "Directory 'target11:0:0' with parent 'scsi_debug' already present!" can be triggered easily. Fix it by switching to domain async schedule, and draining it before adding new target debugfs entry. Cc: Wenchao Hao <haowenchao2@huawei.com> Fixes: f084fe52c640 ("scsi: scsi_debug: Add debugfs interface to fail target reset") Signed-off-by: Ming Lei <ming.lei@redhat.com> Acked-by: Wenchao Hao <haowenchao22@gmail.com> Link: https://lore.kernel.org/r/20240619013803.3008857-1-ming.lei@redhat.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-06-24lpfc_nvmet: implement 'host_traddr'Hannes Reinecke
Implement the 'host_traddr' callback to display the host transport address for nvmet debugfs. Signed-off-by: Hannes Reinecke <hare@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-06-21Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Two fixes: one in the ufs driver fixing an obvious memory leak and the other (with a core flag based update) trying to prevent USB crashes by stopping the core from issuing a request for the I/O Hints mode page" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: usb: uas: Do not query the IO Advice Hints Grouping mode page for USB/UAS devices scsi: core: Introduce the BLIST_SKIP_IO_HINTS flag scsi: ufs: core: Free memory allocated for model before reinit
2024-06-20scsi: scsi_debug: Atomic write supportJohn Garry
Add initial support for atomic writes. As is standard method, feed device properties via modules param, those being: - atomic_max_size_blks - atomic_alignment_blks - atomic_granularity_blks - atomic_max_size_with_boundary_blks - atomic_max_boundary_blks These just match sbc4r22 section 6.6.4 - Block limits VPD page. We just support ATOMIC WRITE (16). The major change in the driver is how we lock the device for RW accesses. Currently the driver uses a per-device lock for accessing device metadata and "media" data (calls to do_device_access()) atomically for the duration of the whole read/write command. This should not suit verifying atomic writes. Reason being that currently all reads/writes are atomic, so using atomic writes does not prove anything. Change device access model to basis that regular writes only atomic on a per-sector basis, while reads and atomic writes are fully atomic. As mentioned, since accessing metadata and device media is atomic, continue to have regular writes involving metadata - like discard or PI - as atomic. We can improve this later. Currently we only support model where overlapping going reads or writes wait for current access to complete before commencing an atomic write. This is described in 4.29.3.2 section of the SBC. However, we simplify, things and wait for all accesses to complete (when issuing an atomic write). Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: John Garry <john.g.garry@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Link: https://lore.kernel.org/r/20240620125359.2684798-10-john.g.garry@oracle.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-06-20scsi: sd: Atomic write supportJohn Garry
Support is divided into two main areas: - reading VPD pages and setting sdev request_queue limits - support WRITE ATOMIC (16) command and tracing The relevant block limits VPD page need to be read to allow the block layer request_queue atomic write limits to be set. These VPD page limits are described in sbc4r22 section 6.6.4 - Block limits VPD page. There are five limits of interest: - MAXIMUM ATOMIC TRANSFER LENGTH - ATOMIC ALIGNMENT - ATOMIC TRANSFER LENGTH GRANULARITY - MAXIMUM ATOMIC TRANSFER LENGTH WITH BOUNDARY - MAXIMUM ATOMIC BOUNDARY SIZE MAXIMUM ATOMIC TRANSFER LENGTH is the maximum length for a WRITE ATOMIC (16) command. It will not be greater than the device MAXIMUM TRANSFER LENGTH. ATOMIC ALIGNMENT and ATOMIC TRANSFER LENGTH GRANULARITY are the minimum alignment and length values for an atomic write in terms of logical blocks. Unlike NVMe, SCSI does not specify an LBA space boundary, but does specify a per-IO boundary granularity. The maximum boundary size is specified in MAXIMUM ATOMIC BOUNDARY SIZE. When used, this boundary value is set in the WRITE ATOMIC (16) ATOMIC BOUNDARY field - layout for the WRITE_ATOMIC_16 command can be found in sbc4r22 section 5.48. This boundary value is the granularity size at which the device may atomically write the data. A value of zero in WRITE ATOMIC (16) ATOMIC BOUNDARY field means that all data must be atomically written together. MAXIMUM ATOMIC TRANSFER LENGTH WITH BOUNDARY is the maximum atomic write length if a non-zero boundary value is set. For atomic write support, the WRITE ATOMIC (16) boundary is not of much interest, as the block layer expects each request submitted to be executed atomically. However, the SCSI spec does leave itself open to a quirky scenario where MAXIMUM ATOMIC TRANSFER LENGTH is zero, yet MAXIMUM ATOMIC TRANSFER LENGTH WITH BOUNDARY and MAXIMUM ATOMIC BOUNDARY SIZE are both non-zero. This case will be supported. To set the block layer request_queue atomic write capabilities, sanitize the VPD page limits and set limits as follows: - atomic_write_unit_min is derived from granularity and alignment values. If no granularity value is not set, use physical block size - atomic_write_unit_max is derived from MAXIMUM ATOMIC TRANSFER LENGTH. In the scenario where MAXIMUM ATOMIC TRANSFER LENGTH is zero and boundary limits are non-zero, use MAXIMUM ATOMIC BOUNDARY SIZE for atomic_write_unit_max. New flag scsi_disk.use_atomic_write_boundary is set for this scenario. - atomic_write_boundary_bytes is set to zero always SCSI also supports a WRITE ATOMIC (32) command, which is for type 2 protection enabled. This is not going to be supported now, so check for T10_PI_TYPE2_PROTECTION when setting any request_queue limits. To handle an atomic write request, add support for WRITE ATOMIC (16) command in handler sd_setup_atomic_cmnd(). Flag use_atomic_write_boundary is checked here for encoding ATOMIC BOUNDARY field. Trace info is also added for WRITE_ATOMIC_16 command. Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: John Garry <john.g.garry@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20240620125359.2684798-9-john.g.garry@oracle.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-06-19block: move the bounce flag into the features fieldChristoph Hellwig
Move the bounce flag into the features field to reclaim a little bit of space. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20240617060532.127975-27-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-06-19block: move the zone_resetall flag to queue_limitsChristoph Hellwig
Move the zone_resetall flag into the queue_limits feature field so that it can be set atomically with the queue frozen. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20240617060532.127975-24-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-06-19block: move the zoned flag into the features fieldChristoph Hellwig
Move the zoned flags into the features field to reclaim a little bit of space. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20240617060532.127975-23-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-06-19block: move the stable_writes flag to queue_limitsChristoph Hellwig
Move the stable_writes flag into the queue_limits feature field so that it can be set atomically with the queue frozen. The flag is now inherited by blk_stack_limits, which greatly simplifies the code in dm, and fixed md which previously did not pass on the flag set on lower devices. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20240617060532.127975-18-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-06-19block: move the add_random flag to queue_limitsChristoph Hellwig
Move the add_random flag into the queue_limits feature field so that it can be set atomically with the queue frozen. Note that this also removes code from dm to clear the flag based on the underlying devices, which can't be reached as dm devices will always start out without the flag set. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20240617060532.127975-16-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-06-19block: move the nonrot flag to queue_limitsChristoph Hellwig
Move the nonrot flag into the queue_limits feature field so that it can be set atomically with the queue frozen. Use the chance to switch to defaulting to non-rotational and require the driver to opt into rotational, which matches the polarity of the sysfs interface. For the z2ram, ps3vram, 2x memstick, ubiblock and dcssblk the new rotational flag is not set as they clearly are not rotational despite this being a behavior change. There are some other drivers that unconditionally set the rotational flag to keep the existing behavior as they arguably can be used on rotational devices even if that is probably not their main use today (e.g. virtio_blk and drbd). The flag is automatically inherited in blk_stack_limits matching the existing behavior in dm and md. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20240617060532.127975-15-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-06-19block: move cache control settings out of queue->flagsChristoph Hellwig
Move the cache control settings into the queue_limits so that the flags can be set atomically with the device queue frozen. Add new features and flags field for the driver set flags, and internal (usually sysfs-controlled) flags in the block layer. Note that we'll eventually remove enough field from queue_limits to bring it back to the previous size. The disable flag is inverted compared to the previous meaning, which means it now survives a rescan, similar to the max_sectors and max_discard_sectors user limits. The FLUSH and FUA flags are now inherited by blk_stack_limits, which simplified the code in dm a lot, but also causes a slight behavior change in that dm-switch and dm-unstripe now advertise a write cache despite setting num_flush_bios to 0. The I/O path will handle this gracefully, but as far as I can tell the lack of num_flush_bios and thus flush support is a pre-existing data integrity bug in those targets that really needs fixing, after which a non-zero num_flush_bios should be required in dm for targets that map to underlying devices. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20240617060532.127975-14-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-06-19sd: move zone limits setup out of sd_read_block_characteristicsChristoph Hellwig
Move a bit of code that sets up the zone flag and the write granularity into sd_zbc_read_zones to be with the rest of the zoned limits. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20240617060532.127975-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-06-19sd: remove sd_is_zonedChristoph Hellwig
Since commit 7437bb73f087 ("block: remove support for the host aware zone model"), only ZBC devices expose a zoned access model. sd_is_zoned is used to check for that and thus return false for host aware devices. Replace the helper with the simple open coded TYPE_ZBC check to fix this. Fixes: 7437bb73f087 ("block: remove support for the host aware zone model") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20240617060532.127975-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-06-14Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Three obvious driver fixes and two core fixes. The two core fixes are to disable Command Duration Limits by default to fix an inconsistency in SATA and some USB devices. The other is to change the default read size for block zero to follow the device preference (some USB bridges preferring 16 byte commands don't have a translation for READ(10) and thus don't scan properly)" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: mpi3mr: Fix ATA NCQ priority support scsi: ufs: core: Quiesce request queues before checking pending cmds scsi: core: Disable CDL by default scsi: mpt3sas: Avoid test/set_bit() operating in non-allocated memory scsi: sd: Use READ(16) when reading block zero on large capacity disks
2024-06-14block: move integrity information into queue_limitsChristoph Hellwig
Move the integrity information into the queue limits so that it can be set atomically with other queue limits, and that the sysfs changes to the read_verify and write_generate flags are properly synchronized. This also allows to provide a more useful helper to stack the integrity fields, although it still is separate from the main stacking function as not all stackable devices want to inherit the integrity settings. Even with that it greatly simplifies the code in md and dm. Note that the integrity field is moved as-is into the queue limits. While there are good arguments for removing the separate blk_integrity structure, this would cause a lot of churn and might better be done at a later time if desired. However the integrity field in the queue_limits structure is now unconditional so that various ifdefs can be avoided or replaced with IS_ENABLED(). Given that tiny size of it that seems like a worthwhile trade off. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20240613084839.1044015-13-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-06-14block: remove the blk_integrity_profile structureChristoph Hellwig
Block layer integrity configuration is a bit complex right now, as it indirects through operation vectors for a simple two-dimensional configuration: a) the checksum type of none, ip checksum, crc, crc64 b) the presence or absence of a reference tag Remove the integrity profile, and instead add a separate csum_type flag which replaces the existing ip-checksum field and a new flag that indicates the presence of the reference tag. This removes up to two layers of indirect calls, remove the need to offload the no-op verification of non-PI metadata to a workqueue and generally simplifies the code. The downside is that block/t10-pi.c now has to be built into the kernel when CONFIG_BLK_DEV_INTEGRITY is supported. Given that both nvme and SCSI require t10-pi.ko, it is loaded for all usual configurations that enabled CONFIG_BLK_DEV_INTEGRITY already, though. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20240613084839.1044015-6-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-06-14block: add special APIs for run-time disabling of discard and friendsChristoph Hellwig
A few drivers optimistically try to support discard, write zeroes and secure erase and disable the features from the I/O completion handler if the hardware can't support them. This disable can't be done using the atomic queue limits API because the I/O completion handlers can't take sleeping locks or freeze the queue. Keep the existing clearing of the relevant field to zero, but replace the old blk_queue_max_* APIs with new disable APIs that force the value to 0. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Nitesh Shetty <nj.shetty@samsung.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20240531074837.1648501-15-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-06-14sr: convert to the atomic queue limits APIChristoph Hellwig
Assign all queue limits through a local queue_limits variable and queue_limits_commit_update so that we can't race updating them from multiple places, and free the queue when updating them so that in-progress I/O submissions don't see half-updated limits. Also use the chance to clean up variable names to standard ones. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20240531074837.1648501-13-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-06-14sd: convert to the atomic queue limits APIChristoph Hellwig
Assign all queue limits through a local queue_limits variable and queue_limits_commit_update so that we can't race updating them from multiple places, and freeze the queue when updating them so that in-progress I/O submissions don't see half-updated limits. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20240531074837.1648501-12-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-06-14sd: cleanup zoned queue limits initializationChristoph Hellwig
Consolidate setting zone-related queue limits in sd_zbc_read_zones instead of splitting them between sd_zbc_revalidate_zones and sd_zbc_read_zones, and move the early_zone_information initialization in sd_zbc_read_zones above setting up the queue limits. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20240531074837.1648501-11-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-06-14sd: factor out a sd_discard_mode helperChristoph Hellwig
Split the logic to pick the right discard mode into a little helper to prepare for further changes. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20240531074837.1648501-10-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-06-14sd: simplify the disable case in sd_config_discardChristoph Hellwig
Fall through to the main call to blk_queue_max_discard_sectors given that max_blocks has been initialized to zero above instead of duplicating the call. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20240531074837.1648501-9-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-06-14sd: add a sd_disable_write_same helperChristoph Hellwig
Add helper to disable WRITE SAME when it is not supported and use it instead of sd_config_write_same in the I/O completion handler. This avoids touching more fields than required in the I/O completion handler and prepares for converting sd to use the atomic queue limits API. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20240531074837.1648501-8-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-06-14sd: add a sd_disable_discard helperChristoph Hellwig
Add helper to disable discard when it is not supported and use it instead of sd_config_discard in the I/O completion handler. This avoids touching more fields than required in the I/O completion handler and prepares for converting sd to use the atomic queue limits API. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20240531074837.1648501-7-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-06-14sd: simplify the ZBC case in provisioning_mode_storeChristoph Hellwig
Don't reset the discard settings to no-op over and over when a user writes to the provisioning attribute as that is already the default mode for ZBC devices. In hindsight we should have made writing to the attribute fail for ZBC devices, but the code has probably been around for far too long to change this now. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20240531074837.1648501-6-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>