summaryrefslogtreecommitdiff
path: root/drivers/cxl
AgeCommit message (Collapse)Author
2022-07-21cxl/port: Record parent dport when adding portsDan Williams
At the time that cxl_port instances are being created, cache the dport from the parent port that points to this new child port. This will be useful for region provisioning when walking the tree to calculate decoder targets, and saves rewalking the dport list after the fact to build this information. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220624041950.559155-1-dan.j.williams@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-07-21cxl/port: Record dport in endpoint referencesDan Williams
Recall that the primary role of the cxl_mem driver is to probe if the given endpoint is connected to a CXL port topology. In that process it walks its device ancestry to its PCI root port. If that root port is also a CXL root port then the probe process adds cxl_port object instances at switch in the path between to the root and the endpoint. As those cxl_port instances are added, or if a previous enumeration attempt already created the port, a 'struct cxl_ep' instance is registered with that port to track the endpoints interested in that port. At the time the cxl_ep is registered the downstream egress path from the port to the endpoint is known. Take the opportunity to record that information as it will be needed for dynamic programming of decoder targets during region provisioning. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784329944.1758207.15203961796832072116.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-07-21cxl/hdm: Add support for allocating DPA to an endpoint decoderDan Williams
The region provisioning flow will roughly follow a sequence of: 1/ Allocate DPA to a set of decoders 2/ Allocate HPA to a region 3/ Associate decoders with a region and validate that the DPA allocations and topologies match the parameters of the region. For now, this change (step 1) arranges for DPA capacity to be allocated and deleted from non-committed decoders based on the decoder's mode / partition selection. Capacity is allocated from the lowest DPA in the partition and any 'pmem' allocation blocks out all remaining ram capacity in its 'skip' setting. DPA allocations are enforced in decoder instance order. I.e. decoder N + 1 always starts at a higher DPA than instance N, and deleting allocations must proceed from the highest-instance allocated decoder to the lowest. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784329399.1758207.16732038126938632700.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-07-21cxl/hdm: Track next decoder to allocateDan Williams
The CXL specification enforces that endpoint decoders are committed in hw instance id order. In preparation for adding dynamic DPA allocation, record the hw instance id in endpoint decoders, and enforce allocations to occur in hw instance id order. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784328827.1758207.9627538529944559954.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-07-21cxl/hdm: Add 'mode' attribute to decoder objectsDan Williams
Recall that the Device Physical Address (DPA) space of a CXL Memory Expander is potentially partitioned into a volatile and persistent portion. A decoder maps a Host Physical Address (HPA) range to a DPA range and that translation depends on the value of all previous (lower instance number) decoders before the current one. In preparation for allowing dynamic provisioning of regions, decoders need an ABI to indicate which DPA partition a decoder targets. This ABI needs to be prepared for the possibility that some other agent committed and locked a decoder that spans the partition boundary. Add 'decoderX.Y/mode' to endpoint decoders that indicates which partition 'ram' / 'pmem' the decoder targets, or 'mixed' if the decoder currently spans the partition boundary. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165603881967.551046.6007594190951596439.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-07-21cxl/hdm: Enumerate allocated DPADan Williams
In preparation for provisioning CXL regions, add accounting for the DPA space consumed by existing regions / decoders. Recall, a CXL region is a memory range comprised from one or more endpoint devices contributing a mapping of their DPA into HPA space through a decoder. Record the DPA ranges covered by committed decoders at initial probe of endpoint ports relative to a per-device resource tree of the DPA type (pmem or volatile-ram). The cxl_dpa_rwsem semaphore is introduced to globally synchronize DPA state across all endpoints and their decoders at once. The vast majority of DPA operations are reads as region creation is expected to be as rare as disk partitioning and volume creation. The device_lock() for this synchronization is specifically avoided for concern of entangling with sysfs attribute removal. Co-developed-by: Ben Widawsky <bwidawsk@kernel.org> Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784327682.1758207.7914919426043855876.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-07-21cxl/core: Define a 'struct cxl_endpoint_decoder'Dan Williams
Previously the target routing specifics of switch decoders and platform CXL window resource tracking of root decoders were factored out of 'struct cxl_decoder'. While switch decoders translate from SPA to downstream ports, endpoint decoders translate from SPA to DPA. This patch, 3 of 3, adds a 'struct cxl_endpoint_decoder' that tracks an endpoint-specific Device Physical Address (DPA) resource. For now this just defines ->dpa_res, a follow-on patch will handle requesting DPA resource ranges from a device-DPA resource tree. Co-developed-by: Ben Widawsky <bwidawsk@kernel.org> Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784327088.1758207.15502834501671201192.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-07-21cxl/core: Define a 'struct cxl_root_decoder'Dan Williams
Previously the target routing specifics of switch decoders were factored out of 'struct cxl_decoder' into 'struct cxl_switch_decoder'. This patch, 2 of 3, adds a 'struct cxl_root_decoder' as a superset of a switch decoder that also track the associated CXL window platform resource. Note that the reason the resource for a given root decoder needs to be looked up after the fact (i.e. after cxl_parse_cfmws() and add_cxl_resource()) is because add_cxl_resource() may have merged CXL windows in order to keep them at the top of the resource tree / decode hierarchy. Co-developed-by: Ben Widawsky <bwidawsk@kernel.org> Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784326541.1758207.9915663937394448341.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-07-21cxl/acpi: Track CXL resources in iomem_resourceDan Williams
Recall that CXL capable address ranges, on ACPI platforms, are published in the CEDT.CFMWS (CXL Early Discovery Table: CXL Fixed Memory Window Structures). These windows represent both the actively mapped capacity and the potential address space that can be dynamically assigned to a new CXL decode configuration (region / interleave-set). CXL endpoints like DDR DIMMs can be mapped at any physical address including 0 and legacy ranges. There is an expectation and requirement that the /proc/iomem interface and the iomem_resource tree in the kernel reflect the full set of platform address ranges. I.e. that every address range that platform firmware and bus drivers enumerate be reflected as an iomem_resource entry. The hard requirement to do this for CXL arises from the fact that facilities like CONFIG_DEVICE_PRIVATE expect to be able to treat empty iomem_resource ranges as free for software to use as proxy address space. Without CXL publishing its potential address ranges in iomem_resource, the CONFIG_DEVICE_PRIVATE mechanism may inadvertently steal capacity reserved for runtime provisioning of new CXL regions. So, iomem_resource needs to know about both active and potential CXL resource ranges. The active CXL resources might already be reflected in iomem_resource as "System RAM". insert_resource_expand_to_fit() handles re-parenting "System RAM" underneath a CXL window. The "_expand_to_fit()" behavior handles cases where a CXL window is not a strict superset of an existing entry in the iomem_resource tree. The "_expand_to_fit()" behavior is acceptable from the perspective of resource allocation. The expansion happens because a conflicting resource range is already populated, which means the resource boundary expansion does not result in any additional free CXL address space being made available. CXL address space allocation is always bounded by the orginal unexpanded address range. However, the potential for expansion does mean that something like walk_iomem_res_desc(IORES_DESC_CXL...) can only return fuzzy answers on corner case platforms that cause the resource tree to expand a CXL window resource over a range that is not decoded by CXL. This would be an odd platform configuration, but if it becomes a problem in practice the CXL subsytem could just publish an API that returns definitive answers. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: David Hildenbrand <david@redhat.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Christoph Hellwig <hch@lst.de> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/165784325943.1758207.5310344844375305118.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-07-21cxl/core: Define a 'struct cxl_switch_decoder'Dan Williams
Currently 'struct cxl_decoder' contains the superset of attributes needed for all decoder types. Before more type-specific attributes are added to the common definition, reorganize 'struct cxl_decoder' into type specific objects. This patch, the first of three, factors out a cxl_switch_decoder type. See the new kdoc for what a 'struct cxl_switch_decoder' represents in a CXL topology. Co-developed-by: Ben Widawsky <bwidawsk@kernel.org> Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/r/165784325340.1758207.5064717153608954960.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-07-19cxl/port: Read CDAT tableIra Weiny
The per-device CDAT data provides performance data that is relevant for mapping which CXL devices can participate in which CXL ranges by QTG (QoS Throttling Group) (per ECN: CXL 2.0 CEDT CFMWS & QTG_DSM) [1]. The QTG association specified in the ECN is advisory. Until the cxl_acpi driver grows support for invoking the QTG _DSM method the CDAT data is only of interest to userspace that may need it for debug purposes. Search the DOE mailboxes available, query CDAT data, cache the data and make it available via a sysfs binary attribute per endpoint at: /sys/bus/cxl/devices/endpointX/CDAT ...similar to other ACPI-structured table data in /sys/firmware/ACPI/tables. The CDAT is relative to 'struct cxl_port' objects since switches in addition to endpoints can host a CDAT instance. Switch CDAT support is not implemented. This does not support table updates at runtime. It will always provide whatever was there when first cached. It is also the case that table updates are not expected outside of explicit DPA address map affecting commands like Set Partition with the immediate flag set. Given that the driver does not support Set Partition with the immediate flag set there is no current need for update support. Link: https://www.computeexpresslink.org/spec-landing [1] Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> [djbw: drop in-kernel parsing infra for now, and other minor fixups] Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220719205249.566684-7-ira.weiny@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-07-19cxl/pci: Create PCI DOE mailbox's for memory devicesIra Weiny
DOE mailbox objects will be needed for various mailbox communications with each memory device. Iterate each DOE mailbox capability and create PCI DOE mailbox objects as found. It is not anticipated that this is the final resting place for the iteration of the DOE devices. The support of switch ports will drive this code into the PCIe side. In this imagined architecture the CXL port driver would then query into the PCI device for the DOE mailbox array. For now creating the mailboxes in the CXL port is good enough for the endpoints. Later PCIe ports will need to support this to support switch ports more generically. Cc: Dan Williams <dan.j.williams@intel.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Lukas Wunner <lukas@wunner.de> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20220719205249.566684-5-ira.weiny@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-07-11cxl/pmem: Delete unused nvdimm attributeDan Williams
While there is a need to go from a LIBNVDIMM 'struct nvdimm' to a CXL 'struct cxl_nvdimm', there is no use case to go the other direction. Likely this is a leftover from an early version of the referenced commit before it implemented devm for releasing the created nvdimm. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220624041950.559155-19-dan.j.williams@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-07-10cxl/hdm: Initialize decoder type for memory expander devicesDan Williams
Unless and until accelerator (type-2) drivers start registering for CXL.mem mapping services from the CXL subsystem core, initialize idle HDM decoders to the "expander" type. I.e. the only CXL devices using the CXL core presently are those implementing the CXL 2.0 Type-3 memory expander device class code that the cxl_pci driver claims. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220624041950.559155-6-dan.j.williams@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-07-10cxl/port: Cache CXL host bridge dataDan Williams
Region creation has need for checking host-bridge connectivity when adding endpoints to regions. Record, at port creation time, the host-bridge to provide a useful shortcut from any location in the topology to the most-significant ancestor. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220624041950.559155-4-dan.j.williams@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-07-10tools/testing/cxl: Add partition supportDan Williams
In support of testing DPA allocation mechanisms in the CXL core, the cxl_test environment needs to support establishing and retrieving the 'pmem partition boundary. Replace the platform_device_add_resources() method for delineating DPA within an endpoint with an emulated DEV_SIZE amount of partitionable capacity. Set DEV_SIZE such that an endpoint has enough capacity to simultaneously participate in 8 distinct regions. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165603887411.551046.13234212587991192347.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-07-10cxl/mem: Add a debugfs version of 'iomem' for DPA, 'dpamem'Dan Williams
Dump the device-physical-address map for a CXL expander in /proc/iomem style format. E.g.: cat /sys/kernel/debug/cxl/mem1/dpamem 00000000-0fffffff : ram 10000000-1fffffff : pmem Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165603885318.551046.8308248564880066726.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-07-10cxl/debug: Move debugfs init to cxl_core_init()Dan Williams
In preparation for a new cxl debugfs file, move 'cxl' directory establishment and teardown to the core and let subsequent init routines reference that setup. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165603884654.551046.4962104601691723080.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-07-09cxl/hdm: Require all decoders to be enumeratedBen Widawsky
In preparation for region provisioning all device decoders need to be enumerated since DPA allocations are calculated by summing the capacities of all decoders in a set. I.e. the programming for decoder[N] depends on the state of decoder[N-1], so skipping over decoders that fail to initialize prevents accurate DPA accounting. Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> [djbw: reword changelog] Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165603879664.551046.6863805202478861026.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-07-09cxl/mem: Convert partition-info to resourcesDan Williams
To date the per-device-partition DPA range information has only been used for enumeration purposes. In preparation for allocating regions from available DPA capacity, convert those ranges into DPA-type resource trees. With resources and the new add_dpa_res() helper some open coded end address calculations and debug prints can be cleaned. The 'cxlds->pmem_res' and 'cxlds->ram_res' resources are child resources of the total-device DPA space and they in turn will host DPA allocations from cxl_endpoint_decoder instances (tracked by cxled->dpa_res). Cc: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165603878921.551046.8127845916514734142.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-07-09cxl: Introduce cxl_to_{ways,granularity}Dan Williams
Interleave granularity and ways have CXL specification defined encodings. Promote the conversion helpers to a common header, and use them to replace other open-coded instances. Force caller to consider the error case of the conversion similarly to other conversion helpers like kstrto*(). Co-developed-by: Ben Widawsky <bwidawsk@kernel.org> Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165603875016.551046.17236943065932132355.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-07-09cxl/core: Drop is_cxl_decoder()Dan Williams
This helper was only used to identify the object type for lockdep purposes. Now that lockdep support is done with explicit lock classes, this helper can be dropped. Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Adam Manzanares <a.manzanares@samsung.com> Link: https://lore.kernel.org/r/165603874340.551046.15491766127759244728.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-07-09cxl/core: Drop ->platform_res attribute for root decodersDan Williams
Root decoders are responsible for hosting the available host address space for endpoints and regions to claim. The tracking of that available capacity can be done in iomem_resource directly. As a result, root decoders no longer need to host their own resource tree. The current ->platform_res attribute was added prematurely. Otherwise, ->hpa_range fills the role of conveying the current decode range of the decoder. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Adam Manzanares <a.manzanares@samsung.com> Link: https://lore.kernel.org/r/165603873619.551046.791596854070136223.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-07-09cxl/core: Rename ->decoder_range ->hpa_rangeDan Williams
In preparation for growing a ->dpa_range attribute for endpoint decoders, rename the current ->decoder_range to the more descriptive ->hpa_range. Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Adam Manzanares <a.manzanares@samsung.com> Link: https://lore.kernel.org/r/165603872867.551046.2170426227407458814.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-07-09cxl/hdm: Use local hdm variableBen Widawsky
Save a few characters and use the already initialized local variable. Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Adam Manzanares <a.manzanares@samsung.com> Link: https://lore.kernel.org/r/165603872171.551046.913207574344536475.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-07-09cxl/port: Keep port->uport valid for the entire life of a portDan Williams
The upcoming region provisioning implementation has a need to dereference port->uport during the port unregister flow. Specifically, endpoint decoders need to be able to lookup their corresponding memdev via port->uport. The existing ->dead flag was added for cases where the core was committed to tearing down the port, but needed to drop locks before calling device_unregister(). Reuse that flag to indicate to delete_endpoint() that it has no "release action" work to do as unregister_port() will handle it. Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Adam Manzanares <a.manzanares@samsung.com> Link: https://lore.kernel.org/r/165603871491.551046.6682199179541194356.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-06-28cxl/mbox: Fix missing variable payload checks in cmd size validationVishal Verma
The conversion of command sizes to unsigned missed a couple of checks against variable size payloads during command validation, which made all variable payload commands unconditionally fail. Add the checks back using the new CXL_VARIABLE_PAYLOAD scheme. Fixes: 26f89535a5bb ("cxl/mbox: Use type __u32 for mailbox payload sizes") Cc: <stable@vger.kernel.org> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Alison Schofield <alison.schofield@intel.com> Reported-by: Abhi Cs <abhi.cs@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/20220628220109.633564-1-vishal.l.verma@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-06-21cxl/mbox: Use __le32 in get,set_lsa mailbox structuresAlison Schofield
CXL specification defines these as little endian. Fixes: 60b8f17215de ("cxl/pmem: Translate NVDIMM label commands to CXL label commands") Reported-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/20220225221456.1025635-1-alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-06-21cxl/core: Use is_endpoint_decoderBen Widawsky
Save some characters and directly check decoder type rather than port type. There's no need to check if the port is an endpoint port since, by this point, cxl_endpoint_decoder_alloc() has a specified type. Reviewed by: Adam Manzanares <a.manzanares@samsung.com> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-06-21cxl: Fix cleanup of port devices on failure to probe driver.Jonathan Cameron
The device is created, and then there is a check if a driver succesfully bound to it. In event of failing the bind (e.g. failure in cxl_port_probe()) the device is left registered. When a bus rescan later occurs, fresh devices are created leading to a multiple device representing the same underlying hardware. Bad things may follow and at very least we have far too many devices. Fix by ensuring autoremove is registered if the device create succeeds, but doesn't depend on sucessful binding to a driver. Bug was observed as side effect of incorrect ownership in [PATCH v9 6/9] cxl/port: Read CDAT table but will result from any failure to in cxl_port_probe(). Fixes: 8dd2bc0f8e02 ("cxl/mem: Add the cxl_mem driver") Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20220609134519.11668-1-Jonathan.Cameron@huawei.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-05-20cxl/port: Enable HDM Capability after validating DVSEC RangesDan Williams
CXL memory expanders that support the CXL 2.0 memory device class code include an "HDM Decoder Capability" mechanism to supplant the "CXL DVSEC Range" mechanism originally defined in CXL 1.1. Both mechanisms depend on a "mem_enable" bit being set in configuration space before either mechanism activates. When the HDM Decoder Capability is enabled the CXL DVSEC Range settings are ignored. Previously, the cxl_mem driver was relying on platform-firmware to set "mem_enable". That is an invalid assumption as there is no requirement that platform-firmware sets the bit before the driver sees a device, especially in hot-plug scenarios. Additionally, ACPI-platforms that support CXL 2.0 devices also support the ACPI CEDT (CXL Early Discovery Table). That table outlines the platform permissible address ranges for CXL operation. So, there is a need for the driver to set "mem_enable", and there is information available to determine the validity of the CXL DVSEC Ranges. Arrange for the driver to optionally enable the HDM Decoder Capability if "mem_enable" was not set by platform firmware, or the CXL DVSEC Range configuration was invalid. Be careful to only disable memory decode if the kernel was the one to enable it. In other words, if CXL is backing all of kernel memory at boot the device needs to maintain "mem_enable" and "HDM Decoder enable" all the way up to handoff back to platform firmware (e.g. ACPI S5 state entry may require CXL memory to stay active). Fixes: 560f78559006 ("cxl/pci: Retrieve CXL DVSEC memory info") Cc: Dan Carpenter <dan.carpenter@oracle.com> [dan: fix early terminiation of range-allowed loop] Cc: Ariel Sibley <ariel.sibley@microchip.com> [ariel: Memory_size must be non-zero] Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/165307136375.2499769.861793697156744166.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-05-19cxl/port: Reuse 'struct cxl_hdm' context for hdm initDan Williams
The port driver maps component registers for port operations. Reuse that mapping for HDM Decoder Capability setup / enable. Move devm_cxl_setup_hdm() before cxl_hdm_decode_init() and plumb @cxlhdm through the hdm init helpers. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165291691712.1426646.14336397551571515480.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-05-19cxl/port: Move endpoint HDM Decoder Capability init to port driverDan Williams
The responsibility for establishing HDM Decoder Capability based operation is more closely tied to port enabling than memdev enabling which is concerned with port enumeration. This later enables reusing @cxlhdm for probing / controlling "global enable" for the HDM Decoder Capability. For now, just do the nominal move. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165291691167.1426646.7936109077255288258.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-05-19cxl/pci: Drop @info argument to cxl_hdm_decode_init()Dan Williams
Now that nothing external to cxl_hdm_decode_init() considers 'struct cxl_endpoint_dvec_info' move it internal to cxl_hdm_decode_init(). Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165291690612.1426646.7866084245521113414.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-05-19cxl/mem: Merge cxl_dvsec_ranges() and cxl_hdm_decode_init()Dan Williams
In preparation for changing how the driver handles 'mem_enable' in the CXL DVSEC control register. Merge the contents of cxl_hdm_decode_init() into cxl_dvsec_ranges() and rename the combined function cxl_hdm_decode_init(). The possible cleanups and fixes that result from this merge are saved for a follow-on change. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/165291690027.1426646.10249756632415633752.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-05-19cxl/mem: Skip range enumeration if mem_enable clearDan Williams
When a device does not have mem_enable set then the current range settings are moot. Skip the enumeration and cause cxl_hdm_decode_init() to proceed directly to enable the HDM Decoder Capability. Fixes: 560f78559006 ("cxl/pci: Retrieve CXL DVSEC memory info") Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165291689442.1426646.18012291761753694336.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-05-19cxl/mem: Consolidate CXL DVSEC Range enumeration in the coreDan Williams
In preparation for fixing the setting of the 'mem_enabled' bit in CXL DVSEC Control register, move all CXL DVSEC range enumeration into the same source file. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165291688886.1426646.15046138604010482084.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-05-19cxl/pci: Move cxl_await_media_ready() to the coreDan Williams
Allow cxl_await_media_ready() to be mocked for testing purposes rather than carrying the maintenance burden of an indirect function call in the mainline driver. With the move cxl_await_media_ready() can no longer reuse the mailbox timeout override, so add a media_ready_timeout module parameter to the core to backfill. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165291688340.1426646.4755627801983775011.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-05-19cxl/mem: Validate port connectivity before dvsec rangesDan Williams
In preparation for validating DVSEC ranges against the platform declared CXL memory ranges (ACPI CFMWS) move port enumeration before the endpoint's decoder validation. Ultimately this logic will move to the port driver, but create a bisect point before that larger move. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165291687749.1426646.18091538443879226995.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-05-19cxl/mem: Fix cxl_mem_probe() error exitDan Williams
The addition of cxl_mem_active() broke error exit scenarios for cxl_mem_probe(). Return early rather than proceed with disabling suspend, and update the label name since it is no longer a terminal "out" label that exits the function. Fixes: 9ea4dcf49878 ("PM: CXL: Disable suspend") Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165291687176.1426646.15449254938752532784.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-05-19cxl/pci: Drop wait_for_valid() from cxl_await_media_ready()Dan Williams
A check mem_info_valid already happens in __cxl_dvsec_ranges(). Rely on that instead of calling wait_for_valid again. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165291686632.1426646.7479581732894574486.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-05-19cxl/pci: Consolidate wait_for_media() and wait_for_media_ready()Dan Williams
Now that wait_for_media() does nothing supplemental to wait_for_media_ready() just promote wait_for_media_ready() to a common helper and drop wait_for_media(). Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165291686046.1426646.4390664747934592185.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-05-19cxl/mem: Drop mem_enabled check from wait_for_media()Dan Williams
Media ready is asserted by the device independent of whether mem_enabled was ever set. Drop this check to allow for dropping wait_for_media() in favor of ->wait_media_ready(). Fixes: 8dd2bc0f8e02 ("cxl/mem: Add the cxl_mem driver") Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165291685501.1426646.10372821863672431074.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-04-28cxl: Drop cxl_device_lock()Dan Williams
Now that all CXL subsystem locking is validated with custom lock classes, there is no need for the custom usage of the lockdep_mutex. Cc: Alison Schofield <alison.schofield@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Ben Widawsky <ben.widawsky@intel.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/165055520383.3745911.53447786039115271.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-04-28cxl/acpi: Add root device lockdep validationDan Williams
The CXL "root" device, ACPI0017, is an attach point for coordinating platform level CXL resources and is the parent device for a CXL port topology tree. As such it has distinct locking rules relative to other CXL subsystem objects, but because it is an ACPI device the lock class is established well before it is given to the cxl_acpi driver. However, the lockdep API does support changing the lock class "live" for situations like this. Add a device_lock_set_class() helper that a driver can use in ->probe() to set a custom lock class, and device_lock_reset_class() to return to the default "no validate" class before the custom lock class key goes out of scope after ->remove(). Note the helpers are all macros to support dead code elimination in the CONFIG_PROVE_LOCKING=n case, however device_set_lock_class() still needs #ifdef CONFIG_PROVE_LOCKING since lockdep_match_class() explicitly does not have a helper in the CONFIG_PROVE_LOCKING=n case (see comment in lockdep.h). The lockdep API needs 2 small tweaks to prevent "unused" warnings for the @key argument to lock_set_class(), and a new lock_set_novalidate_class() is added to supplement lockdep_set_novalidate_class() in the cases where the lock class is converted while the lock is held. Suggested-by: Peter Zijlstra <peterz@infradead.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Will Deacon <will@kernel.org> Cc: Waiman Long <longman@redhat.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Alison Schofield <alison.schofield@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Ben Widawsky <ben.widawsky@intel.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/165100081305.1528964.11138612430659737238.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-04-28cxl: Replace lockdep_mutex with local lock classesDan Williams
In response to an attempt to expand dev->lockdep_mutex for device_lock() validation [1], Peter points out [2] that the lockdep API already has the ability to assign a dedicated lock class per subsystem device-type. Use lockdep_set_class() to override the default device_lock() '__lockdep_no_validate__' class for each CXL subsystem device-type. This enables lockdep to detect deadlocks and recursive locking within the device-driver core and the subsystem. The lockdep_set_class_and_subclass() API is used for port objects that recursively lock the 'cxl_port_key' class by hierarchical topology depth. Link: https://lore.kernel.org/r/164982968798.684294.15817853329823976469.stgit@dwillia2-desk3.amr.corp.intel.com [1] Link: https://lore.kernel.org/r/Ylf0dewci8myLvoW@hirez.programming.kicks-ass.net [2] Suggested-by: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Will Deacon <will@kernel.org> Cc: Waiman Long <longman@redhat.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Alison Schofield <alison.schofield@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Ben Widawsky <ben.widawsky@intel.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/165055519317.3745911.7342499516839702840.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-04-28cxl/mbox: fix logical vs bitwise typoDan Carpenter
This should be bitwise & instead of &&. Fixes: 6179045ccc0c ("cxl/mbox: Block immediate mode in SET_PARTITION_INFO command") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/YmpgkbbQ1Yxu36uO@kili Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-04-22cxl/mbox: Replace NULL check with IS_ERR() after vmemdup_user()Alison Schofield
vmemdup_user() returns an ERR_PTR() on failure. Use IS_ERR() to check the return value. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20220407010915.1211258-1-alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-04-22cxl/mbox: Use type __u32 for mailbox payload sizesAlison Schofield
Payload sizes for mailbox commands are expected to be positive values coming from userspace. The documentation correctly describes these as always unsigned values. The mailbox and send structures that support the mailbox commands however, use __s32 types for the payloads. Replace __s32 with __u32 in the mailbox and send command structures and update usages. Kernel users of the interface already block all negative values and there is no known ability for userspace to have grown a dependency on submitting negative values to the kernel. The known user of the IOCTL, the CXL command line interface (cxl-cli) already enforces positive size values. A Smatch warning of a signedness uncovered this issue. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/20220414051246.1244575-1-alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-04-22PM: CXL: Disable suspendDan Williams
The CXL specification claims S3 support at a hardware level, but at a system software level there are some missing pieces. Section 9.4 (CXL 2.0) rightly claims that "CXL mem adapters may need aux power to retain memory context across S3", but there is no enumeration mechanism for the OS to determine if a given adapter has that support. Moreover the save state and resume image for the system may inadvertantly end up in a CXL device that needs to be restored before the save state is recoverable. I.e. a circular dependency that is not resolvable without a third party save-area. Arrange for the cxl_mem driver to fail S3 attempts. This still nominaly allows for suspend, but requires unbinding all CXL memory devices before the suspend to ensure the typical DRAM flow is taken. The cxl_mem unbind flow is intended to also tear down all CXL memory regions associated with a given cxl_memdev. It is reasonable to assume that any device participating in a System RAM range published in the EFI memory map is covered by aux power and save-area outside the device itself. So this restriction can be minimized in the future once pre-existing region enumeration support arrives, and perhaps a spec update to clarify if the EFI memory map is sufficent for determining the range of devices managed by platform-firmware for S3 support. Per Rafael, if the CXL configuration prevents suspend then it should fail early before tasks are frozen, and mem_sleep should stop showing 'mem' as an option [1]. Effectively CXL augments the platform suspend ->valid() op since, for example, the ACPI ops are not aware of the CXL / PCI dependencies. Given the split role of platform firmware vs OS provisioned CXL memory it is up to the cxl_mem driver to determine if the CXL configuration has elements that platform firmware may not be prepared to restore. Link: https://lore.kernel.org/r/CAJZ5v0hGVN_=3iU8OLpHY3Ak35T5+JcBM-qs8SbojKrpd0VXsA@mail.gmail.com [1] Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: Len Brown <len.brown@intel.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/165066828317.3907920.5690432272182042556.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>