summaryrefslogtreecommitdiff
path: root/drivers/cxl
AgeCommit message (Collapse)Author
2025-04-02Merge tag 'cxl-for-6.15' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull Compute Express Link (CXL) updates from Dave Jiang: - Add support for Global Persistent Flush (GPF) - Cleanup of DPA partition metadata handling: - Remove the CXL_DECODER_MIXED enum that's not needed anymore - Introduce helpers to access resource and perf meta data - Introduce 'struct cxl_dpa_partition' and 'struct cxl_range_info' - Make cxl_dpa_alloc() DPA partition number agnostic - Remove cxl_decoder_mode - Cleanup partition size and perf helpers - Remove unused CXL partition values - Add logging support for CXL CPER endpoint and port protocol errors: - Prefix protocol error struct and function names with cxl_ - Move protocol error definitions and structures to a common location - Remove drivers/firmware/efi/cper_cxl.h to include/linux/cper.h - Add support in GHES to process CXL CPER protocol errors - Process CXL CPER protocol errors - Add trace logging for CXL PCIe port RAS errors - Remove redundant gp_port init - Add validation of cxl device serial number - CXL ABI documentation updates/fixups - A series that uses guard() to clean up open coded mutex lockings and remove gotos for error handling. - Some followup patches to support dirty shutdown accounting: - Add helper to retrieve DVSEC offset for dirty shutdown registers - Rename cxl_get_dirty_shutdown() to cxl_arm_dirty_shutdown() - Add support for dirty shutdown count via sysfs - cxl_test support for dirty shutdown - A series to support CXL mailbox Features commands. Mostly in preparation for CXL EDAC code to utilize the Features commands. It's also in preparation for CXL fwctl support to utilize the CXL Features. The commands include "Get Supported Features", "Get Feature", and "Set Feature". - A series to support extended linear cache support described by the ACPI HMAT table. The addition helps enumerate the cache and also provides additional RAS reporting support for configuration with extended linear cache. (and related fixes for the series). - An update to cxl_test to support a 3-way capable CFMWS - A documentation fix to remove unused "mixed mode" * tag 'cxl-for-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (39 commits) cxl/region: Fix the first aliased address miscalculation cxl/region: Quiet some dev_warn()s in extended linear cache setup cxl/Documentation: Remove 'mixed' from sysfs mode doc cxl: Fix warning from emitting resource_size_t as long long int on 32bit systems cxl/test: Define a CFMWS capable of a 3 way HB interleave cxl/mem: Do not return error if CONFIG_CXL_MCE unset tools/testing/cxl: Set Shutdown State support cxl/pmem: Export dirty shutdown count via sysfs cxl/pmem: Rename cxl_dirty_shutdown_state() cxl/pci: Introduce cxl_gpf_get_dvsec() cxl/pci: Support Global Persistent Flush (GPF) cxl: Document missing sysfs files cxl: Plug typos in ABI doc cxl/pmem: debug invalid serial number data cxl/cdat: Remove redundant gp_port initialization cxl/memdev: Remove unused partition values cxl/region: Drop goto pattern of construct_region() cxl/region: Drop goto pattern in cxl_dax_region_alloc() cxl/core: Use guard() to drop goto pattern of cxl_dpa_alloc() cxl/core: Use guard() to drop the goto pattern of cxl_dpa_free() ...
2025-04-01Merge tag 'driver-core-6.15-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updatesk from Greg KH: "Here is the big set of driver core updates for 6.15-rc1. Lots of stuff happened this development cycle, including: - kernfs scaling changes to make it even faster thanks to rcu - bin_attribute constify work in many subsystems - faux bus minor tweaks for the rust bindings - rust binding updates for driver core, pci, and platform busses, making more functionaliy available to rust drivers. These are all due to people actually trying to use the bindings that were in 6.14. - make Rafael and Danilo full co-maintainers of the driver core codebase - other minor fixes and updates" * tag 'driver-core-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (52 commits) rust: platform: require Send for Driver trait implementers rust: pci: require Send for Driver trait implementers rust: platform: impl Send + Sync for platform::Device rust: pci: impl Send + Sync for pci::Device rust: platform: fix unrestricted &mut platform::Device rust: pci: fix unrestricted &mut pci::Device rust: device: implement device context marker rust: pci: use to_result() in enable_device_mem() MAINTAINERS: driver core: mark Rafael and Danilo as co-maintainers rust/kernel/faux: mark Registration methods inline driver core: faux: only create the device if probe() succeeds rust/faux: Add missing parent argument to Registration::new() rust/faux: Drop #[repr(transparent)] from faux::Registration rust: io: fix devres test with new io accessor functions rust: io: rename `io::Io` accessors kernfs: Move dput() outside of the RCU section. efi: rci2: mark bin_attribute as __ro_after_init rapidio: constify 'struct bin_attribute' firmware: qemu_fw_cfg: constify 'struct bin_attribute' powerpc/perf/hv-24x7: Constify 'struct bin_attribute' ...
2025-03-20cxl/region: Fix the first aliased address miscalculationLi Ming
In extended linear cache(ELC) case, cxl_port_get_spa_cache_alias() helps to get the aliased address of a SPA, it considers the first address in CXL memory range is "region start + region cache size + 1", but it should be "region start + region cache size". So if a SPA is equal to "region start + region cache size", its aliased address should be "SPA - region cache size". Signed-off-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://patch.msgid.link/20250317070124.815028-1-ming.li@zohomail.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-03-17cxl: Add support to handle user feature commands for set featureDave Jiang
Add helper function to parse the user data from fwctl RPC ioctl and send the parsed input parameters to cxl_set_feature() call. Link: https://patch.msgid.link/r/20250307205648.1021626-6-dave.jiang@intel.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-03-17cxl: Add support to handle user feature commands for get featureDave Jiang
Add helper function to parse the user data from fwctl RPC ioctl and send the parsed input parameters to cxl_get_feature() call. Link: https://patch.msgid.link/r/20250307205648.1021626-5-dave.jiang@intel.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-03-17cxl: Add support for fwctl RPC command to enable CXL feature commandsDave Jiang
fwctl provides a fwctl_ops->fw_rpc() callback in order to issue ioctls to a device. The cxl fwctl driver will start by supporting the CXL Feature commands: Get Supported Features, Get Feature, and Set Feature. The fw_rpc() callback provides 'enum fwctl_rpc_scope' parameter where it indicates the security scope of the call. The Get Supported Features and Get Feature calls can be executed with the scope of FWCTL_RPC_CONFIGRATION. The Set Feature call is gated by the effects of the Feature reported by Get Supported Features call for the specific Feature. Only "Get Supported Features" is supported in this patch. Additional commands will be added in follow on patches. "Get Supported Features" will filter the Features that are exclusive to the kernel. The flag field of the Feature details will be cleared of the "Changeable" field and the "set feat size" will be set to 0 to indicate that the feature is not changeable. Link: https://patch.msgid.link/r/20250307205648.1021626-4-dave.jiang@intel.com Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-03-17cxl: Add FWCTL support to CXLDave Jiang
Add fwctl support code to allow sending of CXL feature commands from userspace through as ioctls via FWCTL. Provide initial setup bits. The CXL PCI probe function will call devm_cxl_setup_fwctl() after the cxl_memdev has been enumerated in order to setup FWCTL char device under the cxl_memdev like the existing memdev char device for issuing CXL raw mailbox commands from userspace via ioctls. Link: https://patch.msgid.link/r/20250307205648.1021626-2-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-03-17Merge branch 'for-6.15/features' into cxl-for-nextDave Jiang
Add CXL Features support. Setup code for enabling in kernel usage of CXL Features. Expecting EDAC/RAS to utilize CXL Features in kernel for things such as memory sparing. Also prepartion for enabling of CXL FWCTL support to issue allowed Features from user space.
2025-03-14cxl/region: Quiet some dev_warn()s in extended linear cache setupAlison Schofield
Extended Linear Cache (ELC) setup code emits a dev_warn(), "Extended linear cache calculation failed." for issues found while setting up the ELC. For platforms without CONFIG_ACPI_HMAT, every auto region setup will emit the warning because the default !ACPI_HMAT return value is EOPNOTSUPP. Suppress it by skipping the warn for EOPNOTSUPP. Change the EOPNOTSUPP in the actual ELC failure path to ENXIO. Remove the check and enusing dev_warn() when region resource size is NULL. The endpoint decoders hpa_range used to create the resource is checked in init_hdm_decoder(), so it cannot be NULL here. For good measure, add the rc value to the dev_warn(). It will either be the -ENOENT returned by HMAT if the mem target is not found, or the -ENXIO from the region driver calculation. Reviewed-by: Li Ming <ming.li@zohomail.com> Signed-off-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20250306213700.2606304-1-alison.schofield@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-03-14cxl: Fix warning from emitting resource_size_t as long long int on 32bit systemsDave Jiang
Reported by kernel test bot from an ARM build: drivers/cxl/core/region.c:3263:26: warning: format '%lld' expects argument of type 'long long int', but argument 3 has type 'resource_size_t' {aka 'unsigned int'} [-Wformat=] On a 32bit system, resource_size_t is defined as 32bit long vs on a 64bit system it is defined as 64bit long. Use %pa format to deal with resource_size_t. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202503010252.mIDhZ5kY-lkp@intel.com/ Fixes: 0ec9849b6333 ("acpi/hmat / cxl: Add extended linear cache support for CXL") Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20250228204739.3849309-1-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-03-14cxl/mem: Do not return error if CONFIG_CXL_MCE unsetLi Ming
CONFIG_CXL_MCE depends on CONFIG_MEMORY_FAILURE, if CONFIG_CXL_MCE is not set, devm_cxl_register_mce_notifier() will return an -EOPNOTSUPP, it will cause cxl_mem_state_create() failure , and then cxl pci device probing failed. In this case, it should not break cxl pci device probing. Add a checking in cxl_mem_state_create() to check if the returned value of devm_cxl_register_mce_notifier() is -EOPNOTSUPP, if yes, just output a warning log, do not let cxl_mem_state_create() return an error. Signed-off-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://patch.msgid.link/20250227101848.388595-1-ming.li@zohomail.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-03-14Merge branch 'for-6.15/extended-linear-cache' into cxl-for-next2Dave Jiang
Add support for Extended Linear Cache for CXL. Add enumeration support of the cache. Add MCE notification of the aliased memory address.
2025-03-14Merge branch 'for-6.15/dirty-shutdown' into cxl-for-next2Dave Jiang
Add support for Global Persistent Flush (GPF) and dirty shutdown accounting.
2025-03-14Merge branch 'for-6.15/guard_cleanups' into cxl-for-next2Dave Jiang
A series of CXL refactoring using scope based resource management to remove goto patterns on the cleanup paths.
2025-03-14cxl/pmem: Export dirty shutdown count via sysfsDavidlohr Bueso
Similar to how the acpi_nfit driver exports Optane dirty shutdown count, introduce: /sys/bus/cxl/devices/nvdimm-bridge0/ndbusX/nmemY/cxl/dirty_shutdown Under the conditions that 1) dirty shutdown can be set, 2) Device GPF DVSEC exists, and 3) the count itself can be retrieved. Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://patch.msgid.link/20250220220235.276831-4-dave@stgolabs.net Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-03-14cxl/pmem: Rename cxl_dirty_shutdown_state()Davidlohr Bueso
... to a better suited 'cxl_arm_dirty_shutdown()'. Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://patch.msgid.link/20250220220235.276831-3-dave@stgolabs.net Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-03-14cxl/pci: Introduce cxl_gpf_get_dvsec()Davidlohr Bueso
Add a helper to fetch the port/device GPF dvsecs. This is currently only used for ports, but a later patch to export dirty count to users will make use of the device one. Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://patch.msgid.link/20250220220235.276831-2-dave@stgolabs.net Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-03-14cxl/pci: Support Global Persistent Flush (GPF)Davidlohr Bueso
Add support for GPF flows. It is found that the CXL specification around this to be a bit too involved from the driver side. And while this should really all handled by the hardware, this patch takes things with a grain of salt. Upon respective port enumeration, both phase timeouts are set to a max of 20 seconds, which is the NMI watchdog default for lockup detection. The premise is that the kernel does not have enough information to set anything better than a max across the board and hope devices finish their GPF flows within the platform energy budget. Timeout detection is based on dirty Shutdown semantics. The driver will mark it as dirty, expecting that the device clear it upon a successful GPF event. The admin may consult the device Health and check the dirty shutdown counter to see if there was a problem with data integrity. [ davej: Explicitly set return to 0 in update_gpf_port_dvsec() ] [ davej: Add spec reference for 'struct cxl_mbox_set_shutdown_state_in ] [ davej: Fix 0-day reported issue ] Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20250124233533.910535-1-dave@stgolabs.net Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-03-14cxl/pmem: debug invalid serial number dataYuquan Wang
In a nvdimm interleave-set each device with an invalid or zero serial number may cause pmem region initialization to fail, but in cxl case such device could still set cookies of nd_interleave_set and create a nvdimm pmem region. This adds the validation of serial number in cxl pmem region creation. The event of no serial number would cause to fail to set the cookie and pmem region. For cxl-test to work properly, always +1 on mock device's serial number. Signed-off-by: Yuquan Wang <wangyuquan1236@phytium.com.cn> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://patch.msgid.link/20250219040029.515451-2-wangyuquan1236@phytium.com.cn Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-03-14cxl/cdat: Remove redundant gp_port initializationLi Ming
gp_port is already pointed to the grandparent port during its definition, remove a redundant code to let gp_port point to the grandparent port again. Signed-off-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Link: https://patch.msgid.link/20250211062054.300108-1-ming.li@zohomail.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-03-14cxl/memdev: Remove unused partition valuesIra Weiny
The next volatile and next persistent values are unused and are cluttering the cxl_memdev_state. Remove these values. Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://patch.msgid.link/20250206-cxl-cleanup-v1-1-9ddf26dd8433@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-03-14cxl/region: Drop goto pattern of construct_region()Li Ming
Some operations need to be protected by the cxl_region_rwsem in construct_region(). Currently, construct_region() uses down_write() and up_write() for the cxl_region_rwsem locking, so there is a goto pattern after down_write() invoked to release cxl_region_rwsem. construct region() can be optimized to remove the goto pattern. The changes are creating a new function called __construct_region() which will include all checking and operations protected by the cxl_region_rwsem, and using guard(rwsem_write) to replace down_write() and up_write() in __construct_region(). Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Li Ming <ming.li@zohomail.com> Link: https://patch.msgid.link/20250221013205.126419-1-ming.li@zohomail.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-03-14cxl/region: Drop goto pattern in cxl_dax_region_alloc()Li Ming
In cxl_dax_region_alloc(), there is a goto pattern to release the rwsem cxl_region_rwsem when the function returns, the down_read() and up_read can be replaced by a guard(rwsem_read) then the goto pattern can be removed. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Li Ming <ming.li@zohomail.com> Link: https://patch.msgid.link/20250221012453.126366-7-ming.li@zohomail.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-03-14cxl/core: Use guard() to drop goto pattern of cxl_dpa_alloc()Li Ming
In cxl_dpa_alloc(), some checking and operations need to be protected by a rwsem called cxl_dpa_rwsem, so there is a goto pattern in cxl_dpa_alloc() to release the rwsem. The goto pattern can be optimized by using guard() to hold the rwsem. Creating a new function called __cxl_dpa_alloc() to include all checking and operations needed to be protected by cxl_dpa_rwsem. Using guard(rwsem_write()) to hold cxl_dpa_rwsem at the beginning of the new function. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Li Ming <ming.li@zohomail.com> Link: https://patch.msgid.link/20250221012453.126366-6-ming.li@zohomail.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-03-14cxl/core: Use guard() to drop the goto pattern of cxl_dpa_free()Li Ming
cxl_dpa_free() has a goto pattern to call up_write() for cxl_dpa_rwsem, it can be removed by using a guard() to replace the down_write() and up_write(). Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Li Ming <ming.li@zohomail.com> Link: https://patch.msgid.link/20250221012453.126366-5-ming.li@zohomail.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-03-14cxl/memdev: cxl_memdev_ioctl() cleanupLi Ming
In cxl_memdev_ioctl(), the down_read(&cxl_memdev_rwsem) and up_read(&cxl_memdev_rwsem) can be replaced by a guard(rwsem_read)(&cxl_memdev_rwsem), it helps to remove the open-coded up_read(&cxl_memdev_rwsem). Besides, the local var 'rc' can be also removed to make the code more cleaner. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Li Ming <ming.li@zohomail.com> Link: https://patch.msgid.link/20250221012453.126366-4-ming.li@zohomail.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-03-14cxl/core: cxl_mem_sanitize() cleanupLi Ming
In cxl_mem_sanitize(), the down_read() and up_read() for cxl_region_rwsem can be simply replaced by a guard(rwsem_read), and the local variable 'rc' can be removed. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Li Ming <ming.li@zohomail.com> Link: https://patch.msgid.link/20250221012453.126366-3-ming.li@zohomail.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-03-14cxl/core: Use guard() to replace open-coded down_read/write()Li Ming
Some down/up_read() and down/up_write() cases can be replaced by a guard() simply to drop explicit unlock invoked. It helps to align coding style with current CXL subsystem's. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Li Ming <ming.li@zohomail.com> Link: https://patch.msgid.link/20250221012453.126366-2-ming.li@zohomail.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-03-14Merge branch 'for-6.15/fw-first-error-logging' into cxl-for-next2Dave Jiang
Add logging support for CXL CPER endpoint and port protocol errors. Including the 2 patches that was completed later. Link: https://lore.kernel.org/linux-cxl/20250123084421.127697-1-Smita.KoralahalliChannabasappa@amd.com/ Link: https://lore.kernel.org/linux-cxl/20250310223839.31342-1-Smita.KoralahalliChannabasappa@amd.com/
2025-03-14cxl/pci: Add trace logging for CXL PCIe Port RAS errorsSmita Koralahalli
The CXL drivers use kernel trace functions for logging endpoint and Restricted CXL host (RCH) Downstream Port RAS errors. Similar functionality is required for CXL Root Ports, CXL Downstream Switch Ports, and CXL Upstream Switch Ports. Introduce trace logging functions for both RAS correctable and uncorrectable errors specific to CXL PCIe Ports. Use them to trace FW-First Protocol errors. Co-developed-by: Terry Bowman <terry.bowman@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://patch.msgid.link/20250310223839.31342-3-Smita.KoralahalliChannabasappa@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-03-14acpi/ghes, cxl/pci: Process CXL CPER Protocol ErrorsSmita Koralahalli
When PCIe AER is in FW-First, OS should process CXL Protocol errors from CPER records. Introduce support for handling and logging CXL Protocol errors. The defined trace events cxl_aer_uncorrectable_error and cxl_aer_correctable_error trace native CXL AER endpoint errors. Reuse them to trace FW-First Protocol errors. Since the CXL code is required to be called from process context and GHES is in interrupt context, use workqueues for processing. Similar to CXL CPER event handling, use kfifo to handle errors as it simplifies queue processing by providing lock free fifo operations. Add the ability for the CXL sub-system to register a workqueue to process CXL CPER protocol errors. [DJ: return cxl_cper_register_prot_err_work() directly in cxl_ras_init()] Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://patch.msgid.link/20250310223839.31342-2-Smita.KoralahalliChannabasappa@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-02-26cxl: Add mce notifier to emit aliased address for extended linear cacheDave Jiang
Below is a setup with extended linear cache configuration with an example layout of memory region shown below presented as a single memory region consists of 256G memory where there's 128G of DRAM and 128G of CXL memory. The kernel sees a region of total 256G of system memory. 128G DRAM 128G CXL memory |-----------------------------------|-------------------------------------| Data resides in either DRAM or far memory (FM) with no replication. Hot data is swapped into DRAM by the hardware behind the scenes. When error is detected in one location, it is possible that error also resides in the aliased location. Therefore when a memory location that is flagged by MCE is part of the special region, the aliased memory location needs to be offlined as well. Add an mce notify callback to identify if the MCE address location is part of an extended linear cache region and handle accordingly. Added symbol export to set_mce_nospec() in x86 code in order to call set_mce_nospec() from the CXL MCE notify callback. Link: https://lore.kernel.org/linux-cxl/668333b17e4b2_5639294fd@dwillia2-xfh.jf.intel.com.notmuch/ Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20250226162224.3633792-5-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-02-26cxl: Add extended linear cache address alias emission for cxl eventsDave Jiang
Add the aliased address of extended linear cache when emitting event trace for poison, DRAM and general media of CXL events. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20250226162224.3633792-4-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-02-26acpi/hmat / cxl: Add extended linear cache support for CXLDave Jiang
The current cxl region size only indicates the size of the CXL memory region without accounting for the extended linear cache size. Retrieve the cache size from HMAT and append that to the cxl region size for the cxl region range that matches the SRAT range that has extended linear cache enabled. The SRAT defines the whole memory range that includes the extended linear cache and the CXL memory region. The new HMAT ECN/ECR to the Memory Side Cache Information Structure defines the size of the extended linear cache size and matches to the SRAT Memory Affinity Structure by the memory proxmity domain. Add a helper to match the cxl range to the SRAT memory range in order to retrieve the cache size. There are several places that checks the cxl region range against the decoder range. Use new helper to check between the two ranges and address the new cache size. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20250226162224.3633792-3-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-02-26cxl: Setup exclusive CXL features that are reserved for the kernelDave Jiang
Certain features will be exclusively used by components such as in kernel RAS driver. Setup an exclusion list that can be used to detect if a feature is exclusive to the kernel. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Tested-by: Shiju Jose <shiju.jose@huawei.com> Link: https://patch.msgid.link/20250220194438.2281088-7-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-02-26cxl/mbox: Add SET_FEATURE mailbox commandShiju Jose
Add support for SET_FEATURE mailbox command. CXL spec r3.2 section 8.2.9.6 describes optional device specific features. CXL devices supports features with changeable attributes. The settings of a feature can be optionally modified using Set Feature command. CXL spec r3.2 section 8.2.9.6.3 describes Set Feature command. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Link: https://patch.msgid.link/20250220194438.2281088-6-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-02-26cxl/mbox: Add GET_FEATURE mailbox commandShiju Jose
Add support for GET_FEATURE mailbox command. CXL spec r3.2 section 8.2.9.6 describes optional device specific features. The settings of a feature can be retrieved using Get Feature command. CXL spec r3.2 section 8.2.9.6.2 describes Get Feature command. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Link: https://patch.msgid.link/20250220194438.2281088-5-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-02-26cxl: Add Get Supported Features command for kernel usageDave Jiang
CXL spec r3.2 8.2.9.6.1 Get Supported Features (Opcode 0500h) The command retrieve the list of supported device-specific features (identified by UUID) and general information about each Feature. The driver will retrieve the Feature entries in order to make checks and provide information for the Get Feature and Set Feature command. One of the main piece of information retrieved are the effects a Set Feature command would have for a particular feature. The retrieved Feature entries are stored in the cxl_mailbox context. The setup of Features is initiated via devm_cxl_setup_features() during the pci probe function before the cxl_memdev is enumerated. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Tested-by: Shiju Jose <shiju.jose@huawei.com> Link: https://patch.msgid.link/20250220194438.2281088-3-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-02-26cxl: Enumerate feature commandsDave Jiang
Add feature commands enumeration code in order to detect and enumerate the 3 feature related commands "get supported features", "get feature", and "set feature". The enumeration will help determine whether the driver can issue any of the 3 commands to the device. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Tested-by: Shiju Jose <shiju.jose@huawei.com> Link: https://patch.msgid.link/20250220194438.2281088-2-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-02-24cxl: Refactor user ioctl command path from mds to mailboxDave Jiang
With 'struct cxl_mailbox' context introduced, the helper functions cxl_query_cmd() and cxl_send_cmd() can take a cxl_mailbox directly rather than a cxl_memdev parameter. Refactor to use cxl_mailbox directly. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20250204220430.4146187-2-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-02-21cxl/port: Constify 'struct bin_attribute'Thomas Weißschuh
The sysfs core now allows instances of 'struct bin_attribute' to be moved into read-only memory. Make use of that to protect them against accidental or malicious modifications. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20250114-sysfs-const-bin_attr-cxl-v1-1-5afa23fe2a52@weissschuh.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-04cxl: Cleanup partition size and perf helpersDan Williams
Now that the 'struct cxl_dpa_partition' array contains both size and performance information, all paths that iterate over that information can use a loop rather than hard-code 'ram' and 'pmem' lookups. Remove, or reduce the scope of the temporary helpers that bridged the pre-'struct cxl_dpa_partition' state of the code to the post-'struct cxl_dpa_partition' state. - to_{ram,pmem}_perf(): scope reduced to just sysfs_emit + is_visible() helpers - to_{ram,pmem}_res(): fold into their only users cxl_{ram,pmem}_size() - cxl_ram_size(): scope reduced to ram_size_show() (Note, cxl_pmem_size() also used to gate nvdimm registration) In short, memdev sysfs ABI already made the promise that 0-sized partitions will show for memdevs, but that can be avoided for future partitions by using dynamic sysfs group visibility (new relative to when the partition ABI first shipped upstream). Cc: Dave Jiang <dave.jiang@intel.com> Cc: Alejandro Lucero <alucerop@amd.com> Cc: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Alejandro Lucero <alucerop@amd.com> Link: https://patch.msgid.link/173864307519.668823.10800104022426067621.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-02-04cxl: Kill enum cxl_decoder_modeDan Williams
Now that the operational mode of DPA capacity (ram vs pmem... etc) is tracked in the partition, and no code paths have dependencies on the mode implying the partition index, the ambiguous 'enum cxl_decoder_mode' can be cleaned up, specifically this ambiguity on whether the operation mode implied anything about the partition order. Endpoint decoders simply reference their assigned partition where the operational mode can be retrieved as partition mode. With this in place PMEM can now be partition0 which happens today when the RAM capacity size is zero. Dynamic RAM can appear above PMEM when DCD arrives, etc. Code sequences that hard coded the "PMEM after RAM" assumption can now just iterate partitions and consult the partition mode after the fact. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Alejandro Lucero <alucerop@amd.com> Link: https://patch.msgid.link/173864306972.668823.3327008645125276726.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-02-04cxl: Make cxl_dpa_alloc() DPA partition number agnosticDan Williams
cxl_dpa_alloc() is a hard coded nest of assumptions around PMEM allocations being distinct from RAM allocations in specific ways when in practice the allocation rules are only relative to DPA partition index. The rules for cxl_dpa_alloc() are: - allocations can only come from 1 partition - if allocating at partition-index-N, all free space in partitions less than partition-index-N must be skipped over Use the new 'struct cxl_dpa_partition' array to support allocation with an arbitrary number of DPA partitions on the device. A follow-on patch can go further to cleanup 'enum cxl_decoder_mode' concept and supersede it with looking up the memory properties from partition metadata. Until then cxl_part_mode() temporarily bridges code that looks up partitions by @cxled->mode. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Alejandro Lucero <alucerop@amd.com> Link: https://patch.msgid.link/173864306400.668823.12143134425285426523.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-02-04cxl: Introduce 'struct cxl_dpa_partition' and 'struct cxl_range_info'Dan Williams
The pending efforts to add CXL Accelerator (type-2) device [1], and Dynamic Capacity (DCD) support [2], tripped on the no-longer-fit-for-purpose design in the CXL subsystem for tracking device-physical-address (DPA) metadata. Trip hazards include: - CXL Memory Devices need to consider a PMEM partition, but Accelerator devices with CXL.mem likely do not in the common case. - CXL Memory Devices enumerate DPA through Memory Device mailbox commands like Partition Info, Accelerators devices do not. - CXL Memory Devices that support DCD support more than 2 partitions. Some of the driver algorithms are awkward to expand to > 2 partition cases. - DPA performance data is a general capability that can be shared with accelerators, so tracking it in 'struct cxl_memdev_state' is no longer suitable. - Hardcoded assumptions around the PMEM partition always being index-1 if RAM is zero-sized or PMEM is zero sized. - 'enum cxl_decoder_mode' is sometimes a partition id and sometimes a memory property, it should be phased in favor of a partition id and the memory property comes from the partition info. Towards cleaning up those issues and allowing a smoother landing for the aforementioned pending efforts, introduce a 'struct cxl_dpa_partition' array to 'struct cxl_dev_state', and 'struct cxl_range_info' as a shared way for Memory Devices and Accelerators to initialize the DPA information in 'struct cxl_dev_state'. For now, split a new cxl_dpa_setup() from cxl_mem_create_range_info() to get the new data structure initialized, and cleanup some qos_class init. Follow on patches will go further to use the new data structure to cleanup algorithms that are better suited to loop over all possible partitions. cxl_dpa_setup() follows the locking expectations of mutating the device DPA map, and is suitable for Accelerator drivers to use. Accelerators likely only have one hardcoded 'ram' partition to convey to the cxl_core. Link: http://lore.kernel.org/20241230214445.27602-1-alejandro.lucero-palau@amd.com [1] Link: http://lore.kernel.org/20241210-dcd-type2-upstream-v8-0-812852504400@intel.com [2] Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Alejandro Lucero <alucerop@amd.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Alejandro Lucero <alucerop@amd.com> Link: https://patch.msgid.link/173864305827.668823.13978794102080021276.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-02-04cxl: Introduce to_{ram,pmem}_{res,perf}() helpersDan Williams
In preparation for consolidating all DPA partition information into an array of DPA metadata, introduce helpers that hide the layout of the current data. I.e. make the eventual replacement of ->ram_res, ->pmem_res, ->ram_perf, and ->pmem_perf with a new DPA metadata array a no-op for code paths that consume that information, and reduce the noise of follow-on patches. The end goal is to consolidate all DPA information in 'struct cxl_dev_state', but for now the helpers just make it appear that all DPA metadata is relative to @cxlds. As the conversion to generic partition metadata walking is completed, these helpers will naturally be eliminated, or reduced in scope. Cc: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Tested-by: Alejandro Lucero <alucerop@amd.com> Link: https://patch.msgid.link/173864305238.668823.16553986866633608541.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-02-04cxl: Remove the CXL_DECODER_MIXED mistakeDan Williams
CXL_DECODER_MIXED is a safety mechanism introduced for the case where platform firmware has programmed an endpoint decoder that straddles a DPA partition boundary. While the kernel is careful to only allocate DPA capacity within a single partition there is no guarantee that platform firmware, or anything that touched the device before the current kernel, gets that right. However, __cxl_dpa_reserve() will never get to the CXL_DECODER_MIXED designation because of the way it tracks partition boundaries. A request_resource() that spans ->ram_res and ->pmem_res fails with the following signature: __cxl_dpa_reserve: cxl_port endpoint15: decoder15.0: failed to reserve allocation CXL_DECODER_MIXED is dead defensive programming after the driver has already given up on the device. It has never offered any protection in practice, just delete it. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Tested-by: Alejandro Lucero <alucerop@amd.com> Link: https://patch.msgid.link/173864304660.668823.17000888505587850279.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-01-29Merge tag 'cxl-for-6.14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull Compute Express Link (CXL) updates from Dave Jiang: "A tweak to the HMAT output that was acked by Rafael, a prep patch for CXL type2 devices support that's coming soon, refactoring of the CXL regblock enumeration code, and a series of patches to update the event records to CXL spec r3.1: - Move HMAT printouts to pr_debug() - Add CXL type2 support to cxl_dvsec_rr_decode() in preparation for type2 support - A series that updates CXL event records to spec r3.1 and related changes - Refactoring of cxl_find_regblock_instance() to count regblocks" * tag 'cxl-for-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: cxl/core/regs: Refactor out functions to count regblocks of given type cxl/test: Update test code for event records to CXL spec rev 3.1 cxl/events: Update Memory Module Event Record to CXL spec rev 3.1 cxl/events: Update DRAM Event Record to CXL spec rev 3.1 cxl/events: Update General Media Event Record to CXL spec rev 3.1 cxl/events: Add Component Identifier formatting for CXL spec rev 3.1 cxl/events: Update Common Event Record to CXL spec rev 3.1 cxl/pci: Add CXL Type 1/2 support to cxl_dvsec_rr_decode() ACPI/HMAT: Move HMAT messages to pr_debug()
2025-01-22cxl/core/regs: Refactor out functions to count regblocks of given typeHuaisheng Ye
cxl_find_regblock_instance() counts the number of instances of a register block as a side effect of searching through all available register blocks. cxl_count_regblock() throws away that work and recounts all the register blocks by asking cxl_find_regblock_instance() to redo work it has already done until it finally returns an error, that is needlessly wasteful. Let cxl_count_regblock() leverage the counting that cxl_find_regblock_instance() already does by passing in a sentinel value (CXL_INSTANCES_COUNT) that triggers the count to be returned. [ davej: Updated to more concise commit log supplied by djbw ] [ davej: Fix up checkpatch formatting warnings ] Signed-off-by: Huaisheng Ye <huaisheng.ye@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20250115152600.26482-2-huaisheng.ye@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-01-13cxl/events: Update Memory Module Event Record to CXL spec rev 3.1Shiju Jose
CXL spec 3.1 section 8.2.9.2.1.3 Table 8-47, Memory Module Event Record has updated with following new fields and new info for Device Event Type and Device Health Information fields. 1. Validity Flags 2. Component Identifier 3. Device Event Sub-Type Update the Memory Module event record and Memory Module trace event for the above spec changes. The new fields are inserted in logical places. Example trace print of cxl_memory_module trace event, cxl_memory_module: memdev=mem3 host=0000:0f:00.0 serial=3 log=Fatal : \ time=371709344709 uuid=fe927475-dd59-4339-a586-79bab113b774 len=128 \ flags='0x1' handle=2 related_handle=0 maint_op_class=0 \ maint_op_sub_class=0 : event_type='Temperature Change' \ event_sub_type='Unsupported Config Data' \ health_status='MAINTENANCE_NEEDED|REPLACEMENT_NEEDED' \ media_status='All Data Loss in Event of Power Loss' as_life_used=0x3 \ as_dev_temp=Normal as_cor_vol_err_cnt=Normal as_cor_per_err_cnt=Normal \ life_used=8 device_temp=3 dirty_shutdown_cnt=33 cor_vol_err_cnt=25 \ cor_per_err_cnt=45 validity_flags='COMPONENT|COMPONENT PLDM FORMAT' \ comp_id=03 74 c5 08 9a 1a 0b fc d2 7e 2f 31 9b 3c 81 4d \ comp_id_pldm_valid_flags='Resource ID' \ pldm_entity_id=0x00 pldm_resource_id=fc d2 7e 2f Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Link: https://patch.msgid.link/20250111091756.1682-6-shiju.jose@huawei.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>