summaryrefslogtreecommitdiff
path: root/drivers/base
AgeCommit message (Collapse)Author
2022-07-04arch_topology: Drop LLC identifier stash from the CPU topologySudeep Holla
Since the cacheinfo LLC information is used directly in arch_topology, there is no need to parse and store the LLC ID information only for ACPI systems in the CPU topology. Remove the redundant LLC ID from the generic CPU arch_topology information. Link: https://lore.kernel.org/r/20220704101605.1318280-13-sudeep.holla@arm.com Tested-by: Ionela Voinescu <ionela.voinescu@arm.com> Tested-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2022-07-04arch_topology: Use the last level cache information from the cacheinfoSudeep Holla
The cacheinfo is now initialised early along with the CPU topology initialisation. Instead of relying on the LLC ID information parsed separately only with ACPI PPTT elsewhere, migrate to use the similar information from the cacheinfo. This is generic for both DT and ACPI systems. The ACPI LLC ID information parsed separately can now be removed from arch specific code. Link: https://lore.kernel.org/r/20220704101605.1318280-11-sudeep.holla@arm.com Tested-by: Ionela Voinescu <ionela.voinescu@arm.com> Tested-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2022-07-04arch_topology: Add support to parse and detect cache attributesSudeep Holla
Currently ACPI populates just the minimum information about the last level cache from PPTT in order to feed the same to build sched_domains. Similar support for DT platforms is not present. In order to enable the same, the entire cache hierarchy information can be built as part of CPU topoplogy parsing both on ACPI and DT platforms. Note that this change builds the cacheinfo early even on ACPI systems, but the current mechanism of building llc_sibling mask remains unchanged. Link: https://lore.kernel.org/r/20220704101605.1318280-10-sudeep.holla@arm.com Tested-by: Ionela Voinescu <ionela.voinescu@arm.com> Tested-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2022-07-04cacheinfo: Align checks in cache_shared_cpu_map_{setup,remove} for readabilitySudeep Holla
The checks to skip the CPU itself or no cacheinfo case are implemented bit differently though the effect is exactly same. Just align the implementation in both cache_shared_cpu_map_{setup,remove} just for improved readability. No functional change. Link: https://lore.kernel.org/r/20220704101605.1318280-9-sudeep.holla@arm.com Tested-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2022-07-04cacheinfo: Use cache identifiers to check if the caches are shared if availableSudeep Holla
The cache identifiers is an optional property on most of the platforms. The presence of one must be indicated by the CACHE_ID valid bit in the attributes. We can use the cache identifiers provided by the firmware to check if any two cpus share the same cache instead of relying on the fw_token generated and set in the OS. Link: https://lore.kernel.org/r/20220704101605.1318280-8-sudeep.holla@arm.com Tested-by: Ionela Voinescu <ionela.voinescu@arm.com> Tested-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2022-07-04cacheinfo: Allow early detection and population of cache attributesSudeep Holla
Some architecture/platforms may need to setup cache properties very early in the boot along with other cpu topologies so that all these information can be used to build sched_domains which is used by the scheduler. Allow detect_cache_attributes to be called quite early during the boot. Link: https://lore.kernel.org/r/20220704101605.1318280-7-sudeep.holla@arm.com Tested-by: Ionela Voinescu <ionela.voinescu@arm.com> Tested-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2022-07-04cacheinfo: Add support to check if last level cache(LLC) is valid or sharedSudeep Holla
It is useful to have helper to check if the given two CPUs share last level cache. We can do that check by comparing fw_token or by comparing the cache ID. Currently we check just for fw_token as the cache ID is optional. This helper can be used to build the llc_sibling during arch specific topology parsing and feeding information to the sched_domains. This also helps to get rid of llc_id in the CPU topology as it is sort of duplicate information. Also add helper to check if the llc information in cacheinfo is valid or not. Link: https://lore.kernel.org/r/20220704101605.1318280-6-sudeep.holla@arm.com Tested-by: Ionela Voinescu <ionela.voinescu@arm.com> Tested-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2022-07-04cacheinfo: Move cache_leaves_are_shared out of CONFIG_OFSudeep Holla
cache_leaves_are_shared is already used even with ACPI and PPTT. It checks if the cache leaves are the shared based on fw_token pointer. However it is defined conditionally only if CONFIG_OF is enabled which is wrong. Move the function cache_leaves_are_shared out of CONFIG_OF and keep it generic. It also handles the case where both OF and ACPI is not defined. Link: https://lore.kernel.org/r/20220704101605.1318280-5-sudeep.holla@arm.com Tested-by: Ionela Voinescu <ionela.voinescu@arm.com> Tested-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2022-07-04cacheinfo: Add helper to access any cache index for a given CPUSudeep Holla
The cacheinfo for a given CPU at a given index is used at quite a few places by fetching the base point for index 0 using the helper per_cpu_cacheinfo(cpu) and offsetting it by the required index. Instead, add another helper to fetch the required pointer directly and use it to simplify and improve readability. Link: https://lore.kernel.org/r/20220704101605.1318280-4-sudeep.holla@arm.com Tested-by: Ionela Voinescu <ionela.voinescu@arm.com> Tested-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2022-07-04cacheinfo: Use of_cpu_device_node_get instead cpu_dev->of_nodeSudeep Holla
The of_cpu_device_node_get takes care of fetching the CPU'd device node either from cached cpu_dev->of_node if cpu_dev is initialised or uses of_get_cpu_node to parse and fetch node if cpu_dev isn't available yet. Just use of_cpu_device_node_get instead of getting the cpu device first and then using cpu_dev->of_node for two reasons: 1. There is no other use of cpu_dev and can be simplified 2. It enabled the use detect_cache_attributes and hence cache_setup_of_node much earlier before the CPUs are registered as devices. Link: https://lore.kernel.org/r/20220704101605.1318280-3-sudeep.holla@arm.com Tested-by: Ionela Voinescu <ionela.voinescu@arm.com> Tested-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2022-07-01PM: runtime: Fix supplier device management during consumer probeRafael J. Wysocki
Because pm_runtime_get_suppliers() bumps up the rpm_active counter of each device link to a supplier of the given device in addition to bumping up the supplier's PM-runtime usage counter, a runtime suspend of the consumer device may case the latter to go down to 0 when pm_runtime_put_suppliers() is running on a remote CPU. If that happens after pm_runtime_put_suppliers() has released power.lock for the consumer device, and a runtime resume of that device takes place immediately after it, before pm_runtime_put() is called for the supplier, that pm_runtime_put() call may cause the supplier to be suspended even though the consumer is active. To prevent that from happening, modify pm_runtime_get_suppliers() to call pm_runtime_get_sync() for the given device's suppliers without touching the rpm_active counters of the involved device links Accordingly, modify pm_runtime_put_suppliers() to call pm_runtime_put() for the given device's suppliers without looking at the rpm_active counters of the device links at hand. [This is analogous to what happened before commit 4c06c4e6cf63 ("driver core: Fix possible supplier PM-usage counter imbalance").] Since pm_runtime_get_suppliers() sets supplier_preactivated for each device link where the supplier's PM-runtime usage counter has been incremented and pm_runtime_put_suppliers() calls pm_runtime_put() for the suppliers whose device links have supplier_preactivated set, the PM-runtime usage counter is balanced for each supplier and this is independent of the runtime suspend and resume of the consumer device. However, in case a device link with DL_FLAG_PM_RUNTIME set is dropped during the consumer device probe, so pm_runtime_get_suppliers() bumps up the supplier's PM-runtime usage counter, but it cannot be dropped by pm_runtime_put_suppliers(), make device_link_release_fn() take care of that. Fixes: 4c06c4e6cf63 ("driver core: Fix possible supplier PM-usage counter imbalance") Reported-by: Peter Wang <peter.wang@mediatek.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Peter Wang <peter.wang@mediatek.com> Cc: 5.1+ <stable@vger.kernel.org> # 5.1+
2022-07-01PM: runtime: Redefine pm_runtime_release_supplier()Rafael J. Wysocki
Instead of passing an extra bool argument to pm_runtime_release_supplier(), make its callers take care of triggering a runtime-suspend of the supplier device as needed. No expected functional impact. Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: 5.1+ <stable@vger.kernel.org> # 5.1+
2022-06-30regmap-irq cleanups and refactoringMark Brown
Merge series from Aidan MacDonald <aidanmacdonald.0x0@gmail.com>: This series is an attempt at cleaning up the regmap-irq API in order to simplify things and consolidate existing features, while at the same time generalizing it to support a wider range of hardware. There is a new system for IRQ type configuration, some tweaks to unmask registers so they're more intuitive and useful, and a new callback for calculating register addresses. There's also a few minor code cleanups in here. In v2 I've taken the approach of adding new features and deprecating existing ones rather than removing them aggressively. Warnings will be issued for any drivers that use deprecated features, but they'll otherwise continue to function normally. One important caveat: not all of these changes are tested beyond compile testing, since I don't have hardware to exercise all of the features.
2022-06-30regmap: cache: Add extra parameter check in regcache_initSchspa Shi
When num_reg_defaults > 0 but reg_defaults is NULL, there will be a NULL pointer exception. Current code has no such usage, but as additional hardening, also check this to prevent any chance of crashing. Signed-off-by: Schspa Shi <schspa@gmail.com> Link: https://lore.kernel.org/r/20220629130951.63040-1-schspa@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-06-29regmap-irq: Deprecate the not_fixed_stride flagAidan MacDonald
This flag is a bit of a hack and the same thing can be accomplished using a custom ->get_irq_reg() callback. Add a warning to catch any use of the flag. Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com> Link: https://lore.kernel.org/r/20220623211420.918875-13-aidanmacdonald.0x0@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-06-29regmap-irq: Add get_irq_reg() callbackAidan MacDonald
Replace the internal sub_irq_reg() function with a public callback that drivers can use when they have more complex register layouts. The default implementation is regmap_irq_get_irq_reg_linear(), used if the chip doesn't provide its own callback. Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com> Link: https://lore.kernel.org/r/20220623211420.918875-12-aidanmacdonald.0x0@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-06-29regmap-irq: Fix inverted handling of unmask registersAidan MacDonald
To me "unmask" suggests that we write 1s to the register when an interrupt is enabled. This also makes sense because it's the opposite of what the "mask" register does (write 1s to disable an interrupt). But regmap-irq does the opposite: for a disabled interrupt, it writes 1s to "unmask" and 0s to "mask". This is surprising and deviates from the usual way mask registers are handled. Additionally, mask_invert didn't interact with unmask registers properly -- it caused them to be ignored entirely. Fix this by making mask and unmask registers orthogonal, using the following behavior: * Mask registers are written with 1s for disabled interrupts. * Unmask registers are written with 1s for enabled interrupts. This behavior supports both normal or inverted mask registers and separate set/clear registers via different combinations of mask_base/unmask_base. The old unmask register behavior is deprecated. Drivers need to opt-in to the new behavior by setting mask_unmask_non_inverted. Warnings are issued if the driver relies on deprecated behavior. Chips that only set one of mask_base/unmask_base don't have to use the mask_unmask_non_inverted flag because that use case was previously not supported. The mask_invert flag is also deprecated in favor of describing inverted mask registers as unmask registers. Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com> Link: https://lore.kernel.org/r/20220623211420.918875-11-aidanmacdonald.0x0@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-06-29regmap-irq: Deprecate type registers and virtual registersAidan MacDonald
Config registers can be used to replace both type and virtual registers, so mark both features are deprecated and issue a warning if they're used. Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com> Link: https://lore.kernel.org/r/20220623211420.918875-10-aidanmacdonald.0x0@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-06-29regmap-irq: Introduce config registers for irq typesAidan MacDonald
Config registers provide a more uniform approach to handling irq type registers. They are essentially an extension of the virtual registers used by the qcom-pm8008 driver. Config registers can be represented as a 2D array: config_base[0] reg0,0 reg0,1 reg0,2 reg0,3 config_base[1] reg1,0 reg1,1 reg1,2 reg1,3 config_base[2] reg2,0 reg2,1 reg2,2 reg2,3 There are 'num_config_bases' base registers, each of which is used to address 'num_config_regs' registers. The addresses are calculated in the same way as for other bases. It is assumed that an irq's type is controlled by one column of registers; that column is identified by the irq's 'type_reg_offset'. The set_type_config() callback is responsible for updating the config register contents. It receives an array of buffers (each represents a row of registers) and the index of the column to update, along with the 'struct regmap_irq' description and requested irq type. Buffered values are written to registers in regmap_irq_sync_unlock(). Note that the entire register contents are overwritten, which is a minor change in behavior from type registers via 'type_base'. Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com> Link: https://lore.kernel.org/r/20220623211420.918875-9-aidanmacdonald.0x0@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-06-29regmap-irq: Refactor checks for status bulk read supportAidan MacDonald
There are several conditions that must be satisfied to support bulk read of status registers. Move the check into a function to avoid duplicating it in two places. Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com> Link: https://lore.kernel.org/r/20220623211420.918875-8-aidanmacdonald.0x0@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-06-29regmap-irq: Remove mask_writeonly and regmap_irq_update_bits()Aidan MacDonald
Commit a71411dbf6c8 ("regmap: irq: add chip option mask_writeonly") introduced the mask_writeonly option, but it isn't used now and it appears it's never been used by any in-tree drivers. The motivation for the option is mentioned in the commit message, Some irq controllers have writeonly/multipurpose register layouts. In those cases we read invalid data back. [...] The option causes mask register updates to use regmap_write_bits() instead of regmap_update_bits(). However, regmap_write_bits() doesn't solve the reading invalid data problem. It's still a read-modify-write op like regmap_update_bits(). The difference is that 'update bits' will only write the new value if it is different from the current value, while 'write bits' will write the new value unconditionally, even if it's the same as the current value. This seems like a bit of a specialized use case and probably isn't that useful for regmap-irq, so let's just remove the option and go back to using an 'update bits' op for the mask registers. We can always add the option back if some driver ends up needing it in the future. Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com> Link: https://lore.kernel.org/r/20220623211420.918875-7-aidanmacdonald.0x0@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-06-29regmap-irq: Remove inappropriate uses of regmap_irq_update_bits()Aidan MacDonald
regmap_irq_update_bits() is misnamed and should only be used for updating mask registers, since it checks the mask_writeonly flag. However, it was also used for updating wake and type registers. It's safe to replace these uses with regmap_update_bits() because there are no users of the mask_writeonly flag. Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com> Link: https://lore.kernel.org/r/20220623211420.918875-6-aidanmacdonald.0x0@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-06-29regmap-irq: Remove an unnecessary restriction on type_in_maskAidan MacDonald
Check types_supported instead of checking type_rising/falling_val when using type_in_mask interrupts. This makes the intent clearer and allows a type_in_mask irq to support level or edge triggers, rather than only edge triggers. Update the documentation and comments to reflect the new behavior. This shouldn't affect existing drivers, because if they didn't set types_supported properly the type buffer wouldn't be updated. Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com> Link: https://lore.kernel.org/r/20220623211420.918875-5-aidanmacdonald.0x0@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-06-29regmap-irq: Cleanup sizeof(...) use in memory allocationAidan MacDonald
Instead of mentioning unsigned int directly, use a sizeof(...) involving the buffer we're allocating to ensure the types don't get out of sync. Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com> Link: https://lore.kernel.org/r/20220623211420.918875-4-aidanmacdonald.0x0@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-06-29regmap-irq: Remove unused type_reg_stride fieldAidan MacDonald
It appears that no chip ever required a nonzero type_reg_stride and commit 1066cfbdfa3f ("regmap-irq: Extend sub-irq to support non-fixed reg strides") broke support. Just remove the field. Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com> Link: https://lore.kernel.org/r/20220623211420.918875-3-aidanmacdonald.0x0@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-06-29regmap-irq: Convert bool bitfields to unsigned intAidan MacDonald
Use 'unsigned int' for bitfields for consistency with most other kernel code. Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com> Link: https://lore.kernel.org/r/20220623211420.918875-2-aidanmacdonald.0x0@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-06-29regmap: Merge up fixesMark Brown
Needed for the regmap-irq rework.
2022-06-27driver core: fw_devlink: Allow firmware to mark devices as best effortSaravana Kannan
When firmware sets the FWNODE_FLAG_BEST_EFFORT flag for a fwnode, fw_devlink will do a best effort ordering for that device where it'll only enforce the probe/suspend/resume ordering of that device with suppliers that have drivers. The driver of that device can then decide if it wants to defer probe or probe without the suppliers. This will be useful for avoid probe delays of the console device that were caused by commit 71066545b48e ("driver core: Set fw_devlink.strict=1 by default"). Fixes: 71066545b48e ("driver core: Set fw_devlink.strict=1 by default") Reported-by: Sascha Hauer <sha@pengutronix.de> Reported-by: Peng Fan <peng.fan@nxp.com> Tested-by: Peng Fan <peng.fan@nxp.com> Signed-off-by: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20220623080344.783549-2-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-27driver core: fix potential deadlock in __driver_attachZhang Wensheng
In __driver_attach function, There are also AA deadlock problem, like the commit b232b02bf3c2 ("driver core: fix deadlock in __device_attach"). stack like commit b232b02bf3c2 ("driver core: fix deadlock in __device_attach"). list below: In __driver_attach function, The lock holding logic is as follows: ... __driver_attach if (driver_allows_async_probing(drv)) device_lock(dev) // get lock dev async_schedule_dev(__driver_attach_async_helper, dev); // func async_schedule_node async_schedule_node_domain(func) entry = kzalloc(sizeof(struct async_entry), GFP_ATOMIC); /* when fail or work limit, sync to execute func, but __driver_attach_async_helper will get lock dev as will, which will lead to A-A deadlock. */ if (!entry || atomic_read(&entry_count) > MAX_WORK) { func; else queue_work_node(node, system_unbound_wq, &entry->work) device_unlock(dev) As above show, when it is allowed to do async probes, because of out of memory or work limit, async work is not be allowed, to do sync execute instead. it will lead to A-A deadlock because of __driver_attach_async_helper getting lock dev. Reproduce: and it can be reproduce by make the condition (if (!entry || atomic_read(&entry_count) > MAX_WORK)) untenable, like below: [ 370.785650] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 370.787154] task:swapper/0 state:D stack: 0 pid: 1 ppid: 0 flags:0x00004000 [ 370.788865] Call Trace: [ 370.789374] <TASK> [ 370.789841] __schedule+0x482/0x1050 [ 370.790613] schedule+0x92/0x1a0 [ 370.791290] schedule_preempt_disabled+0x2c/0x50 [ 370.792256] __mutex_lock.isra.0+0x757/0xec0 [ 370.793158] __mutex_lock_slowpath+0x1f/0x30 [ 370.794079] mutex_lock+0x50/0x60 [ 370.794795] __device_driver_lock+0x2f/0x70 [ 370.795677] ? driver_probe_device+0xd0/0xd0 [ 370.796576] __driver_attach_async_helper+0x1d/0xd0 [ 370.797318] ? driver_probe_device+0xd0/0xd0 [ 370.797957] async_schedule_node_domain+0xa5/0xc0 [ 370.798652] async_schedule_node+0x19/0x30 [ 370.799243] __driver_attach+0x246/0x290 [ 370.799828] ? driver_allows_async_probing+0xa0/0xa0 [ 370.800548] bus_for_each_dev+0x9d/0x130 [ 370.801132] driver_attach+0x22/0x30 [ 370.801666] bus_add_driver+0x290/0x340 [ 370.802246] driver_register+0x88/0x140 [ 370.802817] ? virtio_scsi_init+0x116/0x116 [ 370.803425] scsi_register_driver+0x1a/0x30 [ 370.804057] init_sd+0x184/0x226 [ 370.804533] do_one_initcall+0x71/0x3a0 [ 370.805107] kernel_init_freeable+0x39a/0x43a [ 370.805759] ? rest_init+0x150/0x150 [ 370.806283] kernel_init+0x26/0x230 [ 370.806799] ret_from_fork+0x1f/0x30 To fix the deadlock, move the async_schedule_dev outside device_lock, as we can see, in async_schedule_node_domain, the parameter of queue_work_node is system_unbound_wq, so it can accept concurrent operations. which will also not change the code logic, and will not lead to deadlock. Fixes: ef0ff68351be ("driver core: Probe devices asynchronously instead of the driver") Signed-off-by: Zhang Wensheng <zhangwensheng5@huawei.com> Link: https://lore.kernel.org/r/20220622074327.497102-1-zhangwensheng5@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-27devtmpfs: fix the dangling pointer of global devtmpfsd threadYangxi Xiang
When the devtmpfs fails to mount, a dangling pointer still remains in global. Specifically, the err variable is passed by a pointer to the devtmpfsd. When the devtmpfsd exits, it sets the error and completes the setup_done. In this situation, the thread pointer is not set to null. After the devtmpfsd exited, the devtmpfs can wakes up the destroyed devtmpfsd thread by wake_up_process if a device change event comes. Signed-off-by: Yangxi Xiang <xyangxi5@gmail.com> Link: https://lore.kernel.org/r/20220627120409.11174-1-xyangxi5@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-27Revert "devcoredump: remove the useless gfp_t parameter in dev_coredumpv and ↵Greg Kroah-Hartman
dev_coredumpm" This reverts commit 77515ebaf01920e2db49e04672ef669a7c2907f2 as it causes build problems in linux-next. It needs to be reintroduced in a way that can allow the api to evolve and not require a "flag day" to catch all users. Link: https://lore.kernel.org/r/20220623160723.7a44b573@canb.auug.org.au Cc: Duoming Zhou <duoming@zju.edu.cn> Cc: Brian Norris <briannorris@chromium.org> Cc: Johannes Berg <johannes@sipsolutions.net> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-27regmap: Don't warn about cache only mode for devices with no cacheMark Brown
For devices with no cache it can make sense to use cache only mode as a mechanism for trapping writes to hardware which is inaccessible but since no cache is equivalent to cache bypass we force such devices into bypass mode. This means that our check that bypass and cache only mode aren't both enabled simultanously is less sensible for devices without a cache so relax it. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220622171723.1235749-1-broonie@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org>
2022-06-27x86/bugs: Report AMD retbleed vulnerabilityAlexandre Chartre
Report that AMD x86 CPUs are vulnerable to the RETBleed (Arbitrary Speculative Code Execution with Return Instructions) attack. [peterz: add hygon] [kim: invert parity; fam15h] Co-developed-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Alexandre Chartre <alexandre.chartre@oracle.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de>
2022-06-26Merge tag 'mm-hotfixes-stable-2022-06-26' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull hotfixes from Andrew Morton: "Minor things, mainly - mailmap updates, MAINTAINERS updates, etc. Fixes for this merge window: - fix for a damon boot hang, from SeongJae - fix for a kfence warning splat, from Jason Donenfeld - fix for zero-pfn pinning, from Alex Williamson - fix for fallocate hole punch clearing, from Mike Kravetz Fixes for previous releases: - fix for a performance regression, from Marcelo - fix for a hwpoisining BUG from zhenwei pi" * tag 'mm-hotfixes-stable-2022-06-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mailmap: add entry for Christian Marangi mm/memory-failure: disable unpoison once hw error happens hugetlbfs: zero partial pages during fallocate hole punch mm: memcontrol: reference to tools/cgroup/memcg_slabinfo.py mm: re-allow pinning of zero pfns mm/kfence: select random number before taking raw lock MAINTAINERS: add maillist information for LoongArch MAINTAINERS: update MM tree references MAINTAINERS: update Abel Vesa's email MAINTAINERS: add MEMORY HOT(UN)PLUG section and add David as reviewer MAINTAINERS: add Miaohe Lin as a memory-failure reviewer mailmap: add alias for jarkko@profian.com mm/damon/reclaim: schedule 'damon_reclaim_timer' only after 'system_wq' is initialized kthread: make it clear that kthread_create_on_node() might be terminated by any fatal signal mm: lru_cache_disable: use synchronize_rcu_expedited mm/page_isolation.c: fix one kernel-doc comment
2022-06-24Merge tag 'regmap-fix-v5.19-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap Pull regmap fixes from Mark Brown: "Two sets of fixes - one for things that were missed with the support for custom bulk I/O operations introduced in the last merge window, and another for some long standing issues with regmap-irq which affect a fairly small subset of devices" * tag 'regmap-fix-v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap: regmap-irq: Fix offset/index mismatch in read_sub_irq_data() regmap-irq: Fix a bug in regmap_irq_enable() for type_in_mask chips regmap: Wire up regmap_config provided bulk write in missed functions regmap: Make regmap_noinc_read() return -ENOTSUPP if map->read isn't set regmap: Re-introduce bulk read support check in regmap_bulk_read()
2022-06-22regmap-irq: Fix offset/index mismatch in read_sub_irq_data()Aidan MacDonald
We need to divide the sub-irq status register offset by register stride to get an index for the status buffer to avoid an out of bounds write when the register stride is greater than 1. Fixes: a2d21848d921 ("regmap: regmap-irq: Add main status register support") Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com> Link: https://lore.kernel.org/r/20220620200644.1961936-3-aidanmacdonald.0x0@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-06-22regmap-irq: Fix a bug in regmap_irq_enable() for type_in_mask chipsAidan MacDonald
When enabling a type_in_mask irq, the type_buf contents must be AND'd with the mask of the IRQ we're enabling to avoid enabling other IRQs by accident, which can happen if several type_in_mask irqs share a mask register. Fixes: bc998a730367 ("regmap: irq: handle HW using separate rising/falling edge interrupts") Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com> Link: https://lore.kernel.org/r/20220620200644.1961936-2-aidanmacdonald.0x0@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-06-21devcoredump: remove the useless gfp_t parameter in dev_coredumpv and ↵Duoming Zhou
dev_coredumpm The dev_coredumpv() and dev_coredumpm() could not be used in atomic context, because they call kvasprintf_const() and kstrdup() with GFP_KERNEL parameter. The process is shown below: dev_coredumpv(.., gfp_t gfp) dev_coredumpm(.., gfp_t gfp) dev_set_name kobject_set_name_vargs kvasprintf_const(GFP_KERNEL, ...); //may sleep kstrdup(s, GFP_KERNEL); //may sleep This patch removes gfp_t parameter of dev_coredumpv() and dev_coredumpm() and changes the gfp_t parameter of kzalloc() in dev_coredumpm() to GFP_KERNEL in order to show they could not be used in atomic context. Fixes: 833c95456a70 ("device coredump: add new device coredump class") Reviewed-by: Brian Norris <briannorris@chromium.org> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Link: https://lore.kernel.org/r/df72af3b1862bac7d8e793d1f3931857d3779dfd.1654569290.git.duoming@zju.edu.cn Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-20regmap: Wire up regmap_config provided bulk write in missed functionsJavier Martinez Canillas
There are some functions that were missed by commit d77e74561368 ("regmap: Add bulk read/write callbacks into regmap_config") when support to define bulk read/write callbacks in regmap_config was introduced. The regmap_bulk_write() and regmap_noinc_write() functions weren't changed to use the added map->write instead of the map->bus->write handler. Also, the regmap_can_raw_write() was not modified to take map->write into account. So will only return true if a bus with a .write callback is set. Fixes: d77e74561368 ("regmap: Add bulk read/write callbacks into regmap_config") Signed-off-by: Javier Martinez Canillas <javierm@redhat.com> Reviewed-by: Marek Vasut <marex@denx.de> Link: https://lore.kernel.org/r/20220616073435.1988219-4-javierm@redhat.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-06-20regmap: Make regmap_noinc_read() return -ENOTSUPP if map->read isn't setJavier Martinez Canillas
Before adding support to define bulk read/write callbacks in regmap_config by the commit d77e74561368 ("regmap: Add bulk read/write callbacks into regmap_config"), the regmap_noinc_read() function returned an errno early a map->bus->read callback wasn't set. But that commit dropped the check and now a call to _regmap_raw_read() is attempted even when bulk read operations are not supported. That function checks for map->read anyways but there's no point to continue if the read can't succeed. Also is a fragile assumption to make so is better to make it fail earlier. Fixes: d77e74561368 ("regmap: Add bulk read/write callbacks into regmap_config") Signed-off-by: Javier Martinez Canillas <javierm@redhat.com> Reviewed-by: Marek Vasut <marex@denx.de> Link: https://lore.kernel.org/r/20220616073435.1988219-3-javierm@redhat.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-06-20regmap: Re-introduce bulk read support check in regmap_bulk_read()Javier Martinez Canillas
Support for drivers to define bulk read/write callbacks in regmap_config was introduced by the commit d77e74561368 ("regmap: Add bulk read/write callbacks into regmap_config"), but this commit wrongly dropped a check in regmap_bulk_read() to determine whether bulk reads can be done or not. Before that commit, it was checked if map->bus was set. Now has to check if a map->read callback has been set. Fixes: d77e74561368 ("regmap: Add bulk read/write callbacks into regmap_config") Signed-off-by: Javier Martinez Canillas <javierm@redhat.com> Reviewed-by: Marek Vasut <marex@denx.de> Link: https://lore.kernel.org/r/20220616073435.1988219-2-javierm@redhat.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-06-17Merge tag 'fs_for_v5.19-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull writeback and ext2 fixes from Jan Kara: "A fix for writeback bug which prevented machines with kdevtmpfs from booting and also one small ext2 bugfix in IO error handling" * tag 'fs_for_v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: init: Initialize noop_backing_dev_info early ext2: fix fs corruption when trying to remove a non-empty directory with IO error
2022-06-16mm/memory-failure: disable unpoison once hw error happenszhenwei pi
Currently unpoison_memory(unsigned long pfn) is designed for soft poison(hwpoison-inject) only. Since 17fae1294ad9d, the KPTE gets cleared on a x86 platform once hardware memory corrupts. Unpoisoning a hardware corrupted page puts page back buddy only, the kernel has a chance to access the page with *NOT PRESENT* KPTE. This leads BUG during accessing on the corrupted KPTE. Suggested by David&Naoya, disable unpoison mechanism when a real HW error happens to avoid BUG like this: Unpoison: Software-unpoisoned page 0x61234 BUG: unable to handle page fault for address: ffff888061234000 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 2c01067 P4D 2c01067 PUD 107267063 PMD 10382b063 PTE 800fffff9edcb062 Oops: 0002 [#1] PREEMPT SMP NOPTI CPU: 4 PID: 26551 Comm: stress Kdump: loaded Tainted: G M OE 5.18.0.bm.1-amd64 #7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) ... RIP: 0010:clear_page_erms+0x7/0x10 Code: ... RSP: 0000:ffffc90001107bc8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000901 RCX: 0000000000001000 RDX: ffffea0001848d00 RSI: ffffea0001848d40 RDI: ffff888061234000 RBP: ffffea0001848d00 R08: 0000000000000901 R09: 0000000000001276 R10: 0000000000000003 R11: 0000000000000000 R12: 0000000000000001 R13: 0000000000000000 R14: 0000000000140dca R15: 0000000000000001 FS: 00007fd8b2333740(0000) GS:ffff88813fd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff888061234000 CR3: 00000001023d2005 CR4: 0000000000770ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> prep_new_page+0x151/0x170 get_page_from_freelist+0xca0/0xe20 ? sysvec_apic_timer_interrupt+0xab/0xc0 ? asm_sysvec_apic_timer_interrupt+0x1b/0x20 __alloc_pages+0x17e/0x340 __folio_alloc+0x17/0x40 vma_alloc_folio+0x84/0x280 __handle_mm_fault+0x8d4/0xeb0 handle_mm_fault+0xd5/0x2a0 do_user_addr_fault+0x1d0/0x680 ? kvm_read_and_reset_apf_flags+0x3b/0x50 exc_page_fault+0x78/0x170 asm_exc_page_fault+0x27/0x30 Link: https://lkml.kernel.org/r/20220615093209.259374-2-pizhenwei@bytedance.com Fixes: 847ce401df392 ("HWPOISON: Add unpoisoning support") Fixes: 17fae1294ad9d ("x86/{mce,mm}: Unmap the entire page if the whole page is affected and poisoned") Signed-off-by: zhenwei pi <pizhenwei@bytedance.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: <stable@vger.kernel.org> [5.8+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-06-16init: Initialize noop_backing_dev_info earlyJan Kara
noop_backing_dev_info is used by superblocks of various pseudofilesystems such as kdevtmpfs. After commit 10e14073107d ("writeback: Fix inode->i_io_list not be protected by inode->i_lock error") this broke because __mark_inode_dirty() started to access more fields from noop_backing_dev_info and this led to crashes inside locked_inode_to_wb_and_lock_list() called from __mark_inode_dirty(). Fix the problem by initializing noop_backing_dev_info before the filesystems get mounted. Fixes: 10e14073107d ("writeback: Fix inode->i_io_list not be protected by inode->i_lock error") Reported-and-tested-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reported-and-tested-by: Alexandru Elisei <alexandru.elisei@arm.com> Reported-and-tested-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>
2022-06-15Merge tag 'regmap-field-bit-helpers' of ↵Mark Brown
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap into regmap-5.20 regmap: Add regmap_field helpers for simple bit operations Add simple bit operations for setting, clearing and testing specific bits with regmap_field.
2022-06-15regmap: provide regmap_field helpers for simple bit operationsLi Chen
We have set/clear/test operations for regmap, but not for regmap_field yet. So let's introduce regmap_field helpers too. In many instances regmap_field_update_bits() is used for simple bit setting and clearing. In these cases the last argument is redundant and we can hide it with a static inline function. This adds three new helpers for simple bit operations: set_bits, clear_bits and test_bits (the last one defined as a regular function). Signed-off-by: Li Chen <lchen@ambarella.com> Link: https://lore.kernel.org/r/180eef422c3.deae9cd960729.8518395646822099769@zohomail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-06-14Merge tag 'x86-bugs-2022-06-01' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 MMIO stale data fixes from Thomas Gleixner: "Yet another hw vulnerability with a software mitigation: Processor MMIO Stale Data. They are a class of MMIO-related weaknesses which can expose stale data by propagating it into core fill buffers. Data which can then be leaked using the usual speculative execution methods. Mitigations include this set along with microcode updates and are similar to MDS and TAA vulnerabilities: VERW now clears those buffers too" * tag 'x86-bugs-2022-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/speculation/mmio: Print SMT warning KVM: x86/speculation: Disable Fill buffer clear within guests x86/speculation/mmio: Reuse SRBDS mitigation for SBDS x86/speculation/srbds: Update SRBDS mitigation selection x86/speculation/mmio: Add sysfs reporting for Processor MMIO Stale Data x86/speculation/mmio: Enable CPU Fill buffer clearing on idle x86/bugs: Group MDS, TAA & Processor MMIO Stale Data mitigations x86/speculation/mmio: Add mitigation for Processor MMIO Stale Data x86/speculation: Add a common function for MD_CLEAR mitigation update x86/speculation/mmio: Enumerate Processor MMIO Stale Data bug Documentation: Add documentation for Processor MMIO Stale Data
2022-06-10driver core: Introduce device_find_any_child() helperAndy Shevchenko
There are several places in the kernel where this kind of functionality is being used. Provide a generic helper for such cases. Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20220610120219.18988-1-andriy.shevchenko@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-10driver core: Delete driver_deferred_probe_check_state()Saravana Kannan
The function is no longer used. So delete it. Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20220601070707.3946847-10-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-10driver core: Set fw_devlink.strict=1 by defaultSaravana Kannan
Now that deferred_probe_timeout is non-zero by default, fw_devlink will never permanently block the probing of devices. It'll try its best to probe the devices in the right order and then finally let devices probe even if their suppliers don't have any drivers. Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20220601070707.3946847-8-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>