summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-12-15mm: use stack_depot for recording kmemleak's backtraceZhaoyang Huang
Using stack_depot to record kmemleak's backtrace which has been implemented on slub for reducing redundant information. [akpm@linux-foundation.org: fix build - remove now-unused __save_stack_trace()] [zhaoyang.huang@unisoc.com: v3] Link: https://lkml.kernel.org/r/1667101354-4669-1-git-send-email-zhaoyang.huang@unisoc.com [akpm@linux-foundation.org: fix v3 layout oddities] [akpm@linux-foundation.org: coding-style cleanups] Link: https://lkml.kernel.org/r/1666864224-27541-1-git-send-email-zhaoyang.huang@unisoc.com Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: ke.wang <ke.wang@unisoc.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Zhaoyang Huang <huangzhaoyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-15maple_tree: update copyright dates for test codeLiam Howlett
Add the span to the year of the development. Link: https://lkml.kernel.org/r/20221025173709.2718725-1-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-15maple_tree: fix mas_find_rev() commentLiam Howlett
mas_find_rev() uses mas_prev_entry(), not mas_next_entry(), correct comment. Link: https://lkml.kernel.org/r/20221025173756.2719616-1-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-15mm/gup_test: free memory allocated via kvcalloc() using kvfree()David Hildenbrand
We have to free via kvfree(), not via kfree(). Link: https://lkml.kernel.org/r/20221212182018.264900-1-david@redhat.com Fixes: c77369b437f9 ("mm/gup_test: start/stop/read functionality for PIN LONGTERM test") Signed-off-by: David Hildenbrand <david@redhat.com> Reported-by: kernel test robot <lkp@intel.com> Reported-by: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-15cifs: set correct tcon status after initial tree connectPaulo Alcantara
cifs_tcon::status wasn't correctly updated to TID_GOOD after initial tree connect thus staying at TID_NEW as long as it was connected. Cc: stable@vger.kernel.org Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-12-15Merge tag '6.2-rc-smb3-client-fixes-part1' of ↵Linus Torvalds
git://git.samba.org/sfrench/cifs-2.6 Pull cifs client updates from Steve French: - SMB3.1.1 POSIX Extensions fixes - remove use of generic_writepages() and ->cifs_writepage(), in favor of ->cifs_writepages() and ->migrate_folio() - memory management fixes - mount parm parsing fixes - minor cleanup fixes * tag '6.2-rc-smb3-client-fixes-part1' of git://git.samba.org/sfrench/cifs-2.6: cifs: Remove duplicated include in cifsglob.h cifs: fix oops during encryption cifs: print warning when conflicting soft vs. hard mount options specified cifs: fix missing display of three mount options cifs: fix various whitespace errors in headers cifs: minor cleanup of some headers cifs: skip alloc when request has no pages cifs: remove ->writepage cifs: stop using generic_writepages cifs: wire up >migrate_folio cifs: Parse owner/group for stat in smb311 posix extensions cifs: Add "extbuf" and "extbuflen" args to smb2_compound_op() Fix path in cifs/usage.rst
2022-12-15Merge tag 'i2c-for-6.2-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c updates from Wolfram Sang: "Core got a new helper 'i2c_client_get_device_id()', designware got some bigger updates, the rest is driver updates all over the place" * tag 'i2c-for-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (41 commits) i2c: ismt: Fix an out-of-bounds bug in ismt_access() i2c: mux: reg: check return value after calling platform_get_resource() i2c: xiic: Make sure to disable clock on .remove() i2c: hisi: Add support to get clock frequency from clock i2c: pxa-pci: fix missing pci_disable_device() on error in ce4100_i2c_probe i2c: slave-eeprom: Convert to i2c's .probe_new() i2c: mux: pca954x: Convert to i2c's .probe_new() drivers/i2c: use simple i2c probe i2c: mux: pca9541: switch to using .probe_new i2c: gpio: Fix potential unused warning for 'i2c_gpio_dt_ids' i2c: qcom-geni: add support for I2C Master Hub variant i2c: qcom-geni: add desc struct to prepare support for I2C Master Hub variant soc: qcom: geni-se: add support for I2C Master Hub wrapper variant soc: qcom: geni-se: add desc struct to specify clocks from device match data dt-bindings: i2c: qcom-geni: document I2C Master Hub serial I2C engine dt-bindings: qcom: geni-se: document I2C Master Hub wrapper variant dt-bindings: i2c: renesas,riic: Document RZ/Five SoC i2c: tegra: Set ACPI node as primary fwnode i2c: smbus: add DDR support for SPD i2c: /pasemi: PASemi I2C controller IRQ enablement ...
2022-12-15rtc: ds1742: use devm_platform_get_and_ioremap_resource()Minghao Chi
Convert platform_get_resource(), devm_ioremap_resource() to a single call to devm_platform_get_and_ioremap_resource(), as this is exactly what this function does. Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn> Signed-off-by: ye xingchen <ye.xingchen@zte.com.cn> Link: https://lore.kernel.org/r/202211220947194856561@zte.com.cn Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2022-12-15rtc: mxc_v2: Add missing clk_disable_unprepare()GUO Zihua
The call to clk_disable_unprepare() is left out in the error handling of devm_rtc_allocate_device. Add it back. Fixes: 5490a1e018a4 ("rtc: mxc_v2: fix possible race condition") Signed-off-by: GUO Zihua <guozihua@huawei.com> Link: https://lore.kernel.org/r/20221122085046.21689-1-guozihua@huawei.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2022-12-15rtc: rs5c313: correct some spelling mistakesZhang Jiaming
Change 'modifed' to 'modified'. Change 'Updata' to 'Update'. Change 'Initiatlize' to 'Initialize'. Signed-off-by: Zhang Jiaming <jiaming@nfschina.com> Link: https://lore.kernel.org/r/20220622085344.23519-1-jiaming@nfschina.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2022-12-15rtc: at91rm9200: Fix syntax errors in commentsXiang wangx
Delete the redundant word 'is'. Signed-off-by: Xiang wangx <wangxiang@cdjrlc.com> Link: https://lore.kernel.org/r/20220605083515.9514-1-wangxiang@cdjrlc.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2022-12-15rtc: remove duplicated words in commentsshaomin Deng
Signed-off-by: shaomin Deng <dengshaomin@cdjrlc.com> Link: https://lore.kernel.org/r/20220808152354.3641-1-dengshaomin@cdjrlc.com Link: https://lore.kernel.org/r/20220808153454.6844-1-dengshaomin@cdjrlc.com Link: https://lore.kernel.org/r/20220808152822.5012-1-dengshaomin@cdjrlc.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2022-12-15rtc: rv3028: Use IRQ flags obtained from device tree if availableWadim Egorov
Make the interrupt pin of the RV3028 usable with GPIO controllers without level type IRQs support, such as the TI Davinci GPIO controller. Therefore, allow the IRQ type to be passed from the device tree if available. Based on commit d4785b46345c ("rtc: pcf2127: use IRQ flags obtained from device tree if available") Signed-off-by: Wadim Egorov <w.egorov@phytec.de> Link: https://lore.kernel.org/r/20221208133605.4193907-1-w.egorov@phytec.de Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2022-12-15rtc: ds1307: use sysfs_emit() to instead of scnprintf()ye xingchen
Follow the advice of the Documentation/filesystems/sysfs.rst and show() should only use sysfs_emit() or sysfs_emit_at() when formatting the value to be returned to user space. Signed-off-by: ye xingchen <ye.xingchen@zte.com.cn> Link: https://lore.kernel.org/r/202212051134455911470@zte.com.cn Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2022-12-15rtc: isl12026: drop obsolete dependency on COMPILE_TESTJean Delvare
Since commit 0166dc11be91 ("of: make CONFIG_OF user selectable"), it is possible to test-build any driver which depends on OF on any architecture by explicitly selecting OF. Therefore depending on COMPILE_TEST as an alternative is no longer needed. It is actually better to always build such drivers with OF enabled, so that the test builds are closer to how each driver will actually be built on its intended target. Building them without OF may not test much as the compiler will optimize out potentially large parts of the code. In the worst case, this could even pop false positive warnings. Dropping COMPILE_TEST here improves the quality of our testing and avoids wasting time on non-existent issues. Signed-off-by: Jean Delvare <jdelvare@suse.de> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20221124154359.039be06c@endymion.delvare Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2022-12-15igc: Set Qbv start_time and end_time to end_time if not being configured in GCLTan Tee Min
The default setting of end_time minus start_time is whole 1 second. Thus, if it's not being configured in any GCL entry then it will be staying at original 1 second. This patch is changing the start_time and end_time to be end_time as if setting zero will be having weird HW behavior where the gate will not be fully closed. Fixes: ec50a9d437f0 ("igc: Add support for taprio offloading") Signed-off-by: Tan Tee Min <tee.min.tan@linux.intel.com> Signed-off-by: Muhammad Husaini Zulkifli <muhammad.husaini.zulkifli@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-12-15igc: recalculate Qbv end_time by considering cycle timeTan Tee Min
Qbv users can specify a cycle time that is not equal to the total GCL intervals. Hence, recalculation is necessary here to exclude the time interval that exceeds the cycle time. As those GCL which exceeds the cycle time will be truncated. According to IEEE Std. 802.1Q-2018 section 8.6.9.2, once the end of the list is reached, it will switch to the END_OF_CYCLE state and leave the gates in the same state until the next cycle is started. Fixes: ec50a9d437f0 ("igc: Add support for taprio offloading") Signed-off-by: Tan Tee Min <tee.min.tan@linux.intel.com> Signed-off-by: Muhammad Husaini Zulkifli <muhammad.husaini.zulkifli@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-12-15igc: allow BaseTime 0 enrollment for QbvTan Tee Min
Introduce qbv_enable flag in igc_adapter struct to store the Qbv on/off. So this allow the BaseTime to enroll with zero value. Fixes: 61572d5f8f91 ("igc: Simplify TSN flags handling") Signed-off-by: Muhammad Husaini Zulkifli <muhammad.husaini.zulkifli@intel.com> Signed-off-by: Tan Tee Min <tee.min.tan@linux.intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-12-15igc: Add checking for basetime less than zeroMuhammad Husaini Zulkifli
Using the tc qdisc command, the user can set basetime to any value. Checking should be done on the driver's side to prevent registering basetime values that are less than zero. Fixes: ec50a9d437f0 ("igc: Add support for taprio offloading") Signed-off-by: Muhammad Husaini Zulkifli <muhammad.husaini.zulkifli@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-12-15igc: Use strict cycles for Qbv schedulingVinicius Costa Gomes
Configuring strict cycle mode in the controller forces more well behaved transmissions when taprio is offloaded. When set this strict_cycle and strict_end, transmission is not enabled if the whole packet cannot be completed before end of the Qbv cycle. Fixes: 82faa9b79950 ("igc: Add support for ETF offloading") Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: Aravindhan Gunasekaran <aravindhan.gunasekaran@intel.com> Signed-off-by: Muhammad Husaini Zulkifli <muhammad.husaini.zulkifli@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-12-15igc: Enhance Qbv scheduling by using first flag bitVinicius Costa Gomes
The I225 hardware has a limitation that packets can only be scheduled in the [0, cycle-time] interval. So, scheduling a packet to the start of the next cycle doesn't usually work. To overcome this, we use the Transmit Descriptor first flag to indicates that a packet should be the first packet (from a queue) in a cycle according to the section 7.5.2.9.3.4 The First Packet on Each QBV Cycle in Intel Discrete I225/6 User Manual. But this only works if there was any packet from that queue during the current cycle, to avoid this issue, we issue an empty packet if that's not the case. Also require one more descriptor to be available, to take into account the empty packet that might be issued. Test Setup: Talker: Use l2_tai to generate the launchtime into packet load. Listener: Use timedump.c to compute the delta between packet arrival and LaunchTime packet payload. Test Result: Before: 1666000610127300000,1666000610127300096,96,621273 1666000610127400000,1666000610127400192,192,621274 1666000610127500000,1666000610127500032,32,621275 1666000610127600000,1666000610127600128,128,621276 1666000610127700000,1666000610127700224,224,621277 1666000610127800000,1666000610127800064,64,621278 1666000610127900000,1666000610127900160,160,621279 1666000610128000000,1666000610128000000,0,621280 1666000610128100000,1666000610128100096,96,621281 1666000610128200000,1666000610128200192,192,621282 1666000610128300000,1666000610128300032,32,621283 1666000610128400000,1666000610128301056,-98944,621284 1666000610128500000,1666000610128302080,-197920,621285 1666000610128600000,1666000610128302848,-297152,621286 1666000610128700000,1666000610128303872,-396128,621287 1666000610128800000,1666000610128304896,-495104,621288 1666000610128900000,1666000610128305664,-594336,621289 1666000610129000000,1666000610128306688,-693312,621290 1666000610129100000,1666000610128307712,-792288,621291 1666000610129200000,1666000610128308480,-891520,621292 1666000610129300000,1666000610128309504,-990496,621293 1666000610129400000,1666000610128310528,-1089472,621294 1666000610129500000,1666000610128311296,-1188704,621295 1666000610129600000,1666000610128312320,-1287680,621296 1666000610129700000,1666000610128313344,-1386656,621297 1666000610129800000,1666000610128314112,-1485888,621298 1666000610129900000,1666000610128315136,-1584864,621299 1666000610130000000,1666000610128316160,-1683840,621300 1666000610130100000,1666000610128316928,-1783072,621301 1666000610130200000,1666000610128317952,-1882048,621302 1666000610130300000,1666000610128318976,-1981024,621303 1666000610130400000,1666000610128319744,-2080256,621304 1666000610130500000,1666000610128320768,-2179232,621305 1666000610130600000,1666000610128321792,-2278208,621306 1666000610130700000,1666000610128322816,-2377184,621307 1666000610130800000,1666000610128323584,-2476416,621308 1666000610130900000,1666000610128324608,-2575392,621309 1666000610131000000,1666000610128325632,-2674368,621310 1666000610131100000,1666000610128326400,-2773600,621311 1666000610131200000,1666000610128327424,-2872576,621312 1666000610131300000,1666000610128328448,-2971552,621313 1666000610131400000,1666000610128329216,-3070784,621314 1666000610131500000,1666000610131500032,32,621315 1666000610131600000,1666000610131600128,128,621316 1666000610131700000,1666000610131700224,224,621317 After: 1666073510646200000,1666073510646200064,64,2676462 1666073510646300000,1666073510646300160,160,2676463 1666073510646400000,1666073510646400256,256,2676464 1666073510646500000,1666073510646500096,96,2676465 1666073510646600000,1666073510646600192,192,2676466 1666073510646700000,1666073510646700032,32,2676467 1666073510646800000,1666073510646800128,128,2676468 1666073510646900000,1666073510646900224,224,2676469 1666073510647000000,1666073510647000064,64,2676470 1666073510647100000,1666073510647100160,160,2676471 1666073510647200000,1666073510647200256,256,2676472 1666073510647300000,1666073510647300096,96,2676473 1666073510647400000,1666073510647400192,192,2676474 1666073510647500000,1666073510647500032,32,2676475 1666073510647600000,1666073510647600128,128,2676476 1666073510647700000,1666073510647700224,224,2676477 1666073510647800000,1666073510647800064,64,2676478 1666073510647900000,1666073510647900160,160,2676479 1666073510648000000,1666073510648000000,0,2676480 1666073510648100000,1666073510648100096,96,2676481 1666073510648200000,1666073510648200192,192,2676482 1666073510648300000,1666073510648300032,32,2676483 1666073510648400000,1666073510648400128,128,2676484 1666073510648500000,1666073510648500224,224,2676485 1666073510648600000,1666073510648600064,64,2676486 1666073510648700000,1666073510648700160,160,2676487 1666073510648800000,1666073510648800000,0,2676488 1666073510648900000,1666073510648900096,96,2676489 1666073510649000000,1666073510649000192,192,2676490 1666073510649100000,1666073510649100032,32,2676491 1666073510649200000,1666073510649200128,128,2676492 1666073510649300000,1666073510649300224,224,2676493 1666073510649400000,1666073510649400064,64,2676494 1666073510649500000,1666073510649500160,160,2676495 1666073510649600000,1666073510649600000,0,2676496 1666073510649700000,1666073510649700096,96,2676497 1666073510649800000,1666073510649800192,192,2676498 1666073510649900000,1666073510649900032,32,2676499 1666073510650000000,1666073510650000128,128,2676500 Fixes: 82faa9b79950 ("igc: Add support for ETF offloading") Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Co-developed-by: Aravindhan Gunasekaran <aravindhan.gunasekaran@intel.com> Signed-off-by: Aravindhan Gunasekaran <aravindhan.gunasekaran@intel.com> Co-developed-by: Muhammad Husaini Zulkifli <muhammad.husaini.zulkifli@intel.com> Signed-off-by: Muhammad Husaini Zulkifli <muhammad.husaini.zulkifli@intel.com> Signed-off-by: Malli C <mallikarjuna.chilakala@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-12-15Merge tag 'vfio-v6.2-rc1' of https://github.com/awilliam/linux-vfioLinus Torvalds
Pull VFIO updates from Alex Williamson: - Replace deprecated git://github.com link in MAINTAINERS (Palmer Dabbelt) - Simplify vfio/mlx5 with module_pci_driver() helper (Shang XiaoJing) - Drop unnecessary buffer from ACPI call (Rafael Mendonca) - Correct latent missing include issue in iova-bitmap and fix support for unaligned bitmaps. Follow-up with better fix through refactor (Joao Martins) - Rework ccw mdev driver to split private data from parent structure, better aligning with the mdev lifecycle and allowing us to remove a temporary workaround (Eric Farman) - Add an interface to get an estimated migration data size for a device, allowing userspace to make informed decisions, ex. more accurately predicting VM downtime (Yishai Hadas) - Fix minor typo in vfio/mlx5 array declaration (Yishai Hadas) - Simplify module and Kconfig through consolidating SPAPR/EEH code and config options and folding virqfd module into main vfio module (Jason Gunthorpe) - Fix error path from device_register() across all vfio mdev and sample drivers (Alex Williamson) - Define migration pre-copy interface and implement for vfio/mlx5 devices, allowing portions of the device state to be saved while the device continues operation, towards reducing the stop-copy state size (Jason Gunthorpe, Yishai Hadas, Shay Drory) - Implement pre-copy for hisi_acc devices (Shameer Kolothum) - Fixes to mdpy mdev driver remove path and error path on probe (Shang XiaoJing) - vfio/mlx5 fixes for incorrect return after copy_to_user() fault and incorrect buffer freeing (Dan Carpenter) * tag 'vfio-v6.2-rc1' of https://github.com/awilliam/linux-vfio: (42 commits) vfio/mlx5: error pointer dereference in error handling vfio/mlx5: fix error code in mlx5vf_precopy_ioctl() samples: vfio-mdev: Fix missing pci_disable_device() in mdpy_fb_probe() hisi_acc_vfio_pci: Enable PRE_COPY flag hisi_acc_vfio_pci: Move the dev compatibility tests for early check hisi_acc_vfio_pci: Introduce support for PRE_COPY state transitions hisi_acc_vfio_pci: Add support for precopy IOCTL vfio/mlx5: Enable MIGRATION_PRE_COPY flag vfio/mlx5: Fallback to STOP_COPY upon specific PRE_COPY error vfio/mlx5: Introduce multiple loads vfio/mlx5: Consider temporary end of stream as part of PRE_COPY vfio/mlx5: Introduce vfio precopy ioctl implementation vfio/mlx5: Introduce SW headers for migration states vfio/mlx5: Introduce device transitions of PRE_COPY vfio/mlx5: Refactor to use queue based data chunks vfio/mlx5: Refactor migration file state vfio/mlx5: Refactor MKEY usage vfio/mlx5: Refactor PD usage vfio/mlx5: Enforce a single SAVE command at a time vfio: Extend the device migration protocol with PRE_COPY ...
2022-12-15Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm updates from Paolo Bonzini: "ARM64: - Enable the per-vcpu dirty-ring tracking mechanism, together with an option to keep the good old dirty log around for pages that are dirtied by something other than a vcpu. - Switch to the relaxed parallel fault handling, using RCU to delay page table reclaim and giving better performance under load. - Relax the MTE ABI, allowing a VMM to use the MAP_SHARED mapping option, which multi-process VMMs such as crosvm rely on (see merge commit 382b5b87a97d: "Fix a number of issues with MTE, such as races on the tags being initialised vs the PG_mte_tagged flag as well as the lack of support for VM_SHARED when KVM is involved. Patches from Catalin Marinas and Peter Collingbourne"). - Merge the pKVM shadow vcpu state tracking that allows the hypervisor to have its own view of a vcpu, keeping that state private. - Add support for the PMUv3p5 architecture revision, bringing support for 64bit counters on systems that support it, and fix the no-quite-compliant CHAIN-ed counter support for the machines that actually exist out there. - Fix a handful of minor issues around 52bit VA/PA support (64kB pages only) as a prefix of the oncoming support for 4kB and 16kB pages. - Pick a small set of documentation and spelling fixes, because no good merge window would be complete without those. s390: - Second batch of the lazy destroy patches - First batch of KVM changes for kernel virtual != physical address support - Removal of a unused function x86: - Allow compiling out SMM support - Cleanup and documentation of SMM state save area format - Preserve interrupt shadow in SMM state save area - Respond to generic signals during slow page faults - Fixes and optimizations for the non-executable huge page errata fix. - Reprogram all performance counters on PMU filter change - Cleanups to Hyper-V emulation and tests - Process Hyper-V TLB flushes from a nested guest (i.e. from a L2 guest running on top of a L1 Hyper-V hypervisor) - Advertise several new Intel features - x86 Xen-for-KVM: - Allow the Xen runstate information to cross a page boundary - Allow XEN_RUNSTATE_UPDATE flag behaviour to be configured - Add support for 32-bit guests in SCHEDOP_poll - Notable x86 fixes and cleanups: - One-off fixes for various emulation flows (SGX, VMXON, NRIPS=0). - Reinstate IBPB on emulated VM-Exit that was incorrectly dropped a few years back when eliminating unnecessary barriers when switching between vmcs01 and vmcs02. - Clean up vmread_error_trampoline() to make it more obvious that params must be passed on the stack, even for x86-64. - Let userspace set all supported bits in MSR_IA32_FEAT_CTL irrespective of the current guest CPUID. - Fudge around a race with TSC refinement that results in KVM incorrectly thinking a guest needs TSC scaling when running on a CPU with a constant TSC, but no hardware-enumerated TSC frequency. - Advertise (on AMD) that the SMM_CTL MSR is not supported - Remove unnecessary exports Generic: - Support for responding to signals during page faults; introduces new FOLL_INTERRUPTIBLE flag that was reviewed by mm folks Selftests: - Fix an inverted check in the access tracking perf test, and restore support for asserting that there aren't too many idle pages when running on bare metal. - Fix build errors that occur in certain setups (unsure exactly what is unique about the problematic setup) due to glibc overriding static_assert() to a variant that requires a custom message. - Introduce actual atomics for clear/set_bit() in selftests - Add support for pinning vCPUs in dirty_log_perf_test. - Rename the so called "perf_util" framework to "memstress". - Add a lightweight psuedo RNG for guest use, and use it to randomize the access pattern and write vs. read percentage in the memstress tests. - Add a common ucall implementation; code dedup and pre-work for running SEV (and beyond) guests in selftests. - Provide a common constructor and arch hook, which will eventually be used by x86 to automatically select the right hypercall (AMD vs. Intel). - A bunch of added/enabled/fixed selftests for ARM64, covering memslots, breakpoints, stage-2 faults and access tracking. - x86-specific selftest changes: - Clean up x86's page table management. - Clean up and enhance the "smaller maxphyaddr" test, and add a related test to cover generic emulation failure. - Clean up the nEPT support checks. - Add X86_PROPERTY_* framework to retrieve multi-bit CPUID values. - Fix an ordering issue in the AMX test introduced by recent conversions to use kvm_cpu_has(), and harden the code to guard against similar bugs in the future. Anything that tiggers caching of KVM's supported CPUID, kvm_cpu_has() in this case, effectively hides opt-in XSAVE features if the caching occurs before the test opts in via prctl(). Documentation: - Remove deleted ioctls from documentation - Clean up the docs for the x86 MSR filter. - Various fixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (361 commits) KVM: x86: Add proper ReST tables for userspace MSR exits/flags KVM: selftests: Allocate ucall pool from MEM_REGION_DATA KVM: arm64: selftests: Align VA space allocator with TTBR0 KVM: arm64: Fix benign bug with incorrect use of VA_BITS KVM: arm64: PMU: Fix period computation for 64bit counters with 32bit overflow KVM: x86: Advertise that the SMM_CTL MSR is not supported KVM: x86: remove unnecessary exports KVM: selftests: Fix spelling mistake "probabalistic" -> "probabilistic" tools: KVM: selftests: Convert clear/set_bit() to actual atomics tools: Drop "atomic_" prefix from atomic test_and_set_bit() tools: Drop conflicting non-atomic test_and_{clear,set}_bit() helpers KVM: selftests: Use non-atomic clear/set bit helpers in KVM tests perf tools: Use dedicated non-atomic clear/set bit helpers tools: Take @bit as an "unsigned long" in {clear,set}_bit() helpers KVM: arm64: selftests: Enable single-step without a "full" ucall() KVM: x86: fix APICv/x2AVIC disabled when vm reboot by itself KVM: Remove stale comment about KVM_REQ_UNHALT KVM: Add missing arch for KVM_CREATE_DEVICE and KVM_{SET,GET}_DEVICE_ATTR KVM: Reference to kvm_userspace_memory_region in doc and comments KVM: Delete all references to removed KVM_SET_MEMORY_ALIAS ioctl ...
2022-12-15io_uring: use call_rcu_hurry if signaling an eventfdDylan Yudaken
io_uring uses call_rcu in the case it needs to signal an eventfd as a result of an eventfd signal, since recursing eventfd signals are not allowed. This should be calling the new call_rcu_hurry API to not delay the signal. Signed-off-by: Dylan Yudaken <dylany@meta.com> Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Cc: Paul E. McKenney <paulmck@kernel.org> Acked-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Link: https://lore.kernel.org/r/20221215184138.795576-1-dylany@meta.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-12-15x86/mm: Ensure forced page table splittingDave Hansen
There are a few kernel users like kfence that require 4k pages to work correctly and do not support large mappings. They use set_memory_4k() to break down those large mappings. That, in turn relies on cpa_data->force_split option to indicate to set_memory code that it should split page tables regardless of whether the need to be. But, a recent change added an optimization which would return early if a set_memory request came in that did not change permissions. It did not consult ->force_split and would mistakenly optimize away the splitting that set_memory_4k() needs. This broke kfence. Skip the same-permission optimization when ->force_split is set. Fixes: 127960a05548 ("x86/mm: Inhibit _PAGE_NX changes from cpa_process_alias()") Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Tested-by: Marco Elver <elver@google.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/CA+G9fYuFxZTxkeS35VTZMXwQvohu73W3xbZ5NtjebsVvH6hCuA@mail.gmail.com/
2022-12-15x86/kasan: Populate shadow for shared chunk of the CPU entry areaSean Christopherson
Popuplate the shadow for the shared portion of the CPU entry area, i.e. the read-only IDT mapping, during KASAN initialization. A recent change modified KASAN to map the per-CPU areas on-demand, but forgot to keep a shadow for the common area that is shared amongst all CPUs. Map the common area in KASAN init instead of letting idt_map_in_cea() do the dirty work so that it Just Works in the unlikely event more shared data is shoved into the CPU entry area. The bug manifests as a not-present #PF when software attempts to lookup an IDT entry, e.g. when KVM is handling IRQs on Intel CPUs (KVM performs direct CALL to the IRQ handler to avoid the overhead of INTn): BUG: unable to handle page fault for address: fffffbc0000001d8 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 16c03a067 P4D 16c03a067 PUD 0 Oops: 0000 [#1] PREEMPT SMP KASAN CPU: 5 PID: 901 Comm: repro Tainted: G W 6.1.0-rc3+ #410 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:kasan_check_range+0xdf/0x190 vmx_handle_exit_irqoff+0x152/0x290 [kvm_intel] vcpu_run+0x1d89/0x2bd0 [kvm] kvm_arch_vcpu_ioctl_run+0x3ce/0xa70 [kvm] kvm_vcpu_ioctl+0x349/0x900 [kvm] __x64_sys_ioctl+0xb8/0xf0 do_syscall_64+0x2b/0x50 entry_SYSCALL_64_after_hwframe+0x46/0xb0 Fixes: 9fd429c28073 ("x86/kasan: Map shadow for percpu pages on demand") Reported-by: syzbot+8cdd16fd5a6c0565e227@syzkaller.appspotmail.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20221110203504.1985010-6-seanjc@google.com
2022-12-15x86/kasan: Add helpers to align shadow addresses up and downSean Christopherson
Add helpers to dedup code for aligning shadow address up/down to page boundaries when translating an address to its shadow. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Andrey Ryabinin <ryabinin.a.a@gmail.com> Link: https://lkml.kernel.org/r/20221110203504.1985010-5-seanjc@google.com
2022-12-15x86/kasan: Rename local CPU_ENTRY_AREA variables to shorten namesSean Christopherson
Rename the CPU entry area variables in kasan_init() to shorten their names, a future fix will reference the beginning of the per-CPU portion of the CPU entry area, and shadow_cpu_entry_per_cpu_begin is a bit much. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Andrey Ryabinin <ryabinin.a.a@gmail.com> Link: https://lkml.kernel.org/r/20221110203504.1985010-4-seanjc@google.com
2022-12-15x86/mm: Populate KASAN shadow for entire per-CPU range of CPU entry areaSean Christopherson
Populate a KASAN shadow for the entire possible per-CPU range of the CPU entry area instead of requiring that each individual chunk map a shadow. Mapping shadows individually is error prone, e.g. the per-CPU GDT mapping was left behind, which can lead to not-present page faults during KASAN validation if the kernel performs a software lookup into the GDT. The DS buffer is also likely affected. The motivation for mapping the per-CPU areas on-demand was to avoid mapping the entire 512GiB range that's reserved for the CPU entry area, shaving a few bytes by not creating shadows for potentially unused memory was not a goal. The bug is most easily reproduced by doing a sigreturn with a garbage CS in the sigcontext, e.g. int main(void) { struct sigcontext regs; syscall(__NR_mmap, 0x1ffff000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul); syscall(__NR_mmap, 0x20000000ul, 0x1000000ul, 7ul, 0x32ul, -1, 0ul); syscall(__NR_mmap, 0x21000000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul); memset(&regs, 0, sizeof(regs)); regs.cs = 0x1d0; syscall(__NR_rt_sigreturn); return 0; } to coerce the kernel into doing a GDT lookup to compute CS.base when reading the instruction bytes on the subsequent #GP to determine whether or not the #GP is something the kernel should handle, e.g. to fixup UMIP violations or to emulate CLI/STI for IOPL=3 applications. BUG: unable to handle page fault for address: fffffbc8379ace00 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 16c03a067 P4D 16c03a067 PUD 15b990067 PMD 15b98f067 PTE 0 Oops: 0000 [#1] PREEMPT SMP KASAN CPU: 3 PID: 851 Comm: r2 Not tainted 6.1.0-rc3-next-20221103+ #432 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:kasan_check_range+0xdf/0x190 Call Trace: <TASK> get_desc+0xb0/0x1d0 insn_get_seg_base+0x104/0x270 insn_fetch_from_user+0x66/0x80 fixup_umip_exception+0xb1/0x530 exc_general_protection+0x181/0x210 asm_exc_general_protection+0x22/0x30 RIP: 0003:0x0 Code: Unable to access opcode bytes at 0xffffffffffffffd6. RSP: 0003:0000000000000000 EFLAGS: 00000202 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000000000001d0 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 </TASK> Fixes: 9fd429c28073 ("x86/kasan: Map shadow for percpu pages on demand") Reported-by: syzbot+ffb4f000dc2872c93f62@syzkaller.appspotmail.com Suggested-by: Andrey Ryabinin <ryabinin.a.a@gmail.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Andrey Ryabinin <ryabinin.a.a@gmail.com> Link: https://lkml.kernel.org/r/20221110203504.1985010-3-seanjc@google.com
2022-12-15x86/mm: Recompute physical address for every page of per-CPU CEA mappingSean Christopherson
Recompute the physical address for each per-CPU page in the CPU entry area, a recent commit inadvertantly modified cea_map_percpu_pages() such that every PTE is mapped to the physical address of the first page. Fixes: 9fd429c28073 ("x86/kasan: Map shadow for percpu pages on demand") Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Andrey Ryabinin <ryabinin.a.a@gmail.com> Link: https://lkml.kernel.org/r/20221110203504.1985010-2-seanjc@google.com
2022-12-15x86/mm: Rename __change_page_attr_set_clr(.checkalias)Peter Zijlstra
Now that the checkalias functionality is taken by CPA_NO_CHECK_ALIAS rename the argument to better match is remaining purpose: primary, matching __change_page_attr(). Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20221110125544.661001508%40infradead.org
2022-12-15x86/mm: Inhibit _PAGE_NX changes from cpa_process_alias()Peter Zijlstra
There is a cludge in change_page_attr_set_clr() that inhibits propagating NX changes to the aliases (directmap and highmap) -- this is a cludge twofold: - it also inhibits the primary checks in __change_page_attr(); - it hard depends on single bit changes. The introduction of set_memory_rox() triggered this last issue for clearing both _PAGE_RW and _PAGE_NX. Explicitly ignore _PAGE_NX in cpa_process_alias() instead. Fixes: b38994948567 ("x86/mm: Implement native set_memory_rox()") Reported-by: kernel test robot <oliver.sang@intel.com> Debugged-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20221110125544.594991716%40infradead.org
2022-12-15x86/mm: Untangle __change_page_attr_set_clr(.checkalias)Peter Zijlstra
The .checkalias argument to __change_page_attr_set_clr() is overloaded and serves two different purposes: - it inhibits the call to cpa_process_alias() -- as suggested by the name; however, - it also serves as 'primary' indicator for __change_page_attr() ( which in turn also serves as a recursion terminator for cpa_process_alias() ). Untangle these by extending the use of CPA_NO_CHECK_ALIAS to all callsites that currently use .checkalias=0 for this purpose. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20221110125544.527267183%40infradead.org
2022-12-15x86/mm: Add a few commentsPeter Zijlstra
It's a shame to hide useful comments in Changelogs, add some to the code. Shamelessly stolen from commit: c40a56a7818c ("x86/mm/init: Remove freed kernel image areas from alias mapping") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20221110125544.460677011%40infradead.org
2022-12-15x86/mm: Fix CR3_ADDR_MASKKirill A. Shutemov
The mask must not include bits above physical address mask. These bits are reserved and can be used for other things. Bits 61 and 62 are used for Linear Address Masking. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Reviewed-by: Alexander Potapenko <glider@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Alexander Potapenko <glider@google.com> Link: https://lore.kernel.org/all/20221109165140.9137-2-kirill.shutemov%40linux.intel.com
2022-12-15x86/mm: Remove P*D_PAGE_MASK and P*D_PAGE_SIZE macrosPasha Tatashin
Other architectures and the common mm/ use P*D_MASK, and P*D_SIZE. Remove the duplicated P*D_PAGE_MASK and P*D_PAGE_SIZE which are only used in x86/*. Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Link: https://lore.kernel.org/r/20220516185202.604654-1-tatashin@google.com
2022-12-15mm: Convert __HAVE_ARCH_P..P_GET to the new stylePeter Zijlstra
Since __HAVE_ARCH_* style guards have been depricated in favour of defining the function name onto itself, convert pxxp_get(). Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/Y2EUEBlQXNgaJgoI@hirez.programming.kicks-ass.net
2022-12-15mm: Remove pointless barrier() after pmdp_get_lockless()Peter Zijlstra
pmdp_get_lockless() should itself imply any ordering required. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20221022114425.298833095%40infradead.org
2022-12-15x86/mm/pae: Get rid of set_64bit()Peter Zijlstra
Recognise that set_64bit() is a special case of our previously introduced pxx_xchg64(), so use that and get rid of set_64bit(). Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20221022114425.233481884%40infradead.org
2022-12-15x86_64: Remove pointless set_64bit() usagePeter Zijlstra
The use of set_64bit() in X86_64 only code is pretty pointless, seeing how it's a direct assignment. Remove all this nonsense. [nathanchance: unbreak irte] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20221022114425.168036718%40infradead.org
2022-12-15x86/mm/pae: Be consistent with pXXp_get_and_clear()Peter Zijlstra
Given that ptep_get_and_clear() uses cmpxchg8b, and that should be by far the most common case, there's no point in having an optimized variant for pmd/pud. Introduce the pxx_xchg64() helper to implement the common logic once. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20221022114425.103392961%40infradead.org
2022-12-15x86/mm/pae: Use WRITE_ONCE()Peter Zijlstra
Disallow write-tearing, that would be really unfortunate. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20221022114425.038102604%40infradead.org
2022-12-15x86/mm/pae: Don't (ab)use atomic64Peter Zijlstra
PAE implies CX8, write readable code. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20221022114424.971450128%40infradead.org
2022-12-15mm/gup: Fix the lockless PMD accessPeter Zijlstra
On architectures where the PTE/PMD is larger than the native word size (i386-PAE for example), READ_ONCE() can do the wrong thing. Use pmdp_get_lockless() just like we use ptep_get_lockless(). Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20221022114424.906110403%40infradead.org
2022-12-15mm: Rename pmd_read_atomic()Peter Zijlstra
There's no point in having the identical routines for PTE/PMD have different names. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20221022114424.841277397%40infradead.org
2022-12-15mm: Rename GUP_GET_PTE_LOW_HIGHPeter Zijlstra
Since it no longer applies to only PTEs, rename it to PXX. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20221022114424.776404066%40infradead.org
2022-12-15mm: Fix pmd_read_atomic()Peter Zijlstra
AFAICT there's no reason to do anything different than what we do for PTEs. Make it so (also affects SH). Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20221022114424.711181252%40infradead.org
2022-12-15sh/mm: Make pmd_t similar to pte_tPeter Zijlstra
Just like 64bit pte_t, have a low/high split in pmd_t. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20221022114424.645657294%40infradead.org
2022-12-15x86/mm/pae: Make pmd_t similar to pte_tPeter Zijlstra
Instead of mucking about with at least 2 different ways of fudging it, do the same thing we do for pte_t. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20221022114424.580310787%40infradead.org
2022-12-15mm: Update ptep_get_lockless()'s commentPeter Zijlstra
Improve the comment. Suggested-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20221022114424.515572025%40infradead.org