summaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)Author
2025-04-29perf lock contention: Symbolize zone->lock using BTFNamhyung Kim
The struct zone is embedded in struct pglist_data which can be allocated for each NUMA node early in the boot process. As it's not a slab object nor a global lock, this was not symbolized. Since the zone->lock is often contended, it'd be nice if we can symbolize it. On NUMA systems, node_data array will have pointers for struct pglist_data. By following the pointer, it can calculate the address of each zone and its lock using BTF. On UMA, it can just use contig_page_data and its zones. The following example shows the zone lock contention at the end. $ sudo ./perf lock con -abl -E 5 -- ./perf bench sched messaging # Running 'sched/messaging' benchmark: # 20 sender and receiver processes per group # 10 groups == 400 processes run Total time: 0.038 [sec] contended total wait max wait avg wait address symbol 5167 18.17 ms 10.27 us 3.52 us ffff953340052d00 &kmem_cache_node (spinlock) 38 11.75 ms 465.49 us 309.13 us ffff95334060c480 &sock_inode_cache (spinlock) 3916 10.13 ms 10.43 us 2.59 us ffff953342aecb40 &kmem_cache_node (spinlock) 2963 10.02 ms 13.75 us 3.38 us ffff9533d2344098 &kmalloc-rnd-08-2k (spinlock) 216 5.05 ms 99.49 us 23.39 us ffff9542bf7d65d0 zone_lock (spinlock) Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: bpf@vger.kernel.org Cc: linux-mm@kvack.org Link: https://lore.kernel.org/r/20250401063055.7431-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-29selftests/net: test tcp connection load balancingWillem de Bruijn
Verify that TCP connections use both routes when connecting multiple times to a remote service over a two nexthop multipath route. Use socat to create the connections. Use tc prio + tc filter to count routes taken, counting SYN packets across the two egress devices. Also verify that the saddr matches that of the device. To avoid flaky tests when testing inherently randomized behavior, set a low bar and pass if even a single SYN is observed on each device. Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250424143549.669426-4-willemdebruijn.kernel@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-29selftests: ublk: fix UBLK_F_NEED_GET_DATAMing Lei
Commit 57e13a2e8cd2 ("selftests: ublk: support user recovery") starts to support UBLK_F_NEED_GET_DATA for covering recovery feature, however the ublk utility implementation isn't done correctly. Fix it by supporting UBLK_F_NEED_GET_DATA correctly. Also add test generic_07 for covering UBLK_F_NEED_GET_DATA. Reviewed-by: Caleb Sander Mateos <csander@purestorage.com> Fixes: 57e13a2e8cd2 ("selftests: ublk: support user recovery") Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250429022941.1718671-2-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-28tools/Makefile: Add ynl targetJoe Damato
Add targets to build, clean, and install ynl headers, libynl.a, and python tooling. Signed-off-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20250423204647.190784-1-jdamato@fastly.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28selftests: tc-testing: Add TDC tests that exercise reentrant enqueue behaviourVictor Nogueira
Add 5 TDC tests that exercise the reentrant enqueue behaviour in drr, ets, qfq, and hfsc: - Test DRR's enqueue reentrant behaviour with netem (which caused a double list add) - Test ETS's enqueue reentrant behaviour with netem (which caused a double list add) - Test QFQ's enqueue reentrant behaviour with netem (which caused a double list add) - Test HFSC's enqueue reentrant behaviour with netem (which caused a UAF) - Test nested DRR's enqueue reentrant behaviour with netem (which caused a double list add) Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Victor Nogueira <victor@mojatatu.com> Link: https://patch.msgid.link/20250425220710.3964791-6-victor@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28selftests: net: add a virtio_net deadlock selftestBui Quang Minh
The selftest reproduces the deadlock scenario when binding/unbinding XDP program, XDP socket, rx ring resize on virtio_net interface. Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://patch.msgid.link/20250425071018.36078-5-minhquangbui99@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28selftests: net: retry when bind returns EBUSY in xdp_helperBui Quang Minh
When binding the XDP socket, we may get EBUSY because the deferred destructor of XDP socket in previous test has not been executed yet. If that is the case, just sleep and retry some times. Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://patch.msgid.link/20250425071018.36078-4-minhquangbui99@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28selftests: net: add flag to force zerocopy mode in xdp_helperBui Quang Minh
This commit adds an optional -z flag to xdp_helper. When this flag is provided, the XDP socket binding is forced to be in zerocopy mode. Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://patch.msgid.link/20250425071018.36078-3-minhquangbui99@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28selftests: net: move xdp_helper to net/libBui Quang Minh
Move xdp_helper to net/lib to make it easier for other selftests to use the helper. Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://patch.msgid.link/20250425071018.36078-2-minhquangbui99@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28perf test: Add perf trace summary testNamhyung Kim
$ sudo ./perf test -vv 'trace summary' 109: perf trace summary: --- start --- test child forked, pid 3501572 testing: perf trace -s -- true testing: perf trace -S -- true testing: perf trace -s --summary-mode=thread -- true testing: perf trace -S --summary-mode=total -- true testing: perf trace -as --summary-mode=thread --no-bpf-summary -- true testing: perf trace -as --summary-mode=total --no-bpf-summary -- true testing: perf trace -as --summary-mode=thread --bpf-summary -- true testing: perf trace -as --summary-mode=total --bpf-summary -- true testing: perf trace -aS --summary-mode=total --bpf-summary -- true ---- end(0) ---- 109: perf trace summary : Ok Reviewed-by: Howard Chu <howardchu95@gmail.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20250326044001.3503432-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-28perf trace: Implement syscall summary in BPFNamhyung Kim
When -s/--summary option is used, it doesn't need (augmented) arguments of syscalls. Let's skip the augmentation and load another small BPF program to collect the statistics in the kernel instead of copying the data to the ring-buffer to calculate the stats in userspace. This will be much more light-weight than the existing approach and remove any lost events. Let's add a new option --bpf-summary to control this behavior. I cannot make it default because there's no way to get e_machine in the BPF which is needed for detecting different ABIs like 32-bit compat mode. No functional changes intended except for no more LOST events. :) $ sudo ./perf trace -as --summary-mode=total --bpf-summary sleep 1 Summary of events: total, 6194 events syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ epoll_wait 561 0 4530.843 0.000 8.076 520.941 18.75% futex 693 45 4317.231 0.000 6.230 500.077 21.98% poll 300 0 1040.109 0.000 3.467 120.928 17.02% clock_nanosleep 1 0 1000.172 1000.172 1000.172 1000.172 0.00% ppoll 360 0 872.386 0.001 2.423 253.275 41.91% epoll_pwait 14 0 384.349 0.001 27.453 380.002 98.79% pselect6 14 0 108.130 7.198 7.724 8.206 0.85% nanosleep 39 0 43.378 0.069 1.112 10.084 44.23% ... Reviewed-by: Howard Chu <howardchu95@gmail.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20250326044001.3503432-1-namhyung@kernel.org [ Added fixup sent from Namhyung in response to my report to make it also dependent on CONFIG_TRACE ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-28Merge tag 'hyperv-fixes-signed-20250427' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv fixes from Wei Liu: - Bug fixes for the Hyper-V driver and kvp_daemon * tag 'hyperv-fixes-signed-20250427' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: Drivers: hv: Fix bad ref to hv_synic_eventring_tail when CPU goes offline tools/hv: update route parsing in kvp daemon Drivers: hv: Fix bad pointer dereference in hv_get_partition_id
2025-04-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf after rc4Alexei Starovoitov
Cross-merge bpf and other fixes after downstream PRs. No conflicts. Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-04-26Merge tag 'pci-v6.15-fixes-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull PCI fixes from Bjorn Helgaas: - When releasing a start-aligned resource, e.g., a bridge window, save start/end/flags for the next assignment attempt; fixes a v6.15-rc1 regression (Ilpo Järvinen) - Move set_pcie_speed.sh from TEST_PROGS to TEST_FILE; fixes a bwctrl selftest v6.15-rc1 regression (Ilpo Järvinen) - Add Manivannan Sadhasivam as maintainer of native host bridge and endpoint drivers (Manivannan Sadhasivam) - In endpoint test driver, defer IRQ allocation from .probe() until ioctl() to fix a regression on platforms where the Vendor/Device ID match doesn't include driver_data (Niklas Cassel) * tag 'pci-v6.15-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: misc: pci_endpoint_test: Defer IRQ allocation until ioctl(PCITEST_SET_IRQTYPE) MAINTAINERS: Move Manivannan Sadhasivam as PCI Native host bridge and endpoint maintainer selftests/pcie_bwctrl: Fix test progs list PCI: Restore assigned resources fully after release
2025-04-26Merge tag 'x86-urgent-2025-04-26' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull misc x86 fixes from Ingo Molnar: - Fix 32-bit kernel boot crash if passed physical memory with more than 32 address bits - Fix Xen PV crash - Work around build bug in certain limited build environments - Fix CTEST instruction decoding in insn_decoder_test * tag 'x86-urgent-2025-04-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/insn: Fix CTEST instruction decoding x86/boot: Work around broken busybox 'truncate' tool x86/mm: Fix _pgd_alloc() for Xen PV mode x86/e820: Discard high memory that can't be addressed by 32-bit systems
2025-04-26Merge tag 'move-lib-kunit-v6.15-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull kunit fix from Kees Cook: "A single fix for the kunit lib/tests/ relocation: - Ensure prime numbers tests are included in KUnit test runs (Mark Brown)" * tag 'move-lib-kunit-v6.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: lib: Ensure prime numbers tests are included in KUnit test runs
2025-04-25selftests: net: bridge_vlan_aware: test untagged/8021p-tagged with and ↵Vladimir Oltean
without PVID Recent discussions around commit ad1afb003939 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)") have sparked the question what happens with the DSA (and possibly other switchdev) data path when the bridge says that ports should have no PVID VLAN, but the 8021q module, as the result of a NETDEV_UP event, decides it should add VID 0 to the RX filter of those bridge ports. Do those bridge ports receive packets tagged with VID 0 or not, now? We don't know, there is no test. In the veth realm, this passes trivially, because veth is not VLAN filtering and this, the 8021q module lacks the instinct to add VID 0 in the first place. In the realm of VLAN filtering NICs with no switchdev offload, this should also pass, because the VLAN groups of the software bridge are consulted, where it can clearly be seen that a PVID is missing, even though the packet was initially accepted by the NIC. The test only poses a challenge for switchdev drivers, which usually have to program to hardware both VLANs from RX filtering, as well as from switchdev. Especially when a switchdev port joins a VLAN-aware bridge, it is unavoidable that it gains the NETIF_F_HW_VLAN_CTAG_FILTER feature, i.e. any 8021q uppers that the bridge port may have must also be committed to the RX filtering table of the interface. When a VLAN-tagged packet is physically received by the port, it is initially indistinguishable whether it will reach the bridge data path or the 8021q upper data path. That is rather the final step of the new tests that we introduce. We need to build context up to that stage, which means the following: - we need to test that 802.1p (VID 0) tagged traffic is received in the first place (on bridge ports with a valid PVID). This is the "8021p" test. - we need to test that the usual paths of reaching a configuration with no PVID on a bridge port are all covered and they all reach the same state. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250424223734.3096202-2-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-25io_uring/zcrx: selftests: add test case for rss ctxDavid Wei
RSS contexts are used to shard work across multiple queues for an application using io_uring zero copy receive. Add a test case checking that steering flows into an RSS context works. Until I add multi-thread support to the selftest binary, this test case only has 1 queue in the RSS context. Signed-off-by: David Wei <dw@davidwei.uk> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250425022049.3474590-4-dw@davidwei.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-25io_uring/zcrx: selftests: set hds_thresh to 0David Wei
Setting hds_thresh to 0 is required for queue reset. Signed-off-by: David Wei <dw@davidwei.uk> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250425022049.3474590-3-dw@davidwei.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-25io_uring/zcrx: selftests: switch to using defer() for cleanupDavid Wei
Switch to using defer() for putting the NIC back to the original state prior to running the selftest. Signed-off-by: David Wei <dw@davidwei.uk> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250425022049.3474590-2-dw@davidwei.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-25Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfLinus Torvalds
Pull bpf fixes from Alexei Starovoitov: - Add namespace to BPF internal symbols (Alexei Starovoitov) - Fix possible endless loop in BPF map iteration (Brandon Kammerdiener) - Fix compilation failure for samples/bpf on LoongArch (Haoran Jiang) - Disable a part of sockmap_ktls test (Ihor Solodrai) - Correct typo in __clang_major__ macro (Peilin Ye) * tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: selftests/bpf: Correct typo in __clang_major__ macro samples/bpf: Fix compilation failure for samples/bpf on LoongArch Fedora bpf: Add namespace to BPF internal symbols selftests/bpf: add test for softlock when modifying hashmap while iterating bpf: fix possible endless loop in BPF map iteration selftests/bpf: Mitigate sockmap_ktls disconnect_after_delete failure
2025-04-25selftests/bpf: Correct typo in __clang_major__ macroPeilin Ye
Make sure that CAN_USE_BPF_ST test (compute_live_registers/store) is enabled when __clang_major__ >= 18. Fixes: 2ea8f6a1cda7 ("selftests/bpf: test cases for compute_live_registers()") Signed-off-by: Peilin Ye <yepeilin@google.com> Link: https://lore.kernel.org/r/20250425213712.1542077-1-yepeilin@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-04-25Merge tag 'cxl-fixes-6.15-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull cxl fixes from Dave Jiang: "The fixes address global persistent flush (GPF) changes and CXL Features support changes that went in the 6.15 merge window. And also a fix to an issue observed on CXL 1.1 platform during device enumeration. Summary: - Fix using the wrong GPF DVSEC location: - Fix caching of dport GPF DVSEC from the first endpoint - Ensure that the GPF phase timeout is only updated once by first endpoint - Drop is_port parameter for cxl_gpf_get_dvsec() - Fix the devm_* call host device for CXL fwctl setup - Set the out_len in Set Features failure case - Fix RCD initialization by skipping unneeded mem_en check" * tag 'cxl-fixes-6.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: cxl/core/regs.c: Skip Memory Space Enable check for RCD and RCH Ports cxl/feature: Update out_len in set feature failure case cxl: Fix devm host device for CXL fwctl initialization cxl/pci: Drop the parameter is_port of cxl_gpf_get_dvsec() cxl/pci: Update Port GPF timeout only when the first EP attaching cxl/core: Fix caching dport GPF DVSEC issue
2025-04-25Merge tag 'block-6.15-20250424' of git://git.kernel.dk/linuxLinus Torvalds
Pull block fixes from Jens Axboe: - Fix autoloading of drivers from stat*(2) - Fix losing read-ahead setting one suspend/resume, when a device is re-probed. - Fix race between setting the block size and page cache updates. Includes a helper that a coming XFS fix will use as well. - ublk cancelation fixes. - ublk selftest additions and fixes. - NVMe pull via Christoph: - fix an out-of-bounds access in nvmet_enable_port (Richard Weinberger) * tag 'block-6.15-20250424' of git://git.kernel.dk/linux: ublk: fix race between io_uring_cmd_complete_in_task and ublk_cancel_cmd ublk: call ublk_dispatch_req() for handling UBLK_U_IO_NEED_GET_DATA block: don't autoload drivers on blk-cgroup configuration block: don't autoload drivers on stat block: remove the backing_inode variable in bdev_statx block: move blkdev_{get,put} _no_open prototypes out of blkdev.h block: never reduce ra_pages in blk_apply_bdi_limits selftests: ublk: common: fix _get_disk_dev_t for pre-9.0 coreutils selftests: ublk: remove useless 'delay_us' from 'struct dev_ctx' selftests: ublk: fix recover test block: hoist block size validation code to a separate function block: fix race between set_blocksize and read paths nvmet: fix out-of-bounds access in nvmet_enable_port
2025-04-25Use thread-safe function pointer in libbpf_printJonathan Wiepert
This patch fixes a thread safety bug where libbpf_print uses the global variable storing the print function pointer rather than the local variable that had the print function set via __atomic_load_n. Fixes: f1cb927cdb62 ("libbpf: Ensure print callback usage is thread-safe") Signed-off-by: Jonathan Wiepert <jonathan.wiepert@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Mykyta Yatsenko <mykyta.yatsenko5@gmail.com> Link: https://lore.kernel.org/bpf/20250424221457.793068-1-jonathan.wiepert@gmail.com
2025-04-25libbpf: Remove sample_period init in perf_bufferTao Chen
It seems that sample_period is not used in perf buffer. Actually, only wakeup_events are meaningful to enable events aggregation for wakeup notification. Remove sample_period setting code to avoid confusion. Fixes: fb84b8224655 ("libbpf: add perf buffer API") Signed-off-by: Tao Chen <chen.dylane@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/bpf/20250423163901.2983689-1-chen.dylane@linux.dev
2025-04-25selftests/bpf: add test for softlock when modifying hashmap while iteratingBrandon Kammerdiener
Add test that modifies the map while it's being iterated in such a way that hangs the kernel thread unless the _safe fix is applied to bpf_for_each_hash_elem. Signed-off-by: Brandon Kammerdiener <brandon.kammerdiener@intel.com> Link: https://lore.kernel.org/r/20250424153246.141677-3-brandon.kammerdiener@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Hou Tao <houtao1@huawei.com>
2025-04-25perf vendor events arm64: Drop hip08 PublicDescription if same as ↵Junhao He
BriefDescription If BriefDescription and PublicDescription are the same, only BriefDescription is needed. It will be used for both long and short format outputs. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Junhao He <hejunhao3@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Leo Yan <leo.yan@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250418070812.3771441-3-hejunhao3@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25perf vendor events arm64: Fill up Desc field for Hisi hip08 hha pmuJunhao He
In the same PMU, when some JSON events have the "BriefDescription" field populated while others do not, the cmp_sevent() function will split these two types of events into separate groups. As a result, when using perf list to display events, the two types of events cannot be grouped together in the output. before patch: $ perf list pmu ... uncore hha: hisi_sccl1_hha2/sdir-hit/ hisi_sccl1_hha2/sdir-lookup/ ... uncore hha: edir-hit [Count of The number of HHA E-Dir hit operations. Unit: hisi_sccl1_hha2] ... after patch: $ perf list pmu ... uncore hha: edir-hit [Count of The number of HHA E-Dir hit operations. Unit: hisi_sccl1_hha2] sdir-hit [Count of The number of HHA S-Dir hit operations. Unit: hisi_sccl1_hha2] sdir-lookup [Count of the number of HHA S-Dir lookup operations. Unit: hisi_sccl1_hha2] ... Reviewed-by: James Clark <james.clark@linaro.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Junhao He <hejunhao3@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Leo Yan <leo.yan@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250418070812.3771441-2-hejunhao3@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25perf bench evlist-open-close: Reduce scope of 2 variablesIan Rogers
Make 2 global variables local. Reduces ELF binary size by removing relocations. For a no flags build, the perf binary size is reduced by 4,144 bytes on x86-64. Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Dominique Martinet <asmadeus@codewreck.org> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Hao Ge <gehao@kylinos.cn> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Levi Yun <yeoreum.yun@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tengda Wu <wutengda@huaweicloud.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Veronika Molnarova <vmolnaro@redhat.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xu Yang <xu.yang_2@nxp.com> Link: https://lore.kernel.org/r/20250410173631.1713627-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25perf tests record: Cleanup improvementsIan Rogers
Remove the script output file. Add a trap debug message. Minor style consistency changes. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250410173631.1713627-2-irogers@google.com Cc: Mark Rutland <mark.rutland@arm.com> Cc: Levi Yun <yeoreum.yun@arm.com> Cc: Dominique Martinet <asmadeus@codewreck.org> Cc: Howard Chu <howardchu95@gmail.com> Cc: Tengda Wu <wutengda@huaweicloud.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Hao Ge <gehao@kylinos.cn> Cc: James Clark <james.clark@linaro.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Xu Yang <xu.yang_2@nxp.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Veronika Molnarova <vmolnaro@redhat.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: bpf@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-perf-users@vger.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25perf tests metric-only perf stat: Fix tests 84 and 86 s390Thomas Richter
On s390x KVM and z/VM machines the CPU Measurement Facility is not available. Events cycles and instructions do not exist. Running above tests on s390 KVM and z/VM guests always fail with this error: # ./perf test 84 86 84: perf stat JSON output linter : FAILED! 86: perf stat STD output linter : FAILED! # Root cause is command: # perf stat -j --metric-only -e instructions,cycles -- true {"metric-value" : "none"} # Which fails due to unsupported events and returns "none". Do not execute this test case on s390 KVM and z/VM machines. Output after: # ./perf test 84 86 84: perf stat JSON output linter : Ok 86: perf stat STD output linter : Ok # Fixes: 45a86d017adf4d6c ("perf test: Add --metric-only to perf stat output tests") Suggested-by: Heiko Carstens <hca@linux.ibm.com> Suggested-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Vasily Gorbik <gor@linux.ibm.com> Link: https://lore.kernel.org/r/20250424133310.37452-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25perf tool_pmu: Fix aggregation on duration_timeIan Rogers
evsel__count_has_error() fails counters when the enabled or running time are 0. The duration_time event reads 0 when the cpu_map_idx != 0 to avoid aggregating time over CPUs. Change the enable and running time to always have a ratio of 100% so that evsel__count_has_error won't fail. Before: ``` $ sudo /tmp/perf/perf stat --per-core -a -M UNCORE_FREQ sleep 1 Performance counter stats for 'system wide': S0-D0-C0 1 2,615,819,485 UNC_CLOCK.SOCKET # 2.61 UNCORE_FREQ S0-D0-C0 2 <not counted> duration_time 1.002111784 seconds time elapsed ``` After: ``` $ perf stat --per-core -a -M UNCORE_FREQ sleep 1 Performance counter stats for 'system wide': S0-D0-C0 1 758,160,296 UNC_CLOCK.SOCKET # 0.76 UNCORE_FREQ S0-D0-C0 2 1,003,438,246 duration_time 1.002486017 seconds time elapsed ``` Note: the metric reads the value a different way and isn't impacted. Fixes: 240505b2d0adcdc8 ("perf tool_pmu: Factor tool events into their own PMU") Reported-by: Stephane Eranian <eranian@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Link: https://lore.kernel.org/r/20250423050358.94310-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25perf session: Skip unsupported new event typesChun-Tse Shao
`perf report` currently halts with an error when encountering unsupported new event types (`event.type >= PERF_RECORD_HEADER_MAX`). This patch modifies the behavior to skip these samples and continue processing the remaining events. Additionally, stops reporting if the new event size is not 8-byte aligned. Suggested-by: Arnaldo Carvalho de Melo <acme@kernel.org> Suggested-by: Namhyung Kim <namhyung@kernel.org> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Chun-Tse Shao <ctshao@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250414173921.2905822-1-ctshao@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25perf hist: Allow custom output fields in hierarchy modeNamhyung Kim
Now it can handle multiple output fields and sort keys in separate levels, so it should be ok to use it in the hierarchy mode. This allows fully customized output format. $ perf report -F latency,comm,parallelism -H --stdio ... # Latency Command / Parallelism # ........... ..................... # 31.84% cc1 29.96% 5 1.24% 4 0.37% 6 0.26% 3 0.02% 2 24.68% as 22.39% 5 1.12% 2 0.98% 4 0.12% 3 0.07% 6 ... Committer testing: Before: $ perf report -F latency,comm,parallelism -H --stdio Error: --hierarchy and --fields options cannot be used together Usage: perf report [<options>] -F, --fields <key[,keys...]> output field(s): overhead latency overhead_sys overhead_us overhead_guest_sys overhead_guest_us overhead_children latency_children sample period weight1 weight2 weight3 <SNIP> -H, --hierarchy Show entries in a hierarchy $ After: $ perf report -F latency,comm,parallelism -H --stdio # Total Lost Samples: 0 # # Samples: 1K of event 'cycles:Pu' # Event count (approx.): 1581450138 # # Latency Command / Parallelism # ........... ..................... # 97.66% git 96.95% 1 0.55% 2 0.04% 5 0.03% 8 0.03% 4 0.02% 3 0.01% 9 0.01% 7 0.01% 6 0.01% 10 0.00% 12 2.34% git-remote-http 2.24% 1 0.07% 5 0.02% 2 0.00% 4 # # (Tip: To analyze particular parallelism levels, try: perf report --latency --parallelism=32-64) Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250331073722.4695-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25perf hist: Set levels in output_field_add()Namhyung Kim
It turns out that the output fields didn't consider the hierarchy mode and put all the fields in the same level. To support hierarchy, each non-output field should be in a separate level. Pass a pointer to level to output_field_add() and make it increase the level when it sees non-output fields. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250331073722.4695-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25perf hist: Remove formats in hierarchy when cancel latencyNamhyung Kim
Likewise, it should remove latency output fields in hierarchy list. Pass evlist to perf_hpp__cancel_latency() to handle them properly. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250331073722.4695-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25perf hist: Remove formats in hierarchy when cancel childrenNamhyung Kim
This is to support hierarchy options with custom output fields. Currently perf_hpp__cancel_cumulate() only removes accumulated overhead and latency fields from the global perf_hpp_list. This is not used in the hierarchy mode because each evsel's hist has its own separate hpp_list. So it needs to remove the fields from the lists too. Pass evlist to the function so that it can iterate the evsels. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250331073722.4695-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25perf record: Retirement latency cleanup in evsel__configIan Rogers
'perf record' will fail with retirement latency events as the open doesn't do a perf_event_open system call. Use evsel__config() to set up such events for recording by removing the flag and enabling sample weights - the sample weights containing the retirement latency. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Weilin Wang <weilin.wang@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andreas Färber <afaerber@suse.de> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250414174134.3095492-17-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25perf pmu-events: Add retirement latency to JSON events inside of perfIan Rogers
The updated Intel vendor events add retirement latency for graniterapids: https://lore.kernel.org/lkml/20250322063403.364981-14-irogers@google.com/ This change makes those values available within an alias/event within a PMU and saves them into the evsel at event parse time. When no TPEBS data is available the default values are substituted in for TMA metrics that are using retirement latency events - currently just those on graniterapids. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Weilin Wang <weilin.wang@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andreas Färber <afaerber@suse.de> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250414174134.3095492-16-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25perf stat: Add mean, min, max and last --tpebs-mode optionsIan Rogers
Add command line configuration option for how retirement latency events are combined. The default "mean" gives the average of retirement latency. "min" or "max" give the smallest or largest retirment latency times respectively. "last" uses the last retirment latency sample's time. Committer notes: Enclose parse_tpebs_mode() under HAVE_ARCH_X86_64_SUPPORT to match the ifdef block where it is used, fixing the build in systems like: 20 5.60 debian:experimental-x-mips : FAIL gcc version 14.2.0 (Debian 14.2.0-1) builtin-stat.c:2330:12: error: 'parse_tpebs_mode' defined but not used [-Werror=unused-function] 2330 | static int parse_tpebs_mode(const struct option *opt, const char *str, | ^~~~~~~~~~~~~~~~ Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Weilin Wang <weilin.wang@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andreas Färber <afaerber@suse.de> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250414174134.3095492-15-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25perf intel-tpebs: Use stats for retirement latency statisticsIan Rogers
struct stats provides access to mean, min and max. It also provides uniformity with statistics code used elsewhere in perf. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Weilin Wang <weilin.wang@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andreas Färber <afaerber@suse.de> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250414174134.3095492-14-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25perf intel-tpebs: Don't close record on readIan Rogers
Factor sending record control fd code into its own function. Rather than killing the record process send it a ping when reading. Timeouts were witnessed if done too frequently, so only ping for the first tpebs events. Don't kill the record command send it a stop command. As close isn't reliably called also close on evsel__exit. Add extra checks on the record being terminated to avoid warnings. Adjust the locking as needed and incorporate extra -Wthread-safety checks. Check to do six 500ms poll timeouts when sending commands, rather than the larger 3000ms, to allow the record process terminating to be better witnessed. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Weilin Wang <weilin.wang@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andreas Färber <afaerber@suse.de> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250414174134.3095492-13-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25perf intel-tpebs: Add mutex for tpebs_resultsIan Rogers
Ensure sample reader isn't racing with events being added/removed. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Weilin Wang <weilin.wang@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andreas Färber <afaerber@suse.de> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250414174134.3095492-12-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25perf intel-tpebs: Add support for updating counts in evsel__tpebs_readIan Rogers
Rename to reflect evsel argument and for consistency with other tpebs functions. Update count from prev_raw_counts when available. Eventually this will allow inteval mode support. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Weilin Wang <weilin.wang@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andreas Färber <afaerber@suse.de> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250414174134.3095492-11-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25perf intel-tpebs: Refactor tpebs_results listIan Rogers
evsel names and metric-ids are used for matching but this can be problematic, for example, multiple occurrences of the same retirement latency event become a single event for the record. Change the name of the record events so they are unique and reflect the evsel of the retirement latency event that opens them (the retirement latency event's evsel address is embedded within them). This allows an evsel based close to close the event when the retirement latency event is closed. This is important as 'perf stat' has an evlist and the session listen to the record events has an evlist, knowing which event should remove the tpebs_retire_lat can't be tied to an evlist list as there is more than 1, so closing which evlist should cause the tpebs to stop? Using the evsel and the last one out doing the tpebs_stop is cleaner. Committer notes: Fix the build on 32-bit systems by using unsigned long when converting pointers to integers instead of uint64_t. Fixes: 20 4.97 debian:experimental-x-mips : FAIL gcc version 14.2.0 (Debian 14.2.0-13) util/intel-tpebs.c: In function 'tpebs_retire_lat__find': util/intel-tpebs.c:377:21: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast] 377 | if ((uint64_t)t->evsel == num) | ^ cc1: all warnings being treated as errors Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Weilin Wang <weilin.wang@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andreas Färber <afaerber@suse.de> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250414174134.3095492-10-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25perf intel-tpebs: Ensure events are opened, factor out findingIan Rogers
Factor out finding an tpebs_retire_lat from an evsel. Don't blindly return when ignoring an open request, which happens after the first open request, ensure the event was started on a fork of perf record. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Weilin Wang <weilin.wang@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andreas Färber <afaerber@suse.de> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250414174134.3095492-9-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25perf intel-tpebs: Inline get_perf_record_argsIan Rogers
Code is short enough to be inlined and there are no error cases when made inline. Make the implicit NULL pointer at the end of the argv explicit. Move the fixed number of arguments before the variable number of arguments. Correctly size the argv allocation and zero when feeing to avoid a dangling pointer. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Weilin Wang <weilin.wang@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andreas Färber <afaerber@suse.de> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250414174134.3095492-8-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25perf intel-tpebs: Reduce scope of the tpebs_events_size variableIan Rogers
Moved to record argument computation rather than being global. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Weilin Wang <weilin.wang@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andreas Färber <afaerber@suse.de> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250414174134.3095492-7-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25perf intel-tpebs: Move the cpumap_buf variable out of evsel__tpebs_open()Ian Rogers
The buffer holds the cpumap to pass to the 'perf record' command, so move it down to the 'perf record' function. Make this function an evsel function given the need for the evsel for the cpumap. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Weilin Wang <weilin.wang@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andreas Färber <afaerber@suse.de> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250414174134.3095492-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>