linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2023-10-09	bpf: Derive source IP addr via bpf_*_fib_lookup()	Martynas Pumputis
	Extend the bpf_fib_lookup() helper by making it to return the source IPv4/IPv6 address if the BPF_FIB_LOOKUP_SRC flag is set. For example, the following snippet can be used to derive the desired source IP address: struct bpf_fib_lookup p = { .ipv4_dst = ip4->daddr }; ret = bpf_skb_fib_lookup(skb, p, sizeof(p), BPF_FIB_LOOKUP_SRC \| BPF_FIB_LOOKUP_SKIP_NEIGH); if (ret != BPF_FIB_LKUP_RET_SUCCESS) return TC_ACT_SHOT; /* the p.ipv4_src now contains the source address */ The inability to derive the proper source address may cause malfunctions in BPF-based dataplanes for hosts containing netdevs with more than one routable IP address or for multi-homed hosts. For example, Cilium implements packet masquerading in BPF. If an egressing netdev to which the Cilium's BPF prog is attached has multiple IP addresses, then only one [hardcoded] IP address can be used for masquerading. This breaks connectivity if any other IP address should have been selected instead, for example, when a public and private addresses are attached to the same egress interface. The change was tested with Cilium [1]. Nikolay Aleksandrov helped to figure out the IPv6 addr selection. [1]: https://github.com/cilium/cilium/pull/28283 Signed-off-by: Martynas Pumputis <m@lambda.lt> Link: https://lore.kernel.org/r/20231007081415.33502-2-m@lambda.lt Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-10-09	selftests/bpf: Add testcase for async callback return value failure	David Vernet
	A previous commit updated the verifier to print an accurate failure message for when someone specifies a nonzero return value from an async callback. This adds a testcase for validating that the verifier emits the correct message in such a case. Signed-off-by: David Vernet <void@manifault.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20231009161414.235829-2-void@manifault.com
2023-10-09	KVM: selftests: Test behavior of HWCR, a.k.a. MSR_K7_HWCR	Jim Mattson
	Verify the following behavior holds true for writes and reads of HWCR from host userspace: * Attempts to set bits 3, 6, or 8 are ignored * Bits 18 and 24 are the only bits that can be set * Any bit that can be set can also be cleared Signed-off-by: Jim Mattson <jmattson@google.com> Link: https://lore.kernel.org/r/20230929230246.1954854-4-jmattson@google.com Co-developed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-10-09	tools/testing/cxl: Add 'sanitize notifier' support	Dan Williams
	Allow for cxl_test regression of the sanitize notifier. Reuse the core setup infrastructure, and trigger notifications upon any sanitize submission with a programmable notification delay. Cc: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-10-09	tools/testing/cxl: Make cxl_memdev_state available to other command emulation	Dan Williams
	Move @mds out of the event specific 'struct mock_event_store' and into the base 'struct cxl_mockmem_data' directly. This is in preparation for enabling cxl_test to exercise the notifier flow for 'sanitize' operation completion. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-10-09	bpftool: Align bpf_load_and_run_opts insns and data	Ian Rogers
	A C string lacks alignment so use aligned arrays to avoid potential alignment problems. Switch to using sizeof (less 1 for the \0 terminator) rather than a hardcode size constant. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20231007044439.25171-2-irogers@google.com
2023-10-09	bpftool: Align output skeleton ELF code	Ian Rogers
	libbpf accesses the ELF data requiring at least 8 byte alignment, however, the data is generated into a C string that doesn't guarantee alignment. Fix this by assigning to an aligned char array. Use sizeof on the array, less one for the \0 terminator, rather than generating a constant. Fixes: a6cc6b34b93e ("bpftool: Provide a helper method for accessing skeleton's embedded ELF data") Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Acked-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20231007044439.25171-1-irogers@google.com
2023-10-09	selftests/bpf: Test pinning bpf timer to a core	David Vernet
	Now that we support pinning a BPF timer to the current core, we should test it with some selftests. This patch adds two new testcases to the timer suite, which verifies that a BPF timer both with and without BPF_F_TIMER_ABS, can be pinned to the calling core with BPF_F_TIMER_CPU_PIN. Signed-off-by: David Vernet <void@manifault.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <song@kernel.org> Acked-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/bpf/20231004162339.200702-3-void@manifault.com
2023-10-09	bpf: Add ability to pin bpf timer to calling CPU	David Vernet
	BPF supports creating high resolution timers using bpf_timer_* helper functions. Currently, only the BPF_F_TIMER_ABS flag is supported, which specifies that the timeout should be interpreted as absolute time. It would also be useful to be able to pin that timer to a core. For example, if you wanted to make a subset of cores run without timer interrupts, and only have the timer be invoked on a single core. This patch adds support for this with a new BPF_F_TIMER_CPU_PIN flag. When specified, the HRTIMER_MODE_PINNED flag is passed to hrtimer_start(). A subsequent patch will update selftests to validate. Signed-off-by: David Vernet <void@manifault.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <song@kernel.org> Acked-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/bpf/20231004162339.200702-2-void@manifault.com
2023-10-06	selftests/bpf: Make seen_tc* variable tests more robust	Daniel Borkmann
	Martin reported that on his local dev machine the test_tc_chain_mixed() fails as "test_tc_chain_mixed:FAIL:seen_tc5 unexpected seen_tc5: actual 1 != expected 0" and others occasionally, too. However, when running in a more isolated setup (qemu in particular), it works fine for him. The reason is that there is a small race-window where seen_tc* could turn into true for various test cases when there is background traffic, e.g. after the asserts they often get reset. In such case when subsequent detach takes place, unrelated background traffic could have already flipped the bool to true beforehand. Add a small helper tc_skel_reset_all_seen() to reset all bools before we do the ping test. At this point, everything is set up as expected and therefore no race can occur. All tc_{opts,links} tests continue to pass after this change. Reported-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/r/20231006220655.1653-7-daniel@iogearbox.net Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-10-06	selftests/bpf: Test query on empty mprog and pass revision into attach	Daniel Borkmann
	Add a new test case to query on an empty bpf_mprog and pass the revision directly into expected_revision for attachment to assert that this does succeed. ./test_progs -t tc_opts [ 1.406778] tsc: Refined TSC clocksource calibration: 3407.990 MHz [ 1.408863] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x311fcaf6eb0, max_idle_ns: 440795321766 ns [ 1.412419] clocksource: Switched to clocksource tsc [ 1.428671] bpf_testmod: loading out-of-tree module taints kernel. [ 1.430260] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel #252 tc_opts_after:OK #253 tc_opts_append:OK #254 tc_opts_basic:OK #255 tc_opts_before:OK #256 tc_opts_chain_classic:OK #257 tc_opts_chain_mixed:OK #258 tc_opts_delete_empty:OK #259 tc_opts_demixed:OK #260 tc_opts_detach:OK #261 tc_opts_detach_after:OK #262 tc_opts_detach_before:OK #263 tc_opts_dev_cleanup:OK #264 tc_opts_invalid:OK #265 tc_opts_max:OK #266 tc_opts_mixed:OK #267 tc_opts_prepend:OK #268 tc_opts_query:OK #269 tc_opts_query_attach:OK <--- (new test) #270 tc_opts_replace:OK #271 tc_opts_revision:OK Summary: 20/0 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/r/20231006220655.1653-6-daniel@iogearbox.net Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-10-06	selftests/bpf: Adapt assert_mprog_count to always expect 0 count	Daniel Borkmann
	Simplify __assert_mprog_count() to remove the -ENOENT corner case as the bpf_prog_query() now returns 0 when no bpf_mprog is attached. This also allows to convert a few test cases from using raw __assert_mprog_count() over to plain assert_mprog_count() helper. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/r/20231006220655.1653-5-daniel@iogearbox.net Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-10-06	selftests/bpf: Test bpf_mprog query API via libbpf and raw syscall	Daniel Borkmann
	Add a new test case which performs double query of the bpf_mprog through libbpf API, but also via raw bpf(2) syscall. This is testing to gather first the count and then in a subsequent probe the full information with the program array without clearing passed structs in between. # ./vmtest.sh -- ./test_progs -t tc_opts [...] ./test_progs -t tc_opts [ 1.398818] tsc: Refined TSC clocksource calibration: 3407.999 MHz [ 1.400263] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x311fd336761, max_idle_ns: 440795243819 ns [ 1.402734] clocksource: Switched to clocksource tsc [ 1.426639] bpf_testmod: loading out-of-tree module taints kernel. [ 1.428112] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel #252 tc_opts_after:OK #253 tc_opts_append:OK #254 tc_opts_basic:OK #255 tc_opts_before:OK #256 tc_opts_chain_classic:OK #257 tc_opts_chain_mixed:OK #258 tc_opts_delete_empty:OK #259 tc_opts_demixed:OK #260 tc_opts_detach:OK #261 tc_opts_detach_after:OK #262 tc_opts_detach_before:OK #263 tc_opts_dev_cleanup:OK #264 tc_opts_invalid:OK #265 tc_opts_max:OK #266 tc_opts_mixed:OK #267 tc_opts_prepend:OK #268 tc_opts_query:OK <--- (new test) #269 tc_opts_replace:OK #270 tc_opts_revision:OK Summary: 19/0 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/r/20231006220655.1653-4-daniel@iogearbox.net Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-10-06	selftests: firmware: remove duplicate unneeded defines	Muhammad Usama Anjum
	These duplicate defines should automatically be picked up from kernel headers. Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-06	selftests: core: remove duplicate defines	Muhammad Usama Anjum
	Remove duplicate defines which are already defined in kernel headers and re-definition isn't required. Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-06	selftests: clone3: remove duplicate defines	Muhammad Usama Anjum
	Remove duplicate defines which are already included in kernel headers. MAX_PID_NS_LEVEL macro is used inside kernel only. It isn't exposed to userspace. So it is never defined in test application. Remove #ifndef in this case. Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-06	selftests: capabilities: remove duplicate unneeded defines	Muhammad Usama Anjum
	These duplicate defines should automatically be picked up from kernel headers. Use KHDR_INCLUDES to add kernel header files. Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-06	kselftest: vm: add tests for no-inherit memory-deny-write-execute	Florent Revest
	Add some tests to cover the new PR_MDWE_NO_INHERIT flag of the PR_SET_MDWE prctl. Check that: - it can't be set without PR_SET_MDWE - MDWE flags can't be unset - when set, PR_SET_MDWE doesn't propagate to children Link: https://lkml.kernel.org/r/20230828150858.393570-7-revest@chromium.org Signed-off-by: Florent Revest <revest@chromium.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Alexey Izbyshev <izbyshev@ispras.ru> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Ayush Jain <ayush.jain3@amd.com> Cc: David Hildenbrand <david@redhat.com> Cc: Greg Thelen <gthelen@google.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Peter Xu <peterx@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Szabolcs Nagy <Szabolcs.Nagy@arm.com> Cc: Topi Miettinen <toiwoton@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-06	mm: add a NO_INHERIT flag to the PR_SET_MDWE prctl	Florent Revest
	This extends the current PR_SET_MDWE prctl arg with a bit to indicate that the process doesn't want MDWE protection to propagate to children. To implement this no-inherit mode, the tag in current->mm->flags must be absent from MMF_INIT_MASK. This means that the encoding for "MDWE but without inherit" is different in the prctl than in the mm flags. This leads to a bit of bit-mangling in the prctl implementation. Link: https://lkml.kernel.org/r/20230828150858.393570-6-revest@chromium.org Signed-off-by: Florent Revest <revest@chromium.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Alexey Izbyshev <izbyshev@ispras.ru> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Ayush Jain <ayush.jain3@amd.com> Cc: David Hildenbrand <david@redhat.com> Cc: Greg Thelen <gthelen@google.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Peter Xu <peterx@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Szabolcs Nagy <Szabolcs.Nagy@arm.com> Cc: Topi Miettinen <toiwoton@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-06	mm: make PR_MDWE_REFUSE_EXEC_GAIN an unsigned long	Florent Revest
	Defining a prctl flag as an int is a footgun because on a 64 bit machine and with a variadic implementation of prctl (like in musl and glibc), when used directly as a prctl argument, it can get casted to long with garbage upper bits which would result in unexpected behaviors. This patch changes the constant to an unsigned long to eliminate that possibilities. This does not break UAPI. I think that a stable backport would be "nice to have": to reduce the chances that users build binaries that could end up with garbage bits in their MDWE prctl arguments. We are not aware of anyone having yet encountered this corner case with MDWE prctls but a backport would reduce the likelihood it happens, since this sort of issues has happened with other prctls. But If this is perceived as a backporting burden, I suppose we could also live without a stable backport. Link: https://lkml.kernel.org/r/20230828150858.393570-5-revest@chromium.org Fixes: b507808ebce2 ("mm: implement memory-deny-write-execute as a prctl") Signed-off-by: Florent Revest <revest@chromium.org> Suggested-by: Alexey Izbyshev <izbyshev@ispras.ru> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Ayush Jain <ayush.jain3@amd.com> Cc: Greg Thelen <gthelen@google.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Peter Xu <peterx@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Szabolcs Nagy <Szabolcs.Nagy@arm.com> Cc: Topi Miettinen <toiwoton@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-06	kselftest: vm: check errnos in mdwe_test	Florent Revest
	Invalid prctls return a negative code and set errno. It's good practice to check that errno is set as expected. Link: https://lkml.kernel.org/r/20230828150858.393570-4-revest@chromium.org Signed-off-by: Florent Revest <revest@chromium.org> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Alexey Izbyshev <izbyshev@ispras.ru> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Ayush Jain <ayush.jain3@amd.com> Cc: David Hildenbrand <david@redhat.com> Cc: Greg Thelen <gthelen@google.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Peter Xu <peterx@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Szabolcs Nagy <Szabolcs.Nagy@arm.com> Cc: Topi Miettinen <toiwoton@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-06	kselftest: vm: fix mdwe's mmap_FIXED test case	Florent Revest
	I checked with the original author, the mmap_FIXED test case wasn't properly tested and fails. Currently, it maps two consecutive (non overlapping) pages and expects the second mapping to be denied by MDWE but these two pages have nothing to do with each other so MDWE is actually out of the picture here. What the test actually intended to do was to remap a virtual address using MAP_FIXED. However, this operation unmaps the existing mapping and creates a new one so the va is backed by a new page and MDWE is again out of the picture, all remappings should succeed. This patch keeps the test case to make it clear that this situation is expected to work: MDWE shouldn't block a MAP_FIXED replacement. Link: https://lkml.kernel.org/r/20230828150858.393570-3-revest@chromium.org Fixes: 4cf1fe34fd18 ("kselftest: vm: add tests for memory-deny-write-execute") Signed-off-by: Florent Revest <revest@chromium.org> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> Tested-by: Ryan Roberts <ryan.roberts@arm.com> Tested-by: Ayush Jain <ayush.jain3@amd.com> Cc: Alexey Izbyshev <izbyshev@ispras.ru> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Greg Thelen <gthelen@google.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Peter Xu <peterx@redhat.com> Cc: Szabolcs Nagy <Szabolcs.Nagy@arm.com> Cc: Topi Miettinen <toiwoton@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-06	kselftest: vm: fix tabs/spaces inconsistency in the mdwe test	Florent Revest
	Patch series "MDWE without inheritance", v4. Joey recently introduced a Memory-Deny-Write-Executable (MDWE) prctl which tags current with a flag that prevents pages that were previously not executable from becoming executable. This tag always gets inherited by children tasks. (it's in MMF_INIT_MASK) At Google, we've been using a somewhat similar downstream patch for a few years now. To make the adoption of this feature easier, we've had it support a mode in which the W^X flag does not propagate to children. For example, this is handy if a C process which wants W^X protection suspects it could start children processes that would use a JIT. I'd like to align our features with the upstream prctl. This series proposes a new NO_INHERIT flag to the MDWE prctl to make this kind of adoption easier. It sets a different flag in current that is not in MMF_INIT_MASK and which does not propagate. As part of looking into MDWE, I also fixed a couple of things in the MDWE test. The background for this was discussed in these threads: v1: https://lore.kernel.org/all/66900d0ad42797a55259061f757beece@ispras.ru/ v2: https://lore.kernel.org/all/d7e3749c-a718-df94-92af-1cb0fecab772@redhat.com/ This patch (of 6): Fix tabs/spaces inconsistency in the mdwe test. Link: https://lkml.kernel.org/r/20230828150858.393570-1-revest@chromium.org Link: https://lkml.kernel.org/r/20230828150858.393570-2-revest@chromium.org Signed-off-by: Florent Revest <revest@chromium.org> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Alexey Izbyshev <izbyshev@ispras.ru> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Ayush Jain <ayush.jain3@amd.com> Cc: Greg Thelen <gthelen@google.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Peter Xu <peterx@redhat.com> Cc: Szabolcs Nagy <Szabolcs.Nagy@arm.com> Cc: Topi Miettinen <toiwoton@gmail.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-06	selftests/bpf: Add pairs_redir_to_connected helper	Geliang Tang
	Extract duplicate code from these four functions unix_redir_to_connected() udp_redir_to_connected() inet_unix_redir_to_connected() unix_inet_redir_to_connected() to generate a new helper pairs_redir_to_connected(). Create the different socketpairs in these four functions, then pass the socketpairs info to the new common helper to do the connections. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Link: https://lore.kernel.org/r/54bb28dcf764e7d4227ab160883931d2173f4f3d.1696588133.git.geliang.tang@suse.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-10-06	selftests/bpf: Don't truncate #test/subtest field	Andrii Nakryiko
	We currently expect up to a three-digit number of tests and subtests, so: #999/999: some_test/some_subtest: ... Is the largest test/subtest we can see. If we happen to cross into 1000s, current logic will just truncate everything after 7th character. This patch fixes this truncate and allows to go way higher (up to 31 characters in total). We still nicely align test numbers: #60/66 core_reloc_btfgen/type_based___incompat:OK #60/67 core_reloc_btfgen/type_based___fn_wrong_args:OK #60/68 core_reloc_btfgen/type_id:OK #60/69 core_reloc_btfgen/type_id___missing_targets:OK #60/70 core_reloc_btfgen/enumval:OK Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20231006175744.3136675-3-andrii@kernel.org
2023-10-06	selftests/bpf: Support building selftests in optimized -O2 mode	Andrii Nakryiko
	Add support for building selftests with -O2 level of optimization, which allows more compiler warnings detection (like lots of potentially uninitialized usage), but also is useful to have a faster-running test for some CPU-intensive tests. One can build optimized versions of libbpf and selftests by running: $ make RELEASE=1 There is a measurable speed up of about 10 seconds for me locally, though it's mostly capped by non-parallelized serial tests. User CPU time goes down by total 40 seconds, from 1m10s to 0m28s. Unoptimized build (-O0) ======================= Summary: 430/3544 PASSED, 25 SKIPPED, 4 FAILED real 1m59.937s user 1m10.877s sys 3m14.880s Optimized build (-O2) ===================== Summary: 425/3543 PASSED, 25 SKIPPED, 9 FAILED real 1m50.540s user 0m28.406s sys 3m13.198s Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20231006175744.3136675-2-andrii@kernel.org
2023-10-06	selftests/bpf: Fix compiler warnings reported in -O2 mode	Andrii Nakryiko
	Fix a bunch of potentially unitialized variable usage warnings that are reported by GCC in -O2 mode. Also silence overzealous stringop-truncation class of warnings. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20231006175744.3136675-1-andrii@kernel.org
2023-10-06	tools: ynl-gen: use uapi header name for the header guard	Jakub Kicinski
	Chuck points out that we should use the uapi-header property when generating the guard. Otherwise we may generate the same guard as another file in the tree. Tested-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-06	cxl/pci: Clarify devm host for memdev relative setup	Dan Williams
	It is all too easy to get confused about @dev usage in the CXL driver stack. Before adding a new cxl_pci_probe() setup operation that has a devm lifetime dependent on @cxlds->dev binding, but also references @cxlmd->dev, and prints messages, rework the devm_cxl_add_memdev() and cxl_memdev_setup_fw_upload() function signatures to make this distinction explicit. I.e. pass in the devm context as an @host argument rather than infer it from other objects. This is in preparation for adding a devm_cxl_sanitize_setup_notifier(). Note the whitespace fixup near the change of the devm_cxl_add_memdev() signature. That uncaught typo originated in the patch that added cxl_memdev_security_init(). Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-10-05	selftests/bpf: Enable CONFIG_VSOCKETS in config	Geliang Tang
	CONFIG_VSOCKETS is required by BPF selftests, otherwise we get errors like this: ./test_progs:socket_loopback_reuseport:386: socket: Address family not supported by protocol socket_loopback_reuseport:FAIL:386 ./test_progs:vsock_unix_redir_connectible:1496: vsock_socketpair_connectible() failed vsock_unix_redir_connectible:FAIL:1496 So this patch enables it in tools/testing/selftests/bpf/config. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Link: https://lore.kernel.org/r/472e73d285db2ea59aca9bbb95eb5d4048327588.1696490003.git.geliang.tang@suse.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-10-05	KVM: selftests: Zero-initialize entire test_result in memslot perf test	Sean Christopherson
	Zero-initialize the entire test_result structure used by memslot_perf_test instead of zeroing only the fields used to guard the pr_info() calls. gcc 13.2.0 is a bit overzealous and incorrectly thinks that rbestslottime's slot_runtime may be used uninitialized. In file included from memslot_perf_test.c:25: memslot_perf_test.c: In function ‘main’: include/test_util.h:31:22: error: ‘rbestslottime.slot_runtime.tv_nsec’ may be used uninitialized [-Werror=maybe-uninitialized] 31 \| #define pr_info(...) printf(__VA_ARGS__) \| ^~~~~~~~~~~~~~~~~~~ memslot_perf_test.c:1127:17: note: in expansion of macro ‘pr_info’ 1127 \| pr_info("Best slot setup time for the whole test area was %ld.%.9lds\n", \| ^~~~~~~ memslot_perf_test.c:1092:28: note: ‘rbestslottime.slot_runtime.tv_nsec’ was declared here 1092 \| struct test_result rbestslottime; \| ^~~~~~~~~~~~~ include/test_util.h:31:22: error: ‘rbestslottime.slot_runtime.tv_sec’ may be used uninitialized [-Werror=maybe-uninitialized] 31 \| #define pr_info(...) printf(__VA_ARGS__) \| ^~~~~~~~~~~~~~~~~~~ memslot_perf_test.c:1127:17: note: in expansion of macro ‘pr_info’ 1127 \| pr_info("Best slot setup time for the whole test area was %ld.%.9lds\n", \| ^~~~~~~ memslot_perf_test.c:1092:28: note: ‘rbestslottime.slot_runtime.tv_sec’ was declared here 1092 \| struct test_result rbestslottime; \| ^~~~~~~~~~~~~ That can't actually happen, at least not without the "result" structure in test_loop() also being used uninitialized, which gcc doesn't complain about, as writes to rbestslottime are all-or-nothing, i.e. slottimens can't be non-zero without slot_runtime being written. if (!data->mem_size && (!rbestslottime->slottimens \|\| result.slottimens < rbestslottime->slottimens)) *rbestslottime = result; Zero-initialize the structures to make gcc happy even though this is likely a compiler bug. The cost to do so is negligible, both in terms of code and runtime overhead. The only downside is that the compiler won't warn about legitimate usage of "uninitialized" data, e.g. the test could end up consuming zeros instead of useful data. However, given that the test is quite mature and unlikely to see substantial changes, the odds of introducing such bugs are relatively low, whereas being able to compile KVM selftests with -Werror detects issues on a regular basis. Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Link: https://lore.kernel.org/r/20231005002954.2887098-1-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-10-05	objtool: Remove max symbol name length limitation	Aaron Plattner
	If one of the symbols processed by read_symbols() happens to have a .cold variant with a name longer than objtool's MAX_NAME_LEN limit, the build fails. Avoid this problem by just using strndup() to copy the parent function's name, rather than strncpy()ing it onto the stack. Signed-off-by: Aaron Plattner <aplattner@nvidia.com> Link: https://lore.kernel.org/r/41e94cfea1d9131b758dd637fecdeacd459d4584.1696355111.git.aplattner@nvidia.com Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-10-05	objtool: Propagate early errors	Aaron Plattner
	If objtool runs into a problem that causes it to exit early, the overall tool still returns a status code of 0, which causes the build to continue as if nothing went wrong. Note this only affects early errors, as later errors are still ignored by check(). Fixes: b51277eb9775 ("objtool: Ditch subcommands") Signed-off-by: Aaron Plattner <aplattner@nvidia.com> Link: https://lore.kernel.org/r/cb6a28832d24b2ebfafd26da9abb95f874c83045.1696355111.git.aplattner@nvidia.com Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-10-05	selftests: timers: Convert nsleep-lat test to generate KTAP output	Mark Brown
	Currently the nsleep-lat test does not produce KTAP output but rather a custom format. This means that we only get a pass/fail for the suite, not for each individual test that the suite does. Convert to using the standard kselftest output functions which result in KTAP output being generated. Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-05	selftests: timers: Convert posix_timers test to generate KTAP output	Mark Brown
	Currently the posix_timers test does not produce KTAP output but rather a custom format. This means that we only get a pass/fail for the suite, not for each individual test that the suite does. Convert to using the standard kselftest output functions which result in KTAP output being generated. As part of this fix the printing of diagnostics in the unlikely event that the pthread APIs fail, these were using perror() but the API functions directly return an error code instead of setting errno. Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-05	selftests/exec: Convert execveat test to generate KTAP output	Mark Brown
	Currently the execveat test does not produce KTAP output but rather a custom format. This means that we only get a pass/fail for the suite, not for each individual test that the suite does. Convert to using the standard kselftest output functions which result in KTAP output being generated. The main trick with this is that, being an exec() related test, the program executes itself and returns specific exit codes to verify success meaning that we need to only use the top level kselftest header/summary functions when invoked directly rather than when run as part of a test. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-05	kselftest: Add a ksft_perror() helper	Mark Brown
	The standard library perror() function provides a convenient way to print an error message based on the current errno but this doesn't play nicely with KTAP output. Provide a helper which does an equivalent thing in a KTAP compatible format. nolibc doesn't have a strerror() and adding the table of strings required doesn't seem like a good fit for what it's trying to do so when we're using that only print the errno. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-05	selftests: static_keys: fix test name in messages	Javier Carrasco
	As a general rule, the name of the selftest is printed at the beginning of every message. Use "static_keys" (name of the test itself) consistently instead of mixing "static_key" and "static_keys" at the beginning of the messages in the test_static_keys script. Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-05	selftests: uevent filtering: fix return on error in uevent_listener	Javier Carrasco
	The ret variable is used to check function return values and assigning values to it on error has no effect as it is an unused value. The current implementation uses an additional variable (fret) to return the error value, which in this case is unnecessary and lead to the above described misuse. There is no restriction in the current implementation to always return -1 on error and the actual negative error value can be returned safely without storing -1 in a specific variable. Simplify the error checking by using a single variable which always holds the returned value. Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-05	selftests/dmabuf-heaps: add gitignore file	Javier Carrasco
	dmabuf-heaps builds a dmabuf-heap binary that can be ignored by git. Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-05	selftests/tdx: add gitignore file	Javier Carrasco
	tdx builds a tdx_guest_test binary that can be ignored by git. Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-05	selftests/user_events: add gitignore file	Javier Carrasco
	user_events builds a series of binaries that can be ignored by git. Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-10-05	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	Cross-merge networking fixes after downstream PR. No conflicts (or adjacent changes of note). Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-05	Merge tag 'net-6.6-rc5' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from Bluetooth, netfilter, BPF and WiFi. I didn't collect precise data but feels like we've got a lot of 6.5 fixes here. WiFi fixes are most user-awaited. Current release - regressions: - Bluetooth: fix hci_link_tx_to RCU lock usage Current release - new code bugs: - bpf: mprog: fix maximum program check on mprog attachment - eth: ti: icssg-prueth: fix signedness bug in prueth_init_tx_chns() Previous releases - regressions: - ipv6: tcp: add a missing nf_reset_ct() in 3WHS handling - vringh: don't use vringh_kiov_advance() in vringh_iov_xfer(), it doesn't handle zero length like we expected - wifi: - cfg80211: fix cqm_config access race, fix crashes with brcmfmac - iwlwifi: mvm: handle PS changes in vif_cfg_changed - mac80211: fix mesh id corruption on 32 bit systems - mt76: mt76x02: fix MT76x0 external LNA gain handling - Bluetooth: fix handling of HCI_QUIRK_STRICT_DUPLICATE_FILTER - l2tp: fix handling of transhdrlen in __ip{,6}_append_data() - dsa: mv88e6xxx: avoid EEPROM timeout when EEPROM is absent - eth: stmmac: fix the incorrect parameter after refactoring Previous releases - always broken: - net: replace calls to sock->ops->connect() with kernel_connect(), prevent address rewrite in kernel_bind(); otherwise BPF hooks may modify arguments, unexpectedly to the caller - tcp: fix delayed ACKs when reads and writes align with MSS - bpf: - verifier: unconditionally reset backtrack_state masks on global func exit - s390: let arch_prepare_bpf_trampoline return program size, fix struct_ops offsets - sockmap: fix accounting of available bytes in presence of PEEKs - sockmap: reject sk_msg egress redirects to non-TCP sockets - ipv4/fib: send netlink notify when delete source address routes - ethtool: plca: fix width of reads when parsing netlink commands - netfilter: nft_payload: rebuild vlan header on h_proto access - Bluetooth: hci_codec: fix leaking memory of local_codecs - eth: intel: ice: always add legacy 32byte RXDID in supported_rxdids - eth: stmmac: - dwmac-stm32: fix resume on STM32 MCU - remove buggy and unneeded stmmac_poll_controller, depend on NAPI - ibmveth: always recompute TCP pseudo-header checksum, fix use of the driver with Open vSwitch - wifi: - rtw88: rtw8723d: fix MAC address offset in EEPROM - mt76: fix lock dependency problem for wed_lock - mwifiex: sanity check data reported by the device - iwlwifi: ensure ack flag is properly cleared - iwlwifi: mvm: fix a memory corruption due to bad pointer arithm - iwlwifi: mvm: fix incorrect usage of scan API Misc: - wifi: mac80211: work around Cisco AP 9115 VHT MPDU length" * tag 'net-6.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (99 commits) MAINTAINERS: update Matthieu's email address mptcp: userspace pm allow creating id 0 subflow mptcp: fix delegated action races net: stmmac: remove unneeded stmmac_poll_controller net: lan743x: also select PHYLIB net: ethernet: mediatek: disable irq before schedule napi net: mana: Fix oversized sge0 for GSO packets net: mana: Fix the tso_bytes calculation net: mana: Fix TX CQE error handling netlink: annotate data-races around sk->sk_err sctp: update hb timer immediately after users change hb_interval sctp: update transport state when processing a dupcook packet tcp: fix delayed ACKs for MSS boundary condition tcp: fix quick-ack counting to count actual ACKs of new data page_pool: fix documentation typos tipc: fix a potential deadlock on &tx->lock net: stmmac: dwmac-stm32: fix resume on STM32 MCU ipv4: Set offload_failed flag in fibmatch results netfilter: nf_tables: nft_set_rbtree: fix spurious insertion failure netfilter: nf_tables: Deduplicate nft_register_obj audit logs ...
2023-10-05	tools: iio: iio_generic_buffer ensure alignment	Matti Vaittinen
	The iio_generic_buffer can return garbage values when the total size of scan data is not a multiple of the largest element in the scan. This can be demonstrated by reading a scan, consisting, for example of one 4-byte and one 2-byte element, where the 4-byte element is first in the buffer. The IIO generic buffer code does not take into account the last two padding bytes that are needed to ensure that the 4-byte data for next scan is correctly aligned. Add the padding bytes required to align the next sample with the scan size. Signed-off-by: Matti Vaittinen <mazziesaccount@gmail.com> Fixes: e58537ccce73 ("staging: iio: update example application.") Link: https://lore.kernel.org/r/ZRvlm4ktNLu+qmlf@dc78bmyyyyyyyyyyyyydt-3.rev.dnainternet.fi Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2023-10-05	Merge earlier changes in Intel thermal drivers for v6.7.	Rafael J. Wysocki

2023-10-04	tools/perf: Update call stack check in builtin-lock.c	Kajol Jain
	The perf test named "kernel lock contention analysis test" fails in powerpc system with below error: [command]# ./perf test 81 -vv 81: kernel lock contention analysis test : --- start --- test child forked, pid 2140 Testing perf lock record and perf lock contention Testing perf lock contention --use-bpf [Skip] No BPF support Testing perf lock record and perf lock contention at the same time Testing perf lock contention --threads Testing perf lock contention --lock-addr Testing perf lock contention --type-filter (w/ spinlock) Testing perf lock contention --lock-filter (w/ tasklist_lock) Testing perf lock contention --callstack-filter (w/ unix_stream) [Fail] Recorded result should have a lock from unix_stream: test child finished with -1 ---- end ---- kernel lock contention analysis test: FAILED! The test is failing because we get an address entry with 0 in perf lock samples for powerpc, and code for lock contention option "--callstack-filter" will not check further entries after address 0. Below are some of the samples from test generated perf.data file, which have 0 address in the 2nd entry of callstack: -------- sched-messaging 3409 [001] 7152.904029: lock:contention_begin: 0xc00000c80904ef00 (flags=SPIN) c0000000001e926c __traceiter_contention_begin+0x6c ([kernel.kallsyms]) 0 [unknown] ([unknown]) c000000000f8a178 native_queued_spin_lock_slowpath+0x1f8 ([kernel.kallsyms]) c000000000f89f44 _raw_spin_lock_irqsave+0x84 ([kernel.kallsyms]) c0000000001d9fd0 prepare_to_wait+0x50 ([kernel.kallsyms]) c000000000c80f50 sock_alloc_send_pskb+0x1b0 ([kernel.kallsyms]) c000000000e82298 unix_stream_sendmsg+0x2b8 ([kernel.kallsyms]) c000000000c78980 sock_sendmsg+0x80 ([kernel.kallsyms]) sched-messaging 3408 [005] 7152.904036: lock:contention_begin: 0xc00000c80904ef00 (flags=SPIN) c0000000001e926c __traceiter_contention_begin+0x6c ([kernel.kallsyms]) 0 [unknown] ([unknown]) c000000000f8a178 native_queued_spin_lock_slowpath+0x1f8 ([kernel.kallsyms]) c000000000f89f44 _raw_spin_lock_irqsave+0x84 ([kernel.kallsyms]) c0000000001d9fd0 prepare_to_wait+0x50 ([kernel.kallsyms]) c000000000c80f50 sock_alloc_send_pskb+0x1b0 ([kernel.kallsyms]) c000000000e82298 unix_stream_sendmsg+0x2b8 ([kernel.kallsyms]) c000000000c78980 sock_sendmsg+0x80 ([kernel.kallsyms]) -------- Based on commit 20002ded4d93 ("perf_counter: powerpc: Add callchain support"), incase of powerpc, the callchain saved by kernel always includes first three entries as the NIP (next instruction pointer), LR (link register), and the contents of LR save area in the second stack frame. In certain scenarios its possible to have invalid kernel instruction addresses in either of LR or the second stack frame's LR. In that case, kernel will store the address as zer0. Hence, its possible to have 2nd or 3rd callstack entry as 0. As per the current code in match_callstack_filter function, we skip the callstack check incase we get 0 address. And hence the test case is failing in powerpc. Fix this issue by updating the check in match_callstack_filter function, to not skip callstack check if the 2nd or 3rd entry have 0 address for powerpc. Result in powerpc after patch changes: [command]# ./perf test 81 -vv 81: kernel lock contention analysis test : --- start --- test child forked, pid 4570 Testing perf lock record and perf lock contention Testing perf lock contention --use-bpf [Skip] No BPF support Testing perf lock record and perf lock contention at the same time Testing perf lock contention --threads Testing perf lock contention --lock-addr Testing perf lock contention --type-filter (w/ spinlock) Testing perf lock contention --lock-filter (w/ tasklist_lock) [Skip] Could not find 'tasklist_lock' Testing perf lock contention --callstack-filter (w/ unix_stream) Testing perf lock contention --callstack-filter with task aggregation Testing perf lock contention CSV output [Skip] No BPF support test child finished with 0 ---- end ---- kernel lock contention analysis test: Ok Fixes: ebab291641be ("perf lock contention: Support filters for different aggregation") Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com> Tested-by: Disha Goel <disgoel@linux.ibm.com> Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Cc: maddy@linux.ibm.com Cc: atrajeev@linux.vnet.ibm.com Link: https://lore.kernel.org/r/20231003092113.252380-1-kjain@linux.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-10-04	tools/perf/tests: Fix object code reading to skip address that falls out of ↵	Athira Rajeev
	text section The testcase "Object code reading" fails in somecases for "fs_something" sub test as below: Reading object code for memory address: 0xc008000007f0142c File is: /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko On file address is: 0x1114cc Objdump command is: objdump -z -d --start-address=0x11142c --stop-address=0x1114ac /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko objdump read too few bytes: 128 test child finished with -1 This can alo be reproduced when running perf record with workload that exercises fs_something() code. In the test setup, this is exercising xfs code since root is xfs. # perf record ./a.out # perf report -v \|grep "xfs.ko" 0.76% a.out /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko 0xc008000007de5efc B [k] xlog_cil_commit 0.74% a.out /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko 0xc008000007d5ae18 B [k] xfs_btree_key_offset 0.74% a.out /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko 0xc008000007e11fd4 B [k] 0x0000000000112074 Here addr "0xc008000007e11fd4" is not resolved. since this is a kernel module, its offset is from the DSO. Xfs module is loaded at 0xc008000007d00000 # cat /proc/modules \| grep xfs xfs 2228224 3 - Live 0xc008000007d00000 And size is 0x220000. So its loaded between 0xc008000007d00000 and 0xc008000007f20000. From objdump, text section is: text 0010f7bc 0000000000000000 0000000000000000 000000a0 2**4 Hence perf captured ip maps to 0x112074 which is: ( ip - start of module ) + a0 This offset 0x112074 falls out .text section which is up to 0x10f7bc In this case for module, the address 0xc008000007e11fd4 is pointing to stub instructions. This address range represents the module stubs which is allocated on module load and hence is not part of DSO offset. To address this issue in "object code reading", skip the sample if address falls out of text section and is within the module end. Use the "text_end" member of "struct dso" to do this check. To address this issue in "perf report", exploring an option of having stubs range as part of the /proc/kallsyms, so that perf report can resolve addresses in stubs range However this patch uses text_end to skip the stub range for Object code reading testcase. Reported-by: Disha Goel <disgoel@linux.ibm.com> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Disha Goel<disgoel@linux.ibm.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Cc: maddy@linux.ibm.com Cc: disgoel@linux.vnet.ibm.com Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20230928075213.84392-3-atrajeev@linux.vnet.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-10-04	tools/perf: Add "is_kmod" to struct dso to check if it is kernel module	Athira Rajeev
	Update "struct dso" to include new member "is_kmod". This new field will determine if the file is a kernel module or not. To resolve the address from a sample, perf looks at the DSO maps. In case of address from a kernel module, there were some address found to be not resolved. This was observed while running perf test for "Object code reading". Though the ip falls beteen the start address of the loaded module (perf map->start ) and end address ( perf map->end), it was unresolved. This was happening because in some cases for kernel modules, address from sample points to stub instructions. To identify if the DSO is a kernel module, the new field "is_kmod" is added to "struct dso". Reported-by: Disha Goel <disgoel@linux.ibm.com> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: kjain@linux.ibm.com Cc: maddy@linux.ibm.com Cc: disgoel@linux.vnet.ibm.com Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20230928075213.84392-2-atrajeev@linux.vnet.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-10-04	tools/perf: Add text_end to "struct dso" to save .text section size	Athira Rajeev
	Update "struct dso" to include new member "text_end". This new field will represent the offset for end of text section for a dso. For elf, this value is derived as: sh_size (Size of section in byes) + sh_offset (Section file offst) of the elf header for text. For bfd, this value is derived as: 1. For PE file, section->size + ( section->vma - dso->text_offset) 2. Other cases: section->filepos (file position) + section->size (size of section) To resolve the address from a sample, perf looks at the DSO maps. In case of address from a kernel module, there were some address found to be not resolved. This was observed while running perf test for "Object code reading". Though the ip falls beteen the start address of the loaded module (perf map->start ) and end address ( perf map->end), it was unresolved. Example: Reading object code for memory address: 0xc008000007f0142c File is: /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko On file address is: 0x1114cc Objdump command is: objdump -z -d --start-address=0x11142c --stop-address=0x1114ac /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko objdump read too few bytes: 128 test child finished with -1 Here, module is loaded at: # cat /proc/modules \| grep xfs xfs 2228224 3 - Live 0xc008000007d00000 From objdump for xfs module, text section is: text 0010f7bc 0000000000000000 0000000000000000 000000a0 2**4 Here the offset for 0xc008000007f0142c ie 0x112074 falls out .text section which is up to 0x10f7bc. In this case for module, the address 0xc008000007e11fd4 is pointing to stub instructions. This address range represents the module stubs which is allocated on module load and hence is not part of DSO offset. To identify such address, which falls out of text section and within module end, added the new field "text_end" to "struct dso". Reported-by: Disha Goel <disgoel@linux.ibm.com> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: maddy@linux.ibm.com Cc: disgoel@linux.vnet.ibm.com Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20230928075213.84392-1-atrajeev@linux.vnet.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>