summaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)Author
2023-04-06selftests: xsk: Add test UNALIGNED_INV_DESC_4K1_FRAME_SIZEKal Conley
Add unaligned descriptor test for frame size of 4001. Using an odd frame size ensures that the end of the UMEM is not near a page boundary. This allows testing descriptors that staddle the end of the UMEM but not a page. This test used to fail without the previous commit ("xsk: Fix unaligned descriptor validation"). Signed-off-by: Kal Conley <kal.conley@dectris.com> Link: https://lore.kernel.org/r/20230405235920.7305-3-kal.conley@dectris.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-04-06selftests/bpf: fix xdp_redirect xdp-features selftest for veth driverLorenzo Bianconi
xdp-features supported by veth driver are no more static, but they depends on veth configuration (e.g. if GRO is enabled/disabled or TX/RX queue configuration). Take it into account in xdp_redirect xdp-features selftest for veth driver. Fixes: fccca038f300 ("veth: take into account device reconfiguration for xdp_features flag") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/bc35455cfbb1d4f7f52536955ded81ad47d8dc54.1680777371.git.lorenzo@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-04-06selftests/net: fix typo in tcp_mmapEric Dumazet
kernel test robot reported the following warning: All warnings (new ones prefixed by >>): tcp_mmap.c: In function 'child_thread': >> tcp_mmap.c:211:61: warning: 'lu' may be used uninitialized in this function [-Wmaybe-uninitialized] 211 | zc.length = min(chunk_size, FILE_SZ - lu); We want to read FILE_SZ bytes, so the correct expression should be (FILE_SZ - total) Fixes: 5c5945dc695c ("selftests/net: Add SHA256 computation over data sent in tcp_mmap") Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/oe-kbuild-all/202304042104.UFIuevBp-lkp@intel.com/ Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Xiaoyan Li <lixiaoyan@google.com> Cc: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20230405071556.1019623-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-04-06selftests/clone3: fix number of tests in ksft_set_planTobias Klauser
Commit 515bddf0ec41 ("selftests/clone3: test clone3 with CLONE_NEWTIME") added an additional test, so the number passed to ksft_set_plan needs to be bumped accordingly. Also use ksft_finished() to print results and exit. This will catch future mismatches between ksft_set_plan() and the number of tests being run. Fixes: 515bddf0ec41 ("selftests/clone3: test clone3 with CLONE_NEWTIME") Cc: Christian Brauner <brauner@kernel.org> Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-04-05bpftool: Clean up _bpftool_once_attr() calls in bash completionQuentin Monnet
In bpftool's bash completion file, function _bpftool_once_attr() is able to process multiple arguments. There are a few locations where this function is called multiple times in a row, each time for a single argument; let's pass all arguments instead to minimize the number of function calls required for the completion. Signed-off-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/r/20230405132120.59886-8-quentin@isovalent.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-05bpftool: Support printing opcodes and source file references in CFGQuentin Monnet
Add support for displaying opcodes or/and file references (filepath, line and column numbers) when dumping the control flow graphs of loaded BPF programs with bpftool. The filepaths in the records are absolute. To avoid blocks on the graph to get too wide, we truncate them when they get too long (but we always keep the entire file name). In the unlikely case where the resulting file name is ambiguous, it remains possible to get the full path with a regular dump (no CFG). Signed-off-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/r/20230405132120.59886-7-quentin@isovalent.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-05bpftool: Support "opcodes", "linum", "visual" simultaneouslyQuentin Monnet
When dumping a program, the keywords "opcodes" (for printing the raw opcodes), "linum" (for displaying the filename, line number, column number along with the source code), and "visual" (for generating the control flow graph for translated programs) are mutually exclusive. But there's no reason why they should be. Let's make it possible to pass several of them at once. The "file FILE" option, which makes bpftool output a binary image to a file, remains incompatible with the others. Signed-off-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/r/20230405132120.59886-6-quentin@isovalent.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-05bpftool: Return an error on prog dumps if both CFG and JSON are requiredQuentin Monnet
We do not support JSON output for control flow graphs of programs with bpftool. So far, requiring both the CFG and JSON output would result in producing a null JSON object. It makes more sense to raise an error directly when parsing command line arguments and options, so that users know they won't get any output they might expect. If JSON is required for the graph, we leave it to Graphviz instead: # bpftool prog dump xlated <REF> visual | dot -Tjson Suggested-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/r/20230405132120.59886-5-quentin@isovalent.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-05bpftool: Support inline annotations when dumping the CFG of a programQuentin Monnet
We support dumping the control flow graph of loaded programs to the DOT format with bpftool, but so far this feature wouldn't display the source code lines available through BTF along with the eBPF bytecode. Let's add support for these annotations, to make it easier to read the graph. In prog.c, we move the call to dump_xlated_cfg() in order to pass and use the full struct dump_data, instead of creating a minimal one in draw_bb_node(). We pass the pointer to this struct down to dump_xlated_for_graph() in xlated_dumper.c, where most of the logics is added. We deal with BTF mostly like we do for plain or JSON output, except that we cannot use a "nr_skip" value to skip a given number of linfo records (we don't process the BPF instructions linearly, and apart from the root of the graph we don't know how many records we should skip, so we just store the last linfo and make sure the new one we find is different before printing it). When printing the source instructions to the label of a DOT graph node, there are a few subtleties to address. We want some special newline markers, and there are some characters that we must escape. To deal with them, we introduce a new dedicated function btf_dump_linfo_dotlabel() in btf_dumper.c. We'll reuse this function in a later commit to format the filepath, line, and column references as well. Signed-off-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/r/20230405132120.59886-4-quentin@isovalent.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-05bpftool: Fix bug for long instructions in program CFG dumpsQuentin Monnet
When dumping the control flow graphs for programs using the 16-byte long load instruction, we need to skip the second part of this instruction when looking for the next instruction to process. Otherwise, we end up printing "BUG_ld_00" from the kernel disassembler in the CFG. Fixes: efcef17a6d65 ("tools: bpftool: generate .dot graph from CFG information") Signed-off-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/r/20230405132120.59886-3-quentin@isovalent.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-05bpftool: Fix documentation about line info display for prog dumpsQuentin Monnet
The documentation states that when line_info is available when dumping a program, the source line will be displayed "by default". There is no notion of "default" here: the line is always displayed if available, there is no way currently to turn it off. In the next sentence, the documentation states that if "linum" is used on the command line, the relevant filename, line, and column will be displayed "on top of the source line". This is incorrect, as they are currently displayed on the right side of the source line (or on top of the eBPF instruction, not the source). This commit fixes the documentation to address these points. Signed-off-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/r/20230405132120.59886-2-quentin@isovalent.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-05selftests/mm: set overcommit_policy as OVERCOMMIT_ALWAYSChaitanya S Prakash
The kernel's default behaviour is to obstruct the allocation of high virtual address as it handles memory overcommit in a heuristic manner. Setting the parameter as OVERCOMMIT_ALWAYS, ensures kernel isn't susceptible to the availability of a platform's physical memory when denying a memory allocation request. Link: https://lkml.kernel.org/r/20230323060121.1175830-4-chaitanyas.prakash@arm.com Signed-off-by: Chaitanya S Prakash <chaitanyas.prakash@arm.com> Cc: David Hildenbrand <david@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-05selftests/mm: change NR_CHUNKS_HIGH for aarch64Chaitanya S Prakash
Although there is a provision for 52 bit VA on arm64 platform, it remains unutilised and higher addresses are not allocated. In order to accommodate 4PB [2^52] virtual address space where supported, NR_CHUNKS_HIGH is changed accordingly. Array holding addresses is changed from static allocation to dynamic allocation to accommodate its voluminous nature which otherwise might overflow the stack. Link: https://lkml.kernel.org/r/20230323060121.1175830-3-chaitanyas.prakash@arm.com Signed-off-by: Chaitanya S Prakash <chaitanyas.prakash@arm.com> Cc: David Hildenbrand <david@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-05selftests/mm: change MAP_CHUNK_SIZEChaitanya S Prakash
Patch series "selftests: Fix virtual address range for arm64", v2. When the virtual address range selftest is run on arm64 and x86 platforms, it is observed that both the low and high VA range iterations are skipped when the MAP_CHUNK_SIZE is set to 16GB. The MAP_CHUNK_SIZE is changed to 1GB to resolve this issue, following which support for arm64 platform is added by changing the NR_CHUNKS_HIGH for aarch64 to accommodate up to 4PB of virtual address space allocation requests. Dynamic memory allocation of array holding addresses is introduced to prevent overflow of the stack. Finally, the overcommit_policy is set as OVERCOMMIT_ALWAYS to prevent the kernel from denying a memory allocation request based on a platform's physical memory availability. This patch (of 3): mmap() fails to allocate 16GB virtual space chunk, skipping both low and high VA range iterations. Hence, reduce MAP_CHUNK_SIZE to 1GB and update relevant macros as required. Link: https://lkml.kernel.org/r/20230323060121.1175830-1-chaitanyas.prakash@arm.com Link: https://lkml.kernel.org/r/20230323060121.1175830-2-chaitanyas.prakash@arm.com Signed-off-by: Chaitanya S Prakash <chaitanyas.prakash@arm.com> Cc: David Hildenbrand <david@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-05mm: userfaultfd: add UFFDIO_CONTINUE_MODE_WP to install WP PTEsAxel Rasmussen
UFFDIO_COPY already has UFFDIO_COPY_MODE_WP, so when installing a new PTE to resolve a missing fault, one can install a write-protected one. This is useful when using UFFDIO_REGISTER_MODE_{MISSING,WP} in combination. This was motivated by testing HugeTLB HGM [1], and in particular its interaction with userfaultfd features. Existing userfaultfd code supports using WP and MINOR modes together (i.e. you can register an area with both enabled), but without this CONTINUE flag the combination is in practice unusable. So, add an analogous UFFDIO_CONTINUE_MODE_WP, which does the same thing as UFFDIO_COPY_MODE_WP, but for *minor* faults. Update the selftest to do some very basic exercising of the new flag. Update Documentation/ to describe how these flags are used (neither the COPY nor the new CONTINUE versions of this mode flag were described there before). [1]: https://patchwork.kernel.org/project/linux-mm/cover/20230218002819.1486479-1-jthoughton@google.com/ Link: https://lkml.kernel.org/r/20230314221250.682452-5-axelrasmussen@google.com Signed-off-by: Axel Rasmussen <axelrasmussen@google.com> Acked-by: Peter Xu <peterx@redhat.com> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Hugh Dickins <hughd@google.com> Cc: Jan Kara <jack@suse.cz> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nadav Amit <namit@vmware.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-05mm, treewide: redefine MAX_ORDER sanelyKirill A. Shutemov
MAX_ORDER currently defined as number of orders page allocator supports: user can ask buddy allocator for page order between 0 and MAX_ORDER-1. This definition is counter-intuitive and lead to number of bugs all over the kernel. Change the definition of MAX_ORDER to be inclusive: the range of orders user can ask from buddy allocator is 0..MAX_ORDER now. [kirill@shutemov.name: fix min() warning] Link: https://lkml.kernel.org/r/20230315153800.32wib3n5rickolvh@box [akpm@linux-foundation.org: fix another min_t warning] [kirill@shutemov.name: fixups per Zi Yan] Link: https://lkml.kernel.org/r/20230316232144.b7ic4cif4kjiabws@box.shutemov.name [akpm@linux-foundation.org: fix underlining in docs] Link: https://lore.kernel.org/oe-kbuild-all/202303191025.VRCTk6mP-lkp@intel.com/ Link: https://lkml.kernel.org/r/20230315113133.11326-11-kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc] Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-05selftests/mm: smoke test UFFD_FEATURE_WP_UNPOPULATEDPeter Xu
Enable it by default on the stress test, and add some smoke tests for the pte markers on anonymous. Link: https://lkml.kernel.org/r/20230309223711.823547-3-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Muhammad Usama Anjum <usama.anjum@collabora.com> Cc: Nadav Amit <nadav.amit@gmail.com> Cc: Paul Gofman <pgofman@codeweavers.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-05selftests: net: rps_default_mask.sh: delete veth link specificallyHangbin Liu
When deleting the netns and recreating a new one while re-adding the veth interface, there is a small window of time during which the old veth interface has not yet been removed. This can cause the new addition to fail. To resolve this issue, we can either wait for a short while to ensure that the old veth interface is deleted, or we can specifically remove the veth interface. Before this patch: # ./rps_default_mask.sh empty rps_default_mask [ ok ] changing rps_default_mask dont affect existing devices [ ok ] changing rps_default_mask dont affect existing netns [ ok ] changing rps_default_mask affect newly created devices [ ok ] changing rps_default_mask don't affect newly child netns[II][ ok ] rps_default_mask is 0 by default in child netns [ ok ] RTNETLINK answers: File exists changing rps_default_mask in child ns don't affect the main one[ ok ] cat: /sys/class/net/vethC11an1/queues/rx-0/rps_cpus: No such file or directory changing rps_default_mask in child ns affects new childns devices./rps_default_mask.sh: line 36: [: -eq: unary operator expected [fail] expected 1 found changing rps_default_mask in child ns don't affect existing devices[ ok ] After this patch: # ./rps_default_mask.sh empty rps_default_mask [ ok ] changing rps_default_mask dont affect existing devices [ ok ] changing rps_default_mask dont affect existing netns [ ok ] changing rps_default_mask affect newly created devices [ ok ] changing rps_default_mask don't affect newly child netns[II][ ok ] rps_default_mask is 0 by default in child netns [ ok ] changing rps_default_mask in child ns don't affect the main one[ ok ] changing rps_default_mask in child ns affects new childns devices[ ok ] changing rps_default_mask in child ns don't affect existing devices[ ok ] Fixes: 3a7d84eae03b ("self-tests: more rps self tests") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/20230404072411.879476-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-05maple_tree: fix write memory barrier of nodes once dead for RCU modeLiam R. Howlett
During the development of the maple tree, the strategy of freeing multiple nodes changed and, in the process, the pivots were reused to store pointers to dead nodes. To ensure the readers see accurate pivots, the writers need to mark the nodes as dead and call smp_wmb() to ensure any readers can identify the node as dead before using the pivot values. There were two places where the old method of marking the node as dead without smp_wmb() were being used, which resulted in RCU readers seeing the wrong pivot value before seeing the node was dead. Fix this race condition by using mte_set_node_dead() which has the smp_wmb() call to ensure the race is closed. Add a WARN_ON() to the ma_free_rcu() call to ensure all nodes being freed are marked as dead to ensure there are no other call paths besides the two updated paths. This is necessary for the RCU mode of the maple tree. Link: https://lkml.kernel.org/r/20230227173632.3292573-6-surenb@google.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Suren Baghdasaryan <surenb@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-05selftests/bpf: Wait for receive in cg_storage_multi testYiFei Zhu
In some cases the loopback latency might be large enough, causing the assertion on invocations to be run before ingress prog getting executed. The assertion would fail and the test would flake. This can be reliably reproduced by arbitrarily increasing the loopback latency (thanks to [1]): tc qdisc add dev lo root handle 1: htb default 12 tc class add dev lo parent 1:1 classid 1:12 htb rate 20kbps ceil 20kbps tc qdisc add dev lo parent 1:12 netem delay 100ms Fix this by waiting on the receive end, instead of instantly returning to the assert. The call to read() will wait for the default SO_RCVTIMEO timeout of 3 seconds provided by start_server(). [1] https://gist.github.com/kstevens715/4598301 Reported-by: Martin KaFai Lau <martin.lau@linux.dev> Link: https://lore.kernel.org/bpf/9c5c8b7e-1d89-a3af-5400-14fde81f4429@linux.dev/ Fixes: 3573f384014f ("selftests/bpf: Test CGROUP_STORAGE behavior on shared egress + ingress") Acked-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: YiFei Zhu <zhuyifei@google.com> Link: https://lore.kernel.org/r/20230405193354.1956209-1-zhuyifei@google.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-04-05selftests: xsk: Deflakify STATS_RX_DROPPED testKal Conley
Fix flaky STATS_RX_DROPPED test. The receiver calls getsockopt after receiving the last (valid) packet which is not the final packet sent in the test (valid and invalid packets are sent in alternating fashion with the final packet being invalid). Since the last packet may or may not have been dropped already, both outcomes must be allowed. This issue could also be fixed by making sure the last packet sent is valid. This alternative is left as an exercise to the reader (or the benevolent maintainers of this file). This problem was quite visible on certain setups. On one machine this failure was observed 50% of the time. Also, remove a redundant assignment of pkt_stream->nb_pkts. This field is already initialized by __pkt_stream_alloc. Fixes: 27e934bec35b ("selftests: xsk: make stat tests not spin on getsockopt") Signed-off-by: Kal Conley <kal.conley@dectris.com> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20230403120400.31018-1-kal.conley@dectris.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-04-05selftests: xsk: Disable IPv6 on VETH1Kal Conley
This change fixes flakiness in the BIDIRECTIONAL test: # [is_pkt_valid] expected length [60], got length [90] not ok 1 FAIL: SKB BUSY-POLL BIDIRECTIONAL When IPv6 is enabled, the interface will periodically send MLDv1 and MLDv2 packets. These packets can cause the BIDIRECTIONAL test to fail since it uses VETH0 for RX. For other tests, this was not a problem since they only receive on VETH1 and IPv6 was already disabled on VETH0. Fixes: a89052572ebb ("selftests/bpf: Xsk selftests framework") Signed-off-by: Kal Conley <kal.conley@dectris.com> Link: https://lore.kernel.org/r/20230405082905.6303-1-kal.conley@dectris.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-04-05kunit: tool: Add support for SH under QEMUGeert Uytterhoeven
Add basic support to run SH under QEMU via kunit_tool using the virtualized r2d platform. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: David Gow <davidgow@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-04-05kunit: tool: Add support for overriding the QEMU serial portGeert Uytterhoeven
On some platforms, the console is not the first serial port. To make this work, the first serial port in QEMU must be set to "null". Add support for this by adding an optional "serial" parameter, which defaults to "stdio", and can be overridden by platform-specific configuration. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: David Gow <davidgow@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-04-05selftests: xsk: Add test case for packets at end of UMEMKal Conley
Add test case to testapp_invalid_desc for valid packets at the end of the UMEM. Signed-off-by: Kal Conley <kal.conley@dectris.com> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20230403145047.33065-3-kal.conley@dectris.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-04-05selftests: xsk: Use correct UMEM size in testapp_invalid_descKal Conley
Avoid UMEM_SIZE macro in testapp_invalid_desc which is incorrect when the frame size is not XSK_UMEM__DEFAULT_FRAME_SIZE. Also remove the macro since it's no longer being used. Fixes: 909f0e28207c ("selftests: xsk: Add tests for 2K frame size") Signed-off-by: Kal Conley <kal.conley@dectris.com> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20230403145047.33065-2-kal.conley@dectris.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-04-05selftests: xsk: Add xskxceiver.h dependency to MakefileKal Conley
xskxceiver depends on xskxceiver.h so tell make about it. Signed-off-by: Kal Conley <kal.conley@dectris.com> Link: https://lore.kernel.org/r/20230403130151.31195-1-kal.conley@dectris.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-04-05Merge branches 'rcu/staging-core', 'rcu/staging-docs' and ↵Joel Fernandes (Google)
'rcu/staging-kfree', remote-tracking branches 'paul/srcu-cf.2023.04.04a', 'fbq/rcu/lockdep.2023.03.27a' and 'fbq/rcu/rcutorture.2023.03.20a' into rcu/staging
2023-04-05rcu: Remove CONFIG_SRCUPaul E. McKenney
Now that all references to CONFIG_SRCU have been removed, it is time to remove CONFIG_SRCU itself. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: John Ogness <john.ogness@linutronix.de> Cc: Petr Mladek <pmladek@suse.com> Reviewed-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2023-04-04selftests/bpf: Add tracing tests for walking skb and req.Alexei Starovoitov
Add tracing tests for walking skb->sk and req->sk. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/bpf/20230404045029.82870-9-alexei.starovoitov@gmail.com
2023-04-04selftests/bpf: Add RESOLVE_BTFIDS dependency to bpf_testmod.koIlya Leoshkevich
bpf_testmod.ko sometimes fails to build from a clean checkout: BTF [M] linux/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.ko /bin/sh: 1: linux-build//tools/build/resolve_btfids/resolve_btfids: not found The reason is that RESOLVE_BTFIDS may not yet be built. Fix by adding a dependency. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20230403172935.1553022-1-iii@linux.ibm.com
2023-04-04iommufd/selftest: Cover domain unmap with huge pages and accessJason Gunthorpe
Inspired by the syzkaller reproducer check the batch carry path with a simple test. Link: https://lore.kernel.org/r/4-v1-ceab6a4d7d7a+94-iommufd_syz_jgg@nvidia.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-04-04tools/virtio: fix typo in README instructionsRoss Zwisler
We need to have a unique chardev for each data path, else the chardevs will collide and qemu will die with this message: qemu-system-x86_64: -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel0, id=channel1,name=trace-path-cpu0: Property 'virtserialport.chardev' can't take value 'charchannel0': Device 'charchannel0' is in use Signed-off-by: Ross Zwisler <zwisler@google.com> Message-Id: <20230215223350.2658616-7-zwisler@google.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-04-04Merge branch 'iommufd/for-rc' into for-nextJason Gunthorpe
The following selftest patch requires both the bug fixes and the improvements of the selftest framework. * iommufd/for-rc: iommufd: Do not corrupt the pfn list when doing batch carry iommufd: Fix unpinning of pages when an access is present iommufd: Check for uptr overflow Linux 6.3-rc5 Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-04-04vsock/test: update expected return valuesArseniy Krasnov
This updates expected return values for invalid buffer test. Now such values are returned from transport, not from af_vsock.c. Signed-off-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-04-03Merge 6.3-rc5 into driver-core-nextGreg Kroah-Hartman
We need the fixes in here for testing, as well as the driver core changes for documentation updates to build on. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-04-01bpf: Remove now-defunct task kfuncsDavid Vernet
In commit 22df776a9a86 ("tasks: Extract rcu_users out of union"), the 'refcount_t rcu_users' field was extracted out of a union with the 'struct rcu_head rcu' field. This allows us to safely perform a refcount_inc_not_zero() on task->rcu_users when acquiring a reference on a task struct. A prior patch leveraged this by making struct task_struct an RCU-protected object in the verifier, and by bpf_task_acquire() to use the task->rcu_users field for synchronization. Now that we can use RCU to protect tasks, we no longer need bpf_task_kptr_get(), or bpf_task_acquire_not_zero(). bpf_task_kptr_get() is truly completely unnecessary, as we can just use RCU to get the object. bpf_task_acquire_not_zero() is now equivalent to bpf_task_acquire(). In addition to these changes, this patch also updates the associated selftests to no longer use these kfuncs. Signed-off-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20230331195733.699708-3-void@manifault.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-01bpf: Make struct task_struct an RCU-safe typeDavid Vernet
struct task_struct objects are a bit interesting in terms of how their lifetime is protected by refcounts. task structs have two refcount fields: 1. refcount_t usage: Protects the memory backing the task struct. When this refcount drops to 0, the task is immediately freed, without waiting for an RCU grace period to elapse. This is the field that most callers in the kernel currently use to ensure that a task remains valid while it's being referenced, and is what's currently tracked with bpf_task_acquire() and bpf_task_release(). 2. refcount_t rcu_users: A refcount field which, when it drops to 0, schedules an RCU callback that drops a reference held on the 'usage' field above (which is acquired when the task is first created). This field therefore provides a form of RCU protection on the task by ensuring that at least one 'usage' refcount will be held until an RCU grace period has elapsed. The qualifier "a form of" is important here, as a task can remain valid after task->rcu_users has dropped to 0 and the subsequent RCU gp has elapsed. In terms of BPF, we want to use task->rcu_users to protect tasks that function as referenced kptrs, and to allow tasks stored as referenced kptrs in maps to be accessed with RCU protection. Let's first determine whether we can safely use task->rcu_users to protect tasks stored in maps. All of the bpf_task* kfuncs can only be called from tracepoint, struct_ops, or BPF_PROG_TYPE_SCHED_CLS, program types. For tracepoint and struct_ops programs, the struct task_struct passed to a program handler will always be trusted, so it will always be safe to call bpf_task_acquire() with any task passed to a program. Note, however, that we must update bpf_task_acquire() to be KF_RET_NULL, as it is possible that the task has exited by the time the program is invoked, even if the pointer is still currently valid because the main kernel holds a task->usage refcount. For BPF_PROG_TYPE_SCHED_CLS, tasks should never be passed as an argument to the any program handlers, so it should not be relevant. The second question is whether it's safe to use RCU to access a task that was acquired with bpf_task_acquire(), and stored in a map. Because bpf_task_acquire() now uses task->rcu_users, it follows that if the task is present in the map, that it must have had at least one task->rcu_users refcount by the time the current RCU cs was started. Therefore, it's safe to access that task until the end of the current RCU cs. With all that said, this patch makes struct task_struct is an RCU-protected object. In doing so, we also change bpf_task_acquire() to be KF_ACQUIRE | KF_RCU | KF_RET_NULL, and adjust any selftests as necessary. A subsequent patch will remove bpf_task_kptr_get(), and bpf_task_acquire_not_zero() respectively. Signed-off-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20230331195733.699708-2-void@manifault.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-01veristat: small fixed found in -O2 modeAndrii Nakryiko
Fix few potentially unitialized variables uses, found while building veristat.c in release (-O2) mode. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230331222405.3468634-5-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-01veristat: avoid using kernel-internal headersAndrii Nakryiko
Drop linux/compiler.h include, which seems to be needed for ARRAY_SIZE macro only. Redefine own version of ARRAY_SIZE instead. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230331222405.3468634-4-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-01veristat: improve version reportingAndrii Nakryiko
For packaging version of the tool is important, so add a simple way to specify veristat version for upstream mirror at Github. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230331222405.3468634-3-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-01veristat: relicense veristat.c as dual GPL-2.0-only or BSD-2-Clause licensedAndrii Nakryiko
Dual-license veristat.c to dual GPL-2.0-only or BSD-2-Clause license. This is needed to mirror it to Github to make it convenient for distro packagers to package veristat as a separate package. Veristat grew into a useful tool by itself, and there are already a bunch of users relying on veristat as generic BPF loading and verification helper tool. So making it easy to packagers by providing Github mirror just like we do for bpftool and libbpf is the next step to get veristat into the hands of users. Apart from few typo fixes, I'm the sole contributor to veristat.c so far, so no extra Acks should be needed for relicensing. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230331222405.3468634-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-31selftests/bpf: Fix conflicts with built-in functions in ↵James Hilliard
bench_local_storage_create The fork function in gcc is considered a built in function due to being used by libgcov when building with gnu extensions. Rename fork to sched_process_fork to prevent this conflict. See details: https://github.com/gcc-mirror/gcc/commit/d1c38823924506d389ca58d02926ace21bdf82fa https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82457 Fixes the following error: In file included from progs/bench_local_storage_create.c:6: progs/bench_local_storage_create.c:43:14: error: conflicting types for built-in function 'fork'; expected 'int(void)' [-Werror=builtin-declaration-mismatch] 43 | int BPF_PROG(fork, struct task_struct *parent, struct task_struct *child) | ^~~~ Fixes: cbe9d93d58b1 ("selftests/bpf: Add bench for task storage creation") Signed-off-by: James Hilliard <james.hilliard1@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230331075848.1642814-1-james.hilliard1@gmail.com
2023-03-31selftests/bpf: Replace extract_build_id with read_build_idJiri Olsa
Replacing extract_build_id with read_build_id that parses out build id directly from elf without using readelf tool. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230331093157.1749137-4-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-31selftests/bpf: Add read_build_id functionJiri Olsa
Adding read_build_id function that parses out build id from specified binary. It will replace extract_build_id and also be used in following changes. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230331093157.1749137-3-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-31selftests/bpf: Add err.h headerJiri Olsa
Moving error macros from profiler.inc.h to new err.h header. It will be used in following changes. Also adding PTR_ERR macro that will be used in following changes. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230331093157.1749137-2-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-31selftests mount: Fix mount_setattr_test builds failedAnh Tuan Phan
When compiling selftests with target mount_setattr I encountered some errors with the below messages: mount_setattr_test.c: In function ‘mount_setattr_thread’: mount_setattr_test.c:343:16: error: variable ‘attr’ has initializer but incomplete type 343 | struct mount_attr attr = { | ^~~~~~~~~~ These errors might be because of linux/mount.h is not included. This patch resolves that issue. Signed-off-by: Anh Tuan Phan <tuananhlfc@gmail.com> Acked-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-03-30tools: ynl: ethtool testing toolStanislav Fomichev
This is what I've been using to see whether the spec makes sense. A small subset of getters (mostly the unprivileged ones) is implemented. Some setters (channels) also work. Setters for messages with bitmasks are not implemented. Initially I was trying to make this tool look 1:1 like real ethtool, but eventually gave up :-) Sample output: $ ./tools/net/ynl/ethtool enp0s31f6 Settings for enp0s31f6: Supported ports: [ TP ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Supported pause frame use: no Supports auto-negotiation: yes Supported FEC modes: Not reported Speed: Unknown! Duplex: Unknown! (255) Auto-negotiation: on Port: Twisted Pair PHYAD: 2 Transceiver: Internal MDI-X: Unknown (auto) Current message level: drv probe link Link detected: no Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-30tools: ynl: replace print with NlErrorStanislav Fomichev
Instead of dumping the error on the stdout, make the callee and opportunity to decide what to do with it. This is mostly for the ethtool testing. Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-30tools: ynl: support byte-order in cliStanislav Fomichev
Used by ethtool spec. Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>