linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2021-11-12	perf trace: Beautify the 'level' argument of getsockopt	Arnaldo Carvalho de Melo
	# perf trace -e getsockopt 0.000 ( 0.006 ms): systemd-resolv/1121 getsockopt(fd: 21, level: 1, optname: 17, optval: 0x7ffee2c0c6cc, optlen: 0x7ffee2c0c6c8) = 0 0.301 ( 0.003 ms): systemd-resolv/1121 getsockopt(fd: 22, level: IP, optname: 14, optval: 0x7ffee2c0c1a0, optlen: 0x7ffee2c0c1a4) = -1 ENOTCONN (Transport endpoint is not connected) 2.215 ( 0.005 ms): systemd-resolv/1121 getsockopt(fd: 21, level: 1, optname: 17, optval: 0x7ffee2c0c6cc, optlen: 0x7ffee2c0c6c8) = 0 2.422 ( 0.005 ms): systemd-resolv/1121 getsockopt(fd: 22, level: IP, optname: 14, optval: 0x7ffee2c0c1a0, optlen: 0x7ffee2c0c1a4) = -1 ENOTCONN (Transport endpoint is not connected) 1001.308 ( 0.006 ms): systemd-resolv/1121 getsockopt(fd: 21, level: 1, optname: 17, optval: 0x7ffee2c0c6cc, optlen: 0x7ffee2c0c6c8) = 0 1001.586 ( 0.003 ms): systemd-resolv/1121 getsockopt(fd: 22, level: IP, optname: 14, optval: 0x7ffee2c0c1a0, optlen: 0x7ffee2c0c1a4) = -1 ENOTCONN (Transport endpoint is not connected) 1001.647 ( 0.002 ms): systemd-resolv/1121 getsockopt(fd: 23, level: IP, optname: 14, optval: 0x7ffee2c0c1a0, optlen: 0x7ffee2c0c1a4) = -1 ENOTCONN (Transport endpoint is not connected) 1003.868 ( 0.010 ms): systemd-resolv/1121 getsockopt(fd: 21, level: 1, optname: 17, optval: 0x7ffee2c0c6cc, optlen: 0x7ffee2c0c6c8) = 0 1004.036 ( 0.006 ms): systemd-resolv/1121 getsockopt(fd: 22, level: IP, optname: 14, optval: 0x7ffee2c0c1a0, optlen: 0x7ffee2c0c1a4) = -1 ENOTCONN (Transport endpoint is not connected) 1004.087 ( 0.002 ms): systemd-resolv/1121 getsockopt(fd: 23, level: IP, optname: 14, optval: 0x7ffee2c0c1a0, optlen: 0x7ffee2c0c1a4) = -1 ENOTCONN (Transport endpoint is not connected) ^C# So the simple straight STRARRAY method is not enough as SOL_SOCKET is '1' in most architectures but some use 0xffff (alpha, mips, parisc and sparc), so a followup patch will create a specialized scnprintf to cover that. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-12	perf beauty socket: Add generator for socket level (SOL_*) string table	Arnaldo Carvalho de Melo
	$ tools/perf/trace/beauty/socket.sh static const char socket_ipproto[] = { [0] = "IP", [1] = "ICMP", <SNIP> [255] = "RAW", [262] = "MPTCP", }; static const char socket_level[] = { [0] = "IP", [6] = "TCP", [17] = "UDP", [41] = "IPV6", [58] = "ICMPV6", [132] = "SCTP", [136] = "UDPLITE", [255] = "RAW", [256] = "IPX", [257] = "AX25", [258] = "ATALK", [259] = "NETROM", [260] = "ROSE", [261] = "DECNET", [262] = "X25", [263] = "PACKET", [264] = "ATM", [265] = "AAL", [266] = "IRDA", [267] = "NETBEUI", [268] = "LLC", [269] = "DCCP", [270] = "NETLINK", [271] = "TIPC", [272] = "RXRPC", [273] = "PPPOL2TP", [274] = "BLUETOOTH", [275] = "PNPIPE", [276] = "RDS", [277] = "IUCV", [278] = "CAIF", [279] = "ALG", [280] = "NFC", [281] = "KCM", [282] = "TLS", [283] = "XDP", [284] = "MPTCP", [285] = "MCTP", }; $ Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-12	perf beauty socket: Sort the ipproto array entries	Arnaldo Carvalho de Melo
	Just tidying up the output for human consumption. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-12	perf beauty socket: Rename 'regex' to 'ipproto_regex'	Arnaldo Carvalho de Melo
	Paving the way for more regexps to be used here. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-12	perf beauty socket: Prep to receive more input header files	Arnaldo Carvalho de Melo
	Move from ternary like expression to an if block, this way we'll have just the extra lines for new files in the following patches. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-12	perf beauty socket: Rename header_dir to uapi_header_dir	Arnaldo Carvalho de Melo
	Paving the way to pass more headers to be consumed, like tools/perf/trace/beauty/include/linux/socket.h in addition to the current tools/include/uapi/linux/in.h, to get the SOL_* defines. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-12	perf beauty: Rename socket_ipproto.sh to socket.sh to hold more socket table ↵	Arnaldo Carvalho de Melo
	generators To avoid having to add new entries to tools/perf/Makefile.perf prep socket.sh so that it can generate other socket table generators, such as the upcoming SOL_ socket level one. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-12	perf beauty: Make all sockaddr files use a common naming scheme	Arnaldo Carvalho de Melo
	The script that generates the tables was named 'socket.sh', which is confusing, rename it to sockaddr.sh and make sure the related Makefile.perf targets also use the 'sockaddr' namespace. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-11	selftests/bpf: Clarify llvm dependency with btf_tag selftest	Yonghong Song
	btf_tag selftest needs certain llvm versions (>= llvm14). Make it clear in the selftests README.rst file. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211112012651.1508549-1-yhs@fb.com
2021-11-11	selftests/bpf: Add a C test for btf_type_tag	Yonghong Song
	The following is the main btf_type_tag usage in the C test: #define __tag1 __attribute__((btf_type_tag("tag1"))) #define __tag2 __attribute__((btf_type_tag("tag2"))) struct btf_type_tag_test { int __tag1 * __tag1 __tag2 p; } g; The bpftool raw dump with related types: [4] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED [11] STRUCT 'btf_type_tag_test' size=8 vlen=1 'p' type_id=14 bits_offset=0 [12] TYPE_TAG 'tag1' type_id=16 [13] TYPE_TAG 'tag2' type_id=12 [14] PTR '(anon)' type_id=13 [15] TYPE_TAG 'tag1' type_id=4 [16] PTR '(anon)' type_id=15 [17] VAR 'g' type_id=11, linkage=global With format C dump, we have struct btf_type_tag_test { int __attribute__((btf_type_tag("tag1"))) __attribute__((btf_type_tag("tag1"))) __attribute__((btf_type_tag("tag2"))) *p; }; The result C code is identical to the original definition except macro's are gone. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211112012646.1508231-1-yhs@fb.com
2021-11-11	selftests/bpf: Rename progs/tag.c to progs/btf_decl_tag.c	Yonghong Song
	Rename progs/tag.c to progs/btf_decl_tag.c so we can introduce progs/btf_type_tag.c in the next patch. Also create a subtest for btf_decl_tag in prog_tests/btf_tag.c so we can introduce btf_type_tag subtest in the next patch. I also took opportunity to remove the check whether __has_attribute is defined or not in progs/btf_decl_tag.c since all recent clangs should already support this macro. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211112012641.1507144-1-yhs@fb.com
2021-11-11	selftests/bpf: Test BTF_KIND_DECL_TAG for deduplication	Yonghong Song
	Add BTF_KIND_TYPE_TAG duplication unit tests. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211112012635.1506853-1-yhs@fb.com
2021-11-11	selftests/bpf: Add BTF_KIND_TYPE_TAG unit tests	Yonghong Song
	Add BTF_KIND_TYPE_TAG unit tests. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211112012630.1506095-1-yhs@fb.com
2021-11-11	selftests/bpf: Test libbpf API function btf__add_type_tag()	Yonghong Song
	Add unit tests for btf__add_type_tag(). Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211112012625.1505748-1-yhs@fb.com
2021-11-11	bpftool: Support BTF_KIND_TYPE_TAG	Yonghong Song
	Add bpftool support for BTF_KIND_TYPE_TAG. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211112012620.1505506-1-yhs@fb.com
2021-11-11	libbpf: Support BTF_KIND_TYPE_TAG	Yonghong Song
	Add libbpf support for BTF_KIND_TYPE_TAG. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211112012614.1505315-1-yhs@fb.com
2021-11-11	bpf: Support BTF_KIND_TYPE_TAG for btf_type_tag attributes	Yonghong Song
	LLVM patches ([1] for clang, [2] and [3] for BPF backend) added support for btf_type_tag attributes. This patch added support for the kernel. The main motivation for btf_type_tag is to bring kernel annotations __user, __rcu etc. to btf. With such information available in btf, bpf verifier can detect mis-usages and reject the program. For example, for __user tagged pointer, developers can then use proper helper like bpf_probe_read_user() etc. to read the data. BTF_KIND_TYPE_TAG may also useful for other tracing facility where instead of to require user to specify kernel/user address type, the kernel can detect it by itself with btf. [1] https://reviews.llvm.org/D111199 [2] https://reviews.llvm.org/D113222 [3] https://reviews.llvm.org/D113496 Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211112012609.1505032-1-yhs@fb.com
2021-11-11	bpftool: Update btf_dump__new() and perf_buffer__new_raw() calls	Andrii Nakryiko
	Use v1.0-compatible variants of btf_dump and perf_buffer "constructors". This is also a demonstration of reusing struct perf_buffer_raw_opts as OPTS-style option struct for new perf_buffer__new_raw() API. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211111053624.190580-10-andrii@kernel.org
2021-11-11	tools/runqslower: Update perf_buffer__new() calls	Andrii Nakryiko
	Use v1.0+ compatible variant of perf_buffer__new() call to prepare for deprecation. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211111053624.190580-9-andrii@kernel.org
2021-11-11	selftests/bpf: Update btf_dump__new() uses to v1.0+ variant	Andrii Nakryiko
	Update to-be-deprecated forms of btf_dump__new(). Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211111053624.190580-8-andrii@kernel.org
2021-11-11	selftests/bpf: Migrate all deprecated perf_buffer uses	Andrii Nakryiko
	Migrate all old-style perf_buffer__new() and perf_buffer__new_raw() calls to new v1.0+ variants. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211111053624.190580-7-andrii@kernel.org
2021-11-11	libbpf: Make perf_buffer__new() use OPTS-based interface	Andrii Nakryiko
	Add new variants of perf_buffer__new() and perf_buffer__new_raw() that use OPTS-based options for future extensibility ([0]). Given all the currently used API names are best fits, re-use them and use ___libbpf_override() approach and symbol versioning to preserve ABI and source code compatibility. struct perf_buffer_opts and struct perf_buffer_raw_opts are kept as well, but they are restructured such that they are OPTS-based when used with new APIs. For struct perf_buffer_raw_opts we keep few fields intact, so we have to also preserve the memory location of them both when used as OPTS and for legacy API variants. This is achieved with anonymous padding for OPTS "incarnation" of the struct. These pads can be eventually used for new options. [0] Closes: https://github.com/libbpf/libbpf/issues/311 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211111053624.190580-6-andrii@kernel.org
2021-11-11	libbpf: Ensure btf_dump__new() and btf_dump_opts are future-proof	Andrii Nakryiko
	Change btf_dump__new() and corresponding struct btf_dump_ops structure to be extensible by using OPTS "framework" ([0]). Given we don't change the names, we use a similar approach as with bpf_prog_load(), but this time we ended up with two APIs with the same name and same number of arguments, so overloading based on number of arguments with ___libbpf_override() doesn't work. Instead, use "overloading" based on types. In this particular case, print callback has to be specified, so we detect which argument is a callback. If it's 4th (last) argument, old implementation of API is used by user code. If not, it must be 2nd, and thus new implementation is selected. The rest is handled by the same symbol versioning approach. btf_ext argument is dropped as it was never used and isn't necessary either. If in the future we'll need btf_ext, that will be added into OPTS-based struct btf_dump_opts. struct btf_dump_opts is reused for both old API and new APIs. ctx field is marked deprecated in v0.7+ and it's put at the same memory location as OPTS's sz field. Any user of new-style btf_dump__new() will have to set sz field and doesn't/shouldn't use ctx, as ctx is now passed along the callback as mandatory input argument, following the other APIs in libbpf that accept callbacks consistently. Again, this is quite ugly in implementation, but is done in the name of backwards compatibility and uniform and extensible future APIs (at the same time, sigh). And it will be gone in libbpf 1.0. [0] Closes: https://github.com/libbpf/libbpf/issues/283 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211111053624.190580-5-andrii@kernel.org
2021-11-11	libbpf: Turn btf_dedup_opts into OPTS-based struct	Andrii Nakryiko
	btf__dedup() and struct btf_dedup_opts were added before we figured out OPTS mechanism. As such, btf_dedup_opts is non-extensible without breaking an ABI and potentially crashing user application. Unfortunately, btf__dedup() and btf_dedup_opts are short and succinct names that would be great to preserve and use going forward. So we use ___libbpf_override() macro approach, used previously for bpf_prog_load() API, to define a new btf__dedup() variant that accepts only struct btf * and struct btf_dedup_opts * arguments, and rename the old btf__dedup() implementation into btf__dedup_deprecated(). This keeps both source and binary compatibility with old and new applications. The biggest problem was struct btf_dedup_opts, which wasn't OPTS-based, and as such doesn't have `size_t sz;` as a first field. But btf__dedup() is a pretty rarely used API and I believe that the only currently known users (besides selftests) are libbpf's own bpf_linker and pahole. Neither use case actually uses options and just passes NULL. So instead of doing extra hacks, just rewrite struct btf_dedup_opts into OPTS-based one, move btf_ext argument into those opts (only bpf_linker needs to dedup btf_ext, so it's not a typical thing to specify), and drop never used `dont_resolve_fwds` option (it was never used anywhere, AFAIK, it makes BTF dedup much less useful and efficient). Just in case, for old implementation, btf__dedup_deprecated(), detect non-NULL options and error out with helpful message, to help users migrate, if there are any user playing with btf__dedup(). The last remaining piece is dedup_table_size, which is another anachronism from very early days of BTF dedup. Since then it has been reduced to the only valid value, 1, to request forced hash collisions. This is only used during testing. So instead introduce a bool flag to force collisions explicitly. This patch also adapts selftests to new btf__dedup() and btf_dedup_opts use to avoid selftests breakage. [0] Closes: https://github.com/libbpf/libbpf/issues/281 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211111053624.190580-4-andrii@kernel.org
2021-11-11	selftests/bpf: Minor cleanups and normalization of Makefile	Andrii Nakryiko
	Few clean ups and single-line simplifications. Also split CLEAN command into multiple $(RM) invocations as it gets dangerously close to too long argument list. Make sure that -o <output.o> is used always as the last argument for saner verbose make output. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211111053624.190580-3-andrii@kernel.org
2021-11-11	bpftool: Normalize compile rules to specify output file last	Andrii Nakryiko
	When dealing with verbose Makefile output, it's extremely confusing when compiler invocation commands don't specify -o <output.o> as the last argument. Normalize bpftool's Makefile to do just that, as most other BPF-related Makefiles are already doing that. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211111053624.190580-2-andrii@kernel.org
2021-11-11	selftests/bpf: Fix bpf_prog_test_load() logic to pass extra log level	Andrii Nakryiko
	After recent refactoring bpf_prog_test_load(), used across multiple selftests, lost ability to specify extra log_level 1 or 2 (for -vv and -vvv, respectively). Fix that problem by using bpf_object__load_xattr() API that supports extra log_level flags. Also restore BPF_F_TEST_RND_HI32 prog_flags by utilizing new bpf_program__set_extra_flags() API. Fixes: f87c1930ac29 ("selftests/bpf: Merge test_stub.c into testing_helpers.c") Reported-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211111051758.92283-3-andrii@kernel.org
2021-11-11	libbpf: Add ability to get/set per-program load flags	Andrii Nakryiko
	Add bpf_program__flags() API to retrieve prog_flags that will be (or were) supplied to BPF_PROG_LOAD command. Also add bpf_program__set_extra_flags() API to allow to set extra flags, in addition to those determined by program's SEC() definition. Such flags are logically OR'ed with libbpf-derived flags. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211111051758.92283-2-andrii@kernel.org
2021-11-11	Merge tag 'net-5.16-rc1' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bpf, can and netfilter. Current release - regressions: - bpf: do not reject when the stack read size is different from the tracked scalar size - net: fix premature exit from NAPI state polling in napi_disable() - riscv, bpf: fix RV32 broken build, and silence RV64 warning Current release - new code bugs: - net: fix possible NULL deref in sock_reserve_memory - amt: fix error return code in amt_init(); fix stopping the workqueue - ax88796c: use the correct ioctl callback Previous releases - always broken: - bpf: stop caching subprog index in the bpf_pseudo_func insn - security: fixups for the security hooks in sctp - nfc: add necessary privilege flags in netlink layer, limit operations to admin only - vsock: prevent unnecessary refcnt inc for non-blocking connect - net/smc: fix sk_refcnt underflow on link down and fallback - nfnetlink_queue: fix OOB when mac header was cleared - can: j1939: ignore invalid messages per standard - bpf, sockmap: - fix race in ingress receive verdict with redirect to self - fix incorrect sk_skb data_end access when src_reg = dst_reg - strparser, and tls are reusing qdisc_skb_cb and colliding - ethtool: fix ethtool msg len calculation for pause stats - vlan: fix a UAF in vlan_dev_real_dev() when ref-holder tries to access an unregistering real_dev - udp6: make encap_rcv() bump the v6 not v4 stats - drv: prestera: add explicit padding to fix m68k build - drv: felix: fix broken VLAN-tagged PTP under VLAN-aware bridge - drv: mvpp2: fix wrong SerDes reconfiguration order Misc & small latecomers: - ipvs: auto-load ipvs on genl access - mctp: sanity check the struct sockaddr_mctp padding fields - libfs: support RENAME_EXCHANGE in simple_rename() - avoid double accounting for pure zerocopy skbs" * tag 'net-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (123 commits) selftests/net: udpgso_bench_rx: fix port argument net: wwan: iosm: fix compilation warning cxgb4: fix eeprom len when diagnostics not implemented net: fix premature exit from NAPI state polling in napi_disable() net/smc: fix sk_refcnt underflow on linkdown and fallback net/mlx5: Lag, fix a potential Oops with mlx5_lag_create_definer() gve: fix unmatched u64_stats_update_end() net: ethernet: lantiq_etop: Fix compilation error selftests: forwarding: Fix packet matching in mirroring selftests vsock: prevent unnecessary refcnt inc for nonblocking connect net: marvell: mvpp2: Fix wrong SerDes reconfiguration order net: ethernet: ti: cpsw_ale: Fix access to un-initialized memory net: stmmac: allow a tc-taprio base-time of zero selftests: net: test_vxlan_under_vrf: fix HV connectivity test net: hns3: allow configure ETS bandwidth of all TCs net: hns3: remove check VF uc mac exist when set by PF net: hns3: fix some mac statistics is always 0 in device version V2 net: hns3: fix kernel crash when unload VF while it is being reset net: hns3: sync rx ring head in echo common pull net: hns3: fix pfc packet number incorrect after querying pfc parameters ...
2021-11-11	Merge branch 'kvm-sev-move-context' into kvm-master	Paolo Bonzini
	Add support for AMD SEV and SEV-ES intra-host migration support. Intra host migration provides a low-cost mechanism for userspace VMM upgrades. In the common case for intra host migration, we can rely on the normal ioctls for passing data from one VMM to the next. SEV, SEV-ES, and other confidential compute environments make most of this information opaque, and render KVM ioctls such as "KVM_GET_REGS" irrelevant. As a result, we need the ability to pass this opaque metadata from one VMM to the next. The easiest way to do this is to leave this data in the kernel, and transfer ownership of the metadata from one KVM VM (or vCPU) to the next. In-kernel hand off makes it possible to move any data that would be unsafe/impossible for the kernel to hand directly to userspace, and cannot be reproduced using data that can be handed to userspace. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-11	selftest: KVM: Add intra host migration tests	Peter Gonda
	Adds testcases for intra host migration for SEV and SEV-ES. Also adds locking test to confirm no deadlock exists. Signed-off-by: Peter Gonda <pgonda@google.com> Suggested-by: Sean Christopherson <seanjc@google.com> Cc: Marc Orr <marcorr@google.com> Cc: Sean Christopherson <seanjc@google.com> Cc: David Rientjes <rientjes@google.com> Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Message-Id: <20211021174303.385706-6-pgonda@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-11	selftest: KVM: Add open sev dev helper	Peter Gonda
	Refactors out open path support from open_kvm_dev_path_or_exit() and adds new helper for SEV device path. Signed-off-by: Peter Gonda <pgonda@google.com> Suggested-by: Sean Christopherson <seanjc@google.com> Cc: Marc Orr <marcorr@google.com> Cc: Sean Christopherson <seanjc@google.com> Cc: David Rientjes <rientjes@google.com> Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Message-Id: <20211021174303.385706-5-pgonda@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-11	selftests/net: udpgso_bench_rx: fix port argument	Willem de Bruijn
	The below commit added optional support for passing a bind address. It configures the sockaddr bind arguments before parsing options and reconfigures on options -b and -4. This broke support for passing port (-p) on its own. Configure sockaddr after parsing all arguments. Fixes: 3327a9c46352 ("selftests: add functionals test for UDP GRO") Reported-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-11	static_call,x86: Robustify trampoline patching	Peter Zijlstra
	Add a few signature bytes after the static call trampoline and verify those bytes match before patching the trampoline. This avoids patching random other JMPs (such as CFI jump-table entries) instead. These bytes decode as: d: 53 push %rbx e: 43 54 rex.XB push %r12 And happen to spell "SCT". Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20211030074758.GT174703@worktop.programming.kicks-ass.net
2021-11-10	selftests/bpf: Add tests for accessing ingress_ifindex in bpf_sk_lookup	Mark Pashmfouroush
	A new field was added to the bpf_sk_lookup data that users can access. Add tests that validate that the new ingress_ifindex field contains the right data. Signed-off-by: Mark Pashmfouroush <markpash@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211110111016.5670-3-markpash@cloudflare.com
2021-11-10	bpf: Add ingress_ifindex to bpf_sk_lookup	Mark Pashmfouroush
	It may be helpful to have access to the ifindex during bpf socket lookup. An example may be to scope certain socket lookup logic to specific interfaces, i.e. an interface may be made exempt from custom lookup code. Add the ifindex of the arriving connection to the bpf_sk_lookup API. Signed-off-by: Mark Pashmfouroush <markpash@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211110111016.5670-2-markpash@cloudflare.com
2021-11-10	Merge tag 'arm64-fixes' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: - Fix double-evaluation of 'pte' macro argument when using 52-bit PAs - Fix signedness of some MTE prctl PR_* constants - Fix kmemleak memory usage by skipping early pgtable allocations - Fix printing of CPU feature register strings - Remove redundant -nostdlib linker flag for vDSO binaries * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: pgtable: make __pte_to_phys/__phys_to_pte_val inline functions arm64: Track no early_pgtable_alloc() for kmemleak arm64: mte: change PR_MTE_TCF_NONE back into an unsigned long arm64: vdso: remove -nostdlib compiler flag arm64: arm64_ftr_reg->name may not be a human-readable string
2021-11-10	bpftool: Fix SPDX tag for Makefiles and .gitignore	Quentin Monnet
	Bpftool is dual-licensed under GPLv2 and BSD-2-Clause. In commit 907b22365115 ("tools: bpftool: dual license all files") we made sure that all its source files were indeed covered by the two licenses, and that they had the correct SPDX tags. However, bpftool's Makefile, the Makefile for its documentation, and the .gitignore file were skipped at the time (their GPL-2.0-only tag was added later). Let's update the tags. Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Tobias Klauser <tklauser@distanz.ch> Acked-by: Joe Stringer <joe@cilium.io> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/bpf/20211105221904.3536-1-quentin@isovalent.com
2021-11-10	selftests: forwarding: Fix packet matching in mirroring selftests	Petr Machata
	In commit 6de6e46d27ef ("cls_flower: Fix inability to match GRE/IPIP packets"), cls_flower was fixed to match an outer packet of a tunneled packet as would be expected, rather than dissecting to the inner packet and matching on that. This fix uncovered several issues in packet matching in mirroring selftests: - in mirror_gre_bridge_1d_vlan.sh and mirror_gre_vlan_bridge_1q.sh, the vlan_ethtype match is copied around as "ip", even as some of the tests are running over ip6gretap. This is fixed by using an "ipv6" for vlan_ethtype in the ip6gretap tests. - in mirror_gre_changes.sh, a filter to count GRE packets is set up to match TTL of 50. This used to trigger in the offloaded datapath, where the envelope TTL was matched, but not in the software datapath, which considered TTL of the inner packet. Now that both match consistently, all the packets were double-counted. This is fixed by marking the filter as skip_hw, leaving only the SW datapath component active. Fixes: 6de6e46d27ef ("cls_flower: Fix inability to match GRE/IPIP packets") Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-10	selftests: net: test_vxlan_under_vrf: fix HV connectivity test	Andrea Righi
	It looks like test_vxlan_under_vrf.sh is always failing to verify the connectivity test during the ping between the two simulated VMs. This is due to the fact that veth-hv in each VM should have a distinct MAC address. Fix by setting a unique MAC address on each simulated VM interface. Without this fix: $ sudo ./tools/testing/selftests/net/test_vxlan_under_vrf.sh Checking HV connectivity [ OK ] Check VM connectivity through VXLAN (underlay in the default VRF) [FAIL] With this fix applied: $ sudo ./tools/testing/selftests/net/test_vxlan_under_vrf.sh Checking HV connectivity [ OK ] Check VM connectivity through VXLAN (underlay in the default VRF) [ OK ] Check VM connectivity through VXLAN (underlay in a VRF) [FAIL] NOTE: the connectivity test with the underlay VRF is still failing; it seems that ARP requests are blocked at the simulated hypervisor level, probably due to some missing ARP forwarding rules. This requires more investigation (in the meantime we may consider to set that test as expected failure - XFAIL). Signed-off-by: Andrea Righi <andrea.righi@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-09	Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf	Jakub Kicinski
	Alexei Starovoitov says: ==================== pull-request: bpf 2021-11-09 We've added 7 non-merge commits during the last 3 day(s) which contain a total of 10 files changed, 174 insertions(+), 48 deletions(-). The main changes are: 1) Various sockmap fixes, from John and Jussi. 2) Fix out-of-bound issue with bpf_pseudo_func, from Martin. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: bpf, sockmap: sk_skb data_end access incorrect when src_reg = dst_reg bpf: sockmap, strparser, and tls are reusing qdisc_skb_cb and colliding bpf, sockmap: Fix race in ingress receive verdict with redirect to self bpf, sockmap: Remove unhash handler for BPF sockmap usage bpf, sockmap: Use stricter sk state checks in sk_lookup_assign bpf: selftest: Trigger a DCE on the whole subprog bpf: Stop caching subprog index in the bpf_pseudo_func insn ==================== Link: https://lore.kernel.org/r/20211109215702.38350-1-alexei.starovoitov@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-09	libbpf: Compile using -std=gnu89	Kumar Kartikeya Dwivedi
	The minimum supported C standard version is C89, with use of GNU extensions, hence make sure to catch any instances that would break the build for this mode by passing -std=gnu89. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211105234243.390179-4-memxor@gmail.com
2021-11-09	Merge branch 'akpm' (patches from Andrew)	Linus Torvalds
	Merge more updates from Andrew Morton: "87 patches. Subsystems affected by this patch series: mm (pagecache and hugetlb), procfs, misc, MAINTAINERS, lib, checkpatch, binfmt, kallsyms, ramfs, init, codafs, nilfs2, hfs, crash_dump, signals, seq_file, fork, sysvfs, kcov, gdb, resource, selftests, and ipc" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (87 commits) ipc/ipc_sysctl.c: remove fallback for !CONFIG_PROC_SYSCTL ipc: check checkpoint_restore_ns_capable() to modify C/R proc files selftests/kselftest/runner/run_one(): allow running non-executable files virtio-mem: disallow mapping virtio-mem memory via /dev/mem kernel/resource: disallow access to exclusive system RAM regions kernel/resource: clean up and optimize iomem_is_exclusive() scripts/gdb: handle split debug for vmlinux kcov: replace local_irq_save() with a local_lock_t kcov: avoid enable+disable interrupts if !in_task() kcov: allocate per-CPU memory on the relevant node Documentation/kcov: define `ip' in the example Documentation/kcov: include types.h in the example sysv: use BUILD_BUG_ON instead of runtime check kernel/fork.c: unshare(): use swap() to make code cleaner seq_file: fix passing wrong private data seq_file: move seq_escape() to a header signal: remove duplicate include in signal.h crash_dump: remove duplicate include in crash_dump.h crash_dump: fix boolreturn.cocci warning hfs/hfsplus: use WARN_ON for sanity check ...
2021-11-09	selftests/kselftest/runner/run_one(): allow running non-executable files	SeongJae Park
	When running a test program, 'run_one()' checks if the program has the execution permission and fails if it doesn't. However, it's easy to mistakenly lose the permissions, as some common tools like 'diff' don't support the permission change well[1]. Compared to that, making mistakes in the test program's path would only rare, as those are explicitly listed in 'TEST_PROGS'. Therefore, it might make more sense to resolve the situation on our own and run the program. For this reason, this commit makes the test program runner function still print the warning message but to try parsing the interpreter of the program and to explicitly run it with the interpreter, in this case. [1] https://lore.kernel.org/mm-commits/YRJisBs9AunccCD4@kroah.com/ Link: https://lkml.kernel.org/r/20210810164534.25902-1-sj38.park@gmail.com Signed-off-by: SeongJae Park <sjpark@amazon.de> Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-09	procfs: do not list TID 0 in /proc/<pid>/task	Florian Weimer
	If a task exits concurrently, task_pid_nr_ns may return 0. [akpm@linux-foundation.org: coding style tweaks] [adobriyan@gmail.com: test that /proc/*/task doesn't contain "0"] Link: https://lkml.kernel.org/r/YV88AnVzHxPafQ9o@localhost.localdomain Link: https://lkml.kernel.org/r/8735pn5dx7.fsf@oldenburg.str.redhat.com Signed-off-by: Florian Weimer <fweimer@redhat.com> Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Reviewed-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Kees Cook <keescook@chromium.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-08	selftests/bpf: Add exception handling selftests for tp_bpf program	Alan Maguire
	Exception handling is triggered in BPF tracing programs when a NULL pointer is dereferenced; the exception handler zeroes the target register and execution of the BPF program progresses. To test exception handling then, we need to trigger a NULL pointer dereference for a field which should never be zero; if it is, the only explanation is the exception handler ran. task->task_works is the NULL pointer chosen (for a new task from fork() no work is associated), and the task_works->func field should not be zero if task_works is non-NULL. The test verifies that task_works and task_works->func are 0. Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1636131046-5982-3-git-send-email-alan.maguire@oracle.com
2021-11-08	Merge tag 'cxl-for-5.16' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull cxl updates from Dan Williams: "More preparation and plumbing work in the CXL subsystem. From an end user perspective the highlight here is lighting up the CXL Persistent Memory related commands (label read / write) with the generic ioctl() front-end in LIBNVDIMM. Otherwise, the ability to instantiate new persistent and volatile memory regions is still on track for v5.17. Summary: - Fix support for platforms that do not enumerate every ACPI0016 (CXL Host Bridge) in the CHBS (ACPI Host Bridge Structure). - Introduce a common pci_find_dvsec_capability() helper, clean up open coded implementations in various drivers. - Add 'cxl_test' for regression testing CXL subsystem ABIs. 'cxl_test' is a module built from tools/testing/cxl/ that mocks up a CXL topology to augment the nascent support for emulation of CXL devices in QEMU. - Convert libnvdimm to use the uuid API. - Complete the definition of CXL namespace labels in libnvdimm. - Tunnel libnvdimm label operations from nd_ioctl() back to the CXL mailbox driver. Enable 'ndctl {read,write}-labels' for CXL. - Continue to sort and refactor functionality into distinct driver and core-infrastructure buckets. For example, mailbox handling is now a generic core capability consumed by the PCI and cxl_test drivers" * tag 'cxl-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (34 commits) ocxl: Use pci core's DVSEC functionality cxl/pci: Use pci core's DVSEC functionality PCI: Add pci_find_dvsec_capability to find designated VSEC cxl/pci: Split cxl_pci_setup_regs() cxl/pci: Add @base to cxl_register_map cxl/pci: Make more use of cxl_register_map cxl/pci: Remove pci request/release regions cxl/pci: Fix NULL vs ERR_PTR confusion cxl/pci: Remove dev_dbg for unknown register blocks cxl/pci: Convert register block identifiers to an enum cxl/acpi: Do not fail cxl_acpi_probe() based on a missing CHBS cxl/pci: Disambiguate cxl_pci further from cxl_mem Documentation/cxl: Add bus internal docs cxl/core: Split decoder setup into alloc + add tools/testing/cxl: Introduce a mock memory device + driver cxl/mbox: Move command definitions to common location cxl/bus: Populate the target list at decoder create tools/testing/cxl: Introduce a mocked-up CXL port hierarchy cxl/pmem: Add support for multiple nvdimm-bridge objects cxl/pmem: Translate NVDIMM label commands to CXL label commands ...
2021-11-08	Add 'tools/perf/libbpf/' to ignored files	Linus Torvalds
	Commit 6b491a86b77c ("perf build: Install libbpf headers locally when building") installed copies of the libbpf headers into the build tree, causing unnecessary noise from 'git status' after a perf tools build. Add the 'libbpf/' subdirectory to the .gitignore file to silence it all again. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-08	Merge tag 'perf-tools-for-v5.16-2021-11-07-without-bpftool-fix' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tools updates from Arnaldo Carvalho de Melo: "perf annotate: - Add riscv64 support. - Add fusion logic for AMD microarchs. perf record: - Add an option to control the synthesizing behavior: --synth <no\|all\|task\|mmap\|cgroup> core: - Allow controlling synthesizing PERF_RECORD_ metadata events during record. - perf.data reader prep work for multithreaded processing. - Fix missing exclude_{host,guest} setting in PMUs that don't support it and that were causing the feature detection code to disable it for all events, even the ones in PMUs that support it. - Fix the default use of precise events on AMD, that were always falling back to non-precise because perf_event_attr.exclude_guest=1 was set and IBS does not have filtering capability, refusing precise + exclude_guest. - Add bitfield_swap() to handle branch_stack endian issue. perf script: - Show binary offsets for userspace addresses in callchains. - Support instruction latency via new "ins_lat" selectable field. - Add dlfilter-show-cycles perf inject: - Add vmlinux and ignore-vmlinux arguments, similar to other tools. perf list: - Display PMU prefix for partially supported hybrid cache events. - Display hybrid PMU events with cpu type. perf stat: - Improve metrics documentation of data structures. - Fix memory leaks in the metric code. - Use NAN for missing event IDs. - Don't compute unused events. - Fix memory leak on error path. - Encode and use metric-id as a metric qualifier. - Allow metrics with no events. - Avoid events for an 'if' constant result. - Only add a referenced metric once. - Simplify metric_refs calculation. - Allow modifiers on metrics. perf test: - Add workload test of metric and metric groups. - Workload test of all PMUs. - vmlinux-kallsyms: Ignore hidden symbols. - Add pmu-event test for event described as "config=". - Verify more event members in pmu-events test. - Add endian test for struct branch_flags on the sample-parsing test. - Improve temp file cleanup in several tests. perf daemon: - Address MSAN warnings on send_cmd(). perf kmem: - Improve man page for record options perf srcline: - Use long-running addr2line per DSO, greatly speeding up the 'srcline' sort order. perf symbols: - Ignore $a/$d symbols for ARM modules. - Fix /proc/kcore access on 32 bit systems. Kernel UAPI copies: - Update copy of linux/socket.h with the kernel sources, no change in tooling output. libbpf: - Pull in bpf_program__get_prog_info_linear() from libbpf, too much specific to perf. - Deprecate bpf_map__resize() in favor of bpf_map_set_max_entries() - Install libbpf headers locally when building. - Bump minimum LLVM C++ std to GNU++14. libperf: - Use binary search in perf_cpu_map__idx() as array are sorted. libtracefs: - Enable libtracefs dynamic linking. libtraceevent: - Increase logging when verbose. Arch specific: * PowerPC: - Add support to expose instruction and data address registers as part of extended regs. Vendor events: * JSON parser: - Support ConfigCode to set the config= in PMUs - Make the JSON parser more conformant when in strict mode. * All JSON files: - Fix all remaining invalid JSON files. * ARM: - Syntax corrections in Neoverse N1 json. - Categorise the Neoverse V1 counters. - Add new armv8 PMU events. - Revise hip08 uncore events. Hardware tracing: * auxtrace: - Add missing Z option to ITRACE_HELP. - Add itrace A option to approximate IPC. - Add itrace d+o option to direct debug log to stdout. * Intel PT: - Add support for PERF_RECORD_AUX_OUTPUT_HW_ID - Support itrace A option to approximate IPC - Support itrace d+o option to direct debug log to stdout" * tag 'perf-tools-for-v5.16-2021-11-07-without-bpftool-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (120 commits) perf build: Install libbpf headers locally when building perf MANIFEST: Add bpftool files to allow building with BUILD_BPF_SKEL=1 perf metric: Fix memory leaks perf parse-event: Add init and exit to parse_event_error perf parse-events: Rename parse_events_error functions perf stat: Fix memory leak on error path perf tools: Use __BYTE_ORDER__ perf inject: Add vmlinux and ignore-vmlinux arguments perf tools: Check vmlinux/kallsyms arguments in all tools perf tools: Refactor out kernel symbol argument sanity checking perf symbols: Ignore $a/$d symbols for ARM modules perf evsel: Don't set exclude_guest by default perf evsel: Fix missing exclude_{host,guest} setting perf bpf: Add missing free to bpf_event__print_bpf_prog_info() perf beauty: Update copy of linux/socket.h with the kernel sources perf clang: Fixes for more recent LLVM/clang tools: Bump minimum LLVM C++ std to GNU++14 perf bpf: Pull in bpf_program__get_prog_info_linear() Revert "perf bench futex: Add support for 32-bit systems with 64-bit time_t" perf test sample-parsing: Add endian test for struct branch_flags ...
2021-11-08	selftests: nft_nat: Simplify port shadow notrack test	Phil Sutter
	The second rule in prerouting chain was probably a leftover: The router listens on veth0, so not tracking connections via that interface is sufficient. Likewise, the rule in output chain can be limited to that interface as well. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>