summaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)Author
2023-08-09selftests: forwarding: ethtool_mm: Skip when MAC Merge is not supportedIdo Schimmel
MAC Merge cannot be tested with veth pairs, resulting in failures: # ./ethtool_mm.sh [...] TEST: Manual configuration with verification: swp1 to swp2 [FAIL] Verification did not succeed Fix by skipping the test when the interfaces do not support MAC Merge. Fixes: e6991384ace5 ("selftests: forwarding: add a test for MAC Merge layer") Reported-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Closes: https://lore.kernel.org/netdev/adc5e40d-d040-a65e-eb26-edf47dac5b02@alu.unizg.hr/ Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20230808141503.4060661-11-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-09selftests: forwarding: hw_stats_l3_gre: Skip when using veth pairsIdo Schimmel
Layer 3 hardware stats cannot be used when the underlying interfaces are veth pairs, resulting in failures: # ./hw_stats_l3_gre.sh TEST: ping gre flat [ OK ] TEST: Test rx packets: [FAIL] Traffic not reflected in the counter: 0 -> 0 TEST: Test tx packets: [FAIL] Traffic not reflected in the counter: 0 -> 0 Fix by skipping the test when used with veth pairs. Fixes: 813f97a26860 ("selftests: forwarding: Add a tunnel-based test for L3 HW stats") Reported-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Closes: https://lore.kernel.org/netdev/adc5e40d-d040-a65e-eb26-edf47dac5b02@alu.unizg.hr/ Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Tested-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20230808141503.4060661-10-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-09selftests: forwarding: ethtool_extended_state: Skip when using veth pairsIdo Schimmel
Ethtool extended state cannot be tested with veth pairs, resulting in failures: # ./ethtool_extended_state.sh TEST: Autoneg, No partner detected [FAIL] Expected "Autoneg", got "Link detected: no" [...] Fix by skipping the test when used with veth pairs. Fixes: 7d10bcce98cd ("selftests: forwarding: Add tests for ethtool extended state") Reported-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Closes: https://lore.kernel.org/netdev/adc5e40d-d040-a65e-eb26-edf47dac5b02@alu.unizg.hr/ Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Tested-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20230808141503.4060661-9-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-09selftests: forwarding: ethtool: Skip when using veth pairsIdo Schimmel
Auto-negotiation cannot be tested with veth pairs, resulting in failures: # ./ethtool.sh TEST: force of same speed autoneg off [FAIL] error in configuration. swp1 speed Not autoneg off [...] Fix by skipping the test when used with veth pairs. Fixes: 64916b57c0b1 ("selftests: forwarding: Add speed and auto-negotiation test") Reported-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Closes: https://lore.kernel.org/netdev/adc5e40d-d040-a65e-eb26-edf47dac5b02@alu.unizg.hr/ Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Tested-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20230808141503.4060661-8-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-09selftests: forwarding: Add a helper to skip test when using veth pairsIdo Schimmel
A handful of tests require physical loopbacks to be used instead of veth pairs. Add a helper that these tests will invoke in order to be skipped when executed with veth pairs. Fixes: 64916b57c0b1 ("selftests: forwarding: Add speed and auto-negotiation test") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Tested-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20230808141503.4060661-7-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-09selftests: forwarding: Set default IPv6 traceroute utilityIdo Schimmel
The test uses the 'TROUTE6' environment variable to encode the name of the IPv6 traceroute utility. By default (without a configuration file), this variable is not set, resulting in failures: # ./ip6_forward_instats_vrf.sh TEST: ping6 [ OK ] TEST: Ip6InTooBigErrors [ OK ] TEST: Ip6InHdrErrors [FAIL] TEST: Ip6InAddrErrors [ OK ] TEST: Ip6InDiscards [ OK ] Fix by setting a default utility name and skip the test if the utility is not present. Fixes: 0857d6f8c759 ("ipv6: When forwarding count rx stats on the orig netdev") Reported-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Closes: https://lore.kernel.org/netdev/adc5e40d-d040-a65e-eb26-edf47dac5b02@alu.unizg.hr/ Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Tested-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20230808141503.4060661-6-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-09selftests: forwarding: bridge_mdb_max: Check iproute2 versionIdo Schimmel
The selftest relies on iproute2 changes present in version 6.3, but the test does not check for it, resulting in errors: # ./bridge_mdb_max.sh INFO: 802.1d tests TEST: cfg4: port: ngroups reporting [FAIL] Number of groups was null, now is null, but 5 expected TEST: ctl4: port: ngroups reporting [FAIL] Number of groups was null, now is null, but 5 expected TEST: cfg6: port: ngroups reporting [FAIL] Number of groups was null, now is null, but 5 expected [...] Fix by skipping the test if iproute2 is too old. Fixes: 3446dcd7df05 ("selftests: forwarding: bridge_mdb_max: Add a new selftest") Reported-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Closes: https://lore.kernel.org/netdev/6b04b2ba-2372-6f6b-3ac8-b7cba1cfae83@alu.unizg.hr/ Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Tested-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20230808141503.4060661-5-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-09selftests: forwarding: bridge_mdb: Check iproute2 versionIdo Schimmel
The selftest relies on iproute2 changes present in version 6.3, but the test does not check for it, resulting in error: # ./bridge_mdb.sh INFO: # Host entries configuration tests TEST: Common host entries configuration tests (IPv4) [FAIL] Managed to add IPv4 host entry with a filter mode TEST: Common host entries configuration tests (IPv6) [FAIL] Managed to add IPv6 host entry with a filter mode TEST: Common host entries configuration tests (L2) [FAIL] Managed to add L2 host entry with a filter mode INFO: # Port group entries configuration tests - (*, G) Command "replace" is unknown, try "bridge mdb help". [...] Fix by skipping the test if iproute2 is too old. Fixes: b6d00da08610 ("selftests: forwarding: Add bridge MDB test") Reported-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Closes: https://lore.kernel.org/netdev/6b04b2ba-2372-6f6b-3ac8-b7cba1cfae83@alu.unizg.hr/ Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Tested-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20230808141503.4060661-4-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-09selftests: forwarding: Switch off timeoutIdo Schimmel
The default timeout for selftests is 45 seconds, but it is not enough for forwarding selftests which can takes minutes to finish depending on the number of tests cases: # make -C tools/testing/selftests TARGETS=net/forwarding run_tests TAP version 13 1..102 # timeout set to 45 # selftests: net/forwarding: bridge_igmp.sh # TEST: IGMPv2 report 239.10.10.10 [ OK ] # TEST: IGMPv2 leave 239.10.10.10 [ OK ] # TEST: IGMPv3 report 239.10.10.10 is_include [ OK ] # TEST: IGMPv3 report 239.10.10.10 include -> allow [ OK ] # not ok 1 selftests: net/forwarding: bridge_igmp.sh # TIMEOUT 45 seconds Fix by switching off the timeout and setting it to 0. A similar change was done for BPF selftests in commit 6fc5916cc256 ("selftests: bpf: Switch off timeout"). Fixes: 81573b18f26d ("selftests/net/forwarding: add Makefile to install tests") Reported-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Closes: https://lore.kernel.org/netdev/8d149f8c-818e-d141-a0ce-a6bae606bc22@alu.unizg.hr/ Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Tested-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20230808141503.4060661-3-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-09selftests: forwarding: Skip test when no interfaces are specifiedIdo Schimmel
As explained in [1], the forwarding selftests are meant to be run with either physical loopbacks or veth pairs. The interfaces are expected to be specified in a user-provided forwarding.config file or as command line arguments. By default, this file is not present and the tests fail: # make -C tools/testing/selftests TARGETS=net/forwarding run_tests [...] TAP version 13 1..102 # timeout set to 45 # selftests: net/forwarding: bridge_igmp.sh # Command line is not complete. Try option "help" # Failed to create netif not ok 1 selftests: net/forwarding: bridge_igmp.sh # exit=1 [...] Fix by skipping a test if interfaces are not provided either via the configuration file or command line arguments. # make -C tools/testing/selftests TARGETS=net/forwarding run_tests [...] TAP version 13 1..102 # timeout set to 45 # selftests: net/forwarding: bridge_igmp.sh # SKIP: Cannot create interface. Name not specified ok 1 selftests: net/forwarding: bridge_igmp.sh # SKIP [1] tools/testing/selftests/net/forwarding/README Fixes: 81573b18f26d ("selftests/net/forwarding: add Makefile to install tests") Reported-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Closes: https://lore.kernel.org/netdev/856d454e-f83c-20cf-e166-6dc06cbc1543@alu.unizg.hr/ Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Tested-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20230808141503.4060661-2-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-09nexthop: Fix infinite nexthop bucket dump when using maximum nexthop IDIdo Schimmel
A netlink dump callback can return a positive number to signal that more information needs to be dumped or zero to signal that the dump is complete. In the second case, the core netlink code will append the NLMSG_DONE message to the skb in order to indicate to user space that the dump is complete. The nexthop bucket dump callback always returns a positive number if nexthop buckets were filled in the provided skb, even if the dump is complete. This means that a dump will span at least two recvmsg() calls as long as nexthop buckets are present. In the last recvmsg() call the dump callback will not fill in any nexthop buckets because the previous call indicated that the dump should restart from the last dumped nexthop ID plus one. # ip link add name dummy1 up type dummy # ip nexthop add id 1 dev dummy1 # ip nexthop add id 10 group 1 type resilient buckets 2 # strace -e sendto,recvmsg -s 5 ip nexthop bucket sendto(3, [[{nlmsg_len=24, nlmsg_type=RTM_GETNEXTHOPBUCKET, nlmsg_flags=NLM_F_REQUEST|NLM_F_DUMP, nlmsg_seq=1691396980, nlmsg_pid=0}, {family=AF_UNSPEC, data="\x00\x00\x00\x00\x00"...}], {nlmsg_len=0, nlmsg_type=0 /* NLMSG_??? */, nlmsg_flags=0, nlmsg_seq=0, nlmsg_pid=0}], 152, 0, NULL, 0) = 152 recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=NULL, iov_len=0}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_TRUNC}, MSG_PEEK|MSG_TRUNC) = 128 recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=[[{nlmsg_len=64, nlmsg_type=RTM_NEWNEXTHOPBUCKET, nlmsg_flags=NLM_F_MULTI, nlmsg_seq=1691396980, nlmsg_pid=347}, {family=AF_UNSPEC, data="\x00\x00\x00\x00\x00"...}], [{nlmsg_len=64, nlmsg_type=RTM_NEWNEXTHOPBUCKET, nlmsg_flags=NLM_F_MULTI, nlmsg_seq=1691396980, nlmsg_pid=347}, {family=AF_UNSPEC, data="\x00\x00\x00\x00\x00"...}]], iov_len=32768}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 128 id 10 index 0 idle_time 6.66 nhid 1 id 10 index 1 idle_time 6.66 nhid 1 recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=NULL, iov_len=0}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_TRUNC}, MSG_PEEK|MSG_TRUNC) = 20 recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=[{nlmsg_len=20, nlmsg_type=NLMSG_DONE, nlmsg_flags=NLM_F_MULTI, nlmsg_seq=1691396980, nlmsg_pid=347}, 0], iov_len=32768}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 20 +++ exited with 0 +++ This behavior is both inefficient and buggy. If the last nexthop to be dumped had the maximum ID of 0xffffffff, then the dump will restart from 0 (0xffffffff + 1) and never end: # ip link add name dummy1 up type dummy # ip nexthop add id 1 dev dummy1 # ip nexthop add id $((2**32-1)) group 1 type resilient buckets 2 # ip nexthop bucket id 4294967295 index 0 idle_time 5.55 nhid 1 id 4294967295 index 1 idle_time 5.55 nhid 1 id 4294967295 index 0 idle_time 5.55 nhid 1 id 4294967295 index 1 idle_time 5.55 nhid 1 [...] Fix by adjusting the dump callback to return zero when the dump is complete. After the fix only one recvmsg() call is made and the NLMSG_DONE message is appended to the RTM_NEWNEXTHOPBUCKET responses: # ip link add name dummy1 up type dummy # ip nexthop add id 1 dev dummy1 # ip nexthop add id $((2**32-1)) group 1 type resilient buckets 2 # strace -e sendto,recvmsg -s 5 ip nexthop bucket sendto(3, [[{nlmsg_len=24, nlmsg_type=RTM_GETNEXTHOPBUCKET, nlmsg_flags=NLM_F_REQUEST|NLM_F_DUMP, nlmsg_seq=1691396737, nlmsg_pid=0}, {family=AF_UNSPEC, data="\x00\x00\x00\x00\x00"...}], {nlmsg_len=0, nlmsg_type=0 /* NLMSG_??? */, nlmsg_flags=0, nlmsg_seq=0, nlmsg_pid=0}], 152, 0, NULL, 0) = 152 recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=NULL, iov_len=0}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_TRUNC}, MSG_PEEK|MSG_TRUNC) = 148 recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=[[{nlmsg_len=64, nlmsg_type=RTM_NEWNEXTHOPBUCKET, nlmsg_flags=NLM_F_MULTI, nlmsg_seq=1691396737, nlmsg_pid=350}, {family=AF_UNSPEC, data="\x00\x00\x00\x00\x00"...}], [{nlmsg_len=64, nlmsg_type=RTM_NEWNEXTHOPBUCKET, nlmsg_flags=NLM_F_MULTI, nlmsg_seq=1691396737, nlmsg_pid=350}, {family=AF_UNSPEC, data="\x00\x00\x00\x00\x00"...}], [{nlmsg_len=20, nlmsg_type=NLMSG_DONE, nlmsg_flags=NLM_F_MULTI, nlmsg_seq=1691396737, nlmsg_pid=350}, 0]], iov_len=32768}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 148 id 4294967295 index 0 idle_time 6.61 nhid 1 id 4294967295 index 1 idle_time 6.61 nhid 1 +++ exited with 0 +++ Note that if the NLMSG_DONE message cannot be appended because of size limitations, then another recvmsg() will be needed, but the core netlink code will not invoke the dump callback and simply reply with a NLMSG_DONE message since it knows that the callback previously returned zero. Add a test that fails before the fix: # ./fib_nexthops.sh -t basic_res [...] TEST: Maximum nexthop ID dump [FAIL] [...] And passes after it: # ./fib_nexthops.sh -t basic_res [...] TEST: Maximum nexthop ID dump [ OK ] [...] Fixes: 8a1bbabb034d ("nexthop: Add netlink handlers for bucket dump") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20230808075233.3337922-4-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-09nexthop: Fix infinite nexthop dump when using maximum nexthop IDIdo Schimmel
A netlink dump callback can return a positive number to signal that more information needs to be dumped or zero to signal that the dump is complete. In the second case, the core netlink code will append the NLMSG_DONE message to the skb in order to indicate to user space that the dump is complete. The nexthop dump callback always returns a positive number if nexthops were filled in the provided skb, even if the dump is complete. This means that a dump will span at least two recvmsg() calls as long as nexthops are present. In the last recvmsg() call the dump callback will not fill in any nexthops because the previous call indicated that the dump should restart from the last dumped nexthop ID plus one. # ip nexthop add id 1 blackhole # strace -e sendto,recvmsg -s 5 ip nexthop sendto(3, [[{nlmsg_len=24, nlmsg_type=RTM_GETNEXTHOP, nlmsg_flags=NLM_F_REQUEST|NLM_F_DUMP, nlmsg_seq=1691394315, nlmsg_pid=0}, {nh_family=AF_UNSPEC, nh_scope=RT_SCOPE_UNIVERSE, nh_protocol=RTPROT_UNSPEC, nh_flags=0}], {nlmsg_len=0, nlmsg_type=0 /* NLMSG_??? */, nlmsg_flags=0, nlmsg_seq=0, nlmsg_pid=0}], 152, 0, NULL, 0) = 152 recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=NULL, iov_len=0}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_TRUNC}, MSG_PEEK|MSG_TRUNC) = 36 recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=[{nlmsg_len=36, nlmsg_type=RTM_NEWNEXTHOP, nlmsg_flags=NLM_F_MULTI, nlmsg_seq=1691394315, nlmsg_pid=343}, {nh_family=AF_INET, nh_scope=RT_SCOPE_UNIVERSE, nh_protocol=RTPROT_UNSPEC, nh_flags=0}, [[{nla_len=8, nla_type=NHA_ID}, 1], {nla_len=4, nla_type=NHA_BLACKHOLE}]], iov_len=32768}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 36 id 1 blackhole recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=NULL, iov_len=0}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_TRUNC}, MSG_PEEK|MSG_TRUNC) = 20 recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=[{nlmsg_len=20, nlmsg_type=NLMSG_DONE, nlmsg_flags=NLM_F_MULTI, nlmsg_seq=1691394315, nlmsg_pid=343}, 0], iov_len=32768}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 20 +++ exited with 0 +++ This behavior is both inefficient and buggy. If the last nexthop to be dumped had the maximum ID of 0xffffffff, then the dump will restart from 0 (0xffffffff + 1) and never end: # ip nexthop add id $((2**32-1)) blackhole # ip nexthop id 4294967295 blackhole id 4294967295 blackhole [...] Fix by adjusting the dump callback to return zero when the dump is complete. After the fix only one recvmsg() call is made and the NLMSG_DONE message is appended to the RTM_NEWNEXTHOP response: # ip nexthop add id $((2**32-1)) blackhole # strace -e sendto,recvmsg -s 5 ip nexthop sendto(3, [[{nlmsg_len=24, nlmsg_type=RTM_GETNEXTHOP, nlmsg_flags=NLM_F_REQUEST|NLM_F_DUMP, nlmsg_seq=1691394080, nlmsg_pid=0}, {nh_family=AF_UNSPEC, nh_scope=RT_SCOPE_UNIVERSE, nh_protocol=RTPROT_UNSPEC, nh_flags=0}], {nlmsg_len=0, nlmsg_type=0 /* NLMSG_??? */, nlmsg_flags=0, nlmsg_seq=0, nlmsg_pid=0}], 152, 0, NULL, 0) = 152 recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=NULL, iov_len=0}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_TRUNC}, MSG_PEEK|MSG_TRUNC) = 56 recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=[[{nlmsg_len=36, nlmsg_type=RTM_NEWNEXTHOP, nlmsg_flags=NLM_F_MULTI, nlmsg_seq=1691394080, nlmsg_pid=342}, {nh_family=AF_INET, nh_scope=RT_SCOPE_UNIVERSE, nh_protocol=RTPROT_UNSPEC, nh_flags=0}, [[{nla_len=8, nla_type=NHA_ID}, 4294967295], {nla_len=4, nla_type=NHA_BLACKHOLE}]], [{nlmsg_len=20, nlmsg_type=NLMSG_DONE, nlmsg_flags=NLM_F_MULTI, nlmsg_seq=1691394080, nlmsg_pid=342}, 0]], iov_len=32768}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 56 id 4294967295 blackhole +++ exited with 0 +++ Note that if the NLMSG_DONE message cannot be appended because of size limitations, then another recvmsg() will be needed, but the core netlink code will not invoke the dump callback and simply reply with a NLMSG_DONE message since it knows that the callback previously returned zero. Add a test that fails before the fix: # ./fib_nexthops.sh -t basic [...] TEST: Maximum nexthop ID dump [FAIL] [...] And passes after it: # ./fib_nexthops.sh -t basic [...] TEST: Maximum nexthop ID dump [ OK ] [...] Fixes: ab84be7e54fc ("net: Initial nexthop code") Reported-by: Petr Machata <petrm@nvidia.com> Closes: https://lore.kernel.org/netdev/87sf91enuf.fsf@nvidia.com/ Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20230808075233.3337922-2-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-09tools: ynl-gen: add missing empty line between policiesJakub Kicinski
We're missing empty line between policies. DPLL will need this. Tested-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://lore.kernel.org/r/20230808200907.1290647-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-09tools: ynl-gen: avoid rendering empty validate fieldJiri Pirko
When dont-validate flags are filtered out for do/dump op, the list may be empty. In that case, avoid rendering the validate field. Fixes: fa8ba3502ade ("ynl-gen-c.py: render netlink policies static for split ops") Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20230808090344.1368874-1-jiri@resnulli.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-08selftests/bpf: relax expected log messages to allow emitting BPF_STEduard Zingerman
Update [1] to LLVM BPF backend seeks to enable generation of BPF_ST instruction when CPUv4 is selected. This affects expected log messages for the following selftests: - log_fixup/missing_map - spin_lock/lock_id_mapval_preserve - spin_lock/lock_id_innermapval_preserve Expected messages in these tests hard-code instruction numbers for BPF programs compiled from C. These instruction numbers change when BPF_ST is allowed because single BPF_ST instruction replaces a pair of BPF_MOV/BPF_STX instructions, e.g.: r1 = 42; *(u32 *)(r10 - 8) = r1; ---> *(u32 *)(r10 - 8) = 42; This commit updates expected log messages to avoid matching specific instruction numbers (program position still could be uniquely identified). [1] https://reviews.llvm.org/D140804 "[BPF] support for BPF_ST instruction in codegen" Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230808162755.392606-1-eddyz87@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-08-08selftests/bpf: remove duplicated functionsKui-Feng Lee
The file cgroup_tcp_skb.c contains redundant implementations of the similar functions (create_server_sock_v6(), connect_client_server_v6() and get_sock_port_v6()) found in network_helpers.c. Let's eliminate these duplicated functions. Changes from v1: - Remove get_sock_port_v6() as well. v1: https://lore.kernel.org/all/20230807193840.567962-1-thinker.li@gmail.com/ Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> Link: https://lore.kernel.org/r/20230808162858.326871-1-thinker.li@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-08-08perf stat: Don't display zero tool countsIan Rogers
Andi reported (see link below) a regression when printing the 'duration_time' tool event, where it gets printed as "not counted" for most of the CPUs, fix it by skipping zero counts for tool events. Reported-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Andi Kleen <ak@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Claire Jensen <cjense@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/all/ZMlrzcVrVi1lTDmn@tassilo/ Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-08tools arch x86: Sync the msr-index.h copy with the kernel sourcesArnaldo Carvalho de Melo
To pick up the changes from these csets: 522b1d69219d8f08 ("x86/cpu/amd: Add a Zenbleed fix") That cause no changes to tooling: $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after $ diff -u before after $ Just silences this perf build warning: Warning: Kernel ABI header differences: diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov (AMD) <bp@alien8.de> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/ZND17H7BI4ariERn@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-08Revert "perf report: Append inlines to non-DWARF callchains"Arnaldo Carvalho de Melo
This reverts commit 46d21ec067490ab9cdcc89b9de5aae28786a8b8e. The tests were made with a specific workload, further tests on a recently updated fedora 38 system with a system wide perf.data file shows 'perf report' taking excessive time resolving inlines in vmlinux, so lets revert this until a full investigation and improvement on the addr2line support code is made. Reported-by: Jesper Dangaard Brouer <hawk@kernel.org> Acked-by: Artem Savkov <asavkov@redhat.com> Tested-by: Jesper Dangaard Brouer <hawk@kernel.org> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/ZMl8VyhdwhClTM5g@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-07selftests/bpf: Add bpf_get_func_ip test for uprobe inside functionJiri Olsa
Adding get_func_ip test for uprobe inside function that validates the get_func_ip helper returns correct probe address value. Tested-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230807085956.2344866-4-jolsa@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-08-07selftests/bpf: Add bpf_get_func_ip tests for uprobe on function entryJiri Olsa
Adding get_func_ip tests for uprobe on function entry that validates that bpf_get_func_ip returns proper values from both uprobe and return uprobe. Tested-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230807085956.2344866-3-jolsa@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-08-07bpf: Add support for bpf_get_func_ip helper for uprobe programJiri Olsa
Adding support for bpf_get_func_ip helper for uprobe program to return probed address for both uprobe and return uprobe. We discussed this in [1] and agreed that uprobe can have special use of bpf_get_func_ip helper that differs from kprobe. The kprobe bpf_get_func_ip returns: - address of the function if probe is attach on function entry for both kprobe and return kprobe - 0 if the probe is not attach on function entry The uprobe bpf_get_func_ip returns: - address of the probe for both uprobe and return uprobe The reason for this semantic change is that kernel can't really tell if the probe user space address is function entry. The uprobe program is actually kprobe type program attached as uprobe. One of the consequences of this design is that uprobes do not have its own set of helpers, but share them with kprobes. As we need different functionality for bpf_get_func_ip helper for uprobe, I'm adding the bool value to the bpf_trace_run_ctx, so the helper can detect that it's executed in uprobe context and call specific code. The is_uprobe bool is set as true in bpf_prog_run_array_sleepable, which is currently used only for executing bpf programs in uprobe. Renaming bpf_prog_run_array_sleepable to bpf_prog_run_array_uprobe to address that it's only used for uprobes and that it sets the run_ctx.is_uprobe as suggested by Yafang Shao. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Alan Maguire <alan.maguire@oracle.com> [1] https://lore.kernel.org/bpf/CAEf4BzZ=xLVkG5eurEuvLU79wAMtwho7ReR+XJAgwhFF4M-7Cg@mail.gmail.com/ Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Viktor Malik <vmalik@redhat.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230807085956.2344866-2-jolsa@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-08-07Merge tag 'x86_bugs_srso' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/srso fixes from Borislav Petkov: "Add a mitigation for the speculative RAS (Return Address Stack) overflow vulnerability on AMD processors. In short, this is yet another issue where userspace poisons a microarchitectural structure which can then be used to leak privileged information through a side channel" * tag 'x86_bugs_srso' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/srso: Tie SBPB bit setting to microcode patch detection x86/srso: Add a forgotten NOENDBR annotation x86/srso: Fix return thunks in generated code x86/srso: Add IBPB on VMEXIT x86/srso: Add IBPB x86/srso: Add SRSO_NO support x86/srso: Add IBPB_BRTYPE support x86/srso: Add a Speculative RAS Overflow mitigation x86/bugs: Increase the x86 bugs vector size to two u32s
2023-08-07selftests/bpf: Add a movsx selftest for sign-extension of R10Yonghong Song
A movsx selftest is added for sign-extension of frame pointer R10. The verification fails for both privileged and unprivileged prog runs. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230807175726.672394-1-yonghong.song@linux.dev Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-08-07Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fixes from Paolo Bonzini: "x86: - Fix SEV race condition ARM: - Fixes for the configuration of SVE/SME traps when hVHE mode is in use - Allow use of pKVM on systems with FF-A implementations that are v1.0 compatible - Request/release percpu IRQs (arch timer, vGIC maintenance) correctly when pKVM is in use - Fix function prototype after __kvm_host_psci_cpu_entry() rename - Skip to the next instruction when emulating writes to TCR_EL1 on AmpereOne systems Selftests: - Fix missing include" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: selftests/rseq: Fix build with undefined __weak KVM: SEV: remove ghcb variable declarations KVM: SEV: only access GHCB fields once KVM: SEV: snapshot the GHCB before accessing it KVM: arm64: Skip instruction after emulating write to TCR_EL1 KVM: arm64: fix __kvm_host_psci_cpu_entry() prototype KVM: arm64: Fix resetting SME trap values on reset for (h)VHE KVM: arm64: Fix resetting SVE trap values on reset for hVHE KVM: arm64: Use the appropriate feature trap register when activating traps KVM: arm64: Helper to write to appropriate feature trap register based on mode KVM: arm64: Disable SME traps for (h)VHE at setup KVM: arm64: Use the appropriate feature trap register for SVE at EL2 setup KVM: arm64: Factor out code for checking (h)VHE mode into a macro KVM: arm64: Rephrase percpu enable/disable tracking in terms of hyp KVM: arm64: Fix hardware enable/disable flows for pKVM KVM: arm64: Allow pKVM on v1.0 compatible FF-A implementations
2023-08-04selftests: mptcp: join: fix 'implicit EP' testAndrea Claudi
mptcp_join 'implicit EP' test currently fails when using ip mptcp: $ ./mptcp_join.sh -iI <snip> 001 implicit EP creation[fail] expected '10.0.2.2 10.0.2.2 id 1 implicit' found '10.0.2.2 id 1 rawflags 10 ' Error: too many addresses or duplicate one: -22. ID change is prevented[fail] expected '10.0.2.2 10.0.2.2 id 1 implicit' found '10.0.2.2 id 1 rawflags 10 ' modif is allowed[fail] expected '10.0.2.2 10.0.2.2 id 1 signal' found '10.0.2.2 id 1 signal ' This happens because of two reasons: - iproute v6.3.0 does not support the implicit flag, fixed with iproute2-next commit 3a2535a41854 ("mptcp: add support for implicit flag") - pm_nl_check_endpoint wrongly expects the ip address to be repeated two times in iproute output, and does not account for a final whitespace in it. This fixes the issue trimming the whitespace in the output string and removing the double address in the expected string. Fixes: 69c6ce7b6eca ("selftests: mptcp: add implicit endpoint test case") Cc: stable@vger.kernel.org Signed-off-by: Andrea Claudi <aclaudi@redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Link: https://lore.kernel.org/r/20230803-upstream-net-20230803-misc-fixes-6-5-v1-2-6671b1ab11cc@tessares.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-04selftests: mptcp: join: fix 'delete and re-add' testAndrea Claudi
mptcp_join 'delete and re-add' test fails when using ip mptcp: $ ./mptcp_join.sh -iI <snip> 002 delete and re-add before delete[ ok ] mptcp_info subflows=1 [ ok ] Error: argument "ADDRESS" is wrong: invalid for non-zero id address after delete[fail] got 2:2 subflows expected 1 This happens because endpoint delete includes an ip address while id is not 0, contrary to what is indicated in the ip mptcp man page: "When used with the delete id operation, an IFADDR is only included when the ID is 0." This fixes the issue using the $addr variable in pm_nl_del_endpoint() only when id is 0. Fixes: 34aa6e3bccd8 ("selftests: mptcp: add ip mptcp wrappers") Cc: stable@vger.kernel.org Signed-off-by: Andrea Claudi <aclaudi@redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Link: https://lore.kernel.org/r/20230803-upstream-net-20230803-misc-fixes-6-5-v1-1-6671b1ab11cc@tessares.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-04selftests: net: test vxlan pmtu exceptions with tcpFlorian Westphal
TCP might get stuck if a nonlinear skb exceeds the path MTU, icmp error contains an incorrect icmp checksum in that case. Extend the existing test for vxlan to also send at least 1MB worth of data via TCP in addition to the existing 'large icmp packet adds route exception'. On my test VM this fails due to 0-size output file without "tunnels: fix kasan splat when generating ipv4 pmtu error". Signed-off-by: Florian Westphal <fw@strlen.de> Link: https://lore.kernel.org/r/20230803152653.29535-3-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-04Merge tag 'hyperv-fixes-signed-20230804' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv fixes from Wei Liu: - Fix a bug in a python script for Hyper-V (Ani Sinha) - Workaround a bug in Hyper-V when IBT is enabled (Michael Kelley) - Fix an issue parsing MP table when Linux runs in VTL2 (Saurabh Sengar) - Several cleanup patches (Nischala Yelchuri, Kameron Carr, YueHaibing, ZhiHu) * tag 'hyperv-fixes-signed-20230804' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: Drivers: hv: vmbus: Remove unused extern declaration vmbus_ontimer() x86/hyperv: add noop functions to x86_init mpparse functions vmbus_testing: fix wrong python syntax for integer value comparison x86/hyperv: fix a warning in mshyperv.h x86/hyperv: Disable IBT when hypercall page lacks ENDBR instruction x86/hyperv: Improve code for referencing hyperv_pcpu_input_arg Drivers: hv: Change hv_free_hyperv_page() to take void * argument
2023-08-04Merge tag 'riscv-for-linus-6.5-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - A pair of fixes for build-related failures in the selftests - A fix for a sparse warning in acpi_os_ioremap() - A fix to restore the kernel PA offset in vmcoreinfo, to fix crash handling * tag 'riscv-for-linus-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: Documentation: kdump: Add va_kernel_pa_offset for RISCV64 riscv: Export va_kernel_pa_offset in vmcoreinfo RISC-V: ACPI: Fix acpi_os_ioremap to return iomem address selftests: riscv: Fix compilation error with vstate_exec_nolibc.c selftests/riscv: fix potential build failure during the "emit_tests" step
2023-08-04selftests/rseq: Fix build with undefined __weakMark Brown
Commit 3bcbc20942db ("selftests/rseq: Play nice with binaries statically linked against glibc 2.35+") which is now in Linus' tree introduced uses of __weak but did nothing to ensure that a definition is provided for it resulting in build failures for the rseq tests: rseq.c:41:1: error: unknown type name '__weak' __weak ptrdiff_t __rseq_offset; ^ rseq.c:41:17: error: expected ';' after top level declarator __weak ptrdiff_t __rseq_offset; ^ ; rseq.c:42:1: error: unknown type name '__weak' __weak unsigned int __rseq_size; ^ rseq.c:43:1: error: unknown type name '__weak' __weak unsigned int __rseq_flags; Fix this by using the definition from tools/include compiler.h. Fixes: 3bcbc20942db ("selftests/rseq: Play nice with binaries statically linked against glibc 2.35+") Signed-off-by: Mark Brown <broonie@kernel.org> Message-Id: <20230804-kselftest-rseq-build-v1-1-015830b66aa9@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-08-04libbpf: Use local includes inside the librarySergey Kacheev
In our monrepo, we try to minimize special processing when importing (aka vendor) third-party source code. Ideally, we try to import directly from the repositories with the code without changing it, we try to stick to the source code dependency instead of the artifact dependency. In the current situation, a patch has to be made for libbpf to fix the includes in bpf headers so that they work directly from libbpf/src. Signed-off-by: Sergey Kacheev <s.kacheev@gmail.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/CAJVhQqUg6OKq6CpVJP5ng04Dg+z=igevPpmuxTqhsR3dKvd9+Q@mail.gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-08-04netlink: specs: devlink: add info-get dump opJiri Pirko
Add missing dump op for info-get command and re-generate related devlink-user.[ch] code. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20230803111340.1074067-10-jiri@resnulli.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-04ynl-gen-c.py: render netlink policies static for split opsJiri Pirko
When policies are rendered for split ops, they are consumed in the same file. No need to expose them for user outside, make them static. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20230803111340.1074067-5-jiri@resnulli.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-04ynl-gen-c.py: allow directional model for kernel modeJiri Pirko
Directional model limitation is only applicable for uapi mode. For kernel mode, the code is generated correctly using right cmd values for do/dump requests. Lift the limitation. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20230803111340.1074067-4-jiri@resnulli.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-04ynl-gen-c.py: filter rendering of validate field values for split opsJiri Pirko
For split ops, do and dump has different meaningful values in validate field. Fix the rendering to allow the values per op type as follows: do: strict dump: dump, strict-dump Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20230803111340.1074067-3-jiri@resnulli.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-04selftests: cgroup: fix test_kmem_basic false positivesJohannes Weiner
This test fails routinely in our prod testing environment, and I can reproduce it locally as well. The test allocates dcache inside a cgroup, then drops the memory limit and checks that usage drops correspondingly. The reason it fails is because dentries are freed with an RCU delay - a debugging sleep shows that usage drops as expected shortly after. Insert a 1s sleep after dropping the limit. This should be good enough, assuming that machines running those tests are otherwise not very busy. Link: https://lkml.kernel.org/r/20230801135632.1768830-1-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Paul E. McKenney <paulmck@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-04selftests: mm: ksm: fix incorrect evaluation of parameterAyush Jain
A missing break in kms_tests leads to kselftest hang when the parameter -s is used. In current code flow because of missing break in -s, -t parses args spilled from -s and as -t accepts only valid values as 0,1 so any arg in -s >1 or <0, gets in ksm_test failure This went undetected since, before the addition of option -t, the next case -M would immediately break out of the switch statement but that is no longer the case Add the missing break statement. ----Before---- ./ksm_tests -H -s 100 Invalid merge type ----After---- ./ksm_tests -H -s 100 Number of normal pages: 0 Number of huge pages: 50 Total size: 100 MiB Total time: 0.401732682 s Average speed: 248.922 MiB/s Link: https://lkml.kernel.org/r/20230728163952.4634-1-ayush.jain3@amd.com Fixes: 07115fcc15b4 ("selftests/mm: add new selftests for KSM") Signed-off-by: Ayush Jain <ayush.jain3@amd.com> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Stefan Roesch <shr@devkernel.io> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-04radix tree test suite: fix incorrect allocation size for pthreadsColin Ian King
Currently the pthread allocation for each array item is based on the size of a pthread_t pointer and should be the size of the pthread_t structure, so the allocation is under-allocating the correct size. Fix this by using the size of each element in the pthreads array. Static analysis cppcheck reported: tools/testing/radix-tree/regression1.c:180:2: warning: Size of pointer 'threads' used instead of size of its data. [pointerSize] Link: https://lkml.kernel.org/r/20230727160930.632674-1-colin.i.king@gmail.com Fixes: 1366c37ed84b ("radix tree test harness") Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-04selftests/bpf: fix the incorrect verification of port numbers.Kui-Feng Lee
Check port numbers before calling htons(). According to Dan Carpenter's report, Smatch identified incorrect port number checks. It is expected that the returned port number is an integer, with negative numbers indicating errors. However, the value was mistakenly verified after being translated by htons(). Major changes from v1: - Move the variable 'port' to the same line of 'err'. Fixes: 539c7e67aa4a ("selftests/bpf: Verify that the cgroup_skb filters receive expected packets.") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/bpf/cafd6585-d5a2-4096-b94f-7556f5aa7737@moroto.mountain/ Acked-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> Link: https://lore.kernel.org/r/20230804165831.173627-1-thinker.li@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-08-04selftests/bpf: Add test for detachment on empty mprog entryDaniel Borkmann
Add a detachment test case with miniq present to assert that with and without the miniq we get the same error. # ./test_progs -t tc_opts #244 tc_opts_after:OK #245 tc_opts_append:OK #246 tc_opts_basic:OK #247 tc_opts_before:OK #248 tc_opts_chain_classic:OK #249 tc_opts_delete_empty:OK #250 tc_opts_demixed:OK #251 tc_opts_detach:OK #252 tc_opts_detach_after:OK #253 tc_opts_detach_before:OK #254 tc_opts_dev_cleanup:OK #255 tc_opts_invalid:OK #256 tc_opts_mixed:OK #257 tc_opts_prepend:OK #258 tc_opts_replace:OK #259 tc_opts_revision:OK Summary: 16/0 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/r/20230804131112.11012-2-daniel@iogearbox.net Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-08-04Merge tag 'counter-fixes-for-6.5b' of ↵Greg Kroah-Hartman
git://git.kernel.org/pub/scm/linux/kernel/git/wbg/counter into char-misc-linus William writes: Second set of Counter fixes for 6.5 The I8254 Kconfig entry is repositioned to resolve a misplacement causing the "Counter support" submenu items to disappear in menuconfig. The tools/counter/Makefile clean recipe is adjusted to replace rmdir with an equivalent set of rm to prevent failure if someone tries to clean the counter directory without building it first. * tag 'counter-fixes-for-6.5b' of git://git.kernel.org/pub/scm/linux/kernel/git/wbg/counter: tools/counter: Makefile: Replace rmdir by rm to avoid make,clean failure counter: Fix menuconfig "Counter support" submenu entries disappearance
2023-08-03Merge tag 'perf-tools-fixes-for-v6.5-2-2023-08-03' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools fixes from Arnaldo Carvalho de Melo: - Fix segfault in the powerpc specific arch_skip_callchain_idx function. The patch doing the reference count init/exit that went into 6.5 missed this function. - Fix regression reading the arm64 PMU cpu slots in sysfs, a patch removing some code duplication ended up duplicating the /sysfs prefix for these files. - Fix grouping of events related to topdown, addressing a regression on the CSV output produced by 'perf stat' noticed on the downstream tool toplev. - Fix the uprobe_from_different_cu 'perf test' entry, it is failing when gcc isn't available, so we need to check that and skip the test if it is not installed. * tag 'perf-tools-fixes-for-v6.5-2-2023-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: perf test parse-events: Test complex name has required event format perf pmus: Create placholder regardless of scanning core_only perf test uprobe_from_different_cu: Skip if there is no gcc perf parse-events: Only move force grouped evsels when sorting perf parse-events: When fixing group leaders always set the leader perf parse-events: Extra care around force grouped events perf callchain powerpc: Fix addr location init during arch_skip_callchain_idx function perf pmu arm64: Fix reading the PMU cpu slots in sysfs
2023-08-03Merge tag 'for-netdev' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Martin KaFai Lau says: ==================== pull-request: bpf-next 2023-08-03 We've added 54 non-merge commits during the last 10 day(s) which contain a total of 84 files changed, 4026 insertions(+), 562 deletions(-). The main changes are: 1) Add SO_REUSEPORT support for TC bpf_sk_assign from Lorenz Bauer, Daniel Borkmann 2) Support new insns from cpu v4 from Yonghong Song 3) Non-atomically allocate freelist during prefill from YiFei Zhu 4) Support defragmenting IPv(4|6) packets in BPF from Daniel Xu 5) Add tracepoint to xdp attaching failure from Leon Hwang 6) struct netdev_rx_queue and xdp.h reshuffling to reduce rebuild time from Jakub Kicinski * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (54 commits) net: invert the netdevice.h vs xdp.h dependency net: move struct netdev_rx_queue out of netdevice.h eth: add missing xdp.h includes in drivers selftests/bpf: Add testcase for xdp attaching failure tracepoint bpf, xdp: Add tracepoint to xdp attaching failure selftests/bpf: fix static assert compilation issue for test_cls_*.c bpf: fix bpf_probe_read_kernel prototype mismatch riscv, bpf: Adapt bpf trampoline to optimized riscv ftrace framework libbpf: fix typos in Makefile tracing: bpf: use struct trace_entry in struct syscall_tp_t bpf, devmap: Remove unused dtab field from bpf_dtab_netdev bpf, cpumap: Remove unused cmap field from bpf_cpu_map_entry netfilter: bpf: Only define get_proto_defrag_hook() if necessary bpf: Fix an array-index-out-of-bounds issue in disasm.c net: remove duplicate INDIRECT_CALLABLE_DECLARE of udp[6]_ehashfn docs/bpf: Fix malformed documentation bpf: selftests: Add defrag selftests bpf: selftests: Support custom type and proto for client sockets bpf: selftests: Support not connecting client socket netfilter: bpf: Support BPF_F_NETFILTER_IP_DEFRAG in netfilter link ... ==================== Link: https://lore.kernel.org/r/20230803174845.825419-1-martin.lau@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. Conflicts: net/dsa/port.c 9945c1fb03a3 ("net: dsa: fix older DSA drivers using phylink") a88dd7538461 ("net: dsa: remove legacy_pre_march2020 detection") https://lore.kernel.org/all/20230731102254.2c9868ca@canb.auug.org.au/ net/xdp/xsk.c 3c5b4d69c358 ("net: annotate data-races around sk->sk_mark") b7f72a30e9ac ("xsk: introduce wrappers and helpers for supporting multi-buffer in Tx path") https://lore.kernel.org/all/20230731102631.39988412@canb.auug.org.au/ drivers/net/ethernet/broadcom/bnxt/bnxt.c 37b61cda9c16 ("bnxt: don't handle XDP in netpoll") 2b56b3d99241 ("eth: bnxt: handle invalid Tx completions more gracefully") https://lore.kernel.org/all/20230801101708.1dc7faac@canb.auug.org.au/ Adjacent changes: drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c 62da08331f1a ("net/mlx5e: Set proper IPsec source port in L4 selector") fbd517549c32 ("net/mlx5e: Add function to get IPsec offload namespace") drivers/net/ethernet/sfc/selftest.c 55c1528f9b97 ("sfc: fix field-spanning memcpy in selftest") ae9d445cd41f ("sfc: Miscellaneous comment removals") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-03Merge tag 'net-6.5-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bpf and wireless. Nothing scary here. Feels like the first wave of regressions from v6.5 is addressed - one outstanding fix still to come in TLS for the sendpage rework. Current release - regressions: - udp: fix __ip_append_data()'s handling of MSG_SPLICE_PAGES - dsa: fix older DSA drivers using phylink Previous releases - regressions: - gro: fix misuse of CB in udp socket lookup - mlx5: unregister devlink params in case interface is down - Revert "wifi: ath11k: Enable threaded NAPI" Previous releases - always broken: - sched: cls_u32: fix match key mis-addressing - sched: bind logic fixes for cls_fw, cls_u32 and cls_route - add bound checks to a number of places which hand-parse netlink - bpf: disable preemption in perf_event_output helpers code - qed: fix scheduling in a tasklet while getting stats - avoid using APIs which are not hardirq-safe in couple of drivers, when we may be in a hard IRQ (netconsole) - wifi: cfg80211: fix return value in scan logic, avoid page allocator warning - wifi: mt76: mt7615: do not advertise 5 GHz on first PHY of MT7615D (DBDC) Misc: - drop handful of inactive maintainers, put some new in place" * tag 'net-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (98 commits) MAINTAINERS: update TUN/TAP maintainers test/vsock: remove vsock_perf executable on `make clean` tcp_metrics: fix data-race in tcpm_suck_dst() vs fastopen tcp_metrics: annotate data-races around tm->tcpm_net tcp_metrics: annotate data-races around tm->tcpm_vals[] tcp_metrics: annotate data-races around tm->tcpm_lock tcp_metrics: annotate data-races around tm->tcpm_stamp tcp_metrics: fix addr_same() helper prestera: fix fallback to previous version on same major version udp: Fix __ip_append_data()'s handling of MSG_SPLICE_PAGES net/mlx5e: Set proper IPsec source port in L4 selector net/mlx5: fs_core: Skip the FTs in the same FS_TYPE_PRIO_CHAINS fs_prio net/mlx5: fs_core: Make find_closest_ft more generic wifi: brcmfmac: Fix field-spanning write in brcmf_scan_params_v2_to_v1() vxlan: Fix nexthop hash size ip6mr: Fix skb_under_panic in ip6mr_cache_report() s390/qeth: Don't call dev_close/dev_open (DOWN/UP) net: tap_open(): set sk_uid from current_fsuid() net: tun_chr_open(): set sk_uid from current_fsuid() net: dcb: choose correct policy to parse DCB_ATTR_BCN ...
2023-08-03test/vsock: remove vsock_perf executable on `make clean`Stefano Garzarella
We forgot to add vsock_perf to the rm command in the `clean` target, so now we have a left over after `make clean` in tools/testing/vsock. Fixes: 8abbffd27ced ("test/vsock: vsock_perf utility") Cc: AVKrasnov@sberdevices.ru Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Simon Horman <horms@kernel.org> # build-tested Link: https://lore.kernel.org/r/20230803085454.30897-1-sgarzare@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-03selftests: openvswitch: add ct-nat test case with ipv4Aaron Conole
Building on the previous work, add a very simplistic NAT case using ipv4. This just tests dnat transformation Signed-off-by: Aaron Conole <aconole@redhat.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-08-03selftests: openvswitch: add basic ct test case parsingAaron Conole
Forwarding via ct() action is an important use case for openvswitch, but generally would require using a full ovs-vswitchd to get working. Add a ct action parser for basic ct test case. Signed-off-by: Aaron Conole <aconole@redhat.com> Reviewed-by: Adrian Moreno <amorenoz@redhat.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-08-03selftests: openvswitch: add a test for ipv4 forwardingAaron Conole
This is a simple ipv4 bidirectional connectivity test. Signed-off-by: Aaron Conole <aconole@redhat.com> Reviewed-by: Adrian Moreno <amorenoz@redhat.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>