summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-12-19selftests/bpf: add testcase to verifier_bounds.c for BPF_JNEMenglong Dong
Add testcase for the logic that the verifier tracks the BPF_JNE for regs. The assembly function "reg_not_equal_const()" and "reg_equal_const" that we add is exactly converted from the following case: u32 a = bpf_get_prandom_u32(); u64 b = 0; a %= 8; /* the "a > 0" here will be optimized to "a != 0" */ if (a > 0) { /* now the range of a should be [1, 7] */ bpf_skb_store_bytes(skb, 0, &b, a, 0); } Signed-off-by: Menglong Dong <menglong8.dong@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231219134800.1550388-5-menglong8.dong@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-19selftests/bpf: activate the OP_NE logic in range_cond()Menglong Dong
The edge range checking for the registers is supported by the verifier now, so we can activate the extended logic in tools/testing/selftests/bpf/prog_tests/reg_bounds.c/range_cond() to test such logic. Besides, I added some cases to the "crafted_cases" array for this logic. These cases are mainly used to test the edge of the src reg and dst reg. All reg bounds testings has passed in the SLOW_TESTS mode: $ export SLOW_TESTS=1 && ./test_progs -t reg_bounds -j Summary: 65/18959832 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Menglong Dong <menglong8.dong@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231219134800.1550388-4-menglong8.dong@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-19selftests/bpf: remove reduplicated s32 casting in "crafted_cases"Menglong Dong
The "S32_MIN" is already defined with s32 casting, so there is no need to do it again. Signed-off-by: Menglong Dong <menglong8.dong@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231219134800.1550388-3-menglong8.dong@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-19bpf: make the verifier tracks the "not equal" for regsMenglong Dong
We can derive some new information for BPF_JNE in regs_refine_cond_op(). Take following code for example: /* The type of "a" is u32 */ if (a > 0 && a < 100) { /* the range of the register for a is [0, 99], not [1, 99], * and will cause the following error: * * invalid zero-sized read * * as a can be 0. */ bpf_skb_store_bytes(skb, xx, xx, a, 0); } In the code above, "a > 0" will be compiled to "jmp xxx if a == 0". In the TRUE branch, the dst_reg will be marked as known to 0. However, in the fallthrough(FALSE) branch, the dst_reg will not be handled, which makes the [min, max] for a is [0, 99], not [1, 99]. For BPF_JNE, we can reduce the range of the dst reg if the src reg is a const and is exactly the edge of the dst reg. Signed-off-by: Menglong Dong <menglong8.dong@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Link: https://lore.kernel.org/r/20231219134800.1550388-2-menglong8.dong@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-19Merge tag 'for-netdev' of ↵Paolo Abeni
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2023-12-19 Hi David, hi Jakub, hi Paolo, hi Eric, The following pull-request contains BPF updates for your *net-next* tree. We've added 2 non-merge commits during the last 1 day(s) which contain a total of 40 files changed, 642 insertions(+), 2926 deletions(-). The main changes are: 1) Revert all of BPF token-related patches for now as per list discussion [0], from Andrii Nakryiko. [0] https://lore.kernel.org/bpf/CAHk-=wg7JuFYwGy=GOMbRCtOL+jwSQsdUaBsRWkDVYbxipbM5A@mail.gmail.com 2) Fix a syzbot-reported use-after-free read in nla_find() triggered from bpf_skb_get_nlattr_nest() helper, from Jakub Kicinski. bpf-next-for-netdev * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: Revert BPF token-related functionality bpf: Use nla_ok() instead of checking nla_len directly ==================== Link: https://lore.kernel.org/r/20231219170359.11035-1-daniel@iogearbox.net Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-19Revert BPF token-related functionalityAndrii Nakryiko
This patch includes the following revert (one conflicting BPF FS patch and three token patch sets, represented by merge commits): - revert 0f5d5454c723 "Merge branch 'bpf-fs-mount-options-parsing-follow-ups'"; - revert 750e785796bb "bpf: Support uid and gid when mounting bpffs"; - revert 733763285acf "Merge branch 'bpf-token-support-in-libbpf-s-bpf-object'"; - revert c35919dcce28 "Merge branch 'bpf-token-and-bpf-fs-based-delegation'". Link: https://lore.kernel.org/bpf/CAHk-=wg7JuFYwGy=GOMbRCtOL+jwSQsdUaBsRWkDVYbxipbM5A@mail.gmail.com Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2023-12-19Merge branch 'devlink-introduce-notifications-filtering'Paolo Abeni
Jiri Pirko says: ==================== devlink: introduce notifications filtering From: Jiri Pirko <jiri@nvidia.com> Currently the user listening on a socket for devlink notifications gets always all messages for all existing devlink instances and objects, even if he is interested only in one of those. That may cause unnecessary overhead on setups with thousands of instances present. User is currently able to narrow down the devlink objects replies to dump commands by specifying select attributes. Allow similar approach for notifications providing user a new notify-filter-set command to select attributes with values the notification message has to match. In that case, it is delivered to the socket. Note that the filtering is done per-socket, so multiple users may specify different selection of attributes with values. This patchset initially introduces support for following attributes: DEVLINK_ATTR_BUS_NAME DEVLINK_ATTR_DEV_NAME DEVLINK_ATTR_PORT_INDEX Patches #1 - #4 are preparations in devlink code, patch #3 is an optimization done on the way. Patches #5 - #7 are preparations in netlink and generic netlink code. Patch #8 is the main one in this set implementing of the notify-filter-set command and the actual per-socket filtering. Patch #9 extends the infrastructure allowing to filter according to a port index. Example: $ devlink mon port pci/0000:08:00.0/32768 [port,new] pci/0000:08:00.0/32768: type notset flavour pcisf controller 0 pfnum 0 sfnum 107 splittable false function: hw_addr 00:00:00:00:00:00 state inactive opstate detached roce enable [port,new] pci/0000:08:00.0/32768: type eth flavour pcisf controller 0 pfnum 0 sfnum 107 splittable false function: hw_addr 00:00:00:00:00:00 state inactive opstate detached roce enable [port,new] pci/0000:08:00.0/32768: type eth netdev eth3 flavour pcisf controller 0 pfnum 0 sfnum 107 splittable false function: hw_addr 00:00:00:00:00:00 state inactive opstate detached roce enable [port,new] pci/0000:08:00.0/32768: type eth netdev eth3 flavour pcisf controller 0 pfnum 0 sfnum 107 splittable false function: hw_addr 00:00:00:00:00:00 state inactive opstate detached roce enable [port,new] pci/0000:08:00.0/32768: type eth flavour pcisf controller 0 pfnum 0 sfnum 107 splittable false function: hw_addr 00:00:00:00:00:00 state inactive opstate detached roce enable [port,new] pci/0000:08:00.0/32768: type notset flavour pcisf controller 0 pfnum 0 sfnum 107 splittable false function: hw_addr 00:00:00:00:00:00 state inactive opstate detached roce enable [port,del] pci/0000:08:00.0/32768: type notset flavour pcisf controller 0 pfnum 0 sfnum 107 splittable false function: hw_addr 00:00:00:00:00:00 state inactive opstate detached roce enable ==================== Link: https://lore.kernel.org/r/20231216123001.1293639-1-jiri@resnulli.us Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-19devlink: extend multicast filtering by port indexJiri Pirko
Expose the previously introduced notification multicast messages filtering infrastructure and allow the user to select messages using port index. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-19devlink: add a command to set notification filter and use it for multicastsJiri Pirko
Currently the user listening on a socket for devlink notifications gets always all messages for all existing instances, even if he is interested only in one of those. That may cause unnecessary overhead on setups with thousands of instances present. User is currently able to narrow down the devlink objects replies to dump commands by specifying select attributes. Allow similar approach for notifications. Introduce a new devlink NOTIFY_FILTER_SET which the user passes the select attributes. Store these per-socket and use them for filtering messages during multicast send. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-19genetlink: introduce helpers to do filtered multicastJiri Pirko
Currently it is possible for netlink kernel user to pass custom filter function to broadcast send function netlink_broadcast_filtered(). However, this is not exposed to multicast send and to generic netlink users. Extend the api and introduce a netlink helper nlmsg_multicast_filtered() and a generic netlink helper genlmsg_multicast_netns_filtered() to allow generic netlink families to specify filter function while sending multicast messages. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-19netlink: introduce typedef for filter functionJiri Pirko
Make the code using filter function a bit nicer by consolidating the filter function arguments using typedef. Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-19genetlink: introduce per-sock family private storageJiri Pirko
Introduce an xarray for Generic netlink family to store per-socket private. Initialize this xarray only if family uses per-socket privs. Introduce genl_sk_priv_get() to get the socket priv pointer for a family and initialize it in case it does not exist. Introduce __genl_sk_priv_get() to obtain socket priv pointer for a family under RCU read lock. Allow family to specify the priv size, init() and destroy() callbacks. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-19devlink: introduce a helper for netlink multicast sendJiri Pirko
Introduce a helper devlink_nl_notify_send() so each object notification function does not have to call genlmsg_multicast_netns() with the same arguments. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-19devlink: send notifications only if there are listenersJiri Pirko
Introduce devlink_nl_notify_need() helper and using it to check at the beginning of notification functions to avoid overhead of composing notification messages in case nobody listens. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-19devlink: introduce __devl_is_registered() helper and use it instead of ↵Jiri Pirko
xa_get_mark() Introduce __devl_is_registered() which does not assert on devlink instance lock and use it in notifications which may be called without devlink instance lock held. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-19devlink: use devl_is_registered() helper instead xa_get_mark()Jiri Pirko
Instead of checking the xarray mark directly using xa_get_mark() helper use devl_is_registered() helper which wraps it up. Note that there are couple more users of xa_get_mark() left which are going to be handled by the next patch. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-19bpf: Use nla_ok() instead of checking nla_len directlyJakub Kicinski
nla_len may also be too short to be sane, in which case after recent changes nla_len() will return a wrapped value. Fixes: 172db56d90d2 ("netlink: Return unsigned value for nla_len()") Reported-by: syzbot+f43a23b6e622797c7a28@syzkaller.appspotmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/bpf/20231218231904.260440-1-kuba@kernel.org
2023-12-19Merge branch 'add-pf-vf-mailbox-support'Paolo Abeni
Shinas Rasheed says: ==================== add PF-VF mailbox support This patchset aims to add PF-VF mailbox support, its related version support, and relevant control net support for immediate functionalities such as firmware notifications to VF. Changes: V6: - Fixed 1/4 patch to apply to top of net-next merged with net fixes V5: https://lore.kernel.org/all/20231214164536.2670006-1-srasheed@marvell.com/ - Refactored patches to cut out redundant changes in 1/4 patch. V4: https://lore.kernel.org/all/20231213035816.2656851-1-srasheed@marvell.com/ - Included tag [1/4] in subject of first patch of series which was lost in V3 V3: https://lore.kernel.org/all/20231211063355.2630028-1-srasheed@marvell.com/ - Corrected error cleanup logic for PF-VF mbox setup - Removed double inclusion of types.h header file in octep_pfvf_mbox.c V2: https://lore.kernel.org/all/20231209081450.2613561-1-srasheed@marvell.com/ - Removed unused variable in PATCH 1/4 V1: https://lore.kernel.org/all/20231208070352.2606192-1-srasheed@marvell.com/ ==================== Link: https://lore.kernel.org/r/20231215181425.2681426-1-srasheed@marvell.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-19octeon_ep: support firmware notifications for VFsShinas Rasheed
Notifications from firmware to vf has to pass through PF control mbox and via PF-VF mailboxes. The notifications have to be parsed out from the control mbox and passed to the PF-VF mailbox in order to reach the corresponding VF. Version compatibility should also be checked before messages are passed to the mailboxes. Signed-off-by: Shinas Rasheed <srasheed@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-19octeon_ep: control net framework to support VF offloadsShinas Rasheed
Inquire firmware on supported offloads, as well as convey offloads enabled dynamically to firmware for the VFs. Implement control net API to support the same. Signed-off-by: Shinas Rasheed <srasheed@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-19octeon_ep: PF-VF mailbox version supportShinas Rasheed
Add PF-VF mailbox initial version support Signed-off-by: Shinas Rasheed <srasheed@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-19octeon_ep: add PF-VF mailbox communicationShinas Rasheed
Implement mailbox communication between PF and VFs. PF-VF mailbox is used for all control commands from VF to PF and asynchronous notification messages from PF to VF. Signed-off-by: Shinas Rasheed <srasheed@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-18Merge tag 'for-netdev' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Alexei Starovoitov says: ==================== pull-request: bpf-next 2023-12-18 This PR is larger than usual and contains changes in various parts of the kernel. The main changes are: 1) Fix kCFI bugs in BPF, from Peter Zijlstra. End result: all forms of indirect calls from BPF into kernel and from kernel into BPF work with CFI enabled. This allows BPF to work with CONFIG_FINEIBT=y. 2) Introduce BPF token object, from Andrii Nakryiko. It adds an ability to delegate a subset of BPF features from privileged daemon (e.g., systemd) through special mount options for userns-bound BPF FS to a trusted unprivileged application. The design accommodates suggestions from Christian Brauner and Paul Moore. Example: $ sudo mkdir -p /sys/fs/bpf/token $ sudo mount -t bpf bpffs /sys/fs/bpf/token \ -o delegate_cmds=prog_load:MAP_CREATE \ -o delegate_progs=kprobe \ -o delegate_attachs=xdp 3) Various verifier improvements and fixes, from Andrii Nakryiko, Andrei Matei. - Complete precision tracking support for register spills - Fix verification of possibly-zero-sized stack accesses - Fix access to uninit stack slots - Track aligned STACK_ZERO cases as imprecise spilled registers. It improves the verifier "instructions processed" metric from single digit to 50-60% for some programs. - Fix verifier retval logic 4) Support for VLAN tag in XDP hints, from Larysa Zaremba. 5) Allocate BPF trampoline via bpf_prog_pack mechanism, from Song Liu. End result: better memory utilization and lower I$ miss for calls to BPF via BPF trampoline. 6) Fix race between BPF prog accessing inner map and parallel delete, from Hou Tao. 7) Add bpf_xdp_get_xfrm_state() kfunc, from Daniel Xu. It allows BPF interact with IPSEC infra. The intent is to support software RSS (via XDP) for the upcoming ipsec pcpu work. Experiments on AWS demonstrate single tunnel pcpu ipsec reaching line rate on 100G ENA nics. 8) Expand bpf_cgrp_storage to support cgroup1 non-attach, from Yafang Shao. 9) BPF file verification via fsverity, from Song Liu. It allows BPF progs get fsverity digest. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (164 commits) bpf: Ensure precise is reset to false in __mark_reg_const_zero() selftests/bpf: Add more uprobe multi fail tests bpf: Fail uprobe multi link with negative offset selftests/bpf: Test the release of map btf s390/bpf: Fix indirect trampoline generation selftests/bpf: Temporarily disable dummy_struct_ops test on s390 x86/cfi,bpf: Fix bpf_exception_cb() signature bpf: Fix dtor CFI cfi: Add CFI_NOSEAL() x86/cfi,bpf: Fix bpf_struct_ops CFI x86/cfi,bpf: Fix bpf_callback_t CFI x86/cfi,bpf: Fix BPF JIT call cfi: Flip headers selftests/bpf: Add test for abnormal cnt during multi-kprobe attachment selftests/bpf: Don't use libbpf_get_error() in kprobe_multi_test selftests/bpf: Add test for abnormal cnt during multi-uprobe attachment bpf: Limit the number of kprobes when attaching program to multiple kprobes bpf: Limit the number of uprobes when attaching program to multiple uprobes bpf: xdp: Register generic_kfunc_set with XDP programs selftests/bpf: utilize string values for delegate_xxx mount options ... ==================== Link: https://lore.kernel.org/r/20231219000520.34178-1-alexei.starovoitov@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-18Merge tag 'wireless-next-2023-12-18' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next Kalle Valo says: ==================== wireless-next patches for v6.8 The second features pull request for v6.8. A bigger one this time with changes both to stack and drivers. We have a new Wifi band RFI (WBRF) mitigation feature for which we pulled an immutable branch shared with other subsystems. And, as always, other new features and bug fixes all over. Major changes: cfg80211/mac80211 * AMD ACPI based Wifi band RFI (WBRF) mitigation feature * Basic Service Set (BSS) usage reporting * TID to link mapping support * mac80211 hardware flag to disallow puncturing iwlwifi * new debugfs file fw_dbg_clear mt76 * NVMEM EEPROM improvements * mt7996 Extremely High Throughpu (EHT) improvements * mt7996 Wireless Ethernet Dispatcher (WED) support * mt7996 36-bit DMA support ath12k * support one MSI vector * WCN7850: support AP mode * tag 'wireless-next-2023-12-18' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (207 commits) wifi: mt76: mt7996: Use DECLARE_FLEX_ARRAY() and fix -Warray-bounds warnings wifi: ath11k: workaround too long expansion sparse warnings Revert "wifi: ath12k: use ATH12K_PCI_IRQ_DP_OFFSET for DP IRQ" wifi: rt2x00: remove useless code in rt2x00queue_create_tx_descriptor() wifi: rtw89: only reset BB/RF for existing WiFi 6 chips while starting up wifi: rtw89: add DBCC H2C to notify firmware the status wifi: rtw89: mac: add suffix _ax to MAC functions wifi: rtw89: mac: add flags to check if CMAC and DMAC are enabled wifi: rtw89: 8922a: add power on/off functions wifi: rtw89: add XTAL SI for WiFi 7 chips wifi: rtw89: phy: print out RFK log with formatted string wifi: rtw89: parse and print out RFK log from C2H events wifi: rtw89: add C2H event handlers of RFK log and report wifi: rtw89: load RFK log format string from firmware file wifi: rtw89: fw: add version field to BB MCU firmware element wifi: rtw89: fw: load TX power track tables from fw_element wifi: mwifiex: configure BSSID consistently when starting AP wifi: mwifiex: add extra delay for firmware ready wifi: mac80211: sta_info.c: fix sentence grammar wifi: mac80211: rx.c: fix sentence grammar ... ==================== Link: https://lore.kernel.org/r/20231218163900.C031DC433C9@smtp.kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-18bpf: Ensure precise is reset to false in __mark_reg_const_zero()Andrii Nakryiko
It is safe to always start with imprecise SCALAR_VALUE register. Previously __mark_reg_const_zero() relied on caller to reset precise mark, but it's very error prone and we already missed it in a few places. So instead make __mark_reg_const_zero() reset precision always, as it's a safe default for SCALAR_VALUE. Explanation is basically the same as for why we are resetting (or rather not setting) precision in current state. If necessary, precision propagation will set it to precise correctly. As such, also remove a big comment about forward precision propagation in mark_reg_stack_read() and avoid unnecessarily setting precision to true after reading from STACK_ZERO stack. Again, precision propagation will correctly handle this, if that SCALAR_VALUE register will ever be needed to be precise. Reported-by: Maxim Mikityanskiy <maxtram95@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yonghong.song@linux.dev> Acked-by: Maxim Mikityanskiy <maxtram95@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20231218173601.53047-1-andrii@kernel.org
2023-12-18Merge branch 'tools-net-ynl-add-sub-message-support-to-ynl'Jakub Kicinski
Donald Hunter says: ==================== tools/net/ynl: Add 'sub-message' support to ynl This patchset adds a 'sub-message' attribute type to the netlink-raw schema and implements it in ynl. This provides support for kind-specific options attributes as used in rt_link and tc raw netlink families. A description of the new 'sub-message' attribute type and the corresponding sub-message definitions is provided in patch 3. The patchset includes updates to the rt_link spec and a new tc spec that make use of the new 'sub-message' attribute type. As mentioned in patch 4, encode support is not yet implemented in ynl and support for sub-message selectors at a different nest level from the key attribute is not yet supported. I plan to work on these in follow-up patches. Patches 1 is code cleanup in ynl Patches 2-4 add sub-message support to the schema and ynl with documentation updates. Patch 5 adds binary and pad support to structs in netlink-raw. Patches 6-8 contain specs that use the sub-message attribute type. Patches 9-13 update ynl-gen-rst and its make target ==================== Link: https://lore.kernel.org/r/20231215093720.18774-1-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-18tools/net/ynl-gen-rst: Remove extra indentation from generated docsDonald Hunter
The output from ynl-gen-rst.py has extra indentation that causes extra <blockquote> elements to be generated in the HTML output. Reduce the indentation so that sphinx doesn't generate unnecessary <blockquote> elements. Reviewed-by: Breno Leitao <leitao@debian.org> Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20231215093720.18774-14-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-18tools/net/ynl-gen-rst: Remove bold from attribute-set headingsDonald Hunter
The generated .rst for attribute-sets currently uses a sub-sub-heading for each attribute, with the attribute name in bold. This makes attributes stand out more than the attribute-set sub-headings they are part of. Remove the bold markup from attribute sub-sub-headings. Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20231215093720.18774-13-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-18tools/net/ynl-gen-rst: Sort the index of generated netlink specsDonald Hunter
The index of netlink specs was being generated unsorted. Sort the output before generating the index entries. Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Breno Leitao <leitao@debian.org> Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20231215093720.18774-12-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-18tools/net/ynl-gen-rst: Add sub-messages to generated docsDonald Hunter
Add a section for sub-messages to the generated .rst files. Reviewed-by: Breno Leitao <leitao@debian.org> Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20231215093720.18774-11-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-18doc/netlink: Regenerate netlink .rst files if ynl-gen-rst changesDonald Hunter
Add ynl-gen-rst.py to the dependencies for the netlink .rst files in the doc Makefile so that the docs get regenerated if the ynl-gen-rst.py script is modified. Use $(Q) to honour V=1 in the rules that run ynl-gen-rst.py Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Breno Leitao <leitao@debian.org> Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20231215093720.18774-10-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-18doc/netlink/specs: Add a spec for tcDonald Hunter
This is a work-in-progress spec for tc that covers: - most of the qdiscs - the flower classifier - new, del, get for qdisc, chain, class and filter Notable omissions: - most of the stats attrs are left as binary blobs - notifications are not yet implemented Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20231215093720.18774-9-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-18doc/netlink/specs: use pad in structs in rt_linkDonald Hunter
The rt_link spec was using pad1, pad2 attributes in structs which appears in the ynl output. Replace this with the 'pad' type which doesn't pollute the output. Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20231215093720.18774-8-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-18doc/netlink/specs: Add sub-message type to rt_link familyDonald Hunter
Start using sub-message selectors in the rt_link spec for the link-specific 'data' and 'slave-data' attributes. Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20231215093720.18774-7-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-18tools/net/ynl: Add binary and pad support to structs for tcDonald Hunter
The tc netlink-raw family needs binary and pad types for several qopt C structs. Add support for them to ynl. Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20231215093720.18774-6-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-18tools/net/ynl: Add 'sub-message' attribute decoding to ynlDonald Hunter
Implement the 'sub-message' attribute type in ynl. Encode support is not yet implemented. Support for sub-message selectors at a different nest level from the key attribute is not yet supported. Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20231215093720.18774-5-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-18doc/netlink: Document the sub-message format for netlink-rawDonald Hunter
Document the spec format used by netlink-raw families like rt and tc. Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20231215093720.18774-4-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-18doc/netlink: Add sub-message support to netlink-rawDonald Hunter
Add a 'sub-message' attribute type with a selector that supports polymorphic attribute formats for raw netlink families like tc. A sub-message attribute uses the value of another attribute as a selector key to choose the right sub-message format. For example if the following attribute has already been decoded: { "kind": "gre" } and we encounter the following attribute spec: - name: data type: sub-message sub-message: linkinfo-data-msg selector: kind Then we look for a sub-message definition called 'linkinfo-data-msg' and use the value of the 'kind' attribute i.e. 'gre' as the key to choose the correct format for the sub-message: sub-messages: name: linkinfo-data-msg formats: - value: bridge attribute-set: linkinfo-bridge-attrs - value: gre attribute-set: linkinfo-gre-attrs - value: geneve attribute-set: linkinfo-geneve-attrs This would decode the attribute value as a sub-message with the attribute-set called 'linkinfo-gre-attrs' as the attribute space. A sub-message can have an optional 'fixed-header' followed by zero or more attributes from an attribute-set. For example the following 'tc-options-msg' sub-message defines message formats that use a mixture of fixed-header, attribute-set or both together: sub-messages: - name: tc-options-msg formats: - value: bfifo fixed-header: tc-fifo-qopt - value: cake attribute-set: tc-cake-attrs - value: netem fixed-header: tc-netem-qopt attribute-set: tc-netem-attrs Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20231215093720.18774-3-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-18tools/net/ynl: Use consistent array index expression formattingDonald Hunter
Use expression formatting that conforms to the python style guide. Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20231215093720.18774-2-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-18Merge branch 'bpf-add-check-for-negative-uprobe-multi-offset'Andrii Nakryiko
Jiri Olsa says: ==================== bpf: Add check for negative uprobe multi offset hi, adding the check for negative offset for uprobe multi link. v2 changes: - add more failure checks [Alan] - move the offset retrieval/check up in the loop to be done earlier [Song] thanks, jirka --- ==================== Link: https://lore.kernel.org/r/20231217215538.3361991-1-jolsa@kernel.org Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2023-12-18selftests/bpf: Add more uprobe multi fail testsJiri Olsa
We fail to create uprobe if we pass negative offset. Add more tests validating kernel-side error checking code. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/bpf/20231217215538.3361991-3-jolsa@kernel.org
2023-12-18bpf: Fail uprobe multi link with negative offsetJiri Olsa
Currently the __uprobe_register will return 0 (success) when called with negative offset. The reason is that the call to register_for_each_vma and then build_map_info won't return error for negative offset. They just won't do anything - no matching vma is found so there's no registered breakpoint for the uprobe. I don't think we can change the behaviour of __uprobe_register and fail for negative uprobe offset, because apps might depend on that already. But I think we can still make the change and check for it on bpf multi link syscall level. Also moving the __get_user call and check for the offsets to the top of loop, to fail early without extra __get_user calls for ref_ctr_offset and cookie arrays. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/bpf/20231217215538.3361991-2-jolsa@kernel.org
2023-12-18selftests/bpf: Test the release of map btfHou Tao
When there is bpf_list_head or bpf_rb_root field in map value, the free of map btf and the free of map value may run concurrently and there may be use-after-free problem, so add two test cases to demonstrate it. And the use-after-free problem can been easily reproduced by using bpf_next tree and a KASAN-enabled kernel. The first test case tests the racing between the free of map btf and the free of array map. It constructs the racing by releasing the array map in the end after other ref-counter of map btf has been released. To delay the free of array map and make it be invoked after btf_free_rcu() is invoked, it stresses system_unbound_wq by closing multiple percpu array maps before it closes the array map. The second case tests the racing between the free of map btf and the free of inner map. Beside using the similar method as the first one does, it uses bpf_map_delete_elem() to delete the inner map and to defer the release of inner map after one RCU grace period. The reason for using two skeletons is to prevent the release of outer map and inner map in map_in_map_btf.c interfering the release of bpf map in normal_map_btf.c. Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/bpf/20231216035510.4030605-1-houtao@huaweicloud.com
2023-12-18s390/bpf: Fix indirect trampoline generationAlexei Starovoitov
The func_addr used to be NULL for indirect trampolines used by struct_ops. Now func_addr is a valid function pointer. Hence use BPF_TRAMP_F_INDIRECT flag to detect such condition. Fixes: 2cd3e3772e41 ("x86/cfi,bpf: Fix bpf_struct_ops CFI") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/bpf/20231216004549.78355-1-alexei.starovoitov@gmail.com
2023-12-18Merge branch 'rtnl-rcu'David S. Miller
Pedro Tammela says: ==================== net: rtnl: introduce rcu_replace_pointer_rtnl Introduce the rcu_replace_pointer_rtnl helper to lockdep check rtnl lock rcu replacements, alongside the already existing helpers. Patch 2 uses the new helper in the rtnl_unregister_* functions. Originally this change was part of the P4TC series, as it's a recurrent pattern there, but since it has a use case in mainline we are pushing it separately. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-18net: rtnl: use rcu_replace_pointer_rtnl in rtnl_unregister_*Pedro Tammela
With the introduction of the rcu_replace_pointer_rtnl helper, cleanup the rtnl_unregister_* functions to use the helper instead of open coding it. Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-18net: rtnl: introduce rcu_replace_pointer_rtnlJamal Hadi Salim
Introduce the rcu_replace_pointer_rtnl helper to lockdep check rtnl lock rcu replacements, alongside the already existing helpers. This is a quality of life helper so instead of using: rcu_replace_pointer(rp, p, lockdep_rtnl_is_held()) .. or the open coded.. rtnl_dereference() / rcu_assign_pointer() .. or the lazy check version .. rcu_replace_pointer(rp, p, 1) Use: rcu_replace_pointer_rtnl(rp, p) Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Victor Nogueira <victor@mojatatu.com> Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-17Merge branch 'phy-ackage-addr-mmd-apis'David S. Miller
Christian Marangi says: ==================== net: phy: add PHY package base addr + mmd APIs This small series is required for the upcoming qca807x PHY that will make use of PHY package mmd API and the new implementation with read/write based on base addr. The MMD PHY package patch currently has no use but it will be used in the upcoming patch and it does complete what a PHY package may require in addition to basic read/write to setup global PHY address. (Changelog for all the revision is present in the single patch) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-17net: phy: add support for PHY package MMD read/writeChristian Marangi
Some PHY in PHY package may require to read/write MMD regs to correctly configure the PHY package. Add support for these additional required function in both lock and no lock variant. It's assumed that the entire PHY package is either C22 or C45. We use C22 or C45 way of writing/reading to mmd regs based on the passed phydev whether it's C22 or C45. Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-17net: phy: restructure __phy_write/read_mmd to helper and phydev userChristian Marangi
Restructure phy_write_mmd and phy_read_mmd to implement generic helper for direct mdiobus access for mmd and use these helper for phydev user. This is needed in preparation of PHY package API that requires generic access to the mdiobus and are deatched from phydev struct but instead access them based on PHY package base_addr and offsets. Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>