summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2023-03-07Merge branch 'main' of ↵Paolo Abeni
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Restore ctnetlink zero mark in events and dump, from Ivan Delalande. 2) Fix deadlock due to missing disabled bh in tproxy, from Florian Westphal. 3) Safer maximum chain load in conntrack, from Eric Dumazet. * 'main' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: conntrack: adopt safer max chain length netfilter: tproxy: fix deadlock due to missing BH disable netfilter: ctnetlink: revert to dumping mark regardless of event type ==================== Link: https://lore.kernel.org/r/20230307100424.2037-1-pablo@netfilter.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-03-07wifi: nl80211: Add support for randomizing TA of auth and deauth framesVeerendranath Jakkam
Add support to use a random local address in authentication and deauthentication frames sent to unassociated peer when the driver supports. The driver needs to configure receive behavior to accept frames with random transmit address specified in TX path authentication frames during the time of the frame exchange is pending and such frames need to be acknowledged similarly to frames sent to the local permanent address when this random address functionality is used. This capability allows use of randomized transmit address for PASN authentication frames to improve privacy of WLAN clients. Signed-off-by: Veerendranath Jakkam <quic_vjakkam@quicinc.com> Link: https://lore.kernel.org/r/20230112012415.167556-2-quic_vjakkam@quicinc.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-07wifi: mac80211: add LDPC related flags in ieee80211_bss_confRyder Lee
This is utilized to pass LDPC configurations from user space (i.e. hostapd) to driver. Signed-off-by: Ryder Lee <ryder.lee@mediatek.com> Link: https://lore.kernel.org/r/1de696aaa34efd77a926eb657b8c0fda05aaa177.1676628065.git.ryder.lee@mediatek.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-07wifi: mac80211: add EHT MU-MIMO related flags in ieee80211_bss_confRyder Lee
Similar to VHT/HE. This is utilized to pass MU-MIMO configurations from user space (i.e. hostapd) to driver. Signed-off-by: Ryder Lee <ryder.lee@mediatek.com> Link: https://lore.kernel.org/r/8d9966c4c1e77cb1ade77d42bdc49905609192e9.1676628065.git.ryder.lee@mediatek.com [move into combined if statement, reset on !eht] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-07wifi: mac80211: introduce ieee80211_refresh_tx_agg_session_timer()Ryder Lee
This allows low level drivers to refresh the tx agg session timer, based on querying stats from the firmware usually. Especially for some mt76 devices support .net_fill_forward_path would bypass mac80211, which leads to tx BA session timeout clients that set a timeout in their AddBA response to our request, even if our request is without a timeout. Signed-off-by: Ryder Lee <ryder.lee@mediatek.com> Link: https://lore.kernel.org/r/7c3f72eac1c34921cd84a462e60d71e125862152.1676616450.git.ryder.lee@mediatek.com [slightly clarify commit message, add note about RCU] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-07wifi: mac80211: add support for driver adding radiotap TLVsMordechay Goodstein
The new TLV format enables adding TLVs after the fixed fields in radiotap, as part of the radiotap header. Support this and move vendor data to the TLV format, allowing a reuse of the RX_FLAG_RADIOTAP_VENDOR_DATA as the new RX_FLAG_RADIOTAP_TLV_AT_END flag. Signed-off-by: Mordechay Goodstein <mordechay.goodstein@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230301115906.b18fd5da8477.I576400ec40a7b35ef97a3b09a99b3a49e9174786@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-07netfilter: conntrack: adopt safer max chain lengthEric Dumazet
Customers using GKE 1.25 and 1.26 are facing conntrack issues root caused to commit c9c3b6811f74 ("netfilter: conntrack: make max chain length random"). Even if we assume Uniform Hashing, a bucket often reachs 8 chained items while the load factor of the hash table is smaller than 0.5 With a limit of 16, we reach load factors of 3. With a limit of 32, we reach load factors of 11. With a limit of 40, we reach load factors of 15. With a limit of 50, we reach load factors of 24. This patch changes MIN_CHAINLEN to 50, to minimize risks. Ideally, we could in the future add a cushion based on expected load factor (2 * nf_conntrack_max / nf_conntrack_buckets), because some setups might expect unusual values. Fixes: c9c3b6811f74 ("netfilter: conntrack: make max chain length random") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-03-07wifi: mac80211: fix ieee80211_link_set_associated() typeJohannes Berg
The return type here should be u64 for the flags, even if it doesn't matter right now because it doesn't return any flags that don't fit into u32. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230301115906.d67ccae57d60.Ia4768e547ba8b1deb2b84ce3bbfbe216d5bfff6a@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-07wifi: mac80211: simplify reasoning about EHT capa handlingJohannes Berg
Given the code in cfg80211, EHT capa cannot be non-NULL when HE capa is NULL, but it's easier to reason about it if both are checked and the compiler will likely integrate the check with the previous one for HE capa anyway. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230301115906.7413d50d23bc.I6fef7484721be9bd5364f64921fc5e9168495f62@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-07wifi: mac80211: mlme: remove pointless sta checkJohannes Berg
We already exited the function if sta ended up NULL, so just remove the extra check. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230301115906.4cbac9cfd03a.I21ec81c96d246afdabc2b0807d3856e6b1182cb7@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-07wifi: mac80211: add netdev per-link debugfs data and driver hookBenjamin Berg
This adds the infrastructure to have netdev specific per-link data both for mac80211 and the driver in debugfs. For the driver, a new callback is added which is only used if MLO is supported. Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230301115906.fb4c947e4df8.I69b3516ddf4c8a7501b395f652d6063444ecad63@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-07wifi: mac80211: remove SMPS from AP debugfsBenjamin Berg
The spatial multiplexing power save feature does not apply to AP mode. Remove it from debugfs in this case. Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230301115906.01b167027dd5.Iee69f2e4df98581f259ef2c76309b940b20174be@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-07wifi: mac80211: add pointer from bss_conf to vifBenjamin Berg
While often not needed, this considerably simplifies going from a link specific bss_config to the vif. This helps with e.g. creating link specific debugfs entries inside drivers. Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230301115906.46f701a10ed5.I20390b2a8165ff222d66585915689206ea93222b@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-07wifi: mac80211: warn only once on AP probeJohannes Berg
We should perhaps support this API for MLO, but it's not clear that it makes sense, in any case then we'd have to update it to probe the correct BSS. For now, if it happens, warn only once so that we don't get flooded with messages if the driver misbehaves and calls this. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230301115906.1c8499b6fbe6.I1a76a2be3b42ff93904870ac069f0319507adc23@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-07wifi: cfg80211/mac80211: report link ID on control port RXJohannes Berg
For control port RX, report the link ID for MLO. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230301115906.fe06dfc3791b.Iddcab94789cafe336417be406072ce8a6312fc2d@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-07wifi: mac80211: add support for set_hw_timestamp commandAvraham Stern
Support the set_hw_timestamp callback for enabling and disabling HW timestamping if the low level driver supports it. Signed-off-by: Avraham Stern <avraham.stern@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230301115906.700ded7badde.Ib2f7c228256ce313a04d3d9f9ecc6c7b9aa602bb@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-07wifi: nl80211: add a command to enable/disable HW timestampingAvraham Stern
Add a command to enable and disable HW timestamping of TM and FTM frames. HW timestamping can be enabled for a specific mac address or for all addresses. The low level driver will indicate how many peers HW timestamping can be enabled concurrently, and this information will be passed to userspace. Signed-off-by: Avraham Stern <avraham.stern@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230301115906.05678d7b1c17.Iccc08869ea8156f1c71a3111a47f86dd56234bd0@changeid [switch to needing netdev UP, minor edits] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-07wifi: wireless: cleanup unused function parametersMordechay Goodstein
In the past ftype was used for deciding about 6G DUP beacon, but the logic was removed and ftype is not needed anymore. Signed-off-by: Mordechay Goodstein <mordechay.goodstein@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230301115906.98d4761b809b.I255f5ecd77cb24fcf2f1641bb5833ea2d121296e@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-07wifi: wireless: correct primary channel validation on 6 GHzMordechay Goodstein
The check that beacon primary channel is in the range of 80 MHz (abs < 80) is invalid for 320 MHz since duplicate beacon transmit means that the AP transmits it on all the 20 MHz sub-channels: 9.4.2.249 HE Operation element - ... AP transmits Beacon frames in non-HT duplicate PPDU with a TXVECTOR parameter CH_BANDWIDTH value that is up to the BSS bandwidth. So in case of 320 MHz the DUP beacon can be in upper 160 for primary channel in the lower 160 giving possibly an absolute range of over 80 MHz. Also this check is redundant alltogether, if AP has a wrong primary channel in the beacon it's a faulty AP, and we would fail in next steps to connect. While at it, fix the frequency comparison to no longer compare between KHz and MHz, which was introduced by commit 7f599aeccbd2 ("cfg80211: Use the HE operation IE to determine a 6GHz BSS channel"). Signed-off-by: Mordechay Goodstein <mordechay.goodstein@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230301115906.314faf725255.I5e27251ac558297553b590d3917a7b6d1aae0e74@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-07wifi: wireless: return primary channel regardless of DUPMordechay Goodstein
Currently in case DUP bit is not set we don't return the primary channel for 6 GHz Band, but the spec says that the DUP bit has no effect on this field: 9.4.2.249 HE Operation element: The Duplicate Beacon subfield is set to 1 if the AP transmits Beacon frames in non-HT duplicate PPDU with a TXVECTOR parameter CH_BANDWIDTH value that is up to the BSS bandwidth and is set to 0 otherwise. So remove the condition for returning primary channel based on DUP. Since the caller code already marks the signal as invalid in case the indicated frequency is not the tuned frequency, there's no need to additionally handle this case here since that's already true for duplicated beacons on the non-primary channel(s). Signed-off-by: Mordechay Goodstein <mordechay.goodstein@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230301115906.66d7f05f7d11.I5e0add054f72ede95611391b99804c61c40cc959@changeid [clarify commit message] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-07wifi: mac80211: allow beacon protection HW offloadJohannes Berg
In case of beacon protection, check if the key was offloaded to the hardware and in that case set control.hw_key so that the encryption function will see it and only do the needed steps that aren't done in hardware. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230301115906.b2becd9a22fb.I6c0b9c50c6a481128ba912a11cb7afc92c4b6da7@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-07wifi: mac80211: check key taint for beacon protectionJohannes Berg
This will likely never happen, but for completeness check the key taint flag before using a key for beacon protection. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230301115906.cf2c3fee6f1f.I2f19b3e04e31c99bed3c9dc71935bf513b2cd177@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-07wifi: mac80211: clear all bits that relate rtap fields on skbMordechay Goodstein
Since we remove radiotap from skb data, clear all RX_FLAG_X related info that indicate info on the skb data. Also we need to do it only once so remove the clear from cooked_monitor. Signed-off-by: Mordechay Goodstein <mordechay.goodstein@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230301115906.74d3efe19eae.Ie17a35864d2e120f9858516a2e3d3047d83cf805@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-07wifi: mac80211: adjust scan cancel comment/checkJohannes Berg
Instead of the comment about holding RTNL, which is now wrong, add a proper lockdep assertion for the wiphy mutex. Fixes: a05829a7222e ("cfg80211: avoid holding the RTNL when calling the driver") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230301115906.84352e46f342.Id90fef8c581cebe19cb30274340cf43885d55c74@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-06Merge tag 'for-netdev' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2023-03-06 We've added 85 non-merge commits during the last 13 day(s) which contain a total of 131 files changed, 7102 insertions(+), 1792 deletions(-). The main changes are: 1) Add skb and XDP typed dynptrs which allow BPF programs for more ergonomic and less brittle iteration through data and variable-sized accesses, from Joanne Koong. 2) Bigger batch of BPF verifier improvements to prepare for upcoming BPF open-coded iterators allowing for less restrictive looping capabilities, from Andrii Nakryiko. 3) Rework RCU enforcement in the verifier, add kptr_rcu and enforce BPF programs to NULL-check before passing such pointers into kfunc, from Alexei Starovoitov. 4) Add support for kptrs in percpu hashmaps, percpu LRU hashmaps and in local storage maps, from Kumar Kartikeya Dwivedi. 5) Add BPF verifier support for ST instructions in convert_ctx_access() which will help new -mcpu=v4 clang flag to start emitting them, from Eduard Zingerman. 6) Make uprobe attachment Android APK aware by supporting attachment to functions inside ELF objects contained in APKs via function names, from Daniel Müller. 7) Add a new flag BPF_F_TIMER_ABS flag for bpf_timer_start() helper to start the timer with absolute expiration value instead of relative one, from Tero Kristo. 8) Add a new kfunc bpf_cgroup_from_id() to look up cgroups via id, from Tejun Heo. 9) Extend libbpf to support users manually attaching kprobes/uprobes in the legacy/perf/link mode, from Menglong Dong. 10) Implement workarounds in the mips BPF JIT for DADDI/R4000, from Jiaxun Yang. 11) Enable mixing bpf2bpf and tailcalls for the loongarch BPF JIT, from Hengqi Chen. 12) Extend BPF instruction set doc with describing the encoding of BPF instructions in terms of how bytes are stored under big/little endian, from Jose E. Marchesi. 13) Follow-up to enable kfunc support for riscv BPF JIT, from Pu Lehui. 14) Fix bpf_xdp_query() backwards compatibility on old kernels, from Yonghong Song. 15) Fix BPF selftest cross compilation with CLANG_CROSS_FLAGS, from Florent Revest. 16) Improve bpf_cpumask_ma to only allocate one bpf_mem_cache, from Hou Tao. 17) Fix BPF verifier's check_subprogs to not unnecessarily mark a subprogram with has_tail_call, from Ilya Leoshkevich. 18) Fix arm syscall regs spec in libbpf's bpf_tracing.h, from Puranjay Mohan. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (85 commits) selftests/bpf: Add test for legacy/perf kprobe/uprobe attach mode selftests/bpf: Split test_attach_probe into multi subtests libbpf: Add support to set kprobe/uprobe attach mode tools/resolve_btfids: Add /libsubcmd to .gitignore bpf: add support for fixed-size memory pointer returns for kfuncs bpf: generalize dynptr_get_spi to be usable for iters bpf: mark PTR_TO_MEM as non-null register type bpf: move kfunc_call_arg_meta higher in the file bpf: ensure that r0 is marked scratched after any function call bpf: fix visit_insn()'s detection of BPF_FUNC_timer_set_callback helper bpf: clean up visit_insn()'s instruction processing selftests/bpf: adjust log_fixup's buffer size for proper truncation bpf: honor env->test_state_freq flag in is_state_visited() selftests/bpf: enhance align selftest's expected log matching bpf: improve regsafe() checks for PTR_TO_{MEM,BUF,TP_BUFFER} bpf: improve stack slot state printing selftests/bpf: Disassembler tests for verifier.c:convert_ctx_access() selftests/bpf: test if pointer type is tracked for BPF_ST_MEM bpf: allow ctx writes using BPF_ST_MEM instruction bpf: Use separate RCU callbacks for freeing selem ... ==================== Link: https://lore.kernel.org/r/20230307004346.27578-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-06Merge tag 'for-netdev' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2023-03-06 We've added 8 non-merge commits during the last 7 day(s) which contain a total of 9 files changed, 64 insertions(+), 18 deletions(-). The main changes are: 1) Fix BTF resolver for DATASEC sections when a VAR points at a modifier, that is, keep resolving such instances instead of bailing out, from Lorenz Bauer. 2) Fix BPF test framework with regards to xdp_frame info misplacement in the "live packet" code, from Alexander Lobakin. 3) Fix an infinite loop in BPF sockmap code for TCP/UDP/AF_UNIX, from Liu Jian. 4) Fix a build error for riscv BPF JIT under PERF_EVENTS=n, from Randy Dunlap. 5) Several BPF doc fixes with either broken links or external instead of internal doc links, from Bagas Sanjaya. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: selftests/bpf: check that modifier resolves after pointer btf: fix resolving BTF_KIND_VAR after ARRAY, STRUCT, UNION, PTR bpf, test_run: fix &xdp_frame misplacement for LIVE_FRAMES bpf, doc: Link to submitting-patches.rst for general patch submission info bpf, doc: Do not link to docs.kernel.org for kselftest link bpf, sockmap: Fix an infinite loop error when len is 0 in tcp_bpf_recvmsg_parser() riscv, bpf: Fix patch_text implicit declaration bpf, docs: Fix link to BTF doc ==================== Link: https://lore.kernel.org/r/20230306215944.11981-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-06net: tls: fix device-offloaded sendpage straddling recordsJakub Kicinski
Adrien reports that incorrect data is transmitted when a single page straddles multiple records. We would transmit the same data in all iterations of the loop. Reported-by: Adrien Moulin <amoulin@corp.free.fr> Link: https://lore.kernel.org/all/61481278.42813558.1677845235112.JavaMail.zimbra@corp.free.fr Fixes: c1318b39c7d3 ("tls: Add opt-in zerocopy mode of sendfile()") Tested-by: Adrien Moulin <amoulin@corp.free.fr> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Acked-by: Maxim Mikityanskiy <maxtram95@gmail.com> Link: https://lore.kernel.org/r/20230304192610.3818098-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-06bpf, test_run: fix &xdp_frame misplacement for LIVE_FRAMESAlexander Lobakin
&xdp_buff and &xdp_frame are bound in a way that xdp_buff->data_hard_start == xdp_frame It's always the case and e.g. xdp_convert_buff_to_frame() relies on this. IOW, the following: for (u32 i = 0; i < 0xdead; i++) { xdpf = xdp_convert_buff_to_frame(&xdp); xdp_convert_frame_to_buff(xdpf, &xdp); } shouldn't ever modify @xdpf's contents or the pointer itself. However, "live packet" code wrongly treats &xdp_frame as part of its context placed *before* the data_hard_start. With such flow, data_hard_start is sizeof(*xdpf) off to the right and no longer points to the XDP frame. Instead of replacing `sizeof(ctx)` with `offsetof(ctx, xdpf)` in several places and praying that there are no more miscalcs left somewhere in the code, unionize ::frm with ::data in a flex array, so that both starts pointing to the actual data_hard_start and the XDP frame actually starts being a part of it, i.e. a part of the headroom, not the context. A nice side effect is that the maximum frame size for this mode gets increased by 40 bytes, as xdp_buff::frame_sz includes everything from data_hard_start (-> includes xdpf already) to the end of XDP/skb shared info. Also update %MAX_PKT_SIZE accordingly in the selftests code. Leave it hardcoded for 64 bit && 4k pages, it can be made more flexible later on. Minor: align `&head->data` with how `head->frm` is assigned for consistency. Minor #2: rename 'frm' to 'frame' in &xdp_page_head while at it for clarity. (was found while testing XDP traffic generator on ice, which calls xdp_convert_frame_to_buff() for each XDP frame) Fixes: b530e9e1063e ("bpf: Add "live packet" mode for XDP in BPF_PROG_RUN") Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://lore.kernel.org/r/20230224163607.2994755-1-aleksander.lobakin@intel.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-03-06netfilter: tproxy: fix deadlock due to missing BH disableFlorian Westphal
The xtables packet traverser performs an unconditional local_bh_disable(), but the nf_tables evaluation loop does not. Functions that are called from either xtables or nftables must assume that they can be called in process context. inet_twsk_deschedule_put() assumes that no softirq interrupt can occur. If tproxy is used from nf_tables its possible that we'll deadlock trying to aquire a lock already held in process context. Add a small helper that takes care of this and use it. Link: https://lore.kernel.org/netfilter-devel/401bd6ed-314a-a196-1cdc-e13c720cc8f2@balasys.hu/ Fixes: 4ed8eb6570a4 ("netfilter: nf_tables: Add native tproxy support") Reported-and-tested-by: Major Dávid <major.david@balasys.hu> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-03-06netfilter: ctnetlink: revert to dumping mark regardless of event typeIvan Delalande
It seems that change was unintentional, we have userspace code that needs the mark while listening for events like REPLY, DESTROY, etc. Also include 0-marks in requested dumps, as they were before that fix. Fixes: 1feeae071507 ("netfilter: ctnetlink: fix compilation warning after data race fixes in ct mark") Signed-off-by: Ivan Delalande <colona@arista.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-03-03bpf: allow ctx writes using BPF_ST_MEM instructionEduard Zingerman
Lift verifier restriction to use BPF_ST_MEM instructions to write to context data structures. This requires the following changes: - verifier.c:do_check() for BPF_ST updated to: - no longer forbid writes to registers of type PTR_TO_CTX; - track dst_reg type in the env->insn_aux_data[...].ptr_type field (same way it is done for BPF_STX and BPF_LDX instructions). - verifier.c:convert_ctx_access() and various callbacks invoked by it are updated to handled BPF_ST instruction alongside BPF_STX. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20230304011247.566040-2-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-03bpf: Introduce kptr_rcu.Alexei Starovoitov
The life time of certain kernel structures like 'struct cgroup' is protected by RCU. Hence it's safe to dereference them directly from __kptr tagged pointers in bpf maps. The resulting pointer is MEM_RCU and can be passed to kfuncs that expect KF_RCU. Derefrence of other kptr-s returns PTR_UNTRUSTED. For example: struct map_value { struct cgroup __kptr *cgrp; }; SEC("tp_btf/cgroup_mkdir") int BPF_PROG(test_cgrp_get_ancestors, struct cgroup *cgrp_arg, const char *path) { struct cgroup *cg, *cg2; cg = bpf_cgroup_acquire(cgrp_arg); // cg is PTR_TRUSTED and ref_obj_id > 0 bpf_kptr_xchg(&v->cgrp, cg); cg2 = v->cgrp; // This is new feature introduced by this patch. // cg2 is PTR_MAYBE_NULL | MEM_RCU. // When cg2 != NULL, it's a valid cgroup, but its percpu_ref could be zero if (cg2) bpf_cgroup_ancestor(cg2, level); // safe to do. } Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/bpf/20230303041446.3630-4-alexei.starovoitov@gmail.com
2023-03-03bpf, sockmap: Fix an infinite loop error when len is 0 in ↵Liu Jian
tcp_bpf_recvmsg_parser() When the buffer length of the recvmsg system call is 0, we got the flollowing soft lockup problem: watchdog: BUG: soft lockup - CPU#3 stuck for 27s! [a.out:6149] CPU: 3 PID: 6149 Comm: a.out Kdump: loaded Not tainted 6.2.0+ #30 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014 RIP: 0010:remove_wait_queue+0xb/0xc0 Code: 5e 41 5f c3 cc cc cc cc 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 41 57 <41> 56 41 55 41 54 55 48 89 fd 53 48 89 f3 4c 8d 6b 18 4c 8d 73 20 RSP: 0018:ffff88811b5978b8 EFLAGS: 00000246 RAX: 0000000000000000 RBX: ffff88811a7d3780 RCX: ffffffffb7a4d768 RDX: dffffc0000000000 RSI: ffff88811b597908 RDI: ffff888115408040 RBP: 1ffff110236b2f1b R08: 0000000000000000 R09: ffff88811a7d37e7 R10: ffffed10234fa6fc R11: 0000000000000001 R12: ffff88811179b800 R13: 0000000000000001 R14: ffff88811a7d38a8 R15: ffff88811a7d37e0 FS: 00007f6fb5398740(0000) GS:ffff888237180000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020000000 CR3: 000000010b6ba002 CR4: 0000000000370ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> tcp_msg_wait_data+0x279/0x2f0 tcp_bpf_recvmsg_parser+0x3c6/0x490 inet_recvmsg+0x280/0x290 sock_recvmsg+0xfc/0x120 ____sys_recvmsg+0x160/0x3d0 ___sys_recvmsg+0xf0/0x180 __sys_recvmsg+0xea/0x1a0 do_syscall_64+0x3f/0x90 entry_SYSCALL_64_after_hwframe+0x72/0xdc The logic in tcp_bpf_recvmsg_parser is as follows: msg_bytes_ready: copied = sk_msg_recvmsg(sk, psock, msg, len, flags); if (!copied) { wait data; goto msg_bytes_ready; } In this case, "copied" always is 0, the infinite loop occurs. According to the Linux system call man page, 0 should be returned in this case. Therefore, in tcp_bpf_recvmsg_parser(), if the length is 0, directly return. Also modify several other functions with the same problem. Fixes: 1f5be6b3b063 ("udp: Implement udp_bpf_recvmsg() for sockmap") Fixes: 9825d866ce0d ("af_unix: Implement unix_dgram_bpf_recvmsg()") Fixes: c5d2177a72a1 ("bpf, sockmap: Fix race in ingress receive verdict with redirect to self") Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: Liu Jian <liujian56@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Cc: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/bpf/20230303080946.1146638-1-liujian56@huawei.com
2023-03-02bpf: Make bpf_get_current_[ancestor_]cgroup_id() available for all program typesTejun Heo
These helpers are safe to call from any context and there's no reason to restrict access to them. Remove them from bpf_trace and filter lists and add to bpf_base_func_proto() under perfmon_capable(). v2: After consulting with Andrii, relocated in bpf_base_func_proto() so that they require bpf_capable() but not perfomon_capable() as it doesn't read from or affect others on the system. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/ZAD8QyoszMZiTzBY@slm.duckdns.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-02Merge tag 'ieee802154-for-net-2023-03-02' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/wpan/wpan Stefan Schmidt says: ==================== ieee802154 for net 2023-03-02 Two small fixes this time. Alexander Aring fixed a potential negative array access in the ca8210 driver. Miquel Raynal fixed a crash that could have been triggered through the extended netlink API for 802154. This only came in this merge window. Found by syzkaller. * tag 'ieee802154-for-net-2023-03-02' of git://git.kernel.org/pub/scm/linux/kernel/git/wpan/wpan: ieee802154: Prevent user from crashing the host ca8210: fix mac_len negative array access ==================== Link: https://lore.kernel.org/r/20230302153032.1312755-1-stefan@datenfreihafen.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-02net: caif: Fix use-after-free in cfusbl_device_notify()Shigeru Yoshida
syzbot reported use-after-free in cfusbl_device_notify() [1]. This causes a stack trace like below: BUG: KASAN: use-after-free in cfusbl_device_notify+0x7c9/0x870 net/caif/caif_usb.c:138 Read of size 8 at addr ffff88807ac4e6f0 by task kworker/u4:6/1214 CPU: 0 PID: 1214 Comm: kworker/u4:6 Not tainted 5.19.0-rc3-syzkaller-00146-g92f20ff72066 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: netns cleanup_net Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106 print_address_description.constprop.0.cold+0xeb/0x467 mm/kasan/report.c:313 print_report mm/kasan/report.c:429 [inline] kasan_report.cold+0xf4/0x1c6 mm/kasan/report.c:491 cfusbl_device_notify+0x7c9/0x870 net/caif/caif_usb.c:138 notifier_call_chain+0xb5/0x200 kernel/notifier.c:87 call_netdevice_notifiers_info+0xb5/0x130 net/core/dev.c:1945 call_netdevice_notifiers_extack net/core/dev.c:1983 [inline] call_netdevice_notifiers net/core/dev.c:1997 [inline] netdev_wait_allrefs_any net/core/dev.c:10227 [inline] netdev_run_todo+0xbc0/0x10f0 net/core/dev.c:10341 default_device_exit_batch+0x44e/0x590 net/core/dev.c:11334 ops_exit_list+0x125/0x170 net/core/net_namespace.c:167 cleanup_net+0x4ea/0xb00 net/core/net_namespace.c:594 process_one_work+0x996/0x1610 kernel/workqueue.c:2289 worker_thread+0x665/0x1080 kernel/workqueue.c:2436 kthread+0x2e9/0x3a0 kernel/kthread.c:376 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:302 </TASK> When unregistering a net device, unregister_netdevice_many_notify() sets the device's reg_state to NETREG_UNREGISTERING, calls notifiers with NETDEV_UNREGISTER, and adds the device to the todo list. Later on, devices in the todo list are processed by netdev_run_todo(). netdev_run_todo() waits devices' reference count become 1 while rebdoadcasting NETDEV_UNREGISTER notification. When cfusbl_device_notify() is called with NETDEV_UNREGISTER multiple times, the parent device might be freed. This could cause UAF. Processing NETDEV_UNREGISTER multiple times also causes inbalance of reference count for the module. This patch fixes the issue by accepting only first NETDEV_UNREGISTER notification. Fixes: 7ad65bf68d70 ("caif: Add support for CAIF over CDC NCM USB interface") CC: sjur.brandeland@stericsson.com <sjur.brandeland@stericsson.com> Reported-by: syzbot+b563d33852b893653a9e@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?id=c3bfd8e2450adab3bffe4d80821fbbced600407f [1] Signed-off-by: Shigeru Yoshida <syoshida@redhat.com> Link: https://lore.kernel.org/r/20230301163913.391304-1-syoshida@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-02ieee802154: Prevent user from crashing the hostMiquel Raynal
Avoid crashing the machine by checking info->attrs[NL802154_ATTR_SCAN_TYPE] presence before de-referencing it, which was the primary intend of the blamed patch. Reported-by: Sanan Hasanov <sanan.hasanov@Knights.ucf.edu> Suggested-by: Eric Dumazet <edumazet@google.com> Fixes: a0b6106672b5 ("ieee802154: Convert scan error messages to extack") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/r/20230301154450.547716-1-miquel.raynal@bootlin.com Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
2023-03-02net: use indirect calls helpers for sk_exit_memory_pressure()Brian Vazquez
Florian reported a regression and sent a patch with the following changelog: <quote> There is a noticeable tcp performance regression (loopback or cross-netns), seen with iperf3 -Z (sendfile mode) when generic retpolines are needed. With SK_RECLAIM_THRESHOLD checks gone number of calls to enter/leave memory pressure happen much more often. For TCP indirect calls are used. We can't remove the if-set-return short-circuit check in tcp_enter_memory_pressure because there are callers other than sk_enter_memory_pressure. Doing a check in the sk wrapper too reduces the indirect calls enough to recover some performance. Before, 0.00-60.00 sec 322 GBytes 46.1 Gbits/sec receiver After: 0.00-60.04 sec 359 GBytes 51.4 Gbits/sec receiver "iperf3 -c $peer -t 60 -Z -f g", connected via veth in another netns. </quote> It seems we forgot to upstream this indirect call mitigation we had for years, lets do this instead. [edumazet] - It seems we forgot to upstream this indirect call mitigation we had for years, let's do this instead. - Changed to INDIRECT_CALL_INET_1() to avoid bots reports. Fixes: 4890b686f408 ("net: keep sk->sk_forward_alloc as small as possible") Reported-by: Florian Westphal <fw@strlen.de> Link: https://lore.kernel.org/netdev/20230227152741.4a53634b@kernel.org/T/ Signed-off-by: Brian Vazquez <brianvv@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20230301133247.2346111-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-03-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nfPaolo Abeni
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Fix bogus error report in selftests/netfilter/nft_nat.sh, from Hangbin Liu. 2) Initialize last and quota expressions from template when expr_ops::clone is called, otherwise, states are not restored accordingly when loading a dynamic set with elements using these two expressions. * git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nft_quota: copy content when cloning expression netfilter: nft_last: copy content when cloning expression selftests: nft_nat: ensuring the listening side is up before starting the client ==================== Link: https://lore.kernel.org/r/20230301222021.154670-1-pablo@netfilter.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-03-01net: tls: avoid hanging tasks on the tx_lockJakub Kicinski
syzbot sent a hung task report and Eric explains that adversarial receiver may keep RWIN at 0 for a long time, so we are not guaranteed to make forward progress. Thread which took tx_lock and went to sleep may not release tx_lock for hours. Use interruptible sleep where possible and reschedule the work if it can't take the lock. Testing: existing selftest passes Reported-by: syzbot+9c0268252b8ef967c62e@syzkaller.appspotmail.com Fixes: 79ffe6087e91 ("net/tls: add a TX lock") Link: https://lore.kernel.org/all/000000000000e412e905f5b46201@google.com/ Cc: stable@vger.kernel.org # wait 4 weeks Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20230301002857.2101894-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-01net: tls: fix possible race condition between do_tls_getsockopt_conf() and ↵Hangyu Hua
do_tls_setsockopt_conf() ctx->crypto_send.info is not protected by lock_sock in do_tls_getsockopt_conf(). A race condition between do_tls_getsockopt_conf() and error paths of do_tls_setsockopt_conf() may lead to a use-after-free or null-deref. More discussion: https://lore.kernel.org/all/Y/ht6gQL+u6fj3dG@hog/ Fixes: 3c4d7559159b ("tls: kernel TLS support") Signed-off-by: Hangyu Hua <hbh25y@gmail.com> Link: https://lore.kernel.org/r/20230228023344.9623-1-hbh25y@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-01Merge tag 'nfsd-6.3-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fix from Chuck Lever: - Make new GSS Kerberos Kunit tests work on non-x86 platforms * tag 'nfsd-6.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: SUNRPC: Properly terminate test case arrays SUNRPC: Let Kunit tests run with some enctypes compiled out
2023-03-01bpf: Add bpf_dynptr_slice and bpf_dynptr_slice_rdwrJoanne Koong
Two new kfuncs are added, bpf_dynptr_slice and bpf_dynptr_slice_rdwr. The user must pass in a buffer to store the contents of the data slice if a direct pointer to the data cannot be obtained. For skb and xdp type dynptrs, these two APIs are the only way to obtain a data slice. However, for other types of dynptrs, there is no difference between bpf_dynptr_slice(_rdwr) and bpf_dynptr_data. For skb type dynptrs, the data is copied into the user provided buffer if any of the data is not in the linear portion of the skb. For xdp type dynptrs, the data is copied into the user provided buffer if the data is between xdp frags. If the skb is cloned and a call to bpf_dynptr_data_rdwr is made, then the skb will be uncloned (see bpf_unclone_prologue()). Please note that any bpf_dynptr_write() automatically invalidates any prior data slices of the skb dynptr. This is because the skb may be cloned or may need to pull its paged buffer into the head. As such, any bpf_dynptr_write() will automatically have its prior data slices invalidated, even if the write is to data in the skb head of an uncloned skb. Please note as well that any other helper calls that change the underlying packet buffer (eg bpf_skb_pull_data()) invalidates any data slices of the skb dynptr as well, for the same reasons. Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Link: https://lore.kernel.org/r/20230301154953.641654-10-joannelkoong@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-01bpf: Add xdp dynptrsJoanne Koong
Add xdp dynptrs, which are dynptrs whose underlying pointer points to a xdp_buff. The dynptr acts on xdp data. xdp dynptrs have two main benefits. One is that they allow operations on sizes that are not statically known at compile-time (eg variable-sized accesses). Another is that parsing the packet data through dynptrs (instead of through direct access of xdp->data and xdp->data_end) can be more ergonomic and less brittle (eg does not need manual if checking for being within bounds of data_end). For reads and writes on the dynptr, this includes reading/writing from/to and across fragments. Data slices through the bpf_dynptr_data API are not supported; instead bpf_dynptr_slice() and bpf_dynptr_slice_rdwr() should be used. For examples of how xdp dynptrs can be used, please see the attached selftests. Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Link: https://lore.kernel.org/r/20230301154953.641654-9-joannelkoong@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-01bpf: Add skb dynptrsJoanne Koong
Add skb dynptrs, which are dynptrs whose underlying pointer points to a skb. The dynptr acts on skb data. skb dynptrs have two main benefits. One is that they allow operations on sizes that are not statically known at compile-time (eg variable-sized accesses). Another is that parsing the packet data through dynptrs (instead of through direct access of skb->data and skb->data_end) can be more ergonomic and less brittle (eg does not need manual if checking for being within bounds of data_end). For bpf prog types that don't support writes on skb data, the dynptr is read-only (bpf_dynptr_write() will return an error) For reads and writes through the bpf_dynptr_read() and bpf_dynptr_write() interfaces, reading and writing from/to data in the head as well as from/to non-linear paged buffers is supported. Data slices through the bpf_dynptr_data API are not supported; instead bpf_dynptr_slice() and bpf_dynptr_slice_rdwr() (added in subsequent commit) should be used. For examples of how skb dynptrs can be used, please see the attached selftests. Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Link: https://lore.kernel.org/r/20230301154953.641654-8-joannelkoong@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-01Merge tag '9p-6.3-for-linus-part1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs Pull 9p updates from Eric Van Hensbergen: - some fixes and cleanup setting up for a larger set of performance patches I've been working on - a contributed fixes relating to 9p/rdma - some contributed fixes relating to 9p/xen * tag '9p-6.3-for-linus-part1' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: fs/9p: fix error reporting in v9fs_dir_release net/9p: fix bug in client create for .L 9p/rdma: unmap receive dma buffer in rdma_request()/post_recv() 9p/xen: fix connection sequence 9p/xen: fix version parsing fs/9p: Expand setup of writeback cache to all levels net/9p: Adjust maximum MSIZE to account for p9 header
2023-03-01netfilter: nft_quota: copy content when cloning expressionPablo Neira Ayuso
If the ruleset contains consumed quota, restore them accordingly. Otherwise, listing after restoration shows never used items. Restore the user-defined quota and flags too. Fixes: ed0a0c60f0e5 ("netfilter: nft_quota: move stateful fields out of expression data") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-03-01netfilter: nft_last: copy content when cloning expressionPablo Neira Ayuso
If the ruleset contains last timestamps, restore them accordingly. Otherwise, listing after restoration shows never used items. Fixes: 33a24de37e81 ("netfilter: nft_last: move stateful fields out of expression data") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-03-01net/sched: flower: fix fl_change() error recovery pathEric Dumazet
The two "goto errout;" paths in fl_change() became wrong after cited commit. Indeed we only must not call __fl_put() until the net pointer has been set in tcf_exts_init_ex() This is a minimal fix. We might in the future validate TCA_FLOWER_FLAGS before we allocate @fnew. BUG: KASAN: null-ptr-deref in instrument_atomic_read include/linux/instrumented.h:72 [inline] BUG: KASAN: null-ptr-deref in atomic_read include/linux/atomic/atomic-instrumented.h:27 [inline] BUG: KASAN: null-ptr-deref in refcount_read include/linux/refcount.h:147 [inline] BUG: KASAN: null-ptr-deref in __refcount_add_not_zero include/linux/refcount.h:152 [inline] BUG: KASAN: null-ptr-deref in __refcount_inc_not_zero include/linux/refcount.h:227 [inline] BUG: KASAN: null-ptr-deref in refcount_inc_not_zero include/linux/refcount.h:245 [inline] BUG: KASAN: null-ptr-deref in maybe_get_net include/net/net_namespace.h:269 [inline] BUG: KASAN: null-ptr-deref in tcf_exts_get_net include/net/pkt_cls.h:260 [inline] BUG: KASAN: null-ptr-deref in __fl_put net/sched/cls_flower.c:513 [inline] BUG: KASAN: null-ptr-deref in __fl_put+0x13e/0x3b0 net/sched/cls_flower.c:508 Read of size 4 at addr 000000000000014c by task syz-executor548/5082 CPU: 0 PID: 5082 Comm: syz-executor548 Not tainted 6.2.0-syzkaller-05251-g5b7c4cabbb65 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/21/2023 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xd9/0x150 lib/dump_stack.c:106 print_report mm/kasan/report.c:420 [inline] kasan_report+0xec/0x130 mm/kasan/report.c:517 check_region_inline mm/kasan/generic.c:183 [inline] kasan_check_range+0x141/0x190 mm/kasan/generic.c:189 instrument_atomic_read include/linux/instrumented.h:72 [inline] atomic_read include/linux/atomic/atomic-instrumented.h:27 [inline] refcount_read include/linux/refcount.h:147 [inline] __refcount_add_not_zero include/linux/refcount.h:152 [inline] __refcount_inc_not_zero include/linux/refcount.h:227 [inline] refcount_inc_not_zero include/linux/refcount.h:245 [inline] maybe_get_net include/net/net_namespace.h:269 [inline] tcf_exts_get_net include/net/pkt_cls.h:260 [inline] __fl_put net/sched/cls_flower.c:513 [inline] __fl_put+0x13e/0x3b0 net/sched/cls_flower.c:508 fl_change+0x101b/0x4ab0 net/sched/cls_flower.c:2341 tc_new_tfilter+0x97c/0x2290 net/sched/cls_api.c:2310 rtnetlink_rcv_msg+0x996/0xd50 net/core/rtnetlink.c:6165 netlink_rcv_skb+0x165/0x440 net/netlink/af_netlink.c:2574 netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline] netlink_unicast+0x547/0x7f0 net/netlink/af_netlink.c:1365 netlink_sendmsg+0x925/0xe30 net/netlink/af_netlink.c:1942 sock_sendmsg_nosec net/socket.c:722 [inline] sock_sendmsg+0xde/0x190 net/socket.c:745 ____sys_sendmsg+0x334/0x900 net/socket.c:2504 ___sys_sendmsg+0x110/0x1b0 net/socket.c:2558 __sys_sendmmsg+0x18f/0x460 net/socket.c:2644 __do_sys_sendmmsg net/socket.c:2673 [inline] __se_sys_sendmmsg net/socket.c:2670 [inline] __x64_sys_sendmmsg+0x9d/0x100 net/socket.c:2670 Fixes: 08a0063df3ae ("net/sched: flower: Move filter handle initialization earlier") Reported-by: syzbot+baabf3efa7c1e57d28b2@syzkaller.appspotmail.com Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Paul Blakey <paulb@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-01ila: do not generate empty messages in ila_xlat_nl_cmd_get_mapping()Eric Dumazet
ila_xlat_nl_cmd_get_mapping() generates an empty skb, triggerring a recent sanity check [1]. Instead, return an error code, so that user space can get it. [1] skb_assert_len WARNING: CPU: 0 PID: 5923 at include/linux/skbuff.h:2527 skb_assert_len include/linux/skbuff.h:2527 [inline] WARNING: CPU: 0 PID: 5923 at include/linux/skbuff.h:2527 __dev_queue_xmit+0x1bc0/0x3488 net/core/dev.c:4156 Modules linked in: CPU: 0 PID: 5923 Comm: syz-executor269 Not tainted 6.2.0-syzkaller-18300-g2ebd1fbb946d #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/21/2023 pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : skb_assert_len include/linux/skbuff.h:2527 [inline] pc : __dev_queue_xmit+0x1bc0/0x3488 net/core/dev.c:4156 lr : skb_assert_len include/linux/skbuff.h:2527 [inline] lr : __dev_queue_xmit+0x1bc0/0x3488 net/core/dev.c:4156 sp : ffff80001e0d6c40 x29: ffff80001e0d6e60 x28: dfff800000000000 x27: ffff0000c86328c0 x26: dfff800000000000 x25: ffff0000c8632990 x24: ffff0000c8632a00 x23: 0000000000000000 x22: 1fffe000190c6542 x21: ffff0000c8632a10 x20: ffff0000c8632a00 x19: ffff80001856e000 x18: ffff80001e0d5fc0 x17: 0000000000000000 x16: ffff80001235d16c x15: 0000000000000000 x14: 0000000000000000 x13: 0000000000000001 x12: 0000000000000001 x11: ff80800008353a30 x10: 0000000000000000 x9 : 21567eaf25bfb600 x8 : 21567eaf25bfb600 x7 : 0000000000000001 x6 : 0000000000000001 x5 : ffff80001e0d6558 x4 : ffff800015c74760 x3 : ffff800008596744 x2 : 0000000000000001 x1 : 0000000100000000 x0 : 000000000000000e Call trace: skb_assert_len include/linux/skbuff.h:2527 [inline] __dev_queue_xmit+0x1bc0/0x3488 net/core/dev.c:4156 dev_queue_xmit include/linux/netdevice.h:3033 [inline] __netlink_deliver_tap_skb net/netlink/af_netlink.c:307 [inline] __netlink_deliver_tap+0x45c/0x6f8 net/netlink/af_netlink.c:325 netlink_deliver_tap+0xf4/0x174 net/netlink/af_netlink.c:338 __netlink_sendskb net/netlink/af_netlink.c:1283 [inline] netlink_sendskb+0x6c/0x154 net/netlink/af_netlink.c:1292 netlink_unicast+0x334/0x8d4 net/netlink/af_netlink.c:1380 nlmsg_unicast include/net/netlink.h:1099 [inline] genlmsg_unicast include/net/genetlink.h:433 [inline] genlmsg_reply include/net/genetlink.h:443 [inline] ila_xlat_nl_cmd_get_mapping+0x620/0x7d0 net/ipv6/ila/ila_xlat.c:493 genl_family_rcv_msg_doit net/netlink/genetlink.c:968 [inline] genl_family_rcv_msg net/netlink/genetlink.c:1048 [inline] genl_rcv_msg+0x938/0xc1c net/netlink/genetlink.c:1065 netlink_rcv_skb+0x214/0x3c4 net/netlink/af_netlink.c:2574 genl_rcv+0x38/0x50 net/netlink/genetlink.c:1076 netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline] netlink_unicast+0x660/0x8d4 net/netlink/af_netlink.c:1365 netlink_sendmsg+0x800/0xae0 net/netlink/af_netlink.c:1942 sock_sendmsg_nosec net/socket.c:714 [inline] sock_sendmsg net/socket.c:734 [inline] ____sys_sendmsg+0x558/0x844 net/socket.c:2479 ___sys_sendmsg net/socket.c:2533 [inline] __sys_sendmsg+0x26c/0x33c net/socket.c:2562 __do_sys_sendmsg net/socket.c:2571 [inline] __se_sys_sendmsg net/socket.c:2569 [inline] __arm64_sys_sendmsg+0x80/0x94 net/socket.c:2569 __invoke_syscall arch/arm64/kernel/syscall.c:38 [inline] invoke_syscall+0x98/0x2c0 arch/arm64/kernel/syscall.c:52 el0_svc_common+0x138/0x258 arch/arm64/kernel/syscall.c:142 do_el0_svc+0x64/0x198 arch/arm64/kernel/syscall.c:193 el0_svc+0x58/0x168 arch/arm64/kernel/entry-common.c:637 el0t_64_sync_handler+0x84/0xf0 arch/arm64/kernel/entry-common.c:655 el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:591 irq event stamp: 136484 hardirqs last enabled at (136483): [<ffff800008350244>] __up_console_sem+0x60/0xb4 kernel/printk/printk.c:345 hardirqs last disabled at (136484): [<ffff800012358d60>] el1_dbg+0x24/0x80 arch/arm64/kernel/entry-common.c:405 softirqs last enabled at (136418): [<ffff800008020ea8>] softirq_handle_end kernel/softirq.c:414 [inline] softirqs last enabled at (136418): [<ffff800008020ea8>] __do_softirq+0xd4c/0xfa4 kernel/softirq.c:600 softirqs last disabled at (136371): [<ffff80000802b4a4>] ____do_softirq+0x14/0x20 arch/arm64/kernel/irq.c:80 ---[ end trace 0000000000000000 ]--- skb len=0 headroom=0 headlen=0 tailroom=192 mac=(0,0) net=(0,-1) trans=-1 shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0)) csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0) hash(0x0 sw=0 l4=0) proto=0x0010 pkttype=6 iif=0 dev name=nlmon0 feat=0x0000000000005861 Fixes: 7f00feaf1076 ("ila: Add generic ILA translation facility") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>