diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2021-11-02 06:20:58 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2021-11-02 06:20:58 -0700 |
commit | fc02cb2b37fe2cbf1d3334b9f0f0eab9431766c4 (patch) | |
tree | 93b16bc48fdc3be4a1adccbf4c7de92a5e8440e1 /net/ipv4/tcp_output.c | |
parent | bfc484fe6abba4b89ec9330e0e68778e2a9856b2 (diff) | |
parent | 84882cf72cd774cf16fd338bdbf00f69ac9f9194 (diff) |
Merge tag 'net-next-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core:
- Remove socket skb caches
- Add a SO_RESERVE_MEM socket op to forward allocate buffer space and
avoid memory accounting overhead on each message sent
- Introduce managed neighbor entries - added by control plane and
resolved by the kernel for use in acceleration paths (BPF / XDP
right now, HW offload users will benefit as well)
- Make neighbor eviction on link down controllable by userspace to
work around WiFi networks with bad roaming implementations
- vrf: Rework interaction with netfilter/conntrack
- fq_codel: implement L4S style ce_threshold_ect1 marking
- sch: Eliminate unnecessary RCU waits in mini_qdisc_pair_swap()
BPF:
- Add support for new btf kind BTF_KIND_TAG, arbitrary type tagging
as implemented in LLVM14
- Introduce bpf_get_branch_snapshot() to capture Last Branch Records
- Implement variadic trace_printk helper
- Add a new Bloomfilter map type
- Track <8-byte scalar spill and refill
- Access hw timestamp through BPF's __sk_buff
- Disallow unprivileged BPF by default
- Document BPF licensing
Netfilter:
- Introduce egress hook for looking at raw outgoing packets
- Allow matching on and modifying inner headers / payload data
- Add NFT_META_IFTYPE to match on the interface type either from
ingress or egress
Protocols:
- Multi-Path TCP:
- increase default max additional subflows to 2
- rework forward memory allocation
- add getsockopts: MPTCP_INFO, MPTCP_TCPINFO, MPTCP_SUBFLOW_ADDRS
- MCTP flow support allowing lower layer drivers to configure msg
muxing as needed
- Automatic Multicast Tunneling (AMT) driver based on RFC7450
- HSR support the redbox supervision frames (IEC-62439-3:2018)
- Support for the ip6ip6 encapsulation of IOAM
- Netlink interface for CAN-FD's Transmitter Delay Compensation
- Support SMC-Rv2 eliminating the current same-subnet restriction, by
exploiting the UDP encapsulation feature of RoCE adapters
- TLS: add SM4 GCM/CCM crypto support
- Bluetooth: initial support for link quality and audio/codec offload
Driver APIs:
- Add a batched interface for RX buffer allocation in AF_XDP buffer
pool
- ethtool: Add ability to control transceiver modules' power mode
- phy: Introduce supported interfaces bitmap to express MAC
capabilities and simplify PHY code
- Drop rtnl_lock from DSA .port_fdb_{add,del} callbacks
New drivers:
- WiFi driver for Realtek 8852AE 802.11ax devices (rtw89)
- Ethernet driver for ASIX AX88796C SPI device (x88796c)
Drivers:
- Broadcom PHYs
- support 72165, 7712 16nm PHYs
- support IDDQ-SR for additional power savings
- PHY support for QCA8081, QCA9561 PHYs
- NXP DPAA2: support for IRQ coalescing
- NXP Ethernet (enetc): support for software TCP segmentation
- Renesas Ethernet (ravb) - support DMAC and EMAC blocks of
Gigabit-capable IP found on RZ/G2L SoC
- Intel 100G Ethernet
- support for eswitch offload of TC/OvS flow API, including
offload of GRE, VxLAN, Geneve tunneling
- support application device queues - ability to assign Rx and Tx
queues to application threads
- PTP and PPS (pulse-per-second) extensions
- Broadcom Ethernet (bnxt)
- devlink health reporting and device reload extensions
- Mellanox Ethernet (mlx5)
- offload macvlan interfaces
- support HW offload of TC rules involving OVS internal ports
- support HW-GRO and header/data split
- support application device queues
- Marvell OcteonTx2:
- add XDP support for PF
- add PTP support for VF
- Qualcomm Ethernet switch (qca8k): support for QCA8328
- Realtek Ethernet DSA switch (rtl8366rb)
- support bridge offload
- support STP, fast aging, disabling address learning
- support for Realtek RTL8365MB-VC, a 4+1 port 10M/100M/1GE switch
- Mellanox Ethernet/IB switch (mlxsw)
- multi-level qdisc hierarchy offload (e.g. RED, prio and shaping)
- offload root TBF qdisc as port shaper
- support multiple routing interface MAC address prefixes
- support for IP-in-IP with IPv6 underlay
- MediaTek WiFi (mt76)
- mt7921 - ASPM, 6GHz, SDIO and testmode support
- mt7915 - LED and TWT support
- Qualcomm WiFi (ath11k)
- include channel rx and tx time in survey dump statistics
- support for 80P80 and 160 MHz bandwidths
- support channel 2 in 6 GHz band
- spectral scan support for QCN9074
- support for rx decapsulation offload (data frames in 802.3
format)
- Qualcomm phone SoC WiFi (wcn36xx)
- enable Idle Mode Power Save (IMPS) to reduce power consumption
during idle
- Bluetooth driver support for MediaTek MT7922 and MT7921
- Enable support for AOSP Bluetooth extension in Qualcomm WCN399x and
Realtek 8822C/8852A
- Microsoft vNIC driver (mana)
- support hibernation and kexec
- Google vNIC driver (gve)
- support for jumbo frames
- implement Rx page reuse
Refactor:
- Make all writes to netdev->dev_addr go thru helpers, so that we can
add this address to the address rbtree and handle the updates
- Various TCP cleanups and optimizations including improvements to
CPU cache use
- Simplify the gnet_stats, Qdisc stats' handling and remove
qdisc->running sequence counter
- Driver changes and API updates to address devlink locking
deficiencies"
* tag 'net-next-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2122 commits)
Revert "net: avoid double accounting for pure zerocopy skbs"
selftests: net: add arp_ndisc_evict_nocarrier
net: ndisc: introduce ndisc_evict_nocarrier sysctl parameter
net: arp: introduce arp_evict_nocarrier sysctl parameter
libbpf: Deprecate AF_XDP support
kbuild: Unify options for BTF generation for vmlinux and modules
selftests/bpf: Add a testcase for 64-bit bounds propagation issue.
bpf: Fix propagation of signed bounds from 64-bit min/max into 32-bit.
bpf: Fix propagation of bounds from 64-bit min/max into 32-bit and var_off.
net: vmxnet3: remove multiple false checks in vmxnet3_ethtool.c
net: avoid double accounting for pure zerocopy skbs
tcp: rename sk_wmem_free_skb
netdevsim: fix uninit value in nsim_drv_configure_vfs()
selftests/bpf: Fix also no-alu32 strobemeta selftest
bpf: Add missing map_delete_elem method to bloom filter map
selftests/bpf: Add bloom map success test for userspace calls
bpf: Add alignment padding for "map_extra" + consolidate holes
bpf: Bloom filter map naming fixups
selftests/bpf: Add test cases for struct_ops prog
bpf: Add dummy BPF STRUCT_OPS for test purpose
...
Diffstat (limited to 'net/ipv4/tcp_output.c')
-rw-r--r-- | net/ipv4/tcp_output.c | 39 |
1 files changed, 10 insertions, 29 deletions
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 6d72f3ea48c4..6fbbf1558033 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -394,7 +394,6 @@ static void tcp_init_nondata_skb(struct sk_buff *skb, u32 seq, u8 flags) skb->ip_summed = CHECKSUM_PARTIAL; TCP_SKB_CB(skb)->tcp_flags = flags; - TCP_SKB_CB(skb)->sacked = 0; tcp_skb_pcount_set(skb, 1); @@ -1256,8 +1255,6 @@ static int __tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, tp->tcp_wstamp_ns = max(tp->tcp_wstamp_ns, tp->tcp_clock_cache); skb->skb_mstamp_ns = tp->tcp_wstamp_ns; if (clone_it) { - TCP_SKB_CB(skb)->tx.in_flight = TCP_SKB_CB(skb)->end_seq - - tp->snd_una; oskb = skb; tcp_skb_tsorted_save(oskb) { @@ -1566,7 +1563,7 @@ int tcp_fragment(struct sock *sk, enum tcp_queue tcp_queue, return -ENOMEM; /* Get a new skb... force flag on. */ - buff = sk_stream_alloc_skb(sk, nsize, gfp, true); + buff = tcp_stream_alloc_skb(sk, nsize, gfp, true); if (!buff) return -ENOMEM; /* We'll just try again later. */ skb_copy_decrypted(buff, skb); @@ -1592,8 +1589,6 @@ int tcp_fragment(struct sock *sk, enum tcp_queue tcp_queue, skb_split(skb, buff, len); - buff->ip_summed = CHECKSUM_PARTIAL; - buff->tstamp = skb->tstamp; tcp_fragment_tstamp(skb, buff); @@ -1678,7 +1673,6 @@ int tcp_trim_head(struct sock *sk, struct sk_buff *skb, u32 len) delta_truesize = __pskb_trim_head(skb, len); TCP_SKB_CB(skb)->seq += len; - skb->ip_summed = CHECKSUM_PARTIAL; if (delta_truesize) { skb->truesize -= delta_truesize; @@ -2123,7 +2117,7 @@ static int tso_fragment(struct sock *sk, struct sk_buff *skb, unsigned int len, return tcp_fragment(sk, TCP_FRAG_IN_WRITE_QUEUE, skb, len, mss_now, gfp); - buff = sk_stream_alloc_skb(sk, 0, gfp, true); + buff = tcp_stream_alloc_skb(sk, 0, gfp, true); if (unlikely(!buff)) return -ENOMEM; skb_copy_decrypted(buff, skb); @@ -2144,12 +2138,8 @@ static int tso_fragment(struct sock *sk, struct sk_buff *skb, unsigned int len, TCP_SKB_CB(skb)->tcp_flags = flags & ~(TCPHDR_FIN | TCPHDR_PSH); TCP_SKB_CB(buff)->tcp_flags = flags; - /* This packet was never sent out yet, so no SACK bits. */ - TCP_SKB_CB(buff)->sacked = 0; - tcp_skb_fragment_eor(skb, buff); - buff->ip_summed = CHECKSUM_PARTIAL; skb_split(skb, buff, len); tcp_fragment_tstamp(skb, buff); @@ -2390,7 +2380,7 @@ static int tcp_mtu_probe(struct sock *sk) return -1; /* We're allowed to probe. Build it now. */ - nskb = sk_stream_alloc_skb(sk, probe_size, GFP_ATOMIC, false); + nskb = tcp_stream_alloc_skb(sk, probe_size, GFP_ATOMIC, false); if (!nskb) return -1; sk_wmem_queued_add(sk, nskb->truesize); @@ -2403,9 +2393,6 @@ static int tcp_mtu_probe(struct sock *sk) TCP_SKB_CB(nskb)->seq = TCP_SKB_CB(skb)->seq; TCP_SKB_CB(nskb)->end_seq = TCP_SKB_CB(skb)->seq + probe_size; TCP_SKB_CB(nskb)->tcp_flags = TCPHDR_ACK; - TCP_SKB_CB(nskb)->sacked = 0; - nskb->csum = 0; - nskb->ip_summed = CHECKSUM_PARTIAL; tcp_insert_write_queue_before(nskb, skb, sk); tcp_highest_sack_replace(sk, skb, nskb); @@ -2425,7 +2412,7 @@ static int tcp_mtu_probe(struct sock *sk) TCP_SKB_CB(nskb)->eor = TCP_SKB_CB(skb)->eor; tcp_skb_collapse_tstamp(nskb, skb); tcp_unlink_write_queue(skb, sk); - sk_wmem_free_skb(sk, skb); + tcp_wmem_free_skb(sk, skb); } else { TCP_SKB_CB(nskb)->tcp_flags |= TCP_SKB_CB(skb)->tcp_flags & ~(TCPHDR_FIN|TCPHDR_PSH); @@ -2969,8 +2956,7 @@ u32 __tcp_select_window(struct sock *sk) icsk->icsk_ack.quick = 0; if (tcp_under_memory_pressure(sk)) - tp->rcv_ssthresh = min(tp->rcv_ssthresh, - 4U * tp->advmss); + tcp_adjust_rcv_ssthresh(sk); /* free_space might become our new window, make sure we don't * increase it due to wscale. @@ -3048,13 +3034,9 @@ static bool tcp_collapse_retrans(struct sock *sk, struct sk_buff *skb) BUG_ON(tcp_skb_pcount(skb) != 1 || tcp_skb_pcount(next_skb) != 1); - if (next_skb_size) { - if (next_skb_size <= skb_availroom(skb)) - skb_copy_bits(next_skb, 0, skb_put(skb, next_skb_size), - next_skb_size); - else if (!tcp_skb_shift(skb, next_skb, 1, next_skb_size)) - return false; - } + if (next_skb_size && !tcp_skb_shift(skb, next_skb, 1, next_skb_size)) + return false; + tcp_highest_sack_replace(sk, next_skb, skb); /* Update sequence range on original skb. */ @@ -3757,10 +3739,9 @@ static int tcp_send_syn_data(struct sock *sk, struct sk_buff *syn) /* limit to order-0 allocations */ space = min_t(size_t, space, SKB_MAX_HEAD(MAX_TCP_HEADER)); - syn_data = sk_stream_alloc_skb(sk, space, sk->sk_allocation, false); + syn_data = tcp_stream_alloc_skb(sk, space, sk->sk_allocation, false); if (!syn_data) goto fallback; - syn_data->ip_summed = CHECKSUM_PARTIAL; memcpy(syn_data->cb, syn->cb, sizeof(syn->cb)); if (space) { int copied = copy_from_iter(skb_put(syn_data, space), space, @@ -3838,7 +3819,7 @@ int tcp_connect(struct sock *sk) return 0; } - buff = sk_stream_alloc_skb(sk, 0, sk->sk_allocation, true); + buff = tcp_stream_alloc_skb(sk, 0, sk->sk_allocation, true); if (unlikely(!buff)) return -ENOBUFS; |