linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2023-06-10	net: renesas: rswitch: Use napi_gro_receive() in RX	Yoshihiro Shimoda
	This hardware can receive multiple frames so that using napi_gro_receive() instead of netif_receive_skb() gets good performance of RX. Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-10	Merge branch 'sfc-tc-encap-actions-offload'	Jakub Kicinski
	Edward Cree says: ==================== sfc: TC encap actions offload This series adds support for offloading TC tunnel_key set actions to the EF100 driver, supporting VxLAN and GENEVE tunnels over IPv4 or IPv6. ==================== Link: https://lore.kernel.org/r/cover.1686240142.git.ecree.xilinx@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10	sfc: generate encap headers for TC offload	Edward Cree
	Support constructing VxLAN and GENEVE headers, on either IPv4 or IPv6, using the neighbouring information obtained in encap->neigh to populate the Ethernet header. Note that the ef100 hardware does not insert UDP checksums when performing encap, so for IPv6 the remote endpoint will need to be configured with udp6zerocsumrx or equivalent. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10	sfc: neighbour lookup for TC encap action offload	Edward Cree
	For each neighbour we're interested in, create a struct efx_neigh_binder object which has a list of all the encap_actions using it. When we receive a neighbouring update (through the netevent notifier), find the corresponding efx_neigh_binder and update all its users. Since the actual generation of encap headers is still only a stub, the resulting rules still get left on fallback actions. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10	sfc: MAE functions to create/update/delete encap headers	Edward Cree
	Besides the raw header data, also pass the tunnel type, so that the hardware knows it needs to update the IP Total Length and UDP Length fields (and corresponding checksums) for each packet. Also, populate the ENCAP_HEADER_ID field in efx_mae_alloc_action_set() with the fw_id returned from efx_mae_allocate_encap_md(). Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com> Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10	sfc: add function to atomically update a rule in the MAE	Edward Cree
	efx_mae_update_rule() changes the action-set-list attached to an MAE flow rule in the Action Rule Table. We will use this when neighbouring updates change encap actions. Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com> Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10	sfc: some plumbing towards TC encap action offload	Edward Cree
	Create software objects to manage the metadata for encap actions that can be attached to TC rules. However, since we don't yet have the neighbouring information (needed to generate the Ethernet header), all rules with encap actions are marked as "unready" and thus insert the fallback action into hardware rather than actually offloading the encapsulation action. Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com> Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10	sfc: add fallback action-set-lists for TC offload	Edward Cree
	When offloading a TC encap action, the action information for the hardware might not be "ready": if there's currently no neighbour entry available for the destination address, we can't construct the Ethernet header to prepend to the packet. In this case, we still offload the flow rule, but with its action-set-list ID pointing at a "fallback" action which simply delivers the packet to its default destination (as though no flow rule had matched), thus allowing software TC to handle it. Later, when we receive a neighbouring update that allows us to construct the encap header, the rule will become "ready" and we will update its action-set-list ID in hardware to point at the actual offloaded actions. This patch sets up these fallback ASLs, but does not yet use them. Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com> Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10	net: move gso declarations and functions to their own files	Eric Dumazet
	Move declarations into include/net/gso.h and code into net/core/gso.c Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Stanislav Fomichev <sdf@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20230608191738.3947077-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10	Merge branch 'mptcp-unify-pm-interfaces'	Jakub Kicinski
	Matthieu Baerts says: ==================== mptcp: unify PM interfaces These patches from Geliang better isolate the two MPTCP path-managers by avoiding calling userspace PM functions from the in-kernel PM. Instead, new functions declared in pm.c directly dispatch to the right PM. In addition to have a clearer code, this also avoids a bit of duplicated checks. This is a refactoring, there is no behaviour change intended here. ==================== Link: https://lore.kernel.org/r/20230608-upstream-net-next-20230608-mptcp-unify-pm-interfaces-v1-0-b301717c9ff5@tessares.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10	mptcp: unify pm set_flags interfaces	Geliang Tang
	This patch unifies the three PM set_flags() interfaces: mptcp_pm_nl_set_flags() in mptcp/pm_netlink.c for the in-kernel PM and mptcp_userspace_pm_set_flags() in mptcp/pm_userspace.c for the userspace PM. They'll be switched in the common PM infterface mptcp_pm_set_flags() in mptcp/pm.c based on whether token is NULL or not. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10	mptcp: unify pm get_flags_and_ifindex_by_id	Geliang Tang
	This patch unifies the three PM get_flags_and_ifindex_by_id() interfaces: mptcp_pm_nl_get_flags_and_ifindex_by_id() in mptcp/pm_netlink.c for the in-kernel PM and mptcp_userspace_pm_get_flags_and_ifindex_by_id() in mptcp/pm_userspace.c for the userspace PM. They'll be switched in the common PM infterface mptcp_pm_get_flags_and_ifindex_by_id() in mptcp/pm.c based on whether mptcp_pm_is_userspace() or not. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10	mptcp: unify pm get_local_id interfaces	Geliang Tang
	This patch unifies the three PM get_local_id() interfaces: mptcp_pm_nl_get_local_id() in mptcp/pm_netlink.c for the in-kernel PM and mptcp_userspace_pm_get_local_id() in mptcp/pm_userspace.c for the userspace PM. They'll be switched in the common PM infterface mptcp_pm_get_local_id() in mptcp/pm.c based on whether mptcp_pm_is_userspace() or not. Also put together the declarations of these three functions in protocol.h. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10	mptcp: export local_address	Geliang Tang
	Rename local_address() with "mptcp_" prefix and export it in protocol.h. This function will be re-used in the common PM code (pm.c) in the following commit. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-09	Merge tag 'wireless-next-2023-06-09' of ↵	Jakub Kicinski
	git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next Kalle Valo says: ==================== wireless-next patches for v6.5 The second pull request for v6.5. We have support for three new Realtek chipsets, all from different generations. Shows how active Realtek development is right now, even older generations are being worked on. Note: We merged wireless into wireless-next to avoid complex conflicts between the trees. Major changes: rtl8xxxu - RTL8192FU support rtw89 - RTL8851BE support rtw88 - RTL8723DS support ath11k - Multiple Basic Service Set Identifier (MBSSID) and Enhanced MBSSID Advertisement (EMA) support in AP mode iwlwifi - support for segmented PNVM images and power tables - new vendor entries for PPAG (platform antenna gain) feature cfg80211/mac80211 - more Multi-Link Operation (MLO) support such as hardware restart - fixes for a potential work/mutex deadlock and with it beginnings of the previously discussed locking simplifications * tag 'wireless-next-2023-06-09' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (162 commits) wifi: rtlwifi: remove misused flag from HAL data wifi: rtlwifi: remove unused dualmac control leftovers wifi: rtlwifi: remove unused timer and related code wifi: rsi: Do not set MMC_PM_KEEP_POWER in shutdown wifi: rsi: Do not configure WoWlan in shutdown hook if not enabled wifi: brcmfmac: Detect corner error case earlier with log wifi: rtw89: 8852c: update RF radio A/B parameters to R63 wifi: rtw89: 8852c: update TX power tables to R63 with 6 GHz power type (3 of 3) wifi: rtw89: 8852c: update TX power tables to R63 with 6 GHz power type (2 of 3) wifi: rtw89: 8852c: update TX power tables to R63 with 6 GHz power type (1 of 3) wifi: rtw89: process regulatory for 6 GHz power type wifi: rtw89: regd: update regulatory map to R64-R40 wifi: rtw89: regd: judge 6 GHz according to chip and BIOS wifi: rtw89: refine clearing supported bands to check 2/5 GHz first wifi: rtw89: 8851b: configure CRASH_TRIGGER feature for 8851B wifi: rtw89: set TX power without precondition during setting channel wifi: rtw89: debug: txpwr table access only valid page according to chip wifi: rtw89: 8851b: enable hw_scan support wifi: cfg80211: move scan done work to wiphy work wifi: cfg80211: move sched scan stop to wiphy work ... ==================== Link: https://lore.kernel.org/r/87bkhohkbg.fsf@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-09	Merge branch 'tools-ynl-gen-code-gen-improvements-before-ethtool'	Jakub Kicinski
	Jakub Kicinski says: ==================== tools: ynl-gen: code gen improvements before ethtool I was going to post ethtool but I couldn't stand the ugliness of the if conditions which were previously generated. So I cleaned that up and improved a number of other things ethtool will benefit from. ==================== Link: https://lore.kernel.org/r/20230608211200.1247213-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-09	tools: ynl-gen: support / skip pads on the way to kernel	Jakub Kicinski
	Kernel does not have padding requirements for 64b attrs. We can ignore pad attrs. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-09	tools: ynl-gen: don't pass op_name to RenderInfo	Jakub Kicinski
	The op_name argument is barely used and identical to op.name in all cases. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-09	tools: ynl-gen: support code gen for events	Jakub Kicinski
	Netlink specs support both events and notifications (former can define their own message contents). Plug in missing code to generate types, parsers and include events into notification tables. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-09	tools: ynl-gen: sanitize notification tracking	Jakub Kicinski
	Don't modify the raw dicts (as loaded from YAML) to pretend that the notify attributes also exist on the ops. This makes the code easier to follow. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-09	tools: ynl: regen: stop generating common notification handlers	Jakub Kicinski
	Remove unused notification handlers. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-09	tools: ynl-gen: stop generating common notification handlers	Jakub Kicinski
	Common notification handler was supposed to be a way for the user to parse the notifications from a socket synchronously. I don't think we'll end up using it, ynl_ntf_check() works for all known use cases. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-09	tools: ynl: regen: regenerate the if ladders	Jakub Kicinski
	Renegate the code to combine } and else and use tmp variable to store type. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-09	tools: ynl-gen: get attr type outside of if()	Jakub Kicinski
	Reading attr type with mnl_attr_get_type() for each condition leads to most conditions being longer than 80 chars. Avoid this by reading the type to a variable on the stack. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-09	tools: ynl-gen: combine else with closing bracket	Jakub Kicinski
	Code gen currently prints: } else if (... This is really ugly. Fix it by delaying printing of closing brackets in anticipation of else coming along. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-09	tools: ynl-gen: complete the C keyword list	Jakub Kicinski
	C keywords need to be avoided when naming things. Complete the list (ethtool has at least one thing called "auto"). Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-09	tools: ynl: regen: cleanup user space header includes	Jakub Kicinski
	Remove unnecessary includes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-09	tools: ynl-gen: cleanup user space header includes	Jakub Kicinski
	Bots started screaming that we're including stdlib.h twice. While at it move string.h into a common spot and drop stdio.h which we don't need. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=5464 Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=5466 Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=5467 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-09	Revert "tools: ynl: Remove duplicated include in handshake-user.c"	Jakub Kicinski
	This reverts commit e7c5433c5aaab52ddd5448967a9a5db94a3939cc. Commit e7c5433c5aaa ("tools: ynl: Remove duplicated include in handshake-user.c") was applied too hastily. It changes an auto-generated file, and there's already a proper fix on the list. Link: https://lore.kernel.org/all/ZIMPLYi%2FxRih+DlC@nanopsycho/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-09	tools: ynl: Remove duplicated include in handshake-user.c	Yang Li
	./tools/net/ynl/generated/handshake-user.c: stdlib.h is included more than once. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=5464 Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-09	Merge branch 'broadcom-phy-led-brightness'	David S. Miller
	Florian Fainelli says: ==================== LED brightness support for Broadcom PHYs This patch series adds support for controlling the LED brightness on Broadcom PHYs. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-09	net: phy: broadcom: Add support for setting LED brightness	Florian Fainelli
	Broadcom PHYs have two LEDs selector registers which allow us to control the LED assignment, including how to turn them on/off. Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-09	net: phy: broadcom: Rename LED registers	Florian Fainelli
	These registers are common to most PHYs and are not specific to the BCM5482, renamed the constants accordingly, no functional change. Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-09	Merge branch 'net-ncsi-refactoring-for-GMA-cmd'	David S. Miller
	Ivan Mikhaylov says: ==================== net/ncsi: refactoring for GMA command Make one GMA function for all manufacturers, change ndo_set_mac_address to dev_set_mac_address for notifiying net layer about MAC change which ndo_set_mac_address doesn't do. Changes from v1: 1. delete ftgmac100.txt changes about mac-address-increment 2. add convert to yaml from ftgmac100.txt 3. add mac-address-increment option for ethernet-controller.yaml Changes from v2: 1. remove DT changes from series, will be done in another one ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-09	net/ncsi: change from ndo_set_mac_address to dev_set_mac_address	Ivan Mikhaylov
	Change ndo_set_mac_address to dev_set_mac_address because dev_set_mac_address provides a way to notify network layer about MAC change. In other case, services may not aware about MAC change and keep using old one which set from network adapter driver. As example, DHCP client from systemd do not update MAC address without notification from net subsystem which leads to the problem with acquiring the right address from DHCP server. Fixes: cb10c7c0dfd9e ("net/ncsi: Add NCSI Broadcom OEM command") Cc: stable@vger.kernel.org # v6.0+ 2f38e84 net/ncsi: make one oem_gma function for all mfr id Signed-off-by: Paul Fertser <fercerpav@gmail.com> Signed-off-by: Ivan Mikhaylov <fr0st61te@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-09	net/ncsi: make one oem_gma function for all mfr id	Ivan Mikhaylov
	Make the one Get Mac Address function for all manufacturers and change this call in handlers accordingly. Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Ivan Mikhaylov <fr0st61te@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-09	usbnet: ipheth: update Kconfig description	Foster Snowhill
	This module has for a long time not been limited to iPhone <= 3GS. Update description to match the actual state of the driver. Remove dead link from 2010, instead reference an existing userspace iOS device pairing implementation as part of libimobiledevice. Signed-off-by: Foster Snowhill <forst@pen.gy> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-09	usbnet: ipheth: add CDC NCM support	Foster Snowhill
	Recent iOS releases support CDC NCM encapsulation on RX. This mode is the default on macOS and Windows. In this mode, an iOS device may include one or more Ethernet frames inside a single URB. Freshly booted iOS devices start in legacy mode, but are put into NCM mode by the official Apple driver. When reconnecting such a device from a macOS/Windows machine to a Linux host, the device stays in NCM mode, making it unusable with the legacy ipheth driver code. To correctly support such a device, the driver has to either support the NCM mode too, or put the device back into legacy mode. To match the behaviour of the macOS/Windows driver, and since there is no documented control command to revert to legacy mode, implement NCM support. The device is attempted to be put into NCM mode by default, and falls back to legacy mode if the attempt fails. Signed-off-by: Foster Snowhill <forst@pen.gy> Tested-by: Georgi Valkov <gvalkov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-09	usbnet: ipheth: transmit URBs without trailing padding	Foster Snowhill
	The behaviour of the official iOS tethering driver on macOS is to not transmit any trailing padding at the end of URBs. This is applicable to both NCM and legacy modes, including older devices. Adapt the driver to not include trailing padding in TX URBs, matching the behaviour of the official macOS driver. Signed-off-by: Foster Snowhill <forst@pen.gy> Tested-by: Georgi Valkov <gvalkov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-09	usbnet: ipheth: fix risk of NULL pointer deallocation	Georgi Valkov
	The cleanup precedure in ipheth_probe will attempt to free a NULL pointer in dev->ctrl_buf if the memory allocation for this buffer is not successful. While kfree ignores NULL pointers, and the existing code is safe, it is a better design to rearrange the goto labels and avoid this. Signed-off-by: Georgi Valkov <gvalkov@gmail.com> Signed-off-by: Foster Snowhill <forst@pen.gy> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-08	Merge branch ↵	Jakub Kicinski
	'splice-net-rewrite-splice-to-socket-fix-splice_f_more-and-handle-msg_splice_pages-in-af_tls' David Howells says: ==================== splice, net: Rewrite splice-to-socket, fix SPLICE_F_MORE and handle MSG_SPLICE_PAGES in AF_TLS Here are patches to do the following: (1) Block MSG_SENDPAGE_* flags from leaking into ->sendmsg() from userspace, whilst allowing splice_to_socket() to pass them in. (2) Allow MSG_SPLICE_PAGES to be passed into tls_*_sendmsg(). Until support is added, it will be ignored and a splice-driven sendmsg() will be treated like a normal sendmsg(). TCP, UDP, AF_UNIX and Chelsio-TLS already handle the flag in net-next. (3) Replace a chain of functions to splice-to-sendpage with a single function to splice via sendmsg() with MSG_SPLICE_PAGES. This allows a bunch of pages to be spliced from a pipe in a single call using a bio_vec[] and pushes the main processing loop down into the bowels of the protocol driver rather than repeatedly calling in with a page at a time. (4) Provide a ->splice_eof() op[2] that allows splice to signal to its output that the input observed a premature EOF and that the caller didn't flag SPLICE_F_MORE, thereby allowing a corked socket to be flushed. This attempts to maintain the current behaviour. It is also not called if we didn't manage to read any data and so didn't called the actor function. This needs routing though several layers to get it down to the network protocol. [!] Note that I chose not to pass in any flags - I'm not sure it's particularly useful to pass in the splice flags; I also elected not to return any error code - though we might actually want to do that. (5) Provide tls_{device,sw}_splice_eof() to flush a pending TLS record if there is one. (6) Provide splice_eof() for UDP, TCP, Chelsio-TLS and AF_KCM. AF_UNIX doesn't seem to pay attention to the MSG_MORE or MSG_SENDPAGE_NOTLAST flags. (7) Alter the behaviour of sendfile() and fix SPLICE_F_MORE/MSG_MORE signalling[1] such SPLICE_F_MORE is always signalled until we have read sufficient data to finish the request. If we get a zero-length before we've managed to splice sufficient data, we now leave the socket expecting more data and leave it to userspace to deal with it. (8) Make AF_TLS handle the MSG_SPLICE_PAGES internal sendmsg flag. MSG_SPLICE_PAGES is an internal hint that tells the protocol that it should splice the pages supplied if it can. Its sendpage implementations are then turned into wrappers around that. Link: https://lore.kernel.org/r/499791.1685485603@warthog.procyon.org.uk/ [1] Link: https://lore.kernel.org/r/CAHk-=wh=V579PDYvkpnTobCLGczbgxpMgGmmhqiTyE34Cpi5Gg@mail.gmail.com/ [2] Link: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=51c78a4d532efe9543a4df019ff405f05c6157f6 # part 1 Link: https://lore.kernel.org/r/20230524153311.3625329-1-dhowells@redhat.com/ # v1 ==================== Link: https://lore.kernel.org/r/20230607181920.2294972-1-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08	tls/device: Convert tls_device_sendpage() to use MSG_SPLICE_PAGES	David Howells
	Convert tls_device_sendpage() to use sendmsg() with MSG_SPLICE_PAGES rather than directly splicing in the pages itself. With that, the tls_iter_offset union is no longer necessary and can be replaced with an iov_iter pointer and the zc_page argument to tls_push_data() can also be removed. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Jakub Kicinski <kuba@kernel.org> cc: Chuck Lever <chuck.lever@oracle.com> cc: Boris Pismenny <borisp@nvidia.com> cc: John Fastabend <john.fastabend@gmail.com> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08	tls/device: Support MSG_SPLICE_PAGES	David Howells
	Make TLS's device sendmsg() support MSG_SPLICE_PAGES. This causes pages to be spliced from the source iterator if possible. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> cc: Chuck Lever <chuck.lever@oracle.com> cc: Boris Pismenny <borisp@nvidia.com> cc: John Fastabend <john.fastabend@gmail.com> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08	tls/sw: Convert tls_sw_sendpage() to use MSG_SPLICE_PAGES	David Howells
	Convert tls_sw_sendpage() and tls_sw_sendpage_locked() to use sendmsg() with MSG_SPLICE_PAGES rather than directly splicing in the pages itself. [!] Note that tls_sw_sendpage_locked() appears to have the wrong locking upstream. I think the caller will only hold the socket lock, but it should hold tls_ctx->tx_lock too. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> cc: Chuck Lever <chuck.lever@oracle.com> cc: Boris Pismenny <borisp@nvidia.com> cc: John Fastabend <john.fastabend@gmail.com> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08	tls/sw: Support MSG_SPLICE_PAGES	David Howells
	Make TLS's sendmsg() support MSG_SPLICE_PAGES. This causes pages to be spliced from the source iterator if possible. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells <dhowells@redhat.com> cc: Chuck Lever <chuck.lever@oracle.com> cc: Boris Pismenny <borisp@nvidia.com> cc: John Fastabend <john.fastabend@gmail.com> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08	splice, net: Fix SPLICE_F_MORE signalling in splice_direct_to_actor()	David Howells
	splice_direct_to_actor() doesn't manage SPLICE_F_MORE correctly[1] - and, as a result, it incorrectly signals/fails to signal MSG_MORE when splicing to a socket. The problem I'm seeing happens when a short splice occurs because we got a short read due to hitting the EOF on a file: as the length read (read_len) is less than the remaining size to be spliced (len), SPLICE_F_MORE (and thus MSG_MORE) is set. The issue is that, for the moment, we have no way to know why the short read occurred and so can't make a good decision on whether we should keep MSG_MORE set. MSG_SENDPAGE_NOTLAST was added to work around this, but that is also set incorrectly under some circumstances - for example if a short read fills a single pipe_buffer, but the next read would return more (seqfile can do this). This was observed with the multi_chunk_sendfile tests in the tls kselftest program. Some of those tests would hang and time out when the last chunk of file was less than the sendfile request size: build/kselftest/net/tls -r tls.12_aes_gcm.multi_chunk_sendfile This has been observed before[2] and worked around in AF_TLS[3]. Fix this by making splice_direct_to_actor() always signal SPLICE_F_MORE if we haven't yet hit the requested operation size. SPLICE_F_MORE remains signalled if the user passed it in to splice() but otherwise gets cleared when we've read sufficient data to fulfill the request. If, however, we get a premature EOF from ->splice_read(), have sent at least one byte and SPLICE_F_MORE was not set by the caller, ->splice_eof() will be invoked. Signed-off-by: David Howells <dhowells@redhat.com> cc: Linus Torvalds <torvalds@linux-foundation.org> cc: Jens Axboe <axboe@kernel.dk> cc: Christoph Hellwig <hch@lst.de> cc: Al Viro <viro@zeniv.linux.org.uk> cc: Matthew Wilcox <willy@infradead.org> cc: Jan Kara <jack@suse.cz> cc: Jeff Layton <jlayton@kernel.org> cc: David Hildenbrand <david@redhat.com> cc: Christian Brauner <brauner@kernel.org> cc: Chuck Lever <chuck.lever@oracle.com> cc: Boris Pismenny <borisp@nvidia.com> cc: John Fastabend <john.fastabend@gmail.com> cc: linux-mm@kvack.org Link: https://lore.kernel.org/r/499791.1685485603@warthog.procyon.org.uk/ [1] Link: https://lore.kernel.org/r/1591392508-14592-1-git-send-email-pooja.trivedi@stackpath.com/ [2] Link: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=d452d48b9f8b1a7f8152d33ef52cfd7fe1735b0a [3] Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08	kcm: Use splice_eof() to flush	David Howells
	Allow splice to undo the effects of MSG_MORE after prematurely ending a splice/sendfile due to getting an EOF condition (->splice_read() returned 0) after splice had called sendmsg() with MSG_MORE set when the user didn't set MSG_MORE. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/CAHk-=wh=V579PDYvkpnTobCLGczbgxpMgGmmhqiTyE34Cpi5Gg@mail.gmail.com/ Signed-off-by: David Howells <dhowells@redhat.com> cc: Tom Herbert <tom@herbertland.com> cc: Tom Herbert <tom@quantonium.net> cc: Cong Wang <cong.wang@bytedance.com> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08	chelsio/chtls: Use splice_eof() to flush	David Howells
	Allow splice to end a Chelsio TLS record after prematurely ending a splice/sendfile due to getting an EOF condition (->splice_read() returned 0) after splice had called sendmsg() with MSG_MORE set when the user didn't set MSG_MORE. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/CAHk-=wh=V579PDYvkpnTobCLGczbgxpMgGmmhqiTyE34Cpi5Gg@mail.gmail.com/ Signed-off-by: David Howells <dhowells@redhat.com> cc: Ayush Sawal <ayush.sawal@chelsio.com> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08	ipv4, ipv6: Use splice_eof() to flush	David Howells
	Allow splice to undo the effects of MSG_MORE after prematurely ending a splice/sendfile due to getting an EOF condition (->splice_read() returned 0) after splice had called sendmsg() with MSG_MORE set when the user didn't set MSG_MORE. For UDP, a pending packet will not be emitted if the socket is closed before it is flushed; with this change, it be flushed by ->splice_eof(). For TCP, it's not clear that MSG_MORE is actually effective. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/CAHk-=wh=V579PDYvkpnTobCLGczbgxpMgGmmhqiTyE34Cpi5Gg@mail.gmail.com/ Signed-off-by: David Howells <dhowells@redhat.com> cc: Kuniyuki Iwashima <kuniyu@amazon.com> cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com> cc: David Ahern <dsahern@kernel.org> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08	tls/device: Use splice_eof() to flush	David Howells
	Allow splice to end a TLS record after prematurely ending a splice/sendfile due to getting an EOF condition (->splice_read() returned 0) after splice had called TLS with a sendmsg() with MSG_MORE set when the user didn't set MSG_MORE. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/CAHk-=wh=V579PDYvkpnTobCLGczbgxpMgGmmhqiTyE34Cpi5Gg@mail.gmail.com/ Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> cc: Chuck Lever <chuck.lever@oracle.com> cc: Boris Pismenny <borisp@nvidia.com> cc: John Fastabend <john.fastabend@gmail.com> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>