summaryrefslogtreecommitdiff
path: root/include/uapi
AgeCommit message (Collapse)Author
2025-01-16net: mdio: add definition for clock stop capable bitRussell King (Oracle)
Add a definition for the clock stop capable bit in the PCS MMD. This bit indicates whether the MAC is able to stop the transmit xMII clock while it is signalling LPI. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/E1tYADb-0014PV-6T@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-16PCI: Add TLP Prefix reading to pcie_read_tlp_log()Ilpo Järvinen
pcie_read_tlp_log() handles only 4 Header Log DWORDs but TLP Prefix Log (PCIe r6.1 secs 7.8.4.12 & 7.9.14.13) may also be present. Generalize pcie_read_tlp_log() and struct pcie_tlp_log to also handle TLP Prefix Log. The relevant registers are formatted identically in AER and DPC Capability, but has these variations: a) The offsets of TLP Prefix Log registers vary. b) DPC RP PIO TLP Prefix Log register can be < 4 DWORDs. c) AER TLP Prefix Log Present (PCIe r6.1 sec 7.8.4.7) can indicate Prefix Log is not present. Therefore callers must pass the offset of the TLP Prefix Log register and the entire length to pcie_read_tlp_log() to be able to read the correct number of TLP Prefix DWORDs from the correct offset. Link: https://lore.kernel.org/r/20250114170840.1633-8-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> [bhelgaas: squash ternary fix from https://lore.kernel.org/r/20250116172019.88116-1-colin.i.king@gmail.com] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-01-15net: ethtool: add support for configuring hds-threshTaehee Yoo
The hds-thresh option configures the threshold value of the header-data-split. If a received packet size is larger than this threshold value, a packet will be split into header and payload. The header indicates TCP and UDP header, but it depends on driver spec. The bnxt_en driver supports HDS(Header-Data-Split) configuration at FW level, affecting TCP and UDP too. So, If hds-thresh is set, it affects UDP and TCP packets. Example: # ethtool -G <interface name> hds-thresh <value> # ethtool -G enp14s0f0np0 tcp-data-split on hds-thresh 256 # ethtool -g enp14s0f0np0 Ring parameters for enp14s0f0np0: Pre-set maximums: ... HDS thresh: 1023 Current hardware settings: ... TCP data split: on HDS thresh: 256 The default/min/max values are not defined in the ethtool so the drivers should define themself. The 0 value means that all TCP/UDP packets' header and payload will be split. Tested-by: Stanislav Fomichev <sdf@fomichev.me> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Link: https://patch.msgid.link/20250114142852.3364986-3-ap420073@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-15Merge tag 'loongarch-kvm-6.14' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson into HEAD LoongArch KVM changes for v6.14 1. Clear LLBCTL if secondary mmu mapping changed. 2. Add hypercall service support for usermode VMM. This is a really small changeset, because the Chinese New Year (Spring Festival) is coming. Happy New Year!
2025-01-15Input: allocate keycode for phone linkingIllia Ostapyshyn
The F11 key on the new Lenovo Thinkpad T14 Gen 5, T16 Gen 3, and P14s Gen 5 laptops includes a symbol showing a smartphone and a laptop chained together. According to the user manual, it starts the Microsoft Phone Link software used to connect to Android/iOS devices and relay messages/calls or sync data. As there are no suitable keycodes for this action, introduce a new one. Signed-off-by: Illia Ostapyshyn <illia@yshyn.com> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Link: https://lore.kernel.org/r/20241114173930.44983-2-illia@yshyn.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-01-14PCI: Store number of supported End-End TLP PrefixesIlpo Järvinen
eetlp_prefix_path in the struct pci_dev tells if End-End TLP Prefixes are supported by the path or not, and the value is only calculated if CONFIG_PCI_PASID is set. The Max End-End TLP Prefixes field in the Device Capabilities Register 2 also tells how many (1-4) End-End TLP Prefixes are supported (PCIe r6.2 sec 7.5.3.15). The number of supported End-End Prefixes is useful for reading correct number of DWORDs from TLP Prefix Log register in AER capability (PCIe r6.2 sec 7.8.4.12). Replace eetlp_prefix_path with eetlp_prefix_max and determine the number of supported End-End Prefixes regardless of CONFIG_PCI_PASID so that an upcoming commit generalizing TLP Prefix Log register reading does not have to read extra DWORDs for End-End Prefixes that never will be there. The value stored into eetlp_prefix_max is directly derived from device's Max End-End TLP Prefixes and does not consider limitations imposed by bridges or the Root Port beyond supported/not supported flags. This is intentional for two reasons: 1) PCIe r6.2 spec sections 2.2.10.4 & 6.2.4.4 indicate that a TLP is malformed only if the number of prefixes exceed the number of Max End-End TLP Prefixes, which seems to be the case even if the device could never receive that many prefixes due to smaller maximum imposed by a bridge or the Root Port. If TLP parsing is later added, this distinction is significant in interpreting what is logged by the TLP Prefix Log registers and the value matching to the Malformed TLP threshold is going to be more useful. 2) TLP Prefix handling happens autonomously on a low layer and the value in eetlp_prefix_max is not programmed anywhere by the kernel (i.e., there is no limiter OS can control to prevent sending more than N TLP Prefixes). Link: https://lore.kernel.org/r/20250114170840.1633-7-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
2025-01-14tcp: add LINUX_MIB_PAWS_OLD_ACK SNMP counterEric Dumazet
Prior patch in the series added TCP_RFC7323_PAWS_ACK drop reason. This patch adds the corresponding SNMP counter, for folks using nstat instead of tracing for TCP diagnostics. nstat -az | grep PAWSOldAck Suggested-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Tested-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250113135558.3180360-4-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14ALSA: rawmidi: Make tied_device=0 as default / unknownTakashi Iwai
In the original change, rawmidi_info.tied_device showed -1 for the unknown or untied device. But this would require the user-space to check the protocol version and judge the value conditionally, which is rather error-prone. Instead, set the tied_device = 0 to be default as unknown, and indicate the real device with the offset 1, for achieving more backward compatibility. Suggested-by: Jaroslav Kysela <perex@perex.cz> Link: https://patch.msgid.link/20250114104711.19197-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-01-14Merge tag 'nf-next-25-01-11' of ↵Paolo Abeni
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next Pablo Neira Ayuso says: ==================== Netfilter/IPVS updates for net-next The following patchset contains a small batch of Netfilter/IPVS updates for net-next: 1) Remove unused genmask parameter in nf_tables_addchain() 2) Speed up reads from /proc/net/ip_vs_conn, from Florian Westphal. 3) Skip empty buckets in hashlimit to avoid atomic operations that results in false positive reports by syzbot with lockdep enabled, patch from Eric Dumazet. 4) Add conntrack event timestamps available via ctnetlink, from Florian Westphal. netfilter pull request 25-01-11 * tag 'nf-next-25-01-11' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next: netfilter: conntrack: add conntrack event timestamp netfilter: xt_hashlimit: htable_selective_cleanup() optimization ipvs: speed up reads from ip_vs_conn proc file netfilter: nf_tables: remove the genmask parameter ==================== Link: https://patch.msgid.link/20250111230800.67349-1-pablo@netfilter.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-01-14net: ethtool: add support for structured PHY statisticsJakub Kicinski
Introduce a new way to report PHY statistics in a structured and standardized format using the netlink API. This new method does not replace the old driver-specific stats, which can still be accessed with `ethtool -S <eth name>`. The structured stats are available with `ethtool -S <eth name> --all-groups`. This new method makes it easier to diagnose problems by organizing stats in a consistent and documented way. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-01-13md: reintroduce md-linearYu Kuai
THe md-linear is removed by commit 849d18e27be9 ("md: Remove deprecated CONFIG_MD_LINEAR") because it has been marked as deprecated for a long time. However, md-linear is used widely for underlying disks with different size, sadly we didn't know this until now, and it's true useful to create partitions and assemble multiple raid and then append one to the other. People have to use dm-linear in this case now, however, they will prefer to minimize the number of involved modules. Fixes: 849d18e27be9 ("md: Remove deprecated CONFIG_MD_LINEAR") Cc: stable@vger.kernel.org Signed-off-by: Yu Kuai <yukuai3@huawei.com> Acked-by: Coly Li <colyli@kernel.org> Acked-by: Mike Snitzer <snitzer@kernel.org> Link: https://lore.kernel.org/r/20250102112841.1227111-1-yukuai1@huaweicloud.com Signed-off-by: Song Liu <song@kernel.org>
2025-01-13wifi: cfg80211: Add support for controlling EPCSIlan Peer
Add support for configuring Emergency Preparedness Communication Services (EPCS) for station mode. Signed-off-by: Ilan Peer <ilan.peer@intel.com> Reviewed-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250102161730.ea54ac94445c.I11d750188bc0871e13e86146a3b5cc048d853e69@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-01-13wifi: cfg80211: Add support for dynamic addition/removal of linksIlan Peer
Add support for requesting dynamic addition/removal of links to the current MLO association. Signed-off-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250102161730.cef23352f2a2.I79c849974c494cb1cbf9e1b22a5d2d37395ff5ac@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-01-13wifi: nl80211: permit userspace to pass supported selectorsBenjamin Berg
Currently the SAE_H2E selector already exists, which needs to be implemented by the SME. As new such selectors might be added in the future, add a feature to permit userspace to report a selector as supported. If not given, the kernel should assume that userspace does support SAE_H2E. Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Reviewed-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250101070249.fe67b871cc39.Ieb98390328927e998e612345a58b6dbc00b0e3a2@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-01-13Merge 6.13-rc4 into char-misc-nextGreg Kroah-Hartman
We need the IIO fixes in here as well, and it resolves a merge conflict in: drivers/iio/adc/ti-ads1119.c Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-13Merge 6.13-rc7 into usb-nextGreg Kroah-Hartman
We need the USB fixes in here as well for testing. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-12delayacct: add delay min to record delay peakWang Yaxin
Delay accounting can now calculate the average delay of processes, detect the overall system load, and also record the 'delay max' to identify potential abnormal delays. However, 'delay min' can help us identify another useful delay peak. By comparing the difference between 'delay max' and 'delay min', we can understand the optimization space for latency, providing a reference for the optimization of latency performance. Use case ========= bash-4.4# ./getdelays -d -t 242 print delayacct stats ON TGID 242 CPU count real total virtual total delay total delay average delay max delay min 39 156000000 156576579 2111069 0.054ms 0.212296ms 0.031307ms IO count delay total delay average delay max delay min 0 0 0.000ms 0.000000ms 0.000000ms SWAP count delay total delay average delay max delay min 0 0 0.000ms 0.000000ms 0.000000ms RECLAIM count delay total delay average delay max delay min 0 0 0.000ms 0.000000ms 0.000000ms THRASHING count delay total delay average delay max delay min 0 0 0.000ms 0.000000ms 0.000000ms COMPACT count delay total delay average delay max delay min 0 0 0.000ms 0.000000ms 0.000000ms WPCOPY count delay total delay average delay max delay min 156 11215873 0.072ms 0.207403ms 0.033913ms IRQ count delay total delay average delay max delay min 0 0 0.000ms 0.000000ms 0.000000ms Link: https://lkml.kernel.org/r/20241220173105906EOdsPhzjMLYNJJBqgz1ga@zte.com.cn Co-developed-by: Wang Yong <wang.yong12@zte.com.cn> Signed-off-by: Wang Yong <wang.yong12@zte.com.cn> Co-developed-by: xu xin <xu.xin16@zte.com.cn> Signed-off-by: xu xin <xu.xin16@zte.com.cn> Signed-off-by: Wang Yaxin <wang.yaxin@zte.com.cn> Co-developed-by: Kun Jiang <jiang.kun2@zte.com.cn> Signed-off-by: Kun Jiang <jiang.kun2@zte.com.cn> Cc: Balbir Singh <bsingharora@gmail.com> Cc: David Hildenbrand <david@redhat.com> Cc: Fan Yu <fan.yu9@zte.com.cn> Cc: Peilin He <he.peilin@zte.com.cn> Cc: tuqiang <tu.qiang35@zte.com.cn> Cc: ye xingchen <ye.xingchen@zte.com.cn> Cc: Yunkai Zhang <zhang.yunkai@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-12delayacct: add delay max to record delay peakWang Yaxin
Introduce the use cases of delay max, which can help quickly detect potential abnormal delays in the system and record the types and specific details of delay spikes. Problem ======== Delay accounting can track the average delay of processes to show system workload. However, when a process experiences a significant delay, maybe a delay spike, which adversely affects performance, getdelays can only display the average system delay over a period of time. Yet, average delay is unhelpful for diagnosing delay peak. It is not even possible to determine which type of delay has spiked, as this information might be masked by the average delay. Solution ========= the 'delay max' can display delay peak since the system's startup, which can record potential abnormal delays over time, including the type of delay and the maximum delay. This is helpful for quickly identifying crash caused by delay. Use case ========= bash# ./getdelays -d -p 244 print delayacct stats ON PID 244 CPU count real total virtual total delay total delay average delay max 68 192000000 213676651 705643 0.010ms 0.306381ms IO count delay total delay average delay max 0 0 0.000ms 0.000000ms SWAP count delay total delay average delay max 0 0 0.000ms 0.000000ms RECLAIM count delay total delay average delay max 0 0 0.000ms 0.000000ms THRASHING count delay total delay average delay max 0 0 0.000ms 0.000000ms COMPACT count delay total delay average delay max 0 0 0.000ms 0.000000ms WPCOPY count delay total delay average delay max 235 15648284 0.067ms 0.263842ms IRQ count delay total delay average delay max 0 0 0.000ms 0.000000ms [wang.yaxin@zte.com.cn: update docs and fix some spelling errors] Link: https://lkml.kernel.org/r/20241213192700771XKZ8H30OtHSeziGqRVMs0@zte.com.cn Link: https://lkml.kernel.org/r/20241203164848805CS62CQPQWG9GLdQj2_BxS@zte.com.cn Co-developed-by: Wang Yong <wang.yong12@zte.com.cn> Signed-off-by: Wang Yong <wang.yong12@zte.com.cn> Co-developed-by: xu xin <xu.xin16@zte.com.cn> Signed-off-by: xu xin <xu.xin16@zte.com.cn> Co-developed-by: Wang Yaxin <wang.yaxin@zte.com.cn> Signed-off-by: Wang Yaxin <wang.yaxin@zte.com.cn> Signed-off-by: Kun Jiang <jiang.kun2@zte.com.cn> Cc: Balbir Singh <bsingharora@gmail.com> Cc: David Hildenbrand <david@redhat.com> Cc: Fan Yu <fan.yu9@zte.com.cn> Cc: Peilin He <he.peilin@zte.com.cn> Cc: tuqiang <tu.qiang35@zte.com.cn> Cc: Yang Yang <yang.yang29@zte.com.cn> Cc: ye xingchen <ye.xingchen@zte.com.cn> Cc: Yunkai Zhang <zhang.yunkai@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-13Merge tag 'drm-msm-next-2025-01-07' of gitlab.freedesktop.org:drm/msm into ↵Dave Airlie
drm-next Updates for v6.14 MDSS: - properly described UBWC registers - added SM6150 (aka QCS615) support MDP4: - several small fixes DPU: - added SM6150 (aka QCS615) support - enabled wide planes if virtual planes are enabled (by using two SSPPs for a single plane) - fixed modes filtering for platforms w/o 3DMux - fixed DSPP DSPP_2 / _3 links on several platforms - corrected DSPP definitions on SDM670 - added CWB hardware blocks support - added VBIF to DPU snapshots - dropped struct dpu_rm_requirements DP: - reworked DP audio support DSI: - added SM6150 (aka QCS615) support GPU: - Print GMU core fw version - GMU bandwidth voting for a740 and a750 - Expose uche trap base via uapi - UAPI error reporting Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rob Clark <robdclark@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGsutUu4ff6OpXNXxqf1xaV0rV6oV23VXNRiF0_OEfe72Q@mail.gmail.com
2025-01-12ALSA: seq: Notify UMP EP and FB changesTakashi Iwai
So far we notify the sequencer client and port changes upon UMP FB changes, but those aren't really corresponding to the UMP updates. e.g. when a FB info gets updated, it's not notified but done only when some of sequencer port attribute is changed. This is no ideal behavior. This patch adds the two new sequencer event types for notifying the UMP EP and FB changes via the announce port. The new event takes snd_seq_ev_ump_notify type data, which is compatible with snd_seq_addr (where the port number is replaced with the block number). The events are sent when the EP and FB info gets updated explicitly via ioctl, or the backend UMP receives the corresponding UMP messages. The sequencer protocol version is bumped to 1.0.5 along with it. Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/20250110155943.31578-9-tiwai@suse.de
2025-01-12ALSA: rawmidi: Bump protocol version to 2.0.5Takashi Iwai
Bump the protocol version to 2.0.5, as we extended the rawmidi ABI for the new tied_device info and the substream inactive flag. Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/20250110155943.31578-4-tiwai@suse.de
2025-01-12ALSA: rawmidi: Show substream activity in info ioctlTakashi Iwai
The UMP legacy rawmidi may turn on/off the substream dynamically depending on the UMP Function Block information. So far, there was no direct way to know whether the substream is disabled (inactive) or not; at most one can take a look at the substream name string or try to open and get -ENODEV. This patch extends the rawmidi info ioctl to show the current inactive state of the given substream. When the selected substream is inactive, info flags field contains the new bit flag SNDRV_RAWMIDI_INFO_STREAM_INACTIVE. Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/20250110155943.31578-3-tiwai@suse.de
2025-01-12ALSA: rawmidi: Expose the tied device number in info ioctlTakashi Iwai
The UMP legacy rawmidi is derived from the UMP rawmidi, but currently there is no way to know which device is involved in other side. This patch extends the rawmidi info ioctl to show the tied device number. As default it stores -1, indicating that no tied device. Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/20250110155943.31578-2-tiwai@suse.de
2025-01-10io_uring: expose read/write attribute capabilityAnuj Gupta
After commit 9a213d3b80c0, we can pass additional attributes along with read/write. However, userspace doesn't know that. Add a new feature flag IORING_FEAT_RW_ATTR, to notify the userspace that the kernel has this ability. Signed-off-by: Anuj Gupta <anuj20.g@samsung.com> Reviewed-by: Li Zetao <lizetao1@huawei.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Tested-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/20241205062109.1788-1-anuj20.g@samsung.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-01-10Merge tag 'ipsec-next-2025-01-09' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== ipsec-next-2025-01-09 1) Implement the AGGFRAG protocol and basic IP-TFS (RFC9347) functionality. From Christian Hopps. 2) Support ESN context update to hardware for TX. From Jianbo Liu. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2025-01-10Merge tag 'v6.13-rc6' into drm-nextDave Airlie
This backmerges Linux 6.13-rc6 this is need for the newer pulls. Signed-off-by: Dave Airlie <airlied@redhat.com>
2025-01-09fs: add STATX_DIO_READ_ALIGNChristoph Hellwig
Add a separate dio read align field, as many out of place write file systems can easily do reads aligned to the device sector size, but require bigger alignment for writes. This is usually papered over by falling back to buffered I/O for smaller writes and doing read-modify-write cycles, but performance for this sucks, so applications benefit from knowing the actual write alignment. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20250109083109.1441561-3-hch@lst.de Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-09fs: reformat the statx definitionChristoph Hellwig
The comments after the declaration are becoming rather unreadable with long enough comments. Move them into lines of their own. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20250109083109.1441561-2-hch@lst.de Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-09netfilter: conntrack: add conntrack event timestampFlorian Westphal
Nadia Pinaeva writes: I am working on a tool that allows collecting network performance metrics by using conntrack events. Start time of a conntrack entry is used to evaluate seen_reply latency, therefore the sooner it is timestamped, the better the precision is. In particular, when using this tool to compare the performance of the same feature implemented using iptables/nftables/OVS it is crucial to have the entry timestamped earlier to see any difference. At this time, conntrack events can only get timestamped at recv time in userspace, so there can be some delay between the event being generated and the userspace process consuming the message. There is sys/net/netfilter/nf_conntrack_timestamp, which adds a 64bit timestamp (ns resolution) that records start and stop times, but its not suited for this either, start time is the 'hashtable insertion time', not 'conntrack allocation time'. There is concern that moving the start-time moment to conntrack allocation will add overhead in case of flooding, where conntrack entries are allocated and released right away without getting inserted into the hashtable. Also, even if this was changed it would not with events other than new (start time) and destroy (stop time). Pablo suggested to add new CTA_TIMESTAMP_EVENT, this adds this feature. The timestamp is recorded in case both events are requested and the sys/net/netfilter/nf_conntrack_timestamp toggle is enabled. Reported-by: Nadia Pinaeva <n.m.pinaeva@gmail.com> Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-01-09netlink: add IPv6 anycast join/leave notificationsYuyang Huang
This change introduces a mechanism for notifying userspace applications about changes to IPv6 anycast addresses via netlink. It includes: * Addition and deletion of IPv6 anycast addresses are reported using RTM_NEWANYCAST and RTM_DELANYCAST. * A new netlink group (RTNLGRP_IPV6_ACADDR) for subscribing to these notifications. This enables user space applications(e.g. ip monitor) to efficiently track anycast addresses through netlink messages, improving metrics collection and system monitoring. It also unlocks the potential for advanced anycast management in user space, such as hardware offload control and fine grained network control. Cc: Maciej Żenczykowski <maze@google.com> Cc: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: Yuyang Huang <yuyanghuang@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20250107114355.1766086-1-yuyanghuang@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-01-09Merge tag 'drm-xe-next-2025-01-07' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/xe/kernel into drm-next UAPI Changes: - OA new property: 'unblock after N reports' (Ashutosh) i915 display Changes: - UHBR rates for Thunderbolt (Kahola) Driver Changes: - IRQ related fixes and improvements (Ilia) - Revert some changes that break a mesa debug tool (John) - Fix migration issues (Nirmoy) - Enable GuC's WA_DUAL_QUEUE for newer platforms (Daniele) - Move shrink test out of xe_bo (Nirmoy) - SRIOV PF: Use correct function to check LMEM provisioning (Michal) - Fix a false-positive "Missing outer runtime PM protection" warning (Rodrigo) - Make GSCCS disabling message less alarming (Daniele) - Fix DG1 power gate sequence (Rodrigo) - Xe files fixes (Lucas) - Fix a potential TP_printk UAF (Thomas) - OA Fixes (Umesh) - Fix tlb invalidation when wedging (Lucas) - Documentation fix (Lucas) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/Z31579j3V3XCPFaK@intel.com
2025-01-08ntsync: Introduce alertable waits.Elizabeth Figura
NT waits can optionally be made "alertable". This is a special channel for thread wakeup that is mildly similar to SIGIO. A thread has an internal single bit of "alerted" state, and if a thread is alerted while an alertable wait, the wait will return a special value, consume the "alerted" state, and will not consume any of its objects. Alerts are implemented using events; the user-space NT emulator is expected to create an internal ntsync event for each thread and pass that event to wait functions. Signed-off-by: Elizabeth Figura <zfigura@codeweavers.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20241213193511.457338-16-zfigura@codeweavers.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-08ntsync: Introduce NTSYNC_IOC_EVENT_READ.Elizabeth Figura
This corresponds to the NT syscall NtQueryEvent(). This returns the signaled state of the event and whether it is manual-reset. Signed-off-by: Elizabeth Figura <zfigura@codeweavers.com> Link: https://lore.kernel.org/r/20241213193511.457338-15-zfigura@codeweavers.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-08ntsync: Introduce NTSYNC_IOC_MUTEX_READ.Elizabeth Figura
This corresponds to the NT syscall NtQueryMutant(). This returns the recursion count, owner, and abandoned state of the mutex. Signed-off-by: Elizabeth Figura <zfigura@codeweavers.com> Link: https://lore.kernel.org/r/20241213193511.457338-14-zfigura@codeweavers.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-08ntsync: Introduce NTSYNC_IOC_SEM_READ.Elizabeth Figura
This corresponds to the NT syscall NtQuerySemaphore(). This returns the current count and maximum count of the semaphore. Signed-off-by: Elizabeth Figura <zfigura@codeweavers.com> Link: https://lore.kernel.org/r/20241213193511.457338-13-zfigura@codeweavers.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-08ntsync: Introduce NTSYNC_IOC_EVENT_PULSE.Elizabeth Figura
This corresponds to the NT syscall NtPulseEvent(). This wakes up any waiters as if the event had been set, but does not set the event, instead resetting it if it had been signalled. Thus, for a manual-reset event, all waiters are woken, whereas for an auto-reset event, at most one waiter is woken. Signed-off-by: Elizabeth Figura <zfigura@codeweavers.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20241213193511.457338-12-zfigura@codeweavers.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-08ntsync: Introduce NTSYNC_IOC_EVENT_RESET.Elizabeth Figura
This corresponds to the NT syscall NtResetEvent(). This sets the event to the unsignaled state, and returns its previous state. Signed-off-by: Elizabeth Figura <zfigura@codeweavers.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20241213193511.457338-11-zfigura@codeweavers.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-08ntsync: Introduce NTSYNC_IOC_EVENT_SET.Elizabeth Figura
This corresponds to the NT syscall NtSetEvent(). This sets the event to the signaled state, and returns its previous state. Signed-off-by: Elizabeth Figura <zfigura@codeweavers.com> Link: https://lore.kernel.org/r/20241213193511.457338-10-zfigura@codeweavers.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-08ntsync: Introduce NTSYNC_IOC_CREATE_EVENT.Elizabeth Figura
This correspond to the NT syscall NtCreateEvent(). An NT event holds a single bit of state denoting whether it is signaled or unsignaled. There are two types of events: manual-reset and automatic-reset. When an automatic-reset event is acquired via a wait function, its state is reset to unsignaled. Manual-reset events are not affected by wait functions. Whether the event is manual-reset, and its initial state, are specified at creation time. Signed-off-by: Elizabeth Figura <zfigura@codeweavers.com> Link: https://lore.kernel.org/r/20241213193511.457338-9-zfigura@codeweavers.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-08ntsync: Introduce NTSYNC_IOC_MUTEX_KILL.Elizabeth Figura
This does not correspond to any NT syscall. Rather, when a thread dies, it should be called by the NT emulator for each mutex, with the TID of the dying thread. NT mutexes are robust (in the pthread sense). When an NT thread dies, any mutexes it owned are immediately released. Acquisition of those mutexes by other threads will return a special value indicating that the mutex was abandoned, like EOWNERDEAD returned from pthread_mutex_lock(), and EOWNERDEAD is indeed used here for that purpose. Signed-off-by: Elizabeth Figura <zfigura@codeweavers.com> Link: https://lore.kernel.org/r/20241213193511.457338-8-zfigura@codeweavers.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-08ntsync: Introduce NTSYNC_IOC_MUTEX_UNLOCK.Elizabeth Figura
This corresponds to the NT syscall NtReleaseMutant(). This syscall decrements the mutex's recursion count by one, and returns the previous value. If the mutex is not owned by the current task, the function instead fails and returns -EPERM. Signed-off-by: Elizabeth Figura <zfigura@codeweavers.com> Link: https://lore.kernel.org/r/20241213193511.457338-7-zfigura@codeweavers.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-08ntsync: Introduce NTSYNC_IOC_CREATE_MUTEX.Elizabeth Figura
This corresponds to the NT syscall NtCreateMutant(). An NT mutex is recursive, with a 32-bit recursion counter. When acquired via NtWaitForMultipleObjects(), the recursion counter is incremented by one. The OS records the thread which acquired it. The OS records the thread which acquired it. However, in order to keep this driver self-contained, the owning thread ID is managed by user-space, and passed as a parameter to all relevant ioctls. The initial owner and recursion count, if any, are specified when the mutex is created. Signed-off-by: Elizabeth Figura <zfigura@codeweavers.com> Link: https://lore.kernel.org/r/20241213193511.457338-6-zfigura@codeweavers.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-08ntsync: Introduce NTSYNC_IOC_WAIT_ALL.Elizabeth Figura
This is similar to NTSYNC_IOC_WAIT_ANY, but waits until all of the objects are simultaneously signaled, and then acquires all of them as a single atomic operation. Because acquisition of multiple objects is atomic, some complex locking is required. We cannot simply spin-lock multiple objects simultaneously, as that may disable preëmption for a problematically long time. Instead, modifying any object which may be involved in a wait-all operation takes a device-wide sleeping mutex, "wait_all_lock", instead of the normal object spinlock. Because wait-for-all is a rare operation, in order to optimize wait-for-any, this lock is only taken when necessary. "all_hint" is used to mark objects which are involved in a wait-for-all operation, and if an object is not, only its spinlock is taken. The locking scheme used here was written by Peter Zijlstra. Signed-off-by: Elizabeth Figura <zfigura@codeweavers.com> Link: https://lore.kernel.org/r/20241213193511.457338-5-zfigura@codeweavers.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-08ntsync: Introduce NTSYNC_IOC_WAIT_ANY.Elizabeth Figura
This corresponds to part of the functionality of the NT syscall NtWaitForMultipleObjects(). Specifically, it implements the behaviour where the third argument (wait_any) is TRUE, and it does not handle alertable waits. Those features have been split out into separate patches to ease review. This patch therefore implements the wait/wake infrastructure which comprises the core of ntsync's functionality. NTSYNC_IOC_WAIT_ANY is a vectored wait function similar to poll(). Unlike poll(), it "consumes" objects when they are signaled. For semaphores, this means decreasing one from the internal counter. At most one object can be consumed by this function. This wait/wake model is fundamentally different from that used anywhere else in the kernel, and for that reason ntsync does not use any existing infrastructure, such as futexes, kernel mutexes or semaphores, or wait_event(). Up to 64 objects can be waited on at once. As soon as one is signaled, the object with the lowest index is consumed, and that index is returned via the "index" field. A timeout is supported. The timeout is passed as a u64 nanosecond value, which represents absolute time measured against either the MONOTONIC or REALTIME clock (controlled by the flags argument). If U64_MAX is passed, the ioctl waits indefinitely. This ioctl validates that all objects belong to the relevant device. This is not necessary for any technical reason related to NTSYNC_IOC_WAIT_ANY, but will be necessary for NTSYNC_IOC_WAIT_ALL introduced in the following patch. Some padding fields are added for alignment and for fields which will be added in future patches (split out to ease review). Signed-off-by: Elizabeth Figura <zfigura@codeweavers.com> Link: https://lore.kernel.org/r/20241213193511.457338-4-zfigura@codeweavers.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-08ntsync: Rename NTSYNC_IOC_SEM_POST to NTSYNC_IOC_SEM_RELEASE.Elizabeth Figura
Use the more common "release" terminology, which is also the term used by NT, instead of "post" (which is used by POSIX). Signed-off-by: Elizabeth Figura <zfigura@codeweavers.com> Link: https://lore.kernel.org/r/20241213193511.457338-3-zfigura@codeweavers.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-08ntsync: Return the fd from NTSYNC_IOC_CREATE_SEM.Elizabeth Figura
Simplify the user API a bit by returning the fd as return value from the ioctl instead of through the argument pointer. Signed-off-by: Elizabeth Figura <zfigura@codeweavers.com> Link: https://lore.kernel.org/r/20241213193511.457338-2-zfigura@codeweavers.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-08drivers pps: add PPS generators supportRodolfo Giometti
Sometimes one needs to be able not only to catch PPS signals but to produce them also. For example, running a distributed simulation, which requires computers' clock to be synchronized very tightly. This patch adds PPS generators class in order to have a well-defined interface for these devices. Signed-off-by: Rodolfo Giometti <giometti@enneenne.com> Link: https://lore.kernel.org/r/20241108073115.759039-2-giometti@enneenne.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-08ASoC: Merge up v6.13-rc6Mark Brown
This helps several of my boards in CI.
2025-01-08vduse: relicense under GPL-2.0 OR BSD-3-ClauseYongji Xie
Dual-license the vduse kernel header file to dual GPL-2.0 OR BSD-3-Clause license to make it possible to ship it with DPDK (under BSD-3-Clause) for older distros. Signed-off-by: Yongji Xie <xieyongji@bytedance.com> Message-Id: <20241119074238.38299-1-xieyongji@bytedance.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-01-07Merge tag 'for-netdev' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2025-01-07 We've added 7 non-merge commits during the last 32 day(s) which contain a total of 11 files changed, 190 insertions(+), 103 deletions(-). The main changes are: 1) Migrate the test_xdp_meta.sh BPF selftest into test_progs framework, from Bastien Curutchet. 2) Add ability to configure head/tailroom for netkit devices, from Daniel Borkmann. 3) Fixes and improvements to the xdp_hw_metadata selftest, from Song Yoong Siang. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: selftests/bpf: Extend netkit tests to validate set {head,tail}room netkit: Add add netkit {head,tail}room to rt_link.yaml netkit: Allow for configuring needed_{head,tail}room selftests/bpf: Migrate test_xdp_meta.sh into xdp_context_test_run.c selftests/bpf: test_xdp_meta: Rename BPF sections selftests/bpf: Enable Tx hwtstamp in xdp_hw_metadata selftests/bpf: Actuate tx_metadata_len in xdp_hw_metadata ==================== Link: https://patch.msgid.link/20250107130908.143644-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>