Age | Commit message (Collapse) | Author |
|
This block is used in (at least) T1024 and T1040, including their
variants like T1023 etc.
Fixes: d55ad2967d89 ("powerpc/mpc85xx: Create dts components for the FSL QorIQ DPAA FMan")
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Update FMan binding documentation with the newly added workaround for
erratum A-009885.
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Once an MDIO read transaction is initiated, we must read back the data
register within 16 MDC cycles after the transaction completes. Outside
of this window, reads may return corrupt data.
Therefore, disable local interrupts in the critical section, to
maximize the probability that we can satisfy this requirement.
Fixes: d55ad2967d89 ("powerpc/mpc85xx: Create dts components for the FSL QorIQ DPAA FMan")
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Clang static analysis reports this issue
ocelot_flower.c:563:8: warning: 1st function call argument
is an uninitialized value
!is_zero_ether_addr(match.mask->dst)) {
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The variable match is used before it is set. So move the
block.
Fixes: 75944fda1dfe ("net: mscc: ocelot: offload ingress skbedit and vlan actions to VCAP IS1")
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
On a setup with KSZ9131 and MACB drivers it happens on suspend path, from
time to time, that the PHY interrupt arrives after PHY and MACB were
suspended (PHY via genphy_suspend(), MACB via macb_suspend()). In this
case the phy_read() at the beginning of kszphy_handle_interrupt() will
fail (as MACB driver is suspended at this time) leading to phy_error()
being called and a stack trace being displayed on console. To solve this
.suspend/.resume functions for all KSZ devices implementing
.handle_interrupt were replaced with kszphy_suspend()/kszphy_resume()
which disable/enable interrupt before/after calling
genphy_suspend()/genphy_resume().
The fix has been adapted for all KSZ devices which implements
.handle_interrupt but it has been tested only on KSZ9131.
Fixes: 59ca4e58b917 ("net: phy: micrel: implement generic .handle_interrupt() callback")
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Both versions of the CPSW driver declare a CPSW_HEADROOM_NA macro that
takes NET_IP_ALIGN into account, but fail to use it appropriately when
storing incoming packets in memory. This results in the IPv4 source and
destination addresses to appear misaligned in memory, which causes
aligment faults that need to be fixed up in software.
So let's switch from CPSW_HEADROOM to CPSW_HEADROOM_NA where needed.
This gets rid of any alignment faults on the RX path on a Beaglebone
White.
Fixes: 9ed4050c0d75 ("net: ethernet: ti: cpsw: add XDP support")
Cc: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Syzbot detected a NULL pointer dereference of nfc_llcp_sock->dev pointer
(which is a 'struct nfc_dev *') with calls to llcp_sock_sendmsg() after
a failed llcp_sock_bind(). The message being sent is a SOCK_DGRAM.
KASAN report:
BUG: KASAN: null-ptr-deref in nfc_alloc_send_skb+0x2d/0xc0
Read of size 4 at addr 00000000000005c8 by task llcp_sock_nfc_a/899
CPU: 5 PID: 899 Comm: llcp_sock_nfc_a Not tainted 5.16.0-rc6-next-20211224-00001-gc6437fbf18b0 #125
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x45/0x59
? nfc_alloc_send_skb+0x2d/0xc0
__kasan_report.cold+0x117/0x11c
? mark_lock+0x480/0x4f0
? nfc_alloc_send_skb+0x2d/0xc0
kasan_report+0x38/0x50
nfc_alloc_send_skb+0x2d/0xc0
nfc_llcp_send_ui_frame+0x18c/0x2a0
? nfc_llcp_send_i_frame+0x230/0x230
? __local_bh_enable_ip+0x86/0xe0
? llcp_sock_connect+0x470/0x470
? llcp_sock_connect+0x470/0x470
sock_sendmsg+0x8e/0xa0
____sys_sendmsg+0x253/0x3f0
...
The issue was visible only with multiple simultaneous calls to bind() and
sendmsg(), which resulted in most of the bind() calls to fail. The
bind() was failing on checking if there is available WKS/SDP/SAP
(respective bit in 'struct nfc_llcp_local' fields). When there was no
available WKS/SDP/SAP, the bind returned error but the sendmsg() to such
socket was able to trigger mentioned NULL pointer dereference of
nfc_llcp_sock->dev.
The code looks simply racy and currently it protects several paths
against race with checks for (!nfc_llcp_sock->local) which is NULL-ified
in error paths of bind(). The llcp_sock_sendmsg() did not have such
check but called function nfc_llcp_send_ui_frame() had, although not
protected with lock_sock().
Therefore the race could look like (same socket is used all the time):
CPU0 CPU1
==== ====
llcp_sock_bind()
- lock_sock()
- success
- release_sock()
- return 0
llcp_sock_sendmsg()
- lock_sock()
- release_sock()
llcp_sock_bind(), same socket
- lock_sock()
- error
- nfc_llcp_send_ui_frame()
- if (!llcp_sock->local)
- llcp_sock->local = NULL
- nfc_put_device(dev)
- dereference llcp_sock->dev
- release_sock()
- return -ERRNO
The nfc_llcp_send_ui_frame() checked llcp_sock->local outside of the
lock, which is racy and ineffective check. Instead, its caller
llcp_sock_sendmsg(), should perform the check inside lock_sock().
Reported-and-tested-by: syzbot+7f23bcddf626e0593a39@syzkaller.appspotmail.com
Fixes: b874dec21d1c ("NFC: Implement LLCP connection less Tx path")
Cc: <stable@vger.kernel.org>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Robert Hancock says:
====================
Xilinx axienet fixes
Various fixes for the Xilinx AXI Ethernet driver.
Changed since v2:
-added Reviewed-by tags, added some explanation to commit
messages, no code changes
Changed since v1:
-corrected a Fixes tag to point to mainline commit
-split up reset changes into 3 patches
-added ratelimit on netdev_warn in TX busy case
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
With previous changes to make the driver handle the TX ring size more
correctly, the default TX ring size of 64 appears to significantly
bottleneck TX performance to around 600 Mbps on a 1 Gbps link on ZynqMP.
Increasing this to 128 seems to bring performance up to near line rate and
shouldn't cause excess bufferbloat (this driver doesn't yet support modern
byte-based queue management).
Fixes: 8a3b7a252dca9 ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Network driver documentation indicates we should be avoiding returning
NETDEV_TX_BUSY from ndo_start_xmit in normal cases, since it requires
the packets to be requeued. Instead the queue should be stopped after
a packet is added to the TX ring when there may not be enough room for an
additional one. Also, when TX ring entries are completed, we should only
wake the queue if we know there is room for another full maximally
fragmented packet.
Print a warning if there is insufficient space at the start of start_xmit,
since this should no longer happen.
Combined with increasing the default TX ring size (in a subsequent
patch), this appears to recover the TX performance lost by previous changes
to actually manage the TX ring state properly.
Fixes: 8a3b7a252dca9 ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The check for the number of available TX ring slots was off by 1 since a
slot is required for the skb header as well as each fragment. This could
result in overwriting a TX ring slot that was still in use.
Fixes: 8a3b7a252dca9 ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The check for whether a TX ring slot was available was incorrect,
since a slot which had been loaded with transmit data but the device had
not started transmitting would be treated as available, potentially
causing non-transmitted slots to be overwritten. The control field in
the descriptor should be checked, rather than the status field (which may
only be updated when the device completes the entry).
Fixes: 8a3b7a252dca9 ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The driver will not work properly if the TX ring size is set to below
MAX_SKB_FRAGS + 1 since it needs to hold at least one full maximally
fragmented packet in the TX ring. Limit setting the ring size to below
this value.
Fixes: 8b09ca823ffb4 ("net: axienet: Make RX/TX ring sizes configurable")
Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This driver was missing some required memory barriers:
Use dma_rmb to ensure we see all updates to the descriptor after we see
that an entry has been completed.
Use wmb and rmb to avoid stale descriptor status between the TX path and
TX complete IRQ path.
Fixes: 8a3b7a252dca9 ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In some cases where the Xilinx Ethernet core was used in 1000Base-X or
SGMII modes, which use the internal PCS/PMA PHY, and the MGT
transceiver clock source for the PCS was not running at the time the
FPGA logic was loaded, the core would come up in a state where the
PCS could not be found on the MDIO bus. To fix this, the Ethernet core
(including the PCS) should be reset after enabling the clocks, prior to
attempting to access the PCS using of_mdio_find_device.
Fixes: 1a02556086fc (net: axienet: Properly handle PCS/PMA PHY for 1000BaseX mode)
Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When resetting the device, wait for the PhyRstCmplt bit to be set
in the interrupt status register before continuing initialization, to
ensure that the core is actually ready. When using an external PHY, this
also ensures we do not start trying to access the PHY while it is still
in reset. The PHY reset is initiated by the core reset which is
triggered just above, but remains asserted for 5ms after the core is
reset according to the documentation.
The MgtRdy bit could also be waited for, but unfortunately when using
7-series devices, the bit does not appear to work as documented (it
seems to behave as some sort of link state indication and not just an
indication the transceiver is ready) so it can't really be relied on for
this purpose.
Fixes: 8a3b7a252dca9 ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The previous timeout of 1ms was too short to handle some cases where the
core is reset just after the input clocks were started, which will
be introduced in an upcoming patch. Increase the timeout to 50ms. Also
simplify the reset timeout checking to use read_poll_timeout.
Fixes: 8a3b7a252dca9 ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
"In this round, we've tried to address some performance issues in
f2fs_checkpoint and direct IO flows. Also, there was a work to enhance
the page cache management used for compression. Other than them, we've
done typical work including sysfs, code clean-ups, tracepoint, sanity
check, in addition to bug fixes on corner cases.
Enhancements:
- use iomap for direct IO
- try to avoid lock contention to improve f2fs_ckpt speed
- avoid unnecessary memory allocation in compression flow
- POSIX_FADV_DONTNEED drops the page cache containing compression
pages
- add some sysfs entries (gc_urgent_high_remaining, pending_discard)
Bug fixes:
- try not to expose unwritten blocks to user by DIO (this was added
to avoid merge conflict; another patch is coming to address other
missing case)
- relax minor error condition for file pinning feature used in
Android OTA
- fix potential deadlock case in compression flow
- should not truncate any block on pinned file
In addition, we've done some code clean-ups and tracepoint/sanity
check improvement"
* tag 'f2fs-for-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (29 commits)
f2fs: do not allow partial truncation on pinned file
f2fs: remove redunant invalidate compress pages
f2fs: Simplify bool conversion
f2fs: don't drop compressed page cache in .{invalidate,release}page
f2fs: fix to reserve space for IO align feature
f2fs: fix to check available space of CP area correctly in update_ckpt_flags()
f2fs: support fault injection to f2fs_trylock_op()
f2fs: clean up __find_inline_xattr() with __find_xattr()
f2fs: fix to do sanity check on last xattr entry in __f2fs_setxattr()
f2fs: do not bother checkpoint by f2fs_get_node_info
f2fs: avoid down_write on nat_tree_lock during checkpoint
f2fs: compress: fix potential deadlock of compress file
f2fs: avoid EINVAL by SBI_NEED_FSCK when pinning a file
f2fs: add gc_urgent_high_remaining sysfs node
f2fs: fix to do sanity check in is_alive()
f2fs: fix to avoid panic in is_alive() if metadata is inconsistent
f2fs: fix to do sanity check on inode type during garbage collection
f2fs: avoid duplicate call of mark_inode_dirty
f2fs: show number of pending discard commands
f2fs: support POSIX_FADV_DONTNEED drop compressed page cache
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fix from Steven Rostedt:
"tracing/scripts: Possible uninitialized variable
The 0day bot discovered a possible uninitialized path in the scripts
that sort the mcount sections at build time. Just needed to initialize
that variable"
* tag 'trace-v5.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
script/sorttable: Fix some initialization problems
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Palmer Dabbelt:
- Support for the DA9063 as used on the HiFive Unmatched.
- Support for relative extables, which puts us in line with other
architectures and save some space in vmlinux.
- A handful of kexec fixes/improvements, including the ability to run
crash kernels from PCI-addressable memory on the HiFive Unmatched.
- Support for the SBI SRST extension, which allows systems that do not
have an explicit driver in Linux to reboot.
- A handful of fixes and cleanups, including to the defconfigs and
device trees.
* tag 'riscv-for-linus-5.17-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (52 commits)
RISC-V: Use SBI SRST extension when available
riscv: mm: fix wrong phys_ram_base value for RV64
RISC-V: Use common riscv_cpuid_to_hartid_mask() for both SMP=y and SMP=n
riscv: head: remove useless __PAGE_ALIGNED_BSS and .balign
riscv: errata: alternative: mark vendor_patch_func __initdata
riscv: head: make secondary_start_common() static
riscv: remove cpu_stop()
riscv: try to allocate crashkern region from 32bit addressible memory
riscv: use hart id instead of cpu id on machine_kexec
riscv: Don't use va_pa_offset on kdump
riscv: dts: sifive: fu540-c000: Fix PLIC node
riscv: dts: sifive: fu540-c000: Drop bogus soc node compatible values
riscv: dts: sifive: Group tuples in register properties
riscv: dts: sifive: Group tuples in interrupt properties
riscv: dts: microchip: mpfs: Group tuples in interrupt properties
riscv: dts: microchip: mpfs: Fix clock controller node
riscv: dts: microchip: mpfs: Fix reference clock node
riscv: dts: microchip: mpfs: Fix PLIC node
riscv: dts: microchip: mpfs: Drop empty chosen node
riscv: dts: canaan: Group tuples in interrupt properties
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- Add new kconfig target 'make mod2noconfig', which will be useful to
speed up the build and test iteration.
- Raise the minimum supported version of LLVM to 11.0.0
- Refactor certs/Makefile
- Change the format of include/config/auto.conf to stop double-quoting
string type CONFIG options.
- Fix ARCH=sh builds in dash
- Separate compression macros for general purposes (cmd_bzip2 etc.) and
the ones for decompressors (cmd_bzip2_with_size etc.)
- Misc Makefile cleanups
* tag 'kbuild-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (34 commits)
kbuild: add cmd_file_size
arch: decompressor: remove useless vmlinux.bin.all-y
kbuild: rename cmd_{bzip2,lzma,lzo,lz4,xzkern,zstd22}
kbuild: drop $(size_append) from cmd_zstd
sh: rename suffix-y to suffix_y
doc: kbuild: fix default in `imply` table
microblaze: use built-in function to get CPU_{MAJOR,MINOR,REV}
certs: move scripts/extract-cert to certs/
kbuild: do not quote string values in include/config/auto.conf
kbuild: do not include include/config/auto.conf from shell scripts
certs: simplify $(srctree)/ handling and remove config_filename macro
kbuild: stop using config_filename in scripts/Makefile.modsign
certs: remove misleading comments about GCC PR
certs: refactor file cleaning
certs: remove unneeded -I$(srctree) option for system_certificates.o
certs: unify duplicated cmd_extract_certs and improve the log
certs: use $< and $@ to simplify the key generation rule
kbuild: remove headers_check stub
kbuild: move headers_check.pl to usr/include/
certs: use if_changed to re-generate the key when the key type is changed
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/crng/random
Pull random number generator fixes from Jason Donenfeld:
- Some Kconfig changes resulted in BIG_KEYS being unselectable, which
Justin sent a patch to fix.
- Geert pointed out that moving to BLAKE2s bloated vmlinux on little
machines, like m68k, so we now compensate for this.
- Numerous style and house cleaning fixes, meant to have a cleaner base
for future changes.
* 'random-5.17-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
random: simplify arithmetic function flow in account()
random: selectively clang-format where it makes sense
random: access input_pool_data directly rather than through pointer
random: cleanup fractional entropy shift constants
random: prepend remaining pool constants with POOL_
random: de-duplicate INPUT_POOL constants
random: remove unused OUTPUT_POOL constants
random: rather than entropy_store abstraction, use global
random: remove unused extract_entropy() reserved argument
random: remove incomplete last_data logic
random: cleanup integer types
random: cleanup poolinfo abstraction
random: fix typo in comments
lib/crypto: sha1: re-roll loops to reduce code size
lib/crypto: blake2s: move hmac construction into wireguard
lib/crypto: add prompts back to crypto libraries
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux
Pull hwspinlock updates from Bjorn Andersson:
"This contains a change to the stm32 hwspinlock driver to ensure that
the hardware is operational even without CONFIG_PM"
* tag 'hwlock-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux:
hwspinlock: stm32: enable clock at probe
|
|
There's an unneeded and almost empty wireless section in MAINTAINERS, seems to
be leftovers from commit 0e324cf640fb ("MAINTAINERS: changes for wireless"). I
don't see any need for that so let's remove it.
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220117181958.3509-2-kvalo@kernel.org
|
|
For easier maintenance we have decided to create common wireless and
wireless-next trees for all wireless patches. Old mac80211 and wireless-drivers
trees will not be used anymore.
While at it, add a wiki link to wireless drivers section and a patchwork link
to 802.11, mac80211 and rfkill sections. Also use https in patchwork links.
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220117181958.3509-1-kvalo@kernel.org
|
|
Daniel Borkmann says:
====================
pull-request: bpf 2022-01-19
We've added 12 non-merge commits during the last 8 day(s) which contain
a total of 12 files changed, 262 insertions(+), 64 deletions(-).
The main changes are:
1) Various verifier fixes mainly around register offset handling when
passed to helper functions, from Daniel Borkmann.
2) Fix XDP BPF link handling to assert program type,
from Toke Høiland-Jørgensen.
3) Fix regression in mount parameter handling for BPF fs,
from Yafang Shao.
4) Fix incorrect integer literal when marking scratched stack slots
in verifier, from Christy Lee.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
bpf, selftests: Add ringbuf memory type confusion test
bpf, selftests: Add various ringbuf tests with invalid offset
bpf: Fix ringbuf memory type confusion when passing to helpers
bpf: Fix out of bounds access for ringbuf helpers
bpf: Generally fix helper register offset check
bpf: Mark PTR_TO_FUNC register initially with zero offset
bpf: Generalize check_ctx_reg for reuse with other types
bpf: Fix incorrect integer literal used for marking scratched stack.
bpf/selftests: Add check for updating XDP bpf_link with wrong program type
bpf/selftests: convert xdp_link test to ASSERT_* macros
xdp: check prog type before updating BPF link
bpf: Fix mount source show for bpffs
====================
Link: https://lore.kernel.org/r/20220119011825.9082-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add two tests, one which asserts that ring buffer memory can be passed to
other helpers for populating its entry area, and another one where verifier
rejects different type of memory passed to bpf_ringbuf_submit().
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
|
|
Assert that the verifier is rejecting invalid offsets on the ringbuf entries:
# ./test_verifier | grep ring
#947/u ringbuf: invalid reservation offset 1 OK
#947/p ringbuf: invalid reservation offset 1 OK
#948/u ringbuf: invalid reservation offset 2 OK
#948/p ringbuf: invalid reservation offset 2 OK
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
|
|
The bpf_ringbuf_submit() and bpf_ringbuf_discard() have ARG_PTR_TO_ALLOC_MEM
in their bpf_func_proto definition as their first argument, and thus both expect
the result from a prior bpf_ringbuf_reserve() call which has a return type of
RET_PTR_TO_ALLOC_MEM_OR_NULL.
While the non-NULL memory from bpf_ringbuf_reserve() can be passed to other
helpers, the two sinks (bpf_ringbuf_submit(), bpf_ringbuf_discard()) right now
only enforce a register type of PTR_TO_MEM.
This can lead to potential type confusion since it would allow other PTR_TO_MEM
memory to be passed into the two sinks which did not come from bpf_ringbuf_reserve().
Add a new MEM_ALLOC composable type attribute for PTR_TO_MEM, and enforce that:
- bpf_ringbuf_reserve() returns NULL or PTR_TO_MEM | MEM_ALLOC
- bpf_ringbuf_submit() and bpf_ringbuf_discard() only take PTR_TO_MEM | MEM_ALLOC
but not plain PTR_TO_MEM arguments via ARG_PTR_TO_ALLOC_MEM
- however, other helpers might treat PTR_TO_MEM | MEM_ALLOC as plain PTR_TO_MEM
to populate the memory area when they use ARG_PTR_TO_{UNINIT_,}MEM in their
func proto description
Fixes: 457f44363a88 ("bpf: Implement BPF ring buffer and verifier support for it")
Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
|
|
Both bpf_ringbuf_submit() and bpf_ringbuf_discard() have ARG_PTR_TO_ALLOC_MEM
in their bpf_func_proto definition as their first argument. They both expect
the result from a prior bpf_ringbuf_reserve() call which has a return type of
RET_PTR_TO_ALLOC_MEM_OR_NULL.
Meaning, after a NULL check in the code, the verifier will promote the register
type in the non-NULL branch to a PTR_TO_MEM and in the NULL branch to a known
zero scalar. Generally, pointer arithmetic on PTR_TO_MEM is allowed, so the
latter could have an offset.
The ARG_PTR_TO_ALLOC_MEM expects a PTR_TO_MEM register type. However, the non-
zero result from bpf_ringbuf_reserve() must be fed into either bpf_ringbuf_submit()
or bpf_ringbuf_discard() but with the original offset given it will then read
out the struct bpf_ringbuf_hdr mapping.
The verifier missed to enforce a zero offset, so that out of bounds access
can be triggered which could be used to escalate privileges if unprivileged
BPF was enabled (disabled by default in kernel).
Fixes: 457f44363a88 ("bpf: Implement BPF ring buffer and verifier support for it")
Reported-by: <tr3e.wang@gmail.com> (SecCoder Security Lab)
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
|
|
Right now the assertion on check_ptr_off_reg() is only enforced for register
types PTR_TO_CTX (and open coded also for PTR_TO_BTF_ID), however, this is
insufficient since many other PTR_TO_* register types such as PTR_TO_FUNC do
not handle/expect register offsets when passed to helper functions.
Given this can slip-through easily when adding new types, make this an explicit
allow-list and reject all other current and future types by default if this is
encountered.
Also, extend check_ptr_off_reg() to handle PTR_TO_BTF_ID as well instead of
duplicating it. For PTR_TO_BTF_ID, reg->off is used for BTF to match expected
BTF ids if struct offset is used. This part still needs to be allowed, but the
dynamic off from the tnum must be rejected.
Fixes: 69c087ba6225 ("bpf: Add bpf_for_each_map_elem() helper")
Fixes: eaa6bcb71ef6 ("bpf: Introduce bpf_per_cpu_ptr()")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
|
|
Similar as with other pointer types where we use ldimm64, clear the register
content to zero first, and then populate the PTR_TO_FUNC type and subprogno
number. Currently this is not done, and leads to reuse of stale register
tracking data.
Given for special ldimm64 cases we always clear the register offset, make it
common for all cases, so it won't be forgotten in future.
Fixes: 69c087ba6225 ("bpf: Add bpf_for_each_map_elem() helper")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
|
|
Generalize the check_ctx_reg() helper function into a more generic named one
so that it can be reused for other register types as well to check whether
their offset is non-zero. No functional change.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
|
|
elf_mcount_loc and mcount_sort_thread definitions are not
initialized immediately within the function, which can cause
the judgment logic to use uninitialized values when the
initialization logic of subsequent code fails.
Link: https://lkml.kernel.org/r/20211212113358.34208-2-yinan@linux.alibaba.com
Link: https://lkml.kernel.org/r/20220118065241.42364-1-yinan@linux.alibaba.com
Fixes: 72b3942a173c ("scripts: ftrace - move the sort-processing in ftrace_init")
Tested-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Yinan Liu <yinan@linux.alibaba.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
When under stress, cleanup_net() can have to dismantle
netns in big numbers. ops_exit_list() currently calls
many helpers [1] that have no schedule point, and we can
end up with soft lockups, particularly on hosts
with many cpus.
Even for moderate amount of netns processed by cleanup_net()
this patch avoids latency spikes.
[1] Some of these helpers like fib_sync_up() and fib_sync_down_dev()
are very slow because net/ipv4/fib_semantics.c uses host-wide hash tables,
and ifindex is used as the only input of two hash functions.
ifindexes tend to be the same for all netns (lo.ifindex==1 per instance)
This will be fixed in a separate patch.
Fixes: 72ad937abd0a ("net: Add support for batching network namespace cleanups")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Now that have_bytes is never modified, we can simplify this function.
First, we move the check for negative entropy_count to be first. That
ensures that subsequent reads of this will be non-negative. Then,
have_bytes and ibytes can be folded into their one use site in the
min_t() function.
Suggested-by: Dominik Brodowski <linux@dominikbrodowski.net>
Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
This is an old driver that has seen a lot of different eras of kernel
coding style. In an effort to make it easier to code for, unify the
coding style around the current norm, by accepting some of -- but
certainly not all of -- the suggestions from clang-format. This should
remove ambiguity in coding style, especially with regards to spacing,
when code is being changed or amended. Consequently it also makes code
review easier on the eyes, following one uniform style rather than
several.
Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
This gets rid of another abstraction we no longer need. It would be nice
if we could instead make pool an array rather than a pointer, but the
latent entropy plugin won't be able to do its magic in that case. So
instead we put all accesses to the input pool's actual data through the
input_pool_data array directly.
Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
The entropy estimator is calculated in terms of 1/8 bits, which means
there are various constants where things are shifted by 3. Move these
into our pool info enum with the other relevant constants. While we're
at it, move an English assertion about sizes into a proper BUILD_BUG_ON
so that the compiler can ensure this invariant.
Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
The other pool constants are prepended with POOL_, but not these last
ones. Rename them. This will then let us move them into the enum in the
following commit.
Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
We already had the POOL_* constants, so deduplicate the older INPUT_POOL
ones. As well, fold EXTRACT_SIZE into the poolinfo enum, since it's
related.
Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
We no longer have an output pool. Rather, we have just a wakeup bits
threshold for /dev/random reads, presumably so that processes don't
hang. This value, random_write_wakeup_bits, is configurable anyway. So
all the no longer usefully named OUTPUT_POOL constants were doing was
setting a reasonable default for random_write_wakeup_bits. This commit
gets rid of the constants and just puts it all in the default value of
random_write_wakeup_bits.
Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Originally, the RNG used several pools, so having things abstracted out
over a generic entropy_store object made sense. These days, there's only
one input pool, and then an uneven mix of usage via the abstraction and
usage via &input_pool. Rather than this uneasy mixture, just get rid of
the abstraction entirely and have things always use the global. This
simplifies the code and makes reading it a bit easier.
Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
This argument is always set to zero, as a result of us not caring about
keeping a certain amount reserved in the pool these days. So just remove
it and cleanup the function signatures.
Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
There were a few things added under the "if (fips_enabled)" banner,
which never really got completed, and the FIPS people anyway are
choosing a different direction. Rather than keep around this halfbaked
code, get rid of it so that we can focus on a single design of the RNG
rather than two designs.
Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Rather than using the userspace type, __uXX, switch to using uXX. And
rather than using variously chosen `char *` or `unsigned char *`, use
`u8 *` uniformly for things that aren't strings, in the case where we
are doing byte-by-byte traversal.
Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Now that we're only using one polynomial, we can cleanup its
representation into constants, instead of passing around pointers
dynamically to select different polynomials. This improves the codegen
and makes the code a bit more straightforward.
Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
s/or/for
Signed-off-by: Schspa Shi <schspa@gmail.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
With SHA-1 no longer being used for anything performance oriented, and
also soon to be phased out entirely, we can make up for the space added
by unrolled BLAKE2s by simply re-rolling SHA-1. Since SHA-1 is so much
more complex, re-rolling it more or less takes care of the code size
added by BLAKE2s. And eventually, hopefully we'll see SHA-1 removed
entirely from most small kernel builds.
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Basically nobody should use blake2s in an HMAC construction; it already
has a keyed variant. But unfortunately for historical reasons, Noise,
used by WireGuard, uses HKDF quite strictly, which means we have to use
this. Because this really shouldn't be used by others, this commit moves
it into wireguard's noise.c locally, so that kernels that aren't using
WireGuard don't get this superfluous code baked in. On m68k systems,
this shaves off ~314 bytes.
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|