Age | Commit message (Collapse) | Author |
|
Since commit 8f9ea86fdf99 ("sched: Always preserve the user requested
cpumask"), a successful call to sched_setaffinity() should always save
the user requested cpu affinity mask in a task's user_cpus_ptr. However,
when the given cpu mask is the same as the current one, user_cpus_ptr
is not updated. Fix this by saving the user mask in this case too.
Fixes: 8f9ea86fdf99 ("sched: Always preserve the user requested cpumask")
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20230203181849.221943-1-longman@redhat.com
|
|
Tetsuo-San noted that commit f5d39b020809 ("freezer,sched: Rewrite
core freezer logic") broke call_usermodehelper_exec() for the KILLABLE
case.
Specifically it was missed that the second, unconditional,
wait_for_completion() was not optional and ensures the on-stack
completion is unused before going out-of-scope.
Fixes: f5d39b020809 ("freezer,sched: Rewrite core freezer logic")
Reported-by: syzbot+6cd18e123583550cf469@syzkaller.appspotmail.com
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Debugged-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/Y90ar35uKQoUrLEK@hirez.programming.kicks-ass.net
|
|
The only caller to get_kernel_pages() [shm_get_kernel_pages()] has been
updated to not need it.
Remove get_kernel_pages().
Cc: Mel Gorman <mgorman@suse.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "Fabio M. De Francesco" <fmdefrancesco@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Acked-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Acked-by: Andrew Morton <akpm@linux-foudation.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sumit Garg <sumit.garg@linaro.org>
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
|
|
The kernel pages used by shm_get_kernel_pages() are allocated using
GFP_KERNEL through the following call stack:
trusted_instantiate()
trusted_payload_alloc() -> GFP_KERNEL
<trusted key op>
tee_shm_register_kernel_buf()
register_shm_helper()
shm_get_kernel_pages()
Where <trusted key op> is one of:
trusted_key_unseal()
trusted_key_get_random()
trusted_key_seal()
Because the pages can't be from highmem get_kernel_pages() boils down to
a get_page() call.
Remove the get_kernel_pages() call and open code the get_page().
In case a highmem page does slip through warn on once for a kmap'ed
address.
Cc: Jens Wiklander <jens.wiklander@linaro.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "Fabio M. De Francesco" <fmdefrancesco@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sumit Garg <sumit.garg@linaro.org>
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
|
|
The kernel pages used by shm_get_kernel_pages() are allocated using
GFP_KERNEL through the following call stack:
trusted_instantiate()
trusted_payload_alloc() -> GFP_KERNEL
<trusted key op>
tee_shm_register_kernel_buf()
register_shm_helper()
shm_get_kernel_pages()
Where <trusted key op> is one of:
trusted_key_unseal()
trusted_key_get_random()
trusted_key_seal()
Remove the vmalloc page support from shm_get_kernel_pages(). Replace
with a warn on once.
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "Fabio M. De Francesco" <fmdefrancesco@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Jens Wiklander <jens.wiklander@linaro.org>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sumit Garg <sumit.garg@linaro.org>
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
|
|
is_kmap_addr() is only looking at the kmap() address range which may
cause check_heap_object() to miss checking an overflow on a
kmap_local_page() page.
Add a check for the kmap_local_page() address range to is_kmap_addr().
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "Fabio M. De Francesco" <fmdefrancesco@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Acked-by: Andrew Morton <akpm@linux-foudation.org>
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
|
|
pci_msix_alloc_irq_at() and pci_msix_free_irq() are not declared when
CONFIG_PCI_MSI is disabled.
Users of these two calls do not yet exist but when users do appear (shown
below is an attempt to use the new API in vfio-pci) the following errors
will be encountered when compiling with CONFIG_PCI_MSI disabled:
drivers/vfio/pci/vfio_pci_intrs.c:461:4: error: implicit declaration of\
function 'pci_msix_free_irq' is invalid in C99\
[-Werror,-Wimplicit-function-declaration]
pci_msix_free_irq(pdev, msix_map);
^
drivers/vfio/pci/vfio_pci_intrs.c:511:15: error: implicit declaration of\
function 'pci_msix_alloc_irq_at' is invalid in C99\
[-Werror,-Wimplicit-function-declaration]
msix_map = pci_msix_alloc_irq_at(pdev, vector, NULL);
Provide definitions for pci_msix_alloc_irq_at() and pci_msix_free_irq() in
preparation for users that need to compile when CONFIG_PCI_MSI is
disabled.
Reported-by: kernel test robot <lkp@intel.com>
Fixes: 34026364df8e ("PCI/MSI: Provide post-enable dynamic allocation interfaces for MSI-X")
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/158e40e1cfcfc58ae30ecb2bbfaf86e5bba7a1ef.1675978686.git.reinette.chatre@intel.com
|
|
In bnxt_reserve_rings(), there is logic to check that the number of TX
rings reserved is enough to cover all the mqprio TCs, but it fails to
account for the TX XDP rings. So the check will always fail if there
are mqprio TCs and TX XDP rings. As a result, the driver always fails
to initialize after the XDP program is attached and the device will be
brought down. A subsequent ifconfig up will also fail because the
number of TX rings is set to an inconsistent number. Fix the check to
properly account for TX XDP rings. If the check fails, set the number
of TX rings back to a consistent number after calling netdev_reset_tc().
Fixes: 674f50a5b026 ("bnxt_en: Implement new method to reserve rings.")
Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When converting net_device_stats to rtnl_link_stats64 sign extension
is triggered on ILP32 machines as 6c1c509778 changed the previous
"ulong -> u64" conversion to "long -> u64" by accessing the
net_device_stats fields through a (signed) atomic_long_t.
This causes for example the received bytes counter to jump to 16EiB after
having received 2^31 bytes. Casting the atomic value to "unsigned long"
beforehand converting it into u64 avoids this.
Fixes: 6c1c5097781f ("net: add atomic_long_t to net_device_stats fields")
Signed-off-by: Felix Riemann <felix.riemann@sma.de>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
syzbot reported that act_len in kalmia_send_init_packet() is
uninitialized when passing it to the first usb_bulk_msg error path. Jiri
Pirko noted that it's pointless to pass it in the error path, and that
the value that would be printed in the second error path would be the
value of act_len from the first call to usb_bulk_msg.[1]
With this in mind, let's just not pass act_len to the usb_bulk_msg error
paths.
1: https://lore.kernel.org/lkml/Y9pY61y1nwTuzMOa@nanopsycho/
Fixes: d40261236e8e ("net/usb: Add Samsung Kalmia driver for Samsung GT-B3730")
Reported-and-tested-by: syzbot+cd80c5ef5121bfe85b55@syzkaller.appspotmail.com
Signed-off-by: Miko Larsson <mikoxyzzz@gmail.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
old_meter needs to be free after it is detached regardless of whether
the new meter is successfully attached.
Fixes: c7c4c44c9a95 ("net: openvswitch: expand the meters supported number")
Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Since x->encap of pfkey_msg2xfrm_state() is not
initialized to 0, kernel heap data can be leaked.
Fix with kzalloc() to prevent this.
Signed-off-by: Hyunwoo Kim <v4bel@theori.io>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Some TPM 2.0 devices have support for additional commands which are not
part of the TPM 2.0 specifications.
These commands are identified with bit 29 of the 32 bits command codes.
Contrarily to other fields of the TPMA_CC spec structure used to list
available commands, the Vendor flag also has to be present in the
command code itself (TPM_CC) when called.
Add this flag to tpm_find_cc() mask to prevent blocking vendor command
codes that can actually be supported by the underlying TPM device.
Signed-off-by: Julien Gomes <julien@arista.com>
Tested-by: Jarkko Sakkinen <jarkko@kernel.org>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
Some platforms may desire to pass the event log up to Linux in the
form of a reserved memory region. In particular, this is desirable
for embedded systems or baseboard management controllers (BMCs)
booting with U-Boot. IBM OpenBMC BMCs will be the first user.
Add support for the reserved memory in the TPM core to find the
region and map it.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
[jarkko: removed spurious dev_info()'s from tpm_read_log_memory_region()]
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
[yang: return -ENOMEM when devm_memremap() fails]
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
Since the bios event log is freed in the device release function,
let devres handle the deallocation. This will allow other memory
allocation/mapping functions to be used for the bios event log.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Tested-by: Jarkko Sakkinen <jarkko@kernel.org>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
The probe function doesn't make use of the i2c_device_id * parameter so it
can be trivially converted.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
.probe_new() doesn't get the i2c_device_id * parameter, so determine
that explicitly in the probe function.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
The probe function doesn't make use of the i2c_device_id * parameter so it
can be trivially converted.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
The probe function doesn't make use of the i2c_device_id * parameter so it
can be trivially converted.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
The probe function doesn't make use of the i2c_device_id * parameter so it
can be trivially converted.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
When support for ECDSA keys was added, constraints for data & signature
sizes were never updated. This makes it impossible to use such keys via
keyctl API from userspace.
Update constraint on max_data_size to 64 bytes in order to support
SHA512-based signatures. Also update the signature length constraints
per ECDSA signature encoding described in RFC 5480.
Fixes: 299f561a6693 ("x509: Add support for parsing x509 certs with ECDSA keys")
Signed-off-by: Denis Kenzior <denkenz@gmail.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
When the same key is blacklisted repeatedly logging at pr_err() level is
excessive as no functionality is impaired.
When these duplicates are provided by buggy firmware there is nothing
the user can do to fix the situation.
Instead of spamming the bootlog with errors we use a warning that can
still be seen by OEMs when testing their firmware.
Link: https://lore.kernel.org/all/c8c65713-5cda-43ad-8018-20f2e32e4432@t-8ch.de/
Link: https://lore.kernel.org/all/20221104014704.3469-1-linux@weissschuh.net/
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Tested-by: Paul Menzel <pmenzel@molgen.mpg.de>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
key_create() works like key_create_or_update() but does not allow
updating an existing key, instead returning ERR_PTR(-EEXIST).
key_create() will be used by the blacklist keyring which should not
create duplicate entries or update existing entries.
Instead a dedicated message with appropriate severity will be logged.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
One common situation triggering this log statement are duplicate hashes
reported by the system firmware.
These duplicates should be removed from the firmware.
Without logging the blacklisted hash triggering the issue however the users
can not report it properly to the firmware vendors and the firmware vendors
can not easily see which specific hash is duplicated.
While changing the log message also use the dedicated ERR_PTR format
placeholder for the returned error value.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
Pluton is an integrated security processor present in some recent Ryzen
parts. If it's enabled, it presents two devices - an MSFT0101 ACPI device
that's broadly an implementation of a Command Response Buffer TPM2, and an
MSFT0200 ACPI device whose functionality I haven't examined in detail yet.
This patch only attempts to add support for the TPM device.
There's a few things that need to be handled here. The first is that the
TPM2 ACPI table uses a previously undefined start method identifier. The
table format appears to include 16 bytes of startup data, which corresponds
to one 64-bit address for a start message and one 64-bit address for a
completion response. The second is that the ACPI tables on the Thinkpad Z13
I'm testing this on don't define any memory windows in _CRS (or, more
accurately, there are two empty memory windows). This check doesn't seem
strictly necessary, so I've skipped that.
Finally, it seems like chip needs to be explicitly asked to transition into
ready status on every command. Failing to do this means that if two
commands are sent in succession without an idle/ready transition in
between, everything will appear to work fine but the response is simply the
original command. I'm working without any docs here, so I'm not sure if
this is actually the required behaviour or if I'm missing something
somewhere else, but doing this results in the chip working reliably.
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Matthew Garrett <mjg59@srcf.ucam.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
GPIO2 PIN use for output. Mask Dir and Data need to assign for 0x4. Not 0x3.
This fixed was for Lenovo Desktop(0x17aa1056). GPIO2 use for AMP enable.
Signed-off-by: Kailang Yang <kailang@realtek.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/8d02bb9ac8134f878cd08607fdf088fd@realtek.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
The selftest code is built into the x509_key_parser module, and depends
on the pkcs7_message_parser module, which in turn has a dependency on
the key parser, creating a dependency loop and a resulting link
failure when the pkcs7 code is a loadable module:
ld: crypto/asymmetric_keys/selftest.o: in function `fips_signature_selftest':
crypto/asymmetric_keys/selftest.c:205: undefined reference to `pkcs7_parse_message'
ld: crypto/asymmetric_keys/selftest.c:209: undefined reference to `pkcs7_supply_detached_data'
ld: crypto/asymmetric_keys/selftest.c:211: undefined reference to `pkcs7_verify'
ld: crypto/asymmetric_keys/selftest.c:215: undefined reference to `pkcs7_validate_trust'
ld: crypto/asymmetric_keys/selftest.c:219: undefined reference to `pkcs7_free_message'
Avoid this by only allowing the selftest to be enabled when either
both parts are loadable modules, or both are built-in.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
Yet another device which needs a quirk:
nvme nvme1: globally duplicate IDs for nsid 1
nvme nvme1: VID:DID 10ec:5763 model:ADATA SX6000PNP firmware:V9002s94
Link: http://bugzilla.opensuse.org/show_bug.cgi?id=1207827
Reported-by: Gustavo Freitas <freitasmgustavo@gmail.com>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
trace_define_field_ext() is not used outside of trace_events.c, it should
be static.
Link: https://lore.kernel.org/oe-kbuild-all/202302130750.679RaRog-lkp@intel.com/
Fixes: b6c7abd1c28a ("tracing: Fix TASK_COMM_LEN in trace event format file")
Reported-by: Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
|
|
Both Rich Felker and Yoshinori Sato haven't done any work on arch/sh
for a while. As I have been maintaining Debian's sh4 port since 2014,
I am interested to keep the architecture alive.
Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Acked-by: Yoshinori Sato <ysato@users.sourceforge.jp>
Acked-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fix from Steven Rostedt:
"Fix showing of TASK_COMM_LEN instead of its value
The TASK_COMM_LEN was converted from a macro into an enum so that BTF
would have access to it. But this unfortunately caused TASK_COMM_LEN
to display in the format fields of trace events, as they are created
by the TRACE_EVENT() macro and such, macros convert to their values,
where as enums do not.
To handle this, instead of using the field itself to be display, save
the value of the array size as another field in the trace_event_fields
structure, and use that instead.
Not only does this fix the issue, but also converts the other trace
events that have this same problem (but were not breaking tooling).
With this change, the original work around b3bc8547d3be6 ("tracing:
Have TRACE_DEFINE_ENUM affect trace event types as well") could be
reverted (but that should be done in the merge window)"
* tag 'trace-v6.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: Fix TASK_COMM_LEN in trace event format file
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- one more fix for a tree-log 'write time corruption' report, update
the last dir index directly and don't keep in the log context
- do VFS-level inode lock around FIEMAP to prevent a deadlock with
concurrent fsync, the extent-level lock is not sufficient
- don't cache a single-device filesystem device to avoid cases when a
loop device is reformatted and the entry gets stale
* tag 'for-6.2-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: free device in btrfs_close_devices for a single device filesystem
btrfs: lock the inode in shared mode before starting fiemap
btrfs: simplify update of last_dir_index_offset when logging a directory
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are 2 small USB driver fixes that resolve some reported
regressions and one new device quirk. Specifically these are:
- new quirk for Alcor Link AK9563 smartcard reader
- revert of u_ether gadget change in 6.2-rc1 that caused problems
- typec pin probe fix
All of these have been in linux-next with no reported problems"
* tag 'usb-6.2-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: core: add quirk for Alcor Link AK9563 smartcard reader
usb: typec: altmodes/displayport: Fix probe pin assign check
Revert "usb: gadget: u_ether: Do not make UDC parent of the net device"
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull EFI fix from Ard Biesheuvel:
"A fix from Darren to widen the SMBIOS match for detecting Ampere Altra
machines with problematic firmware. In the mean time, we are working
on a more precise check, but this is still work in progress"
* tag 'efi-fixes-for-v6.2-4' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
arm64: efi: Force the use of SetVirtualAddressMap() on eMAG and Altra Max machines
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- Fix interrupt exit race with security mitigation switching.
- Don't select ARCH_WANTS_NO_INSTR until warnings are fixed.
- Build fix for CONFIG_NUMA=n.
Thanks to Nicholas Piggin, Randy Dunlap, and Sachin Sant.
* tag 'powerpc-6.2-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/64s/interrupt: Fix interrupt exit race with security mitigation switch
powerpc/kexec_file: fix implicit decl error
powerpc: Don't select ARCH_WANTS_NO_INSTR
|
|
When we upgraded our kernel, we started seeing some page corruption like
the following consistently:
BUG: Bad page state in process ganesha.nfsd pfn:1304ca
page:0000000022261c55 refcount:0 mapcount:-128 mapping:0000000000000000 index:0x0 pfn:0x1304ca
flags: 0x17ffffc0000000()
raw: 0017ffffc0000000 ffff8a513ffd4c98 ffffeee24b35ec08 0000000000000000
raw: 0000000000000000 0000000000000001 00000000ffffff7f 0000000000000000
page dumped because: nonzero mapcount
CPU: 0 PID: 15567 Comm: ganesha.nfsd Kdump: loaded Tainted: P B O 5.10.158-1.nutanix.20221209.el7.x86_64 #1
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/05/2016
Call Trace:
dump_stack+0x74/0x96
bad_page.cold+0x63/0x94
check_new_page_bad+0x6d/0x80
rmqueue+0x46e/0x970
get_page_from_freelist+0xcb/0x3f0
? _cond_resched+0x19/0x40
__alloc_pages_nodemask+0x164/0x300
alloc_pages_current+0x87/0xf0
skb_page_frag_refill+0x84/0x110
...
Sometimes, it would also show up as corruption in the free list pointer
and cause crashes.
After bisecting the issue, we found the issue started from commit
e320d3012d25 ("mm/page_alloc.c: fix freeing non-compound pages"):
if (put_page_testzero(page))
free_the_page(page, order);
else if (!PageHead(page))
while (order-- > 0)
free_the_page(page + (1 << order), order);
So the problem is the check PageHead is racy because at this point we
already dropped our reference to the page. So even if we came in with
compound page, the page can already be freed and PageHead can return
false and we will end up freeing all the tail pages causing double free.
Fixes: e320d3012d25 ("mm/page_alloc.c: fix freeing non-compound pages")
Link: https://lore.kernel.org/lkml/BYAPR02MB448855960A9656EEA81141FC94D99@BYAPR02MB4488.namprd02.prod.outlook.com/
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: stable@vger.kernel.org
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
After commit 3087c61ed2c4 ("tools/testing/selftests/bpf: replace open-coded 16 with TASK_COMM_LEN"),
the content of the format file under
/sys/kernel/tracing/events/task/task_newtask was changed from
field:char comm[16]; offset:12; size:16; signed:0;
to
field:char comm[TASK_COMM_LEN]; offset:12; size:16; signed:0;
John reported that this change breaks older versions of perfetto.
Then Mathieu pointed out that this behavioral change was caused by the
use of __stringify(_len), which happens to work on macros, but not on enum
labels. And he also gave the suggestion on how to fix it:
:One possible solution to make this more robust would be to extend
:struct trace_event_fields with one more field that indicates the length
:of an array as an actual integer, without storing it in its stringified
:form in the type, and do the formatting in f_show where it belongs.
The result as follows after this change,
$ cat /sys/kernel/tracing/events/task/task_newtask/format
field:char comm[16]; offset:12; size:16; signed:0;
Link: https://lore.kernel.org/lkml/Y+QaZtz55LIirsUO@google.com/
Link: https://lore.kernel.org/linux-trace-kernel/20230210155921.4610-1-laoar.shao@gmail.com/
Link: https://lore.kernel.org/linux-trace-kernel/20230212151303.12353-1-laoar.shao@gmail.com
Cc: stable@vger.kernel.org
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Kajetan Puchalski <kajetan.puchalski@arm.com>
CC: Qais Yousef <qyousef@layalina.io>
Fixes: 3087c61ed2c4 ("tools/testing/selftests/bpf: replace open-coded 16 with TASK_COMM_LEN")
Reported-by: John Stultz <jstultz@google.com>
Debugged-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Suggested-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A couple of hopefully final fixes for spi: one driver specific fix for
an issue with very large transfers and a fix for an issue with the
locking fixes in spidev merged earlier this release cycle which was
missed"
* tag 'spi-fix-v6.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: spidev: fix a recursive locking error
spi: dw: Fix wrong FIFO level setting for long xfers
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"Fix a kprobes bug, plus add a new Intel model number to the upstream
<asm/intel-family.h> header for drivers to use"
* tag 'x86-urgent-2023-02-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/cpu: Add Lunar Lake M
x86/kprobes: Fix 1 byte conditional jump target
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fix from Ingo Molnar:
"Fix an rtmutex missed-wakeup bug"
* tag 'locking-urgent-2023-02-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rtmutex: Ensure that the top waiter is always woken up
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull cxl fixes from Dan Williams:
"Two fixups for CXL (Compute Express Link) in presence of passthrough
decoders.
This primarily helps developers using the QEMU CXL emulation, but with
the impending arrival of CXL switches these types of topologies will
be of interest to end users.
- Fix a crash when shutting down regions in the presence of
passthrough decoders
- Fix region creation to understand passthrough decoders instead of
the narrower definition of passthrough ports"
* tag 'cxl-fixes-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
cxl/region: Fix passthrough-decoder detection
cxl/region: Fix null pointer dereference for resetting decoder
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm fixes from Dan Williams:
"A fix for an issue that could causes users to inadvertantly reserve
too much capacity when debugging the KMSAN and persistent memory
namespace, a lockdep fix, and a kernel-doc build warning:
- Resolve the conflict between KMSAN and NVDIMM with respect to
reserving pmem namespace / volume capacity for larger sizeof(struct
page)
- Fix a lockdep warning in the the NFIT code
- Fix a kernel-doc build warning"
* tag 'libnvdimm-fixes-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
nvdimm: Support sizeof(struct page) > MAX_STRUCT_PAGE_SIZE
ACPI: NFIT: fix a potential deadlock during NFIT teardown
dax: super.c: fix kernel-doc bad line warning
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock
Pull memblock revert from Mike Rapoport:
"Revert 'mm: Always release pages to the buddy allocator in
memblock_free_late()'
The pages being freed by memblock_free_late() have already been
initialized, but if they are in the deferred init range,
__free_one_page() might access nearby uninitialized pages when trying
to coalesce buddies, which will cause a crash.
A proper fix will be more involved so revert this change for the time
being"
* tag 'fixes-2023-02-11' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
Revert "mm: Always release pages to the buddy allocator in memblock_free_late()."
|
|
The nfs4_file table is global, so shutting it down when a containerized
nfsd is shut down is wrong and can lead to double-frees. Tear down the
nfs4_file_rhltable in nfs4_state_shutdown instead of
nfs4_state_shutdown_net.
Fixes: d47b295e8d76 ("NFSD: Use rhashtable for managing nfs4_file objects")
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2169017
Reported-by: JianHong Yin <jiyin@redhat.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Commit f2bd1c5ae2cb ("ALSA: hda: Fix page fault in
snd_hda_codec_shutdown()") relocated initialization of several codec
device fields. Due to differences between codec_exec_verb() and
snd_hdac_bus_exec_bus() in how they handle VERB execution - the latter
does not touch PM - assigning ->exec_verb to codec_exec_verb() causes PM
to be engaged before it is configured for the device. Configuration of
PM for the ASoC HDAudio sound card is done with snd_hda_set_power_save()
during skl_hda_audio_probe() whereas the assignment happens early, in
snd_hda_codec_device_init().
Revert to previous behavior to avoid problems caused by too early PM
manipulation.
Suggested-by: Jason Montleon <jmontleo@redhat.com>
Link: https://lore.kernel.org/regressions/CALFERdzKUodLsm6=Ub3g2+PxpNpPtPq3bGBLbff=eZr9_S=YVA@mail.gmail.com
Fixes: f2bd1c5ae2cb ("ALSA: hda: Fix page fault in snd_hda_codec_shutdown()")
Signed-off-by: Cezary Rojewski <cezary.rojewski@intel.com>
Link: https://lore.kernel.org/r/20230210165541.3543604-1-cezary.rojewski@intel.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
Kuniyuki Iwashima says:
====================
sk->sk_forward_alloc fixes.
The first patch fixes a negative sk_forward_alloc by adding
sk_rmem_schedule() before skb_set_owner_r(), and second patch
removes an unnecessary WARN_ON_ONCE().
v2: https://lore.kernel.org/netdev/20230209013329.87879-1-kuniyu@amazon.com/
v1: https://lore.kernel.org/netdev/20230207183718.54520-1-kuniyu@amazon.com/
====================
Link: https://lore.kernel.org/r/20230210002202.81442-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Christoph Paasch reported that commit b5fc29233d28 ("inet6: Remove
inet6_destroy_sock() in sk->sk_prot->destroy().") started triggering
WARN_ON_ONCE(sk->sk_forward_alloc) in sk_stream_kill_queues(). [0 - 2]
Also, we can reproduce it by a program in [3].
In the commit, we delay freeing ipv6_pinfo.pktoptions from sk->destroy()
to sk->sk_destruct(), so sk->sk_forward_alloc is no longer zero in
inet_csk_destroy_sock().
The same check has been in inet_sock_destruct() from at least v2.6,
we can just remove the WARN_ON_ONCE(). However, among the users of
sk_stream_kill_queues(), only CAIF is not calling inet_sock_destruct().
Thus, we add the same WARN_ON_ONCE() to caif_sock_destructor().
[0]: https://lore.kernel.org/netdev/39725AB4-88F1-41B3-B07F-949C5CAEFF4F@icloud.com/
[1]: https://github.com/multipath-tcp/mptcp_net-next/issues/341
[2]:
WARNING: CPU: 0 PID: 3232 at net/core/stream.c:212 sk_stream_kill_queues+0x2f9/0x3e0
Modules linked in:
CPU: 0 PID: 3232 Comm: syz-executor.0 Not tainted 6.2.0-rc5ab24eb4698afbe147b424149c529e2a43ec24eb5 #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:sk_stream_kill_queues+0x2f9/0x3e0
Code: 03 0f b6 04 02 84 c0 74 08 3c 03 0f 8e ec 00 00 00 8b ab 08 01 00 00 e9 60 ff ff ff e8 d0 5f b6 fe 0f 0b eb 97 e8 c7 5f b6 fe <0f> 0b eb a0 e8 be 5f b6 fe 0f 0b e9 6a fe ff ff e8 02 07 e3 fe e9
RSP: 0018:ffff88810570fc68 EFLAGS: 00010293
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff888101f38f40 RSI: ffffffff8285e529 RDI: 0000000000000005
RBP: 0000000000000ce0 R08: 0000000000000005 R09: 0000000000000000
R10: 0000000000000ce0 R11: 0000000000000001 R12: ffff8881009e9488
R13: ffffffff84af2cc0 R14: 0000000000000000 R15: ffff8881009e9458
FS: 00007f7fdfbd5800(0000) GS:ffff88811b600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b32923000 CR3: 00000001062fc006 CR4: 0000000000170ef0
Call Trace:
<TASK>
inet_csk_destroy_sock+0x1a1/0x320
__tcp_close+0xab6/0xe90
tcp_close+0x30/0xc0
inet_release+0xe9/0x1f0
inet6_release+0x4c/0x70
__sock_release+0xd2/0x280
sock_close+0x15/0x20
__fput+0x252/0xa20
task_work_run+0x169/0x250
exit_to_user_mode_prepare+0x113/0x120
syscall_exit_to_user_mode+0x1d/0x40
do_syscall_64+0x48/0x90
entry_SYSCALL_64_after_hwframe+0x72/0xdc
RIP: 0033:0x7f7fdf7ae28d
Code: c1 20 00 00 75 10 b8 03 00 00 00 0f 05 48 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 ee fb ff ff 48 89 04 24 b8 03 00 00 00 0f 05 <48> 8b 3c 24 48 89 c2 e8 37 fc ff ff 48 89 d0 48 83 c4 08 48 3d 01
RSP: 002b:00000000007dfbb0 EFLAGS: 00000293 ORIG_RAX: 0000000000000003
RAX: 0000000000000000 RBX: 0000000000000004 RCX: 00007f7fdf7ae28d
RDX: 0000000000000000 RSI: ffffffffffffffff RDI: 0000000000000003
RBP: 0000000000000000 R08: 000000007f338e0f R09: 0000000000000e0f
R10: 000000007f338e13 R11: 0000000000000293 R12: 00007f7fdefff000
R13: 00007f7fdefffcd8 R14: 00007f7fdefffce0 R15: 00007f7fdefffcd8
</TASK>
[3]: https://lore.kernel.org/netdev/20230208004245.83497-1-kuniyu@amazon.com/
Fixes: b5fc29233d28 ("inet6: Remove inet6_destroy_sock() in sk->sk_prot->destroy().")
Reported-by: syzbot <syzkaller@googlegroups.com>
Reported-by: Christoph Paasch <christophpaasch@icloud.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Eric Dumazet pointed out [0] that when we call skb_set_owner_r()
for ipv6_pinfo.pktoptions, sk_rmem_schedule() has not been called,
resulting in a negative sk_forward_alloc.
We add a new helper which clones a skb and sets its owner only
when sk_rmem_schedule() succeeds.
Note that we move skb_set_owner_r() forward in (dccp|tcp)_v6_do_rcv()
because tcp_send_synack() can make sk_forward_alloc negative before
ipv6_opt_accepted() in the crossed SYN-ACK or self-connect() cases.
[0]: https://lore.kernel.org/netdev/CANn89iK9oc20Jdi_41jb9URdF210r7d1Y-+uypbMSbOfY6jqrg@mail.gmail.com/
Fixes: 323fbd0edf3f ("net: dccp: Add handling of IPV6_PKTOPTIONS to dccp_v6_do_rcv()")
Fixes: 3df80d9320bc ("[DCCP]: Introduce DCCPv6")
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The result of nlmsg_find_attr() 'br_spec' is dereferenced in
nla_for_each_nested(), but it can take NULL value in nla_find() function,
which will result in an error.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 51616018dd1b ("i40e: Add support for getlink, setlink ndo ops")
Signed-off-by: Natalia Petrova <n.petrova@fintech.ru>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20230209172833.3596034-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|