summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-06-05drivers/base/memory: fix trying offlining memory blocks with memory holes on ↵David Hildenbrand
aarch64 offline_pages() properly checks for memory holes and bails out. However, we do a page_zone(pfn_to_page(start_pfn)) before calling offline_pages() when offlining a memory block. We should not unconditionally call page_zone(pfn_to_page(start_pfn)) on aarch64 in offlining code, otherwise we can trigger a BUG when hitting a memory hole: kernel BUG at include/linux/mm.h:1383! Internal error: Oops - BUG: 0 [#1] SMP Modules linked in: loop processor efivarfs ip_tables x_tables ext4 mbcache jbd2 dm_mod igb nvme i2c_algo_bit mlx5_core i2c_core nvme_core firmware_class CPU: 13 PID: 1694 Comm: ranbug Not tainted 5.12.0-next-20210524+ #4 Hardware name: MiTAC RAPTOR EV-883832-X3-0001/RAPTOR, BIOS 1.6 06/28/2020 pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--) pc : memory_subsys_offline+0x1f8/0x250 lr : memory_subsys_offline+0x1f8/0x250 Call trace: memory_subsys_offline+0x1f8/0x250 device_offline+0x154/0x1d8 online_store+0xa4/0x118 dev_attr_store+0x44/0x78 sysfs_kf_write+0xe8/0x138 kernfs_fop_write_iter+0x26c/0x3d0 new_sync_write+0x2bc/0x4f8 vfs_write+0x718/0xc88 ksys_write+0xf8/0x1e0 __arm64_sys_write+0x74/0xa8 invoke_syscall.constprop.0+0x78/0x1e8 do_el0_svc+0xe4/0x298 el0_svc+0x20/0x30 el0_sync_handler+0xb0/0xb8 el0_sync+0x178/0x180 Kernel panic - not syncing: Oops - BUG: Fatal exception SMP: stopping secondary CPUs Kernel Offset: disabled CPU features: 0x00000251,20000846 Memory Limit: none If nr_vmemmap_pages is set, we know that we are dealing with hotplugged memory that doesn't have any holes. So call page_zone(pfn_to_page(start_pfn)) only when really necessary -- when nr_vmemmap_pages is set and we actually adjust the present pages. Link: https://lkml.kernel.org/r/20210526075226.5572-1-david@redhat.com Fixes: a08a2ae34613 ("mm,memory_hotplug: allocate memmap from the added memory range") Signed-off-by: David Hildenbrand <david@redhat.com> Reported-by: Qian Cai (QUIC) <quic_qiancai@quicinc.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Mike Rapoport <rppt@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-05mm/page_alloc: fix counting of free pages after take off from buddyDing Hui
Recently we found that there is a lot MemFree left in /proc/meminfo after do a lot of pages soft offline, it's not quite correct. Before Oscar's rework of soft offline for free pages [1], if we soft offline free pages, these pages are left in buddy with HWPoison flag, and NR_FREE_PAGES is not updated immediately. So the difference between NR_FREE_PAGES and real number of available free pages is also even big at the beginning. However, with the workload running, when we catch HWPoison page in any alloc functions subsequently, we will remove it from buddy, meanwhile update the NR_FREE_PAGES and try again, so the NR_FREE_PAGES will get more and more closer to the real number of available free pages. (regardless of unpoison_memory()) Now, for offline free pages, after a successful call take_page_off_buddy(), the page is no longer belong to buddy allocator, and will not be used any more, but we missed accounting NR_FREE_PAGES in this situation, and there is no chance to be updated later. Do update in take_page_off_buddy() like rmqueue() does, but avoid double counting if some one already set_migratetype_isolate() on the page. [1]: commit 06be6ff3d2ec ("mm,hwpoison: rework soft offline for free pages") Link: https://lkml.kernel.org/r/20210526075247.11130-1-dinghui@sangfor.com.cn Fixes: 06be6ff3d2ec ("mm,hwpoison: rework soft offline for free pages") Signed-off-by: Ding Hui <dinghui@sangfor.com.cn> Suggested-by: Naoya Horiguchi <naoya.horiguchi@nec.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-05mm/debug_vm_pgtable: fix alignment for pmd/pud_advanced_tests()Gerald Schaefer
In pmd/pud_advanced_tests(), the vaddr is aligned up to the next pmd/pud entry, and so it does not match the given pmdp/pudp and (aligned down) pfn any more. For s390, this results in memory corruption, because the IDTE instruction used e.g. in xxx_get_and_clear() will take the vaddr for some calculations, in combination with the given pmdp. It will then end up with a wrong table origin, ending on ...ff8, and some of those wrongly set low-order bits will also select a wrong pagetable level for the index addition. IDTE could therefore invalidate (or 0x20) something outside of the page tables, depending on the wrongly picked index, which in turn depends on the random vaddr. As result, we sometimes see "BUG task_struct (Not tainted): Padding overwritten" on s390, where one 0x5a padding value got overwritten with 0x7a. Fix this by aligning down, similar to how the pmd/pud_aligned pfns are calculated. Link: https://lkml.kernel.org/r/20210525130043.186290-2-gerald.schaefer@linux.ibm.com Fixes: a5c3b9ffb0f40 ("mm/debug_vm_pgtable: add tests validating advanced arch page table helpers") Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: <stable@vger.kernel.org> [5.9+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-05pid: take a reference when initializing `cad_pid`Mark Rutland
During boot, kernel_init_freeable() initializes `cad_pid` to the init task's struct pid. Later on, we may change `cad_pid` via a sysctl, and when this happens proc_do_cad_pid() will increment the refcount on the new pid via get_pid(), and will decrement the refcount on the old pid via put_pid(). As we never called get_pid() when we initialized `cad_pid`, we decrement a reference we never incremented, can therefore free the init task's struct pid early. As there can be dangling references to the struct pid, we can later encounter a use-after-free (e.g. when delivering signals). This was spotted when fuzzing v5.13-rc3 with Syzkaller, but seems to have been around since the conversion of `cad_pid` to struct pid in commit 9ec52099e4b8 ("[PATCH] replace cad_pid by a struct pid") from the pre-KASAN stone age of v2.6.19. Fix this by getting a reference to the init task's struct pid when we assign it to `cad_pid`. Full KASAN splat below. ================================================================== BUG: KASAN: use-after-free in ns_of_pid include/linux/pid.h:153 [inline] BUG: KASAN: use-after-free in task_active_pid_ns+0xc0/0xc8 kernel/pid.c:509 Read of size 4 at addr ffff23794dda0004 by task syz-executor.0/273 CPU: 1 PID: 273 Comm: syz-executor.0 Not tainted 5.12.0-00001-g9aef892b2d15 #1 Hardware name: linux,dummy-virt (DT) Call trace: ns_of_pid include/linux/pid.h:153 [inline] task_active_pid_ns+0xc0/0xc8 kernel/pid.c:509 do_notify_parent+0x308/0xe60 kernel/signal.c:1950 exit_notify kernel/exit.c:682 [inline] do_exit+0x2334/0x2bd0 kernel/exit.c:845 do_group_exit+0x108/0x2c8 kernel/exit.c:922 get_signal+0x4e4/0x2a88 kernel/signal.c:2781 do_signal arch/arm64/kernel/signal.c:882 [inline] do_notify_resume+0x300/0x970 arch/arm64/kernel/signal.c:936 work_pending+0xc/0x2dc Allocated by task 0: slab_post_alloc_hook+0x50/0x5c0 mm/slab.h:516 slab_alloc_node mm/slub.c:2907 [inline] slab_alloc mm/slub.c:2915 [inline] kmem_cache_alloc+0x1f4/0x4c0 mm/slub.c:2920 alloc_pid+0xdc/0xc00 kernel/pid.c:180 copy_process+0x2794/0x5e18 kernel/fork.c:2129 kernel_clone+0x194/0x13c8 kernel/fork.c:2500 kernel_thread+0xd4/0x110 kernel/fork.c:2552 rest_init+0x44/0x4a0 init/main.c:687 arch_call_rest_init+0x1c/0x28 start_kernel+0x520/0x554 init/main.c:1064 0x0 Freed by task 270: slab_free_hook mm/slub.c:1562 [inline] slab_free_freelist_hook+0x98/0x260 mm/slub.c:1600 slab_free mm/slub.c:3161 [inline] kmem_cache_free+0x224/0x8e0 mm/slub.c:3177 put_pid.part.4+0xe0/0x1a8 kernel/pid.c:114 put_pid+0x30/0x48 kernel/pid.c:109 proc_do_cad_pid+0x190/0x1b0 kernel/sysctl.c:1401 proc_sys_call_handler+0x338/0x4b0 fs/proc/proc_sysctl.c:591 proc_sys_write+0x34/0x48 fs/proc/proc_sysctl.c:617 call_write_iter include/linux/fs.h:1977 [inline] new_sync_write+0x3ac/0x510 fs/read_write.c:518 vfs_write fs/read_write.c:605 [inline] vfs_write+0x9c4/0x1018 fs/read_write.c:585 ksys_write+0x124/0x240 fs/read_write.c:658 __do_sys_write fs/read_write.c:670 [inline] __se_sys_write fs/read_write.c:667 [inline] __arm64_sys_write+0x78/0xb0 fs/read_write.c:667 __invoke_syscall arch/arm64/kernel/syscall.c:37 [inline] invoke_syscall arch/arm64/kernel/syscall.c:49 [inline] el0_svc_common.constprop.1+0x16c/0x388 arch/arm64/kernel/syscall.c:129 do_el0_svc+0xf8/0x150 arch/arm64/kernel/syscall.c:168 el0_svc+0x28/0x38 arch/arm64/kernel/entry-common.c:416 el0_sync_handler+0x134/0x180 arch/arm64/kernel/entry-common.c:432 el0_sync+0x154/0x180 arch/arm64/kernel/entry.S:701 The buggy address belongs to the object at ffff23794dda0000 which belongs to the cache pid of size 224 The buggy address is located 4 bytes inside of 224-byte region [ffff23794dda0000, ffff23794dda00e0) The buggy address belongs to the page: page:(____ptrval____) refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x4dda0 head:(____ptrval____) order:1 compound_mapcount:0 flags: 0x3fffc0000010200(slab|head) raw: 03fffc0000010200 dead000000000100 dead000000000122 ffff23794d40d080 raw: 0000000000000000 0000000000190019 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff23794dd9ff00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff23794dd9ff80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff23794dda0000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff23794dda0080: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc ffff23794dda0100: fc fc fc fc fc fc fc fc 00 00 00 00 00 00 00 00 ================================================================== Link: https://lkml.kernel.org/r/20210524172230.38715-1-mark.rutland@arm.com Fixes: 9ec52099e4b8678a ("[PATCH] replace cad_pid by a struct pid") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Christian Brauner <christian@brauner.io> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Kees Cook <keescook@chromium.org Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-05kfence: use TASK_IDLE when awaiting allocationMarco Elver
Since wait_event() uses TASK_UNINTERRUPTIBLE by default, waiting for an allocation counts towards load. However, for KFENCE, this does not make any sense, since there is no busy work we're awaiting. Instead, use TASK_IDLE via wait_event_idle() to not count towards load. BugLink: https://bugzilla.suse.com/show_bug.cgi?id=1185565 Link: https://lkml.kernel.org/r/20210521083209.3740269-1-elver@google.com Fixes: 407f1d8c1b5f ("kfence: await for allocation using wait_event") Signed-off-by: Marco Elver <elver@google.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: David Laight <David.Laight@ACULAB.COM> Cc: Hillf Danton <hdanton@sina.com> Cc: <stable@vger.kernel.org> [5.12+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-05Revert "MIPS: make userspace mapping young by default"Thomas Bogendoerfer
This reverts commit f685a533a7fab35c5d069dcd663f59c8e4171a75. The MIPS cache flush logic needs to know whether the mapping was already established to decide how to flush caches. This is done by checking the valid bit in the PTE. The commit above breaks this logic by setting the valid in the PTE in new mappings, which causes kernel crashes. Link: https://lkml.kernel.org/r/20210526094335.92948-1-tsbogend@alpha.franken.de Fixes: f685a533a7f ("MIPS: make userspace mapping young by default") Reported-by: Zhou Yanjie <zhouyanjie@wanyeetech.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Huang Pei <huangpei@loongson.cn> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-05ALSA: firewire-lib: fix the context to call snd_pcm_stop_xrun()Takashi Sakamoto
In the workqueue to queue wake-up event, isochronous context is not processed, thus it's useless to check context for the workqueue to switch status of runtime for PCM substream to XRUN. On the other hand, in software IRQ context of 1394 OHCI, it's needed. This commit fixes the bug introduced when tasklet was replaced with workqueue. Cc: <stable@vger.kernel.org> Fixes: 2b3d2987d800 ("ALSA: firewire: Replace tasklet with work") Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp> Link: https://lore.kernel.org/r/20210605091054.68866-1-o-takashi@sakamocchi.jp Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-06-05ALSA: hda/realtek: fix mute/micmute LEDs for HP EliteBook 840 Aero G8Jeremy Szu
The HP EliteBook 840 Aero G8 using ALC285 codec which using 0x04 to control mute LED and 0x01 to control micmute LED. In the other hand, there is no output from right channel of speaker. Therefore, add a quirk to make it works. Signed-off-by: Jeremy Szu <jeremy.szu@canonical.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20210605082539.41797-3-jeremy.szu@canonical.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-06-05ALSA: hda/realtek: fix mute/micmute LEDs and speaker for HP EliteBook x360 ↵Jeremy Szu
1040 G8 The HP EliteBook x360 1040 G8 using ALC285 codec which using 0x04 to control mute LED and 0x01 to control micmute LED. In the other hand, there is no output from right channel of speaker. Therefore, add a quirk to make it works. Signed-off-by: Jeremy Szu <jeremy.szu@canonical.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20210605082539.41797-2-jeremy.szu@canonical.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-06-05ALSA: hda/realtek: fix mute/micmute LEDs and speaker for HP Elite Dragonfly G2Jeremy Szu
The HP Elite Dragonfly G2 using ALC285 codec which using 0x04 to control mute LED and 0x01 to control micmute LED. In the other hand, there is no output from right channel of speaker. Therefore, add a quirk to make it works. Signed-off-by: Jeremy Szu <jeremy.szu@canonical.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20210605082539.41797-1-jeremy.szu@canonical.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-06-04Merge tag 'net-5.13-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Networking fixes, including fixes from bpf, wireless, netfilter and wireguard trees. The bpf vs lockdown+audit fix is the most notable. Things haven't slowed down just yet, both in terms of regressions in current release and largish fixes for older code, but we usually see a slowdown only after -rc5. Current release - regressions: - virtio-net: fix page faults and crashes when XDP is enabled - mlx5e: fix HW timestamping with CQE compression, and make sure they are only allowed to coexist with capable devices - stmmac: - fix kernel panic due to NULL pointer dereference of mdio_bus_data - fix double clk unprepare when no PHY device is connected Current release - new code bugs: - mt76: a few fixes for the recent MT7921 devices and runtime power management Previous releases - regressions: - ice: - track AF_XDP ZC enabled queues in bitmap to fix copy mode Tx - fix allowing VF to request more/less queues via virtchnl - correct supported and advertised autoneg by using PHY capabilities - allow all LLDP packets from PF to Tx - kbuild: quote OBJCOPY var to avoid a pahole call break the build Previous releases - always broken: - bpf, lockdown, audit: fix buggy SELinux lockdown permission checks - mt76: address the recent FragAttack vulnerabilities not covered by generic fixes - ipv6: fix KASAN: slab-out-of-bounds Read in fib6_nh_flush_exceptions - Bluetooth: - fix the erroneous flush_work() order, to avoid double free - use correct lock to prevent UAF of hdev object - nfc: fix NULL ptr dereference in llcp_sock_getname() after failed connect - ieee802154: multiple fixes to error checking and return values - igb: fix XDP with PTP enabled - intel: add correct exception tracing for XDP - tls: fix use-after-free when TLS offload device goes down and back up - ipvs: ignore IP_VS_SVC_F_HASHED flag when adding service - netfilter: nft_ct: skip expectations for confirmed conntrack - mptcp: fix falling back to TCP in presence of out of order packets early in connection lifetime - wireguard: switch from O(n) to a O(1) algorithm for maintaining peers, fixing stalls and a large memory leak in the process Misc: - devlink: correct VIRTUAL port to not have phys_port attributes - Bluetooth: fix VIRTIO_ID_BT assigned number - net: return the correct errno code ENOBUF -> ENOMEM - wireguard: - peer: allocate in kmem_cache saving 25% on peer memory - do not use -O3" * tag 'net-5.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (91 commits) cxgb4: avoid link re-train during TC-MQPRIO configuration sch_htb: fix refcount leak in htb_parent_to_leaf_offload wireguard: allowedips: free empty intermediate nodes when removing single node wireguard: allowedips: allocate nodes in kmem_cache wireguard: allowedips: remove nodes in O(1) wireguard: allowedips: initialize list head in selftest wireguard: peer: allocate in kmem_cache wireguard: use synchronize_net rather than synchronize_rcu wireguard: do not use -O3 wireguard: selftests: make sure rp_filter is disabled on vethc wireguard: selftests: remove old conntrack kconfig value virtchnl: Add missing padding to virtchnl_proto_hdrs ice: Allow all LLDP packets from PF to Tx ice: report supported and advertised autoneg using PHY capabilities ice: handle the VF VSI rebuild failure ice: Fix VFR issues for AVF drivers that expect ATQLEN cleared ice: Fix allowing VF to request more/less queues via virtchnl virtio-net: fix for skb_over_panic inside big mode ipv6: Fix KASAN: slab-out-of-bounds Read in fib6_nh_flush_exceptions fib: Return the correct errno code ...
2021-06-04Merge tag 'perf-tools-fixes-for-v5.13-2021-06-04' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tools fixes from Arnaldo Carvalho de Melo: - Fix NULL pointer dereference in 'perf probe' when handling DW_AT_const_value when looking for a variable, which is valid. - Fix for capability querying of perf_event_attr.cgroup support in older kernels. - Add missing cloning of evsel->use_config_name. - Honor event config name on --no-merge in 'perf stat'. - Fix some memory leaks found using ASAN. - Fix the perf entry for perf_event_attr setup with make LIBPFM4=1 on s390 z/VM. - Update MIPS UAPI perf_regs.h file. - Fix 'perf stat' BPF counter load return check. * tag 'perf-tools-fixes-for-v5.13-2021-06-04' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: perf env: Fix memory leak of bpf_prog_info_linear member perf symbol-elf: Fix memory leak by freeing sdt_note.args perf stat: Honor event config name on --no-merge perf evsel: Add missing cloning of evsel->use_config_name perf test: Test 17 fails with make LIBPFM4=1 on s390 z/VM perf stat: Fix error return code in bperf__load() perf record: Move probing cgroup sampling support perf probe: Fix NULL pointer dereference in convert_variable_location() perf tools: Copy uapi/asm/perf_regs.h from the kernel for MIPS
2021-06-04Merge tag 'pci-v5.13-fixes-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: - Fix MSIs for platforms with "msi-map" device-tree property, which we broke in v5.13-rc1 (Jean-Philippe Brucker) - Add Krzysztof Wilczyński as PCI reviewer (Lorenzo Pieralisi) * tag 'pci-v5.13-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI/MSI: Fix MSIs for generic hosts that use device-tree's "msi-map" MAINTAINERS: Add Krzysztof as PCI host/endpoint controllers reviewer
2021-06-04cxgb4: avoid link re-train during TC-MQPRIO configurationRahul Lakkireddy
When configuring TC-MQPRIO offload, only turn off netdev carrier and don't bring physical link down in hardware. Otherwise, when the physical link is brought up again after configuration, it gets re-trained and stalls ongoing traffic. Also, when firmware is no longer accessible or crashed, avoid sending FLOWC and waiting for reply that will never come. Fix following hung_task_timeout_secs trace seen in these cases. INFO: task tc:20807 blocked for more than 122 seconds. Tainted: G S 5.13.0-rc3+ #122 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:tc state:D stack:14768 pid:20807 ppid: 19366 flags:0x00000000 Call Trace: __schedule+0x27b/0x6a0 schedule+0x37/0xa0 schedule_preempt_disabled+0x5/0x10 __mutex_lock.isra.14+0x2a0/0x4a0 ? netlink_lookup+0x120/0x1a0 ? rtnl_fill_ifinfo+0x10f0/0x10f0 __netlink_dump_start+0x70/0x250 rtnetlink_rcv_msg+0x28b/0x380 ? rtnl_fill_ifinfo+0x10f0/0x10f0 ? rtnl_calcit.isra.42+0x120/0x120 netlink_rcv_skb+0x4b/0xf0 netlink_unicast+0x1a0/0x280 netlink_sendmsg+0x216/0x440 sock_sendmsg+0x56/0x60 __sys_sendto+0xe9/0x150 ? handle_mm_fault+0x6d/0x1b0 ? do_user_addr_fault+0x1c5/0x620 __x64_sys_sendto+0x1f/0x30 do_syscall_64+0x3c/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f7f73218321 RSP: 002b:00007ffd19626208 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 000055b7c0a8b240 RCX: 00007f7f73218321 RDX: 0000000000000028 RSI: 00007ffd19626210 RDI: 0000000000000003 RBP: 000055b7c08680ff R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 000055b7c085f5f6 R13: 000055b7c085f60a R14: 00007ffd19636470 R15: 00007ffd196262a0 Fixes: b1396c2bd675 ("cxgb4: parse and configure TC-MQPRIO offload") Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-04sch_htb: fix refcount leak in htb_parent_to_leaf_offloadYunjian Wang
The commit ae81feb7338c ("sch_htb: fix null pointer dereference on a null new_q") fixes a NULL pointer dereference bug, but it is not correct. Because htb_graft_helper properly handles the case when new_q is NULL, and after the previous patch by skipping this call which creates an inconsistency : dev_queue->qdisc will still point to the old qdisc, but cl->parent->leaf.q will point to the new one (which will be noop_qdisc, because new_q was NULL). The code is based on an assumption that these two pointers are the same, so it can lead to refcount leaks. The correct fix is to add a NULL pointer check to protect qdisc_refcount_inc inside htb_parent_to_leaf_offload. Fixes: ae81feb7338c ("sch_htb: fix null pointer dereference on a null new_q") Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Suggested-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-04Merge branch '100GbE' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2021-06-04 This series contains updates to virtchnl header file and ice driver. Brett fixes VF being unable to request a different number of queues then allocated and adds clearing of VF_MBX_ATQLEN register for VF reset. Haiyue handles error of rebuilding VF VSI during reset. Paul fixes reporting of autoneg to use the PHY capabilities. Dave allows LLDP packets without priority of TC_PRIO_CONTROL to be transmitted. Geert Uytterhoeven adds explicit padding to virtchnl_proto_hdrs structure in the virtchnl header file. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-04Merge branch 'wireguard-fixes'David S. Miller
Jason A. Donenfeld says: ==================== wireguard fixes for 5.13-rc5 Here are bug fixes to WireGuard for 5.13-rc5: 1-2,6) These are small, trivial tweaks to our test harness. 3) Linus thinks -O3 is still dangerous to enable. The code gen wasn't so much different with -O2 either. 4) We were accidentally calling synchronize_rcu instead of synchronize_net while holding the rtnl_lock, resulting in some rather large stalls that hit production machines. 5) Peer allocation was wasting literally hundreds of megabytes on real world deployments, due to oddly sized large objects not fitting nicely into a kmalloc slab. 7-9) We move from an insanely expensive O(n) algorithm to a fast O(1) algorithm, and cleanup a massive memory leak in the process, in which allowed ips churn would leave danging nodes hanging around without cleanup until the interface was removed. The O(1) algorithm eliminates packet stalls and high latency issues, in addition to bringing operations that took as much as 10 minutes down to less than a second. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-04wireguard: allowedips: free empty intermediate nodes when removing single nodeJason A. Donenfeld
When removing single nodes, it's possible that that node's parent is an empty intermediate node, in which case, it too should be removed. Otherwise the trie fills up and never is fully emptied, leading to gradual memory leaks over time for tries that are modified often. There was originally code to do this, but was removed during refactoring in 2016 and never reworked. Now that we have proper parent pointers from the previous commits, we can implement this properly. In order to reduce branching and expensive comparisons, we want to keep the double pointer for parent assignment (which lets us easily chain up to the root), but we still need to actually get the parent's base address. So encode the bit number into the last two bits of the pointer, and pack and unpack it as needed. This is a little bit clumsy but is the fastest and less memory wasteful of the compromises. Note that we align the root struct here to a minimum of 4, because it's embedded into a larger struct, and we're relying on having the bottom two bits for our flag, which would only be 16-bit aligned on m68k. The existing macro-based helpers were a bit unwieldy for adding the bit packing to, so this commit replaces them with safer and clearer ordinary functions. We add a test to the randomized/fuzzer part of the selftests, to free the randomized tries by-peer, refuzz it, and repeat, until it's supposed to be empty, and then then see if that actually resulted in the whole thing being emptied. That combined with kmemcheck should hopefully make sure this commit is doing what it should. Along the way this resulted in various other cleanups of the tests and fixes for recent graphviz. Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Cc: stable@vger.kernel.org Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-04wireguard: allowedips: allocate nodes in kmem_cacheJason A. Donenfeld
The previous commit moved from O(n) to O(1) for removal, but in the process introduced an additional pointer member to a struct that increased the size from 60 to 68 bytes, putting nodes in the 128-byte slab. With deployed systems having as many as 2 million nodes, this represents a significant doubling in memory usage (128 MiB -> 256 MiB). Fix this by using our own kmem_cache, that's sized exactly right. This also makes wireguard's memory usage more transparent in tools like slabtop and /proc/slabinfo. Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Suggested-by: Arnd Bergmann <arnd@arndb.de> Suggested-by: Matthew Wilcox <willy@infradead.org> Cc: stable@vger.kernel.org Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-04wireguard: allowedips: remove nodes in O(1)Jason A. Donenfeld
Previously, deleting peers would require traversing the entire trie in order to rebalance nodes and safely free them. This meant that removing 1000 peers from a trie with a half million nodes would take an extremely long time, during which we're holding the rtnl lock. Large-scale users were reporting 200ms latencies added to the networking stack as a whole every time their userspace software would queue up significant removals. That's a serious situation. This commit fixes that by maintaining a double pointer to the parent's bit pointer for each node, and then using the already existing node list belonging to each peer to go directly to the node, fix up its pointers, and free it with RCU. This means removal is O(1) instead of O(n), and we don't use gobs of stack. The removal algorithm has the same downside as the code that it fixes: it won't collapse needlessly long runs of fillers. We can enhance that in the future if it ever becomes a problem. This commit documents that limitation with a TODO comment in code, a small but meaningful improvement over the prior situation. Currently the biggest flaw, which the next commit addresses, is that because this increases the node size on 64-bit machines from 60 bytes to 68 bytes. 60 rounds up to 64, but 68 rounds up to 128. So we wind up using twice as much memory per node, because of power-of-two allocations, which is a big bummer. We'll need to figure something out there. Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Cc: stable@vger.kernel.org Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-04wireguard: allowedips: initialize list head in selftestJason A. Donenfeld
The randomized trie tests weren't initializing the dummy peer list head, resulting in a NULL pointer dereference when used. Fix this by initializing it in the randomized trie test, just like we do for the static unit test. While we're at it, all of the other strings like this have the word "self-test", so add it to the missing place here. Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Cc: stable@vger.kernel.org Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-04wireguard: peer: allocate in kmem_cacheJason A. Donenfeld
With deployments having upwards of 600k peers now, this somewhat heavy structure could benefit from more fine-grained allocations. Specifically, instead of using a 2048-byte slab for a 1544-byte object, we can now use 1544-byte objects directly, thus saving almost 25% per-peer, or with 600k peers, that's a savings of 303 MiB. This also makes wireguard's memory usage more transparent in tools like slabtop and /proc/slabinfo. Fixes: 8b5553ace83c ("wireguard: queueing: get rid of per-peer ring buffers") Suggested-by: Arnd Bergmann <arnd@arndb.de> Suggested-by: Matthew Wilcox <willy@infradead.org> Cc: stable@vger.kernel.org Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-04wireguard: use synchronize_net rather than synchronize_rcuJason A. Donenfeld
Many of the synchronization points are sometimes called under the rtnl lock, which means we should use synchronize_net rather than synchronize_rcu. Under the hood, this expands to using the expedited flavor of function in the event that rtnl is held, in order to not stall other concurrent changes. This fixes some very, very long delays when removing multiple peers at once, which would cause some operations to take several minutes. Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Cc: stable@vger.kernel.org Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-04wireguard: do not use -O3Jason A. Donenfeld
Apparently, various versions of gcc have O3-related miscompiles. Looking at the difference between -O2 and -O3 for gcc 11 doesn't indicate miscompiles, but the difference also doesn't seem so significant for performance that it's worth risking. Link: https://lore.kernel.org/lkml/CAHk-=wjuoGyxDhAF8SsrTkN0-YfCx7E6jUN3ikC_tn2AKWTTsA@mail.gmail.com/ Link: https://lore.kernel.org/lkml/CAHmME9otB5Wwxp7H8bR_i2uH2esEMvoBMC8uEXBMH9p0q1s6Bw@mail.gmail.com/ Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Cc: stable@vger.kernel.org Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-04wireguard: selftests: make sure rp_filter is disabled on vethcJason A. Donenfeld
Some distros may enable strict rp_filter by default, which will prevent vethc from receiving the packets with an unrouteable reverse path address. Reported-by: Hangbin Liu <liuhangbin@gmail.com> Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Cc: stable@vger.kernel.org Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-04wireguard: selftests: remove old conntrack kconfig valueJason A. Donenfeld
On recent kernels, this config symbol is no longer used. Reported-by: Rui Salvaterra <rsalvaterra@gmail.com> Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Cc: stable@vger.kernel.org Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-04i2c: qcom-geni: Suspend and resume the bus during SYSTEM_SLEEP_PM opsRoja Rani Yarubandi
Mark bus as suspended during system suspend to block the future transfers. Implement geni_i2c_resume_noirq() to resume the bus. Fixes: 37692de5d523 ("i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller") Signed-off-by: Roja Rani Yarubandi <rojay@codeaurora.org> Reviewed-by: Stephen Boyd <swboyd@chromium.org> Signed-off-by: Wolfram Sang <wsa@kernel.org>
2021-06-04i2c: qcom-geni: Add shutdown callback for i2cRoja Rani Yarubandi
If the hardware is still accessing memory after SMMU translation is disabled (as part of smmu shutdown callback), then the IOVAs (I/O virtual address) which it was using will go on the bus as the physical addresses which will result in unknown crashes like NoC/interconnect errors. So, implement shutdown callback for i2c driver to suspend the bus during system "reboot" or "shutdown". Fixes: 37692de5d523 ("i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller") Signed-off-by: Roja Rani Yarubandi <rojay@codeaurora.org> Reviewed-by: Stephen Boyd <swboyd@chromium.org> Signed-off-by: Wolfram Sang <wsa@kernel.org>
2021-06-04platform/mellanox: mlxreg-hotplug: Revert "move to use request_irq by ↵Mykola Kostenok
IRQF_NO_AUTOEN flag" It causes mlxreg-hotplug probing failure: request_threaded_irq() returns -EINVAL due to true value of condition: ((irqflags & IRQF_SHARED) && (irqflags & IRQF_NO_AUTOEN)) after flag "IRQF_NO_AUTOEN" has been added to: err = devm_request_irq(&pdev->dev, priv->irq, mlxreg_hotplug_irq_handler, IRQF_TRIGGER_FALLING | IRQF_SHARED | IRQF_NO_AUTOEN, "mlxreg-hotplug", priv); This reverts commit bee3ecfed0fc ("platform/mellanox: mlxreg-hotplug: move to use request_irq by IRQF_NO_AUTOEN flag"). Signed-off-by: Mykola Kostenok <c_mykolak@nvidia.com> Acked-by: Vadim Pasternak <vadimp@nvidia.com> Link: https://lore.kernel.org/r/20210603172827.2599908-1-c_mykolak@nvidia.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2021-06-04platform/surface: dtx: Add missing mutex_destroy() call in failure pathMaximilian Luz
When we fail to open the device file due to DTX being shut down, the mutex is initialized but never destroyed. We are destroying it when releasing the file, so add the missing call in the failure path as well. Fixes: 1d609992832e ("platform/surface: Add DTX driver") Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com> Link: https://lore.kernel.org/r/20210604132540.533036-1-luzmaximilian@gmail.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2021-06-04Merge tag 'sound-5.13-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "A couple of small fixes are found in the ALSA core side at this time; a fix in the new LED handling code and a long-standing (and likely no one would notice) ioctl bug. The rest are usual HD-audio fixes, mostly device-specific quirks but also one major regression fix that was introduced in 5.13" * tag 'sound-5.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda: update the power_state during the direct-complete ALSA: timer: Fix master timer notification ALSA: control led: fix memory leak in snd_ctl_led_register ALSA: hda: Fix for mute key LED for HP Pavilion 15-CK0xx ALSA: hda/cirrus: Set Initial DMIC volume to -26 dB ALSA: hda: Fix a regression in Capture Switch mixer read ALSA: hda: Add AlderLake-M PCI ID
2021-06-04x86/sev: Check SME/SEV support in CPUID firstPu Wen
The first two bits of the CPUID leaf 0x8000001F EAX indicate whether SEV or SME is supported, respectively. It's better to check whether SEV or SME is actually supported before accessing the MSR_AMD64_SEV to check whether SEV or SME is enabled. This is both a bare-metal issue and a guest/VM issue. Since the first generation Hygon Dhyana CPU doesn't support the MSR_AMD64_SEV, reading that MSR results in a #GP - either directly from hardware in the bare-metal case or via the hypervisor (because the RDMSR is actually intercepted) in the guest/VM case, resulting in a failed boot. And since this is very early in the boot phase, rdmsrl_safe()/native_read_msr_safe() can't be used. So check the CPUID bits first, before accessing the MSR. [ tlendacky: Expand and improve commit message. ] [ bp: Massage commit message. ] Fixes: eab696d8e8b9 ("x86/sev: Do not require Hypervisor CPUID bit for SEV guests") Signed-off-by: Pu Wen <puwen@hygon.cn> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Cc: <stable@vger.kernel.org> # v5.10+ Link: https://lkml.kernel.org/r/20210602070207.2480-1-puwen@hygon.cn
2021-06-04Merge tag 'drm-fixes-2021-06-04-1' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm fixes from Dave Airlie: "Two big regression reverts in here, one for fbdev and one i915. Otherwise it's mostly amdgpu display fixes, and tegra fixes. fb: - revert broken fb_defio patch amdgpu: - Display fixes - FRU EEPROM error handling fix - RAS fix - PSP fix - Releasing pinned BO fix i915: - Revert conversion to io_mapping_map_user() which lead to BUG_ON() - Fix check for error valued returns in a selftest tegra: - SOR power domain race condition fix - build warning fix - runtime pm ref leak fix - modifier fix" * tag 'drm-fixes-2021-06-04-1' of git://anongit.freedesktop.org/drm/drm: amd/display: convert DRM_DEBUG_ATOMIC to drm_dbg_atomic drm/amdgpu: make sure we unpin the UVD BO drm/amd/amdgpu:save psp ring wptr to avoid attack drm/amd/display: Fix potential memory leak in DMUB hw_init drm/amdgpu: Don't query CE and UE errors drm/amd/display: Fix overlay validation by considering cursors drm/amdgpu: refine amdgpu_fru_get_product_info drm/amdgpu: add judgement for dc support drm/amd/display: Fix GPU scaling regression by FS video support drm/amd/display: Allow bandwidth validation for 0 streams. Revert "i915: use io_mapping_map_user" drm/i915/selftests: Fix return value check in live_breadcrumbs_smoketest() Revert "fb_defio: Remove custom address_space_operations" drm/tegra: Correct DRM_FORMAT_MOD_NVIDIA_SECTOR_LAYOUT drm/tegra: sor: Fix AUX device reference leak drm/tegra: Get ref for DP AUX channel, not its ddc adapter drm/tegra: Fix shift overflow in tegra_shared_plane_atomic_update drm/tegra: sor: Fully initialize SOR before registration gpu: host1x: Split up client initalization and registration drm/tegra: sor: Do not leak runtime PM reference
2021-06-04virtchnl: Add missing padding to virtchnl_proto_hdrsGeert Uytterhoeven
On m68k (Coldfire M547x): CC drivers/net/ethernet/intel/i40e/i40e_main.o In file included from drivers/net/ethernet/intel/i40e/i40e_prototype.h:9, from drivers/net/ethernet/intel/i40e/i40e.h:41, from drivers/net/ethernet/intel/i40e/i40e_main.c:12: include/linux/avf/virtchnl.h:153:36: warning: division by zero [-Wdiv-by-zero] 153 | { virtchnl_static_assert_##X = (n)/((sizeof(struct X) == (n)) ? 1 : 0) } | ^ include/linux/avf/virtchnl.h:844:1: note: in expansion of macro ‘VIRTCHNL_CHECK_STRUCT_LEN’ 844 | VIRTCHNL_CHECK_STRUCT_LEN(2312, virtchnl_proto_hdrs); | ^~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/avf/virtchnl.h:844:33: error: enumerator value for ‘virtchnl_static_assert_virtchnl_proto_hdrs’ is not an integer constant 844 | VIRTCHNL_CHECK_STRUCT_LEN(2312, virtchnl_proto_hdrs); | ^~~~~~~~~~~~~~~~~~~ On m68k, integers are aligned on addresses that are multiples of two, not four, bytes. Hence the size of a structure containing integers may not be divisible by 4. Fix this by adding explicit padding. Fixes: 1f7ea1cd6a374842 ("ice: Enable FDIR Configure for AVF") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-06-04ice: Allow all LLDP packets from PF to TxDave Ertman
Currently in the ice driver, the check whether to allow a LLDP packet to egress the interface from the PF_VSI is being based on the SKB's priority field. It checks to see if the packets priority is equal to TC_PRIO_CONTROL. Injected LLDP packets do not always meet this condition. SCAPY defaults to a sk_buff->protocol value of ETH_P_ALL (0x0003) and does not set the priority field. There will be other injection methods (even ones used by end users) that will not correctly configure the socket so that SKB fields are correctly populated. Then ethernet header has to have to correct value for the protocol though. Add a check to also allow packets whose ethhdr->h_proto matches ETH_P_LLDP (0x88CC). Fixes: 0c3a6101ff2d ("ice: Allow egress control packets from PF_VSI") Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-06-04ice: report supported and advertised autoneg using PHY capabilitiesPaul Greenwalt
Ethtool incorrectly reported supported and advertised auto-negotiation settings for a backplane PHY image which did not support auto-negotiation. This can occur when using media or PHY type for reporting ethtool supported and advertised auto-negotiation settings. Remove setting supported and advertised auto-negotiation settings based on PHY type in ice_phy_type_to_ethtool(), and MAC type in ice_get_link_ksettings(). Ethtool supported and advertised auto-negotiation settings should be based on the PHY image using the AQ command get PHY capabilities with media. Add setting supported and advertised auto-negotiation settings based get PHY capabilities with media in ice_get_link_ksettings(). Fixes: 48cb27f2fd18 ("ice: Implement handlers for ethtool PHY/link operations") Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-06-04ice: handle the VF VSI rebuild failureHaiyue Wang
VSI rebuild can be failed for LAN queue config, then the VF's VSI will be NULL, the VF reset should be stopped with the VF entering into the disable state. Fixes: 12bb018c538c ("ice: Refactor VF reset") Signed-off-by: Haiyue Wang <haiyue.wang@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-06-04ice: Fix VFR issues for AVF drivers that expect ATQLEN clearedBrett Creeley
Some AVF drivers expect the VF_MBX_ATQLEN register to be cleared for any type of VFR/VFLR. Fix this by clearing the VF_MBX_ATQLEN register at the same time as VF_MBX_ARQLEN. Fixes: 82ba01282cf8 ("ice: clear VF ARQLEN register on reset") Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-06-04ice: Fix allowing VF to request more/less queues via virtchnlBrett Creeley
Commit 12bb018c538c ("ice: Refactor VF reset") caused a regression that removes the ability for a VF to request a different amount of queues via VIRTCHNL_OP_REQUEST_QUEUES. This prevents VF drivers to either increase or decrease the number of queue pairs they are allocated. Fix this by using the variable vf->num_req_qs when determining the vf->num_vf_qs during VF VSI creation. Fixes: 12bb018c538c ("ice: Refactor VF reset") Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-06-04perf env: Fix memory leak of bpf_prog_info_linear memberRiccardo Mancini
ASan reported a memory leak caused by info_linear not being deallocated. The info_linear was allocated during in perf_event__synthesize_one_bpf_prog(). This patch adds the corresponding free() when bpf_prog_info_node is freed in perf_env__purge_bpf(). $ sudo ./perf record -- sleep 5 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.025 MB perf.data (8 samples) ] ================================================================= ==297735==ERROR: LeakSanitizer: detected memory leaks Direct leak of 7688 byte(s) in 19 object(s) allocated from: #0 0x4f420f in malloc (/home/user/linux/tools/perf/perf+0x4f420f) #1 0xc06a74 in bpf_program__get_prog_info_linear /home/user/linux/tools/lib/bpf/libbpf.c:11113:16 #2 0xb426fe in perf_event__synthesize_one_bpf_prog /home/user/linux/tools/perf/util/bpf-event.c:191:16 #3 0xb42008 in perf_event__synthesize_bpf_events /home/user/linux/tools/perf/util/bpf-event.c:410:9 #4 0x594596 in record__synthesize /home/user/linux/tools/perf/builtin-record.c:1490:8 #5 0x58c9ac in __cmd_record /home/user/linux/tools/perf/builtin-record.c:1798:8 #6 0x58990b in cmd_record /home/user/linux/tools/perf/builtin-record.c:2901:8 #7 0x7b2a20 in run_builtin /home/user/linux/tools/perf/perf.c:313:11 #8 0x7b12ff in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8 #9 0x7b2583 in run_argv /home/user/linux/tools/perf/perf.c:409:2 #10 0x7b0d79 in main /home/user/linux/tools/perf/perf.c:539:3 #11 0x7fa357ef6b74 in __libc_start_main /usr/src/debug/glibc-2.33-8.fc34.x86_64/csu/../csu/libc-start.c:332:16 Signed-off-by: Riccardo Mancini <rickyman7@gmail.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Fastabend <john.fastabend@gmail.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Yonghong Song <yhs@fb.com> Link: http://lore.kernel.org/lkml/20210602224024.300485-1-rickyman7@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-04x86/fault: Don't send SIGSEGV twice on SEGV_PKUERRJiashuo Liang
__bad_area_nosemaphore() calls both force_sig_pkuerr() and force_sig_fault() when handling SEGV_PKUERR. This does not cause problems because the second signal is filtered by the legacy_queue() check in __send_signal() because in both cases, the signal is SIGSEGV, the second one seeing that the first one is already pending. This causes the kernel to do unnecessary work so send the signal only once for SEGV_PKUERR. [ bp: Massage commit message. ] Fixes: 9db812dbb29d ("signal/x86: Call force_sig_pkuerr from __bad_area_nosemaphore") Suggested-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Jiashuo Liang <liangjs@pku.edu.cn> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Link: https://lkml.kernel.org/r/20210601085203.40214-1-liangjs@pku.edu.cn
2021-06-04perf symbol-elf: Fix memory leak by freeing sdt_note.argsRiccardo Mancini
Reported by ASan. Signed-off-by: Riccardo Mancini <rickyman7@gmail.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Fabian Hemmer <copy@copy.sh> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Remi Bernon <rbernon@codeweavers.com> Cc: Jiri Slaby <jirislaby@kernel.org> Link: http://lore.kernel.org/lkml/20210602220833.285226-1-rickyman7@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-04perf stat: Honor event config name on --no-mergeNamhyung Kim
If user gave an event name explicitly, it should be displayed in the output as is. But with --no-merge option it adds a pmu name at the end so might confuse users. Actually this is true for hybrid pmus, I think we should do the same for others. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210602212241.2175005-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-04perf evsel: Add missing cloning of evsel->use_config_nameNamhyung Kim
The evsel__clone() should copy all fields in the evsel which are set during the event parsing. But it missed the use_config_name field. Fixes: 12279429d862 ("perf stat: Uniquify hybrid event name") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210602212241.2175005-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-04ASoC: rt5682: Fix the fast discharge for headset unplugging in soundwire modeOder Chiou
Based on ("5a15cd7fce20b1fd4aece6a0240e2b58cd6a225d"), the setting also should be set in soundwire mode. Signed-off-by: Oder Chiou <oder_chiou@realtek.com> Link: https://lore.kernel.org/r/20210604063150.29925-1-oder_chiou@realtek.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-04btrfs: promote debugging asserts to full-fledged checks in validate_superNikolay Borisov
Syzbot managed to trigger this assert while performing its fuzzing. Turns out it's better to have those asserts turned into full-fledged checks so that in case buggy btrfs images are mounted the users gets an error and mounting is stopped. Alternatively with CONFIG_BTRFS_ASSERT disabled such image would have been erroneously allowed to be mounted. Reported-by: syzbot+a6bf271c02e4fe66b4e4@syzkaller.appspotmail.com CC: stable@vger.kernel.org # 5.4+ Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> [ add uuids to the messages ] Signed-off-by: David Sterba <dsterba@suse.com>
2021-06-04btrfs: return value from btrfs_mark_extent_written() in case of errorRitesh Harjani
We always return 0 even in case of an error in btrfs_mark_extent_written(). Fix it to return proper error value in case of a failure. All callers handle it. CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-06-04btrfs: zoned: fix zone number to sector/physical calculationNaohiro Aota
In btrfs_get_dev_zone_info(), we have "u32 sb_zone" and calculate "sector_t sector" by shifting it. But, this "sector" is calculated in 32bit, leading it to be 0 for the 2nd superblock copy. Since zone number is u32, shifting it to sector (sector_t) or physical address (u64) can easily trigger a missing cast bug like this. This commit introduces helpers to convert zone number to sector/LBA, so we won't fall into the same pitfall again. Reported-by: Dmitry Fomichev <Dmitry.Fomichev@wdc.com> Fixes: 12659251ca5d ("btrfs: implement log-structured superblock for ZONED mode") CC: stable@vger.kernel.org # 5.11+ Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-06-04btrfs: do not write supers if we have an fs errorJosef Bacik
Error injection testing uncovered a pretty severe problem where we could end up committing a super that pointed to the wrong tree roots, resulting in transid mismatch errors. The way we commit the transaction is we update the super copy with the current generations and bytenrs of the important roots, and then copy that into our super_for_commit. Then we allow transactions to continue again, we write out the dirty pages for the transaction, and then we write the super. If the write out fails we'll bail and skip writing the supers. However since we've allowed a new transaction to start, we can have a log attempting to sync at this point, which would be blocked on fs_info->tree_log_mutex. Once the commit fails we're allowed to do the log tree commit, which uses super_for_commit, which now points at fs tree's that were not written out. Fix this by checking BTRFS_FS_STATE_ERROR once we acquire the tree_log_mutex. This way if the transaction commit fails we're sure to see this bit set and we can skip writing the super out. This patch fixes this specific transid mismatch error I was seeing with this particular error path. CC: stable@vger.kernel.org # 5.12+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-06-04Merge tag 'drm/tegra/for-5.13-rc5' of ↵Dave Airlie
ssh://git.freedesktop.org/git/tegra/linux into drm-fixes drm/tegra: Fixes for v5.13-rc5 The most important change here fixes a race condition that causes either HDA or (more frequently) display to malfunction because they race for enabling the SOR power domain at probe time. Other than that, there's a couple of build warnings for issues introduced in v5.13 as well as some minor fixes, such as reference leak plugs. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thierry Reding <thierry.reding@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210603144624.788861-1-thierry.reding@gmail.com