Age | Commit message (Collapse) | Author |
|
Merge two commits removing unused helper functions related to power
management (David Alan Gilbert).
* pm-misc:
PM: clk: Remove unused pm_clk_remove()
PM: clk: remove unused of_pm_clk_add_clk()
|
|
Merge updates related to system sleep for 6.15-rc1 including fixes,
cleanups and a rework of the "smart suspend" driver flag handling to
avoid issues that may occur when drivers using it depend on some other
drivers:
- Rework the handling of the "smart suspend" driver flag in the PM core
to avoid issues hat may occur when drivers using it depend on some
other drivers and clean up the related PM core code (Rafael Wysocki,
Colin Ian King).
- Fix the handling of devices with the power.direct_complete flag set
if device_suspend() returns an error for at least one device to avoid
situations in which some of them may not be resumed (Rafael Wysocki).
- Use mutex_trylock() in hibernate_compressor_param_set() to avoid a
possible deadlock that may occur if the "compressor" hibernation
module parameter is accessed during the registration of a new
ieee80211 device (Lizhi Xu).
- Suppress sleeping parent warning in device_pm_add() in the case when
new children are added under a device with the power.direct_complete
set after it has been processed by device_resume() (Xu Yang).
- Remove needless return in three void functions related to system
wakeup (Zijun Hu).
- Replace deprecated kmap_atomic() with kmap_local_page() in the
hibernation core code (David Reaver).
- Remove unused helper functions related to system sleep (David Alan
Gilbert).
- Clean up s2idle_enter() so it does not lock and unlock CPU offline
in vain and update comments in it (Ulf Hansson).
- Clean up broken white space in dpm_wait_for_children() (Geert
Uytterhoeven).
* pm-sleep:
PM: sleep: Fix bit masking operation
PM: sleep: Fix handling devices with direct_complete set on errors
PM: sleep: core: Fix indentation in dpm_wait_for_children()
PM: s2idle: Extend comment in s2idle_enter()
PM: s2idle: Drop redundant locks when entering s2idle
PM: sleep: Remove unused pm_generic_ wrappers
PM: sleep: Rearrange dpm_async_fn() and async state clearing
PM: sleep: Rename power.async_in_progress to power.work_in_progress
PM: core: Tweak pm_runtime_block_if_disabled() return value
PM: runtime: Convert pm_runtime_blocked() to static inline
PM: sleep: Update power.smart_suspend under PM spinlock
PM: sleep: Adjust check before setting power.must_resume
PM: wakeup: Remove needless return in three void APIs
PM: sleep: Suppress sleeping parent warning in special case
PM: hibernate: Avoid deadlock in hibernate_compressor_param_set()
PM: sleep: Avoid unnecessary checks in device_prepare_smart_suspend()
PM: sleep: Use DPM_FLAG_SMART_SUSPEND conditionally
PM: runtime: Introduce pm_runtime_blocked()
PM: Block enabling of runtime PM during system suspend
PM: hibernate: Replace deprecated kmap_atomic() with kmap_local_page()
|
|
Merge Energy Model handling code updates and updates of the runtime PM
core code for 6.15-rc1:
- Clean up the Energy Model handling code somewhat (Rafael Wysocki).
- Use kfree_rcu() to simplify the handling of runtime Energy Model
updates (Li RongQing).
- Add an entry for the Energy Model framework to MAINTAINERS as
properly maintained (Lukasz Luba).
- Address RCU-related sparse warnings in the Energy Model code (Rafael
Wysocki).
- Remove ENERGY_MODEL dependency on SMP and allow it to be selected
when DEVFREQ is set without CPUFREQ so it can be used on a wider
range of systems (Jeson Gao).
- Unify error handling during runtime suspend and runtime resume in the
core to help drivers to implement more consistent runtime PM error
handling (Rafael Wysocki).
- Drop a redundant check from pm_runtime_force_resume() and rearrange
documentation related to __pm_runtime_disable() (Rafael Wysocki).
* pm-em:
PM: EM: Rework the depends on for CONFIG_ENERGY_MODEL
PM: EM: Address RCU-related sparse warnings
PM: EM: Consify two parameters of em_dev_register_perf_domain()
MAINTAINERS: Add Energy Model framework as properly maintained
PM: EM: use kfree_rcu() to simplify the code
PM: EM: Slightly reduce em_check_capacity_update() overhead
PM: EM: Drop unused parameter from em_adjust_new_capacity()
* pm-runtime:
PM: runtime: Unify error handling during suspend and resume
PM: runtime: Drop status check from pm_runtime_force_resume()
PM: Rearrange documentation related to __pm_runtime_disable()
|
|
Merge cpufreq updates for 6.15-rc1:
- Manage sysfs attributes and boost frequencies efficiently from
cpufreq core to reduce boilerplate code from drivers (Viresh Kumar).
- Minor cleanups to cpufreq drivers (Aaron Kling, Benjamin Schneider,
Dhananjay Ugwekar, Imran Shaik, and zuoqian).
- Migrate some cpufreq drivers to using for_each_present_cpu() (Jacky
Bai).
- cpufreq-qcom-hw DT binding fixes (Krzysztof Kozlowski).
- Use str_enable_disable() helper in cpufreq_online() (Lifeng Zheng).
- Optimize the amd-pstate driver to avoid cases where call paths end
up calling the same writes multiple times and needlessly caching
variables through code reorganization, locking overhaul and tracing
adjustments (Mario Limonciello, Dhananjay Ugwekar).
- Make it possible to avoid enabling capacity-aware scheduling (CAS) in
the intel_pstate driver and relocate a check for out-of-band (OOB)
platform handling in it to make it detect OOB before checking HWP
availability (Rafael Wysocki).
- Fix dbs_update() to avoid inadvertent conversions of negative integer
values to unsigned int which causes CPU frequency selection to be
inaccurate in some cases when the "conservative" cpufreq governor is
in use (Jie Zhan).
* pm-cpufreq: (91 commits)
dt-bindings: cpufreq: cpufreq-qcom-hw: Narrow properties on SDX75, SA8775p and SM8650
dt-bindings: cpufreq: cpufreq-qcom-hw: Drop redundant minItems:1
dt-bindings: cpufreq: cpufreq-qcom-hw: Add missing constraint for interrupt-names
dt-bindings: cpufreq: cpufreq-qcom-hw: Add QCS8300 compatible
cpufreq: Init cpufreq only for present CPUs
cpufreq: tegra186: Share policy per cluster
cpufreq/amd-pstate: Drop actions in amd_pstate_epp_cpu_offline()
cpufreq/amd-pstate: Stop caching EPP
cpufreq/amd-pstate: Rework CPPC enabling
cpufreq/amd-pstate: Drop debug statements for policy setting
cpufreq/amd-pstate: Update cppc_req_cached for shared mem EPP writes
cpufreq/amd-pstate: Move all EPP tracing into *_update_perf and *_set_epp functions
cpufreq/amd-pstate: Cache CPPC request in shared mem case too
cpufreq/amd-pstate: Replace all AMD_CPPC_* macros with masks
cpufreq/amd-pstate-ut: Adjust variable scope
cpufreq/amd-pstate-ut: Run on all of the correct CPUs
cpufreq/amd-pstate-ut: Drop SUCCESS and FAIL enums
cpufreq/amd-pstate-ut: Allow lowest nonlinear and lowest to be the same
cpufreq/amd-pstate-ut: Use _free macro to free put policy
cpufreq/amd-pstate: Drop `cppc_cap1_cached`
...
|
|
Merge an ACPI CPPC update, ACPI platform-profile driver updates, an ACPI
APEI update and a MAINTAINERS update related to ACPI for 6.15-rc1:
- Add a missing header file include to the x86 arch CPPC code (Mario
Limonciello).
- Rework the sysfs attributes implementation in the ACPI platform-profile
driver and improve the unregistration code in it (Nathan Chancellor,
Kurt Borja).
- Prevent the ACPI HED driver from being built as a module and change
its initcall level to subsys_initcall to avoid initialization ordering
issues related to it (Xiaofei Tan).
- Update a maintainer email address in the ACPI PMIC entry in
MAINTAINERS (Mika Westerberg).
* acpi-x86:
x86/ACPI: CPPC: Add missing include
* acpi-platform-profile:
ACPI: platform_profile: Improve platform_profile_unregister()
ACPI: platform-profile: Fix CFI violation when accessing sysfs files
* acpi-apei:
ACPI: HED: Always initialize before evged
* acpi-misc:
MAINTAINERS: Use my kernel.org address for ACPI PMIC work
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm
Merge ARM cpufreq updates for 6.15 from Viresh Kumar:
"- manage sysfs attributes and boost frequencies efficiently from cpufreq
core to reduce boilerplate code from drivers (Viresh Kumar).
- Minor cleanups to cpufreq drivers (Aaron Kling, Benjamin Schneider,
Dhananjay Ugwekar, Imran Shaik, and zuoqian).
- Migrate to using for_each_present_cpu (Jacky Bai).
- cpufreq-qcom-hw DT binding fixes (Krzysztof Kozlowski).
- Use str_enable_disable() helper (Lifeng Zheng)."
* tag 'cpufreq-arm-updates-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm: (59 commits)
dt-bindings: cpufreq: cpufreq-qcom-hw: Narrow properties on SDX75, SA8775p and SM8650
dt-bindings: cpufreq: cpufreq-qcom-hw: Drop redundant minItems:1
dt-bindings: cpufreq: cpufreq-qcom-hw: Add missing constraint for interrupt-names
dt-bindings: cpufreq: cpufreq-qcom-hw: Add QCS8300 compatible
cpufreq: Init cpufreq only for present CPUs
cpufreq: tegra186: Share policy per cluster
cpufreq: tegra194: Allow building for Tegra234
cpufreq: enable 1200Mhz clock speed for armada-37xx
cpufreq: Remove cpufreq_enable_boost_support()
cpufreq: staticize policy_has_boost_freq()
cpufreq: qcom: Set .set_boost directly
cpufreq: dt: Set .set_boost directly
cpufreq: scmi: Set .set_boost directly
cpufreq: powernv: Set .set_boost directly
cpufreq: loongson: Set .set_boost directly
cpufreq: apple: Set .set_boost directly
cpufreq: Restrict enabling boost on policies with no boost frequencies
cpufreq: cppc: Set policy->boost_supported
cpufreq: amd: Set policy->boost_supported
cpufreq: acpi: Set policy->boost_supported
...
|
|
Once a key's reference count has been reduced to 0, the garbage collector
thread may destroy it at any time and so key_put() is not allowed to touch
the key after that point. The most key_put() is normally allowed to do is
to touch key_gc_work as that's a static global variable.
However, in an effort to speed up the reclamation of quota, this is now
done in key_put() once the key's usage is reduced to 0 - but now the code
is looking at the key after the deadline, which is forbidden.
Fix this by using a flag to indicate that a key can be gc'd now rather than
looking at the key's refcount in the garbage collector.
Fixes: 9578e327b2b4 ("keys: update key quotas in key_put()")
Reported-by: syzbot+6105ffc1ded71d194d6d@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/673b6aec.050a0220.87769.004a.GAE@google.com/
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: syzbot+6105ffc1ded71d194d6d@syzkaller.appspotmail.com
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
Patch series "resource: Split and use DEFINE_RES*() macros", v2.
Replace open coded variants of DEFINE_RES*() macros.
Note, there are many more possibilities over the kernel and even in
reources.c, however the latter contains not so trivial leftovers. That's
why the examples cover only straightforward conversions.
This patch (of 4):
In some cases it would be useful to supply predefined descriptor of the
resource. For this, introduce DEFINE_RES_NAMED_DESC() macro.
While at it, provide DEFINE_RES() that takes only start, size,
and flags.
Link: https://lkml.kernel.org/r/20250317181412.1560630-1-andriy.shevchenko@linux.intel.com
Link: https://lkml.kernel.org/r/20250317181412.1560630-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "hung_task: Dump the blocking task stacktrace", v4.
The hung_task detector is very useful for detecting the lockup. However,
since it only dumps the blocked (uninterruptible sleep) processes, it is
not enough to identify the root cause of that lockup.
For example, if a process holds a mutex and sleep an event in
interruptible state long time, the other processes will wait on the mutex
in uninterruptible state. In this case, the waiter processes are dumped,
but the blocker process is not shown because it is sleep in interruptible
state.
This adds a feature to dump the blocker task which holds a mutex
when detecting a hung task. e.g.
INFO: task cat:115 blocked for more than 122 seconds.
Not tainted 6.14.0-rc3-00003-ga8946be3de00 #156
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:cat state:D stack:13432 pid:115 tgid:115 ppid:106 task_flags:0x400100 flags:0x00000002
Call Trace:
<TASK>
__schedule+0x731/0x960
? schedule_preempt_disabled+0x54/0xa0
schedule+0xb7/0x140
? __mutex_lock+0x51b/0xa60
? __mutex_lock+0x51b/0xa60
schedule_preempt_disabled+0x54/0xa0
__mutex_lock+0x51b/0xa60
read_dummy+0x23/0x70
full_proxy_read+0x6a/0xc0
vfs_read+0xc2/0x340
? __pfx_direct_file_splice_eof+0x10/0x10
? do_sendfile+0x1bd/0x2e0
ksys_read+0x76/0xe0
do_syscall_64+0xe3/0x1c0
? exc_page_fault+0xa9/0x1d0
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x4840cd
RSP: 002b:00007ffe99071828 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004840cd
RDX: 0000000000001000 RSI: 00007ffe99071870 RDI: 0000000000000003
RBP: 00007ffe99071870 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000001000000 R11: 0000000000000246 R12: 0000000000001000
R13: 00000000132fd3a0 R14: 0000000000000001 R15: ffffffffffffffff
</TASK>
INFO: task cat:115 is blocked on a mutex likely owned by task cat:114.
task:cat state:S stack:13432 pid:114 tgid:114 ppid:106 task_flags:0x400100 flags:0x00000002
Call Trace:
<TASK>
__schedule+0x731/0x960
? schedule_timeout+0xa8/0x120
schedule+0xb7/0x140
schedule_timeout+0xa8/0x120
? __pfx_process_timeout+0x10/0x10
msleep_interruptible+0x3e/0x60
read_dummy+0x2d/0x70
full_proxy_read+0x6a/0xc0
vfs_read+0xc2/0x340
? __pfx_direct_file_splice_eof+0x10/0x10
? do_sendfile+0x1bd/0x2e0
ksys_read+0x76/0xe0
do_syscall_64+0xe3/0x1c0
? exc_page_fault+0xa9/0x1d0
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x4840cd
RSP: 002b:00007ffe3e0147b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004840cd
RDX: 0000000000001000 RSI: 00007ffe3e014800 RDI: 0000000000000003
RBP: 00007ffe3e014800 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000001000000 R11: 0000000000000246 R12: 0000000000001000
R13: 000000001a0a93a0 R14: 0000000000000001 R15: ffffffffffffffff
</TASK>
TBD: We can extend this feature to cover other locks like rwsem and
rt_mutex, but rwsem requires to dump all the tasks which acquire and wait
that rwsem. We can follow the waiter link but the output will be a bit
different compared with mutex case.
This patch (of 2):
The "hung_task" shows a long-time uninterruptible slept task, but most
often, it's blocked on a mutex acquired by another task. Without dumping
such a task, investigating the root cause of the hung task problem is very
difficult.
This introduce task_struct::blocker_mutex to point the mutex lock which
this task is waiting for. Since the mutex has "owner" information, we can
find the owner task and dump it with hung tasks.
Note: the owner can be changed while dumping the owner task, so
this is "likely" the owner of the mutex.
With this change, the hung task shows blocker task's info like below;
INFO: task cat:115 blocked for more than 122 seconds.
Not tainted 6.14.0-rc3-00003-ga8946be3de00 #156
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:cat state:D stack:13432 pid:115 tgid:115 ppid:106 task_flags:0x400100 flags:0x00000002
Call Trace:
<TASK>
__schedule+0x731/0x960
? schedule_preempt_disabled+0x54/0xa0
schedule+0xb7/0x140
? __mutex_lock+0x51b/0xa60
? __mutex_lock+0x51b/0xa60
schedule_preempt_disabled+0x54/0xa0
__mutex_lock+0x51b/0xa60
read_dummy+0x23/0x70
full_proxy_read+0x6a/0xc0
vfs_read+0xc2/0x340
? __pfx_direct_file_splice_eof+0x10/0x10
? do_sendfile+0x1bd/0x2e0
ksys_read+0x76/0xe0
do_syscall_64+0xe3/0x1c0
? exc_page_fault+0xa9/0x1d0
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x4840cd
RSP: 002b:00007ffe99071828 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004840cd
RDX: 0000000000001000 RSI: 00007ffe99071870 RDI: 0000000000000003
RBP: 00007ffe99071870 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000001000000 R11: 0000000000000246 R12: 0000000000001000
R13: 00000000132fd3a0 R14: 0000000000000001 R15: ffffffffffffffff
</TASK>
INFO: task cat:115 is blocked on a mutex likely owned by task cat:114.
task:cat state:S stack:13432 pid:114 tgid:114 ppid:106 task_flags:0x400100 flags:0x00000002
Call Trace:
<TASK>
__schedule+0x731/0x960
? schedule_timeout+0xa8/0x120
schedule+0xb7/0x140
schedule_timeout+0xa8/0x120
? __pfx_process_timeout+0x10/0x10
msleep_interruptible+0x3e/0x60
read_dummy+0x2d/0x70
full_proxy_read+0x6a/0xc0
vfs_read+0xc2/0x340
? __pfx_direct_file_splice_eof+0x10/0x10
? do_sendfile+0x1bd/0x2e0
ksys_read+0x76/0xe0
do_syscall_64+0xe3/0x1c0
? exc_page_fault+0xa9/0x1d0
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x4840cd
RSP: 002b:00007ffe3e0147b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004840cd
RDX: 0000000000001000 RSI: 00007ffe3e014800 RDI: 0000000000000003
RBP: 00007ffe3e014800 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000001000000 R11: 0000000000000246 R12: 0000000000001000
R13: 000000001a0a93a0 R14: 0000000000000001 R15: ffffffffffffffff
</TASK>
[akpm@linux-foundation.org: implement debug_show_blocker() in C rather than in CPP]
Link: https://lkml.kernel.org/r/174046694331.2194069.15472952050240807469.stgit@mhiramat.tok.corp.google.com
Link: https://lkml.kernel.org/r/174046695384.2194069.16796289525958195643.stgit@mhiramat.tok.corp.google.com
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Waiman Long <longman@redhat.com>
Reviewed-by: Lance Yang <ioworker0@gmail.com>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Anna Schumaker <anna.schumaker@oracle.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Joel Granados <joel.granados@kernel.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tomasz Figa <tfiga@chromium.org>
Cc: Will Deacon <will@kernel.org>
Cc: Yongliang Gao <leonylgao@tencent.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "mm/vmscan: don't try to reclaim hwpoison folio".
Fix a bug during memory reclaim if folio is hwpoisoned.
This patch (of 2):
Introduce helper folio_contain_hwpoisoned_page() to check if the entire
folio is hwpoisoned or it contains hwpoisoned pages.
Link: https://lkml.kernel.org/r/20250318083939.987651-1-tujinjiang@huawei.com
Link: https://lkml.kernel.org/r/20250318083939.987651-2-tujinjiang@huawei.com
Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com>
Acked-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Nanyong Sun <sunnanyong@huawei.com>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: <stable@vger,kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "Adding Proactive Memory Reclaim Statistics".
These two patches are related to proactive memory reclaim.
Patch 1 Split proactive reclaim statistics from direct reclaim counters
and introduces new counters: pgsteal_proactive, pgdemote_proactive,
and pgscan_proactive.
Patch 2 Adds pswpin and pswpout items to the cgroup-v2 documentation.
This patch (of 2):
In proactive memory reclaim scenarios, it is necessary to accurately track
proactive reclaim statistics to dynamically adjust the frequency and
amount of memory being reclaimed proactively. Currently, proactive
reclaim is included in direct reclaim statistics, which can make these
direct reclaim statistics misleading.
Therefore, separate proactive reclaim memory from the direct reclaim
counters by introducing new counters: pgsteal_proactive,
pgdemote_proactive, and pgscan_proactive, to avoid confusion with direct
reclaim.
Link: https://lkml.kernel.org/r/20250318075833.90615-1-jiahao.kernel@gmail.com
Link: https://lkml.kernel.org/r/20250318075833.90615-2-jiahao.kernel@gmail.com
Signed-off-by: Hao Jia <jiahao1@lixiang.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Koutný <mkoutny@suse.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "mm/damon: introduce DAMOS filter type for active pages".
The memory reclaim algorithm categorizes pages into active and inactive
lists, separately for file and anon pages. The system's performance
relies heavily on the (relative and absolute) accuracy of this
categorization.
This patch series add a new DAMOS filter for pages' activeness, giving us
visibility into the access frequency of the pages on each list. This
insight can help us diagnose issues with the active-inactive balancing
dynamics, and make decisions to optimize reclaim efficiency and memory
utilization.
For instance, we might decide to enable DAMON_LRU_SORT, if we find that
there are pages on the active list that are infrequently accessed, or less
frequently accessed than pages on the inactive list.
This patch (of 2):
Implement a DAMOS filter type for active pages on DAMON kernel API, and
add support of it from the physical address space DAMON operations set
(paddr).
Link: https://lkml.kernel.org/r/20250318183029.2062917-1-nphamcs@gmail.com
Link: https://lkml.kernel.org/r/20250318183029.2062917-2-nphamcs@gmail.com
Signed-off-by: Nhat Pham <nphamcs@gmail.com>
Suggested-by: SeongJae Park <sj@kernel.org>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "track memory used by balloon drivers", v2.
This series introduces a way to track memory used by balloon drivers.
Add a NR_BALLOON_PAGES counter to track how many pages are reclaimed by
the balloon drivers. First add the accounting, then updates the balloon
drivers (virtio, Hyper-V, VMware, Pseries-cmm, and Xen) to maintain this
counter. The virtio, Vmware, and pseries-cmm balloon drivers utilize the
balloon_compaction interface to allocate and free balloon pages. Other
balloon drivers will have to maintain this counter manually.
This makes the information visible in memory reporting interfaces like
/proc/meminfo, show_mem, and OOM reporting.
This provides admins visibility into their VM balloon sizes without
requiring different virtualization tooling. Furthermore, this information
is helpful when debugging an OOM inside a VM.
This patch (of 4):
Add NR_BALLOON_PAGES counter to track memory used by balloon drivers and
expose it through /proc/meminfo and other memory reporting interfaces.
[npache@redhat.com: document Balloon Meminfo entry]
Link: https://lkml.kernel.org/r/a0315ccf-f244-460e-8643-fd7388724fe5@redhat.com
Link: https://lkml.kernel.org/r/20250314213757.244258-1-npache@redhat.com
Link: https://lkml.kernel.org/r/20250314213757.244258-2-npache@redhat.com
Signed-off-by: Nico Pache <npache@redhat.com>
Cc: Alexander Atanasov <alexander.atanasov@virtuozzo.com>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Juegren Gross <jgross@suse.com>
Cc: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: Michael Kelley <mhklinux@outlook.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
There's no need to check which kind of pointer is in the memcg_data field,
all we actually care about is whether it's zero or not. Saves 70 bytes in
workingset_activation() with the Debian config.
Link: https://lkml.kernel.org/r/20250314133617.138071-5-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Cc: David Hildenbrand <david@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The last argument to split_page_memcg() is now always 0, so remove it,
effectively reverting commit b8791381d7ed.
Link: https://lkml.kernel.org/r/20250314133617.138071-3-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: Zi Yan <ziy@nvidia.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Cc: David Hildenbrand <david@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "Minor memcg cleanups & prep for memdescs", v2.
Separate the handling of accounted folios and GFP_ACCOUNT pages for easier
to understand code. For more detail, see
https://lore.kernel.org/linux-mm/Z9LwTOudOlCGny3f@casper.infradead.org/
This patch (of 5):
Folios always use memcg_data to refer to the mem_cgroup while pages
allocated with GFP_ACCOUNT have a pointer to the obj_cgroup. Since the
caller already knows what it has, split the function into two and then we
don't need to check.
Move the assignment of split folio memcg_data to the point where we set up
the other parts of the new folio. That leaves folio_split_memcg_refs()
just handling the memcg accounting.
Link: https://lkml.kernel.org/r/20250314133617.138071-1-willy@infradead.org
Link: https://lkml.kernel.org/r/20250314133617.138071-2-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: Zi Yan <ziy@nvidia.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Cc: David Hildenbrand <david@redhat.com>
Cc: Matthew Wilcow (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The pds_fwctl driver doesn't know what RPC operations are available
in the firmware, so also doesn't know what scope they might have. The
userland utility supplies the firmware "endpoint" and "operation" id values
and this driver queries the firmware for endpoints and their available
operations. The operation descriptions include the scope information
which the driver uses for scope testing.
Link: https://patch.msgid.link/r/20250320194412.67983-6-shannon.nelson@amd.com
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Brett Creeley <brett.creeley@amd.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Initial files for adding a new fwctl driver for the AMD/Pensando PDS
devices. This sets up a simple auxiliary_bus driver that registers
with fwctl subsystem. It expects that a pds_core device has set up
the auxiliary_device pds_core.fwctl
Link: https://patch.msgid.link/r/20250320194412.67983-5-shannon.nelson@amd.com
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
hwspin_lock_get_id() has been unused since the original 2011
commit bd9a4c7df256 ("drivers: hwspinlock: add framework")
Remove it and the corresponding docs.
Note that the of_hwspin_lock_get_id() version is still in use,
so leave that alone.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Link: https://lore.kernel.org/r/20241215022023.181435-1-linux@treblig.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
|
|
devm_hwspin_lock_request() was added by 2018's
commit 4f1acd758b08 ("hwspinlock: Add devm_xxx() APIs to request/free
hwlock") however, it's never been used, everyone uses the
devm_hwspin_lock_request_specific() call instead.
Remove it.
Similarly, the none-devm variant isn't used.
Remove it, and the referring documentation.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Link: https://lore.kernel.org/r/20241027205445.239108-1-linux@treblig.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
|
|
SIOCBRDELIF is passed to dev_ioctl() first and later forwarded to
br_ioctl_call(), which causes unnecessary RTNL dance and the splat
below [0] under RTNL pressure.
Let's say Thread A is trying to detach a device from a bridge and
Thread B is trying to remove the bridge.
In dev_ioctl(), Thread A bumps the bridge device's refcnt by
netdev_hold() and releases RTNL because the following br_ioctl_call()
also re-acquires RTNL.
In the race window, Thread B could acquire RTNL and try to remove
the bridge device. Then, rtnl_unlock() by Thread B will release RTNL
and wait for netdev_put() by Thread A.
Thread A, however, must hold RTNL after the unlock in dev_ifsioc(),
which may take long under RTNL pressure, resulting in the splat by
Thread B.
Thread A (SIOCBRDELIF) Thread B (SIOCBRDELBR)
---------------------- ----------------------
sock_ioctl sock_ioctl
`- sock_do_ioctl `- br_ioctl_call
`- dev_ioctl `- br_ioctl_stub
|- rtnl_lock |
|- dev_ifsioc '
' |- dev = __dev_get_by_name(...)
|- netdev_hold(dev, ...) .
/ |- rtnl_unlock ------. |
| |- br_ioctl_call `---> |- rtnl_lock
Race | | `- br_ioctl_stub |- br_del_bridge
Window | | | |- dev = __dev_get_by_name(...)
| | | May take long | `- br_dev_delete(dev, ...)
| | | under RTNL pressure | `- unregister_netdevice_queue(dev, ...)
| | | | `- rtnl_unlock
\ | |- rtnl_lock <-' `- netdev_run_todo
| |- ... `- netdev_run_todo
| `- rtnl_unlock |- __rtnl_unlock
| |- netdev_wait_allrefs_any
|- netdev_put(dev, ...) <----------------'
Wait refcnt decrement
and log splat below
To avoid blocking SIOCBRDELBR unnecessarily, let's not call
dev_ioctl() for SIOCBRADDIF and SIOCBRDELIF.
In the dev_ioctl() path, we do the following:
1. Copy struct ifreq by get_user_ifreq in sock_do_ioctl()
2. Check CAP_NET_ADMIN in dev_ioctl()
3. Call dev_load() in dev_ioctl()
4. Fetch the master dev from ifr.ifr_name in dev_ifsioc()
3. can be done by request_module() in br_ioctl_call(), so we move
1., 2., and 4. to br_ioctl_stub().
Note that 2. is also checked later in add_del_if(), but it's better
performed before RTNL.
SIOCBRADDIF and SIOCBRDELIF have been processed in dev_ioctl() since
the pre-git era, and there seems to be no specific reason to process
them there.
[0]:
unregister_netdevice: waiting for wpan3 to become free. Usage count = 2
ref_tracker: wpan3@ffff8880662d8608 has 1/1 users at
__netdev_tracker_alloc include/linux/netdevice.h:4282 [inline]
netdev_hold include/linux/netdevice.h:4311 [inline]
dev_ifsioc+0xc6a/0x1160 net/core/dev_ioctl.c:624
dev_ioctl+0x255/0x10c0 net/core/dev_ioctl.c:826
sock_do_ioctl+0x1ca/0x260 net/socket.c:1213
sock_ioctl+0x23a/0x6c0 net/socket.c:1318
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:906 [inline]
__se_sys_ioctl fs/ioctl.c:892 [inline]
__x64_sys_ioctl+0x1a4/0x210 fs/ioctl.c:892
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xcb/0x250 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Fixes: 893b19587534 ("net: bridge: fix ioctl locking")
Reported-by: syzkaller <syzkaller@googlegroups.com>
Reported-by: yan kang <kangyan91@outlook.com>
Reported-by: yue sun <samsun1006219@gmail.com>
Closes: https://lore.kernel.org/netdev/SY8P300MB0421225D54EB92762AE8F0F2A1D32@SY8P300MB0421.AUSP300.PROD.OUTLOOK.COM/
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20250316192851.19781-1-kuniyu@amazon.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
So far s390 does not allow mmap() of PCI resources to user-space via the
usual mechanisms, though it does use it for RDMA. For the PCI sysfs
resource files and /proc/bus/pci it defines neither HAVE_PCI_MMAP nor
ARCH_GENERIC_PCI_MMAP_RESOURCE. For vfio-pci s390 previously relied on
disabled VFIO_PCI_MMAP and now relies on setting pdev->non_mappable_bars
for all devices.
This is partly because access to mapped PCI resources from user-space
requires special PCI load/store memory-I/O (MIO) instructions, or the
special MMIO syscalls when these are not available. Still, such access is
possible and useful not just for RDMA, in fact not being able to mmap() PCI
resources has previously caused extra work when testing devices.
One thing that doesn't work with PCI resources mapped to user-space though
is the s390 specific virtual ISM device. Not only because the BAR size of
256 TiB prevents mapping the whole BAR but also because access requires use
of the legacy PCI instructions which are not accessible to user-space on
systems with the newer MIO PCI instructions.
Now with the pdev->non_mappable_bars flag ISM can be excluded from mapping
its resources while making this functionality available for all other PCI
devices. To this end introduce a minimal implementation of PCI_QUIRKS and
use that to set pdev->non_mappable_bars for ISM devices only. Then also set
ARCH_GENERIC_PCI_MMAP_RESOURCE to take advantage of the generic
implementation of pci_mmap_resource_range() enabling only the newer sysfs
mmap() interface. This follows the recommendation in
Documentation/PCI/sysfs-pci.rst.
Link: https://lore.kernel.org/r/20250226-vfio_pci_mmap-v7-3-c5c0f1d26efd@linux.ibm.com
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
The ability to map PCI resources to user-space is controlled by global
defines. For vfio there is VFIO_PCI_MMAP which is only disabled on s390 and
controls mapping of PCI resources using vfio-pci with a fallback option via
the pread()/pwrite() interface.
For the PCI core there is ARCH_GENERIC_PCI_MMAP_RESOURCE which enables a
generic implementation for mapping PCI resources plus the newer sysfs
interface. Then there is HAVE_PCI_MMAP which can be used with custom
definitions of pci_mmap_resource_range() and the historical /proc/bus/pci
interface. Both mechanisms are all or nothing.
For s390 mapping PCI resources is possible and useful for testing and
certain applications such as QEMU's vfio-pci based user-space NVMe driver.
For certain devices, however access to PCI resources via mappings to
user-space is not possible and these must be excluded from the general PCI
resource mapping mechanisms.
Introduce pdev->non_mappable_bars to indicate that a PCI device's BARs can
not be accessed via mappings to user-space. In the future this enables
per-device restrictions of PCI resource mapping.
For now, set this flag for all PCI devices on s390 in line with the
existing, general disable of PCI resource mapping. As s390 is the only user
of the VFI_PCI_MMAP Kconfig options this can already be replaced with a
check of this new flag. Also add similar checks in the other code protected
by HAVE_PCI_MMAP respectively ARCH_GENERIC_PCI_MMAP in preparation for
enabling these for supported devices.
Link: https://lore.kernel.org/lkml/20250212132808.08dcf03c.alex.williamson@redhat.com/
Link: https://lore.kernel.org/r/20250226-vfio_pci_mmap-v7-2-c5c0f1d26efd@linux.ibm.com
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
io_uring_cmd_import_fixed_vec() is a cmd helper around vectored
registered buffer import functions, which caches the memory under
the hood. The lifetime of the vectore and hence the iterator is bound to
the request. Furthermore, the user is not allowed to call it multiple
times for a single request.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/97487a80dec3fb8cf8aeedf1f9026ef6d503fe4b.1742579999.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Propagate the NFS_MOUNT_NETUNREACH_FATAL flag to work with the pNFS
flexfiles client. In these circumstances, the client needs to treat the
ENETDOWN and ENETUNREACH errors as fatal, and should abandon the
attempted I/O.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Tested-by: Jeff Layton <jlayton@kernel.org>
Acked-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Propagate the NFS_MOUNT_NETUNREACH_FATAL flag to work with the generic
NFS client. If the flag is set, the client will receive ENETDOWN and
ENETUNREACH errors from the RPC layer, and is expected to treat them as
being fatal.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Tested-by: Jeff Layton <jlayton@kernel.org>
Acked-by: Chuck Lever <chuck.lever@oracle.com>
|
|
If the NFS client was initially created in a container, and that
container is torn down, there is usually no possibity to go back and
destroy any NFS clients that are hung because their virtual network
devices have been unlinked.
Add a flag that tells the NFS client that in these circumstances, it
should treat ENETDOWN and ENETUNREACH errors as fatal to the NFS client.
The option defaults to being on when the mount happens from inside a net
namespace that is not "init_net".
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Tested-by: Jeff Layton <jlayton@kernel.org>
Acked-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Writing to this file will clone the 'main' xprt of an xprt_switch and
add it to be used as an additional connection.
--
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
v3: Replace call to xprt_iter_get_xprt() with xprt_iter_get_next()
Link: https://lore.kernel.org/r/20250207204225.594002-5-anna@kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
There are certain users that wish to force the NFS client to choose
READDIRPLUS over READDIR for a particular mount. Update the "rdirplus" mount
option to optionally accept values. For "rdirplus=force", the NFS client
will always attempt to use READDDIRPLUS. The setting of "rdirplus=none" is
aliased to the existing "nordirplus".
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Link: https://lore.kernel.org/r/c4cf0de4c8be0930b91bc74bee310d289781cd3b.1741885071.git.bcodding@redhat.com
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
The 'comp' compression API has been superseded by the acomp API, which
is a bit more cumbersome to use, but ultimately more flexible when it
comes to hardware implementations.
Now that all the users and implementations have been removed, let's
remove the core plumbing of the 'comp' API as well.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Add ACOMP_REQUEST_ALLOC which is a wrapper around acomp_request_alloc
that falls back to a synchronous stack reqeust if the allocation
fails.
Also add ACOMP_REQUEST_ON_STACK which stores the request on the stack
only.
The request should be freed with acomp_request_free.
Finally add acomp_request_alloc_extra which gives the user extra
memory to use in conjunction with the request.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Add kmap_local support to the scatterlist iterator. Use it for
all the helper functions in lib/scatterlist.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
fsl_mc_allocator_driver_exit() was added explicitly by
commit 1e8ac83b6caf ("bus: fsl-mc: add fsl_mc_allocator cleanup function")
but was never used.
Remove it.
fsl_mc_portal_reset() was added in 2015 by
commit 197f4d6a4a00 ("staging: fsl-mc: fsl-mc object allocator driver")
but was never used.
Remove it.
fsl_mc_portal_reset() was the only caller of dpmcp_reset().
Remove it.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Acked-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Acked-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Link: https://lore.kernel.org/r/20241115152055.279732-1-linux@treblig.org
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
|
|
Setting pci_msi_ignore_mask inhibits the toggling of the mask bit for both
MSI and MSI-X entries globally, regardless of the IRQ chip they are using.
Only Xen sets the pci_msi_ignore_mask when routing physical interrupts over
event channels, to prevent PCI code from attempting to toggle the maskbit,
as it's Xen that controls the bit.
However, the pci_msi_ignore_mask being global will affect devices that use
MSI interrupts but are not routing those interrupts over event channels
(not using the Xen pIRQ chip). One example is devices behind a VMD PCI
bridge. In that scenario the VMD bridge configures MSI(-X) using the
normal IRQ chip (the pIRQ one in the Xen case), and devices behind the
bridge configure the MSI entries using indexes into the VMD bridge MSI
table. The VMD bridge then demultiplexes such interrupts and delivers to
the destination device(s). Having pci_msi_ignore_mask set in that scenario
prevents (un)masking of MSI entries for devices behind the VMD bridge.
Move the signaling of no entry masking into the MSI domain flags, as that
allows setting it on a per-domain basis. Set it for the Xen MSI domain
that uses the pIRQ chip, while leaving it unset for the rest of the
cases.
Remove pci_msi_ignore_mask at once, since it was only used by Xen code, and
with Xen dropping usage the variable is unneeded.
This fixes using devices behind a VMD bridge on Xen PV hardware domains.
Albeit Devices behind a VMD bridge are not known to Xen, that doesn't mean
Linux cannot use them. By inhibiting the usage of
VMD_FEAT_CAN_BYPASS_MSI_REMAP and the removal of the pci_msi_ignore_mask
bodge devices behind a VMD bridge do work fine when use from a Linux Xen
hardware domain. That's the whole point of the series.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Juergen Gross <jgross@suse.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Message-ID: <20250219092059.90850-4-roger.pau@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
This patch adds struct_ops context information to struct bpf_prog_aux.
This context information will be used in the kfunc filter.
Currently the added context information includes struct_ops member
offset and a pointer to struct bpf_struct_ops.
Signed-off-by: Juntong Deng <juntong.deng@outlook.com>
Signed-off-by: Amery Hung <ameryhung@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Link: https://patch.msgid.link/20250319215358.2287371-2-ameryhung@gmail.com
|
|
Add a fabrics option 'concat' to request secure channel concatenation as
specified the NVME Base Specification v2.1, section 8.3.4.3: Secure Channel
Concatenation.
When secure channel concatenation is enabled a 'generated PSK' is inserted
into the keyring such that it's available after reset.
Signed-off-by: Hannes Reinecke <hare@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
Add a function to refresh a generated PSK in the specified keyring.
Signed-off-by: Hannes Reinecke <hare@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
Add a function to derive the TLS PSK as specified TP8018.
Signed-off-by: Hannes Reinecke <hare@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
Add a function to calculate the PSK digest as specified in TP8018.
Signed-off-by: Hannes Reinecke <hare@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
Add a function to generate a NVMe PSK from the shared credentials
negotiated by DH-HMAC-CHAP.
Signed-off-by: Hannes Reinecke <hare@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
Add support for a new fwctl-based auxiliary_device for creating a
channel for fwctl support into the AMD/Pensando DSC.
Link: https://patch.msgid.link/r/20250320194412.67983-4-shannon.nelson@amd.com
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
For some reason, cardbus related io/mem size declarations are in
linux/pci.h, whereas non-cardbus sizes are already in pci/pci.h.
Move all them into one place in pci/pci.h.
Link: https://lore.kernel.org/r/20250311174701.3586-4-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
pci_setup_bridge() is only used within setup-bus.c. Therefore, make it a
static function.
Link: https://lore.kernel.org/r/20250311174701.3586-3-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Neither pci_reassign_bridge_resources() nor pci_reassign_resource() is used
outside of the PCI subsystem. They seem to be naturally static functions
but since resource fitting/assignment is split between setup-bus.c and
setup-res.c, they fall into different sides of the divide and need to be
declared.
Move the declarations of pci_reassign_bridge_resources() and
pci_reassign_resource() into pci/pci.h to keep them internal to PCI
subsystem.
Link: https://lore.kernel.org/r/20250311174701.3586-2-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
pci_rescan_bus_bridge_resize() is only used by code inside PCI subsystem.
The comment also falsely advertises it to be for hotplug drivers, yet the
only caller is from sysfs store function. Move the function declaration
into pci/pci.h.
Link: https://lore.kernel.org/r/20250311174701.3586-1-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Since commit 4e1a7df45480 ("cpumask: Add enabled cpumask
for present CPUs that can be brought online") introduced
cpu_enabled_mask, the comment line describing the mask
has been slightly out of alignment with the adjacent
lines.
Fix this by removing a single space character.
Signed-off-by: Joel Savitz <jsavitz@redhat.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
|
|
The union vmpacket_largest_possible_header and several structs have not
been used for a long time afaict - remove them.
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Link: https://lore.kernel.org/r/20250311091634.494888-2-thorsten.blum@linux.dev
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <20250311091634.494888-2-thorsten.blum@linux.dev>
|
|
Cross-merge networking fixes after downstream PR (net-6.14-rc8).
Conflict:
tools/testing/selftests/net/Makefile
03544faad761 ("selftest: net: add proc_net_pktgen")
3ed61b8938c6 ("selftests: net: test for lwtunnel dst ref loops")
tools/testing/selftests/net/config:
85cb3711acb8 ("selftests: net: Add test cases for link and peer netns")
3ed61b8938c6 ("selftests: net: test for lwtunnel dst ref loops")
Adjacent commits:
tools/testing/selftests/net/Makefile
c935af429ec2 ("selftests: net: add support for testing SO_RCVMARK and SO_RCVPRIORITY")
355d940f4d5a ("Revert "selftests: Add IPv6 link-local address generation tests for GRE devices."")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The immediate issue being fixed here is a nVMX bug where KVM fails to
detect that, after nested VM-Exit, L1 has a pending IRQ (or NMI).
However, checking for a pending interrupt accesses the legacy PIC, and
x86's kvm_arch_destroy_vm() currently frees the PIC before destroying
vCPUs, i.e. checking for IRQs during the forced nested VM-Exit results
in a NULL pointer deref; that's a prerequisite for the nVMX fix.
The remaining patches attempt to bring a bit of sanity to x86's VM
teardown code, which has accumulated a lot of cruft over the years. E.g.
KVM currently unloads each vCPU's MMUs in a separate operation from
destroying vCPUs, all because when guest SMP support was added, KVM had a
kludgy MMU teardown flow that broke when a VM had more than one 1 vCPU.
And that oddity lived on, for 18 years...
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 updates for 6.15
- Nested virtualization support for VGICv3, giving the nested
hypervisor control of the VGIC hardware when running an L2 VM
- Removal of 'late' nested virtualization feature register masking,
making the supported feature set directly visible to userspace
- Support for emulating FEAT_PMUv3 on Apple silicon, taking advantage
of an IMPLEMENTATION DEFINED trap that covers all PMUv3 registers
- Paravirtual interface for discovering the set of CPU implementations
where a VM may run, addressing a longstanding issue of guest CPU
errata awareness in big-little systems and cross-implementation VM
migration
- Userspace control of the registers responsible for identifying a
particular CPU implementation (MIDR_EL1, REVIDR_EL1, AIDR_EL1),
allowing VMs to be migrated cross-implementation
- pKVM updates, including support for tracking stage-2 page table
allocations in the protected hypervisor in the 'SecPageTable' stat
- Fixes to vPMU, ensuring that userspace updates to the vPMU after
KVM_RUN are reflected into the backing perf events
|