Age | Commit message (Collapse) | Author |
|
The name is Core Performance Boost (CPB) for the cpuid flag. Correct
cpuid caps flag to use this name (instead of CBP).
Signed-off-by: Robert Richter <rrichter@amd.com>
Signed-off-by: Nathan Fontenot <nathan.fontenot@amd.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
The msr_pstate union struct named fam17h_bits is misleading since
this is the struct to use for all families >= 0x17, not just
for family 0x17. Rename the bits structs to be 'pstate' (for pre
family 17h CPUs) and 'pstatedef' (for CPUs since fam 17h) to align
closer with PPR/BDKG (1) naming.
There are no functional changes as part of this update.
1: AMD Processor Programming Reference (PPR) and BIOS and
Kernel Developer's Guide (BKDG) available at:
http://developer.amd.com/resources/developer-guides-manuals
Signed-off-by: Nathan Fontenot <nathan.fontenot@amd.com>
Reviewed-by: Robert Richter <rrichter@amd.com>
Reviewed-by: skhan@linuxfoundation.org
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
Fix bug in handling bpf_testmod unloading that will cause test_progs exiting
prematurely if bpf_testmod unloading failed. This is especially problematic
when running a subset of test_progs that doesn't require root permissions and
doesn't rely on bpf_testmod, yet will fail immediately due to exit(1) in
unload_bpf_testmod().
Fixes: 9f7fa225894c ("selftests/bpf: Add bpf_testmod kernel module for testing")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210126065019.1268027-1-andrii@kernel.org
|
|
There exists many build warnings when make M=samples/bpf on the Loongson
platform, this issue is MIPS related, x86 compiles just fine.
Here are some warnings:
CC samples/bpf/ibumad_user.o
samples/bpf/ibumad_user.c: In function ‘dump_counts’:
samples/bpf/ibumad_user.c:46:24: warning: format ‘%llu’ expects argument of type ‘long long unsigned int’, but argument 3 has type ‘__u64’ {aka ‘long unsigned int’} [-Wformat=]
printf("0x%02x : %llu\n", key, value);
~~~^ ~~~~~
%lu
CC samples/bpf/offwaketime_user.o
samples/bpf/offwaketime_user.c: In function ‘print_ksym’:
samples/bpf/offwaketime_user.c:34:17: warning: format ‘%llx’ expects argument of type ‘long long unsigned int’, but argument 3 has type ‘__u64’ {aka ‘long unsigned int’} [-Wformat=]
printf("%s/%llx;", sym->name, addr);
~~~^ ~~~~
%lx
samples/bpf/offwaketime_user.c: In function ‘print_stack’:
samples/bpf/offwaketime_user.c:68:17: warning: format ‘%lld’ expects argument of type ‘long long int’, but argument 3 has type ‘__u64’ {aka ‘long unsigned int’} [-Wformat=]
printf(";%s %lld\n", key->waker, count);
~~~^ ~~~~~
%ld
MIPS needs __SANE_USERSPACE_TYPES__ before <linux/types.h> to select
'int-ll64.h' in arch/mips/include/uapi/asm/types.h, then it can avoid
build warnings when printing __u64 with %llu, %llx or %lld.
The header tools/include/linux/types.h defines __SANE_USERSPACE_TYPES__,
it seems that we can include <linux/types.h> in the source files which
have build warnings, but it has no effect due to actually it includes
usr/include/linux/types.h instead of tools/include/linux/types.h, the
problem is that "usr/include" is preferred first than "tools/include"
in samples/bpf/Makefile, that sounds like a ugly hack to -Itools/include
before -Iusr/include.
So define __SANE_USERSPACE_TYPES__ for MIPS in samples/bpf/Makefile
is proper, if add "TPROGS_CFLAGS += -D__SANE_USERSPACE_TYPES__" in
samples/bpf/Makefile, it appears the following error:
Auto-detecting system features:
... libelf: [ on ]
... zlib: [ on ]
... bpf: [ OFF ]
BPF API too old
make[3]: *** [Makefile:293: bpfdep] Error 1
make[2]: *** [Makefile:156: all] Error 2
With #ifndef __SANE_USERSPACE_TYPES__ in tools/include/linux/types.h,
the above error has gone and this ifndef change does not hurt other
compilations.
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/1611551146-14052-1-git-send-email-yangtiezhu@loongson.cn
|
|
Update struct bpf_perf_event_data with the addr field to match the
tools headers with the kernel headers.
Signed-off-by: Florian Lehner <dev@der-flo.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210123185221.23946-1-dev@der-flo.net
|
|
There is no need to cast to void * when the argument is void *. Avoid
cluttering of code.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210122154725.22140-13-bjorn.topel@gmail.com
|
|
Use calloc instead of malloc where it makes sense, and avoid C++-style
void *-cast.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210122154725.22140-12-bjorn.topel@gmail.com
|
|
The data variable is only used locally. Instead of using the heap,
stick to using the stack.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210122154725.22140-11-bjorn.topel@gmail.com
|
|
Use C89 rules for variable definition.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210122154725.22140-10-bjorn.topel@gmail.com
|
|
Instead of casting from void *, let us use the actual type in
gen_udp_hdr().
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210122154725.22140-9-bjorn.topel@gmail.com
|
|
Instead of casting from void *, let us use the actual type in
init_iface_config().
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210122154725.22140-8-bjorn.topel@gmail.com
|
|
Let us use a local variable in nsswitchthread(), so we can remove a
lot of casting for better readability.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210122154725.22140-7-bjorn.topel@gmail.com
|
|
Introduce a local variable to get rid of lot of casting. Move common
code outside the if/else-clause.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210122154725.22140-6-bjorn.topel@gmail.com
|
|
The allocated entry is immediately overwritten by an assignment. Fix
that.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210122154725.22140-5-bjorn.topel@gmail.com
|
|
Silence three checkpatch style warnings.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210122154725.22140-4-bjorn.topel@gmail.com
|
|
The enums undef and bidi are not used. Remove them.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210122154725.22140-3-bjorn.topel@gmail.com
|
|
Instead of passing void * all over the place, let us pass the actual
type (ifobject) and remove the void-ptr-to-type-ptr casting.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210122154725.22140-2-bjorn.topel@gmail.com
|
|
Add detection for kernel version, and adapt the BPF program based on
kernel support. This way, users will get the best possible performance
from the BPF program.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Marek Majtyka <alardam@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20210122105351.11751-4-bjorn.topel@gmail.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull objtool fixes from Borislav Petkov:
- Adjust objtool to handle a recent binutils change to not generate
unused symbols anymore.
- Revert the fail-the-build-on-fatal-errors objtool strategy for now
due to the ever-increasing matrix of supported toolchains/plugins and
them causing too many such fatal errors currently.
- Do not add empty symbols to objdump's rbtree to accommodate clang
removing section symbols.
* tag 'objtool_urgent_for_v5.11_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Don't fail on missing symbol table
objtool: Don't fail the kernel build on fatal errors
objtool: Don't add empty symbols to the rbtree
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- Fix a bad interaction between the scv handling and the fallback L1D
flush, which could lead to user register corruption. Only affects
people using scv (~no one) on machines with old firmware that are
missing the L1D flush.
- Two small selftest fixes.
Thanks to Eirik Fuller, Libor Pechacek, Nicholas Piggin, Sandipan Das,
and Tulio Magno Quites Machado Filho.
* tag 'powerpc-5.11-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/64s: fix scv entry fallback flush vs interrupt
selftests/powerpc: Only test lwm/stmw on big endian
selftests/powerpc: Fix exit status of pkey tests
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kunit fixes from Shuah :
"Five fixes to the kunit tool and documentation from Daniel Latypov and
David Gow"
* tag 'linux-kselftest-kunit-fixes-5.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
kunit: tool: move kunitconfig parsing into __init__, make it optional
kunit: tool: fix minor typing issue with None status
kunit: tool: surface and address more typing issues
Documentation: kunit: include example of a parameterized test
kunit: tool: Fix spelling of "diagnostic" in kunit_parser
|
|
Query the maximum number of supported physical ports using devlink-resource
and test that this number can be reached by splitting each of the
splittable ports to its width. Test that an error is returned in case
the maximum number is exceeded.
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
HTB doesn't scale well because of contention on a single lock, and it
also consumes CPU. This patch adds support for offloading HTB to
hardware that supports hierarchical rate limiting.
In the offload mode, HTB passes control commands to the driver using
ndo_setup_tc. The driver has to replicate the whole hierarchy of classes
and their settings (rate, ceil) in the NIC. Every modification of the
HTB tree caused by the admin results in ndo_setup_tc being called.
After this setup, the HTB algorithm is done completely in the NIC. An SQ
(send queue) is created for every leaf class and attached to the
hierarchy, so that the NIC can calculate and obey aggregated rate
limits, too. In the future, it can be changed, so that multiple SQs will
back a single leaf class.
ndo_select_queue is responsible for selecting the right queue that
serves the traffic class of each packet.
The data path works as follows: a packet is classified by clsact, the
driver selects a hardware queue according to its class, and the packet
is enqueued into this queue's qdisc.
This solution addresses two main problems of scaling HTB:
1. Contention by flow classification. Currently the filters are attached
to the HTB instance as follows:
# tc filter add dev eth0 parent 1:0 protocol ip flower dst_port 80
classid 1:10
It's possible to move classification to clsact egress hook, which is
thread-safe and lock-free:
# tc filter add dev eth0 egress protocol ip flower dst_port 80
action skbedit priority 1:10
This way classification still happens in software, but the lock
contention is eliminated, and it happens before selecting the TX queue,
allowing the driver to translate the class to the corresponding hardware
queue in ndo_select_queue.
Note that this is already compatible with non-offloaded HTB and doesn't
require changes to the kernel nor iproute2.
2. Contention by handling packets. HTB is not multi-queue, it attaches
to a whole net device, and handling of all packets takes the same lock.
When HTB is offloaded, it registers itself as a multi-queue qdisc,
similarly to mq: HTB is attached to the netdev, and each queue has its
own qdisc.
Some features of HTB may be not supported by some particular hardware,
for example, the maximum number of classes may be limited, the
granularity of rate and ceil parameters may be different, etc. - so, the
offload is not enabled by default, a new parameter is used to enable it:
# tc qdisc replace dev eth0 root handle 1: htb offload
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
'kfree_rcu.2021.01.04a', 'mmdumpobj.2021.01.22a', 'nocb.2021.01.06a', 'rt.2021.01.04a', 'stall.2021.01.06a', 'torture.2021.01.12a' and 'tortureall.2021.01.06a' into HEAD
doc.2021.01.06a: Documentation updates.
fixes.2021.01.04b: Miscellaneous fixes.
kfree_rcu.2021.01.04a: kfree_rcu() updates.
mmdumpobj.2021.01.22a: Dump allocation point for memory blocks.
nocb.2021.01.06a: RCU callback offload updates and cblist segment lengths.
rt.2021.01.04a: Real-time updates.
stall.2021.01.06a: RCU CPU stall warning updates.
torture.2021.01.12a: Torture-test updates and polling SRCU grace-period API.
tortureall.2021.01.06a: Torture-test script updates.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull more perf tools fixes from Arnaldo Carvalho de Melo:
- Fix id index used in Intel PT for heterogeneous systems
- Fix overrun issue in 'perf script' for dynamically-allocated PMU type
number
- Fix 'perf stat' metrics containing the 'duration_time' synthetic
event
- Fix system PMU 'perf stat' metrics
* tag 'perf-tools-fixes-v5.11-2-2021-01-22' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
perf script: Fix overrun issue for dynamically-allocated PMU type number
perf metricgroup: Fix system PMU metrics
perf metricgroup: Fix for metrics containing duration_time
perf evlist: Fix id index for heterogeneous systems
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver fixes from Hans de Goede:
"A small collection of bug-fixes and model-specific quirks"
* tag 'platform-drivers-x86-v5.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
platform/x86: thinkpad_acpi: Add P53/73 firmware to fan_quirk_table for dual fan control
platform/x86: hp-wmi: Don't log a warning on HPWMI_RET_UNKNOWN_COMMAND errors
platform/x86: intel-vbtn: Drop HP Stream x360 Convertible PC 11 from allow-list
platform/x86: ideapad-laptop: Disable touchpad_switch for ELAN0634
platform/x86: amd-pmc: Fix CONFIG_DEBUG_FS check
platform/x86: thinkpad_acpi: correct palmsensor error checking
platform/x86: intel-vbtn: Support for tablet mode on Dell Inspiron 7352
platform/x86: touchscreen_dmi: Add swap-x-y quirk for Goodix touchscreen on Estar Beauty HD tablet
platform/x86: i2c-multi-instantiate: Don't create platform device for INT3515 ACPI nodes
platform/surface: SURFACE_PLATFORMS should depend on ACPI
platform/surface: surface_gpe: Fix non-PM_SLEEP build warnings
tools/power/x86/intel-speed-select: Set higher of cpuinfo_max_freq or base_frequency
tools/power/x86/intel-speed-select: Set scaling_max_freq to base_frequency
|
|
This affects all ACPICA source code modules.
ACPICA commit c570953c914437e621dd5f160f26ddf352e0d2f4
Link: https://github.com/acpica/acpica/commit/c570953c
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Erik Kaneda <erik.kaneda@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Change 'exeeds' to 'exceeds'.
Signed-off-by: Junlin Yang <yangjunlin@yulong.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210121122309.1501-1-angkery@163.com
|
|
For very large ELF objects (with many sections), we could
get special value SHN_XINDEX (65535) for elf object's string
table index - e_shstrndx.
Call elf_getshdrstrndx to get the proper string table index,
instead of reading it directly from ELF header.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210121202203.9346-4-jolsa@kernel.org
|
|
Thanks to a recent binutils change which doesn't generate unused
symbols, it's now possible for thunk_64.o be completely empty without
CONFIG_PREEMPTION: no text, no data, no symbols.
We could edit the Makefile to only build that file when
CONFIG_PREEMPTION is enabled, but that will likely create confusion
if/when the thunks end up getting used by some other code again.
Just ignore it and move on.
Reported-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/1254
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
|
|
This is basically a revert of commit 644592d32837 ("objtool: Fail the
kernel build on fatal errors").
That change turned out to be more trouble than it's worth. Failing the
build is an extreme measure which sometimes gets too much attention and
blocks CI build testing.
These fatal-type warnings aren't yet as rare as we'd hope, due to the
ever-increasing matrix of supported toolchains/plugins and their
fast-changing nature as of late.
Also, there are more people (and bots) looking for objtool warnings than
ever before, so even non-fatal warnings aren't likely to be ignored for
long.
Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
|
|
When unpacking the event which is from dynamic PMU, the array
output[OUTPUT_TYPE_MAX] may be overrun. For example, type number of SKL
uncore_imc is 10, but OUTPUT_TYPE_MAX is 7 now (OUTPUT_TYPE_MAX =
PERF_TYPE_MAX + 1).
/* In builtin-script.c */
process_event()
{
unsigned int type = output_type(attr->type);
if (output[type].fields == 0)
return;
}
output[10] is overrun.
Create a type OUTPUT_TYPE_OTHER for dynamic PMU events, then
output_type(attr->type) will return OUTPUT_TYPE_OTHER here.
Note that if PERF_TYPE_MAX ever changed, then there would be a conflict
between old perf.data files that had a dynamicaliy allocated PMU number
that would then be the same as a fixed PERF_TYPE.
Example:
# perf record --switch-events -C 0 -e "{cpu-clock,uncore_imc/data_reads/,uncore_imc/data_writes/}:SD" -a -- sleep 1
# perf script
Before:
swapper 0 [000] 1479253.987551: 277766 cpu-clock: ffffffff9d4ddb6f cpuidle_enter_state+0xdf ([kernel.kallsyms])
swapper 0 [000] 1479253.987797: 246709 cpu-clock: ffffffff9d4ddb6f cpuidle_enter_state+0xdf ([kernel.kallsyms])
swapper 0 [000] 1479253.988127: 329883 cpu-clock: ffffffff9d4ddb6f cpuidle_enter_state+0xdf ([kernel.kallsyms])
swapper 0 [000] 1479253.988273: 146393 cpu-clock: ffffffff9d4ddb6f cpuidle_enter_state+0xdf ([kernel.kallsyms])
swapper 0 [000] 1479253.988523: 249977 cpu-clock: ffffffff9d4ddb6f cpuidle_enter_state+0xdf ([kernel.kallsyms])
swapper 0 [000] 1479253.988877: 354090 cpu-clock: ffffffff9d4ddb6f cpuidle_enter_state+0xdf ([kernel.kallsyms])
swapper 0 [000] 1479253.989023: 145940 cpu-clock: ffffffff9d4ddb6f cpuidle_enter_state+0xdf ([kernel.kallsyms])
swapper 0 [000] 1479253.989383: 359856 cpu-clock: ffffffff9d4ddb6f cpuidle_enter_state+0xdf ([kernel.kallsyms])
swapper 0 [000] 1479253.989523: 140082 cpu-clock: ffffffff9d4ddb6f cpuidle_enter_state+0xdf ([kernel.kallsyms])
After:
swapper 0 [000] 1397040.402011: 272384 cpu-clock: ffffffff9d4ddb6f cpuidle_enter_state+0xdf ([kernel.kallsyms])
swapper 0 [000] 1397040.402011: 5396 uncore_imc/data_reads/:
swapper 0 [000] 1397040.402011: 967 uncore_imc/data_writes/:
swapper 0 [000] 1397040.402259: 249153 cpu-clock: ffffffff9d4ddb6f cpuidle_enter_state+0xdf ([kernel.kallsyms])
swapper 0 [000] 1397040.402259: 7231 uncore_imc/data_reads/:
swapper 0 [000] 1397040.402259: 1297 uncore_imc/data_writes/:
swapper 0 [000] 1397040.402508: 249108 cpu-clock: ffffffff9d4ddb6f cpuidle_enter_state+0xdf ([kernel.kallsyms])
swapper 0 [000] 1397040.402508: 5333 uncore_imc/data_reads/:
swapper 0 [000] 1397040.402508: 1008 uncore_imc/data_writes/:
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Link: https://lore.kernel.org/r/20201209005828.21302-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Joakim reports that getting "perf stat" for multiple system PMU metrics
segfaults:
$ perf stat -a -I 1000 -M imx8mm_ddr_write.all,imx8mm_ddr_write.all
Segmentation fault
$
While the same works without issue for a single metric.
The logic in metricgroup__add_metric_sys_event_iter() is broken, in that
add_metric() @m argument should be NULL for each new metric. Fix by not
passing a holder for that, and rather make local in
metricgroup__add_metric_sys_event_iter().
Fixes: be335ec28efa ("perf metricgroup: Support adding metrics for system PMUs")
Reported-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linuxarm@openeuler.org
Link: https://lore.kernel.org/r/1611050655-44020-1-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Metrics containing duration_time cause a segfault:
$ perf stat -v -M L1D_Cache_Fill_BW sleep 1
Using CPUID GenuineIntel-6-3D-4
metric expr 64 * l1d.replacement / 1000000000 / duration_time for L1D_Cache_Fill_BW
found event duration_time
found event l1d.replacement
adding {l1d.replacement}:W,duration_time
l1d.replacement -> cpu/umask=0x1,(null)=0x1e8483,event=0x51/
Segmentation fault
$
In commit c2337d67199a1ea1 ("perf metricgroup: Fix metrics using aliases
covering multiple PMUs"), the logic in find_evsel_group() when iter'ing
events was changed to not only select events in same group, but also for
aliased PMUs.
Checking whether events were for aliased PMUs was done by comparing the
event PMU name. This was not safe for duration_time event, which has no
associated PMU (and no PMU name), so fix by checking if the event PMU name
is set also.
Committer testing:
Reproduced the bug, then, on a:
$ grep -m1 ^'model name' /proc/cpuinfo
model name : Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz
$
We now get:
$ perf stat -M L1D_Cache_Fill_BW sleep 1
Performance counter stats for 'sleep 1':
4,141 l1d.replacement:u
1,001,285,107 ns duration_time:u
1.001285107 seconds time elapsed
0.000000000 seconds user
0.001119000 seconds sys
$
Detais from -v:
Using CPUID GenuineIntel-6-8E-A
metric expr 64 * l1d.replacement / 1000000000 / duration_time for L1D_Cache_Fill_BW
found event duration_time
found event l1d.replacement
adding {l1d.replacement}:W,duration_time
l1d.replacement -> cpu/(null)=0x1e8483,umask=0x1,event=0x51/
Control descriptor is not initialized
Warning:
kernel.perf_event_paranoid=2, trying to fall back to excluding kernel and hypervisor samples
Warning:
kernel.perf_event_paranoid=2, trying to fall back to excluding kernel and hypervisor samples
l1d.replacement:u: 4592 612201 612201
duration_time:u: 1001478621 1001478621 1001478621
Fixes: c2337d67199a1ea1 ("perf metricgroup: Fix metrics using aliases covering multiple PMUs")
Reported-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linuxarm@openeuler.org
Link: https://lore.kernel.org/r/1611159518-226883-1-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
perf_evlist__set_sid_idx() updates perf_sample_id with the evlist map
index, CPU number and TID. It is passed indexes to the evsel's cpu and
thread maps, but references the evlist's maps instead. That results in
using incorrect CPU numbers on heterogeneous systems. Fix it by using
evsel maps.
The id index (PERF_RECORD_ID_INDEX) is used by AUX area tracing when in
sampling mode. Having an incorrect CPU number causes the trace data to
be attributed to the wrong CPU, and can result in decoder errors because
the trace data is then associated with the wrong process.
Committer notes:
Keep the class prefix convention in the function name, switching from
perf_evlist__set_sid_idx() to perf_evsel__set_sid_idx().
Fixes: 3c659eedada2fbf9 ("perf tools: Add id index")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20210121125446.11287-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
- rework the character device code to avoid a frame size warning
- fix printk format issues in gpio-tools
- warn on redefinition of the to_irq callback in core gpiolib code
- fix PWM period calculation in gpio-mvebu
- make gpio-sifive Kconfig entry consistent with other drivers
- fix a build issue in gpio-tegra
* tag 'gpio-fixes-for-v5.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpio: tegra: Add missing dependencies
gpio: sifive: select IRQ_DOMAIN_HIERARCHY rather than depend on it
gpio: mvebu: fix pwm .get_state period calculation
gpiolib: add a warning on gpiochip->to_irq defined
tools: gpio: fix %llu warning in gpio-watch.c
tools: gpio: fix %llu warning in gpio-event-mon.c
gpiolib: cdev: fix frame size warning in gpio_ioctl()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:
====================
pull request (net): ipsec 2021-01-21
1) Fix a rare panic on SMP systems when packet reordering
happens between anti replay check and update.
From Shmulik Ladkani.
2) Fix disable_xfrm sysctl when used on xfrm interfaces.
From Eyal Birger.
3) Fix a race in PF_KEY when the availability of crypto
algorithms is set. From Cong Wang.
4) Fix a return value override in the xfrm policy selftests.
From Po-Hsu Lin.
5) Fix an integer wraparound in xfrm_policy_addr_delta.
From Visa Hankala.
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec:
xfrm: Fix wraparound in xfrm_policy_addr_delta()
selftests: xfrm: fix test return value override issue in xfrm_policy.sh
af_key: relax availability checks for skb size calculation
xfrm: fix disable_xfrm sysctl when used on xfrm interfaces
xfrm: Fix oops in xfrm_replay_advance_bmp
====================
Link: https://lore.kernel.org/r/20210121121558.621339-1-steffen.klassert@secunet.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The -lgcc command-line argument is placed poorly in the build options,
which can result in build failures, for exapmle, on ARM when uidiv()
is required. This commit therefore places the -lgcc argument after the
source files.
Fixes: b94ec36896da ("rcutorture: Make use of nolibc when available")
Tested-by: Valentin Schneider <valentin.schneider@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com> [arm64]
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
|
|
The documentation header in the nolibc.h file provides an example command
line, but it places the -lgcc argument before the source files, which
can fail with libgcc.a (e.g. on ARM when uidiv is needed). This commit
therefore moves the -lgcc to the end of the command line, hopefully
before this example leaks into makefiles. This is a port of nolibc's
upstream commit b5e282089223 to the Linux kernel.
Fixes: 66b6f755ad45 ("rcutorture: Import a copy of nolibc")
Tested-by: Valentin Schneider <valentin.schneider@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com> [arm64]
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
|
|
definitions
Some syscalls can be implemented from different __NR_* variants. For
example, sys_dup2() can be implemented based on __NR_dup3 or __NR_dup2.
In this case it is useful to mention both alternatives in error messages
when neither are detected. This information will help the user search for
the right one (e.g __NR_dup3) instead of just the fallback (__NR_dup2)
which might not exist on the platform.
This is a port of nolibc's upstream commit a21080d2ba41 to the Linux
kernel.
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/lkml/20210120145447.GC77728@C02TD0UTHF1T.local/
Tested-by: Valentin Schneider <valentin.schneider@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com> [arm64]
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
|
|
The __ARCH_WANT_* definitions were added in order to support aarch64
when it was missing some syscall definitions (including __NR_dup2,
__NR_fork, and __NR_getpgrp), but these __ARCH_WANT_* definitions were
actually wrong because these syscalls do not exist on this platform.
Defining these resulted in exposing invalid definitions, resulting in
failures on aarch64.
The missing syscalls were since implemented based on the newer ones
(__NR_dup3, __NR_clone, __NR_getpgid) so these incorrect __ARCH_WANT_*
definitions are no longer needed.
Thanks to Mark Rutland for spotting this incorrect analysis and
explaining why it was wrong.
This is a port of nolibc's upstream commit 00b1b0d9b2a4 to the Linux
kernel.
Reported-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/lkml/20210119153147.GA5083@paulmck-ThinkPad-P72
Tested-by: Valentin Schneider <valentin.schneider@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com> [arm64]
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
|
|
The definitions of timeval(), timespec() and timezone() conflict with
linux/time.h when building, so this commit takes them directly from
linux/time.h. This is a port of nolibc's upstream commit dc45f5426b0c
to the Linux kernel.
Fixes: 66b6f755ad45 ("rcutorture: Import a copy of nolibc")
Tested-by: Valentin Schneider <valentin.schneider@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com> [arm64]
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
|
|
Some architectures like arm64 do not implement poll() and have to use
ppoll() instead. This commit therefore makes poll() use ppoll() when
available. This is a port of nolibc's upstream commit 800f75c13ede to
the Linux kernel.
Fixes: 66b6f755ad45 ("rcutorture: Import a copy of nolibc")
Tested-by: Valentin Schneider <valentin.schneider@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com> [arm64]
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
|
|
Some archs such as arm64 do not have fork() and have to use clone()
instead. This commit therefore makes fork() use clone() when
available. This requires including signal.h to get the definition of
SIGCHLD. This is a port of nolibc's upstream commit d2dc42fd6149 to
the Linux kernel.
Fixes: 66b6f755ad45 ("rcutorture: Import a copy of nolibc")
Tested-by: Valentin Schneider <valentin.schneider@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com> [arm64]
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
|
|
The getpgrp() syscall is not implemented on arm64, so this commit instead
uses getpgid(0) when getpgrp() is not available. This is a port of
nolibc's upstream commit 2379f25073f9 to the Linux kernel.
Fixes: 66b6f755ad45 ("rcutorture: Import a copy of nolibc")
Tested-by: Valentin Schneider <valentin.schneider@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com> [arm64]
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
|
|
A recent boot failure on 5.4-rc3 on arm64 revealed that sys_dup2()
is not available and that only sys_dup3() is implemented. This commit
detects this and falls back to sys_dup3() when available. This is a
port of nolibc's upstream commit fd5272ec2c66 to the Linux kernel.
Tested-by: Valentin Schneider <valentin.schneider@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com> [arm64]
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
|
|
This commit adds the dup() function, which was omitted when sys_dup()
was defined. This is a port of nolibc's upstream commit 47cc42a79c92
to the Linux kernel.
Fixes: 66b6f755ad45 ("rcutorture: Import a copy of nolibc")
Tested-by: Valentin Schneider <valentin.schneider@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com> [arm64]
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
|
|
Add custom implementation of getsockopt hook for TCP_ZEROCOPY_RECEIVE.
We skip generic hooks for TCP_ZEROCOPY_RECEIVE and have a custom
call in do_tcp_getsockopt using the on-stack data. This removes
3% overhead for locking/unlocking the socket.
Without this patch:
3.38% 0.07% tcp_mmap [kernel.kallsyms] [k] __cgroup_bpf_run_filter_getsockopt
|
--3.30%--__cgroup_bpf_run_filter_getsockopt
|
--0.81%--__kmalloc
With the patch applied:
0.52% 0.12% tcp_mmap [kernel.kallsyms] [k] __cgroup_bpf_run_filter_getsockopt_kern
Note, exporting uapi/tcp.h requires removing netinet/tcp.h
from test_progs.h because those headers have confliciting
definitions.
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210115163501.805133-2-sdf@google.com
|
|
llvm patch https://reviews.llvm.org/D84002 permitted
to emit empty rodata datasec if the elf .rodata section
contains read-only data from local variables. These
local variables will be not emitted as BTF_KIND_VARs
since llvm converted these local variables as
static variables with private linkage without debuginfo
types. Such an empty rodata datasec will make
skeleton code generation easy since for skeleton
a rodata struct will be generated if there is a
.rodata elf section. The existence of a rodata
btf datasec is also consistent with the existence
of a rodata map created by libbpf.
The btf with such an empty rodata datasec will fail
in the kernel though as kernel will reject a datasec
with zero vlen and zero size. For example, for the below code,
int sys_enter(void *ctx)
{
int fmt[6] = {1, 2, 3, 4, 5, 6};
int dst[6];
bpf_probe_read(dst, sizeof(dst), fmt);
return 0;
}
We got the below btf (bpftool btf dump ./test.o):
[1] PTR '(anon)' type_id=0
[2] FUNC_PROTO '(anon)' ret_type_id=3 vlen=1
'ctx' type_id=1
[3] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
[4] FUNC 'sys_enter' type_id=2 linkage=global
[5] INT 'char' size=1 bits_offset=0 nr_bits=8 encoding=SIGNED
[6] ARRAY '(anon)' type_id=5 index_type_id=7 nr_elems=4
[7] INT '__ARRAY_SIZE_TYPE__' size=4 bits_offset=0 nr_bits=32 encoding=(none)
[8] VAR '_license' type_id=6, linkage=global-alloc
[9] DATASEC '.rodata' size=0 vlen=0
[10] DATASEC 'license' size=0 vlen=1
type_id=8 offset=0 size=4
When loading the ./test.o to the kernel with bpftool,
we see the following error:
libbpf: Error loading BTF: Invalid argument(22)
libbpf: magic: 0xeb9f
...
[6] ARRAY (anon) type_id=5 index_type_id=7 nr_elems=4
[7] INT __ARRAY_SIZE_TYPE__ size=4 bits_offset=0 nr_bits=32 encoding=(none)
[8] VAR _license type_id=6 linkage=1
[9] DATASEC .rodata size=24 vlen=0 vlen == 0
libbpf: Error loading .BTF into kernel: -22. BTF is optional, ignoring.
Basically, libbpf changed .rodata datasec size to 24 since elf .rodata
section size is 24. The kernel then rejected the BTF since vlen = 0.
Note that the above kernel verifier failure can be worked around with
changing local variable "fmt" to a static or global, optionally const, variable.
This patch permits a datasec with vlen = 0 in kernel.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210119153519.3901963-1-yhs@fb.com
|
|
Reuse module_attach infrastructure to add a new bare tracepoint to check
we can attach to it as a raw tracepoint.
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210119122237.2426878-3-qais.yousef@arm.com
|