summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-02-05ixgbe: remove magic constant in ixgbe_reset_hw_82599()Jiri Kosina
ixgbe_reset_hw_82599() resets the value of hw->mac.num_rar_entries to pre-defined value of 128. Let's get rid of that hardcoded literal, and use IXGBE_82599_RAR_ENTRIES instead, the same way the normal initialization path does. Signed-off-by: Jiri Kosina <jkosina@suse.cz> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-05igc: Fix code redundancySasha Neftin
Remove redundant igc_check_for_link_base code and replace it with an igc_check_for_copper_link method. Fix duplication of IGC_ADVTXD_PAYLEN_SHIFT mask declaration. Remove obsolete IGC_SCVPC register definition. Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-05docs/networking: fix formatting of Intel drivers documentationMike Rapoport
The documentation of Intel drivers is missing the heading adornment for document titles. This causes the generated html to have TOC entries from these documents to appear as top level TOC entries: * Linux* Base Driver for Intel(R) Ethernet Network Connection * Contents * Identifying Your Adapter * Command Line Parameters * AutoNeg * Duplex ... Add overline heading adornment to document titles. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-05igc: Remove unreachable code from igc_phy.c fileSasha Neftin
Address community comment. Remove the unreachable code leads to the static checker warning. PHY functionality will be added later per demand. Reported by Dan Carpenter. Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-05e1000e: Exclude device from suspend direct complete optimizationKai-Heng Feng
e1000e sets different WoL settings in system suspend callback and runtime suspend callback. The suspend direct complete optimization leaves e1000e in runtime suspended state with wrong WoL setting during system suspend. To fix this, we need to disable suspend direct complete optimization to let e1000e always use suspend callback to set correct WoL during system suspend. Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-05net: marvell: mvpp2: fix lack of link interruptsRussell King
Sven Auhagen reports that if he changes a SFP+ module for a SFP module on the Macchiatobin Single Shot, the link does not come back up. For Sven, it is as easy as: - Insert a SFP+ module connected, and use ping6 to verify link is up. - Remove SFP+ module - Insert SFP 1000base-X module use ping6 to verify link is up: Link up event did not trigger and the link is down but that doesn't show the problem for me. Locally, this has been reproduced by: - Boot with no modules. - Insert SFP+ module, confirm link is up. - Replace module with 25000base-X module. Confirm link is up. - Set remote end down, link is reported as dropped at both ends. - Set remote end up, link is reported up at remote end, but not local end due to lack of link interrupt. Fix this by setting up both GMAC and XLG interrupts for port 0, but only unmasking the appropriate interrupt according to the current mode set in the mac_config() method. However, only do the mask/unmask dance when we are really changing the link mode to avoid missing any link interrupts. Tested-by: Sven Auhagen <sven.auhagen@voleatech.de> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-05net: marvell: mvpp2: use phy_interface_mode_is_8023z() helperRussell King
Use the phy_interface_mode_is_8023z() helper for detecting interface modes that use 802.3z serial encoding. This is equivalent to testing for both 1000base-X and 2500base-X. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-05Merge branch 'nixge-Fixed-link-support'David S. Miller
Moritz Fischer says: ==================== nixge: Fixed-link support This series adds fixed-link support to nixge. The first patch corrects the binding to correctly reflect hardware that does not come with MDIO cores instantiated. The second patch adds fixed link support to the driver. The third patch updates the binding document with the now optional (formerly required) phy-handle property and references the fixed-link docs. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-05dt-bindings: net: Add fixed-link supportMoritz Fischer
Update device-tree binding with fixed-link support. With fixed-link support the formerly required property 'phy-handle' is now optional if 'fixed-link' child is present. Signed-off-by: Moritz Fischer <mdf@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-05net: nixge: Add support for fixed-link configurationsMoritz Fischer
Add support for fixed-link configurations to nixge driver. Signed-off-by: Moritz Fischer <mdf@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-05net: nixge: Make mdio child node optionalMoritz Fischer
Make MDIO child optional and only instantiate the MDIO bus if the child is actually present. There are currently no (in-tree) users of this binding; all (out-of-tree) users use overlays that get shipped together with the FPGA images that contain the IP. This will significantly increase maintainabilty of future revisions of this IP. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Moritz Fischer <mdf@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-05Merge branch 'bpf-riscv-jit'Daniel Borkmann
Björn Töpel says: ==================== This v2 series adds an RV64G BPF JIT to the kernel. At the moment the RISC-V Linux port does not support CONFIG_HAVE_KPROBES (Patrick Stählin sent out an RFC last year), which means that CONFIG_BPF_EVENTS is not supported. Thus, no tests involving BPF_PROG_TYPE_TRACEPOINT, BPF_PROG_TYPE_PERF_EVENT, BPF_PROG_TYPE_KPROBE and BPF_PROG_TYPE_RAW_TRACEPOINT passes. The implementation does not support "far branching" (>4KiB). Test results: # modprobe test_bpf test_bpf: Summary: 378 PASSED, 0 FAILED, [366/366 JIT'ed] # echo 1 > /proc/sys/kernel/unprivileged_bpf_disabled # ./test_verifier ... Summary: 761 PASSED, 507 SKIPPED, 2 FAILED Note that "test_verifier" was run with one build with CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y and one without, otherwise many of the the tests that require unaligned access were skipped. CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y: # echo 1 > /proc/sys/kernel/unprivileged_bpf_disabled # ./test_verifier | grep -c 'NOTE.*unknown align' 0 No CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS: # echo 1 > /proc/sys/kernel/unprivileged_bpf_disabled # ./test_verifier | grep -c 'NOTE.*unknown align' 59 The two failing test_verifier tests are: "ld_abs: vlan + abs, test 1" "ld_abs: jump around ld_abs" This is due to that "far branching" involved in those tests. All tests where done on QEMU emulator version 3.1.50 (v3.1.0-688-g8ae951fbc106). I'll test it on real hardware, when I get access to it. I'm routing this patch via bpf-next/netdev mailing list (after a conversation with Palmer at FOSDEM), mainly because the other JITs went that path. Again, thanks for all the comments! Cheers, Björn v1 -> v2: * Added JMP32 support. (Daniel) * Add RISC-V to Documentation/sysctl/net.txt. (Daniel) * Fixed seen_call() asymmetry. (Daniel) * Fixed broken bpf_flush_icache() range. (Daniel) * Added alignment annotations to some selftests. RFCv1 -> v1: * Cleaned up the Kconfig and net/Makefile. (Christoph) * Removed the entry-stub and squashed the build/config changes to be part of the JIT implementation. (Christoph) * Simplified the register tracking code. (Daniel) * Removed unused macros. (Daniel) * Added myself as maintainer and updated documentation. (Daniel) * Removed HAVE_EFFICIENT_UNALIGNED_ACCESS. (Christoph, Palmer) * Added tail-calls and cleaned up the code. ==================== Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-05selftests/bpf: add "any alignment" annotation for some testsBjörn Töpel
RISC-V does, in-general, not have "efficient unaligned access". When testing the RISC-V BPF JIT, some selftests failed in the verification due to misaligned access. Annotate these tests with the F_NEEDS_EFFICIENT_UNALIGNED_ACCESS flag. Signed-off-by: Björn Töpel <bjorn.topel@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-05bpf, doc: add RISC-V JIT to BPF documentationBjörn Töpel
Update Documentation/networking/filter.txt and Documentation/sysctl/net.txt to mention RISC-V. Signed-off-by: Björn Töpel <bjorn.topel@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-05MAINTAINERS: add RISC-V BPF JIT maintainerBjörn Töpel
Add Björn Töpel as RISC-V BPF JIT maintainer. Signed-off-by: Björn Töpel <bjorn.topel@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-05bpf, riscv: add BPF JIT for RV64GBjörn Töpel
This commit adds a BPF JIT for RV64G. The JIT is a two-pass JIT, and has a dynamic prolog/epilogue (similar to the MIPS64 BPF JIT) instead of static ones (e.g. x86_64). At the moment the RISC-V Linux port does not support CONFIG_HAVE_KPROBES, which means that CONFIG_BPF_EVENTS is not supported. Thus, no tests involving BPF_PROG_TYPE_TRACEPOINT, BPF_PROG_TYPE_PERF_EVENT, BPF_PROG_TYPE_KPROBE and BPF_PROG_TYPE_RAW_TRACEPOINT passes. The implementation does not support "far branching" (>4KiB). Test results: # modprobe test_bpf test_bpf: Summary: 378 PASSED, 0 FAILED, [366/366 JIT'ed] # echo 1 > /proc/sys/kernel/unprivileged_bpf_disabled # ./test_verifier ... Summary: 761 PASSED, 507 SKIPPED, 2 FAILED Note that "test_verifier" was run with one build with CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y and one without, otherwise many of the the tests that require unaligned access were skipped. CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y: # echo 1 > /proc/sys/kernel/unprivileged_bpf_disabled # ./test_verifier | grep -c 'NOTE.*unknown align' 0 No CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS: # echo 1 > /proc/sys/kernel/unprivileged_bpf_disabled # ./test_verifier | grep -c 'NOTE.*unknown align' 59 The two failing test_verifier tests are: "ld_abs: vlan + abs, test 1" "ld_abs: jump around ld_abs" This is due to that "far branching" involved in those tests. All tests where done on QEMU (QEMU emulator version 3.1.50 (v3.1.0-688-g8ae951fbc106)). Signed-off-by: Björn Töpel <bjorn.topel@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-05Merge branch 'bpf-btf-dedup'Daniel Borkmann
Andrii Nakryiko says: ==================== This patch series adds BTF deduplication algorithm to libbpf. This algorithm allows to take BTF type information containing duplicate per-compilation unit information and reduce it to equivalent set of BTF types with no duplication without loss of information. It also deduplicates strings and removes those strings that are not referenced from any BTF type (and line information in .BTF.ext section, if any). Algorithm also resolves struct/union forward declarations into concrete BTF types across multiple compilation units to facilitate better deduplication ratio. If undesired, this resolution can be disabled through specifying corresponding options. When applied to BTF data emitted by pahole's DWARF->BTF converter, it reduces the overall size of .BTF section by about 65x, from about 112MB to 1.75MB, leaving only 29247 out of initial 3073497 BTF type descriptors. Algorithm with minor differences and preliminary results before FUNC/FUNC_PROTO support is also described more verbosely at: https://facebookmicrosites.github.io/bpf/blog/2018/11/14/btf-enhancement.html v1->v2: - rebase on latest bpf-next - err_log/elog -> pr_debug - btf__dedup, btf__get_strings, btf__get_nr_types listed under 0.0.2 version ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-05selftests/btf: add initial BTF dedup testsAndrii Nakryiko
This patch sets up a new kind of tests (BTF dedup tests) and tests few aspects of BTF dedup algorithm. More complete set of tests will come in follow up patches. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-05btf: add BTF types deduplication algorithmAndrii Nakryiko
This patch implements BTF types deduplication algorithm. It allows to greatly compress typical output of pahole's DWARF-to-BTF conversion or LLVM's compilation output by detecting and collapsing identical types emitted in isolation per compilation unit. Algorithm also resolves struct/union forward declarations into concrete BTF types representing referenced struct/union. If undesired, this resolution can be disabled through specifying corresponding options. Algorithm itself and its application to Linux kernel's BTF types is described in details at: https://facebookmicrosites.github.io/bpf/blog/2018/11/14/btf-enhancement.html Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-05btf: extract BTF type size calculationAndrii Nakryiko
This pre-patch extracts calculation of amount of space taken by BTF type descriptor for later reuse by btf_dedup functionality. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-04net: phy: fixed-phy: Drop GPIO from fixed_phy_add()Linus Walleij
All users of the fixed_phy_add() pass -1 as GPIO number to the fixed phy driver, and all users of fixed_phy_register() pass -1 as GPIO number as well, except for the device tree MDIO bus. Any new users should create a proper device and pass the GPIO as a descriptor associated with the device so delete the GPIO argument from the calls and drop the code looking requesting a GPIO in fixed_phy_add(). In fixed phy_register(), investigate the "fixed-link" node and pick the GPIO descriptor from "link-gpios" if this property exists. Move the corresponding code out of of_mdio.c as the fixed phy code anyways requires OF to be in use. Tested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-04net/mlx5: Fix code style issue in mlx driverTonghao Zhang
Add the tab before '}' and keep the code style consistent. Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-04libbpf: fix libbpf_printStanislav Fomichev
With the recent print rework we now have the following problem: pr_{warning,info,debug} expand to __pr which calls libbpf_print. libbpf_print does va_start and calls __libbpf_pr with va_list argument. In __base_pr we again do va_start. Because the next argument is a va_list, we don't get correct pointer to the argument (and print noting in my case, I don't know why it doesn't crash tbh). Fix this by changing libbpf_print_fn_t signature to accept va_list and remove unneeded calls to va_start in the existing users. Alternatively, this can we solved by exporting __libbpf_pr and changing __pr macro to (and killing libbpf_print): { if (__libbpf_pr) __libbpf_pr(level, "libbpf: " fmt, ##__VA_ARGS__) } Signed-off-by: Stanislav Fomichev <sdf@google.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-04rds: rdma: update rdma transport for tosSantosh Shilimkar
For RDMA transports, RDS TOS is an extension of IB QoS(Annex A13) to provide clients the ability to segregate traffic flows for different type of data. RDMA CM abstract it for ULPs using rdma_set_service_type(). Internally, each traffic flow is represented by a connection with all of its independent resources like that of a normal connection, and is differentiated by service type. In other words, there can be multiple qp connections between an IP pair and each supports a unique service type. The feature has been added from RDSv4.1 onwards and supports rolling upgrades. RDMA connection metadata also carries the tos information to set up SL on end to end context. The original code was developed by Bang Nguyen in downstream kernel back in 2.6.32 kernel days and it has evolved over period of time. Reviewed-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> [yanjun.zhu@oracle.com: Adapted original patch with ipv6 changes] Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
2019-02-04rds: add transport specific tos_map hookSantosh Shilimkar
RDMA transport maps user tos to underline virtual lanes(VL) for IB or DSCP values. RDMA CM transport abstract thats for RDS. TCP transport makes use of default priority 0 and maps all user tos values to it. Reviewed-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> [yanjun.zhu@oracle.com: Adapted original patch with ipv6 changes] Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
2019-02-04rds: add type of service(tos) infrastructureSantosh Shilimkar
RDS Service type (TOS) is user-defined and needs to be configured via RDS IOCTL interface. It must be set before initiating any traffic and once set the TOS can not be changed. All out-going traffic from the socket will be associated with its TOS. Reviewed-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> [yanjun.zhu@oracle.com: Adapted original patch with ipv6 changes] Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
2019-02-04rds: rdma: add consumer rejectSantosh Shilimkar
For legacy protocol version incompatibility with non linux RDS, consumer reject reason being used to convey it to peer. But the choice of reject reason value as '1' was really poor. Anyway for interoperability reasons with shipping products, it needs to be supported. For any future versions, properly encoded reject reason should to be used. Reviewed-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> [yanjun.zhu@oracle.com: Adapted original patch with ipv6 changes] Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
2019-02-04rds: make v3.1 as compat versionSantosh Shilimkar
Mark RDSv3.1 as compat version and add v4.1 version macro's. Subsequent patches enable TOS(Type of Service) feature which is tied with v4.1 for RDMA transport. Reviewed-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> [yanjun.zhu@oracle.com: Adapted original patch with ipv6 changes] Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
2019-02-04Merge branch 'sh_eth-implement-simple-RX-checksum-offload'David S. Miller
Sergei Shtylyov says: ==================== sh_eth: implement simple RX checksum offload Here's a set of 7 patches against DaveM's 'net-next.git' repo. I'm implemeting the simple RX checksum offload (like was done for the 'ravb' driver by Simon Horman); it has been only tested on the R8A7740 and R8A77980 SoCs, the other SoCs should just work (according to their manuals)... ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-04sh_eth: offload RX checksum on SH7763Sergei Shtylyov
The SH7763 SoC manual describes the Ether MAC's RX checksum offload the same way as it's implemented in the EtherAVB MACs... Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-04sh_eth: offload RX checksum on SH7734Sergei Shtylyov
The SH7734 SoC manual describes the Ether MAC's RX checksum offload the same way as it's implemented in the EtherAVB MACs... Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-04sh_eth: offload RX checksum on R8A77980Sergei Shtylyov
The R-Car V3H (R8A77980) SoC manual describes the Ether MAC's RX checksum offload the same way as it's implemented in the EtherAVB MAC... Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-04sh_eth: offload RX checksum on R8A7740Sergei Shtylyov
The R-Mobile A1 (R8A7740) SoC manual describes the Ether MAC's RX checksum offload the same way as it's implemented in the EtherAVB MAC... Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-04sh_eth: offload RX checksum on R7S72100Sergei Shtylyov
The RZ/A1H (R7S721000) SoC manual describes the Ether MAC's RX checksum offload the same way as it's implemented in the EtherAVB MACs... Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-04sh_eth: RX checksum offload supportSergei Shtylyov
Add support for the RX checksum offload. This is enabled by default and may be disabled and re-enabled using 'ethtool': # ethtool -K eth0 rx off # ethtool -K eth0 rx on Some Ether MACs provide a simple checksumming scheme which appears to be completely compatible with CHECKSUM_COMPLETE: sum of all packet data after the L2 header is appended to packet data; this may be trivially read by the driver and used to update the skb accordingly. The same checksumming scheme is implemented in the EtherAVB MACs and now supported by the 'ravb' driver. In terms of performance, throughput is close to gigabit line rate with the RX checksum offload both enabled and disabled. The 'perf' output, however, appears to indicate that significantly less time is spent in do_csum() -- this is as expected. Test results with RX checksum offload enabled: ~/netperf-2.2pl4# perf record -a ./netperf -t TCP_MAERTS -H 192.168.2.4 TCP MAERTS TEST to 192.168.2.4 Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 131072 16384 16384 10.01 933.93 [ perf record: Woken up 8 times to write data ] [ perf record: Captured and wrote 1.955 MB perf.data (41940 samples) ] ~/netperf-2.2pl4# perf report Samples: 41K of event 'cycles:ppp', Event count (approx.): 9915302763 Overhead Command Shared Object Symbol 9.44% netperf [kernel.kallsyms] [k] __arch_copy_to_user 7.75% swapper [kernel.kallsyms] [k] _raw_spin_unlock_irq 6.31% swapper [kernel.kallsyms] [k] default_idle_call 5.89% swapper [kernel.kallsyms] [k] arch_cpu_idle 4.37% swapper [kernel.kallsyms] [k] tick_nohz_idle_exit 4.02% netperf [kernel.kallsyms] [k] _raw_spin_unlock_irq 2.52% netperf [kernel.kallsyms] [k] preempt_count_sub 1.81% netperf [kernel.kallsyms] [k] tcp_recvmsg 1.80% netperf [kernel.kallsyms] [k] _raw_spin_unlock_irqres 1.78% netperf [kernel.kallsyms] [k] preempt_count_add 1.36% netperf [kernel.kallsyms] [k] __tcp_transmit_skb 1.20% netperf [kernel.kallsyms] [k] __local_bh_enable_ip 1.10% netperf [kernel.kallsyms] [k] sh_eth_start_xmit Test results with RX checksum offload disabled: ~/netperf-2.2pl4# perf record -a ./netperf -t TCP_MAERTS -H 192.168.2.4 TCP MAERTS TEST to 192.168.2.4 Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 131072 16384 16384 10.01 932.04 [ perf record: Woken up 14 times to write data ] [ perf record: Captured and wrote 3.642 MB perf.data (78817 samples) ] ~/netperf-2.2pl4# perf report Samples: 78K of event 'cycles:ppp', Event count (approx.): 18091442796 Overhead Command Shared Object Symbol 7.00% swapper [kernel.kallsyms] [k] do_csum 3.94% swapper [kernel.kallsyms] [k] sh_eth_poll 3.83% ksoftirqd/0 [kernel.kallsyms] [k] do_csum 3.23% swapper [kernel.kallsyms] [k] _raw_spin_unlock_irq 2.87% netperf [kernel.kallsyms] [k] __arch_copy_to_user 2.86% swapper [kernel.kallsyms] [k] arch_cpu_idle 2.13% swapper [kernel.kallsyms] [k] default_idle_call 2.12% ksoftirqd/0 [kernel.kallsyms] [k] sh_eth_poll 2.02% swapper [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore 1.84% swapper [kernel.kallsyms] [k] __softirqentry_text_start 1.64% swapper [kernel.kallsyms] [k] tick_nohz_idle_exit 1.53% netperf [kernel.kallsyms] [k] _raw_spin_unlock_irq 1.32% netperf [kernel.kallsyms] [k] preempt_count_sub 1.27% swapper [kernel.kallsyms] [k] __pi___inval_dcache_area 1.22% swapper [kernel.kallsyms] [k] check_preemption_disabled 1.01% ksoftirqd/0 [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore The above results collected on the R-Car V3H Starter Kit board. Based on the commit 4d86d3818627 ("ravb: RX checksum offload")... Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-04sh_eth: rename sh_eth_cpu_data::hw_checksumSergei Shtylyov
Commit 62e04b7e0e3c ("sh_eth: rename 'sh_eth_cpu_data::hw_crc'") renamed the field to 'hw_checksum' for the Ether DMAC "intelligent checksum", however some Ether MACs implement a simpler checksumming scheme, so that name now seems misleading. Rename that field to 'csmr' as the "intelligent checksum" is always controlled by the CSMR register. Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-04Merge branch 'libbpf-btf_ext'Alexei Starovoitov
Yonghong Song says: ==================== This patch set exposed a few functions in libbpf. All these newly added API functions are helpful for JIT based bpf compilation where .BTF and .BTF.ext are available as in-memory data blobs. Patch #1 exposed several btf_ext__* API functions which are used to handle .BTF.ext ELF sections. Patch #2 refactored the function bpf_map_find_btf_info() and exposed API function btf__get_map_kv_tids() to retrieve the map key/value type id's generated by bpf program through BPF_ANNOTATE_KV_PAIR macro. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-04tools/bpf: implement libbpf btf__get_map_kv_tids() API functionYonghong Song
Currently, to get map key/value type id's, the macro BPF_ANNOTATE_KV_PAIR(<map_name>, <key_type>, <value_type>) needs to be defined in the bpf program for the corresponding map. During program/map loading time, the local static function bpf_map_find_btf_info() in libbpf.c is implemented to retrieve the key/value type ids given the map name. The patch refactored function bpf_map_find_btf_info() to create an API btf__get_map_kv_tids() which includes the bulk of implementation for the original function. The API btf__get_map_kv_tids() can be used by bcc, a JIT based bpf compilation system, which uses the same BPF_ANNOTATE_KV_PAIR to record map key/value types. Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-04tools/bpf: expose functions btf_ext__* as API functionsYonghong Song
The following set of functions, which manipulates .BTF.ext section, are exposed as API functions: . btf_ext__new . btf_ext__free . btf_ext__reloc_func_info . btf_ext__reloc_line_info . btf_ext__func_info_rec_size . btf_ext__line_info_rec_size These functions are useful for JIT based bpf codegen, e.g., bcc, to manipulate in-memory .BTF.ext sections. The signature of function btf_ext__reloc_func_info() is also changed to be the same as its definition in btf.c. Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-04selftests/bpf: use localhost in tcp_{server,client}.pyStanislav Fomichev
Bind and connect to localhost. There is no reason for this test to use non-localhost interface. This lets us run this test in a network namespace. Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-04s390: bpf: fix JMP32 code-genHeiko Carstens
Commit 626a5f66da0d19 ("s390: bpf: implement jitting of JMP32") added JMP32 code-gen support for s390. However it triggers the warning below due to some unusual gotos in the original s390 bpf jit code. Add a couple of additional "is_jmp32" initializations to fix this. Also fix the wrong opcode for the "llilf" instruction that was introduced with the same commit. arch/s390/net/bpf_jit_comp.c: In function 'bpf_jit_insn': arch/s390/net/bpf_jit_comp.c:248:55: warning: 'is_jmp32' may be used uninitialized in this function [-Wmaybe-uninitialized] _EMIT6(op1 | reg(b1, b2) << 16 | (rel & 0xffff), op2 | mask); \ ^ arch/s390/net/bpf_jit_comp.c:1211:8: note: 'is_jmp32' was declared here bool is_jmp32 = BPF_CLASS(insn->code) == BPF_JMP32; Fixes: 626a5f66da0d19 ("s390: bpf: implement jitting of JMP32") Cc: Jiong Wang <jiong.wang@netronome.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: Jiong Wang <jiong.wang@netronome.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-04Merge branch 'change-libbpf-print-api'Alexei Starovoitov
Yonghong Song says: ==================== These are patches responding to my comments for Magnus's patch (https://patchwork.ozlabs.org/patch/1032848/). The goal is to make pr_* macros available to other C files than libbpf.c, and to simplify API function libbpf_set_print(). Specifically, Patch #1 used global functions to facilitate pr_* macros in the header files so they are available in different C files. Patch #2 removes the global function libbpf_print_level_available() which is added in Patch 1. Patch #3 simplified libbpf_set_print() which takes only one print function with a debug level argument among others. Changelogs: v3 -> v4: . rename libbpf internal header util.h to libbpf_util.h . rename libbpf internal function libbpf_debug_print() to libbpf_print() v2 -> v3: . bailed out earlier in libbpf_debug_print() if __libbpf_pr is NULL . added missing LIBBPF_DEBUG level check in libbpf.c __base_pr(). v1 -> v2: . Renamed global function libbpf_dprint() to libbpf_debug_print() to be more expressive. . Removed libbpf_dprint_level_available() as it is used only once in btf.c and we can remove it by optimizing for common cases. ==================== Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-04tools/bpf: simplify libbpf API function libbpf_set_print()Yonghong Song
Currently, the libbpf API function libbpf_set_print() takes three function pointer parameters for warning, info and debug printout respectively. This patch changes the API to have just one function pointer parameter and the function pointer has one additional parameter "debugging level". So if in the future, if the debug level is increased, the function signature won't change. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-04tools/bpf: print out btf log at LIBBPF_WARN levelYonghong Song
Currently, the btf log is allocated and printed out in case of error at LIBBPF_DEBUG level. Such logs from kernel are very important for debugging. For example, bpf syscall BPF_PROG_LOAD command can get verifier logs back to user space. In function load_program() of libbpf.c, the log buffer is allocated unconditionally and printed out at pr_warning() level. Let us do the similar thing here for btf. Allocate buffer unconditionally and print out error logs at pr_warning() level. This can reduce one global function and optimize for common situations where pr_warning() is activated either by default or by user supplied debug output function. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-04tools/bpf: move libbpf pr_* debug print functions to headersYonghong Song
A global function libbpf_print, which is invisible outside the shared library, is defined to print based on levels. The pr_warning, pr_info and pr_debug macros are moved into the newly created header common.h. So any .c file including common.h can use these macros directly. Currently btf__new and btf_ext__new API has an argument getting __pr_debug function pointer into btf.c so the debugging information can be printed there. This patch removed this parameter from btf__new and btf_ext__new and directly using pr_debug in btf.c. Another global function libbpf_print_level_available, also invisible outside the shared library, can test whether a particular level debug printing is available or not. It is used in btf.c to test whether DEBUG level debug printing is availabl or not, based on which the log buffer will be allocated when loading btf to the kernel. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-04ath9k: eeprom: Use scnprintf instead of snprintfKees Cook
Change snprintf to scnprintf. There are generally two cases where using snprintf causes problems. 1) Uses of size += snprintf(buf, SIZE - size, fmt, ...) In this case, if snprintf would have written more characters than what the buffer size (SIZE) is, then size will end up larger than SIZE. In later uses of snprintf, SIZE - size will result in a negative number, leading to problems. Note that size might already be too large by using size = snprintf before the code reaches a case of size += snprintf. 2) If size is ultimately used as a length parameter for a copy back to user space, then it will potentially allow for a buffer overflow and information disclosure when size is greater than SIZE. When the size is used to index the buffer directly, we can have memory corruption. This also means when size = snprintf... is used, it may also cause problems since size may become large. Copying to userspace is mitigated by the HARDENED_USERCOPY kernel configuration. The solution to these issues is to use scnprintf which returns the number of characters actually written to the buffer, so the size variable will never exceed SIZE. Cc: Willy Tarreau <w@1wt.eu> Cc: Silvio Cesare <silvio.cesare@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-02-04ath10k: Add support for extended HTT aggr msg supportGovind Singh
HTT aggr message parameter in HL2.0 fw are different in comparison to legacy fw version. Fill correct HTT aggr msg parameter for targets using HL2.0 firmware. Signed-off-by: Govind Singh <govinds@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-02-04ath10k: fix S5 power consumption issue for QCA9377Yu Wang
After system entering S5 (shut down but system still providing power to QCA9377) on Ubuntu platform, power consumption of QCA9377 is 69mA, which is too high. The root cause is pci_soft_reset is not set for QCA9377 during pci probe. To fix this issue, set 'pci_soft_reset' to 'th10k_pci_warm_reset', and then the power consumption drops to a normal value(10mA). Verified on Dell Ubuntu platform with firmware: WLAN.TF.1.0-00002-QCATFSWPZ-5 Signed-off-by: Yu Wang <yyuwang@codeaurora.org> Signed-off-by: Yu Wang <yyuwang@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-02-04ath10k: Set DMA address mask to 35 bit for WCN3990Rakesh Pillai
WCN3990 is a 37-bit target but can address memory range only upto 35 bits. The 36th bit is used to control the smmu/iommu translation and the 37th bit is used by the internal bus masters to access the wifi subsystem internal SRAM. With the DMA mask set to 37i-bit, the host driver can get 37-bit dma address, which leads to incorrect address access in the target. Hence the host driver can used addresses upto 35-bit for WCN3990. Fix the dma mask for wcn3990 to 35-bit, instead of 37-bit. Tested HW: WCN3990 Tested FW: WLAN.HL.2.0-01188-QCAHLSWMTPLZ-1 Tested-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Rakesh Pillai <pillair@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-02-04iwlwifi: implement BISR HW workaround for 22260 devicesJohannes Berg
There's a small hardware bug in 22260 devices which thus require a few more delays during initialization. Implement this workaround. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com>