linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2018-04-25	ALSA: seq: oss: Hardening for potential Spectre v1	Takashi Iwai
	As Smatch recently suggested, a few places in OSS sequencer codes may expand the array directly from the user-space value with speculation, namely there are a significant amount of references to either info->ch[] or dp->synths[] array: sound/core/seq/oss/seq_oss_event.c:315 note_on_event() warn: potential spectre issue 'info->ch' (local cap) sound/core/seq/oss/seq_oss_event.c:362 note_off_event() warn: potential spectre issue 'info->ch' (local cap) sound/core/seq/oss/seq_oss_synth.c:470 snd_seq_oss_synth_load_patch() warn: potential spectre issue 'dp->synths' (local cap) sound/core/seq/oss/seq_oss_event.c:293 note_on_event() warn: potential spectre issue 'dp->synths' sound/core/seq/oss/seq_oss_event.c:353 note_off_event() warn: potential spectre issue 'dp->synths' sound/core/seq/oss/seq_oss_synth.c:506 snd_seq_oss_synth_sysex() warn: potential spectre issue 'dp->synths' sound/core/seq/oss/seq_oss_synth.c:580 snd_seq_oss_synth_ioctl() warn: potential spectre issue 'dp->synths' Although all these seem doing only the first load without further reference, we may want to stay in a safer side, so hardening with array_index_nospec() would still make sense. We may put array_index_nospec() at each place, but here we take a different approach: - For dp->synths[], change the helpers to retrieve seq_oss_synthinfo pointer directly instead of the array expansion at each place - For info->ch[], harden in a normal way, as there are only a couple of places As a result, the existing helper, snd_seq_oss_synth_is_valid() is replaced with snd_seq_oss_synth_info(). Also, we cover MIDI device where a similar array expansion is done, too, although it wasn't reported by Smatch. BugLink: https://marc.info/?l=linux-kernel&m=152411496503418&w=2 Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-04-25	ALSA: seq: oss: Fix unbalanced use lock for synth MIDI device	Takashi Iwai
	When get_synthdev() is called for a MIDI device, it returns the fixed midi_synth_dev without the use refcounting. OTOH, the caller is supposed to unreference unconditionally after the usage, so this would lead to unbalanced refcount. This patch corrects the behavior and keep up the refcount balance also for the MIDI synth device. Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-04-25	ALSA: hda/realtek - Update ALC255 depop optimize	Kailang Yang
	Add ALC255 its own depop functions for alc_init and alc_shutup. Assign it to ALC256 usage. Signed-off-by: Kailang Yang <kailang@realtek.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-04-25	ALSA: hda/realtek - Add some fixes for ALC233	Kailang Yang
	Fill COEF to change EAPD to verb control. Assigned codec type. This is an additional fix over 92f974df3460 ("ALSA: hda/realtek - New vendor ID for ALC233"). [ More notes: according to Kailang, the chip is 10ec:0235 bonding for ALC233b, which is equivalent with ALC255. It's only used for Lenovo. The chip needs no alc_process_coef_fw() for headset unlike ALC255. ] Signed-off-by: Kailang Yang <kailang@realtek.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-04-25	Merge branch 'bpf-optimize-neg-sums'	Daniel Borkmann
	Jakub Kicinski says: ==================== This set adds an optimization run to the NFP jit to turn ADD and SUB instructions with negative immediate into the opposite operation with a positive immediate. NFP can fit small immediates into the instructions but it can't ever fit negative immediates. Addition of small negative immediates is quite common in BPF programs for stack address calculations, therefore this optimization gives us non-negligible savings in instruction count (up to 4%). ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-25	nfp: bpf: optimize comparisons to negative constants	Jakub Kicinski
	Comparison instruction requires a subtraction. If the constant is negative we are more likely to fit it into a NFP instruction directly if we change the sign and use addition. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-25	nfp: bpf: tabularize generations of compare operations	Jakub Kicinski
	There are quite a few compare instructions now, use a table to translate BPF instruction code to NFP instruction parameters instead of parameterizing helpers. This saves LOC and makes future extensions easier. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-25	nfp: bpf: optimize add/sub of a negative constant	Jakub Kicinski
	NFP instruction set can fit small immediates into the instruction. Negative integers, however, will never fit because they will have highest bit set. If we swap the ALU op between ADD and SUB and negate the constant we have a better chance of fitting small negative integers into the instruction itself and saving one or two cycles. immed[gprB_21, 0xfffffffc] alu[gprA_4, gprA_4, +, gprB_21], gpr_wrboth immed[gprB_21, 0xffffffff] alu[gprA_5, gprA_5, +carry, gprB_21], gpr_wrboth now becomes: alu[gprA_4, gprA_4, -, 4], gpr_wrboth alu[gprA_5, gprA_5, -carry, 0], gpr_wrboth Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-25	nfp: bpf: remove double space	Jakub Kicinski
	Whitespace cleanup - remove double space. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-25	bpf: clear the ip_tunnel_info.	William Tu
	The percpu metadata_dst might carry the stale ip_tunnel_info and cause incorrect behavior. When mixing tests using ipv4/ipv6 bpf vxlan and geneve tunnel, the ipv6 tunnel info incorrectly uses ipv4's src ip addr as its ipv6 src address, because the previous tunnel info does not clean up. The patch zeros the fields in ip_tunnel_info. Signed-off-by: William Tu <u9012063@gmail.com> Reported-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-25	drm/i915/fbdev: Enable late fbdev initial configuration	José Roberto de Souza
	If the initial fbdev configuration (intel_fbdev_initial_config()) runs and there still no sink connected it will cause drm_fb_helper_initial_config() to return 0 as no error happened (but internally the return is -EAGAIN). Because no framebuffer was allocated, when a sink is connected intel_fbdev_output_poll_changed() will not execute drm_fb_helper_hotplug_event() that would trigger another try to do the initial fbdev configuration. So here allowing drm_fb_helper_hotplug_event() to be executed when there is no framebuffer allocated and fbdev was not set up yet. This issue also happens when a MST DP sink is connected since boot, as the MST topology is discovered in parallel if intel_fbdev_initial_config() is executed before the first sink MST is discovered it will cause this same issue. This is a follow-up patch of https://patchwork.freedesktop.org/patch/196089/ Changes from v1: - not creating a dump framebuffer anymore, instead just allowing drm_fb_helper_hotplug_event() to execute when fbdev is not setup yet. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104158 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104425 Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: stable@vger.kernel.org # v4.15+ Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Tested-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: frederik <frederik.schwan@linux.com> # 4.15.17 Tested-by: Ian Pilcher <arequipeno@gmail.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180418234158.9388-1-jose.souza@intel.com (cherry picked from commit df9e6521749ab33cde306e8a4350b0ac7889220a) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2018-04-25	drm/i915: Use ktime on wait_for	Mika Kuoppala
	We use jiffies to determine when wait expires. However Imre did find out that jiffies can and will do a >1 increments on certain situations [1]. When this happens in a wait_for loop, we return timeout errorneously much earlier than what the real wallclock would say. We can't afford our waits to timeout prematurely. Discard jiffies and change to ktime to detect timeouts. v2: added bugzilla entry (Imre), added stable (Chris) Reported-by: Imre Deak <imre.deak@intel.com> References: https://lkml.org/lkml/2018/4/18/798 [1] Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105771 Cc: Imre Deak <imre.deak@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180423113754.28424-1-mika.kuoppala@linux.intel.com (cherry picked from commit 3085982c6b45d7d22f76e3aa018affbc143a7370) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2018-04-25	random: rate limit unseeded randomness warnings	Theodore Ts'o
	On systems without sufficient boot randomness, no point spamming dmesg. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
2018-04-25	ALSA: pcm: Change return type to vm_fault_t	Souptick Joarder
	Use new return type vm_fault_t for fault handler. For now, this is just documenting that the function returns a VM_FAULT value rather than an errno. Once all instances are converted, vm_fault_t will become a distinct type. Commit 1c8f422059ae ("mm: change return type to vm_fault_t") Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-04-25	ALSA: usx2y: Change return type to vm_fault_t	Souptick Joarder
	Use new return type vm_fault_t for fault handler. For now, this is just documenting that the function returns a VM_FAULT value rather than an errno. Once all instances are converted, vm_fault_t will become a distinct type. Commit 1c8f422059ae ("mm: change return type to vm_fault_t") Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-04-24	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	David S. Miller

2018-04-25	rtc: opal: Fix OPAL RTC driver OPAL_BUSY loops	Nicholas Piggin
	The OPAL RTC driver does not sleep in case it gets OPAL_BUSY or OPAL_BUSY_EVENT from firmware, which causes large scheduling latencies, up to 50 seconds have been observed here when RTC stops responding (BMC reboot can do it). Fix this by converting it to the standard form OPAL_BUSY loop that sleeps. Fixes: 628daa8d5abf ("powerpc/powernv: Add RTC and NVRAM support plus RTAS fallbacks") Cc: stable@vger.kernel.org # v3.2+ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-04-24	drm/amdgpu: set COMPUTE_PGM_RSRC1 for SGPR/VGPR clearing shaders	Nicolai Hähnle
	Otherwise, the SQ may skip some of the register writes, or shader waves may be allocated where we don't expect them, so that as a result we don't actually reset all of the register SRAMs. This can lead to spurious ECC errors later on if a shader uses an uninitialized register. Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2018-04-24	Merge branch 'userns-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull userns bug fix from Eric Biederman: "Just a small fix to properly set the return code on error" * 'userns-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: commoncap: Handle memory allocation failure.
2018-04-25	bpf: reduce runtime of test_sockmap tests	John Fastabend
	When test_sockmap was running outside of selftests and was not being run by build bots it was reasonable to spend significant amount of time running various tests. The number of tests is high because many different I/O iterators are run. However, now that test_sockmap is part of selftests rather than iterate through all I/O sides only test a minimal set of min/max values along with a few "normal" I/O ops. Also remove the long running tests. They can be run from other test frameworks on a regular cadence. This significanly reduces runtime of test_sockmap. Before: $ time sudo ./test_sockmap > /dev/null real 4m47.521s user 0m0.370s sys 0m3.131s After: $ time sudo ./test_sockmap > /dev/null real 0m0.514s user 0m0.104s sys 0m0.430s The CLI is still available for users that want to test the long running tests that do the larger send/recv tests. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-25	Merge branch 'bpf-sockmap-selftests'	Daniel Borkmann
	John Fastabend says: ==================== This series moves ./samples/sockmap into BPF selftests. There are a few good reasons to do this. First, by pushing this into selftests the tests will be run automatically. Second, sockmap was not really a sample of anything anymore, but rather a large set of tests. Note: There are three recent fixes outstanding against bpf branch that can be detected occasionally by the automated tests here. https://patchwork.ozlabs.org/patch/903138/ https://patchwork.ozlabs.org/patch/903139/ https://patchwork.ozlabs.org/patch/903140/ ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-25	bpf: sockmap, remove samples program	John Fastabend
	The BPF sample sockmap is redundant now that equivelant tests exist in the BPF selftests. Lets remove this sample and only keep the selftest version that will be run as part of the selftest suite. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-25	bpf: sockmap, add selftests	John Fastabend
	This adds a new test program test_sockmap which is the old sample sockmap program. By moving the sample program here we can now run it as part of the self tests suite. To support this a populate_progs() routine is added to load programs and maps which was previously done with load_bpf_file(). This is needed because self test libs do not provide a similar routine. Also we now use the cgroup_helpers routines to manage cgroup use instead of manually creating one and supplying it to the CLI. Notice we keep the CLI around though because it is useful for dbg and specialized testing. To run use ./test_sockmap and the result should be, Summary 660 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-25	bpf: sockmap, add a set of tests to run by default	John Fastabend
	If no options are passed to sockmap after this patch we run a set of tests using various options and sendmsg/sendpage sizes. This replaces the sockmap_test.sh script. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-25	bpf: sockmap, code sockmap_test in C	John Fastabend
	By moving sockmap_test from shell script into C we can run it directly from selftests, but we can also push the input/output around in proper structures. However, keep the CLI options around because they are useful for debugging when a paticular pattern of msghdr or sockmap options trips up the sockmap code path. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-25	tools/bpf: remove test_sock_addr from TEST_GEN_PROGS	Yonghong Song
	Since test_sock_addr is not supposed to run by itself, remove it from TEST_GEN_PROGS and add it to TEST_GEN_PROGS_EXTENDED. This way, run_tests will not run test_sock_addr. The corresponding test to run is test_sock_addr.sh. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-24	selftests: bpf: update .gitignore with missing file	Anders Roxell
	Fixes: c0fa1b6c3efc ("bpf: btf: Add BTF tests") Signed-off-by: Anders Roxell <anders.roxell@linaro.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-24	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	Linus Torvalds
	Pull networking fixes from David Miller: 1) Fix rtnl deadlock in ipvs, from Julian Anastasov. 2) s390 qeth fixes from Julian Wiedmann (control IO completion stalls, bad MAC address update sequence, request side races on command IO timeouts). 3) Handle seq_file overflow properly in l2tp, from Guillaume Nault. 4) Fix VLAN priority mappings in cpsw driver, from Ivan Khoronzhuk. 5) Packet scheduler ife action fixes (malformed TLV lengths, etc.) from Alexander Aring. 6) Fix out of bounds access in tcp md5 option parser, from Jann Horn. 7) Missing netlink attribute policies in rtm_ipv6_policy table, from Eric Dumazet. 8) Missing socket address length checks in l2tp and pppoe connect, from Guillaume Nault. 9) Fix netconsole over team and bonding, from Xin Long. 10) Fix race with AF_PACKET socket state bitfields, from Willem de Bruijn. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (51 commits) ice: Fix insufficient memory issue in ice_aq_manage_mac_read sfc: ARFS filter IDs net: ethtool: Add missing kernel doc for FEC parameters packet: fix bitfield update race ice: Do not check INTEVENT bit for OICR interrupts ice: Fix incorrect comment for action type ice: Fix initialization for num_nodes_added igb: Fix the transmission mode of queue 0 for Qav mode ixgbevf: ensure xdp_ring resources are free'd on error exit team: fix netconsole setup over team amd-xgbe: Only use the SFP supported transceiver signals amd-xgbe: Improve KR auto-negotiation and training amd-xgbe: Add pre/post auto-negotiation phy hooks pppoe: check sockaddr length in pppoe_connect() l2tp: check sockaddr length in pppol2tp_connect() net: phy: marvell: clear wol event before setting it ipv6: add RTA_TABLE and RTA_PREFSRC to rtm_ipv6_policy bonding: do not set slave_dev npinfo before slave_enable_netpoll in bond_enslave tcp: don't read out-of-bounds opsize ibmvnic: Clean actual number of RX or TX pools ...
2018-04-24	ACPI / video: Only default only_lcd to true on Win8-ready _desktops_	Hans de Goede
	Commit 5928c281524f (ACPI / video: Default lcd_only to true on Win8-ready and newer machines) made only_lcd default to true on all machines where acpi_osi_is_win8() returns true, including laptops. The purpose of this is to avoid the bogus / non-working acpi backlight interface which many newer BIOS-es define on desktop machines. But this is causing a regression on some laptops, specifically on the Dell XPS 13 2013 model, which does not have the LCD flag set for its fully functional ACPI backlight interface. Rather then DMI quirking our way out of this, this commits changes the logic for setting only_lcd to true, to only do this on machines with a desktop (or server) dmi chassis-type. Note that we cannot simply only check the chassis-type and not register the backlight interface based on that as there are some laptops and tablets which have their chassis-type set to "3" aka desktop. Hopefully the combination of checking the LCD flag, but only on devices with a desktop(ish) chassis-type will avoid the needs for DMI quirks for this, or at least limit the amount of DMI quirks which we need to a minimum. Fixes: 5928c281524f (ACPI / video: Default lcd_only to true on Win8-ready and newer machines) Reported-and-tested-by: James Hogan <jhogan@kernel.org> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Cc: 4.15+ <stable@vger.kernel.org> # 4.15+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-04-24	Merge branch 'bpf-map-val-as-key'	Daniel Borkmann
	Paul Chaignon says: ==================== Currently, helpers that expect ARG_PTR_TO_MAP_KEY and ARG_PTR_TO_MAP_VALUE can only access stack and packet memory. This patchset allows these helpers to directly access map values by passing registers of type PTR_TO_MAP_VALUE. The first patch changes the verifier; the second adds new test cases. The first three versions of this patchset were sent on the iovisor-dev mailing list only. Changelogs: Changes in v5: - Refactor using check_helper_mem_access. Changes in v4: - Rebase. Changes in v3: - Bug fixes. - Negative test cases. Changes in v2: - Additional test cases for adjusted maps. ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-24	tools/bpf: add verifier tests for accesses to map values	Paul Chaignon
	This patch adds new test cases for accesses to map values from map helpers. Signed-off-by: Paul Chaignon <paul.chaignon@orange.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-24	bpf: allow map helpers access to map values directly	Paul Chaignon
	Helpers that expect ARG_PTR_TO_MAP_KEY and ARG_PTR_TO_MAP_VALUE can only access stack and packet memory. Allow these helpers to directly access map values by passing registers of type PTR_TO_MAP_VALUE. This change removes the need for an extra copy to the stack when using a map value to perform a second map lookup, as in the following: struct bpf_map_def SEC("maps") infobyreq = { .type = BPF_MAP_TYPE_HASHMAP, .key_size = sizeof(struct request ), .value_size = sizeof(struct info_t), .max_entries = 1024, }; struct bpf_map_def SEC("maps") counts = { .type = BPF_MAP_TYPE_HASHMAP, .key_size = sizeof(struct info_t), .value_size = sizeof(u64), .max_entries = 1024, }; SEC("kprobe/blk_account_io_start") int bpf_blk_account_io_start(struct pt_regs ctx) { struct info_t info = bpf_map_lookup_elem(&infobyreq, &ctx->di); u64 count = bpf_map_lookup_elem(&counts, info); (*count)++; } Signed-off-by: Paul Chaignon <paul.chaignon@orange.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-24	Merge branch 'bpf-xfrm-states'	Daniel Borkmann
	Eyal Birger says: ==================== This patchset adds support for fetching XFRM state information from an eBPF program called from TC. The first patch introduces a helper for fetching an XFRM state from the skb's secpath. The XFRM state is modeled using a new virtual struct which contains the SPI, peer address, and reqid values of the state; This struct can be extended in the future to provide additional state information. The second patch adds a test example in test_tunnel_bpf.sh. The sample validates the correct extraction of state information by the eBPF program. v3: - Kept SPI and peer IPv4 address in state in network byte order following suggestion from Alexei Starovoitov v2: - Fixed two comments by Daniel Borkmann: - disallow reserved flags in helper call - avoid compiling in helper code when CONFIG_XFRM is off ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-24	samples/bpf: extend test_tunnel_bpf.sh with xfrm state test	Eyal Birger
	Add a test for fetching xfrm state parameters from a tc program running on ingress. Signed-off-by: Eyal Birger <eyal.birger@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-24	bpf: add helper for getting xfrm states	Eyal Birger
	This commit introduces a helper which allows fetching xfrm state parameters by eBPF programs attached to TC. Prototype: bpf_skb_get_xfrm_state(skb, index, xfrm_state, size, flags) skb: pointer to skb index: the index in the skb xfrm_state secpath array xfrm_state: pointer to 'struct bpf_xfrm_state' size: size of 'struct bpf_xfrm_state' flags: reserved for future extensions The helper returns 0 on success. Non zero if no xfrm state at the index is found - or non exists at all. struct bpf_xfrm_state currently includes the SPI, peer IPv4/IPv6 address and the reqid; it can be further extended by adding elements to its end - indicating the populated fields by the 'size' argument - keeping backwards compatibility. Typical usage: struct bpf_xfrm_state x = {}; bpf_skb_get_xfrm_state(skb, 0, &x, sizeof(x), 0); ... Signed-off-by: Eyal Birger <eyal.birger@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-24	liquidio: Swap VF representor Tx and Rx statistics	Srinivas Jampala
	Swap VF representor tx and rx interface statistics since it is a virtual switchdev port and tx for VM should be rx for VF representor and vice-versa. Signed-off-by: Srinivas Jampala <srinivasa.jampala@cavium.com> Acked-by: Derek Chickles <derek.chickles@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-24	net/ipv6: fix LOCKDEP issue in rt6_remove_exception_rt()	Eric Dumazet
	rt6_remove_exception_rt() is called under rcu_read_lock() only. We lock rt6_exception_lock a bit later, so we do not hold rt6_exception_lock yet. Fixes: 8a14e46f1402 ("net/ipv6: Fix missing rcu dereferences on from") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: David Ahern <dsahern@gmail.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-24	Merge branch '1GbE' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2018-04-24 This series contains fixes to ixgbevf, igb and ice drivers. Colin Ian King fixes the return value on error for the new XDP support that went into ixgbevf for 4.17. Vinicius provides a fix for queue 0 for igb, which was not receiving all the credits it needed when QAV mode was enabled. Anirudh provides several fixes for the new ice driver, starting with properly initializing num_nodes_added to zero. Fixed up a code comment to better reflect what is really going on in the code. Fixed how to detect if an OICR interrupt has occurred to a more reliable method. Md Fahad fixes the ice driver to allocate the right amount of memory when reading and storing the devices MAC addresses. The device can have up to 2 MAC addresses (LAN and WoL), while WoL is currently not supported, we need to ensure it can be properly handled when support is added. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-24	net/tls: remove redundant second null check on sgout	Colin Ian King
	A duplicated null check on sgout is redundant as it is known to be already true because of the identical earlier check. Remove it. Detected by cppcheck: net/tls/tls_sw.c:696: (warning) Identical inner 'if' condition is always true. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-24	fsl/fman_port: remove redundant check on port->rev_info.major	Colin Ian King
	The check port->rev_info.major >= 6 is being performed twice, thus the inner second check is always true and is redundant, hence it can be removed. Detected by cppcheck. drivers/net/ethernet/freescale/fman/fman_port.c:1394]: (warning) Identical inner 'if' condition is always true. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-24	ice: Fix insufficient memory issue in ice_aq_manage_mac_read	Md Fahad Iqbal Polash
	For the MAC read operation, the device can return up to two (LAN and WoL) MAC addresses. Without access to adequate memory, the device will return an error. Fixed this by allocating the right amount of memory. Also, logic to detect and copy the LAN MAC address into the port_info structure has been added. Note that the WoL MAC address is ignored currently as the WoL feature isn't supported yet. Fixes: dc49c7723676 ("ice: Get MAC/PHY/link info and scheduler topology") Signed-off-by: Md Fahad Iqbal Polash <md.fahad.iqbal.polash@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-04-24	perf stat: Fix duplicate PMU name for interval print	Kan Liang
	PMU name is printed repeatedly for interval print, for example: perf stat --no-merge -e 'unc_m_clockticks' -a -I 1000 # time counts unit events 1.001053069 243,702,144 unc_m_clockticks [uncore_imc_4] 1.001053069 244,268,304 unc_m_clockticks [uncore_imc_2] 1.001053069 244,427,386 unc_m_clockticks [uncore_imc_0] 1.001053069 244,583,760 unc_m_clockticks [uncore_imc_5] 1.001053069 244,738,971 unc_m_clockticks [uncore_imc_3] 1.001053069 244,880,309 unc_m_clockticks [uncore_imc_1] 2.002024821 240,818,200 unc_m_clockticks [uncore_imc_4] [uncore_imc_4] 2.002024821 240,767,812 unc_m_clockticks [uncore_imc_2] [uncore_imc_2] 2.002024821 240,764,215 unc_m_clockticks [uncore_imc_0] [uncore_imc_0] 2.002024821 240,759,504 unc_m_clockticks [uncore_imc_5] [uncore_imc_5] 2.002024821 240,755,992 unc_m_clockticks [uncore_imc_3] [uncore_imc_3] 2.002024821 240,750,403 unc_m_clockticks [uncore_imc_1] [uncore_imc_1] For each print, the PMU name is unconditionally appended to the counter->name. Need to check the counter->name first. If the PMU name is already appended, do nothing. Committer notes: Add and use perf_evsel->uniquified_name bool instead of doing the more expensive strstr(event->name, pmu->name). Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Agustin Vega-Frias <agustinv@codeaurora.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Will Deacon <will.deacon@arm.com> Fixes: 8c5421c016a4 ("perf pmu: Display pmu name when printing unmerged events in stat") Link: http://lkml.kernel.org/r/1524594014-79243-5-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-24	perf evsel: Only fall back group read for leader	Kan Liang
	Perf doesn't support mixed events from different PMUs (except software event) in a group. The perf stat should output <not counted>/<not supported> for all events, but it doesn't. For example, perf stat -e '{cycles,uncore_imc_5/umask=0xF,event=0x4/,instructions}' <not counted> cycles <not supported> uncore_imc_5/umask=0xF,event=0x4/ 1,024,300 instructions If perf fails to open an event, it doesn't error out directly. It will disable some features and retry, until the event is opened or all features are disabled. The disabled features will not be re-enabled. The group read is one of these features. For the example as above, the IMC event and the leader event "cycles" are from different PMUs. Opening the IMC event must fail. The group read feature must be disabled for IMC event and the followed event "instructions". The "instructions" event has the same PMU as the leader "cycles". It can be opened successfully. Since the group read feature has been disabled, the "instructions" event will be read as a single event, which definitely has a value. The group read fallback is still useful for the case which kernel doesn't support group read. It is good enough to be handled only by the leader. For the fallback request from members, it must be caused by an error. The fallback only breaks the semantics of group. Limit the group read fallback only for the leader. Committer testing: On a broadwell t450s notebook: Before: # perf stat -e '{cycles,unc_cbo_cache_lookup.read_i,instructions}' sleep 1 Performance counter stats for 'sleep 1': <not counted> cycles <not supported> unc_cbo_cache_lookup.read_i 818,206 instructions 1.003170887 seconds time elapsed Some events weren't counted. Try disabling the NMI watchdog: echo 0 > /proc/sys/kernel/nmi_watchdog perf stat ... echo 1 > /proc/sys/kernel/nmi_watchdog After: # perf stat -e '{cycles,unc_cbo_cache_lookup.read_i,instructions}' sleep 1 Performance counter stats for 'sleep 1': <not counted> cycles <not supported> unc_cbo_cache_lookup.read_i <not counted> instructions 1.001380511 seconds time elapsed Some events weren't counted. Try disabling the NMI watchdog: echo 0 > /proc/sys/kernel/nmi_watchdog perf stat ... echo 1 > /proc/sys/kernel/nmi_watchdog # Reported-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Agustin Vega-Frias <agustinv@codeaurora.org> Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Will Deacon <will.deacon@arm.com> Fixes: 82bf311e15d2 ("perf stat: Use group read for event groups") Link: http://lkml.kernel.org/r/1524594014-79243-3-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-24	perf stat: Print out hint for mixed PMU group error	Kan Liang
	Perf doesn't support mixed events from different PMUs (except software event) in a group. For this case, only "<not counted>" or "<not supported>" are printed out. There is no hint which guides users to fix the issue. Checking the PMU type of events to determine if they are from the same PMU. There may be false alarm for the checking. E.g. the core PMU has different PMU type. But it should not happen often. The false alarm can also be tolerated, because: - It only happens on error path. - It just provides a possible solution for the issue. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Cc: Agustin Vega-Frias <agustinv@codeaurora.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/1524594014-79243-2-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-24	perf pmu: Fix core PMU alias list for X86 platform	Kan Liang
	When counting uncore event with alias, core event is mistakenly involved, for example: perf stat --no-merge -e "unc_m_cas_count.all" -C0 sleep 1 Performance counter stats for 'CPU(s) 0': 0 unc_m_cas_count.all [uncore_imc_4] 0 unc_m_cas_count.all [uncore_imc_2] 0 unc_m_cas_count.all [uncore_imc_0] 153,640 unc_m_cas_count.all [cpu] 0 unc_m_cas_count.all [uncore_imc_5] 25,026 unc_m_cas_count.all [uncore_imc_3] 0 unc_m_cas_count.all [uncore_imc_1] 1.001447890 seconds time elapsed The reason is that current implementation doesn't check PMU name of a event when adding its alias into the alias list for core PMU. The uncore event aliases are mistakenly added. This bug was introduced in: commit 14b22ae028de ("perf pmu: Add helper function is_pmu_core to detect PMU CORE devices") Checking the PMU name for all PMUs on X86 and other architectures except ARM. There is no behavior change for ARM. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Agustin Vega-Frias <agustinv@codeaurora.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Will Deacon <will.deacon@arm.com> Fixes: 14b22ae028de ("perf pmu: Add helper function is_pmu_core to detect PMU CORE devices") Link: http://lkml.kernel.org/r/1524594014-79243-1-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-24	virtio_balloon: add array of stat names	Michael S. Tsirkin
	Jason Wang points out that it's very hard for users to build an array of stat names. The naive thing is to use VIRTIO_BALLOON_S_NR but that breaks if we add more stats - as done e.g. recently by commit 6c64fe7f2 ("virtio_balloon: export hugetlb page allocation counts"). Let's add an array of reasonably readable names. Fixes: 6c64fe7f2 ("virtio_balloon: export hugetlb page allocation counts") Cc: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Jonathan Helman <jonathan.helman@oracle.com>
2018-04-24	arm64: support __int128 with clang	Jason A. Donenfeld
	Commit fb8722735f50 ("arm64: support __int128 on gcc 5+") added support for arm64 __int128 with gcc with a version-conditional, but neglected to enable this for clang, which in fact appears to support aarch64 __int128. This commit therefore enables it if the compiler is clang, using the same type of makefile conditional used elsewhere in the tree. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-04-24	arm64: only advance singlestep for user instruction traps	Mark Rutland
	Our arm64_skip_faulting_instruction() helper advances the userspace singlestep state machine, but this is also called by the kernel BRK handler, as used for WARN(). Thus, if we happen to hit a WARN() while the user singlestep state machine is in the active-no-pending state, we'll advance to the active-pending state without having executed a user instruction, and will take a step exception earlier than expected when we return to userspace. Let's fix this by only advancing the state machine when skipping a user instruction. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-04-24	arm64/kernel: rename module_emit_adrp_veneer->module_emit_veneer_for_adrp	Kim Phillips
	Commit a257e02579e ("arm64/kernel: don't ban ADRP to work around Cortex-A53 erratum #843419") introduced a function whose name ends with "_veneer". This clashes with commit bd8b22d2888e ("Kbuild: kallsyms: ignore veneers emitted by the ARM linker"), which removes symbols ending in "_veneer" from kallsyms. The problem was manifested as 'perf test -vvvvv vmlinux' failed, correctly claiming the symbol 'module_emit_adrp_veneer' was present in vmlinux, but not in kallsyms. ... ERR : 0xffff00000809aa58: module_emit_adrp_veneer not on kallsyms ... test child finished with -1 ---- end ---- vmlinux symtab matches kallsyms: FAILED! Fix the problem by renaming module_emit_adrp_veneer to module_emit_veneer_for_adrp. Now the test passes. Fixes: a257e02579e ("arm64/kernel: don't ban ADRP to work around Cortex-A53 erratum #843419") Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Michal Marek <mmarek@suse.cz> Signed-off-by: Kim Phillips <kim.phillips@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-04-24	arm64: ptrace: remove addr_limit manipulation	Mark Rutland
	We transiently switch to KERNEL_DS in compat_ptrace_gethbpregs() and compat_ptrace_sethbpregs(), but in either case this is pointless as we don't perform any uaccess during this window. let's rip out the redundant addr_limit manipulation. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>