summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2015-07-29net: thunderx: Add PCI driver shutdown routineSunil Goutham
Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29net: thunderx: Fix crash when changing rss with mutliple traffic flowsSunil Goutham
This fixes a crash when changing rss with multiple traffic flows. While interface teardown, disable tx queues after all NAPI threads are done. If done otherwise tx queues might be woken up inside NAPI if any CQE_TX are processed. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29net: thunderx: Set watchdog timeout valueSunil Goutham
If a txq (SQ) remains in stopped state after this timeout its considered as stuck and interface is reinited. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29net: thunderx: Wakeup TXQ only if CQE_TX are processedSunil Goutham
Previously TXQ is wakedup whenever napi is executed and irrespective of if any CQE_TX are processed or not. Added 'txq_stop' and 'txq_wake' counters to aid in debugging if there are any future issues. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29net: thunderx: Suppress alloc_pages() failure warningsSunil Goutham
Suppressing standard alloc_pages() warnings. Some kernel configs limit alloc size and the network driver may fail. Do not drop a kernel warning in this case, instead just drop a oneliner that the network driver could not be loaded since the buffer could not be allocated. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29net: thunderx: Fix TSO packet statisticSunil Goutham
Fixing TSO packages not being counted. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29net: thunderx: Fix memory leak when changing queue countSunil Goutham
Fix for memory leak when changing queue/channel count via ethtool Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29net: thunderx: Fix RQ_DROP miscalculationSunil Goutham
With earlier configured value sufficient number of CQEs are not being reserved for transmitted packets. Hence under heavy incoming traffic load, receive notifications will take away most of the CQ thus transmit notifications will be lost resulting in tx skbs not being freed. Finally SQ will be full and it will be stopped, watchdog timer will kick in. After this fix receive notifications will not take morethan half of CQ reserving the rest for transmit notifications. Also changed CQ & SQ sizes from 16k to 4k. This is also due to the receive notifications taking first half of CQ under heavy load and time taken by NAPI to clear transmit notifications will increase with higher queue sizes. Again results in SQ being stopped. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29net: thunderx: Fix memory leak while tearing down interfaceSunil Goutham
Fixed 'tso_hdrs' memory not being freed properly. Also fixed SQ skbuff maintenance issues. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29net: thunderx: Fix data integrity issues with LDWBSunil Goutham
Switching back to LDD transactions from LDWB. While transmitting packets out with LDWB transactions data integrity issues are seen very frequently. hence switching back to LDD. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29bnx2x: add vlan filtering offloadYuval Mintz
Current driver always uses vlan-promisc mode, i.e., it receives both tagged and untagged traffic and lets the network stack drop packets tagged with unrequested vlan tags. This patch implements vlan-filtering offload in the driver - Unless explicitly configured to promisc mode, only untagged packets or packets tagged with requested vlans would reach the Rx flow. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29Merge branch 'mlx5e-next'David S. Miller
Amir Vadai says: ==================== net/mlx5e: Driver update 29-Jul-2015 This patchset contain bug fixes and code cleaning patches to the ConnectX-4 Ethernet driver. Patchset was applied and tested over commit 8c1a91f ("Merge branch 'mlx4-802.1ad-accel'") ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29net/mlx5e: Remove the mlx5e_update_priv_params() functionAchiad Shochat
It was used to update netdev priv parameters that require stopping and re-opening the device in a generic way - it got the new parameters and did: ndo_stop(), copy new parameters into current parameters, ndo_open(). We chose to remove it for two reasons: 1) It requires additional instance of struct mlx5e_params on the stack and looking forward we expect this struct to grow. 2) Sometimes we want to do additional operations (besides just updating the priv parameters) while the netdev is stopped. For example, updating netdev->mtu @mlx5e_change_mtu() should be done while the netdev is stopped (done in this commit). Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29net/mlx5e: Introduce create/destroy RSS indir table access functionsAchiad Shochat
Introduce access functions to create/destroy RSS indrection table and use it in the Ethernet driver. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29net/mlx5e: Do not use netdev_err() before the netdev is registeredAchiad Shochat
Since it is un-named at this time. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29net/mlx5e: Avoid redundant de-referenceAchiad Shochat
Use the already defined rq pointer directly. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29net/mlx5e: Remove redundant assignment of sq->user_indexAchiad Shochat
It is not needed by the mlx5 Eth driver since it has a CQ per RQ/SQ. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29net/mlx5e: Remove redundant field mlx5e_priv->num_tcAchiad Shochat
This field already exists under the mlx5e_params struct Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29net/mlx5e: Use hard-coded 4K page size for RQ/SQ/CQAchiad Shochat
The page size of the device's RQ/SQ/CQ objects is defined in 4K units regardless of the system pages size. Thus using the Linux's PAGE_SHIFT macro yields wrong device configuration in systems where PAGE_SHIFT!=12. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29net/mlx5_core: Check the return value of mlx5_command_exec()Haggai Abramonvsky
mlx5_cmd_exec() might fail - need to check return value. Signed-off-by: Haggai Abramovsky <hagaya@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29openvswitch: Re-add CONFIG_OPENVSWITCH_VXLANThomas Graf
This readds the config option CONFIG_OPENVSWITCH_VXLAN to avoid a hard dependency of OVS on VXLAN. It moves the VXLAN config compat code to vport-vxlan.c and allows compliation as a module. Fixes: 614732eaa12d ("openvswitch: Use regular VXLAN net_device device") Fixes: 2661371ace96 ("openvswitch: fix compilation when vxlan is a module") Cc: Pravin B Shelar <pshelar@nicira.com> Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Thomas Graf <tgraf@suug.ch> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29ipv6: flush nd cache on IFF_NOARP changeEric Dumazet
This patch is the IPv6 equivalent of commit 6c8b4e3ff81b ("arp: flush arp cache on IFF_NOARP change") Without it, we keep buggy neighbours in the cache, with destination MAC address equal to our own MAC address. Tested: tcpdump -i eth0 -s 0 ip6 -n -e & ip link set dev eth0 arp off ping6 remote // sends buggy frames ip link set dev eth0 arp on ping6 remote // should work once kernel is patched Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Mario Fanelli <mariofanelli@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29net: pktgen: Remove unused 'allocated_skbs' fieldBogdan Hamciuc
Field pktgen_dev.allocated_skbs had been written to, but never read from. The number of allocated skbs can be deduced anyway, from the total number of sent packets and the 'clone_skb' param. Signed-off-by: Bogdan Hamciuc <bogdan.hamciuc@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29net: pktgen: Observe needed_headroom of the deviceBogdan Hamciuc
Allocate enough space so as not to force the outgoing net device to do skb_realloc_headroom(). Signed-off-by: Bogdan Hamciuc <bogdan.hamciuc@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29lwtunnel: Make lwtun_encaps[] staticThomas Graf
Any external user should use the registration API instead of accessing this directly. Cc: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: Thomas Graf <tgraf@suug.ch> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29gianfar: Fix warnings when built on 64-bitScott Wood
As part of defconfig consolidation using fragments, we'd like to be able to have the same drivers enabled on 32-bit and 64-bit. Gianfar happens to only exist on 32-bit systems, and when building the resulting 64-bit kernel warnings were produced. A couple of the warnings are trivial, but the rfbptr code has deeper issues. It uses the virtual address as the DMA address, which again, happens to work in the environments where this driver is currently used, but is not the right thing to do. Fixes: 45b679c9a3cc ("gianfar: Implement PAUSE frame generation support") Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29Merge branch 'sk_txhash'David S. Miller
Tom Herbert says: ==================== net: Initialize sk_hash to random value and reset for failing cnxs This patch set implements a common function to simply set sk_txhash to a random number instead of going through the trouble to call flow dissector. From dst_negative_advice we now reset the sk_txhash in hopes of finding a better ECMP path through the network. Changing sk_txhash affects: - IPv6 flow label and UDP source port which affect ECMP in the network - Local ECMP route selection (pending changes to use sk_txhash) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29net: Recompute sk_txhash on negative routing adviceTom Herbert
When a connection is failing a transport protocol calls dst_negative_advice to try to get a better route. This patch includes changing the sk_txhash in that function. This provides a rudimentary method to try to find a different path in the network since sk_txhash affects ECMP on the local host and through the network (via flow labels or UDP source port in encapsulation). Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29net: Set sk_txhash from a random numberTom Herbert
This patch creates sk_set_txhash and eliminates protocol specific inet_set_txhash and ip6_set_txhash. sk_set_txhash simply sets a random number instead of performing flow dissection. sk_set_txash is also allowed to be called multiple times for the same socket, we'll need this when redoing the hash for negative routing advice. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29hwmon: (nct7802) Fix integer overflow seen when writing voltage limitsGuenter Roeck
Writing a large value into a voltage limit attribute can result in an overflow due to an auto-conversion from unsigned long to unsigned int. Cc: Constantine Shulyupin <const@MakeLinux.com> Reviewed-by: Jean Delvare <jdelvare@suse.de> Cc: stable@vger.kernel.org # v4.1+ Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2015-07-29hwmon: (nct7904) Rename pwm attributes to match hwmon ABIGuenter Roeck
pwm attributes have well defined names, which should be used. Cc: Vadim V. Vlasov <vvlasov@dev.rtsoft.ru> Cc: stable@vger.kernel.org #v4.1+ Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2015-07-30Merge branch 'msm-fixes-4.2' of git://people.freedesktop.org/~robclark/linux ↵Dave Airlie
into drm-fixes Fix for nasty crash on mdp4 in disable path, fix for dma-buf export, smb leak on mdp5 which could result in intermittent modeset fails, and don't let interrupted system call disturb atomic commit once we are past the point of no return. * 'msm-fixes-4.2' of git://people.freedesktop.org/~robclark/linux: drm/msm/mdp5: release SMB (shared memory blocks) in various cases drm/msm: change to uninterruptible wait in atomic commit drm/msm: mdp4: Fix drm_framebuffer dereference crash drm/msm: fix msm_gem_prime_get_sg_table()
2015-07-30Merge branch 'drm-fixes-4.2' of git://people.freedesktop.org/~agd5f/linux ↵Dave Airlie
into drm-fixes Radeon and amdgpu fixes for 4.2. The audio fix ended up being more invasive than I would have liked, but this should finally fix up the last of the regressions since DP audio support was added. * 'drm-fixes-4.2' of git://people.freedesktop.org/~agd5f/linux: drm/amdgpu: add new parameter to seperate map and unmap drm/amdgpu: hdp_flush is not needed for inside IB drm/amdgpu: different emit_ib for gfx and compute drm/amdgpu: information leak in amdgpu_info_ioctl() drm/amdgpu: clean up init sequence for failures drm/radeon/combios: add some validation of lvds values drm/radeon: rework audio modeset to handle non-audio hdmi features drm/radeon: rework audio detect (v4) drm/amdgpu: Drop drm/ prefix for including drm.h in amdgpu_drm.h drm/radeon: Drop drm/ prefix for including drm.h in radeon_drm.h
2015-07-29Merge branch 'netcp-fixes'David S. Miller
Murali Karicheri says: ==================== net: netcp: bug fixes for dynamic module support This series fixes few bugs to allow keystone netcp modules to be dynamically loaded and removed. Currently it allows following sequence multiple times insmod cpsw_ale.ko insmod davinci_mdio.ko insmod keystone_netcp.ko insmod keystone_netcp_ethss.ko ifup eth0 ifup eth1 ping <hosts on eth0> ping <hosts on eth1> ifdown eth1 ifdown eth0 rmmod keystone_netcp_ethss.ko rmmod keystone_netcp.ko rmmod davinci_mdio.ko rmmod cpsw_ale.ko ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29net: netcp: ethss: cleanup gbe_probe() and gbe_remove() functionsKaricheri, Muralidharan
This patch clean up error handle code to use goto label properly. In some cases, the code unnecessarily use goto instead of just returning the error code. Code also make explicit calls to devm_* APIs on error which is not necessary. In the gbe_remove() also it makes similar calls which is also unnecessary. Also fix few checkpatch warnings Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29net: netcp: ethss: fix up incorrect use of list apiKaricheri, Muralidharan
The code seems to assume a null is returned when the list is empty from first_sec_slave() to break the loop which is incorrect. Fix the code by using list_empty(). Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29net: netcp: fix cleanup interface list in netcp_remove()Karicheri, Muralidharan
Currently if user do rmmod keystone_netcp.ko following warning is seen :- [ 59.035891] ------------[ cut here ]------------ [ 59.040535] WARNING: CPU: 2 PID: 1619 at drivers/net/ethernet/ti/ netcp_core.c:2127 netcp_remove) This is because the interface list is not cleaned up in netcp_remove. This patch fixes this. Also fix some checkpatch related warnings. Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29Merge tag 'pm+acpi-4.2-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management and ACPI fixes from Rafael Wysocki: "These fix three regressions, two recent ones (cpufreq core and ACPI device power management) and one introduced during the 4.1 cycle (intel_pstate). Specifics: - Fix a recently introduced issue in the cpufreq core causing it to attempt to create duplicate symbolic links to the policy directory in sysfs for CPUs that are offline when the cpufreq driver is being registered (Rafael J Wysocki) - Fix a recently introduced problem in the ACPI device power management core code causing it to store an incorrect value in the device object's power.state field in some cases which in turn leads to attempts to turn power resources off while they should still be on going forward (Mika Westerberg) - Fix an intel_pstate driver issue introduced during the 4.1 cycle which leads to kernel panics on boot on Knights Landing chips due to incomplete support for them in that driver (Lukasz Anaczkowski)" * tag 'pm+acpi-4.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: Avoid attempts to create duplicate symbolic links ACPI / PM: Use target_state to set the device power state intel_pstate: Add get_scaling cpu_defaults param to Knights Landing
2015-07-29Merge tag 'dm-4.2-fixes-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mike Snitzer: - fix DM thinp to consistently return -ENOSPC when out of data space - fix a logic bug in the DM cache smq policy's creation error path - revert a DM cache 4.2-rc3 change that reduced writeback efficiency - fix a hang on DM cache device destruction due to improper prealloc_used accounting introduced in 4.2-rc3 - update URL for dm-crypt wiki page * tag 'dm-4.2-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm cache: fix device destroy hang due to improper prealloc_used accounting Revert "dm cache: do not wake_worker() in free_migration()" dm crypt: update wiki page URL dm cache policy smq: fix alloc_bitset check that always evaluates as false dm thin: return -ENOSPC when erroring retry list due to out of data space
2015-07-29Merge branch 'thunderx_octeon_mdio'David S. Miller
Radha Mohan Chintakuntla says: ==================== Add MDIO support to ThunderX NIC driver This patch series adds MDIO support to ThunderX NIC driver by making use of existing mdio-octeon driver. In the process modified the mdio-octeon driver to work on both Octeon and ThunderX platforms. * From v1: - Removed default selection in Kconfig for MDIO_OCTEON - Replace uint64 with u64 as suggested by David Daney ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29net: thunderx: Select CONFIG_MDIO_OCTEON for ThunderX NICRadha Mohan Chintakuntla
The CONFIG_MDIO_OCTEON is required so that the ThunderX NIC driver can talk to the PHY drivers. Signed-off-by: Radha Mohan Chintakuntla <rchintakuntla@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29net: mdio-octeon: Fix octeon_mdiobus_probe function for return valuesRadha Mohan Chintakuntla
This patch fixes a possible crash in the octeon_mdiobus_probe function if the return values are not handled properly. Signed-off-by: Radha Mohan Chintakuntla <rchintakuntla@cavium.com> Signed-off-by: Tomasz Nowicki <tomasz.nowicki@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29net: mdio-octeon: Modify driver to work on both ThunderX and OcteonRadha Mohan Chintakuntla
This patch modifies the mdio-octeon driver to work on both ThunderX and Octeon SoCs from Cavium Inc. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Radha Mohan Chintakuntla <rchintakuntla@cavium.com> Signed-off-by: David Daney <david.daney@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29net: netcp: Fixes efuse mac addr swap on k2e and k2lWingMan Kwok
On some of the K2E and K2L platforms, the two DWORDs in efuse occupied by the pre-programmed mac address for slave port 1 are swapped. To workaround this issue, this patch adds a new define NETCP_EFUSE_ADDR_SWAP (2) which signifies the occurrence of such swapping so that the driver can take proper action. The flag can be enabled in the corresponding netcp interface dts binding as efuse-mac = <2> under the corresponding netcp interface node. Signed-off-by: WingMan Kwok <w-kwok2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29ebpf, x86: fix general protection fault when tail call is invokedDaniel Borkmann
With eBPF JIT compiler enabled on x86_64, I was able to reliably trigger the following general protection fault out of an eBPF program with a simple tail call, f.e. tracex5 (or a stripped down version of it): [ 927.097918] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC [...] [ 927.100870] task: ffff8801f228b780 ti: ffff880016a64000 task.ti: ffff880016a64000 [ 927.102096] RIP: 0010:[<ffffffffa002440d>] [<ffffffffa002440d>] 0xffffffffa002440d [ 927.103390] RSP: 0018:ffff880016a67a68 EFLAGS: 00010006 [ 927.104683] RAX: 5a5a5a5a5a5a5a5a RBX: 0000000000000000 RCX: 0000000000000001 [ 927.105921] RDX: 0000000000000000 RSI: ffff88014e438000 RDI: ffff880016a67e00 [ 927.107137] RBP: ffff880016a67c90 R08: 0000000000000000 R09: 0000000000000001 [ 927.108351] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880016a67e00 [ 927.109567] R13: 0000000000000000 R14: ffff88026500e460 R15: ffff880220a81520 [ 927.110787] FS: 00007fe7d5c1f740(0000) GS:ffff880265000000(0000) knlGS:0000000000000000 [ 927.112021] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 927.113255] CR2: 0000003e7bbb91a0 CR3: 000000006e04b000 CR4: 00000000001407e0 [ 927.114500] Stack: [ 927.115737] ffffc90008cdb000 ffff880016a67e00 ffff88026500e460 ffff880220a81520 [ 927.117005] 0000000100000000 000000000000001b ffff880016a67aa8 ffffffff8106c548 [ 927.118276] 00007ffcdaf22e58 0000000000000000 0000000000000000 ffff880016a67ff0 [ 927.119543] Call Trace: [ 927.120797] [<ffffffff8106c548>] ? lookup_address+0x28/0x30 [ 927.122058] [<ffffffff8113d176>] ? __module_text_address+0x16/0x70 [ 927.123314] [<ffffffff8117bf0e>] ? is_ftrace_trampoline+0x3e/0x70 [ 927.124562] [<ffffffff810c1a0f>] ? __kernel_text_address+0x5f/0x80 [ 927.125806] [<ffffffff8102086f>] ? print_context_stack+0x7f/0xf0 [ 927.127033] [<ffffffff810f7852>] ? __lock_acquire+0x572/0x2050 [ 927.128254] [<ffffffff810f7852>] ? __lock_acquire+0x572/0x2050 [ 927.129461] [<ffffffff8119edfa>] ? trace_call_bpf+0x3a/0x140 [ 927.130654] [<ffffffff8119ee4a>] trace_call_bpf+0x8a/0x140 [ 927.131837] [<ffffffff8119edfa>] ? trace_call_bpf+0x3a/0x140 [ 927.133015] [<ffffffff8119f008>] kprobe_perf_func+0x28/0x220 [ 927.134195] [<ffffffff811a1668>] kprobe_dispatcher+0x38/0x60 [ 927.135367] [<ffffffff81174b91>] ? seccomp_phase1+0x1/0x230 [ 927.136523] [<ffffffff81061400>] kprobe_ftrace_handler+0xf0/0x150 [ 927.137666] [<ffffffff81174b95>] ? seccomp_phase1+0x5/0x230 [ 927.138802] [<ffffffff8117950c>] ftrace_ops_recurs_func+0x5c/0xb0 [ 927.139934] [<ffffffffa022b0d5>] 0xffffffffa022b0d5 [ 927.141066] [<ffffffff81174b91>] ? seccomp_phase1+0x1/0x230 [ 927.142199] [<ffffffff81174b95>] seccomp_phase1+0x5/0x230 [ 927.143323] [<ffffffff8102c0a4>] syscall_trace_enter_phase1+0xc4/0x150 [ 927.144450] [<ffffffff81174b95>] ? seccomp_phase1+0x5/0x230 [ 927.145572] [<ffffffff8102c0a4>] ? syscall_trace_enter_phase1+0xc4/0x150 [ 927.146666] [<ffffffff817f9a9f>] tracesys+0xd/0x44 [ 927.147723] Code: 48 8b 46 10 48 39 d0 76 2c 8b 85 fc fd ff ff 83 f8 20 77 21 83 c0 01 89 85 fc fd ff ff 48 8d 44 d6 80 48 8b 00 48 83 f8 00 74 0a <48> 8b 40 20 48 83 c0 33 ff e0 48 89 d8 48 8b 9d d8 fd ff ff 4c [ 927.150046] RIP [<ffffffffa002440d>] 0xffffffffa002440d The code section with the instructions that traps points into the eBPF JIT image of the root program (the one invoking the tail call instruction). Using bpf_jit_disasm -o on the eBPF root program image: [...] 4e: mov -0x204(%rbp),%eax 8b 85 fc fd ff ff 54: cmp $0x20,%eax <--- if (tail_call_cnt > MAX_TAIL_CALL_CNT) 83 f8 20 57: ja 0x000000000000007a 77 21 59: add $0x1,%eax <--- tail_call_cnt++ 83 c0 01 5c: mov %eax,-0x204(%rbp) 89 85 fc fd ff ff 62: lea -0x80(%rsi,%rdx,8),%rax <--- prog = array->prog[index] 48 8d 44 d6 80 67: mov (%rax),%rax 48 8b 00 6a: cmp $0x0,%rax <--- check for NULL 48 83 f8 00 6e: je 0x000000000000007a 74 0a 70: mov 0x20(%rax),%rax <--- GPF triggered here! fetch of bpf_func 48 8b 40 20 [ matches <48> 8b 40 20 ... from above ] 74: add $0x33,%rax <--- prologue skip of new prog 48 83 c0 33 78: jmpq *%rax <--- jump to new prog insns ff e0 [...] The problem is that rax has 5a5a5a5a5a5a5a5a, which suggests a tail call jump to map slot 0 is pointing to a poisoned page. The issue is the following: lea instruction has a wrong offset, i.e. it should be ... lea 0x80(%rsi,%rdx,8),%rax ... but it actually seems to be ... lea -0x80(%rsi,%rdx,8),%rax ... where 0x80 is offsetof(struct bpf_array, prog), thus the offset needs to be positive instead of negative. Disassembling the interpreter, we btw similarly do: [...] c88: lea 0x80(%rax,%rdx,8),%rax <--- prog = array->prog[index] 48 8d 84 d0 80 00 00 00 c90: add $0x1,%r13d 41 83 c5 01 c94: mov (%rax),%rax 48 8b 00 [...] Now the other interesting fact is that this panic triggers only when things like CONFIG_LOCKDEP are being used. In that case offsetof(struct bpf_array, prog) starts at offset 0x80 and in non-CONFIG_LOCKDEP case at offset 0x50. Reason is that the work_struct inside struct bpf_map grows by 48 bytes in my case due to the lockdep_map member (which also has CONFIG_LOCK_STAT enabled members). Changing the emitter to always use the 4 byte displacement in the lea instruction fixes the panic on my side. It increases the tail call instruction emission by 3 more byte, but it should cover us from various combinations (and perhaps other future increases on related structures). After patch, disassembly: [...] 9e: lea 0x80(%rsi,%rdx,8),%rax <--- CONFIG_LOCKDEP/CONFIG_LOCK_STAT 48 8d 84 d6 80 00 00 00 a6: mov (%rax),%rax 48 8b 00 [...] [...] 9e: lea 0x50(%rsi,%rdx,8),%rax <--- No CONFIG_LOCKDEP 48 8d 84 d6 50 00 00 00 a6: mov (%rax),%rax 48 8b 00 [...] Fixes: b52f00e6a715 ("x86: bpf_jit: implement bpf_tail_call() helper") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29tipc: fix bug in broadcast synch message create functionJon Maloy
In commit d999297c3dbbe7fdd832f7fa4ec84301e170b3e6 ("tipc: reduce locking scope during packet reception") we introduced a new function tipc_build_bcast_sync_msg(), which carries initial synchronization data between two nodes at first contact and at re-contact. In this function, we missed to add synchronization data, with the effect that the broadcast link endpoints will fail to synchronize correctly at re-contact between a running and a restarted node. All other cases work as intended. With this commit, we fix this bug. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29packet: remove handling of tx_ring from prb_shutdown_retire_blk_timer()Tobias Klauser
Follow e8e85cc5eb57 ("packet: remove handling of tx_ring") and remove the tx_ring parameter from prb_shutdown_retire_blk_timer() as it is only called with tx_ring = 0. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29bridge: mdb: fix delmdb state in the notificationNikolay Aleksandrov
Since mdb states were introduced when deleting an entry the state was left as it was set in the delete request from the user which leads to the following output when doing a monitor (for example): $ bridge mdb add dev br0 port eth3 grp 239.0.0.1 permanent (monitor) dev br0 port eth3 grp 239.0.0.1 permanent $ bridge mdb del dev br0 port eth3 grp 239.0.0.1 permanent (monitor) dev br0 port eth3 grp 239.0.0.1 temp ^^^ Note the "temp" state in the delete notification which is wrong since the entry was permanent, the state in a delete is always reported as "temp" regardless of the real state of the entry. After this patch: $ bridge mdb add dev br0 port eth3 grp 239.0.0.1 permanent (monitor) dev br0 port eth3 grp 239.0.0.1 permanent $ bridge mdb del dev br0 port eth3 grp 239.0.0.1 permanent (monitor) dev br0 port eth3 grp 239.0.0.1 permanent There's one important note to make here that the state is actually not matched when doing a delete, so one can delete a permanent entry by stating "temp" in the end of the command, I've chosen this fix in order not to break user-space tools which rely on this (incorrect) behaviour. So to give an example after this patch and using the wrong state: $ bridge mdb add dev br0 port eth3 grp 239.0.0.1 permanent (monitor) dev br0 port eth3 grp 239.0.0.1 permanent $ bridge mdb del dev br0 port eth3 grp 239.0.0.1 temp (monitor) dev br0 port eth3 grp 239.0.0.1 permanent Note the state of the entry that got deleted is correct in the notification. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Fixes: ccb1c31a7a87 ("bridge: add flags to distinguish permanent mdb entires") Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29Merge branch 's390-bpf-push-pop'David S. Miller
Michael Holzheu says: ==================== s390/bpf: recache skb->data/hlen for skb_vlan_push/pop Here the s390 backend for Alexei's patch 4e10df9a60d9 ("bpf: introduce bpf_skb_vlan_push/pop() helpers") plus two bugfixes and two minor improvements. The first patch "s390/bpf: clear correct BPF accumulator register" will also go upstream via Martin's "fixes" branch. * v2: Integrated suggestions from Joe Perches ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-29s390/bpf: recache skb->data/hlen for skb_vlan_push/popMichael Holzheu
Allow eBPF programs attached to TC qdiscs call skb_vlan_push/pop via helper functions. These functions may change skb->data/hlen. This data is cached by s390 JIT to improve performance of ld_abs/ld_ind instructions. Therefore after a change we have to reload the data. In case of usage of skb_vlan_push/pop, in the prologue we store the SKB pointer on the stack and restore it after BPF_JMP_CALL to skb_vlan_push/pop. Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>