summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2016-07-01Merge tag 'usb-4.7-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB and PHY fixes from Greg KH: "Here are a number of small USB and PHY driver fixes for 4.7-rc6. Nothing major here, all are described in the shortlog below. All have been in linux-next with no reported issues" * tag 'usb-4.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: USB: don't free bandwidth_mutex too early USB: EHCI: declare hostpc register as zero-length array phy-sun4i-usb: Fix irq free conditions to match request conditions phy: bcm-ns-usb2: checking the wrong variable phy-sun4i-usb: fix missing __iomem * phy: phy-sun4i-usb: Fix optional gpios failing probe phy: rockchip-dp: fix return value check in rockchip_dp_phy_probe() phy: rcar-gen3-usb2: fix unexpected repeat interrupts of VBUS change usb: common: otg-fsm: add license to usb-otg-fsm
2016-07-01Merge tag 'iommu-fixes-v4.7-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU fixes from Joerg Roedel: "Three fixes: - Fix use of smp_processor_id() in preemptible code in the IOVA allocation code. This got introduced with the scalability improvements in this release cycle. - A VT-d fix for out-of-bounds access of the iommu->domains array. The bug showed during suspend/resume. - AMD IOMMU fix to print the correct device id in the ACPI parsing code" * tag 'iommu-fixes-v4.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/amd: Initialize devid variable before using it iommu/vt-d: Fix overflow of iommu->domains array iommu/iova: Disable preemption around use of this_cpu_ptr()
2016-07-01Merge remote-tracking branches 'regulator/fix/anatop' and ↵Mark Brown
'regulator/fix/max77620' into regulator-linus
2016-07-01Merge remote-tracking branches 'asoc/fix/rcar', 'asoc/fix/rt5670' and ↵Mark Brown
'asoc/fix/wm8940' into asoc-linus
2016-07-01Merge remote-tracking branches 'asoc/fix/ak4613', 'asoc/fix/arizona', ↵Mark Brown
'asoc/fix/cx20442', 'asoc/fix/davinci', 'asoc/fix/fsl-ssi' and 'asoc/fix/hdmi' into asoc-linus
2016-07-01Merge remote-tracking branch 'asoc/fix/rt5645' into asoc-linusMark Brown
2016-07-01Merge remote-tracking branch 'asoc/fix/intel' into asoc-linusMark Brown
2016-07-01netfilter: Remove references to obsolete CONFIG_IP_ROUTE_FWMARKMoritz Sichert
This option was removed in commit 47dcf0cb1005 ("[NET]: Rethink mark field in struct flowi"). Signed-off-by: Moritz Sichert <moritz+linux@sichert.me> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-07-01etherdevice.h & bridge: netfilter: Add and use ether_addr_equal_maskedJoe Perches
There are code duplications of a masked ethernet address comparison here so make it a separate function instead. Miscellanea: o Neaten alignment of FWINV macro uses to make it clearer for the reader Signed-off-by: Joe Perches <joe@perches.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-07-01netfilter: x_tables: simplify ip{6}table_mangle_hook()Pablo Neira Ayuso
No need for a special case to handle NF_INET_POST_ROUTING, this is basically the same handling as for prerouting, input, forward. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-07-01locks: use file_inode()Miklos Szeredi
(Another one for the f_path debacle.) ltp fcntl33 testcase caused an Oops in selinux_file_send_sigiotask. The reason is that generic_add_lease() used filp->f_path.dentry->inode while all the others use file_inode(). This makes a difference for files opened on overlayfs since the former will point to the overlay inode the latter to the underlying inode. So generic_add_lease() added the lease to the overlay inode and generic_delete_lease() removed it from the underlying inode. When the file was released the lease remained on the overlay inode's lock list, resulting in use after free. Reported-by: Eryu Guan <eguan@redhat.com> Fixes: 4bacc9c9234c ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay") Cc: <stable@vger.kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-07-01Merge branch 'mlx5-fixes'David S. Miller
Saeed Mahameed says: ==================== Mellanox 100G mlx5 resiliency and xmit path fixes This series provides two set of fixes to the mlx5 driver: - Resiliency fixes for reset flow and internal pci errors - xmit path fixes Please consider queuing those patches for -stable (4.6). Reset flow fixes for core driver: - Add more commands to the list of error simulated commands when pci errors occur - Avoid calling sleeping function by the health poll thread - Fix incorrect page count when in internal error - Fix timeout in wait vital for VFs - Deadlock fix and Timeout handling in commands interface Reset flow and resiliency fixes for mlx5e netdev driver: - Handle RQ flush in error cases - Implement ndo_tx_timeout callback - Timeout if SQ doesn't flush during close - Log link state changes - Validate BW weight values of ETS xmit path fixes: - Fix wrong fallback assumption in select queue callback - Account for all L2 headers when copying headers into inline segment ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01net/mlx5e: Log link state changesShaker Daibes
Add Link UP/Down prints to kernel log when link state changes Signed-off-by: Shaker Daibes <shakerd@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01net/mlx5e: Validate BW weight values of ETSRana Shahout
Valid weight assigned to ETS TClass values are 1-100 Fixes: 08fb1dacdd76 ('net/mlx5e: Support DCBNL IEEE ETS') Signed-off-by: Rana Shahout <ranas@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01net/mlx5e: Fix select queue callbackRana Shahout
The default fallback function used by mlx5e select queue can return any TX queues in range [0..dev->num_real_tx_queues). The current implementation assumes that the fallback function returns a number in the range [0.. number of channels). Actually dev->num_real_tx_queues = (number of channels) * dev->num_tc; which is more than the expected range if num_tc is configured and could lead to crashes. To fix this we test if num_tc is not configured we can safely return the fallback suggestion, if not we will reciprocal_scale the fallback result and normalize it to the desired range. Fixes: 08fb1dacdd76 ('net/mlx5e: Support DCBNL IEEE ETS') Signed-off-by: Rana Shahout <ranas@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reported-by: Doug Ledford <dledford@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01net/mlx5e: Copy all L2 headers into inline segmentMatthew Finlay
ConnectX4-Lx uses an inline wqe mode that currently defaults to requiring the entire L2 header be included in the wqe. This patch fixes mlx5e_get_inline_hdr_size() to account for all L2 headers (VLAN, QinQ, etc) using skb_network_offset(skb). Fixes: e586b3b0baee ("net/mlx5: Ethernet Datapath files") Signed-off-by: Matthew Finlay <matt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01net/mlx5e: Handle RQ flush in error casesDaniel Jurgens
Add a timeout to avoid an infinite loop waiting for RQ's to flush. This occurs during AER/EEH and will also happen if the device stops posting completions due to internal error or reset, or if moving the RQ to the error state fails. Also cleanup posted receive resources when closing the RQ. Fixes: f62b8bb8f2d3 ('net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet functionality') Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01net/mlx5e: Implement ndo_tx_timeout callbackDaniel Jurgens
Add callback to handle TX timeouts. Fixes: f62b8bb8f2d3 ('net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet functionality') Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01net/mlx5e: Timeout if SQ doesn't flush during closeDaniel Jurgens
Avoid an infinite loop by timing out waiting for the SQ to flush. Also clean up the TX descriptors if that happens. Fixes: f62b8bb8f2d3 ('net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet functionality') Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01net/mlx5: Add timeout handle to commands with callbackMohamad Haj Yahia
The current implementation does not handle timeout in case of command with callback request, and this can lead to deadlock if the command doesn't get fw response. Add delayed callback timeout work before posting the command to fw. In case of real fw command completion we will cancel the delayed work. In case of fw command timeout the callback timeout handler will be called and it will simulate fw completion with timeout error. Fixes: e126ba97dba9 ('mlx5: Add driver for Mellanox Connect-IB adapters') Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01net/mlx5: Fix potential deadlock in command mode changeMohamad Haj Yahia
Call command completion handler in case of timeout when working in interrupts mode. Avoid flushing the commands workqueue after acquiring the semaphores to prevent a potential deadlock. Fixes: e126ba97dba9 ('mlx5: Add driver for Mellanox Connect-IB adapters') Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01net/mlx5: Fix wait_vital for VFs and remove fixed sleepDaniel Jurgens
The device ID for VFs is in a different location than PFs. This results in the poll always timing out for VFs. There's no good way to read the VF device ID without using the PF's configuration space. Switch to waiting for the health poll to start incrementing. Also remove the 1s sleep at the beginning. fixes: 89d44f0a6c73 ('net/mlx5_core: Add pci error handlers to mlx5_core driver') Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01net/mlx5: Fix incorrect page count when in internal errorDaniel Jurgens
Change page cleanup flow when in internal error to properly decrement the page counts when reclaiming pages. The prevents timing out waiting for extra pages that were actually cleaned up previously. fixes: 89d44f0a6c73 ('net/mlx5_core: Add pci error handlers to mlx5_core driver') Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01net/mlx5: Avoid calling sleeping function by the health poll threadMohamad Haj Yahia
In internal error state the health poll thread will eventually call synchronize_irq() (to safely trigger command completions) which might sleep, so we are calling sleeping function from atomic context which is invalid. Here we move trigger_cmd_completions(dev) to enter error state which is the earliest stage in error state handling. This way we won't need to wait for next health poll to trigger command completions and will solve the scheduling while atomic issue. mlx5_enter_error_state can be called from two contexts, protect it with dev->intf_state_lock Fixes: 89d44f0a6c73 ('net/mlx5_core: Add pci error handlers to mlx5_core driver') Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01net/mlx5: Fix teardown errors that happen in pci error handlerMohamad Haj Yahia
In case of internal error state we will simulate the commands status through the return value translation function, but we need to simulate all the teardown fw commands as successful so we will not have fw command failure prints. This also fix memory leaks that happen because we skip teardown stages due to failed fw commands. Fixes: 89d44f0a6c73 ('net/mlx5_core: Add pci error handlers to mlx5_core driver') Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01bonding: prevent out of bound accessesEric Dumazet
ether_addr_equal_64bits() requires some care about its arguments, namely that 8 bytes might be read, even if last 2 byte values are not used. KASan detected a violation with null_mac_addr and lacpdu_mcast_addr in bond_3ad.c Same problem with mac_bcast[] and mac_v6_allmcast[] in bond_alb.c : Although the 8-byte alignment was there, KASan would detect out of bound accesses. Fixes: 815117adaf5b ("bonding: use ether_addr_equal_unaligned for bond addr compare") Fixes: bb54e58929f3 ("bonding: Verify RX LACPDU has proper dest mac-addr") Fixes: 885a136c52a8 ("bonding: use compare_ether_addr_64bits() in ALB") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Acked-by: Dmitry Vyukov <dvyukov@google.com> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Acked-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01ASoC: rt5645: fix reg-2f default value.Bard Liao
The default value of reg-2f in codec rt5650 is 0x5002, not 0x1002. Signed-off-by: Bard Liao <bardliao@realtek.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2016-07-01net: mvneta: fix open() error cleanupRussell King - ARM Linux
If mvneta_mdio_probe() fails, a kernel warning is triggered due to missing cleanup in the error path. Add the necessary cleanup. ------------[ cut here ]------------ WARNING: CPU: 1 PID: 281 at kernel/irq/manage.c:1814 __free_percpu_irq+0xfc/0x130 percpu IRQ 38 still enabled on CPU0! Modules linked in: bnep bluetooth xhci_plat_hcd xhci_hcd marvell_cesa armada_thermal des_generic ehci_orion mcp3021 spi_orion sfp mdio_i2c evbug fuse CPU: 1 PID: 281 Comm: connmand Not tainted 4.7.0-rc2+ #53 Hardware name: Marvell Armada 380/385 (Device Tree) Backtrace: [<c0013488>] (dump_backtrace) from [<c00137d0>] (show_stack+0x18/0x1c) r6:60010093 r5:ffffffff r4:00000000 r3:dc8ba500 [<c00137b8>] (show_stack) from [<c02c6fe0>] (dump_stack+0xa4/0xdc) [<c02c6f3c>] (dump_stack) from [<c002d4ec>] (__warn+0xd8/0x104) r6:c081e6a0 r5:00000000 r4:edfe5d50 r3:dc8ba500 [<c002d414>] (__warn) from [<c002d5d0>] (warn_slowpath_fmt+0x40/0x48) r10:a0010013 r8:c09356f8 r7:00000026 r6:ef11a260 r5:edd7b980 r4:ef11a200 [<c002d594>] (warn_slowpath_fmt) from [<c008c8e0>] (__free_percpu_irq+0xfc/0x130) r3:00000026 r2:c081e7ac [<c008c7e4>] (__free_percpu_irq) from [<c008c95c>] (free_percpu_irq+0x48/0x74) r10:00008914 r8:00000000 r7:ffffffed r6:c09356f8 r5:00000026 r4:ef11a200 [<c008c914>] (free_percpu_irq) from [<c043dd70>] (mvneta_open+0x118/0x134) r6:ffffffed r5:ef01e640 r4:ef01e000 r3:ef01e000 [<c043dc58>] (mvneta_open) from [<c055f5b4>] (__dev_open+0xa4/0x108) r7:ef01e030 r6:c06ff3d8 r5:ffff9003 r4:ef01e000 [<c055f510>] (__dev_open) from [<c055f844>] (__dev_change_flags+0x94/0x150) r7:00001002 r6:00000001 r5:ffff9003 r4:ef01e000 [<c055f7b0>] (__dev_change_flags) from [<c055f938>] (dev_change_flags+0x20/0x50) r8:00000000 r7:c09334c8 r6:00001002 r5:00000148 r4:ef01e000 r3:00008914 [<c055f918>] (dev_change_flags) from [<c05de044>] (devinet_ioctl+0x6f4/0x7e0) r8:00000000 r7:c09334c8 r6:00000000 r5:ee87200c r4:00000000 r3:00008914 [<c05dd950>] (devinet_ioctl) from [<c05e0168>] (inet_ioctl+0x1b8/0x1c8) r10:beb4499c r9:edfe4000 r8:ecf13280 r7:c096cf00 r6:beb4499c r5:eef7c240 r4:00008914 [<c05dffb0>] (inet_ioctl) from [<c053c898>] (sock_ioctl+0x78/0x300) [<c053c820>] (sock_ioctl) from [<c0155ecc>] (do_vfs_ioctl+0x98/0xa60) r7:00000011 r6:00008914 r5:00000011 r4:c01568d0 [<c0155e34>] (do_vfs_ioctl) from [<c01568d0>] (SyS_ioctl+0x3c/0x60) r10:00000000 r9:edfe4000 r8:beb4499c r7:00000011 r6:00008914 r5:ecf13280 r4:ecf13280 [<c0156894>] (SyS_ioctl) from [<c000fe60>] (ret_fast_syscall+0x0/0x1c) r8:c0010004 r7:00000036 r6:00000011 r5:000a2978 r4:00000000 r3:00009003 ---[ end trace 711f625d5b04b3a7 ]--- Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Tested-by: Jon Nettleton <jon@solid-run.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01atm: horizon: Use setup_timerAmitoj Kaur Chawla
Convert a call to init_timer and accompanying intializations of the timer's data and function fields to a call to setup_timer. The Coccinelle semantic patch that fixes this problem is as follows: @@ expression t,d,f,e1; identifier x1; statement S1; @@ ( -t.data = d; | -t.function = f; | -init_timer(&t); +setup_timer(&t,f,d); | -init_timer_on_stack(&t); +setup_timer_on_stack(&t,f,d); ) <... when != S1 t.x1 = e1; ...> Signed-off-by: Amitoj Kaur Chawla <amitoj1606@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01r8152: clear LINK_OFF_WAKE_EN after autoresumehayeswang
LINK_OFF_WAKE_EN should be cleared after autoresume, otherwise after system suspend, the system would wake up when linking off occurs. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01usb: dwc3: st: Use explicit reset_control_get_exclusive() APILee Jones
We're making all reset line users specify whether their lines are shared with other IP or they operate them exclusively. In this case the line is exclusively used only by this IP, so use the *_exclusive() API accordingly. Acked-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2016-07-01phy: phy-stih407-usb: Use explicit reset_control_get_exclusive() APILee Jones
We're making all reset line users specify whether their lines are shared with other IP or they operate them exclusively. In this case the line is exclusively used only by this IP, so use the *_exclusive() API accordingly. Acked-by: Kishon Vijay Abraham I <kishon@ti.com> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2016-07-01phy: miphy28lp: Inform the reset framework that our reset line may be sharedLee Jones
On the STiH410 B2120 development board the MiPHY28lp shares its reset line with the Synopsys DWC3 SuperSpeed (SS) USB 3.0 Dual-Role-Device (DRD). New functionality in the reset subsystems forces consumers to be explicit when requesting shared/exclusive reset lines. Acked-by: Kishon Vijay Abraham I <kishon@ti.com> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2016-07-01Merge branch 'qed-next'David S. Miller
Manish Chopra says: ==================== qede: Enhancements This patch series have few small fastpath features support and code refactoring. Note - regarding get/set tunable configuration via ethtool Surprisingly, there is NO ethtool application support for such configuration given that we have kernel support. Do let us know if we need to add support for that in user ethtool. Please consider applying this series to "net-next". ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01qede: Bump up driver version to 8.10.1.20Manish Chopra
Signed-off-by: Manish Chopra <manish.chopra@qlogic.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01qede: Add get/set rx copy break tunable supportManish Chopra
Signed-off-by: Manish <manish.chopra@qlogic.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01qede: Utilize xmit_moreManish Chopra
This patch uses xmit_more optimization to reduce number of TX doorbells write per packet. Signed-off-by: Manish <manish.chopra@qlogic.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01qede: qede_poll refactoringManish Chopra
This patch cleanups qede_poll() routine a bit and allows qede_poll() to do single iteration to handle TX completion [As under heavy TX load qede_poll() might run for indefinite time in the while(1) loop for TX completion processing and cause CPU stuck]. Signed-off-by: Manish <manish.chopra@qlogic.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01qede: Add support for handling IP fragmented packets.Manish Chopra
When handling IP fragmented packets with csum in their transport header, the csum isn't changed as part of the fragmentation. As a result, the packet containing the transport headers would have the correct csum of the original packet, but one that mismatches the actual packet that passes on the wire. As a result, on receive path HW would give an indication that the packet has incorrect csum, which would cause qede to discard the incoming packet. Since HW also delivers a notification of IP fragments, change driver behavior to pass such incoming packets to stack and let it make the decision whether it needs to be dropped. Signed-off-by: Manish <manish.chopra@qlogic.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01Merge branch 'tun-skb_array'David S. Miller
Jason Wang says: ==================== switch to use tx skb array in tun This series tries to switch to use skb array in tun. This is used to eliminate the spinlock contention between producer and consumer. The conversion was straightforward: just introdce a tx skb array and use it instead of sk_receive_queue. A minor issue is to keep the tx_queue_len behaviour, since tun used to use it for the length of sk_receive_queue. This is done through: - add the ability to resize multiple rings at once to avoid handling partial resize failure for mutiple rings. - add the support for zero length ring. - introduce a notifier which was triggered when tx_queue_len was changed for a netdev. - resize all queues during the tx_queue_len changing. Tests shows about 15% improvement on guest rx pps: Before: ~1300000pps After : ~1500000pps Changes from V3: - fix kbuild warnings - call NETDEV_CHANGE_TX_QUEUE_LEN on IFLA_TXQLEN Changes from V2: - add multiple rings resizing support for ptr_ring/skb_array - add zero length ring support - introdce a NETDEV_CHANGE_TX_QUEUE_LEN - drop new flags Changes from V1: - switch to use skb array instead of a customized circular buffer - add non-blocking support - rename .peek to .peek_len - drop lockless peeking since test show very minor improvement ==================== Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-from-altitude: 34697 feet. Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01tun: switch to use skb array for txJason Wang
We used to queue tx packets in sk_receive_queue, this is less efficient since it requires spinlocks to synchronize between producer and consumer. This patch tries to address this by: - switch from sk_receive_queue to a skb_array, and resize it when tx_queue_len was changed. - introduce a new proto_ops peek_len which was used for peeking the skb length. - implement a tun version of peek_len for vhost_net to use and convert vhost_net to use peek_len if possible. Pktgen test shows about 15.3% improvement on guest receiving pps for small buffers: Before: ~1300000pps After : ~1500000pps Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01net: introduce NETDEV_CHANGE_TX_QUEUE_LENJason Wang
This patch introduces a new event - NETDEV_CHANGE_TX_QUEUE_LEN, this will be triggered when tx_queue_len. It could be used by net device who want to do some processing at that time. An example is tun who may want to resize tx array when tx_queue_len is changed. Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01skb_array: add wrappers for resizingJason Wang
Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01ptr_ring: support resizing multiple queuesMichael S. Tsirkin
Sometimes, we need support resizing multiple queues at once. This is because it was not easy to recover to recover from a partial failure of multiple queues resizing. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01skb_array: minor tweakJason Wang
Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01ptr_ring: support zero length ringJason Wang
Sometimes, we need zero length ring. But current code will crash since we don't do any check before accessing the ring. This patch fixes this. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01Merge branch 'sch_hfsc-fixes-cleanups'David S. Miller
Michal Soltys says: ==================== HFSC patches, part 1 It's revised version of part of the patches I submitted really, really long time ago (back then I asked Patrick to ignore them as I found some issues shortly after submitting). Anyway this is the first set with very simple fixes/changes though some of them relatively subtle (I tried to do very exhaustive commit messages explaining what and why with those). The patches are against net-next tree. The second set will be heavier - or rather with more complex explanations, among those I have: - a fix to subtle issue introduced in http://permalink.gmane.org/gmane.linux.kernel.commits.2-4/8281 along with simplifying related stuff - update times to 96 bits (which allows to "just" use 32 bit shifts and improves curve definition accuracy at more extreme low/high speeds) - add curve "merging" instead of just selecting in convex case (computations mirror those from concave intersection) But these are eventually for later. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01net/sched/sch_hfsc.c: anchor virtual curve at proper vt in hfsc_change_fsc()Michal Soltys
cl->cl_vt alone is relative only to the current backlog period, while the curve operates on cumulative virtual time. This patch adds missing cl->cl_vtoff. Signed-off-by: Michal Soltys <soltys@ziu.info> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01net/sched/sch_hfsc.c: go passive after vt updateMichal Soltys
When a class is going passive, it should update its cl_vt first to be consistent with the last dequeue operation. Otherwise its cl_vt will be one packet behind and parent's cvtmax might not be updated as well. One possible side effect is if some class goes passive and subsequently goes active /without/ its parent going passive - with cl_vt lagging one packet behind - comparison made in init_vf() will be affected (same period). Signed-off-by: Michal Soltys <soltys@ziu.info> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01net/sched/sch_hfsc.c: remove leftover dlist and droplistMichal Soltys
This is update to: commit a09ceb0e08140a ("sched: remove qdisc->drop") That commit removed qdisc->drop, but left alone dlist and droplist that no longer serve any meaningful purpose. Signed-off-by: Michal Soltys <soltys@ziu.info> Signed-off-by: David S. Miller <davem@davemloft.net>