summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-01-10net/dim: use struct net_dim_sample as arg to net_dimAndy Gospodarek
Simplify the arguments net_dim() by formatting them into a struct net_dim_sample before calling the function. Signed-off-by: Andy Gospodarek <gospo@broadcom.com> Suggested-by: Tal Gilboa <talgi@mellanox.com> Acked-by: Tal Gilboa <talgi@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-10net/mlx5e: Move dynamic interrupt coalescing code to include/linuxAndy Gospodarek
This move allows drivers to add private structure elements to track the number of packets, bytes, and interrupts events per ring. A driver also defines a workqueue handler to act on this collected data once per poll and modify the coalescing parameters per ring. Signed-off-by: Andy Gospodarek <gospo@broadcom.com> Acked-by: Tal Gilboa <talgi@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-10net/mlx5e: Change Mellanox references in DIM codeAndy Gospodarek
Change all appropriate mlx5_am* and MLX5_AM* references to net_dim and NET_DIM, respectively, in code that handles dynamic interrupt moderation. Also change all references from 'am' to 'dim' when used as local variables and add generic profile references. Signed-off-by: Andy Gospodarek <gospo@broadcom.com> Acked-by: Tal Gilboa <talgi@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-10net/mlx5e: Move generic functions to new fileAndy Gospodarek
These functions were identified as ones that could be made generic and used by multiple drivers. Most of the contents of en_rx_am.c are moved to net_dim.c. Signed-off-by: Andy Gospodarek <gospo@broadcom.com> Acked-by: Tal Gilboa <talgi@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-10net/mlx5e: Move AM logic enumsAndy Gospodarek
More movement to help make this code more generic. Signed-off-by: Andy Gospodarek <gospo@broadcom.com> Acked-by: Tal Gilboa <talgi@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-10net/mlx5e: Remove rq references in mlx5e_rx_amAndy Gospodarek
This makes mlx5e_am_sample more generic so that it can be called easily from a driver that does not use the same data structure to store these values in a single structure. Signed-off-by: Andy Gospodarek <gospo@broadcom.com> Acked-by: Tal Gilboa <talgi@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-10net/mlx5e: Move interrupt moderation forward declarationsAndy Gospodarek
Move these to newly created file to prepare to move these functions to a library. Signed-off-by: Andy Gospodarek <gospo@broadcom.com> Acked-by: Tal Gilboa <talgi@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-10net/mlx5e: Move interrupt moderation structs to new fileAndy Gospodarek
Create new header file to prepare to move code that handles irq moderation to a library that lives in a header file. Signed-off-by: Andy Gospodarek <gospo@broadcom.com> Acked-by: Tal Gilboa <talgi@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-10Merge branch 'ipv6-Add-support-for-non-equal-cost-multipath'David S. Miller
Ido Schimmel says: ==================== ipv6: Add support for non-equal-cost multipath This set aims to add support for IPv6 non-equal-cost multipath routes. The first three patches convert multipath selection to use the hash-threshold method (RFC 2992) instead of modulo-N. The same method is employed by the IPv4 routing code since commit 0e884c78ee19 ("ipv4: L3 hash-based multipath"). Unlike modulo-N, with hash-threshold only the flows near the region boundaries are affected when a nexthop is added or removed. In addition, it allows us to easily add support for non-equal-cost multipath in the last patch by sizing the different regions according to the provided weights. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-10ipv6: Add support for non-equal-cost multipathIdo Schimmel
The use of hash-threshold instead of modulo-N makes it trivial to add support for non-equal-cost multipath. Instead of dividing the multipath hash function's output space equally between the nexthops, each nexthop is assigned a region size which is proportional to its weight. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-10ipv6: Use hash-threshold instead of modulo-NIdo Schimmel
Now that each nexthop stores its region boundary in the multipath hash function's output space, we can use hash-threshold instead of modulo-N in multipath selection. This reduces the number of checks we need to perform during lookup, as dead and linkdown nexthops are assigned a negative region boundary. In addition, in contrast to modulo-N, only flows near region boundaries are affected when a nexthop is added or removed. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-10ipv6: Use a 31-bit multipath hashIdo Schimmel
The hash thresholds assigned to IPv6 nexthops are in the range of [-1, 2^31 - 1], where a negative value is assigned to nexthops that should not be considered during multipath selection. Therefore, in a similar fashion to IPv4, we need to use the upper 31-bits of the multipath hash for multipath selection. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-10ipv6: Calculate hash thresholds for IPv6 nexthopsIdo Schimmel
Before we convert IPv6 to use hash-threshold instead of modulo-N, we first need each nexthop to store its region boundary in the hash function's output space. The boundary is calculated by dividing the output space equally between the different active nexthops. That is, nexthops that are not dead or linkdown. The boundaries are rebalanced whenever a nexthop is added or removed to a multipath route and whenever a nexthop becomes active or inactive. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-10vhost_net: batch used ring update in rxJason Wang
This patch tries to batched used ring update during RX. This is pretty fit for the case when guest is much faster (e.g dpdk based backend). In this case, used ring is almost empty: - we may get serious cache line misses/contending on both used ring and used idx. - at most 1 packet could be dequeued at one time, batching in guest does not make much effect. Update used ring in a batch can help since guest won't access the used ring until used idx was advanced for several descriptors and since we advance used ring for every N packets, guest will only need to access used idx for every N packet since it can cache the used idx. To have a better interaction for both batch dequeuing and dpdk batching, VHOST_RX_BATCH was used as the maximum number of descriptors that could be batched. Test were done between two machines with 2.40GHz Intel(R) Xeon(R) CPU E5-2630 connected back to back through ixgbe. Traffic were generated on one remote ixgbe through MoonGen and measure the RX pps through testpmd in guest when do xdp_redirect_map from local ixgbe to tap. RX pps were increased from 3.05 Mpps to 4.00 Mpps (about 31% improvement). One possible concern for this is the implications for TCP (especially latency sensitive workload). Result[1] does not show obvious changes for most of the netperf test (RR, TX, and RX). And we do get some improvements for RX on some specific size. Guest RX: size/sessions/+thu%/+normalize% 64/ 1/ +2%/ +2% 64/ 2/ +2%/ -1% 64/ 4/ +1%/ +1% 64/ 8/ 0%/ 0% 256/ 1/ +6%/ -3% 256/ 2/ -3%/ +2% 256/ 4/ +11%/ +11% 256/ 8/ 0%/ 0% 512/ 1/ +4%/ 0% 512/ 2/ +2%/ +2% 512/ 4/ 0%/ -1% 512/ 8/ -8%/ -8% 1024/ 1/ -7%/ -17% 1024/ 2/ -8%/ -7% 1024/ 4/ +1%/ 0% 1024/ 8/ 0%/ 0% 2048/ 1/ +30%/ +14% 2048/ 2/ +46%/ +40% 2048/ 4/ 0%/ 0% 2048/ 8/ 0%/ 0% 4096/ 1/ +23%/ +22% 4096/ 2/ +26%/ +23% 4096/ 4/ 0%/ +1% 4096/ 8/ 0%/ 0% 16384/ 1/ -2%/ -3% 16384/ 2/ +1%/ -4% 16384/ 4/ -1%/ -3% 16384/ 8/ 0%/ -1% 65535/ 1/ +15%/ +7% 65535/ 2/ +4%/ +7% 65535/ 4/ 0%/ +1% 65535/ 8/ 0%/ 0% TCP_RR: size/sessions/+thu%/+normalize% 1/ 1/ 0%/ +1% 1/ 25/ +2%/ +1% 1/ 50/ +4%/ +1% 64/ 1/ 0%/ -4% 64/ 25/ +2%/ +1% 64/ 50/ 0%/ -1% 256/ 1/ 0%/ 0% 256/ 25/ 0%/ 0% 256/ 50/ +4%/ +2% Guest TX: size/sessions/+thu%/+normalize% 64/ 1/ +4%/ -2% 64/ 2/ -6%/ -5% 64/ 4/ +3%/ +6% 64/ 8/ 0%/ +3% 256/ 1/ +15%/ +16% 256/ 2/ +11%/ +12% 256/ 4/ +1%/ 0% 256/ 8/ +5%/ +5% 512/ 1/ -1%/ -6% 512/ 2/ 0%/ -8% 512/ 4/ -2%/ +4% 512/ 8/ +6%/ +9% 1024/ 1/ +3%/ +1% 1024/ 2/ +3%/ +9% 1024/ 4/ 0%/ +7% 1024/ 8/ 0%/ +7% 2048/ 1/ +8%/ +2% 2048/ 2/ +3%/ -1% 2048/ 4/ -1%/ +11% 2048/ 8/ +3%/ +9% 4096/ 1/ +8%/ +8% 4096/ 2/ 0%/ -7% 4096/ 4/ +4%/ +4% 4096/ 8/ +2%/ +5% 16384/ 1/ -3%/ +1% 16384/ 2/ -1%/ -12% 16384/ 4/ -1%/ +5% 16384/ 8/ 0%/ +1% 65535/ 1/ 0%/ -3% 65535/ 2/ +5%/ +16% 65535/ 4/ +1%/ +2% 65535/ 8/ +1%/ -1% Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-10Merge tag 'mlx5-updates-2018-01-08' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux mlx5-updates-2018-01-08 Four patches from Or that add Hairpin support to mlx5: =========================================================== From: Or Gerlitz <ogerlitz@mellanox.com> We refer the ability of NIC HW to fwd packet received on one port to the other port (also from a port to itself) as hairpin. The application API is based on ingress tc/flower rules set on the NIC with the mirred redirect action. Other actions can apply to packets during the redirect. Hairpin allows to offload the data-path of various SW DDoS gateways, load-balancers, etc to HW. Packets go through all the required processing in HW (header re-write, encap/decap, push/pop vlan) and then forwarded, CPU stays at practically zero usage. HW Flow counters are used by the control plane for monitoring and accounting. Hairpin is implemented by pairing a receive queue (RQ) to send queue (SQ). All the flows that share <recv NIC, mirred NIC> are redirected through the same hairpin pair. Currently, only header-rewrite is supported as a packet modification action. I'd like to thanks Elijah Shakkour <elijahs@mellanox.com> for implementing this functionality on HW simulator, before it was avail in the FW so the driver code could be tested early. =========================================================== From Feras three patches that provide very small changes that allow IPoIB to support RX timestamping for child interfaces, simply by hooking the mlx5e timestamping PTP ioctl to IPoIB child interface netdev profile. One patch from Gal to fix a spilling mistake. Two patches from Eugenia adds drop counters to VF statistics to be reported as part of VF statistics in netlink (iproute2) and implemented them in mlx5 eswitch. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-10Merge branch 'hns3-next'David S. Miller
Peng Li says: ==================== code improvements in HNS3 driver This patchset fixes 2 comments for community review. [patch 1/2] reverts "net: hns3: Add packet statistics of netdev" reported by Jakub Kicinski and David Miller. [patch 2/2] reports the function type the same line with hns3_nic_get_stats64, reported by Andrew Lunn. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-10net: hns3: report the function type the same line with hns3_nic_get_stats64Peng Li
The function type should be on the same line with the function name, or it may cause display error if a patch edit the function. There is am example following: https://www.spinics.net/lists/netdev/msg476141.html Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-10Revert "net: hns3: Add packet statistics of netdev"Peng Li
This reverts commit 8491000754796c838a0081c267f9dd54ad2ccba3. It is duplicate to add statistics of netdev for ethtool -S. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-10Merge branch 'Socionext-Synquacer-NETSEC-driver'David S. Miller
Jassi Brar says: ==================== Socionext Synquacer NETSEC driver Changes since v5 # Removed helper macros # Removed 'inline' qualifier # Changed multiline empty comment to single line # Added 'clock-names' property in DT binding example # Ignore 'clock-names' property in driver until f/ws in the wild are upgraded or we support instance that take in more than one clock. # Rebased the patchset onto net-next Changes since v4 # Fixed ucode indexing as a word, instead of byte # Removed redundant clocks, keep only phy rate reference clock and expect it to be 'phy_ref_clk' Changes since v3 # Discard 'socionext,snq-mdio', and simply use 'mdio' subnode. # Use ioremap on ucode region as well, instead of memremap. Changes since v2 # Use 'mdio' subnode in DT bindings. # Use phy_interface_mode_is_rgmii(), instead of open coding the check. # Use readl/b with eeprom_base pointer. # Unregister mdio bus upon failure in probe. Changes since v1 # Switched from using memremap to ioremap # Implemented ndo_do_ioctl callback # Defined optional 'dma-coherent' DT property ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-10MAINTAINERS: Add entry for Socionext ethernet driverJassi Brar
Add entry for the Socionext Netsec controller driver and DT bindings. Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-10net: socionext: Add Synquacer NetSec driverJassi Brar
This driver adds support for Socionext "netsec" IP Gigabit Ethernet + PHY IP used in the Synquacer SC2A11 SoC. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-10dt-bindings: net: Add DT bindings for Socionext NetsecJassi Brar
This patch adds documentation for Device-Tree bindings for the Socionext NetSec Controller driver. Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-10Merge branch '10GbE' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 10GbE Intel Wired LAN Driver Updates 2018-01-09 This series contains updates to ixgbe and ixgbevf only. Emil fixes an issue with "wake on LAN"(WoL) where we need to ensure we enable the reception of multicast packets so that WoL works for IPv6 magic packets. Cleaned up code no longer needed with the update to adaptive ITR. Paul update the driver to advertise the highest capable link speed when a module gets inserted. Also extended the displaying of firmware version to include the iSCSI and OEM block in the EEPROM to better identify firmware versions/images. Tonghao Zhang cleans up a code comment that no longer applies since InterruptThrottleRate has been removed from the driver. Alex fixes SR-IOV and MACVLAN offload interaction, where the MACVLAN offload was incorrectly configuring several filters with the wrong pool value which resulted in MACLVAN interfaces not being able to receive traffic that had to pass over the physical interface. Fixed transmit hangs and dropped receive frames when the number of VFs changed. Added support for RSS on MACVLAN pools for X550 devices. Fixed up the MACVLAN limitations so we can now support 63 offloaded devices. Cleaned up MACVLAN code that is no longer needed with the recent changes and fixes. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-10Bluetooth: btbcm: Fix sleep mode struct orderingLukas Wunner
According to the documentation for Laird SD40 radio modules (which use the BCM4329 chipset), the order of the Enable_BREAK_To_Host and Pulsed_HOST_WAKE parameters in the sleep mode struct is reversed vis-à-vis our struct declaration. See page 46 of this PDF: http://cdn.lairdtech.com/home/brandworld/files/Application%20Note%20-%2040%20Series%20Bluetooth.pdf The documentation is dated Oct 2015, so fairly recent, making it appear more likely that the documentation is correct and our code is wrong. Amend our code to be in congruence with the documentation. Cc: Sue White <sue.white@lairdtech.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-01-10Bluetooth: hci_bcm: Sleep instead of spinningLukas Wunner
The driver calls mdelay(15) in the ->suspend, ->resume, ->runtime_suspend and ->runtime_resume hook, however spinning for such a long period of time is discouraged as per Documentation/timers/timers-howto.txt. The use of mdelay() seems unnecessary, it is allowed to sleep in the system sleep and runtime PM hooks (with the exception of ->suspend_noirq and ->resume_noirq) and the driver itself also does not rely on a non-sleeping ->runtime_resume as the only place where a synchronous resume is performed, in bcm_dequeue(), is called from a work item in hci_ldisc.c and hci_serdev.c. So replace the mdelay(15) with msleep(15). Note that the delay is inserted after asserting or deasserting the device wake pin, but in bcm_gpio_set_power() that pin is asserted or deasserted *without* observing a delay. It is thus unclear if the delay is necessary at all. It is likewise unclear why it is exactly 15 ms, the commit introducing it, 118612fb9165 ("Bluetooth: hci_bcm: Add suspend/resume PM functions"), does not provide a rationale. Cc: Frédéric Danis <frederic.danis.oss@gmail.com> Suggested-and-reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-01-10Bluetooth: hci_bcm: Silence IRQ printkLukas Wunner
The host wake IRQ is optional, but if none is found, "BCM irq: -22" is logged which may irritate users. This is really a debug message, so use dev_dbg() instead of dev_info(). If users are interested in the IRQ, they can always consult /proc/interrupts. Cc: Frédéric Danis <frederic.danis.oss@gmail.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-01-10Bluetooth: hci_bcm: Support Apple GPIO handlingLukas Wunner
Enable Bluetooth on the following Macs which provide custom ACPI methods to toggle the GPIOs for device wake and shutdown instead of accessing the pins directly: MacBook8,1 2015 12" MacBook9,1 2016 12" MacBook10,1 2017 12" MacBookPro13,1 2016 13" MacBookPro13,2 2016 13" with Touch Bar MacBookPro13,3 2016 15" with Touch Bar MacBookPro14,1 2017 13" MacBookPro14,2 2017 13" with Touch Bar MacBookPro14,3 2017 15" with Touch Bar On the MacBook8,1 Bluetooth is muxed with a second device (a debug port on the SSD) under the control of PCH GPIO 36. Because serdev cannot deal with multiple slaves yet, it is currently necessary to patch the DSDT and remove the SSDC device. The custom ACPI methods are called: BTLP (Low Power) takes one argument, toggles device wake GPIO BTPU (Power Up) tells SMC to drive shutdown GPIO high BTPD (Power Down) tells SMC to drive shutdown GPIO low BTRS (Reset) calls BTPD followed by BTPU BTRB unknown, not present on all MacBooks Search for the BTLP, BTPU and BTPD methods on ->probe and cache them in struct bcm_device if the machine is a Mac. Additionally, set the init_speed based on a custom device property provided by Apple in lieu of _CRS resources. The Broadcom UART's speed is fixed on Apple Macs: Any attempt to change it results in Bluetooth status code 0x0c and bcm_set_baudrate() thus always returns -EBUSY. By setting only the init_speed and leaving oper_speed at zero, we can achieve that the host UART's speed is adjusted but the Broadcom UART's speed is left as is. The host wake pin goes into the SMC which handles it independently of the OS, so there's no IRQ for it. Thanks to Ronald Tschalär who did extensive debugging and testing of this patch and contributed fixes. ACPI snippet containing the custom methods and device properties (taken from a MacBook8,1): Method (BTLP, 1, Serialized) { If (LEqual (Arg0, 0x00)) { Store (0x01, GD54) /* set PCH GPIO 54 direction to input */ } If (LEqual (Arg0, 0x01)) { Store (0x00, GD54) /* set PCH GPIO 54 direction to output */ Store (0x00, GP54) /* set PCH GPIO 54 value to low */ } } Method (BTPU, 0, Serialized) { Store (0x01, \_SB.PCI0.LPCB.EC.BTPC) Sleep (0x0A) } Method (BTPD, 0, Serialized) { Store (0x00, \_SB.PCI0.LPCB.EC.BTPC) Sleep (0x0A) } Method (BTRS, 0, Serialized) { BTPD () BTPU () } Method (_DSM, 4, NotSerialized) // _DSM: Device-Specific Method { If (LEqual (Arg0, ToUUID ("a0b5b7c6-1318-441c-b0c9-fe695eaf949b"))) { Store (Package (0x08) { "baud", Buffer (0x08) { 0xC0, 0xC6, 0x2D, 0x00, 0x00, 0x00, 0x00, 0x00 }, "parity", Buffer (0x08) { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 }, "dataBits", Buffer (0x08) { 0x08, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 }, "stopBits", Buffer (0x08) { 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 } }, Local0) DTGP (Arg0, Arg1, Arg2, Arg3, RefOf (Local0)) Return (Local0) } Return (0x00) } Link: https://github.com/Dunedan/mbp-2016-linux/issues/29 Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=110901 Reported-by: Leif Liddy <leif.liddy@gmail.com> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Frédéric Danis <frederic.danis.oss@gmail.com> Cc: Loic Poulain <loic.poulain@linaro.org> Cc: Hans de Goede <hdegoede@redhat.com> Tested-by: Max Shavrick <mxms@me.com> [MacBook8,1] Tested-by: Leif Liddy <leif.liddy@gmail.com> [MacBook9,1] Tested-by: Daniel Roschka <danielroschka@phoenitydawn.de> [MacBookPro13,2] Tested-by: Ronald Tschalär <ronald@innovation.ch> [MacBookPro13,3] Tested-by: Peter Y. Chuang <peteryuchuang@gmail.com> [MacBookPro14,1] Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Ronald Tschalär <ronald@innovation.ch> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-01-10Bluetooth: hci_bcm: Handle errors properlyLukas Wunner
A significant portion of this driver lacks error handling. As a first step, add error paths to bcm_gpio_set_power(), bcm_open(), bcm_close(), bcm_suspend_device(), bcm_resume_device(), bcm_resume(), bcm_probe() and bcm_serdev_probe(). (I've also scrutinized bcm_suspend() but think it's fine as is.) Those are all the functions accessing the device wake and shutdown GPIO. On Apple Macs the pins are accessed through ACPI methods, which may fail for various reasons, hence proper error handling is necessary. Non-Macs access the pins directly, which may fail as well but the GPIO core does not yet pass back errors to consumers. Cc: Frédéric Danis <frederic.danis.oss@gmail.com> Cc: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-01-10Bluetooth: hci_bcm: Add callbacks to toggle GPIOsLukas Wunner
MacBooks provides custom ACPI methods to toggle the GPIOs for device wake and shutdown instead of accessing the pins directly. Prepare for their support by adding callbacks to toggle the GPIOs, which on non-Macs do nothing more but call gpiod_set_value(). No functional change intended. Suggested-and-reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-01-10Bluetooth: hci_bcm: Document struct bcm_deviceLukas Wunner
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-01-10Bluetooth: hci_bcm: Invalidate IRQ on request failureLukas Wunner
If devm_request_irq() fails, the driver bails out of bcm_request_irq() but continues to ->setup the device (because the IRQ is optional). The driver subsequently calls devm_free_irq(), enable_irq_wake() and disable_irq_wake() on the IRQ even though requesting it failed. Avoid by invalidating the IRQ on request failure. Cc: Frédéric Danis <frederic.danis.oss@gmail.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-01-10Bluetooth: hci_bcm: Fix unbalanced pm_runtime_disable()Lukas Wunner
On ->setup, pm_runtime_enable() is only called if a valid IRQ was found, but on ->close(), pm_runtime_disable() is called unconditionally. Disablement of runtime PM is recorded in a counter, so every pm_runtime_disable() needs to be balanced. Fix it. Cc: Frédéric Danis <frederic.danis.oss@gmail.com> Reported-and-reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-01-10Bluetooth: hci_bcm: Fix race on closeLukas Wunner
Upon ->close, the driver powers the Bluetooth controller down, deasserts the device wake pin, updates the runtime PM status to "suspended" and finally frees the IRQ. Because the IRQ is freed last, a runtime resume can take place after the controller was powered down. The impact is not grave, the worst thing that can happen is that the device wake pin is reasserted (should have no effect while the regulator is off) and that setting the runtime PM status to "suspended" does not reflect reality. Still, it's wrong, so free the IRQ first. Cc: Frédéric Danis <frederic.danis.oss@gmail.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-01-10Bluetooth: hci_bcm: Clean up unnecessary #ifdefLukas Wunner
pm_runtime_disable() and pm_runtime_set_suspended() are replaced with empty inlines if CONFIG_PM is disabled, so there's no need to #ifdef them. device_init_wakeup() is likewise replaced with an inline, though it's not empty, but it and devm_free_irq() can be made conditional on IS_ENABLED(CONFIG_PM), which is preferable to #ifdef as per section 20 of Documentation/process/coding-style.rst. Cc: Frédéric Danis <frederic.danis.oss@gmail.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-01-10Bluetooth: hci_bcm: Validate IRQ before using itRonald Tschalär
The ->close, ->suspend and ->resume hooks assume presence of a valid IRQ if the device is wakeup capable. However it's entirely possible that wakeup was enabled by some other entity besides this driver and in this case the user will get a WARN splat if no valid IRQ was found. Avoid by checking if the IRQ is valid, i.e. > 0. Case in point: On recent MacBook Pros, the Bluetooth device lacks an IRQ (because host wakeup is handled by the SMC, independently of the operating system), but it does possess a _PRW method (which specifies the SMC's GPE as wake event). The ACPI core therefore automatically marks the physical Bluetooth device wakeup capable upon binding it to its ACPI companion: device_set_wakeup_capable+0x96/0xb0 acpi_bind_one+0x28a/0x310 acpi_platform_notify+0x20/0xa0 device_add+0x215/0x690 serdev_device_add+0x57/0xf0 acpi_serdev_add_device+0xc9/0x110 acpi_ns_walk_namespace+0x131/0x280 acpi_walk_namespace+0xf5/0x13d serdev_controller_add+0x6f/0x110 serdev_tty_port_register+0x98/0xf0 tty_port_register_device_attr_serdev+0x3a/0x70 uart_add_one_port+0x268/0x500 serial8250_register_8250_port+0x32e/0x490 dw8250_probe+0x46c/0x720 platform_drv_probe+0x35/0x90 driver_probe_device+0x300/0x450 bus_for_each_drv+0x67/0xb0 __device_attach+0xde/0x160 bus_probe_device+0x9c/0xb0 device_add+0x448/0x690 platform_device_add+0x10e/0x260 mfd_add_device+0x392/0x4c0 mfd_add_devices+0xb1/0x110 intel_lpss_probe+0x2a9/0x610 [intel_lpss] intel_lpss_pci_probe+0x7a/0xa8 [intel_lpss_pci] Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Ronald Tschalär <ronald@innovation.ch> [lukas: fix up ->suspend and ->resume as well, add commit message] Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-01-10Bluetooth: hci_bcm: Mandate presence of shutdown and device wake GPIOLukas Wunner
Commit 0395ffc1ee05 ("Bluetooth: hci_bcm: Add PM for BCM devices") amended this driver to request a shutdown and device wake GPIO on probe, but mandated that only one of them need to be present: /* Make sure at-least one of the GPIO is defined and that * a name is specified for this instance */ if ((!dev->device_wakeup && !dev->shutdown) || !dev->name) { dev_err(&pdev->dev, "invalid platform data\n"); return -EINVAL; } However the same commit added a call to bcm_gpio_set_power() to the ->probe hook, which unconditionally accesses *both* GPIOs. Luckily, the resulting NULL pointer deref was never reported, suggesting there's no machine where either GPIO is missing. Commit 8a92056837fd ("Bluetooth: hci_bcm: Add (runtime)pm support to the serdev driver") removed the check whether at least one of the GPIOs is present without specifying a reason. Because commit 62aaefa7d038 ("Bluetooth: hci_bcm: improve use of gpios API") refactored the driver to use devm_gpiod_get_optional() instead of devm_gpiod_get(), one is now tempted to believe that the driver doesn't require *any* of the two GPIOs. Which is wrong, the driver still requires both GPIOs to avoid a NULL pointer deref. To this end, establish the status quo ante and request the GPIOs with devm_gpiod_get() again. Bail out of ->probe if either of them is missing. Oddly enough, whereas bcm_gpio_set_power() accesses the device wake pin unconditionally, bcm_suspend_device() and bcm_resume_device() do check for its presence before accessing it. Those checks are superfluous, so remove them. Cc: Frédéric Danis <frederic.danis.oss@gmail.com> Cc: Loic Poulain <loic.poulain@linaro.org> Cc: Hans de Goede <hdegoede@redhat.com> Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Cc: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-01-09Merge branch 'r8169-improve-runtime-pm'David S. Miller
Heiner Kallweit says: ==================== r8169: improve runtime pm On my system with two network ports I found that runtime PM didn't suspend the unused port. Therefore I checked runtime pm in this driver in somewhat more detail and this series improves runtime pm in general and solves the mentioned issue. Tested on a system with RTL8168evl (MAC version 34). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-09r8169: improve runtime pm in general and suspend unused portsHeiner Kallweit
So far rpm doesn't cover cases like unused ports which are never brought up. If they are active at probe time they remain in this state. Included in this patch: - Let the idle notification check whether we can suspend and let it schedule the suspend. This way we don't need to have calls to pm_schedule_suspend in different places. - At the end of rtl_open and rtl_init_one send an idle notification to allow suspending if the link is down. If a cable is plugged in aneg is finished before the suspend timer expires and the suspend request is cancelled. - Change rtl8169_runtime_suspend to power down the chip if the interface is down. Successfully tested on a RTL8168evl (mac version 34). Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-09r8169: improve runtime pm in rtl8169_check_link_statusHeiner Kallweit
This patch partially reverts commit e4fbce740f07 "r8169: Fix runtime power management" from 2010. At that time the suspend delay was 100ms and therefore suspending happened during initial aneg. Currently suspend delay is 5s, so suspend starts after aneg and the issue doesn't exist any longer. On my system aneg takes almost 3s, to be on the safe side let's increase the suspend delay to 10s. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-09r8169: remove unneeded rpm ops in rtl_shutdownHeiner Kallweit
This patch reverts commit 2a15cd2ff488 "r8169: runtime resume before shutdown" from 2012. Few months after this change the underlying issue was solved in the PCI core with commit 3ff2de9ba1a2 "PCI/PM: Resume device before shutdown". Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-09Merge branch 'tipc-improvements-to-group-messaging'David S. Miller
Jon Maloy says: ==================== tipc: improvements to group messaging We make a number of simplifications and improvements to the group messaging service. They aim at readability/maintainability of the code as well as scalability. The series is based on commit f9c935db8086 ("tipc: fix problems with multipoint-to-point flow control) which has been applied to 'net' but not yet to 'net-next'. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-09tipc: improve poll() for group member socketJon Maloy
The current criteria for returning POLLOUT from a group member socket is too simplistic. It basically returns POLLOUT as soon as the group has external destinations, something obviously leading to a lot of spinning during destination congestion situations. At the same time, the internal congestion handling is unnecessarily complex. We now change this as follows. - We introduce an 'open' flag in struct tipc_group. This flag is used only to help poll() get the setting of POLLOUT right, and *not* for congeston handling as such. This means that a user can choose to ignore an EAGAIN for a destination and go on sending messages to other destinations in the group if he wants to. - The flag is set to false every time we return EAGAIN on a send call. - The flag is set to true every time any member, i.e., not necessarily the member that caused EAGAIN, is removed from the small_win list. - We remove the group member 'usr_pending' flag. The size of the send window and presence in the 'small_win' list is sufficient criteria for recognizing congestion. This solution seems to be a reasonable compromise between 'anycast', which is normally not waiting for POLLOUT for a specific destination, and the other three send modes, which are. Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-09tipc: improve groupcast scope handlingJon Maloy
When a member joins a group, it also indicates a binding scope. This makes it possible to create both node local groups, invisible to other nodes, as well as cluster global groups, visible everywhere. In order to avoid that different members end up having permanently differing views of group size and memberhip, we must inhibit locally and globally bound members from joining the same group. We do this by using the binding scope as an additional separator between groups. I.e., a member must ignore all membership events from sockets using a different scope than itself, and all lookups for message destinations must require an exact match between the message's lookup scope and the potential target's binding scope. Apart from making it possible to create local groups using the same identity on different nodes, a side effect of this is that it now also becomes possible to create a cluster global group with the same identity across the same nodes, without interfering with the local groups. Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-09tipc: add option to suppress PUBLISH events for pre-existing publicationsJon Maloy
Currently, when a user is subscribing for binding table publications, he will receive a PUBLISH event for all already existing matching items in the binding table. However, a group socket making a subscriptions doesn't need this initial status update from the binding table, because it has already scanned it during the join operation. Worse, the multiplicatory effect of issuing mutual events for dozens or hundreds group members within a short time frame put a heavy load on the topology server, with the end result that scale out operations on a big group tend to take much longer than needed. We now add a new filter option, TIPC_SUB_NO_STATUS, for topology server subscriptions, so that this initial avalanche of events is suppressed. This change, along with the previous commit, significantly improves the range and speed of group scale out operations. We keep the new option internal for the tipc driver, at least for now. Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-09tipc: send out join messages as soon as new member is discoveredJon Maloy
When a socket is joining a group, we look up in the binding table to find if there are already other members of the group present. This is used for being able to return EAGAIN instead of EHOSTUNREACH if the user proceeds directly to a send attempt. However, the information in the binding table can be used to directly set the created member in state MBR_PUBLISHED and send a JOIN message to the peer, instead of waiting for a topology PUBLISH event to do this. When there are many members in a group, the propagation time for such events can be significant, and we can save time during the join operation if we use the initial lookup result fully. In this commit, we eliminate the member state MBR_DISCOVERED which has been the result of the initial lookup, and do instead go directly to MBR_PUBLISHED, which initiates the setup. After this change, the tipc_member FSM looks as follows: +-----------+ ---->| PUBLISHED |-----------------------------------------------+ PUB- +-----------+ LEAVE/WITHRAW | LISH |JOIN | | +-------------------------------------------+ | | | LEAVE/WITHDRAW | | | | +------------+ | | | | +----------->| PENDING |---------+ | | | | |msg/maxactv +-+---+------+ LEAVE/ | | | | | | | | WITHDRAW | | | | | | +----------+ | | | | | | | |revert/maxactv| | | | | | | V V V V V | +----------+ msg +------------+ +-----------+ +-->| JOINED |------>| ACTIVE |------>| LEAVING |---> | +----------+ +--- -+------+ LEAVE/+-----------+DOWN | A A | WITHDRAW A A A EVT | | | |RECLAIM | | | | | |REMIT V | | | | | |== adv +------------+ | | | | | +---------| RECLAIMING |--------+ | | | | +-----+------+ LEAVE/ | | | | |REMIT WITHDRAW | | | | |< adv | | | |msg/ V LEAVE/ | | | |adv==ADV_IDLE+------------+ WITHDRAW | | | +-------------| REMITTED |------------+ | | +------------+ | |PUBLISH | JOIN +-----------+ LEAVE/WITHDRAW | ---->| JOINING |-----------------------------------------------+ +-----------+ Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-09tipc: simplify group LEAVE sequenceJon Maloy
After the changes in the previous commit the group LEAVE sequence can be simplified. We now let the arrival of a LEAVE message unconditionally issue a group DOWN event to the user. When a topology WITHDRAW event is received, the member, if it still there, is set to state LEAVING, but we only issue a group DOWN event when the link to the peer node is gone, so that no LEAVE message is to be expected. Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-09tipc: create group member event messages when they are neededJon Maloy
In the current implementation, a group socket receiving topology events about other members just converts the topology event message into a group event message and stores it until it reaches the right state to issue it to the user. This complicates the code unnecessarily, and becomes impractical when we in the coming commits will need to create and issue membership events independently. In this commit, we change this so that we just notice the type and origin of the incoming topology event, and then drop the buffer. Only when it is time to actually send a group event to the user do we explicitly create a new message and send it upwards. Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-09tipc: adjustment to group member FSMJon Maloy
Analysis reveals that the member state MBR_QURANTINED in reality is unnecessary, and can be replaced by the state MBR_JOINING at all occurrencs. Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-09tipc: let group member stay in JOINED mode if unable to reclaimJon Maloy
We handle a corner case in the function tipc_group_update_rcv_win(). During extreme pessure it might happen that a message receiver has all its active senders in RECLAIMING or REMITTED mode, meaning that there is nobody to reclaim advertisements from if an additional sender tries to go active. Currently we just set the new sender to ACTIVE anyway, hence at least theoretically opening up for a receiver queue overflow by exceeding the MAX_ACTIVE limit. The correct solution to this is to instead add the member to the pending queue, while letting the oldest member in that queue revert to JOINED state. In this commit we refactor the code for handling message arrival from a JOINED member, both to make it more comprehensible and to cover the case described above. Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-09tipc: a couple of cleanupsJon Maloy
- We remove the 'reclaiming' member list in struct tipc_group, since it doesn't serve any purpose. - We simplify the GRP_REMIT_MSG branch of tipc_group_protocol_rcv(). Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>