summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-10-11MAINTAINERS: update Johannes Berg's entriesJohannes Berg
Update my MAINTAINERS file entries to list all the right files. Since I'm also the de-facto wireless extensions maintainer, there's little point in excluding those. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-10-10openvswitch: add ct_clear actionEric Garver
This adds a ct_clear action for clearing conntrack state. ct_clear is currently implemented in OVS userspace, but is not backed by an action in the kernel datapath. This is useful for flows that may modify a packet tuple after a ct lookup has already occurred. Signed-off-by: Eric Garver <e@erig.me> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-10net: dst: move cpu inside ifdef to avoid compilation warningJakub Kicinski
If CONFIG_DST_CACHE is not selected cpu variable will be unused and we will see a compilation warning. Move it under the ifdef. Reported-by: kbuild test robot <fengguang.wu@intel.com> Fixes: d66f2b91f95b ("bpf: don't rely on the verifier lock for metadata_dst allocation") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-10Merge branch '1GbE' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 1GbE Intel Wired LAN Driver Updates 2017-10-10 This series contains updates to e1000e and igb. Benjamin Poirier provides several fixes for e1000e, starting with a correction to the return status which was always returning success even if it was not successful. Fixed code comments to reflect the actual code behavior. Fixed the conditional test for the correct return value. Fixed a potential race condition reported by Lennart Sorensen, where the single flag get_link_status is used to signal two different states. Sasha fixes a buffer overrun for i219 devices, where the chipset had reduced the round-trip latency for the LAN controller DMA accesses which in some high performance cases caused a buffer overrun while processing the DMA transactions. Willem de Bruijn changes the default behavior of e1000e to use the burst mode settings by default unless the user specifies the receive interrupt delay (RxIntDelay). Florian Fainelli updates the driver to differentiate between when e1000e_put_txbuf() is called from normal reclamation or when a DMA mapping failure to make the driver more "drop monitor friendly". Christophe JAILLET fixes a potential NULL pointer dereference by properly returning -ENOMEM on memory allocation failures. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-10rtnetlink: bridge: use ext_ack instead of printkFlorian Westphal
We can now piggyback error strings to userspace via extended acks rather than using printk. Before: bridge fdb add 01:02:03:04:05:06 dev br0 vlan 4095 RTNETLINK answers: Invalid argument After: bridge fdb add 01:02:03:04:05:06 dev br0 vlan 4095 Error: invalid vlan id. v3: drop 'RTM_' prefixes, suggested by David Ahern, they are not useful, the add/del in bridge command line is enough. Also reword error in response to malformed/bad vlan id attribute size. Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-10selftests: rtnetlink: test RTM_GETNETCONFFlorian Westphal
exercise RTM_GETNETCONF call path for unspec, inet and inet6 families, they are DOIT_UNLOCKED candidates. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-10Merge branch 'mlx4_en-num-of-rings'David S. Miller
Tariq Toukan says: ==================== mlx4_en num of rings This patchset from Inbar contains changes to rings control to the mlx4 Eth driver. Patches 1 and 2 limit the number of rings to the number of CPUs. Patch 3 removes a limitation in logic of default number of RX rings. Series generated against net-next commit: 812b5ca7d376 Add a driver for Renesas uPD60620 and uPD60620A PHYs ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-10net/mlx4_en: Increase number of default RX ringsInbar Karmy
Remove limitation of netif_get_num_default_rss_queues() from logic of RX rings default number. Signed-off-by: Inbar Karmy <inbark@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-10net/mlx4_en: Limit the number of RX ringsInbar Karmy
Limit the number of RX rings by the number of cores in the system. Signed-off-by: Inbar Karmy <inbark@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-10net/mlx4_en: Limit the number of TX ringsInbar Karmy
Limit the number of TX rings per UP by the number of cores in the system. Signed-off-by: Inbar Karmy <inbark@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-10Merge branch 'hnx3-rxnfc'David S. Miller
Lipeng says: ==================== Support set_ringparam and {set|get}_rxnfc ethtool commands 1, Patch [1/5,2/5] add support for ethtool ops set_ringparam (ethtool -G) and fix related bug. 2, Patch [3/5,4/5, 5/5] add support for ethtool ops set_rxnfc/get_rxnfc (-n/-N) and fix related bug. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-10net: hns3: fix the ring count for ETHTOOL_GRXRINGSLipeng
This patch fix the ring count for ETHTOOL_GRXRINGS. Ring count not TC size should be return for command "ethtool -n ethx". Signed-off-by: Lipeng <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-10net: hns3: add support for ETHTOOL_GRXFHLipeng
This patch add support for ethtool's ETHTOOL_GRXFH in hns3_get_rxnfc(). Signed-off-by: Lipeng <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-10net: hns3: add support for set_rxnfcLipeng
This patch supports the ethtool's set_rxnfc(). Signed-off-by: Lipeng <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-10net: hns3: add support for set_ringparamLipeng
This patch supports the ethtool's set_ringparam(). Signed-off-by: Lipeng <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-10net: hns3: fixes the ring index in hns3_fini_ringLipeng
This patch fixes the ring index in hns3_fini_ring. Signed-off-by: Lipeng <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-10cxgb4: add new T5 pci device id'sGanesh Goudar
Add 0x50aa and 0x50ab T5 device id's. Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-10cxgb4: Add support for new flash partsGanesh Goudar
Add support for new flash parts identification, and also cleanup the flash Part identifying and decoding code. Based on the original work of Casey Leedom <leedom@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-10net/core: Fix BUG to BUG_ON conditionals.Tim Hansen
Fix BUG() calls to use BUG_ON(conditional) macros. This was found using make coccicheck M=net/core on linux next tag next-2017092 Signed-off-by: Tim Hansen <devtimhansen@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-10Merge branch ↵David S. Miller
'bpf-get-rid-of-global-verifier-state-and-reuse-instruction-printer' Jakub Kicinski says: ==================== bpf: get rid of global verifier state and reuse instruction printer This set started off as simple extraction of eBPF verifier's instruction printer into a separate file but evolved into removal of global state. The purpose of moving instruction printing code is to be able to reuse it from the bpftool. As far as the global verifier lock goes, this set removes the global variables relating to the log buffer, makes the one-time init done by bpf_get_skb_set_tunnel_proto() not depend on any external locking, and performs verifier log writeback as data is produced removing the need for allocating a potentially large temporary buffer. The final step of actually removing the verifier lock is left to someone more competent and self-confident :) Note that struct bpf_verifier_env is just 40B under two pages now, we should probably switch to vzalloc() when it's expanded again... v2: - add a selftest; - use env buffer and flush on every print (Alexei); - handle kernel log allocation failures (Daniel); - put the env log members into a struct (Daniel). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-10bpf: write back the verifier log buffer as it gets filledJakub Kicinski
Verifier log buffer can be quite large (up to 16MB currently). As Eric Dumazet points out if we allow multiple verification requests to proceed simultaneously, malicious user may use the verifier as a way of allocating large amounts of unswappable memory to OOM the host. Switch to a strategy of allocating a smaller buffer (1024B) and writing it out into the user buffer after every print. While at it remove the old BUG_ON(). This is in preparation of the global verifier lock removal. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-10bpf: don't rely on the verifier lock for metadata_dst allocationJakub Kicinski
bpf_skb_set_tunnel_*() functions require allocation of per-cpu metadata_dst. The allocation happens upon verification of the first program using those helpers. In preparation for removing the verifier lock, use cmpxchg() to make sure we only allocate the metadata_dsts once. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-10tools: bpftool: use the kernel's instruction printerJakub Kicinski
Compile the instruction printer from kernel/bpf and use it for disassembling "translated" eBPF code. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-10bpf: move instruction printing into a separate fileJakub Kicinski
Separate the instruction printing into a standalone source file. This way sneaky code from tools/ can compile it in directly. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-10bpf: move global verifier log into verifier environmentJakub Kicinski
The biggest piece of global state protected by the verifier lock is the verifier_log. Move that log to struct bpf_verifier_env. struct bpf_verifier_env has to be passed now to all invocations of verbose(). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-10bpf: encapsulate verifier log state into a structureJakub Kicinski
Put the loose log_* variables into a structure. This will make it simpler to remove the global verifier state in following patches. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-10selftests/bpf: add a test for verifier logsJakub Kicinski
Add a test for verifier log handling. Check bad attr combinations but focus on cases when log is truncated. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-10ipv6: fix incorrect bitwise operator used on rt6i_flagsColin Ian King
The use of the | operator always leads to true which looks rather suspect to me. Fix this by using & instead to just check the RTF_CACHE entry bit. Detected by CoverityScan, CID#1457734, #1457747 ("Wrong operator used") Fixes: 35732d01fe31 ("ipv6: introduce a hash table to store dst cache") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-10ipv6: fix dereference of rt6_ex before null check errorColin Ian King
Currently rt6_ex is being dereferenced before it is null checked hence there is a possible null dereference bug. Fix this by only dereferencing rt6_ex after it has been null checked. Detected by CoverityScan, CID#1457749 ("Dereference before null check") Fixes: 81eb8447daae ("ipv6: take care of rt6_stats") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-10igb: check memory allocation failureChristophe JAILLET
Check memory allocation failures and return -ENOMEM in such cases, as already done for other memory allocations in this function. This avoids NULL pointers dereference. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Tested-by: Aaron Brown <aaron.f.brown@intel.com Acked-by: PJ Waskiewicz <peter.waskiewicz.jr@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-10e1000e: Be drop monitor friendlyFlorian Fainelli
e1000e_put_txbuf() can be called from normal reclamation path as well as when a DMA mapping failure, so we need to differentiate these two cases when freeing SKBs to be drop monitor friendly. e1000e_tx_hwtstamp_work() and e1000_remove() are processing TX timestamped SKBs and those should not be accounted as drops either. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-10e1000e: apply burst mode settings only on defaultWillem de Bruijn
Devices that support FLAG2_DMA_BURST have different default values for RDTR and RADV. Apply burst mode default settings only when no explicit value was passed at module load. The RDTR default is zero. If the module is loaded for low latency operation with RxIntDelay=0, do not override this value with a burst default of 32. Move the decision to apply burst values earlier, where explicitly initialized module variables can be distinguished from defaults. Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-10e1000e: fix buffer overrun while the I219 is processing DMA transactionsSasha Neftin
Intel® 100/200 Series Chipset platforms reduced the round-trip latency for the LAN Controller DMA accesses, causing in some high performance cases a buffer overrun while the I219 LAN Connected Device is processing the DMA transactions. I219LM and I219V devices can fall into unrecovered Tx hang under very stressfully UDP traffic and multiple reconnection of Ethernet cable. This Tx hang of the LAN Controller is only recovered if the system is rebooted. Slightly slow down DMA access by reducing the number of outstanding requests. This workaround could have an impact on TCP traffic performance on the platform. Disabling TSO eliminates performance loss for TCP traffic without a noticeable impact on CPU performance. Please, refer to I218/I219 specification update: https://www.intel.com/content/www/us/en/embedded/products/networking/ ethernet-connection-i218-family-documentation.html Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Reviewed-by: Dima Ruinskiy <dima.ruinskiy@intel.com> Reviewed-by: Raanan Avargil <raanan.avargil@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-10e1000e: Avoid receiver overrun interrupt burstsBenjamin Poirier
When e1000e_poll() is not fast enough to keep up with incoming traffic, the adapter (when operating in msix mode) raises the Other interrupt to signal Receiver Overrun. This is a double problem because 1) at the moment e1000_msix_other() assumes that it is only called in case of Link Status Change and 2) if the condition persists, the interrupt is repeatedly raised again in quick succession. Ideally we would configure the Other interrupt to not be raised in case of receiver overrun but this doesn't seem possible on this adapter. Instead, we handle the first part of the problem by reverting to the practice of reading ICR in the other interrupt handler, like before commit 16ecba59bc33 ("e1000e: Do not read ICR in Other interrupt"). Thanks to commit 0a8047ac68e5 ("e1000e: Fix msi-x interrupt automask") which cleared IAME from CTRL_EXT, reading ICR doesn't interfere with RxQ0, TxQ0 interrupts anymore. We handle the second part of the problem by not re-enabling the Other interrupt right away when there is overrun. Instead, we wait until traffic subsides, napi polling mode is exited and interrupts are re-enabled. Reported-by: Lennart Sorensen <lsorense@csclub.uwaterloo.ca> Fixes: 16ecba59bc33 ("e1000e: Do not read ICR in Other interrupt") Signed-off-by: Benjamin Poirier <bpoirier@suse.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-10e1000e: Separate signaling for link check/link upBenjamin Poirier
Lennart reported the following race condition: \ e1000_watchdog_task \ e1000e_has_link \ hw->mac.ops.check_for_link() === e1000e_check_for_copper_link /* link is up */ mac->get_link_status = false; /* interrupt */ \ e1000_msix_other hw->mac.get_link_status = true; link_active = !hw->mac.get_link_status /* link_active is false, wrongly */ This problem arises because the single flag get_link_status is used to signal two different states: link status needs checking and link status is down. Avoid the problem by using the return value of .check_for_link to signal the link status to e1000e_has_link(). Reported-by: Lennart Sorensen <lsorense@csclub.uwaterloo.ca> Signed-off-by: Benjamin Poirier <bpoirier@suse.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-10e1000e: Fix return value testBenjamin Poirier
All the helpers return -E1000_ERR_PHY. Signed-off-by: Benjamin Poirier <bpoirier@suse.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-10e1000e: Fix wrong comment related to link detectionBenjamin Poirier
Reading e1000e_check_for_copper_link() shows that get_link_status is set to false after link has been detected. Therefore, it stays TRUE until then. Signed-off-by: Benjamin Poirier <bpoirier@suse.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-10e1000e: Fix error path in link detectionBenjamin Poirier
In case of error from e1e_rphy(), the loop will exit early and "success" will be set to true erroneously. Signed-off-by: Benjamin Poirier <bpoirier@suse.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-09Add a driver for Renesas uPD60620 and uPD60620A PHYsBernd Edlinger
Signed-off-by: Bernd Edlinger <bernd.edlinger@hotmail.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09vhost_net: do not stall on zerocopy depletionWillem de Bruijn
Vhost-net has a hard limit on the number of zerocopy skbs in flight. When reached, transmission stalls. Stalls cause latency, as well as head-of-line blocking of other flows that do not use zerocopy. Instead of stalling, revert to copy-based transmission. Tested by sending two udp flows from guest to host, one with payload of VHOST_GOODCOPY_LEN, the other too small for zerocopy (1B). The large flow is redirected to a netem instance with 1MBps rate limit and deep 1000 entry queue. modprobe ifb ip link set dev ifb0 up tc qdisc add dev ifb0 root netem limit 1000 rate 1MBit tc qdisc add dev tap0 ingress tc filter add dev tap0 parent ffff: protocol ip \ u32 match ip dport 8000 0xffff \ action mirred egress redirect dev ifb0 Before the delay, both flows process around 80K pps. With the delay, before this patch, both process around 400. After this patch, the large flow is still rate limited, while the small reverts to its original rate. See also discussion in the first link, below. Without rate limiting, {1, 10, 100}x TCP_STREAM tests continued to send at 100% zerocopy. The limit in vhost_exceeds_maxpend must be carefully chosen. With vq->num >> 1, the flows remain correlated. This value happens to correspond to VHOST_MAX_PENDING for vq->num == 256. Allow smaller fractions and ensure correctness also for much smaller values of vq->num, by testing the min() of both explicitly. See also the discussion in the second link below. Changes v1 -> v2 - replaced min with typed min_t - avoid unnecessary whitespace change Link:http://lkml.kernel.org/r/CAF=yD-+Wk9sc9dXMUq1+x_hh=3ThTXa6BnZkygP3tgVpjbp93g@mail.gmail.com Link:http://lkml.kernel.org/r/20170819064129.27272-1-den@klaipeden.com Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09openvswitch: Add erspan tunnel support.William Tu
Add erspan netlink interface for OVS. Signed-off-by: William Tu <u9012063@gmail.com> Cc: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09once: switch to new jump label APIEric Biggers
Switch the DO_ONCE() macro from the deprecated jump label API to the new one. The new one is more readable, and for DO_ONCE() it also makes the generated code more icache-friendly: now the one-time initialization code is placed out-of-line at the jump target, rather than at the inline fallthrough case. Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
2017-10-09ipv6: use rcu_dereference_bh() in ipv6_route_seq_next()Wei Wang
This patch replaces rcu_deference() with rcu_dereference_bh() in ipv6_route_seq_next() to avoid the following warning: [ 19.431685] WARNING: suspicious RCU usage [ 19.433451] 4.14.0-rc3-00914-g66f5d6c #118 Not tainted [ 19.435509] ----------------------------- [ 19.437267] net/ipv6/ip6_fib.c:2259 suspicious rcu_dereference_check() usage! [ 19.440790] [ 19.440790] other info that might help us debug this: [ 19.440790] [ 19.444734] [ 19.444734] rcu_scheduler_active = 2, debug_locks = 1 [ 19.447757] 2 locks held by odhcpd/3720: [ 19.449480] #0: (&p->lock){+.+.}, at: [<ffffffffb1231f7d>] seq_read+0x3c/0x333 [ 19.452720] #1: (rcu_read_lock_bh){....}, at: [<ffffffffb1d2b984>] ipv6_route_seq_start+0x5/0xfd [ 19.456323] [ 19.456323] stack backtrace: [ 19.458812] CPU: 0 PID: 3720 Comm: odhcpd Not tainted 4.14.0-rc3-00914-g66f5d6c #118 [ 19.462042] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 [ 19.465414] Call Trace: [ 19.466788] dump_stack+0x86/0xc0 [ 19.468358] lockdep_rcu_suspicious+0xea/0xf3 [ 19.470183] ipv6_route_seq_next+0x71/0x164 [ 19.471963] seq_read+0x244/0x333 [ 19.473522] proc_reg_read+0x48/0x67 [ 19.475152] ? proc_reg_write+0x67/0x67 [ 19.476862] __vfs_read+0x26/0x10b [ 19.478463] ? __might_fault+0x37/0x84 [ 19.480148] vfs_read+0xba/0x146 [ 19.481690] SyS_read+0x51/0x8e [ 19.483197] do_int80_syscall_32+0x66/0x15a [ 19.484969] entry_INT80_compat+0x32/0x50 [ 19.486707] RIP: 0023:0xf7f0be8e [ 19.488244] RSP: 002b:00000000ffa75d04 EFLAGS: 00000246 ORIG_RAX: 0000000000000003 [ 19.491431] RAX: ffffffffffffffda RBX: 0000000000000009 RCX: 0000000008056068 [ 19.493886] RDX: 0000000000001000 RSI: 0000000008056008 RDI: 0000000000001000 [ 19.496331] RBP: 00000000000001ff R08: 0000000000000000 R09: 0000000000000000 [ 19.498768] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 19.501217] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Fixes: 66f5d6ce53e6 ("ipv6: replace rwlock with rcu and spinlock in fib6_table") Reported-by: Xiaolong Ye <xiaolong.ye@intel.com> Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09Merge branch 'ppc-bundle' (bundle from Michael Ellerman)Linus Torvalds
Merge powerpc transactional memory fixes from Michael Ellerman: "I figured I'd still send you the commits using a bundle to make sure it works in case I need to do it again in future" This fixes transactional memory state restore for powerpc. * bundle'd patches from Michael Ellerman: powerpc/tm: Fix illegal TM state in signal handler powerpc/64s: Use emergency stack for kernel TM Bad Thing program checks
2017-10-09Merge branch '40GbE' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 40GbE Intel Wired LAN Driver Updates 2017-10-09 This series contains updates to i40e and i40evf only. Jake fixes missed flag conversion from u64 to u32. Fixes a deafult ITR value issue where the driver defaults to an ITR value of half the expected value (in terms of minimum microseconds between interrupts). So fix this by changing the default values to be calculated using the ITR_REG_TO_USEC() macro which indicates that we are converting from the register units into microseconds. Updates the drivers to bump the tail in increments of 8 and double the number of descriptors we will bundle into one tail bump when receiving. With the recent kernel support for enabling XPS and QoS at the same time, we no longer need to worry about the number of traffic classes when enabling XPS. Lihong converts the use of hash_for_each() to hash_for_each_safe() to safely remove a hash entry. Adds a check for the return value for find_first_bit() in the case that it returns the size passed to search. Alan fixes a bug in which filters are erroneously removed if they are removed and then added again. So make sure that when adding a filter, if we find it already existed in our list, make sure it is not marked to be removed. Jayaprakash adds the retrying of PHY reads when the I2C is busy for a maximum period of 500ms. Rami fixes code comment typo. Stefano Brivio simplifies the code by removing the use of a local return code variable and simply return the results of the read function. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09Merge branch '10GbE' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 10GbE Intel Wired LAN Driver Updates 2017-10-09 This series contains updates to ixgbe only. Emil fixes an issue where the semaphore bits could be stuck after a reset or a crash, by adding the clearing of software resource bits in the software/firmware synchronization register. Added error checks when we attempt to identify and initialize the PHY to prevent a crash. Fixed a few issues in the logic of ixgbe_clean_test_rings() which was exposed by a previous commit that was causing a crash in ethtool diagnostics. Bhumika Goyal fixes a couple of instances which were overlooked when we made ixgbe_mac_operations constant. Shannon Nelson fixes an issue to restore normal operations after the last MACVLAN offload is removed, otherwise we get stuck in a single queue operations. The infamous Jesper Dangaard Brouer adds a counter which counts the number of times the recycle fails and the real page allocator is invoked. Alex updates the adaptive ITR algorithm to better support the needs of the network. This attempt to make it so that our ITR algorithm will try to prevent either starving a socket buffer for memory in the case of transmit, or overrunning an receive socket buffer on receive. We should function better with new features like XDP which can handle small packets at high rates without needing to lock us into NAPI polling mode. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Fix object leak on IPSEC offload failure, from Steffen Klassert. 2) Fix range checks in ipset address range addition operations, from Jozsef Kadlecsik. 3) Fix pernet ops unregistration order in ipset, from Florian Westphal. 4) Add missing netlink attribute policy for nl80211 packet pattern attrs, from Peng Xu. 5) Fix PPP device destruction race, from Guillaume Nault. 6) Write marks get lost when BPF verifier processes R1=R2 register assignments, causing incorrect liveness information and less state pruning. Fix from Alexei Starovoitov. 7) Fix blockhole routes so that they are marked dead and therefore not cached in sockets, otherwise IPSEC stops working. From Steffen Klassert. 8) Fix broadcast handling of UDP socket early demux, from Paolo Abeni. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (37 commits) cdc_ether: flag the u-blox TOBY-L2 and SARA-U2 as wwan net: thunderx: mark expected switch fall-throughs in nicvf_main() udp: fix bcast packet reception netlink: do not set cb_running if dump's start() errs ipv4: Fix traffic triggered IPsec connections. ipv6: Fix traffic triggered IPsec connections. ixgbe: incorrect XDP ring accounting in ethtool tx_frame param net: ixgbe: Use new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag Revert commit 1a8b6d76dc5b ("net:add one common config...") ixgbe: fix masking of bits read from IXGBE_VXLANCTRL register ixgbe: Return error when getting PHY address if PHY access is not supported netfilter: xt_bpf: Fix XT_BPF_MODE_FD_PINNED mode of 'xt_bpf_info_v1' netfilter: SYNPROXY: skip non-tcp packet in {ipv4, ipv6}_synproxy_hook tipc: Unclone message at secondary destination lookup tipc: correct initialization of skb list gso: fix payload length when gso_size is zero mlxsw: spectrum_router: Avoid expensive lookup during route removal bpf: fix liveness marking doc: Fix typo "8023.ad" in bonding documentation ipv6: fix net.ipv6.conf.all.accept_dad behaviour for real ...
2017-10-09cdc_ether: flag the u-blox TOBY-L2 and SARA-U2 as wwanAleksander Morgado
The u-blox TOBY-L2 is a LTE Cat 4 module with HSPA+ and 2G fallback. This module allows switching to different USB profiles with the 'AT+UUSBCONF' command, and provides a ECM network interface when the 'AT+UUSBCONF=2' profile is selected. The u-blox SARA-U2 is a HSPA module with 2G fallback. The default USB configuration includes a ECM network interface. Both these modules are controlled via AT commands through one of the TTYs exposed. Connecting these modules may be done just by activating the desired PDP context with 'AT+CGACT=1,<cid>' and then running DHCP on the ECM interface. Signed-off-by: Aleksander Morgado <aleksander@aleksander.es> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09i40e: Avoid some useless variables and initializers in NVM functionsStefano Brivio
Fixes: 09f79fd49d94 ("i40e: avoid NVM acquire deadlock during NVM update") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>