summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2017-02-08ipv6: addrconf: fix generation of new temporary addressesMarcus Huewe
Under some circumstances it is possible that no new temporary addresses will be generated. For instance, addrconf_prefix_rcv_add_addr() indirectly calls ipv6_create_tempaddr(), which creates a tentative temporary address and starts dad. Next, addrconf_prefix_rcv_add_addr() indirectly calls addrconf_verify_rtnl(). Now, assume that the previously created temporary address has the least preferred lifetime among all existing addresses and is still tentative (that is, dad is still running). Hence, the next run of addrconf_verify_rtnl() is performed when the preferred lifetime of the temporary address ends. If dad succeeds before the next run, the temporary address becomes deprecated during the next run, but no new temporary address is generated. In order to fix this, schedule the next addrconf_verify_rtnl() run slightly before the temporary address becomes deprecated, if dad succeeded. Signed-off-by: Marcus Huewe <suse-tux@gmx.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-08netfilter: nft_exthdr: add TCP option matchingManuel Messner
This patch implements the kernel side of the TCP option patch. Signed-off-by: Manuel Messner <mm@skelett.io> Reviewed-by: Florian Westphal <fw@strlen.de> Acked-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-02-08netfilter: nft_ct: add zone id set supportFlorian Westphal
zones allow tracking multiple connections sharing identical tuples, this is needed e.g. when tracking distinct vlans with overlapping ip addresses (conntrack is l2 agnostic). Thus the zone has to be set before the packet is picked up by the connection tracker. This is done by means of 'conntrack templates' which are conntrack structures used solely to pass this info from one netfilter hook to the next. The iptables CT target instantiates these connection tracking templates once per rule, i.e. the template is fixed/tied to particular zone, can be read-only and therefore be re-used by as many skbs simultaneously as needed. We can't follow this model because we want to take the zone id from an sreg at rule eval time so we could e.g. fill in the zone id from the packets vlan id or a e.g. nftables key : value maps. To avoid cost of per packet alloc/free of the template, use a percpu template 'scratch' object and use the refcount to detect the (unlikely) case where the template is still attached to another skb (i.e., previous skb was nfqueued ...). Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-02-08netfilter: nft_ct: prepare for key-dependent error unwindFlorian Westphal
Next patch will add ZONE_ID set support which will need similar error unwind (put operation) as conntrack labels. Prepare for this: remove the 'label_got' boolean in favor of a switch statement that can be extended in next patch. As we already have that in the set_destroy function place that in a separate function and call it from the set init function. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-02-08netfilter: nft_ct: add zone id get supportFlorian Westphal
Just like with counters the direction attribute is optional. We set priv->dir to MAX unconditionally to avoid duplicating the assignment for all keys with optional direction. For keys where direction is mandatory, existing code already returns an error. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-02-08netfilter: nf_tables: add bitmap set typePablo Neira Ayuso
This patch adds a new bitmap set type. This bitmap uses two bits to represent one element. These two bits determine the element state in the current and the future generation that fits into the nf_tables commit protocol. When dumping elements back to userspace, the two bits are expanded into a struct nft_set_ext object. If no NFTA_SET_DESC_SIZE is specified, the existing automatic set backend selection prefers bitmap over hash in case of keys whose size is <= 16 bit. If the set size is know, the bitmap set type is selected if with 16 bit kets and more than 390 elements in the set, otherwise the hash table set implementation is used. For 8 bit keys, the bitmap consumes 66 bytes. For 16 bit keys, the bitmap takes 16388 bytes. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-02-08netfilter: nf_tables: add space notation to setsPablo Neira Ayuso
The space notation allows us to classify the set backend implementation based on the amount of required memory. This provides an order of the set representation scalability in terms of memory. The size field is still left in place so use this if the userspace provides no explicit number of elements, so we cannot calculate the real memory that this set needs. This also helps us break ties in the set backend selection routine, eg. two backend implementations provide the same performance. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-02-08netfilter: nf_tables: rename struct nft_set_estimate class fieldPablo Neira Ayuso
Use lookup as field name instead, to prepare the introduction of the memory class in a follow up patch. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-02-08netfilter: nf_tables: add flush field to struct nft_set_iterPablo Neira Ayuso
This provides context to walk callback iterator, thus, we know if the walk happens from the set flush path. This is required by the new bitmap set type coming in a follow up patch which has no real struct nft_set_ext, so it has to allocate it based on the two bit compact element representation. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-02-08netfilter: nf_tables: rename deactivate_one() to flush()Pablo Neira Ayuso
Although semantics are similar to deactivate() with no implicit element lookup, this is only called from the set flush path, so better rename this to flush(). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-02-08netfilter: nf_tables: use struct nft_set_iter in set element flushPablo Neira Ayuso
Instead of struct nft_set_dump_args, remove unnecessary wrapper structure. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-02-08netfilter: nf_tables: pass netns to set->ops->remove()Pablo Neira Ayuso
This new parameter is required by the new bitmap set type that comes in a follow up patch. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-02-08netfilter: nft_exthdr: Add support for existence checkPhil Sutter
If NFT_EXTHDR_F_PRESENT is set, exthdr will not copy any header field data into *dest, but instead set it to 1 if the header is found and 0 otherwise. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-02-08cfg80211: Pass new RSSI level in CQM RSSI notificationAndrzej Zaborowski
Update the drivers to pass the RSSI level as a cfg80211_cqm_rssi_notify parameter and pass this value to userspace in a new nl80211 attribute. This helps both userspace and also helps in the implementation of the multiple RSSI thresholds CQM mechanism. Note for marvell/mwifiex I pass 0 for the RSSI value because the new RSSI value is not available to the driver at the time of the cfg80211_cqm_rssi_notify call, but the driver queries the new value immediately after that, so it is actually available just a moment later if we wanted to defer caling cfg80211_cqm_rssi_notify until that moment. Without this, the new cfg80211 code (patch 3) will call .get_station which will send a duplicate HostCmd_CMD_RSSI_INFO command to the hardware. Signed-off-by: Andrew Zaborowski <andrew.zaborowski@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-02-08mac80211: Pass new RSSI level in CQM RSSI notificationAndrzej Zaborowski
Extend ieee80211_cqm_rssi_notify with a rssi_level parameter so that this information can be passed to netlink clients in the next patch, if available. Most drivers will have this value at hand. wl1251 receives events from the firmware that only tell it whether latest measurement is above or below threshold so we don't pass any value at this time (parameter is 0). Signed-off-by: Andrew Zaborowski <andrew.zaborowski@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-02-08nl80211: fix validation of scheduled scan info for wowlan netdetectArend Van Spriel
For wowlan netdetect a separate limit is defined for the number of matchsets. Currently, this limit is ignored and the regular limit for scheduled scan matchsets, ie. struct wiphy::max_match_sets, is used for the net-detect case as well. Cc: Johannes Berg <johannes@sipsolutions.net> Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com> Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Franky Lin <franky.lin@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-02-08nl80211: add HT/VHT capabilities to AP parametersJohannes Berg
For the benefit of drivers that rebuild IEs in firmware, parse the IEs for HT/VHT capabilities and the respective membership selector in the (extended) supported rates. This avoids duplicating the same code into all drivers that need this information. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-02-08cfg80211: make rdev assignment clearer in nl80211_testmode_dump()Luca Coelho
Avoid assigning rdev to NULL when we already have it and getting it again from the wiphy index, by moving this code to relevant if block. Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-02-08mac80211: check for allocation failure in debugfs codeDan Carpenter
kmalloc() can fail. Also let's move the allocation out of the declaration block so it's easier to read. Fixes: 4a5eccaa9350 ("mac80211: Show pending txqlen in debugfs.") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-02-08mac80211: aes-cmac: switch to shash CMAC driverArd Biesheuvel
Instead of open coding the CMAC algorithm in the mac80211 driver using byte wide xors and calls into the crypto layer for each block of data, instantiate a cmac(aes) synchronous hash and pass all the data into it directly. This does not only simplify the code, it also allows the use of more efficient and more secure implementations, especially on platforms where SIMD ciphers have a considerable setup cost. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-02-08mac80211: fils_aead: Use crypto api CMAC shash rather than bare cipherArd Biesheuvel
Switch the FILS AEAD code to use a cmac(aes) shash instantiated by the crypto API rather than reusing the open coded implementation in aes_cmac_vector(). This makes the code more understandable, and allows platforms to implement cmac(aes) in a more secure (*) and efficient way than is typically possible when using the AES cipher directly. So replace the crypto_cipher by a crypto_shash, and update the aes_s2v() routine to call the shash interface directly. * In particular, the generic table based AES implementation is sensitive to known-plaintext timing attacks on the key, to which AES based MAC algorithms are especially vulnerable, given that their plaintext is not usually secret. Time invariant alternatives are available (e.g., based on SIMD algorithms), but may incur a setup cost that is prohibitive when operating on a single block at a time, which is why they don't usually expose the cipher API. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-02-08cfg80211 debugfs: Cleanup some checkpatch issuesPichugin Dmitry
This fixes the checkpatch.pl warnings: * Macros should not use a trailing semicolon. * Spaces required around that '='. * Symbolic permissions 'S_IRUGO' are not preferred. Signed-off-by: Dmitriy Pichugin <smokeman85@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-02-07Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
The conflict was an interaction between a bug fix in the netvsc driver in 'net' and an optimization of the RX path in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-07Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Load correct firmware in rtl8192ce wireless driver, from Jurij Smakov. 2) Fix leak of tx_ring and tx_cq due to overwriting in mlx4 driver, from Martin KaFai Lau. 3) Need to reference count PHY driver module when it is attached, from Mao Wenan. 4) Don't do zero length vzalloc() in ethtool register dump, from Stanislaw Gruszka. 5) Defer net_disable_timestamp() to a workqueue to get out of locking issues, from Eric Dumazet. 6) We cannot drop the SKB dst when IP options refer to them, fix also from Eric Dumazet. 7) Incorrect packet header offset calculations in ip6_gre, again from Eric Dumazet. 8) Missing tcp_v6_restore_cb() causes use-after-free, from Eric too. 9) tcp_splice_read() can get into an infinite loop with URG, and hey it's from Eric once more. 10) vnet_hdr_sz can change asynchronously, so read it once during decision making in macvtap and tun, from Willem de Bruijn. 11) Can't use kernel stack for DMA transfers in USB networking drivers, from Ben Hutchings. 12) Handle csum errors properly in UDP by calling the proper destructor, from Eric Dumazet. 13) For non-deterministic softirq run when scheduling NAPI from a workqueue in mlx4, from Benjamin Poirier. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (28 commits) sctp: check af before verify address in sctp_addr_id2transport sctp: avoid BUG_ON on sctp_wait_for_sndbuf mlx4: Invoke softirqs after napi_reschedule udp: properly cope with csum errors catc: Use heap buffer for memory size test catc: Combine failure cleanup code in catc_probe() rtl8150: Use heap buffers for all register access pegasus: Use heap buffers for all register access macvtap: read vnet_hdr_size once tun: read vnet_hdr_sz once tcp: avoid infinite loop in tcp_splice_read() hns: avoid stack overflow with CONFIG_KASAN ipv6: Fix IPv6 packet loss in scenarios involving roaming + snooping switches ipv6: tcp: add a missing tcp_v6_restore_cb() nl80211: Fix mesh HT operation check mac80211: Fix adding of mesh vendor IEs mac80211: Allocate a sync skcipher explicitly for FILS AEAD mac80211: Fix FILS AEAD protection in Association Request frame ip6_gre: fix ip6gre_err() invalid reads netlabel: out of bound access in cipso_v4_validate() ...
2017-02-07bridge: avoid unnecessary read of jiffiesstephen hemminger
Jiffies is volatile so read it once. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-07bridge: remove unnecessary check for vtbegin in br_fill_vlan_tinfo_rangeRoopa Prabhu
vtbegin should not be NULL in this function, Its already checked by the caller. this should silence the below smatch complaint: net/bridge/br_netlink_tunnel.c:144 br_fill_vlan_tinfo_range() error: we previously assumed 'vtbegin' could be null (see line 130) net/bridge/br_netlink_tunnel.c 129 130 if (vtbegin && vtend && (vtend->vid - vtbegin->vid) > 0) { ^^^^^^^ Check for NULL. Fixes: efa5356b0d97 ("bridge: per vlan dst_metadata netlink support") Reported-By: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-07sctp: check af before verify address in sctp_addr_id2transportXin Long
Commit 6f29a1306131 ("sctp: sctp_addr_id2transport should verify the addr before looking up assoc") invoked sctp_verify_addr to verify the addr. But it didn't check af variable beforehand, once users pass an address with family = 0 through sockopt, sctp_get_af_specific will return NULL and NULL pointer dereference will be caused by af->sockaddr_len. This patch is to fix it by returning NULL if af variable is NULL. Fixes: 6f29a1306131 ("sctp: sctp_addr_id2transport should verify the addr before looking up assoc") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-07bridge: tunnel: fix attribute checks in br_parse_vlan_tunnel_infoNikolay Aleksandrov
These checks should go after the attributes have been parsed otherwise we're using tb uninitialized. Fixes: efa5356b0d97 ("bridge: per vlan dst_metadata netlink support") Reported-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-07net: bridge: remove redundant check to see if err is setColin Ian King
The error check on err is redundant as it is being checked previously each time it has been updated. Remove this redundant check. Detected with CoverityScan, CID#140030("Logically dead code") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-07net: dsa: Do not clobber PHY link outside of state machineFlorian Fainelli
Calling phy_read_status() means that we may call into genphy_read_status() which in turn will use genphy_update_link() which can make changes to phydev->link outside of the state machine's state transitions. This is an invalid behavior that is now caught as of 811a919135b9 ("phy state machine: failsafe leave invalid RUNNING state") Reported-by: Zefir Kurtisi <zefir.kurtisi@neratec.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-07net: pending_confirm is not used anymoreJulian Anastasov
When same struct dst_entry can be used for many different neighbours we can not use it for pending confirmations. As last step, we can remove the pending_confirm flag. Reported-by: YueHaibing <yuehaibing@huawei.com> Fixes: 5110effee8fd ("net: Do delayed neigh confirmation.") Fixes: f2bb4bedf35d ("ipv4: Cache output routes in fib_info nexthops.") Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-07net: use dst_confirm_neigh for UDP, RAW, ICMP, L2TPJulian Anastasov
When same struct dst_entry can be used for many different neighbours we can not use it for pending confirmations. The datagram protocols can use MSG_CONFIRM to confirm the neighbour. When used with MSG_PROBE we do not reach the code where neighbour is confirmed, so we have to do the same slow lookup by using the dst_confirm_neigh() helper. When MSG_PROBE is not used, ip_append_data/ip6_append_data will set the skb flag dst_pending_confirm. Reported-by: YueHaibing <yuehaibing@huawei.com> Fixes: 5110effee8fd ("net: Do delayed neigh confirmation.") Fixes: f2bb4bedf35d ("ipv4: Cache output routes in fib_info nexthops.") Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-07net: add confirm_neigh method to dst_opsJulian Anastasov
Add confirm_neigh method to dst_ops and use it from IPv4 and IPv6 to lookup and confirm the neighbour. Its usage via the new helper dst_confirm_neigh() should be restricted to MSG_PROBE users for performance reasons. For XFRM prefer the last tunnel address, if present. With help from Steffen Klassert. Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-07tcp: replace dst_confirm with sk_dst_confirmJulian Anastasov
When same struct dst_entry can be used for many different neighbours we can not use it for pending confirmations. Use the new sk_dst_confirm() helper to propagate the indication from received packets to sock_confirm_neigh(). Reported-by: YueHaibing <yuehaibing@huawei.com> Fixes: 5110effee8fd ("net: Do delayed neigh confirmation.") Fixes: f2bb4bedf35d ("ipv4: Cache output routes in fib_info nexthops.") Tested-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-07sctp: add dst_pending_confirm flagJulian Anastasov
Add new transport flag to allow sockets to confirm neighbour. When same struct dst_entry can be used for many different neighbours we can not use it for pending confirmations. The flag is propagated from transport to every packet. It is reset when cached dst is reset. Reported-by: YueHaibing <yuehaibing@huawei.com> Fixes: 5110effee8fd ("net: Do delayed neigh confirmation.") Fixes: f2bb4bedf35d ("ipv4: Cache output routes in fib_info nexthops.") Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-07net: add dst_pending_confirm flag to skbuffJulian Anastasov
Add new skbuff flag to allow protocols to confirm neighbour. When same struct dst_entry can be used for many different neighbours we can not use it for pending confirmations. Add sock_confirm_neigh() helper to confirm the neighbour and use it for IPv4, IPv6 and VRF before dst_neigh_output. Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-07sock: add sk_dst_pending_confirm flagJulian Anastasov
Add new sock flag to allow sockets to confirm neighbour. When same struct dst_entry can be used for many different neighbours we can not use it for pending confirmations. As not all call paths lock the socket use full word for the flag. Add sk_dst_confirm as replacement for dst_confirm when called for received packets. Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-07sctp: avoid BUG_ON on sctp_wait_for_sndbufMarcelo Ricardo Leitner
Alexander Popov reported that an application may trigger a BUG_ON in sctp_wait_for_sndbuf if the socket tx buffer is full, a thread is waiting on it to queue more data and meanwhile another thread peels off the association being used by the first thread. This patch replaces the BUG_ON call with a proper error handling. It will return -EPIPE to the original sendmsg call, similarly to what would have been done if the association wasn't found in the first place. Acked-by: Alexander Popov <alex.popov@linux.com> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Reviewed-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-07ipv6: sr: fix non static symbol warningsWei Yongjun
Fixes the following sparse warnings: net/ipv6/seg6_iptunnel.c:58:5: warning: symbol 'nla_put_srh' was not declared. Should it be static? net/ipv6/seg6_iptunnel.c:238:5: warning: symbol 'seg6_input' was not declared. Should it be static? net/ipv6/seg6_iptunnel.c:254:5: warning: symbol 'seg6_output' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-07net/sched: act_mirred: remove duplicated include from act_mirred.cWei Yongjun
Remove duplicated include. Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-07udp: properly cope with csum errorsEric Dumazet
Dmitry reported that UDP sockets being destroyed would trigger the WARN_ON(atomic_read(&sk->sk_rmem_alloc)); in inet_sock_destruct() It turns out we do not properly destroy skb(s) that have wrong UDP checksum. Thanks again to syzkaller team. Fixes : 7c13f97ffde6 ("udp: do fwd memory scheduling on dequeue") Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-07net: dsa: Add support for platform dataFlorian Fainelli
Allow drivers to use the new DSA API with platform data. Most of the code in net/dsa/dsa2.c does not rely so much on device_nodes and can get the same information from platform_data instead. We purposely do not support distributed configurations with platform data, so drivers should be providing a pointer to a 'struct dsa_chip_data' structure if they wish to communicate per-port layout. Multiple CPUs port could potentially be supported and dsa_chip_data is extended to receive up to one reference to an upstream network device per port described by a dsa_chip_data structure. dsa_dev_to_net_device() increments the network device's reference count, so we intentionally call dev_put() to be consistent with the DT-enabled path, until we have a generic notifier based solution. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-07net: dsa: Rename and export dev_to_net_device()Florian Fainelli
In preparation for using this function in net/dsa/dsa2.c, rename the function to make its scope DSA specific, and export it. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-07mac80211: add back lost debugfs filesJohannes Berg
Somehow these files were never present or lost, but the code is there and they seem somewhat useful, so add them back. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-02-06bridge: fdb: write to used and updated at most once per jiffyNikolay Aleksandrov
Writing once per jiffy is enough to limit the bridge's false sharing. After this change the bridge doesn't show up in the local load HitM stats. Suggested-by: David S. Miller <davem@davemloft.net> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-06bridge: move write-heavy fdb members in their own cache lineNikolay Aleksandrov
Fdb's used and updated fields are written to on every packet forward and packet receive respectively. Thus if we are receiving packets from a particular fdb, they'll cause false-sharing with everyone who has looked it up (even if it didn't match, since mac/vid share cache line!). The "used" field is even worse since it is updated on every packet forward to that fdb, thus the standard config where X ports use a single gateway results in 100% fdb false-sharing. Note that this patch does not prevent the last scenario, but it makes it better for other bridge participants which are not using that fdb (and are only doing lookups over it). The point is with this move we make sure that only communicating parties get the false-sharing, in a later patch we'll show how to avoid that too. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-06bridge: move to workqueue gcNikolay Aleksandrov
Move the fdb garbage collector to a workqueue which fires at least 10 milliseconds apart and cleans chain by chain allowing for other tasks to run in the meantime. When having thousands of fdbs the system is much more responsive. Most importantly remove the need to check if the matched entry has expired in __br_fdb_get that causes false-sharing and is completely unnecessary if we cleanup entries, at worst we'll get 10ms of traffic for that entry before it gets deleted. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-06bridge: modify bridge and port to have often accessed fields in one cache lineNikolay Aleksandrov
Move around net_bridge so the vlan fields are in the beginning since they're checked on every packet even if vlan filtering is disabled. For the port move flags & vlan group to the beginning, so they're in the same cache line with the port's state (both flags and state are checked on each packet). Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-06tcp: avoid infinite loop in tcp_splice_read()Eric Dumazet
Splicing from TCP socket is vulnerable when a packet with URG flag is received and stored into receive queue. __tcp_splice_read() returns 0, and sk_wait_data() immediately returns since there is the problematic skb in queue. This is a nice way to burn cpu (aka infinite loop) and trigger soft lockups. Again, this gem was found by syzkaller tool. Fixes: 9c55e01c0cc8 ("[TCP]: Splice receive support.") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: Willy Tarreau <w@1wt.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-06net: dsa: introduce bridge notifierVivien Didelot
A slave device will now notify the switch fabric once its port is bridged or unbridged, instead of calling directly its switch operations. This code allows propagating cross-chip bridging events in the fabric. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>