summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-03-09Merge branch 'hns3-fixes-for-configuration-lost-problems'David S. Miller
Peng Li says: ==================== fixes for configuration lost problems This patchset refactors some functions and some bugs in order to fix the configuration loss problem when resetting and setting channel number. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09net: hns3: fix for coal configuation lost when setting the channelYunsheng Lin
This patch fixes the coalesce configuation lost problem when setting the channel number by restoring all vectors's coalesce configuation to vector 0's, because all vectors belonging to the same netdev have the same coalesce configuation for now. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09net: hns3: refactor the coalesce related structYunsheng Lin
This patch refoctors the coalesce related struct by introducing the hns3_enet_coalesce struct, in order to fix the coalesce configuation lost problem when changing the channel number. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09net: hns3: fix for coalesce configuration lost during resetYunsheng Lin
Coalesce configuration will be set to default value by hns3_nic_init_vector_data during reset, which causes the coalesce configuration loss problem. This patch fixes it by setting the default value in hns3_nic_alloc_vector_data, which will not be called in the reset process. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09net: hns3: refactor the get/put_vector functionYunsheng Lin
There is a get_vector function, which allocate the vectors for a client, but there is not a put_vector to free the vector. This patch introduces the put_vector function in order to fix the coalesce configuration lost problem during reset process. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09net: hns3: fix for use-after-free when setting ring parameterYunsheng Lin
In hns3_set_ringparam, hns3_uninit_all_ring frees the memory pointed by priv->ring_data[i].ring, and hns3_change_all_ring_bd_num use that pointer without mallocing, which will cause a use-after-free problem. The patch fixes it by not freeing the memory in hns3_uninit_all_ring, and uses hns3_put_ring_config to free it when necessary. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09net: hns3: fix for pause configuration lost during resetYunsheng Lin
Pause configuration will be set to default value by hclge_tm_schd_init during reset, which causes the RSS configuration loss problem. This patch fixes it by calling hclge_tm_init_hw during reset process , which will set the pause configuration to default value. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09net: hns3: fix for RSS configuration loss problem during resetYunsheng Lin
RSS configuration will be set to default value by hclge_rss_init_hw during reset, which causes the RSS configuration loss problem. This patch fixes it by setting the default value in hclge_rss_init_cfg function, which will not be called in the reset process. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09net: hns3: refactor the hclge_get/set_rss_tuple functionYunsheng Lin
This patch refactors the hclge_get/set_rss_tuple function in order to fix the rss configuration loss problem during reset process. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09net: hns3: refactor the hclge_get/set_rss functionYunsheng Lin
This patch refactors the hclge_get/set_rss function in order to fix the rss configuration loss problem during reset process. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09Merge branch 'sched-action-events'David S. Miller
Roman Mashak says: ==================== Fix event generation for actions batch Add/Delete mode When adding or deleting a batch of entries, the kernel sends upto TCA_ACT_MAX_PRIO entries in an event to user space. However it does not consider that the action sizes may vary and require different skb sizes. For example : % cat tc-batch.sh TC="sudo /mnt/iproute2.git/tc/tc" $TC actions flush action gact for i in `seq 1 $1`; do cmd="action pass index $i " args=$args$cmd done $TC actions add $args % % ./tc-batch.sh 32 Error: Failed to fill netlink attributes while adding TC action. We have an error talking to the kernel % This patchset introduces new callback in tc_action_ops, which calculates the action size, and passes size to tcf_add_notify()/tcf_del_notify(). The patch fixes act_gact, and the rest of actions will be updated in the follow-up patches. v3: Fixed tcf_action_fill_size() to return shared attrs length when action ->get_fill_size() isn't implemented. v2: Restructured patches to make them bisectable. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09net sched actions: implement get_fill_size routine in act_gactRoman Mashak
Signed-off-by: Roman Mashak <mrv@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09net sched actions: calculate add/delete event message sizeRoman Mashak
Introduce routines to calculate size of the shared tc netlink attributes and the full message size including netlink header and tc service header. Update add/delete action logic to have the size for event messages, the size is passed to tcf_add_notify() and tcf_del_notify() where the notification message is being allocated and constructed. Signed-off-by: Roman Mashak <mrv@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09net sched actions: add new tc_action_ops callbackRoman Mashak
Add a new callback in tc_action_ops, it will be needed by the tc actions to compute its size when a ADD/DELETE notification message is constructed. This routine has to take into account optional/variable size TLVs specific per action. Signed-off-by: Roman Mashak <mrv@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09net sched actions: update Add/Delete action API with new argumentRoman Mashak
Introduce a new function argument to carry total attributes size for correct allocation of skb in event messages. Signed-off-by: Roman Mashak <mrv@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09net: do not create fallback tunnels for non-default namespacesEric Dumazet
fallback tunnels (like tunl0, gre0, gretap0, erspan0, sit0, ip6tnl0, ip6gre0) are automatically created when the corresponding module is loaded. These tunnels are also automatically created when a new network namespace is created, at a great cost. In many cases, netns are used for isolation purposes, and these extra network devices are a waste of resources. We are using thousands of netns per host, and hit the netns creation/delete bottleneck a lot. (Many thanks to Kirill for recent work on this) Add a new sysctl so that we can opt-out from this automatic creation. Note that these tunnels are still created for the initial namespace, to be the least intrusive for typical setups. Tested: lpk43:~# cat add_del_unshare.sh for i in `seq 1 40` do (for j in `seq 1 100` ; do unshare -n /bin/true >/dev/null ; done) & done wait lpk43:~# echo 0 >/proc/sys/net/core/fb_tunnels_only_for_init_net lpk43:~# time ./add_del_unshare.sh real 0m37.521s user 0m0.886s sys 7m7.084s lpk43:~# echo 1 >/proc/sys/net/core/fb_tunnels_only_for_init_net lpk43:~# time ./add_del_unshare.sh real 0m4.761s user 0m0.851s sys 1m8.343s lpk43:~# Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09tools: tc-testing: Can pause just before post-suiteBrenda J. Butler
With option -P, the test script will pause just before the post_suite functions are called. This allows the tester to inspect the system before it is torn down. Signed-off-by: Brenda J. Butler <bjb@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09tools: tc-testing: Can refer to $TESTID in test specBrenda J. Butler
When processing the commands in the test cases, substitute the test id for $TESTID. This helps to make more flexible tests. For example, the testid can be given as a command line argument. As an example, if we wish to save the test output to a file named for the test case, we can write in the test case: "cmdUnderTest": "some test command | tee -a $TESTID.out" Signed-off-by: Brenda J. Butler <bjb@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09net: dsa: mv88e6xxx: Fix irq free'ingAndrew Lunn
Call the common irq free function, rather than going recursive and blowing away the stack, followed by the machine. Fixes: 294d711ee8c0 ("net: dsa: mv88e6xxx: Poll when no interrupt defined") Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09tc-testing: add csum testsRoman Mashak
Signed-off-by: Roman Mashak <mrv@mojatatu.com> Tested-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08net: usb: asix88179_178a: de-duplicate codeAlexander Kurz
Remove the duplicated code for asix88179_178a bind and reset methods. Signed-off-by: Alexander Kurz <akurz@blala.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08net: usb: asix88179_178a: set permanent address once onlyAlexander Kurz
The permanent address of asix88179_178a devices is read at probe time and should not be overwritten later. Otherwise it may be overwritten unintentionally with a configured address. Signed-off-by: Alexander Kurz <akurz@blala.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08Merge branch 'ntuple-filters-with-RSS'David S. Miller
Edward Cree says: ==================== ntuple filters with RSS This series introduces the ability to mark an ethtool steering filter to use RSS spreading, and the ability to create and configure multiple RSS contexts with different indirection tables, hash keys, and hash fields. An implementation for the sfc driver (for 7000-series and later SFC NICs) is included in patch 2/2. The anticipated use case of this feature is for steering traffic destined for a container (or virtual machine) to the subset of CPUs on which processes in the container (or the VM's vCPUs) are bound, while retaining the scalability of RSS spreading from the viewpoint inside the container. The use of both a base queue number (ring_cookie) and indirection table is intended to allow re-use of a single RSS context to target multiple sets of CPUs. For instance, if an 8-core system is hosting three containers on CPUs [1,2], [3,4] and [6,7], then a single RSS context with an equal-weight [0,1] indirection table could be used to target all three containers by setting ring_cookie to 1, 3 and 6 on the respective filters. v2: Initialised ctx in efx_ef10_filter_insert() to avoid (false positive) gcc warning. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08sfc: support RSS spreading of ethtool ntuple filtersEdward Cree
Use a linked list to associate user-facing context IDs with FW-facing context IDs (since the latter can change after an MC reset). Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08net: ethtool: extend RXNFC API to support RSS spreading of filter matchesEdward Cree
We use a two-step process to configure a filter with RSS spreading. First, the RSS context is allocated and configured using ETHTOOL_SRSSH; this returns an identifier (rss_context) which can then be passed to subsequent invocations of ETHTOOL_SRXCLSRLINS to specify that the offset from the RSS indirection table lookup should be added to the queue number (ring_cookie) when delivering the packet. Drivers for devices which can only use the indirection table entry directly (not add it to a base queue number) should reject rule insertions combining RSS with a nonzero ring_cookie. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08rds: rds_info_from_znotifier() can be statickbuild test robot
Fixes: 9426bbc6de99 ("rds: use list structure to track information for zerocopy completion notification") Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08rds: rds_message_zcopy_from_user() can be statickbuild test robot
Fixes: d40a126b16ea ("rds: refactor zcopy code into rds_message_zcopy_from_user") Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08net/ncsi: unlock on error in ncsi_set_interface_nl()Dan Carpenter
There are two error paths which are missing unlocks in this function. Fixes: 955dc68cb9b2 ("net/ncsi: Add generic netlink family") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08net/ncsi: use kfree_skb() instead of kfree()Dan Carpenter
We're supposed to use kfree_skb() to free these sk_buffs. Fixes: 955dc68cb9b2 ("net/ncsi: Add generic netlink family") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08liquidio: avoid doing useless workPrasad Kanneganti
Avoid doing useless work by making sure that the response_list is not empty before scheduling work to process it. Signed-off-by: Prasad Kanneganti <prasad.kanneganti@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08liquidio: Resolved mbox read issue while reading more than one 64bit dataIntiyaz Basha
Corrected length check when data received in the mbox is more than one 64 bit data value Signed-off-by: Intiyaz Basha <intiyaz.basha@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08Merge tag 'mlx5-updates-2018-02-28-2' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Saeed Mahameed says: ==================== mlx5-updates-2018-02-28-2 (IPSec-2) This series follows our previous one to lay out the foundations for IPSec in user-space and extend current kernel netdev IPSec support. As noted in our previous pull request cover letter "mlx5-updates-2018-02-28-1 (IPSec-1)", the IPSec mechanism will be supported through our flow steering mechanism. Therefore, we need to change the initialization order. Furthermore, IPsec is also supported in both egress and ingress. Since our current flow steering is egress only, we add an empty (only implemented through FPGA steering ops) egress namespace to handle that case. We also implement the required flow steering callbacks and logic in our FPGA driver. We extend the FPGA support for ESN and modifying a xfrm too. Therefore, we add support for some new FPGA command interface that supports them. The other required bits are added too. The new features and requirements are advertised via cap bits. Last but not least, we revise our driver's accel_esp API. This API will be shared between our netdev and IB driver, so we need to have all the required functionality from both worlds. Regards, Aviad and Matan ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08Merge branch 'ibmvnic-Clean-up-net-close-and-fix-reset-bug'David S. Miller
Thomas Falcon says: ==================== ibmvnic: Clean up net close and fix reset bug This patch set cleans up and reorganizes the driver's net_device close function and leverages that to fix up a bug that can occur during some device resets. Some reset cases require the backing adapter to be disabled before continuing, but other cases, such as during a device failover or partition migration, do not require this step. Since the device will not be initialized at this stage and its command-processing queue is closed, do not send the request to disable the device as it could result in an error or timeout disrupting the reset. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08ibmvnic: Do not disable device during failover or partition migrationThomas Falcon
During a device failover or partition migration reset, it is not necessary to disable the backing adapter since it should not be running yet and its Command-Response Queue is closed. Sending device commands during this time could result in an error or timeout disrupting the reset process. In these cases, just halt transmissions, clean up resources, and continue with reset. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08ibmvnic: Reorganize device closeThomas Falcon
Introduce a function to halt network operations and clean up any unused or outstanding socket buffers. Then, during device close, disable backing adapter before halting all queues and performing cleanup. This ensures all backing device operations will be stopped before the driver cleans up shared resources. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08ibmvnic: Clean up device closeThomas Falcon
Remove some dead code now that RX pools are being cleaned. This was included to wait until any pending RX queue interrupts are processed, but NAPI polling should be disabled by this point. Another minor change is to use the net device parameter for any print functions instead of accessing it from the adapter structure. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08openvswitch: fix vport packet length check.William Tu
When sending a packet to a tunnel device, the dev's hard_header_len could be larger than the skb->len in function packet_length(). In the case of ip6gretap/erspan, hard_header_len = LL_MAX_HEADER + t_hlen, which is around 180, and an ARP packet sent to this tunnel has skb->len = 42. This causes the 'unsign int length' to become super large because it is negative value, causing the later ovs_vport_send to drop it due to over-mtu size. The patch fixes it by setting it to 0. Signed-off-by: William Tu <u9012063@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08Merge branch 'pernet-convert-part5'David S. Miller
Kirill Tkhai says: ==================== Converting pernet_operations (part #5) this series continues to review and to convert pernet_operations to make them possible to be executed in parallel for several net namespaces in the same time. There are mostly netfilter operations (and they should be the last netfilter's), also there are two patches touching pktgen and xfrm. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08net: Convet ipv6_net_opsKirill Tkhai
These pernet_operations are similar to ipv4_net_ops. They are safe to be async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08net: Convert ipv4_net_opsKirill Tkhai
These pernet_operations register and unregister bunch of nf_conntrack_l4proto. Exit method unregisters related sysctl, init method calls init_net and get_net_proto. The whole builtin_l4proto4 array has pretty simple init_net and get_net_proto methods. The first one register sysctl table, the second one is just RO memory dereference. So, these pernet_operations are safe to be marked as async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08net: Convert iptable_security_net_opsKirill Tkhai
These pernet_operations unregister net::ipv4::iptable_security table. Another net/pernet_operations do not send ipv4 packets to foreign net namespaces. So, we mark them async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08net: Convert iptable_raw_net_opsKirill Tkhai
These pernet_operations unregister net::ipv4::iptable_raw table. Another net/pernet_operations do not send ipv4 packets to foreign net namespaces. So, we mark them async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08net: Convert iptable_nat_net_opsKirill Tkhai
These pernet_operations unregister net::ipv4::nat_table table. Another net/pernet_operations do not send ipv4 packets to foreign net namespaces. So, we mark them async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08net: Convert iptable_mangle_net_opsKirill Tkhai
These pernet_operations unregister net::ipv4::iptable_mangle table. Another net/pernet_operations do not send ipv4 packets to foreign net namespaces. So, we mark them async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08net: Convert arptable_filter_net_opsKirill Tkhai
These pernet_operations unregister net::ipv4::arptable_filter. Another net/pernet_operations do not send arp packets to foreign net namespaces. So, we mark them async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08net: Convert pg_net_opsKirill Tkhai
These pernet_operations create per-net pktgen threads and /proc entries. These pernet subsys looks closed in itself, and there are no pernet_operations outside this file, which are interested in the threads. Init and/or exit methods look safe to be executed in parallel. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08net: Convert nfnl_queue_net_opsKirill Tkhai
These pernet_operations register and unregister net::nf::queue_handler and /proc entry. The handler is accessed only under RCU, so this looks safe to convert them. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08net: Convert nfnl_log_net_opsKirill Tkhai
These pernet_operations create and destroy /proc entries. Also, exit method unsets nfulnl_logger. The logger is not set by default, and it becomes bound via userspace request. So, they look safe to be made async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08net: Convert cttimeout_opsKirill Tkhai
These pernet_operations also look closed in themself. Exit method touch only per-net structures, so it's safe to execute them for several net namespaces in parallel. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08net: Convert nfnl_acct_opsKirill Tkhai
These pernet_operations look closed in themself, and there are no other users of net::nfnl_acct_list outside. They are safe to be executed for several net namespaces in parallel. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>