Age | Commit message (Collapse) | Author |
|
The handlers for xfrm_tunnel are always invoked with rcu read lock
already.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The gre function pointers for receive and error handling are
always called (from gre.c) with rcu_read_lock already held.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
GRE driver incorrectly uses zero as a flag value. Zero is a perfectly
valid value for key, and the tunnel should match packets with no key only
with tunnels created without key, and vice versa.
This is a slightly visible change since previously it might be possible to
construct a working tunnel that sent key 0 and received only because
of the key wildcard of zero. I.e the sender sent key of zero, but tunnel
was defined without key.
Note: using gre key 0 requires iproute2 utilities v3.2 or later.
The original utility code was broken as well.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In case of error, the function genlmsg_put() returns NULL pointer
not ERR_PTR(). The IS_ERR() test in the return value check should
be replaced with NULL test.
dpatch engine is used to auto generate this patch.
(https://github.com/weiyj/dpatch)
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pablo Neira Ayuso says:
====================
If time allows, I'd appreciate if you can take the following fix
for the xt_limit match.
As Jan indicates, random things may occur while using the xt_limit
match due to use of uninitialized memory.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
nfc_llcp_socket_release is calling lock_sock/release_sock while holding
write lock for rwlock. Use bh_lock/unlock_sock instead.
BUG: sleeping function called from invalid context at net/core/sock.c:2138
in_atomic(): 1, irqs_disabled(): 0, pid: 56, name: kworker/1:1
4 locks held by kworker/1:1/56:
Pid: 56, comm: kworker/1:1 Not tainted 3.5.0-999-nfc+ #7
Call Trace:
[<ffffffff810952c5>] __might_sleep+0x145/0x200
[<ffffffff815d7686>] lock_sock_nested+0x36/0xa0
[<ffffffff81731569>] ? _raw_write_lock+0x49/0x50
[<ffffffffa04aa100>] ? nfc_llcp_socket_release+0x30/0x200 [nfc]
[<ffffffffa04aa122>] nfc_llcp_socket_release+0x52/0x200 [nfc]
[<ffffffffa04ab9f0>] nfc_llcp_mac_is_down+0x20/0x30 [nfc]
[<ffffffffa04a6fea>] nfc_dep_link_down+0xaa/0xf0 [nfc]
[<ffffffffa04a9bb5>] nfc_llcp_timeout_work+0x15/0x20 [nfc]
[<ffffffff810825f7>] process_one_work+0x197/0x7c0
[<ffffffff81082596>] ? process_one_work+0x136/0x7c0
[<ffffffff8172fbc9>] ? __schedule+0x419/0x9c0
[<ffffffffa04a9ba0>] ? nfc_llcp_build_gb+0x1b0/0x1b0 [nfc]
[<ffffffff81083090>] worker_thread+0x190/0x4c0
[<ffffffff81082f00>] ? rescuer_thread+0x2a0/0x2a0
[<ffffffff81088d1e>] kthread+0xae/0xc0
[<ffffffff810caafd>] ? trace_hardirqs_on+0xd/0x10
[<ffffffff8173acc4>] kernel_thread_helper+0x4/0x10
[<ffffffff81732174>] ? retint_restore_args+0x13/0x13
[<ffffffff81088c70>] ? flush_kthread_worker+0x150/0x150
[<ffffffff8173acc0>] ? gs_change+0x13/0x13
Signed-off-by: Szymon Janc <szymon.janc@tieto.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
|
|
netlink_register_notifier requires notify functions to not sleep.
nfc_stop_poll locks device mutex and must not be called from notifier.
Create workqueue that will handle this for all devices.
BUG: sleeping function called from invalid context at kernel/mutex.c:269
in_atomic(): 0, irqs_disabled(): 0, pid: 4497, name: neard
1 lock held by neard/4497:
Pid: 4497, comm: neard Not tainted 3.5.0-999-nfc+ #5
Call Trace:
[<ffffffff810952c5>] __might_sleep+0x145/0x200
[<ffffffff81743dde>] mutex_lock_nested+0x2e/0x50
[<ffffffff816ffd19>] nfc_stop_poll+0x39/0xb0
[<ffffffff81700a17>] nfc_genl_rcv_nl_event+0x77/0xc0
[<ffffffff8174aa8c>] notifier_call_chain+0x5c/0x120
[<ffffffff8174abd6>] __atomic_notifier_call_chain+0x86/0x140
[<ffffffff8174ab50>] ? notifier_call_chain+0x120/0x120
[<ffffffff815e1347>] ? skb_dequeue+0x67/0x90
[<ffffffff8174aca6>] atomic_notifier_call_chain+0x16/0x20
[<ffffffff8162119a>] netlink_release+0x24a/0x280
[<ffffffff815d7aa8>] sock_release+0x28/0xa0
[<ffffffff815d7be7>] sock_close+0x17/0x30
[<ffffffff811b2a7c>] __fput+0xcc/0x250
[<ffffffff811b2c0e>] ____fput+0xe/0x10
[<ffffffff81085009>] task_work_run+0x69/0x90
[<ffffffff8101b951>] do_notify_resume+0x81/0xd0
[<ffffffff8174ef22>] int_signal+0x12/0x17
Signed-off-by: Szymon Janc <szymon.janc@tieto.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
|
|
This is used when CONFIG_NFC_SHDLC is disabled.
Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
|
|
This adds support for socket of type SOCK_RAW to LLCP.
sk_buff are copied and sent to raw sockets with a 2 bytes extra header:
The first byte header contains the nfc adapter index.
The second one contains flags:
- 0x01 - Direction (0=RX, 1=TX)
- 0x02-0x80 - Reserved
A raw socket has to be explicitly bound to a nfc adapter. This is achieved
by specifying the adapter index to be bound to in the dev_idx field of the
sockaddr_nfc_llcp struct passed to bind().
Signed-off-by: Thierry Escande <thierry.escande@linux.intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
|
|
If rwlock is dynamically allocated but statically initialized it is
missing proper lockdep annotation.
INFO: trying to register non-static key.
the code is fine but needs lockdep annotation.
turning off the locking correctness validator.
Pid: 3352, comm: neard Not tainted 3.5.0-999-nfc+ #2
Call Trace:
[<ffffffff810c8526>] __lock_acquire+0x8f6/0x1bf0
[<ffffffff81739045>] ? printk+0x4d/0x4f
[<ffffffff810c9eed>] lock_acquire+0x9d/0x220
[<ffffffff81702bfe>] ? nfc_llcp_sock_from_sn+0x4e/0x160
[<ffffffff81746724>] _raw_read_lock+0x44/0x60
[<ffffffff81702bfe>] ? nfc_llcp_sock_from_sn+0x4e/0x160
[<ffffffff81702bfe>] nfc_llcp_sock_from_sn+0x4e/0x160
[<ffffffff817034a7>] nfc_llcp_get_sdp_ssap+0xa7/0x1b0
[<ffffffff81706353>] llcp_sock_bind+0x173/0x210
[<ffffffff815d9c94>] sys_bind+0xe4/0x100
[<ffffffff8139209e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[<ffffffff8174ea69>] system_call_fastpath+0x16/0x1b
Signed-off-by: Szymon Janc <szymon.janc@tieto.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
|
|
simplifies a bunch of callers...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
iterates through the opened files in given descriptor table,
calling a supplied function; we stop once non-zero is returned.
Callback gets struct file *, descriptor number and const void *
argument passed to iterator. It is called with files->file_lock
held, so it is not allowed to block.
tty_io, netprio_cgroup and selinux flush_unauthorized_files()
converted to its use.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Both modular callers of sock_map_fd() had been buggy; sctp one leaks
descriptor and file if copy_to_user() fails, 9p one shouldn't be
exposing file in the descriptor table at all.
Switch both to sock_alloc_file(), export it, unexport sock_map_fd() and
make it static.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Commit v2.6.19-rc1~1272^2~41 tells us that r->cost != 0 can happen when
a running state is saved to userspace and then reinstated from there.
Make sure that private xt_limit area is initialized with correct values.
Otherwise, random matchings due to use of uninitialized memory.
Signed-off-by: Jan Engelhardt <jengelh@inai.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Pull more networking fixes from David Miller:
1) Eric Dumazet discovered and fixed what turned out to be a family of
bugs. These functions were using pskb_may_pull() which might need
to reallocate the linear SKB data buffer, but the callers were not
expecting this possibility. The callers have cached pointers to the
packet header areas, and would need to reload them if we were to
continue using pskb_may_pull().
So they could end up reading garbage.
It's easier to just change these RAW4/RAW6/MIP6 routines to use
skb_header_pointer() instead of pskb_may_pull(), which won't modify
the linear SKB data area.
2) Dave Jone's syscall spammer caught a case where a non-TCP socket can
call down into the TCP keepalive code. The case basically involves
creating a raw socket with sk_protocol == IPPROTO_TCP, then calling
setsockopt(sock_fd, SO_KEEPALIVE, ...)
Fixed by Eric Dumazet.
3) Bluetooth devices do not get configured properly while being powered
on, resulting in always using legacy pairing instead of SSP. Fix
from Andrzej Kaczmarek.
4) Bluetooth cancels delayed work erroneously, put stricter checks in
place. From Andrei Emeltchenko.
5) Fix deadlock between cfg80211_mutex and reg_regdb_search_mutex in
cfg80211, from Luis R. Rodriguez.
6) Fix interrupt double release in iwlwifi, from Emmanuel Grumbach.
7) Missing module license in bcm87xx driver, from Peter Huewe.
8) Team driver can lose port changed events when adding devices to a
team, fix from Jiri Pirko.
9) Fix endless loop when trying ot unregister PPPOE device in zombie
state, from Xiaodong Xu.
10) batman-adv layer needs to set MAC address of software device
earlier, otherwise we call tt_local_add with it uninitialized.
11) Fix handling of KSZ8021 PHYs, it's matched currently by KS8051 but
that doesn't program the device properly. From Marek Vasut.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
ipv6: mip6: fix mip6_mh_filter()
ipv6: raw: fix icmpv6_filter()
net: guard tcp_set_keepalive() to tcp sockets
phy/micrel: Add missing header to micrel_phy.h
phy/micrel: Rename KS80xx to KSZ80xx
phy/micrel: Implement support for KSZ8021
batman-adv: Fix symmetry check / route flapping in multi interface setups
batman-adv: Fix change mac address of soft iface.
pppoe: drop PPPOX_ZOMBIEs in pppoe_release
team: send port changed when added
ipv4: raw: fix icmp_filter()
net/phy/bcm87xx: Add MODULE_LICENSE("GPL") to GPL driver
iwlwifi: don't double free the interrupt in failure path
cfg80211: fix possible circular lock on reg_regdb_search()
Bluetooth: Fix not removing power_off delayed work
Bluetooth: Fix freeing uninitialized delayed works
Bluetooth: mgmt: Fix enabling LE while powered off
Bluetooth: mgmt: Fix enabling SSP while powered off
|
|
mip6_mh_filter() should not modify its input, or else its caller
would need to recompute ipv6_hdr() if skb->head is reallocated.
Use skb_header_pointer() instead of pskb_may_pull()
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
|
|
Included fixes:
- fix the behaviour of batman-adv in case of virtual interface MAC change event
- fix symmetric link check in neighbour selection
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The commit 5e953778a2aab04929a5e7b69f53dc26e39b079e ("ipconfig: add nameserver
IPs to kernel-parameter ip=") introduces ic_nameservers_predef() that defined
only for BOOTP. However it is used by ip_auto_config_setup() as well. This
patch moves it outside of #ifdef BOOTP.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Christoph Fritz <chf.fritz@googlemail.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
icmpv6_filter() should not modify its input, or else its caller
would need to recompute ipv6_hdr() if skb->head is reallocated.
Use skb_header_pointer() instead of pskb_may_pull() and
change the prototype to make clear both sk and skb are const.
Also, if icmpv6 header cannot be found, do not deliver the packet,
as we do in IPv4.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The current regulatory code on cfg80211 performs a check to
see if a regulatory rule belongs to an IEEE band so that if
a Country IE is received and no rules are specified for a
band (which is allowed by IEEE) those bands are left intact.
The current band check assumes a rule is bound to a band
if the rule's start or end frequency is less than 2 GHz
apart from the center of frequency being inspected.
In order to support 60 GHz for 802.11ad we need to increase
this to account for the channel spacing of 2160 MHz whereby
a channel somewhere in the middle of a regulatory rule may
be more than 2 GHz apart from either the beginning or
end of the frequency rule.
Without a fix for this even though channels 1-3 are allowed world
wide on the rule (57240 - 63720 @ 2160), channel 2 at 60480 MHz
will end up getting disabled given that it is 3240 MHz from
both the frequency rule start and end frequency. Fix this by
using 2 GHz separation assumption for the 2.4 and 5 GHz bands
but for 60 GHz use a 10 GHz separation before assuming a rule
is not part of the band.
Since we have no 802.11ad drivers yet merged this change has
no impact to existing Linux upstream device drivers.
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Acked-by: Luis R. Rodriguez <mcgrof@do-not-panic.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Commit 5640f7685831 ("net: use a per task frag allocator")
accidentally contained an unrelated change to net/ipv4/raw.c,
later committed (without the pr_err() debugging bits) in
net tree as commit ab43ed8b749 (ipv4: raw: fix icmp_filter())
This patch reverts this glitch, noticed by Stephen Rothwell.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless
John W. Linville says:
====================
Please pull this last(?) batch of fixes intended for 3.6...
For the Bluetooth bits, Gustavo says this:
"Here goes probably my last update to 3.6. It includes the two patches
you were ok last week(from Andrzej Kaczmarek), those are critical
ones, and two other fixes one for a system crash and the other for
a missing lockdep annotation."
The referenced fixes from Andrzej prevent attempts to configure devices
that are powered-off.
Along with the Bluetooth fixes, there are a couple of 802.11 fixes.
Emmanuel Grumbach gives us an iwlwifi fix to prevent releasing an
interrupt twice. Luis R. Rodriguez provides a fix for a possible
circular lock dependency in the cfg80211 regulatory enforcement code.
All of these have been in linux-next for a few days. I hope they are
not too late to make the 3.6 release!
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
Pull two ceph fixes from Sage Weil:
"The first fixes a leak in the rbd setup error path, and the second
fixes a more serious problem with mismatched kmap/kunmap that surfaced
after the recent refactoring work."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
libceph: only kunmap kmapped pages
rbd: drop dev reference on error in rbd_open()
|
|
Signed-off-by: Waldemar Rymarkiewicz <waldemar.rymarkiewicz@tieto.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
|
|
During processing incoming RSET frame chip, possibly due to
its internal timout, can retrnasmit an another RSET which
is next queued for processing in shdlc layer.
In case when we accept processed RSET skip those remaining on
the rcv queue until chip will send it's first S or I frame.
This will mean the chip completed connection as well.
Signed-off-by: Waldemar Rymarkiewicz <waldemar.rymarkiewicz@tieto.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
|
|
As queue_work() does not guarantee immediate execution of sm_work it
can happen in crossover RSET usecase that connect timer will constantly
change the shdlc state from NEGOTIATING to CONNECTING before shdlc has
chance to handle incoming frame.
Signed-off-by: Waldemar Rymarkiewicz <waldemar.rymarkiewicz@tieto.com>
Acked-by: Eric Lapuyade <eric.lapuyade@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
|
|
Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
|
|
Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
|
|
After fixing the LLC Makefile, we no longer need those exports.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
|
|
The previous shdlc HCI driver and its header are removed from the tree.
PN544 now registers directly with HCI and passes the name of the llc it
requires (shdlc).
HCI instantiation now allocates the required llc instance. The llc is
started when the HCI device is brought up.
Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
|
|
This is used by HCI drivers such as the one for the pn544 which require
communications between HCI and the chip to use shdlc.
Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
|
|
This is a passthrough llc. It can be used by HCI drivers that don't
need link layer control. HCI will then write directly to the driver, and
driver will deliver incoming frames directly to HCI without any
processing.
Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
|
|
The LLC layer manages modules that control the link layer protocol (such
as shdlc) between HCI and an HCI driver. The driver must simply specify
the required llc when it registers with HCI.
Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
|
|
This enables the completion callback to be called from a different
context, preventing a possible deadlock if the callback resulted in the
invocation of a nested call to the currently locked nfc_dev.
This is also more in line with the im_transceive nfc_ops for NFC Core or
NCI drivers which already behave asynchronously.
Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
|
|
This method initiates execution of an HCI cmd. Result will be delivered
through an asynchronous callback.
Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
|
|
Make it match the data_exchange_cb_t so that it can be used directly in
the implementation of an asynchronous hci_transceive
Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
|
|
Driver must handle its data added to the frame, so at this point
removeing control field of shdlc frame is enough.
Signed-off-by: Waldemar Rymarkiewicz <waldemar.rymarkiewicz@tieto.com>
Acked-by: Eric Lapuyade <eric.lapuyade@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
|
|
Checksum is specific for a chip spcification and it varies
(in size and type) between different hardware. It should be
handled in the driver then.
Moreover, shdlc spec doesn't mention crc as a part of the frame.
Update pn544_hci driver as well.
Signed-off-by: Waldemar Rymarkiewicz <waldemar.rymarkiewicz@tieto.com>
Acked-by: Eric Lapuyade <eric.lapuyade@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
|
|
nfc_llcp_build_tlv() malloced the memory and should be free in
nfc_llcp_build_gb() after used, and the same in the error handling
case, otherwise it will cause memory leak.
spatch with a semantic match is used to found this problem.
(http://coccinelle.lip6.fr/)
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
|
|
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
|
|
NFC is using a number of custom ordered workqueues w/ WQ_MEM_RECLAIM.
WQ_MEM_RECLAIM is unnecessary unless NFC is gonna be used as transport
for storage device, and all use cases match one work item to one
ordered workqueue - IOW, there's no actual ordering going on at all
and using system_nrt_wq gives the same behavior.
There's nothing to be gained by using custom workqueues. Use
system_nrt_wq instead and drop all the custom ones.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
|
|
This patch remove the repeated code for checking llcp_sock &
llcp_sock->dev against NULL.
Signed-off-by: Syam Sidhardhan <s.syam@samsung.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
|
|
During NFC-DEP target activation, store the remote
general bytes to be used later in dep_link_up.
When dep_link_up is called, activate the NFC-DEP target,
and forward the remote general bytes.
When dep_link_down is called, deactivate the target.
Signed-off-by: Ilan Elias <ilane@ti.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
|
|
Signed-off-by: Ilan Elias <ilane@ti.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
|
|
If initiator protocol is NFC-DEP, set the local general bytes
in nci_start_poll.
Signed-off-by: Ilan Elias <ilane@ti.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
|
|
Its possible to use RAW sockets to get a crash in
tcp_set_keepalive() / sk_reset_timer()
Fix is to make sure socket is a SOCK_STREAM one.
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
SKF_AD_ALU_XOR_X has been added a while ago, but as an 'ancillary'
operation that is invoked through a negative offset in K within BPF
load operations. Since BPF_MOD has recently been added, BPF_XOR should
also be part of the common ALU operations. Removing SKF_AD_ALU_XOR_X
might not be an option since this is exposed to user space.
Signed-off-by: Daniel Borkmann <daniel.borkmann@tik.ee.ethz.ch>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We currently use a per socket order-0 page cache for tcp_sendmsg()
operations.
This page is used to build fragments for skbs.
Its done to increase probability of coalescing small write() into
single segments in skbs still in write queue (not yet sent)
But it wastes a lot of memory for applications handling many mostly
idle sockets, since each socket holds one page in sk->sk_sndmsg_page
Its also quite inefficient to build TSO 64KB packets, because we need
about 16 pages per skb on arches where PAGE_SIZE = 4096, so we hit
page allocator more than wanted.
This patch adds a per task frag allocator and uses bigger pages,
if available. An automatic fallback is done in case of memory pressure.
(up to 32768 bytes per frag, thats order-3 pages on x86)
This increases TCP stream performance by 20% on loopback device,
but also benefits on other network devices, since 8x less frags are
mapped on transmit and unmapped on tx completion. Alexander Duyck
mentioned a probable performance win on systems with IOMMU enabled.
Its possible some SG enabled hardware cant cope with bigger fragments,
but their ndo_start_xmit() should already handle this, splitting a
fragment in sub fragments, since some arches have PAGE_SIZE=65536
Successfully tested on various ethernet devices.
(ixgbe, igb, bnx2x, tg3, mellanox mlx4)
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Vijay Subramanian <subramanian.vijay@gmail.com>
Cc: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Vijay Subramanian <subramanian.vijay@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|