summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2014-02-17ipv4: fix counter in_slow_totDuan Jiong
since commit 89aef8921bf("ipv4: Delete routing cache."), the counter in_slow_tot can't work correctly. The counter in_slow_tot increase by one when fib_lookup() return successfully in ip_route_input_slow(), but actually the dst struct maybe not be created and cached, so we can increase in_slow_tot after the dst struct is created. Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-17Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph fixes from Sage Weil: "We have some patches fixing up ACL support issues from Zheng and Guangliang and a mount option to enable/disable this support. (These fixes were somewhat delayed by the Chinese holiday.) There is also a small fix for cached readdir handling when directories are fragmented" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: ceph: fix __dcache_readdir() ceph: add acl, noacl options for cephfs mount ceph: make ceph_forget_all_cached_acls() static inline ceph: add missing init_acl() for mkdir() and atomic_open() ceph: fix ceph_set_acl() ceph: fix ceph_removexattr() ceph: remove xattr when null value is given to setxattr() ceph: properly handle XATTR_CREATE and XATTR_REPLACE
2014-02-17Merge branch 'for-linus' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull CIFS fixes from Steve French: "Three cifs fixes, the most important fixing the problem with passing bogus pointers with writev (CVE-2014-0069). Two additional cifs fixes are still in review (including the fix for an append problem which Al also discovered)" * 'for-linus' of git://git.samba.org/sfrench/cifs-2.6: CIFS: Fix too big maxBuf size for SMB3 mounts cifs: ensure that uncached writes handle unmapped areas correctly [CIFS] Fix cifsacl mounts over smb2 to not call cifs
2014-02-17FS-Cache: Handle removal of unadded object to the fscache_object_list rb treeDavid Howells
When FS-Cache allocates an object, the following sequence of events can occur: -->fscache_alloc_object() -->cachefiles_alloc_object() [via cache->ops->alloc_object] <--[returns new object] -->fscache_attach_object() <--[failed] -->cachefiles_put_object() [via cache->ops->put_object] -->fscache_object_destroy() -->fscache_objlist_remove() -->rb_erase() to remove the object from fscache_object_list. resulting in a crash in the rbtree code. The problem is that the object is only added to fscache_object_list on the success path of fscache_attach_object() where it calls fscache_objlist_add(). So if fscache_attach_object() fails, the object won't have been added to the objlist rbtree. We do, however, unconditionally try to remove the object from the tree. Thanks to NeilBrown for finding this and suggesting this solution. Reported-by: NeilBrown <neilb@suse.de> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: (a customer of) NeilBrown <neilb@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-17reiserfs: fix utterly brain-damaged indentation.Dave Jones
This has been this way for years, and every time I stumble across it I lose my lunch. After coming across it for the nth time in the Coverity results, I had to overcome the bystander effect and do something about it. This ignores the 79 column limit in favor of making it look like C instead of gibberish. The correct thing to do here would be to lose some of the indentation by breaking this function up into several smaller ones. I might do that at some point if I have the stomach to look at this again. (Also some of those overlong ternary operations would likely be more readable as regular if's) Signed-off-by: Dave Jones <davej@fedoraproject.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-17irtty-sir.c: Do not set_termios() on irtty_close()Tommie Gannert
Issuing set_termios() from irtty_close() causes kernel Oops for unplugged usb-serial devices. Since no other tty_ldisc calls set_termios() on close and no tty driver seem to check if tty->device_data is NULL or not on entry to set_termios(), the only solution I can come up with is to remove the irtty_stop_receiver() call, which only updates termios. Signed-off-by: Tommie Gannert <tommie@gannert.se> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-17Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
2014-02-17Merge tag 'dma-buf-for-3.14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sumits/dma-buf Pull dma-buf fix from Sumit Semwal: "Just some debugfs output updates. There's another patch related to dma-buf, but it'll get upstreamed via Greg KH's pull request" * tag 'dma-buf-for-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/sumits/dma-buf: dma-buf: update debugfs output
2014-02-17Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/egtvedt/linux-avr32 Pull AVR32 fixes from Hans-Christian Egtvedt. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/egtvedt/linux-avr32: avr32: add generic vga.h to Kbuild avr32: add generic ioremap_wc() definition in io.h avr32: Makefile: add '-D__linux__' flag for gcc-4.4.7 use avr32: fix missing module.h causing build failure in mimc200/fram.c
2014-02-17ceph: fix __dcache_readdir()Yan, Zheng
If directory is fragmented, readdir() read its dirfrags one by one. After reading all dirfrags, the corresponding dentries are sorted in (frag_t, off) order in the dcache. If dentries of a directory are all cached, __dcache_readdir() can use the cached dentries to satisfy readdir syscall. But when checking if a given dentry is after the position of readdir, __dcache_readdir() compares numerical value of frag_t directly. This is wrong, it should use ceph_frag_compare(). Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-02-17ceph: add acl, noacl options for cephfs mountSage Weil
Make the 'acl' option dependent on having ACL support compiled in. Make the 'noacl' option work even without it so that one can always ask it to be off and not error out on mount when it is not supported. Signed-off-by: Guangliang Zhao <lucienchao@gmail.com> Signed-off-by: Sage Weil <sage@inktank.com>
2014-02-17ceph: make ceph_forget_all_cached_acls() static inlineGuangliang Zhao
Signed-off-by: Guangliang Zhao <lucienchao@gmail.com> Reviewed-by: Alex Elder <elder@linaro.org> Signed-off-by: Sage Weil <sage@inktank.com>
2014-02-17ceph: add missing init_acl() for mkdir() and atomic_open()Yan, Zheng
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-02-17ceph: fix ceph_set_acl()Yan, Zheng
If acl is equivalent to file mode permission bits, ceph_set_acl() needs to remove any existing acl xattr. Use __ceph_setxattr() to handle both setting and removing acl xattr cases, it doesn't return -ENODATA when there is no acl xattr. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-02-17ceph: fix ceph_removexattr()Yan, Zheng
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-02-17ceph: remove xattr when null value is given to setxattr()Yan, Zheng
For the setxattr request, introduce a new flag CEPH_XATTR_REMOVE to distinguish null value case from the zero-length value case. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-02-17ceph: properly handle XATTR_CREATE and XATTR_REPLACEYan, Zheng
return -EEXIST if XATTR_CREATE is set and xattr alread exists. return -ENODATA if XATTR_REPLACE is set but xattr does not exist. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-02-17Merge branch 'merge' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc fixes from Ben Herrenschmidt: "Here are some more powerpc fixes for 3.14 The main one is a nasty issue with the NUMA balancing support which requires a small generic change and the addition of a new accessor to set _PAGE_NUMA. Both have been reviewed and acked by Mel and Rik. The changelog should have plenty of details but basically, without this fix, we get random user segfaults and/or corruptions due to missing TLB/hash flushes. Aneesh series of 3 patches fixes it. We have some vDSO vs. perf fixes from Anton, some small EEH fixes from Gavin, a ppc32 regression vs the stack overflow detector, and a fix for the way we handle PCIe host bridge speed settings on pseries (which is needed for proper operations of AMD graphics cards on Power8)" * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc/eeh: Disable EEH on reboot powerpc/eeh: Cleanup on eeh_subsystem_enabled powerpc/powernv: Rework EEH reset powerpc: Use unstripped VDSO image for more accurate profiling data powerpc: Link VDSOs at 0x0 mm: Use ptep/pmdp_set_numa() for updating _PAGE_NUMA bit mm: Dirty accountable change only apply to non prot numa case powerpc/mm: Add new "set" flag argument to pte/pmd update function powerpc/pseries: Add Gen3 definitions for PCIE link speed powerpc/pseries: Fix regression on PCI link speed powerpc: Set the correct ksp_limit on ppc32 when switching to irq stack
2014-02-17printk: fix syslog() overflowing user bufferLinus Torvalds
This is not a buffer overflow in the traditional sense: we don't overflow any *kernel* buffers, but we do mis-count the amount of data we copy back to user space for the SYSLOG_ACTION_READ_ALL case. In particular, if the user buffer is too small to hold everything, and *if* there is a continuation line at just the right place, we can end up giving the user more data than he asked for. The reason is that we first count up the number of bytes all the log records contains, then we walk the records again until we've skipped the records at the beginning that won't fit, and then we walk the rest of the records and copy them to the user space buffer. And in between that "skip the initial records that won't fit" and the "copy the records that *will* fit to user space", we reset the 'prev' variable that contained the record information for the last record not copied. That meant that when we started copying to user space, we now had a different character count than what we had originally calculated in the first record walk-through. The fix is to simply not clear the 'prev' flags value (in both cases where we had the same logic: syslog_print_all and kmsg_dump_get_buffer: the latter is used for pstore-like dumping) Reported-and-tested-by: Debabrata Banerjee <dbanerje@akamai.com> Acked-by: Kay Sievers <kay@vrfy.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-17HID: hyperv: make sure input buffer is big enoughDavid Herrmann
We need at least HID_MAX_BUFFER_SIZE (4096) bytes as input buffer. HID core depends on this as it requires every input report to be at least as big as advertised. Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2014-02-17HID: Bluetooth: hidp: make sure input buffers are big enoughDavid Herrmann
HID core expects the input buffers to be at least of size 4096 (HID_MAX_BUFFER_SIZE). Other sizes will result in buffer-overflows if an input-report is smaller than advertised. We could, like i2c, compute the biggest report-size instead of using HID_MAX_BUFFER_SIZE, but this will blow up if report-descriptors are changed after ->start() has been called. So lets be safe and just use the biggest buffer we have. Note that this adds an additional copy to the HIDP input path. If there is a way to make sure the skb-buf is big enough, we should use that instead. The best way would be to make hid-core honor the @size argument, though, that sounds easier than it is. So lets just fix the buffer-overflows for now and afterwards look for a faster way for all transport drivers. Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2014-02-17bonding: 802.3ad: make aggregator_identifier bond-privateJiri Bohac
aggregator_identifier is used to assign unique aggregator identifiers to aggregators of a bond during device enslaving. aggregator_identifier is currently a global variable that is zeroed in bond_3ad_initialize(). This sequence will lead to duplicate aggregator identifiers for eth1 and eth3: create bond0 change bond0 mode to 802.3ad enslave eth0 to bond0 //eth0 gets agg id 1 enslave eth1 to bond0 //eth1 gets agg id 2 create bond1 change bond1 mode to 802.3ad enslave eth2 to bond1 //aggregator_identifier is reset to 0 //eth2 gets agg id 1 enslave eth3 to bond0 //eth3 gets agg id 2 Fix this by making aggregator_identifier private to the bond. Signed-off-by: Jiri Bohac <jbohac@suse.cz> Acked-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-17usbnet: remove generic hard_header_len checkEmil Goode
This patch removes a generic hard_header_len check from the usbnet module that is causing dropped packages under certain circumstances for devices that send rx packets that cross urb boundaries. One example is the AX88772B which occasionally send rx packets that cross urb boundaries where the remaining partial packet is sent with no hardware header. When the buffer with a partial packet is of less number of octets than the value of hard_header_len the buffer is discarded by the usbnet module. With AX88772B this can be reproduced by using ping with a packet size between 1965-1976. The bug has been reported here: https://bugzilla.kernel.org/show_bug.cgi?id=29082 This patch introduces the following changes: - Removes the generic hard_header_len check in the rx_complete function in the usbnet module. - Introduces a ETH_HLEN check for skbs that are not cloned from within a rx_fixup callback. - For safety a hard_header_len check is added to each rx_fixup callback function that could be affected by this change. These extra checks could possibly be removed by someone who has the hardware to test. - Removes a call to dev_kfree_skb_any() and instead utilizes the dev->done list to queue skbs for cleanup. The changes place full responsibility on the rx_fixup callback functions that clone skbs to only pass valid skbs to the usbnet_skb_return function. Signed-off-by: Emil Goode <emilgoode@gmail.com> Reported-by: Igor Gnatenko <i.gnatenko.brain@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-17gre: add link local route when local addr is anyNicolas Dichtel
This bug was reported by Steinar H. Gunderson and was introduced by commit f7cb8886335d ("sit/gre6: don't try to add the same route two times"). root@morgental:~# ip tunnel add foo mode gre remote 1.2.3.4 ttl 64 root@morgental:~# ip link set foo up mtu 1468 root@morgental:~# ip -6 route show dev foo fe80::/64 proto kernel metric 256 but after the above commit, no such route shows up. There is no link local route because dev->dev_addr is 0 (because local ipv4 address is 0), hence no link local address is configured. In this scenario, the link local address is added manually: 'ip -6 addr add fe80::1 dev foo' and because prefix is /128, no link local route is added by the kernel. Even if the right things to do is to add the link local address with a /64 prefix, we need to restore the previous behavior to avoid breaking userpace. Reported-by: Steinar H. Gunderson <sesse@samfundet.no> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-17batman-adv: fix potential kernel paging error for unicast transmissionsAntonio Quartulli
batadv_send_skb_prepare_unicast(_4addr) might reallocate the skb's data. If it does then our ethhdr pointer is not valid anymore in batadv_send_skb_unicast(), resulting in a kernel paging error. Fixing this by refetching the ethhdr pointer after the potential reallocation. Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
2014-02-17batman-adv: avoid double free when orig_node initialization failsAntonio Quartulli
In the failure path of the orig_node initialization routine the orig_node->bat_iv.bcast_own field is free'd twice: first in batadv_iv_ogm_orig_get() and then later in batadv_orig_node_free_rcu(). Fix it by removing the kfree in batadv_iv_ogm_orig_get(). Signed-off-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2014-02-17batman-adv: free skb on TVLV parsing successAntonio Quartulli
When the TVLV parsing routine succeed the skb is left untouched thus leading to a memory leak. Fix this by consuming the skb in case of success. Introduced by ef26157747d42254453f6b3ac2bd8bd3c53339c3 ("batman-adv: tvlv - basic infrastructure") Reported-by: Russel Senior <russell@personaltelco.net> Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> Tested-by: Russell Senior <russell@personaltelco.net> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2014-02-17batman-adv: fix TT CRC computation by ensuring byte orderAntonio Quartulli
When computing the CRC on a 2byte variable the order of the bytes obviously alters the final result. This means that computing the CRC over the same value on two archs having different endianess leads to different numbers. The global and local translation table CRC computation routine makes this mistake while processing the clients VIDs. The result is a continuous CRC mismatching between nodes having different endianess. Fix this by converting the VID to Network Order before processing it. This guarantees that every node uses the same byte order. Introduced by 7ea7b4a142758deaf46c1af0ca9ceca6dd55138b ("batman-adv: make the TT CRC logic VLAN specific") Reported-by: Russel Senior <russell@personaltelco.net> Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> Tested-by: Russell Senior <russell@personaltelco.net> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2014-02-17batman-adv: fix potential orig_node reference leakSimon Wunderlich
Since batadv_orig_node_new() sets the refcount to two, assuming that the calling function will use a reference for putting the orig_node into a hash or similar, both references must be freed if initialization of the orig_node fails. Otherwise that object may be leaked in that error case. Reported-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
2014-02-17batman-adv: avoid potential race condition when adding a new neighbourAntonio Quartulli
When adding a new neighbour it is important to atomically perform the following: - check if the neighbour already exists - append the neighbour to the proper list If the two operations are not performed in an atomic context it is possible that two concurrent insertions add the same neighbour twice. Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2014-02-17batman-adv: properly check pskb_may_pull return valueAntonio Quartulli
pskb_may_pull() returns 1 on success and 0 in case of failure, therefore checking for the return value being negative does not make sense at all. This way if the function fails we will probably read beyond the current skb data buffer. Fix this by doing the proper check. Signed-off-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2014-02-17batman-adv: release vlan object after checking the CRCAntonio Quartulli
There is a refcounter unbalance in the CRC checking routine invoked on OGM reception. A vlan object is retrieved (thus its refcounter is increased by one) but it is never properly released. This leads to a memleak because the vlan object will never be free'd. Fix this by releasing the vlan object after having read the CRC. Reported-by: Russell Senior <russell@personaltelco.net> Reported-by: Daniel <daniel@makrotopia.org> Reported-by: cmsv <cmsv@wirelesspt.net> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2014-02-17batman-adv: fix TT-TVLV parsing on OGM receptionAntonio Quartulli
When accessing a TT-TVLV container in the OGM RX path the variable pointing to the list of changes to apply is altered by mistake. This makes the TT component read data at the wrong position in the OGM packet buffer. Fix it by removing the bogus pointer alteration. Signed-off-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2014-02-17batman-adv: fix soft-interface MTU computationAntonio Quartulli
The current MTU computation always returns a value smaller than 1500bytes even if the real interfaces have an MTU large enough to compensate the batman-adv overhead. Fix the computation by properly returning the highest admitted value. Introduced by a19d3d85e1b854e4a483a55d740a42458085560d ("batman-adv: limit local translation table max size") Reported-by: Russell Senior <russell@personaltelco.net> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2014-02-17HID: hid-sensor-hub: quirk for STM Sensor hubArchana Patni
Added STM sensor hub vendor id in HID_SENSOR_HUB_ENUM_QUIRK to fix report descriptors. These devices uses old FW which uses logical 0 as minimum. In these, HID reports are not using proper collection classes. So we need to fix report descriptors,for such devices. This will not have any impact, if the FW uses logical 1 as minimum. We look for usage id for "power and report state", and modify logical minimum value to 1. This is a follow-up patch to commit id 875e36f8. Signed-off-by: Archana Patni <archana.patni@linux.intel.com> Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2014-02-17avr32: add generic vga.h to KbuildChen Gang
Need add generic "vga.h", or can not pass building for allmodconfig, the related error: CC [M] drivers/gpu/drm/drm_irq.o In file included from include/linux/vgaarb.h:34, from drivers/gpu/drm/drm_irq.c:42: include/video/vga.h:22:21: error: asm/vga.h: No such file or directory Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com> Acked-by: Hans-Christian Egtvedt <hegtvedt@cisco.com>
2014-02-17avr32: add generic ioremap_wc() definition in io.hChen Gang
Need generic ioremap_wc(), or can not pass compiling with allmodconfig, the related error: CC [M] drivers/gpu/drm/drm_bufs.o drivers/gpu/drm/drm_bufs.c: In function 'drm_addmap_core': drivers/gpu/drm/drm_bufs.c:217: error: implicit declaration of function 'ioremap_wc' drivers/gpu/drm/drm_bufs.c:218: warning: assignment makes pointer from integer without a cast Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com> Acked-by: Hans-Christian Egtvedt <hegtvedt@cisco.com>
2014-02-17avr32: Makefile: add '-D__linux__' flag for gcc-4.4.7 useChen Gang
For avr32 cross compiler, do not define '__linux__' internally, so it will cause issue with allmodconfig. The related error: CC [M] fs/coda/psdev.o In file included from include/linux/coda.h:64, from fs/coda/psdev.c:45: include/uapi/linux/coda.h:221: error: expected specifier-qualifier-list before 'u_quad_t' The related toolchain version (which only download, not re-compile): [root@gchen linux-next]# /upstream/toolchain/download/avr32-gnu-toolchain-linux_x86/bin/avr32-gcc -v Using built-in specs. Target: avr32 Configured with: /data2/home/toolsbuild/jenkins-knuth/workspace/avr32-gnu-toolchain/src/gcc/configure --target=avr32 --host=i686-pc-linux-gnu --build=x86_64-pc-linux-gnu --prefix=/home/toolsbuild/jenkins-knuth/workspace/avr32-gnu-toolchain/avr32-gnu-toolchain-linux_x86 --enable-languages=c,c++ --disable-nls --disable-libssp --disable-libstdcxx-pch --with-dwarf2 --enable-version-specific-runtime-libs --disable-shared --enable-doc --with-mpfr-lib=/home/toolsbuild/jenkins-knuth/workspace/avr32-gnu-toolchain/avr32-gnu-toolchain-linux_x86/lib --with-mpfr-include=/home/toolsbuild/jenkins-knuth/workspace/avr32-gnu-toolchain/avr32-gnu-toolchain-linux_x86/include --with-gmp=/home/toolsbuild/jenkins-knuth/workspace/avr32-gnu-toolchain/avr32-gnu-toolchain-linux_x86 --with-mpc=/home/toolsbuild/jenkins-knuth/workspace/avr32-gnu-toolchain/avr32-gnu-toolchain-linux_x86 --enable-__cxa_atexit --disable-shared --with-newlib --with-pkgversion=AVR_32_bit_GNU_Toolchain_3.4.2_435 --with-bugurl=http://www .atmel.com/avr Thread model: single gcc version 4.4.7 (AVR_32_bit_GNU_Toolchain_3.4.2_435) Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com> Acked-by: Hans-Christian Egtvedt <hegtvedt@cisco.com> Cc: stable@vger.kernel.org
2014-02-17avr32: fix missing module.h causing build failure in mimc200/fram.cPaul Gortmaker
Causing this: In file included from arch/avr32/boards/mimc200/fram.c:13: include/linux/miscdevice.h:51: error: field 'list' has incomplete type include/linux/miscdevice.h:55: error: expected specifier-qualifier-list before 'mode_t' arch/avr32/boards/mimc200/fram.c:42: error: 'THIS_MODULE' undeclared here (not in a function) Reported-by: Fengguang Wu <fengguang.wu@intel.com> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org> Acked-by: Hans-Christian Egtvedt <egtvedt@samfundet.no> Cc: stable@vger.kernel.org
2014-02-17packet: check for ndo_select_queue during queue selectionDaniel Borkmann
Mathias reported that on an AMD Geode LX embedded board (ALiX) with ath9k driver PACKET_QDISC_BYPASS, introduced in commit d346a3fae3ff ("packet: introduce PACKET_QDISC_BYPASS socket option"), triggers a WARN_ON() coming from the driver itself via 066dae93bdf ("ath9k: rework tx queue selection and fix queue stopping/waking"). The reason why this happened is that ndo_select_queue() call is not invoked from direct xmit path i.e. for ieee80211 subsystem that sets queue and TID (similar to 802.1d tag) which is being put into the frame through 802.11e (WMM, QoS). If that is not set, pending frame counter for e.g. ath9k can get messed up. So the WARN_ON() in ath9k is absolutely legitimate. Generally, the hw queue selection in ieee80211 depends on the type of traffic, and priorities are set according to ieee80211_ac_numbers mapping; working in a similar way as DiffServ only on a lower layer, so that the AP can favour frames that have "real-time" requirements like voice or video data frames. Therefore, check for presence of ndo_select_queue() in netdev ops and, if available, invoke it with a fallback handler to __packet_pick_tx_queue(), so that driver such as bnx2x, ixgbe, or mlx4 can still select a hw queue for transmission in relation to the current CPU while e.g. ieee80211 subsystem can make their own choices. Reported-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-17netdevice: move netdev_cap_txqueue for shared usage to headerDaniel Borkmann
In order to allow users to invoke netdev_cap_txqueue, it needs to be moved into netdevice.h header file. While at it, also add kernel doc header to document the API. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-17netdevice: add queue selection fallback handler for ndo_select_queueDaniel Borkmann
Add a new argument for ndo_select_queue() callback that passes a fallback handler. This gets invoked through netdev_pick_tx(); fallback handler is currently __netdev_pick_tx() as most drivers invoke this function within their customized implementation in case for skbs that don't need any special handling. This fallback handler can then be replaced on other call-sites with different queue selection methods (e.g. in packet sockets, pktgen etc). This also has the nice side-effect that __netdev_pick_tx() is then only invoked from netdev_pick_tx() and export of that function to modules can be undone. Suggested-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-17drivers/net: tulip_remove_one needs to call pci_disable_device()Ingo Molnar
Otherwise the device is not completely shut down. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-17net: sctp: Fix a_rwnd/rwnd management to reflect real state of the ↵Matija Glavinic Pecotic
receiver's buffer Implementation of (a)rwnd calculation might lead to severe performance issues and associations completely stalling. These problems are described and solution is proposed which improves lksctp's robustness in congestion state. 1) Sudden drop of a_rwnd and incomplete window recovery afterwards Data accounted in sctp_assoc_rwnd_decrease takes only payload size (sctp data), but size of sk_buff, which is blamed against receiver buffer, is not accounted in rwnd. Theoretically, this should not be the problem as actual size of buffer is double the amount requested on the socket (SO_RECVBUF). Problem here is that this will have bad scaling for data which is less then sizeof sk_buff. E.g. in 4G (LTE) networks, link interfacing radio side will have a large portion of traffic of this size (less then 100B). An example of sudden drop and incomplete window recovery is given below. Node B exhibits problematic behavior. Node A initiates association and B is configured to advertise rwnd of 10000. A sends messages of size 43B (size of typical sctp message in 4G (LTE) network). On B data is left in buffer by not reading socket in userspace. Lets examine when we will hit pressure state and declare rwnd to be 0 for scenario with above stated parameters (rwnd == 10000, chunk size == 43, each chunk is sent in separate sctp packet) Logic is implemented in sctp_assoc_rwnd_decrease: socket_buffer (see below) is maximum size which can be held in socket buffer (sk_rcvbuf). current_alloced is amount of data currently allocated (rx_count) A simple expression is given for which it will be examined after how many packets for above stated parameters we enter pressure state: We start by condition which has to be met in order to enter pressure state: socket_buffer < currently_alloced; currently_alloced is represented as size of sctp packets received so far and not yet delivered to userspace. x is the number of chunks/packets (since there is no bundling, and each chunk is delivered in separate packet, we can observe each chunk also as sctp packet, and what is important here, having its own sk_buff): socket_buffer < x*each_sctp_packet; each_sctp_packet is sctp chunk size + sizeof(struct sk_buff). socket_buffer is twice the amount of initially requested size of socket buffer, which is in case of sctp, twice the a_rwnd requested: 2*rwnd < x*(payload+sizeof(struc sk_buff)); sizeof(struct sk_buff) is 190 (3.13.0-rc4+). Above is stated that rwnd is 10000 and each payload size is 43 20000 < x(43+190); x > 20000/233; x ~> 84; After ~84 messages, pressure state is entered and 0 rwnd is advertised while received 84*43B ~= 3612B sctp data. This is why external observer notices sudden drop from 6474 to 0, as it will be now shown in example: IP A.34340 > B.12345: sctp (1) [INIT] [init tag: 1875509148] [rwnd: 81920] [OS: 10] [MIS: 65535] [init TSN: 1096057017] IP B.12345 > A.34340: sctp (1) [INIT ACK] [init tag: 3198966556] [rwnd: 10000] [OS: 10] [MIS: 10] [init TSN: 902132839] IP A.34340 > B.12345: sctp (1) [COOKIE ECHO] IP B.12345 > A.34340: sctp (1) [COOKIE ACK] IP A.34340 > B.12345: sctp (1) [DATA] (B)(E) [TSN: 1096057017] [SID: 0] [SSEQ 0] [PPID 0x18] IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057017] [a_rwnd 9957] [#gap acks 0] [#dup tsns 0] IP A.34340 > B.12345: sctp (1) [DATA] (B)(E) [TSN: 1096057018] [SID: 0] [SSEQ 1] [PPID 0x18] IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057018] [a_rwnd 9957] [#gap acks 0] [#dup tsns 0] IP A.34340 > B.12345: sctp (1) [DATA] (B)(E) [TSN: 1096057019] [SID: 0] [SSEQ 2] [PPID 0x18] IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057019] [a_rwnd 9914] [#gap acks 0] [#dup tsns 0] <...> IP A.34340 > B.12345: sctp (1) [DATA] (B)(E) [TSN: 1096057098] [SID: 0] [SSEQ 81] [PPID 0x18] IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057098] [a_rwnd 6517] [#gap acks 0] [#dup tsns 0] IP A.34340 > B.12345: sctp (1) [DATA] (B)(E) [TSN: 1096057099] [SID: 0] [SSEQ 82] [PPID 0x18] IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057099] [a_rwnd 6474] [#gap acks 0] [#dup tsns 0] IP A.34340 > B.12345: sctp (1) [DATA] (B)(E) [TSN: 1096057100] [SID: 0] [SSEQ 83] [PPID 0x18] --> Sudden drop IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057100] [a_rwnd 0] [#gap acks 0] [#dup tsns 0] At this point, rwnd_press stores current rwnd value so it can be later restored in sctp_assoc_rwnd_increase. This however doesn't happen as condition to start slowly increasing rwnd until rwnd_press is returned to rwnd is never met. This condition is not met since rwnd, after it hit 0, must first reach rwnd_press by adding amount which is read from userspace. Let us observe values in above example. Initial a_rwnd is 10000, pressure was hit when rwnd was ~6500 and the amount of actual sctp data currently waiting to be delivered to userspace is ~3500. When userspace starts to read, sctp_assoc_rwnd_increase will be blamed only for sctp data, which is ~3500. Condition is never met, and when userspace reads all data, rwnd stays on 3569. IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057100] [a_rwnd 1505] [#gap acks 0] [#dup tsns 0] IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057100] [a_rwnd 3010] [#gap acks 0] [#dup tsns 0] IP A.34340 > B.12345: sctp (1) [DATA] (B)(E) [TSN: 1096057101] [SID: 0] [SSEQ 84] [PPID 0x18] IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057101] [a_rwnd 3569] [#gap acks 0] [#dup tsns 0] --> At this point userspace read everything, rwnd recovered only to 3569 IP A.34340 > B.12345: sctp (1) [DATA] (B)(E) [TSN: 1096057102] [SID: 0] [SSEQ 85] [PPID 0x18] IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057102] [a_rwnd 3569] [#gap acks 0] [#dup tsns 0] Reproduction is straight forward, it is enough for sender to send packets of size less then sizeof(struct sk_buff) and receiver keeping them in its buffers. 2) Minute size window for associations sharing the same socket buffer In case multiple associations share the same socket, and same socket buffer (sctp.rcvbuf_policy == 0), different scenarios exist in which congestion on one of the associations can permanently drop rwnd of other association(s). Situation will be typically observed as one association suddenly having rwnd dropped to size of last packet received and never recovering beyond that point. Different scenarios will lead to it, but all have in common that one of the associations (let it be association from 1)) nearly depleted socket buffer, and the other association blames socket buffer just for the amount enough to start the pressure. This association will enter pressure state, set rwnd_press and announce 0 rwnd. When data is read by userspace, similar situation as in 1) will occur, rwnd will increase just for the size read by userspace but rwnd_press will be high enough so that association doesn't have enough credit to reach rwnd_press and restore to previous state. This case is special case of 1), being worse as there is, in the worst case, only one packet in buffer for which size rwnd will be increased. Consequence is association which has very low maximum rwnd ('minute size', in our case down to 43B - size of packet which caused pressure) and as such unusable. Scenario happened in the field and labs frequently after congestion state (link breaks, different probabilities of packet drop, packet reordering) and with scenario 1) preceding. Here is given a deterministic scenario for reproduction: >From node A establish two associations on the same socket, with rcvbuf_policy being set to share one common buffer (sctp.rcvbuf_policy == 0). On association 1 repeat scenario from 1), that is, bring it down to 0 and restore up. Observe scenario 1). Use small payload size (here we use 43). Once rwnd is 'recovered', bring it down close to 0, as in just one more packet would close it. This has as a consequence that association number 2 is able to receive (at least) one more packet which will bring it in pressure state. E.g. if association 2 had rwnd of 10000, packet received was 43, and we enter at this point into pressure, rwnd_press will have 9957. Once payload is delivered to userspace, rwnd will increase for 43, but conditions to restore rwnd to original state, just as in 1), will never be satisfied. --> Association 1, between A.y and B.12345 IP A.55915 > B.12345: sctp (1) [INIT] [init tag: 836880897] [rwnd: 10000] [OS: 10] [MIS: 65535] [init TSN: 4032536569] IP B.12345 > A.55915: sctp (1) [INIT ACK] [init tag: 2873310749] [rwnd: 81920] [OS: 10] [MIS: 10] [init TSN: 3799315613] IP A.55915 > B.12345: sctp (1) [COOKIE ECHO] IP B.12345 > A.55915: sctp (1) [COOKIE ACK] --> Association 2, between A.z and B.12346 IP A.55915 > B.12346: sctp (1) [INIT] [init tag: 534798321] [rwnd: 10000] [OS: 10] [MIS: 65535] [init TSN: 2099285173] IP B.12346 > A.55915: sctp (1) [INIT ACK] [init tag: 516668823] [rwnd: 81920] [OS: 10] [MIS: 10] [init TSN: 3676403240] IP A.55915 > B.12346: sctp (1) [COOKIE ECHO] IP B.12346 > A.55915: sctp (1) [COOKIE ACK] --> Deplete socket buffer by sending messages of size 43B over association 1 IP B.12345 > A.55915: sctp (1) [DATA] (B)(E) [TSN: 3799315613] [SID: 0] [SSEQ 0] [PPID 0x18] IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315613] [a_rwnd 9957] [#gap acks 0] [#dup tsns 0] <...> IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315696] [a_rwnd 6388] [#gap acks 0] [#dup tsns 0] IP B.12345 > A.55915: sctp (1) [DATA] (B)(E) [TSN: 3799315697] [SID: 0] [SSEQ 84] [PPID 0x18] IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315697] [a_rwnd 6345] [#gap acks 0] [#dup tsns 0] --> Sudden drop on 1 IP B.12345 > A.55915: sctp (1) [DATA] (B)(E) [TSN: 3799315698] [SID: 0] [SSEQ 85] [PPID 0x18] IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315698] [a_rwnd 0] [#gap acks 0] [#dup tsns 0] --> Here userspace read, rwnd 'recovered' to 3698, now deplete again using association 1 so there is place in buffer for only one more packet IP B.12345 > A.55915: sctp (1) [DATA] (B)(E) [TSN: 3799315799] [SID: 0] [SSEQ 186] [PPID 0x18] IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315799] [a_rwnd 86] [#gap acks 0] [#dup tsns 0] IP B.12345 > A.55915: sctp (1) [DATA] (B)(E) [TSN: 3799315800] [SID: 0] [SSEQ 187] [PPID 0x18] IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315800] [a_rwnd 43] [#gap acks 0] [#dup tsns 0] --> Socket buffer is almost depleted, but there is space for one more packet, send them over association 2, size 43B IP B.12346 > A.55915: sctp (1) [DATA] (B)(E) [TSN: 3676403240] [SID: 0] [SSEQ 0] [PPID 0x18] IP A.55915 > B.12346: sctp (1) [SACK] [cum ack 3676403240] [a_rwnd 0] [#gap acks 0] [#dup tsns 0] --> Immediate drop IP A.60995 > B.12346: sctp (1) [SACK] [cum ack 387491510] [a_rwnd 0] [#gap acks 0] [#dup tsns 0] --> Read everything from the socket, both association recover up to maximum rwnd they are capable of reaching, note that association 1 recovered up to 3698, and association 2 recovered only to 43 IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315800] [a_rwnd 1548] [#gap acks 0] [#dup tsns 0] IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315800] [a_rwnd 3053] [#gap acks 0] [#dup tsns 0] IP B.12345 > A.55915: sctp (1) [DATA] (B)(E) [TSN: 3799315801] [SID: 0] [SSEQ 188] [PPID 0x18] IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315801] [a_rwnd 3698] [#gap acks 0] [#dup tsns 0] IP B.12346 > A.55915: sctp (1) [DATA] (B)(E) [TSN: 3676403241] [SID: 0] [SSEQ 1] [PPID 0x18] IP A.55915 > B.12346: sctp (1) [SACK] [cum ack 3676403241] [a_rwnd 43] [#gap acks 0] [#dup tsns 0] A careful reader might wonder why it is necessary to reproduce 1) prior reproduction of 2). It is simply easier to observe when to send packet over association 2 which will push association into the pressure state. Proposed solution: Both problems share the same root cause, and that is improper scaling of socket buffer with rwnd. Solution in which sizeof(sk_buff) is taken into concern while calculating rwnd is not possible due to fact that there is no linear relationship between amount of data blamed in increase/decrease with IP packet in which payload arrived. Even in case such solution would be followed, complexity of the code would increase. Due to nature of current rwnd handling, slow increase (in sctp_assoc_rwnd_increase) of rwnd after pressure state is entered is rationale, but it gives false representation to the sender of current buffer space. Furthermore, it implements additional congestion control mechanism which is defined on implementation, and not on standard basis. Proposed solution simplifies whole algorithm having on mind definition from rfc: o Receiver Window (rwnd): This gives the sender an indication of the space available in the receiver's inbound buffer. Core of the proposed solution is given with these lines: sctp_assoc_rwnd_update: if ((asoc->base.sk->sk_rcvbuf - rx_count) > 0) asoc->rwnd = (asoc->base.sk->sk_rcvbuf - rx_count) >> 1; else asoc->rwnd = 0; We advertise to sender (half of) actual space we have. Half is in the braces depending whether you would like to observe size of socket buffer as SO_RECVBUF or twice the amount, i.e. size is the one visible from userspace, that is, from kernelspace. In this way sender is given with good approximation of our buffer space, regardless of the buffer policy - we always advertise what we have. Proposed solution fixes described problems and removes necessity for rwnd restoration algorithm. Finally, as proposed solution is simplification, some lines of code, along with some bytes in struct sctp_association are saved. Version 2 of the patch addressed comments from Vlad. Name of the function is set to be more descriptive, and two parts of code are changed, in one removing the superfluous call to sctp_assoc_rwnd_update since call would not result in update of rwnd, and the other being reordering of the code in a way that call to sctp_assoc_rwnd_update updates rwnd. Version 3 corrected change introduced in v2 in a way that existing function is not reordered/copied in line, but it is correctly called. Thanks Vlad for suggesting. Signed-off-by: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nsn.com> Reviewed-by: Alexander Sverdlin <alexander.sverdlin@nsn.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-16ipv4: distinguish EHOSTUNREACH from the ENETUNREACHDuan Jiong
since commit 251da413("ipv4: Cache ip_error() routes even when not forwarding."), the counter IPSTATS_MIB_INADDRERRORS can't work correctly, because the value of err was always set to ENETUNREACH. Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-16hyperv: Fix the carrier status settingHaiyang Zhang
Without this patch, the "cat /sys/class/net/ethN/operstate" shows "unknown", and "ethtool ethN" shows "Link detected: yes", when VM boots up with or without vNIC connected. This patch fixed the problem. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-16dccp: re-enable debug macroGerrit Renker
dccp tfrc: revert This reverts 6aee49c558de ("dccp: make local variable static") since the variable tfrc_debug is referenced by the tfrc_pr_debug(fmt, ...) macro when TFRC debugging is enabled. If it is enabled, use of the macro produces a compilation error. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-16ext4: don't leave i_crtime.tv_sec uninitializedTheodore Ts'o
If the i_crtime field is not present in the inode, don't leave the field uninitialized. Fixes: ef7f38359 ("ext4: Add nanosecond timestamps") Reported-by: Vegard Nossum <vegard.nossum@oracle.com> Tested-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
2014-02-17powerpc/eeh: Disable EEH on rebootGavin Shan
We possiblly detect EEH errors during reboot, particularly in kexec path, but it's impossible for device drivers and EEH core to handle or recover them properly. The patch registers one reboot notifier for EEH and disable EEH subsystem during reboot. That means the EEH errors is going to be cleared by hardware reset or second kernel during early stage of PCI probe. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-02-17powerpc/eeh: Cleanup on eeh_subsystem_enabledGavin Shan
The patch cleans up variable eeh_subsystem_enabled so that we needn't refer the variable directly from external. Instead, we will use function eeh_enabled() and eeh_set_enable() to operate the variable. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>