summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2013-06-12ARM: OMAP4: hwmod data: add DSS data backTomi Valkeinen
Commit 3b9b10151c6838af52244cec4af41a938bb5b7ec (ARM: OMAP4: hwmod data: Clean up the data file) removes hwmod data for omap4, including DSS data. DSS has not yet been converted to DT, so the hwmod data is still needed. This patch adds back the DSS parts of the hwmod data. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com>
2013-06-12HID: multitouch: prevent memleak with the allocated nameBenjamin Tissoires
mt_free_input_name() was never called during .remove(): hid_hw_stop() removes the hid_input items in hdev->inputs, and so the list is therefore empty after the call. In the end, we never free the special names that has been allocated during .probe(). Restore the original name before freeing it to avoid acessing already freed pointer. This fixes a regression introduced by 49a5a827a ("HID: multitouch: append " Pen" to the name of the stylus input") Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2013-06-12team: fix checks in team_get_first_port_txable_rcu()Jiri Pirko
should be checked if "cur" is txable, not "port". Introduced by commit 6e88e1357c "team: use function team_port_txable() for determing enabled and up port" Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-12team: move add to port list before port enablementJiri Pirko
team_port_enable() adds port to port_hashlist. Reader sees port in team_get_port_by_index_rcu() and returns it, but team_get_first_port_txable_rcu() tries to go through port_list, where the port is not inserted yet -> NULL pointer dereference. Fix this by reordering port_list and port_hashlist insertion. Panic is easily triggeable when txing packets and adding/removing port in a loop. Introduced by commit 3d249d4c "net: introduce ethernet teaming device" Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-12team: check return value of team_get_port_by_index_rcu() for NULLJiri Pirko
team_get_port_by_index_rcu() might return NULL due to race between port removal and skb tx path. Panic is easily triggeable when txing packets and adding/removing port in a loop. introduced by commit 3d249d4ca "net: introduce ethernet teaming device" and commit 753f993911b "team: introduce random mode" (for random mode) Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-12tuntap: set SOCK_ZEROCOPY flag during openJason Wang
Commit 54f968d6efdbf7dec36faa44fc11f01b0e4d1990 (tuntap: move socket to tun_file) forgets to set SOCK_ZEROCOPY flag, which will prevent vhost_net from doing zercopy w/ tap. This patch fixes this by setting it during file open. Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-11Merge branch 'fixes-3.10' of git://git.infradead.org/users/willy/linux-nvmeLinus Torvalds
Pull NVMe fixes from Matthew Wilcox. * 'fixes-3.10' of git://git.infradead.org/users/willy/linux-nvme: NVMe: Add MSI support NVMe: Use dma_set_mask() correctly Return the result from user admin command IOCTL even in case of failure NVMe: Do not cancel command multiple times NVMe: fix error return code in nvme_submit_bio_queue() NVMe: check for integer overflow in nvme_map_user_pages() MAINTAINERS: update NVM EXPRESS DRIVER file list NVMe: Fix a signedness bug in nvme_trans_modesel_get_mp NVMe: Remove redundant version.h header include
2013-06-11Merge tag 'fixes-3.10-4' of git://git.infradead.org/users/jcooper/linux into ↵Olof Johansson
fixes From Jason Cooper, mvebu fixes for v3.10 round 4: - mvebu - fix PCIe ranges property so NOR flash is visible - kirkwood - fix identification of 88f6282 so MPPs can be set correctly * tag 'fixes-3.10-4' of git://git.infradead.org/users/jcooper/linux: arm: mvebu: armada-xp-{gp,openblocks-ax3-4}: specify PCIe range ARM: Kirkwood: handle mv88f6282 cpu in __kirkwood_variant(). Signed-off-by: Olof Johansson <olof@lixom.net>
2013-06-11usb: chipidea: fix id change handlingAlexander Shishkin
Re-enable chipidea irq even if there's no role changing to do. This is a problem since b183c19f ("USB: chipidea: re-order irq handling to avoid unhandled irqs"); when it manifests, chipidea irq gets disabled for good. Cc: stable@vger.kernel.org # v3.7 Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-06-11usb: chipidea: fix no transceiver caseAlexander Shishkin
Since usb phy code does return ERR_PTR() values, make sure that we don't end up dereferencing them. This is a problem, for example, on platforms that don't register a phy for chipidea since b7fa5c2a ("usb: phy: return -ENXIO when PHY layer isn't enabled"). Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-06-11Merge branch 'fixes' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm bugfixes from Gleb Natapov: "There is one more fix for MIPS KVM ABI here, MIPS and PPC build breakage fixes and a couple of PPC bug fixes" * 'fixes' of git://git.kernel.org/pub/scm/virt/kvm/kvm: kvm/ppc/booke64: Fix lazy ee handling in kvmppc_handle_exit() kvm/ppc/booke: Hold srcu lock when calling gfn functions kvm/ppc/booke64: Disable e6500 support kvm/ppc/booke64: Fix AltiVec interrupt numbers and build breakage mips/kvm: Use KVM_REG_MIPS and proper size indicators for *_ONE_REG kvm: Add definition of KVM_REG_MIPS KVM: add kvm_para_available to asm-generic/kvm_para.h
2013-06-11tracing: Fix outputting formats of x86-tsc and counter when use trace_clockYoshihiro YUNOMAE
Outputting formats of x86-tsc and counter should be a raw format, but after applying the patch(2b6080f28c7cc3efc8625ab71495aae89aeb63a0), the format was changed to nanosec. This is because the global variable trace_clock_id was used. When we use multiple buffers, clock_id of each sub-buffer should be used. Then, this patch uses tr->clock_id instead of the global variable trace_clock_id. [ Basically, this fixes a regression where the multibuffer code changed the trace_clock file to update tr->clock_id but the traces still use the old global trace_clock_id variable, negating the file's effect. The global trace_clock_id variable is obsolete and removed. - SR ] Link: http://lkml.kernel.org/r/20130423013239.22334.7394.stgit@yunodevel Signed-off-by: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-06-11netlink: fix error propagation in netlink_mmap()Patrick McHardy
Return the error if something went wrong instead of unconditionally returning 0. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-11net: sctp: fix NULL pointer dereference in socket destructionDaniel Borkmann
While stress testing sctp sockets, I hit the following panic: BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 IP: [<ffffffffa0490c4e>] sctp_endpoint_free+0xe/0x40 [sctp] PGD 7cead067 PUD 7ce76067 PMD 0 Oops: 0000 [#1] SMP Modules linked in: sctp(F) libcrc32c(F) [...] CPU: 7 PID: 2950 Comm: acc Tainted: GF 3.10.0-rc2+ #1 Hardware name: Dell Inc. PowerEdge T410/0H19HD, BIOS 1.6.3 02/01/2011 task: ffff88007ce0e0c0 ti: ffff88007b568000 task.ti: ffff88007b568000 RIP: 0010:[<ffffffffa0490c4e>] [<ffffffffa0490c4e>] sctp_endpoint_free+0xe/0x40 [sctp] RSP: 0018:ffff88007b569e08 EFLAGS: 00010292 RAX: 0000000000000000 RBX: ffff88007db78a00 RCX: dead000000200200 RDX: ffffffffa049fdb0 RSI: ffff8800379baf38 RDI: 0000000000000000 RBP: ffff88007b569e18 R08: ffff88007c230da0 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: ffff880077990d00 R14: 0000000000000084 R15: ffff88007db78a00 FS: 00007fc18ab61700(0000) GS:ffff88007fc60000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000020 CR3: 000000007cf9d000 CR4: 00000000000007e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Stack: ffff88007b569e38 ffff88007db78a00 ffff88007b569e38 ffffffffa049fded ffffffff81abf0c0 ffff88007db78a00 ffff88007b569e58 ffffffff8145b60e 0000000000000000 0000000000000000 ffff88007b569eb8 ffffffff814df36e Call Trace: [<ffffffffa049fded>] sctp_destroy_sock+0x3d/0x80 [sctp] [<ffffffff8145b60e>] sk_common_release+0x1e/0xf0 [<ffffffff814df36e>] inet_create+0x2ae/0x350 [<ffffffff81455a6f>] __sock_create+0x11f/0x240 [<ffffffff81455bf0>] sock_create+0x30/0x40 [<ffffffff8145696c>] SyS_socket+0x4c/0xc0 [<ffffffff815403be>] ? do_page_fault+0xe/0x10 [<ffffffff8153cb32>] ? page_fault+0x22/0x30 [<ffffffff81544e02>] system_call_fastpath+0x16/0x1b Code: 0c c9 c3 66 2e 0f 1f 84 00 00 00 00 00 e8 fb fe ff ff c9 c3 66 0f 1f 84 00 00 00 00 00 55 48 89 e5 53 48 83 ec 08 66 66 66 66 90 <48> 8b 47 20 48 89 fb c6 47 1c 01 c6 40 12 07 e8 9e 68 01 00 48 RIP [<ffffffffa0490c4e>] sctp_endpoint_free+0xe/0x40 [sctp] RSP <ffff88007b569e08> CR2: 0000000000000020 ---[ end trace e0d71ec1108c1dd9 ]--- I did not hit this with the lksctp-tools functional tests, but with a small, multi-threaded test program, that heavily allocates, binds, listens and waits in accept on sctp sockets, and then randomly kills some of them (no need for an actual client in this case to hit this). Then, again, allocating, binding, etc, and then killing child processes. This panic then only occurs when ``echo 1 > /proc/sys/net/sctp/auth_enable'' is set. The cause for that is actually very simple: in sctp_endpoint_init() we enter the path of sctp_auth_init_hmacs(). There, we try to allocate our crypto transforms through crypto_alloc_hash(). In our scenario, it then can happen that crypto_alloc_hash() fails with -EINTR from crypto_larval_wait(), thus we bail out and release the socket via sk_common_release(), sctp_destroy_sock() and hit the NULL pointer dereference as soon as we try to access members in the endpoint during sctp_endpoint_free(), since endpoint at that time is still NULL. Now, if we have that case, we do not need to do any cleanup work and just leave the destruction handler. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-11vhost: fix ubuf_info cleanupMichael S. Tsirkin
vhost_net_clear_ubuf_info didn't clear ubuf_info after kfree, this could trigger double free. Fix this and simplify this code to make it more robust: make sure ubuf info is always freed through vhost_net_clear_ubuf_info. Reported-by: Tommi Rantala <tt.rantala@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-11vhost: check owner before we overwrite ubuf_infoMichael S. Tsirkin
If device has an owner, we shouldn't touch ubuf_info since it might be in use. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-11qmi_wwan/cdc_ether: let qmi_wwan handle the Huawei E1820Bjørn Mork
Another QMI speaking Qualcomm based device, which should be driven by qmi_wwan, while cdc_ether should ignore it. Like on other Huawei devices, the wwan function can appear either as a single vendor specific interface or as a CDC ECM class function using separate control and data interfaces. The ECM control interface protocol is 0xff, likely in an attempt to indicate that vendor specific management is required. In addition to the near standard CDC class, Huawei also add vendor specific AT management commands to their firmwares. This is probably an attempt to support non-Windows systems using standard class drivers. Unfortunately, this part of the firmware is often buggy. Linux is much better off using whatever native vendor specific management protocol the device offers, and Windows uses, whenever possible. This means QMI in the case of Qualcomm based devices. The E1820 has been verified to work fine with QMI. Matching on interface number is necessary to distiguish the wwan function from serial functions in the single interface mode, as both function types will have class/subclass/function set to ff/ff/ff. The control interface number does not change in CDC ECM mode, so the interface number matching rule is sufficient to handle both modes. The cdc_ether blacklist entry is only relevant in CDC ECM mode, but using a similar interface number based rule helps document this as a transfer from one driver to another. Other Huawei 02/06/ff devices are left with the cdc_ether driver because we do not know whether they are based on Qualcomm chips. The Huawei specific AT command management is known to be somewhat hardware independent, and their usage of these class codes may also be independent of the modem hardware. Reported-by: Graham Inggs <graham.inggs@uct.ac.za> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-11Merge tag 'drm-intel-fixes-2013-06-11' of ↵Dave Airlie
git://people.freedesktop.org/~danvet/drm-intel into drm-fixes Daniel writes: Just tiny regression fixes here: - Two fixes to fix sdvo hotplug which broke in the hpd storm detection work. - One fix to patch-up the sdvo lvds regression fixer from the last pull - we need to prefer the vbt mode over edid modes. * tag 'drm-intel-fixes-2013-06-11' of git://people.freedesktop.org/~danvet/drm-intel: drm/i915: prefer VBT modes for SVDO-LVDS over EDID drm/i915: Enable hotplug interrupts after querying hw capabilities. drm/i915: Fix hotplug interrupt enabling for SDVOC
2013-06-11sh_eth: fix result of sh_eth_check_reset() on timeoutSergei Shtylyov
When the first loop in sh_eth_check_reset() runs to its end, 'cnt' is 0, so the following check for 'cnt < 0' fails to catch the timeout. Fix the condition in this check, so that the timeout is actually reported. While at it, fix the grammar in the failure message... Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-11net/ti davinci_mdio: don't hold a spin lock while calling pm_runtimeSebastian Siewior
was playing with suspend and run into this: |BUG: sleeping function called from invalid context at drivers/base/power/runtime.c:891 |in_atomic(): 1, irqs_disabled(): 0, pid: 1963, name: bash |6 locks held by bash/1963: |CPU: 0 PID: 1963 Comm: bash Not tainted 3.10.0-rc4+ #50 |[<c0014fdc>] (unwind_backtrace+0x0/0xf8) from [<c0011da4>] (show_stack+0x10/0x14) |[<c0011da4>] (show_stack+0x10/0x14) from [<c02e8680>] (__pm_runtime_idle+0xa4/0xac) |[<c02e8680>] (__pm_runtime_idle+0xa4/0xac) from [<c0341158>] (davinci_mdio_suspend+0x6c/0x9c) |[<c0341158>] (davinci_mdio_suspend+0x6c/0x9c) from [<c02e0628>] (platform_pm_suspend+0x2c/0x54) |[<c02e0628>] (platform_pm_suspend+0x2c/0x54) from [<c02e52bc>] (dpm_run_callback.isra.3+0x2c/0x64) |[<c02e52bc>] (dpm_run_callback.isra.3+0x2c/0x64) from [<c02e57e4>] (__device_suspend+0x100/0x22c) |[<c02e57e4>] (__device_suspend+0x100/0x22c) from [<c02e67e8>] (dpm_suspend+0x68/0x230) |[<c02e67e8>] (dpm_suspend+0x68/0x230) from [<c0072a20>] (suspend_devices_and_enter+0x68/0x350) |[<c0072a20>] (suspend_devices_and_enter+0x68/0x350) from [<c0072f18>] (pm_suspend+0x210/0x24c) |[<c0072f18>] (pm_suspend+0x210/0x24c) from [<c0071c74>] (state_store+0x6c/0xbc) |[<c0071c74>] (state_store+0x6c/0xbc) from [<c02714dc>] (kobj_attr_store+0x14/0x20) |[<c02714dc>] (kobj_attr_store+0x14/0x20) from [<c01341a0>] (sysfs_write_file+0x16c/0x19c) |[<c01341a0>] (sysfs_write_file+0x16c/0x19c) from [<c00ddfe4>] (vfs_write+0xb4/0x190) |[<c00ddfe4>] (vfs_write+0xb4/0x190) from [<c00de3a4>] (SyS_write+0x3c/0x70) |[<c00de3a4>] (SyS_write+0x3c/0x70) from [<c000e2c0>] (ret_fast_syscall+0x0/0x48) I don't see a reason why the pm_runtime call must be under the lock. Further I don't understand why this is a spinlock and not mutex. Cc: Mugunthan V N <mugunthanvnm@ti.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Mugunthan V N <mugunthanvnm@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-11kvm/ppc/booke64: Fix lazy ee handling in kvmppc_handle_exit()Scott Wood
EE is hard-disabled on entry to kvmppc_handle_exit(), so call hard_irq_disable() so that PACA_IRQ_HARD_DIS is set, and soft_enabled is unset. Without this, we get warnings such as arch/powerpc/kernel/time.c:300, and sometimes host kernel hangs. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-06-11kvm/ppc/booke: Hold srcu lock when calling gfn functionsScott Wood
KVM core expects arch code to acquire the srcu lock when calling gfn_to_memslot and similar functions. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-06-11kvm/ppc/booke64: Disable e6500 supportScott Wood
The previous patch made 64-bit booke KVM build again, but Altivec support is still not complete, and we can't prevent the guest from turning on Altivec (which can corrupt host state until state save/restore is implemented). Disable e6500 on KVM until this is fixed. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-06-11kvm/ppc/booke64: Fix AltiVec interrupt numbers and build breakageMihai Caraman
Interrupt numbers defined for Book3E follows IVORs definition. Align BOOKE_INTERRUPT_ALTIVEC_UNAVAIL and BOOKE_INTERRUPT_ALTIVEC_ASSIST to this rule which also fixes the build breakage. IVORs 32 and 33 are shared so reflect this in the interrupts naming. This fixes a build break for 64-bit booke KVM. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-06-11ARM: SAMSUNG: pm: Adjust for pinctrl- and DT-enabled platformsTomasz Figa
This patch makes legacy code on suspend/resume path being executed conditionally, on non-DT platforms only, to fix suspend/resume of DT-enabled systems, for which the code is inappropriate. Signed-off-by: Tomasz Figa <t.figa@samsung.com> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com> [olof: add #include <linux/of.h>] Signed-off-by: Olof Johansson <olof@lixom.net>
2013-06-11mips/kvm: Use KVM_REG_MIPS and proper size indicators for *_ONE_REGDavid Daney
The API requires that the GET_ONE_REG and SET_ONE_REG ioctls have this extra information encoded in the register identifiers. Signed-off-by: David Daney <david.daney@cavium.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-06-11kvm: Add definition of KVM_REG_MIPSDavid Daney
We use 0x7000000000000000ULL as 0x6000000000000000ULL is reserved for ARM64. Signed-off-by: David Daney <david.daney@cavium.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-06-11ARM: prima2: fix incorrect panic usageHaojian Zhuang
In prima2, some functions of checking DT is registered in initcall level. If it doesn't match the compatible name of sirf, kernel will panic. It blocks the usage of multiplatform on other verndor. The error message is in below. Knic - not syncing: unable to find compatible pwrc node in dtb CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.10.0-rc3-00006-gd7f26ea-dirty #86 [<c0013adc>] (unwind_backtrace+0x0/0xf8) from [<c0011430>] (show_stack+0x10/0x1) [<c0011430>] (show_stack+0x10/0x14) from [<c026f724>] (panic+0x90/0x1e8) [<c026f724>] (panic+0x90/0x1e8) from [<c03267fc>] (sirfsoc_of_pwrc_init+0x24/0x) [<c03267fc>] (sirfsoc_of_pwrc_init+0x24/0x58) from [<c0320864>] (do_one_initcal) [<c0320864>] (do_one_initcall+0x90/0x150) from [<c0320a20>] (kernel_init_freeab) [<c0320a20>] (kernel_init_freeable+0xfc/0x1c4) from [<c026b9e8>] (kernel_init+0) [<c026b9e8>] (kernel_init+0x8/0xe4) from [<c000e158>] (ret_from_fork+0x14/0x3c) Signen-off-by: Haojian Zhuang <haojian.zhuang@linaro.org> Signed-off-by: Olof Johansson <olof@lixom.net>
2013-06-10sock_diag: fix filter code sent to userspaceNicolas Dichtel
Filters need to be translated to real BPF code for userland, like SO_GETFILTER. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-10Fix lockup related to stop_machine being stuck in __do_softirq.Ben Greear
The stop machine logic can lock up if all but one of the migration threads make it through the disable-irq step and the one remaining thread gets stuck in __do_softirq. The reason __do_softirq can hang is that it has a bail-out based on jiffies timeout, but in the lockup case, jiffies itself is not incremented. To work around this, re-add the max_restart counter in __do_irq and stop processing irqs after 10 restarts. Thanks to Tejun Heo and Rusty Russell and others for helping me track this down. This was introduced in 3.9 by commit c10d73671ad3 ("softirq: reduce latencies"). It may be worth looking into ath9k to see if it has issues with its irq handler at a later date. The hang stack traces look something like this: ------------[ cut here ]------------ WARNING: at kernel/watchdog.c:245 watchdog_overflow_callback+0x9c/0xa7() Watchdog detected hard LOCKUP on cpu 2 Modules linked in: ath9k ath9k_common ath9k_hw ath mac80211 cfg80211 nfsv4 auth_rpcgss nfs fscache nf_nat_ipv4 nf_nat veth 8021q garp stp mrp llc pktgen lockd sunrpc] Pid: 23, comm: migration/2 Tainted: G C 3.9.4+ #11 Call Trace: <NMI> warn_slowpath_common+0x85/0x9f warn_slowpath_fmt+0x46/0x48 watchdog_overflow_callback+0x9c/0xa7 __perf_event_overflow+0x137/0x1cb perf_event_overflow+0x14/0x16 intel_pmu_handle_irq+0x2dc/0x359 perf_event_nmi_handler+0x19/0x1b nmi_handle+0x7f/0xc2 do_nmi+0xbc/0x304 end_repeat_nmi+0x1e/0x2e <<EOE>> cpu_stopper_thread+0xae/0x162 smpboot_thread_fn+0x258/0x260 kthread+0xc7/0xcf ret_from_fork+0x7c/0xb0 ---[ end trace 4947dfa9b0a4cec3 ]--- BUG: soft lockup - CPU#1 stuck for 22s! [migration/1:17] Modules linked in: ath9k ath9k_common ath9k_hw ath mac80211 cfg80211 nfsv4 auth_rpcgss nfs fscache nf_nat_ipv4 nf_nat veth 8021q garp stp mrp llc pktgen lockd sunrpc] irq event stamp: 835637905 hardirqs last enabled at (835637904): __do_softirq+0x9f/0x257 hardirqs last disabled at (835637905): apic_timer_interrupt+0x6d/0x80 softirqs last enabled at (5654720): __do_softirq+0x1ff/0x257 softirqs last disabled at (5654725): irq_exit+0x5f/0xbb CPU 1 Pid: 17, comm: migration/1 Tainted: G WC 3.9.4+ #11 To be filled by O.E.M. To be filled by O.E.M./To be filled by O.E.M. RIP: tasklet_hi_action+0xf0/0xf0 Process migration/1 Call Trace: <IRQ> __do_softirq+0x117/0x257 irq_exit+0x5f/0xbb smp_apic_timer_interrupt+0x8a/0x98 apic_timer_interrupt+0x72/0x80 <EOI> printk+0x4d/0x4f stop_machine_cpu_stop+0x22c/0x274 cpu_stopper_thread+0xae/0x162 smpboot_thread_fn+0x258/0x260 kthread+0xc7/0xcf ret_from_fork+0x7c/0xb0 Signed-off-by: Ben Greear <greearb@candelatech.com> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Pekka Riikonen <priikone@iki.fi> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-06-10Merge tag '9p-3.10-bug-fix-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs Pull net/9p bug fix from Eric Van Hensbergen: "zero copy error fix" * tag '9p-3.10-bug-fix-1' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: net/9p: Handle error in zero copy request correctly for 9p2000.u
2013-06-11Merge branch 'gma500-fixes' of git://github.com/patjak/drm-gma500 into drm-fixesDave Airlie
Patrik writes: Two fixes for memory leaks split into Cedarview and Poulsbo versions, and a fix for properly setting the pipe base when using fbdev. It's on my todo-list to start unifying the chips since they are very similar, but until then I'd like to split them up in case there are side-effects on Cedarview that I cannot currently test. airled: Verified pull from github matches what I expected. * 'gma500-fixes' of git://github.com/patjak/drm-gma500: drm/gma500/cdv: Fix cursor gem obj referencing on cdv drm/gma500/psb: Fix cursor gem obj referencing on psb drm/gma500/cdv: Unpin framebuffer on crtc disable drm/gma500/psb: Unpin framebuffer on crtc disable drm/gma500: Add fb gtt offset to fb base
2013-06-10tuntap: fix a possible race between queue selection and changing queuesJason Wang
Complier may generate codes that re-read the tun->numqueues during tun_select_queue(). This may be a race if vlan->numqueues were changed in the same time and can lead unexpected result (e.g. very huge value). We need prevent the compiler from generating such codes by adding an ACCESS_ONCE() to make sure tun->numqueues were only read once. Bug were introduced by commit c8d68e6be1c3b242f1c598595830890b65cea64a (tuntap: multiqueue support). Reported-by: Michael S. Tsirkin <mst@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-10vhost_net: clear msg.control for non-zerocopy case during txJason Wang
When we decide not use zero-copy, msg.control should be set to NULL otherwise macvtap/tap may set zerocopy callbacks which may decrease the kref of ubufs wrongly. Bug were introduced by commit cedb9bdce099206290a2bdd02ce47a7b253b6a84 (vhost-net: skip head management if no outstanding). This solves the following warnings: WARNING: at include/linux/kref.h:47 handle_tx+0x477/0x4b0 [vhost_net]() Modules linked in: vhost_net macvtap macvlan tun nfsd exportfs bridge stp llc openvswitch kvm_amd kvm bnx2 megaraid_sas [last unloaded: tun] CPU: 5 PID: 8670 Comm: vhost-8668 Not tainted 3.10.0-rc2+ #1566 Hardware name: Dell Inc. PowerEdge R715/00XHKG, BIOS 1.5.2 04/19/2011 ffffffffa0198323 ffff88007c9ebd08 ffffffff81796b73 ffff88007c9ebd48 ffffffff8103d66b 000000007b773e20 ffff8800779f0000 ffff8800779f43f0 ffff8800779f8418 000000000000015c 0000000000000062 ffff88007c9ebd58 Call Trace: [<ffffffff81796b73>] dump_stack+0x19/0x1e [<ffffffff8103d66b>] warn_slowpath_common+0x6b/0xa0 [<ffffffff8103d6b5>] warn_slowpath_null+0x15/0x20 [<ffffffffa0197627>] handle_tx+0x477/0x4b0 [vhost_net] [<ffffffffa0197690>] handle_tx_kick+0x10/0x20 [vhost_net] [<ffffffffa019541e>] vhost_worker+0xfe/0x1a0 [vhost_net] [<ffffffffa0195320>] ? vhost_attach_cgroups_work+0x30/0x30 [vhost_net] [<ffffffffa0195320>] ? vhost_attach_cgroups_work+0x30/0x30 [vhost_net] [<ffffffff81061f46>] kthread+0xc6/0xd0 [<ffffffff81061e80>] ? kthread_freezable_should_stop+0x70/0x70 [<ffffffff817a1aec>] ret_from_fork+0x7c/0xb0 [<ffffffff81061e80>] ? kthread_freezable_should_stop+0x70/0x70 Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-10Modify UEFI anti-bricking codeMatthew Garrett
This patch reworks the UEFI anti-bricking code, including an effective reversion of cc5a080c and 31ff2f20. It turns out that calling QueryVariableInfo() from boot services results in some firmware implementations jumping to physical addresses even after entering virtual mode, so until we have 1:1 mappings for UEFI runtime space this isn't going to work so well. Reverting these gets us back to the situation where we'd refuse to create variables on some systems because they classify deleted variables as "used" until the firmware triggers a garbage collection run, which they won't do until they reach a lower threshold. This results in it being impossible to install a bootloader, which is unhelpful. Feedback from Samsung indicates that the firmware doesn't need more than 5KB of storage space for its own purposes, so that seems like a reasonable threshold. However, there's still no guarantee that a platform will attempt garbage collection merely because it drops below this threshold. It seems that this is often only triggered if an attempt to write generates a genuine EFI_OUT_OF_RESOURCES error. We can force that by attempting to create a variable larger than the remaining space. This should fail, but if it somehow succeeds we can then immediately delete it. I've tested this on the UEFI machines I have available, but I don't have a Samsung and so can't verify that it avoids the bricking problem. Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com> Signed-off-by: Lee, Chun-Y <jlee@suse.com> [ dummy variable cleanup ] Cc: <stable@vger.kernel.org> Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2013-06-10rcu: Fix deadlock with CPU hotplug, RCU GP init, and timer migrationPaul E. McKenney
In Steven Rostedt's words: > I've been debugging the last couple of days why my tests have been > locking up. One of my tracing tests, runs all available tracers. The > lockup always happened with the mmiotrace, which is used to trace > interactions between priority drivers and the kernel. But to do this > easily, when the tracer gets registered, it disables all but the boot > CPUs. The lockup always happened after it got done disabling the CPUs. > > Then I decided to try this: > > while :; do > for i in 1 2 3; do > echo 0 > /sys/devices/system/cpu/cpu$i/online > done > for i in 1 2 3; do > echo 1 > /sys/devices/system/cpu/cpu$i/online > done > done > > Well, sure enough, that locked up too, with the same users. Doing a > sysrq-w (showing all blocked tasks): > > [ 2991.344562] task PC stack pid father > [ 2991.344562] rcu_preempt D ffff88007986fdf8 0 10 2 0x00000000 > [ 2991.344562] ffff88007986fc98 0000000000000002 ffff88007986fc48 0000000000000908 > [ 2991.344562] ffff88007986c280 ffff88007986ffd8 ffff88007986ffd8 00000000001d3c80 > [ 2991.344562] ffff880079248a40 ffff88007986c280 0000000000000000 00000000fffd4295 > [ 2991.344562] Call Trace: > [ 2991.344562] [<ffffffff815437ba>] schedule+0x64/0x66 > [ 2991.344562] [<ffffffff81541750>] schedule_timeout+0xbc/0xf9 > [ 2991.344562] [<ffffffff8154bec0>] ? ftrace_call+0x5/0x2f > [ 2991.344562] [<ffffffff81049513>] ? cascade+0xa8/0xa8 > [ 2991.344562] [<ffffffff815417ab>] schedule_timeout_uninterruptible+0x1e/0x20 > [ 2991.344562] [<ffffffff810c980c>] rcu_gp_kthread+0x502/0x94b > [ 2991.344562] [<ffffffff81062791>] ? __init_waitqueue_head+0x50/0x50 > [ 2991.344562] [<ffffffff810c930a>] ? rcu_gp_fqs+0x64/0x64 > [ 2991.344562] [<ffffffff81061cdb>] kthread+0xb1/0xb9 > [ 2991.344562] [<ffffffff81091e31>] ? lock_release_holdtime.part.23+0x4e/0x55 > [ 2991.344562] [<ffffffff81061c2a>] ? __init_kthread_worker+0x58/0x58 > [ 2991.344562] [<ffffffff8154c1dc>] ret_from_fork+0x7c/0xb0 > [ 2991.344562] [<ffffffff81061c2a>] ? __init_kthread_worker+0x58/0x58 > [ 2991.344562] kworker/0:1 D ffffffff81a30680 0 47 2 0x00000000 > [ 2991.344562] Workqueue: events cpuset_hotplug_workfn > [ 2991.344562] ffff880078dbbb58 0000000000000002 0000000000000006 00000000000000d8 > [ 2991.344562] ffff880078db8100 ffff880078dbbfd8 ffff880078dbbfd8 00000000001d3c80 > [ 2991.344562] ffff8800779ca5c0 ffff880078db8100 ffffffff81541fcf 0000000000000000 > [ 2991.344562] Call Trace: > [ 2991.344562] [<ffffffff81541fcf>] ? __mutex_lock_common+0x3d4/0x609 > [ 2991.344562] [<ffffffff815437ba>] schedule+0x64/0x66 > [ 2991.344562] [<ffffffff81543a39>] schedule_preempt_disabled+0x18/0x24 > [ 2991.344562] [<ffffffff81541fcf>] __mutex_lock_common+0x3d4/0x609 > [ 2991.344562] [<ffffffff8103d11b>] ? get_online_cpus+0x3c/0x50 > [ 2991.344562] [<ffffffff8103d11b>] ? get_online_cpus+0x3c/0x50 > [ 2991.344562] [<ffffffff815422ff>] mutex_lock_nested+0x3b/0x40 > [ 2991.344562] [<ffffffff8103d11b>] get_online_cpus+0x3c/0x50 > [ 2991.344562] [<ffffffff810af7e6>] rebuild_sched_domains_locked+0x6e/0x3a8 > [ 2991.344562] [<ffffffff810b0ec6>] rebuild_sched_domains+0x1c/0x2a > [ 2991.344562] [<ffffffff810b109b>] cpuset_hotplug_workfn+0x1c7/0x1d3 > [ 2991.344562] [<ffffffff810b0ed9>] ? cpuset_hotplug_workfn+0x5/0x1d3 > [ 2991.344562] [<ffffffff81058e07>] process_one_work+0x2d4/0x4d1 > [ 2991.344562] [<ffffffff81058d3a>] ? process_one_work+0x207/0x4d1 > [ 2991.344562] [<ffffffff8105964c>] worker_thread+0x2e7/0x3b5 > [ 2991.344562] [<ffffffff81059365>] ? rescuer_thread+0x332/0x332 > [ 2991.344562] [<ffffffff81061cdb>] kthread+0xb1/0xb9 > [ 2991.344562] [<ffffffff81061c2a>] ? __init_kthread_worker+0x58/0x58 > [ 2991.344562] [<ffffffff8154c1dc>] ret_from_fork+0x7c/0xb0 > [ 2991.344562] [<ffffffff81061c2a>] ? __init_kthread_worker+0x58/0x58 > [ 2991.344562] bash D ffffffff81a4aa80 0 2618 2612 0x10000000 > [ 2991.344562] ffff8800379abb58 0000000000000002 0000000000000006 0000000000000c2c > [ 2991.344562] ffff880077fea140 ffff8800379abfd8 ffff8800379abfd8 00000000001d3c80 > [ 2991.344562] ffff8800779ca5c0 ffff880077fea140 ffffffff81541fcf 0000000000000000 > [ 2991.344562] Call Trace: > [ 2991.344562] [<ffffffff81541fcf>] ? __mutex_lock_common+0x3d4/0x609 > [ 2991.344562] [<ffffffff815437ba>] schedule+0x64/0x66 > [ 2991.344562] [<ffffffff81543a39>] schedule_preempt_disabled+0x18/0x24 > [ 2991.344562] [<ffffffff81541fcf>] __mutex_lock_common+0x3d4/0x609 > [ 2991.344562] [<ffffffff81530078>] ? rcu_cpu_notify+0x2f5/0x86e > [ 2991.344562] [<ffffffff81530078>] ? rcu_cpu_notify+0x2f5/0x86e > [ 2991.344562] [<ffffffff815422ff>] mutex_lock_nested+0x3b/0x40 > [ 2991.344562] [<ffffffff81530078>] rcu_cpu_notify+0x2f5/0x86e > [ 2991.344562] [<ffffffff81091c99>] ? __lock_is_held+0x32/0x53 > [ 2991.344562] [<ffffffff81548912>] notifier_call_chain+0x6b/0x98 > [ 2991.344562] [<ffffffff810671fd>] __raw_notifier_call_chain+0xe/0x10 > [ 2991.344562] [<ffffffff8103cf64>] __cpu_notify+0x20/0x32 > [ 2991.344562] [<ffffffff8103cf8d>] cpu_notify_nofail+0x17/0x36 > [ 2991.344562] [<ffffffff815225de>] _cpu_down+0x154/0x259 > [ 2991.344562] [<ffffffff81522710>] cpu_down+0x2d/0x3a > [ 2991.344562] [<ffffffff81526351>] store_online+0x4e/0xe7 > [ 2991.344562] [<ffffffff8134d764>] dev_attr_store+0x20/0x22 > [ 2991.344562] [<ffffffff811b3c5f>] sysfs_write_file+0x108/0x144 > [ 2991.344562] [<ffffffff8114c5ef>] vfs_write+0xfd/0x158 > [ 2991.344562] [<ffffffff8114c928>] SyS_write+0x5c/0x83 > [ 2991.344562] [<ffffffff8154c494>] tracesys+0xdd/0xe2 > > As well as held locks: > > [ 3034.728033] Showing all locks held in the system: > [ 3034.728033] 1 lock held by rcu_preempt/10: > [ 3034.728033] #0: (rcu_preempt_state.onoff_mutex){+.+...}, at: [<ffffffff810c9471>] rcu_gp_kthread+0x167/0x94b > [ 3034.728033] 4 locks held by kworker/0:1/47: > [ 3034.728033] #0: (events){.+.+.+}, at: [<ffffffff81058d3a>] process_one_work+0x207/0x4d1 > [ 3034.728033] #1: (cpuset_hotplug_work){+.+.+.}, at: [<ffffffff81058d3a>] process_one_work+0x207/0x4d1 > [ 3034.728033] #2: (cpuset_mutex){+.+.+.}, at: [<ffffffff810b0ec1>] rebuild_sched_domains+0x17/0x2a > [ 3034.728033] #3: (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff8103d11b>] get_online_cpus+0x3c/0x50 > [ 3034.728033] 1 lock held by mingetty/2563: > [ 3034.728033] #0: (&ldata->atomic_read_lock){+.+...}, at: [<ffffffff8131e28a>] n_tty_read+0x252/0x7e8 > [ 3034.728033] 1 lock held by mingetty/2565: > [ 3034.728033] #0: (&ldata->atomic_read_lock){+.+...}, at: [<ffffffff8131e28a>] n_tty_read+0x252/0x7e8 > [ 3034.728033] 1 lock held by mingetty/2569: > [ 3034.728033] #0: (&ldata->atomic_read_lock){+.+...}, at: [<ffffffff8131e28a>] n_tty_read+0x252/0x7e8 > [ 3034.728033] 1 lock held by mingetty/2572: > [ 3034.728033] #0: (&ldata->atomic_read_lock){+.+...}, at: [<ffffffff8131e28a>] n_tty_read+0x252/0x7e8 > [ 3034.728033] 1 lock held by mingetty/2575: > [ 3034.728033] #0: (&ldata->atomic_read_lock){+.+...}, at: [<ffffffff8131e28a>] n_tty_read+0x252/0x7e8 > [ 3034.728033] 7 locks held by bash/2618: > [ 3034.728033] #0: (sb_writers#5){.+.+.+}, at: [<ffffffff8114bc3f>] file_start_write+0x2a/0x2c > [ 3034.728033] #1: (&buffer->mutex#2){+.+.+.}, at: [<ffffffff811b3b93>] sysfs_write_file+0x3c/0x144 > [ 3034.728033] #2: (s_active#54){.+.+.+}, at: [<ffffffff811b3c3e>] sysfs_write_file+0xe7/0x144 > [ 3034.728033] #3: (x86_cpu_hotplug_driver_mutex){+.+.+.}, at: [<ffffffff810217c2>] cpu_hotplug_driver_lock+0x17/0x19 > [ 3034.728033] #4: (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff8103d196>] cpu_maps_update_begin+0x17/0x19 > [ 3034.728033] #5: (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff8103cfd8>] cpu_hotplug_begin+0x2c/0x6d > [ 3034.728033] #6: (rcu_preempt_state.onoff_mutex){+.+...}, at: [<ffffffff81530078>] rcu_cpu_notify+0x2f5/0x86e > [ 3034.728033] 1 lock held by bash/2980: > [ 3034.728033] #0: (&ldata->atomic_read_lock){+.+...}, at: [<ffffffff8131e28a>] n_tty_read+0x252/0x7e8 > > Things looked a little weird. Also, this is a deadlock that lockdep did > not catch. But what we have here does not look like a circular lock > issue: > > Bash is blocked in rcu_cpu_notify(): > > 1961 /* Exclude any attempts to start a new grace period. */ > 1962 mutex_lock(&rsp->onoff_mutex); > > > kworker is blocked in get_online_cpus(), which makes sense as we are > currently taking down a CPU. > > But rcu_preempt is not blocked on anything. It is simply sleeping in > rcu_gp_kthread (really rcu_gp_init) here: > > 1453 #ifdef CONFIG_PROVE_RCU_DELAY > 1454 if ((prandom_u32() % (rcu_num_nodes * 8)) == 0 && > 1455 system_state == SYSTEM_RUNNING) > 1456 schedule_timeout_uninterruptible(2); > 1457 #endif /* #ifdef CONFIG_PROVE_RCU_DELAY */ > > And it does this while holding the onoff_mutex that bash is waiting for. > > Doing a function trace, it showed me where it happened: > > [ 125.940066] rcu_pree-10 3.... 28384115273: schedule_timeout_uninterruptible <-rcu_gp_kthread > [...] > [ 125.940066] rcu_pree-10 3d..3 28384202439: sched_switch: prev_comm=rcu_preempt prev_pid=10 prev_prio=120 prev_state=D ==> next_comm=watchdog/3 next_pid=38 next_prio=120 > > The watchdog ran, and then: > > [ 125.940066] watchdog-38 3d..3 28384692863: sched_switch: prev_comm=watchdog/3 prev_pid=38 prev_prio=120 prev_state=P ==> next_comm=modprobe next_pid=2848 next_prio=118 > > Not sure what modprobe was doing, but shortly after that: > > [ 125.940066] modprobe-2848 3d..3 28385041749: sched_switch: prev_comm=modprobe prev_pid=2848 prev_prio=118 prev_state=R+ ==> next_comm=migration/3 next_pid=40 next_prio=0 > > Where the migration thread took down the CPU: > > [ 125.940066] migratio-40 3d..3 28389148276: sched_switch: prev_comm=migration/3 prev_pid=40 prev_prio=0 prev_state=P ==> next_comm=swapper/3 next_pid=0 next_prio=120 > > which finally did: > > [ 125.940066] <idle>-0 3...1 28389282142: arch_cpu_idle_dead <-cpu_startup_entry > [ 125.940066] <idle>-0 3...1 28389282548: native_play_dead <-arch_cpu_idle_dead > [ 125.940066] <idle>-0 3...1 28389282924: play_dead_common <-native_play_dead > [ 125.940066] <idle>-0 3...1 28389283468: idle_task_exit <-play_dead_common > [ 125.940066] <idle>-0 3...1 28389284644: amd_e400_remove_cpu <-play_dead_common > > > CPU 3 is now offline, the rcu_preempt thread that ran on CPU 3 is still > doing a schedule_timeout_uninterruptible() and it registered it's > timeout to the timer base for CPU 3. You would think that it would get > migrated right? The issue here is that the timer migration happens at > the CPU notifier for CPU_DEAD. The problem is that the rcu notifier for > CPU_DOWN is blocked waiting for the onoff_mutex to be released, which is > held by the thread that just put itself into a uninterruptible sleep, > that wont wake up until the CPU_DEAD notifier of the timer > infrastructure is called, which wont happen until the rcu notifier > finishes. Here's our deadlock! This commit breaks this deadlock cycle by substituting a shorter udelay() for the previous schedule_timeout_uninterruptible(), while at the same time increasing the probability of the delay. This maintains the intensity of the testing. Reported-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Steven Rostedt <rostedt@goodmis.org>
2013-06-10rcu: Don't call wakeup() with rcu_node structure ->lock heldSteven Rostedt
This commit fixes a lockdep-detected deadlock by moving a wake_up() call out from a rnp->lock critical section. Please see below for the long version of this story. On Tue, 2013-05-28 at 16:13 -0400, Dave Jones wrote: > [12572.705832] ====================================================== > [12572.750317] [ INFO: possible circular locking dependency detected ] > [12572.796978] 3.10.0-rc3+ #39 Not tainted > [12572.833381] ------------------------------------------------------- > [12572.862233] trinity-child17/31341 is trying to acquire lock: > [12572.870390] (rcu_node_0){..-.-.}, at: [<ffffffff811054ff>] rcu_read_unlock_special+0x9f/0x4c0 > [12572.878859] > but task is already holding lock: > [12572.894894] (&ctx->lock){-.-...}, at: [<ffffffff811390ed>] perf_lock_task_context+0x7d/0x2d0 > [12572.903381] > which lock already depends on the new lock. > > [12572.927541] > the existing dependency chain (in reverse order) is: > [12572.943736] > -> #4 (&ctx->lock){-.-...}: > [12572.960032] [<ffffffff810b9851>] lock_acquire+0x91/0x1f0 > [12572.968337] [<ffffffff816ebc90>] _raw_spin_lock+0x40/0x80 > [12572.976633] [<ffffffff8113c987>] __perf_event_task_sched_out+0x2e7/0x5e0 > [12572.984969] [<ffffffff81088953>] perf_event_task_sched_out+0x93/0xa0 > [12572.993326] [<ffffffff816ea0bf>] __schedule+0x2cf/0x9c0 > [12573.001652] [<ffffffff816eacfe>] schedule_user+0x2e/0x70 > [12573.009998] [<ffffffff816ecd64>] retint_careful+0x12/0x2e > [12573.018321] > -> #3 (&rq->lock){-.-.-.}: > [12573.034628] [<ffffffff810b9851>] lock_acquire+0x91/0x1f0 > [12573.042930] [<ffffffff816ebc90>] _raw_spin_lock+0x40/0x80 > [12573.051248] [<ffffffff8108e6a7>] wake_up_new_task+0xb7/0x260 > [12573.059579] [<ffffffff810492f5>] do_fork+0x105/0x470 > [12573.067880] [<ffffffff81049686>] kernel_thread+0x26/0x30 > [12573.076202] [<ffffffff816cee63>] rest_init+0x23/0x140 > [12573.084508] [<ffffffff81ed8e1f>] start_kernel+0x3f1/0x3fe > [12573.092852] [<ffffffff81ed856f>] x86_64_start_reservations+0x2a/0x2c > [12573.101233] [<ffffffff81ed863d>] x86_64_start_kernel+0xcc/0xcf > [12573.109528] > -> #2 (&p->pi_lock){-.-.-.}: > [12573.125675] [<ffffffff810b9851>] lock_acquire+0x91/0x1f0 > [12573.133829] [<ffffffff816ebe9b>] _raw_spin_lock_irqsave+0x4b/0x90 > [12573.141964] [<ffffffff8108e881>] try_to_wake_up+0x31/0x320 > [12573.150065] [<ffffffff8108ebe2>] default_wake_function+0x12/0x20 > [12573.158151] [<ffffffff8107bbf8>] autoremove_wake_function+0x18/0x40 > [12573.166195] [<ffffffff81085398>] __wake_up_common+0x58/0x90 > [12573.174215] [<ffffffff81086909>] __wake_up+0x39/0x50 > [12573.182146] [<ffffffff810fc3da>] rcu_start_gp_advanced.isra.11+0x4a/0x50 > [12573.190119] [<ffffffff810fdb09>] rcu_start_future_gp+0x1c9/0x1f0 > [12573.198023] [<ffffffff810fe2c4>] rcu_nocb_kthread+0x114/0x930 > [12573.205860] [<ffffffff8107a91d>] kthread+0xed/0x100 > [12573.213656] [<ffffffff816f4b1c>] ret_from_fork+0x7c/0xb0 > [12573.221379] > -> #1 (&rsp->gp_wq){..-.-.}: > [12573.236329] [<ffffffff810b9851>] lock_acquire+0x91/0x1f0 > [12573.243783] [<ffffffff816ebe9b>] _raw_spin_lock_irqsave+0x4b/0x90 > [12573.251178] [<ffffffff810868f3>] __wake_up+0x23/0x50 > [12573.258505] [<ffffffff810fc3da>] rcu_start_gp_advanced.isra.11+0x4a/0x50 > [12573.265891] [<ffffffff810fdb09>] rcu_start_future_gp+0x1c9/0x1f0 > [12573.273248] [<ffffffff810fe2c4>] rcu_nocb_kthread+0x114/0x930 > [12573.280564] [<ffffffff8107a91d>] kthread+0xed/0x100 > [12573.287807] [<ffffffff816f4b1c>] ret_from_fork+0x7c/0xb0 Notice the above call chain. rcu_start_future_gp() is called with the rnp->lock held. Then it calls rcu_start_gp_advance, which does a wakeup. You can't do wakeups while holding the rnp->lock, as that would mean that you could not do a rcu_read_unlock() while holding the rq lock, or any lock that was taken while holding the rq lock. This is because... (See below). > [12573.295067] > -> #0 (rcu_node_0){..-.-.}: > [12573.309293] [<ffffffff810b8d36>] __lock_acquire+0x1786/0x1af0 > [12573.316568] [<ffffffff810b9851>] lock_acquire+0x91/0x1f0 > [12573.323825] [<ffffffff816ebc90>] _raw_spin_lock+0x40/0x80 > [12573.331081] [<ffffffff811054ff>] rcu_read_unlock_special+0x9f/0x4c0 > [12573.338377] [<ffffffff810760a6>] __rcu_read_unlock+0x96/0xa0 > [12573.345648] [<ffffffff811391b3>] perf_lock_task_context+0x143/0x2d0 > [12573.352942] [<ffffffff8113938e>] find_get_context+0x4e/0x1f0 > [12573.360211] [<ffffffff811403f4>] SYSC_perf_event_open+0x514/0xbd0 > [12573.367514] [<ffffffff81140e49>] SyS_perf_event_open+0x9/0x10 > [12573.374816] [<ffffffff816f4dd4>] tracesys+0xdd/0xe2 Notice the above trace. perf took its own ctx->lock, which can be taken while holding the rq lock. While holding this lock, it did a rcu_read_unlock(). The perf_lock_task_context() basically looks like: rcu_read_lock(); raw_spin_lock(ctx->lock); rcu_read_unlock(); Now, what looks to have happened, is that we scheduled after taking that first rcu_read_lock() but before taking the spin lock. When we scheduled back in and took the ctx->lock, the following rcu_read_unlock() triggered the "special" code. The rcu_read_unlock_special() takes the rnp->lock, which gives us a possible deadlock scenario. CPU0 CPU1 CPU2 ---- ---- ---- rcu_nocb_kthread() lock(rq->lock); lock(ctx->lock); lock(rnp->lock); wake_up(); lock(rq->lock); rcu_read_unlock(); rcu_read_unlock_special(); lock(rnp->lock); lock(ctx->lock); **** DEADLOCK **** > [12573.382068] > other info that might help us debug this: > > [12573.403229] Chain exists of: > rcu_node_0 --> &rq->lock --> &ctx->lock > > [12573.424471] Possible unsafe locking scenario: > > [12573.438499] CPU0 CPU1 > [12573.445599] ---- ---- > [12573.452691] lock(&ctx->lock); > [12573.459799] lock(&rq->lock); > [12573.467010] lock(&ctx->lock); > [12573.474192] lock(rcu_node_0); > [12573.481262] > *** DEADLOCK *** > > [12573.501931] 1 lock held by trinity-child17/31341: > [12573.508990] #0: (&ctx->lock){-.-...}, at: [<ffffffff811390ed>] perf_lock_task_context+0x7d/0x2d0 > [12573.516475] > stack backtrace: > [12573.530395] CPU: 1 PID: 31341 Comm: trinity-child17 Not tainted 3.10.0-rc3+ #39 > [12573.545357] ffffffff825b4f90 ffff880219f1dbc0 ffffffff816e375b ffff880219f1dc00 > [12573.552868] ffffffff816dfa5d ffff880219f1dc50 ffff88023ce4d1f8 ffff88023ce4ca40 > [12573.560353] 0000000000000001 0000000000000001 ffff88023ce4d1f8 ffff880219f1dcc0 > [12573.567856] Call Trace: > [12573.575011] [<ffffffff816e375b>] dump_stack+0x19/0x1b > [12573.582284] [<ffffffff816dfa5d>] print_circular_bug+0x200/0x20f > [12573.589637] [<ffffffff810b8d36>] __lock_acquire+0x1786/0x1af0 > [12573.596982] [<ffffffff810918f5>] ? sched_clock_cpu+0xb5/0x100 > [12573.604344] [<ffffffff810b9851>] lock_acquire+0x91/0x1f0 > [12573.611652] [<ffffffff811054ff>] ? rcu_read_unlock_special+0x9f/0x4c0 > [12573.619030] [<ffffffff816ebc90>] _raw_spin_lock+0x40/0x80 > [12573.626331] [<ffffffff811054ff>] ? rcu_read_unlock_special+0x9f/0x4c0 > [12573.633671] [<ffffffff811054ff>] rcu_read_unlock_special+0x9f/0x4c0 > [12573.640992] [<ffffffff811390ed>] ? perf_lock_task_context+0x7d/0x2d0 > [12573.648330] [<ffffffff810b429e>] ? put_lock_stats.isra.29+0xe/0x40 > [12573.655662] [<ffffffff813095a0>] ? delay_tsc+0x90/0xe0 > [12573.662964] [<ffffffff810760a6>] __rcu_read_unlock+0x96/0xa0 > [12573.670276] [<ffffffff811391b3>] perf_lock_task_context+0x143/0x2d0 > [12573.677622] [<ffffffff81139070>] ? __perf_event_enable+0x370/0x370 > [12573.684981] [<ffffffff8113938e>] find_get_context+0x4e/0x1f0 > [12573.692358] [<ffffffff811403f4>] SYSC_perf_event_open+0x514/0xbd0 > [12573.699753] [<ffffffff8108cd9d>] ? get_parent_ip+0xd/0x50 > [12573.707135] [<ffffffff810b71fd>] ? trace_hardirqs_on_caller+0xfd/0x1c0 > [12573.714599] [<ffffffff81140e49>] SyS_perf_event_open+0x9/0x10 > [12573.721996] [<ffffffff816f4dd4>] tracesys+0xdd/0xe2 This commit delays the wakeup via irq_work(), which is what perf and ftrace use to perform wakeups in critical sections. Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2013-06-10trace: Allow idle-safe tracepoints to be called from irqPaul E. McKenney
__DECLARE_TRACE_RCU() currently creates an _rcuidle() tracepoint which may safely be invoked from what RCU considers to be an idle CPU. However, these _rcuidle() tracepoints may -not- be invoked from the handler of an irq taken from idle, because rcu_idle_enter() zeroes RCU's nesting-level counter, so that the rcu_irq_exit() returning to idle will trigger a WARN_ON_ONCE(). This commit therefore substitutes rcu_irq_enter() for rcu_idle_exit() and rcu_irq_exit() for rcu_idle_enter() in order to make the _rcuidle() tracepoints usable from irq handlers as well as from process context. Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Steven Rostedt <rostedt@goodmis.org>
2013-06-10Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== The following patchset contains four fixes for Netfilter and one fix for IPVS, they are: * Fix data leak to user-space via getsockopt IP_VS_SO_GET_DESTS, from Dan Carpenter. * Fix xt_TCPMSS if no TCP MSS is specified in syn packets, to avoid the violation of RFC879, from Phil Oester. * Fix incomplete dump of objects via nfnetlink_acct and nfnetlink_cttimeout, from myself. * Fix missing HW protocol in packets passed to user-space via NFQUEUE, from myself. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-10Merge tag 'spi-v3.10-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fixes from Mark Brown: "A few nasty issues, particularly a race with the interrupt controller in the xilinx driver, together with a couple of more minor fixes and a much needed move of the mailing list away from sourceforge." * tag 'spi-v3.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: hspi: fixup long delay time spi: spi-xilinx: Remove ISR race condition spi: topcliff-pch: fix error return code in pch_spi_probe() spi: topcliff-pch: Pass correct pointer to free_irq() spi: Move mailing list to vger
2013-06-10Merge tag 'stable/for-linus-3.10-rc5-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen Pull xen fixes from Konrad Rzeszutek Wilk: "Two bug-fixes for regressions: - xen/tmem stopped working after a certain combination of modprobe/swapon was used - cpu online/offlining would trigger WARN_ON." * tag 'stable/for-linus-3.10-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen/tmem: Don't over-write tmem_frontswap_poolid after tmem_frontswap_init set it. xen/smp: Fixup NOHZ per cpu data when onlining an offline CPU.
2013-06-10Merge tag 'regmap-v3.10-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap Pull regmap fixes from Mark Brown: "The biggest fix here is Lars-Peter's fix for custom locking callbacks which is pretty localised but important for those devices that use the feature. Otherwise we've got a couple of fairly small cleanups which would have been sent sooner were it not for letting Lars-Peter's patch soak for a while" * tag 'regmap-v3.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap: regmap: rbtree: Fixed node range check on sync regmap: regcache: Fixup locking for custom lock callbacks regmap: debugfs: Check return value of regmap_write()
2013-06-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds
Pull crypto fixes from Herbert Xu: "This fixes a build problem in sahara and temporarily disables two new optimisations because of performance regressions until a permanent fix is ready" * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: sahara - fix building as module crypto: blowfish - disable AVX2 implementation crypto: twofish - disable AVX2 implementation
2013-06-10USB: pl2303: fix device initialisation at openJohan Hovold
Do not use uninitialised termios data to determine when to configure the device at open. This also prevents stack data from leaking to userspace in the OOM error path. Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-06-10USB: spcp8x5: fix device initialisation at openJohan Hovold
Do not use uninitialised termios data to determine when to configure the device at open. Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-06-10USB: f81232: fix device initialisation at openJohan Hovold
Do not use uninitialised termios data to determine when to configure the device at open. This also prevents stack data from leaking to userspace. Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-06-10MIPS: ftrace: Add missing CONFIG_DYNAMIC_FTRACEMarkos Chandras
arch_ftrace_update_code and ftrace_modify_all_code are only available if CONFIG_DYNAMIC_FTRACE is selected. Fixes the following build problem on MIPS randconfig: arch/mips/kernel/ftrace.c: In function 'arch_ftrace_update_code': arch/mips/kernel/ftrace.c:31:2: error: implicit declaration of function 'ftrace_modify_all_code' [-Werror=implicit-function-declaration] Signed-off-by: Markos Chandras <markos.chandras@imgtec.com> Acked-by: Steven J. Hill <Steven.Hill@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/5435/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2013-06-10MIPS: include: mmu_context.h: Replace VIRTUALIZATION with KVMMarkos Chandras
The kvm_* symbols are only available if KVM is selected. Fixes the following linking problem on a randconfig: arch/mips/built-in.o: In function `local_flush_tlb_mm': (.text+0x18a94): undefined reference to `kvm_local_flush_tlb_all' arch/mips/built-in.o: In function `local_flush_tlb_range': (.text+0x18d0c): undefined reference to `kvm_local_flush_tlb_all' kernel/built-in.o: In function `__schedule': core.c:(.sched.text+0x2a00): undefined reference to `kvm_local_flush_tlb_all' mm/built-in.o: In function `use_mm': (.text+0x30214): undefined reference to `kvm_local_flush_tlb_all' fs/built-in.o: In function `flush_old_exec': (.text+0xf0a0): undefined reference to `kvm_local_flush_tlb_all' make: *** [vmlinux] Error 1 Signed-off-by: Markos Chandras <markos.chandras@imgtec.com> Acked-by: Steven J. Hill <Steven.Hill@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/5437/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2013-06-10MIPS: Alchemy: fix wait functionManuel Lauss
Only an interrupt can wake the core from 'wait', enable interrupts locally before executing 'wait'. [ralf@linux-mips.org: This leave the race between an interrupt that's setting TIF_NEED_RESCHEd and entering the WAIT status. but at least it's going to bring Alchemy back from the dead, so I'm going to apply this patch.] Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com> Cc: Linux-MIPS <linux-mips@linux-mips.org> Cc: Maciej W. Rozycki <macro@linux-mips.org> Patchwork: https://patchwork.linux-mips.org/patch/5408/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2013-06-10xen/tmem: Don't over-write tmem_frontswap_poolid after tmem_frontswap_init ↵Konrad Rzeszutek Wilk
set it. Commit 10a7a0771399a57a297fca9615450dbb3f88081a ("xen: tmem: enable Xen tmem shim to be built/loaded as a module") allows the tmem module to be loaded any time. For this work the frontswap API had to be able to asynchronously to call tmem_frontswap_init before or after the swap image had been set. That was added in git commit 905cd0e1bf9ffe82d6906a01fd974ea0f70be97a ("mm: frontswap: lazy initialization to allow tmem backends to build/run as modules"). Which means we could do this (The common case): modprobe tmem [so calls frontswap_register_ops, no ->init] modifies tmem_frontswap_poolid = -1 swapon /dev/xvda1 [__frontswap_init, calls -> init, tmem_frontswap_poolid is < 0 so tmem hypercall done] Or the failing one: swapon /dev/xvda1 [calls __frontswap_init, sets the need_init bitmap] modprobe tmem [calls frontswap_register_ops, -->init calls, finds out tmem_frontswap_poolid is 0, does not make a hypercall. Later in the module_init, sets tmem_frontswap_poolid=-1] Which meant that in the failing case we would not call the hypercall to initialize the pool and never be able to make any frontswap backend calls. Moving the frontswap_register_ops after setting the tmem_frontswap_poolid fixes it. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Bob Liu <bob.liu@oracle.com>