summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2013-05-01Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input updates from Dmitry Torokhov: "Assorted fixes and cleanups to the existing drivers plus a new driver for IMS Passenger Control Unit device they use for ther in-flight entertainment system." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (44 commits) Input: trackpoint - Optimize trackpoint init to use power-on reset Input: apbps2 - convert to devm_ioremap_resource() Input: ALPS - use %ph to print buffers ARM - shmobile: Armadillo800EVA: Move st1232 reset pin handling Input: st1232 - add reset pin handling Input: st1232 - convert to devm_* infrastructure Input: MT - handle semi-mt devices in core Input: adxl34x - use spi_get_drvdata() Input: ad7877 - use spi_get_drvdata() and spi_set_drvdata() Input: ads7846 - use spi_get_drvdata() and spi_set_drvdata() Input: ims-pcu - fix a memory leak on error Input: sysrq - supplement reset sequence with timeout functionality Input: tegra-kbc - support for defining row/columns based on SoC Input: imx_keypad - switch to using managed resources Input: arc_ps2 - add support for device tree Input: mma8450 - fix signed 12bits to 32bits conversion Input: eeti_ts - remove redundant null check Input: edt-ft5x06 - remove redundant null check before kfree Input: ad714x - add CONFIG_PM_SLEEP to suspend/resume functions Input: adxl34x - add CONFIG_PM_SLEEP to suspend/resume functions ...
2013-05-01xfs: fix da node magic number mismatchesDave Chinner
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
2013-05-01xfs: Remote attr validation fixes and optimisationsDave Chinner
- optimise the calcuation for the number of blocks in a remote xattr. - check attribute length against MAX_XATTR_SIZE, not MAXPATHLEN - whitespace fixes Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
2013-05-01af_unix: fix a fatal race with bit fieldsEric Dumazet
Using bit fields is dangerous on ppc64/sparc64, as the compiler [1] uses 64bit instructions to manipulate them. If the 64bit word includes any atomic_t or spinlock_t, we can lose critical concurrent changes. This is happening in af_unix, where unix_sk(sk)->gc_candidate/ gc_maybe_cycle/lock share the same 64bit word. This leads to fatal deadlock, as one/several cpus spin forever on a spinlock that will never be available again. A safer way would be to use a long to store flags. This way we are sure compiler/arch wont do bad things. As we own unix_gc_lock spinlock when clearing or setting bits, we can use the non atomic __set_bit()/__clear_bit(). recursion_level can share the same 64bit location with the spinlock, as it is set only with this spinlock held. [1] bug fixed in gcc-4.8.0 : http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52080 Reported-by: Ambrose Feinstein <ambrose@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-01Merge branch 'bnx2x'David S. Miller
Yuval Mintz says: ==================== This fixes 2 small bugs - one which may cause an unnecessary link flap, and the other is a small memory leak when unloading while cnic is not loaded. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-01bnx2x: Prevent memory leak when cnic is absentYuval Mintz
bnx2x driver allocates searcher T2 tables, but it releases that memory during unload only released if the cnic is loaded. Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-01bnx2x: correct reading of speed capabilitiesYaniv Rosner
When the bnx2x driver reads the port configuration - mask irrelevant bits. Without this change, the unintended bits may cause the driver to needlessly toggle the link, as a comparison in the link flap avoidance flow will show that the old link did not advertise the same capabilities and thus cannot be retained. Signed-off-by: Yaniv Rosner <yanivr@broadcom.com> Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-01net: sctp: attribute printl with __printf for gcc fmt checksDaniel Borkmann
Let GCC check for format string errors in sctp's probe printl function. This patch fixes the warning when compiled with W=1: net/sctp/probe.c:73:2: warning: function might be possible candidate for 'gnu_printf' format attribute [-Wmissing-format-attribute] Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-01netlink: kconfig: move mmap i/o into netlink kconfigDaniel Borkmann
Currently, in menuconfig, Netlink's new mmaped IO is the very first entry under the ``Networking support'' item and comes even before ``Networking options'': [ ] Netlink: mmaped IO Networking options ---> ... Lets move this into ``Networking options'' under netlink's Kconfig, since this might be more appropriate. Introduced by commit ccdfcc398 (``netlink: mmaped netlink: ring setup''). Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-01netpoll: convert mutex into a semaphoreNeil Horman
Bart Van Assche recently reported a warning to me: <IRQ> [<ffffffff8103d79f>] warn_slowpath_common+0x7f/0xc0 [<ffffffff8103d7fa>] warn_slowpath_null+0x1a/0x20 [<ffffffff814761dd>] mutex_trylock+0x16d/0x180 [<ffffffff813968c9>] netpoll_poll_dev+0x49/0xc30 [<ffffffff8136a2d2>] ? __alloc_skb+0x82/0x2a0 [<ffffffff81397715>] netpoll_send_skb_on_dev+0x265/0x410 [<ffffffff81397c5a>] netpoll_send_udp+0x28a/0x3a0 [<ffffffffa0541843>] ? write_msg+0x53/0x110 [netconsole] [<ffffffffa05418bf>] write_msg+0xcf/0x110 [netconsole] [<ffffffff8103eba1>] call_console_drivers.constprop.17+0xa1/0x1c0 [<ffffffff8103fb76>] console_unlock+0x2d6/0x450 [<ffffffff8104011e>] vprintk_emit+0x1ee/0x510 [<ffffffff8146f9f6>] printk+0x4d/0x4f [<ffffffffa0004f1d>] scsi_print_command+0x7d/0xe0 [scsi_mod] This resulted from my commit ca99ca14c which introduced a mutex_trylock operation in a path that could execute in interrupt context. When mutex debugging is enabled, the above warns the user when we are in fact exectuting in interrupt context interrupt context. After some discussion, It seems that a semaphore is the proper mechanism to use here. While mutexes are defined to be unusable in interrupt context, no such condition exists for semaphores (save for the fact that the non blocking api calls, like up and down_trylock must be used when in irq context). Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Reported-by: Bart Van Assche <bvanassche@acm.org> CC: Bart Van Assche <bvanassche@acm.org> CC: David Miller <davem@davemloft.net> CC: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-01netlink: Fix skb ref counting.Pravin B Shelar
Commit f9c2288837ba072b21dba955f04a4c97eaa77b1e (netlink: implement memory mapped recvmsg) increamented skb->users ref count twice for a dump op which does not look right. Following patch fixes that. CC: Patrick McHardy <kaber@trash.net> Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tileLinus Torvalds
Pull tile arch changes from Chris Metcalf: "These are some minor new feature work and other changes that didn't merit getting pushed up after the 3.9 merge window closed. There should be a lot more activity in the 3.11 merge window" * git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile: arch/tile: Fix syscall return value passed to tracepoint tile: comment assumption about __insn_mtspr for <asm/irqflags.h> tile: ns2cycles should use __raw_get_cpu_var arch: remove KCORE_ELF again [tile] tile: remove two outdated Kconfig entries tile: support atomic64_dec_if_positive() tile: support TIF_SYSCALL_TRACEPOINT; select HAVE_SYSCALL_TRACEPOINTS tile: Add definition of NR_syscalls tile: move declaration of sys_call_table to <asm/syscall.h> arch/tile: Enable HAVE_ARCH_TRACEHOOK arch/tile: Call tracehook_report_syscall_{entry,exit} in syscall trace
2013-05-01init: Do not warn on non-zero initcall returnSteven Rostedt
Commit f91eb62f71b3 ("init: scream bloody murder if interrupts are enabled too early") added three new warnings. The first two seemed reasonable, but the third included a warning when an initcall returned non-zero. Although, the third WARN() does include an imbalanced preempt disabled, or irqs disable, it shouldn't warn if it only had an initcall that just returns non-zero. In fact, according to Linus, it shouldn't print at all. As it only prints with initcall_debug set, and that already shows enough information to fix things. Link: http://lkml.kernel.org/r/CA+55aFzaBC5SFi7=F2mfm+KWY5qTsBmOqgbbs8E+LUS8JK-sBg@mail.gmail.com Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-01net_sched: act_ipt forward compat with xtablesJamal Hadi Salim
Deal with changes in newer xtables while maintaining backward compatibility. Thanks to Jan Engelhardt for suggestions. Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-01Merge branch 'topic/omap3isp' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull omap3isp clk support from Mauro Carvalho Chehab: "This patch were sent in separate as it depends on a merge from clock framework, that you merged in commit 362ed48dee50" * 'topic/omap3isp' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: [media] omap3isp: Use the common clock framework
2013-05-01jfs: fix a couple racesDave Kleikamp
This patch fixes races uncovered by xfstests testcase 068. One race is the result of jfs_sync() trying to write a sync point to the journal after it has been frozen (or possibly in the process). Since freezing sync's the journal, there is no need to write a sync point so we simply want to return. The second involves jfs_write_inode() being called on a deleted inode. It calls jfs_flush_journal which is held up by the jfs_commit thread doing the final iput on the same deleted inode, which itself is waiting for the I_SYNC flag to be cleared. jfs_write_inode need not do anything when i_nlink is zero, which is the easy fix. Reported-by: Michael L. Semon <mlsemon35@gmail.com> Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
2013-05-01Merge branch 'next' into for-linusDmitry Torokhov
Prepare first set of updates for 3.10 merge window.
2013-05-01Merge branch 'ipc-scalability'Linus Torvalds
Merge IPC cleanup and scalability patches from Andrew Morton. This cleans up many of the oddities in the IPC code, uses the list iterator helpers, splits out locking and adds per-semaphore locks for greater scalability of the IPC semaphore code. Most normal user-level locking by now uses futexes (ie pthreads, but also a lot of specialized locks), but SysV IPC semaphores are apparently still used in some big applications, either for portability reasons, or because they offer tracking and undo (and you don't need to have a special shared memory area for them). Our IPC semaphore scalability was pitiful. We used to lock much too big ranges, and we used to have a single ipc lock per ipc semaphore array. Most loads never cared, but some do. There are some numbers in the individual commits. * ipc-scalability: ipc: sysv shared memory limited to 8TiB ipc/msg.c: use list_for_each_entry_[safe] for list traversing ipc,sem: fine grained locking for semtimedop ipc,sem: have only one list in struct sem_queue ipc,sem: open code and rename sem_lock ipc,sem: do not hold ipc lock more than necessary ipc: introduce lockless pre_down ipcctl ipc: introduce obtaining a lockless ipc object ipc: remove bogus lock comment for ipc_checkid ipc/msgutil.c: use linux/uaccess.h ipc: refactor msg list search into separate function ipc: simplify msg list search ipc: implement MSG_COPY as a new receive mode ipc: remove msg handling from queue scan ipc: set EFAULT as default error in load_msg() ipc: tighten msg copy loops ipc: separate msg allocation from userspace copy ipc: clamp with min()
2013-05-01ipc: sysv shared memory limited to 8TiBRobin Holt
Trying to run an application which was trying to put data into half of memory using shmget(), we found that having a shmall value below 8EiB-8TiB would prevent us from using anything more than 8TiB. By setting kernel.shmall greater than 8EiB-8TiB would make the job work. In the newseg() function, ns->shm_tot which, at 8TiB is INT_MAX. ipc/shm.c: 458 static int newseg(struct ipc_namespace *ns, struct ipc_params *params) 459 { ... 465 int numpages = (size + PAGE_SIZE -1) >> PAGE_SHIFT; ... 474 if (ns->shm_tot + numpages > ns->shm_ctlall) 475 return -ENOSPC; [akpm@linux-foundation.org: make ipc/shm.c:newseg()'s numpages size_t, not int] Signed-off-by: Robin Holt <holt@sgi.com> Reported-by: Alex Thorlton <athorlton@sgi.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-01ipc/msg.c: use list_for_each_entry_[safe] for list traversingNikola Pajkovsky
The ipc/msg.c code does its list operations by hand and it open-codes the accesses, instead of using for_each_entry_[safe]. Signed-off-by: Nikola Pajkovsky <npajkovs@redhat.com> Cc: Stanislav Kinsbursky <skinsbursky@parallels.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-01ipc,sem: fine grained locking for semtimedopRik van Riel
Introduce finer grained locking for semtimedop, to handle the common case of a program wanting to manipulate one semaphore from an array with multiple semaphores. If the call is a semop manipulating just one semaphore in an array with multiple semaphores, only take the lock for that semaphore itself. If the call needs to manipulate multiple semaphores, or another caller is in a transaction that manipulates multiple semaphores, the sem_array lock is taken, as well as all the locks for the individual semaphores. On a 24 CPU system, performance numbers with the semop-multi test with N threads and N semaphores, look like this: vanilla Davidlohr's Davidlohr's + Davidlohr's + threads patches rwlock patches v3 patches 10 610652 726325 1783589 2142206 20 341570 365699 1520453 1977878 30 288102 307037 1498167 2037995 40 290714 305955 1612665 2256484 50 288620 312890 1733453 2650292 60 289987 306043 1649360 2388008 70 291298 306347 1723167 2717486 80 290948 305662 1729545 2763582 90 290996 306680 1736021 2757524 100 292243 306700 1773700 3059159 [davidlohr.bueso@hp.com: do not call sem_lock when bogus sma] [davidlohr.bueso@hp.com: make refcounter atomic] Signed-off-by: Rik van Riel <riel@redhat.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Cc: Chegu Vinod <chegu_vinod@hp.com> Cc: Jason Low <jason.low2@hp.com> Reviewed-by: Michel Lespinasse <walken@google.com> Cc: Peter Hurley <peter@hurleysoftware.com> Cc: Stanislav Kinsbursky <skinsbursky@parallels.com> Tested-by: Emmanuel Benisty <benisty.e@gmail.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-01ipc,sem: have only one list in struct sem_queueRik van Riel
Having only one list in struct sem_queue, and only queueing simple semaphore operations on the list for the semaphore involved, allows us to introduce finer grained locking for semtimedop. Signed-off-by: Rik van Riel <riel@redhat.com> Acked-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Cc: Chegu Vinod <chegu_vinod@hp.com> Cc: Emmanuel Benisty <benisty.e@gmail.com> Cc: Jason Low <jason.low2@hp.com> Cc: Michel Lespinasse <walken@google.com> Cc: Peter Hurley <peter@hurleysoftware.com> Cc: Stanislav Kinsbursky <skinsbursky@parallels.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-01ipc,sem: open code and rename sem_lockRik van Riel
Rename sem_lock() to sem_obtain_lock(), so we can introduce a sem_lock() later that only locks the sem_array and does nothing else. Open code the locking from ipc_lock() in sem_obtain_lock() so we can introduce finer grained locking for the sem_array in the next patch. [akpm@linux-foundation.org: propagate the ipc_obtain_object() errno out of sem_obtain_lock()] Signed-off-by: Rik van Riel <riel@redhat.com> Acked-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Cc: Chegu Vinod <chegu_vinod@hp.com> Cc: Emmanuel Benisty <benisty.e@gmail.com> Cc: Jason Low <jason.low2@hp.com> Cc: Michel Lespinasse <walken@google.com> Cc: Peter Hurley <peter@hurleysoftware.com> Cc: Stanislav Kinsbursky <skinsbursky@parallels.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-01ipc,sem: do not hold ipc lock more than necessaryDavidlohr Bueso
Instead of holding the ipc lock for permissions and security checks, among others, only acquire it when necessary. Some numbers.... 1) With Rik's semop-multi.c microbenchmark we can see the following results: Baseline (3.9-rc1): cpus 4, threads: 256, semaphores: 128, test duration: 30 secs total operations: 151452270, ops/sec 5048409 + 59.40% a.out [kernel.kallsyms] [k] _raw_spin_lock + 6.14% a.out [kernel.kallsyms] [k] sys_semtimedop + 3.84% a.out [kernel.kallsyms] [k] avc_has_perm_flags + 3.64% a.out [kernel.kallsyms] [k] __audit_syscall_exit + 2.06% a.out [kernel.kallsyms] [k] copy_user_enhanced_fast_string + 1.86% a.out [kernel.kallsyms] [k] ipc_lock With this patchset: cpus 4, threads: 256, semaphores: 128, test duration: 30 secs total operations: 273156400, ops/sec 9105213 + 18.54% a.out [kernel.kallsyms] [k] _raw_spin_lock + 11.72% a.out [kernel.kallsyms] [k] sys_semtimedop + 7.70% a.out [kernel.kallsyms] [k] ipc_has_perm.isra.21 + 6.58% a.out [kernel.kallsyms] [k] avc_has_perm_flags + 6.54% a.out [kernel.kallsyms] [k] __audit_syscall_exit + 4.71% a.out [kernel.kallsyms] [k] ipc_obtain_object_check 2) While on an Oracle swingbench DSS (data mining) workload the improvements are not as exciting as with Rik's benchmark, we can see some positive numbers. For an 8 socket machine the following are the percentages of %sys time incurred in the ipc lock: Baseline (3.9-rc1): 100 swingbench users: 8,74% 400 swingbench users: 21,86% 800 swingbench users: 84,35% With this patchset: 100 swingbench users: 8,11% 400 swingbench users: 19,93% 800 swingbench users: 77,69% [riel@redhat.com: fix two locking bugs] [sasha.levin@oracle.com: prevent releasing RCU read lock twice in semctl_main] [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Signed-off-by: Rik van Riel <riel@redhat.com> Reviewed-by: Chegu Vinod <chegu_vinod@hp.com> Acked-by: Michel Lespinasse <walken@google.com> Cc: Rik van Riel <riel@redhat.com> Cc: Jason Low <jason.low2@hp.com> Cc: Emmanuel Benisty <benisty.e@gmail.com> Cc: Peter Hurley <peter@hurleysoftware.com> Cc: Stanislav Kinsbursky <skinsbursky@parallels.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-01ipc: introduce lockless pre_down ipcctlDavidlohr Bueso
Various forms of ipc use ipcctl_pre_down() to retrieve an ipc object and check permissions, mostly for IPC_RMID and IPC_SET commands. Introduce ipcctl_pre_down_nolock(), a lockless version of this function. The locking version is retained, yet modified to call the nolock version without affecting its semantics, thus transparent to all ipc callers. Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Signed-off-by: Rik van Riel <riel@redhat.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Chegu Vinod <chegu_vinod@hp.com> Cc: Emmanuel Benisty <benisty.e@gmail.com> Cc: Jason Low <jason.low2@hp.com> Cc: Michel Lespinasse <walken@google.com> Cc: Peter Hurley <peter@hurleysoftware.com> Cc: Stanislav Kinsbursky <skinsbursky@parallels.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-01ipc: introduce obtaining a lockless ipc objectDavidlohr Bueso
Through ipc_lock() and therefore ipc_lock_check() we currently return the locked ipc object. This is not necessary for all situations and can, therefore, cause unnecessary ipc lock contention. Introduce analogous ipc_obtain_object() and ipc_obtain_object_check() functions that only lookup and return the ipc object. Both these functions must be called within the RCU read critical section. [akpm@linux-foundation.org: propagate the ipc_obtain_object() errno from ipc_lock()] Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Signed-off-by: Rik van Riel <riel@redhat.com> Reviewed-by: Chegu Vinod <chegu_vinod@hp.com> Acked-by: Michel Lespinasse <walken@google.com> Cc: Emmanuel Benisty <benisty.e@gmail.com> Cc: Jason Low <jason.low2@hp.com> Cc: Peter Hurley <peter@hurleysoftware.com> Cc: Stanislav Kinsbursky <skinsbursky@parallels.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-01ipc: remove bogus lock comment for ipc_checkidDavidlohr Bueso
This series makes the sysv semaphore code more scalable, by reducing the time the semaphore lock is held, and making the locking more scalable for semaphore arrays with multiple semaphores. The first four patches were written by Davidlohr Buesso, and reduce the hold time of the semaphore lock. The last three patches change the sysv semaphore code locking to be more fine grained, providing a performance boost when multiple semaphores in a semaphore array are being manipulated simultaneously. On a 24 CPU system, performance numbers with the semop-multi test with N threads and N semaphores, look like this: vanilla Davidlohr's Davidlohr's + Davidlohr's + threads patches rwlock patches v3 patches 10 610652 726325 1783589 2142206 20 341570 365699 1520453 1977878 30 288102 307037 1498167 2037995 40 290714 305955 1612665 2256484 50 288620 312890 1733453 2650292 60 289987 306043 1649360 2388008 70 291298 306347 1723167 2717486 80 290948 305662 1729545 2763582 90 290996 306680 1736021 2757524 100 292243 306700 1773700 3059159 This patch: There is no reason to be holding the ipc lock while reading ipcp->seq, hence remove misleading comment. Also simplify the return value for the function. Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Signed-off-by: Rik van Riel <riel@redhat.com> Cc: Chegu Vinod <chegu_vinod@hp.com> Cc: Emmanuel Benisty <benisty.e@gmail.com> Cc: Jason Low <jason.low2@hp.com> Cc: Michel Lespinasse <walken@google.com> Cc: Peter Hurley <peter@hurleysoftware.com> Cc: Stanislav Kinsbursky <skinsbursky@parallels.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-01ipc/msgutil.c: use linux/uaccess.hHoSung Jung
Signed-off-by: HoSung Jung <rain6557@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-01ipc: refactor msg list search into separate functionPeter Hurley
[fengguang.wu@intel.com: find_msg can be static] Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Cc: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-01ipc: simplify msg list searchPeter Hurley
Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Acked-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-01ipc: implement MSG_COPY as a new receive modePeter Hurley
Teach the helper routines about MSG_COPY so that msgtyp is preserved as the message number to copy. The security functions affected by this change were audited and no additional changes are necessary. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Acked-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-01ipc: remove msg handling from queue scanPeter Hurley
In preparation for refactoring the queue scan into a separate function, relocate msg copying. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Acked-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-01ipc: set EFAULT as default error in load_msg()Peter Hurley
Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Acked-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-01ipc: tighten msg copy loopsPeter Hurley
Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Acked-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-01ipc: separate msg allocation from userspace copyPeter Hurley
Separating msg allocation enables single-block vmalloc allocation instead. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Acked-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-01ipc: clamp with min()Peter Hurley
Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Acked-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-01Merge tag 'ext4_for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Ted Ts'o: "Mostly performance and bug fixes, plus some cleanups. The one new feature this merge window is a new ioctl EXT4_IOC_SWAP_BOOT which allows installation of a hidden inode designed for boot loaders." * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (50 commits) ext4: fix type-widening bug in inode table readahead code ext4: add check for inodes_count overflow in new resize ioctl ext4: fix Kconfig documentation for CONFIG_EXT4_DEBUG ext4: fix online resizing for ext3-compat file systems jbd2: trace when lock_buffer in do_get_write_access takes a long time ext4: mark metadata blocks using bh flags buffer: add BH_Prio and BH_Meta flags ext4: mark all metadata I/O with REQ_META ext4: fix readdir error in case inline_data+^dir_index. ext4: fix readdir error in the case of inline_data+dir_index jbd2: use kmem_cache_zalloc instead of kmem_cache_alloc/memset ext4: mext_insert_extents should update extent block checksum ext4: move quota initialization out of inode allocation transaction ext4: reserve xattr index for Rich ACL support jbd2: reduce journal_head size ext4: clear buffer_uninit flag when submitting IO ext4: use io_end for multiple bios ext4: make ext4_bio_write_page() use BH_Async_Write flags ext4: Use kstrtoul() instead of parse_strtoul() ext4: defragmentation code cleanup ...
2013-05-01Merge tag 'tag-for-linus-3.10' of ↵Linus Torvalds
git://git.linaro.org/people/sumitsemwal/linux-dma-buf Pull dma-buf updates from Sumit Semwal: "Added debugfs support to dma-buf" * tag 'tag-for-linus-3.10' of git://git.linaro.org/people/sumitsemwal/linux-dma-buf: dma-buf: Add debugfs support dma-buf: replace dma_buf_export() with dma_buf_export_named()
2013-05-01Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rkuo/linux-hexagon-kernel Pull Hexagon fixes from Richard Kuo: "Changes for the Hexagon architecture (and one touching OpenRISC). They include various fixes to make use of additional arch features and cleanups. The largest functional change is a cleanup of the signal and event return paths" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rkuo/linux-hexagon-kernel: (32 commits) Hexagon: add v4 CS regs to core copyout macro Hexagon: use correct translation for VMALLOC_START Hexagon: use correct translations for DMA mappings Hexagon: fix return value for notify_resume case in do_work_pending Hexagon: fix signal number for user mem faults Hexagon: remove two Kconfig entries arch: remove CONFIG_GENERIC_FIND_NEXT_BIT again Hexagon: update copyright dates Hexagon: add translation types for __vmnewmap Hexagon: fix signal.c compile error Hexagon: break up user fn/arg register setting Hexagon: use generic sys_fork, sys_vfork, and sys_clone Hexagon: fix psp/sp macro Hexagon: fix up int enable/disable at ret_from_fork Hexagon: add IOMEM and _relaxed IO macros Hexagon: switch to using the device type for IO mappings Hexagon: don't print info for offline CPU's Hexagon: add support for single-stepping (v4+) Hexagon: use correct work mask when checking for more work Hexagon: add support for additional exceptions ...
2013-05-01tty: fix up atime/mtime mess, take threeLinus Torvalds
We first tried to avoid updating atime/mtime entirely (commit b0de59b5733d: "TTY: do not update atime/mtime on read/write"), and then limited it to only update it occasionally (commit 37b7f3c76595: "TTY: fix atime/mtime regression"), but it turns out that this was both insufficient and overkill. It was insufficient because we let people attach to the shared ptmx node to see activity without even reading atime/mtime, and it was overkill because the "only once a minute" means that you can't really tell an idle person from an active one with 'w'. So this tries to fix the problem properly. It marks the shared ptmx node as un-notifiable, and it lowers the "only once a minute" to a few seconds instead - still long enough that you can't time individual keystrokes, but short enough that you can tell whether somebody is active or not. Reported-by: Simon Kirby <sim@hostway.ca> Acked-by: Jiri Slaby <jslaby@suse.cz> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-01Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal Pull compat cleanup from Al Viro: "Mostly about syscall wrappers this time; there will be another pile with patches in the same general area from various people, but I'd rather push those after both that and vfs.git pile are in." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: syscalls.h: slightly reduce the jungles of macros get rid of union semop in sys_semctl(2) arguments make do_mremap() static sparc: no need to sign-extend in sync_file_range() wrapper ppc compat wrappers for add_key(2) and request_key(2) are pointless x86: trim sys_ia32.h x86: sys32_kill and sys32_mprotect are pointless get rid of compat_sys_semctl() and friends in case of ARCH_WANT_OLD_COMPAT_IPC merge compat sys_ipc instances consolidate compat lookup_dcookie() convert vmsplice to COMPAT_SYSCALL_DEFINE switch getrusage() to COMPAT_SYSCALL_DEFINE switch epoll_pwait to COMPAT_SYSCALL_DEFINE convert sendfile{,64} to COMPAT_SYSCALL_DEFINE switch signalfd{,4}() to COMPAT_SYSCALL_DEFINE make SYSCALL_DEFINE<n>-generated wrappers do asmlinkage_protect make HAVE_SYSCALL_WRAPPERS unconditional consolidate cond_syscall and SYSCALL_ALIAS declarations teach SYSCALL_DEFINE<n> how to deal with long long/unsigned long long get rid of duplicate logics in __SC_....[1-6] definitions
2013-05-01dma-buf: Add debugfs supportSumit Semwal
Add debugfs support to make it easier to print debug information about the dma-buf buffers. Cc: Dave Airlie <airlied@redhat.com> [minor fixes on init and warning fix] Cc: Dan Carpenter <dan.carpenter@oracle.com> [remove double unlock in fail case] Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
2013-05-01dma-buf: replace dma_buf_export() with dma_buf_export_named()Sumit Semwal
For debugging purposes, it is useful to have a name-string added while exporting buffers. Hence, dma_buf_export() is replaced with dma_buf_export_named(), which additionally takes 'exp_name' as a parameter. For backward compatibility, and for lazy exporters who don't wish to name themselves, a #define dma_buf_export() is also made available, which adds a __FILE__ instead of 'exp_name'. Cc: Daniel Vetter <daniel.vetter@ffwll.ch> [Thanks for the idea!] Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
2013-05-01vhost: fix error handling in RESET_OWNER ioctlMichael S. Tsirkin
RESET_OWNER ioctl would leave the fd in a bad state if memory allocation failed: device is stopped but owner is not reset. Make state changes after allocating memory, such that a failed ioctl has no effect. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-05-01tcm_vhost: remove virtio-net.h dependencyMichael S. Tsirkin
vhost.h only has generic bits now, so we can drop it virtio-net.h in tcm_vhost. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-05-01vhost: move per-vq net specific fields out to netMichael S. Tsirkin
This will remove the need for vhost scsi to pull in virtio-net.h. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-05-01tcm_vhost: document inflight ref-counting useMichael S. Tsirkin
Add more comments so we remember not to break it next time we change things. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-05-01vhost: move vhost-net zerocopy fields to net.cAsias He
On top of 'vhost: Allow device specific fields per vq', we can move device specific fields to device virt queue from vhost virt queue. Signed-off-by: Asias He <asias@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-05-01tcm_vhost: Wait for pending requests in vhost_scsi_flush()Asias He
Unlike tcm_vhost_evt requests, tcm_vhost_cmd requests are passed to the target core system, we can not make sure all the pending requests will be finished by flushing the virt queue. In this patch, we do refcount for every tcm_vhost_cmd requests to make vhost_scsi_flush() wait for all the pending requests issued before the flush operation to be finished. This is useful when we call vhost_scsi_clear_endpoint() to stop tcm_vhost. No new requests will be passed to target core system because we clear the endpoint by setting vs_tpg to NULL. And we wait for all the old requests. These guarantee no requests will be leaked and existing requests will be completed. Signed-off-by: Asias He <asias@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-05-01vhost: Allow device specific fields per vqAsias He
This is useful for any device who wants device specific fields per vq. For example, tcm_vhost wants a per vq field to track requests which are in flight on the vq. Also, on top of this we can add patches to move things like ubufs from vhost.h out to net.c. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Asias He <asias@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>