summaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)Author
2014-02-20tracing/uprobes: Move argument fetching to uprobe_dispatcher()Namhyung Kim
A single uprobe event might serve different users like ftrace and perf. And this is especially important for upcoming multi buffer support. But in this case it'll fetch (same) data from userspace multiple times. So move it to the beginning of the dispatcher function and reuse it for each users. Link: http://lkml.kernel.org/r/1389946120-19610-3-git-send-email-namhyung@kernel.org Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: zhangwei(Jovi) <jovi.zhangwei@huawei.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-02-20tracing/uprobes: Rename uprobe_{trace,perf}_print() functionsNamhyung Kim
The uprobe_{trace,perf}_print functions are misnomers since what they do is not printing. There's also a real print function named print_uprobe_event() so they'll only increase confusion IMHO. Rename them with double underscores to follow convention of kprobe. Link: http://lkml.kernel.org/r/1389946120-19610-2-git-send-email-namhyung@kernel.org Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-02-20ftrace: Allow for function tracing instance to filter functionsSteven Rostedt (Red Hat)
Create a "set_ftrace_filter" and "set_ftrace_notrace" files in the instance directories to let users filter of functions to trace for the given instance. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-02-20ftrace: Pass in global_ops for use with filtering filesSteven Rostedt (Red Hat)
In preparation for having the function tracing instances be able to filter on functions, the generic filter functions must first be converted to take in the global_ops as a parameter. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-02-20ftrace: Allow instances to use function tracingSteven Rostedt (Red Hat)
Allow instances (sub-buffers) to enable function tracing. Each instance will have its own function tracing capability. For now, instances will not have function stack tracing, or will they be able to pick and choose what functions they can trace. Picking and choosing their own functions will come later. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-02-20tracing: Convert tracer->enabled to counterSteven Rostedt (Red Hat)
As tracers will soon be used by instances, the tracer enabled field needs to be converted to a counter instead of a boolean. This counter is protected by the trace_types_lock mutex. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-02-20tracing: Disable tracers before deletion of instanceSteven Rostedt (Red Hat)
When an instance is about to be deleted, make sure the tracer is set to nop. If it isn't reset the tracer and set it to the nop tracer, otherwise memory leaks and bad pointers may result. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-02-20ftrace: Copy ops private to global_ops privateSteven Rostedt (Red Hat)
If global_ops function is being called directly, instead of the global_ops list function, set the global_ops private to be the same as the ops private that's being called directly. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-02-20tracing: Only let top level have option filesSteven Rostedt (Red Hat)
Currently, only the top level instance can have tracing options. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-02-20tracing: Set up infrastructure to allow tracers for instancesSteven Rostedt (Red Hat)
Currently the tracers (function, function_graph, irqsoff, etc) can only be used by the top level tracing directory (not for instances). This sets up the infrastructure to allow instances to be able to run a separate tracer apart from the what the top level tracing is doing. As tracers need to adapt for being used by instances, the tracers must flag if they can be used by instances or not. Currently only the 'nop' tracer can be used by all instances. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-02-20tracing: Pass trace_array to flag_changed callbackSteven Rostedt (Red Hat)
As options (flags) may affect instances instead of being global the flag_changed() callbacks need to receive the trace_array descriptor of the instance they will be modifying. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-02-20tracing: Pass trace_array to set_flag callbackSteven Rostedt (Red Hat)
As options (flags) may affect instances instead of being global the set_flag() callbacks need to receive the trace_array descriptor of the instance they will be modifying. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-02-20Merge branch 'master' into for-nextJiri Kosina
2014-02-19genirq: Update the a comment typoChuansheng Liu
Change the comment "chasnge" to "change". Signed-off-by: Chuansheng Liu <chuansheng.liu@intel.com> Link: http://lkml.kernel.org/r/1392020037-5484-2-git-send-email-chuansheng.liu@intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-02-19genirq: Provide irq_wake_thread()Thomas Gleixner
In course of the sdhci/sdio discussion with Russell about killing the sdio kthread hackery we discovered the need to be able to wake an interrupt thread from software. The rationale for this is, that sdio hardware can lack proper interrupt support for certain features. So the driver needs to poll the status registers, but at the same time it needs to be woken up by an hardware interrupt. To be able to get rid of the home brewn kthread construct of sdio we need a way to wake an irq thread independent of an actual hardware interrupt. Provide an irq_wake_thread() function which wakes up the thread which is associated to a given dev_id. This allows sdio to invoke the irq thread from the hardware irq handler via the IRQ_WAKE_THREAD return value and provides a possibility to wake it via a timer for the polling scenarios. That allows to simplify the sdio logic significantly. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Chris Ball <chris@printf.net> Acked-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20140215003823.772565780@linutronix.de
2014-02-19genirq: Provide synchronize_hardirq()Thomas Gleixner
synchronize_irq() waits for hard irq and threaded handlers to complete before returning. For some special cases we only need to make sure that the hard interrupt part of the irq line is not in progress when we disabled the - possibly shared - interrupt at the device level. A proper use case for this was provided by Russell. The sdhci driver requires some irq triggered functions to be run in thread context. The current implementation of the thread context is a sdio private kthread construct, which has quite some shortcomings. These can be avoided when the thread is directly associated to the device interrupt via the generic threaded irq infrastructure. Though there is a corner case related to run time power management where one side disables the device interrupts at the device level and needs to make sure, that an already running hard interrupt handler has completed before proceeding further. Though that hard interrupt handler might wake the associated thread, which in turn can request the runtime PM to reenable the device. Using synchronize_irq() leads to an immediate deadlock of the irq thread waiting for the PM lock and the synchronize_irq() waiting for the irq thread to complete. Due to the fact that it is sufficient for this case to ensure that no hard irq handler is executing a new function which avoids the check for the thread is required. Add a function, which just monitors the hard irq parts and ignores the threaded handlers. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Russell King <linux@arm.linux.org.uk> Cc: Chris Ball <chris@printf.net> Acked-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20140215003823.653236081@linutronix.de
2014-02-19sched_clock: Prevent callers from seeing half-updated dataStephen Boyd
The generic sched_clock registration function was previously done lockless, due to the fact that it was expected to be called only once. However, now there are systems that may register multiple sched_clock sources, for which the lack of locking has casued problems: If two sched_clock sources are registered we may end up in a situation where a call to sched_clock() may be accessing the epoch cycle count for the old counter and the cycle count for the new counter. This can lead to confusing results where sched_clock() values jump and then are reset to 0 (due to the way the registration function forces the epoch_ns to be 0). Fix this by reorganizing the registration function to hold the seqlock for as short a time as possible while we update the clock_data structure for a new counter. We also put any accumulated time into epoch_ns instead of resetting the time to 0 so that the clock doesn't reset after each successful registration. [jstultz: Added extra context to the commit message] Reported-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Josh Cartwright <joshc@codeaurora.org> Link: http://lkml.kernel.org/r/1392662736-7803-2-git-send-email-john.stultz@linaro.org Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-02-19user_namespace.c: Remove duplicated word in commentBrian Campbell
Signed-off-by: Brian Campbell <brian.campbell@editshare.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2014-02-19treewide: Fix typo in Documentation/DocBookMasanari Iida
This patch fix spelling typo in Documentation/DocBook. It is because .html and .xml files are generated by make htmldocs, I have to fix a typo within the source files. Signed-off-by: Masanari Iida <standby24x7@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2014-02-18cgroup: update cgroup_enable_task_cg_lists() to grab siglockTejun Heo
Currently, there's nothing preventing cgroup_enable_task_cg_lists() from missing set PF_EXITING and race against cgroup_exit(). Depending on the timing, cgroup_exit() may finish with the task still linked on css_set leading to list corruption. Fix it by grabbing siglock in cgroup_enable_task_cg_lists() so that PF_EXITING is guaranteed to be visible. This whole on-demand cg_list optimization is extremely fragile and has ample possibility to lead to bugs which can cause things like once-a-year oops during boot. I'm wondering whether the better approach would be just adding "cgroup_disable=all" handling which disables the whole cgroup rather than tempting fate with this on-demand craziness. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizefan@huawei.com> Cc: stable@vger.kernel.org
2014-02-18cgroup: add a validation check to cgroup_add_cftyps()Li Zefan
Fengguang reported this bug: BUG: unable to handle kernel NULL pointer dereference at 0000003c IP: [<cc90b4ad>] cgroup_cfts_commit+0x27/0x1c1 ... Call Trace: [<cc9d1129>] ? kmem_cache_alloc_trace+0x33f/0x3b7 [<cc90c6fc>] cgroup_add_cftypes+0x8f/0xca [<cd78b646>] cgroup_init+0x6a/0x26a [<cd764d7d>] start_kernel+0x4d7/0x57a [<cd7642ef>] i386_start_kernel+0x92/0x96 This happens in a corner case. If CGROUP_SCHED=y but CFS_BANDWIDTH=n && FAIR_GROUP_SCHED=n && RT_GROUP_SCHED=n, we have: cpu_files[] = { { } /* terminate */ } When we pass cpu_files to cgroup_apply_cftypes(), as cpu_files[0].ss is NULL, we'll access NULL pointer. The bug was introduced by commit de00ffa56ea3132c6013fc8f07133b8a1014cf53 ("cgroup: make cgroup_subsys->base_cftypes use cgroup_add_cftypes()"). Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Li Zefan <lizefan@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2014-02-18workqueue: ensure @task is valid across kthread_stop()Lai Jiangshan
When a kworker should die, the kworkre is notified through WORKER_DIE flag instead of kthread_should_stop(). This, IIRC, is primarily to keep the test synchronized inside worker_pool lock. WORKER_DIE is first set while holding pool->lock, the lock is dropped and kthread_stop() is called. Unfortunately, this means that there's a slight chance that the target kworker may see WORKER_DIE before kthread_stop() finishes and exits and frees the target task before or during kthread_stop(). Fix it by pinning the target task before setting WORKER_DIE and putting it after kthread_stop() is done. tj: Improved patch description and comment. Moved pinning above WORKER_DIE for better signify what it's protecting. CC: stable@vger.kernel.org Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2014-02-18inotify: Fix reporting of cookies for inotify eventsJan Kara
My rework of handling of notification events (namely commit 7053aee26a35 "fsnotify: do not share events between notification groups") broke sending of cookies with inotify events. We didn't propagate the value passed to fsnotify() properly and passed 4 uninitialized bytes to userspace instead (so it is also an information leak). Sadly I didn't notice this during my testing because inotify cookies aren't used very much and LTP inotify tests ignore them. Fix the problem by passing the cookie value properly. Fixes: 7053aee26a3548ebaba046ae2e52396ccf56ac6c Reported-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Jan Kara <jack@suse.cz>
2014-02-17rcu: Optimize RCU_FAST_NO_HZ for RCU_NOCB_CPU_ALLPaul E. McKenney
If CONFIG_RCU_NOCB_CPU_ALL=y, then no CPU will ever have RCU callbacks because these callbacks will instead be handled by the rcuo kthreads. However, the current version of RCU_FAST_NO_HZ nevertheless checks for RCU callbacks. This commit therefore creates static inline implementations of rcu_prepare_for_idle() and rcu_cleanup_after_idle() that are no-ops when CONFIG_RCU_NOCB_CPU_ALL=y. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2014-02-17rcu: Optimize rcu_needs_cpu() for RCU_NOCB_CPU_ALLPaul E. McKenney
If CONFIG_RCU_NOCB_CPU_ALL=y, then rcu_needs_cpu() will always return false, however, the current version nevertheless checks for RCU callbacks. This commit therefore creates a static inline implementation of rcu_needs_cpu() that unconditionally returns false when CONFIG_RCU_NOCB_CPU_ALL=y. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2014-02-17rcu: Optimize rcu_is_nocb_cpu() for RCU_NOCB_CPU_ALLPaul E. McKenney
If CONFIG_RCU_NOCB_CPU_ALL=y, then rcu_is_nocb_cpu() will always return true, however, the current version nevertheless checks rcu_nocb_mask. This commit therefore creates a static inline implementation of rcu_is_nocb_cpu() that unconditionally returns true when CONFIG_RCU_NOCB_CPU_ALL=y. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2014-02-17rcu: Move SRCU grace period work to power efficient workqueueShaibal Dutta
For better use of CPU idle time, allow the scheduler to select the CPU on which the SRCU grace period work would be scheduled. This improves idle residency time and conserves power. This functionality is enabled when CONFIG_WQ_POWER_EFFICIENT is selected. Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Dipankar Sarma <dipankar@in.ibm.com> Signed-off-by: Shaibal Dutta <shaibal.dutta@broadcom.com> [zoran.markovic@linaro.org: Rebased to latest kernel version. Added commit message. Fixed code alignment.] Signed-off-by: Zoran Markovic <zoran.markovic@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2014-02-17rcu: Disambiguate CONFIG_RCU_NOCB_CPUsPaul Bolle
This commit fixes a grammar issue in the rcu_nohz_full_cpu() comment header, so that it is clear that the plural is CPUs not Kconfig options. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2014-02-17rcu: Remove ACCESS_ONCE() from jiffiesPaul E. McKenney
Because jiffies is one of a very few variables marked "volatile", there is no need to use ACCESS_ONCE() when accessing it. This commit therefore removes the redundant ACCESS_ONCE() wrappers. Reported by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2014-02-17rcu: Stop tracking FSF's postal addressPaul E. McKenney
All of the RCU source files have the usual GPL header, which contains a long-obsolete postal address for FSF. To avoid the need to track the FSF office's movements, this commit substitutes the URL where GPL may be found. Reported-by: Greg KH <gregkh@linuxfoundation.org> Reported-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2014-02-17rcu: Add ACCESS_ONCE() to ->n_force_qs_lh accessesPaul E. McKenney
The ->n_force_qs_lh field is accessed without the benefit of any synchronization, so this commit adds the needed ACCESS_ONCE() wrappers. Yes, increments to ->n_force_qs_lh can be lost, but contention should be low and the field is strictly statistical in nature, so this is not a problem. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2014-02-17printk: fix syslog() overflowing user bufferLinus Torvalds
This is not a buffer overflow in the traditional sense: we don't overflow any *kernel* buffers, but we do mis-count the amount of data we copy back to user space for the SYSLOG_ACTION_READ_ALL case. In particular, if the user buffer is too small to hold everything, and *if* there is a continuation line at just the right place, we can end up giving the user more data than he asked for. The reason is that we first count up the number of bytes all the log records contains, then we walk the records again until we've skipped the records at the beginning that won't fit, and then we walk the rest of the records and copy them to the user space buffer. And in between that "skip the initial records that won't fit" and the "copy the records that *will* fit to user space", we reset the 'prev' variable that contained the record information for the last record not copied. That meant that when we started copying to user space, we now had a different character count than what we had originally calculated in the first record walk-through. The fix is to simply not clear the 'prev' flags value (in both cases where we had the same logic: syslog_print_all and kmsg_dump_get_buffer: the latter is used for pstore-like dumping) Reported-and-tested-by: Debabrata Banerjee <dbanerje@akamai.com> Acked-by: Kay Sievers <kay@vrfy.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-15Merge branches 'irq-urgent-for-linus' and 'irq-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq update from Thomas Gleixner: "Fix from the urgent branch: a trivial oneliner adding the missing Kconfig dependency curing build failures which have been discovered by several build robots. The update in the irq-core branch provides a new function in the irq/devres code, which is a prerequisite for driver developers to get rid of boilerplate code all over the place. Not a bugfix, but it has zero impact on the current kernel due to the lack of users. It's simpler to provide the infrastructure to interested parties via your tree than fulfilling the wishlist of driver maintainers on which particular commit or tag this should be based on" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq: Add missing irq_to_desc export for CONFIG_SPARSE_IRQ=n * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq: Add devm_request_any_context_irq()
2014-02-15Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Thomas Gleixner: "The following trilogy of patches brings you: - fix for a long standing math overflow issue with HZ < 60 - an onliner fix for a corner case in the dreaded tick broadcast mechanism affecting a certain range of AMD machines which are infested with the infamous automagic C1E power control misfeature - a fix for one of the ARM platforms which allows the kernel to proceed and boot instead of stupidly panicing for no good reason. The patch is slightly larger than necessary, but it's less ugly than the alternative 5 liner" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tick: Clear broadcast pending bit when switching to oneshot clocksource: Kona: Print warning rather than panic time: Fix overflow when HZ is smaller than 60
2014-02-14nohz: ensure users are aware boot CPU is not NO_HZ_FULLPaul Gortmaker
This bit of information is in the Kconfig help text: "Note the boot CPU will still be kept outside the range to handle the timekeeping duty." However neither the variable NO_HZ_FULL_ALL, or the prompt convey this important detail, so lets add it to the prompt to make it more explicitly obvious to the average user. Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1391711781-7466-1-git-send-email-paul.gortmaker@windriver.com Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2014-02-14timer: Spare IPI when deferrable timer is queued on idle remote targetsViresh Kumar
When a timer is enqueued or modified on a remote target, the latter is expected to see and handle this timer on its next tick. However if the target is idle and CONFIG_NO_HZ_IDLE=y, the CPU may be sleeping tickless and the timer may be ignored. wake_up_nohz_cpu() takes care of that by setting TIF_NEED_RESCHED and sending an IPI to idle targets so that the tick is reevaluated on the idle loop through the tick_nohz_idle_*() APIs. Now this is all performed regardless of the power properties of the timer. If the timer is deferrable, idle targets don't need to be woken up. Only the next buzy tick needs to care about it, and no IPI kick is needed for that to happen. So lets spare the IPI on idle targets when the timer is deferrable. Meanwhile we keep the current behaviour on full dynticks targets. We can spare IPIs on idle full dynticks targets as well but some tricky races against idle_cpu() must be dealt all along to make sure that the timer is well handled after idle exit. We can deal with that later since NO_HZ_FULL already has more important powersaving issues. Reported-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/CAKohpomMZ0TAN2e6N76_g4ZRzxd5vZ1XfuZfxrP7GMxfTNiLVw@mail.gmail.com Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2014-02-14cgroup: fix memory leak in cgroup_mount()Li Zefan
We should free the memory allocated in parse_cgroupfs_options() before calling this function again. Signed-off-by: Li Zefan <lizefan@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2014-02-14cgroup: fix locking in cgroupstats_build()Li Zefan
css_set_lock has been converted to css_set_rwsem, and rwsem can't nest inside rcu_read_lock. Signed-off-by: Li Zefan <lizefan@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2014-02-13lto: Disable LTO for sys_niAndi Kleen
The assembler alias code in cond_syscall does not work when compiled for LTO. Just disable LTO for that file. Signed-off-by: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1391846481-31491-6-git-send-email-ak@linux.intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-02-13lto: Handle LTO common symbols in module loaderJoe Mario
Here is the workaround I made for having the kernel not reject modules built with -flto. The clean solution would be to get the compiler to not emit the symbol. Or if it has to emit the symbol, then emit it as initialized data but put it into a comdat/linkonce section. Minor tweaks by AK over Joe's patch. Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1391846481-31491-5-git-send-email-ak@linux.intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-02-13asmlinkage: Make trace_hardirqs_on/off_caller visibleAndi Kleen
These functions are called from assembler, and thus need to be __visible. Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1391845930-28580-12-git-send-email-ak@linux.intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-02-13asmlinkage Make __stack_chk_failed and memcmp visibleAndi Kleen
In LTO symbols implicitely referenced by the compiler need to be visible. Earlier these symbols were visible implicitely from being exported, but we disabled implicit visibility fo EXPORTs when modules are disabled to improve code size. So now these symbols have to be marked visible explicitely. Do this for __stack_chk_fail (with stack protector) and memcmp. Signed-off-by: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1391845930-28580-10-git-send-email-ak@linux.intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-02-13asmlinkage: Mark rwsem functions that can be called from assembler asmlinkageAndi Kleen
Mark the rwsem functions that can be called from assembler asmlinkage. Signed-off-by: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1391845930-28580-9-git-send-email-ak@linux.intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-02-13asmlinkage: Make main_extable_sort_needed visibleAndi Kleen
main_extable_sort_needed is used by the build system and needs to be a normal ELF symbol. Make it visible so that LTO does not remove or mangle it. Signed-off-by: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1391845930-28580-8-git-send-email-ak@linux.intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-02-13asmlinkage, mutex: Mark __visibleAndi Kleen
Various kernel/mutex.c functions can be called from inline assembler, so they should be all global and __visible. Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1391845930-28580-7-git-send-email-ak@linux.intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-02-13asmlinkage: Make trace_hardirq visibleAndi Kleen
Can be called from assembler code. Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1391845930-28580-6-git-send-email-ak@linux.intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-02-13asmlinkage: Make lockdep_sys_exit asmlinkageAndi Kleen
lockdep_sys_exit can be called from assembler code, so make it asmlinkage. Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1391845930-28580-5-git-send-email-ak@linux.intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-02-13asmlinkage: Make jiffies visibleAndi Kleen
Jiffies is referenced by the linker script, so it has to be visible. Handled both the generic and the x86 version. Signed-off-by: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1391845930-28580-3-git-send-email-ak@linux.intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-02-13cgroup: fix coccinelle warningsFengguang Wu
kernel/cgroup.c:2256:1-3: WARNING: PTR_RET can be used Use PTR_ERR_OR_ZERO rather than if(IS_ERR(...)) + PTR_ERR Generated by: coccinelle/api/ptr_ret.cocci Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2014-02-13tick: Clear broadcast pending bit when switching to oneshotThomas Gleixner
AMD systems which use the C1E workaround in the amd_e400_idle routine trigger the WARN_ON_ONCE in the broadcast code when onlining a CPU. The reason is that the idle routine of those AMD systems switches the cpu into forced broadcast mode early on before the newly brought up CPU can switch over to high resolution / NOHZ mode. The timer related CPU1 bringup looks like this: clockevent_register_device(local_apic); tick_setup(local_apic); ... idle() tick_broadcast_on_off(FORCE); tick_broadcast_oneshot_control(ENTER) cpumask_set(cpu, broadcast_oneshot_mask); halt(); Now the broadcast interrupt on CPU0 sets CPU1 in the broadcast_pending_mask and wakes CPU1. So CPU1 continues: local_apic_timer_interrupt() tick_handle_periodic(); softirq() tick_init_highres(); cpumask_clr(cpu, broadcast_oneshot_mask); tick_broadcast_oneshot_control(ENTER) WARN_ON(cpumask_test(cpu, broadcast_pending_mask); So while we remove CPU1 from the broadcast_oneshot_mask when we switch over to highres mode, we do not clear the pending bit, which then triggers the warning when we go back to idle. The reason why this is only visible on C1E affected AMD systems is that the other machines enter the deep sleep states via acpi_idle/intel_idle and exit the broadcast mode before executing the remote triggered local_apic_timer_interrupt. So the pending bit is already cleared when the switch over to highres mode is clearing the oneshot mask. The solution is simple: Clear the pending bit together with the mask bit when we switch over to highres mode. Stanislaw came up independently with the same patch by enforcing the C1E workaround and debugging the fallout. I picked mine, because mine has a changelog :) Reported-by: poma <pomidorabelisima@gmail.com> Debugged-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Olaf Hering <olaf@aepfle.de> Cc: Dave Jones <davej@redhat.com> Cc: Justin M. Forbes <jforbes@redhat.com> Cc: Josh Boyer <jwboyer@redhat.com> Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1402111434180.21991@ionos.tec.linutronix.de Cc: stable@vger.kernel.org # 3.10+ Signed-off-by: Thomas Gleixner <tglx@linutronix.de>