summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-02-06sched/fair: Do not migrate if the prev_cpu is idleMel Gorman
wake_affine_idle() prefers to move a task to the current CPU if the wakeup is due to an interrupt. The expectation is that the interrupt data is cache hot and relevant to the waking task as well as avoiding a search. However, there is no way to determine if there was cache hot data on the previous CPU that may exceed the interrupt data. Furthermore, round-robin delivery of interrupts can migrate tasks around a socket where each CPU is under-utilised. This can interact badly with cpufreq which makes decisions based on per-cpu data. It has been observed on machines with HWP that p-states are not boosted to their maximum levels even though the workload is latency and throughput sensitive. This patch uses the previous CPU for the task if it's idle and cache-affine with the current CPU even if the current CPU is idle due to the wakup being related to the interrupt. This reduces migrations at the cost of the interrupt data not being cache hot when the task wakes. A variety of workloads were tested on various machines and no adverse impact was noticed that was outside noise. dbench on ext4 on UMA showed roughly 10% reduction in the number of CPU migrations and it is a case where interrupts are frequent for IO competions. In most cases, the difference in performance is quite small but variability is often reduced. For example, this is the result for pgbench running on a UMA machine with different numbers of clients. 4.15.0-rc9 4.15.0-rc9 baseline waprev-v1 Hmean 1 22096.28 ( 0.00%) 22734.86 ( 2.89%) Hmean 4 74633.42 ( 0.00%) 75496.77 ( 1.16%) Hmean 7 115017.50 ( 0.00%) 113030.81 ( -1.73%) Hmean 12 126209.63 ( 0.00%) 126613.40 ( 0.32%) Hmean 16 131886.91 ( 0.00%) 130844.35 ( -0.79%) Stddev 1 636.38 ( 0.00%) 417.11 ( 34.46%) Stddev 4 614.64 ( 0.00%) 583.24 ( 5.11%) Stddev 7 542.46 ( 0.00%) 435.45 ( 19.73%) Stddev 12 173.93 ( 0.00%) 171.50 ( 1.40%) Stddev 16 671.42 ( 0.00%) 680.30 ( -1.32%) CoeffVar 1 2.88 ( 0.00%) 1.83 ( 36.26%) Note that the different in performance is marginal but for low utilisation, there is less variability. Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20180130104555.4125-4-mgorman@techsingularity.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-06sched/fair: Restructure wake_affine*() to return a CPU idMel Gorman
This is a preparation patch that has wake_affine*() return a CPU ID instead of a boolean. The intent is to allow the wake_affine() helpers to be avoided if a decision is already made. This patch has no functional change. Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20180130104555.4125-3-mgorman@techsingularity.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-06sched/fair: Remove unnecessary parameters from wake_affine_idle()Mel Gorman
wake_affine_idle() takes parameters it never uses so clean it up. Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20180130104555.4125-2-mgorman@techsingularity.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-06sched/rt: Make update_curr_rt() more accurateWen Yang
rq->clock_task may be updated between the two calls of rq_clock_task() in update_curr_rt(). Calling rq_clock_task() only once makes it more accurate and efficient, taking update_curr() as reference. Signed-off-by: Wen Yang <wen.yang99@zte.com.cn> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Jiang Biao <jiang.biao2@zte.com.cn> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: zhong.weidong@zte.com.cn Link: http://lkml.kernel.org/r/1517800721-42092-1-git-send-email-wen.yang99@zte.com.cn Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-06sched/rt: Up the root domain ref count when passing it around via IPIsSteven Rostedt (VMware)
When issuing an IPI RT push, where an IPI is sent to each CPU that has more than one RT task scheduled on it, it references the root domain's rto_mask, that contains all the CPUs within the root domain that has more than one RT task in the runable state. The problem is, after the IPIs are initiated, the rq->lock is released. This means that the root domain that is associated to the run queue could be freed while the IPIs are going around. Add a sched_get_rd() and a sched_put_rd() that will increment and decrement the root domain's ref count respectively. This way when initiating the IPIs, the scheduler will up the root domain's ref count before releasing the rq->lock, ensuring that the root domain does not go away until the IPI round is complete. Reported-by: Pavan Kondeti <pkondeti@codeaurora.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 4bdced5c9a292 ("sched/rt: Simplify the IPI based RT balancing logic") Link: http://lkml.kernel.org/r/CAEU1=PkiHO35Dzna8EQqNSKW1fr1y1zRQ5y66X117MG06sQtNA@mail.gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-06sched/rt: Use container_of() to get root domain in rto_push_irq_work_func()Steven Rostedt (VMware)
When the rto_push_irq_work_func() is called, it looks at the RT overloaded bitmask in the root domain via the runqueue (rq->rd). The problem is that during CPU up and down, nothing here stops rq->rd from changing between taking the rq->rd->rto_lock and releasing it. That means the lock that is released is not the same lock that was taken. Instead of using this_rq()->rd to get the root domain, as the irq work is part of the root domain, we can simply get the root domain from the irq work that is passed to the routine: container_of(work, struct root_domain, rto_push_work) This keeps the root domain consistent. Reported-by: Pavan Kondeti <pkondeti@codeaurora.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 4bdced5c9a292 ("sched/rt: Simplify the IPI based RT balancing logic") Link: http://lkml.kernel.org/r/CAEU1=PkiHO35Dzna8EQqNSKW1fr1y1zRQ5y66X117MG06sQtNA@mail.gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-06sched/core: Optimize update_stats_*()Peter Zijlstra
These functions are already gated by schedstats_enabled(), there is no point in then issuing another static_branch for every individual update in them. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-06sched/core: Optimize ttwu_stat()Peter Zijlstra
The whole of ttwu_stat() is guarded by a single schedstat_enabled(), there is absolutely no point in then issuing another static_branch for every single schedstat_inc() in there. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-06drm/amdgpu: re-enable CGCG on CZ and disable on STShirish S
The CGCG feature on Stoney is causing GFX related issues such as freezes and blank outs. Signed-off-by: Shirish S <shirish.s@amd.com> Reviewed-by: Arindam Nath <arindam.nath@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-05ibmvnic: fix empty firmware version and errors cleanupDesnes Augusto Nunes do Rosario
This patch makes sure that the firmware version is never NULL. Moreover, it also performs some cleanup on the error messages. Fixes: a107311d7fdf ("ibmvnic: fix firmware version when no firmware level has been provided by the VIOS server") Signed-off-by: Desnes A. Nunes do Rosario <desnesn@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-05sctp: fix dst refcnt leak in sctp_v4_get_dstTommi Rantala
Fix dst reference count leak in sctp_v4_get_dst() introduced in commit 410f03831 ("sctp: add routing output fallback"): When walking the address_list, successive ip_route_output_key() calls may return the same rt->dst with the reference incremented on each call. The code would not decrement the dst refcount when the dst pointer was identical from the previous iteration, causing the dst refcnt leak. Testcase: ip netns add TEST ip netns exec TEST ip link set lo up ip link add dummy0 type dummy ip link add dummy1 type dummy ip link add dummy2 type dummy ip link set dev dummy0 netns TEST ip link set dev dummy1 netns TEST ip link set dev dummy2 netns TEST ip netns exec TEST ip addr add 192.168.1.1/24 dev dummy0 ip netns exec TEST ip link set dummy0 up ip netns exec TEST ip addr add 192.168.1.2/24 dev dummy1 ip netns exec TEST ip link set dummy1 up ip netns exec TEST ip addr add 192.168.1.3/24 dev dummy2 ip netns exec TEST ip link set dummy2 up ip netns exec TEST sctp_test -H 192.168.1.2 -P 20002 -h 192.168.1.1 -p 20000 -s -B 192.168.1.3 ip netns del TEST In 4.4 and 4.9 kernels this results to: [ 354.179591] unregister_netdevice: waiting for lo to become free. Usage count = 1 [ 364.419674] unregister_netdevice: waiting for lo to become free. Usage count = 1 [ 374.663664] unregister_netdevice: waiting for lo to become free. Usage count = 1 [ 384.903717] unregister_netdevice: waiting for lo to become free. Usage count = 1 [ 395.143724] unregister_netdevice: waiting for lo to become free. Usage count = 1 [ 405.383645] unregister_netdevice: waiting for lo to become free. Usage count = 1 ... Fixes: 410f03831 ("sctp: add routing output fallback") Fixes: 0ca50d12f ("sctp: fix src address selection if using secondary addresses") Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-05sctp: fix dst refcnt leak in sctp_v6_get_dst()Alexey Kodanev
When going through the bind address list in sctp_v6_get_dst() and the previously found address is better ('matchlen > bmatchlen'), the code continues to the next iteration without releasing currently held destination. Fix it by releasing 'bdst' before continue to the next iteration, and instead of introducing one more '!IS_ERR(bdst)' check for dst_release(), move the already existed one right after ip6_dst_lookup_flow(), i.e. we shouldn't proceed further if we get an error for the route lookup. Fixes: dbc2b5e9a09e ("sctp: fix src address selection if using secondary addresses for ipv6") Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-05gcc-plugins: Use dynamic initializersKees Cook
GCC 8 changed the order of some fields and is very picky about ordering in static initializers, so instead just move to dynamic initializers, and drop the redundant already-zero field assignments. Suggested-by: Valdis Kletnieks <valdis.kletnieks@vt.edu> Signed-off-by: Kees Cook <keescook@chromium.org>
2018-02-05gcc-plugins: Add include required by GCC release 8valdis.kletnieks@vt.edu
GCC requires another #include to get the gcc-plugins to build cleanly. Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu> Signed-off-by: Kees Cook <keescook@chromium.org>
2018-02-05nfsd4: don't set lock stateid's sc_type to CLOSEDJ. Bruce Fields
There's no point I can see to stp->st_stid.sc_type = NFS4_CLOSED_STID; given release_lock_stateid immediately sets sc_type to 0. That set of sc_type to 0 should be enough to prevent it being used where we don't want it to be; NFS4_CLOSED_STID should only be needed for actual open stateid's that are actually closed. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-02-05nfsd: Detect unhashed stids in nfsd4_verify_open_stid()Trond Myklebust
The state of the stid is guaranteed by 2 locks: - The nfs4_client 'cl_lock' spinlock - The nfs4_ol_stateid 'st_mutex' mutex so it is quite possible for the stid to be unhashed after lookup, but before calling nfsd4_lock_ol_stateid(). So we do need to check for a zero value for 'sc_type' in nfsd4_verify_open_stid(). Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Tested-by: Checuk Lever <chuck.lever@oracle.com> Cc: stable@vger.kernel.org Fixes: 659aefb68eca "nfsd: Ensure we don't recognise lock stateids..." Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-02-05sunrpc: remove dead code in svc_sock_setbufsizeChristoph Hellwig
Setting values in struct sock directly is the usual method. Remove the long dead code using set_fs() and the related comment. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-02-05Merge tag 'xfs-4.16-merge-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds
Pull more xfs updates from Darrick Wong: "As promised, here's a (much smaller) second pull request for the second week of the merge cycle. This time around we have a couple patches shutting off unsupported fs configurations, and a couple of cleanups. Last, we turn off EXPERIMENTAL for the reverse mapping btree, since the primary downstream user of that information (online fsck) is now upstream and I haven't seen any major failures in a few kernel releases. Summary: - Print scrub build status in the xfs build info. - Explicitly call out the remaining two scenarios where we don't support reflink and never have. - Remove EXPERIMENTAL tag from reverse mapping btree!" * tag 'xfs-4.16-merge-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: remove experimental tag for reverse mapping xfs: don't allow reflink + realtime filesystems xfs: don't allow DAX on reflink filesystems xfs: add scrub to XFS_BUILD_OPTIONS xfs: fix u32 type usage in sb validation function
2018-02-05Merge tag 'perf-urgent-for-mingo-4.16-20180205' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fixes from Arnaldo Carvalho de Melo: - Fix 'period' and 'freq' handling for 'perf record', also related: add Add PERF_SAMPLE_PERIOD into PEBS_FREERUNNING_FLAGS in the x86 perf kernel driver (Jiri Olsa) - Fix 'perf trace -i perf.data' callgraph handling (Ravi Bangoria) - Synchronize tooling headers for asound, s390 and powerpc KVM, sched and x86 features (Arnaldo Carvalho de Melo) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-05Merge branch 'overlayfs-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs Pull overlayfs updates from Miklos Szeredi: "This work from Amir adds NFS export capability to overlayfs. NFS exporting an overlay filesystem is a challange because we want to keep track of any copy-up of a file or directory between encoding the file handle and decoding it. This is achieved by indexing copied up objects by lower layer file handle. The index is already used for hard links, this patchset extends the use to NFS file handle decoding" * 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: (51 commits) ovl: check ERR_PTR() return value from ovl_encode_fh() ovl: fix regression in fsnotify of overlay merge dir ovl: wire up NFS export operations ovl: lookup indexed ancestor of lower dir ovl: lookup connected ancestor of dir in inode cache ovl: hash non-indexed dir by upper inode for NFS export ovl: decode pure lower dir file handles ovl: decode indexed dir file handles ovl: decode lower file handles of unlinked but open files ovl: decode indexed non-dir file handles ovl: decode lower non-dir file handles ovl: encode lower file handles ovl: copy up before encoding non-connectable dir file handle ovl: encode non-indexed upper file handles ovl: decode connected upper dir file handles ovl: decode pure upper file handles ovl: encode pure upper file handles ovl: document NFS export vfs: factor out helpers d_instantiate_anon() and d_alloc_anon() ovl: store 'has_upper' and 'opaque' as bit flags ...
2018-02-05membarrier/selftest: Test private expedited sync core commandMathieu Desnoyers
Test the new MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE and MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED_SYNC_CORE commands. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Shuah Khan <shuahkh@osg.samsung.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Alice Ferrazzi <alice.ferrazzi@gmail.com> Cc: Andrea Parri <parri.andrea@gmail.com> Cc: Andrew Hunter <ahh@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Avi Kivity <avi@scylladb.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Dave Watson <davejwatson@fb.com> Cc: David Sehr <sehr@google.com> Cc: Greg Hackmann <ghackmann@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Maged Michael <maged.michael@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Elder <paul.elder@pitt.edu> Cc: Paul Mackerras <paulus@samba.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-api@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linux-kselftest@vger.kernel.org Link: http://lkml.kernel.org/r/20180129202020.8515-12-mathieu.desnoyers@efficios.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-05membarrier/arm64: Provide core serializing commandMathieu Desnoyers
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrea Parri <parri.andrea@gmail.com> Cc: Andrew Hunter <ahh@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Avi Kivity <avi@scylladb.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Dave Watson <davejwatson@fb.com> Cc: David Sehr <sehr@google.com> Cc: Greg Hackmann <ghackmann@google.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Maged Michael <maged.michael@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-api@vger.kernel.org Cc: linux-arch@vger.kernel.org Link: http://lkml.kernel.org/r/20180129202020.8515-11-mathieu.desnoyers@efficios.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-05membarrier/x86: Provide core serializing commandMathieu Desnoyers
There are two places where core serialization is needed by membarrier: 1) When returning from the membarrier IPI, 2) After scheduler updates curr to a thread with a different mm, before going back to user-space, since the curr->mm is used by membarrier to check whether it needs to send an IPI to that CPU. x86-32 uses IRET as return from interrupt, and both IRET and SYSEXIT to go back to user-space. The IRET instruction is core serializing, but not SYSEXIT. x86-64 uses IRET as return from interrupt, which takes care of the IPI. However, it can return to user-space through either SYSRETL (compat code), SYSRETQ, or IRET. Given that SYSRET{L,Q} is not core serializing, we rely instead on write_cr3() performed by switch_mm() to provide core serialization after changing the current mm, and deal with the special case of kthread -> uthread (temporarily keeping current mm into active_mm) by adding a sync_core() in that specific case. Use the new sync_core_before_usermode() to guarantee this. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrea Parri <parri.andrea@gmail.com> Cc: Andrew Hunter <ahh@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Avi Kivity <avi@scylladb.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Dave Watson <davejwatson@fb.com> Cc: David Sehr <sehr@google.com> Cc: Greg Hackmann <ghackmann@google.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Maged Michael <maged.michael@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-api@vger.kernel.org Cc: linux-arch@vger.kernel.org Link: http://lkml.kernel.org/r/20180129202020.8515-10-mathieu.desnoyers@efficios.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-05membarrier: Provide core serializing command, *_SYNC_COREMathieu Desnoyers
Provide core serializing membarrier command to support memory reclaim by JIT. Each architecture needs to explicitly opt into that support by documenting in their architecture code how they provide the core serializing instructions required when returning from the membarrier IPI, and after the scheduler has updated the curr->mm pointer (before going back to user-space). They should then select ARCH_HAS_MEMBARRIER_SYNC_CORE to enable support for that command on their architecture. Architectures selecting this feature need to either document that they issue core serializing instructions when returning to user-space, or implement their architecture-specific sync_core_before_usermode(). Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrea Parri <parri.andrea@gmail.com> Cc: Andrew Hunter <ahh@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Avi Kivity <avi@scylladb.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Dave Watson <davejwatson@fb.com> Cc: David Sehr <sehr@google.com> Cc: Greg Hackmann <ghackmann@google.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Maged Michael <maged.michael@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-api@vger.kernel.org Cc: linux-arch@vger.kernel.org Link: http://lkml.kernel.org/r/20180129202020.8515-9-mathieu.desnoyers@efficios.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-05lockin/x86: Implement sync_core_before_usermode()Mathieu Desnoyers
Ensure that a core serializing instruction is issued before returning to user-mode. x86 implements return to user-space through sysexit, sysrel, and sysretq, which are not core serializing. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrea Parri <parri.andrea@gmail.com> Cc: Andrew Hunter <ahh@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Avi Kivity <avi@scylladb.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Dave Watson <davejwatson@fb.com> Cc: David Sehr <sehr@google.com> Cc: Greg Hackmann <ghackmann@google.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Maged Michael <maged.michael@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-api@vger.kernel.org Cc: linux-arch@vger.kernel.org Link: http://lkml.kernel.org/r/20180129202020.8515-8-mathieu.desnoyers@efficios.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-05locking: Introduce sync_core_before_usermode()Mathieu Desnoyers
Introduce an architecture function that ensures the current CPU issues a core serializing instruction before returning to usermode. This is needed for the membarrier "sync_core" command. Architectures defining the sync_core_before_usermode() static inline need to select ARCH_HAS_SYNC_CORE_BEFORE_USERMODE. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrea Parri <parri.andrea@gmail.com> Cc: Andrew Hunter <ahh@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Avi Kivity <avi@scylladb.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Dave Watson <davejwatson@fb.com> Cc: David Sehr <sehr@google.com> Cc: Greg Hackmann <ghackmann@google.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Maged Michael <maged.michael@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-api@vger.kernel.org Cc: linux-arch@vger.kernel.org Link: http://lkml.kernel.org/r/20180129202020.8515-7-mathieu.desnoyers@efficios.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-05membarrier/selftest: Test global expedited commandMathieu Desnoyers
Test the new MEMBARRIER_CMD_GLOBAL_EXPEDITED and MEMBARRIER_CMD_REGISTER_GLOBAL_EXPEDITED commands. Adapt to the MEMBARRIER_CMD_SHARED -> MEMBARRIER_CMD_GLOBAL rename. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Shuah Khan <shuahkh@osg.samsung.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Alice Ferrazzi <alice.ferrazzi@gmail.com> Cc: Andrea Parri <parri.andrea@gmail.com> Cc: Andrew Hunter <ahh@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Avi Kivity <avi@scylladb.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Dave Watson <davejwatson@fb.com> Cc: David Sehr <sehr@google.com> Cc: Greg Hackmann <ghackmann@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Maged Michael <maged.michael@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Elder <paul.elder@pitt.edu> Cc: Paul Mackerras <paulus@samba.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-api@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linux-kselftest@vger.kernel.org Link: http://lkml.kernel.org/r/20180129202020.8515-6-mathieu.desnoyers@efficios.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-05membarrier: Provide GLOBAL_EXPEDITED commandMathieu Desnoyers
Allow expedited membarrier to be used for data shared between processes through shared memory. Processes wishing to receive the membarriers register with MEMBARRIER_CMD_REGISTER_GLOBAL_EXPEDITED. Those which want to issue membarrier invoke MEMBARRIER_CMD_GLOBAL_EXPEDITED. This allows extremely simple kernel-level implementation: we have almost everything we need with the PRIVATE_EXPEDITED barrier code. All we need to do is to add a flag in the mm_struct that will be used to check whether we need to send the IPI to the current thread of each CPU. There is a slight downside to this approach compared to targeting specific shared memory users: when performing a membarrier operation, all registered "global" receivers will get the barrier, even if they don't share a memory mapping with the sender issuing MEMBARRIER_CMD_GLOBAL_EXPEDITED. This registration approach seems to fit the requirement of not disturbing processes that really deeply care about real-time: they simply should not register with MEMBARRIER_CMD_REGISTER_GLOBAL_EXPEDITED. In order to align the membarrier command names, the "MEMBARRIER_CMD_SHARED" command is renamed to "MEMBARRIER_CMD_GLOBAL", keeping an alias of MEMBARRIER_CMD_SHARED to MEMBARRIER_CMD_GLOBAL for UAPI header backward compatibility. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrea Parri <parri.andrea@gmail.com> Cc: Andrew Hunter <ahh@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Avi Kivity <avi@scylladb.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Dave Watson <davejwatson@fb.com> Cc: David Sehr <sehr@google.com> Cc: Greg Hackmann <ghackmann@google.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Maged Michael <maged.michael@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-api@vger.kernel.org Link: http://lkml.kernel.org/r/20180129202020.8515-5-mathieu.desnoyers@efficios.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-05membarrier: Document scheduler barrier requirementsMathieu Desnoyers
Document the membarrier requirement on having a full memory barrier in __schedule() after coming from user-space, before storing to rq->curr. It is provided by smp_mb__after_spinlock() in __schedule(). Document that membarrier requires a full barrier on transition from kernel thread to userspace thread. We currently have an implicit barrier from atomic_dec_and_test() in mmdrop() that ensures this. The x86 switch_mm_irqs_off() full barrier is currently provided by many cpumask update operations as well as write_cr3(). Document that write_cr3() provides this barrier. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrea Parri <parri.andrea@gmail.com> Cc: Andrew Hunter <ahh@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Avi Kivity <avi@scylladb.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Dave Watson <davejwatson@fb.com> Cc: David Sehr <sehr@google.com> Cc: Greg Hackmann <ghackmann@google.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Maged Michael <maged.michael@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-api@vger.kernel.org Link: http://lkml.kernel.org/r/20180129202020.8515-4-mathieu.desnoyers@efficios.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-05powerpc, membarrier: Skip memory barrier in switch_mm()Mathieu Desnoyers
Allow PowerPC to skip the full memory barrier in switch_mm(), and only issue the barrier when scheduling into a task belonging to a process that has registered to use expedited private. Threads targeting the same VM but which belong to different thread groups is a tricky case. It has a few consequences: It turns out that we cannot rely on get_nr_threads(p) to count the number of threads using a VM. We can use (atomic_read(&mm->mm_users) == 1 && get_nr_threads(p) == 1) instead to skip the synchronize_sched() for cases where the VM only has a single user, and that user only has a single thread. It also turns out that we cannot use for_each_thread() to set thread flags in all threads using a VM, as it only iterates on the thread group. Therefore, test the membarrier state variable directly rather than relying on thread flags. This means membarrier_register_private_expedited() needs to set the MEMBARRIER_STATE_PRIVATE_EXPEDITED flag, issue synchronize_sched(), and only then set MEMBARRIER_STATE_PRIVATE_EXPEDITED_READY which allows private expedited membarrier commands to succeed. membarrier_arch_switch_mm() now tests for the MEMBARRIER_STATE_PRIVATE_EXPEDITED flag. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andrea Parri <parri.andrea@gmail.com> Cc: Andrew Hunter <ahh@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Avi Kivity <avi@scylladb.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Dave Watson <davejwatson@fb.com> Cc: David Sehr <sehr@google.com> Cc: Greg Hackmann <ghackmann@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Maged Michael <maged.michael@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-api@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/20180129202020.8515-3-mathieu.desnoyers@efficios.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-05membarrier/selftest: Test private expedited commandMathieu Desnoyers
Test the new MEMBARRIER_CMD_PRIVATE_EXPEDITED and MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED commands. Add checks expecting specific error values on system calls expected to fail. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Shuah Khan <shuahkh@osg.samsung.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Alice Ferrazzi <alice.ferrazzi@gmail.com> Cc: Andrea Parri <parri.andrea@gmail.com> Cc: Andrew Hunter <ahh@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Avi Kivity <avi@scylladb.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Dave Watson <davejwatson@fb.com> Cc: David Sehr <sehr@google.com> Cc: Greg Hackmann <ghackmann@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Maged Michael <maged.michael@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Elder <paul.elder@pitt.edu> Cc: Paul Mackerras <paulus@samba.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-api@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linux-kselftest@vger.kernel.org Link: http://lkml.kernel.org/r/20180129202020.8515-2-mathieu.desnoyers@efficios.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-06Merge branch 'linux-4.16' of git://github.com/skeggsb/linux into drm-nextDave Airlie
- initial kepler clock gating support - atomic gamma handling fixes - support for gp108 "secure boot" (enables acceleration, finally) * 'linux-4.16' of git://github.com/skeggsb/linux: drm/nouveau/clk: fix gcc-7 -Wint-in-bool-context warning drm/nouveau/mmu: Fix trailing semicolon drm/nouveau: Introduce NvPmEnableGating option drm/nouveau: Add support for SLCG for Kepler2 drm/nouveau: Add support for BLCG on Kepler2 drm/nouveau: Add support for BLCG on Kepler1 drm/nouveau: Add support for basic clockgating on Kepler1 drm/nouveau/kms/nv50: fix handling of gamma since atomic conversion drm/nouveau/kms/nv50: use INTERPOLATE_257_UNITY_RANGE LUT on newer chipsets drm/nouveau/kms/nv50: use "low res" lut for indexed mode drm/nouveau/kms/nv50: prepare for double-buffered LUTs drm/nouveau/bo: add helper functions for handling pinned+mapped buffers drm/nouveau/fbcon: add module parameter to select bits-per-pixel drm/nouveau/secboot/gp108: implement on top of acr_r370 drm/nouveau/secboot/r370: implement support for booting LS SEC2 ucode drm/nouveau/secboot/r370: move a bunch of r375 stuff to a new implementation drm/nouveau: nouveau: use correct string length drm/nouveau/drm/nouveau/mmu: fix odd_ptr_err.cocci warnings drm/nouveau/pmu/fuc: don't use movw directly anymore
2018-02-06Merge tag 'drm-intel-next-fixes-2018-02-01' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-next Fixes for GPU hangs and other bugs around hangcheck and result; Fix for regression on suspend case with vgaswitcheroo; Fixes for eDP and HDMI blank screens Fix for protecting WC allocation to avoid overflow on page vec; Cleanup around unpublished GLK firmware blobs, and other small fixes. This also contains GVT pull request mostly with regression fixes on vGPU display dmabuf, mmio switch and other misc changes. * tag 'drm-intel-next-fixes-2018-02-01' of git://anongit.freedesktop.org/drm/drm-intel: (21 commits) drm/i915/ppgtt: Pin page directories before allocation drm/i915: Always run hangcheck while the GPU is busy Revert "drm/i915: mark all device info struct with __initconst" drm/i915/edp: Do not do link training fallback or prune modes on EDP drm/i915: Check for fused or unused pipes drm/i915: Protect WC stash allocation against direct reclaim drm/i915: Only attempt to scan the requested number of shrinker slabs drm/i915: Always call to intel_display_set_init_power() in resume_early. drm/i915/gvt: cancel scheduler timer when no vGPU exists drm/i915/gvt: cancel virtual vblank timer when no vGPU exists drm/i915/gvt: Keep obj->dma_buf link NULL during exporting drm/i915/pmu: Reconstruct active state on starting busy-stats drm/i915: Stop getting the fault address from RING_FAULT_REG drm/i915/guc: Add uc_fini_wq in gem_init unwind path drm/i915: Fix using BIT_ULL() vs. BIT() for power domain masks drm/i915: Try EDID bitbanging on HDMI after failed read drm/i915/glk: Disable Guc and HuC on GLK drm/i915/gvt: Do not use I915_NUM_ENGINES to iterate over the mocs regs array drm/i915/gvt: validate gfn before set shadow page entry drm/i915/gvt: add PLANE_KEYMAX regs to mmio track list ...
2018-02-05Merge tag 'rproc-v4.16' of git://github.com/andersson/remoteprocLinus Torvalds
Pull remoteproc updates from Bjorn Andersson: "This contains a few bug fixes and a cleanup up of the resource-table handling in the framework, which removes the need for drivers with no resource table to provide a fake one" * tag 'rproc-v4.16' of git://github.com/andersson/remoteproc: remoteproc: Reset table_ptr on stop remoteproc: Drop dangling find_rsc_table dummies remoteproc: Move resource table load logic to find remoteproc: Don't handle empty resource table remoteproc: Merge rproc_ops and rproc_fw_ops remoteproc: Clone rproc_ops in rproc_alloc() remoteproc: Cache resource table size remoteproc: Remove depricated crash completion virtio_remoteproc: correct put_device virtio_device.dev
2018-02-05Merge tag 'rpmsg-v4.16' of git://github.com/andersson/remoteprocLinus Torvalds
Pull rpmsg updates from Bjorn Andersson: "This fixes a few issues found in the SMD and GLINK drivers and corrects the handling of SMD channels that are found in an (previously) unexpected state" * tag 'rpmsg-v4.16' of git://github.com/andersson/remoteproc: rpmsg: smd: Fix double unlock in __qcom_smd_send() rpmsg: glink: Fix missing mutex_init() in qcom_glink_alloc_channel() rpmsg: smd: Don't hold the tx lock during wait rpmsg: smd: Fail send on a closed channel rpmsg: smd: Wake up all waiters rpmsg: smd: Create device for all channels rpmsg: smd: Perform handshake during open rpmsg: glink: smem: Ensure ordering during tx drivers: rpmsg: remove duplicate includes remoteproc: qcom: Use PTR_ERR_OR_ZERO() in glink prob
2018-02-05Merge tag 'mmc-v4.16-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC host fixes from Ulf Hansson: - renesas_sdhi: Fix build error in case NO_DMA=y - sdhci: Implement a bounce buffer to address throughput regressions * tag 'mmc-v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: MMC_SDHI_{SYS,INTERNAL}_DMAC should depend on HAS_DMA mmc: sdhci: Implement an SDHCI-specific bounce buffer
2018-02-05Merge tag 'pwm/for-4.16-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm Pull pwm updates from Thierry Reding: "The Meson PWM controller driver gains support for the AXG series and a minor bug is fixed for the STMPE driver. To round things off, the class is now set for PWM channels exported via sysfs which allows non-root access, provided that the system has been configured accordingly" * tag 'pwm/for-4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm: pwm: meson: Add clock source configuration for Meson-AXG dt-bindings: pwm: Update bindings for the Meson-AXG pwm: stmpe: Fix wrong register offset for hwpwm=2 case pwm: Set class for exported channels in sysfs
2018-02-05net: mediatek: Explicitly include pinctrl headersThierry Reding
The Mediatek ethernet driver fails to build after commit 23c35f48f5fb ("pinctrl: remove include file from <linux/device.h>") because it relies on the pinctrl/consumer.h and pinctrl/devinfo.h being pulled in by the device.h header implicitly. Include these headers explicitly to avoid the build failure. Cc: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-05mmc: meson-gx-mmc: Explicitly include pinctr/consumer.hThierry Reding
The Meson GX MMC driver fails to build after commit 23c35f48f5fb ("pinctrl: remove include file from <linux/device.h>") because it relies on the pinctrl/consumer.h being pulled in by the device.h header implicitly. Include the header explicitly to avoid the build failure. Cc: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-05drm/rockchip: lvds: Explicitly include pinctrl headersThierry Reding
The Rockchip LVDS driver fails to build after commit 23c35f48f5fb ("pinctrl: remove include file from <linux/device.h>") because it relies on the pinctrl/consumer.h and pinctrl/devinfo.h being pulled in by the device.h header implicitly. Include these headers explicitly to avoid the build failure. Cc: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-05pinctrl: files should directly include apis they useStephen Rothwell
Fixes: 23c35f48f5fb ("pinctrl: remove include file from <linux/device.h>") Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-05perf tools: Add trace/beauty/generated/ into .gitignoreRavi Bangoria
No functionality changes. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20180130053053.13214-2-ravi.bangoria@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-05perf trace: Fix call-graph outputRavi Bangoria
Recently, Arnaldo fixed global vs event specific --max-stack usage with commit bd3dda9ab0fb ("perf trace: Allow overriding global --max-stack per event"). This commit is having a regression when we don't use --max-stack at all with perf trace. Ex, $ ./perf trace record -g ls $ ./perf trace -i perf.data 0.076 ( 0.002 ms): ls/9109 brk( 0.196 ( 0.008 ms): ls/9109 access(filename: 0x9f998b70, mode: R 0.209 ( 0.031 ms): ls/9109 open(filename: 0x9f998978, flags: CLOEXEC This is missing call-traces. After patch: $ ./perf trace -i perf.data 0.076 ( 0.002 ms): ls/9109 brk( do_syscall_trace_leave ([kernel.kallsyms]) [0] ([unknown]) syscall_exit_work ([kernel.kallsyms]) brk (/usr/lib64/ld-2.17.so) _dl_sysdep_start (/usr/lib64/ld-2.17.so) _dl_start_final (/usr/lib64/ld-2.17.so) _dl_start (/usr/lib64/ld-2.17.so) _start (/usr/lib64/ld-2.17.so) 0.196 ( 0.008 ms): ls/9109 access(filename: 0x9f998b70, mode: R do_syscall_trace_leave ([kernel.kallsyms]) [0] ([unknown]) Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Fixes: bd3dda9ab0fb ("perf trace: Allow overriding global --max-stack per event") Link: http://lkml.kernel.org/r/20180130053053.13214-3-ravi.bangoria@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-05x86/events/intel/ds: Add PERF_SAMPLE_PERIOD into PEBS_FREERUNNING_FLAGSJiri Olsa
Stephane reported that we don't support period for enabling large PEBS data, which there's no reason for. Adding PERF_SAMPLE_PERIOD into freerunning flags. Tested it with: # perf record -e cycles:P -c 100 --no-timestamp -C 0 --period Reported-by: Stephane Eranian <eranian@google.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Kan Liang <kan.liang@intel.com> Tested-by: Stephane Eranian <eranian@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180201083812.11359-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-05net/mlx5: increase async EQ to avoid EQ overrunMax Gurtovoy
Currently the async EQ has 256 entries only. It might not be big enough for the SW to handle all the needed pending events. For example, in case of many QPs (let's say 1024) connected to a SRQ created using NVMeOF target and the target goes down, the FW will raise 1024 "last WQE reached" events and may cause EQ overrun. Increase the EQ to more reasonable size, that beyond it the FW should be able to delay the event and raise it later on using internal backpressure mechanism. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-02-05mlx5: fix mlx5_get_vector_affinity to start from completion vector 0Sagi Grimberg
The consumers of this routine expects the affinity map of of vector index relative to the first completion vector. The upper layers are not aware of internal/private completion vectors that mlx5 allocates for its own usage. Hence, return the affinity map of vector index relative to the first completion vector. Fixes: 05e0cc84e00c ("net/mlx5: Fix get vector affinity helper function") Reported-by: Logan Gunthorpe <logang@deltatee.com> Tested-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Cc: <stable@vger.kernel.org> # v4.15 Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-02-05RDMA/hns: Fix the endian problem for hnsoulijun
The hip06 and hip08 run on a little endian ARM, it needs to revise the annotations to indicate that the HW uses little endian data in the various DMA buffers, and flow the necessary swaps throughout. The imm_data use big endian mode. The cpu_to_le32/le32_to_cpu swaps are no-op for this, which makes the only substantive change the handling of imm_data which is now mandatory swapped. This also keep match with the userspace hns driver and resolve the warning by sparse. Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-02-05perf record: Fix period option handlingJiri Olsa
Stephan reported we don't unset PERIOD sample type when --no-period is specified. Adding the unset check and reset PERIOD if --no-period is specified. Committer notes: Check the sample_type, it shouldn't have PERF_SAMPLE_PERIOD there when --no-period is used. Before: # perf record --no-period sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.018 MB perf.data (7 samples) ] # perf evlist -v cycles:ppp: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 # After: [root@jouet ~]# perf record --no-period sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.019 MB perf.data (17 samples) ] [root@jouet ~]# perf evlist -v cycles:ppp: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 [root@jouet ~]# Reported-by: Stephane Eranian <eranian@google.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Stephane Eranian <eranian@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180201083812.11359-3-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-05perf evsel: Fix period/freq terms setupJiri Olsa
Stephane reported that we don't set properly PERIOD sample type for events with period term defined. Before: $ perf record -e cpu/cpu-cycles,period=1000/u ls $ perf evlist -v cpu/cpu-cycles,period=1000/u: ... sample_type: IP|TID|TIME|PERIOD, ... After: $ perf record -e cpu/cpu-cycles,period=1000/u ls $ perf evlist -v cpu/cpu-cycles,period=1000/u: ... sample_type: IP|TID|TIME, ... Setting PERIOD sample type based on period term setup. Committer note: When we use -c or a period=N term in the event definition, then we don't need to ask the kernel, for this event, via perf_event_attr.sample_type |= PERF_SAMPLE_PERIOD, to put the event period in each sample for this event, as we know it already, it is in perf_event_attr.sample_period. Reported-by: Stephane Eranian <eranian@google.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Stephane Eranian <eranian@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180201083812.11359-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-05dwc-xlgmac: remove Jie Deng as co-maintainerJie Deng
Jose Abreu is working on this driver and I will leave Synopsys soon. Thus it does not seem appropriate for me to be a co-maintainer anymore. Signed-off-by: Jie Deng <jiedeng@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>