summaryrefslogtreecommitdiff
path: root/lib
AgeCommit message (Collapse)Author
2018-04-02Merge tag 'arch-removal' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pul removal of obsolete architecture ports from Arnd Bergmann: "This removes the entire architecture code for blackfin, cris, frv, m32r, metag, mn10300, score, and tile, including the associated device drivers. I have been working with the (former) maintainers for each one to ensure that my interpretation was right and the code is definitely unused in mainline kernels. Many had fond memories of working on the respective ports to start with and getting them included in upstream, but also saw no point in keeping the port alive without any users. In the end, it seems that while the eight architectures are extremely different, they all suffered the same fate: There was one company in charge of an SoC line, a CPU microarchitecture and a software ecosystem, which was more costly than licensing newer off-the-shelf CPU cores from a third party (typically ARM, MIPS, or RISC-V). It seems that all the SoC product lines are still around, but have not used the custom CPU architectures for several years at this point. In contrast, CPU instruction sets that remain popular and have actively maintained kernel ports tend to all be used across multiple licensees. [ See the new nds32 port merged in the previous commit for the next generation of "one company in charge of an SoC line, a CPU microarchitecture and a software ecosystem" - Linus ] The removal came out of a discussion that is now documented at https://lwn.net/Articles/748074/. Unlike the original plans, I'm not marking any ports as deprecated but remove them all at once after I made sure that they are all unused. Some architectures (notably tile, mn10300, and blackfin) are still being shipped in products with old kernels, but those products will never be updated to newer kernel releases. After this series, we still have a few architectures without mainline gcc support: - unicore32 and hexagon both have very outdated gcc releases, but the maintainers promised to work on providing something newer. At least in case of hexagon, this will only be llvm, not gcc. - openrisc, risc-v and nds32 are still in the process of finishing their support or getting it added to mainline gcc in the first place. They all have patched gcc-7.3 ports that work to some degree, but complete upstream support won't happen before gcc-8.1. Csky posted their first kernel patch set last week, their situation will be similar [ Palmer Dabbelt points out that RISC-V support is in mainline gcc since gcc-7, although gcc-7.3.0 is the recommended minimum - Linus ]" This really says it all: 2498 files changed, 95 insertions(+), 467668 deletions(-) * tag 'arch-removal' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: (74 commits) MAINTAINERS: UNICORE32: Change email account staging: iio: remove iio-trig-bfin-timer driver tty: hvc: remove tile driver tty: remove bfin_jtag_comm and hvc_bfin_jtag drivers serial: remove tile uart driver serial: remove m32r_sio driver serial: remove blackfin drivers serial: remove cris/etrax uart drivers usb: Remove Blackfin references in USB support usb: isp1362: remove blackfin arch glue usb: musb: remove blackfin port usb: host: remove tilegx platform glue pwm: remove pwm-bfin driver i2c: remove bfin-twi driver spi: remove blackfin related host drivers watchdog: remove bfin_wdt driver can: remove bfin_can driver mmc: remove bfin_sdh driver input: misc: remove blackfin rotary driver input: keyboard: remove bf54x driver ...
2018-04-02Merge branch 'x86-dma-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 dma mapping updates from Ingo Molnar: "This tree, by Christoph Hellwig, switches over the x86 architecture to the generic dma-direct and swiotlb code, and also unifies more of the dma-direct code between architectures. The now unused x86-only primitives are removed" * 'x86-dma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: dma-mapping: Don't clear GFP_ZERO in dma_alloc_attrs swiotlb: Make swiotlb_{alloc,free}_buffer depend on CONFIG_DMA_DIRECT_OPS dma/swiotlb: Remove swiotlb_{alloc,free}_coherent() dma/direct: Handle force decryption for DMA coherent buffers in common code dma/direct: Handle the memory encryption bit in common code dma/swiotlb: Remove swiotlb_set_mem_attributes() set_memory.h: Provide set_memory_{en,de}crypted() stubs x86/dma: Remove dma_alloc_coherent_gfp_flags() iommu/intel-iommu: Enable CONFIG_DMA_DIRECT_OPS=y and clean up intel_{alloc,free}_coherent() iommu/amd_iommu: Use CONFIG_DMA_DIRECT_OPS=y and dma_direct_{alloc,free}() x86/dma/amd_gart: Use dma_direct_{alloc,free}() x86/dma/amd_gart: Look at dev->coherent_dma_mask instead of GFP_DMA x86/dma: Use generic swiotlb_ops x86/dma: Use DMA-direct (CONFIG_DMA_DIRECT_OPS=y) x86/dma: Remove dma_alloc_coherent_mask()
2018-04-02Merge branch 'locking-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: "The main changes in the locking subsystem in this cycle were: - Add the Linux Kernel Memory Consistency Model (LKMM) subsystem, which is an an array of tools in tools/memory-model/ that formally describe the Linux memory coherency model (a.k.a. Documentation/memory-barriers.txt), and also produce 'litmus tests' in form of kernel code which can be directly executed and tested. Here's a high level background article about an earlier version of this work on LWN.net: https://lwn.net/Articles/718628/ The design principles: "There is reason to believe that Documentation/memory-barriers.txt could use some help, and a major purpose of this patch is to provide that help in the form of a design-time tool that can produce all valid executions of a small fragment of concurrent Linux-kernel code, which is called a "litmus test". This tool's functionality is roughly similar to a full state-space search. Please note that this is a design-time tool, not useful for regression testing. However, we hope that the underlying Linux-kernel memory model will be incorporated into other tools capable of analyzing large bodies of code for regression-testing purposes." [...] "A second tool is klitmus7, which converts litmus tests to loadable kernel modules for direct testing. As with herd7, the klitmus7 code is freely available from http://diy.inria.fr/sources/index.html (and via "git" at https://github.com/herd/herdtools7)" [...] Credits go to: "This patch was the result of a most excellent collaboration founded by Jade Alglave and also including Alan Stern, Andrea Parri, and Luc Maranget." ... and to the gents listed in the MAINTAINERS entry: LINUX KERNEL MEMORY CONSISTENCY MODEL (LKMM) M: Alan Stern <stern@rowland.harvard.edu> M: Andrea Parri <parri.andrea@gmail.com> M: Will Deacon <will.deacon@arm.com> M: Peter Zijlstra <peterz@infradead.org> M: Boqun Feng <boqun.feng@gmail.com> M: Nicholas Piggin <npiggin@gmail.com> M: David Howells <dhowells@redhat.com> M: Jade Alglave <j.alglave@ucl.ac.uk> M: Luc Maranget <luc.maranget@inria.fr> M: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> The LKMM project already found several bugs in Linux locking primitives and improved the understanding and the documentation of the Linux memory model all around. - Add KASAN instrumentation to atomic APIs (Dmitry Vyukov) - Add RWSEM API debugging and reorganize the lock debugging Kconfig (Waiman Long) - ... misc cleanups and other smaller changes" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits) locking/Kconfig: Restructure the lock debugging menu locking/Kconfig: Add LOCK_DEBUGGING_SUPPORT to make it more readable locking/rwsem: Add DEBUG_RWSEMS to look for lock/unlock mismatches lockdep: Make the lock debug output more useful locking/rtmutex: Handle non enqueued waiters gracefully in remove_waiter() locking/atomic, asm-generic, x86: Add comments for atomic instrumentation locking/atomic, asm-generic: Add KASAN instrumentation to atomic operations locking/atomic/x86: Switch atomic.h to use atomic-instrumented.h locking/atomic, asm-generic: Add asm-generic/atomic-instrumented.h locking/xchg/alpha: Remove superfluous memory barriers from the _local() variants tools/memory-model: Finish the removal of rb-dep, smp_read_barrier_depends(), and lockless_dereference() tools/memory-model: Add documentation of new litmus test tools/memory-model: Remove mention of docker/gentoo image locking/memory-barriers: De-emphasize smp_read_barrier_depends() some more locking/lockdep: Show unadorned pointers mutex: Drop linkage.h from mutex.h tools/memory-model: Remove rb-dep, smp_read_barrier_depends, and lockless_dereference tools/memory-model: Convert underscores to hyphens tools/memory-model: Add a S lock-based external-view litmus test tools/memory-model: Add required herd7 version to README file ...
2018-04-02Merge branch 'core-debugobjects-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull debugobjects updates from Ingo Molnar: "Misc improvements: - add better instrumentation/debugging - optimize the freeing logic improve performance" * 'core-debugobjects-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: debugobjects: Avoid another unused variable warning debugobjects: Fix debug_objects_freed accounting debugobjects: Use global free list in __debug_check_no_obj_freed() debugobjects: Use global free list in free_object() debugobjects: Add global free list and the counter debugobjects: Export max loops counter
2018-03-31locking/Kconfig: Restructure the lock debugging menuWaiman Long
Two config options in the lock debugging menu that are probably the most frequently used, as far as I am concerned, is the PROVE_LOCKING and LOCK_STAT. From a UI perspective, they should be front and center. So these two options are now moved to the top of the lock debugging menu. The DEBUG_WW_MUTEX_SLOWPATH option is also added to the PROVE_LOCKING umbrella. Signed-off-by: Waiman Long <longman@redhat.com> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1522445280-7767-4-git-send-email-longman@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-31locking/Kconfig: Add LOCK_DEBUGGING_SUPPORT to make it more readableWaiman Long
There are a couples of lock debugging Kconfig options that depends on the following support options: - TRACE_IRQFLAGS_SUPPORT - STACKTRACE_SUPPORT - LOCKDEP_SUPPORT That makes those lock debugging options harder to read and understand. So a new LOCK_DEBUGGING_SUPPORT option is added that is equivalent to the above three options together. That makes the Kconfig.debug file more readable. Signed-off-by: Waiman Long <longman@redhat.com> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1522445280-7767-3-git-send-email-longman@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-31locking/rwsem: Add DEBUG_RWSEMS to look for lock/unlock mismatchesWaiman Long
For a rwsem, locking can either be exclusive or shared. The corresponding exclusive or shared unlock must be used. Otherwise, the protected data structures may get corrupted or the lock may be in an inconsistent state. In order to detect such anomaly, a new configuration option DEBUG_RWSEMS is added which can be enabled to look for such mismatches and print warnings that that happens. Signed-off-by: Waiman Long <longman@redhat.com> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1522445280-7767-2-git-send-email-longman@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-28dma-mapping: Don't clear GFP_ZERO in dma_alloc_attrsChristoph Hellwig
Revert the clearing of __GFP_ZERO in dma_alloc_attrs and move it to dma_direct_alloc for now. While most common architectures always zero dma cohereny allocations (and x86 did so since day one) this is not documented and at least arc and s390 do not zero without the explicit __GFP_ZERO argument. Fixes: 57bf5a8963f8 ("dma-mapping: clear harmful GFP_* flags in common code") Reported-by: Evgeniy Didin <Evgeniy.Didin@synopsys.com> Reported-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Evgeniy Didin <Evgeniy.Didin@synopsys.com> Cc: iommu@lists.linux-foundation.org Link: https://lkml.kernel.org/r/20180328133535.17302-2-hch@lst.de
2018-03-26raid: remove tile specific raid6 implementationArnd Bergmann
The Tile architecture is getting removed, so we no longer need this either. Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-03-26treewide: simplify Kconfig dependencies for removed archsArnd Bergmann
A lot of Kconfig symbols have architecture specific dependencies. In those cases that depend on architectures we have already removed, they can be omitted. Acked-by: Kalle Valo <kvalo@codeaurora.org> Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-03-24Merge branch 'linus' into x86/dma, to resolve a conflict with upstreamIngo Molnar
Conflicts: arch/x86/mm/init_64.c Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-23swiotlb: Make swiotlb_{alloc,free}_buffer depend on CONFIG_DMA_DIRECT_OPSChristoph Hellwig
Otherwise this causes unused symbol warnings for configs that build swiotlb.c only for use by xen-swiotlb.c and that don't otherwise select CONFIG_DMA_DIRECT_OPS, which is possible on arm. Fixes: 16e73adbca76 ("dma/swiotlb: Remove swiotlb_{alloc,free}_coherent()") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: iommu@lists.linux-foundation.org Cc: konrad.wilk@oracle.com Link: https://lkml.kernel.org/r/20180323174930.17767-1-hch@lst.de
2018-03-22Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "13 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm, thp: do not cause memcg oom for thp mm/vmscan: wake up flushers for legacy cgroups too Revert "mm: page_alloc: skip over regions of invalid pfns where possible" mm/shmem: do not wait for lock_page() in shmem_unused_huge_shrink() mm/thp: do not wait for lock_page() in deferred_split_scan() mm/khugepaged.c: convert VM_BUG_ON() to collapse fail x86/mm: implement free pmd/pte page interfaces mm/vmalloc: add interfaces to free unmapped page table h8300: remove extraneous __BIG_ENDIAN definition hugetlbfs: check for pgoff value overflow lockdep: fix fs_reclaim warning MAINTAINERS: update Mark Fasheh's e-mail mm/mempolicy.c: avoid use uninitialized preferred_node
2018-03-22mm/vmalloc: add interfaces to free unmapped page tableToshi Kani
On architectures with CONFIG_HAVE_ARCH_HUGE_VMAP set, ioremap() may create pud/pmd mappings. A kernel panic was observed on arm64 systems with Cortex-A75 in the following steps as described by Hanjun Guo. 1. ioremap a 4K size, valid page table will build, 2. iounmap it, pte0 will set to 0; 3. ioremap the same address with 2M size, pgd/pmd is unchanged, then set the a new value for pmd; 4. pte0 is leaked; 5. CPU may meet exception because the old pmd is still in TLB, which will lead to kernel panic. This panic is not reproducible on x86. INVLPG, called from iounmap, purges all levels of entries associated with purged address on x86. x86 still has memory leak. The patch changes the ioremap path to free unmapped page table(s) since doing so in the unmap path has the following issues: - The iounmap() path is shared with vunmap(). Since vmap() only supports pte mappings, making vunmap() to free a pte page is an overhead for regular vmap users as they do not need a pte page freed up. - Checking if all entries in a pte page are cleared in the unmap path is racy, and serializing this check is expensive. - The unmap path calls free_vmap_area_noflush() to do lazy TLB purges. Clearing a pud/pmd entry before the lazy TLB purges needs extra TLB purge. Add two interfaces, pud_free_pmd_page() and pmd_free_pte_page(), which clear a given pud/pmd entry and free up a page for the lower level entries. This patch implements their stub functions on x86 and arm64, which work as workaround. [akpm@linux-foundation.org: fix typo in pmd_free_pte_page() stub] Link: http://lkml.kernel.org/r/20180314180155.19492-2-toshi.kani@hpe.com Fixes: e61ce6ade404e ("mm: change ioremap to set up huge I/O mappings") Reported-by: Lei Li <lious.lilei@hisilicon.com> Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Wang Xuefeng <wxf.wang@hisilicon.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Hanjun Guo <guohanjun@huawei.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Borislav Petkov <bp@suse.de> Cc: Matthew Wilcox <willy@infradead.org> Cc: Chintan Pandya <cpandya@codeaurora.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Always validate XFRM esn replay attribute, from Florian Westphal. 2) Fix RCU read lock imbalance in xfrm_get_tos(), from Xin Long. 3) Don't try to get firmware dump if not loaded in iwlwifi, from Shaul Triebitz. 4) Fix BPF helpers to deal with SCTP GSO SKBs properly, from Daniel Axtens. 5) Fix some interrupt handling issues in e1000e driver, from Benjamin Poitier. 6) Use strlcpy() in several ethtool get_strings methods, from Florian Fainelli. 7) Fix rhlist dup insertion, from Paul Blakey. 8) Fix SKB leak in netem packet scheduler, from Alexey Kodanev. 9) Fix driver unload crash when link is up in smsc911x, from Jeremy Linton. 10) Purge out invalid socket types in l2tp_tunnel_create(), from Eric Dumazet. 11) Need to purge the write queue when TCP connections are aborted, otherwise userspace using MSG_ZEROCOPY can't close the fd. From Soheil Hassas Yeganeh. 12) Fix double free in error path of team driver, from Arkadi Sharshevsky. 13) Filter fixes for hv_netvsc driver, from Stephen Hemminger. 14) Fix non-linear packet access in ipv6 ndisc code, from Lorenzo Bianconi. 15) Properly filter out unsupported feature flags in macvlan driver, from Shannon Nelson. 16) Don't request loading the diag module for a protocol if the protocol itself is not even registered. From Xin Long. 17) If datagram connect fails in ipv6, make sure the socket state is consistent afterwards. From Paolo Abeni. 18) Use after free in qed driver, from Dan Carpenter. 19) If received ipv4 PMTU is less than the min pmtu, lock the mtu in the entry. From Sabrina Dubroca. 20) Fix sleep in atomic in tg3 driver, from Jonathan Toppins. 21) Fix vlan in vlan untagging in some situations, from Toshiaki Makita. 22) Fix double SKB free in genlmsg_mcast(). From Nicolas Dichtel. 23) Fix NULL derefs in error paths of tcf_*_init(), from Davide Caratti. 24) Unbalanced PM runtime calls in FEC driver, from Florian Fainelli. 25) Memory leak in gemini driver, from Igor Pylypiv. 26) IDR leaks in error paths of tcf_*_init() functions, from Davide Caratti. 27) Need to use GFP_ATOMIC in seg6_build_state(), from David Lebrun. 28) Missing dev_put() in error path of macsec_newlink(), from Dan Carpenter. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (201 commits) macsec: missing dev_put() on error in macsec_newlink() net: dsa: Fix functional dsa-loop dependency on FIXED_PHY hv_netvsc: common detach logic hv_netvsc: change GPAD teardown order on older versions hv_netvsc: use RCU to fix concurrent rx and queue changes hv_netvsc: disable NAPI before channel close net/ipv6: Handle onlink flag with multipath routes ppp: avoid loop in xmit recursion detection code ipv6: sr: fix NULL pointer dereference when setting encap source address ipv6: sr: fix scheduling in RCU when creating seg6 lwtunnel state net: aquantia: driver version bump net: aquantia: Implement pci shutdown callback net: aquantia: Allow live mac address changes net: aquantia: Add tx clean budget and valid budget handling logic net: aquantia: Change inefficient wait loop on fw data reads net: aquantia: Fix a regression with reset on old firmware net: aquantia: Fix hardware reset when SPI may rarely hangup s390/qeth: on channel error, reject further cmd requests s390/qeth: lock read device while queueing next buffer s390/qeth: when thread completes, wake up all waiters ...
2018-03-20test_bpf: Fix testing with CONFIG_BPF_JIT_ALWAYS_ON=y on other archesThadeu Lima de Souza Cascardo
Function bpf_fill_maxinsns11 is designed to not be able to be JITed on x86_64. So, it fails when CONFIG_BPF_JIT_ALWAYS_ON=y, and commit 09584b406742 ("bpf: fix selftests/bpf test_kmod.sh failure when CONFIG_BPF_JIT_ALWAYS_ON=y") makes sure that failure is detected on that case. However, it does not fail on other architectures, which have a different JIT compiler design. So, test_bpf has started to fail to load on those. After this fix, test_bpf loads fine on both x86_64 and ppc64el. Fixes: 09584b406742 ("bpf: fix selftests/bpf test_kmod.sh failure when CONFIG_BPF_JIT_ALWAYS_ON=y") Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Reviewed-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-03-20dma/swiotlb: Remove swiotlb_{alloc,free}_coherent()Christoph Hellwig
Unused now that everyone uses swiotlb_{alloc,free}(). Tested-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Joerg Roedel <joro@8bytes.org> Cc: Jon Mason <jdmason@kudzu.us> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Muli Ben-Yehuda <mulix@mulix.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: iommu@lists.linux-foundation.org Link: http://lkml.kernel.org/r/20180319103826.12853-15-hch@lst.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-20dma/direct: Handle force decryption for DMA coherent buffers in common codeChristoph Hellwig
With that in place the generic DMA-direct routines can be used to allocate non-encrypted bounce buffers, and the x86 SEV case can use the generic swiotlb ops including nice features such as using CMA allocations. Note that I'm not too happy about using sev_active() in DMA-direct, but I couldn't come up with a good enough name for a wrapper to make it worth adding. Tested-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Joerg Roedel <joro@8bytes.org> Cc: Jon Mason <jdmason@kudzu.us> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Muli Ben-Yehuda <mulix@mulix.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: iommu@lists.linux-foundation.org Link: http://lkml.kernel.org/r/20180319103826.12853-14-hch@lst.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-20dma/direct: Handle the memory encryption bit in common codeChristoph Hellwig
Give the basic phys_to_dma() and dma_to_phys() helpers a __-prefix and add the memory encryption mask to the non-prefixed versions. Use the __-prefixed versions directly instead of clearing the mask again in various places. Tested-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Joerg Roedel <joro@8bytes.org> Cc: Jon Mason <jdmason@kudzu.us> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Muli Ben-Yehuda <mulix@mulix.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: iommu@lists.linux-foundation.org Link: http://lkml.kernel.org/r/20180319103826.12853-13-hch@lst.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-20dma/swiotlb: Remove swiotlb_set_mem_attributes()Christoph Hellwig
Now that set_memory_decrypted() is always available we can just call it directly. Tested-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Joerg Roedel <joro@8bytes.org> Cc: Jon Mason <jdmason@kudzu.us> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Muli Ben-Yehuda <mulix@mulix.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: iommu@lists.linux-foundation.org Link: http://lkml.kernel.org/r/20180319103826.12853-12-hch@lst.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-19Merge branch 'for-4.16-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu Pull percpu fixes from Tejun Heo: "Late percpu pull request for v4.16-rc6. - percpu allocator pool replenishing no longer triggers OOM or warning messages. Also, the alloc interface now understands __GFP_NORETRY and __GFP_NOWARN. This is to allow avoiding OOMs from userland triggered actions like bpf map creation. Also added cond_resched() in alloc loop. - perpcu allocation now can be interrupted by kill sigs to avoid deadlocking OOM killer. - Added Dennis Zhou as a co-maintainer. He has rewritten the area map allocator, understands most of the code base and has been responsive for all bug reports" * 'for-4.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: percpu_ref: Update doc to dissuade users from depending on internal RCU grace periods mm: Allow to kill tasks doing pcpu_alloc() and waiting for pcpu_balance_workfn() percpu: include linux/sched.h for cond_resched() percpu: add a schedule point in pcpu_balance_workfn() percpu: allow select gfp to be passed to underlying allocators percpu: add __GFP_NORETRY semantics to the percpu balancing path percpu: match chunk allocator declarations with definitions percpu: add Dennis Zhou as a percpu co-maintainer
2018-03-19percpu_ref: Update doc to dissuade users from depending on internal RCU ↵Tejun Heo
grace periods percpu_ref internally uses sched-RCU to implement the percpu -> atomic mode switching and the documentation suggested that this could be depended upon. This doesn't seem like a good idea. * percpu_ref uses sched-RCU which has different grace periods regular RCU. Users may combine percpu_ref with regular RCU usage and incorrectly believe that regular RCU grace periods are performed by percpu_ref. This can lead to, for example, use-after-free due to premature freeing. * percpu_ref has a grace period when switching from percpu to atomic mode. It doesn't have one between the last put and release. This distinction is subtle and can lead to surprising bugs. * percpu_ref allows starting in and switching to atomic mode manually for debugging and other purposes. This means that there may not be any grace periods from kill to release. This patch makes it clear that the grace periods are percpu_ref's internal implementation detail and can't be depended upon by the users. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Kent Overstreet <kent.overstreet@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Tejun Heo <tj@kernel.org>
2018-03-14btree: avoid variable-length allocationsJoern Engel
geo->keylen cannot be larger than 4. So we might as well make fixed-size allocations. Given the one remaining user, geo->keylen cannot even be larger than 1. Logfs used to have 64bit and 128bit keys, tcm_qla2xxx only has 32bit keys. But let's not break the code if we don't have to. Signed-off-by: Joern Engel <joern@purestorage.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-14debugobjects: Avoid another unused variable warningArnd Bergmann
debug_objects_maxchecked is only updated in __debug_check_no_obj_freed(), and only read in debug_objects_maxchecked, unfortunately both of these are optional and depend on different Kconfig symbols. When both CONFIG_DEBUG_OBJECTS_FREE and CONFIG_DEBUG_FS are disabled this warning is emitted: lib/debugobjects.c:56:14: error: 'debug_objects_maxchecked' defined but not used [-Werror=unused-variable] Rather than trying to add more complex #ifdef protections, mark the variable as __maybe_unused so it can be silently dropped when usused. Fixes: bd9dcd046509 ("debugobjects: Export max loops counter") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Yang Shi <yang.shi@linux.alibaba.com> Cc: Waiman Long <longman@redhat.com> Link: https://lkml.kernel.org/r/20180313131857.158876-1-arnd@arndb.de
2018-03-09lib/test_kmod.c: fix limit check on number of test devices createdLuis R. Rodriguez
As reported by Dan the parentheses is in the wrong place, and since unlikely() call returns either 0 or 1 it's never less than zero. The second issue is that signed integer overflows like "INT_MAX + 1" are undefined behavior. Since num_test_devs represents the number of devices, we want to stop prior to hitting the max, and not rely on the wrap arround at all. So just cap at num_test_devs + 1, prior to assigning a new device. Link: http://lkml.kernel.org/r/20180224030046.24238-1-mcgrof@kernel.org Fixes: d9c6a72d6fa2 ("kmod: add test driver to stress test the module loader") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-09lib/bug.c: exclude non-BUG/WARN exceptions from report_bug()Kees Cook
Commit b8347c219649 ("x86/debug: Handle warnings before the notifier chain, to fix KGDB crash") changed the ordering of fixups, and did not take into account the case of x86 processing non-WARN() and non-BUG() exceptions. This would lead to output of a false BUG line with no other information. In the case of a refcount exception, it would be immediately followed by the refcount WARN(), producing very strange double-"cut here": lkdtm: attempting bad refcount_inc() overflow ------------[ cut here ]------------ Kernel BUG at 0000000065f29de5 [verbose debug info unavailable] ------------[ cut here ]------------ refcount_t overflow at lkdtm_REFCOUNT_INC_OVERFLOW+0x6b/0x90 in cat[3065], uid/euid: 0/0 WARNING: CPU: 0 PID: 3065 at kernel/panic.c:657 refcount_error_report+0x9a/0xa4 ... In the prior ordering, exceptions were searched first: do_trap_no_signal(struct task_struct *tsk, int trapnr, char *str, ... if (fixup_exception(regs, trapnr)) return 0; - if (fixup_bug(regs, trapnr)) - return 0; - As a result, fixup_bugs()'s is_valid_bugaddr() didn't take into account needing to search the exception list first, since that had already happened. So, instead of searching the exception list twice (once in is_valid_bugaddr() and then again in fixup_exception()), just add a simple sanity check to report_bug() that will immediately bail out if a BUG() (or WARN()) entry is not found. Link: http://lkml.kernel.org/r/20180301225934.GA34350@beast Fixes: b8347c219649 ("x86/debug: Handle warnings before the notifier chain, to fix KGDB crash") Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Richard Weinberger <richard.weinberger@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-09bug: use %pB in BUG and stack protector failureKees Cook
The BUG and stack protector reports were still using a raw %p. This changes it to %pB for more meaningful output. Link: http://lkml.kernel.org/r/20180301225704.GA34198@beast Fixes: ad67b74d2469 ("printk: hash addresses printed with %p") Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Richard Weinberger <richard.weinberger@gmail.com>, Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-09mn10300: Remove the architectureDavid Howells
Remove the MN10300 arch as the hardware is defunct. Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David Howells <dhowells@redhat.com> cc: Masahiro Yamada <yamada.masahiro@socionext.com> cc: linux-am33-list@redhat.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-03-07Merge tag 'metag_remove_2' of ↵Arnd Bergmann
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/jhogan/metag into asm-generic Remove metag architecture These patches remove the metag architecture and tightly dependent drivers from the kernel. With the 4.16 kernel the ancient gcc 4.2.4 based metag toolchain we have been using is hitting compiler bugs, so now seems a good time to drop it altogether. * tag 'metag_remove_2' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/jhogan/metag: i2c: img-scb: Drop METAG dependency media: img-ir: Drop METAG dependency watchdog: imgpdc: Drop METAG dependency MAINTAINERS/CREDITS: Drop METAG ARCHITECTURE tty: Remove metag DA TTY and console driver clocksource: Remove metag generic timer driver irqchip: Remove metag irqchip drivers Drop a bunch of metag references docs: Remove remaining references to metag docs: Remove metag docs metag: Remove arch/metag/ Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-03-07test_rhashtable: add test case for rhltable with duplicate objectsPaul Blakey
Tries to insert duplicates in the middle of bucket's chain: bucket 1: [[val 21 (tid=1)]] -> [[ val 1 (tid=2), val 1 (tid=0) ]] Reuses tid to distinguish the elements insertion order. Signed-off-by: Paul Blakey <paulb@mellanox.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-07rhashtable: Fix rhlist duplicates insertionPaul Blakey
When inserting duplicate objects (those with the same key), current rhlist implementation messes up the chain pointers by updating the bucket pointer instead of prev next pointer to the newly inserted node. This causes missing elements on removal and travesal. Fix that by properly updating pprev pointer to point to the correct rhash_head next pointer. Issue: 1241076 Change-Id: I86b2c140bcb4aeb10b70a72a267ff590bb2b17e7 Fixes: ca26893f05e8 ('rhashtable: Add rhlist interface') Signed-off-by: Paul Blakey <paulb@mellanox.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Use an appropriate TSQ pacing shift in mac80211, from Toke Høiland-Jørgensen. 2) Just like ipv4's ip_route_me_harder(), we have to use skb_to_full_sk in ip6_route_me_harder, from Eric Dumazet. 3) Fix several shutdown races and similar other problems in l2tp, from James Chapman. 4) Handle missing XDP flush properly in tuntap, for real this time. From Jason Wang. 5) Out-of-bounds access in powerpc ebpf tailcalls, from Daniel Borkmann. 6) Fix phy_resume() locking, from Andrew Lunn. 7) IFLA_MTU values are ignored on newlink for some tunnel types, fix from Xin Long. 8) Revert F-RTO middle box workarounds, they only handle one dimension of the problem. From Yuchung Cheng. 9) Fix socket refcounting in RDS, from Ka-Cheong Poon. 10) Don't allow ppp unit registration to an unregistered channel, from Guillaume Nault. 11) Various hv_netvsc fixes from Stephen Hemminger. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (98 commits) hv_netvsc: propagate rx filters to VF hv_netvsc: filter multicast/broadcast hv_netvsc: defer queue selection to VF hv_netvsc: use napi_schedule_irqoff hv_netvsc: fix race in napi poll when rescheduling hv_netvsc: cancel subchannel setup before halting device hv_netvsc: fix error unwind handling if vmbus_open fails hv_netvsc: only wake transmit queue if link is up hv_netvsc: avoid retry on send during shutdown virtio-net: re enable XDP_REDIRECT for mergeable buffer ppp: prevent unregistered channels from connecting to PPP units tc-testing: skbmod: fix match value of ethertype mlxsw: spectrum_switchdev: Check success of FDB add operation net: make skb_gso_*_seglen functions private net: xfrm: use skb_gso_validate_network_len() to check gso sizes net: sched: tbf: handle GSO_BY_FRAGS case in enqueue net: rename skb_gso_validate_mtu -> skb_gso_validate_network_len rds: Incorrect reference counting in TCP socket creation net: ethtool: don't ignore return from driver get_fecparam method vrf: check forwarding on the original netdevice when generating ICMP dest unreachable ...
2018-03-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf 2018-02-28 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Add schedule points and reduce the number of loop iterations the test_bpf kernel module is performing in order to not hog the CPU for too long, from Eric. 2) Fix an out of bounds access in tail calls in the ppc64 BPF JIT compiler, from Daniel. 3) Fix a crash on arm64 on unaligned BPF xadd operations that could be triggered via interpreter and JIT, from Daniel. Please not that once you merge net into net-next at some point, there is a minor merge conflict in test_verifier.c since test cases had been added at the end in both trees. Resolution is trivial: keep all the test cases from both trees. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-28Merge tag 'dma-mapping-4.16-3' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds
Pull dma-mapping fix from Christoph Hellwig: "A single fix for a memory leak regression in the dma-debug code" * tag 'dma-mapping-4.16-3' of git://git.infradead.org/users/hch/dma-mapping: dma-debug: fix memory leak in debug_dma_alloc_coherent
2018-02-28test_bpf: reduce MAX_TESTRUNSEric Dumazet
For tests that are using the maximal number of BPF instruction, each run takes 20 usec. Looping 10,000 times on them totals 200 ms, which is bad when the loop is not preemptible. test_bpf: #264 BPF_MAXINSNS: Call heavy transformations jited:1 19248 18548 PASS test_bpf: #269 BPF_MAXINSNS: ld_abs+get_processor_id jited:1 20896 PASS Lets divide by ten the number of iterations, so that max latency is 20ms. We could use need_resched() to break the loop earlier if we believe 20 ms is too much. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-02-26test_bpf: add a schedule pointEric Dumazet
test_bpf() is taking 1.6 seconds nowadays, it is time to add a schedule point in it. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-02-26idr: Fix handling of IDs above INT_MAXMatthew Wilcox
Khalid reported that the kernel selftests are currently failing: selftests: test_bpf.sh ======================================== test_bpf: [FAIL] not ok 1..8 selftests: test_bpf.sh [FAIL] He bisected it to 6ce711f2750031d12cec91384ac5cfa0a485b60a ("idr: Make 1-based IDRs more efficient"). The root cause is doing a signed comparison in idr_alloc_u32() instead of an unsigned comparison. I went looking for any similar problems and found a couple (which would each result in the failure to warn in two situations that aren't supposed to happen). I knocked up a few test-cases to prove that I was right and added them to the test-suite. Reported-by: Khalid Aziz <khalid.aziz@oracle.com> Tested-by: Khalid Aziz <khalid.aziz@oracle.com> Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
2018-02-23Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk Pull printk fixlet from Petr Mladek: "People expect to see the real pointer value for %px. Let's substitute '(null)' only for the other %p? format modifiers that need to deference the pointer" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk: vsprintf: avoid misleading "(null)" for %px
2018-02-23Drop a bunch of metag referencesJames Hogan
Now that arch/metag/ has been removed, drop a bunch of metag references in various codes across the whole tree: - VM_GROWSUP and __VM_ARCH_SPECIFIC_1. - MT_METAG_* ELF note types. - METAG Kconfig dependencies (FRAME_POINTER) and ranges (MAX_STACK_SIZE_MB). - metag cases in tools (checkstack.pl, recordmcount.c, perf). Signed-off-by: James Hogan <jhogan@kernel.org> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: linux-mm@kvack.org Cc: linux-metag@vger.kernel.org
2018-02-22dma-debug: fix memory leak in debug_dma_alloc_coherentMiles Chen
Marty reported a memory leakage introduced by commit 3aaabbf1c39e ("lib/dma-debug.c: fix incorrect pfn calculation"). Fix it by checking the virtual address before allocating the entry. This patch also use virt_addr_valid() instead of virt_to_page() to check if a virtual address is linear. Fixes: 3aaabbf1 ("lib/dma-debug.c: fix incorrect pfn calculation") Reported-by: Marty Faltesek <mfaltesek@google.com> Signed-off-by: Miles Chen <miles.chen@mediatek.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-02-22debugobjects: Fix debug_objects_freed accountingArnd Bergmann
The removal of the batched object freeing has caused the debug_objects_freed to become read-only, and the reading is inside an ifdef, so gcc warns that it is completely unused without CONFIG_DEBUG_FS: lib/debugobjects.c:71:14: error: 'debug_objects_freed' defined but not used [-Werror=unused-variable] Assuming we are still interested in this number, this adds back code to keep track of the freed objects. Fixes: 636e1970fd7d ("debugobjects: Use global free list in free_object()") Suggested-by: Waiman Long <longman@redhat.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Yang Shi <yang.shi@linux.alibaba.com> Acked-by: Waiman Long <longman@redhat.com> Link: https://lkml.kernel.org/r/20180222155335.1647466-1-arnd@arndb.de
2018-02-21lib/Kconfig.debug: enable RUNTIME_TESTING_MENUAnders Roxell
Commit d3deafaa8b5c ("lib/: make RUNTIME_TESTS a menuconfig to ease disabling it all") causes a regression when using runtime tests due to it defaults RUNTIME_TESTING_MENU to not set. Link: http://lkml.kernel.org/r/20180214133015.10090-1-anders.roxell@linaro.org Fixes: d3deafaa8b5c ("lib/: make RUNTIME_TESTS a menuconfig to easedisabling it all") Signed-off-by: Anders Roxell <anders.roxell@linaro.org> Cc: Vincent Legoll <vincent.legoll@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Byungchul Park <byungchul.park@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-21ida: do zeroing in ida_pre_get()Rasmus Villemoes
As far as I can tell, the only place the per-cpu ida_bitmap is populated is in ida_pre_get. The pre-allocated element is stolen in two places in ida_get_new_above, in both cases immediately followed by a memset(0). Since ida_get_new_above is called with locks held, do the zeroing in ida_pre_get, or rather let kmalloc() do it. Also, apparently gcc generates ~44 bytes of code to do a memset(, 0, 128): $ scripts/bloat-o-meter vmlinux.{0,1} add/remove: 0/0 grow/shrink: 2/1 up/down: 5/-88 (-83) Function old new delta ida_pre_get 115 119 +4 vermagic 27 28 +1 ida_get_new_above 715 627 -88 Link: http://lkml.kernel.org/r/20180108225634.15340-1-linux@rasmusvillemoes.dk Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Acked-by: Matthew Wilcox <mawilcox@microsoft.com> Cc: Eric Biggers <ebiggers@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-13debugobjects: Use global free list in __debug_check_no_obj_freed()Yang Shi
__debug_check_no_obj_freed() iterates over the to be freed memory region in chunks and iterates over the corresponding hash bucket list for each chunk. This can accumulate to hundred thousands of checked objects. In the worst case this can trigger the soft lockup detector: NMI watchdog: BUG: soft lockup - CPU#15 stuck for 22s! CPU: 15 PID: 110342 Comm: stress-ng-getde Call Trace: [<ffffffff8141177e>] debug_check_no_obj_freed+0x13e/0x220 [<ffffffff811f8751>] __free_pages_ok+0x1f1/0x5c0 [<ffffffff811fa785>] __free_pages+0x25/0x40 [<ffffffff812638db>] __free_slab+0x19b/0x270 [<ffffffff812639e9>] discard_slab+0x39/0x50 [<ffffffff812679f7>] __slab_free+0x207/0x270 [<ffffffff81269966>] ___cache_free+0xa6/0xb0 [<ffffffff8126c267>] qlist_free_all+0x47/0x80 [<ffffffff8126c5a9>] quarantine_reduce+0x159/0x190 [<ffffffff8126b3bf>] kasan_kmalloc+0xaf/0xc0 [<ffffffff8126b8a2>] kasan_slab_alloc+0x12/0x20 [<ffffffff81265e8a>] kmem_cache_alloc+0xfa/0x360 [<ffffffff812abc8f>] ? getname_flags+0x4f/0x1f0 [<ffffffff812abc8f>] getname_flags+0x4f/0x1f0 [<ffffffff812abe42>] getname+0x12/0x20 [<ffffffff81298da9>] do_sys_open+0xf9/0x210 [<ffffffff81298ede>] SyS_open+0x1e/0x20 [<ffffffff817d6e01>] entry_SYSCALL_64_fastpath+0x1f/0xc2 The code path might be called in either atomic or non-atomic context, but in_atomic() can't tell if the current context is atomic or not on a PREEMPT=n kernel, so cond_resched() can't be used to prevent the softlockup. Utilize the global free list to shorten the loop execution time. [ tglx: Massaged changelog ] Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: longman@redhat.com Link: https://lkml.kernel.org/r/1517872708-24207-5-git-send-email-yang.shi@linux.alibaba.com
2018-02-13debugobjects: Use global free list in free_object()Yang Shi
The newly added global free list allows to avoid lengthy pool_list iterations in free_obj_work() by putting objects either into the pool list when the fill level of the pool is below the maximum or by putting them on the global free list immediately. As the pool is now guaranteed to never exceed the maximum fill level this allows to remove the batch removal from pool list in free_obj_work(). Split free_object() into two parts, so the actual queueing function can be reused without invoking schedule_work() on every invocation. [ tglx: Remove the batch removal from pool list and massage changelog ] Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: longman@redhat.com Link: https://lkml.kernel.org/r/1517872708-24207-4-git-send-email-yang.shi@linux.alibaba.com
2018-02-13debugobjects: Add global free list and the counterYang Shi
free_object() adds objects to the pool list and schedules work when the pool list is larger than the pool size. The worker handles the actual kfree() of the object by iterating the pool list until the pool size is below the maximum pool size again. To iterate the pool list, pool_lock has to be held and the objects which should be freed() need to be put into temporary storage so pool_lock can be dropped for the actual kmem_cache_free() invocation. That's a pointless and expensive exercise if there is a large number of objects to free. In such a case its better to evaulate the fill level of the pool in free_objects() and queue the object to free either in the pool list or if it's full on a separate global free list. The worker can then do the following simpler operation: - Move objects back from the global free list to the pool list if the pool list is not longer full. - Remove the remaining objects in a single list move operation from the global free list and do the kmem_cache_free() operation lockless from the temporary list head. In fill_pool() the global free list is checked as well to avoid real allocations from the kmem cache. Add the necessary list head and a counter for the number of objects on the global free list and export that counter via sysfs: max_chain :79 max_loops :8147 warnings :0 fixups :0 pool_free :1697 pool_min_free :346 pool_used :15356 pool_max_used :23933 on_free_list :39 objs_allocated:32617 objs_freed :16588 Nothing queues objects on the global free list yet. This happens in a follow up change. [ tglx: Simplified implementation and massaged changelog ] Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: longman@redhat.com Link: https://lkml.kernel.org/r/1517872708-24207-3-git-send-email-yang.shi@linux.alibaba.com
2018-02-13debugobjects: Export max loops counterYang Shi
__debug_check_no_obj_freed() can be an expensive operation depending on the size of memory freed. It already exports the maximum chain walk length via debugfs, but this only records the maximum of a single memory chunk. Though there is no information about the total number of objects inspected for a __debug_check_no_obj_freed() operation, which might be significantly larger when a huge memory region is freed. Aggregate the number of objects inspected for a single invocation of __debug_check_no_obj_freed() and export it via sysfs. The resulting output of /sys/kernel/debug/debug_objects/stats looks like: max_chain :121 max_checked :543267 warnings :0 fixups :0 pool_free :1764 pool_min_free :341 pool_used :86438 pool_max_used :268887 objs_allocated:6068254 objs_freed :5981076 [ tglx: Renamed the variable to max_checked and adjusted changelog ] Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: longman@redhat.com Link: https://lkml.kernel.org/r/1517872708-24207-2-git-send-email-yang.shi@linux.alibaba.com
2018-02-12dma-direct: comment the dma_direct_free calling conventionChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-02-12dma-direct: mark as is_physChristoph Hellwig
Various PCI_DMA_BUS_IS_PHYS implementations rely on this flag to make proper decisions for block and networking addressability. Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-02-09Merge tag 'kbuild-v4.16-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull more Kbuild updates from Masahiro Yamada: "Makefile changes: - enable unused-variable warning that was wrongly disabled for clang Kconfig changes: - warn about blank 'help' and fix existing instances - fix 'choice' behavior to not write out invisible symbols - fix misc weirdness Coccinell changes: - fix false positive of free after managed memory alloc detection - improve performance of NULL dereference detection" * tag 'kbuild-v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (21 commits) kconfig: remove const qualifier from sym_expand_string_value() kconfig: add xrealloc() helper kconfig: send error messages to stderr kconfig: echo stdin to stdout if either is redirected kconfig: remove check_stdin() kconfig: remove 'config*' pattern from .gitignnore kconfig: show '?' prompt even if no help text is available kconfig: do not write choice values when their dependency becomes n coccinelle: deref_null: avoid useless computation coccinelle: devm_free: reduce false positives kbuild: clang: disable unused variable warnings only when constant kconfig: Warn if help text is blank nios2: kconfig: Remove blank help text arm: vt8500: kconfig: Remove blank help text MIPS: kconfig: Remove blank help text MIPS: BCM63XX: kconfig: Remove blank help text lib/Kconfig.debug: Remove blank help text Staging: rtl8192e: kconfig: Remove blank help text Staging: rtl8192u: kconfig: Remove blank help text mmc: kconfig: Remove blank help text ...