git.armlinux.org.uk/linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2016-07-23	x86/mm/cpa: Add missing comment in populate_pdg()	Andy Lutomirski
	In commit: 21cbc2822aa1 ("x86/mm/cpa: Unbreak populate_pgd(): stop trying to deallocate failed PUDs") I intended to add this comment, but I failed at using git. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/242baf8612394f4e31216f96d13c4d2e9b90d1b7.1469293159.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-23	x86/mm/cpa: Fix populate_pgd(): Stop trying to deallocate failed PUDs	Andy Lutomirski
	Valdis Kletnieks bisected a boot failure back to this recent commit: 360cb4d15567 ("x86/mm/cpa: In populate_pgd(), don't set the PGD entry until it's populated") I broke the case where a PUD table got allocated -- populate_pud() would wander off a pgd_none entry and get lost. I'm not sure how this survived my testing. Fix the original issue in a much simpler way. The problem was that, if we allocated a PUD table, failed to populate it, and freed it, another CPU could potentially keep using the PGD entry we installed (either by copying it via vmalloc_fault or by speculatively caching it). There's a straightforward fix: simply leave the top-level entry in place if this happens. This can't waste any significant amount of memory -- there are at most 256 entries like this systemwide and, as a practical matter, if we hit this failure path repeatedly, we're likely to reuse the same page anyway. For context, this is a reversion with this hunk added in: if (ret < 0) { + /* + * Leave the PUD page in place in case some other CPU or thread + * already found it, but remove any useless entries we just + * added to it. + */ - unmap_pgd_range(cpa->pgd, addr, + unmap_pud_range(pgd_entry, addr, addr + (cpa->numpages << PAGE_SHIFT)); return ret; } This effectively open-codes what the now-deleted unmap_pgd_range() function used to do except that unmap_pgd_range() used to try to free the page as well. Reported-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu> Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Mike Krinkin <krinkin.m.u@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Link: http://lkml.kernel.org/r/21cbc2822aa18aa812c0215f4231dbf5f65afa7f.1469249789.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-23	arm64: KVM: VHE: Context switch MDSCR_EL1	Marc Zyngier
	The kprobe enablement work has uncovered that changes made by a guest to MDSCR_EL1 were propagated to the host when VHE was enabled, leading to unexpected exception being delivered. Moving this register to the list of registers that are always context-switched fixes the issue. Fixes: 9c6c35683286 ("arm64: KVM: VHE: Split save/restore of registers shared between guest and host") Cc: stable@vger.kernel.org #4.6 Reported-by: Tirumalesh Chalamarla <Tirumalesh.Chalamarla@cavium.com> Tested-by: Tirumalesh Chalamarla <Tirumalesh.Chalamarla@cavium.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-07-23	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	Linus Torvalds
	Pull networking fixes from David Miller: 1) Fix memory leak in nftables, from Liping Zhang. 2) Need to check result of vlan_insert_tag() in batman-adv otherwise we risk NULL skb derefs, from Sven Eckelmann. 3) Check for dev_alloc_skb() failures in cfg80211, from Gregory Greenman. 4) Handle properly when we have ppp_unregister_channel() happening in parallel with ppp_connect_channel(), from WANG Cong. 5) Fix DCCP deadlock, from Eric Dumazet. 6) Bail out properly in UDP if sk_filter() truncates the packet to be smaller than even the space that the protocol headers need. From Michal Kubecek. 7) Similarly for rose, dccp, and sctp, from Willem de Bruijn. 8) Make TCP challenge ACKs less predictable, from Eric Dumazet. 9) Fix infinite loop in bgmac_dma_tx_add() from Florian Fainelli. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (65 commits) packet: propagate sock_cmsg_send() error net/mlx5e: Fix del vxlan port command buffer memset packet: fix second argument of sock_tx_timestamp() net: switchdev: change ageing_time type to clock_t Update maintainer for EHEA driver. net/mlx4_en: Add resilience in low memory systems net/mlx4_en: Move filters cleanup to a proper location sctp: load transport header after sk_filter net/sched/sch_htb: clamp xstats tokens to fit into 32-bit int net: cavium: liquidio: Avoid dma_unmap_single on uninitialized ndata net: nb8800: Fix SKB leak in nb8800_receive() et131x: Fix logical vs bitwise check in et131x_tx_timeout() vlan: use a valid default mtu value for vlan over macsec net: bgmac: Fix infinite loop in bgmac_dma_tx_add() mlxsw: spectrum: Prevent invalid ingress buffer mapping mlxsw: spectrum: Prevent overwrite of DCB capability fields mlxsw: spectrum: Don't emit errors when PFC is disabled mlxsw: spectrum: Indicate support for autonegotiation mlxsw: spectrum: Force link training according to admin state r8152: add MODULE_VERSION ...
2016-07-23	Merge branch 'overlayfs-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs Pull overlayfs fixes from Miklos Szeredi: "This contains a fix for a potential crash/corruption issue and another where the suid/sgid bits weren't cleared on write" * 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: ovl: verify upper dentry in ovl_remove_and_whiteout() ovl: Copy up underlying inode's ->i_mode to overlay inode ovl: handle ATTR_KILL*
2016-07-23	Merge branch 'akpm' (patches from Andrew)	Linus Torvalds
	Merge misc fixes from Andrew Morton: "Five fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: pps: do not crash when failed to register tools/vm/slabinfo: fix an unintentional printf testing/radix-tree: fix a macro expansion bug radix-tree: fix radix_tree_iter_retry() for tagged iterators. mm: memcontrol: fix cgroup creation failure after many small jobs
2016-07-23	Merge tag 'drm-fixes-for-v4.7-rc8-intel-kbl' of ↵	Linus Torvalds
	git://people.freedesktop.org/~airlied/linux Pull intel kabylake drm fixes from Dave Airlie: "As mentioned Intel has gathered all the Kabylake fixes from -next, which we've enabled in 4.7 for the first time, these are pretty much limited in scope to only affects kabylake, which is hw that isn't shipping yet. So I'm mostly okay with it going in now. If we don't land this, it might be a good idea to disable kabylake support in 4.7 before we ship" * tag 'drm-fixes-for-v4.7-rc8-intel-kbl' of git://people.freedesktop.org/~airlied/linux: (28 commits) drm/i915/kbl: Introduce the first official DMC for Kabylake. drm/i915: Introduce Kabypoint PCH for Kabylake H/DT. drm/i915/gen9: implement WaConextSwitchWithConcurrentTLBInvalidate drm/i915/gen9: Add WaFbcHighMemBwCorruptionAvoidance drm/i195/fbc: Add WaFbcNukeOnHostModify drm/i915/gen9: Add WaFbcWakeMemOn drm/i915/gen9: Add WaFbcTurnOffFbcWatermark drm/i915/kbl: Add WaClearSlmSpaceAtContextSwitch drm/i915/gen9: Add WaEnableChickenDCPR drm/i915/kbl: Add WaDisableSbeCacheDispatchPortSharing drm/i915/kbl: Add WaDisableGafsUnitClkGating drm/i915/kbl: Add WaForGAMHang drm/i915: Add WaInsertDummyPushConstP for bxt and kbl drm/i915/kbl: Add WaDisableDynamicCreditSharing drm/i915/kbl: Add WaDisableGamClockGating drm/i915/gen9: Enable must set chicken bits in config0 reg drm/i915/kbl: Add WaDisableLSQCROPERFforOCL drm/i915/kbl: Add WaDisableSDEUnitClockGating drm/i915/kbl: Add WaDisableFenceDestinationToSLM for A0 drm/i915/kbl: Add WaEnableGapsTsvCreditFix ...
2016-07-23	Merge tag 'drm-fixes-for-v4.7-rc8-intel' of ↵	Linus Torvalds
	git://people.freedesktop.org/~airlied/linux Pull drm fixes from Dave Airlie: "Two i915 regression fixes. Intel have submitted some Kabylake fixes I'll send separately, since this is the first kernel with kabylake support and they don't go much outside that area I think they should be fine" * tag 'drm-fixes-for-v4.7-rc8-intel' of git://people.freedesktop.org/~airlied/linux: drm/i915: add missing condition for committing planes on crtc drm/i915: Treat eDP as always connected, again
2016-07-23	Merge tag 'm68k-for-v4.8-tag1' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k Pull m68k upddates from Geert Uytterhoeven: - assorted spelling fixes - defconfig updates * tag 'm68k-for-v4.8-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k: m68k/defconfig: Update defconfigs for v4.7-rc2 m68k: Assorted spelling fixes
2016-07-23	Merge tag 'armsoc-fixes' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Olof Johansson: "A handful of fixes before final release: Marvell Armada: - One to fix a typo in the devicetree specifying memory ranges for the crypto engine - Two to deal with marking PCI and device-memory as strongly ordered to avoid hardware deadlocks, in particular when enabling above crypto driver. - Compile fix for PM Allwinner: - DT clock fixes to deal with u-boot-enabled framebuffer (simplefb). - Make R8 (C.H.I.P. SoC) inherit system compatibility from A13 to make clocks register proper. Tegra: - Fix SD card voltage setting on the Tegra3 Beaver dev board Misc: - Two maintainers updates for STM32 and STi platforms" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: ARM: tegra: beaver: Allow SD card voltage to be changed MAINTAINERS: update STi maintainer list MAINTAINERS: update STM32 maintainers list ARM: mvebu: compile pm code conditionally ARM: dts: sun7i: Fix pll3x2 and pll7x2 not having a parent clock ARM: dts: sunxi: Add pll3 to simplefb nodes clocks lists ARM: dts: armada-38x: fix MBUS_ID for crypto SRAM on Armada 385 Linksys ARM: mvebu: map PCI I/O regions strongly ordered ARM: mvebu: fix HW I/O coherency related deadlocks ARM: sunxi/dt: make the CHIP inherit from allwinner,sun5i-a13
2016-07-23	Merge branch 'linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: "This fixes a sporadic build failure in the qat driver as well as a memory corruption bug in rsa-pkcs1pad" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: rsa-pkcs1pad - fix rsa-pkcs1pad request struct crypto: qat - make qat_asym_algs.o depend on asn1 headers
2016-07-23	Merge branch 'for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull key handling fixes from James Morris: "Quoting David Howells: Here are three miscellaneous fixes: (1) Fix a panic in some debugging code in PKCS#7. This can only happen by explicitly inserting a #define DEBUG into the code. (2) Fix the calculation of the digest length in the PE file parser. This causes a failure where there should be a success. (3) Fix the case where an X.509 cert can be added as an asymmetric key to a trusted keyring with no trust restriction if no AKID is supplied. Bugs (1) and (2) aren't particularly problematic, but (3) allows a security check to be bypassed. Happily, this is a recent regression and never made it into a released kernel" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: KEYS: Fix for erroneous trust of incorrectly signed X.509 certs pefile: Fix the failure of calculation for digest PKCS#7: Fix panic when referring to the empty AKID when DEBUG defined
2016-07-23	Merge branch 'for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input fixes from Dmitry Torokhov: "A few more fixes for the input subsystem: - restore naming for tsc2005 touchscreens as some userspace match on it - fix out of bound access in legacy keyboard driver - fixup in RMI4 driver Everything is tagged for stable as well" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: tsc200x - report proper input_dev name tty/vt/keyboard: fix OOB access in do_compute_shiftstate() Input: synaptics-rmi4 - fix maximum size check for F12 control register 8
2016-07-23	Merge branch 'libnvdimm-fixes' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm fix from Dan Williams: "This contains a regression fix for a problem that was introduced in v4.7-rc6. In 4.7-rc1 we introduced auto-probing for the ACPI DSM (device- specific-method) format that the platform firmware implements for nvdimm devices. We initially fixed a regression in probing the QEMU DSM implementation by making acpi_check_dsm() tolerant of the way QEMU reports the "0 DSMs supported" condition. However, that broke HPE platforms since that tolerance caused the driver to mistakenly match the 1-zero-byte response those platforms give to "unknown" commands. Instead, we simply make the driver tolerant of not finding any supported DSMs. This has been tested to work with both QEMU and HPE platforms. This commit has appeared in a -next release with no reported issues" * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: nfit: make DIMM DSMs optional
2016-07-23	Merge tag 'gpio-v4.7-6' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO fix from Linus Walleij: "Compile problem fix for Tegra, Sorry to send this in the last minute but Ingo says this build failure is very prominent so I'm not going to wait for v4.7 before sending it. It is a case of COMPILE_TEST causing more problems than it solves and I'm already swearing about me shooting myself in the foot with that gun :(" * tag 'gpio-v4.7-6' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: gpio: tegra: don't auto-enable for COMPILE_TEST
2016-07-23	Merge tag 'clk-fixes-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fixes from Michael Turquette: "Fix a bug in the at91 clk driver, two compile time warnings in sunxi clk drivers, and one bug in a sunxi clk driver introduced in the 4.7 merge window" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: at91: fix clk_programmable_set_parent() clk: sunxi: remove unused variable clk: sunxi: display: Add per-clock flags clk: sunxi: tcon-ch1: Do not return a negative error in get_parent
2016-07-23	Merge branch 'for-4.7-fixes' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata Pull libata fix from Tejun Heo: "Another fallout from max_sectors bump a couple years ago. The lite-on optical drive times out on large requests" * 'for-4.7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: libata: LITE-ON CX1-JB256-HP needs lower max_sectors
2016-07-23	Merge tag 'mmc-v4.7-rc7' of git://git.linaro.org/people/ulf.hansson/mmc	Linus Torvalds
	Pull MMC fixes from Ulf Hansson: "Here are a few late mmc fixes intended for v4.7 final. MMC core: - Fix eMMC packed command header endianness - Fix free of uninitialized buffer for mmc ioctl MMC host: - pxamci: Fix potential oops in ->probe()" * tag 'mmc-v4.7-rc7' of git://git.linaro.org/people/ulf.hansson/mmc: mmc: pxamci: fix potential oops mmc: block: fix packed command header endianness mmc: block: fix free of uninitialized 'idata->buf'
2016-07-23	Merge tag 'sound-4.7-fix2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "No surprise, just a few small fixes: a couple of changes are seen in the core part, and both of them are rather for unusual error paths. The rest are the regular HD-audio fixes and one USB-audio regression fix" * tag 'sound-4.7-fix2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: usb-audio: Fix quirks code is not called ALSA: hda: add AMD Stoney PCI ID with proper driver caps ALSA: hda - fix use-after-free after module unload ALSA: pcm: Free chmap at PCM free callback, too ALSA: ctl: Stop notification after disconnection ALSA: hda/realtek - add new pin definition in alc225 pin quirk table
2016-07-23	Merge branch 'for-linus' of git://git.kernel.dk/linux-block	Linus Torvalds
	Pull NVMe fix from Jens Axboe: "Late addition here, it's basically a revert of a patch that was added in this merge window, but has proven to cause problems. This is swapping out the RCU based namespace protection with a good old mutex instead" * 'for-linus' of git://git.kernel.dk/linux-block: nvme: Remove RCU namespace protection
2016-07-23	pps: do not crash when failed to register	Jiri Slaby
	With this command sequence: modprobe plip modprobe pps_parport rmmod pps_parport the partport_pps modules causes this crash: BUG: unable to handle kernel NULL pointer dereference at (null) IP: parport_detach+0x1d/0x60 [pps_parport] Oops: 0000 [#1] SMP ... Call Trace: parport_unregister_driver+0x65/0xc0 [parport] SyS_delete_module+0x187/0x210 The sequence that builds up to this is: 1) plip is loaded and takes the parport device for exclusive use: plip0: Parallel port at 0x378, using IRQ 7. 2) pps_parport then fails to grab the device: pps_parport: parallel port PPS client parport0: cannot grant exclusive access for device pps_parport pps_parport: couldn't register with parport0 3) rmmod of pps_parport is then killed because it tries to access pardev->name, but pardev (taken from port->cad) is NULL. So add a check for NULL in the test there too. Link: http://lkml.kernel.org/r/20160714115245.12651-1-jslaby@suse.cz Signed-off-by: Jiri Slaby <jslaby@suse.cz> Acked-by: Rodolfo Giometti <giometti@enneenne.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-23	tools/vm/slabinfo: fix an unintentional printf	Dan Carpenter
	The curly braces are missing here so we print stuff unintentionally. Fixes: 9da4714a2d44 ('slub: slabinfo update for cmpxchg handling') Link: http://lkml.kernel.org/r/20160715211243.GE19522@mwanda Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Christoph Lameter <cl@linux.com> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Colin Ian King <colin.king@canonical.com> Cc: Laura Abbott <labbott@fedoraproject.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-23	testing/radix-tree: fix a macro expansion bug	Dan Carpenter
	There are no parentheses around this macro and it causes a problem when we do: index = rand() % THRASH_SIZE; Link: http://lkml.kernel.org/r/20160715210953.GC19522@mwanda Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-23	radix-tree: fix radix_tree_iter_retry() for tagged iterators.	Andrey Ryabinin
	radix_tree_iter_retry() resets slot to NULL, but it doesn't reset tags. Then NULL slot and non-zero iter.tags passed to radix_tree_next_slot() leading to crash: RIP: radix_tree_next_slot include/linux/radix-tree.h:473 find_get_pages_tag+0x334/0x930 mm/filemap.c:1452 .... Call Trace: pagevec_lookup_tag+0x3a/0x80 mm/swap.c:960 mpage_prepare_extent_to_map+0x321/0xa90 fs/ext4/inode.c:2516 ext4_writepages+0x10be/0x2b20 fs/ext4/inode.c:2736 do_writepages+0x97/0x100 mm/page-writeback.c:2364 __filemap_fdatawrite_range+0x248/0x2e0 mm/filemap.c:300 filemap_write_and_wait_range+0x121/0x1b0 mm/filemap.c:490 ext4_sync_file+0x34d/0xdb0 fs/ext4/fsync.c:115 vfs_fsync_range+0x10a/0x250 fs/sync.c:195 vfs_fsync fs/sync.c:209 do_fsync+0x42/0x70 fs/sync.c:219 SYSC_fdatasync fs/sync.c:232 SyS_fdatasync+0x19/0x20 fs/sync.c:230 entry_SYSCALL_64_fastpath+0x23/0xc1 arch/x86/entry/entry_64.S:207 We must reset iterator's tags to bail out from radix_tree_next_slot() and go to the slow-path in radix_tree_next_chunk(). Fixes: 46437f9a554f ("radix-tree: fix race in gang lookup") Link: http://lkml.kernel.org/r/1468495196-10604-1-git-send-email-aryabinin@virtuozzo.com Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Acked-by: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-23	mm: memcontrol: fix cgroup creation failure after many small jobs	Johannes Weiner
	The memory controller has quite a bit of state that usually outlives the cgroup and pins its CSS until said state disappears. At the same time it imposes a 16-bit limit on the CSS ID space to economically store IDs in the wild. Consequently, when we use cgroups to contain frequent but small and short-lived jobs that leave behind some page cache, we quickly run into the 64k limitations of outstanding CSSs. Creating a new cgroup fails with -ENOSPC while there are only a few, or even no user-visible cgroups in existence. Although pinning CSSs past cgroup removal is common, there are only two instances that actually need an ID after a cgroup is deleted: cache shadow entries and swapout records. Cache shadow entries reference the ID weakly and can deal with the CSS having disappeared when it's looked up later. They pose no hurdle. Swap-out records do need to pin the css to hierarchically attribute swapins after the cgroup has been deleted; though the only pages that remain swapped out after offlining are tmpfs/shmem pages. And those references are under the user's control, so they are manageable. This patch introduces a private 16-bit memcg ID and switches swap and cache shadow entries over to using that. This ID can then be recycled after offlining when the CSS remains pinned only by objects that don't specifically need it. This script demonstrates the problem by faulting one cache page in a new cgroup and deleting it again: set -e mkdir -p pages for x in `seq 128000`; do [ $((x % 1000)) -eq 0 ] && echo $x mkdir /cgroup/foo echo $$ >/cgroup/foo/cgroup.procs echo trex >pages/$x echo $$ >/cgroup/cgroup.procs rmdir /cgroup/foo done When run on an unpatched kernel, we eventually run out of possible IDs even though there are no visible cgroups: [root@ham ~]# ./cssidstress.sh [...] 65000 mkdir: cannot create directory '/cgroup/foo': No space left on device After this patch, the IDs get released upon cgroup destruction and the cache and css objects get released once memory reclaim kicks in. [hannes@cmpxchg.org: init the IDR] Link: http://lkml.kernel.org/r/20160621154601.GA22431@cmpxchg.org Fixes: b2052564e66d ("mm: memcontrol: continue cache reclaim from offlined groups") Link: http://lkml.kernel.org/r/20160617162516.GD19084@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reported-by: John Garcia <john.garcia@mesosphere.io> Reviewed-by: Vladimir Davydov <vdavydov@virtuozzo.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: Nikolay Borisov <kernel@kyup.com> Cc: <stable@vger.kernel.org> [3.19+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-22	doc-rst: kernel-doc: fix handling of address_space tags	Mauro Carvalho Chehab
	The RST cpp:function handler is very pedantic: it doesn't allow any macros like __user on it: Documentation/media/kapi/dtv-core.rst:28: WARNING: Error when parsing function declaration. If the function has no return type: Error in declarator or parameters and qualifiers Invalid definition: Expecting "(" in parameters_and_qualifiers. [error at 8] ssize_t dvb_ringbuffer_pkt_read_user (struct dvb_ringbuffer * rbuf, size_t idx, int offset, u8 __user * buf, size_t len) --------^ If the function has a return type: Error in declarator or parameters and qualifiers If pointer to member declarator: Invalid definition: Expected '::' in pointer to member (function). [error at 37] ssize_t dvb_ringbuffer_pkt_read_user (struct dvb_ringbuffer * rbuf, size_t idx, int offset, u8 __user * buf, size_t len) -------------------------------------^ If declarator-id: Invalid definition: Expecting "," or ")" in parameters_and_qualifiers, got "". [error at 102] ssize_t dvb_ringbuffer_pkt_read_user (struct dvb_ringbuffer rbuf, size_t idx, int offset, u8 __user * buf, size_t len) ------------------------------------------------------------------------------------------------------^ So, we have to remove it from the function prototype. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2016-07-22	tools build: Fix objtool build with ARCH=x86_64	Josh Poimboeuf
	The objtool build fails in a cross-compiled environment on a non-x86 host with "ARCH=x86_64": tools/objtool/objtool-in.o: In function `decode_instructions': tools/objtool/builtin-check.c:276: undefined reference to `arch_decode_instruction' We could override the ARCH environment variable and change it back to x86, similar to what the objtool Makefile was doing before; but it's tricky to override environment variables consistently. Instead, take a similar approach used by the Linux top-level Makefile and introduce a SRCARCH Makefile variable which evaluates to "x86" when ARCH is either "x86_64" or "x86". Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20160722191920.ej62fnspnqurbaa7@treble Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-07-22	objtool: Always use host headers	Arnaldo Carvalho de Melo
	From a conversation with Josh: From http://lkml.kernel.org/r/20160722034118.guckaniobf3f7czc@treble : It needs to be compiled with the host (powerpc) compiler, but then it needs to disassemble target (x86) files. ---- So use HOSTARCH instead of ARCH. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20160722034118.guckaniobf3f7czc@treble Link: http://lkml.kernel.org/n/tip-le1m1yzxnfpt3msbblu40nm8@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-07-22	objtool: Use tools/scripts/Makefile.arch to get ARCH and HOSTARCH	Arnaldo Carvalho de Melo
	objtool's Makefile was setting up ARCH but fixing up just the x86_64 -> x86, using Makefile.arch will do the necessary fixups for all arches. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-hbq0bbh03u2b722vozcyql31@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-07-22	tools build: Add HOSTARCH Makefile variable	Arnaldo Carvalho de Melo
	For tools that needs to be always compiled with the host headers. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-907q32k2nep6q670dkxypmu6@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-07-22	perf tests kmod-path: Fix build on ubuntu:16.04-x-armhf	Arnaldo Carvalho de Melo
	Cross building it on Ubuntu 16.04 to ARM ends up showing we get the free() prototype by luck in other environments, fix it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-0ktfgmmyhcfw8ondka2013f3@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-07-22	gpio: tegra: don't auto-enable for COMPILE_TEST	Arnd Bergmann
	I stumbled over a build error with COMPILE_TEST and CONFIG_OF disabled: drivers/gpio/gpio-tegra.c: In function 'tegra_gpio_probe': drivers/gpio/gpio-tegra.c:603:9: error: 'struct gpio_chip' has no member named 'of_node' The problem is that the newly added GPIO_TEGRA Kconfig symbol does not have a dependency on CONFIG_OF. However, there is another problem here as the driver gets enabled unconditionally whenever COMPILE_TEST is set. This fixes both problems, by making the symbol user-visible when COMPILE_TEST is set and default-enabled for ARCH_TEGRA=y. As a side-effect, it is now possible to compile-test a Tegra kernel with GPIO support disabled, which is harmless. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: 4dd4dd1d2120 ("gpio: tegra: Allow compile test") Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2016-07-22	libceph: apply new_state before new_up_client on incrementals	Ilya Dryomov
	Currently, osd_weight and osd_state fields are updated in the encoding order. This is wrong, because an incremental map may look like e.g. new_up_client: { osd=6, addr=... } # set osd_state and addr new_state: { osd=6, xorstate=EXISTS } # clear osd_state Suppose osd6's current osd_state is EXISTS (i.e. osd6 is down). After applying new_up_client, osd_state is changed to EXISTS \| UP. Carrying on with the new_state update, we flip EXISTS and leave osd6 in a weird "!EXISTS but UP" state. A non-existent OSD is considered down by the mapping code 2087 for (i = 0; i < pg->pg_temp.len; i++) { 2088 if (ceph_osd_is_down(osdmap, pg->pg_temp.osds[i])) { 2089 if (ceph_can_shift_osds(pi)) 2090 continue; 2091 2092 temp->osds[temp->size++] = CRUSH_ITEM_NONE; and so requests get directed to the second OSD in the set instead of the first, resulting in OSD-side errors like: [WRN] : client.4239 192.168.122.21:0/2444980242 misdirected client.4239.1:2827 pg 2.5df899f2 to osd.4 not [1,4,6] in e680/680 and hung rbds on the client: [ 493.566367] rbd: rbd0: write 400000 at 11cc00000 (0) [ 493.566805] rbd: rbd0: result -6 xferred 400000 [ 493.567011] blk_update_request: I/O error, dev rbd0, sector 9330688 The fix is to decouple application from the decoding and: - apply new_weight first - apply new_state before new_up_client - twiddle osd_state flags if marking in - clear out some of the state if osd is destroyed Fixes: http://tracker.ceph.com/issues/14901 Cc: stable@vger.kernel.org # 3.15+: 6dd74e44dc1d: libceph: set 'exists' flag for newly up osd Cc: stable@vger.kernel.org # 3.15+ Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2016-07-22	Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6	Herbert Xu
	Merge the crypto tree to resolve conflict in rsa-pkcs1pad.
2016-07-22	crypto: rsa-pkcs1pad - fix rsa-pkcs1pad request struct	Herbert Xu
	To allow for child request context the struct akcipher_request child_req needs to be at the end of the structure. Cc: stable@vger.kernel.org Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-07-22	x86/boot: Simplify EBDA-vs-BIOS reservation logic	Andy Lutomirski
	Both the intent and the effect of reserve_bios_regions() is simple: reserve the range from the apparent BIOS start (suitably filtered) through 1MB and, if the EBDA start address is sensible, extend that reservation downward to cover the EBDA as well. The code is overcomplicated, though, and contains head-scratchers like: if (ebda_start < BIOS_START_MIN) ebda_start = BIOS_START_MAX; That snipped is trying to say "if ebda_start < BIOS_START_MIN, ignore it". Simplify it: reorder the code so that it makes sense. This should have no functional effect under any circumstances. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Mario Limonciello <mario_limonciello@dell.com> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Link: http://lkml.kernel.org/r/ef89c0c761be20ead8bd9a3275743e6259b6092a.1469135598.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-22	x86/boot: Clarify what x86_legacy_features.reserve_bios_regions does	Andy Lutomirski
	It doesn't just control probing for the EBDA -- it controls whether we detect and reserve the <1MB BIOS regions in general. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Mario Limonciello <mario_limonciello@dell.com> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Link: http://lkml.kernel.org/r/55bd591115498440d461857a7b64f349a5d911f3.1469135598.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-22	ovl: verify upper dentry in ovl_remove_and_whiteout()	Maxim Patlasov
	The upper dentry may become stale before we call ovl_lock_rename_workdir. For example, someone could (mistakenly or maliciously) manually unlink(2) it directly from upperdir. To ensure it is not stale, let's lookup it after ovl_lock_rename_workdir and and check if it matches the upper dentry. Essentially, it is the same problem and similar solution as in commit 11f3710417d0 ("ovl: verify upper dentry before unlink and rename"). Signed-off-by: Maxim Patlasov <mpatlasov@virtuozzo.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Cc: <stable@vger.kernel.org>
2016-07-22	packet: propagate sock_cmsg_send() error	Soheil Hassas Yeganeh
	sock_cmsg_send() can return different error codes and not only -EINVAL, and we should properly propagate them. Fixes: c14ac9451c34 ("sock: enable timestamping using control messages") Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-21	Documentation: dtb: xgene: Add hwmon dts binding documentation	hotran
	This patch adds the APM X-Gene hwmon device tree node documentation. Signed-off-by: Hoan Tran <hotran@apm.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2016-07-21	GFS2: Fix gfs2_replay_incr_blk for multiple journal sizes	Bob Peterson
	Before this patch, if you used gfs2_jadd to add new journals of a size smaller than the existing journals, replaying those new journals would withdraw. That's because function gfs2_replay_incr_blk was using the number of journal blocks (jd_block) from the superblock's journal pointer. In other words, "My journal's max size" rather than "the journal we're replaying's size." This patch changes the function to use the size of the pertinent journal rather than always using the journal we happen to be using. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2016-07-21	x86/fpu: Do not BUG_ON() in early FPU code	Dave Hansen
	I don't think it is really possible to have a system where CPUID enumerates support for XSAVE but that it does not have FP/SSE (they are "legacy" features and always present). But, I did manage to hit this case in qemu when I enabled its somewhat shaky XSAVE support. The bummer is that the FPU is set up before we parse the command-line or have any console support including earlyprintk. That turned what should have been an easy thing to debug in to a bit more of an odyssey. So a BUG() here is worthless. All it does it guarantee that if/when we hit this case we have an empty console. So, remove the BUG() and try to limp along by disabling XSAVE and trying to continue. Add a comment on why we are doing this, and also add a common "out_disable" path for leaving fpu__init_system_xstate(). Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave@sr71.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20160720194551.63BB2B58@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-21	perf tools: Add AVX-512 instructions to the new instructions test	Adrian Hunter
	Previous patches added support for Intel's AVX-512 instructions to the kernel and perf tools instruction decoders. AVX-512 instructions are documented in Intel Architecture Instruction Set Extensions Programming Reference (February 2016). Add a representative set of instructions to perf's "new instructions" test. e.g. perf test "new instructions" Or to view a particular instruction: perf test -v "new instructions" 2>&1 \| grep vbroadcasti64x4 Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dan Williams <dan.j.williams@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: X86 ML <x86@kernel.org> Link: http://lkml.kernel.org/r/1469003437-32706-5-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-07-21	perf tools: Add AVX-512 support to the instruction decoder used by Intel PT	Adrian Hunter
	Add support for Intel's AVX-512 instructions to perf tools instruction decoder used by Intel PT. The kernel's instruction decoder was updated in a previous patch. AVX-512 instructions are documented in Intel Architecture Instruction Set Extensions Programming Reference (February 2016). AVX-512 instructions are identified by a EVEX prefix which, for the purpose of instruction decoding, can be treated as though it were a 4-byte VEX prefix. Existing instructions which can now accept an EVEX prefix need not be further annotated in the op code map (x86-opcode-map.txt). In the case of new instructions, the op code map is updated accordingly. Also add associated Mask Instructions that are used to manipulate mask registers used in AVX-512 instructions. A representative set of instructions is added to the perf tools new instructions test in a subsequent patch. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dan Williams <dan.j.williams@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: X86 ML <x86@kernel.org> Link: http://lkml.kernel.org/r/1469003437-32706-4-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-07-21	x86/insn: Add AVX-512 support to the instruction decoder	Adrian Hunter
	Add support for Intel's AVX-512 instructions to the instruction decoder. AVX-512 instructions are documented in Intel Architecture Instruction Set Extensions Programming Reference (February 2016). AVX-512 instructions are identified by a EVEX prefix which, for the purpose of instruction decoding, can be treated as though it were a 4-byte VEX prefix. Existing instructions which can now accept an EVEX prefix need not be further annotated in the op code map (x86-opcode-map.txt). In the case of new instructions, the op code map is updated accordingly. Also add associated Mask Instructions that are used to manipulate mask registers used in AVX-512 instructions. The 'perf tools' instruction decoder is updated in a subsequent patch. And a representative set of instructions is added to the perf tools new instructions test in a subsequent patch. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dan Williams <dan.j.williams@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: X86 ML <x86@kernel.org> Link: http://lkml.kernel.org/r/1469003437-32706-3-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-07-21	x86/boot: Reorganize and clean up the BIOS area reservation code	Ingo Molnar
	So the reserve_ebda_region() code has accumulated a number of problems over the years that make it really difficult to read and understand: - The calculation of 'lowmem' and 'ebda_addr' is an unnecessarily interleaved mess of first lowmem, then ebda_addr, then lowmem tweaks... - 'lowmem' here means 'super low mem' - i.e. 16-bit addressable memory. In other parts of the x86 code 'lowmem' means 32-bit addressable memory... This makes it super confusing to read. - It does not help at all that we have various memory range markers, half of which are 'start of range', half of which are 'end of range' - but this crucial property is not obvious in the naming at all ... gave me a headache trying to understand all this. - Also, the 'ebda_addr' name sucks: it highlights that it's an address (which is obvious, all values here are addresses!), while it does not highlight that it's the _start_ of the EBDA region ... - 'BIOS_LOWMEM_KILOBYTES' says a lot of things, except that this is the only value that is a pointer to a value, not a memory range address! - The function name itself is a misnomer: it says 'reserve_ebda_region()' while its main purpose is to reserve all the firmware ROM typically between 640K and 1MB, while the 'EBDA' part is only a small part of that ... - Likewise, the paravirt quirk flag name 'ebda_search' is misleading as well: this too should be about whether to reserve firmware areas in the paravirt case. - In fact thinking about this as 'end of RAM' is confusing: what this function really wants to reserve is firmware data and code areas! Once the thinking is inverted from a mixed 'ram' and 'reserved firmware area' notion to a pure 'reserved area' notion everything becomes a lot clearer. To improve all this rewrite the whole code (without changing the logic): - Firstly invert the naming from 'lowmem end' to 'BIOS reserved area start' and propagate this concept through all the variable names and constants. BIOS_RAM_SIZE_KB_PTR // was: BIOS_LOWMEM_KILOBYTES BIOS_START_MIN // was: INSANE_CUTOFF ebda_start // was: ebda_addr bios_start // was: lowmem BIOS_START_MAX // was: LOWMEM_CAP - Then clean up the name of the function itself by renaming it to reserve_bios_regions() and renaming the ::ebda_search paravirt flag to ::reserve_bios_regions. - Fix up all the comments (fix typos), harmonize and simplify their formulation and remove comments that become unnecessary due to the much better naming all around. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-21	Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6	Herbert Xu
	Merge the crypto tree to resolve conflict in qat Makefile.
2016-07-21	crypto: qat - make qat_asym_algs.o depend on asn1 headers	Jan Stancek
	Parallel build can sporadically fail because asn1 headers may not be built yet by the time qat_asym_algs.o is compiled: drivers/crypto/qat/qat_common/qat_asym_algs.c:55:32: fatal error: qat_rsapubkey-asn1.h: No such file or directory #include "qat_rsapubkey-asn1.h" Cc: stable@vger.kernel.org Signed-off-by: Jan Stancek <jstancek@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-07-20	block: Fix front merge check	Damien Le Moal
	For a front merge, the maximum number of sectors of the request must be checked against the front merge BIO sector, not the current sector of the request. Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-20	block: do not merge requests without consulting with io scheduler	Tahsin Erdogan
	Before merging a bio into an existing request, io scheduler is called to get its approval first. However, the requests that come from a plug flush may get merged by block layer without consulting with io scheduler. In case of CFQ, this can cause fairness problems. For instance, if a request gets merged into a low weight cgroup's request, high weight cgroup now will depend on low weight cgroup to get scheduled. If high weigt cgroup needs that io request to complete before submitting more requests, then it will also lose its timeslice. Following script demonstrates the problem. Group g1 has a low weight, g2 and g3 have equal high weights but g2's requests are adjacent to g1's requests so they are subject to merging. Due to these merges, g2 gets poor disk time allocation. cat > cfq-merge-repro.sh << "EOF" #!/bin/bash set -e IO_ROOT=/mnt-cgroup/io mkdir -p $IO_ROOT if ! mount \| grep -qw $IO_ROOT; then mount -t cgroup none -oblkio $IO_ROOT fi cd $IO_ROOT for i in g1 g2 g3; do if [ -d $i ]; then rmdir $i fi done mkdir g1 && echo 10 > g1/blkio.weight mkdir g2 && echo 495 > g2/blkio.weight mkdir g3 && echo 495 > g3/blkio.weight RUNTIME=10 (echo $BASHPID > g1/cgroup.procs && fio --readonly --name name1 --filename /dev/sdb \ --rw read --size 64k --bs 64k --time_based \ --runtime=$RUNTIME --offset=0k &> /dev/null)& (echo $BASHPID > g2/cgroup.procs && fio --readonly --name name1 --filename /dev/sdb \ --rw read --size 64k --bs 64k --time_based \ --runtime=$RUNTIME --offset=64k &> /dev/null)& (echo $BASHPID > g3/cgroup.procs && fio --readonly --name name1 --filename /dev/sdb \ --rw read --size 64k --bs 64k --time_based \ --runtime=$RUNTIME --offset=256k &> /dev/null)& sleep $((RUNTIME+1)) for i in g1 g2 g3; do echo ---- $i ---- cat $i/blkio.time done EOF # ./cfq-merge-repro.sh ---- g1 ---- 8:16 162 ---- g2 ---- 8:16 165 ---- g3 ---- 8:16 686 After applying the patch: # ./cfq-merge-repro.sh ---- g1 ---- 8:16 90 ---- g2 ---- 8:16 445 ---- g3 ---- 8:16 471 Signed-off-by: Tahsin Erdogan <tahsin@google.com> Signed-off-by: Jens Axboe <axboe@fb.com>