summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2016-05-26libceph: wait_request_timeout()Ilya Dryomov
The unwatch timeout is currently implemented in rbd. With watch/unwatch code moving into libceph, we are going to need a ceph_osdc_wait_request() variant with a timeout. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-05-26libceph: request_init() and request_release_checks()Ilya Dryomov
These are going to be used by request_reinit() code. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-05-26libceph: a major OSD client updateIlya Dryomov
This is a major sync up, up to ~Jewel. The highlights are: - per-session request trees (vs a global per-client tree) - per-session locking (vs a global per-client rwlock) - homeless OSD session - no ad-hoc global per-client lists - support for pool quotas - foundation for watch/notify v2 support - foundation for map check (pool deletion detection) support The switchover is incomplete: lingering requests can be setup and teared down but aren't ever reestablished. This functionality is restored with the introduction of the new lingering infrastructure (ceph_osd_linger_request, linger_work, etc) in a later commit. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-05-26libceph: protect osdc->osd_lru list with a spinlockIlya Dryomov
OSD client is getting moved from the big per-client lock to a set of per-session locks. The big rwlock would only be held for read most of the time, so a global osdc->osd_lru needs additional protection. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-05-26libceph: allocate ceph_osd with GFP_NOFAILIlya Dryomov
create_osd() is called way too deep in the stack to be able to error out in a sane way; a failing create_osd() just messes everything up. The current req_notarget list solution is broken - the list is never traversed as it's not entirely clear when to do it, I guess. If we were to start traversing it at regular intervals and retrying each request, we wouldn't be far off from what __GFP_NOFAIL is doing, so allocate OSD sessions with __GFP_NOFAIL, at least until we come up with a better fix. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-05-26libceph: osd_init() and osd_cleanup()Ilya Dryomov
These are going to be used by homeless OSD sessions code. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-05-26libceph: handle_one_map()Ilya Dryomov
Separate osdmap handling from decoding and iterating over a bag of maps in a fresh MOSDMap message. This sets up the scene for the updated OSD client. Of particular importance here is the addition of pi->was_full, which can be used to answer "did this pool go full -> not-full in this map?". This is the key bit for supporting pool quotas. We won't be able to downgrade map_sem for much longer, so drop downgrade_write(). Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-05-25Merge branch 'work.iov_iter' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs iov_iter regression fix from Al Viro: "Fix for braino in 'fold checks into iterate_and_advance()'" * 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: do "fold checks into iterate_and_advance()" right
2016-05-25Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs xattr regression fixes from Al Viro. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: make xattr_resolve_handlers() safe to use with NULL ->s_xattr xattr: Fail with -EINVAL for NULL attribute names
2016-05-25Merge tag 'acpi-4.7-rc1-more' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fix from Rafael Wysocki: "Additional ACPI update for v4.7-rc1 Just one fix for incorrect async_synchronize_cookie() usage in the ACPI battery driver (Chris Wilson)" * tag 'acpi-4.7-rc1-more' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI / battery: Correctly serialise with the pending async probe
2016-05-26libceph: allocate dummy osdmap in ceph_osdc_init()Ilya Dryomov
This leads to a simpler osdmap handling code, particularly when dealing with pi->was_full, which is introduced in a later commit. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-05-26libceph: schedule tick from ceph_osdc_init()Ilya Dryomov
Both homeless OSD sessions and watch/notify v2, introduced in later commits, require periodic ticks which don't depend on ->num_requests. Schedule the initial tick from ceph_osdc_init() and reschedule from handle_timeout() unconditionally. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-05-26libceph: move schedule_delayed_work() in ceph_osdc_init()Ilya Dryomov
ceph_osdc_stop() isn't called if ceph_osdc_init() fails, so we end up with handle_osds_timeout() running on invalid memory if any one of the allocations fails. Call schedule_delayed_work() after everything is setup, just before returning. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-05-26libceph: redo callbacks and factor out MOSDOpReply decodingIlya Dryomov
If you specify ACK | ONDISK and set ->r_unsafe_callback, both ->r_callback and ->r_unsafe_callback(true) are called on ack. This is very confusing. Redo this so that only one of them is called: ->r_unsafe_callback(true), on ack ->r_unsafe_callback(false), on commit or ->r_callback, on ack|commit Decode everything in decode_MOSDOpReply() to reduce clutter. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-05-26libceph: drop msg argument from ceph_osdc_callback_tIlya Dryomov
finish_read(), its only user, uses it to get to hdr.data_len, which is what ->r_result is set to on success. This gains us the ability to safely call callbacks from contexts other than reply, e.g. map check. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-05-26libceph: switch to calc_target(), part 2Ilya Dryomov
The crux of this is getting rid of ceph_osdc_build_request(), so that MOSDOp can be encoded not before but after calc_target() calculates the actual target. Encoding now happens within ceph_osdc_start_request(). Also nuked is the accompanying bunch of pointers into the encoded buffer that was used to update fields on each send - instead, the entire front is re-encoded. If we want to support target->name_len != base->name_len in the future, there is no other way, because oid is surrounded by other fields in the encoded buffer. Encoding OSD ops and adding data items to the request message were mixed together in osd_req_encode_op(). While we want to re-encode OSD ops, we don't want to add duplicate data items to the message when resending, so all call to ceph_osdc_msg_data_add() are factored out into a new setup_request_data(). Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-05-26libceph: switch to calc_target(), part 1Ilya Dryomov
Replace __calc_request_pg() and most of __map_request() with calc_target() and start using req->r_t. ceph_osdc_build_request() however still encodes base_oid, because it's called before calc_target() is and target_oid is empty at that point in time; a printf in osdc_show() also shows base_oid. This is fixed in "libceph: switch to calc_target(), part 2". Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-05-26libceph: introduce ceph_osd_request_target, calc_target()Ilya Dryomov
Introduce ceph_osd_request_target, containing all mapping-related fields of ceph_osd_request and calc_target() for calculating mappings and populating it. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-05-26libceph: pi->min_size, pi->last_force_request_resendIlya Dryomov
Add and decode pi->min_size and pi->last_force_request_resend. These are going to be used by calc_target(). Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-05-26libceph: make pgid_cmp() globalIlya Dryomov
calc_target() code is going to need to know how to compare PGs. Take lhs and rhs pgid by const * while at it. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-05-26libceph: rename ceph_calc_pg_primary()Ilya Dryomov
Rename ceph_calc_pg_primary() to ceph_pg_to_acting_primary() to emphasise that it returns acting primary. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-05-26libceph: ceph_osds, ceph_pg_to_up_acting_osds()Ilya Dryomov
Knowning just acting set isn't enough, we need to be able to record up set as well to detect interval changes. This means returning (up[], up_len, up_primary, acting[], acting_len, acting_primary) and passing it around. Introduce and switch to ceph_osds to help with that. Rename ceph_calc_pg_acting() to ceph_pg_to_up_acting_osds() and return both up and acting sets from it. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-05-26libceph: rename ceph_oloc_oid_to_pg()Ilya Dryomov
Rename ceph_oloc_oid_to_pg() to ceph_object_locator_to_pg(). Emphasise that returned is raw PG and return -ENOENT instead of -EIO if the pool doesn't exist. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-05-26libceph: fix ceph_eversion encodingIlya Dryomov
eversion_t is version+epoch in userspace and is encoded in that order. ceph_eversion is defined as epoch+version in rados.h, yet we memcpy it in __send_request(). Reoder ceph_eversion fields. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-05-26libceph: DEFINE_RB_FUNCS macroIlya Dryomov
Given struct foo { u64 id; struct rb_node bar_node; }; generate insert_bar(), erase_bar() and lookup_bar() functions with DEFINE_RB_FUNCS(bar, struct foo, id, bar_node) The key is assumed to be an integer (u64, int, etc), compared with < and >. nodefld has to be initialized with RB_CLEAR_NODE(). Start using it for MDS, MON and OSD requests and OSD sessions. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-05-26libceph: open-code remove_{all,old}_osds()Ilya Dryomov
They are called only once, from ceph_osdc_stop() and handle_osds_timeout() respectively. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-05-26libceph: nuke unused fields and functionsIlya Dryomov
Either unused or useless: osdmap->mkfs_epoch osd->o_marked_for_keepalive monc->num_generic_requests osdc->map_waiters osdc->last_requested_map osdc->timeout_tid osd_req_op_cls_response_data() osdmap_apply_incremental() @msgr arg Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-05-26rbd: use header_oid instead of header_nameIlya Dryomov
Switch to ceph_object_id and use ceph_oid_aprintf() instead of a bare const char *. This reduces noise in rbd_dev_header_name(). Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-05-26libceph: variable-sized ceph_object_idIlya Dryomov
Currently ceph_object_id can hold object names of up to 100 (CEPH_MAX_OID_NAME_LEN) characters. This is enough for all use cases, expect one - long rbd image names: - a format 1 header is named "<imgname>.rbd" - an object that points to a format 2 header is named "rbd_id.<imgname>" We operate on these potentially long-named objects during rbd map, and, for format 1 images, during header refresh. (A format 2 header name is a small system-generated string.) Lift this 100 character limit by making ceph_object_id be able to point to an externally-allocated string. Apart from being able to work with almost arbitrarily-long named objects, this allows us to reduce the size of ceph_object_id from >100 bytes to 64 bytes. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-05-26libceph: change how osd_op_reply message size is calculatedIlya Dryomov
For a message pool message, preallocate a page, just like we do for osd_op. For a normal message, take ceph_object_id into account and don't bother subtracting CEPH_OSD_SLAB_OPS ceph_osd_ops. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-05-26libceph: move message allocation out of ceph_osdc_alloc_request()Ilya Dryomov
The size of ->r_request and ->r_reply messages depends on the size of the object name (ceph_object_id), while the size of ceph_osd_request is fixed. Move message allocation into a separate function that would have to be called after ceph_object_id and ceph_object_locator (which is also going to become variable in size with RADOS namespaces) have been filled in: req = ceph_osdc_alloc_request(...); <fill in req->r_base_oid> <fill in req->r_base_oloc> ceph_osdc_alloc_messages(req); Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-05-26libceph: grab snapc in ceph_osdc_alloc_request()Ilya Dryomov
ceph_osdc_build_request() is going away. Grab snapc and initialize ->r_snapid in ceph_osdc_alloc_request(). Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-05-26libceph: make ceph_osdc_put_request() accept NULLIlya Dryomov
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-05-26rbd: get/put img_request in rbd_img_request_submit()Ilya Dryomov
By the time we get to checking for_each_obj_request_safe(img_request) terminating condition, all obj_requests may be complete and img_request ref, that rbd_img_request_submit() takes away from its caller, may be put. Moving the next_obj_request cursor is then a use-after-free on img_request. It's totally benign, as the value that's read is never used, but I think it's still worth fixing. Cc: Alex Elder <elder@linaro.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-05-25Merge tag 'pm-4.7-rc1-more' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more power management updates from Rafael Wysocki: "These are two stable-candidate fixes (PM core, cpuidle) and a bunch of cpufreq cleanups. Specifics: - Stable-candidate cpuidle fix to make it check the right variable when deciding whether or not to enable interrupts on the local CPU so as to avoid enabling iterrupts too early in some cases if the system has both coupled and per-core idle states (Daniel Lezcano). - Stable-candidate PM core fix to make it handle failures at the "late suspend" stage of device suspend consistently for all devices regardless of whether or not async suspend/resume is enabled for them (Rafael Wysocki). - Cleanups in the cpufreq core, the schedutil governor and the intel_pstate driver (Rafael Wysocki, Pankaj Gupta, Viresh Kumar)" * tag 'pm-4.7-rc1-more' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM / sleep: Handle failures in device_suspend_late() consistently cpufreq: schedutil: Improve prints messages with pr_fmt cpuidle: Fix cpuidle_state_is_coupled() argument in cpuidle_enter() cpufreq: simplified goto out in cpufreq_register_driver() cpufreq: governor: CPUFREQ_GOV_STOP never fails cpufreq: governor: CPUFREQ_GOV_POLICY_EXIT never fails intel_pstate: Simplify conditional in intel_pstate_set_policy()
2016-05-25do "fold checks into iterate_and_advance()" rightAl Viro
the only case when we should skip the iterate_and_advance() guts is when nothing's left in the iterator, _not_ just when requested amount is 0. Said guts will do nothing in the latter case anyway; the problem we tried to deal with in the aforementioned commit is that when there's nothing left *and* the amount requested is 0, we might end up deferencing one iovec too many; the value we fetch from there is discarded in that case, but theoretically it might oops if the iovec array ends exactly at the end of page with the next page not mapped. Bailing out on zero size requested had an unexpected side effect - zero-length segment in the beginning of iovec array ended up throwing do_loop_readv_writev() into infinite spin; we do not advance past the empty segment at all. Reproducer is trivial: echo '#include <sys/uio.h>' >a.c echo 'main() {char c; struct iovec v[] = {{&c,0},{&c,1}}; readv(0,v,2);}' >>a.c cc a.c && ./a.out </proc/uptime which should end up with the process not hanging. Probably ought to go into LTP or xfstests... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-25make xattr_resolve_handlers() safe to use with NULL ->s_xattrAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-25xattr: Fail with -EINVAL for NULL attribute namesAndreas Gruenbacher
Commit 98e9cb57 improved the xattr name checks in xattr_resolve_name but didn't update the NULL attribute name check appropriately, so NULL attribute names lead to NULL pointer dereferences. Turn that into -EINVAL results instead. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> fs/xattr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-25Merge branch 'acpi-battery'Rafael J. Wysocki
* acpi-battery: ACPI / battery: Correctly serialise with the pending async probe
2016-05-25Merge branches 'pm-cpufreq', 'pm-cpuidle' and 'pm-core'Rafael J. Wysocki
* pm-cpufreq: cpufreq: schedutil: Improve prints messages with pr_fmt cpufreq: simplified goto out in cpufreq_register_driver() cpufreq: governor: CPUFREQ_GOV_STOP never fails cpufreq: governor: CPUFREQ_GOV_POLICY_EXIT never fails intel_pstate: Simplify conditional in intel_pstate_set_policy() * pm-cpuidle: cpuidle: Fix cpuidle_state_is_coupled() argument in cpuidle_enter() * pm-core: PM / sleep: Handle failures in device_suspend_late() consistently
2016-05-25Merge tag 'pwm/for-4.7-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm Pull pwm updates from Thierry Reding: "This set of changes introduces an atomic API to the PWM subsystem. This is influenced by the DRM atomic API that was introduced a while back, though it is obviously a lot simpler. The fundamental idea remains the same, though: drivers provide a single callback to implement the atomic configuration of a PWM channel. As a side-effect the PWM subsystem gains the ability for initial state retrieval, so that the logical state mirrors that of the hardware. Many use-cases don't care about this, but for others it is essential. These new features require changes in all users, which these patches take care of. The core is transitioned to use the atomic callback if available and provides a fallback mechanism for other drivers. Changes to transition users and drivers to the atomic API are postponed to v4.8" * tag 'pwm/for-4.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm: (30 commits) pwm: Add information about polarity, duty cycle and period to debugfs pwm: Switch to the atomic API pwm: Update documentation pwm: Add core infrastructure to allow atomic updates pwm: Add hardware readout infrastructure pwm: Move the enabled/disabled info into pwm_state pwm: Introduce the pwm_state concept pwm: Keep PWM state in sync with hardware state ARM: Explicitly apply PWM config extracted from pwm_args drm: i915: Explicitly apply PWM config extracted from pwm_args input: misc: pwm-beeper: Explicitly apply PWM config extracted from pwm_args input: misc: max8997: Explicitly apply PWM config extracted from pwm_args backlight: lm3630a: explicitly apply PWM config extracted from pwm_args backlight: lp855x: Explicitly apply PWM config extracted from pwm_args backlight: lp8788: Explicitly apply PWM config extracted from pwm_args backlight: pwm_bl: Use pwm_get_args() where appropriate fbdev: ssd1307fb: Use pwm_get_args() where appropriate regulator: pwm: Use pwm_get_args() where appropriate leds: pwm: Use pwm_get_args() where appropriate input: misc: max77693: Use pwm_get_args() where appropriate ...
2016-05-25nfs/flexfiles: Helper function to detect FF_FLAGS_NO_READ_IOTom Haynes
The mds can inform the client not to use the IOMODE_RW layout segment for doing READs. I.e., it is basically a IOMODE_WRITE layout segment. It would do this to not interfere with the WRITEs. Signed-off-by: Tom Haynes <loghyr@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-25Merge git://www.linux-watchdog.org/linux-watchdogLinus Torvalds
Pull watchdog updates from Wim Van Sebroeck: - add support for Fintek F81865 Super-IO chip - add support for watchdogs (RWDT and SWDT) found on RCar Gen3 based SoCs from Renesas - octeon: Handle the FROZEN hot plug notifier actions - f71808e_wdt fixes and cleanups - some small improvements in code and documentation * git://www.linux-watchdog.org/linux-watchdog: MAINTAINERS: Add file patterns for watchdog device tree bindings Documentation: Add ebc-c384_wdt watchdog-parameters.txt entry watchdog: shwdt: Use setup_timer() watchdog: cpwd: Use setup_timer() arm64: defconfig: enable Renesas Watchdog Timer watchdog: renesas-wdt: add driver watchdog: remove error message when unable to allocate watchdog device watchdog: f71808e_wdt: Fix WDTMOUT_STS register read watchdog: f71808e_wdt: Fix typo watchdog: f71808e_wdt: Add F81865 support watchdog: sp5100_tco: properly check for new register layouts watchdog: core: Fix circular locking dependency watchdog: core: fix trivial typo in a comment watchdog: hpwdt: Adjust documentation to match latest kernel module parameters. watchdog: imx2_wdt: add external reset support via dt prop watchdog: octeon: Handle the FROZEN hot plug notifier actions. watchdog: qcom: Report reboot reason
2016-05-25nfs: avoid race that crashes nfs_init_commitWeston Andros Adamson
Since the patch "NFS: Allow multiple commit requests in flight per file" we can run multiple simultaneous commits on the same inode. This introduced a race over collecting pages to commit that made it possible to call nfs_init_commit() with an empty list - which causes crashes like the one below. The fix is to catch this race and avoid calling nfs_init_commit and initiate_commit when there is no work to do. Here is the crash: [600522.076832] BUG: unable to handle kernel NULL pointer dereference at 0000000000000040 [600522.078475] IP: [<ffffffffa0479e72>] nfs_init_commit+0x22/0x130 [nfs] [600522.078745] PGD 4272b1067 PUD 4272cb067 PMD 0 [600522.078972] Oops: 0000 [#1] SMP [600522.079204] Modules linked in: nfsv3 nfs_layout_flexfiles rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache dcdbas ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw vmw_vsock_vmci_transport vsock bonding ipmi_devintf ipmi_msghandler coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel ppdev vmw_balloon parport_pc parport acpi_cpufreq vmw_vmci i2c_piix4 shpchp nfsd auth_rpcgss nfs_acl lockd grace sunrpc xfs libcrc32c vmwgfx drm_kms_helper ttm drm crc32c_intel serio_raw vmxnet3 [600522.081380] vmw_pvscsi ata_generic pata_acpi [600522.081809] CPU: 3 PID: 15667 Comm: /usr/bin/python Not tainted 4.1.9-100.pd.88.el7.x86_64 #1 [600522.082281] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/30/2014 [600522.082814] task: ffff8800bbbfa780 ti: ffff88042ae84000 task.ti: ffff88042ae84000 [600522.083378] RIP: 0010:[<ffffffffa0479e72>] [<ffffffffa0479e72>] nfs_init_commit+0x22/0x130 [nfs] [600522.083973] RSP: 0018:ffff88042ae87438 EFLAGS: 00010246 [600522.084571] RAX: 0000000000000000 RBX: ffff880003485e40 RCX: ffff88042ae87588 [600522.085188] RDX: 0000000000000000 RSI: ffff88042ae874b0 RDI: ffff880003485e40 [600522.085756] RBP: ffff88042ae87448 R08: ffff880003486010 R09: ffff88042ae874b0 [600522.086332] R10: 0000000000000000 R11: 0000000000000005 R12: ffff88042ae872d0 [600522.086905] R13: ffff88042ae874b0 R14: ffff880003485e40 R15: ffff88042704c840 [600522.087484] FS: 00007f4728ff2740(0000) GS:ffff88043fd80000(0000) knlGS:0000000000000000 [600522.088070] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [600522.088663] CR2: 0000000000000040 CR3: 000000042b6aa000 CR4: 00000000001406e0 [600522.089327] Stack: [600522.089926] 0000000000000001 ffff88042ae87588 ffff88042ae874f8 ffffffffa04f09fa [600522.090549] 0000000000017840 0000000000017840 ffff88042ae87588 ffff8803258d9930 [600522.091169] ffff88042ae87578 ffffffffa0563d80 0000000000000000 ffff88042704c840 [600522.091789] Call Trace: [600522.092420] [<ffffffffa04f09fa>] pnfs_generic_commit_pagelist+0x1da/0x320 [nfsv4] [600522.093052] [<ffffffffa0563d80>] ? ff_layout_commit_prepare_v3+0x30/0x30 [nfs_layout_flexfiles] [600522.093696] [<ffffffffa0562645>] ff_layout_commit_pagelist+0x15/0x20 [nfs_layout_flexfiles] [600522.094359] [<ffffffffa047bc78>] nfs_generic_commit_list+0xe8/0x120 [nfs] [600522.095032] [<ffffffffa047bd6a>] nfs_commit_inode+0xba/0x110 [nfs] [600522.095719] [<ffffffffa046ac54>] nfs_release_page+0x44/0xd0 [nfs] [600522.096410] [<ffffffff811a8122>] try_to_release_page+0x32/0x50 [600522.097109] [<ffffffff811bd4f1>] shrink_page_list+0x961/0xb30 [600522.097812] [<ffffffff811bdced>] shrink_inactive_list+0x1cd/0x550 [600522.098530] [<ffffffff811bea65>] shrink_lruvec+0x635/0x840 [600522.099250] [<ffffffff811bed60>] shrink_zone+0xf0/0x2f0 [600522.099974] [<ffffffff811bf312>] do_try_to_free_pages+0x192/0x470 [600522.100709] [<ffffffff811bf6ca>] try_to_free_pages+0xda/0x170 [600522.101464] [<ffffffff811b2198>] __alloc_pages_nodemask+0x588/0x970 [600522.102235] [<ffffffff811fbbd5>] alloc_pages_vma+0xb5/0x230 [600522.103000] [<ffffffff813a1589>] ? cpumask_any_but+0x39/0x50 [600522.103774] [<ffffffff811d6115>] wp_page_copy.isra.55+0x95/0x490 [600522.104558] [<ffffffff810e3438>] ? __wake_up+0x48/0x60 [600522.105357] [<ffffffff811d7d3b>] do_wp_page+0xab/0x4f0 [600522.106137] [<ffffffff810a1bbb>] ? release_task+0x36b/0x470 [600522.106902] [<ffffffff8126dbd7>] ? eventfd_ctx_read+0x67/0x1c0 [600522.107659] [<ffffffff811da2a8>] handle_mm_fault+0xc78/0x1900 [600522.108431] [<ffffffff81067ef1>] __do_page_fault+0x181/0x420 [600522.109173] [<ffffffff811446a6>] ? __audit_syscall_exit+0x1e6/0x280 [600522.109893] [<ffffffff810681c0>] do_page_fault+0x30/0x80 [600522.110594] [<ffffffff81024f36>] ? syscall_trace_leave+0xc6/0x120 [600522.111288] [<ffffffff81790a58>] page_fault+0x28/0x30 [600522.111947] Code: 5d c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 4c 8d 87 d0 01 00 00 48 89 e5 53 48 89 fb 48 83 ec 08 4c 8b 0e 49 8b 41 18 4c 39 ce <48> 8b 40 40 4c 8b 50 30 74 24 48 8b 87 d0 01 00 00 48 8b 7e 08 [600522.113343] RIP [<ffffffffa0479e72>] nfs_init_commit+0x22/0x130 [nfs] [600522.114003] RSP <ffff88042ae87438> [600522.114636] CR2: 0000000000000040 Fixes: af7cf057 (NFS: Allow multiple commit requests in flight per file) CC: stable@vger.kernel.org Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-25Merge tag 'vfio-v4.7-rc1' of git://github.com/awilliam/linux-vfioLinus Torvalds
Pull VFIO updates from Alex Williamson: - Hide INTx on certain known broken devices (Alex Williamson) - Additional backdoor reset detection (Alex Williamson) - Remove unused iommudata reference (Alexey Kardashevskiy) - Use cfg_size to avoid probing extended config space (Alexey Kardashevskiy) * tag 'vfio-v4.7-rc1' of git://github.com/awilliam/linux-vfio: vfio_pci: Test for extended capabilities if config space > 256 bytes vfio_iommu_spapr_tce: Remove unneeded iommu_group_get_iommudata vfio/pci: Add test for BAR restore vfio/pci: Hide broken INTx support from user
2016-05-25Merge tag 'drm-4.7-rc1-headers-fix' of ↵Linus Torvalds
git://people.freedesktop.org/~airlied/linux Pull header warning fix from Dave Airlie: "Here is the C++ guards warning fix from Arnd" [ Background: there are 'extern "C" { }' guards in include/uapi for the GPU headers. They should arguably be wrapped somehow, but as it is they caused checkpatch to warn because it would trigger on the 'extern' and think it's exporting a function or variable from the kernel to user space. This just fixes checkpatch. Whether we wrap the C++ guards some way in the future will be an independent issue. ] * tag 'drm-4.7-rc1-headers-fix' of git://people.freedesktop.org/~airlied/linux: headers_check: don't warn about c++ guards
2016-05-25Merge branch 'parisc-4.7-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc updates from Helge Deller: - Add native high-resolution timing code for sched_clock() and other timing functions based on the processor internal cr16 cycle counters - Add syscall tracepoint support - Add regset support - Speed up get_user() and put_user() functions - Updated futex.h to match generic implementation (John David Anglin) - A few smaller ftrace build fixes - Fixed thuge-gen kernel self test to utilize architectured MAP_HUGETLB value - Added parisc architecture to seccomp_bpf kernel self test - Various typo fixes (Andrea Gelmini) * 'parisc-4.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Whitespace cleanups in unistd.h parisc: Use long jump to reach ftrace_return_to_handler() parisc: Fix typo in fpudispatch.c parisc: Fix typos in eisa_eeprom.h parisc: Fix typo in ldcw.h parisc: Fix typo in pdc.h parisc: Update futex.h to match generic implementation parisc: Merge ftrace C-helper and assembler functions into .text.hot section selftests/thuge-gen: Use platform specific MAP_HUGETLB value parisc: Add native high-resolution sched_clock() implementation parisc: Add ARCH_TRACEHOOK and regset support parisc: Add 64bit get_user() and put_user() for 32bit kernel parisc: Simplify and speed up get_user() and put_user() parisc: Add syscall tracepoint support
2016-05-25NFS: checking for NULL instead of IS_ERR() in nfs_commit_file()Dan Carpenter
nfs_create_request() doesn't return NULL, it returns error pointers. Fixes: 67911c8f18b5 ('NFS: Add nfs_commit_file()') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-25parisc: Whitespace cleanups in unistd.hHelge Deller
Clean up whitespaces and mark unused syscalls as such. Signed-off-by: Helge Deller <deller@gmx.de>
2016-05-25sched/core: Fix remote wakeupsPeter Zijlstra
Commit: b5179ac70de8 ("sched/fair: Prepare to fix fairness problems on migration") ... introduced a bug: Mike Galbraith found that it introduced a performance regression, while Paul E. McKenney reported lost wakeups and bisected it to this commit. The reason is that I mis-read ttwu_queue() such that I assumed any wakeup that got a remote queue must have had the task migrated. Since this is not so; we need to transfer this information between queueing the wakeup and actually doing the wakeup. Use a new task_struct::sched_flag for this, we already write to sched_contributes_to_load in the wakeup path so this is a hot and modified cacheline. Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reported-by: Mike Galbraith <umgwanakikbuti@gmail.com> Tested-by: Mike Galbraith <umgwanakikbuti@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Hunter <ahh@google.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Ben Segall <bsegall@google.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Paul Turner <pjt@google.com> Cc: Pavan Kondeti <pkondeti@codeaurora.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: byungchul.park@lge.com Fixes: b5179ac70de8 ("sched/fair: Prepare to fix fairness problems on migration") Link: http://lkml.kernel.org/r/20160523091907.GD15728@worktop.ger.corp.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>