linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2012-07-17	rpmsg: fix dependency on initialization order	Federico Fuga
	When rpmsg drivers are built into the kernel, they must not initialize before the rpmsg bus does, otherwise they'd trigger a BUG() in drivers/base/driver.c line 169 (driver_register()). To fix that, and to stop depending on arbitrary linkage ordering of those built-in rpmsg drivers, we make the rpmsg bus initialize at subsys_initcall. Cc: stable <stable@vger.kernel.org> Signed-off-by: Federico Fuga <fuga@studiofuga.com> [ohad: rewrite the commit log] Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
2012-07-17	regulator: tps65910: set input_supply on desc unconditionally	Laxman Dewangan
	Set the supply_name in the regulator descriptor unconditionally and make this parameter as required parameter in the device node for successfully registration of the regulator. Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2012-07-17	regulator: palmas: Fix calcuating selector in palmas_map_voltage_smps	Axel Lin
	The logic of calculating selector in palmas_map_voltage_smps() does not match the logic to list voltage in palmas_list_voltage_smps(). We use below equation to calculate voltage when selector > 0: voltage = (0.49V + (selector * 0.01V)) * RANGE RANGE is either x1 or x2 So we need to take into account with the multiplier set in VSEL register when calculating selector in palmas_map_voltage_smps() Signed-off-by: Axel Lin <axel.lin@gmail.com> Acked-by: Graeme Gregory <gg@slimlogic.co.uk> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2012-07-17	ipvs: fix oops in ip_vs_dst_event on rmmod	Julian Anastasov
	After commit 39f618b4fd95ae243d940ec64c961009c74e3333 (3.4) "ipvs: reset ipvs pointer in netns" we can oops in ip_vs_dst_event on rmmod ip_vs because ip_vs_control_cleanup is called after the ipvs_core_ops subsys is unregistered and net->ipvs is NULL. Fix it by exiting early from ip_vs_dst_event if ipvs is NULL. It is safe because all services and dests for the net are already freed. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-07-17	ipvs: fix oops on NAT reply in br_nf context	Lin Ming
	IPVS should not reset skb->nf_bridge in FORWARD hook by calling nf_reset for NAT replies. It triggers oops in br_nf_forward_finish. [ 579.781508] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004 [ 579.781669] IP: [<ffffffff817b1ca5>] br_nf_forward_finish+0x58/0x112 [ 579.781792] PGD 218f9067 PUD 0 [ 579.781865] Oops: 0000 [#1] SMP [ 579.781945] CPU 0 [ 579.781983] Modules linked in: [ 579.782047] [ 579.782080] [ 579.782114] Pid: 4644, comm: qemu Tainted: G W 3.5.0-rc5-00006-g95e69f9 #282 Hewlett-Packard /30E8 [ 579.782300] RIP: 0010:[<ffffffff817b1ca5>] [<ffffffff817b1ca5>] br_nf_forward_finish+0x58/0x112 [ 579.782455] RSP: 0018:ffff88007b003a98 EFLAGS: 00010287 [ 579.782541] RAX: 0000000000000008 RBX: ffff8800762ead00 RCX: 000000000001670a [ 579.782653] RDX: 0000000000000000 RSI: 000000000000000a RDI: ffff8800762ead00 [ 579.782845] RBP: ffff88007b003ac8 R08: 0000000000016630 R09: ffff88007b003a90 [ 579.782957] R10: ffff88007b0038e8 R11: ffff88002da37540 R12: ffff88002da01a02 [ 579.783066] R13: ffff88002da01a80 R14: ffff88002d83c000 R15: ffff88002d82a000 [ 579.783177] FS: 0000000000000000(0000) GS:ffff88007b000000(0063) knlGS:00000000f62d1b70 [ 579.783306] CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b [ 579.783395] CR2: 0000000000000004 CR3: 00000000218fe000 CR4: 00000000000027f0 [ 579.783505] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 579.783684] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 579.783795] Process qemu (pid: 4644, threadinfo ffff880021b20000, task ffff880021aba760) [ 579.783919] Stack: [ 579.783959] ffff88007693cedc ffff8800762ead00 ffff88002da01a02 ffff8800762ead00 [ 579.784110] ffff88002da01a02 ffff88002da01a80 ffff88007b003b18 ffffffff817b26c7 [ 579.784260] ffff880080000000 ffffffff81ef59f0 ffff8800762ead00 ffffffff81ef58b0 [ 579.784477] Call Trace: [ 579.784523] <IRQ> [ 579.784562] [ 579.784603] [<ffffffff817b26c7>] br_nf_forward_ip+0x275/0x2c8 [ 579.784707] [<ffffffff81704b58>] nf_iterate+0x47/0x7d [ 579.784797] [<ffffffff817ac32e>] ? br_dev_queue_push_xmit+0xae/0xae [ 579.784906] [<ffffffff81704bfb>] nf_hook_slow+0x6d/0x102 [ 579.784995] [<ffffffff817ac32e>] ? br_dev_queue_push_xmit+0xae/0xae [ 579.785175] [<ffffffff8187fa95>] ? _raw_write_unlock_bh+0x19/0x1b [ 579.785179] [<ffffffff817ac417>] __br_forward+0x97/0xa2 [ 579.785179] [<ffffffff817ad366>] br_handle_frame_finish+0x1a6/0x257 [ 579.785179] [<ffffffff817b2386>] br_nf_pre_routing_finish+0x26d/0x2cb [ 579.785179] [<ffffffff817b2cf0>] br_nf_pre_routing+0x55d/0x5c1 [ 579.785179] [<ffffffff81704b58>] nf_iterate+0x47/0x7d [ 579.785179] [<ffffffff817ad1c0>] ? br_handle_local_finish+0x44/0x44 [ 579.785179] [<ffffffff81704bfb>] nf_hook_slow+0x6d/0x102 [ 579.785179] [<ffffffff817ad1c0>] ? br_handle_local_finish+0x44/0x44 [ 579.785179] [<ffffffff81551525>] ? sky2_poll+0xb35/0xb54 [ 579.785179] [<ffffffff817ad62a>] br_handle_frame+0x213/0x229 [ 579.785179] [<ffffffff817ad417>] ? br_handle_frame_finish+0x257/0x257 [ 579.785179] [<ffffffff816e3b47>] __netif_receive_skb+0x2b4/0x3f1 [ 579.785179] [<ffffffff816e69fc>] process_backlog+0x99/0x1e2 [ 579.785179] [<ffffffff816e6800>] net_rx_action+0xdf/0x242 [ 579.785179] [<ffffffff8107e8a8>] __do_softirq+0xc1/0x1e0 [ 579.785179] [<ffffffff8135a5ba>] ? trace_hardirqs_off_thunk+0x3a/0x6c [ 579.785179] [<ffffffff8188812c>] call_softirq+0x1c/0x30 The steps to reproduce as follow, 1. On Host1, setup brige br0(192.168.1.106) 2. Boot a kvm guest(192.168.1.105) on Host1 and start httpd 3. Start IPVS service on Host1 ipvsadm -A -t 192.168.1.106:80 -s rr ipvsadm -a -t 192.168.1.106:80 -r 192.168.1.105:80 -m 4. Run apache benchmark on Host2(192.168.1.101) ab -n 1000 http://192.168.1.106/ ip_vs_reply4 ip_vs_out handle_response ip_vs_notrack nf_reset() { skb->nf_bridge = NULL; } Actually, IPVS wants in this case just to replace nfct with untracked version. So replace the nf_reset(skb) call in ip_vs_notrack() with a nf_conntrack_put(skb->nfct) call. Signed-off-by: Lin Ming <mlin@ss.pku.edu.cn> Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-07-17	ixgbevf: Fix panic when loading driver	Alexander Duyck
	This patch addresses a kernel panic seen when setting up the interface. Specifically we see a NULL pointer dereference on the Tx descriptor cleanup path when enabling interrupts. This change corrects that so it cannot occur. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: Greg Rose <gregory.v.rose@intel.com> Tested-by: Sibai Li <sibai.li@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-17	sh: pfc: pinctrl legacy group support.	Paul Mundt
	This follows the function support by simply doing 1 pin per group encapsulation in order to keep with legacy behaviour. This will be built on incrementally as SoCs define their own pin groups. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-07-17	sh: pfc: Ignore pinmux GPIOs with invalid enum IDs.	Paul Mundt
	If we encounter invalid entries in the pinmux GPIO range, make sure we've still got a dummy pin definition but don't otherwise map it. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-07-16	ax25: Fix missing break	Alan Cox
	At least there seems to be no reason to disallow ROSE sockets when NETROM is loaded. Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-17	sh: pfc: Export pinctrl binding init symbol.	Paul Mundt
	symbol_request() requires the registration symbol to be exported, make sure it is. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-07-16	MAINTAINERS: reflect actual changes in IEEE 802.15.4 maintainership	Dmitry Eremin-Solenikov
	As the life flows, developers priorities shifts a bit. Reflect actual changes in the maintainership of IEEE 802.15.4 code: Sergey mostly stopped cared about this piece of code. Most of the work recently was done by Alexander, so put him to the MAINTAINERS file to reflect his status and to ease the life of respective patches. Also add new net/mac802154/ directory to the list of maintained files. Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Cc: Alexander Smirnov <alex.bluesman.smirnov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-16	Merge branch 'master' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net Jeff Kirsher says: ==================== This series contains fixes to e1000e. ... Bruce Allan (1): e1000e: fix test for PHY being accessible on 82577/8/9 and I217 Tushar Dave (1): e1000e: Correct link check logic for 82571 serdes ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-17	sh: pfc: Error out on pinctrl init resolution failure.	Paul Mundt
	pinctrl support is required for correct operation, failure to locate the init routine is fatal. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-07-16	caif: Fix access to freed pernet memory	Sjur Brændeland
	unregister_netdevice_notifier() must be called before unregister_pernet_subsys() to avoid accessing already freed pernet memory. This fixes the following oops when doing rmmod: Call Trace: [<ffffffffa0f802bd>] caif_device_notify+0x4d/0x5a0 [caif] [<ffffffff81552ba9>] unregister_netdevice_notifier+0xb9/0x100 [<ffffffffa0f86dcc>] caif_device_exit+0x1c/0x250 [caif] [<ffffffff810e7734>] sys_delete_module+0x1a4/0x300 [<ffffffff810da82d>] ? trace_hardirqs_on_caller+0x15d/0x1e0 [<ffffffff813517de>] ? trace_hardirqs_on_thunk+0x3a/0x3 [<ffffffff81696bad>] system_call_fastpath+0x1a/0x1f RIP [<ffffffffa0f7f561>] caif_get+0x51/0xb0 [caif] Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-16	net: cgroup: fix access the unallocated memory in netprio cgroup	Gao feng
	there are some out of bound accesses in netprio cgroup. now before accessing the dev->priomap.priomap array,we only check if the dev->priomap exist.and because we don't want to see additional bound checkings in fast path, so we should make sure that dev->priomap is null or array size of dev->priomap.priomap is equal to max_prioidx + 1; so in write_priomap logic,we should call extend_netdev_table when dev->priomap is null and dev->priomap.priomap_len < max_len. and in cgrp_create->update_netdev_tables logic,we should call extend_netdev_table only when dev->priomap exist and dev->priomap.priomap_len < max_len. and it's not needed to call update_netdev_tables in write_priomap, we can only allocate the net device's priomap which we change through net_prio.ifpriomap. this patch also add a return value for update_netdev_tables & extend_netdev_table, so when new_priomap is allocated failed, write_priomap will stop to access the priomap,and return -ENOMEM back to the userspace to tell the user what happend. Change From v3: 1. add rtnl protect when reading max_prioidx in write_priomap. 2. only call extend_netdev_table when map->priomap_len < max_len, this will make sure array size of dev->map->priomap always bigger than any prioidx. 3. add a function write_update_netdev_table to make codes clear. Change From v2: 1. protect extend_netdev_table by RTNL. 2. when extend_netdev_table failed,call dev_put to reduce device's refcount. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Eric Dumazet <edumazet@google.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-16	ixgbevf: Prevent RX/TX statistics getting reset to zero	Narendra K
	The commit 4197aa7bb81877ebb06e4f2cc1b5fea2da23a7bd implements 64 bit per ring statistics. But the driver resets the 'total_bytes' and 'total_packets' from RX and TX rings in the RX and TX interrupt handlers to zero. This results in statistics being lost and user space reporting RX and TX statistics as zero. This patch addresses the issue by preventing the resetting of RX and TX ring statistics to zero. Signed-off-by: Narendra K <narendra_k@dell.com> Tested-by: Sibai Li <sibai.li@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-16	sctp: Fix list corruption resulting from freeing an association on a list	Neil Horman
	A few days ago Dave Jones reported this oops: [22766.294255] general protection fault: 0000 [#1] PREEMPT SMP [22766.295376] CPU 0 [22766.295384] Modules linked in: [22766.387137] ffffffffa169f292 6b6b6b6b6b6b6b6b ffff880147c03a90 ffff880147c03a74 [22766.387135] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 00000000000 [22766.387136] Process trinity-watchdo (pid: 10896, threadinfo ffff88013e7d2000, [22766.387137] Stack: [22766.387140] ffff880147c03a10 [22766.387140] ffffffffa169f2b6 [22766.387140] ffff88013ed95728 [22766.387143] 0000000000000002 [22766.387143] 0000000000000000 [22766.387143] ffff880003fad062 [22766.387144] ffff88013c120000 [22766.387144] [22766.387145] Call Trace: [22766.387145] <IRQ> [22766.387150] [<ffffffffa169f292>] ? __sctp_lookup_association+0x62/0xd0 [sctp] [22766.387154] [<ffffffffa169f2b6>] __sctp_lookup_association+0x86/0xd0 [sctp] [22766.387157] [<ffffffffa169f597>] sctp_rcv+0x207/0xbb0 [sctp] [22766.387161] [<ffffffff810d4da8>] ? trace_hardirqs_off_caller+0x28/0xd0 [22766.387163] [<ffffffff815827e3>] ? nf_hook_slow+0x133/0x210 [22766.387166] [<ffffffff815902fc>] ? ip_local_deliver_finish+0x4c/0x4c0 [22766.387168] [<ffffffff8159043d>] ip_local_deliver_finish+0x18d/0x4c0 [22766.387169] [<ffffffff815902fc>] ? ip_local_deliver_finish+0x4c/0x4c0 [22766.387171] [<ffffffff81590a07>] ip_local_deliver+0x47/0x80 [22766.387172] [<ffffffff8158fd80>] ip_rcv_finish+0x150/0x680 [22766.387174] [<ffffffff81590c54>] ip_rcv+0x214/0x320 [22766.387176] [<ffffffff81558c07>] __netif_receive_skb+0x7b7/0x910 [22766.387178] [<ffffffff8155856c>] ? __netif_receive_skb+0x11c/0x910 [22766.387180] [<ffffffff810d423e>] ? put_lock_stats.isra.25+0xe/0x40 [22766.387182] [<ffffffff81558f83>] netif_receive_skb+0x23/0x1f0 [22766.387183] [<ffffffff815596a9>] ? dev_gro_receive+0x139/0x440 [22766.387185] [<ffffffff81559280>] napi_skb_finish+0x70/0xa0 [22766.387187] [<ffffffff81559cb5>] napi_gro_receive+0xf5/0x130 [22766.387218] [<ffffffffa01c4679>] e1000_receive_skb+0x59/0x70 [e1000e] [22766.387242] [<ffffffffa01c5aab>] e1000_clean_rx_irq+0x28b/0x460 [e1000e] [22766.387266] [<ffffffffa01c9c18>] e1000e_poll+0x78/0x430 [e1000e] [22766.387268] [<ffffffff81559fea>] net_rx_action+0x1aa/0x3d0 [22766.387270] [<ffffffff810a495f>] ? account_system_vtime+0x10f/0x130 [22766.387273] [<ffffffff810734d0>] __do_softirq+0xe0/0x420 [22766.387275] [<ffffffff8169826c>] call_softirq+0x1c/0x30 [22766.387278] [<ffffffff8101db15>] do_softirq+0xd5/0x110 [22766.387279] [<ffffffff81073bc5>] irq_exit+0xd5/0xe0 [22766.387281] [<ffffffff81698b03>] do_IRQ+0x63/0xd0 [22766.387283] [<ffffffff8168ee2f>] common_interrupt+0x6f/0x6f [22766.387283] <EOI> [22766.387284] [22766.387285] [<ffffffff8168eed9>] ? retint_swapgs+0x13/0x1b [22766.387285] Code: c0 90 5d c3 66 0f 1f 44 00 00 4c 89 c8 5d c3 0f 1f 00 55 48 89 e5 48 83 ec 20 48 89 5d e8 4c 89 65 f0 4c 89 6d f8 66 66 66 66 90 <0f> b7 87 98 00 00 00 48 89 fb 49 89 f5 66 c1 c0 08 66 39 46 02 [22766.387307] [22766.387307] RIP [22766.387311] [<ffffffffa168a2c9>] sctp_assoc_is_match+0x19/0x90 [sctp] [22766.387311] RSP <ffff880147c039b0> [22766.387142] ffffffffa16ab120 [22766.599537] ---[ end trace 3f6dae82e37b17f5 ]--- [22766.601221] Kernel panic - not syncing: Fatal exception in interrupt It appears from his analysis and some staring at the code that this is likely occuring because an association is getting freed while still on the sctp_assoc_hashtable. As a result, we get a gpf when traversing the hashtable while a freed node corrupts part of the list. Nominally I would think that an mibalanced refcount was responsible for this, but I can't seem to find any obvious imbalance. What I did note however was that the two places where we create an association using sctp_primitive_ASSOCIATE (__sctp_connect and sctp_sendmsg), have failure paths which free a newly created association after calling sctp_primitive_ASSOCIATE. sctp_primitive_ASSOCIATE brings us into the sctp_sf_do_prm_asoc path, which issues a SCTP_CMD_NEW_ASOC side effect, which in turn adds a new association to the aforementioned hash table. the sctp command interpreter that process side effects has not way to unwind previously processed commands, so freeing the association from the __sctp_connect or sctp_sendmsg error path would lead to a freed association remaining on this hash table. I've fixed this but modifying sctp_[un]hash_established to use hlist_del_init, which allows us to proerly use hlist_unhashed to check if the node is on a hashlist safely during a delete. That in turn alows us to safely call sctp_unhash_established in the __sctp_connect and sctp_sendmsg error paths before freeing them, regardles of what the associations state is on the hash list. I noted, while I was doing this, that the __sctp_unhash_endpoint was using hlist_unhsashed in a simmilar fashion, but never nullified any removed nodes pointers to make that function work properly, so I fixed that up in a simmilar fashion. I attempted to test this using a virtual guest running the SCTP_RR test from netperf in a loop while running the trinity fuzzer, both in a loop. I wasn't able to recreate the problem prior to this fix, nor was I able to trigger the failure after (neither of which I suppose is suprising). Given the trace above however, I think its likely that this is what we hit. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Reported-by: davej@redhat.com CC: davej@redhat.com CC: "David S. Miller" <davem@davemloft.net> CC: Vlad Yasevich <vyasevich@gmail.com> CC: Sridhar Samudrala <sri@us.ibm.com> CC: linux-sctp@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-16	cifs: always update the inode cache with the results from a FIND_*	Jeff Layton
	When we get back a FIND_FIRST/NEXT result, we have some info about the dentry that we use to instantiate a new inode. We were ignoring and discarding that info when we had an existing dentry in the cache. Fix this by updating the inode in place when we find an existing dentry and the uniqueid is the same. Cc: <stable@vger.kernel.org> # .31.x Reported-and-Tested-by: Andrew Bartlett <abartlet@samba.org> Reported-by: Bill Robertson <bill_robertson@debortoli.com.au> Reported-by: Dion Edwards <dion_edwards@debortoli.com.au> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>
2012-07-16	cifs: when CONFIG_HIGHMEM is set, serialize the read/write kmaps	Jeff Layton
	Jian found that when he ran fsx on a 32 bit arch with a large wsize the process and one of the bdi writeback kthreads would sometimes deadlock with a stack trace like this: crash> bt PID: 2789 TASK: f02edaa0 CPU: 3 COMMAND: "fsx" #0 [eed63cbc] schedule at c083c5b3 #1 [eed63d80] kmap_high at c0500ec8 #2 [eed63db0] cifs_async_writev at f7fabcd7 [cifs] #3 [eed63df0] cifs_writepages at f7fb7f5c [cifs] #4 [eed63e50] do_writepages at c04f3e32 #5 [eed63e54] __filemap_fdatawrite_range at c04e152a #6 [eed63ea4] filemap_fdatawrite at c04e1b3e #7 [eed63eb4] cifs_file_aio_write at f7fa111a [cifs] #8 [eed63ecc] do_sync_write at c052d202 #9 [eed63f74] vfs_write at c052d4ee #10 [eed63f94] sys_write at c052df4c #11 [eed63fb0] ia32_sysenter_target at c0409a98 EAX: 00000004 EBX: 00000003 ECX: abd73b73 EDX: 012a65c6 DS: 007b ESI: 012a65c6 ES: 007b EDI: 00000000 SS: 007b ESP: bf8db178 EBP: bf8db1f8 GS: 0033 CS: 0073 EIP: 40000424 ERR: 00000004 EFLAGS: 00000246 Each task would kmap part of its address array before getting stuck, but not enough to actually issue the write. This patch fixes this by serializing the marshal_iov operations for async reads and writes. The idea here is to ensure that cifs aggressively tries to populate a request before attempting to fulfill another one. As soon as all of the pages are kmapped for a request, then we can unlock and allow another one to proceed. There's no need to do this serialization on non-CONFIG_HIGHMEM arches however, so optimize all of this out when CONFIG_HIGHMEM isn't set. Cc: <stable@vger.kernel.org> Reported-by: Jian Li <jiali@redhat.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>
2012-07-16	cifs: on CONFIG_HIGHMEM machines, limit the rsize/wsize to the kmap space	Jeff Layton
	We currently rely on being able to kmap all of the pages in an async read or write request. If you're on a machine that has CONFIG_HIGHMEM set then that kmap space is limited, sometimes to as low as 512 slots. With 512 slots, we can only support up to a 2M r/wsize, and that's assuming that we can get our greedy little hands on all of them. There are other users however, so it's possible we'll end up stuck with a size that large. Since we can't handle a rsize or wsize larger than that currently, cap those options at the number of kmap slots we have. We could consider capping it even lower, but we currently default to a max of 1M. Might as well allow those luddites on 32 bit arches enough rope to hang themselves. A more robust fix would be to teach the send and receive routines how to contend with an array of pages so we don't need to marshal up a kvec array at all. That's a fairly significant overhaul though, so we'll need this limit in place until that's ready. Cc: <stable@vger.kernel.org> Reported-by: Jian Li <jiali@redhat.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>
2012-07-16	Initialise mid_q_entry before putting it on the pending queue	Sachin Prabhu
	A user reported a crash in cifs_demultiplex_thread() caused by an incorrectly set mid_q_entry->callback() function. It appears that the callback assignment made in cifs_call_async() was not flushed back to memory suggesting that a memory barrier was required here. Changing the code to make sure that the mid_q_entry structure was completely initialised before it was added to the pending queue fixes the problem. Signed-off-by: Sachin Prabhu <sprabhu@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com> Signed-off-by: Steve French <smfrench@gmail.com>
2012-07-16	target: Check number of unmap descriptors against our limit	Roland Dreier
	Fail UNMAP commands that have more than our reported limit on unmap descriptors. Signed-off-by: Roland Dreier <roland@purestorage.com> Cc: stable@vger.kernel.org Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16	target: Fix possible integer underflow in UNMAP emulation	Roland Dreier
	It's possible for an initiator to send us an UNMAP command with a descriptor that is less than 8 bytes; in that case it's really bad for us to set an unsigned int to that value, subtract 8 from it, and then use that as a limit for our loop (since the value will wrap around to a huge positive value). Fix this by making size be signed and only looping if size >= 16 (ie if we have at least a full descriptor available). Also remove offset as an obfuscated name for the constant 8. Signed-off-by: Roland Dreier <roland@purestorage.com> Cc: stable@vger.kernel.org Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16	target: Fix reading of data length fields for UNMAP commands	Roland Dreier
	The UNMAP DATA LENGTH and UNMAP BLOCK DESCRIPTOR DATA LENGTH fields are in the unmap descriptor (the payload transferred to our data out buffer), not in the CDB itself. Read them from the correct place in target_emulated_unmap. Signed-off-by: Roland Dreier <roland@purestorage.com> Cc: stable@vger.kernel.org Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16	target: Add range checking to UNMAP emulation	Roland Dreier
	When processing an UNMAP command, we need to make sure that the number of blocks we're asked to UNMAP does not exceed our reported maximum number of blocks per UNMAP, and that the range of blocks we're unmapping doesn't go past the end of the device. Signed-off-by: Roland Dreier <roland@purestorage.com> Cc: stable@vger.kernel.org Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16	target: Add generation of LOGICAL BLOCK ADDRESS OUT OF RANGE	Roland Dreier
	Many SCSI commands are defined to return a CHECK CONDITION / ILLEGAL REQUEST with ASC set to LOGICAL BLOCK ADDRESS OUT OF RANGE if the initiator sends a command that accesses a too-big LBA. Add an enum value and case entries so that target code can return this status. Signed-off-by: Roland Dreier <roland@purestorage.com> Cc: stable@vger.kernel.org Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16	target: Make unnecessarily global se_dev_align_max_sectors() static	Roland Dreier
	Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16	target: Remove se_session.sess_wait_list	Roland Dreier
	Since we set se_session.sess_tearing_down and stop new commands from being added to se_session.sess_cmd_list before we wait for commands to finish when freeing a session, there's no need for a separate sess_wait_list -- if we let new commands be added to sess_cmd_list after setting sess_tearing_down, that would be a bug that breaks the logic of waiting in-flight commands. Also rename target_splice_sess_cmd_list() to target_sess_cmd_list_set_waiting(), since we are no longer splicing onto a separate list. Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16	qla2xxx: Remove racy, now-redundant check of sess_tearing_down	Roland Dreier
	Now that target_submit_cmd() / target_get_sess_cmd() check sess_tearing_down before adding commands to the list, we no longer need the check in qlt_do_work(). In fact this check is racy anyway (and that race is what inspired the change to add the check of sess_tearing_down to the target core). Cc: Chad Dupuis <chad.dupuis@qlogic.com> Cc: Arun Easi <arun.easi@qlogic.com> Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16	target: Check sess_tearing_down in target_get_sess_cmd()	Roland Dreier
	Target core code assumes that target_splice_sess_cmd_list() has set sess_tearing_down and moved the list of pending commands to sess_wait_list, no more commands will be added to the session; if any are added, nothing keeps the se_session from being freed while the command is still in flight, which e.g. leads to use-after-free of se_cmd->se_sess in target_release_cmd_kref(). To enforce this invariant, put a check of sess_tearing_down inside of sess_cmd_lock in target_get_sess_cmd(); any checks before this are racy and can lead to the use-after-free described above. For example, the qla_target check in qlt_do_work() checks sess_tearing_down from work thread context but then drops all locks before calling target_submit_cmd() (as it must, since that is a sleeping function). However, since no locks are held, anything can happen with respect to the session it has looked up -- although it does correctly get sess_kref within its lock, so the memory won't be freed while target_submit_cmd() is actually running, nothing stops eg an ACL from being dropped and calling ->shutdown_session() (which calls into target_splice_sess_cmd_list()) before we get to target_get_sess_cmd(). Once this happens, the se_session memory can be freed as soon as target_submit_cmd() returns and qlt_do_work() drops its reference, even though we've just added a command to sess_cmd_list. To prevent this use-after-free, check sess_tearing_down inside of sess_cmd_lock right before target_get_sess_cmd() adds a command to sess_cmd_list; this is synchronized with target_splice_sess_cmd_list() so that every command is either waited for or not added to the queue. (nab: Keep target_submit_cmd() returning void for now..) Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16	sbp-target: Consolidate duplicated error path code in sbp_handle_command()	Roland Dreier
	Cc: Chris Boot <bootc@bootc.net> Cc: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-07-16	target: Un-export target_get_sess_cmd()	Roland Dreier
	There are no in-tree users of target_get_sess_cmd() outside of target_core_transport.c. Any new code should use the higher-level target_submit_cmd() interface. So let's un-export target_get_sess_cmd() and make it static to the one file where it's actually used. (nab: Fix up minor fuzz to for-next) Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16	qla2xxx: Get rid of redundant qla_tgt_sess.tearing_down	Roland Dreier
	The only place that sets qla_tgt_sess.tearing_down calls target_splice_sess_cmd_list() immediately afterwards, without dropping the lock it holds. That function sets se_session.sess_tearing_down, so we can get rid of the qla_target-specific flag, and in the one place that looks at the qla_tgt_sess.tearing_down flag just test se_session.sess_tearing_down instead. Cc: Chad Dupuis <chad.dupuis@qlogic.com> Cc: Arun Easi <arun.easi@qlogic.com> Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16	target: Make core_disable_device_list_for_node use pre-refactoring lock ordering	Nicholas Bellinger
	So after kicking around commit 547ac4c9c90 around a bit more, a tcm_qla2xxx LUN unlink OP has generated the following warning: [ 50.386625] qla2xxx [0000:07:00.0]-00af:0: Performing ISP error recovery - ha=ffff880263774000. [ 70.572988] qla2xxx [0000:07:00.0]-8038:0: Cable is unplugged... [ 126.527531] ------------[ cut here ]------------ [ 126.532677] WARNING: at kernel/softirq.c:159 local_bh_enable_ip+0x41/0x8c() [ 126.540433] Hardware name: S5520HC [ 126.544248] Modules linked in: tcm_vhost ib_srpt ib_cm ib_sa ib_mad ib_core tcm_qla2xxx tcm_loop tcm_fc libfc iscsi_target_mod target_core_pscsi target_core_file target_core_iblock target_core_mod configfs ipv6 iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi dm_round_robin dm_multipath scsi_dh loop i2c_i801 kvm_intel kvm crc32c_intel i2c_core microcode joydev button iomemory_vsl(O) pcspkr ext3 jbd uhci_hcd lpfc ata_piix libata ehci_hcd qla2xxx mlx4_core scsi_transport_fc scsi_tgt igb [last unloaded: scsi_wait_scan] [ 126.595567] Pid: 3283, comm: unlink Tainted: G O 3.5.0-rc2+ #33 [ 126.603128] Call Trace: [ 126.605853] [<ffffffff81026b91>] ? warn_slowpath_common+0x78/0x8c [ 126.612737] [<ffffffff8102c342>] ? local_bh_enable_ip+0x41/0x8c [ 126.619433] [<ffffffffa03582a2>] ? core_disable_device_list_for_node+0x70/0xe3 [target_core_mod] [ 126.629323] [<ffffffffa035849f>] ? core_clear_lun_from_tpg+0x88/0xeb [target_core_mod] [ 126.638244] [<ffffffffa0362ec1>] ? core_tpg_post_dellun+0x17/0x48 [target_core_mod] [ 126.646873] [<ffffffffa03575ee>] ? core_dev_del_lun+0x26/0x8c [target_core_mod] [ 126.655114] [<ffffffff810bcbd1>] ? dput+0x27/0x154 [ 126.660549] [<ffffffffa0359aa0>] ? target_fabric_port_unlink+0x3b/0x41 [target_core_mod] [ 126.669661] [<ffffffffa034a698>] ? configfs_unlink+0xfc/0x14a [configfs] [ 126.677224] [<ffffffff810b5979>] ? vfs_unlink+0x58/0xb7 [ 126.683141] [<ffffffff810b6ef3>] ? do_unlinkat+0xbb/0x142 [ 126.689253] [<ffffffff81330c75>] ? page_fault+0x25/0x30 [ 126.695170] [<ffffffff81335df9>] ? system_call_fastpath+0x16/0x1b [ 126.702053] ---[ end trace 2f8e5b0a9ec797ef ]--- [ 126.756336] qla2xxx [0000:07:00.0]-00af:0: Performing ISP error recovery - ha=ffff880263774000. [ 146.942414] qla2xxx [0000:07:00.0]-8038:0: Cable is unplugged... So this warning triggered because device_list disable logic is now holding nacl->device_list_lock w/ spin_lock_irqsave before obtaining port->sep_alua_lock with only spin_lock_bh.. The original disable logic obtains *deve ahead of dropping the entry from deve->alua_port_list and then obtains ->device_list_lock to do the remaining work. Also, I'm pretty sure this particular warning is being generated by a demo-mode session in tcm_qla2xxx, and not by explicit NodeACL MappedLUNs. The Initiator MappedLUNs are already protected by a seperate configfs symlink reference back se_lun->lun_group, and the demo-mode se_node_acl (and associated ->device_list[]) is released during se_portal_group->tpg_group shutdown. The following patch drops the extra functional change to disable logic in commit 547ac4c9c90 Cc: Andy Grover <agrover@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16	target: refactor core_update_device_list_for_node()	Andy Grover
	Code was almost entirely divided based on value of bool param "enable". Split it into two functions. Signed-off-by: Andy Grover <agrover@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16	target: Eliminate else using boolean logic	Andy Grover
	Signed-off-by: Andy Grover <agrover@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16	target: Misc retval cleanups	Andy Grover
	Bubble-up retval from iscsi_update_param_value() and iscsit_ta_authentication(). Other very small retval cleanups. Signed-off-by: Andy Grover <agrover@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16	target: Remove hba param from core_dev_add_lun	Andy Grover
	Only used in a debugprint, and function signature is cleaner now. Signed-off-by: Andy Grover <agrover@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16	target: Remove unneeded double parentheses	Andy Grover
	Signed-off-by: Andy Grover <agrover@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16	target: replace the processing thread with a TMR work queue	Christoph Hellwig
	The last functionality of the target processing thread is offloading possibly long running task management requests from the submitter context. To keep TMR semantics the same we need a single threaded ordered queue, which can be provided by a per-device workqueue with the right flags. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16	target: remove transport_generic_handle_cdb_map	Christoph Hellwig
	Remove this command submission path which is not used by any in-tree driver. This also removes the now unused new_cmd_map fabtric method, which a few drivers implemented despite never calling transport_generic_handle_cdb_map. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16	target: simply fabric driver queue full processing	Christoph Hellwig
	There is no need to schedule the delayed processing in a workqueue that offloads it to the target processing thread. Instead execute it directly from the workqueue. There will be a lot of future work in this area, which I'd likfe to defer for now as it is not nessecary for getting rid of the target processing thread. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16	target: remove transport_generic_handle_data	Christoph Hellwig
	Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16	tcm_fc: Offload WRITE I/O backend submission to tpg workqueue	Christoph Hellwig
	Defer the write processing to the internal to be able to use target_execute_cmd. I'm not even entirely sure the calling code requires this due to the convoluted structure in libfc, but let's be safe for now. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Mark Rustad <mark.d.rustad@intel.com> Cc: Kiran Patil <Kiran.patil@intel.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16	tcm_qla2xxx: Offload WRITE I/O backend submission to tcm_qla2xxx wq	Christoph Hellwig
	Defer the whole tcm_qla2xxx_handle_data call instead of just the error path to the qla2xxx-internal workqueue. Also remove the useless lock around the CMD_T_ABORTED check. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Roland Dreier <roland@purestorage.com> Cc: Giridhar Malavali <giridhar.malavali@qlogic.com> Cc: tcm-qla2xxx@qlogic.com Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16	srpt: use target_execute_cmd for WRITEs in srpt_handle_rdma_comp	Christoph Hellwig
	srpt_handle_rdma_comp is called from kthread context and thus can execute target_execute_cmd directly. srpt_abort_cmd sets the CMD_T_LUN_STOP flag directly, and thus the abuse of transport_generic_handle_data can be replaced with an opencoded variant of that code path. I'm still not happy about a fabric driver poking into target core internals like this, but let's defer the bigger architecture changes for now. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16	iscsit: use target_execute_cmd for WRITEs	Christoph Hellwig
	All three callers of transport_generic_handle_data are from user context and can use target_execute_cmd directly to handle the backend I/O submission of WRITE I/O. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Andy Grover <agrover@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16	target: merge transport_generic_write_pending into transport_generic_new_cmd	Christoph Hellwig
	Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16	target: call transport_check_aborted_status from target_execute_cmd	Christoph Hellwig
	When we call target_execute_cmd for write commands the command has been on the state list before an abort might have come in before target_execute_cmd. Call transport_check_aborted_status to deal with this case. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-07-16	target: remove transport_generic_process_write	Christoph Hellwig
	Just call target_execute_cmd directly. Also, convert loopback, sbp, usb-gadget to use the newly exported target_execute_cmd(). Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>