summaryrefslogtreecommitdiff
path: root/drivers/infiniband/hw/cxgb4
AgeCommit message (Collapse)Author
2016-10-07IB/cxgb4: Move user vendor structuresLeon Romanovsky
This patch moves cxgb4 vendor's specific structures to common UAPI folder which will be visible to all consumers. These structures are used by user-space library driver (libcxgb4) and currently manually copied to that library. This move will allow cross-compile against these files and simplify introduction of vendor specific data. Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-07IB/{core,hw}: Add constant for node_descYuval Shaia
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-07iw_cxgb4: Remove deprecated create_singlethread_workqueueBhaktipriya Shridhar
alloc_ordered_workqueue() with WQ_MEM_RECLAIM set, replaces deprecated create_singlethread_workqueue(). This is the identity conversion. The workqueue "workq" queues work item &skb_work. It has been identity converted. WQ_MEM_RECLAIM has been set to ensure forward progress under memory pressure. Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-09-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
2016-09-19chcr/cxgb4i/cxgbit/RDMA/cxgb4: Allocate resources dynamically for all cxgb4 ↵Hariprasad Shenai
ULD's Allocate resources dynamically to cxgb4's Upper layer driver's(ULD) like cxgbit, iw_cxgb4 and cxgb4i. Allocate resources when they register with cxgb4 driver and free them while unregistering. All the queues and the interrupts for them will be allocated during ULD probe only and freed during remove. Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-15libcxgb,iw_cxgb4,cxgbit: add cxgb_mk_rx_data_ack()Varun Prakash
Add cxgb_mk_rx_data_ack() to remove duplicate code to form CPL_RX_DATA_ACK hardware command. Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-15libcxgb,iw_cxgb4,cxgbit: add cxgb_mk_abort_rpl()Varun Prakash
Add cxgb_mk_abort_rpl() to remove duplicate code to form CPL_ABORT_RPL hardware command. Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-15libcxgb,iw_cxgb4,cxgbit: add cxgb_mk_abort_req()Varun Prakash
Add cxgb_mk_abort_req() to remove duplicate code to form CPL_ABORT_REQ hardware command. Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-15libcxgb, iw_cxgb4, cxgbit: add cxgb_mk_close_con_req()Varun Prakash
Add cxgb_mk_close_con_req() to remove duplicate code to form CPL_CLOSE_CON_REQ hardware command. Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-15libcxgb,iw_cxgb4,cxgbit: add cxgb_mk_tid_release()Varun Prakash
Add cxgb_mk_tid_release() to remove duplicate code to form CPL_TID_RELEASE hardware command. Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-15libcxgb,iw_cxgb4,cxgbit: add cxgb_compute_wscale()Varun Prakash
Add cxgb_compute_wscale() in libcxgb_cm.h to remove it's duplicate definitions from cxgb4/cm.c and cxgbit/cxgbit_cm.c. Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-15libcxgb,iw_cxgb4,cxgbit: add cxgb_best_mtu()Varun Prakash
Add cxgb_best_mtu() in libcxgb_cm.h to remove it's duplicate definitions from cxgb4/cm.c and cxgbit/cxgbit_cm.c Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-15libcxgb,iw_cxgb4,cxgbit: add cxgb_is_neg_adv()Varun Prakash
Add cxgb_is_neg_adv() in libcxgb_cm.h to remove it's duplicate definitions from cxgb4/cm.c and cxgbit/cxgbit_cm.c. Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-15libcxgb,iw_cxgb4,cxgbit: add cxgb_find_route6()Varun Prakash
Add cxgb_find_route6() in libcxgb_cm.c to remove it's duplicate definitions from cxgb4/cm.c and cxgbit/cxgbit_cm.c. Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-15libcxgb,iw_cxgb4,cxgbit: add cxgb_find_route()Varun Prakash
Add cxgb_find_route() in libcxgb_cm.c to remove it's duplicate definitions from cxgb4/cm.c and cxgbit/cxgbit_cm.c. Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-15libcxgb,iw_cxgb4,cxgbit: add cxgb_get_4tuple()Varun Prakash
Add cxgb_get_4tuple() in libcxgb_cm.c to remove it's duplicate definitions from cxgb4/cm.c and cxgbit/cxgbit_cm.c. Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-15Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: "A set of fixes for the current series in the realm of block. Like the previous pull request, the meat of it are fixes for the nvme fabrics/target code. Outside of that, just one fix from Gabriel for not doing a queue suspend if we didn't get the admin queue setup in the first place" * 'for-linus' of git://git.kernel.dk/linux-block: nvme-rdma: add back dependency on CONFIG_BLOCK nvme-rdma: fix null pointer dereference on req->mr nvme-rdma: use ib_client API to detect device removal nvme-rdma: add DELETING queue flag nvme/quirk: Add a delay before checking device ready for memblaze device nvme: Don't suspend admin queue that wasn't created nvme-rdma: destroy nvme queue rdma resources on connect failure nvme_rdma: keep a ref on the ctrl during delete/flush iw_cxgb4: block module unload until all ep resources are released iw_cxgb4: call dev_put() on l2t allocation failure
2016-09-13Merge branch 'nvmf-4.8-rc' of git://git.infradead.org/nvme-fabrics into ↵Jens Axboe
for-linus Sagi writes: Here we have: - Kconfig dependencies fix from Arnd - nvme-rdma device removal fixes from Steve - possible bad deref fix from Colin
2016-09-04iw_cxgb4: block module unload until all ep resources are releasedSteve Wise
Otherwise an endpoint can be still closing down causing a touch after free crash. Also WARN_ON if ulps have failed to destroy various resources during device removal. Fixes: ad61a4c7a9b7 ("iw_cxgb4: don't block in destroy_qp awaiting the last deref") Reviewed-by: Sagi Grimberg <sagi@grimbrg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2016-09-04iw_cxgb4: call dev_put() on l2t allocation failureSteve Wise
Reviewed-by: Sagi Grimberg <sagi@grimbrg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2016-09-02IB/cxgb4: Make _free_qp static to silence build warningBaoyou Xie
We get 1 warning when build kernel with W=1: drivers/infiniband/hw/cxgb4/qp.c:686:6: warning: no previous prototype for '_free_qp' [-Wmissing-prototypes] In fact, this function is only used in the file in which it is declared and don't need a declaration, but can be made static. so this patch marks it 'static'. Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-08-23iw_cxgb4: Fix cxgb4 arm CQ logic w/IB_CQ_REPORT_MISSED_EVENTSBharat Potnuri
Current cxgb4 arm CQ logic ignores IB_CQ_REPORT_MISSED_EVENTS for request completion notification on a CQ. Due to this ib_poll_handler() assumes all events polled and avoids further iopoll scheduling. This patch adds logic to cxgb4 ib_req_notify_cq() handler to check if CQ is not empty and return accordingly. Based on the return value of ib_req_notify_cq() handler, ib_poll_handler() will schedule a run of iopoll handler. Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-08-22iw_cxgb4: use the MPA initiator's IRD if < our ORDSteve Wise
The i40iw initiator sends an MPA-request with ird=16 and ord=16. The cxgb4 responder sends an MPA-reply with ord = 32 causing i40iw to terminate due to insufficient resources. The logic to reduce the ORD to <= peer's IRD was wrong. Reported-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-08-22iw_cxgb4: limit IRD/ORD advertised to ULP by device max.Steve Wise
The i40iw initiator sends an MPA-request with ird = 63, ord = 63. The cxgb4 responder sends a RST. Since the inbound ord=63 and it exceeds the max_ird/c4iw_max_read_depth (=32 default), chelsio decides to abort. Instead, cxgb4 should adjust the ord/ird down before presenting it to the ULP. Reported-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-08-03Merge branches 'cxgb4' and 'mlx5' into k.o/for-4.8Doug Ledford
2016-08-02iw_cxgb4: don't block in destroy_qp awaiting the last derefSteve Wise
Blocking in c4iw_destroy_qp() causes a deadlock when apps destroy a qp or disconnect a cm_id from their cm event handler function. There is no need to block here anyway, so just replace the refcnt atomic with a kref object and free the memory on the last put. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-08-02iw_cxgb4: explicitly move the qp to ERROR state during flushSteve Wise
This forces the connection to abort if the application failed to disconnect before flushing. This is aligned with how the common flush services work. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-08-02iw_cxgb4: stop MPA_REPLY timer when disconnectingSteve Wise
There exists a race where the application can setup a connection and then disconnect it before iw_cxgb4 processes the fw4_ack message. For passive side connections, the fw4_ack message is used to know when to stop the ep timer for MPA_REPLY messages. If the application disconnects before the fw4_ack is handled then c4iw_ep_disconnect() needs to clean up the timer state and stop the timer before restarting it for the disconnect timer. Failure to do this results in a "timer already started" message and a premature stopping of the disconnect timer. Fixes: e4b76a2 ("RDMA/iw_cxgb4: stop_ep_timer() after MPA negotiation") Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-08-02RDMA/iw_cxgb4: Use kfree_skb instead of kfreeHariprasad S
The commit 0f8ab0b6e91b4d53 ("RDMA/iw_cxgb4: Low resource fixes for Memory registration") from Jun 10, 2016, leads to the following static checker warning: drivers/infiniband/hw/cxgb4/mem.c:612 c4iw_alloc_mw() error: use kfree_skb() here instead of kfree(mhp->dereg_skb) Also fixes skb leak in c4iw_dealloc_mw Fixes: 0f8ab0b6e91b4d53 ("RDMA/iw_cxgb4: Low resource fixes for Memory registration") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-23Merge branches 'cxgb4-4.8', 'mlx5-4.8' and 'fw-version' into k.o/for-4.8Doug Ledford
2016-06-23IB/cxgb4: Support device FW version stringIra Weiny
And remove sysfs fw_ver in favor of the core. Reviewed-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-23RDMA/iw_cxgb4: Low resource fixes for Completion queueHariprasad S
Pre-allocate buffers to deallocate completion queue, so that completion queue is deallocated during RDMA termination when system is running out of memory. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-23RDMA/iw_cxgb4: Low resource fixes for Memory registrationHariprasad S
Pre-allocate buffers for deregistering memory region and memory window during RDMA connection close, when system is running out of memory. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-23RDMA/iw_cxgb4: Low resource fixes for connection managerHariprasad S
Pre-allocate buffers for sending various control messages to close connection, abort connection, etc so that we gracefully handle connections when system is running out of memory. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-23RDMA/iw_cxgb4: Add missing error codes for act open cmdHariprasad S
Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-23RDMA/iw_cxgb4: clean up c4iw_reject_cr()Hariprasad S
Get rid of unneeded code, and refactor things a bit. For MPA version 0 we abort the connection. For > 0, we attempt to send an MPA_START/REJECT Reply, and then disconnect gracefully. If the send of the MPA message fails, then we abort the connection. We can ignore c4iw_ep_disconnect() errors here because it will clean up the endpoint if there are failures. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-23RDMA/iw_cxgb4: allocate enough space for debugfs "qps" dumpHariprasad S
With IPv6 addresses, the "qps" debugfs is running out of space and truncating the output. Bump the required size accordingly. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-23RDMA/iw_cxgb4: only read markers_enabled mod param onceHariprasad S
markers_enabled should be read only once during MPA negotiation. The present code does read markers_enabled twice during negotiation which results in setting wrong recv/xmit markers if the markers_enabled is changed in the middle of negotiation. With this change the markers_enabled is read only once during MPA negotiation. recv markers are set based on markers enabled module parameter and xmit markers are set based on markers flag from the MPA_START_REQ/MPA_START_REP. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-26IB/core: Make device counter infrastructure dynamicChristoph Lameter
In practice, each RDMA device has a unique set of counters that the hardware implements. Having a central set of counters that they must all adhere to is limiting and causes many useful counters to not be available. Therefore we create a dynamic counter registration infrastructure. The driver must implement a stats structure allocation routine, in which the driver must place the directory name it wants, a list of names for all of the counters, an array of u64 counters themselves, plus a few generic configuration options. We then implement a core routine to create a sysfs file for each of the named stats elements, and a core routine to retrieve the stats when any of the sysfs attribute files are read. To avoid excessive beating on the stats generation routine in the drivers, the core code also caches the stats for a short period of time so that someone attempting to read all of the stats in a given device's directory will not result in a stats generation call per file read. Future work will attempt to standardize just the shared stats elements, and possibly add a method to get the stats via netlink in addition to sysfs. Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com> [ Add caching, make structure names more informative, add i40iw support, other significant rewrites from the original patch ]
2016-05-13Merge branches 'cxgb4-2', 'i40iw-2', 'ipoib', 'misc-4.7' and 'mlx5-fcs' into ↵Doug Ledford
k.o/for-4.7
2016-05-13iw_cxgb4: Convert a __force castBart Van Assche
__force casts should be avoided if there is a better alternative. Hence modify the comparison of s_addr with INADDR_ANY such that the __force cast is no longer necessary. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Steve Wise <swise@opengridcomputing.com> Cc: Vipul Pandya <vipul@chelsio.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13RDMA/iw_cxgb4: Add arp failure handlers to send_mpa_reply/reject()Hariprasad S
These handlers when called print error message to the kernel log, but the actual handling is done by _c4iw_free_ep() and process_timeout(). Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13RDMA/iw_cxgb4: Always wake up waiter in c4iw_peer_abort_intr()Hariprasad S
Currently c4iw_peer_abort_intr() does not wake up the waiter if the endpoint state indicates we're using MPAv2 and we're currently trying to connect. This was introduced with commit 7c0a33d61187a ("RDMA/cxgb4: Don't wakeup threads for MPAv2") However, this original fix is flawed because it introduces a race that can cause a deadlock of the iwarp stack. Here is the race: ->local side sets up an active offload connection. ->local side sends MPA_START request. ->peer sends MPA_START response. ->local side ingress cpl thread begins processing the MPA_START response, but before it changes the state from MPA_REQ_SENT to FPDU_MODE: ->peer sends a RST which results in a ABORT_REQ_RSS. This triggers peer_abort_intr() which sees the state in MPA_REQ_SENT and since mpa_rev is 2, it will avoid waking up the endpoint with -ECONNRESET, assuming the stack will re-attempt the connection using MPAv1. ->Meanwhile, the cpl thread moves the state to FPDU_MODE and calls c4iw_modify_rc_qp() which calls rdma_init() which sends a RI_WR/INIT WR to firmware. But since HW sent an abort, FW correctly drops the RI_WR/INIT WR. ->So the cpl thread is stuck waiting for a reply and cannot process the ABORT_REQ_RSS cpl sitting in its input queue. Thus everything comes to a halt because no more ingress cpls are processed by the stack... The correct fix for the issue is to always do the wake up in c4iw_abort_intr() but reinitialize the wait object in c4iw_reconnect(). Fixes: 7c0a33d61187a ("RDMA/cxgb4: Don't wakeup threads for MPAv2") Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13RDMA/iw_cxgb4: Handle ret value of process_mpa_reply() in rx_dataHariprasad S
Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13RDMA/iw_cxgb4: atomic find and reference for listening endpointsHariprasad S
Add get_ep_from_stid() which will atomically find and reference the endpoint struct if found. This avoids touch-after-free races between threads destroying listening endpoints and the CPL processing thread processing an incoming PASS_ACCEPT_REQ CPL. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13RDMA/iw_cxgb4: Handle ULP accept/reject during ABORTINGHariprasad S
c4iw_reject() and c4iw_accept() need to handle the case where the endpoint has timed out and is in the middle of ABORTING the connection. Here is the flow that causes the BUG_ON() to fire on the server side: 1) offload connection setup and endpoint timer started 2) MPA_START request received from peer, CONNECT_REQUEST passed to ULP 3) endpoint timer fires, and process_timeout() aborts the connection, this moves the endpoint state to ABORTING until HW sends up the ABORT_RPL_RSS. 4) application exits closing the CONNECT_REQUEST cm_id. The IWCM calls c4iw_reject_cr() to destroy this connection request. 5) WHAMO: BUG_ON() because the state is ABORTING. The fix is to change c4iw_reject_cr() and c4iw_accept_cr() to fail the operation if the state is not in MPA_REQ_RCVD vs in DEAD. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13RDMA/iw_cxgb4: Release ep for for FPDU_MODE and MPA_REQ_RCVD in process_timeoutHariprasad S
ARP failure may also happen when ep in FPDU_MODE and these failures need to be handled by process_timeout(). process_timeout() also has to handle case MPA_REQ_RCVD, setting abort to 1, leading to ep resource release. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13RDMA/iw_cxgb4: Free skb in case of arp failure in _c4iw_free_ep()Hariprasad S
Arp failure for send_mpa_reply/reject() is handled by freeing the mpa_skb in c4iw_free_ep() before releasing ep. Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13RDMA/iw_cxgb4: atomically lookup ep and get a referenceHariprasad S
There is a race between ULP threads calling c4iw_ep_disconnect() via c4iw_modify_rc_qp() and the ingress CPL thread where the ULP thread can free the endpoint just after the ingress CPL thread finds the ep pointer in the tid table. To avoid this, we now use the hwtid_idr table for lookups instead of the LLD tid table so we can lock around insert, remove, and lookup+get_ep to avoid the race. The CPL handlers now will either find the ep ptr and have a ref on it, or not find it and they can discard the CPL. Callers of get_ep_from_tid() will have a ref on the ep if found, and thus must deref when they are done. Negative advice in peer_abort_intr() need to dereference the ep. therefore peer_abort() is scheduled to dereference the ep later. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13RDMA/iw_cxgb4: Handle return value of c4iw_ofld_send() in abort_arp_failure()Hariprasad S
In abort_arp_failure(), the return value from c4iw_ofld_send() is ignored and thus if the CPL isn't sent, the endpoint is stuck and never gets aborted. Failure of c4iw_ofld_send() is treated as fatal error, and the ep resources are released in a safer context through process_work(). Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>