Whamcloud - gitweb
LU-7650 o2iblnd: Put back work queue check previously removed 81/22281/3
authorDoug Oucharek <doug.s.oucharek@intel.com>
Fri, 2 Sep 2016 01:15:09 +0000 (18:15 -0700)
committerOleg Drokin <oleg.drokin@intel.com>
Sat, 10 Sep 2016 03:23:34 +0000 (03:23 +0000)
The previous patch, http://review.whamcloud.com/21304/, removed
a check needed until LU-5718 is properly addressed.  With
the check, LU-5718 results in an error message and a lost
RDMA operation.  Without it, we have memory corruption and
a crash (much harder to debug).

Putting the check back in case LU-5718 is not fixed soon.

Signed-off-by: Doug Oucharek <doug.s.oucharek@intel.com>
Change-Id: I2efcc4e60a80794b38174da707d2a7fc27f81b6a
Reviewed-on: http://review.whamcloud.com/22281
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Olaf Weber <olaf@sgi.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
lnet/klnds/o2iblnd/o2iblnd_cb.c

index 2a4e92a..5f13912 100644 (file)
@@ -1094,6 +1094,17 @@ kiblnd_init_rdma(kib_conn_t *conn, kib_tx_t *tx, int type,
                         break;
                 }
 
+               if (tx->tx_nwrq >= IBLND_MAX_RDMA_FRAGS) {
+                       CERROR("RDMA has too many fragments for peer %s (%d), "
+                              "src idx/frags: %d/%d dst idx/frags: %d/%d\n",
+                              libcfs_nid2str(conn->ibc_peer->ibp_nid),
+                              IBLND_MAX_RDMA_FRAGS,
+                              srcidx, srcrd->rd_nfrags,
+                              dstidx, dstrd->rd_nfrags);
+                       rc = -EMSGSIZE;
+                       break;
+               }
+
                 wrknob = MIN(MIN(kiblnd_rd_frag_size(srcrd, srcidx),
                                  kiblnd_rd_frag_size(dstrd, dstidx)), resid);