Whamcloud - gitweb
LU-8752 lnet: Stop MLX5 triggering a dump_cqe 06/24306/2
authorDoug Oucharek <doug.s.oucharek@intel.com>
Mon, 12 Dec 2016 17:31:37 +0000 (09:31 -0800)
committerOleg Drokin <oleg.drokin@intel.com>
Wed, 18 Jan 2017 18:59:00 +0000 (18:59 +0000)
We have found that MLX5 will trigger a dump_cqe if we don't
invalidate the rkey on a newly alloated MR for FastReg usage.

This fix just tags the MR as invalid on its creation if we are
using FastReg and that will force it to do an invalidate of the
rkey on first usage.

Test-Parameters: trivial
Signed-off-by: Doug Oucharek <doug.s.oucharek@intel.com>
Change-Id: Id0de4f799d70c58011746520b20c8cfca6a29e6a
Reviewed-on: https://review.whamcloud.com/24306
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
lnet/klnds/o2iblnd/o2iblnd.c

index e919008..ee5a01f 100644 (file)
@@ -1536,7 +1536,10 @@ static int kiblnd_alloc_freg_pool(kib_fmr_poolset_t *fps, kib_fmr_pool_t *fpo)
                        goto out_middle;
                }
 
                        goto out_middle;
                }
 
-               frd->frd_valid = true;
+               /* There appears to be a bug in MLX5 code where you must
+                * invalidate the rkey of a new FastReg pool before first
+                * using it. Thus, I am marking the FRD invalid here. */
+               frd->frd_valid = false;
 
                list_add_tail(&frd->frd_list, &fpo->fast_reg.fpo_pool_list);
                fpo->fast_reg.fpo_pool_size++;
 
                list_add_tail(&frd->frd_list, &fpo->fast_reg.fpo_pool_list);
                fpo->fast_reg.fpo_pool_size++;