LU-17034 quota: lqeg_arr memmory corruption
Fix memory corruption caused by accessing memory
out of array lqeg_arr. It could happen when at least
one of OSTs has index larger than the whole number
of OSTs. For example, if the system has 4 OSTs with
indexes 0001, 0002, 00c9, 00ca. This issue more often
corrupted bucket_table in obd_uuid_hash or obd_nid_hash
causing to crash rhashtable code. However, it could
be the reason of other panics depending on the type
of corrupted neighbour memory region.
This patch adds an lge_idx field to each lqe global entry
to store index of the OST. It is needed to map OST index
to the array index to avoid out-of-bound array access.
This patch also add locking to protect lqe_glbl_data in
qmt_set_revoke and qmt_clear_lgeg_arr_nu. This was
forgotten in
50ff4d1da6.
This patch begins to store all connected MDTs in the quota
global pool. Thus handling MDTs beginning from this patch
is the same with OSTs stored in the global pool. It is the
1st step to introduce MDT pools.
Add conf-sanity_33c that reproduces mentioned memory
corruption without the fix.
Fixes:
50ff4d1da6 ("LU-16772 quota: protect lqe_glbl_data in qmt_site_recalc_cb")
Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: Id6e4bcde09d9f32726d69f711eedb82729a2266e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52094
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>