Whamcloud - gitweb
LU-15364 ldlm: Kernel oops when stripe on Arm64 multiple MDTs 22/45922/5
authorKevin Zhao <kevin.zhao@linaro.org>
Wed, 22 Dec 2021 01:53:27 +0000 (09:53 +0800)
committerOleg Drokin <green@whamcloud.com>
Tue, 18 Jan 2022 09:07:43 +0000 (09:07 +0000)
commite2ac5f28c06a34318c9eb2c741ffbf47eea4690d
tree1caacbcae4494f5316df5f1ed75fc6772b6341c5
parent2169aed82a32df47be9aef2f249178ede6c7fadd
LU-15364 ldlm: Kernel oops when stripe on Arm64 multiple MDTs

When setup with multiple MDTs, the atomic operation is needed for
`set_bit` operation. On Arm64 platform, the atomic operation will
rely on the exclusive access, which is requesting the address
alignment[1]. So that's why we see that the __ll_sc_atomic64_or+0x4
is crashed. __ll_sc_atomic64_or+0x4 is LDXR instruction, directly
load the value from address exclusively.

The atomic64 required the access the 64 bits alignment address, but
the struct element ha_map is 4 bytes alignment, that is the root
cause. The Error code of this crash is ESR = 0x96000021, which is
the alignment issue[2].

1. https://developer.arm.com/documentation/den0024/a/ch05s01s02
2. https://developer.arm.com/documentation/ddi0595/2021-06/
   AArch64-Registers/ESR-EL1--Exception-Syndrome-Register--EL1-

Signed-off-by: Kevin Zhao <kevin.zhao@linaro.org>
Change-Id: I3cc6d7347f05680ab55f00538e91886f006deb5d
Reviewed-on: https://review.whamcloud.com/45922
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: xinliang <xinliang.liu@linaro.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
lustre/include/lustre_dlm.h