LU-6142 lfsck: Fix style issues for lfsck_striped_dir.c This patch fixes issues reported by checkpatch for file lustre/lfsck/lfsck_striped_dir.c Test-Parameters: trivial Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com> Change-Id: I6469e5973a5ee33c408ced48bb9ab162307fdf07 Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54214 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Timothy Day <timday@amazon.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-6142 lfsck: Fix style issues for lfsck_namespace.c This patch fixes issues reported by checkpatch for file lustre/lfsck/lfsck_namespace.c Test-Parameters: trivial Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com> Change-Id: Ie415d9ace24adaa845a4298499128b2766dc66aa Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54213 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Timothy Day <timday@amazon.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-6142 lfsck: Fix style issues for lfsck_engine.c This patch fixes issues reported by checkpatch for file lustre/lfsck/lfsck_engine.c Test-Parameters: trivial Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com> Change-Id: Icf9941210e7e403088ac9216de38f8c49f52e72e Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54212 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Timothy Day <timday@amazon.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-6142 lfsck: Fix style issues under lustre/lfsck This patch fixes issues reported by checkpatch for all files under folder lustre/lfsck Test-Parameters: trivial testlist=sanity-lfsck Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com> Change-Id: I7fed1e66f82c691d94198390ad89e91db9bfcdea Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54165 Reviewed-by: Timothy Day <timday@amazon.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com> Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com>
LU-17280 scrub: skip dir stripes with OI After fresh mount and LFSCK start all directory stripes are added to inconsistent list. So scrub for all stripes would print LFSCK message "inconsistent OI FID...fixed. Lets check FID to OI mapping before adding to inconsistent list. Also fixing additional debug for scrub. HPE-bug-id: LUS-11777 Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com> Change-Id: I869f1cf71eb6c10f386a3f388a38032c73d2b41a Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53078 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com> Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-9859 libcfs: refactor libcfs initialization. Many lustre modules depend on libcfs having initialized properly, but do not explicit check that it did. When lustre is built as discrete modules, this does not cause a problem because if the libcfs module fails initialization, the other modules don't even get loaded. When lustre is compiled into the kernel, all module_init() routines get run, so they need to check the required initialization succeeded. This patch splits out the initialization of libcfs into a new libcfs_setup(), and has all modules call that. The misc_register() call is kept separate as it does not allocate any resources and if it fails, it fails hard - no point in retrying. Other set-up allocates resources and so is best delayed until they are needed, and can be worth retrying. Ideally, the initialization would happen at mount time (or similar) rather than at load time. Doing this requires each module to check dependencies when they are activated rather than when they are loaded. Achieving that is a much larger job that would have to progress in stages. For now, this change ensures that if some initialization in libcfs fails, other modules will fail-safe. Linux-commit: 64bf0b1a079d61e9e059b9dc7a58e064c7d994ae Change-Id: I6b5ecdba0defc6e033f78d8fc2b9be9e26c7f720 Signed-off-by: Mr. NeilBrown <neilb@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52700 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Timothy Day <timday@amazon.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-16796 lfsck: Change lfsck_assistant_object to use kref This patch changes struct lfsck_assistant_object to use kref(refcount_t) instead of atomic_t Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com> Change-Id: I763a44d2c74f758da5a137c6673f8dfd2ef6dc0a Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52811 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Neil Brown <neilb@suse.de> Reviewed-by: Timothy Day <timday@amazon.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-17010 lfsck: don't dump stack repeatedly If there are transactions started with LFSCK in dry-run mode, don't dump the stack repeatedly, as this can spam the console logs and significantly hurt performance. Test-Parameters: trivial testlist=sanity-lfsck Fixes: 0c1ae1cb9c ("LU-13124 scrub: check for multiple linked file") Signed-off-by: Andreas Dilger <adilger@whamcloud.com> Change-Id: I0b0d64911453dc8ab947e284656311b5d0300c1e Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52356 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com> Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-10499 pcc: use foreign layout for PCCRO on server side This patch includes the codes about using foreign layout for PCCRO on the server side (LOD|MDD|MDT layers). Signed-off-by: Qian Yingjin <qian@ddn.com> Change-Id: I48467be9fef54bd05432528b685241aa53978d24 Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51375 Reviewed-by: James Simmons <jsimmons@infradead.org> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com> Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com>
LU-10026 csdc: DoM pattern could be a combined value DoM pattern is LOV_PATTERN_MDT for now, and in the future it could be combined with LOV_PATTERN_COMPRESS to represent a compressed DoM component. Fix a minor glitch for lov_getstripe_old code path (in ll_lov_getstripe_ea_info), which intends to return the last component stripe info but the commit abf04e7ea3 omits to correctly set the last component stripe info before using it. Fixes: abf04e7ea3 ("LU-14337 lov: return valid stripe_count/size for PFL files") Signed-off-by: Bobi Jam <bobijam@whamcloud.com> Change-Id: Id0779c30c004b6979f88bf96b7b7b74a8b8c26e4 Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51978 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com> Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
LU-17010 lfsck: don't create trans in dryrun mode In LFSCK, the LFSCK transaction should not be created in dryrun mode, which is related to the following patch, Fixes: 0c1ae1cb9c19 ("LU-13124 scrub: check for multiple linked file") Change-Id: Id543bc3c0e300c1cc14b670d724ebcacac3bf71b Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com> Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51849 Reviewed-by: Oleg Drokin <green@whamcloud.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com> Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com>
LU-15527 dne: refactor commit-on-sharing for DNE Commit-on-sharing for DNE is different from the original commit-on-sharing: * the original commit-on-sharing is to eliminate dependency between operations from different clients. * while commit-on-sharing for DNE is to eliminate dependency between operations handled by different MDTs, so that upon multiple MDT failures, an operaiton replay won't fail because its dependent operation is not replayed by another MDT yet. Current CoS for DNE implementation checks dependency in MDT layer, and it decides by checking whether current operation is a distributed transaction, if so, it will trigger CoS upon conflicting locks. Actually this may miss some cases that should trigger CoS (even local transaction should trigger CoS if it depends on a distributed transaction), and on the other hand it may trigger extra CoS because if two operations are handled by the same MDT, the dependency is ensured because they will always be replayed by transaction number. And to avoid mixing the code of two different CoS, the following changes are made: * add new ldlm lock mode LCK_TXN. On DNE system, downgrade PW/EX locks to this mode after transaction stop. * add li_initiator_id in struct ldlm_inodebits, which is the index of MDT where the lock is enqueued, i.e. where operation is handled. If another operation handled by a different MDT requests a conflicting PW|EX mode lock against this TXN mode lock, it will trigger commit to ensure the dependent operation is committed to disk (NB, it doesn't trigger commit on all involved MDTs, but only the MDT where the conflict happens, which is enough to allow replay succeed). * remove LDLM_FL_COS_INCOMPAT and LDLM_FL_COS_ENABLED. * MDT layer doesn't need to check such dependency any more, since lock itself knows. * updated sanityn 33c, 33d and 33e since fewer CoS are triggered now. Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com> Change-Id: Ib0149fcdc0178afd2c6894d211480f3c6c9284a0 Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/46641 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-16893 libcfs: Remove force_sig usage from lfsck The lfsck pool of kernel threads uses force_sig() to signal the worker threads to stop. A signal is used here as the lfsck workers may be waiting in various, and possibly nested, states. As force_sig() has been removed let us simply enable SIGINT to be passed to the worker threads using send_sig(). Test-parameters: testlist=sanity-lfsck,lfsck-performance HPE-bug-id: LUS-11670 Fixes: db9f9543ec ("LU-12634 libcfs: force_sig() removed task parameter") Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com> Change-Id: Ibf6a67f43687960b3eff9cb9a7c7dc8b1be1da63 Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/51470 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: James Simmons <jsimmons@infradead.org> Reviewed-by: Oleg Drokin <green@whamcloud.com> Reviewed-by: Neil Brown <neilb@suse.de>
LU-6142 lustre: use list_first/last_entry() for list heads This patch changes list_entry(foo.next, ...) to list_first_entry(&foo, ...) and list_entry(foo.prev, ...) to list_last_entry(&foo, ...) in cases where 'foo' is a list head - not a list member. Test-Parameters: trivial Signed-off-by: Mr NeilBrown <neilb@suse.de> Change-Id: I22b1278f5b481ce3074db3e59d37d9148016aed5 Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50828 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com> Reviewed-by: James Simmons <jsimmons@infradead.org> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-16518 misc: fix clang build errors Fix several format security errors by explicitly giving the format to the affected functions. A write test in badareaio attempts to write more than 2,147,479,552 bytes. The write will never write that much, so reduce the size of the write to make the test useful. Explicitly cast ll_nfs_get_name_filldir as a filldir_t and NR_WRITEBACK as a zone_stat_item. This silences some implicit cast errors. These casts can likely be removed when older kernel support is dropped. Refactor some code to avoid strncat, which was being used incorrectly anyway. Adjust some variables to use more appropriate types. Inline some functions which are only sometimes used. Remove a LASSERTF that will never trigger, since u32 is always smaller than IDIF_MAX_OID. Signed-off-by: Timothy Day <timday@amazon.com> Change-Id: I3962611de7d012e544636248353c072c9f9c9830 Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50332 Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Reviewed-by: James Simmons <jsimmons@infradead.org> Reviewed-by: Oleg Drokin <green@whamcloud.com> Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com>
LU-12610 misc: remove OBD_ -> CFS_ macros Remove OBD macros that are simply redefinitions of CFS macros. Test-Parameters: trivial Signed-off-by: Timothy Day <timday@amazon.com> Signed-off-by: Ben Evans <beevans@whamcloud.com> Change-Id: I15fe8aa22cb0203bed102a35361f4854ddaabecb Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50809 Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Reviewed-by: James Simmons <jsimmons@infradead.org> Reviewed-by: Neil Brown <neilb@suse.de> Reviewed-by: Oleg Drokin <green@whamcloud.com> Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com>
LU-16826 lfsck: init rec_fid before declare_insert lfsck_namespace_repair_dangling() doesn't init the record buffer properly before calling dt_declare_insert() for the case of local agent creation. Test-parameters: trivial testlist=sanity-lfsck HPE-bug-id: LUS-11609 Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com> Change-Id: Ibd0a44217e9ebcf469f7a817651e63214c218974 Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com> Reviewed-by: Shaun Tancheff <stancheff@cray.com> Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50980 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-6142 all: use list_first_entry() where appropriate. Lustre already uses list_first_entry() in many places, but it is not consistent. Let's make it consistent. The patch was generated with sed -i 's/list_entry(([^,]*)->next,/list_first_entry(1,/' `git grep -l 'list_entry(.*->next' lustre/ lnet/ libcfs/ ` followed by some manual cleanup of indents, and adding list_first_entry() to libcfs/include/libcfs/util/list.h Test-Parameters: trivial Signed-off-by: Mr NeilBrown <neilb@suse.de> Change-Id: Id646fba1faf40282e66ede07c88c8db5ffadc211 Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50826 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com> Reviewed-by: James Simmons <jsimmons@infradead.org> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-16717 mdt: resume dir migration with bad_type LFSCK may set hash type to "none,bad_type" upon migration failure, set it back to "fnv_1a_64,migrating,bad_type,fixed" to allow migration resumption. fnv_1a_64 is set because it's the default hash type, and now that we don't know the hash type in the original migration command, just try with it. LFSCK just add "bad_type" flag on such directory, so that such migration can always be resumed in the future. Add sanity 230z. Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com> Change-Id: I19606aefcb9115e6724843785aea89a1c380e23f Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50797 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-15300 mdt: refresh LOVEA with LL granted this change tries to fix two problems: 1) mdt_reint_open() fetches LOVEA before layout lock is taken. this may race with another process changing the layout and may result in a stale layout returned with a granted layout lock - re-fetch LOVEA once layout lock is granted 2) lov_layout_change() should not apply old layouts which can get through when MDS doesn't take layout lock 3) LFSCK shouldn't ignore layout version stored on MDS to avoid a situation when version degrades compared to client's copy. This patch misses an optimization and can result in a number of useless calls to OSD to fetch LOVEA. To be fixed in a followup patch. Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com> Change-Id: Idee1101d152ab09947faf6d75574a8761a7690a5 Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/46413 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Reviewed-by: Zhenyu Xu <bobijam@hotmail.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>