Commit Graph

37 Commits (master)

Author SHA1 Message Date
Vitaliy Filippov f20564b44b Fix 32-bit build warnings (99.9% in printf) 2024-02-22 12:22:16 +03:00
Vitaliy Filippov b127da40f7 Add a FIXME about incomplete PGs 2024-02-11 13:42:51 +03:00
Vitaliy Filippov 3ad16b9a1a Fix auto_scrubs not starting because of < vs <= =))
Test / test_change_pg_count (push) Successful in 41s Details
Test / test_change_pg_count_ec (push) Successful in 36s Details
Test / test_change_pg_size (push) Successful in 9s Details
Test / test_create_nomaxid (push) Successful in 9s Details
Test / test_etcd_fail (push) Successful in 1m20s Details
Test / test_failure_domain (push) Successful in 12s Details
Test / test_interrupted_rebalance (push) Successful in 2m1s Details
Test / test_interrupted_rebalance_imm (push) Successful in 1m55s Details
Test / test_interrupted_rebalance_ec (push) Successful in 1m48s Details
Test / test_interrupted_rebalance_ec_imm (push) Successful in 1m32s Details
Test / test_move_reappear (push) Successful in 51s Details
Test / test_rebalance_verify (push) Successful in 3m19s Details
Test / test_rebalance_verify_imm (push) Successful in 3m9s Details
Test / test_rebalance_verify_ec (push) Successful in 3m21s Details
Test / test_rebalance_verify_ec_imm (push) Successful in 3m4s Details
Test / test_rm (push) Successful in 17s Details
Test / test_snapshot (push) Successful in 23s Details
Test / test_snapshot_ec (push) Successful in 26s Details
Test / test_splitbrain (push) Successful in 14s Details
Test / test_write (push) Successful in 1m35s Details
Test / test_write_xor (push) Successful in 2m29s Details
Test / test_write_no_same (push) Successful in 29s Details
Test / test_heal_pg_size_2 (push) Successful in 4m11s Details
Test / test_heal_ec (push) Successful in 5m4s Details
Test / test_scrub (push) Successful in 55s Details
Test / test_scrub_zero_osd_2 (push) Successful in 41s Details
Test / test_scrub_xor (push) Successful in 37s Details
Test / test_scrub_pg_size_3 (push) Successful in 57s Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 46s Details
Test / test_scrub_ec (push) Successful in 31s Details
2023-06-17 17:32:21 +03:00
Vitaliy Filippov aa5dacc7a9 Do not start EC PGs without at least pg_data_size connections to old OSDs from each set
Test / test_change_pg_count (push) Successful in 36s Details
Test / test_change_pg_count_ec (push) Successful in 38s Details
Test / test_change_pg_size (push) Successful in 8s Details
Test / test_create_nomaxid (push) Successful in 9s Details
Test / test_etcd_fail (push) Successful in 1m13s Details
Test / test_failure_domain (push) Successful in 11s Details
Test / test_interrupted_rebalance (push) Successful in 1m51s Details
Test / test_interrupted_rebalance_imm (push) Successful in 1m43s Details
Test / test_interrupted_rebalance_ec (push) Successful in 1m47s Details
Test / test_minsize_1 (push) Successful in 43s Details
Test / test_move_reappear (push) Successful in 43s Details
Test / test_rebalance_verify (push) Successful in 3m16s Details
Test / test_rebalance_verify_imm (push) Successful in 3m9s Details
Test / test_rebalance_verify_ec (push) Successful in 3m8s Details
Test / test_rebalance_verify_ec_imm (push) Successful in 3m10s Details
Test / test_rm (push) Successful in 14s Details
Test / test_snapshot (push) Successful in 22s Details
Test / test_snapshot_ec (push) Successful in 25s Details
Test / test_splitbrain (push) Successful in 15s Details
Test / test_write (push) Successful in 1m44s Details
Test / test_write_xor (push) Successful in 2m29s Details
Test / test_write_no_same (push) Successful in 22s Details
Test / test_heal_pg_size_2 (push) Successful in 4m37s Details
Test / test_heal_ec (push) Successful in 4m4s Details
Test / test_scrub (push) Successful in 48s Details
Test / test_scrub_zero_osd_2 (push) Successful in 41s Details
Test / test_scrub_xor (push) Successful in 39s Details
Test / test_scrub_pg_size_3 (push) Successful in 47s Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 41s Details
Test / test_scrub_ec (push) Successful in 34s Details
2023-06-17 02:16:30 +03:00
Vitaliy Filippov fa90b5a4e7 Schedule automatic scrubs correctly (not just after previous scrub) 2023-05-20 23:20:09 +03:00
Vitaliy Filippov c3bd26193d Implement PG scrub runner 2023-05-20 23:19:39 +03:00
Vitaliy Filippov a6d846863b Add min/max stripe and limit to OP_LIST 2023-05-20 23:19:39 +03:00
Vitaliy Filippov 0538a484b3 Add corrupted object state 2023-05-20 23:19:39 +03:00
Vitaliy Filippov 46462da45e Preload own PG history updates to fix PG state loop possibly applying the old metadata version 2023-04-23 01:50:30 +03:00
Vitaliy Filippov d06ed2b0e7 Implement online config update 2023-03-26 19:21:50 +03:00
Vitaliy Filippov 2c8241b7db Remove PG "peered" state 2023-02-21 01:30:42 +03:00
Vitaliy Filippov e950c024d3 Do not sync peer OSDs before listing
Sync before listing was added to wait for all PG writes possibly left in queue
from the previous master to finish before listing it

But in fact it may block the cluster when EC is used and some unstable writes
are left in the queue - they block journal flushing, rollback/stabilize is
required to unblock them, but rollback/stabilize may only happen after PG is
peered. But peering needs listings, listings are requested only after sync, and
sync itself waits for currently blocked writes waiting in the queue
2023-01-03 00:05:45 +03:00
Vitaliy Filippov 67019f5b02 Make OSD sort & sanitize PG history items 2023-01-01 23:17:42 +03:00
Vitaliy Filippov 998e24adf8 Add a new recovery_pg_switch setting to mix all PGs during recovery 2022-12-30 02:03:33 +03:00
Vitaliy Filippov 8669998e5e Fix discard_list_subop() for local ops 2022-12-17 17:54:13 +03:00
Vitaliy Filippov 472bce58ab Fix rare crash in osd_peering due to callback ordering 2022-12-12 00:27:05 +03:00
Vitaliy Filippov ae99ee6266 Rename base64.{cpp.h} to str_util 2022-07-31 01:12:37 +03:00
Vitaliy Filippov 2bdf415eb3 Fix unknown OSD numbers on error 2022-05-28 00:51:14 +03:00
Vitaliy Filippov 7cbfdff41a Replace some throws with force_stop 2022-02-20 00:21:19 +03:00
Vitaliy Filippov 951272f27f Try to process PG one after another 2022-02-19 19:25:55 +03:00
Vitaliy Filippov abaec2008c Fix OSDs missing misplaced recovery 2022-02-11 01:00:24 +03:00
Vitaliy Filippov 61ebed144a Fix OSDs possibly dying with "map::at" errors when other OSDs are stopped 2022-02-09 10:35:29 +03:00
Vitaliy Filippov 8dc1ffb13b Try to connect with PG peers before deciding it's incomplete :)
I already attempted to fix it in 0.6.11, but it happened so that the fix was
only partial :)
2022-01-23 19:19:26 +03:00
Vitaliy Filippov 31b9c683ee Fix flushing of unclean objects
This was preventing OSD failover when there were some unclean objects.
Bug was introduced in aa436027c8
2022-01-23 00:45:11 +03:00
Vitaliy Filippov dd74c5ce1b Fix OSDs marking PGs incomplete instead of trying to connect with peers 2021-12-14 01:57:51 +03:00
Vitaliy Filippov aa436027c8 Report pg/history from OSD on every degraded activation
Required to prevent data loss due to activation of an OSD with older data
when PG OSD set change doesn't occur. I.e. fixes the simplest case:
- Run 2 OSDs with 1 PG
- Start writing into the PG
- Stop OSD 2
- Stop OSD 1
- Start OSD 2

After this change the PG will refuse to start after the last step.
2021-11-13 22:39:17 +03:00
Vitaliy Filippov 57e2c503f7 Rename osd_t::c_cli to msgr 2021-04-17 16:32:09 +03:00
Vitaliy Filippov 97efb9e299 Do not crash on PG re-peering events when operations are in progress 2021-04-07 11:06:31 +03:00
Vitaliy Filippov 435045751d Delete objects only after a SYNC during rebalance in the non-immediate_commit mode
Previously OSDs could commit deletes before writes during recovery or rebalance
in the "lazy fsync" (immediate_commit=off) mode which could result in lost objects
2021-03-16 12:48:26 +03:00
Vitaliy Filippov b44f49aab2 Ignore zero OSDs in history osd_sets 2021-03-12 12:40:15 +03:00
Vitaliy Filippov bd178ac20f Fix history osd_set check - local OSD is always available! 2021-03-09 02:18:18 +03:00
Vitaliy Filippov 21e7686037 Fix possible "assertion failed: pg.inflight >= 0" error during PG stop 2021-03-08 17:04:10 +03:00
Vitaliy Filippov 30d1ccd43e Fix an infinite loop when discarding list operations during stop_pg() 2021-03-08 17:04:10 +03:00
Vitaliy Filippov 8bdd6d8d78 Reset PG state when stopping them 2021-03-08 17:04:10 +03:00
Vitaliy Filippov 09b3e4e789 Fix OSDs being unable to stop PGs that are 'peering', not 'active'
This was sometimes leading to incorrect misplaced and degraded object count statistics
2021-03-08 17:04:10 +03:00
Vitaliy Filippov 46e79f3306 Wait for PGs to become clean before stopping them 2021-02-28 19:36:59 +03:00
Vitaliy Filippov bf9a175efc Move C/C++ sources to src subdirectory 2021-02-25 23:59:03 +03:00