Commit Graph

93 Commits (master)

Author SHA1 Message Date
Vitaliy Filippov 9ad6822353 Release 1.5.0
Test / test_rm (push) Successful in 14s Details
Test / test_interrupted_rebalance_ec_imm (push) Successful in 1m36s Details
Test / test_snapshot_down (push) Successful in 31s Details
Test / test_snapshot_down_ec (push) Successful in 30s Details
Test / test_splitbrain (push) Successful in 24s Details
Test / test_snapshot_chain (push) Successful in 2m20s Details
Test / test_snapshot_chain_ec (push) Successful in 3m5s Details
Test / test_rebalance_verify_imm (push) Successful in 5m11s Details
Test / test_rebalance_verify (push) Successful in 5m55s Details
Test / test_switch_primary (push) Successful in 33s Details
Test / test_rebalance_verify_ec_imm (push) Successful in 4m26s Details
Test / test_write (push) Successful in 54s Details
Test / test_write_xor (push) Successful in 57s Details
Test / test_write_no_same (push) Successful in 19s Details
Test / test_rebalance_verify_ec (push) Successful in 7m21s Details
Test / test_heal_pg_size_2 (push) Successful in 4m36s Details
Test / test_heal_csum_32k_dmj (push) Successful in 4m33s Details
Test / test_heal_ec (push) Successful in 6m15s Details
Test / test_heal_csum_32k_dj (push) Successful in 6m31s Details
Test / test_heal_csum_32k (push) Successful in 6m29s Details
Test / test_heal_csum_4k_dmj (push) Successful in 6m15s Details
Test / test_scrub_zero_osd_2 (push) Successful in 1m16s Details
Test / test_scrub (push) Successful in 1m18s Details
Test / test_scrub_xor (push) Successful in 1m13s Details
Test / test_heal_csum_4k_dj (push) Successful in 7m10s Details
Test / test_scrub_ec (push) Successful in 56s Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 59s Details
Test / test_heal_csum_4k (push) Successful in 6m2s Details
Test / test_scrub_pg_size_3 (push) Successful in 2m11s Details
Test / test_nfs (push) Successful in 11s Details
After half a year of hard work, VitastorFS is finally here ! :-)

New features:
- VitastorFS, a full-featured clustered (read-write-many) file system.
  Documentation: [VitastorFS](docs/usage/nfs.en.md)
- Embedded key-value database implementation based on Parallel Optimistic B-Tree
  algorithm and used for the metadata of VitastorFS
- Pool management commands in vitastor-cli (create-pool, list-pools, rm-pool, modify-pool).
  Thanks MIND Software (https://mindsw.io) for their contribution!
  [Documentation](docs/usage/cli.en.md#create-pool)

Bug fixes:
- Fix a very rare "infinite loop" in the client library
- Fix a rare OSD hang on during start when zeroing out bad metadata entries left from the previous run
2024-03-16 15:35:10 +03:00
Vitaliy Filippov 38b8963330 Release 1.4.8
Test / test_rm (push) Successful in 19s Details
Test / test_move_reappear (push) Successful in 26s Details
Test / test_interrupted_rebalance_ec_imm (push) Successful in 1m40s Details
Test / test_snapshot_down (push) Successful in 31s Details
Test / test_snapshot_down_ec (push) Successful in 34s Details
Test / test_splitbrain (push) Successful in 27s Details
Test / test_snapshot_chain (push) Successful in 2m18s Details
Test / test_snapshot_chain_ec (push) Successful in 2m59s Details
Test / test_rebalance_verify_imm (push) Successful in 5m32s Details
Test / test_rebalance_verify (push) Successful in 6m11s Details
Test / test_switch_primary (push) Successful in 41s Details
Test / test_write (push) Successful in 45s Details
Test / test_write_no_same (push) Successful in 23s Details
Test / test_rebalance_verify_ec_imm (push) Successful in 5m2s Details
Test / test_write_xor (push) Successful in 55s Details
Test / test_rebalance_verify_ec (push) Successful in 6m22s Details
Test / test_heal_pg_size_2 (push) Successful in 5m41s Details
Test / test_heal_csum_32k_dmj (push) Successful in 5m59s Details
Test / test_heal_csum_32k_dj (push) Successful in 7m19s Details
Test / test_heal_csum_32k (push) Successful in 7m17s Details
Test / test_heal_csum_4k_dmj (push) Successful in 7m14s Details
Test / test_scrub (push) Successful in 1m12s Details
Test / test_heal_ec (push) Successful in 9m2s Details
Test / test_scrub_xor (push) Successful in 56s Details
Test / test_scrub_zero_osd_2 (push) Successful in 1m8s Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 2m1s Details
Test / test_heal_csum_4k_dj (push) Successful in 4m45s Details
Test / test_scrub_pg_size_3 (push) Successful in 2m31s Details
Test / test_heal_csum_4k (push) Successful in 4m54s Details
Test / test_scrub_ec (push) Successful in 46s Details
- Do not use \r if output is not a terminal (should fix unexpected job output in proxmox)
- Fix rm/rm-data error return code, add --down-ok option to bypass the error
- Add EIO retry timeout and allow to disable these retries, rename up_wait_retry_interval to client_retry_interval
- Add ubuntu jammy build
- Wait for blockstore initialisation before starting OSD (prevent timeouts when init takes time)
- Fix a rare use-after-free in automatic sync after delete in blockstore
2024-02-29 09:58:34 +03:00
Zibort Cloud 02d1f16bbd Add ubuntu jammy build
PR #62 #62

I accept Vitastor CLA agreement: https://git.yourcmc.ru/vitalif/vitastor/src/branch/master/CLA-en.md
2024-02-28 11:43:54 +03:00
Vitaliy Filippov 5e934264cf Release 1.4.7
- Fix another old "BUG: Attempt to overwrite used offset" in a very simple
  case: bs=4k rw=write iodepth=16 from OSD start; add this case to tests
- Fix a rare crash with "unexpected state during flush: 0x51" possible with
  EC since 1.4.2 during rebalance and OSD outages
- Fix a rare write stall with EC & immediate_commit=none caused by sync
  operations reserving unneeded space in the journal
- Fix 32-bit build warnings, most in printf/scanf format strings
2024-02-22 12:45:52 +03:00
Vitaliy Filippov f87964861d Release 1.4.6
Test / test_snapshot_ec (push) Successful in 29s Details
Test / test_rm (push) Successful in 18s Details
Test / test_move_reappear (push) Successful in 26s Details
Test / test_snapshot_down (push) Successful in 28s Details
Test / test_snapshot_down_ec (push) Successful in 32s Details
Test / test_splitbrain (push) Successful in 23s Details
Test / test_snapshot_chain (push) Successful in 2m3s Details
Test / test_snapshot_chain_ec (push) Successful in 2m46s Details
Test / test_rebalance_verify_imm (push) Successful in 3m1s Details
Test / test_rebalance_verify (push) Successful in 3m30s Details
Test / test_switch_primary (push) Successful in 38s Details
Test / test_write (push) Successful in 32s Details
Test / test_write_no_same (push) Successful in 17s Details
Test / test_write_xor (push) Successful in 38s Details
Test / test_rebalance_verify_ec (push) Successful in 4m38s Details
Test / test_rebalance_verify_ec_imm (push) Successful in 3m57s Details
Test / test_heal_csum_32k_dj (push) Successful in 5m14s Details
Test / test_heal_csum_32k_dmj (push) Successful in 5m21s Details
Test / test_heal_csum_32k (push) Successful in 5m45s Details
Test / test_heal_csum_4k_dmj (push) Successful in 5m27s Details
Test / test_scrub (push) Successful in 1m30s Details
Test / test_heal_csum_4k_dj (push) Successful in 5m26s Details
Test / test_scrub_zero_osd_2 (push) Successful in 38s Details
Test / test_scrub_xor (push) Successful in 40s Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 1m8s Details
Test / test_scrub_ec (push) Successful in 1m5s Details
Test / test_scrub_pg_size_3 (push) Successful in 1m49s Details
Test / test_heal_csum_4k (push) Successful in 5m41s Details
Test / test_heal_ec (push) Successful in 4m11s Details
Test / test_heal_pg_size_2 (push) Successful in 4m22s Details
Unwavering stabilization of 1.4.x, continued :-)

- Include the accidentally lost part of 1.4.5 journal trimming fix
- Fix a possible OSD crash with "BUG: Attempt to overwrite used offset"
  which was probably present for long time, but became apparent after
  fixing flapping tests in CI
- Fix remaining flapping tests in CI. It was the first time when tests
  actually passed without retries :-)
2024-02-20 17:01:26 +03:00
Vitaliy Filippov f882c7dd87 Release 1.4.5
Test / test_interrupted_rebalance_ec_imm (push) Successful in 1m23s Details
Test / test_rm (push) Successful in 15s Details
Test / test_move_reappear (push) Successful in 21s Details
Test / test_snapshot_down (push) Successful in 26s Details
Test / test_snapshot_down_ec (push) Successful in 30s Details
Test / test_splitbrain (push) Successful in 29s Details
Test / test_snapshot_chain (push) Successful in 2m17s Details
Test / test_snapshot_chain_ec (push) Successful in 3m14s Details
Test / test_rebalance_verify_imm (push) Successful in 3m24s Details
Test / test_rebalance_verify (push) Successful in 3m59s Details
Test / test_switch_primary (push) Successful in 35s Details
Test / test_write_xor (push) Successful in 32s Details
Test / test_write_no_same (push) Successful in 13s Details
Test / test_rebalance_verify_ec (push) Successful in 3m46s Details
Test / test_rebalance_verify_ec_imm (push) Successful in 3m13s Details
Test / test_heal_pg_size_2 (push) Successful in 3m52s Details
Test / test_heal_ec (push) Successful in 5m25s Details
Test / test_heal_csum_32k_dj (push) Successful in 4m24s Details
Test / test_heal_csum_4k_dmj (push) Successful in 4m23s Details
Test / test_heal_csum_4k_dj (push) Successful in 4m17s Details
Test / test_scrub (push) Successful in 38s Details
Test / test_scrub_zero_osd_2 (push) Successful in 29s Details
Test / test_scrub_xor (push) Successful in 30s Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 43s Details
Test / test_scrub_ec (push) Successful in 32s Details
Test / test_scrub_pg_size_3 (push) Successful in 1m46s Details
Test / test_heal_csum_4k (push) Successful in 4m4s Details
Test / test_write (push) Successful in 1m38s Details
Test / test_heal_csum_32k_dmj (push) Successful in 4m5s Details
Test / test_heal_csum_32k (push) Successful in 4m15s Details
- Fix a write stall caused by incorrect journal trimming introduced in 1.4.4 :)
- Fix PGs sometimes hanging in "starting" state on mass OSD restarts
- Fix a rare crash with "map::at" during OSD pings
- Use new defaults for non-capacitor (desktop) SSDs - improves T1Q256 random write from ~6k iops to ~45k iops
- Make journal_trim_interval configurable
2024-02-16 10:13:33 +03:00
Vitaliy Filippov c777a0041a Release 1.4.4
Test / test_interrupted_rebalance_ec_imm (push) Successful in 1m23s Details
Test / test_move_reappear (push) Successful in 21s Details
Test / test_rm (push) Successful in 16s Details
Test / test_snapshot_down (push) Successful in 30s Details
Test / test_snapshot_down_ec (push) Successful in 30s Details
Test / test_splitbrain (push) Successful in 25s Details
Test / test_snapshot_chain (push) Successful in 2m18s Details
Test / test_snapshot_chain_ec (push) Successful in 3m13s Details
Test / test_rebalance_verify_imm (push) Successful in 3m8s Details
Test / test_rebalance_verify (push) Successful in 3m41s Details
Test / test_switch_primary (push) Successful in 36s Details
Test / test_write (push) Successful in 40s Details
Test / test_write_no_same (push) Successful in 16s Details
Test / test_write_xor (push) Successful in 39s Details
Test / test_rebalance_verify_ec (push) Successful in 4m56s Details
Test / test_rebalance_verify_ec_imm (push) Successful in 4m21s Details
Test / test_heal_pg_size_2 (push) Successful in 4m15s Details
Test / test_heal_ec (push) Successful in 5m1s Details
Test / test_heal_csum_32k_dj (push) Successful in 5m32s Details
Test / test_heal_csum_32k (push) Successful in 5m38s Details
Test / test_heal_csum_4k_dmj (push) Successful in 5m43s Details
Test / test_scrub (push) Successful in 1m31s Details
Test / test_scrub_zero_osd_2 (push) Successful in 1m17s Details
Test / test_heal_csum_4k_dj (push) Successful in 5m57s Details
Test / test_scrub_xor (push) Successful in 30s Details
Test / test_scrub_pg_size_3 (push) Successful in 1m7s Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 41s Details
Test / test_scrub_ec (push) Successful in 24s Details
Test / test_heal_csum_32k_dmj (push) Successful in 3m56s Details
Test / test_heal_csum_4k (push) Successful in 3m16s Details
A couple of fixes for EC pools

- Fix a segfault possible on partial EC overwrite in 1234 -> 5030 rebalance scenario
- Fix two problems leading to EC pools stalling on rebalance & parallel sudden stops
  of OSDs, for example during a sudden poweroff of a host:
  - Recovery auto-tuning (1.4.0 feature) could apply too large delays and stall
    the EC journal - fixed by limiting delays with a new recovery_tune_sleep_cutoff_us
    parameter (10 seconds by default) and applying recovery pauses before write
    operations, not after them, to not occupy space in the journal for long time
  - Dynamic journal space reservation (1.3.0 feature) wasn't accounting new writes
    when checking the limit so OSDs could still fill the journal fully and stall -
    fixed by including new writes into the limit
- Print etcd dbSize instead of dbSizeInUse in status
2024-02-11 16:23:08 +03:00
Vitaliy Filippov 27e9f244ec Release 1.4.3
Test / test_move_reappear (push) Successful in 22s Details
Test / test_rm (push) Successful in 15s Details
Test / test_snapshot_down (push) Successful in 36s Details
Test / test_snapshot_down_ec (push) Successful in 30s Details
Test / test_interrupted_rebalance (push) Successful in 5m3s Details
Test / test_splitbrain (push) Successful in 20s Details
Test / test_snapshot_chain (push) Successful in 3m1s Details
Test / test_snapshot_chain_ec (push) Successful in 3m13s Details
Test / test_rebalance_verify_imm (push) Successful in 3m0s Details
Test / test_rebalance_verify (push) Successful in 3m29s Details
Test / test_switch_primary (push) Successful in 37s Details
Test / test_write (push) Successful in 44s Details
Test / test_write_xor (push) Successful in 39s Details
Test / test_write_no_same (push) Successful in 16s Details
Test / test_rebalance_verify_ec_imm (push) Successful in 4m13s Details
Test / test_rebalance_verify_ec (push) Successful in 5m31s Details
Test / test_heal_ec (push) Successful in 4m54s Details
Test / test_heal_csum_32k_dj (push) Successful in 5m25s Details
Test / test_heal_csum_32k (push) Successful in 6m8s Details
Test / test_heal_csum_4k_dmj (push) Successful in 6m17s Details
Test / test_scrub (push) Successful in 1m8s Details
Test / test_scrub_zero_osd_2 (push) Successful in 55s Details
Test / test_scrub_xor (push) Successful in 45s Details
Test / test_heal_csum_4k_dj (push) Successful in 6m22s Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 1m11s Details
Test / test_scrub_ec (push) Successful in 46s Details
Test / test_scrub_pg_size_3 (push) Successful in 1m39s Details
Test / test_heal_csum_4k (push) Successful in 6m8s Details
Test / test_heal_csum_32k_dmj (push) Successful in 4m15s Details
Test / test_heal_pg_size_2 (push) Successful in 4m41s Details
Hotfix for hotfix O:-)

- "Write stall fix" was incomplete and EC write stalls could
  continue even on 1.4.2. Now they're finally fixed O:-)
- Make monitor ignore statistics of stopped OSDs. Previously if you stopped all
  OSDs the last total I/O numbers would remain the same indefinitely
2024-02-09 00:29:31 +03:00
Vitaliy Filippov 016115c0d4 Release 1.4.2
Test / test_rm (push) Successful in 16s Details
Test / test_move_reappear (push) Successful in 20s Details
Test / test_snapshot_down (push) Successful in 25s Details
Test / test_snapshot_down_ec (push) Successful in 39s Details
Test / test_interrupted_rebalance (push) Successful in 4m52s Details
Test / test_splitbrain (push) Successful in 20s Details
Test / test_snapshot_chain (push) Successful in 3m11s Details
Test / test_rebalance_verify_imm (push) Successful in 3m16s Details
Test / test_rebalance_verify (push) Successful in 3m45s Details
Test / test_switch_primary (push) Successful in 36s Details
Test / test_write (push) Successful in 40s Details
Test / test_write_xor (push) Successful in 40s Details
Test / test_write_no_same (push) Successful in 17s Details
Test / test_rebalance_verify_ec_imm (push) Successful in 3m8s Details
Test / test_rebalance_verify_ec (push) Successful in 5m57s Details
Test / test_heal_pg_size_2 (push) Successful in 4m22s Details
Test / test_heal_csum_32k_dmj (push) Successful in 4m20s Details
Test / test_heal_ec (push) Successful in 5m54s Details
Test / test_heal_csum_32k_dj (push) Successful in 5m24s Details
Test / test_heal_csum_32k (push) Successful in 6m3s Details
Test / test_heal_csum_4k_dmj (push) Successful in 5m54s Details
Test / test_scrub_zero_osd_2 (push) Successful in 53s Details
Test / test_scrub (push) Successful in 55s Details
Test / test_heal_csum_4k_dj (push) Successful in 6m14s Details
Test / test_scrub_xor (push) Successful in 1m1s Details
Test / test_scrub_pg_size_3 (push) Successful in 1m50s Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 57s Details
Test / test_scrub_ec (push) Successful in 52s Details
Test / test_heal_csum_4k (push) Successful in 5m47s Details
Test / test_snapshot_chain_ec (push) Successful in 1m24s Details
- Log to systemd by default
- Fix excessive autosyncs after every operation with disabled immediate_commit (introduced in 1.1.0)
- Fix a possible write stall with EC due to the lack of OSD wakeup after stabilizing previous writes
- Change sync operation semantics as a final fix to possible write stalls with EC and disabled immediate_commit
- Sync after deleting data in CLI rm / rm-data if immediate_commit is disabled
- Fix OSDs ignoring syncs & autosyncs for delete operations
- Fix OSD space reporting sometimes adding garbage zeros for deleted inodes (causing extra pool/stats etcd keys for deleted pools)
- Speed up monitor failover - change default etcd_mon_ttl from 30 to 5 seconds
- Speed up operation retries - change default up_wait_retry_interval to 50 ms
- Add patch for libvirt 9.10
2024-02-04 02:23:49 +03:00
Vitaliy Filippov d27524f441 Add patch for libvirt 9.10 2024-01-25 01:09:12 +03:00
Vitaliy Filippov ba55f91409 Release 1.4.1
Test / test_move_reappear (push) Successful in 22s Details
Test / test_snapshot_chain (push) Successful in 1m27s Details
Test / test_interrupted_rebalance_ec (push) Successful in 4m41s Details
Test / test_snapshot_down (push) Successful in 25s Details
Test / test_snapshot_chain_ec (push) Successful in 2m0s Details
Test / test_splitbrain (push) Successful in 18s Details
Test / test_snapshot_down_ec (push) Successful in 25s Details
Test / test_rebalance_verify_ec (push) Failing after 2m21s Details
Test / test_rebalance_verify_imm (push) Successful in 2m30s Details
Test / test_switch_primary (push) Successful in 39s Details
Test / test_write (push) Successful in 35s Details
Test / test_interrupted_rebalance (push) Failing after 10m8s Details
Test / test_write_xor (push) Successful in 36s Details
Test / test_write_no_same (push) Successful in 17s Details
Test / test_rebalance_verify_ec_imm (push) Successful in 4m4s Details
Test / test_heal_pg_size_2 (push) Successful in 3m55s Details
Test / test_rebalance_verify (push) Successful in 8m31s Details
Test / test_heal_ec (push) Successful in 5m9s Details
Test / test_heal_csum_32k_dmj (push) Successful in 4m27s Details
Test / test_heal_csum_32k (push) Successful in 5m42s Details
Test / test_heal_csum_32k_dj (push) Successful in 6m1s Details
Test / test_scrub (push) Successful in 59s Details
Test / test_scrub_zero_osd_2 (push) Successful in 38s Details
Test / test_heal_csum_4k_dmj (push) Successful in 7m5s Details
Test / test_scrub_xor (push) Successful in 58s Details
Test / test_heal_csum_4k_dj (push) Successful in 6m25s Details
Test / test_scrub_ec (push) Failing after 42s Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 1m32s Details
Test / test_scrub_pg_size_3 (push) Successful in 1m38s Details
Test / test_heal_csum_4k (push) Successful in 5m38s Details
- Fix a monitor crash on primary OSD switching introduced in 1.4.0
- Fix "partly outside array bounds" warnings for GCC 12 in cpp-btree
- Fix a realloc memory leak in theory possible with too large listings (OSD_OP_LIST)
2024-01-18 02:31:42 +03:00
Vitaliy Filippov 5280d1d561 Release 1.4.0
Test / test_snapshot (push) Successful in 26s Details
Test / test_snapshot_ec (push) Successful in 26s Details
Test / test_rm (push) Successful in 16s Details
Test / test_move_reappear (push) Successful in 24s Details
Test / test_snapshot_down (push) Successful in 26s Details
Test / test_snapshot_down_ec (push) Successful in 30s Details
Test / test_splitbrain (push) Successful in 28s Details
Test / test_snapshot_chain (push) Successful in 2m41s Details
Test / test_rebalance_verify_imm (push) Successful in 2m48s Details
Test / test_rebalance_verify (push) Successful in 3m28s Details
Test / test_write (push) Successful in 47s Details
Test / test_write_no_same (push) Successful in 14s Details
Test / test_rebalance_verify_ec_imm (push) Successful in 3m5s Details
Test / test_rebalance_verify_ec (push) Successful in 3m41s Details
Test / test_heal_pg_size_2 (push) Successful in 3m45s Details
Test / test_heal_csum_32k_dmj (push) Successful in 4m52s Details
Test / test_heal_ec (push) Successful in 5m11s Details
Test / test_heal_csum_32k_dj (push) Successful in 5m42s Details
Test / test_heal_csum_32k (push) Successful in 5m56s Details
Test / test_scrub (push) Successful in 1m25s Details
Test / test_scrub_zero_osd_2 (push) Successful in 1m18s Details
Test / test_scrub_xor (push) Successful in 42s Details
Test / test_heal_csum_4k_dmj (push) Successful in 6m49s Details
Test / test_heal_csum_4k_dj (push) Successful in 6m32s Details
Test / test_heal_csum_4k (push) Successful in 5m31s Details
Test / test_scrub_ec (push) Successful in 50s Details
Test / test_scrub_pg_size_3 (push) Successful in 1m2s Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 1m5s Details
Test / test_snapshot_chain_ec (push) Successful in 1m21s Details
Test / test_write_xor (push) Successful in 36s Details
New features:
- Intelligent recovery/rebalance speed auto-tuning to reduce its impact on clients (see README -> Features)
- Auto-restoration of dead VDUSE daemons in CSI plugin
- Add vitastor-disk update-sb command
- Update QEMU for Debian Bookworm to 8.1 and use it for CSI plugin

Bug fixes:
- Fix pools SOMETIMES staying inactive after stopping a node due to OSDs not reacting
  to PG state changes caused by incorrect full reload of state from etcd on reconnection
- Make monitors retry pool configuration changes quickier which fixes them being unable
  to apply changes when an ongoing rebalance is quickly making a lot of PGs clean
- Fix CSI plugin not accepting array of strings as etcd address in /etc/vitastor/vitastor.conf
- Allow multiple interfaces with the same IP address, for "simple routed" full mesh network
- Do not ignore loopback addresses for OSD network (to make ECMP setups with frr possible)
- Fix a rare client crash during OSD reconnections
- Only treat data partitions as existing OSDs in vitastor-disk prepare
- Remove etcd parameter from default command examples
- Fix reported free space sometimes changing non-immediately after deletion of data from OSDs
- Fix a possible OSD crash on print_slow when bs_op is NULL
- Use the same etcd_ws_keepalive_interval in mon as in OSD
- Fix mon not using values from config when /config/global is not present
- Remove pve-storage-portal-dns-list format for vitastor_etcd_address
- Parse log_level in cluster_client
- Fix vitastor-nbd image existence check not working because of non-zeroed inode_watch fields
- Do not warn on EPIPE in client unless log_level is raised explicitly
- Fix incorrect error in CSI when searching for the device in /sys
- Remove 2 last prints to stdout in etcd_state_client
- Fix a possible OSD crash when checking corrupted journal entries
2024-01-12 01:28:33 +03:00
Vitaliy Filippov 95631773b6 Remove pve-storage-portal-dns-list format for vitastor_etcd_address 2023-12-20 02:22:06 +03:00
Vitaliy Filippov a1c7cc3d8d Release 1.3.1
Test / test_interrupted_rebalance_ec (push) Successful in 1m46s Details
Test / test_move_reappear (push) Successful in 21s Details
Test / test_rm (push) Successful in 15s Details
Test / test_snapshot_ec (push) Successful in 35s Details
Test / test_snapshot_down (push) Successful in 30s Details
Test / test_snapshot_down_ec (push) Successful in 31s Details
Test / test_splitbrain (push) Successful in 23s Details
Test / test_snapshot_chain (push) Successful in 2m22s Details
Test / test_snapshot_chain_ec (push) Successful in 2m59s Details
Test / test_rebalance_verify_imm (push) Successful in 3m3s Details
Test / test_rebalance_verify (push) Successful in 3m47s Details
Test / test_write (push) Successful in 44s Details
Test / test_write_no_same (push) Successful in 13s Details
Test / test_rebalance_verify_ec_imm (push) Successful in 3m36s Details
Test / test_rebalance_verify_ec (push) Successful in 4m20s Details
Test / test_heal_pg_size_2 (push) Successful in 3m43s Details
Test / test_heal_csum_32k_dmj (push) Successful in 4m45s Details
Test / test_heal_ec (push) Successful in 6m22s Details
Test / test_heal_csum_32k_dj (push) Successful in 5m51s Details
Test / test_heal_csum_32k (push) Successful in 6m2s Details
Test / test_scrub (push) Successful in 1m14s Details
Test / test_scrub_zero_osd_2 (push) Successful in 1m19s Details
Test / test_heal_csum_4k_dmj (push) Successful in 5m54s Details
Test / test_scrub_xor (push) Successful in 1m1s Details
Test / test_heal_csum_4k_dj (push) Successful in 5m59s Details
Test / test_scrub_pg_size_3 (push) Successful in 1m54s Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 1m2s Details
Test / test_scrub_ec (push) Successful in 34s Details
Test / test_heal_csum_4k (push) Successful in 6m0s Details
Test / test_write_xor (push) Successful in 32s Details
Hotfix to 1.3.0 - new "journal space reservation" had a bug which
caused OSDs to crash with EC and without immediate_commit.
2023-12-04 18:35:09 +03:00
Vitaliy Filippov 7972502eaf Release 1.3.0
Test / test_rm (push) Successful in 12s Details
Test / test_snapshot_chain (push) Successful in 1m1s Details
Test / test_snapshot_down (push) Successful in 19s Details
Test / test_splitbrain (push) Successful in 12s Details
Test / test_snapshot_down_ec (push) Failing after 3m10s Details
Test / test_rebalance_verify (push) Successful in 2m45s Details
Test / test_rebalance_verify_imm (push) Successful in 2m17s Details
Test / test_write (push) Successful in 1m11s Details
Test / test_rebalance_verify_ec_imm (push) Successful in 2m41s Details
Test / test_write_no_same (push) Successful in 12s Details
Test / test_write_xor (push) Failing after 3m6s Details
Test / test_rebalance_verify_ec (push) Failing after 5m27s Details
Test / test_heal_pg_size_2 (push) Failing after 3m7s Details
Test / test_heal_csum_32k_dmj (push) Successful in 4m36s Details
Test / test_heal_csum_32k_dj (push) Failing after 4m53s Details
Test / test_heal_csum_32k (push) Failing after 5m27s Details
Test / test_heal_ec (push) Failing after 10m15s Details
Test / test_heal_csum_4k_dmj (push) Successful in 5m14s Details
Test / test_scrub (push) Successful in 1m11s Details
Test / test_heal_csum_4k_dj (push) Successful in 5m15s Details
Test / test_scrub_zero_osd_2 (push) Successful in 56s Details
Test / test_scrub_pg_size_3 (push) Successful in 1m4s Details
Test / test_heal_csum_4k (push) Failing after 5m31s Details
Test / test_scrub_xor (push) Failing after 3m17s Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Failing after 3m6s Details
Test / test_change_pg_count_ec (push) Failing after 3m5s Details
Test / test_snapshot_ec (push) Failing after 3m5s Details
Test / test_scrub_ec (push) Failing after 3m5s Details
Test / test_snapshot_chain_ec (push) Failing after 3m5s Details
Test / test_interrupted_rebalance_ec (push) Failing after 10m5s Details
New features:
- RDMA without ODP - much faster and all cards are now supported, not just Mellanox
- VDUSE in CSI - faster, more stable and can even recover after CSI pod restart!
- Reserve journal space for stabilize requests dynamically to prevent stalls under load with EC
- Raise default NBD timeout from 30 to 300 seconds and allow to take it from /etc/vitastor/vitastor.conf
- Remove explicit etcdUrl/etcdPrefix K8S storage class parameter support to prevent
  etcd migration issues for volumes created with these parameters
- Support QEMU 8.1 and pve-qemu 8.1

Bug fixes:
- Fix RDMA connection (and thus memory) leak
- Fix rare crashes under load due to incorrect io_uring queue size tracking
- Fix monitor statistics aggregation in case of empty /osd/stats keys
- Fix crash on unknown long argument to vitastor-disk
- Allow trailing comma in JSONs again
- Fix crash on attempts to dump a long listing of objects "to stabilize" or "to rollback" in a slow op
2023-12-04 02:36:43 +03:00
Vitaliy Filippov d65512bd80 Add patches for QEMU 8.1 2023-12-04 01:56:17 +03:00
Vitaliy Filippov 5524dbdab7 Release 1.2.0
Test / test_snapshot_ec (push) Successful in 25s Details
Test / test_interrupted_rebalance_ec_imm (push) Successful in 1m18s Details
Test / test_rm (push) Successful in 15s Details
Test / test_snapshot_down (push) Successful in 22s Details
Test / test_snapshot_down_ec (push) Successful in 23s Details
Test / test_splitbrain (push) Successful in 18s Details
Test / test_snapshot_chain (push) Successful in 2m13s Details
Test / test_snapshot_chain_ec (push) Successful in 2m57s Details
Test / test_rebalance_verify_imm (push) Successful in 2m51s Details
Test / test_write (push) Successful in 38s Details
Test / test_rebalance_verify (push) Successful in 3m39s Details
Test / test_write_no_same (push) Successful in 12s Details
Test / test_rebalance_verify_ec (push) Successful in 3m56s Details
Test / test_rebalance_verify_ec_imm (push) Successful in 3m6s Details
Test / test_heal_pg_size_2 (push) Successful in 3m43s Details
Test / test_heal_csum_32k_dmj (push) Successful in 4m35s Details
Test / test_heal_csum_32k_dj (push) Successful in 5m44s Details
Test / test_heal_csum_32k (push) Successful in 5m50s Details
Test / test_heal_csum_4k_dmj (push) Successful in 5m44s Details
Test / test_scrub_zero_osd_2 (push) Successful in 57s Details
Test / test_scrub (push) Successful in 1m0s Details
Test / test_scrub_xor (push) Successful in 1m5s Details
Test / test_heal_csum_4k_dj (push) Successful in 5m9s Details
Test / test_scrub_pg_size_3 (push) Successful in 1m38s Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 54s Details
Test / test_scrub_ec (push) Successful in 52s Details
Test / test_heal_csum_4k (push) Successful in 5m8s Details
Test / test_heal_ec (push) Successful in 3m17s Details
Test / test_write_xor (push) Successful in 35s Details
Test / test_move_reappear (push) Failing after 48s Details
New features:

- Implement CSI volume expansion
- Implement CSI volume snapshots
- CSI driver now requires Kubernetes >= 1.20

Bug fixes:

- Important bug fix for EC: fix EC n+k, k>=2 read recovery in ISA-L version returning
  incorrect data when reading at least the second chunk out of multiple missing chunks
  without reading the first one. All users of EC n+k, k>=2 should upgrade as soon as
  possible, and upgrade should be conducted with downtime: first stop all clients
  (VMs/containers), then all OSDs, then upgrade and restart everything.
- Fix unstable statistics aggregation in monitor (affecting vitastor-cli status and df)
- Make udev not wait for OSDs to start during boot
- Do not report negative numbers of offline PGs in vitastor-cli status when changing PG count
- Report both old and new PG counts in vitastor-cli df when changing it
- Fix OSDs sometimes not starting with "The code only supports journal versions 1 and 2,
  but it is 2 on disk" error after upgrading from pre-1.0 versions and letting OSDs run
  for some time
- Fix monitors sometimes returning old PG count back after OSD configuration changes
- Make monitor PG changes more stable and timeout errors less probable
2023-11-05 01:48:57 +03:00
Vitaliy Filippov 8222e3c77d Release 1.1.0
Test / test_interrupted_rebalance_ec (push) Successful in 1m49s Details
Test / test_snapshot_ec (push) Successful in 38s Details
Test / test_rm (push) Successful in 15s Details
Test / test_snapshot_down (push) Successful in 23s Details
Test / test_move_reappear (push) Failing after 49s Details
Test / test_snapshot_down_ec (push) Successful in 23s Details
Test / test_splitbrain (push) Successful in 22s Details
Test / test_snapshot_chain (push) Successful in 2m25s Details
Test / test_snapshot_chain_ec (push) Successful in 3m5s Details
Test / test_rebalance_verify_imm (push) Successful in 2m51s Details
Test / test_write (push) Successful in 34s Details
Test / test_rebalance_verify (push) Successful in 3m38s Details
Test / test_write_no_same (push) Successful in 14s Details
Test / test_write_xor (push) Successful in 50s Details
Test / test_rebalance_verify_ec_imm (push) Successful in 4m3s Details
Test / test_rebalance_verify_ec (push) Successful in 5m0s Details
Test / test_heal_pg_size_2 (push) Successful in 4m2s Details
Test / test_heal_ec (push) Successful in 4m49s Details
Test / test_heal_csum_32k_dmj (push) Successful in 5m27s Details
Test / test_heal_csum_32k_dj (push) Successful in 5m44s Details
Test / test_heal_csum_32k (push) Successful in 6m57s Details
Test / test_heal_csum_4k_dmj (push) Successful in 6m50s Details
Test / test_scrub (push) Successful in 1m12s Details
Test / test_scrub_xor (push) Successful in 48s Details
Test / test_scrub_zero_osd_2 (push) Successful in 54s Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 1m14s Details
Test / test_heal_csum_4k_dj (push) Successful in 6m32s Details
Test / test_scrub_pg_size_3 (push) Successful in 1m38s Details
Test / test_heal_csum_4k (push) Successful in 6m20s Details
Test / test_scrub_ec (push) Successful in 27s Details
New features:

- Implement [client writeback cache](docs/config/client.en.md#client_enable_writeback)
- Add the third I/O mode: [O_DIRECT|O_SYNC](docs/config/osd.en.md#data_io) (good for Optane)
- Reduce load on etcd by splitting OSD lease and statistics reporting intervals:
  [etcd_stats_interval](docs/config/osd.en.md#etcd_stats_interval) (default 30 sec)
- Make MON automatically filter OSDs by layout (block_size/immediate_commit/bitmap_granularity)
  to prevent "refusing to start PGs of this pool" errors on misconfiguration
- Support running fio benchmarks on systems without io_uring
- Make QEMU driver compatible with QEMU 8.1
- Document usage of [vhost-user-blk](docs/usage/qemu.en.md#vhost-user-blk)

Bug fixes:

- Fix resizing disks in QEMU driver (for example, in Proxmox)
- Fix "unexpected result" in Proxmox driver by making CLI flush output on exit
- Remove unneeded block_size mismatch warnings on pools without matching PGs
- Fix possible segfault in vitastor-cli ls -l (usually with deleted pools)
- Fix QEMU driver compatibility with systems without io_uring
- Fix monitor eating 100% CPU when etcd is down (caused by infinite retries)
- Fix potential incorrect write processing with snapshots (not caught in tests
  but could probably lead to client hangs)
- Fix buffer insertion in cluster_client (not caught in tests but could
  probably lead to incorrect writes in rare cases)
- Fix rare OSD crash during sync operation processing
- Fix a reenterability issue in cluster_client not reproducible in QEMU/fio,
  but reproducible with the currently developed K/V database implementation
- Fix deletion of the first modified object - OSDs could crash if you modified
  the same object a lot of times, then deleted it, and then modified it again
- Fix the fio_sec_osd test tool
2023-10-28 00:33:06 +03:00
Vitaliy Filippov 6acf562e01 Release 1.0.0
New features:

- Data and metadata checksums!
  - Metadata checksums are always used with new disk format
  - Data checksums can be turned on with --data_csum_type crc32c for new OSDs
  - Checksum block size can be configured
  - inmemory_metadata now also affects keeping checksums in memory
- Linux page cache I/O caching support which can be enabled separately for
  data, metadata (including checksums) and journal (O_SYNC instead of O_DIRECT)
- Details [here](https://git.yourcmc.ru/vitalif/vitastor/src/branch/master/docs/config/layout-osd.en.md#data_csum_type)
- Backwards compatibility is preserved, you can use new OSDs with old disks

Release also includes bug fixes from [0.9.6](https://git.yourcmc.ru/vitalif/vitastor/releases/tag/v0.9.6).

0.9.6 is moved to "-oldstable" repositories and will be available for some additional time.
2023-07-29 18:57:19 +03:00
Vitaliy Filippov e651c93a90 Release 0.9.6
Test / test_interrupted_rebalance (push) Successful in 1m59s Details
Test / test_interrupted_rebalance_imm (push) Successful in 3m41s Details
Test / test_interrupted_rebalance_ec (push) Successful in 1m53s Details
Test / test_interrupted_rebalance_ec_imm (push) Successful in 1m29s Details
Test / test_failure_domain (push) Successful in 50s Details
Test / test_snapshot (push) Successful in 45s Details
Test / test_snapshot_ec (push) Successful in 23s Details
Test / test_minsize_1 (push) Successful in 15s Details
Test / test_move_reappear (push) Successful in 18s Details
Test / test_rm (push) Successful in 15s Details
Test / test_snapshot_chain (push) Successful in 2m23s Details
Test / test_snapshot_chain_ec (push) Successful in 3m1s Details
Test / test_snapshot_down (push) Successful in 31s Details
Test / test_snapshot_down_ec (push) Successful in 32s Details
Test / test_splitbrain (push) Successful in 19s Details
Test / test_rebalance_verify (push) Successful in 3m34s Details
Test / test_rebalance_verify_imm (push) Successful in 3m31s Details
Test / test_rebalance_verify_ec (push) Successful in 5m14s Details
Test / test_rebalance_verify_ec_imm (push) Successful in 5m14s Details
Test / test_write (push) Successful in 44s Details
Test / test_write_xor (push) Successful in 54s Details
Test / test_write_no_same (push) Successful in 15s Details
Test / test_heal_pg_size_2 (push) Successful in 4m38s Details
Test / test_heal_ec (push) Successful in 3m56s Details
Test / test_scrub (push) Successful in 36s Details
Test / test_scrub_zero_osd_2 (push) Successful in 35s Details
Test / test_scrub_xor (push) Successful in 31s Details
Test / test_scrub_pg_size_3 (push) Successful in 46s Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 25s Details
Test / test_scrub_ec (push) Successful in 24s Details
- Fix vitastor-disk partition zeroing (sometimes it was writing garbage instead of zeroes)
- Fix incorrect EC space statistics in `vitastor-cli status`
- Several bug fixes for NFS:
  - Add . and .. in NFS directory listings
  - Return FILE_SYNC from NFS writes if immediate_commit is enabled
  - Return the same "verifier" in NFS COMMIT as in NFS WRITE
  - Make parallel NFS extending writes work correctly, without conflicts
  - Handle parallel NFS extending writes without imposing extra load on etcd
- Support UTF-8 in vitastor-cli table output
- Also allow "0" and "no" as false for inmemory_metadata and inmemory_journal
- Use HDD defaults for HDD-only in automatic `vitastor-disk prepare` mode
2023-07-29 10:54:00 +03:00
Vitaliy Filippov 10a5fd6abb Release 0.9.5
Test / test_etcd_fail (push) Successful in 50s Details
Test / test_interrupted_rebalance (push) Successful in 2m27s Details
Test / test_interrupted_rebalance_imm (push) Successful in 1m39s Details
Test / test_interrupted_rebalance_ec (push) Successful in 1m49s Details
Test / test_interrupted_rebalance_ec_imm (push) Successful in 1m29s Details
Test / test_failure_domain (push) Successful in 19s Details
Test / test_snapshot (push) Successful in 31s Details
Test / test_snapshot_ec (push) Successful in 26s Details
Test / test_minsize_1 (push) Successful in 15s Details
Test / test_rm (push) Successful in 18s Details
Test / test_snapshot_chain (push) Successful in 1m49s Details
Test / test_snapshot_chain_ec (push) Successful in 2m51s Details
Test / test_snapshot_down (push) Successful in 26s Details
Test / test_snapshot_down_ec (push) Successful in 24s Details
Test / test_splitbrain (push) Successful in 18s Details
Test / test_rebalance_verify (push) Successful in 3m8s Details
Test / test_rebalance_verify_imm (push) Successful in 3m13s Details
Test / test_rebalance_verify_ec (push) Successful in 3m36s Details
Test / test_rebalance_verify_ec_imm (push) Successful in 5m59s Details
Test / test_write (push) Successful in 48s Details
Test / test_write_xor (push) Successful in 37s Details
Test / test_write_no_same (push) Successful in 14s Details
Test / test_heal_pg_size_2 (push) Successful in 3m43s Details
Test / test_heal_ec (push) Successful in 4m6s Details
Test / test_scrub (push) Successful in 35s Details
Test / test_scrub_zero_osd_2 (push) Successful in 34s Details
Test / test_scrub_xor (push) Successful in 42s Details
Test / test_scrub_pg_size_3 (push) Successful in 52s Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 33s Details
Test / test_scrub_ec (push) Successful in 28s Details
A hotfix to 0.9.4 containing only one bugfix: 100% CPU usage in the new QEMU
driver caused by the lack of eventfd reset on io_uring event handling :)
2023-07-21 00:04:41 +03:00
Vitaliy Filippov 1c10430ae1 Release 0.9.4
Test / test_interrupted_rebalance (push) Successful in 1m54s Details
Test / test_interrupted_rebalance_imm (push) Successful in 2m4s Details
Test / test_interrupted_rebalance_ec (push) Successful in 1m40s Details
Test / test_interrupted_rebalance_ec_imm (push) Successful in 1m25s Details
Test / test_failure_domain (push) Successful in 15s Details
Test / test_snapshot (push) Successful in 25s Details
Test / test_snapshot_ec (push) Successful in 20s Details
Test / test_minsize_1 (push) Successful in 13s Details
Test / test_move_reappear (push) Successful in 16s Details
Test / test_rm (push) Successful in 13s Details
Test / test_snapshot_chain (push) Successful in 1m56s Details
Test / test_snapshot_chain_ec (push) Successful in 2m33s Details
Test / test_snapshot_down (push) Successful in 23s Details
Test / test_snapshot_down_ec (push) Successful in 22s Details
Test / test_splitbrain (push) Successful in 16s Details
Test / test_rebalance_verify (push) Successful in 3m3s Details
Test / test_rebalance_verify_imm (push) Successful in 3m2s Details
Test / test_rebalance_verify_ec (push) Successful in 3m13s Details
Test / test_rebalance_verify_ec_imm (push) Successful in 8m35s Details
Test / test_write (push) Successful in 33s Details
Test / test_write_xor (push) Successful in 40s Details
Test / test_write_no_same (push) Successful in 15s Details
Test / test_heal_pg_size_2 (push) Successful in 4m25s Details
Test / test_heal_ec (push) Successful in 3m9s Details
Test / test_scrub (push) Successful in 1m0s Details
Test / test_scrub_zero_osd_2 (push) Successful in 46s Details
Test / test_scrub_xor (push) Successful in 1m1s Details
Test / test_scrub_pg_size_3 (push) Successful in 1m55s Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 1m25s Details
Test / test_scrub_ec (push) Successful in 52s Details
- Improve QEMU driver performance by integrating io_uring in it (up to 1.5x total iops improvement)
- Fix QEMU driver deadlocks which started to reproduce in qemu-img after iothread fixes
- Fix `vitastor-cli status` reporting more etcds than actually exists (fix etcd address duplication in config on reload)
- Fix `vitastor-cli ls` crashing on inodes in non-existing pools
- Delete old garbage /pool/stats/ keys for non-existing (deleted) pools
- Reduce memory usage of etcds initialized by make-etcd script
- Fix OSDs almost always crashing on etcd restart due to "revisions were compacted" (support reloading state from etcd)
- Fix a crash and a stall possible mostly in HDD setups with small journal and big (512k, 900k) random writes
- Add notes about HDDs to documentation. You are officially allowed to use HDD-only Vitastor with HGST/Toshiba/EXOS :)
2023-07-19 02:50:30 +03:00
Vitaliy Filippov c18e92273e Copy qemu 5.1 -> 5.2 patch for convenience 2023-07-18 23:37:53 +03:00
Vitaliy Filippov b963f2fd93 Add QEMU 2.12 patch (basically the same as 3.1) 2023-07-18 23:37:06 +03:00
Vitaliy Filippov a612cdca47 Release 0.9.3
- Add patch for libvirt 9.0
- Add support for Proxmox VE 8.0
- Fix compatibility of the QEMU driver with iothread (QEMU rebuilds are coming)
- Fix vitastor-cli rm-data/rm/merge hanging when some OSDs are down.
  Allow deletions in unclean cluster at the cost of some data possibly
  "reappearing" when those OSDs start back. In that case you can just repeat
  the deletion request using rm-data.
- A bunch of bug fixes for snapshots:
  - Fix snapshot reads often not working at all with snapshot chain size > 2
  - Fix optimized snapshot data merge (children to parent)
  - Fix updating of image name index key during optimized merge
  - Fix auto-selection preventing the use of optimized merge with only 1 snapshot
  - Fix incorrect CAS retries during snapshot merge
  - Fix snapshot merge progress reporting
- Fix primary_read bitmap buffers use-after-free which could lead to
  incorrect allocation map reads
- Remove /usr/local/bin path from make-etcd
- Some documentation fixes
2023-07-01 00:25:58 +03:00
Vitaliy Filippov 10e2e6a7c8 Add a patch for pve-qemu 8.0 2023-06-24 01:33:52 +03:00
Vitaliy Filippov 7ae5b0e368 Add patch for libvirt 9.0 2023-06-19 01:07:08 +03:00
Vitaliy Filippov 926be372fd Release 0.9.2
Test / test_change_pg_count (push) Successful in 35s Details
Test / test_change_pg_count_ec (push) Successful in 34s Details
Test / test_change_pg_size (push) Successful in 9s Details
Test / test_create_nomaxid (push) Successful in 7s Details
Test / test_etcd_fail (push) Successful in 1m1s Details
Test / test_failure_domain (push) Successful in 9s Details
Test / test_interrupted_rebalance (push) Successful in 1m26s Details
Test / test_interrupted_rebalance_imm (push) Successful in 2m15s Details
Test / test_interrupted_rebalance_ec (push) Successful in 1m31s Details
Test / test_interrupted_rebalance_ec_imm (push) Successful in 1m32s Details
Test / test_minsize_1 (push) Successful in 13s Details
Test / test_rebalance_verify (push) Successful in 2m49s Details
Test / test_rebalance_verify_imm (push) Successful in 2m45s Details
Test / test_rebalance_verify_ec (push) Successful in 3m9s Details
Test / test_rebalance_verify_ec_imm (push) Successful in 3m3s Details
Test / test_rm (push) Successful in 12s Details
Test / test_snapshot (push) Successful in 24s Details
Test / test_snapshot_ec (push) Successful in 35s Details
Test / test_splitbrain (push) Successful in 21s Details
Test / test_write (push) Successful in 1m10s Details
Test / test_write_xor (push) Successful in 2m23s Details
Test / test_write_no_same (push) Successful in 19s Details
Test / test_heal_pg_size_2 (push) Successful in 3m58s Details
Test / test_heal_ec (push) Successful in 4m8s Details
Test / test_scrub (push) Successful in 1m3s Details
Test / test_scrub_zero_osd_2 (push) Successful in 36s Details
Test / test_scrub_xor (push) Successful in 34s Details
Test / test_scrub_pg_size_3 (push) Successful in 51s Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 32s Details
Test / test_scrub_ec (push) Successful in 43s Details
- Measure and report scrub I/O statistics in vitastor-cli status
- Make aggregated statistics in vitastor-cli status much smoother
  (first derive, then sum instead of first summing and then deriving)
- Fix an old rare bug leading to journal corruption
  (try to use scrub if you think you're affected...)
- Do not start EC PGs without at least <data chunks> OSDs in each old set
  (prevents spurious read errors with EC during reconnections/restarts)
- Fix failed assert(!scrub_list_op) on OSD restart with pending scrubs
- Fix future planned scrubs not starting because of incorrect time comparison
- Build packages for Debian 12 (Bookworm)
2023-06-18 19:44:33 +03:00
Vitaliy Filippov bdd48e4cf1 Release 0.9.1
Test / test_change_pg_count (push) Successful in 35s Details
Test / test_change_pg_count_ec (push) Successful in 32s Details
Test / test_change_pg_size (push) Successful in 8s Details
Test / test_create_nomaxid (push) Successful in 6s Details
Test / test_etcd_fail (push) Successful in 49s Details
Test / test_failure_domain (push) Successful in 9s Details
Test / test_interrupted_rebalance (push) Successful in 1m4s Details
Test / test_interrupted_rebalance_imm (push) Successful in 57s Details
Test / test_interrupted_rebalance_ec (push) Successful in 1m2s Details
Test / test_interrupted_rebalance_ec_imm (push) Successful in 53s Details
Test / test_minsize_1 (push) Successful in 14s Details
Test / test_move_reappear (push) Successful in 16s Details
Test / test_rebalance_verify (push) Successful in 1m45s Details
Test / test_rebalance_verify_imm (push) Successful in 1m41s Details
Test / test_rebalance_verify_ec (push) Successful in 1m53s Details
Test / test_rebalance_verify_ec_imm (push) Successful in 1m48s Details
Test / test_rm (push) Successful in 10s Details
Test / test_snapshot (push) Successful in 14s Details
Test / test_snapshot_ec (push) Successful in 16s Details
Test / test_splitbrain (push) Successful in 14s Details
Test / test_write (push) Successful in 49s Details
Test / test_write_xor (push) Successful in 1m1s Details
Test / test_write_no_same (push) Successful in 11s Details
Test / test_heal_pg_size_2 (push) Successful in 3m12s Details
Test / test_scrub (push) Successful in 28s Details
Test / test_scrub_zero_osd_2 (push) Successful in 22s Details
Test / test_scrub_xor (push) Successful in 34s Details
Test / test_scrub_pg_size_3 (push) Successful in 25s Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 26s Details
Test / test_scrub_ec (push) Successful in 24s Details
- Fix "Client XX command out of sync" messages sometimes happening on OSD reconnections
- Fix a bug where EC reads parallel with writes to the same object failed with -ERANGE error
- Slightly reduce the amount of metadata writes during journal flushing
- Correctly unmap NBD volumes when Proxmox forces map_volume use (with SWTPM and maybe some other cases)
2023-06-10 11:42:49 +03:00
Vitaliy Filippov af8c3411cd Correctly unmap NBD when Proxmox forces map_volume use (with SWTPM and maybe something else) 2023-06-08 01:31:49 +03:00
Vitaliy Filippov 3b4cf29e65 Release 0.9.0
New features:
- Scrubbing! Check documentation: [auto_scrub](src/branch/master/docs/config/osd.en.md#auto_scrub)
- Document online-updatable configuration parameters

Bug fixes:
- Fix NaN during PG optimisation if there are nonexisting OSDs in node_placement
- Fix monitor crash on pool deletion
- Clear journal_device and meta_device before initialising the next OSD in automatic mode
- Sync unsynced deletes before overwriting them with a lower version
  (reproducted mostly/only after scrubbing)
2023-05-21 15:07:14 +03:00
Vitaliy Filippov 5a9e1ede52 Release 0.8.9
Test / buildenv (push) Successful in 9s Details
Test / build (push) Successful in 2m31s Details
Test / test_cas (push) Successful in 12s Details
Test / make_test (push) Successful in 33s Details
Test / test_change_pg_size (push) Successful in 19s Details
Test / test_change_pg_count (push) Successful in 55s Details
Test / test_create_nomaxid (push) Successful in 21s Details
Test / test_change_pg_count_ec (push) Successful in 58s Details
Test / test_failure_domain (push) Successful in 13s Details
Test / test_etcd_fail (push) Successful in 1m4s Details
Test / test_interrupted_rebalance (push) Successful in 1m13s Details
Test / test_interrupted_rebalance_imm (push) Successful in 1m7s Details
Test / test_add_osd (push) Successful in 2m59s Details
Test / test_move_reappear (push) Successful in 24s Details
Test / test_interrupted_rebalance_ec (push) Successful in 1m22s Details
Test / test_interrupted_rebalance_ec_imm (push) Successful in 1m1s Details
Test / test_rebalance_verify (push) Successful in 2m12s Details
Test / test_minsize_1 (push) Successful in 15s Details
Test / test_rebalance_verify_imm (push) Successful in 2m4s Details
Test / test_rebalance_verify_ec_imm (push) Successful in 2m9s Details
Test / test_rm (push) Successful in 17s Details
Test / test_snapshot (push) Successful in 23s Details
Test / test_rebalance_verify_ec (push) Successful in 2m31s Details
Test / test_splitbrain (push) Successful in 23s Details
Test / test_snapshot_ec (push) Successful in 30s Details
Test / test_write_no_same (push) Successful in 16s Details
Test / test_write (push) Successful in 53s Details
Test / test_write_xor (push) Successful in 1m19s Details
Test / test_heal_pg_size_2 (push) Successful in 4m30s Details
Test / test_heal_ec (push) Successful in 4m32s Details
- The tests are now stable and run in a CI system based on Gitea CI
- The release includes final bug fixes for EC:
  - Implement missing EC recovery of allocation bitmap when built with ISA-L
  - Fix broken snapshot export with EC (allocation bitmap reads were giving incorrect results previously)
- Also fixed bugs manifesting under heavy load:
  - Fix monitor possibly applying incorrect PG history on retries
  - Fix monitor incorrectly changing PG count when last_clean_pgs contains less PGs than the new number
  - Allow writes to wait for free space again, but now correctly (previously dropped in 0.8.2)
  - Fix a rare segfault in client (handle client stop during incoming stream handling in 1 more place)
  - Make monitor correctly handle etcd connection errors - it could die instead of connecting to another etcd
  - Fix OSD rarely being unable to report PG states after a PG was taken over by another OSD
- Fixed return code for incomplete EC objects (now EIO) and made cluster client retry this error
- Made other small changes for tests: timeouts, nice/ionice for etcd, waiting conditions, NBD device checks and so on
2023-05-14 01:25:09 +03:00
Vitaliy Filippov ab615849d6 Release 0.8.8
- Fix vitastor-cli rm/rm-data broken in 0.8.6 (missing messenger initialization)
- Prepare OSD read handler for upcoming version with scrub - allow "secondary reads" to return errors
- Fix OSDs re-peering PGs infinitely with a big number of PGs (reproduced in test_add_osd)
- Fix another variant of flusher sync-waiting stall (reproduced in test_write)
- Fix other tests in tests/ (will add them to Gitea CI soon)
- Add patches for QEMU 6.2-8.0
- Fix QEMU driver compatibility with QEMU 8.0
- Build packages for RHEL 9 clones (based on AlmaLinux 9)
2023-04-28 11:22:00 +03:00
Vitaliy Filippov 7d6bf84a3e Add scripts/meson-buildoptions.sh to QEMU patches 2023-04-28 01:43:22 +03:00
Vitaliy Filippov 0d9e10cf96 Add patches for QEMU 6.2-8.0 2023-04-25 11:20:21 +03:00
Vitaliy Filippov 7e958afeda Release 0.8.7
This release includes a bunch of important bugfixes for erasure-coded setups
with disabled immediate_commit. After these fixes, "test_heal" OSD killing test
now passes fine with EC:

- Fix cluster write stalls with "Error while doing flush on OSD xx: -16 (Device or resource busy)"
  in OSD logs possible in EC setups with disabled immediate_commit by selectively
  syncing nonsynced objects on STABILIZE/ROLLBACK (https://github.com/vitalif/vitastor/issues/51)
- Fix other EC + disabled immediate_commit problems:
  - Fix "opcode=5 retval=-2" errors happening on SYNC retries
  - Fix non-working "pagination" during PG dirty object flushing
  - Fix write operations not continued correctly after dirty object flushing
- Fix incorrect parity read-modify-write calculation when writing into a lost chunk
- Fix OSDs losing left_on_dead PG state of non-clean PGs and thus not removing junk data in the cluster
- Fix a small memory leak caused by bad indexing of EC recovery matrices
- Fix a rare use-after-free in cluster_client caused by a reenterability issue
- Fix vitastor-cli create command syntax in the CSI driver
- Allow to start OSDs without local store for tests
- Fix memory allocation error in disk_tool_meta for non-standard metadata block sizes
- Fix delete operations received before loading pool metadata crashing OSDs with "null pointer exception"
- Improve "theoretical performance" Russian documentation

New features:

- Implement online configuration update for some parameters. Documentation is coming soon :)
2023-04-11 02:11:57 +03:00
Vitaliy Filippov 8810eae8fb Release 0.8.6
Important fixes:

- Fix possibly incorrect EC parity chunk updates with EC n+k, k > 1 and when
  the first parity chunk is missing

Minor fixes and improvements:

- Fix incorrect EC free space statistics in vitastor-cli df output
- Speedup vitastor-cli startup in clusters with RDMA
- Remove unused PG "peered" state (previously used to update PG epoch)
- Use sfdisk with just --json in vitastor-disk (--dump --json isn't needed)
- Allow trailing comma in sfdisk output (fixes sfdisk 2.36 compatibility)
- Slightly improve RDMA send/receive code
- Reduce RDMA memory consumption by default (rdma_max_recv/send = 16/8)
- Use vitastor-cli instead of direct etcd interaction in the CSI driver
2023-02-28 11:18:48 +03:00
Vitaliy Filippov d125fb1f30 Release 0.8.5
- Fix a possible "double free" bug in the client library happening on OSD restart
- Fix a possible write hang on PG history update when only epoch is changed
- Fix incorrect systemd target "local.target" in mon/make-etcd
- Allow "content" option in PVE storage plugin to allow to enable containers
- Build client library without tcmalloc which fixes "attempt to free invalid pointer"
  errors when, for example, trying to run QEMU with both Vitastor and Ceph RBD disks
2023-01-25 01:43:49 +03:00
Vitaliy Filippov c96bcae74b Allow "content" option in PVE storage plugin to allow to enable containers 2023-01-16 18:14:45 +03:00
Vitaliy Filippov 81fc8bb94c Release 0.8.4
New features:
- Implement QCOW2 image/snapshot export via qemu-img (bdrv_co_block_status in the driver)
- Remove OSDs from PG history during `vitastor-cli rm-osd` to prevent `left_on_dead` PG states after deletion
- Add a new recovery_pg_switch setting to mix all PGs during recovery, to almost
  fully reduce the probability of ENOSPC during rebalance
- Introduce partial ENOSPC ("OSD is full") handling - now ENOSPC doesn't turn
  into cascades of crashes
- Add migration support to Proxmox VE Vitastor driver
- Track last_clean_pgs on a per-pool basis thus reducing data movement in a cluster
  with pools remaining unclean/degraded for a long time

Bug fixes:
- Fix a bug where monitor could generate degraded PGs if one of the hosts had no OSDs
- Fix a bug where monitor could skip PG redistribution with a lot of OSDs in cluster
- Report PG history synchronously on the first write, which improves PG consistency
  and availability at the same time, because history now gets reported correctly
  and doesn't get reported without the need for it
- Fix possible write and recovery stalls which could happen in a cluster with both EC and replicated pools
- Make OSD and monitors sanitize & deduplicate PG history items in etcd
- Fix non-working OSD peer config safety check
- Fix a rare journal flush stall where flushing wasn't activated with full journal, but with empty flush queue
- Fix builds without ISA-L (jerasure-only) crashing with EC N+K, K>=2 due to the lack of 16-byte buffer alignment
- Fix a possible crash for EC N+K, K>=2 when calculating a parity chunk with previous parity chunk missing
- Fix a bug where vitastor-disk purge with suppressed warnings didn't work
2023-01-13 23:59:54 +03:00
Vitaliy Filippov 3e280f2f08 Mark vitastor as shared storage in PVE driver 2023-01-13 01:36:30 +03:00
Vitaliy Filippov fa90f287da Release 0.8.3
- Implement a new "vitastor-disk purge" command to remove OSDs with safety checks
- Implement a new "vitastor-cli rm-osd" command to only remove OSD metadata from etcd
- Fix a bug where the monitor could ignore OSD removal and other /osd/stats key changes
- Fix a bug where garbage could be returned when reading objects being written at the same time
- Fix a rare write stall where journal space could be not reclaimed where there
  were no new operations in the flush queue
- Fix a rare peering stall caused by a previous long listing operations queues limiting attempt
- Fix total object count statistic in OSD on object creation
- Add missing offset&len into vitastor-disk dump-journal for big_writes, fix JSON format
- Make vitastor-cli print help on missing command
- Make vitastor-cli translate all '-' to '_' in CLI options
2022-12-27 02:40:55 +03:00
Vitaliy Filippov 5ef8bed75f Release 0.8.2
- Fix QEMU driver compatibility with QEMU 7.0 and < 2.9
- Add patches for pve-qemu-kvm 7.1 (PVE 7.3) and pve-qemu-kvm 6.2 (PVE 7.2)
- Fix Proxmox driver location in the pve-storage-vitastor package
- Disable HDD autodetection in non-hybrid mode
- Explicitly warn about a buggy kernels on -EAGAIN in io_uring
- Final fix for the lack of zeroing out of old metadata entries
  (do not crash with "big_write journal_entry was allocated over another object"
  in some cases after an unclean OSD shutdown)
- Wait for data writes before fsyncing data if data fsync is enabled
- Never try to wait for free space inside blockstore thus stalling OSDs
- Fix a rare crash in osd_peering due to callback ordering
- Fix a rare duplication of ping & op message IDs
- Fix a rare use-after-free during pings
- Add --force to vitastor-disk read-sb
- Make vitastor-disk dump metadata object IDs in hex, add forgotten commas
- Fix vitastor-disk SCSI disk cache check
2022-12-17 17:54:13 +03:00
Vitaliy Filippov 3f35744052 Fix compatibility with QEMU aio_set_fd_handler signatures in 7.0 and < 2.9 2022-12-15 19:17:17 +03:00
Vitaliy Filippov 1364009931 Add patches for pve-qemu-kvm 7.1 (PVE 7.3) and pve-qemu-kvm 6.2 (PVE 7.2) 2022-12-14 19:01:36 +03:00
Vitaliy Filippov d7e30b8353 Fix pve-storage-vitastor filename 2022-12-14 16:41:35 +03:00
Vitaliy Filippov 8fdf30b21f Release 0.8.1
- Remove an additional data copy operation when flushing journal (should
  slightly increase write performance)
- Fix a bug where new writes in the inmemory_journal=false mode could overwrite
  the data currently read by a parallel read operation
- Fix degraded parity writes for EC N+K when K>1 where the bug could also lead
  to an "assertion failed" error
- Fix missing journal space check for "big" writes which could lead to
  "prefill_single_journal_entry(): assertion failed..." error in OSD
- Fix possible "assertion failed: next->prev_wait >= 0" in client in rare cases
- Fix missing "len" field in vitastor-disk write-journal big_writes
- Fix possible crash of a full OSD (ENOSPC)
- Fix CSI build scripts to include newest packages every time
- Fix CSI endpoint in the liveness probe manifest
2022-11-20 11:44:09 +03:00
Vitaliy Filippov 11ec9ad874 Release 0.8.0
- Implement automatic OSD activation via udev and simple on-disk superblock storage
- Add a new `vitastor-disk` tool and merge all disk-related functionality there.
  Now it can prepare new OSD disks, upgrade plain old systemd units to the new scheme,
  resize OSD data area, manage OSD services by disk paths, manage superblocks,
  automatically check and disable disk cache, dump and write back journal and metadata.
- Add a documentation section about `vitastor-disk` (read it if you want details!)
- Install systemd services during package installation instead of the older method
  of manually creating them via separate shell scripts
- Add a new `make-etcd` script that reuses /etc/vitastor/vitastor.conf to configure etcd
- Allow to configure block_size, bitmap_granularity and immediate_commit per-pool
- Fix "fatal error: tried to overwrite non-zero metadata entry" which was possible
  in some cases after unclean OSD shutdown (caused by old metadata entries not being zeroed)
2022-09-05 13:51:20 +03:00
huy 3b7c6dcac2 Fix volume creation from snapshots in Cinder driver 2022-06-06 15:46:13 +03:00
Vitaliy Filippov 101592bbff Release 0.7.1
- Add ISA-L erasure code implementation, now used automatically instead of jerasure when available
- Fix listings sending too many parallel requests to OSDs
- Fix rm-data crashing with --wait-list
- Remove empty inodes from statistics and `ls` output, after <inode_vanish_time> seconds after deletion
- Make monitor delete pool statistics when the pool is deleted and thus remove them from `df` output
- Log multiple etcd addresses in OSD logs correctly
- Fix true/false parsing in json configs like no_recovery/no_rebalance
- Show no_recovery, no_rebalance, readonly flags in status
2022-06-05 00:07:24 +03:00