Commit Graph

16 Commits (a1f2f19489e427aca96a82c77c76e76804b9338a)

Author SHA1 Message Date
Vitaliy Filippov ec90fe6ec1 Release 0.5.13
Another followup to 0.5.11
2021-04-09 12:10:16 +03:00
Vitaliy Filippov 59fbcef734 Release 0.5.12
Fix qemu driver broken in 0.5.11 :)
2021-04-08 15:47:18 +03:00
Vitaliy Filippov 462650134e Release 0.5.11
Another bunch of fixes, including important ones. Now OSDs are stable in SSD+HDD
configurations and everything is mostly ready for the merge of master branch.

Features:

- Add min_flusher_count configuration (good for HDDs)
- Shuffle PGs for better data device utilisation
- Make OSDs benefit from the immediate_commit=small setting if it's applicable

Bug fixes:

- Rework client code to fix write ordering during operation replay
- Rework error handling code so OSDs don't crash in reaction to a crash of their peer OSDs
- Fix several block layer problems related to the journal, some of which
  were leading to double allocations of the same block during journal replay
- Fix monitors crashing during the removal of OSD keys from etcd
- Fix data fsyncs being incorrectly disabled when only disable_journal_fsync was set
- Always zero out unused part of request/reply headers
- Fix some theoretically possible read/write ordering issues
- Don't try to "recover" misplaced objects if it would make them degraded
- Fix heartbeats sometimes preventing OSD to establish connections
2021-04-08 01:18:46 +03:00
Vitaliy Filippov 7e6e1a5a82 Release 0.5.10
The version seems to be stable after this bunch of fixes :)

- Fix delete & write operation ordering during rebalance to not lose objects in the immediate_commit=off mode
- Fix a possible crash caused by very high iodepths
- Re-distribute PG primaries over OSDs that come up after a short downtime
- Allow to specify etcd URLs for OSDs with http://, do not die with a strange error if -etcd option is missing for fio
- Fix a journal flushing deadlock which sometimes occurred in the immediate_commit=off mode
- Fix a bug where OSDs could hang if the data device filled up
- Fix an allocator bug where it was unable to allocate up to last (n%64) data device blocks
- Fix monitor crash that occurred on removal of some etcd keys
- Fix a bug where PGs could remain incomplete due to incorrect PG history with just zeroes in osd_sets
2021-03-16 12:48:26 +03:00
Vitaliy Filippov 036555638e Release 0.5.9
- Fix two monitor bugs which led to objects being "logically lost" (physically
  present on some secondary OSDs while primary doesn't know about it) after multiple
  interrupted rebalancings
- Implement "no_recovery" and "no_rebalance" flags
2021-03-11 00:39:10 +03:00
Vitaliy Filippov 19e47a0279 Release 0.5.8
- Add heartbeats (fixes failover in case of network issues or offline nodes)
- Fix a bug where a PG could incorrectly become listed as 'incomplete' if historical osd_sets
  included a set with the the PG's primary OSD as the only alive one
- Use osd_out_time = 10 minutes by default instead of 30 minutes
- Make monitors stick to a single selected etcd URL on start and not try to select random ones
  on every request - this was leading to etcd interaction errors when some etcds were unavailable
2020-03-09 02:38:17 +03:00
Vitaliy Filippov 88a03f4e98 Release 0.5.7
- Fix multiple bugs leading to OSDs sometimes being unable to correctly activate PGs
  when a lot of PG peering events occurred in a small amount of time
- Fix a bug where OSDs could list incomplete object versions during peering. The bug
  manifested with "local rollback operation failed" messages in OSD logs
- Fix a bug where misplaced chunks for degraded and incomplete objects were not removed
  from extra OSDs during recovery
- Fix incorrect PG history configuration resulting in OSDs being unable to find some
  of the objects after a PG count change
- Simplify block layer write ordering logic
- Avoid extra data move when a lot of OSDs are first stopped for long time and then restarted
- Fix incorrect degraded & misplaced object statistics after a completed rebalance
- Fix incorrect usage of pg_minsize instead of the minimal possible object chunk count in EC pools
2021-03-08 23:37:02 +03:00
Vitaliy Filippov ab90ed747f Release 0.5.6
- Fix operation statistics
- Fix a rebalance hang introduced in 0.5.5
- Test PG count changes with actual data moving
- Fix a possible 'unexpected pg state: 0' error during PG count change
2021-03-01 16:26:04 +03:00
Vitaliy Filippov bb2d9a3afe Release 0.5.5
- Transition to CMake build system
- Fix Monitor being unable to change PG sizes
- Fix PG optimizer not using some OSDs in some cases
- Fix inability to change PG count online
- Improve journal flusher performance
- Add a little better systemd unit generator
- Use w=8 with jerasure (breaking change for EC pools)
2021-02-26 01:59:18 +03:00
Vitaliy Filippov e21b14b72c Fix rpm specs for building with CMake 2021-02-26 01:59:18 +03:00
Vitaliy Filippov b9e7d31aa1 Release v0.5.4
- Fix a rare hang, more or less reproducible with very slow drives
- Fix a hang with the no_same_sector_overwrites mode
2021-02-24 01:40:30 +03:00
Vitaliy Filippov ca0a11ec85 Release 0.5.3 2021-02-03 00:38:57 +03:00
Vitaliy Filippov 9d80bd2d98 Build with jerasure, split some build scripts 2020-12-05 19:02:23 +03:00
Vitaliy Filippov 5596ad8997 Use custom QEMU build for CentOS 7 2020-12-04 11:47:05 +03:00
Vitaliy Filippov 59c29b0cee Fix RPATH for CentOS builds, add additional repos into the CentOS installation instructions 2020-12-04 11:47:04 +03:00
Vitaliy Filippov b56f8820ec Container packaging for Debian 11 Bullseye, CentOS 7 and CentOS 8 2020-11-10 00:02:53 +03:00