Commit Graph

705 Commits (0b41ffc08d337dc48c5357bec14352f40bcf4a90)

Author SHA1 Message Date
Vitaliy Filippov 0b41ffc08d Release 0.6.0
Warning: upgrading from 0.5.x is currently not supported!
Please create an issue if you really need upgrade capability.

New features:
- Snapshots and Copy-on-Write clones
- Inode (image) names
- Inode I/O and space statistics
- Write throttling for smoothing random write workloads in SSD+HDD configurations
2021-04-11 00:49:18 +03:00
Vitaliy Filippov 64eeb79051 Prevent 0.6.x OSDs from talking to 0.5.x
The new protocol is almost compatible - it has bitmaps, but also it has
a "bitmap_length" field. It's not hard to make 0.5-0.6 OSDs and clients
compatible, but for now I just assume nobody needs it.

If I'm wrong and anybody requests to upgrade their production 0.5.x system
to 0.6.x I'll fix it.
2021-04-10 22:26:17 +03:00
Vitaliy Filippov 2a02f3c4c7 Add metadata superblock and check it on start
Refuse to start if the superblock is missing or bad version;
zero out the metadata area when initializing superblock.
2021-04-10 22:26:17 +03:00
Vitaliy Filippov f684d9101a Refuse to start with old journal version 2021-04-10 17:44:12 +03:00
Vitaliy Filippov c72fddd714 Notes about master/0.5.x 2021-04-10 17:44:12 +03:00
Vitaliy Filippov a1f2f19489 Do not increment inode statistics if the object already exists 2021-04-10 17:44:12 +03:00
Vitaliy Filippov 82c1a7ec67 Fix statistics reporting, split inode number into pool & inode 2021-04-10 17:44:12 +03:00
Vitaliy Filippov 2ab423d4ef Implement journaled write throttling for the SSD+HDD case 2021-04-10 17:44:12 +03:00
Vitaliy Filippov 4694811eab Add microsecond accuracy to set_timer 2021-04-10 17:44:12 +03:00
Vitaliy Filippov 6b988de17d Remove timerfd_interval 2021-04-10 17:44:12 +03:00
Vitaliy Filippov 37efdc2a83 Fix bitmap_set for replicated pools 2021-04-10 17:44:12 +03:00
Vitaliy Filippov 591cad09c9 Fix bitmaps for objects larger than 128K 2021-04-10 17:44:12 +03:00
Vitaliy Filippov b907ad50aa Oops, forgot to add external bitmaps to blockstore in some places 2021-04-10 17:44:12 +03:00
Vitaliy Filippov 7308d6a6c0 Note about etcd 3.4.15 2021-04-10 17:44:12 +03:00
Vitaliy Filippov 5f5b6ef150 Enable chained reads in the client 2021-04-10 17:44:12 +03:00
Vitaliy Filippov 38a3df4a0e Implement chained (optimized) read in the primary OSD code 2021-04-10 17:44:12 +03:00
Vitaliy Filippov 6950b8e3a0 Watch inode metadata revisions 2021-04-10 17:44:12 +03:00
Vitaliy Filippov 0cea3576fb Add "read bitmaps" operation to secondary OSD protocol 2021-04-10 17:44:12 +03:00
Vitaliy Filippov f01eea07d3 Add simplified interface to read blockstore bitmaps synchronously 2021-04-10 17:44:12 +03:00
Vitaliy Filippov 2c2f08aca2 Shorten some structure names 2021-04-10 17:44:12 +03:00
Vitaliy Filippov d6524670e1 Introduce data distribution locality 2021-04-10 17:44:12 +03:00
Vitaliy Filippov 879ecfa74d Fix wording 2021-04-10 17:44:12 +03:00
Vitaliy Filippov aea2d19d35 Change Telegram chat link 2021-04-10 17:44:12 +03:00
Vitaliy Filippov 04f86dc00b Fix Russian README for CMake build 2021-04-10 17:44:12 +03:00
Vitaliy Filippov 7aeb2cbac7 Capture all by value in qemu_proxy 2021-04-10 17:44:12 +03:00
Vitaliy Filippov 519f081006 Add LICENSE 2021-04-10 17:44:12 +03:00
Vitaliy Filippov e50f703e1d Add Russian version of the README 2021-04-10 17:44:12 +03:00
Vitaliy Filippov 2612d3198a Introduce image names and metadata storage in etcd
Each inode has: image name, parent inode number & pool, size and readonly flag

Snapshots are created by switching image name to a different inode number
while using the older inode as parent.
2021-04-10 17:44:12 +03:00
Vitaliy Filippov ab39ce2bbb Use clean_entry_bitmap_size instead of entry_attr_size back because of changed bitmap handling 2021-04-10 17:44:12 +03:00
Vitaliy Filippov d0c2e31312 Add a test for snapshots, fix bugs. Now the test passes 2021-04-10 17:44:12 +03:00
Vitaliy Filippov 9038d42327 Fix several snapshot I/O bugs 2021-04-10 17:44:12 +03:00
Vitaliy Filippov 691f066055 Actual snapshot support (untested) 2021-04-10 17:44:12 +03:00
Vitaliy Filippov ffe1cd4c79 Report inode I/O statistics, aggregate it in the monitor 2021-04-10 17:44:12 +03:00
Vitaliy Filippov 4ae1b84c67 Report inode space usage statistics to etcd, aggregate it in the monitor 2021-04-10 17:44:12 +03:00
Vitaliy Filippov c35963967f Add inode space usage statistics tracking to blockstore 2021-04-10 17:44:12 +03:00
Vitaliy Filippov 0aa2dd2890 Send bitmaps with primary-reads, actually read bitmaps for READ ops 2021-04-10 17:44:12 +03:00
Vitaliy Filippov 6bf88883ac Allocate bitmaps along with stripes to avoid memory fragmentation 2021-04-10 17:44:12 +03:00
Vitaliy Filippov 004f265393 Remove cryptic bitmap inlining from bs_op_t and osd_op_t, use bitmap in primary OSD code 2021-04-10 17:44:12 +03:00
Vitaliy Filippov 860ac24762 Add "external" bitmap support to the secondary OSD protocol 2021-04-10 17:44:12 +03:00
Vitaliy Filippov 6107a4d07b Add "external" bitmap support to blockstore 2021-04-10 17:44:12 +03:00
Vitaliy Filippov 95c29b9dc3 Add "external" bitmap support to osd_rmw 2021-04-10 17:44:12 +03:00
Vitaliy Filippov d99407dcec Check QEMU block-vitastor.so during the test 2021-04-10 17:44:12 +03:00
Vitaliy Filippov 6909807068 Allow to start the OSD just to flush the journal completely 2021-04-10 17:44:12 +03:00
Vitaliy Filippov ec90fe6ec1 Release 0.5.13
Another followup to 0.5.11
2021-04-09 12:10:16 +03:00
Vitaliy Filippov 18c72f4835 Correct reenterability fix (now verified with a test)
It's rather funny but 0.5.12 has to be re-published again
2021-04-09 12:10:16 +03:00
Vitaliy Filippov 59fbcef734 Release 0.5.12
Fix qemu driver broken in 0.5.11 :)
2021-04-08 15:47:18 +03:00
Vitaliy Filippov 40b7c21fb1 Followup to 307c1731c1 - fix mark_stable 2021-04-08 15:47:18 +03:00
Vitaliy Filippov efb3678606 Fix qemu-img broken in 0.5.11
Caused by the lack of reenterability of the main cluster_client function
2021-04-08 14:59:20 +03:00
Vitaliy Filippov 462650134e Release 0.5.11
Another bunch of fixes, including important ones. Now OSDs are stable in SSD+HDD
configurations and everything is mostly ready for the merge of master branch.

Features:

- Add min_flusher_count configuration (good for HDDs)
- Shuffle PGs for better data device utilisation
- Make OSDs benefit from the immediate_commit=small setting if it's applicable

Bug fixes:

- Rework client code to fix write ordering during operation replay
- Rework error handling code so OSDs don't crash in reaction to a crash of their peer OSDs
- Fix several block layer problems related to the journal, some of which
  were leading to double allocations of the same block during journal replay
- Fix monitors crashing during the removal of OSD keys from etcd
- Fix data fsyncs being incorrectly disabled when only disable_journal_fsync was set
- Always zero out unused part of request/reply headers
- Fix some theoretically possible read/write ordering issues
- Don't try to "recover" misplaced objects if it would make them degraded
- Fix heartbeats sometimes preventing OSD to establish connections
2021-04-08 01:18:46 +03:00
Vitaliy Filippov 8d87e32175 Fix msgr_op.h includes 2021-04-08 01:18:46 +03:00