Изменения

Перейти к: навигация, поиск

Ceph performance

138 байтов добавлено, 11:00, 27 июля 2019
Нет описания правки
== Bluestore vs Filestore ==
 
TODO: This section lacks random read performance comparisons.
Bluestore is the «new» storage layer of Ceph. All presentations and documents say it’s better in all ways, which in fact seems reasonable for something «new».
Bluestore is also more feature-rich: it has checksums, compression, erasure-coded overwrites and virtual clones. Checksums allow 2x-replicated pools self-heal better, erasure-coded overwrites make EC usable for RBD and CephFS, and virtual clones make VMs run faster after taking a snapshot.
In HDD-only (or bad-SSD-only) setups Bluestore uses a lot more RAM, though, because it uses RocksDB for all metadata, additionally caches some of them by itself and is also tries to cache some data blocks to compensate 2x faster than Filestore for the lack of page cache usagerandom writes. The general rule of thumb This is 1GB again because it can do 1 commit per 1TB of storagewrite, but not less than 2GB per an OSDat least if you apply this patch: https://github.com/ceph/ceph/pull/26909 and turn bluefs_preextend_wal_files on. In fact it’s OK to say that Bluestore’s deferred write implementation is really optimal for transactional writes to slow drives.
AndHowever, suprisinglyif you switch to faster drives, there is one thing that may sometimes be worse with Bluestore: 's random write performancewrites don't appear to be much better than Filestore's. The issue This shows up differently in two HDD+SSD and All-Flash setups both of which are certainly very popular setups. TODO: This section lacks random read performance comparisons.
=== HDD for data + SSD for journal ===
Filestore writes everything to the journal and only starts to flush it to the data device when the journal fills up to the configured percent. This is very convenient because it makes journal act as a «temporary buffer» that absorbs random write bursts.
But it’s still a shame that the increase is only 5-10 % for that amount of architectural effort.
=== HDD-only (or bad-SSD-only) RAM usage ===
In these setups Another thing to note is that Bluestore uses a lot more RAM. This is also 2x faster than Filestore, because it can do 1 commit per writeuses RocksDB for all metadata, at least if you apply this patch: https://github.com/ceph/ceph/pull/26909 additionally caches some of them by itself and turn bluefs_preextend_wal_files on. In fact it’s OK also tries to say that Bluestore’s deferred write implementation is really optimal cache some data blocks to compensate for transactional writes on HDDsthe lack of page cache usage. The general rule of thumb is 1GB per 1TB of storage, but not less than 2GB per an OSD.
=== About the sizing of block.db ===

Навигация