13 781
правка
Изменения
Нет описания правки
*: Reading from an empty RBD image is very fast :) so pre-fill it before testing.
*: Run tests from node(s) where your actual RBD users will reside. The results are usually slightly better when you run tests from a separate physical server.
* The same from inside a VM or through the kernel RBD driver (krbd:*# fio -ioengine=libaio -sync=1 -direct=1 -name=test -bs=4M -iodepth=16 -rw=write -pool=rpool_hdd -runtime=60 -rbdname=testimg*# fio -ioengine=rbd -direct=1 -name=test -bs=4k -iodepth=1 -rw=randwrite -pool=rpool_hdd -runtime=60 -rbdname=testimg*# fio -ioengine=rbd -direct=1 -name=test -bs=4k -iodepth=128 -rw=randwrite -pool=rpool_hdd -runtime=60 -rbdname=testimg* Всё то же самое изнутри виртуалки или через krbd):
*# fio -ioengine=libaio -direct=1 -name=test -bs=4M -iodepth=16 -rw=write -runtime=60 -filename=/dev/rbdX
*# fio -ioengine=libaio -direct=1 -sync=1 -name=test -bs=4k -iodepth=1 -rw=randwrite -runtime=60 -filename=/dev/rbdX
*# fio -ioengine=libaio -direct=1 -name=test -bs=4k -iodepth=128 -rw=randwrite -runtime=60 -filename=/dev/rbdX
*: Заметьте, что при тестировании задержки добавилась опция Don't miss the added -sync=1option. Это не случайноIt is added on purpose, а соответствует режиму работы СУБД (транзакционная запись в 1 поток)to match the ioengine=rbd test. В ioengine=rbd понятие has no concept of sync отсутствует, там всё всегда everything is always «sync»with it. Overall this write pattern — transactional single-threaded write — corresponds to a DBMS.*: Note that regardless of the supposed overhead of moving data in and out the kernel, the kernel client is actually faster.* ceph-gobench*: Or https://github.com/vitalif/ceph-bench. The original idea comes from the «Mark’s bench» from russian Ceph chat ([https://github.com/socketpair/ceph-bench original outdated tool was here]). Both use a non-replicated Ceph pool (size=1), create several 4MB objects (16 by default) in each separate OSD and do random single-thread 4kb writes in randomly selected objects within one OSD. This mimics random writes to RBD and allows to determine the problematic OSDs by benchmarking them separately.*: To create the non-replicated benchmark pool use {{Cmd|ceph osd pool create bench 128 replicated; ceph osd pool set bench size 1; ceph osd pool set bench min_size 1}}. Just note that 128 (PG count) should be enough for all OSDs to get at least one PG each.
== Why is it so slow ==