Изменения

Ceph performance

16 байтов добавлено, 14:53, 24 июля 2019

Нет описания правки

* CPU is the main bottleneck for Ceph running on good SSDs.

* As Nick Fisk said in his ~~presentation —~~ presentation — Ceph is a Software-Defined Storage '''and every piece of Ceph «Software»''' will run faster with every GHz of CPU clock speed.* Server CPUs often have NUMA (Non-Uniform Memory Access). Which means that some CPU cores ~~don't~~ don’t have direct access to the RAM and/or part of the hardware, but forward requests to other cores.* You should try to use CPUs with more clock speed and without NUMA to maximize the performance. That is, a slower CPU with more cores is probably worse than a faster CPU with smaller number of ~~cores...~~cores…* ~~...but~~ …but within reason as one Bluestore OSD can eat up to ~6 cores under full load.* ~~"Clock speed"~~ «Clock speed» means nominal, not Turbo Boost speed, because Turbo Boost is only beneficial for single-threaded workloads.* CPU pinning recommendations (tasket) are almost outdated, because Ceph OSDs are multi-threaded. At least 4 threads are active during writes, so ~~you'll~~ you’ll only slow your OSDs down if you allocate less than 4-6 cores for each of them.* There are 2 parameters responsible for OSD worker thread ~~count -~~ count — osd_op_num_shards and osd_op_num_threads_per_shard…

* …But trying to tune them is pointless, default configuration (1x5 for HDDs and 2x8 for SSDs) is optimal. The problem is that all worker threads still serialize writes into a single kv_sync_thread, and the whole scheme only scales up to ~6 worker threads.

* There is one thing that decreases latency 2-3 times at once. ~~It's~~ It’s disabling all power-save of CPUs:** <tt>cpupower idle-set -D 1</tt> — — this disables C-States (or you can pass <tt>processor.max_cstate=1 intel_idle.max_cstate=0</tt> to the kernel command-line)** <tt>for i in $(seq 0 $((`nproc`-1))); do cpufreq-set -c $i -g performance; done</tt> — — this disables frequency scaling.

* When power-save is disabled CPU heats up as a GTX, but you get 2-3 times more iops.

* High CPU requirement is one of the cases NOT to use Ceph in a ~~"hyperconverged setup"~~«hyperconverged setup», the setup in which storage and compute nodes are combined.

* You can also disable all hardware vulnerability mitigations: <tt>noibrs noibpb nopti nospectre_v2 nospectre_v1 l1tf=off nospec_store_bypass_disable no_stf_barrier</tt> (or just <tt>mitigations=off</tt> for newer kernels)

VitaliyFilippov

Бюрократ, администратор

13 521

правка

Изменения

Ceph performance

YourcmcWiki