Simplified distributed block storage with strong consistency, like in Ceph
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
 
 

5.6 KiB

DocumentationConfiguration → OSD Disk Layout Parameters


Читать на русском

OSD Disk Layout Parameters

These parameters apply to OSDs, are fixed at the moment of OSD drive initialization and can't be changed after it without losing data.

data_device

  • Type: string

Path to the block device to use for data. It's highly recommendded to use stable paths for all device names: /dev/disk/by-partuuid/xxx... instead of just /dev/sda or /dev/nvme0n1 to not mess up after server restart. Files can also be used instead of block devices, but this is implemented only for testing purposes and not for production.

meta_device

  • Type: string

Path to the block device to use for the metadata. Metadata must be on a fast SSD or performance will suffer. If this option is skipped, data_device is used for the metadata.

journal_device

  • Type: string

Path to the block device to use for the journal. Journal must be on a fast SSD or performance will suffer. If this option is skipped, meta_device is used for the journal, and if it's also empty, journal is put on data_device. It's almost always fine to put metadata and journal on the same device, in this case you only need to set meta_device.

journal_offset

  • Type: integer
  • Default: 0

Offset on the device in bytes where the journal is stored.

journal_size

  • Type: integer

Journal size in bytes. By default, all available space between journal_offset and data_offset, meta_offset or the end of the journal device is used. Large journals aren't needed in SSD-only setups, 32 MB is always enough. In SSD+HDD setups it is beneficial to use larger journals (for example, 1 GB) and enable throttle_small_writes.

meta_offset

  • Type: integer
  • Default: 0

Offset on the device in bytes where the metadata area is stored. Again, set it to something if you colocate metadata with journal or data.

data_offset

  • Type: integer
  • Default: 0

Offset on the device in bytes where the data area is stored. Again, set it to something if you colocate data with journal or metadata.

data_size

  • Type: integer

Data area size in bytes. By default, the whole data device up to the end will be used for the data area, but you can restrict it if you want to use a smaller part. Note that there is no option to set metadata area size - it's derived from the data area size.

meta_block_size

  • Type: integer
  • Default: 4096

Physical block size of the metadata device. 4096 for most current HDDs and SSDs.

journal_block_size

  • Type: integer
  • Default: 4096

Physical block size of the journal device. Must be a multiple of disk_alignment. 4096 for most current HDDs and SSDs.

disable_data_fsync

  • Type: boolean
  • Default: false

Do not issue fsyncs to the data device, i.e. do not flush its cache. Safe ONLY if your data device has write-through cache. If you disable the cache yourself using hdparm or scsi_disk/cache_type then make sure that the cache disable command is run every time before starting Vitastor OSD, for example, in the systemd unit. See also immediate_commit option for the instructions to disable cache and how to benefit from it.

disable_meta_fsync

  • Type: boolean
  • Default: false

Same as disable_data_fsync, but for the metadata device. If the metadata device is not set or if the data device is used for the metadata the option is ignored and disable_data_fsync value is used instead of it.

disable_journal_fsync

  • Type: boolean
  • Default: false

Same as disable_data_fsync, but for the journal device. If the journal device is not set or if the metadata device is used for the journal the option is ignored and disable_meta_fsync value is used instead of it. If the same device is used for data, metadata and journal the option is also ignored and disable_data_fsync value is used instead of it.

disable_device_lock

  • Type: boolean
  • Default: false

Do not lock data, metadata and journal block devices exclusively with flock(). Though it's not recommended, but you can use it you want to run multiple OSD with a single device and different offsets, without using partitions.

disk_alignment

  • Type: integer
  • Default: 4096

Required physical disk write alignment. Most current SSD and HDD drives use 4 KB physical sectors even if they report 512 byte logical sector size, so 4 KB is a good default setting.

Note, however, that physical sector size also affects WA, because with block devices it's impossible to write anything smaller than a block. So, when Vitastor has to write a single metadata entry that's only about 32 bytes in size, it actually has to write the whole 4 KB sector.

Because of this it can actually be beneficial to use SSDs which work well with 512 byte sectors and use 512 byte disk_alignment, journal_block_size and meta_block_size. But the only SSD that may fit into this category is Intel Optane (probably, not tested yet).

Clients don't need to be aware of disk_alignment, so it's not required to put a modified value into etcd key /vitastor/config/global.