Vitastor cannot tolerate network error #2

New Issue

DongliSi · 2021-03-03T10:43:52+03:00

DongliSi commented

2021-03-03 10:43:52 +03:00

Hi, My vitastor cluster is configured with 3 nodes, when the network of one of the nodes is shut down, the entire cluster cannot work。

For example, the network shutdown of node3 will cause node1 and node2 to fail to respond to client requests.

This problem is 100% reproducible.

Have you tested this situation?

The following is my environment and configuration information:

vitastor version: 0.5.5
etcd Version: 3.4.14

etcdctl --endpoints http://172.16.7.3:2379 put /vitastor/config/global '{"immediate_commit":"all"}'

etcdctl --endpoints http://172.16.7.3:2379 put /vitastor/config/pools '{"1":{"name":"testpool","scheme":"replicated","pg_size":2,"pg_minsize":1,"pg_count":48,"failure_domain":"host"}}'

Command parameters of one of the nodes:

etcd --name node2-ssd --initial-advertise-peer-urls http://172.16.7.4:2380 --listen-peer-urls http://172.16.7.4:2380 --listen-client-urls http://172.16.7.4:2379,http://127.0.0.1:2379 --advertise-client-urls http://172.16.7.4:2379 --initial-cluster-token vitastor --initial-cluster node1-ssd=http://172.16.7.3:2380,node2-ssd=http://172.16.7.4:2380,node3-ssd=http://172.16.7.5:2380 --initial-cluster-state new --max-txn-ops=100000 --auto-compaction-retention=10 --auto-compaction-mode=revision

vitastor-osd --etcd_address 172.16.7.4:2379/v3 --bind_address 172.16.7.4 --osd_num 3 --disable_data_fsync 1 --immediate_commit all --disk_alignment 4096 --journal_block_size 4096 --meta_block_size 4096 --journal_sector_buffer_count 1024 --journal_offset 0 --meta_offset 16777216 --data_offset 138870784 --data_size 429496729600 --data_device /dev/disk/by-id/ata-Samsung_SSD_860_EVO_500GB_S3Z3NB1KB15171L

vitastor-osd --etcd_address 172.16.7.4:2379/v3 --bind_address 172.16.7.4 --osd_num 4 --disable_data_fsync 1 --immediate_commit all --disk_alignment 4096 --journal_block_size 4096 --meta_block_size 4096 --journal_sector_buffer_count 1024 --journal_offset 0 --meta_offset 16777216 --data_offset 138870784 --data_size 429496729600 --data_device /dev/disk/by-id/ata-Samsung_SSD_860_EVO_500GB_S3Z3NB1KB15085V

node /ovpdatastore/pkg/vitastor/mon/mon-main.js --etcd_url "http://172.16.7.4:2379" --etcd_prefix "/vitastor" --etcd_start_timeout 5

Hi, My vitastor cluster is configured with 3 nodes, when the network of one of the nodes is shut down, the entire cluster cannot work。 For example, the network shutdown of node3 will cause node1 and node2 to fail to respond to client requests. This problem is 100% reproducible. Have you tested this situation? The following is my environment and configuration information: vitastor version: 0.5.5 etcd Version: 3.4.14 etcdctl --endpoints http://172.16.7.3:2379 put /vitastor/config/global '{"immediate_commit":"all"}' etcdctl --endpoints http://172.16.7.3:2379 put /vitastor/config/pools '{"1":{"name":"testpool","scheme":"replicated","pg_size":2,"pg_minsize":1,"pg_count":48,"failure_domain":"host"}}' Command parameters of one of the nodes: etcd --name node2-ssd --initial-advertise-peer-urls http://172.16.7.4:2380 --listen-peer-urls http://172.16.7.4:2380 --listen-client-urls http://172.16.7.4:2379,http://127.0.0.1:2379 --advertise-client-urls http://172.16.7.4:2379 --initial-cluster-token vitastor --initial-cluster node1-ssd=http://172.16.7.3:2380,node2-ssd=http://172.16.7.4:2380,node3-ssd=http://172.16.7.5:2380 --initial-cluster-state new --max-txn-ops=100000 --auto-compaction-retention=10 --auto-compaction-mode=revision vitastor-osd --etcd_address 172.16.7.4:2379/v3 --bind_address 172.16.7.4 --osd_num 3 --disable_data_fsync 1 --immediate_commit all --disk_alignment 4096 --journal_block_size 4096 --meta_block_size 4096 --journal_sector_buffer_count 1024 --journal_offset 0 --meta_offset 16777216 --data_offset 138870784 --data_size 429496729600 --data_device /dev/disk/by-id/ata-Samsung_SSD_860_EVO_500GB_S3Z3NB1KB15171L vitastor-osd --etcd_address 172.16.7.4:2379/v3 --bind_address 172.16.7.4 --osd_num 4 --disable_data_fsync 1 --immediate_commit all --disk_alignment 4096 --journal_block_size 4096 --meta_block_size 4096 --journal_sector_buffer_count 1024 --journal_offset 0 --meta_offset 16777216 --data_offset 138870784 --data_size 429496729600 --data_device /dev/disk/by-id/ata-Samsung_SSD_860_EVO_500GB_S3Z3NB1KB15085V node /ovpdatastore/pkg/vitastor/mon/mon-main.js --etcd_url "http://172.16.7.4:2379" --etcd_prefix "/vitastor" --etcd_start_timeout 5

DongliSi changed title from ~~Vitastor cannot tolerate network abnormalities~~ to Vitastor cannot tolerate network error

2021-03-03 10:58:23 +03:00

vitalif commented

2021-03-09 00:12:29 +03:00

Hi. First I wanted to tell you a lot of things including that I just released 0.5.7 and so on, but then I realized you're talking about the lack of TCP timeouts.

So yes, current versions of Vitastor don't use timeouts and don't detect dead connections... The Linux defaults for net.ipv4.tcp_keepalive_{time,probes,intvl} are 7200, 9, 75, so connections only die after 2 hours of inactivity which is of course unacceptable :))).

I thought about it, but I saved it for the future for some reason. :-)). I'll implement timeouts in the next few days, ok.

Hi. First I wanted to tell you a lot of things including that I just released 0.5.7 and so on, but then I realized you're talking about the lack of TCP timeouts. So yes, current versions of Vitastor don't use timeouts and don't detect dead connections... The Linux defaults for `net.ipv4.tcp_keepalive_{time,probes,intvl}` are 7200, 9, 75, so connections only die after 2 hours of inactivity which is of course unacceptable :))). I thought about it, but I saved it for the future for some reason. :-)). I'll implement timeouts in the next few days, ok.

vitalif commented

2021-03-09 18:48:33 +03:00

OK, try v0.5.8, it has heartbeats. Packages are updated :-)

vitalif closed this issue

2021-03-09 18:48:40 +03:00

vitalif commented

2021-03-10 18:31:17 +03:00

A small correction: I found another bug which may result in lost objects (not physically lost, but unable to be found because of the incorrect PG configuration) in some cases so I'll fix it and release v0.5.9))