Ext4 vs XFS — различия между версиями

Материал из YourcmcWiki
Перейти к: навигация, поиск
м
м (sysbench random read/write 16K in 8M files)
 
(не показано 18 промежуточных версий этого же участника)
Строка 1: Строка 1:
== Copy kernel source (from SSD with warm cache) ==
+
= Feature difference =
  
HDD: WD Scorpio Black 2.5" 750GB 7200rpm
+
* Ext4 supports big cluster sizes (up to 256Mb) with -O bigalloc, while XFS supports only 512b-4Kb cluster size
 +
* XFS supports fully dynamic inode allocation, i.e. you’ll never run out of inodes, and at the same time you don’t need to waste disk space by reserving it for inodes
 +
* Ext4 does NOT support changing inode count without reformatting the filesystem, even with resize2fs; by default, 1/64 of disk space is reserved for inodes (!!!)
 +
*: It's not hard to change inode count in theory: (1) move data blocks out of the way if we need to reserve them for inodes (2) change inode numbers in all directory entries (3) overwrite/move inode bitmaps and tables. But it's not implemented :-(
 +
* XFS does NOT support shrinking of a filesystem at all (you can only grow it)
  
* xfs 1 thread: 12.348s
+
= Benchmarks =
* xfs 4 threads: 65.883s
+
 
* ext4 1 thread: 7.662s
+
== Operations with kernel 3.10 source tree ==
* ext4 4 threads: 33.876s
+
 
 +
* HDD: WD Scorpio Black 2.5" 750GB 7200rpm
 +
* Kernel: 3.12.3 (Debian 3.12.3-1~exp1)
 +
 
 +
Copy kernel source from SSD to tested FS and then sync, with warm page cache (i.e. not read-bound):
 +
* xfs 1 parallel copy: 12.348s
 +
* xfs 4 parallel copies: 65.883s
 +
* ext4 1 parallel copy: 7.662s
 +
* ext4 4 parallel copies: 33.876s
 +
 
 +
tar 3 kernel source copies from tested FS to /dev/null (basically just read and discard) after 'echo 3 > /proc/sys/vm/drop_caches':
 +
* xfs: real 26.815s, user 0.936s, sys 1.556s
 +
* ext4: real 5.509s, user 0.584s, sys 0.872s (almost 5 times faster!)
 +
 
 +
rm 3 kernel source copies and sync after 'echo 3 > /proc/sys/vm/drop_caches':
 +
* xfs: real 7.244s, user 0.148s, sys 2.748s
 +
* ext4: real 8.993s, user 0.108s, sys 2.664s
 +
(oh, xfs is in fact faster in this test!)
  
 
== FS-Mark 3.3, creating 1M files ==
 
== FS-Mark 3.3, creating 1M files ==
  
HDD: WD Scorpio Black 2.5" 750GB 7200rpm
+
* HDD: WD Scorpio Black 2.5" 750GB 7200rpm
 +
* Kernel: 3.12.3
 +
* fs_mark is a write-only test and it does fsync(), so there should be no skew caused by page cache
  
 
<plot>
 
<plot>
Строка 45: Строка 68:
 
== sysbench random read/write 16K in 8M files ==
 
== sysbench random read/write 16K in 8M files ==
  
HDD: WD Scorpio Black 2.5" 750GB 7200rpm
+
* HDD: WD Scorpio Black 2.5" 750GB 7200rpm
 +
* Kernel: 3.12.3
 +
* sysbench was ran with O_DIRECT, so the page cache should also have no impact.
 +
* It’s not a filesystem benchmark at all! It tests disk performance because it holds ALL prepared files open during the test. It only shows us that neither ext4 nor XFS aren’t slowing down the direct access to underlying device (which is also good, of course)…
 +
* Probably because of the above note, the filesystems don’t differ, and the results are totally same for 1x 1GB file and 128x 8MB files… and very similar for 3072x 16KB files (next test below).
  
 
<plot>
 
<plot>
Строка 60: Строка 87:
 
plot 'xfs.dat' using 1:2 title 'XFS' with linespoints, 'ext4.dat' using 1:2 title 'ext4' with linespoints
 
plot 'xfs.dat' using 1:2 title 'XFS' with linespoints, 'ext4.dat' using 1:2 title 'ext4' with linespoints
 
DATASET xfs
 
DATASET xfs
1.0 148.13
+
1.0 148.13
2.0 160.01
+
2.0 160.01
4.0 193.00
+
4.0 193.00
8.0 205.62
+
8.0 205.62
16.0 208.25
+
16.0 208.25
32.0 203.88
+
32.0 203.88
64.0 204.35
+
64.0 204.35
 
ENDDATASET
 
ENDDATASET
 
DATASET ext4
 
DATASET ext4
Строка 81: Строка 108:
 
== sysbench random read/write 16K in 16K files ==
 
== sysbench random read/write 16K in 16K files ==
  
HDD: WD Scorpio Black 2.5" 750GB 7200rpm
+
* HDD: WD Scorpio Black 2.5" 750GB 7200rpm
 +
* Kernel: 3.12.3
  
 
<plot>
 
<plot>
Строка 118: Строка 146:
  
 
* HDD: WD VelociRaptor WD6000HLHX, 10000rpm
 
* HDD: WD VelociRaptor WD6000HLHX, 10000rpm
 +
* Kernel: 3.10.11 (Debian 3.10-3-amd64)
 
* fileserver test is read whole file + append + write whole file test ran on 10000 files in X threads
 
* fileserver test is read whole file + append + write whole file test ran on 10000 files in X threads
* filebench fails to run fileserver test with O_DIRECT, so I tested it with dirty_ratio=1% like this:
+
* filebench fails to run fileserver test with O_DIRECT, so I tried to "disable" page cache using dirty_ratio=1% and ran tests like this:
  
 
<code-bash>
 
<code-bash>
Строка 141: Строка 170:
 
</code-bash>
 
</code-bash>
  
 +
{|
 +
|-
 +
|
 
<plot>
 
<plot>
 
set xrange [1:64]
 
set xrange [1:64]
 
set logscale x
 
set logscale x
 
set xtics (1, 2, 4, 8, 16, 32, 50, 64)
 
set xtics (1, 2, 4, 8, 16, 32, 50, 64)
set yrange [0:300]
+
set yrange [0:3200]
 
set xlabel 'threads'
 
set xlabel 'threads'
 
set ylabel 'ops/s (more is better)'
 
set ylabel 'ops/s (more is better)'
Строка 174: Строка 206:
 
ENDDATASET
 
ENDDATASET
 
</plot>
 
</plot>
 +
|
 +
<plot>
 +
set xrange [1:64]
 +
set logscale x
 +
set xtics (1, 2, 4, 8, 16, 32, 50, 64)
 +
set yrange [0:90]
 +
set xlabel 'threads'
 +
set ylabel 'MB/s (more is better)'
 +
set xzeroaxis
 +
set grid ytics
 +
set style fill solid 1.0 noborder
 +
set boxwidth 0.7 relative
 +
plot 'xfs.dat' using 1:2 title 'XFS' with linespoints, 'ext4.dat' using 1:2 title 'ext4' with linespoints
 +
DATASET xfs
 +
1.0  44.3
 +
2.0  41.9
 +
4.0  41.8
 +
8.0  43.6
 +
16.0  40.0
 +
32.0  34.9
 +
50.0  27.3
 +
64.0  24.5
 +
ENDDATASET
 +
DATASET ext4
 +
1.0  71.8
 +
2.0  55.3
 +
4.0  54.5
 +
8.0  54.3
 +
16.0  47.5
 +
32.0  36.1
 +
50.0  30.5
 +
64.0  26.5
 +
ENDDATASET
 +
</plot>
 +
|}
 +
 +
== [http://sourceforge.net/projects/filebench/ filebench] fileserver, dirty_ratio=20% ==
 +
 +
* HDD: WD VelociRaptor WD6000HLHX, 10000rpm
 +
* Kernel: 3.10.11 (Debian 3.10-3-amd64)
 +
* Same test but ran with default 20% dirty_ratio setting. It's clearly seen that the system was using page cache extensively - ext4 was permanently gaining an unreal result in the single-threaded test...
 +
 +
{|
 +
|-
 +
|
 +
<plot>
 +
set xrange [1:64]
 +
set logscale x
 +
set xtics (1, 2, 4, 8, 16, 32, 50, 64)
 +
set yrange [0:20000]
 +
set xlabel 'threads'
 +
set ylabel 'ops/s (more is better)'
 +
set xzeroaxis
 +
set grid ytics
 +
set style fill solid 1.0 noborder
 +
set boxwidth 0.7 relative
 +
plot 'xfs.dat' using 1:2 title 'XFS' with linespoints, 'ext4.dat' using 1:2 title 'ext4' with linespoints
 +
DATASET xfs
 +
1.0  7755
 +
2.0  3796
 +
4.0  3333
 +
8.0  3320
 +
16.0  3559
 +
32.0  3653
 +
50.0  2650
 +
64.0  1671
 +
ENDDATASET
 +
DATASET ext4
 +
1.0  16415
 +
2.0  4136
 +
4.0  4108
 +
8.0  4026
 +
16.0  3570
 +
32.0  3147
 +
50.0  2632
 +
64.0  2778
 +
ENDDATASET
 +
</plot>
 +
|
 +
<plot>
 +
set xrange [1:64]
 +
set logscale x
 +
set xtics (1, 2, 4, 8, 16, 32, 50, 64)
 +
set yrange [0:400]
 +
set xlabel 'threads'
 +
set ylabel 'MB/s (more is better)'
 +
set xzeroaxis
 +
set grid ytics
 +
set style fill solid 1.0 noborder
 +
set boxwidth 0.7 relative
 +
plot 'xfs.dat' using 1:2 title 'XFS' with linespoints, 'ext4.dat' using 1:2 title 'ext4' with linespoints
 +
DATASET xfs
 +
1.0  182
 +
2.0  88.7
 +
4.0  78.5
 +
8.0  78.3
 +
16.0  83.8
 +
32.0  85.6
 +
50.0  62.0
 +
64.0  39.2
 +
ENDDATASET
 +
DATASET ext4
 +
1.0  382.9
 +
2.0  96.8
 +
4.0  97.5
 +
8.0  94.9
 +
16.0  83.8
 +
32.0  73.3
 +
50.0  61.5
 +
64.0  64.2
 +
ENDDATASET
 +
</plot>
 +
|}

Текущая версия на 17:51, 20 декабря 2013

Feature difference

  • Ext4 supports big cluster sizes (up to 256Mb) with -O bigalloc, while XFS supports only 512b-4Kb cluster size
  • XFS supports fully dynamic inode allocation, i.e. you’ll never run out of inodes, and at the same time you don’t need to waste disk space by reserving it for inodes
  • Ext4 does NOT support changing inode count without reformatting the filesystem, even with resize2fs; by default, 1/64 of disk space is reserved for inodes (!!!)
    It's not hard to change inode count in theory: (1) move data blocks out of the way if we need to reserve them for inodes (2) change inode numbers in all directory entries (3) overwrite/move inode bitmaps and tables. But it's not implemented :-(
  • XFS does NOT support shrinking of a filesystem at all (you can only grow it)

Benchmarks

Operations with kernel 3.10 source tree

  • HDD: WD Scorpio Black 2.5" 750GB 7200rpm
  • Kernel: 3.12.3 (Debian 3.12.3-1~exp1)

Copy kernel source from SSD to tested FS and then sync, with warm page cache (i.e. not read-bound):

  • xfs 1 parallel copy: 12.348s
  • xfs 4 parallel copies: 65.883s
  • ext4 1 parallel copy: 7.662s
  • ext4 4 parallel copies: 33.876s

tar 3 kernel source copies from tested FS to /dev/null (basically just read and discard) after 'echo 3 > /proc/sys/vm/drop_caches':

  • xfs: real 26.815s, user 0.936s, sys 1.556s
  • ext4: real 5.509s, user 0.584s, sys 0.872s (almost 5 times faster!)

rm 3 kernel source copies and sync after 'echo 3 > /proc/sys/vm/drop_caches':

  • xfs: real 7.244s, user 0.148s, sys 2.748s
  • ext4: real 8.993s, user 0.108s, sys 2.664s

(oh, xfs is in fact faster in this test!)

FS-Mark 3.3, creating 1M files

  • HDD: WD Scorpio Black 2.5" 750GB 7200rpm
  • Kernel: 3.12.3
  • fs_mark is a write-only test and it does fsync(), so there should be no skew caused by page cache

sysbench random read/write 16K in 8M files

  • HDD: WD Scorpio Black 2.5" 750GB 7200rpm
  • Kernel: 3.12.3
  • sysbench was ran with O_DIRECT, so the page cache should also have no impact.
  • It’s not a filesystem benchmark at all! It tests disk performance because it holds ALL prepared files open during the test. It only shows us that neither ext4 nor XFS aren’t slowing down the direct access to underlying device (which is also good, of course)…
  • Probably because of the above note, the filesystems don’t differ, and the results are totally same for 1x 1GB file and 128x 8MB files… and very similar for 3072x 16KB files (next test below).

sysbench random read/write 16K in 16K files

  • HDD: WD Scorpio Black 2.5" 750GB 7200rpm
  • Kernel: 3.12.3

filebench fileserver, dirty_ratio=1%

  • HDD: WD VelociRaptor WD6000HLHX, 10000rpm
  • Kernel: 3.10.11 (Debian 3.10-3-amd64)
  • fileserver test is read whole file + append + write whole file test ran on 10000 files in X threads
  • filebench fails to run fileserver test with O_DIRECT, so I tried to "disable" page cache using dirty_ratio=1% and ran tests like this:
echo 1 > /proc/sys/vm/dirty_ratio
echo 0 > /proc/sys/vm/dirty_bytes
echo 0 > /proc/sys/kernel/randomize_va_space
for i in 1 2 4 8 16 32 50 64; do
    echo
    echo "== $i threads =="
    echo
    echo 1 > /proc/sys/vm/drop_caches
    sync
    filebench <<EOF
load fileserver
set \$dir=/media/sdd
set \$nthreads=$i
run 30
EOF
done
echo 20 > /proc/sys/vm/dirty_ratio

filebench fileserver, dirty_ratio=20%

  • HDD: WD VelociRaptor WD6000HLHX, 10000rpm
  • Kernel: 3.10.11 (Debian 3.10-3-amd64)
  • Same test but ran with default 20% dirty_ratio setting. It's clearly seen that the system was using page cache extensively - ext4 was permanently gaining an unreal result in the single-threaded test...