du
does a depth-first traversal of the given tree. By default, it shows the usage of every directory tree, showing the inclusive disk usage of each:
$ du ~
4 /home/bob/Videos
40 /home/bob/.cache/abrt
43284 /home/bob/.cache/mozilla/firefox
43288 /home/bob/.cache/mozilla
12 /home/bob/.cache/imsettings
48340 /home/bob/.cache
4 /home/bob/Documents
48348 /home/bob
If given the -a
option, it will additionally show the size of every file.
With the -s
option, it will show just the total size of each argument file or directory tree.
$ du -s ~
48348 /home/bob
$ du -s ~/*
4 /home/bob/Videos
4 /home/bob/Documents
So, when you ran
$ du -b ~ | wc -l
15041
$ du -b ~ | sort -n | head -n 15040 | cut -f 1 | \
perl -ne 'BEGIN{$i=0;$i+=$_;END{print $i.qq|\n|;}'
12735983847
you were summing up the size of everything under your home directory - multiple times, unfortunately, because the size reported on each line is inclusive of all subdirectories - but because you omitted the final line of du's output, which would be the line for /home/steven
, du
didn't count the size of any of the regular files in the top level of your home directory. So the sum didn't include your very large .xsession-errors
file.
And when you ran
du -sb ~ returns 91296460205, but the sum of du -sb ~/* is only 1690166532
your du -sb ~/*
output didn't include any files or directories in your home directory that begin with .
.
Both du ~ | tail -1
and du -s ~
should do a reasonable job of showing your home directory's disk usage (not including deleted-but-open files, of course), but if you want to sum up all the file sizes without relying on du
, you can do something like this (assuming a modern find
that supports the printf %s
format to show the size in bytes):
find ~ -type f -printf '%s\n' | perl -ne 'BEGIN{$i=0;$i+=$_;END{print $i.qq|\n|;}'
While Marco's answer explained all the details correctly, I just want to focus on your last question/summary:
Is it a good idea to set up SSD + HDD in same pool, or is there a better way to optimize my pair of drives for both speed and capacity?
ZFS is a file system designed for large arrays with many smaller disks. Although it is quite flexible, I think it is suboptimal for your current situation and goal, for the following reasons:
- ZFS does no reshuffling of already written data. What you are looking for is called a hybrid drive, for example Apple's Fusion Drive allows to fuse multiple disks together and automatically selects the storage location for every block based on access history (moving data is done when there is no load on the system or on rewrite). With ZFS, you have none of that, neither automatically nor manually, your data stays were it was written initially (or is already marked for deletion).
- With just a single disk, you give up on redundancy and self-healing. You still detect errors, but you do not use the full capabilities of the system.
- Both disks in the same pool means even higher chance of data loss (this is RAID0 after all) or corruption, additionally your performance will be sub par because of the different drive sizes and drive speeds.
- HDD+SLOG+L2ARC is a bit better, but you need a very good SSD (better two different like Marco said, but a NVMe SSD is a good and expensive compromise) and most of the space on it is wasted: 2 to 4 GB for the ZIL are enough, and a large L2ARC only helps if your RAM is full, but needs higher amounts of RAM itself. This leads to sort of catch-22 - if you want to use L2ARC, you need more RAM, but then you can just use the RAM itself, because it is enough. Remember, only blocks are stored, so you do need not as much as you would assume by looking at plain files.
Now, what are the alternatives?
- You could split by having two pools. One for system, one for data. This way, you have no automatic rebalance and no redundancy, but a clean system which can be extended easily and which has no RAID0 problems.
- Buy a second large HDD, make a mirror, use the SSD like you outlined: removes the problem of differently sized disks and disk speeds, gives you redundancy, keeps the SSD flexible.
- Buy n SSDs and do RAIDZ1/2/3. Smaller SSDs are pretty cheap nowadays and do not suffer slow rebuild times, making RAIDZ1 interesting again.
- Use another file system or volume manager with hybrid capabilities, ZFS on top if needed. This is not seen as optimal, but neither is working with two single disk vdevs in a pool... at least you get exactly what you want, and some nice things of ZFS (snapshots etc.) on top, but I wouldn't count on stellar performance.
Best Answer
If the filesystem is ext4, there are reserved blocks, mostly to help handling and help avoid fragmentation and available only to the root user. For this setting, it can be changed live using tune2fs (not all settings can be handled like this when the filesystem is mounted):
So if you want to lower the reservation to 1% (~ 2GB) thus getting access to ~ 8GB of no more reserved space, you can do this:
Note: the
-m
option actually accepts a decimal number as parameter. You can use-m 0.1
to reserve only about ~200MB (and access most of those previously unavailable 10GB). You can also use the-r
option instead to reserve directly by blocks. It's probably not advised to have 0 reserved blocks.