August 14, 2020

Self built home private cloud NAS — disk management system

By admin

Gen8 is run directly in CentOS mode. LVM is used for logical volume management. All kinds of software are basically run by docker. Basically, it can run continuously for one year.

Disk management system

Previously, gen8 was run directly in CentOS mode. LVM was used for logical volume management. All kinds of software were basically run by docker. Basically, it can run continuously for one year without any problem.

However, I’m going to use openmedia valut as the operating system here. First of all, this freenas based system is actually a Linux system, so I can continue to toss about some things in the future. The hard disk data can also be read on other Linux later. Second, the installation disk requirements are very small, so I plan to use all four SATA to install the hard disk, and the operating system is installed on a U disk. After all the installation, copy the USB flash disk. Even if the USB flash disk is damaged, it means that you are buying a USB flash disk. There is a special flash memory plug-in on openmediavault to reduce the read and write to the root directory to protect the U disk.

If there is sufficient funds for data storage, it is recommended to use NAS dedicated hard disk, such as the purple disk of Western Digital or the wolf disk of Seagate. These hard disks can guarantee good stability during long-term operation (7x24h). I used Seagate’s cool Eagle – monitoring panel in my system. The main reason is that it is cheap and can run for a long time. The speed is only 5900 rpm, which can effectively reduce the disk read and write noise.

For hard disk file system, I also give up raid and LVM, and adopt mergerfs + snapraid. I feel that this is the most suitable disk and file management system for home NAS system on the market.

MergerFS

The first mergerfs is one of the union file systems (here is a good introduction to aufs), which is similar to the result pursued by LVM. Through megerfs, the data on multiple hard disks can be used in the form of a directory. But the difference is that this file system is a user space file system. It does not strip the hard disk. Each hard disk keeps the directory and files in the original file system. If you take out a hard disk and connect it to another machine, you can read the data on this hard disk without the configuration data of the logical volume. Similar file systems are aufs and mhdfs, but many tests show that mergerfs is more efficient and provides more configuration options.

If the “movie” directory is set on hard disk a, there is a “download” directory on disk B. Merge the two disks into one directory through mergerfs, and you will see the “movie” and “download” folders. Under the default policy (use disk with large space under the premise of epmfs directory), the data stored in the movie directory will be stored in disk a first, and the downloaded files will be stored in disk B first. In this way, some disks that are not often used can run less or even hibernate. This strategy is very suitable for BT download, copy the file to the power directory, to protect the disk where the movie is stored. For different strategies, you can refer to the above instructions of GitHub of mergerfs, and you can design them according to your own needs.

The main disadvantage of megerfs is that there is no striped disk, but by storing the actual files on different disks, which can not make full use of disk space. For example, a blue disc is 30GB in size. Although disk a and disk B have 20GB of space, they can’t store this file. Because the files in mergerfs are completely stored, they cannot be divided into two 15GB files. However, compared with trying a few T-units of disk, this situation is rarely encountered. My suggestion is to start to consider adding new disks when the disk utilization is about 80%.

In mergerfs, I built a logical volume with mount point / pool based on * * hard disks. All shared files are in this directory. The added hard disk will also appear in this directory. When the program accesses this directory for reading and writing, mergerfs will automatically process the data in real time and place the data in the correct disk file directory. The test results show that the CPU utilization rate of about 15% can achieve nearly 100MB / s read and write.

SnapRAID

Unlike ZFS, mergerfs does not have redundancy to ensure data security. At this time, we can use snapraid to support a software RAID. This function is similar to RAID5 and requires an additional hard disk to store the verification data. The capacity of this disk must be greater than or equal to other data disks. Other data disks also need to store some data information called “content”. In other words, snapraid needs to take up some extra space for data redundancy.

Why is snapraid more recommended than other hardware raid and ZFS? First of all, snapraid doesn’t need to have the same size of all disks and doesn’t care about the format of each hard disk.

As its name suggests, snapraid uses snapshot to do data redundancy. In other words, redundant data will be generated after a synchronization operation is executed regularly. Compared with the hardware raid mode, this design can avoid all hard disks running without data operation, so as to meet the need of real-time data synchronization. If the repair fails, the data on the whole hard disk will not be lost, and only individual files cannot be recovered.

The disadvantages of snapraid are also obvious. It is not suitable for a system with a large number of small files and frequent changes. It is more suitable for home NAS, such as photo and movie storage, with many large storage systems with little change.

To set up snapraid, you need to add all the disks mapped under mergerfs to the data disk list, while the other dedicated verification data disk should not be added to mergerfs. Only check data and redundant information are saved here.

Compared with the near real-time working mode of mergerfs, snapraid needs regular maintenance. Moreover, it is necessary to avoid data operation when generating verification information. You can choose to manually select synchronization in OMV web page management interface, and then check and redundant data will be synchronized. In addition, you can also create a scheduled task, which is executed every night when there is no data to be written, to generate synchronization information.