How to create Software RAID 1 on Fresh NVMe Drives in CentOS/RHEL

Let’s say, you just installed two NVMe drives. That means, you currently have the following devices on your system:

/dev/nvme0n1
/dev/nvme0n2

Now, to use Raid 1 on these devices, you need to first partition them. If your devices are less than 2TB, you can use label msdos with fdisk. But I prefer gpt with parted. I will partition the disks using parted.

Open the disk nvme0n1 using parted

parted /dev/nvme0n1

Now, set the label to gpt

mklabel gpt

Now, create the primary partition

mkpart primary 0TB 1.9TB

Assuming 1.9TB is the size of your drive.

Run the above process for nvme1n1 as well. This will create one partition on each device which would be like the following:

/dev/nvme0n1p1
/dev/nvme1n1p1

Now, you may create the raid, using mdadm command as follows:

mdadm --create /dev/md201 --level=mirror --raid-devices=2 /dev/nvme0n1p1 /dev/nvme1n1p1

If you see, mdadm command not found, then you can install mdadm using the following:

yum install mdadm -y

Once done, you may now see your raid using the following command:

[root@bd3 ~]# cat /proc/mdstat
Personalities : [raid1]
md301 : active raid1 sdd1[1] sdc1[0]
      976628736 blocks super 1.2 [2/2] [UU]
      bitmap: 0/8 pages [0KB], 65536KB chunk

md201 : active raid1 nvme1n1p1[1] nvme0n1p1[0]
      1875240960 blocks super 1.2 [2/2] [UU]
      bitmap: 2/14 pages [8KB], 65536KB chunk

md124 : active raid1 sda5[0] sdb5[1]
      1843209216 blocks super 1.2 [2/2] [UU]
      bitmap: 4/14 pages [16KB], 65536KB chunk

md125 : active raid1 sda2[0] sdb2[1]
      4193280 blocks super 1.2 [2/2] [UU]

md126 : active raid1 sdb3[1] sda3[0]
      1047552 blocks super 1.2 [2/2] [UU]
      bitmap: 0/1 pages [0KB], 65536KB chunk

md127 : active raid1 sda1[0] sdb1[1]
      104856576 blocks super 1.2 [2/2] [UU]
      bitmap: 1/1 pages [4KB], 65536KB chunk

unused devices: <none>

Here are a few key pieces of information about software raid:

  1. It is better not to use Raid 10 with software raid. In case the raid configuration is lost, it is hard to know which drives were set as stripe and which like a mirror by the mdadm. It is a better practice to use raid 1 as a rule of thumb with software raid.
  2. Raid 1 in mdadm doubles the read request in parallel. In raid 1, one request reads from one device, while the other request in parallel would read from the next device. This gives double read throughput when there is a parallel thread running. It still suffers from the write cost for writing data in two devices.

How to Speed Up Software RAID (mdadm) Resync Speed

mdadm is the software raid tools used in Linux system. One key problem with the software raid, is that it resync is utterly slow comparing with the existing drive speed (SSD or NVMe). The resync speed set by mdadm is default for regardless of whatever the drive type you have. To view the default values, you may run the following:

[root@172 ~]# sysctl dev.raid.speed_limit_min
dev.raid.speed_limit_min = 1000
[root@172 ~]# sysctl dev.raid.speed_limit_max
dev.raid.speed_limit_max = 200000

As you see, the minimum value starts from 1000 and can max upto 200K. Although, it can max upto 200K, but as min value is too low, mdadm always tries to keep the value below average to your speed available. To speed up, we would like to maximize these numbers. To change the numbers, you may run something like the following:

sysctl -w dev.raid.speed_limit_min=500000
sysctl -w dev.raid.speed_limit_max=5000000

Once done, you may now check the speed is going up immediately:

[root@172 ~]# cat /proc/mdstat
Personalities : [raid1]
md3 : active raid1 sdb5[1] sda5[0]
      1916378112 blocks super 1.2 [2/2] [UU]
      [===============>.....]  resync = 76.2% (1461787904/1916378112) finish=27.4min speed=276182K/sec
      bitmap: 5/15 pages [20KB], 65536KB chunk

md0 : active raid1 sdb1[1] sda1[0]
      1046528 blocks super 1.2 [2/2] [UU]
      bitmap: 0/1 pages [0KB], 65536KB chunk

md2 : active raid1 sda2[0] sdb2[1]
      78576640 blocks super 1.2 [2/2] [UU]
      bitmap: 1/1 pages [4KB], 65536KB chunk

md1 : active raid1 sdb3[1] sda3[0]
      4189184 blocks super 1.2 [2/2] [UU]

unused devices: <none>

One thing to keep in mind is that, if you try to set the value too high, like we did, this might cause some handsome load on your system. If you see the load is unmanageable, you should focus on decreasing the number to something like 50k-100k for the min value.

Making The Sysctl Value Permanent

As we have established the kernel variable values on runtime, this would go back to default once we restart/reboot the server. If you want to persist the values, you need to put these values to /etc/sysctl.conf file. To make them persist, open the sysctl.conf file:

nano /etc/sysctl.conf

Add the following lines at the end of the file:

dev.raid.speed_limit_min = 500000
dev.raid.speed_limit_max = 5000000

Save the file, and run the following command:

sysctl -p

This should persist your values for those variables after the reboot as well as runtime.

How to: Setup a server for R1Soft CDP backup?

We at Mellowhost has been utilizing R1Soft CDP backup for last 8 years. R1Soft has been a great backup tool even though the tool is immensely resource hoggy. At different times we had gone through different situations to handle our backup servers efficiently. After all the hiccups with backup nodes, we ended up efficiently configuring 3 backup servers of 3 different configuration

  1. backup1 = It contains 12TB file system on a RAID 0 array. It copies data to a BTRFS compressed drive once a week to keep the data safe if RAID 0 dies. This server uses RAID 0 for faster drive verification and block scanning by r1soft. This server hosts servers that requires frequent backing up and can sustain a loss of a week data (Less important data). As the server performs really fast due to being RAID 0, we can run multiple r1soft threads at a time including disk safe verification and block scans.
  2. backup2 = It contains 30TB file system in RAID 6 hardware array. This is used for hosting our VPS backups. This server is a seriously large one to keep backups of our enterprise VPS clients.
  3. backup3 = It contains 16TB file system in RAID 10 hardware array. This server is hosted in a East Coast American Location. It is our off network backup server and keeps backups for East Coast servers too.

One of the key factor in designing a backup server is the size and the location. Need to keep in mind that CDP 3 takes more space than CDP 2 for unknown reason while still being a differential backup solution, not just an incremental. Location of the server matters due to the network speed. If you are hosting your server a lot far than the server network, it may take longer time to complete the initial storage. Due to the latency it may fails to perform as fast like 1Gbps even if both network supports it. Just for an example, if you are backing up your data at 1MBps speed, it would take 12.13 days to complete backup of 1TB data [ Calculation: (((1024 x 1024) / 60) / 60) / 24 = 12.13 days ]. A 100Mbps port can give you speed upto 10MBps, while you can have 50MBps+ speed if you are using a 1Gbps network roughly. So why does the speed matter? If you are backing up your initial data in 13 days, that doesn’t mean it will be the same all the time. Your second backup would take much less amount of time as it only needs to upload the differential backups. That is true! But the problem will come when you require to do a bare metal restore. If your server requires a disaster recovery, you would then need 13 days to restore your server to the original state. Your customers won’t sit down for 13 days! While creating backup, it is important to think about disaster recovery too. How fast are you going to be able to restore the backup is an important concern while designing your disaster recovery solution.

I always recommend users to choose a 1Gbps network with a latency below 2ms if you want to have a good disaster recovery solution. This can guarantee a faster bare metal restore when needed.

The second key factor while creating the R1soft backup server would be to choose the RAID. If you are thinking to create r1soft backup on a non-raided solution, I think you should drop off your idea. RAID isn’t necessarily always use to keep your data safe, it can also be used for performance. Keeping a RAID 0 or striping in general is must for a R1Soft server. Otherwise, every couple of times, you are going to see a lot of stalled processes doing ‘disk safe verification’ ‘block scan’ etc etc and not able to keep the backup up to date or canceling processes due to duplicate backup process (Old one taking too long to complete). It is better not to choose RAID 5. I particularly didn’t try RAID 5, but I have used RAID – Z on ZFS file system, which was seriously slow for my work around. I switched the server later on to RAID 0 and BTRFS compression to keep a weekly backup which tremendously improved the R1Soft performance. We at later time, worked to create more backup servers with hardware RAID WB cache and battery backed unit to give us more performance benefit while creating and restoring backups. These servers have been performing tremendously well with R1Soft. They can also be called good disaster recovery node.

Last, I recommend you to understand that backup isn’t just keeping a copy of your data of your online existence. It is important to design a disaster recovery solution instead of just creating backups. If you are simply into creating backups, you probably don’t need R1Soft or any high end servers instead simple Rsync would work fine. But to create ‘Disaster Recovery’ solution, you need high level planning, good hardwares and good cost estimation. If you are leaving behind in any, you will probably fail to create a good disaster recovery solution that actually ‘works’.

Misconception about Server Specification!

“Oh! Damn! That server seems pretty cheap, giving me 12GB RAM, dual intel xeon 5430, so, lets go and purchase it” – a  big misconception to judge a server specification looking at their cpu and the ram size in current web hosting industry. I have been playing a lot in Webhostingtalk and some other hosting forums these days and found people are asking the same question everyday, which server is going to be right for me. Here I will go through some basic idea why the idea of users are biasing everyday and how to judge a proper server specification.

Continue reading “Misconception about Server Specification!”