Server Boots to Grub – OVH Servers – How to Fix

Error Details

After you have completed updating your yum, you saw the kernel got updated, and hence restarted the server to take the new kernel. But you find out that the server has never come online. Once you visit the KVM or Serial Console (SOL) of the system, you could see, your system is booted to ‘grub>’ console instead of booting from disk. How can you fix the system now?

Solution Intro

This specific issue can appear for any linux server, along with many reasons. Although, if you are running an server from OVH and had faced a similar issue, the boat I am going to show you can navigate to destination. Please note, in many other case of similar situation, you might end up fixing the grub with the same solution.

What and How the Problem Happened

OVH has an interesting strategy of booting. They follow everything through network PXE, even if it is not ‘netboot’, but just the local drives. For this to work out, you need PXE to take the latest grub details pushed once a kernel is updated. This is one reason why, OVH also supplies a custom kernel from a cusstom repo. Although, if you are using the stock kernel, you might come up with a situation, where the latest grub hasn’t been pushed to PXE and your system fails to boot from drives. It then puts you in the ‘grub’ of network.

How to Fix the Problem

Now, one thing is clear, after you completed a kernel update, your grub is broken due to the latest machine code is not available to the booting system. You can go and follow a regular grub repair method for Grub 2, to fix the situation. A couple of things to remember, as your system’s grub is failing to load, you have to use an independent rescue kernel to fix this, this could either be from a personal network repository or a rescue disk available from your datacenter’s location, like ovh has one. Another thing to remember, is that, if you are using CentOS 7 or Ubuntu with UEFI system, using mdadm or linux software raid, it is highly likely, your boot efi is placed in a non raid partition. Preferably in the first drive’s first partition. You can always verify this from your fstab file.

So the first job, is to boot your system into the rescue disk/cd/kernel. I assume you have done that with no difficulty. Once done, first mount your partitions. In OVH cases, it loads the mdadm automatically. In my case, it was /dev/md2.

mount /dev/md2 /mnt
# check what partition is used for /boot/efi
nano /mnt/etc/fstab
# in my case, it is /dev/nvme0n1p1 (It is a NVMe SSD, and the first partion is used for efi storage
mount /dev/nvme0n1p1 /mnt/boot/efi

Once we have mounted the partitions successfully, you may now chroot the system. Before chrooting, you want the dev, proc and sys to use the /mnt partitions respectively:

mount --bind /dev /mnt/dev
mount --bind /proc /mnt/proc
mount --bind /sys /mnt/sys

If these all goes well, now we can chroot the system:

chroot /mnt

Now you have successfully changed the root directory of the rescue kernel to the original drive’s root. All you need to do, is to remake the grub config, that will immediately generate the grub.cfg file and sync the machine code:

# we know grub.cfg is available in /boot/grub2/grub.cfg
grub2-mkconfig -o  /boot/grub2/grub.cfg
# once this is finished, we have to make sure, grub is also installed for both disks, for my case, these are /dev/nvme0n1 and /dev/nvme1n1
grub2-install /dev/nvme0n1
grub2-install /dev/nvme1n1

If you see the response is ‘No Error Reported’, then you are good go. You may now reboot your system back to hard disk, and can see your grub is able to load the latest kernel you installed from the original hard disk. Remember, for safety, you should umount all the partition, to avoid any data loss due to OS page cache:

# exit from chroot
exit
# unmount dev, proc, sys, /mnt/boot/efi, /mnt
umount /dev
umount /proc
umount /sys
umount /mnt/boot/efi
umount /mnt

Happy troubleshooting!

How to Copy a Disk Over SSH?

First, let’s see why do we need to copy a disk over SSH. One is of course for backup. For example if you have a VM on a LVM partition. You want to keep a copy of the block level backups, you prefer to create a snapshot of the lvm partition and then copy the disk as an image to your backup server. The other being quite the same, but for different purpose. What if you want to migrate a VM that you have created on an LVM partition, and then you want to migrate it as a raw file to another server? Or a LVM partition to another server? For those cases, the technique is pretty awesome.

Copy The Disk to a RAW Image

First, let’s learn how can copy a disk as raw image.

For example, you are sitting in your backup server. And you want to copy a disk or lvm partition or a partition of a disk, from a remote server to your backup server. And you want to keep the copy of the image, then you may run dd command as following from your backup server:

ssh [email protected] "dd if=/dev/vg0/v1092-kdkdksjuekksq" | dd of=/backup/v1092.raw

In our case, 10.10.10.10 is the IP of the server, that partition/disk currently reside. We are trying to copy a LVM partition namely: /dev/vg0/v1092-kdkdksjuekksq, just replace this one with the one your desired lvm partition. You may also do this for a disk like /dev/sda, the command would be like the following then:

ssh [email protected] "dd if=/dev/sda" | dd of=/backup/v1092.raw

Now as you have copied the disk/partition, you may look at what the partition holds by checking the data inside it. To know, how to mount a raw disk image, you may check the following:

How-to-Mount-raw-VM-disk-images-KVMorXenorVMW

Copy The Disk To Another Remote Disk over SSH

You may either copy the disk directly to a secondary disk you have on the backup/migrated server or dump the image that you copied to another disk/partition of same size or bigger. To copy directly a partition /dev/sda from another server, to a backup server with the secondary disk /dev/sdb, you may do the following:

ssh [email protected] "dd if=/dev/sda" | dd of=/dev/sdb

This would backup /dev/sda from 10.10.10.10 and restore to /dev/sdb drive you have on the server that ran the command.

To just dump the image file, that copied earlier on the other example to /dev/sdb, you may do the following:

dd if=/backup/v1092.raw of=/dev/sdb

Hope this helps.

How to Install Let’s Encrypt in Cpanel

Let’s Encrypt is a popular tool to use free SSL for your website. Cpanel comes with Sectigo free ssl service through requesting and pooling system. Although, you might feel interested in getting the SSL released immediately without a queue based approach, and would prefer to use Let’s Encrypt that’s why.

There are two ways, you may install Let’s Encrypt in Cpanel.

  1. Using Cpanel Plugin

First one would be using the plugin created by Cpanel. Login to your server as root:

ssh root@server_ip

Then, run the following to install Let’s Encrypt in your cpanel system

/usr/local/cpanel/scripts/install_lets_encrypt_autossl_provider

It might take a couple of minutes, then it should install Let’s Encrypt as a provider in AutoSSL.

Now, go to WHM >> Manage AutoSSL and select Let’s Encrypt as the provider instead of Sectigo Cpanel default. You need to check the Agreement rules under the Let’s Encrypt selection and you may create the account in Let’s Encrypt using the same tool.

Once done, your new SSLs would be issued using the Let’s Encrypt tool through Cpanel AutoSSL plugin.

2. Using FleetSSL

There is a 3rd party tool, existed before Cpanel provided a plugin for Let’s Encrypt. It’s FleetSSL. One key benefit of using FleetSSL is that, it allows the Cpanel end users to control issuing and renewing the SSL from Cpanel. One key cons of using FleetSSL is that, it is not free of charge, it comes with 30$ one time fees. But mainly hosting provider would not mind to use this as it is a nice addition for the end user feature set in a hosting provider’s point of view.

You may check for details here:

https://letsencrypt-for-cpanel.com/

Now, once you complete installing Let’s Encrypt SSL, you may now use Let’s Encrypt for different cpanel services like webmail/cpanel/whm/calenders/MTA services. You may check the following to know how to:

How to Fix zmconfigd failed in Zimbra – Starting zmconfigd…failed.

Sometimes, if you restart Zimbra, you see zmconfigd is not starting or saying it’s failed. You may also see the zmconfigd service is not running in the Zimbra admin panel. There are couple of common reasons why zmconfigd fails to start.

Disable IPv6

One reason of zmconfigd fails to start is IPv6, for some reason, it fails to route the IPv6 and fails to start. A quick solution to this problem is to disable ipv6 and restart zmconfigd. You may do this like the following:

#Edit your sysctl.conf file
nano /etc/sysctl.conf

# paste the following inside the file
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1

# Save the file, and update sysctl in realtime
sysctl -p

# now try to restart zmconfigd
su - zimbra
zmconfigdctl restart

Now you can check the zmconfigd status with the following, to know if it’s running or not:

[root@mailapp ~]# cat /opt/zimbra/log/zmconfigd.pid
19722

If it returns an ID, it means the zmconfigd is running.

Netcat is not installed

Another reason of the error could be because nc is not installed in your system. Zimbra zmconfigd has a dependency on netcat package. Netcat is available through nmap-ncat in centos systems. You may run the following to install netcat:

yum install nc
# or 
yum install nmap-netcat

[ERROR] Can’t open and lock privilege tables: Table ‘mysql.servers’ doesn’t exist in engine – Resolution

There are times, you may see the following error in your MySQL/MariaDB based Cpanel server:

[ERROR] Can't open and lock privilege tables: Table 'mysql.servers' doesn't exist in engine

The issue is most likely related to your Innodb tablespace got corrupted, and hence some tables under the mysql database got locked out as some of them use Innodb storage engine. One of the outcome of the symptom is, if you try to add a user to a database, it doesn’t add or show the green notification any longer in cpanel mysql databases section. Instead it just stops.

The only and best way to properly fix this would be restore the ‘mysql’ database or just the ‘servers’ table from your backup. If you don’t have one, you may just create the ‘servers’ table using the following SQL statement:

CREATE TABLE `servers` (
`Server_name` char(64) NOT NULL,
`Host` char(64) NOT NULL,
`Db` char(64) NOT NULL,
`Username` char(64) NOT NULL,
`Password` char(64) NOT NULL,
`Port` int(4) DEFAULT NULL,
`Socket` char(64) DEFAULT NULL,
`Wrapper` char(64) NOT NULL,
`Owner` char(64) NOT NULL,
PRIMARY KEY (`Server_name`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

You may require to drop the table first. Now, if you can’t do this either, then there is only one way left, is to uninstall your MariaDB installation, and let Cpanel/WHM to install them for you.

Get a Backup First:

cp -Rf /var/lib/mysql/mysql /root/
rm -Rf /var/lib/mysql/mysql

Uninstall MariaDB:

yum remove MariaDB*

Now, you may install the latest MariaDB from WHM >> MariaDB/MySQL Upgrade and proceed accordingly. This should install the latest for you with a fresh ‘mysql’ database for you. But it will not alter your other data files, means your other databases should be fine.

One thing, you need to remember, after a fresh mysql installation with the old data files, you will have the authorizations missing. You would have to recreate the database users manually to get the privileged table filled up.

How to Recover Innodb Table when ib_logfile / ibdata is/are crashed/deleted/lost without backup

If you are here, that means, you probably have panicked the same way, I did around 12 years back. I lost my ib_logfile0/ib_logfile1/ibdata1 all at once for a server that excessively utilizes Innodb tables. I had to recover vital data from the same situation today on a random request who does not have backups, and thought it is better to keep this as a document for future.

One key purpose of utilizing Innodb tables instead of MyISAM is that, the benefit on writes. It always outperform MyISAM in writes due to the use of extra efficient buffers. But, this also causes Innodb to vulnerable from crashing. As Innodb stores some sensitive data to 3 specific files, loosing them, also looses some serious mapping instruments for the database engines to recognizes Innodb table structure and data.

Who can follow this technique?

If you have lost any of ib_logfile0, ib_logfile1, ibdata1 or all of them, but still manages to keep the database folder intact with the .frm and .ibd files (which you would, if you have accidentally deleted the log file or the data only) and also have the following option NOT DISABLED in your mysql configuration ‘innodb_file_per_table’. This option is enabled by default, until you are explicitly disabling this to increase performance. A suggestion: only do this, if you keeping real time backups of your databases. Otherwise, it is better to have this enabled

What is ‘innodb_file_per_table’?

Primarily the tablespace stores and uses data from system tablespace for Innodb. But, as this creates a single point of failure from ibdata and log files, Innodb by defaults also stores the tablespace in table’s own data file, which is .ibd file. That means, if I lose the ibdata/logfile mappings, I can still use the .ibd file to restore my tablespace and do the schema to data mapping only if I allowed innodb to store these information to the database’s own .ibd file. You may read more about the parameters from MySQL documentation:

File-Per-Table Tablespaces at dev.mysql.com

How to Recover an Innodb Table from database files only?

There are two steps to this process. One is to identify and recognize the database schema from the frm file and then basically find a way to import the tablespace from .ibd file and introduce it to innodb engine system tablespace.

First Step First: How to get the schema from .frm files?

First, you must install mysql-utilities tools to get access mysqlfrm tool, you may get the instructions to install this here:

Once this is done, now you have two options to read mysqlfrm files. My favorite way is to use the ‘diagnostic’ attribute. To achieve this, run the following:

mysqlfrm --diagnostic /var/lib/mysql/your_database/assets.frm

I assumed, your database name is ‘your_database’ and the table you are trying to recover is ‘assets’. The above command will return you the schema of ‘CREATE TABLE’ you need to use. First, create a new database, and run this on the SQL console to generate the table first on the new database.

Second Step: Get your data and mapping back from .ibd to system tablespace

Once the database has the table, it will also create a .frm and .ibd file for you. What we need to do, is to first, make it forget the existing .ibd file it created, sync the .ibd file from our collapsed database, make the mysql innodb engine to recognize tablespace from the backup tablespace of this .ibd file and store & use it from system tablespace. These lines are complex, and might sound a bit difficult. No worry, let’s do it.

Run the following command first to let it forget the .ibd it has created now:

alter table assets discard tablespace;

Remember the following, our table name is ‘assets’. If you have a different table name, make sure to replace this accordingly. What this has done, is removed the assets.ibd file it created in /var/lib/mysql/new_database/ folder as we asked him to forget the existing .ibd file. Now we first need to copy the backup/old .ibd file to this location with the correct permission. I would use rsync to make sure permissions remains intact here:

rsync -vrplogDtH /var/lib/mysql/your_database/assets.ibd /var/lib/mysql/new_database/

Once this is done, we know, .ibd contains a backup of our original tablespace. We only need to make mysql & innodb recognize this. To achieve this, you may do the following from the Sql console:

alter table assets import tablespace;

If it throws a warning on not being able to file the .cfg file, you may forget it, because it is not essential to have a .cfg to recognize permissions/configurations.

If everything runs well, you should see your rows are back. It’s because innodb has now fetched your tablespace data from .ibd file to system tablespace and it can now recognizes the mapping to your data, viola! All you now need is to repeat the process for all of your innodb tables, and recover the whole database.

How To Run a Command in All OpenVZ Containers

You can run single command in a container using the following:

vzctl exec 201 service httpd status

How to find out all the VZ containers:

vzlist -a

The other way? Yes, there is. VZ list is stored inside a file /proc/vz/veinfo, and we can use it with the help of shell to run command in each VZ as following:

for i in `cat /proc/vz/veinfo | awk '{print $1}'|egrep -v '^0$'`; \
do echo "Container $i"; vzctl exec $i <your command goes here>; done

An example, can be the following:

for i in `cat /proc/vz/veinfo | awk '{print $1}'|egrep -v '^0$'`; \
do echo "Container $i"; vzctl exec $i service httpd status; done

This should show all the httpd status of the VZ.

How To Send Email From an IP without Authentication – Cpanel/WHM

Since antirelayed is removed by the cpanel team from the latest cpanel, the situation might arise to some people, at least to me. I had a server sending mails without authentication, a trusted IP. Now, how to do this with the latest Cpanel/WHM?

Well, Cpanel still keeps the facility called ‘alwaysrelay’. This one was there when antirelayed was there. Antirelayed used to allow relay for an IP without authentication for a specific period of time, while ‘alwaysrelay’ will allow relaying all the time.

All you need to do, is to add the IP in the following file in a new line:

/etc/alwaysrelay

and restart the Exim:

service exim restart

That should be it. Remember, you might encounter the exim report cpaneleximscanner found your email to be spam. In such cases, go to WHM >> Service Configuration >> Exim Configuration Manager >> Set the following option to ‘Off’ : Scan outgoing messages for spam and reject based on the Apache SpamAssassin™ internal spam_score setting

and Save. Now you may check, it should work.

How To: Restore Zimbra Quarantined Email by Clam AKA Heuristics.Encrypted.PDF Release Point

Zimbra Mail Server automatically quarantines emails that get hit by the Antivirus scan using Clam when the mail is received. While putting the email on the recipient inbox, what it does, instead of giving the original email with the attachment, it sends a virus detected email with the following kind of error message:

Virus (Heuristics.Encrypted.PDF) in mail to YOU

Virus Alert
Our content checker found
virus: Heuristics.Encrypted.PDF

by Zimbra

It actually means, the original mail is now quarantined. Zimbra maintains a virus quarantine email account that is not normally available in the ‘Manage Account’ list of Zimbra Admin panel. You can find it if you search with ‘virus’ in the ‘Search’ box of the admin panel. What zimbra does in quarantine situation, is that, it pushes the mail to the quarantine email instead of original recipient.

Now, to get back the mail delivered to the original recipient, we need to first get the quarantine email account, get the message id, and then we need to inject the mail into the LMTP pipe that bypasses any scanning. Here are the steps on how to do this:

# First get to the zimbra user
$ su - zimbra

# Get the email account that is used to store virus detected mails
$ zmprov gcf zimbraAmavisQuarantineAccount
zimbraAmavisQuarantineAccount: [email protected]

# [email protected] this should be our quarantine email account, now we need to get the quarantine account's mailbox id
$ zmprov gmi [email protected]
mailboxId: 73
quotaUsed: 644183

# Mailbox id here for the quarantine account is 73. Now go to the message storage of this id using the following command: cd /opt/zimbra/store/0/<mailboxId>/msg/0
$ cd /opt/zimbra/store/0/73/msg/0

# list the messages
$ ls *

These are your quarantined emails. Now for example the complainer is ‘[email protected]’. To search for the emails designated for this email account, you may use the following:

$ grep -l [email protected] *
281-1216.msg
300-1400.msg
301-1476.msg

This should return you all the emails that got quarantined for the above user.

Now the question is, how can we get these emails delivered to the designated user bypassing the antivirus/antispam tools. To do this, you need to inject the mail into LMTP pipe. You may do this using ‘zmlmtpinject’ command as following:

$ zmlmtpinject -r [email protected] -s [email protected] 281-1216.msg

Remember, to change [email protected] to the original recipient. [email protected] would be the newly rewritten sender for this mail delivery and ‘281-1216.msg’ is the file name of the original email that you found out from the grep command. You can do lmtp injections for one email mail with each command. So, you would require to do this for each emails.

How To: Send Email Alert on Different HAProxy Status or Based on HAProxy Stats

HAProxy is a great simple load balancing tool written in Lua. It is extremely efficient as a software load balancer and highly configurable as well. On the contrary, HAProxy lacks programmable automated monitoring tools. It has a directive called ‘mailer’ which has only support above 1.8. Default CentOS 7 repo comes with HAProxy 1.5 and it has no mailer alert support either. Even with 1.8, it doesn’t come with lots of available configuration options neither the tool gets programmable facility.

That is where, I thought to work on to trigger codes from HAProxy stats. This can be done in many ways, in my cases, I did it using per minute crons. If you want it much quicker like every 5 seconds for example, you would have to run this as a daemon, which isn’t like making a rocket, should be easy and short. My entire idea is to allow you understanding how to create programmable 3rd party tools by fetching data from HAProxy socket and trigger monitors.

HAProxy Stats through Unix Socket

First, we need to enable the HAProxy stats that is available through socket. To turn on stats through unix socket, you need put the following line in your global section of haproxy.cfg file:

stats socket /var/lib/haproxy/stats

An example of Global settings section would be like the following:

global
    log         127.0.0.1 local2     #Log configuration

    chroot      /var/lib/haproxy
    pidfile     /var/run/haproxy.pid
    maxconn     40000
    user        haproxy             #Haproxy running under user and group "haproxy"
    group       haproxy
    daemon

    # turn on stats unix socket
    stats socket /var/lib/haproxy/stats

Check how the stats socket section is placed to fit it for your cfg file.

Once this is done, now you can restart the haproxy to start shooting stats through the socket. The output is basically a csv of the HAProxy stats page. So the values going to be in comma seperated format. To understand HAProxy stats page and exploding the values, you can visit the following:

https://www.haproxy.com/blog/exploring-the-haproxy-stats-page/

Now, how to read the unix-socket using bash? There is a tool called ‘socat’ that can be used to read data from unix socket. ‘socat’ means ‘socket cat’, you may read more details about ‘socat’ here:

https://linux.die.net/man/1/socat

If socat is not available on your CentOS yum, you may get it from epel-release.

yum install epel-release
yum install socat

Once ‘socat’ is available in your system, you can use it to redirect the io and show the output as following:

echo "show stat" | socat unix-connect:/var/lib/haproxy/stats stdio

Now as you can see, you can retrieve the whole HAProxy stats in CSV format, you may easily manipulate and operate data using a shell script. I have created a basic shell script to get the status of the HAProxy backends and send an email alert using ‘ssmtp’. Remember, ssmtp is highly configurable mail tool, you can customize smtp authentication as well with ‘ssmtp’. You may use any other tool like Sendmail for example or ‘Curl’ to any email API like Sendgrid, possibility is infinite here. Remember, as the data is instantly available to socket as soon as the HAProxy generates the event, hence, it can be as efficient as HAProxy built in functions like ‘mailer’ is.

#!/bin/bash

cd /root/
rm -f haproxy_stat.txt
echo "show stat" | socat unix-connect:/var/lib/haproxy/stats stdio|grep app-main > /root/test.txt

SEND_EMAIL=0

while IFS= read -r line
do

APP=`echo $line | cut -d"," -f2`
STAT=`echo $line | cut -d"," -f18`
SESSION=`echo $line | cut -d"," -f34`
if [ "$STAT" != "UP" ]; then
SEND_EMAIL=1
MESSAGE+="$APP $STAT $SESSION
"
fi

done < /root/test.txt

if [ $SEND_EMAIL -eq 1 ]; then
echo -e "Subject: Haproxy Instance Down \n\n$MESSAGE" | sudo ssmtp -vvv [email protected]
echo -e "Subject: Haproxy Instance Down \n\n$MESSAGE" | sudo ssmtp -vvv [email protected]

fi

Data that I am interested in are the status of the backend, name of the backend and the session rate of the backend. So, if the load balancer sees any backend is down, this would trigger the email delivery. You can use this to catch anything in the HAProxy, like a Frontend attack for example, like the delivery optimization of your load balancer etc. As you are now able to retrieve data directly from HAProxy to your own ‘programming’ console, you can program it whatever the way you want to. Hope this helps somebody! For any help, shoot a comment!