Troubleshoot – Page 2 – Mellowhost Blog

Troubleshoot: Server IP address could not be found

I had a client ticket today, with the following screenshot:

The error says, ‘server IP address could not be found’. This type of error means there is a DNS resolution error. There could be 3 possibilities:

Client hasn’t updated the dns nameservers for the domain
Host’s DNS server is down.
Client’s DNS resolver isn’t working.

To check if the client has updated the proper dns nameservers, you can use intodns.com. It will also tell you if the host DNS is down or not. If both are ok, you should check if you are able to load other domains using your Internet, if not, it has things to do with the local DNS resolver of your desktop or the ISP. In my case, it was client who failed to update the nameservers of the domain. All that was required to update the nameservers with the server ones.

Troubleshoot: fatal: open lock file /var/lib/postfix/master.lock: unable to set exclusive lock

Error Message & Trace details:

One of my customer came with an error saying the postfix in his server isn’t working. The server was running CentOS 7, and the system postfix status was inactive, means not running. Although, the system queue was running I could see. The error that was returning while restarting/checking status was the following:

# service postfix status Redirecting to /bin/systemctl status postfix.service ● postfix.service - Postfix Mail Transport Agent Loaded: loaded (/usr/lib/systemd/system/postfix.service; enabled; vendor preset: disabled) Active: failed (Result: exit-code) since Tue 2018-01-09 04:04:05 UTC; 1s ago Process: 9201 ExecStart=/usr/sbin/postfix start (code=exited, status=1/FAILURE) Process: 9197 ExecStartPre=/usr/libexec/postfix/chroot-update (code=exited, status=0/SUCCESS) Process: 9194 ExecStartPre=/usr/libexec/postfix/aliasesdb (code=exited, status=0/SUCCESS) Main PID: 1358 (code=killed, signal=TERM)

Jan 09 04:04:03 twin7.hifrank.biz systemd[1]: Starting Postfix Mail Transport Agent... Jan 09 04:04:03 twin7.hifrank.biz postfix/master[9273]: fatal: open lock file /var/lib/postfix/master.lock: unable to set exclusive lock: Resource tempo...vailable Jan 09 04:04:04 twin7.hifrank.biz postfix/master[9272]: fatal: daemon initialization failure Jan 09 04:04:05 twin7.hifrank.biz postfix/postfix-script[9274]: fatal: mail system startup failed Jan 09 04:04:05 twin7.hifrank.biz systemd[1]: postfix.service: control process exited, code=exited status=1 Jan 09 04:04:05 twin7.hifrank.biz systemd[1]: Failed to start Postfix Mail Transport Agent. Jan 09 04:04:05 twin7.hifrank.biz systemd[1]: Unit postfix.service entered failed state. Jan 09 04:04:05 twin7.hifrank.biz systemd[1]: postfix.service failed.

How to fix:

The error to note here is the following:

fatal: open lock file /var/lib/postfix/master.lock

I first killed the smtp and smtpd processes that runs by postfix:

# killall -9 smtp # killall -9 smtpd

But that didn’t solve the problem. I then used the fuser command to check which process holds the lock file:

# fuser /var/lib/postfix/master.lock /var/lib/postfix/master.lock: 18698

Then we check the process 18698 and kill the responsible process:

# ps -axwww|grep 18698 9333 pts/0 S+ 0:00 grep --color=auto 18698
18698 ? Ss 4:28 /usr/libexec/postfix/master -w # killall -9 /usr/libexec/postfix/master or # kill -9 18698

Once the process is killed, you can now start the postfix:

# service postfix start # service postfix status|grep Active Redirecting to /bin/systemctl status postfix.service Active: active (running) since Tue 2018-01-09 04:15:50 UTC; 4min 45s ago

Troubleshoot: killall command not found in centos 7

Problem:

If you are using a centos 7 minimal installation, and trying to kill process by name using the command ‘killall’, you are most likely going to see the error:

# killall -9 php killall: command not found

The error appears because CentOS 7 encourages you to use pkill instead of killall to kill process by name. pkill has versatile application, although, it can be used to kill process by name same as killall.

How to kill process by name in Centos 7

You can use pkill. pkill is a simple command. It’s syntax is as following:

# pkill processname

For example, if you want to kill all the php process, run:

# pkill php

Note: It will kill all the processes that match php. To list the process that pkill going to kill, you can use pgrep as following:

# pgrep -l php

How to use killall in Centos 7

If you do not want to use pkill, and keep using killall commands in centos 7, this is also possible. killall is a part of psmisc yum package. All you have to do, is to install psmisc in your system using yum

# yum install psmisc # killall -9 php

Troubleshooting: Imunify360 database is corrupt. Application cannot run with corrupt database

Error Message:

# service imunify360 start Starting imunify360: WARNING [+ 3743ms] defence360agent.utils.check_db|DatabaseError detected: database disk image is malformed WARNING [+ 3766ms] defence360agent.cli.subparsers.common.server|Imunify360 database is corrupt. Application cannot run with corrupt database. Please, contact Imunify360 support team at https://cloudlinux.zendesk.com

Detail Information & Explanation:

If you are using imunify360, an application firewall for linux servers by Cloudlinux team, you might incur an error where it says the database is corrupt. You might first see ‘Imunify360 is not started’ error from the WHM panel and end up getting the above error message as stated. Imunify360 uses a SQL database, located under ‘/var/imunify360/imunify360.db’. This image is checked everytime Imunfi360 tries to start, and if the database is malformed, it would not start. Fortunately, imunify360 comes with tools to handle this database and recover if corrupted.

How to Fix:

First, we start by running database integrity check. This can be done using the following:

imunfiy360-agent checkdb

(From Imunify360 Doc: checkdb – Check database integrity)

Once done, you can now use ‘migratedb’ to repair and restore if the database is corrupted.

imunify360-agent migratedb

(From Imunify360 Doc: migratedb – Check and repair database if it is corrupted.)

If migratedb fails, the only way to recover this is to reinstall imunify360.

Linux: Assertion failed on job for iptables.service.

If you are using Centos 7 or RHEL 7 or any of it’s variant, you are probably using ‘Firewalld’ by default. Although, if you are a iptables fan like me, who likes it’s simplicity and manipulative nature instead of a full form firewall, then you probably have disabled firewalld from your CentOS 7 instance and using iptables. There are couple of servers, where I use runtime iptables rules for postrouting and masquerading. These rules are dynamically generated by my scripts instead of the sysconfig file under:

/etc/sysconfig/iptables

This file is generated upon running the iptables save command:

service iptables save

which I rarely do so.

Error Details

Which is why, I don’t have a /etc/sysconfig/iptables file in those servers and a common error I see while restarting iptables in those system is the following:

# systemctl restart iptables.service Assertion failed on job for iptables.service.

How to Fix The Error

The error appears because you don’t have any rule in /etc/sysconfig/iptables or the file doesn’t exist either. You can ignore the error as iptables would still run. To eradicate the error, simply make sure you have some iptables rules loaded on your system using the status command:

iptables -S

And then, run:

service iptables save

Once done, restarting iptables shouldn’t show the error any longer.

SMTP Error: 550 Please turn on SMTP Authentication in your mail client – IP is not permitted to relay through this server without authentication

We had a customer complaining about a commonly seen error of the following type:

550 Please turn on SMTP Authentication in your mail client. mail-pf0-f172.google.com [209.85.192.172]:38632 is not permitted to relay through this server without authentication.

Diagnostic-Code: smtp; 550-Please turn on SMTP Authentication in your mail client. 550-mail-pf0-f172.google.com [209.85.192.172]:38632 is not permitted to relay 550 through this server without authentication.

reason: 550-Please turn on SMTP Authentication in your mail client. 550-mout.kundenserver.de [212.227.17.24]:49392 is not permitted to relay 550 through this server without authentication.

They were all basically the same error. This is a common error and the solution is pretty simple as it looks like. Enabling ‘SMTP Authentication’ on the outlook or the mail client should solve the problem. But interestingly, the client was smart and he wasn’t doing any mistake with ‘SMTP authentication’. The error was actually showing up when someone was trying to send the mail to him (As a receiver SMTP). We then tried digging the error further.

There is something we need to remember. SMTP is not only authenticated using username and password, it also goes through a dns authentication check too. If your dkim/domainkeys/spf/dmarc do not match as the mail server has advised, the mail will get denied with the same type of error (Error code 550). We then realized the customer account was transfered earlier from a different server and the old domainkeys were still there in it’s DNS zone file. As domainkeys are RSA keys generated per server, it is important to regenerate the keys after the server change. Otherwise, the old key check through the DNS can trigger the 550 error from the receiver relay. We had deleted and generated a new domainkeys for the customer and the error went off.

phpMyAdmin Coming Blank in Cpanel

One of the customer reported an issue related to phpMyAdmin earlier today. He was getting a blank page of phpmyadmin that only says “Welcome to phpMyAdmin”

Once I hoped into the ssh and checked the cpanel error log file located under

/usr/local/cpanel/logs/error_log

I observed the following error:

PHP Fatal error: require_once(): Failed opening required './libraries/display_select_lang.lib.php' (include_path='/usr/local/cpanel/3rdparty/php/56/lib/php:.') in /usr/local/cpanel/base/3rdparty/phpMyAdmin/libraries/plugins/auth/AuthenticationCpanel.php on line 147

The error was peculiar because display_select_lang.lib.php wasn’t available in any other cpanel phpmyadmin source files I searched. Then I realized “AuthenticationCpanel.php” mentions the error which usually because Cpanel Authentication wasn’t done properly with the MySQL. Cpanel pass wasn’t synced with the MySQL.

Going to WHM >> Password Modification >> If you select the user and WHM shows you the ‘Sync with MySQL Password’ option, that means the MySQL password is outdated to cpanel and requires syncing (NB: If the password doesn’t require syncing, this option won’t be there). You can reset the pass and check the option to Sync the new pass with MySQL. That should restore your phpmyadmin.

What is Kondemand? Why do I see a lot of Kondemand process in my process list?

Question: What is Kondemand? Why do I see a lot of Kondemand process in my process list?

Answer: Kondemand is the process used for automatic CPU scaling on multi core linux system. It automatically reduce/drops the CPU clock speed to power usage when the CPU is not in use. This is done through scaling_governor available on linux. To see if your scaling_governor is set to ‘ondemand’ or not, you may use the following command:

# cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor

If your CPU is showing ‘ondemand’ scanling governor then the kondemand kernel process is active and will reduce your CPU clock speed on fly to reduce power usage. You can change this settings to performance on fly using the following small shell code:

for CPUFREQ in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor; do [ -f $CPUFREQ ] || continue; echo -n performance > $CPUFREQ; done

There is a linux service called CPUSpeed, this can tune your scaling governor back to ondemand after the reboot. You may shut it down:

service cpuspeed stop
chkconfig off cpuspeed

You may check your CPU speed is restored to the original through the proc filesystem:

cat /proc/cpuinfo

[Tue Dec 19 20:40:07.097202 2017] [lsapi:error] [pid 532140:tid 139848266454784] [client IP.IP.IP.IP:25021] mod_lsapi: [host domain.com] [req GET / HTTP/1.1] Could not connect to lsphp backend: connect to lsphp refused: 111 (possibly memory limit for LVE ID 1789 too small), referer: domain.com

If you are running Cloudlinux, cagefs and lsapi with cpanel, you are probably familiar with the error. If the error is appearing for one or two sites, then it is probably because the user is hitting the VM/PM limit you have set through Cloudlinux. But if the error is appearing for all the sites, then it is because the cagefs fails with the suexec permission for some reason.

[Tue Dec 19 20:40:07.097202 2017] [lsapi:error] [pid 532140:tid 139848266454784] [client *:25021] mod_lsapi: [host *] [req GET / HTTP/1.1] Could not connect to lsphp backend: connect to lsphp refused: 111 (possibly memory limit for LVE ID 1789 too small), referer: http://*/

One way to solve the problem is to remount all the users. Sometimes, it doesn’t work and you may require to reinitialize cagefs again:

cagefsctl –remount-all
cagefsctl -r

I have seen times, when nothing works, but reinstalling cagefs does the trick. If cagefs doesn’t work, you may try disabling virtual memory from the CloudLinux LVE manager to see if that fix the problem. CloudLinux also has a known Virtual Memory 503 error issue with LSAPI.

error: Unable to create cgroup for vm**: No such file or directory

The error can appear for any type of KVM VM installation like Virtualizor or SolusVM or Proxmox. You may face the same error if you are simply using virtmanager to manage your KVM installation. The error appear when you try to start/create the VM from the xml file:

[root@vps8 addvs]# virsh create /etc/libvirt/qemu/v1015.xml
error: Failed to create domain from /etc/libvirt/qemu/v1015.xml
error: Unable to create cgroup for v1015: No such file or directory

It appears because for some CentOS starts unmounting the cgroup and breaks libvirt. Easy way to fix this is to restart libvirtd:

service libvirtd restart

The error is more common in CentOS 7 than CentOS 6, as systemd is known to have the bug:

https://bugzilla.redhat.com/show_bug.cgi?id=678555