How to Access Redis over TLS/SSL

If you are trying to purchase Managed Redis Database, from companies like digitalocean, then, you would get access to those Redis servers, only over TLS/SSL support. Unfortunately, by default redis-cli does not ship with TLS support, hence, you need to either use Tunnel to access redis instance through redis-cli or use different tool for your purpose.

Access Your Redis Instance using Python

If you are developing your application using python, and using a managed redis database, then, you would have to make the redis connection over TLS/SSL. This can be done by setting the ‘SSL’ to True, in the Redis constructor. Here is an example:

import redis

r = redis.Redis(host='db-redis-sfo2-89862-do-user-4233327-0.b.db.ondigitalocean.com', port=25061, password='abcdhjnmjtxeupp', decode_responses=True, ssl=True)

As you can see, I have set the ssl to True at the end, to set the connection over TLS/SSL.

Access Your Redis from Command Line using Redli

If you want to have access to your redis instance through a command line tool, then using redli is my first line choice. Configuring stunnel with redis-cli is also possible, but it would be another topic of discussion here.

Let’s see, how can we install redli:

Redli is a tool developed by IBM with TLS/SSL support. It is written in Golang. You may download a version for Linux from the IBMCloud Github repo and start using it:

~ wget https://github.com/IBM-Cloud/redli/releases/download/v0.4.4/redli_0.4.4_linux_amd64.tar.gz
~ tar -xvzf redli_0.4.4_linux_amd64.tar.gz
~ cp redli /usr/bin/
~ chmod +x /usr/bin/redli

Now, you may use redli to connect to your Redis database with TLS enabled as following:

redli --tls -h db-redis-sfo2-89862-do-user-4233327-0.b.db.ondigitalocean.com -p 25061 -a abcdhjnmjtxeupp

See how ‘–tls’ is enabled to access redis over TLS here. Hope this helps.

How to Restore MySQL Database Using SSH

This would be a simple step by step post for the help of my customers. You require two tools for this.

i) putty – ssh client
ii) WinSCP – File transfer client over SSH (You may use FTP or File manager as well)

Once you download putty, open it, and use your server’s IP in the Host Name section:

Now, click on open. In the prompt, type ‘root’ as login as and enter. Then type the password and enter. If the password for the root is right, it will login to the command shell prompt or else it will put you in the authentication error console.

Now, open WinSCP to login to your server using the same manner of SSH. Once done, transfer your sql file to the home directory of root.

Login to your server using WinSCP
WinSCP Transfer

To transfer the file, navigate the sql file from your desktop on the left side and then drag it to the right side of WinSCP window. That will start the transfer. Remember the path of the destination on the server. Path is shown in the red marked area in the image.

Once the file is transferred, come back to putty, and change directory to the path you had noted in WinSCP, in our case, it is /root/

cd /root/
mysql -uusername -ppassword your_database_name < your_sql_file_name.sql

Remember the following. Your mysql username should go right after ‘-u’ without any space and the mysql password should be the same without space. Replace the ‘your_database_name’ with your original database name in the server and ‘your_sql_file_name.sql’ with the exact sql file name that you had uploaded.

Now if you allow sometime, the command should complete after the restore. If it returns any error, you would need to attend them to do a complete restore.

How To Compare Metrics within Identical Time of Today with Yesterday (or More) in Metabase / SQL / Metabase Trend Query

Scenario Explanation

Suppose, you have a live Ecommerce system. What you want to see, is how well your revenue is performing comparing to yesterday. But if you are comparing with the total revenue of yesterday, then this won’t be an efficient metric until you reach the end of the day. In real scenario, what you would like, is to compare it with the identical time it is happening right now. For example, if it is 7.19 PM now, then you want to compare either of the following two and show the trend:

i) Revenues within 00:00 – 7:00 PM yesterday to revenues within 00:00 – 7:00 PM today or
ii) Revenues within 00:00 – 8:00 PM yesterday to revenues within 00:00 – 8:00 PM today (remember, it is 7.19 PM, so the counter and comparison will continue to flow when you get new order on this case)

Both of the way of seeing trend is useful, depends on your choice, but if you are comparing the metrics with the full of yesterday and the half way through to today, this might not be as useful.

How To Do This In Meatabase?

Please remember, I would be doing this in Metabase. Metabase is an open source data analytical and visualization tool written in Java. It is simple, fast and efficient tool for all square use cases in data analytics. I am using a ‘Reformatted’ real time MySQL database as the data source, hence the query can be used in plain MySQL databases to see results.

Note: We can’t do this with the simple query of Metabase, hence, we have to use the Native query support of Metabase. If your Administrator has limited your access to Metabase to Simple & Custom query, then you can not do this until your Administrator allows you so.

Building The Query

Let’s assume, our orders lie in ‘orders’ table, order creation dates are given in ‘date_created’ column, along with order status and the total value of the order is stored separately under status and total column.

First, we write the sum of the total column with the mysql aggregation function ‘sum’ and set a variable ‘date_created’ based the on the date_created column of orders table as following:

SELECT date(`orders`.`date_created`) AS `date_created`, sum(`orders`.`total`) AS `sum`
FROM `orders`

Now, if you want to use the ‘Trend’ in Metabase, then the query must have ‘Group By’ a timeseries based on which it shows the ‘Trend’ metrics. And we know, that our group by timeseries going to be ‘date_created’. We also know that our conditions of selecting the rows, going to be under the ‘WHERE’ statement, so we will keep it as blank for now. So the query, goes as following now:

SELECT date(`orders`.`date_created`) AS `date_created`, sum(`orders`.`total`) AS `sum`
FROM `orders`
WHERE (
# Our conditions will follow here    
    )
    
GROUP BY date(`orders`.`date_created`)
ORDER BY date(`orders`.`date_created`) ASC

We are done with the basic. Now, let’s fill up the WHERE statement. First, you may want to filter out certain ‘status’ of the orders in case you have many. You first, fill that up as following:

SELECT date(`orders`.`date_created`) AS `date_created`, sum(`orders`.`total`) AS `sum`
FROM `orders`
WHERE (

        (`orders`.`status` = 'completed'
        OR `orders`.`status` = 'confirmed' OR `orders`.`status` = 'processing')
        )

    )
    
GROUP BY date(`orders`.`date_created`)
ORDER BY date(`orders`.`date_created`) ASC

In my case, we are filtering if the order status is either completed, confirmed or processing. We are discarding other statuses like on-hold, or pending or abandoned cart.

Now, let’s think for a moment. We have to compare two time series, that means, we have to first, create an identical time differences. Let me give an example for better understanding. If we are trying to find out, the revenues between 00:00 – 20:00 yesterday and 00:00 – 20:00 today, then we need to have two conditions in WHERE statement for each of them, and they should select all the rows from both conditions, which necessarily means, it would be a ‘OR’ statement. For both cases, we need to select the order statuses, that would necessarily mean something like the following:

SELECT date(`orders`.`date_created`) AS `date_created`, sum(`orders`.`total`) AS `sum`
FROM `orders`
WHERE (

        (`orders`.`status` = 'completed'
        OR `orders`.`status` = 'confirmed' OR `orders`.`status` = 'processing')
        )
Or
        (`orders`.`status` = 'completed'
        OR `orders`.`status` = 'confirmed' OR `orders`.`status` = 'processing')
        )

    )
    
GROUP BY date(`orders`.`date_created`)
ORDER BY date(`orders`.`date_created`) ASC

Does the above make sense? If not, soon enough it would be, let’s move on to the next step. Now, how can you align the time to match the hour we have now in SQL query? MySQL curdate() returns the current date, which is today at 00:00 So, if I know the difference of hours left for today, and subtract that from curdate(), that would give us the hour it is now with the date of yesterday. Here is the example:

Say: curdate() = 10-6-2020 00:00
now() = 10-6-2020 20:45


This is 20th hour, so if we subtract 20 from 24, it would be 4. Now, we subtract 4 from curdate():

10-6-2020 00:00 - 4 Hour
09-6-2020 20:00

which basically is the same hour we are on, of today 10-6-2020 20:00 hour. Simple, isn’t it?

Now, how to write this in SQL query? Let’s see

First, we get 09-6-2020 00:00 Hour, this can be done as following in MySQL with negative Interval of 1 Day from current date.

date_add(curdate(), INTERVAL -1 DAY)

Next, we get 09-6-2020 20:00 Hour using the following:

date_add(curdate(), INTERVAL -(24 - hour(now())) HOUR)

See, how I am calculating the Interval Hour, I am using the MySQL function 'hour()' on another function 'now()' to get the hour we are on, which is giving me 20, then we are subtracting from 24 as 24 hours a day, the result is giving me 4 here. So the output going to be 20:00 hour. 

Now, for today, we first need 10-6-2020 00:00 Hour, which is simply:

curdate()

then, the target time is basically the now() function:

now()

We now have all 4 parts of it, here is the completed statement going to be only for the time selections:

For Yesterday:

str_to_date(date_format(`orders`.`date_created`, '%Y-%m-%d %H'), '%Y-%m-%d %H') 
 BETWEEN str_to_date(date_format(date_add(curdate(), INTERVAL -1 DAY), '%Y-%m-%d %H'), '%Y-%m-%d %H') AND str_to_date(date_format(date_add(curdate(), INTERVAL -(24 - hour(now())) HOUR), '%Y-%m-%d %H'), '%Y-%m-%d %H')

For Today:

str_to_date(date_format(`orders`.`date_created`, '%Y-%m-%d %H'), '%Y-%m-%d %H') 
BETWEEN str_to_date(date_format(curdate(), '%Y-%m-%d %H'), '%Y-%m-%d %H') AND str_to_date(date_format(now(), '%Y-%m-%d %H'), '%Y-%m-%d %H')

By adding the above to our basic and simple status selection statements:

SELECT date(`orders`.`date_created`) AS `date_created`, sum(`orders`.`total`) AS `sum`
FROM `orders`
WHERE (

        (`orders`.`status` = 'completed'
        OR `orders`.`status` = 'confirmed' OR `orders`.`status` = 'processing' OR `orders`.`status` = 'on-hold')
        AND str_to_date(date_format(`orders`.`date_created`, '%Y-%m-%d %H'), '%Y-%m-%d %H') 
        BETWEEN str_to_date(date_format(date_add(curdate(), INTERVAL -1 DAY), '%Y-%m-%d %H'), '%Y-%m-%d %H') AND str_to_date(date_format(date_add(curdate(), INTERVAL -(24 - hour(now())) HOUR), '%Y-%m-%d %H'), '%Y-%m-%d %H')
    
    OR
    
        (`orders`.`status` = 'completed'
        OR `orders`.`status` = 'confirmed' OR `orders`.`status` = 'processing' OR `orders`.`status` = 'on-hold')
        AND str_to_date(date_format(`orders`.`date_created`, '%Y-%m-%d %H'), '%Y-%m-%d %H') 
        BETWEEN str_to_date(date_format(curdate(), '%Y-%m-%d %H'), '%Y-%m-%d %H') AND str_to_date(date_format(now(), '%Y-%m-%d %H'), '%Y-%m-%d %H')
    
    )
    
GROUP BY date(`orders`.`date_created`)
ORDER BY date(`orders`.`date_created`) ASC

Simple, huh? Indeed, it is. If you run this query in phpmyadmin, or your mysql console, you will see the revenue amounts from the yesterday till the hour we are running through at this moment of today. But in Metabase, you can get a beautiful ‘Trend’ number metric with it. Just go To Visualization >> Trends to show you the Trend Number Metric for this as following:

It is cool to see how good or bad you are doing by this hour than yesterday.

Can You Use Elasticsearch as a Primary Database?

Well, this is an interesting topic. Earlier today, I answered the same question in a Elasticsearch Community Group in Facebook, thought to keep this documented as well.

Primarily, if you are aware of how Elasticsearch is storing data (The document like), you might think, it is a full fledged NoSQL database, but you need to know, it is not. If you have looked at several NoSQL databases, you might already be thinking, there is probably no standard definition of NoSQL databases, which is partially true, again, there still is definition of database management systems, where it doesn’t fit. Does that mean, we can’t use ES as the primary data store? No, that’s doesn’t mean that. You certainly can, but there are use cases. Let’s look at couple of points before we conclude.

Database Transactions

ES has no transaction. Even the algorithm that ES implements, Lucene Search, which has transaction in the original design, but ES doesn’t have any transaction. When there is no transaction, the first thing you need to remember, it has no rollback facility. That also means, every operation is atomic by design, and there is no way to cancel, abort or revert them. This also means, you have no locking standard in the system. If you are doing parallel writes, you need to be very careful about the writes here because ES is not giving you any guarantee to parallel writes.

ES Caching Strategy

Elasticsearch is a ‘near real’ time search engine. It is not a ‘real’ time search engine. It has a caching standard, that refreshes every 1 second. This is ‘by design’ of Elasticsearch. So, what’s the problem with it? The problem is, you may get tricked by the ES cache if you are immediately asking for a read for a quick write operation and it is within 1 second. But as most of the cases, ES is not written much often (Since most of the people do not use it as primary storage), most of the people thinks it is real time. But it is not. It’s cache can trick you in intense cases, where writes won’t be immediately available to you for read.

User Management

ES user management is no where near to a full fledged DBMS, only because ES doesn’t need to. They have never announced themselves as a NoSQL DBMS, hence they do not require to add the functionality in full square. They only implements the facilities that ES Ecosystems require.

So … ?

So, what does all these mean to you? If you have an extremely large read featured application, where updates are very rarely or occassionaly done, then you can definitely go with ES as your primary document storage engine. But if you have a write oriented application, requires complex user management or parallel threads require write access, then you better be choosing a standard DBMS for your business and use ES as secondary data store for your analytics and searches.