Dealing with hot keys

Standard

Memcached keys can get hot. I mean real hot to the point that it can hinder your server performance. Below is one of the screenshots of an app in new relic

11103122_10153164277530269_2639746238063560027_o

Of course the application ends up in flames as the key gets too hot to handle

nl7uk

So how do one solve the issue of hot keys?

Identify the hot keys using MCTOP

MCTOP (means memcached TOP) is a ruby based gem that helps a sys admin to identify the hot key. Usually hot keys are keys that are large in size, so filter the list based on key size and then identify the largest keys available

Move the data to CDN

If the size is above 1MB and the data don’t change frequently (aka Master data) then you can consider putting them on CDN (Content Delivery Network). CDN is a cost effective way to quickly bring content to the masses, removing one huge bottlenecks in the system

Have duplicates of the key in a cache farm

Another way is to use McRouter or Twemproxy to help you managed the hot key. They can run multiple cache servers with duplicate cache data. Remember, replication is the key to high availability.

Break down the data to smaller multiple key values

Instead of doing a select * and put the whole table into one key, one can consider having <table>_<id> key and <row> array value. It helps as the size of the value is smaller

Use replication groups in AWS Elastic Cache

if you use AWS Elastic Cache, replication groups helps to reduce the bandwidth bottlenecks that is associated with hot keys, as the same request need not go to one node but multiple nodes

Advertisements

How to DIY your own MySQL sharding

Standard

I have seen all sorts of deployment. From the very worst to the best. A lot of the poor deployments are due to database scalability issues not factored in when designing the architecture at the beginning. Hence when there is a sudden surge of users, the database is unable to scale well, causing issues to the application as a whole.

Hence database sharding is important. Why you would want to shard your database?

  1. You don’t want downtime, or you want downtime to be as minimal as possible.
  2. smaller chunks of data is easier to backup or replicate
  3. Queries can be faster as the query load is divided up onto different hosts

Question: what if your old application uses MySQL and you or the management do not want to enlist the help of dbShards to help you manage your data? And you think that MongoDB is unstable (Yes don’t laugh. I met such people before). You will have to do it on your own.

Here I list down some deployments that I have seen. You can decide yourself which suits you best.

1) Table sharding within the database

Say you have a table called users, and the number expands exponentially. So what I saw some legacy system did is to create multiple tables appended with a _<number> behind the table name, like users_1, users_2 etc. So lets say you have 300K users, you break it up to 100K users on each table.

Screen Shot 2015-05-05 at 9.48.37 PM

I strongly do not recommend this approach. The only advantage that I see is you can easily archive old data by dumping the oldest table. Other than that, the downsides are many. You have to modify your queries based on the primary key (if id is 100001, you have to query table user_1 instead of user). It does not help to divide the query load on the database host.

2) Different database on different hosts containing different tables.

Say you have db1 that contain table users on host 10.0.0.1 plus user related tables with foreign key user_id, then you have db2 that contains characters table on host 10.0.0.2. Different host hosted different databases, which hosted different groups of tables.

Screen Shot 2015-05-05 at 9.51.59 PM

I don’t recommend this setup too. Reason is you cannot do JOIN queries across tables that are located on different hosts. You will end up needing to query on one db to get the data, then use the id to query for data on another db and chances is that you will have a lot of foreach loops everywhere. It will stress your application server

3) Multiple hosts with the same database schema

Say you have 2 hosts, 10.0.0.1 and 10.0.0.2 and both hosts have database called db. The name do not clash as they are hosted on different server. Both have users and characters, but users table are sharded (which means user id 1001 on one host, user id 2001 on the other host), while characters data is the master data, similar on both databases.

Screen Shot 2015-05-05 at 9.55.50 PM

I would recommend this setup. Reasons are you do not need to change your queries based on the id as your table names and db names are the same. Also you can do cross table join queries within the shard. You just need to overcome a few challenge with this setup

i) You need to classify which tables are sharded tables and which tables are master tables. Usually the rule of thumb is tables that constantly have new records added are sharded tabled, like users when people keep signing up. Then from there, tables that are related to that table, that contains its primary key as the foreign key, needs to be sharded as well. As for global tables, you need to make sure that the data is the same for all the shards so that cross table queries are possible.

ii) You need to determine a mechanism how to shard the data. Which means autoincrement is a no-no for the table you want to shard. You may want to put in a logic that the first number of the user id table determines which shard the data is in, so id 1001 is in the first shard then 2001 is in the second shard. You may also want to think how to do a round robin sharding to make sure that the amount of data in both database shards grows evenly and not one shard outgrows the other.

Also check out the sharding opinions that framework offers, whether is it RoR Octopus or Cakephp

Database Sharding : dbShards vs. Amazon AWS RDS

Standard

Wonderful blog on database sharding and its tradeoffs

Is This Thing On?

A friend was recently asking about our backend database systems.  Our systems are able to successfully handle high-volume transactional traffic through our API coming from various customers, having vastly different spiking patterns,  including traffic from a site that’s in the top-100 list for highest traffic on the net.   Don’t get me wrong; I don’t want to sound overly impressed with what our little team has been able to accomplish, we’re not perfect by any means and we’re not talking about Google or Facebook traffic levels.  But serving requests to over one million unique users in an hour, and doing 50K database queries per second isn’t trivial, either.
I responded to my friend along the following lines:
  1. If you’re going with an RDBMS, MySQL is the right, best choice in my opinion.  It’s worked very well for us over the years.
  2. Since you’re going the standard SQLroute:
    1. If…

View original post 2,280 more words

Top 5 worst coding practices

Standard

In normal literature, it is easy to read, difficult to write. But in programming, it is the reverse. It is easy to write, difficult to read. So no wonder in the programming world, we keep producing monkeys. In literature, poorly written books get trashed openly. Poorly written code becomes technical debt to the software, bury deep into the system, forgotten by people due to turnovers.

I will highlight some of the worst coding that I have seen so far:

1) Loop to nowhere

while (true){
   .........
}

If you want to write a process that never ends, don’t do this. I rather you use upstart to daemonize the script instead. Loops to nowhere is no different to the dreaded GOTO

2) Mixing linux commands with app script

$check = exec("ps aux | grep somePhpProcess | wc -l");
if (empty($check))
   exec("/usr/sbin/php somePhpProcess.php")

This is one script that I see in cron job. Yes, CRON JOB! Why don’t you use monit instead? it is a lot cleaner, and I am not a fan of php exec(). I stay clear of it like leprosy.

3) Loop through array just to find one record

public static function getRecordMem($id) {
            $listAll = self::getListMem();
            foreach( $listAll as $record ) {
                if($record->getKeyword()==$id)
                return $record;
            }
        }

Hello, there is such a thing call sql query. Please use that! And don’t store the whole table records on memory. You are going to get latency issues.

4) the “If-Else” burger

if($i == 1){
   $j = 1;
else if ($i == 2){
   $j = 2;
else if ....
else $j=100;

Burgers make you fat. This script is totally redundant when a simple $j = $i is enough to do the trick for you.

5) Return of the Null

public Employee getByName(String name) {
  int id = database.find(name);
  if (id) {
    return null;
  }
  return new Employee(id);
}

I know Star Wars is back but the Force is not with the null. Return Null in functions is bad idea. Don’t even think about it.

So in conclusion, do the industry a favour. Start doing clean code, and think through how to optimally deploy your app.

Adding more configurations in CentOS memcached

Standard

Recently been working a lot of memcached stuff. I have set up a 4 node MCRouter cluster on AWS (Screw Elasticache, when you encounter hot keys that reads the value for 2000 request per second, it slows down your system a lot).

I encounter a problem. By default memcached has a memcached.conf in etc folder, but if you install in centOS via “yum install memcached”, there is no such file. So how do you add in additional parameters to tweak your memcached?

what you can do there is a file in /etc/sysconfig/memcached. Add your parameters on the last line. For example, if you want additional threads, add OPTIONS=”-t 16″ at the last line and you will get what you want

$ ps -ef | grep memcached
495      10152     1  0 13:19 ?        00:00:00 memcached -d -p 11211 -u memcached -m 64 -c 1024 -P /var/run/memcached/memcached.pid -t 16
webapp   10200  9483  0 13:29 pts/0    00:00:00 grep memcached

HA jargon survival 101

Standard

Hi! Finally in the mood to post something not so “technical”. This round, I have met people who got “smoked” by people who speaks technical jargons that makes it hard to access technical competency, so I come up with a short HA jargon survival

Here goes…

Scalability is (From wikipedia)

Screen Shot 2015-03-19 at 11.51.08 PM

 

Yes, so when people says so-and-so system can scale up or down, it means it doesn’t matter if 100 users or 100K users use the system. the profit margin is more or less the same, and the user experience is more or less no difference

Scalability is NOT

  • Reliability. Saying Scalability is reliability is like saying I can pass IPPT whether my weight is 60KG or 90KG. You know it is not true. If your system is buggy for one user, it will be buggy for 100 users
  • High Availability. As said, if i expand 60KG to 90KG, I may not be able to run as fast, so i can’t be available for you.

Sharding is partitioning of data into smaller, faster, more easily partitioned chunks of data out of a whole. Sharding is an important component of Scalability, as it helps to systems to allocate resources horizontally

Sharding is NOT

  • Replication. People always mixed up the two. If Voldemort can replicate his horcruxes, maybe he would truly achieve immorality. When the data of one of the shard is lost, it is gone forever if not replicated.

60408153

Replication is the copy of data into several different data storage. It helps to ensure High Availability by storing data at different places, so if outage happens at one place, it can be recovered

Replication is NOT

  • Backup. True to some extend people mix up both of them because people though storing data in a different place is like backup. It is like I keep extra money under the pillows for emergency use, but replication involves real time MIRRORING of data. What happens when the master data gets corrupted? Yes, the replica gets corrupted as well. So (hot) backup is replication without the real time data mirroring. Backup has some time lag, even if it is one second

78ba26a9810c0f04beb4a8d68129f81e9f3e623170960dcc1b27813c2bda7a7f

Auto Scaling is to add additional servers to the load balancers to handle additional traffic.

Auto Scaling is not

  • Associated to database. Data GROWS with time. Your data don’t shrink when your users dropped off from your products. It is true that I can loose weight from 90KG to 60KG, but I can’t loose my age from 33 to 18. So if somebody says that he/she can set up a DB with autoscale capability, that person is a hoax

That is for today folks. Tech stuff are normally quite serious but this is my first serious attempt to add humour into geek stuff. Hope you guys like it

 

 

Create your own mcrouter cluster

Standard

Recently I am trying to help a fellow worker to resolve memcached hotkey issue. Yeah, memcached can be a bottleneck, if there are too many request bombarding for the same key. For that, I recommend to him using Facebook mcrouter.

As we all know there is no such thing as a memcached cluster. They operate as a silo, and they don’t talk to each other. Most of the time you need another server to operate like a puppet master to the memcached, whether is it for sharding or replication.

Memcached router allows users of memcached to sharded AND replicated, which beats Redis. So the following are the steps to set up a memcached cluster

1) Set up 3 Instances. One for mcrouter, the other 2 for memcached

2) SSH into the mcrouter instance

3) do a git pull

git clone git@github.com:facebook/mcrouter.git

4) Install mcrouter

./mcrouter/mcrouter/scripts/install_ubuntu_14.04.sh /home/$USER/mcrouter-install/ -j4

5) Test whether mcrouter is working

webapp@ip-10-11-13-50:~$ ~/mcrouter-install/install/bin/mcrouter --help
mcrouter 1.0
usage: /home/webapp/mcrouter-install/install/bin/mcrouter [options] -p port(s) -f config

libmcrouter options:

And it will list below all the options you can use

6) Create a config file that contains the IP of the other 2 memcached servers

{
   "pools": {
      "A": {
         "servers": [
           "<ip_address_1>:11211",
           "<ip_address_2>:11211"
         ]
      }
   },
   "route": {
     "type": "OperationSelectorRoute",
     "operation_policies": {
       "add": "AllSyncRoute|Pool|A",
       "delete": "AllSyncRoute|Pool|A",
       "get": "RandomRoute|Pool|A",
       "set": "AllSyncRoute|Pool|A"
     }
   }
 }

from there you have a pool called A. There are 2 IPs and use the 11211 to talk to the servers

7) Prep up the other 2 memcached servers. Run “sudo apt-get install memcached” or “sudo yum install memcached”

8) Run the command

sudo ~/mcrouter-install/install/bin/mcrouter --config-file=mcrouter.json -p 11211 &

And you will see the memcached router talk to the 2 node memcached

Also sharing with you the result of a 2000 clients per second load test on this architecture. The result looks very good.

Screen Shot 2015-03-14 at 4.28.51 PM

For more information on MC Router, check this link

Getting GCE instance to transfer log files to gcloud storage

Standard

We developers have a love / hate relationship with logs. Recently I struggled a lot to get them to transfer files automatically via cronjob to the bucket of what I want, but I finally managed to do it thanks to Google Support.

The following are the steps:

1) Prepare the bash script

#!/bin/bash
set -x
/usr/local/bin/gsutil -m mv <location of log files> gs://<Your bucket name>/<folder>

2) Prepare your crontab

PATH=/sbin:/bin:/usr/sbin:/usr/bin:/home/<your username>/google-cloud-sdk/bin
HOME=/home/<your username>
BOTO_CONFIG="/home/<your username>/.config/gcloud/legacy_credentials/<folder/.boto"

# For details see man 4 crontabs

# Example of job definition:
# .---------------- minute (0 - 59)
# |  .------------- hour (0 - 23)
# |  |  .---------- day of month (1 - 31)
# |  |  |  .------- month (1 - 12) OR jan,feb,mar,apr ...
# |  |  |  |  .---- day of week (0 - 6) (Sunday=0 or 7) OR sun,mon,tue,wed,thu,fri,sat
# |  |  |  |  |
# *  *  *  *  * user-name command to be executed

* * * * *  <your username

Do note that you need to set the PATH, HOME and BOTO_CONFIG on your crontab. If not it will not work. Make sure the paths are correct

3) Make sure that your instances have the access to google storage

Screen Shot 2015-02-27 at 7.32.22 PM

And you should be able to see your files saving to your desired google bucket

Setup HA Redis using Redis Sentinel part 2

Standard

Sentinel are really badass creatures. Whether are they Marvel Sentinel or The Matrix Sentinel, they did a good job to protect their evil masters until the saviour do them in.

In the second part of the high availability set up, I will show how to set up redis sentinel

Step 1: Do Step 2 to step 5 in the part 1 post

Step 2: Copy src/redis-sentinel to /usr/local/sbin

Step 3: Copy sentinel.conf to /etc/redis-sentinel.conf

Step 4: Edit /etc/redis-sentinel.conf. search for sentinel monitor and replace 127.0.0.1 to the master sentinel internal IP.

Do take note that the last number on sentinel monitor is 2. What it means is it needs at least 2 sentinels to confirm that the master redis is down before they switch it to slave. You can set that between 1-3

Step 5: add the following lines to the end of the redis-sentinel.conf. Save the file

daemonize yes
logfile "/var/log/sentinel.log"

Step 6: Create /etc/init.d/redis-sentinel file and paste the following in:

#!/bin/sh
# chkconfig: 2345 95 20
# description: Simple Redis init.d script conceived to work on Linux systems
# processname: redis
# 

REDISPORT=6379
EXEC=/usr/local/sbin/redis-sentinel
CLIEXEC=/usr/local/sbin/redis-cli

PIDFILE=/var/run/redis-sentinel.pid
CONF="/etc/redis-sentinel.conf"

case "$1" in
    start)
        if [ -f $PIDFILE ]
        then
                echo "$PIDFILE exists, process is already running or crashed"
        else
                echo "Starting Redis Sentinel server..."
                $EXEC $CONF
        fi
        ;;
    stop)
        if [ ! -f $PIDFILE ]
        then
                echo "$PIDFILE does not exist, process is not running"
        else
                PID=$(cat $PIDFILE)
                echo "Stopping ..."
                $CLIEXEC -p $REDISPORT shutdown
                while [ -x /proc/${PID} ]
                do
                    echo "Waiting for Redis Sentinel to shutdown ..."
                    sleep 1
                done
                echo "Redis Sentinel stopped"
        fi
        ;;
    *)
        echo "Please use start or stop as first argument"
        ;;
esac

Step 7: Set up redis-sentinel proper

chmod +x /etc/init.d/redis-sentinel
chkconfig --add redis-sentinel
chkconfig --level 3 redis-sentinel on

Step 8: repeat all these steps for the remaining 2 sentinel

Step 9: run /etc/init.d/redis-sentinel for all

So your redis sentinel is ready for use! Do take note that it operates slightly different from normal standalone redis. You need to take the master value from sentinel. Sentinel will give you the IP from which to use which redis.

Happy developing!

Setup HA Redis using Redis Sentinel part 1

Standard

In this blog post i will walk through setting up a HA redis cluster using Redis Sentinel. Just like NDB cluster, it is LONG so I break it up to 2 blog post.

Part 1 focus on setting up a master and slave. Part 2 focus on setting up the sentinel

Step 1: Set up 5 instances. I will name them as:

redis-master
redis-slave
redis-sentinel-1
redis-sentinel-2
redis-sentinel-3

Step 2: Install make, gcc and cc in all the five instances

yum install make gcc cc

Step 3: Download the redis source code in both the redis-master and redis-slave

wget http://download.redis.io/releases/redis-2.8.17.tar.gz

Step 4: SSH into redis-master, unzip the tar file

tar -zxvf redis-2.8.17.tar.gz

Step 5: cd into the folder and make

cd redis-2.8.17
make
yum install -y tcl
make test

Step 6: Move redis-server and redis-cli into /usr/local/bin, and move your main redis files to /var/lib

cp src/redis-server /usr/local/sbin
cp src/redis-cli /usr/local/sbin
cp -R /opt/redis-2.8.17 /var/lib
cp redis.conf /etc/

Step 8a: change the redis daemon in redis.conf from no to yes

Step 8b: set the logfile=”<your logfile location>”

BTW, redis have a very nice log file

[28412] 06 Jan 12:57:54.385 * Increased maximum number of open files to 10032 (it was originally set to 1024).
                _._
           _.-``__ ''-._
      _.-``    `.  `_.  ''-._           Redis 2.8.17 (00000000/0) 64 bit
  .-`` .-```.  ```\/    _.,_ ''-._
 (    '      ,       .-`  | `,    )     Running in stand alone mode
 |`-._`-...-` __...-.``-._|'` _.-'|     Port: 6379
 |    `-._   `._    /     _.-'    |     PID: 28412
  `-._    `-._  `-./  _.-'    _.-'
 |`-._`-._    `-.__.-'    _.-'_.-'|
 |    `-._`-._        _.-'_.-'    |           http://redis.io
  `-._    `-._`-.__.-'_.-'    _.-'
 |`-._`-._    `-.__.-'    _.-'_.-'|
 |    `-._`-._        _.-'_.-'    |
  `-._    `-._`-.__.-'_.-'    _.-'
      `-._    `-.__.-'    _.-'
          `-._        _.-'
              `-.__.-'
[28412] 06 Jan 12:57:54.386 # Server started, Redis version 2.8.17

Step 9: Copy the following code into /etc/init.d/redis

REDISPORT=6379
EXEC=/usr/local/sbin/redis-server
CLIEXEC=/usr/local/sbin/redis-cli

PIDFILE=/var/run/redis.pid
CONF="/etc/redis/redis.conf"

case "$1" in
    start)
        if [ -f $PIDFILE ]
        then
                echo "$PIDFILE exists, process is already running or crashed"
        else
                echo "Starting Redis server..."
                $EXEC $CONF
        fi
        ;;
    stop)
        if [ ! -f $PIDFILE ]
        then
                echo "$PIDFILE does not exist, process is not running"
        else
                PID=$(cat $PIDFILE)
                echo "Stopping ..."
                $CLIEXEC -p $REDISPORT shutdown
                while [ -x /proc/${PID} ]
                do
                    echo "Waiting for Redis to shutdown ..."
                    sleep 1
                done
                echo "Redis stopped"
        fi
        ;;
    *)
        echo "Please use start or stop as first argument"
        ;;
esac

Step 10: run Redis by command /etc/init.d/redis start

Step 11: Check that redis is running by running “redis-cli ping”. you should get a pong answer

Step 12: Image / Snapshot the master disk and use it to create a slave instance

Step 13: set the slaveof setting of the slave instance

slaveof <masterip> <masterport>

That is all to set up a master and slave redis. Next part 2 is to set up a sentinel