Thursday, December 5, 2013

Using Couchbase NoSQL Database as a caching layer for our BizTalk Processes - Part 2

Previous Parts:

Before I'll start, I would like to recommend on "Couchbase Essentials" book by John Lablocki.
That book goes through many important subjects of Couchbase Server, and also includes explanations about using it on .NET.
You can purchase the book on: http://bit.ly/1wtEbwl

"Failover" a node in Couchbase


Couchbase NoSQL concept is to have as many servers as possible in one Couchbase cluster.
Couchbase engine spread cluster data across the cluster nodes.

If one node (= server) goes down, part of the data would be unavailable until:
1. Failed node is up again
2. "Failover" is performed on that failed node.

Faling over a node would eliminate it from cluster, and other nodes would serve its data in memory (by promote replica documents of the failed node to "active" status. Replica documents are placed on disk of each one of the cluster nodes).
Replica documents gets updated when data changes, and used as a backup for in-memory data of each node.
Thumb rule: You can only "Failover" as many nodes as you have replicas.
For instance: If a cluster has 10 servers, and 1 server is down - 10% of the data would be unavailable until failing over that node (manually or automatically) or failed node is up again.

Scenario Architecture


Our BizTalk environment contains only 2 servers.
The decision was to keep them in separate clusters for two reasons:
1. If one of those nodes is down, 50% of the cache is unavailable. That is too much.
2. Automatic "Failover" takes 30 seconds (which is too much for us), and can only be configured on clusters which contains 3 nodes or more.

Each of those servers contains its own Couchbase Cluster (= cache), and each cluster backs up the other.

For instance:

If Server A (= cluster A) goes down, Server B is independent and all requests coming from Server B process infrastructure would go to Server B cache.
More than that, if Server A Couchbase windows service is down for some reason (= NoSQL database is down), all requests from process infrastructure on Server A would be routed to Server B (= cluster B). After a configured period of time, requests should hand over again to Server A.

In order to implement backup functionality for clusters, it was needed to write a cache access layer which is built on top of Couchbase Client Library.
In order to keep the cache data identical on both clusters, i've configured an uni-directional XDCR (Cross Datacenter Replication) from Server A to Server B (That means that only Server A loads data from SQL).

To Summarize:



Implement Cache Access with Cluster Backup functionality


First, I've download Couchbase .NET Client Library from http://www.couchbase.com/communities/net/getting-started

I've extracted 3 Dll's from the downloaded zip:
1. Couchbase.dll - Couchbase client dll in order to access Couchbase NoSQL database
2. Enyim.Caching.dll - Memcached dll. Couchbase NoSQL is built on top of it.
3. Newtonsoft.Json.dll - Json.NET is a popular high-performance JSON framework for .NET. I use it to convert JSON documents (from cache) to .NET Dictionary object.

I've added those 3 Dll's to my project, and wrote "CouchbaseClientManager" class:

public class CouchbaseClientManager
    {
        private const string DEAFULT_CLIENT = "couchbase";
        private const string BACKUP_CLIENT = "backup";
        private const string TEST_KEY = "test";

        private Timer timer { get; set; }  // Timestamp for backup to replce the default  
        private string currentClient; // Current client type to create instance
       
        public CouchbaseClientManager()
        {
            this.currentClient = DEAFULT_CLIENT;
            this.timer = new System.Timers.Timer(TimeSpan.Parse(ConfigurationManager.AppSettings["backupTime"]).TotalMilliseconds);
            this.timer.AutoReset = true;           
        }

        public T Get<T>(string key)
        {
            T row;

            if (this.currentClient == DEAFULT_CLIENT)
            {
                if (this.CheckConnectivity(this.currentClient, TEST_KEY))
                {
                    row = this.Get<T>(this.currentClient, key);
                }
                else
                {
                    this.SetCurrentClient(BACKUP_CLIENT); // Setting backup client as the current client
                    this.timer.Elapsed += (sender, arguments) => this.SetCurrentClient(DEAFULT_CLIENT); // When the timer is up, set the current client to default client
                    this.timer.Start();
                    row = this.Get<T>(BACKUP_CLIENT, key);
                }
            }
            else // if currentClient set to BACKUP_CLIENT
            {
                row = this.Get<T>(BACKUP_CLIENT, key);
            }

            return row;
        }

        private T Get<T>(string clientType, string key)
        {
            T row;

            using (var client = new CouchbaseClient(clientType))
            {
                row = client.Get<T>(key);
            }

            return row;
        }

        private void SetCurrentClient(string clientType)
        {
            this.currentClient = clientType;
        }

        private bool CheckConnectivity(string clientType,string key)
        {
            if (System.String.IsNullOrEmpty(this.Get<string>(clientType,key)))
                return false;

            return true;
        }

    }

Few things about the code above:

1. Code reads configurations from config file (Default and Backup cluster configurations and backup time period).
2. Getting a cache row in Json is done by: client.Get<T>(key);
3. Connectivity Check is done by using a "test" key I've places in each cluster.

Here is a flow chart of the code:



Let's take a look on Operations class, which provides a layer for getting data from cache.

public class Operations
    {
        private static readonly Operations instance = new Operations();

        private string urlCacheConfig;
        private List<SyncEntity> syncEntities;
        private CacheConfig.Operations cacheConfigOperations;

        private CouchbaseClientManager couchbaseClientManager;

        static Operations()
        {
        }

        private Operations()
        {
            this.urlCacheConfig = ConfigurationManager.AppSettings["cacheConfigFile"];     
            cacheConfigOperations = new CacheConfig.Operations(this.urlCacheConfig);
            LoadCacheEntities();
            cacheConfigOperations.fileConfigChanged += new System.IO.FileSystemEventHandler(fileConfigChanged);
            this.couchbaseClientManager = new CouchbaseClientManager();
        }

        public static Operations Instance
        {
            get
            {
                return instance;
            }
        }

        private void LoadCacheEntities()
        {            
            syncEntities = cacheConfigOperations.GetCacheItems();           
        }

        private void fileConfigChanged(object sender, System.IO.FileSystemEventArgs e)
        {
            LoadCacheEntities();
        }

        public IDictionary GetDictionary(string table, string key)
        {
            SyncEntity tableSyncEntity = syncEntities.Where(se => se.tableName.Equals(table)).FirstOrDefault();
            if (tableSyncEntity == null)
                return null;

            string column = tableSyncEntity.keys[0];
            string fullKey = table + "_" + column + "_" + key;
            Dictionary<string, string> dicRow;


            string row = this.couchbaseClientManager.Get<string>(fullKey);
            System.Web.Script.Serialization.JavaScriptSerializer js = new System.Web.Script.Serialization.JavaScriptSerializer();
            dicRow = js.Deserialize<Dictionary<string, string>>(row);

            return dicRow;
        }

    }

Operations class keeps entities (List<SyncEntity>) from config in memory.
If config file is changed, Operations loads it again. BizTalk process infrastructure would use "GetDictionary" method in order to get a row from cache.
Each table has a constant key column, and we needed to get data by that column value.

For instance:


If we have defined "Name" as key column, then running GetDictionary("Products","Phone"); would return a key-value dictionary which contains:


table - Cached table name.
key - Value of column.


Client configuration file look like this one, with minor changes (like server and database names etc.):

<?xml version="1.0"?>
<configuration>

  <configSections>   
    <section name="couchbase" type="Couchbase.Configuration.CouchbaseClientSection, Couchbase, Version=1.2.6.0, Culture=neutral, PublicKeyToken=12b9c6b5a9ec94c3"/>
    <section name="backup" type="Couchbase.Configuration.CouchbaseClientSection, Couchbase, Version=1.2.6.0, Culture=neutral, PublicKeyToken=12b9c6b5a9ec94c3"/>
  </configSections>

  <connectionStrings>
    <add name="connectionString" connectionString="Persist Security Info=False;Integrated Security=True;Initial Catalog=DBName;server=SQLServerAddress" />
  </connectionStrings>

  <appSettings>
    <add key="cacheConfigFile" value="C:\CouchbaseCache\BTSCachingTasksConfiguration.xml"/>
    <add key="backupTime" value="00:01:00"/>
  </appSettings>

  <couchbase>
    <servers bucket="default" bucketPassword="private">
      <add uri="http://currentServer:8091/pools/default"/>
    </servers>
  </couchbase>

  <backup>
    <servers bucket="default" bucketPassword="private">
      <add uri="http://otherServer:8091/pools/default"/>
    </servers>
  </backup>

  <startup>
    <supportedRuntime version="v4.0" sku=".NETFramework,Version=v4.0"/>
  </startup>

</configuration>

Here are some remarks regarding config file:

1. 

BTSCachingTasksConfiguration.xml Contains table names and columns keys to cache.

For instance:

<?xml version="1.0" encoding="utf-8"?>
<ArrayOfCachingTask>
  <entity>
   <connectionKey>NotRelevant</connectionKey>  
   <tableName>Products</tableName>
   <keys>Name</keys>
   <timer>12:00:00</timer
  </entity>
</ArrayOfCachingTask>

As you can see, cached table name is "Products", and column key is: "Name".
"timer" is an interval value to load that table again (in order to refresh it on cache).
That load isn't done by that client library. It's done by a windows service I wrote that runs on background.

2. Connection String to SQL database is used when BTSCachingTasksConfiguration.xml is changed, and data gets load again from SQL to Couchbase NoSQL (cache). The load process is done by CacheConfig.Operations class.

3. BizTalk is working only with signed dll's placed on GAC, therefore I put all relevant assemblies there.


What's next?


Part 3 would speak about a windows service I wrote to initialize cache, and refresh it on a configured time interval. I would also go into detail of Coucbase Console Administration and Map-Reduce functions.


2 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. Hi,
    This is very interesting. Can you please share the link for the part 3. I am unable to find it.

    Best,
    Karthik

    ReplyDelete

Thank you Blogger, hello Medium

Hey guys, I've been writing in Blogger for almost 10 years this is a time to move on. I'm happy to announce my new blog at Med...