Last week we released VMware vFabric Postgres 9.2. If you havent tried it out, do try it out it only takes 5 minutes if you have any VMware virtualization platform (vSphere ESXi, vCenter Server, VMware Workstation, VMware Fusion or VMware Player (free)).


This week as I prepare my slides and resources for the VMware Partner Exchange boot camp on Feb 28 (Thursday) at 8:30AM   titled:

CAS1503 - High Availability, Replication and Read Scaling with Virtualized Postgres

I realized that I need to first introduce the typical use cases in a datacenter which helps as the backdrop for the presentation.


Currently databases are considered to be the most battle tested platforms in any data center. As enterprises move towards virtualization, production databases are generally the last applications to be virtualized. PostgreSQL is no exception. Many enterprises do the "Fork-Lift" upgrade whenever change a platform. Today we introduce ways specially with VMware vFabric Postgres (vPostgres) 9.2 which is based on PostgreSQL 9.2 just as concrete example of how Fork-Lift upgrade are things of the past and "Hybrid" existence is the current preferred way.


So lets consider that an enterprise A has some production installations of PostgreSQL databases running on physical machines using Linux 64-bit. What would be the right way to even think of virtualize that workload.


STEP 1: Begin with Physical Database to Virtual Database Replication


vPostgres 9.2 uses the PostgreSQL 9.2 core with the same database file format. It also uses the same streaming replication of PostgreSQL 9.2 allowing streaming replication to vPostgres 9.2  to work (as long as the minor verions - third digit also matches on both sides)

PostgreSQL 9.2 (physical) -> vPostgres 9.2 (virtual)


Also vPostgres 9.2 ships Linux RPM packages for RHEL 6.1 or greater, SLES 11 SP2 so if you have an existing PostgreSQL 9.2 database files then it can be even installed on RHEL on physical instances to replicate to vPostgres on a virtual instance.

vPostgres 9.2 (physical) -> vPostgres 9.2 (virtual)



The obvious questions are (a) Why should I use Replication? and (b) Why should I use Virtualization?


Lets answer them seperately first


(a) Why should I use Replication?


Postgres replication gives two main benefit - HA/DR and Read-Scaling

     HA/DR - High Availability and Disaster Recovery is acheived  by having a Hot Standby version of the production database always available to take over when "stuff" happens. Postgres gives various options of achieving SLAs depending on priorities (response time or data super guarantee) using synchronous streaming replication or asynchronous replication.


Also when organizations typically use Hot Standby, they do not want the standby server to be idle. They also want to be offload read-only queries to some extent to the standby server thus freeing up write contentions in the master (specially in asynchronous mode) and thus allowing the master database to handle more write requests. In fact many online retailers end up using more than 1 read servers since most of their users browse a lot compared to actually using their credit cards.  In such scenario it is frequently observed to be using more than 1 read servers and that too located closer to the application servers using the read data.


(b) Why use I use Virtualization for Replications?


Ideally a database replica slave server resources should match the master database server resource specification. This is needed to make sure that the replica can always catch up with the transactions happening on the master database. However always doing that is not economical. One would want the flexibility to set the same resources but if required be able to tweak the resources for running some end of the month job and then get back to the regular resource allocation. Plus Virtualization allows a better platform to even test out HA/DR without completely wasting the replica.


(Well the real question is why VMware Virtualization. :-)

VMware vSphere 5.1 allows a virtual machine to be as big as 64 vCPUs with 255GB-1TB of RAM for the virtual machine. This allows your database replica to always catch up to your Master. Also using linked clones it is easier to test out DR testing in promoting the slave to a new master.


Further using VMware vFabric Postgres 9.2 Virtual Appliance it is even easier to just tweak the resource settings and then it automatically recalculates the resources with the appliance to make sure the vPostgres database is always running at optimum values for those resource settings.


Also PostgreSQL 9.2 now allows cascading replication which means you could have


PostgreSQL 9.2 or vPostgres 9.2 (physical)  -> vPostgres 9.2 (sync or asnyc replication)  ->  vPostgres 9.2 Read Slaves



STEP II: Migrating to Virtual Database to Virtual Database Replication


Once STEP I has shown enough operational evidence of running Postgres database in virtual environment or if a hardware failure hits the physical host, then vPostgres 9.2 could be promoted instantly to master using a command line operation and now it will be completely in a Virtual Database to Virtual Database Replication.