This document outlines usage for the replication administration tool. Using the admin tool, administrators can add or remove nodes in a replication cluster, alter the node metadata for some or all of the nodes in a replication cluster, and view the existing configuration. Currently, the admin tool allows administrators to perform the following tasks:
- Create a new Cluster Configuration Database (CCDB) with no configuration data
- Create a new replication cluster from existing databases
- Insert new configuration data into the CCDB, augmenting existing data
- Update existing configuration data into the CCDB, changing existing data
- Completely refresh the CCDB with new data, replacing existing data
- Display the current Cluster configuration node topology with all node metadata (e.g. the database connection URL and the replicated tables)
The admin tool uses a command line user interface and takes the following options.
Cluster configuration xml data file. This option also causes the replication schema to be installed on every node specified in FILE (if not already installed), and will attach the replication triggers to all replicated tables
Used with the -data option. Installs the replication schema on each node and adds triggers to the node's replicated tables. Applies to both master and slave nodes in a cluster.
MASTER, SLAVE, NONE,
Used with the -data option. Initializes each slave node's snapshot status using either the existing master for the node's cluster, or from the slave itself. Can be one of MASTER, SLAVE, or NONE. MASTER examines the master database for the slave node's cluster and uses the latest row from the snapshotlog. This option is only appropriate if no updates have occured on the master since the slave was created. SLAVE assumes that the slave node being initialized is a transactional backup of an existing master. The master may have been updated since the transactional backup was created - but that's OK. NONE will do nothing. This is the default. Use this option if the node being initialized was created from a transactional backup of an existing slave.
Lists metadata for master and slave nodes for all clusters at URL
When specified, the configuration database schema is applied to URL.
INSERT, UPDATE, CLEAN_INSERT, DELETE
Can be one of INSERT, UPDATE, DELETE or CLEAN_INSERT. INSERT inserts data into the database, leaving existing rows as they are. UPDATE updates existing rows in the database. CLEAN_INSERT removes any existing data in the database, replacing with new data provided. DELETE deletes existing rows in the database. This option is only used in conjunction with the -data option
Configuration database password
Configuration database connection URL
Prints this message
Configuration database username
With these options in mind, let's look at each of the tasks mentioned earlier and see how to accomplish them. To follow along at home, you'll need to get a recent build of the replication system, have a Java 1.5 JDK, a GCC compiler, and a penchant for adventure.
As noted, you'll need a few things before you can do any of this.
- Replication - Download the latest release http://svn.yfdirect.net:8080/cruisecontrol/buildresults/bruce-sandbox here.
- GCC - Once you have the replication system, you'll need to install the postgres extensions on the replication database(s) you are using. From the downloaded package, cd into the csrc directory and type "sudo make install". (TODO: This could use more explanation)
- JDK 1.5 - You'll need this to run the admin tool and the replication daemon. For best results, create an environment variable called JAVA_HOME that points to the top level directory of your JDK installation.
Creating a New Cluster Configuration Database
In this use case we simply want to create a new CCDB with no replication clusters configured. A very basic task that administrators may want to perform before loading any cluster configuration data into the database. A new database won't let you do anything until you have some configuration data in it. But we'll get to that in time. In any case, let's see how to simply create the new database.
Change into the bin directory and type the following command.
./admin.sh -url <URL> -loadschema
Replace URL with the JDBC url for your configuration database. Voila! You now have a cluster configuration database installed. As mentioned, this doesn't get you very far. So let's look next at creating a replication cluster from scratch using existing databases.
Create a New Replication Cluster From Existing Databases
This command will allow a user to create a completely new replication cluster from an existing database(s). First let's look at the command line.
./admin.sh -url <URL> -loadschema -data <DATA_FILE> -initnodeschema -operation CLEAN_INSERT -initsnapshots MASTER
In this case, admin.sh will make a connection to URL, load the configuration schema onto that database to create a CCDB, load <DATA_FILE> into the CCDB, and wipe out any existing configuration that may have been there before. We've already covered the -loadschema option, so I won't mention that except to say that if you performed the -loadschema command in our first example, you can leave this option out.
The data file that the admin tool uses is a simple XML file that describes a data set. There is example in sample/config.xml in the replication distribution. A small sample of that file looks like so:
<dataset> <table name="yf_node"> <column>id</column> <column>available</column> <column>name</column> <column>uri</column> <column>includetable</column> <row> <value>1</value> <value>true</value> <value>Cluster 0 - Primary master</value> <value>jdbc:postgresql://localhost:5432/bruce_master?user=lball</value> <value>replication_test\..+</value> </row> <!-- More rows and other tables follow --> </dataset>
As you can see, the format is very straightforward and easy to understand. You declare the tables and columns, and then add rows with values for each column. Using this sample, you can quickly and easily get a replication cluster up and running in no time. Simply alter the database URLs and includetable regular expressions in the datafile to point to your database and tables. Then run the command above and startup the replication daemon.
Setting up a replication cluster in this way makes some assumptions however. The system assumes that all of the slaves are identical to the master in the replication cluster you have configured. If they are not, replication will still work, but any data not in a slave when the admin tool is run will remain only in the master. The slaves will simply begin replicating from that point forward.
Insert new configuration data into the CCDB, augmenting existing data
This use case is very similar to creating a new replication cluster from scratch as previously demonstrated. Except in this case, you just want to add configuration for a new cluster or new nodes, so you don't need the -loadschema option, and you'll want to change CLEAN_INSERT to INSERT. The command now looks like this:
./admin.sh -url <URL> -data <DATA_FILE> -operation INSERT
Update existing configuration data into the CCDB, changing existing data
To update existing configuration data with new values, simply change the operation to UPDATE.
./admin.sh -url <URL> -data <DATA_FILE> -operation UPDATE
Completely refresh the CCDB with new data, replacing existing data
If you think about it, this is just like starting from scratch. You are clearing out the configuration to nothing. Then you are installing a new configuration. The command is the same as mentioned above for starting from scratch.
./admin.sh -url <URL> -loadschema -data <DATA_FILE> -operation CLEAN_INSERT
Display the current Cluster configuration node topology with all node metadata
This command makes no changes to the CCDB at all. Instead, its purpose is to help administrators keep up with the tables being replicated by each node in the system. That command and some sample output looks like this:
mercury:bin lanceball$ ./admin.sh -url jdbc:postgresql://localhost:5432/bruce_config?user=lanceball -list
Name: Cluster 0 - Slave Dos
As you can see, the output displays the cluster name "ClusterOne", the master for the cluster, and each slave. It displays the user-specified metadata for each node, such as the URL and the Include table regular expression. In addition, if a connection can be made to the node, the database is examined and a list of tables that will be replicated by the system is displayed. In the example above, the sample/config.xml file was used to create a new replication cluster from scratch. Each node in the cluster has a schema called replication_test and a table called replicate_this which will be replicated. Note that the admin tool will not create the tables you want to replicate. If you are using the example data file, you'll want to add a replication_test schema and a replication_test.replicate_this table to each node in the cluster.