Starting with my “CentOS 6 + EMS” setup, I need to obtain a decent FT/HA demo setup for test purposes.
My clients are often trying to improve EMS availability in case of a disaster or simple software error. Not because EMS itself is unstable, but mostly to manage risk related to MOM RollOver(or FT), HA and DR. TIBCO does provides guidelines on the subject, but not precise recommendation on the combination of OS and distributed FS to use. Additionally, TIBCO experts usually push for file stores over DB stores. They state that DB stores should not be considered for performance reasons, even if it strikes me as the simplest possible FT setup.
Many clients are then encouraged to create a software or hardware filesharing system to allow multi-site FT. This article describe how to implement such a solution FOR TEST PURPOSES using a FREE SOFTWARE distributed filesystem (NFS4 on CentOS 6). The other popular option is to go with specialized HARDWARE SAN solutions. For budget consideration, I wonder if I could create a performing enough setup without going in that expensive direction.
According to the official EMS documentation, the requirements for a distributed EMS Fault tolerance back-end file system are:
Would CentosOS 6+ NFS4 complies to all these rules ? I honestly don’t know. TIBCO does not provide a precise list of free/software or hardware storage solutions that precisely follow these guidelines. They point out that NFS4 could in some instance comply if all of the above principles are respected.
DISCLAIMER : If TIBCO cannot certify a OS/FS pairing for EMS FT, I certainly can’t either. PLEASE CONSIDER THE FOLLOWING HOW-TO AS A SUGGESTION OF TESTING CONFIGURATION. PLEASE DO NOT APPLY THIS SOLUTION IN A PRODUCTION ENVIRONMENT WITHOUT PROPER TESTING.
Side-line editorial : I believe the future is full of easily distributed MOMs, and that this kind of difficult setup will disappear over time. Currently, some MOM (IBM MQ, Rabbit MQ)don’t make FT that hard to implement (but do not support WAN connections). In contrast, some MOM FT setup are similar to EMS (Active MQ)
To summarize, I aim to create something like this:
As written above, start by go through my CentOS/EMS how-to and my CentOS firewall how-to.
Then, I suggest following up an article like this one from cyberciti.biz to setup NF4 and this one from digitalocean.com.
On my VM image, here is what it looked like:
yum install nfs-utils nfs4-acl-tools portmap
Ok, nothing to do then… moving along to the stores sharing.
Let’s edit /etc/exports and add this:
vim /etc/exports #add this line, to allow all host in the VirtualBox subnet (assuming you address is in 10.0.2.*, like mine) /opt/cfgtibco/tibco/cfgmgmt/ems 10.0.2.0/24(rw,sync,no_root_squash)
Turn on the relevant services:
chkconfig nfs on service rpcbind start service nfs start
Just to be on the safe side, I explicitly disable NFS2 and NFS3 like mentioned here. In file /etc/sysconfig/nfs, I added (uncommented):
MOUNTD_NFS_V2="no" MOUNTD_NFS_V3="no" RPCNFSDARGS="-N 2 -N 3"
…and restarted
Now, let’s open the pertinent ports. If you refer to my firewall how-to, you know I use the iptables-restore command. to load my config. Edit the /root/iptables-save.txt, and add a line like this one:
-A INPUT -p tcp -m tcp --dport 2049 -j ACCEPT
Then, we validate and save:
iptables-restore < /root/iptables-save.txt iptables -L /sbin/service iptables save
The “Primary” machine is almost ready… the EMS server is still working (just start it to validate), and it is using the file stores locally, without NFS. We must modify the configuration slightly to allow the “Secondary” machine to share the JSON configuration file.
Note : This is really the big revolution of the new EMS configuration file format (JSON), in the past, each machine in the cluster had an almost identical “tibemsd.conf” system of files (some could be shared, but usually not the main one)… This made no sense since the vast majority of the data was the same for every machine in the cluster. Now the configuration file is shared, and includes slight difference when using FT. (EMS user guide, “Configuring Fault Tolerance in Central Administration”, page 539)
I use the EMS CA interface to change the relevant configuration:
To create the EMS “Secondary” machine, I suggest (all steps detailed below):
Before cloning, I built this script (tibemsd64-2.sh in EMS “bin” folder), and exposed it as a shortcut on the desktop:
cd /opt/tibco/ems/8.1/bin ./tibemsd64 -config "/opt/cfgtibco/tibco/cfgmgmt/ems/data/tibemsd.json" -secondary
Here is the shortcut command:
gnome-terminal -e "gnome-terminal -e /opt/tibco/ems/8.1/bin/tibemsd64-2.sh"
Reference : (EMS User Guide, “Starting Fault Tolerant Server Pairs”, page 109)
When I first started my VMs, I realized they had the SAME IP address (10.0.2.15)… this is where I realized that I needed to switch the networking mode in the VMs for “NAT” to “NAT network”.
As simple as that, but now sadly the IPs have changed ! I end up with 10.0.2.4 and 10.0.2.5.
Another side-effect… The port-forwarding settings (tutorial here) for each separate VMs are lost, they have to be re-implemented on the group itself. As such:
I have to adjust the JSON file by hand, since the EMS server and EMSCA processes won’t start without proper listeners.
Here is my updated JSON file:
{ "acls": [], "bridges": [], "channels": [], "durables": [], "emsca": { "advanced": [], "appliance_options": { "store_paths": [] }, "emsca_listens": [{ "url": "tcp://10.0.2.4:7222" }, { "url": "tcp://10.0.2.5:7222" }] }, "factories": [{ "jndinames": [], "name": "ConnectionFactory", "ssl": { "ssl_issuer_list": [], "ssl_trusted_list": [] }, "ssl_issuer_list": [], "ssl_trusted_list": [], "type": "generic", "url": "tcp://7222" }, { "jndinames": [], "name": "FTConnectionFactory", "ssl": { "ssl_issuer_list": [], "ssl_trusted_list": [] }, "ssl_issuer_list": [], "ssl_trusted_list": [], "type": "generic", "url": "tcp://localhost:7222,tcp://localhost:7224" }, { "jndinames": [], "name": "SSLConnectionFactory", "ssl": { "ssl_issuer_list": [], "ssl_trusted_list": [], "ssl_verify_host": false }, "ssl_issuer_list": [], "ssl_trusted_list": [], "type": "generic", "url": "ssl://7243" }, { "jndinames": [], "name": "GenericConnectionFactory", "ssl": { "ssl_issuer_list": [], "ssl_trusted_list": [] }, "ssl_issuer_list": [], "ssl_trusted_list": [], "type": "generic", "url": "tcp://7222" }, { "jndinames": [], "name": "TopicConnectionFactory", "ssl": { "ssl_issuer_list": [], "ssl_trusted_list": [] }, "ssl_issuer_list": [], "ssl_trusted_list": [], "type": "topic", "url": "tcp://7222" }, { "jndinames": [], "name": "QueueConnectionFactory", "ssl": { "ssl_issuer_list": [], "ssl_trusted_list": [] }, "ssl_issuer_list": [], "ssl_trusted_list": [], "type": "queue", "url": "tcp://7222" }, { "jndinames": [], "name": "FTTopicConnectionFactory", "ssl": { "ssl_issuer_list": [], "ssl_trusted_list": [] }, "ssl_issuer_list": [], "ssl_trusted_list": [], "type": "topic", "url": "tcp://localhost:7222,tcp://localhost:7224" }, { "jndinames": [], "name": "FTQueueConnectionFactory", "ssl": { "ssl_issuer_list": [], "ssl_trusted_list": [] }, "ssl_issuer_list": [], "ssl_trusted_list": [], "type": "queue", "url": "tcp://localhost:7222,tcp://localhost:7224" }, { "jndinames": [], "name": "SSLQueueConnectionFactory", "ssl": { "ssl_issuer_list": [], "ssl_trusted_list": [], "ssl_verify_host": false }, "ssl_issuer_list": [], "ssl_trusted_list": [], "type": "queue", "url": "ssl://7243" }, { "jndinames": [], "name": "SSLTopicConnectionFactory", "ssl": { "ssl_issuer_list": [], "ssl_trusted_list": [], "ssl_verify_host": false }, "ssl_issuer_list": [], "ssl_trusted_list": [], "type": "topic", "url": "ssl://7243" }], "groups": [{ "description": "Administrators", "members": [{ "name": "admin" }], "name": "$admin" }], "model_version": "1.0", "queues": [{ "name": ">" }, { "name": "sample" }, { "name": "queue.sample" }], "routes": [{ "name": "EMS-SERVER2", "selectors": [], "url": "tcp://7022" }], "stores": [{ "file": "meta.db", "file_crc": false, "mode": "async", "name": "$sys.meta", "type": "file" }, { "file": "async-msgs.db", "file_crc": false, "mode": "async", "name": "$sys.nonfailsafe", "type": "file" }, { "file": "sync-msgs.db", "file_crc": false, "mode": "sync", "name": "$sys.failsafe", "type": "file" }], "tibemsd": { "authorization": false, "console_trace": null, "detailed_statistics": "NONE", "flow_control": false, "ft_activation": null, "ft_active": null, "ft_heartbeat": null, "ft_reconnect_timeout": null, "ft_ssl": { "ssl_ciphers": null, "ssl_expected_hostname": null, "ssl_identity": null, "ssl_issuer_list": [], "ssl_password": null, "ssl_private_key": null, "ssl_trusted_list": [], "ssl_verify_host": null, "ssl_verify_hostname": null }, "jre_options": [], "log_trace": null, "logfile": "/opt/cfgtibco/tibco/cfgmgmt/ems/data/datastore/logfile", "logfile_max_size": null, "max_connections": 0, "max_msg_memory": "512MB", "max_stat_memory": "64MB", "msg_swapping": true, "multicast": false, "password": null, "primary_listens": [{ "ft_active": true, "url": "tcp://10.0.2.4:7222" }], "rate_interval": 3, "routing": false, "secondary_listens": [{ "url": "tcp://10.0.2.5:7222", "ft_active": true }], "server": "EMS-SERVER", "server_rate_interval": 1, "ssl": { "ssl_cert_user_specname": "CERTIFICATE_USER", "ssl_dh_size": null, "ssl_issuer_list": [], "ssl_password": null, "ssl_rand_egd": null, "ssl_require_client_cert": null, "ssl_server_ciphers": null, "ssl_server_identity": null, "ssl_server_key": null, "ssl_trusted_list": [], "ssl_use_cert_username": null }, "statistics": true, "statistics_cleanup_interval": 30, "store": "/opt/cfgtibco/tibco/cfgmgmt/ems/data/datastore", "tibrv_transports": null, "track_correlation_ids": null, "track_message_ids": null }, "tibrvcm": [], "topics": [{ "name": ">" }, { "exporttransport": "RV", "name": "topic.sample.exported" }, { "importtransport": "RV", "name": "topic.sample.imported" }, { "name": "sample" }, { "name": "topic.sample" }], "transports": [{ "daemon": null, "name": "RV", "network": null, "service": null, "type": "tibrv" }], "users": [{ "description": "Administrator", "name": "admin", "password": null }, { "description": "Main Server", "name": "EMS-SERVER", "password": null }, { "description": "Route Server", "name": "EMS-SERVER2", "password": null }] }
On the “Secondary” VM :
rm -Rf /opt/cfgtibco/tibco/cfgmgmt/ems
mkdir /opt/cfgtibco/tibco/cfgmgmt/ems # as root, we mount the "root" mount -t nfs4 -v 10.0.2.4:/opt/cfgtibco/tibco/cfgmgmt/ems /opt/cfgtibco/tibco/cfgmgmt/ems
10.0.2.4:/opt/cfgtibco/tibco/cfgmgmt/ems /opt/cfgtibco/tibco/cfgmgmt/ems nfs defaults 0 0
One thing remaining is to update GEMS (see first tutorial) configuration:
Then, to be sure, let’s restart both VMs, validate the automatic NFS link, and start both servers before starting GEMS:
I hope this testing rig will be useful to you !