Troubleshooting JGroups & Multicast IP
Although the jgroups manual implies that you should start looking for a new job if you can not figure out what is wrong with jgroups and multicast ip, I find that in the real world, we systems administrators support many different types of "clusters". We've got Oracle RAC clusters, GFS Clusters, VCS Clusters, Microsoft clusters, hardware clusters, load-balancing clusters with mod_jk or mod_proxy, etc. A JGroups cluster is simply one of the many different types of clusters we support and its default configuration uses multicast IP, which is not something we run into every day.
I've written a couple of short posts already on jgroups problems (see JBoss: Clustered Node Startup Failures & JBoss: Overlooked Solution for JGroups-Related Startup Errors) but I thought a short troubleshooting guide might be helpful. This post will serve to consolidate the two previously mentioned posts plus add some additional recently discovered information. The howto leans more towards running jgroups services on a unix-like platform running jboss but, for the most part, the information in it applies to Windows as well. I've not had much of an opportunity with jgroups over a WAN or across multiple VLANs so if you are doing that and having problems, this post may not be all that helpful but if you are having problems with a jgroups cluster over a Cisco LAN and all your servers are on the same network segment (a fairly common deployment configuration), this howto is for you!
Let me know if you've come across other causes of multicast IP failures with jboss/jgroups and I'll add them to the HOWTO.
Here is the link to the new doc => HOWTO: Troubleshoot JGroups and Multicast IP Issues