HA - Clustering ColdFusion Part 1 - Installing CF

Published By: Mike Brunt on Apr 2, 2008 at 8:01 PM

Times Viewed: 1596

Categories: None

This will be the first post in a series relating to clustering ColdFusion.  In this first series of posts we will be looking at clustering CF at a software level using ColdFusion 8 Enterprise.  Hopefully later on, we can move to a Hardware-Software set-up with examples.

I mentioned in a previous post that what I will detail is drawn from my experiences from either creating clusters for clients or working on existing clusters.  There are no doubt other ways to do this.

Firstly, I always create what I call a "master instance", typically the first instance which is created from a multiple-instance install.  Here are some important steps from that...

As the install progresses select "Multiserver configuration".

Install Multi Server

At the point where you are asked to select a webserver select the "Built-in webserver".  We will use this to run CF Admin and eventually use with wsconfig utility to connect to our Production web server...

Select Built In Webserver

We let the installation complete, successfully and at this point we have one ColdFusion instance (cfusion) and a JRun instance (admin).  We do not need the JRun admin instance so we can go into Windows services and set it to manual start.

Next we take a look at the ColdFusion Administrator GUI on the single instance created during the install.  The thing to note about this instance is the bottom section on the left navigation pane "ENTERPRISE MANAGER" with that section there are two sub sections "Instance Manager" and "Cluster Manager".  This section and it's sub sections will not be present in the instances we create.  This is why I consider this first instance created during the install of CF8 to be a master instance.  Its job from now on will be to manage the cluster.

We will next create our first instance, go into ENTERPRISE MANAGER > Instance Manager to create the first instance, obviously give this instance whatever name you wish, I tend to make it meaningful to the web site it will support and then number {instance_name_1}, {instance_name_1} etc...

Instance Mnager  

Repeat these steps to create your second instance, once again you use the master cfusion instance to do that.  Once this is complete you will have two instances in Instance Manager...

Two instances

At this point we have installed ColdFusion and created two new instances, the next step is to create a cluster for these to sit in. 

Before we do that one thing that is good to do at this point is to connect each instance, individually to your Production web server just to make sure that they are fully functional individually...

wsconfig

Once we have verified that all instances function as expected/needed we are ready to move on to clustering them, which will be the subject of the second article in this series.

30 Comments

Mike thanks for starting this series of posts. There's a definite lack of material out there on clustering, and its something of personal interest right now. I'm looking forward to the posts to come.

Cheers,

David

Posted By: David O Malley on Apr 3, 2008

Great article. I am interested in finding out how many instances you can run per processor/memory/server.

Until recently we have run all of our servers with coldfusion standard and we max out when running 100 websites per server, CF needs to be restarted often; with only 50 sites we never have a problem. Jrun seems to only use 1 core of the quad core processor and we set CF to use the maximum memory 1Gb, the other 3Gb just sits there unused.

All the sites use a common set of CFCs / Objects inside a fusebox framework.

I have just installed enterprise and I am wondering if I should setup 4 instances all with 100 sites each or should I setup 400 instances, and then another 400 to cluster them all on the same box?

The more websites we can fit in a 1u rack space the better, but things need to be stable and reliable.

Posted By: Jason Andres on Apr 3, 2008

@david, thanks for taking the time to comment and for your kind words. One of my motivations for this was just as you say the lack of articles covering this subject. Back in the early days of MX 6.1 Brandon Purcell and Frank DeRienzo put some very detailed information out there but not much was posted after that.

Posted By: Mike Brunt on Apr 3, 2008

@jason, same to you; thank you for the comments. I don't know of anyway to dedicate an instance to (N) processor(s). I have seen the sort of behavior you mention though regarding one CPU taking most of the load. I have actually seen that with SQL Server too.

When we consider multiple CF instances with a single jvm.config file, every instance will take whatever the heap size is, so if it is set to start at 1GB 4 instances will take 4GB. So that is the only thing to watch. I would certainly go for multiple instances of CF and if you need to be more fine tuned about the heap you can create multiple config files in a single jvm or even multiple jvm's. I will be covering those things in future articles.

Posted By: Mike Brunt on Apr 3, 2008

Can you cover the same topic for Solaris? Thanks

Posted By: Mark Ireland on Apr 8, 2008

Just wanted to quickly say thanks to Mike and Alagad for such a great article, and am really looking forward to the future installments.

Posted By: DaveG on Apr 10, 2008

@Daveg, thanks for your kind comments they are very much appreciated. There is a lot more to come on this subject and I will apologize now if this is somewhat slow coming out.

Posted By: Mike Brunt on Apr 10, 2008

Can someone confirm that JRun clustering is meant to be done vertically (only on a single physical box)? I have two physical web servers fronted by a hardware load balancer and another two physical boxes running CF8 multiserver. I have been unable to find documentation on how to cluster horizontally so that either web server can route requests to either of the physical CF8 servers. Can anyone offer insight on this before I lay out $$$ for Adobe support?

Posted By: MikeR on Apr 11, 2008

@MikeR yes this can be done but it is not easy and you are right there is a great lacking in documentation. Before starting there really needs to be a detailed plan and this tends to be very application-site specific so it is difficult to state here how exactly to go about this. I am not sure if Adobe would have the expertise to help you.

Does anyone else have any other thoughts?

Posted By: Mike Brunt on Apr 11, 2008

At work we have three physical boxes, each with a VM container for CF8 and the software loadbalancer just roundrobins with sticky sessions

Posted By: Mark Ireland on Apr 12, 2008

@Mark thanks for your comments and in my field work I am encountering virtual machines more and more often. Do you use this set-up for QA-Testing or Production and if Production do you have any thoughts on performance?

Posted By: Mike Brunt on Apr 12, 2008

@MikeR, at politico.com I setup 4 boxes of multi instances CF(3-4 instances each) all under a hardware load balancer.

So, if any instance fail no problem. If any box fail..no problem.

Posted By: Xung on Apr 30, 2008

Got a bunch of questions for Xung:
- The four boxes are identical?
- Are you using distributed web servers w/ the JRun connector or are you using the built-in web server (JWS)?

And some questions in general:
- What protocol does JRun use to communicate on the proxy port (50010)?

I'm guessing that your setup is HLB -> 4 CF/JWS servers. We were shooting for HLB -> 2 Web Servers (Apache) -> 2 CF servers. Since we can't get the JRun connector/clustering/failover to work properly in this configuration we are putting another HLB between our web and CF servers, using mod_proxy to forward .cfm requests from Apache to the second HLB (sticky sessions all the way through), and are forgetting about clustering/session replication. Final configuration looks like:

External HLB -> 2 Web Servers (Apache) -> Internal HLB -> 2 CF servers

Posted By: MikeR on Apr 30, 2008

yeah 4 boxes almost identical. 2 exactly same and other same, but it doesn't matter really.

I set it up with the Web server COnfiguration tool. We have IIS and I use that tool to connect.

I am not sure why u would need so many LB. We just have one (and later on another for backup). The bottom point is unless every dies the site will always be up running.

Man, if you can get one of these dual Quads from Dell with 16Gig ram you can run like 5-10 instances of CF. Then all you need is 2 boxes!

Posted By: Xung on Apr 30, 2008

So you have IIS+CF running on the same box and replicate that configuration four times?

Do you use clustering or session replication at all? If so, are you able to do so across physical boxes? In other words, if you have boxes A, B, C, and D, and box A goes down, can any of your other servers handle the request from the HLB or is the session lost?

Posted By: MikeR on Apr 30, 2008

yes IIS and CF same box, but I do Web Server Configuration just once. For a box, all CF instances serving from one file location. There is no redundancy. I am not sure if that is what you are referring to. There should be minimum work and movable parts (such as files).

We choose not to use session var, but domain cookies instead since we have www and dyn subdomains sitting in different geographical locations. But that option is doable. With sticky session you are telling server to stay on same box. If the box dies then I think you will lose the session. If instance dies it's OK.

Is that your problem? You are trying to setup where all boxes use same session incase a box dies you want to maintain state? You might be able to since you can have remote servers within a cluster. Sorry, I am not experienced with this specific method but it sounds like memcached might nned to come into play.

Posted By: Xung on Apr 30, 2008

@ MikeR and Xung, firstly I want to thank you both for posting comments here, this will surely be helpful to the community. I have been working with clustering with CF since Allaire acquired Bright Tiger and launched Cluster Cats. One thing that I have realized is that Cluster Cats was far more powerful, as it supported pure software clustering without a hardware clustering device (HLB) and for other reasons. I have found setting up clustering via CF Admin to be quirky and unpredictable and I am working through a detailed blog piece, which will show how to do this manually via the files that should be written to via CF Admin. As a note point, since Allaire-Macromedia dispensed with Cluster Cats we are using J2EE clustering. It is good to know this, as there is a good deal of documentation out there on this.

Posted By: Mike Brunt on Apr 30, 2008

It appears that most of the documentation out there is referring to load balanced clusters and CF instance (software) redundancy. How does Coldfusion handle clustering and failover between two boxes that are in a server cluster. Do I need to install coldfusion on both boxes and then register the instance on box B as a remote instance in the CF Admin of box A. Does realize it is being installed in a cluster much like SQL Server and replicate its settings across both servers during the install or do I have to install it on both boxes as I just alluded to. My question may have been answered in the above comments, but I was just searching for some clarification on this matter. Thanks in advance.

Posted By: Tim on May 6, 2008

@Tim, thanks for your comment-question. In order to try to give you a reasonable answer can you let me know what your current server set up is and also what you are actually wanting to achieve?

Posted By: Mike Brunt on May 6, 2008

@MikeB - Thanks for your willingness to help me with my problem/questions. I will try to explain our server setup. We are running two 2CPU windows enterprise servers in a server cluster formation (ie: when one fails it transfers the network resources/services over to the other server). These servers talk to a common storage array for creating RAID 5 containers where our SQL server databases, web files and customer files will be stored respectively (only one server can own the storage disks at a time). We have the cluster working nicely; with IIS and SQL Server 2005 both failing over using manual failover as well as pulling the network cables to force a failover (takes about 10-20 seconds to come back up). Now it is time to add our web application layer and we need this to failover in much the same manner. We are searching for the best installation route and what is needed for coldfusion to failover gracefully without interruption of a session beyond the downtime for the failover routine to bring the resources back online on the other server. I have only ever installed coldfusion using the stand alone web server therefore any installation suggestions on how to best fail coldfusion over from one server to the other would be appreciated. Do we install coldfusion physically on both boxes? Or is is smart like SQL Server and install its parameters on both boxes so that changes to one will be reflected on both servers? Do we install CF on both but manage CF instances with the enterprise manager in one location? This question may be answered by an answer to the previous question, but what about adding data sources? For example, do we have to add them in the cf administrator in both locations? I'm sure all of this is beyond a simple answer, but would welcome any advice and/or resources that may be of assistance to us as I have found the livedocs and CFWACK books to focus more on load balanced solutions with physical tiering of these layers. Thanks in advance.

Posted By: Tim on May 6, 2008

@Tim sorry for the delay in replying, I think this comment thread could be really useful to the community. I will address some of your points as follows...
My first point-question is are you using a hardware clustering device, if so please state the make-model etc. I suspect, though, that you may be using Windows 2003 software (NLB) clustering, can you confirm this or not, as the case may be?

Posted By: Mike Brunt on May 8, 2008

@Mike - no worries thanks for you thoughts on this matter. I need to clarify that this is not a load balance cluster. It is a Windows 2003 Enterprise cluster set up as a "server cluster" (sometimes referred to as a failover cluster) (2 Dell PowerEdge 1950 Rack Servers each running and instance of SQL Server 2005 and IIS connected via a SANS Drive -MD3000). We use the windows cluster administrator where we created a group which we titled, IIS group. This group has various items such as common disk drives, IIS,SQL Server, Cluster IP address, website name etc... which all have their various dependencies and can only be owned by one web server at a time. These services are monitored by the cluster administrator and if any are not functioning properly it will trigger the failover to the second box. Each of these services must be brought online on the second box in the opposite order in which they were taken offline on box 1.

With coldfusion installed in the multiserver configuration on top of JRun it appears we need to install coldfusion on both boxes and create instances of CF from an ear/war file. One instance on Box 1 and one instance on Box 2 Then create a CF cluster using the Enterprise manager by giving the cluster a name and choosing a cluster algorithm. We then need to enable session replication and enable J2EE session variables on both instances.

Once the two instances are in a CF cluster then we need to run the JRun Web Connector. This is where (we assume) we put in our Cluster IP Address. Since in our configuration only one webserver owns all the network resources at a time JRun will point to that Cluster IP address and determine which instance of IIS is serving up the web requests. Essentially we have have CF cluster with instances spread across box 1 and box 2 and these CF instances are talking to eachother and based on the JRun Host defined in the web connector will send all CF requests to that Cluster IP address which will then determine which server owns the network resources of required to serve up the web request.

In our cluster administrator we simply have to add the jrunsvc.exe as a generic service that want the cluster administrator to monitor. That way if JRun is not functioning it can trigger the failover much like a sql server or IIS failure. We are going to install CF today and hopefully defining the Cluster IP address in the JRun Web Connector Wizard will allow the instance of CF on Box 2 talk to IIS on Box 1 when it owns the network resources and vice versa (CF Box 1 talk to IIS on Box 2 when it owns the service).

This post is rather long but I thought i would share some of our conclusions and would appreciate any rebuttal to any misunderstandings we may have about this process. Mike, hopefully this helps you understand our "server cluster" configuration a bit better.

Posted By: Tim on May 8, 2008

The last time I created JRun cluster was on CFMX 6.1. I referenced Brandon Purcell's blog a lot. There, we had 2 CFMX 6.1 servers (with IIS on each) and we used the JRun in-memory session replication. We fronted the entire setup with a software load balancer. Over the course of about 2 years, we had 6-8 outages related to session replication failing between the cluster nodes. We decided to back out of clustering, and we went to a purely load balanced solution from then on.

Have there been many improvements in clustering in ColdFusion 7 or 8? If so, I might consider it again.

For now, the tradeoff that a few sessions might get lost when we do maintenance on a given node is acceptable for us.

Posted By: Damon on May 8, 2008

@Tim, what you are doing is very interesting and I wish you luck on that, you are correct in that ColdFusion will need to be on both Windows Servers. I want to somehow adequately comment on the level of detail you give here and as I did before, I am going to make some assumptions; that you are running the Windows O/S level clustering in an Active-Passive mode. By the way, even though you are only running in a fail-over mode I would still class that as clustering. Here are my comments/questions your details are enclosed with " ".

"2 Dell PowerEdge 1950 Rack Servers each running and instance of SQL Server 2005 and IIS connected via a SANS Drive -MD3000" Am I correct in assuming these servers currently have both IIS and SQL Server running on them; by the way what version of SQL Server do you have Standard or Enterprise?

With regard to the SAN drive is all web site code on there and I assume the SQL Server Engine is on the two Windows 2003 Servers and the logs and databases themselves are on the SAN? Just a point on the databases and logs, the ideal set up is RAID 1 or 10 for logs-temp db and RAID 5 or 6 for the databases. Also, are you using fiber channel to connect to the SAN and if so what level of redundancy do you have? My concern with a SAN is that it can still be a single point of failure if connectivity is not fully redundant.

If my assumptions above are correct, clustering at the ColdFusion level need only be vertical because only one set of resources will be serving production at one time. The caveat to that is in relation to Session variables. ColdFusion-JRun uses the J2EE "buddy" system to replicate variables between cluster members in a cluster. If you go the simpler vertical route (all ColdFusion cluster members on the same physical server) then if that whole system fails and is failed over to the second standby server the ColdFusion sessions will not be there. In order to cover that contingency my suggestion would be to have horizontal clustering, like this...

This may be too late but if not make the first install of ColdFusion on each server a "Master" instance as I detail above, do not use an external web server at that point (IIS-Apache etc). You will have one master instance on each server. At that point create a minimum of two CF instances on each server give them all unique names, that mean something. If you want load balancing on the Active server, you will need to create a minimum of 4 instances on each server. Let's keep it simple for now and say two on each server. Using the same master instances of CF create a cluster on each server, they once again must have unique names. On each cluster add one local instance and one instance from the other server (remote). Typically I would chose RoundRobin with Sticky Sessions. However as only one instance in the cluster will be serving content (Active-Passive) use Weighted Round Robin and make the local cluster member the highest weighting of the two, also enable Session Replication. Connect IIS on each box to the local cluster using the web server configuration tool.

Above all and before pushing live, load test everything to make sure fail-over works and sessions are replicating.

I have made several assumptions here, which I hope are correct. Please let us know how you go on Tim.

Posted By: Mike Brunt on May 8, 2008

@Damon, thanks for you comments here too. I actually did quite a bit of clustering in CFMX 6.1 and 7 and of course more recently on CF8. In some ways, I preferred the 6.1 set-ups because it was largely manual-non Gui, except for the JRun part. In CF8, in particular, I have seen the CF Admin Enterprise Manager GUI behave in a quirky manner. As to your question regarding replicating sessions. In all honesty I have not been around for long periods after setting this up so would not have seen what you saw, unpredictable results sometimes. I have found it to be difficult to get up and running sometimes, particularly where CF and the Web Server are on different servers.

For the record, I will looking closely at this as a member of the OpenBlueDragon Steering Committee.

Posted By: Mike Brunt on May 8, 2008

Mike Brunt Typing...I moved this here as I thought it fits in with this thread...
Hi Mike. I was waiting to read the rest of your series on setting up CF8 clusters but we took the plunge and did it anyway.

The first problem we had was upgrading from CF7 to CF8.01. I ran the installer assuming it would remove CF7 first or just overwrite it. Wrong! Despite specifiying the existing installation directories it was totally unaware of CF7 and we ended up with "500" errors and both CF7 and CF8 shown in Add/Remove Program (on Win2003).

We restored the web server from the last nightly backup and started over again, this time uninstalling CF7 first and rebooting. Frustratingly the instance we created for clustering was still showing as a service and many files in c:\jrun4 were still there. So the lesson learnt (after restoring the backup again) was to go into CF Admin and delete the instances and cluster first before uninstalling CF because the uninstaller is blissfully unaware of any instances you've created.

The uninstaller appeared to leave references to CF instances in IIS's configuration. Perhaps we should have used the CF Web Config Tool to delete its associations with IIS before uninstalling CF7?

We eventually managed to get CF8.01 installed and working in a 2 server cluster. Windows NLB remained in place to handle IIS load balancing. Weirdly a session would bounce from server A to server B and back again each time we fresh the page in the browser. Round Robin and Sticky Sessions are used so this behaviour shouldn't be happening and wasn't happening in our previous CF7 cluster.

While our application web pages are served quickly the CF Admin is very slow to respond.

Any thoughts on any of the above issues would be great thanks, and I hope some of our experience with upgrading CF in a cluster is useful to someone.

Hi Mike. I was waiting to read the rest of your series on setting up CF8 clusters but we took the plunge and did it anyway.

The first problem we had was upgrading from CF7 to CF8.01. I ran the installer assuming it would remove CF7 first or just overwrite it. Wrong! Despite specifiying the existing installation directories it was totally unaware of CF7 and we ended up with "500" errors and both CF7 and CF8 shown in Add/Remove Program (on Win2003).

We restored the web server from the last nightly backup and started over again, this time uninstalling CF7 first and rebooting. Frustratingly the instance we created for clustering was still showing as a service and many files in c:\jrun4 were still there. So the lesson learnt (after restoring the backup again) was to go into CF Admin and delete the instances and cluster first before uninstalling CF because the uninstaller is blissfully unaware of any instances you've created.

The uninstaller appeared to leave references to CF instances in IIS's configuration. Perhaps we should have used the CF Web Config Tool to delete its associations with IIS before uninstalling CF7?

We eventually managed to get CF8.01 installed and working in a 2 server cluster. Windows NLB remained in place to handle IIS load balancing. Weirdly a session would bounce from server A to server B and back again each time we fresh the page in the browser. Round Robin and Sticky Sessions are used so this behaviour shouldn't be happening and wasn't happening in our previous CF7 cluster.

While our application web pages are served quickly the CF Admin is very slow to respond.

Posted By: Gary Fenton on May 11, 2008

Posted By: Gary Fenton on May 12, 2008

@Gary thanks for this informative comment on what you did. I will give you my opinion on what I would have probably done, with comments.

Upgrading ColdFusion - I always archive then uninstall previous versions of ColdFusion before upgrading. I found it to be best practice. I also check for stray Windows Services and also any keys in the Windows Registry and remove anything referring to ColdFusion or JRun.

Yes, as you point out, I also make sure there are no web sites still connected to CF before the uninstall by running the web server configuration tool (wsconfig) before uninstalling.

An important question how did you set up that two instance cluster. Were both cluster members on one physical server (vertical clustering) or were the two cluster members set up, one on each physical server (horizontal clustering)? Apart from the version difference (CFMX7 to CF8.01) is anything at all different from your previous cluster set up?

How are you serving CF Admin, is it through the internal JWS web server or IIS?

Posted By: Gary Fenton on May 12, 2008

@Mike, thanks for your response. You should write an e-book on clustering as there's so little out there to guide people and troubleshoot. I say an e-book because some people may need to obtain the info urgently and you deserve some recompense. :-)

I forgot to look for leftover registry keys. I'll note that for the CF9 upgrade. :-)

We have 2 physical servers running 1 multi-instance each with horizontal clustering (Windows NLB decides which IIS server gets the request). We can't think of anything that's different from this CF8 setup to our previous CF7 setup. CF Admin is run through IIS - I stopped/disabled the Jrun admin service.

Posted By: Gary Fenton on May 13, 2008

@Gary,thank you for your kind comments and suggestion, I forgot to ask one more question. Is there only one cluster of two instances? If there is more than one cluster, it is important that they have different names.

Posted By: Mike Brunt on May 13, 2008

@Gary some more questions, sorry I missed these. Is Round Robin with Sticky Sessions enabled both in NLB and the ColdFusion cluster? Also when you say bounced from Server A to Server B, do you mean the physical servers or the ColdFusion instances?

Posted By: Mike Brunt on May 13, 2008

Add a Comment

Please provide your email address if you want to subscribe to this blog entry. An unsubscribe link is provided in notification emails. Your email address is never shown on this website.
Processing... Please wait
We are adding your comment.