A Slightly More Resilient JSESSIONID Persistence iRule
One of the nice features built into mod_jk is an automatic retry of a request if a worker is down. It will not necessarily do this by default but it is a configurable option that a lot of people utilize in the event an upstream tomcat, jboss, or glassfish instance is timing out on a socket connection attempt.
The standard irule out there for jsessionid persistence that many of us use does not necessarily have this level of resiliency. Unfortunately, servers and jvm's crash and the health check (monitor) intervals that we set on our bigip's may not catch an error quickly enough for higher volume sites. Taking the standard http_20_sec monitor built into the bigip as an example, one would need to wait 61 seconds before the bigip were to disable the server failing that check. On a high volume site, 61 seconds worth of errors when a node or service fails would result in a lot of unhappy users of our sites.
The following iRule will resend a request to another load-balanced node in the event that the node is not responding. Please note that this rule also includes functionality when a jsessionid is set and it is empty. (One mechanism I've come across in my system administration travels is that a quick and dirty way a developer will use to log a user out of a java web application is to clear the session ID. This rule also handles these cases.)
when CLIENT_ACCEPTED { set mypool [LB::server pool] set lb_fails 0 } when HTTP_REQUEST { # Check if the JSESSIONID cookie is present in the request and has a non-null value if { [HTTP::cookie "JSESSIONID"] ne "" }{ # Persist on the JSESSIONID cookie value for X seconds persist uie [HTTP::cookie "JSESSIONID"] 2700 } else { # Cookie wasn't set or didn't have a value, so check for the session ID in the URI set jsess [findstr [HTTP::uri] "JSESSIONID" 11 ";"] if { $jsess != "" } { # Persist on the JSESSIONID URI value for X seconds persist uie $jsess 2700 } } } when HTTP_RESPONSE { # Check if the JSESSIONID cookie is present in the response and has a non-null value if { [string map {\" ""} [HTTP::cookie "JSESSIONID"]] ne "" }{ log local0. "JSessionID in Response: [HTTP::cookie "JSESSIONID"]" log local0. "Set-Cookie: [HTTP::header values Set-Cookie]" # Persist on the JSESSIONID cookie value for X seconds persist add uie [HTTP::cookie "JSESSIONID"] 2700 } } when LB_FAILED { if { $lb_fails < [active_members $mypool] } { LB::mode rr LB::reselect pool $mypool } }
The new additions to this rule are the when CLIENT_ACCEPTED
section and the when LB_FAILED
section. These are also standard irule code examples used when a load-balanced selection fails. The irule simply selects another node using a round-robin algorithm when a connection failure occurs. A downed server or offline jvm will issue a connection reset back to the BigIP, which is why this irule works, and it will issue those resets quicker than it will take for the health monitors to disable the node.
Next up on my more-resilient jsessionid-persistence irule to do list is a mechanism to handle those cases where a jvm is paused due to long garbage collection times. This rule will continue to route requests to application servers that are hung or paused. The solution there might be to lower the accept count or listen queue size on the container's http connector instead of trying to fix this programmatically but there might be some concerns with that from the development teams.