New Location

My website has moved to http://www.jasonwhaley.com. Please visit there for the latest and only remain here for legacy content.

Tuesday, November 18, 2008

Bad Appserver, No XSD!

This past Thursday I experienced quite a snafu with a certain application server distributed by Day Communique (CQ).

I was in the midst of a late night deployment involving other unrelated applications. Any work involving CQ or our code used in CQ itself was not to be touched - only a separate web application that happens to be housed in the CQ application server. This was supposed to be a relatively painless deployment, as such.

While I had some downtime in the middle of the deployment, I peeked into some of my non-work related irc channels, including ##java on irc.freenode.net - a channel dedicated to java programming that is normally raucous and condescending by day but rather calm by night. In trying to help folks there, we quickly discovered that java.sun.com, which is the source of all downloads and documentation related to java development was unresponsive to http requests.

Then at 4AM a nightly backup of a production server and the authoring application (the part that allows content editing) kicked off. Knowing this was already to happen mid-deployment, this production server was already taken out of the production rotation. It, however, did not come back online, thus meaning we didn't have any fail over for the server that was not backed up. Frantically searching the logs while a couple of engineers and QA folks started pounding at the other apps, I couldn't figure out what happened to this one server. About three minutes later I inspect the Author - it too failed to restart. None of the typical application logs were even writing to their logs. "What the hell" was pretty much the only thought going through my mind - nothing on this server has even changed!

Then I and a colleague take a quick look at the actual application server's log and notice these lovely entries:
18.11.2008 01:09:02 *WARN * servletengine: Unable to locate internal resource: /
resources/xsd/web-app_2_5.xsd
18.11.2008 01:09:02 *WARN * servletengine: Entity unresolved: publicId="null", s
ystemId="http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd"
18.11.2008 01:09:07 *WARN * servletengine: Unable to locate internal resource: /
resources/xsd/javaee_5.xsd
18.11.2008 01:09:07 *WARN * servletengine: Entity unresolved: publicId="null", s
ystemId="http://java.sun.com/xml/ns/javaee/javaee_5.xsd"
18.11.2008 01:09:07 *WARN * servletengine: Unable to locate internal resource: /
resources/xsd/javaee_web_services_client_1_2.xsd
18.11.2008 01:09:07 *WARN * servletengine: Entity unresolved: publicId="null", s
ystemId="http://java.sun.com/xml/ns/javaee/javaee_web_services_client_1_2.xsd"
18.11.2008 01:09:07 *WARN * servletengine: Unable to locate internal resource: /
resources/xsd/jsp_2_1.xsd
18.11.2008 01:09:07 *WARN * servletengine: Entity unresolved: publicId="null", s
ystemId="http://java.sun.com/xml/ns/javaee/jsp_2_1.xsd"
18.11.2008 01:09:07 *WARN * servletengine: Unable to locate internal resource: /
resources/xsd/javaee_5.xsd
18.11.2008 01:09:07 *WARN * servletengine: Entity unresolved: publicId="null", s
ystemId="http://java.sun.com/xml/ns/javaee/javaee_5.xsd"
Laughter ensues - we have an application server, in production, hosting a critical app, that fails to start up with a fairly vanilla installation because it didn't bundle an xsd locally (which is a no-no) that is referenced in the declaration of typical xml files and can't fetch them from java.sun.com for xml parsing. Epic fail.

Thankfully within the next 30 minutes, java.sun.com became responsive again and the production server and its Authoring counterpart could be restarted. This isn't the first time something with Communique has made me scratch my head and say WTF, and it won't be the last. The sad part here is that Communique is the one lone commercial, non-open source product that we use internally and we pay good money for it. Yet we are subject to things like this...

No comments: