The background of this posting is easily understood: IBM/Lotus announce a new site, and it turns out the site is unreachable, because its DNS settings (that’s the stuff the makes your browser know where to go to) are badly broken. To add insult to injury, once their DNS settings are fixed (a trivial change which takes IBM/Lotus several days to fix), the next bit of tubing breaks: the actual access to the site is denied with an error.
There is nothing breathtaking about screwing up a site – it happens all the time, and it happens to all of us who work in such areas. But: when we are told something doesn’t work, we unbreak it. Quickly. Very.Bloody.Quickly! And one thing is very important: we don’t advertise a launch before we’ve tested a site. Never. Ever.
An approximate chronology (in CET) of the events that led to a site not working:
-
Tuesday, October 6th.
- Mr. Ed Brill (Ed Brill is Director, Product Management, IBM Lotus software) announces that the developer.lotus.com is now live.
- Volker Weber asks who can resolve developer.lotus.com, and I point out it cannot work. Stefan Rubner agrees.
- Hajo Schulz of Heise answers as well.
- I write a post about the situation.
- Ed Brill is “hearing it might not be resolving for some”.
- Volker directs a tweet to Ed Brill, giving him a hint.
- Somebody else says that OpenDNS doesn’t resolve the site at all.
- Stefan Rubner writes a post.
-
Wednesday, October 7th.
- IBM/Lotus start messing about with TTL (time to live) on their DNS record. The resource record itself is still impossibly broken.
- Chris Linfoot joins in. He correctly diagnoses that, if the DNS record where valid, the developer.lotus.com site would simply perform an HTTP redirect.
- Volker clearly says it.
- Alex retweets.
- Some juicy comments on Stefan’s post.
- Another tweet regarding “unfound site”.
- And again, asking if site is AWOL.
-
Thursday, October 8th.
- Stefan is watching their DNS, and he indicates a change in the TTL. The data itself is still screwed up.
-
Friday, October 9th.
- I reply that IBM/Lotus has changed TTL back to 300.
- I start using a computer for what it is for – to check IBM/Lotus’ DNS automatically with Nagios, with my joke plugin.
- Volker calls out Groundhog day.
- The ComputerWoche retweets Volker’s message and others follow up.
- Volker points out that Google’s Chrome browser has particular recommendations for http://developer.lotus.com/.
-
Saturday, October 10th.
- Chris gives us a wakeup call: the goofed-up
CNAME
DNS resource record has turned into anA
(Address) record with a different value: a new IP address. - Evidently, the new IP address is live and points to an IBM proxy. I write about the exciting news, because the proxy is borked – it doesn’t do what it is supposed to.
- Volker tweets.
- I quietly update my Nagios joke plugin to check for developer.lotus.com not returning an HTTP 403 status code.
- Chris gives us a wakeup call: the goofed-up
-
Sunday, October 11th.
- Sunday. IBM/Lotus doesn’t seem to work on Sundays.
- At 19:20 CET the 403 error is still deployed. In.Full.Glory.
-
Monday, October 12th.
- Monday. IBM/Lotus doesn’t seem to work on Mondays.
- At 20:30 CET the 403 error remains online. As Volker says earlier: Groundhog Day
- Oh, and Stefan is right: on the Web site it says “Action: No action is required”. :-)
-
Tuesday, October 13th
- No change. The site http://developer.lotus.com/ is still broken. Even the error-page has errors on it:
- No change. The site http://developer.lotus.com/ is still broken. Even the error-page has errors on it:
IBM is a very large organization, and there are undoubtably very many expert technicians who work for them, but they certainly were not available to set up the trivial infrastructure required to get this working.
Zero. Frank has quit. Update: at 17:30 CET, Nagios tells me the site is up.