Years ago, SD Magazine used to feature a section in the October issue on software horror stories. In honor of one of my favorite (since departed) publications, I’d like to continue the Halloween traditions by reliving some of the worst screwups and dumbest decisions made in the career of yours truly. Since I moved back to Austin, I’ve started meeting some of my friends from a previous job in what I’d put forward as the worst software organization I’ve ever been associated with. They’ve been kind enough to remind me of some of these episodes.
- What do you mean, backup? I started my career as an engineer in Houston, but promptly got into project automation writing shadow IT apps for my old engineering team. My first job was writing a tool that could trouble shoot an extremely problematic data exchange. I solved the problem more or less by writing a system that worked in parallel, but worked correctly, and compared my results to the actual results of the real system. It was a complete success. My first programming triumph. The engineering teams wanted access to the correlation reports themselves, so I then added an ASP application that would allow users on the engineering projects to kick off the data comparisons. On my box. On my puny Compaq workstation running underneath my desk. Which was so overworked by my little shadow IT application that it threw a head through the hard drive. The hard drive that had the only copy of the code.
- Sev 1 Trouble Tickets? I’ve got your ticket right here. On my very first big project, and as a lead no less, I decided that it would be a cool thing if we automatically sent a trouble ticket to support if our application logged an error. Sounds good, except we didn’t really specify which errors sent the tickets, and how many times. On the first day it rolled out, in a day that was otherwise very successful, we experienced a minor configuration flaw that caused the application to write an excep tion record to a table in the database.
- Which caused an After Update trigger to fire.
- Which created and posted a trouble ticket to support in a stored procedure.
- With the dreaded Sev 1, OMG the factories are down!, kiss your raise goodbye level.
- Every 5-10 seconds or so as the polling thingie encountered the exact same innocuous problem
- About 300 times until we fixed the configuration
- Integration by Hook or Crook. Of everything I’ve ever done, the one thing that embarrasses me more than anything is my first big integration project. We needed to integrate with an externally facing B2B server. I was offered a choice by my buddy who was in charge of the B2B infrastructure. We could either do it by MQ Series that neither one of us had ever used, or we could just use a queuing table in my Oracle database with nice simple mechanisms that we all understood. I naturally said that I’d man up and do MQ Series. We were already using the built in job scheduling in Oracle and Java Stored Procedures, so I just assumed that we’d use JMS inside the Oracle database. All the samples looked very simple, but it turned out that we had to be running the Oracle database under a certain configuration and the DBA’s were adamant that they weren’t able to effect that change late in the game. For a variety of reasons (ignorance, technical limitations, and haste), we ended up:
- Having a stored procedure called from Oracle’s job scheduler on a polling interval…
- Do an Http post to an ASP page on our website…
- Which would gather up the correct data from our database and post to yet another web page on another server (because, surprise, we couldn’t make DCOM work consistently)…
- Where that ASP page would call a COM component that could finally put the data into MQ Series queue
- Of course, we had good instrumentation and distributed transactions throughout – NOT!
- It was just a little bit buggy in production
- SetConnectionString.asp. I have a really bad taste in my mouth from all of the times I’ve had to interact with centralized architecture teams. Nothing will freeze my blood in fear more than hearing the phrase “you have to use our standard framework for Logging / Persistence / Security / Etc. from our architecture team.” I’ve had that feeling reinforced at 6-7 different places, but it started with a simple tool we had to use from our architecture team to store database connection strings. It was kind of a pre-DPAPI implementation where the data was encrypted in the registry using some sort of NT user information as the key. Basically , the only NT user that could read or write to the registry settings was the currently logged in user. Great for security, but the registry settings corrupted every couple days. There wasn’t a useful tool for setting the connection string in the first place, so I wrote an ASP page that would let you type the connection string into a textbox, then submit the data to the server where the connection string would be written into the registry under the correct user name. Because this was one of those incompetent shops with horrendous configuration management policies, we inevitably had to play the “is it pointed at the right database?” games. I naturally changed my SetConnectionString.asp page to show you the current connection string. I’m sure you know what happened next. That page managed to get deployed to production. A couple years after I left, they did some sort of security scan that scanned a web page and tried to access each page automatically to verify that it was secured. Somehow, this security scan submitted the form with a blank connection string and effectively broke the application. The mission critical, shut the factory down application.
- Workflow? Nobody has ever done Workflow before like me. One of the things we needed to do was provide some workflow capabilities for supply chain dispute resolution between us and our supply chain partners. As one of the architects on the project, I was tasked with determining a strategy for this functionality. I naturally decided that we should code it ourselves, because, you know, there’s only a half billion existing 3rd party workflow tools out there. I delivered a working solution and more or less on time – at the cost of my sanity, mental health, and respect of my development team. At its peak the system had 70+ metadata tables that could be, or had to be, configured to make the system. Let’s just say that the system was a bit over-engineered.
Next week I’ll have more fun by talking about the ones that I only observed, and another about the horror stories inflicted upon me by someone else – usually one of those centralized architect types