What Is It, Really?

“What is it, really?”

Four words. Those four words are like a thought grenade. The ignition point for a boondoggle.

It starts with smart software engineers. Smart and bored.

They’re using a software library or tool. Something with multiple open source and commercial solutions. Good solutions, really. Lots of customers. And the engineers answer their own question with a gross oversimplification.

What is a name server, really? It’s basically a lookup table.

What is an object cache, really? It’s… well, it’s another lookup table.

What’s an ORM, really? It’s a map between SQL result sets and object fields.

Everything is just a map! And those are all real examples. I’ve seen companies who:

  • ignored Open LDAP, Netscape Directory Server, and Active Directory to write their own name server
  • ditched Ehcache to write their own that crashed the app on any serious load, and nobody knew why (multiple companies have done this)
  • ditched Hibernate to write all their queries by hand, seemingly unaware that they could specify SQL to override Hibernate for the 20% of queries that needed optimization. And since they supported MSSQL, Oracle, and MySQL, they wrote their DAOs 3 times.
  • wrote their own version of Struts with some extra features; then they were stuck on a proprietary Struts 1 long after Struts 2, Spring MVC, etc. came out.
  • wrote their own terribly designed version of portlets/JSF/etc. that nobody in the company understood after the creator left (and even he was shakey on what he had done)

Some of these examples are old, but I bet you have stories of recent ones. I’d love to hear them (just keep it classy).

I’ll admit, often it’s less boredom than intimidation. Sometimes the question arises because the library or framework you’re using doesn’t do what you need it to. So you request the feature, and the maintainers respond, “That sounds great! The source is over there, let us know when you’ve added it.”

And you don’t even look at the source. I mean, it’s gotta be crazy complex. It already does so much. You’re not sure where to start. The developer contribution guide is scant and/or years old.

So you start rationalizing. You’re using just a small part of this thing. How hard would it be to recreate that? You’d understand all that code because you wrote it. And you could add that extra feature you needed.

But you’re vastly underestimating the problem. To start with, the corner cases. I remember a story from Jamie Zawinski about the Netscape/Mozilla rewrite.1 A couple devs were reimplementing the FTP functionality. They had taken a few weeks and had a question about an edge case. He helped them, but the real issue was that the original code was gnarly because it had taken them 6 months to find and handle all the edge cases. And they were ignoring the original code because it looked icky.

Similarly, I remember warnings from the Hibernate guys about how freaking hard ORM is. So many edge cases, brain-racking design decisions, and endless debates. If you are thinking of whipping out your own, then you don’t realize how hard it is.

Then there are multithreading issues. You discover you need concurrency, but haven’t been burned by it enough to have internalized the best practices. So there will definitely be bugs in that. App crashing bugs.

Enterprise software companies seem particularly prone to all this. Perhaps because the sales division loves proprietary tools and lock-in. And this gives them more of that.

What I am NOT saying:

I am not saying don’t innovate. Or that you can’t improve things or come up with better products.

If you want to create a new open source competitor, go for it. A number of ORMs came before and after Hibernate, both open source and commercial. That’s great.

If you think you can build a product and sell it, even though there’s competition, go for it.

If you need a small piece of a bloated dependency, and you can knock this out with unit tests pretty quickly, then OK.

Are you brilliant, working with other brilliant folks who will vet this idea? And it’s for something of massive scale, like Google, FB, Amazon, or MS would need? Then I understand.

What I am saying is that building a one-off of a sizeable, complex component, for just your project, will waste tons of money and become a huge regret for all involved. And it’s always done because of ignorance.

There is another way. It doesn’t involve you, and that might be a huge selling point.

As a manager, if I have the budget for a new, complex subsystem, I have the money to go to the maintainers of the project causing you grief and say, “Hey, if you agree this is a good idea, how much would I have to pay you to implement it? Are there committers who are available and want to be paid fairly to make this better?” Almost certainly yes. Maybe there is commercial support. The work would be blessed in advanced and fast-tracked for review.

At a minimum, you can hire them to write a proper contribution guide and code walkthrough so your devs don’t crap their pants at the prospect of contributing.

I have a feeling in projects like Linux, this happens fairly often. And it has got to be cheaper and cause fewer problems in the long run. But when it comes to developer frameworks and libraries, reinventing the wheel seems like too much fun to pass up.

If you liked this, you’ll appreciate What’s the Developer Experience?

Thanks to Dave Ford and Kiran Manur for hilarious, head-shaking discussions about this. And to Joel Spolsky for probably writing about this 15 years ago.

  1. I’m almost certainly misremembering this, but some fine young cannibal will correct me. I couldn’t find a reference, so it was probably in a book. Coders at Work? []

Leave a Reply

Your email address will not be published. Required fields are marked *