Creative Commons License
This work is licensed under a Creative Commons Attribution - Noncommercial - No Derivative Works 3.0 United States License.



















Technorati blog authority

My thoughts on best practices in software architecture and development as a whole (with an emphasis on Java/J2EE).

Thursday, June 15, 2006

Why documentation matters - intent and abstraction

OK so I know I'm swimming against the tide on this one - so let's just say it "I'm in favor of documentation for software". Eek! Yowza!
Now I don't mean docs for users (although that's important) I mean MS Word documents with nice box and UML diagrams for developers - not just javadoc.

I'd say of all the developers I've worked with, probably 80% dislike (if not disdain) documentation and probably 70% don't even see a need for it. Maybe a little javadoc but hey the signature
"public void doStuff(Object o) throws Exception"
is 100% clear right? WRONG!

For managers the percentages are a little better (60%/50% like documentation) but it is not by much and mostly its because they want to know exactly what's going on and *NEED* documentation.

I see several aspects of documentation as very important and useful to developers primarily and to the enterprise at large secondarily:

1) New Developers / Testers on your team
So you show some 250 kLOC to a new senior engineer, give them a sparse functional spec and ask them to go coding some new feature!?!?! -Personally, I think the learning curve will be huge. Which components do they need to change?
Which components do they need to ignore?
Which interfaces do they need to extend?
How do they test it?
What standards need to be followed - syntax, threading, resources etc.?

Now I'm not talking about creating a 500 page CMM tome - just, say, a 20-40 page document describing layers, the high level architecture and what components do what.

I mean it's easier to read a 20 page architecture / design doc than it is 250 kLoC!

2) Intent
OK so the code is self-documenting and debugged right? So I don't need documentation? Well no! If you've read Knuth you know that very few programs are provably correct (well not any one beyond a few hundred lines in the real world). So your code does what you want it to do? That's great, but does it do what it is *SUPPOSED* to do?
Well maybe you know (or will after more testing) but will anyone else?
More importantly what *YOU* think you need to implement and what your tech lead / architect need to implement can be very different.
Documentation can help elucidate the intent of your program.

3) Clarification

This is similar to #2 but the audience is different - it's YOU - the developer! That's why writing is useful - putting stuff down on paper removes some of the "abstract-ness" and "fuzziness" and makes things more concrete - that is it makes things seem more clearly right (or clearly wrong) or clearly incomplete. OK writing can also make clear things fuzzy too but it helps, excuse the pun, keep folks on the same page.

4) Abstraction
Remember 7+/-2 - Miller's magic number- the guidelines of short term memory and the number of "chunks" you can think about at once? Even OOP is a nod to that - it's easier to think about discrete objects than about the entirety of the myriad attributes and methods.

But say we have 15-20+ objects (and probably more) how do you begin to think about those?
At some point with systems bigger than 30-40 classes, a document has to be put together to describe the overall architecture/design at a fairly high level (layers, packages etc.) with the intent of each.

So I think documentation is important and I'm always wary or concerned when I see companies, teams or processes that "just don't DO" documentation as it's a waste of time.
Usually it's a bad smell - a sign or symptom of more serious issues.

Side note
Computer languages are advancing more and more to be closer and closer to everday English (compare Assembly to C to Java). Why? Because the code is easier to write and read? Why does code need to be easy to write *AND* read!?!?!? - So you can be sure it's doing what it's bloody well supposed to! :-) I know very few people who truly write self-documenting code (they can read it sure, but can you?) so a high level description is warranged

So what documents do I prefer and suggest people write - well there's two
#1 - The Architecture and Design document
Describe the layers, the objects, what patterns you follow and most importantly the intent - what are the layers, objects and patterns *SUPPOSED* to do. Also for decisions out of the "norm" - document *WHY* not what. For Example
Q: Why do you limit the connection pool to 3 connections
A: the legacy database tends to deadlock otherwise

So if MAX_CONNECTIONS=3 in the code that's fine - but if a new developer is on the team
and they have a performance issue - you *KNOW* the first thing they're gonna do is
bump that to 5 or 10 and waste their time and your time doing something that won't work.

#2 - The Cookbook
OK now that a developer knows what each piece does (and probably forgotten half of it by the time they reach they end) - what do they actually do with that knowledge?
What are the common extension / maintenance points and how does the developer get it done? What are good examples and what are the potholes to watch for? What tests should they run?
A cookbook can answer that.

Again I'm not a guy who *LOVES* to write nothing but documentation (think CMM - yikes!) but a 20 page document with a few UML diagrams for your 250 kLoC can be banged out in 2-3 days.

CommonCounterArg1: But applications change quite a bit over time . . .
So code changes over time and documentation starts to "rot"? Sure that happens but probably every man-month of developer effort might make a page or two of changes - and if you are changing more than that, then something was probably wrong with your original requirements or design - overall I think the gain outweighs the cost.

CommonCounterArg2: But XP abhors "Big Design Upfront" . . .
Sure it does and that probably works for a crack team of rock-solid, highly experienced developers but most teams I've seen are a mix of skills and a mix of capabilities with 15-25% turnover a year. On a team of 16 developers that means 2-4 new people *EVERY* year - just how many times do you want to explain the build process, the classes, where everything is?

As a nod to tendencies (and painful experience) I'm a big fan of diagrams - UML and others - if I see a page of nothing but text I'll tune out. So again when I say 20-30 pages - you're free to add diagrams liberally throughout as long as they help explain *what* it does, what it is *supposed* to do and *why*.

-Frank

Labels:

3 Comments:

Blogger Frank Kelly said...

Funny - a few days after I published this, one of my favorite blog discoveries of the past few weeks has discussed this
http://www.codinghorror.com/blog/archives/000616.html

In particular, there's a quote from Joel Spolsky they cite -
"The difference between a tolerable programmer and a great programmer is not how many programming languages they know, and it's not whether they prefer Python or Java. It's whether they can communicate their ideas. By persuading other people, they get leverage. By writing clear comments and technical specs, they let other programmers understand their code, which means other programmers can use and work with their code instead of rewriting it. Absent this, their code is worthless"

I agree 100% with this

6/21/2006 6:31 PM

 
Blogger Per Olesen said...

I tend to think you are right. But one really has to be focused on keeping it short.

Personally, I'm not that big a fan of UML. Maybe a simple class diagram for documenting the domain model.

Focus when documenting should be on describing the more stable parts of the system: Domain model and overall architectural choices. This will also keep changes over time down, as these parts do not change very often (or at least, they are supposed to not change very often) and at the same time are the most important ones to document.

10/23/2006 2:32 AM

 
Blogger Ryan Cooper said...

I do agree that a small amount of documentation is usually quite valuable. However, I agree so strongly with using documentation to answer the question "Why?" rather than "What?" that I would alter the focus of the four main points you make.

In other words, to get new devs/testers up to speed, have them work one-on-one with experienced devs/testers. You can't ask a document questions; you can ask an experienced team member questions. Greater communication bandwidth = shorter ramp-up time. I can't tell you how much time I've seen wasted by new devs trying to figure out a 20 page design document when they could have gotten the same information from a half-hour chat with a senior dev. It's good to have some high-level doc to keep everyone on the same page w.r.t the 'whys' of the architecture, but not the 'whats'. Also, having a one-on-one conversation will probably take less time and be more enjoyable to the senior dev than writing an architecture document.

Likewise, intent and clarification are best achieved foremost by clear unit tests. Unit tests make great documentation (especially if they are developed in a test-first manner). Again, traditional documentation plays a supporting role here, emphasizing 'whys', not 'whats'.

Abstraction is best tackled with good OO design (see Domain Driven Design by Eric Evans), although this of course won't work for all situations. Our designs are never perfect, and traditional documentation is great for expressing why things are in the current state, and what design direction we want to move in.

In all these cases, traditional documentation should be a last line of defense, not a first line of defense.

Of course, all the things I've mentioned above count on the fact that developers think of the code/tests themselves as documentation, and focus on making them as clear and self-explanatory as possible. This leaves a smaller cognitive gap for traditional documentation to fill. When this gap is small, it makes it much easier for said documentation to stay current and useful.

10/23/2006 1:04 PM

 

Post a Comment

Links to this post:

Create a Link

<< Home