DISCLAIMER: In this post, I express my own, somewhat controversial, views about Doxia and APT. These are solely my own views and you should not assume that they represent an official statement from Sonatype.
This Maven Book is created using Maven. Everything you see is produced using Springer’s plugin: http://code.google.com/p/docbkx-tools/ – I don’t pass in all of the configuration variables via the pom.xml. We have two stylesheets html_chunk.xsl and xslfo.xsl which help us print out a nice looking PDF and web site from the book. In the next part of this series, next week, I’m going to start blogging about the Maven project we use to manage the book. By next week, I’m going to try to have a Maven archetype ready for people who want to produce a book with Maven. I might even put a chapter in the book about using Maven to create a book (recursion). Ultimately, I’d like to help start a few projects that will make it easier for people to write books using the same technologies we use to publish this book. We need to get more developers writing good content, there are too many technical books being written by people who haven’t had a day of real coding experience.
We use DocBook. The original idea was to use APT, but once I started working on the book, I insisted on DocBook to the surprise of many people involved with the effort. In this post, I explain why I think DocBook is the best choice for writing a book.
I’m going to spoil the party for APT lovers. APT is impossible. I’m convinced that APT is the reason why most Maven documentation and many Maven sites are terrible things to try to read. I’d encourage anyone trying to use the Maven Site Plugin to dump it and start using the XSite plugin. Don’t be afraid of HTML and Markup, Maven sites would, in general look natural and simpler if you didn’t have to suffer through all the canned copy and left navigation menus. It is impossible use APT to write a book with styled cross-references, a good index, appropriate in-line styles. The ability to differentiate between a listing of source code and numbered examples, variable lists, the differences between a chapter and a part are a preface. All of these things come with DocBook.
That being said DocBook tools are a terrible curse. The editor I use is XMLMind. Not only is XMLMind not free, it is about as usable as Emacs on a keyboard that has a broken Meta key. But, you get used it, and you learn to be productive. It takes a year, you initially swear it off, but then you come back to it and admit defeat by purchasing it and learning how to convince it to cooperate. In two years time, you’ll start to respect XMLMind and you might even start to customize some of the key bindings. In other words, it ain’t easy. But, this brings me to my next point…
…writing a book is not easy
When a bunch of developers (I still consider myself a developer, not a writer) get together and decide to write a book, there’s this underlying tension. A developer’s job is tough enough, they don’t want the writing process to start siphoning already scarce time off of the development cycle. The initial reaction is to choose some technology like APT because it is easier to write simple things with simple markup. This works for a while, you’ll write a few chapters and you might even start to develop innovative little plugins to including source code, etc. But, as the content grows in size, and you start getting ready for print production, you’ll start to think about things like:
- A large book without cross-references is about as useless as it gets. If I’m in Chapter 5 Section 3 and I want to reference Chapter 1 Section 1.2, and I don’t have a way to exactly specify an element in a document, what happens when I move a chapter around or when I want to insert a section before Section 3. Sure, I can develop a facility within some Doxia engine to allow me to reference a section of a document, but then you’ll want to do things like customize the text of the reference. Maybe half the time, I’ll want to say “See Section 15.1 for more info”, but just as often I might want to say “See Section 1.5 Aggregating Stuff for more info”. The point here is that cross-references are increasingly important for both the PDF, HTML, and print output, the only way to equal what comes out of the box with DocBook is to add more hacks to APT and customize the engine that reads it.
- Inline Styles
- This is probably the one thing that throws most developers-turned-writers into a tailspin. The idea that every command, classname, code reference, variable reference has to have a different inline style. This takes most people a few weeks to get the hang of, but once you start doing this, you’ll start to realize that it is essential to making readable technical content. Pickup any O’Reilly book, and you’ll notice that it contains a heavy amont of inline styling – Classnames are in a fixed font, they are differentiated from commands on the command-line. We don’t just do this because we like to be fancy, we do this because it is a subtle hint to the reader that eases comprehension. It is also something that requires different markup elements in the book’s source. There are classname, methodname, variable, code elements in DocBook to handle this. Not so in APT Because APT is solely focused on presentation, you can’t embed semantic meaning within it. You can’t say, “this is a classname”; instead, in APT you say, “make this italic” or “make this bold”.
- Print Production
- The publisher I’ve worked with formats the book in DocBook before they send it off to the presses. There’s a lengthy production process during which the book is converted to DocBook (if it isn’t already in DocBook) and someone is going to go through and make sure that the book has all the right inline styles. Then someone is going to go through and markup all the index terms (indexing is an arduous and mind-melting experience BTW). I prefer to produce a product that doesn’t require too much manual futzing with after I deliver it. I understand that the production dudes need to tweak the content a bit, but, I prefer the idea that my stuff doesn’t have to go through some sort of filter before it gets to the real content. More on this in later parts of this series…
- Formatting for Print/Web/PDF
- Sure, I understand that I can get some APT stuff to spit out a PDF and a web page. But, can I tell it what section level I want it to descend to when computing the contents of a table of contents? Can I put a watermark on the output and put a disclaimer in the header of the preface to signify that the output is an alpha release? (something I need to do) Can I generate endnotes? how about footnotes? I could go on and on and on an on about things DocBook can do that APT can’t. It all really boils down to tools and the fact that, with DocBook, I’m capturing more than just syntax - DocBook is semantic and there are a whole host of tools out there that let me convert that output to good looking output. I’m sure someone is going to comment that all of this is possible with some sort of customized Doxia plugin (see previous, I think Doxia should be thrown overboard.)
You could hack up APT so much so that it closely approximates DocBook. You could muck around with the various Maven plugins involved in the process to make it easy to include code samples and snippets….. or, you could use the tools and technologies which already exist. I’m no big fan of reinvention, so for me, the solution was to use DocBook. Furthermore (ugh), hacking APT to the point where it supported a featureset similar to DocBook would’ve meant making APT more like DocBook. By definition, I don’t think you can write a book which requires this much semantic stuff in a wiki-like format without making the wiki-like format more trouble than it is worth.
…stop trying to make it easy…
Even when I wrote Jakarta Commons Cookbook in Word, it was far from easy. There was an ultra-nifty (but very complex and unstable) set of VB macros which were used to manage cross references and inline styles. There was a whole host of keyboard shortcuts, etc. For a 400-page book, I had to split the document into chapter DOC files and have every document open in Word in order for cross references to properly render. It wasn’t an uncommon experience for Word to just blow up and refuse to respond. That was about four years ago, in the intervening years there have been various efforts to simplify the process and move to different tool platforms.
Books have been written in OpenOffice. (And, yes, there are books that have been written in APT.) A few people have tried to write books using collaborative web applications. There has been this persistent idea that people could collaborate on a Wiki and produce a great book, etc….
For me, the most difficult part about writing a book isn’t the technology used to write it. From a wiki-like markup to etching every word on to a stone tablet, for me the most difficult part of writing is the process itself. I use a difficult tool to write with XMLMind, but I spend most of my time writing, and rewriting, and rewriting, and rewriting, and proofreading, and rewriting, and rewriting……
And, writing about technology for a tech-audience isn’t easy. I guess what I’m trying to say is stop using tool selection as an excuse to procrastinate and get down to the business of writing. Writing isn’t easy; in fact, it is just as difficult (maybe a little more difficult) than writing code. Don’t shy away from using professional writing tools even if they are not easy. Writing a book isn’t easy, it’ll drive you crazy. I promise.