XML Is Here to Stay, Part Deux (Attack of the Small Publisher)
In my last post, I wrote about—heck, I guaranteed—that XML wasn’t going anywhere. I’m usually not such a big trash talker, but I firmly believe this—mostly because you can use XML to future-proof content, as well as the fact that putting any structured tagging in your content could be leveraged, even if XML goes away. Which it won’t (I know, nice English).
In the comments section to that post, Thad McIlroy posed an excellent question that, I think, deserves its own post. This post is an answer to his question:
“What do you advise your colleagues in smaller publishing companies who wish to fully embrace XML and its associated technological challenges?”
OK, a lot to discuss here, and I am grateful for the opportunity to opine on this topic. My recommendation can be broken down in the following points, which I will further describe below:
1. Start small.
2. Get help.
3. Embrace standards if your content fits.
4. Understand that everything has a version number.
One of my favorite expressions is “Rome wasn’t tagged in a day.” That is basically XML geek speak for the idea that you don’t have to tag EVERY piece of content in your organization to get started with XML. You can start with one title or a family of like titles, and see how it goes, learning the whole time about what works and what doesn’t work.
Speaking of favorite XML expressions, there was another relevant old saying from SGML times that went: “The first edition will always lose money.” Given the investment in document analysis, DTD writing and various levels of support required for an XML implementation, your first implementation will probably not make money. But the second and third will. And anything else you want to do to repurpose that content (other output formats, combination products, licensing, etc.) will most certainly be done more cost effectively, and perhaps may even be made possible financially because of XML.
This is an important point, because there always has to be a balance between the investment you are putting into an XML implementation and the payoff on the back end. If you want a huge payoff, you better put in the proper amount of effort. This is just as true for large companies as it is for small, but I’ve lost track of the amount of times I’ve seen this ignored. With a small company, perhaps because resource limitations around an XML effort are in place, that’s a way of ensuring that things start small. OK, so maybe that’s circular logic, but I think the point stands: starting small makes sense for small companies and big companies, and I always feel that building incrementally with an eye toward constant improvement is the better approach than the “Big Bang” method.
This is one of my favorite parts about the XML community, and it’s not just because they are my people. In general, people within the XML community are more than happy to help, answer questions, etc. Probably because of the nature of an open, standards-based community, there are a ton of resources ready, willing and able to answer questions about implementation, large or small. I’m not going to list names here because I would probably slight a friend of mine and I don’t want to get that, “Hey, why didn’t you mention me?” e-mail. But they’re out there.
Now, there certainly are companies and individuals who make their living consulting on XML implementations, and if the budget can afford these resources, there are some really good ones. The bad news is that there are also some bad ones (like in any industry). Best advice is to talk to people in your similar circumstances and get personal recommendations. This is very much a “word-of-mouth” industry, and that’s for good reason.
Embrace Standards if Your Content Fits
Starting from scratch is always more difficult, and if you’re a small publisher getting started in XML, it really doesn’t make much sense. Isaac Newton had it right with the whole “standing on the shoulders of giants” thing, didn’t he? This is a well-paved road by now in XML land, and as a result there are a number of standards (DTDs, frameworks, etc.) that can help. Using the “get help” advice previously mentioned can also help with this, but do NOT feel like you have to re-invent the wheel in order to get going with XML. The smart use of previously existing standards is always going to pay benefits, and then if you need to customize to fit your specific requirements, you are that much further along.
I feel compelled in this section to put a plug in for a Working Group I’m chairing on behalf of the Book Industry Study Group around standardization of book content models. We’re just getting started, but I think the Working Group has huge potential. One of our end goals is to help with this exact problem—helping people make sense of what is out there in terms of existing standards, and figuring out how to get started. It’s a subset of BISG’s Digital Standards Committee, and you can follow the progress on the BISG website.
That being said, there are a few XML standards out there that can certainly help publishers—DocBook, ePub, the National Library of Medicine (NLM) family of DTDs for STM among them—and one of these may be appropriate for getting started. But even if one of these standard DTDs or frameworks is not a complete fit right out of the box, you will learn a great deal by the simple act of trying. And it's a place to start.
Understand that Everything Has a Version Number
This one is actually stolen from a frolleague (friend/colleague) from my Elsevier days, and it goes nicely with the “start small” advice up top. Do you know anything in technology that doesn’t change? I didn’t think so. So don’t expect that what you are doing from an XML standpoint is not going to evolve over time.
The great thing about XML is it puts the control in the hands of the content creators, and part of that control is the ability to keep up with the times/changes. I’m not talking about changes to content itself, but changes to your DTDs or your framework/systems you use to create, house and distribute XML. All of these things change over time, and that’s ok! Trying to do something once and then leave it alone forever is a nice goal, but it’s just not realistic in today’s publishing world.
Small publishers, I think, have an advantage and disadvantage in this area. A disadvantage because it makes the investment conversation up front tougher if you know it’s going to evolve (but realism is a good thing, is it not?), and an advantage because, for the most part, smaller organizations seem to have a higher change tolerance than larger ones. The advice on this front is about the understanding that things will change, and communicating that to the decision-makers within the organization, especially a small one. Having this conversation up front with help set expectations and lead to better understanding throughout any implementation.
OK, so that’s it. Four bullet points, and I’ve tried to stay pretty general. Thanks again to Thad for such a great, thought-provoking question. As Billy Crystal says in his great cameo in "The Princess Bride": “Have fun storming the castle!”
Jabin White is Vice President of Content Management for ITHAKA, an organization committed to helping the academic community use digital technologies to preserve the scholarly record and to advance research and teaching in sustainable ways. ITHAKA provides several services to the academic community, including JSTOR and Portico, which increase access to scholarly materials and ensure their preservation for future generations.
With a heavy background in XML theory and practice, White has spent most of his career evangelizing the benefits of markup languages and related technologies, including content management, workflow enhancements and authoring tools.
Prior to joining ITHAKA, White served as Director of Strategic Content at Wolters Kluwer Health's Professional & Education (P&E) Division, Vice President, STM Sales for Scope eKnowledge Center, and VP of Product Development at Silverchair, Inc., a leading developer of information solutions for health care publishers.
He also spent five years as Executive Director of Electronic Production at Elsevier, serving the Health Sciences Division. White started in health sciences publishing as an editorial assistant at Current Medicine and has held digital publishing positions at Mosby, Lippincott Williams & Wilkins and Unbound Medicine. He is a graduate of Wake Forest University with a BA in history and has a Masters in Business Administration from Pennsylvania State University.