Going mobile is more than just big pages fitting on tiny screens. Mobile requires content to be organized so that it can be quickly served up in bite-sized chunks into any of the myriad forms that users might want, because today’s consumers want their content their way, on their devices, and in their timeframes.
Publishers have always been heavily invested in their content, but only recently have needed to think about providing it in alternative forms for delivery across devices, channels, and applications -- and going mobile. For many publishers today XML (Extensible Markup Language) is the key to transforming content into multiple formats, and thereby monetize it and future-proof it -- even for applications that haven’t been thought of yet.
Evolution of the Content Paradigm
Content is no longer linear. For millennia, after graduating from stone tablets, scribes kept written material in scrolls, and it was read linearly from beginning to end. Finding something in the middle wasn’t so easy, nor was marking it for future reference. As time went on, codices (or books) made it easier to refer to things, and to look things up a page at a time. You could then even put in bookmarks. But it was still a mostly linear reading experience. Even dense reference books, never read from cover to cover, needed indexes to point readers to the right page.
The dawn of the digital age introduced new devices for reading content, and computer screens co-opted the scroll to allow you to continue reading long-form content. Content was still primarily linear in the 20th century, with hints of what was to come. Only in the past few years, with the introduction of multiple reading devices, did the page paradigm, the linear-only consumption model, prove insufficient. Device screens forced publishers to reorganize what was on the page, and eventually rethink what constitutes a “page.” And as reading devices became more sophisticated, so did readers, who now regularly pick and choose the pieces of content they want to read today, what they want to mark for later, and what they want to keep for a while. That old page and book paradigm doesn’t really allow for curated, bookmarked, cached, and linked content.
Readers today still purchase print books to read linearly. But they also want digital content to use completely differently -- and much of that reading is on mobile devices. Readers pull together chapters from multiple books, articles across journals related to a single author, car repair instructions related to the model they purchased, and down to just the one repair procedure they need. They may even pull in YouTube videos and other external content. Publishers only recently started thinking in terms of this multi-purpose, multimedia content that users request.
This transformation, not destruction, of publishing has the potential to revolutionize and raise it to new heights, and new revenue levels. Freed from just print, the new economies of electronic “publishing” have already transformed many areas of the publishing industry, well beyond the obvious ebook editions. With the cost of publishing content greatly reduced, and the new ease to distribute globally, publishers can now find viable audiences for materials that were previously too obscure or specialized. The combination of digital publishing, syndication, and licensing opportunities gives publishers the chance to republish materials that were either given up for dead, or didn’t have audiences at the time of original publication. Making your legacy content findable can be quite profitable.
XML Provides the Avenue to Future-Proofing Content
Making content modular, with chunks that can be assembled in myriad ways by both publishers and readers, requires XML. Other approaches exist, but they don’t provide the flexibility and standardization needed to optimize content for a multi-device world.
XML files separate the content from the formatting. They contain content elements a reader needs without referencing a “page.” Instead, the content is “marked up” with tags that describe the various attributes of the contents -- titles, subtitles, paragraphs, lists, links to images, tables, and so on. An XML file may not even indicate where pages start and end, since pages aren’t relevant for many devices.
Even complex content elements, such as detailed tables and math and chemistry formulas, can be coded using XML, which extends capabilities even further. XML elements don’t need to be linear, they can be small sections of text, lists of instructions, standard inserts, modules of text, captions, etc. A database defines how to reassemble the pieces later on, in different orders, depending on the use, combined with style sheets and templates that define layouts according to match. The result? You can, for example, easily produce an automobile manual to address only your particular automobile model, without any references to other models in the line. You can serve up bite-sized chunks of content on a mobile device, and go to the next chunk based on what the user wants. The opportunities are endless.
Layout is separated from the content, so that you can produce many different formats on the fly, from the same content. Style sheets identify formatting according to the device that accesses the content, and the context in which the content is accessed, providing opportunities to manage content in a more modular fashion. Tools to transform the content on the fly create new opportunities to deliver in one way for one audience, and in a completely different way to another audience.
Publishers can easily take advantage of constantly improving capabilities to describe complex elements, such as formulas or graphical representations when the content is stored in XML. The scientist pulling down complex math on to his mobile device will be able to reflow it to fit properly on his small screen, rather than having to pan about an image of the formula. One publisher, The MIT Press, now converts computer science textbooks into reflowable ebooks, making it easier to view the many highly complex charts, graphs, and tables on digital devices and ereaders. The images below show outputs of the user interface for MITCogNet’s available content structures along with a complex summary map output to a mobile device.
The Optical Society of America (OSA) converted its entire 750,000 page archive going back to 1917 into XML, allowing the organization to reach ready markets for content that people hadn’t been able to get to for years. They also found new products in its older content. OSA has an extraordinary image backfile containing almost 1,000,000 images, and by converting the collection, created an independent product. An example of the OSA’s robust image library, along with its advanced search feature, is provided below. They are now able to publish special collections based on specific scientist, topics, and any special collections their customers might want.
XML provides freedom for publishers -- to find new ways to generate revenue after investing in content. Freed from the linear flow of traditional books, you can slice and dice, and reuse pieces in an infinite number of ways. You can publish chapters of books, reformulated chapters, or excerpts. You can explore subscription models, serializing content, and other ways to deliver without relying solely on the book paradigm. And you can retain and reuse that content in new applications that are as-yet unknown, and to devices without additional investments in conversion. Going mobile using XML is a sure route to publishing success.
Related story: German Publisher Bastei Lübbe to Launch International Subscription Platform
Mark Gross, president of Data Conversion Laboratory (DCL), is an authority on XML implementation and document conversion. Prior to joining DCL in 1981, Gross was with the consulting practice of Arthur Young & Co. He has a B.S. degree in Engineering from Columbia University and an MBA from New York University. He also has taught at the New York University Graduate School of Business, the New School, and Pace University. He is a frequent speaker on the topic of automated conversions to XML and SGML.