Welcome to the Metadata Millennium: A Complete Overview of What Metadata Can Do for Publishers

Getting Granular: Using Metadata To Reinvent Content
I mentioned that publishers still tend to think of their books as products, but increasingly they're coming to think of them as resources. Some publishers are getting added value out of their content by "slicing and dicing," taking portions from various books and creating a new product. (An example: taking all the recipes for potatoes out of a list of already-published cookbooks and creating a potato cookbook.) Textbook publishers have been doing this for a long time, creating "coursepacks." Another technique is subsetting: getting mileage out of a big book by publishing parts of it as smaller books. Some books are beginning to be sold by the chapter, or chapters issued as short ebooks. Reference, technical, and STM publishers get value by aggregating, creating an online subject-specific portal, and selling access by subscription, or even by selling the content "by the chunk."
Metadata is needed to make this work. First of all, the books or other resources need to be marked up consistently; that's where sound XML practices come in. But having good markup at a granular level isn't all you need; you also need metadata. Much metadata, of course, is relevant at the title level: that is, it applies to everything in the book. But to use metadata effectively for subdividing content, it's important to have IDs in the XML at the granular level. These granular IDs (at the chapter, section, and sometimes even the paragraph level) let you locate the chunks and point to the chunks. You can then associate metadata with those IDs. That lets you maintain the metadata separately from the documents themselves, which is essential given metadata will evolve over time.
Publishers often need to embed metadata in their publications so that a chapter may have subject classifications for the subjects only in that chapter. Or a section of a textbook can have metadata about the learning objectives associated with it. There are many ways to do this in XML, but I want to focus on two specific ones from the Open Web Platform: microdata and RDF (both of which, with the new EPUB 3.0.1 specification, will also be available in EPUBs).
- Places:
- XML





