ONIX 3.0 Raises Standard for Ebook Metadata
Over the past decade, ONIX for Books has become all but ubiquitous -- a lingua franca of metadata for book publishing in North America, Europe, and increasingly in the Asia-Pacific region. The XML-based standard provides a common language for communication between publishers, retailers, and various intermediaries. ONIX allows book and ebook publishers to create and manage a single body of rich metadata about their products, and to exchange it with their customers in a coherent, unambiguous, and largely automated manner. As a result, ONIX reduces costs across the whole supply chain and contributes to an overall improvement in metadata quality through benchmarking and certification programs.
Yet dramatic changes in the publishing landscape in the past ten years mean that the most widely implemented version of ONIX is not well-fitted to today's global book business. ONIX 2.1 -- as used by almost all North American implementers of ONIX -- has proved remarkably resilient given it was designed in the days when reading an ebook meant toting your Palm Pilot or navigating a PDF on Windows 97. In contrast, the latest version of ONIX, version 3.0, is designed for this decade.
At the beginning of 2012, the international committee that steers ONIX forward affirmed its confidence in ONIX 3.0 and agreed on a "sunset date" for ONIX 2.1 at the end of 2014. This afforded publishers and retailers three years notice before support for ONIX 2.1 would be phased out to ensure that budgeting and technical development could be orderly and unhurried. Yet the rate of adoption of ONIX 3.0 in North America is still relatively slow and any organization without a clear migration plan now has some catching up to do.
Following is an explanation of why ONIX 3.0 is better suited to our times and why publishers and retailers should switch to the updated standard sooner rather than later. [Editor's Note: For clarification on any of the terminology used here, turn back to the Metadata Glossary on page 15.]
ONIX In A Nutshell
There are three elements to the ONIX framework. First, there is an underlying abstract view of the nature of products and titles, and the role and scope of identifiers and metadata in the commercial world. This is barely visible to publishers or retailers that exchange ONIX data, but informs the overall design of the scheme.
Second, there is an XML-based message format. This is the ONIX that many are familiar with:
1
A01
16
0000000121479135
Maj
Sjöwall
Maj Sjöwall är född i Stockholm 1935. Hon är mest känd för de tio Martin Beck-romaner hon skrev tillsammans med sin make Per Wahlöö.
It looks challenging, but it isn't really intended for human consumption. The XML message is for machine-to-machine communication and includes a structured set of data elements for the identification and description of books and ebooks as products -- elements like and shown in the extract above. (It's important to understand that ONIX is a means of communication, not a database. Certainly it has implications for the way that publishers and retailers store and manage their product information, but it doesn't specify the database structure they should use. In general, linking your database or application design too closely to a particular standard should be avoided.)
Third, there's a large group of controlled vocabularies -- "codelists" in ONIX parlance -- describing aspects like the role of a contributor ("A01" above means "written by"), or the physical or electronic form of a product, in a language-independent way. So a code like BB for the physical form of a book means "hardcover"in U.S. English, "hardback" in U.K. English, but equally it means gebundene Ausgabe in German, and can be understood even if the entire ONIX message is in Swedish like the extract above. This language independence has contributed greatly to the international adoption of ONIX, as has the fact that it is a relatively open standard and free of charge for anyone to use.
ONIX 2.1 vs. ONIX 3.0
ONIX 2.1 and ONIX 3.0 are different, but many overestimate the degree of that difference. The underlying abstract view has not changed. Most codelists are shared between the versions. The update doesn't necessarily imply radical change in any application or database used to manage product metadata. And while the extract above is ONIX 3.0, anyone familiar with ONIX 2.1 might not even have noticed the differences. There are two differences visible in the example above: used to be , and in 2.1, it would have occurred after the name rather than before. But not all the differences are as trivial as this, and for an organization using ONIX, there are clearly some development costs associated with migrating to the newer version.
So what are the benefits of version 3.0? First, there was a spring-cleaning based on nearly a decade of experience. ONIX 2.1 contains many XML data elements that were inherited from previous versions of ONIX. They have been superseded by better ways of expressing the information, but the old method still exists in parallel. For example, an ISBN can be carried in a data structure called . But for backward compatibility purposes, ONIX 2.1 also has a dedicated and inflexible data element for ISBNs that it inherited from ONIX 1.0. (Confusingly, the element is solely for obsolete ISBN-10s). ONIX 3.0 sacrifices some compatibility with 2.1 and sweeps away the older elements, increasing clarity, making it simpler to implement, more modular, and more flexible.
The next change lies in the treatment of digital products. In ONIX 2.1, these are something of an afterthought, treated as exceptions rather than mainstream products responsible for perhaps a quarter of the revenue of a typical publisher. ONIX 2.1 is a product of its time -- but its time was 2001. In version 3.0, ebooks and other digital products are treated much like any other product, albeit with some new metadata relating to file formats, usage rights, and licensing, which are all much more important today.
There are a number of changes that arise from the increasingly global nature of the book trade. In 2.1, only minimal information about rights was required -- "Is this book for sale in Canada?" -- yet this doesn't meet the business needs of retailers who operate globally. Version 3.0 requires much more comprehensive information (even if this is simply a list of countries for which the rights position is unknown). The description of complex international distribution and supply arrangements is considerably more clear in 3.0. And while ONIX has always worked in any language, 3.0 enables sophisticated multi-language, multi-script metadata, so the description of a book that is in Spanish could be given in both English and Spanish in parallel.
Over time, ONIX 2.1 "idioms" developed separately in each country that adopted it widely. This resulted in small but critical country-to-country differences in recommended practices promoted by trade organizations like the Book Industry Study Group (BISG), BookNet Canada (BNC), and Book Industry Communication (BIC) in the U.K. A recent revision of the BISG guidelines, carried out jointly with BNC, has improved the alignment significantly, but ONIX 3.0 doesn't carry this baggage -- improvements to the documentation and a global best practice guide from EDItEUR seek to ensure such variations simply don't arise.
Other changes? Some simplifications in the way ONIX 3.0 treats sets and series. Greater flexibility in provision of marketing collateral. And for those concerned with the sheer volume of data that needs to be communicated and processed, ONIX 3.0 includes "block updating" that radically reduces the size of data updates.
More generally, an organization migrating from 2.1 to 3.0 will likely re-evaluate its business processes, and perhaps take the opportunity to re-engineer and streamline, taking its chance to send or receive richer, more accurate, more timely metadata -- all of which lead directly to improved sales.
The Switch
So if the benefits of ONIX 3.0 are so clear, why is it not already in use by every publisher and retailer? Some of the reasons for the slow pace of migration among North American publishers are obvious. First, a kind of network effect: the very ubiquity of the older version means that there's immense inertia to overcome. In countries where ONIX is less prevalent, adoption of the new version is noticeably more rapid. Second, there is an understandable reluctance to invest in technical updates when there is little direct or short-term benefit. The lack of simple backward compatibility means that some investment is needed, and the business case for adopting ONIX 3.0 de novo is much more clear cut than that for upgrading to ONIX 3.0, given that version 2.1 is often viewed as "good enough". And third, there has perhaps been some lack of leadership. The other meaning of the word "standard" is the flag at the head an army, yet trade organizations, major publishers, system vendors, and major recipients of ONIX data have not leapt forward to guide the way.
That third factor is changing. But it underlines that standards development is essentially a social process rather than a technical one. The data exchange standards that underpin the supply chain inch forward by agreement and consensus between business partners.
Organizations adopting ONIX as a part of their business process are in effect "investors" in the scheme and need reassurance about its stability -- and confidence in its continued development to meet their newest business needs. The overall roadmap for ONIX is drawn by the ONIX International Steering Committee convened by EDItEUR and comprising representatives from national user groups in each country. BISG facilitates the U.S. ONIX national group, BookNet Canada and BTLF the Canadian groups, and the committee is currently chaired by a representative of the French national group. It was this steering committee that set the sunset date to focus attention on the mid-term need to migrate, and any organization without a clear plan now has some catching up to do.
ONIX 2.1 support will be reduced and all new developments -- for example new codelist entries -- will be 3.0-only from the beginning of 2015. Of course, 2.1 will not just "stop working" at that point. But there is a risk in delaying migration -- that of being inflexible and unable to make use of more sophisticated metadata and marketing collateral, and missing business opportunities with new entrants who choose to use only 3.0.
Graham Bell (graham@editeur.org) is chief data architect at EDItEUR, the trade standards organization for the global book, ebook, and serials supply chains. EDItEUR is a not-for-profit, membership-supported organization, based in London, but with around 100 members from many countries around the world. In addition to ONIX for Books, EDItEUR manages Thema and the International ISBN Agency.
Related story: Welcome to the Metadata Millennium: A Complete Overview of What Metadata Can Do for Publishers
- Places:
- Europe
- North America