
Structuring
XML Documents
ISBN: 0-13-642299-3
Author: David Megginson
Series: Charles Goldfarb Series
David Megginson is a senior architect for Microstar Software Ltd. of
Ontario, Canada. Microstar is a leading integrator of large SGML and XML
document management, publishing, and production systems. Microstar is also well
known for its Near & Far Designer technology which is the leading
case tool for the design of SGML/XML document type definitions. Meggison has
seven (7) years of experience in structured document design, first as an
academic and then as a professional consultant.
Pages: 420 + CD-ROM
Intended Audience:
The intended audience of
Structuring XML
Documents includes those responsible for developing the document design. It
is not intended for authors, or for software developers, but rather is aimed at
document designers. Although XML is prominent in the title and on the cover of
the book, this book is not intended for those who are XML beginners and wish to
learn more about XML. In fact the design principles presented in this book can
apply to XML, SGML, or any other syntax. It is the design process, itself,
which is the focus of the book.
In specifying the intended audience of Structuring XML Documents,
Megginson makes several important assumptions:
- The reader knows SGML / XML syntax
- The reader knows how to write a basic DTD
- The reader already has identified the users as well as the project
requirements
- Book-oriented DTDs are the focus of the book (such as tech manuals or
legislative documents)
- Database-oriented DTDs are not specifically within the scope of the text
If these assumptions of Structuring XML Documents do not match your
profile, another text will be more appropriate for you. On the other hand, if
you are responsible for creating the document design for your project, whether
it is a structured authoring project, an SGML project, or an XML project, then
this text will provide an excellent starting point for you.
Summary of the Book:
Structuring XML Documents is divided into 4 parts and has back
matter which includes an extensive "Index of Element Types and Attributes"
as well as a CD-ROM.
The first part of Structuring XML Documents provides general
background information which is required for mastery of the advanced topics
presented in the following parts of the book. In the first part of the book a
review of DTD syntax is provided. Again, this is a review and not a beginning
text. If you are not already familiar with SGML, this will not provide adequate
background. However, if you are just a bit "rusty" this portion of
Structuring XML Documents provides an excellent review. Also in this
part of the book, five (5) industry-specific DTDs are introduced. These DTDs
will serve as a basis for discussion of document design throughout the remainder
of the book. The DTDs which are highlighted in this text include:
- ISO 12083: (journals and books)
- DocBook: (software documentation)
- Text Encoding Initiative (TEI): (research-oriented materials)
- MIL-STD-38784: (technical documentation)
- Hypertext Markup Language (HTML): (Web documents)
The second part of the book guides readers to review their goals and
requirements against the 5 selected model DTDs. It is the author's philosophy
(and mine as well) that if an industry standard DTD can be directly used or
adapted the project will move along much more quickly and surely than if a new
DTD was to be developed from the ground up. It is much more costly for an
organization to develop a unique DTD. Also it means that the organization
cannot take advantage of the lessons learned by those SGML pioneers who worked
many long years to develop the model DTDs featured in this text. It is true
that meeting an organizations functional requirements in an SGML/XML design is
of utmost importance. But I agree with Megginson that many good foundations
exist and that most goals can be met by working from an existing model.
Part three of the book focuses on advanced issues in DTD maintenance and
design. In particular, issues such as repetition and omissibility of elements
are highlighted. Also interchange considerations for document fragments are
highlighted. #CURRENT and SUBDOC are just two advanced features which are
discussed in these chapters. Perhaps the most useful portion of part 3 is the
discussion of how to customize each of the five model DTDs. Since in most
instances, customizing an industry standard DTD is the preferred strategy, tips
on how to customize each of the prominent models will be very useful to document
architects.
Part four of Structuring XML Documents shows the reader how
architectural forms (from the HyTime standard) can be used in DTD design.
Architectural forms are first introduced. Basic examples of how to use
architectural forms are given. And advanced uses of architectural forms are
also presented. This part of the book provides some mechanisms for dealing with
design situations which may not be resolved in any other way. My concern here
is the degree of available tool support for architectural forms. Since designs
must be implemented, the elegance gained by using advanced DTD design strategies
may suffer when faced with the realities of system implementation.
The back matter of Structuring XML Documents is a complete index of
all elements and attributes from the 5 model DTDs. I suppose the idea here is
that you can pick standard tag names from any of the standard DTDs to construct
your own DTD. While interesting, I question how valid a mix-and-match approach
really can be. In my experience designers want to begin with a standard model,
prune the model, and then add tags with specific meaning for their own
environment.
The CD-ROM includes a number of XML parsers. It is important to note that
most of the text in Structuring XML Documents appears to focus on SGML.
And discussions of advanced topics such as #CURRENT, which are not supported in
XML, reinforce my impression that this is primarily an SGML text. Very little
mention of XML can be found in the book. I would have expected some chapters
which presented a strategy to move each of the model SGML DTDs to be XML DTDs,
for example. I suppose the title is XML rather than SGML because of the
inclusion of XML parsers on the CD-ROM and because XML is currently a "hot"
topic.
Final Recommendation:
Structuring XML Documents is an excellent text for beginning
document designers. Although the book focuses on SGML and does not have
significant XML-specific content, it will clearly be useful to SGML and XML
designers alike. I particularly agree with and recommend Megginson's
development strategy to begin with a standard model DTD and customize that for
individual use. I also believe it is great to have all the most commonly used
DTD models in a single text where the designer can easily compare and contrast
them. And the tips for how to customize each model are invaluable. I will
certainly try the XML parsers included in the book as I prepare upcoming issues
of XML Filesfor
the GCA