Markup language update

Sidebars & Related Stories Tips for buyers What do you bet that the 1990s will be known in the next millennium as the Internet decade?

Sidebars & Related Stories









Tips for buyers
















What do you bet that the 1990s will be known in the next millennium as the
Internet decade?


Among the long list of things the Internet has changed forever is the role in the
federal government of publishing, especially publishing on the Web. And to make documents
Web-ready, you need markup languages.


The oldest of these, Standard Generalized Markup Language, is not a markup language by
itself—it is a metalanguage used for defining markup languages, primarily for
electronic information encoding and interchange. The International Standards Organization
of Geneva defines the SGML standard in their document, ISO 8879:1986.


In the federal government, SGML is best known as part of the Defense Department’s
Continuous Acquisition and Lifecycle Support initiative. The SGML portion of CALS is
Mil-M-28001. CALS was intended to reduce the cost of supporting and maintaining military
equipment by standardizing information storage formats for weapons systems, which often
have 20-year lifecycles.


SGML specifies a standard method for describing a document’s structure as a
hierarchical, or nested, model with logical, predictable elements, which are marked in the
document with tags.


A Document Type Definition (DTD) defines the document’s structure, along with
rules for the relationships between document elements. Angle brackets contain the element
tags and typically appear in pairs at the beginning and end of elements, for example:


<par>This is a paragraph in
SGML.</par>


SGML authoring tools help generate valid tags by putting the tags from the current DTD
into a menu.


An interactive parser, often included in an authoring tool, can both verify the
correctness of the overall document and restrict the tags entered to only those that are
valid according to the rules of the current DTD. A batch parser can only check the
correctness of the overall document.


A formatter reads an SGML document as well as its DTD and style, or Formatting Output
Specification Instance (FOSI), and produces pages or other formatted output; doing this
correctly is usually a multipass process because of the complexity and sequential nature
of SGML.


Some SGML authoring tools can format single pages for document preview and do visual
editing, even though they don’t generate documents for printing. SGML publishing
tools generally can lay out, compose, paginate and print entire documents, including
generated text such as tables of contents and indexes. SGML publishing is appropriate for
large, structured documents such as manuals and parts catalogs that can be formatted
automatically; it is usually inappropriate for small custom-designed documents such as
brochures.


The CALS initiative goes beyond requiring SGML tags—it requires a particular
Mil-Spec DTD, and a particular style of formatted output, called the output specification.
An individual document style created from the output specification determines how each
element in a document will be rendered on the screen and to a printer.


Another formatting style standard, also blessed by the ISO, is the Document Style
Semantics and Specification Language. DSSSL has not caught on commercially because of its
complexity. A new formatting style standard, Extensible Style Sheet Language (XSL), the
style sheet component within Extensible Markup Language (XML), is still under development.


Several trade organizations have developed their own, industry-specific DTDs, with
mixed success. The auto industry has managed to standardize, for example, but the
publishing industry’s standard DTD has not been widely adopted, largely because it
lacks several essential features.


Developing a custom DTD is a technical activity resembling programming that is usually
done by specialists or consultants.


XML, a subset of SGML, has simplified grammar and no requirement for a DTD. It was
designed for large-scale electronic publishing and is useful for exchanging structured
documents.


XML proponents hope that it will become the new standard for the exchange of a variety
of data on the Web as well as within and between companies and agencies. XML tools, XML
parsers for browsers and XML systems are starting to appear.


You can use XML for the sort of information usually kept in a database. For example,
the following XML describes a customer record:


<customer-details id=“AcPharm39156”>


<name>“Acme Pharmaceuticals
Co.”</name>


<address country=“US”>


<street>7301 Smokey Boulevard</street>


<city>Smallville</city>


<state>Indiana</state>


<postal>94571</postal>


</address>


</customer-details>


Hypertext Markup Language is a markup language written in SGML for use on the Web. Many
agencies and companies publish their electronic information in HTML on the Web. Web-based
electronic commerce, quickly becoming important in the private sector, is of increasing
interest to agencies.


The current World Wide Web Consortium standard is HTML 4.0, but many browsers can only
read older versions of HTML, and other browsers support proprietary tags that have not
been accepted into the standard. The next generation of HTML proposed for consideration by
the consortium will implement HTML as a set of XML tags, allowing for a graceful
transition to the more powerful XML language.


HTML authoring tools range from simple text editors to sophisticated visual Web design
systems.


Web site management tools, such as Microsoft FrontPage, can, besides authoring pages,
automatically generate navigation links between pages in a site, standardize styles across
pages and maintain links when page names change.


Many authoring tools include limited support for client-side scripts in JavaScript or
VBScript. Specialized tools, which are not discussed in this guide, help Web programmers
develop server-side Web applications and dynamically incorporate database records in Web
pages.


Because HTML is an application of SGML, an SGML authoring tool can be used to edit
HTML, given an HTML DTD. Because most HTML-specific authoring tools support scripting,
uploading and link maintenance as well as tag and content creation, they are better suited
than SGML authoring tools to creating Web sites.


This Buyers Guide lists SGML and XML authoring and formatting tools available in the
United States, plus a small sample of HTML authoring and conversion tools.


Agencies use HTML primarily for Web sites—both internal intranet sites and public
sites on the Internet.


XML has not yet really arrived in government, although it is likely to be in use within
the next year as, for instance, a way to automatically connect Web sites to SGML documents
and databases.


SGML’s highly structured nature makes it suitable for creating searchable CD-ROMs.


Government, especially DOD, uses SGML for document storage and publishing. SGML
publishing works well within the CALS framework, where large documents with relatively
simple standard formats are the rule. Several limits crop up in document formatting,
however.


Some SGML formatting engines have trouble generating acceptable multicolumn layouts,
wrapping text around illustrations and correctly formatting CALS tables and equations.
None of these is a problem in conventional desktop publishing environments, but such
environments don’t address the long-term stability issues that prompted creation of
SGML.


Even editing tables and equations can be problematic. Compared with table and equation
edit functions in desktop publishing programs, those in SGML authoring programs often seem
primitive. Given the large number of tags and attributes generated by the editors and the
number of SGML table formatting standards, however, one can easily understand the problems
developers have faced and overcome.


SGML documents are often put to multiple purposes: The same document may be destined
for a print publication, a CD-ROM and a Web site.


This can present a problem. Many more SGML tags are required for a good searchable
CD-ROM than are needed for a printed document, in which excess tags can add significantly
to the document’s development cost and cause maintenance headaches later.


Some systems can automatically convert SGML documents to Web pages and other viewable
documents. The quality may vary, however, and it may be necessary to hand-tune HTML pages
each time they are generated to attain the highest quality Web pages—not a viable
option when thousands of pages are involved.


Some systems offer a way to view SGML documents directly from a Web browser, using a
plug-in viewer.


One common problem is that external contributors and editors don’t have SGML
authoring systems and cannot deal with the SGML tags in text documents.


As a result, tags can be lost in the revision process and must be re-entered when
revisions are merged into the master document. Some agencies deal with the problem by
having external contributors do revisions on paper, and the internal publications
department type the changes directly into the SGML system.


Some SGML authoring systems cannot deal with partial or invalid documents. So even if
an agency’s field office or contractor can do SGML editing, interchange problems may
still exist.


Contributors to a document may be restricted to seeing only part of a document, either
for security reasons or to avoid multiple revisions being made to the same document
section. To allow such multiple levels of access, it is sometimes necessary to extract
subdocuments from the master document and create a full set of context structure tags to
make valid documents for the working DTD. Some systems do this more easily than others.


Some systems also track revisions better than others do. If documents you’ll be
creating and publishing typically have long lifecycles with frequent revisions, be sure to
check a package’s document comparison and revision marking
capabilities.   


lWhen publishing Standard Generalized Markup Language documents, you must
choose between formatting from pure SGML markup and formatting with additional filtering
information.


ArborText’s Adept Publisher, available for Unix and Microsoft Windows NT 4.0,
takes the former route and generates PostScript directly from a document’s SGML,
Document Type Definition and Formatting Output Specification Instance using a multipass,
rule-based engine. The trade-off is you get more automatic publishing but less control
over the end product.


Adept Publisher is an appropriate printing engine for large, simply formatted
documents, including most of the Defense Department’s Continuous Acquisition and
Lifecycle Support documents. It handles complex index and cross-reference structures and
does a good job with revision marking.


Adept Publisher includes all the functions of Adept Editor, a highly configurable
authoring system for pure native SGML and Extensible Markup Language.


In its default view, Adept displays two editable panes, a document map and an edit
view. The edit view has most of the features of desktop word processors, and it deals with
SGML tags in several ways.


You can specify tags to be viewed or hidden in the document. A quick-tag entry menu
helps you to create only those tags valid in the current context. Even when tags are
hidden, empty tags are displayed and highlighted to let you fill in the missing tag
contents.   When searching a document, you can find text inside specific tags if
you wish.


When dragging an element within a document, cursor cues indicate whether you can move
the element, where to drop the element, where the context would change the tags and where
the element would be invalid.


An external equation editor pops up when you add or edit an equation; it includes a
palette of equation symbols.


Adept takes its commands from menus, toolbars, dialogs and a command line. You can
customize menus, toolbars and dialogs. You can automate publishing via Adept Command
Language and compiled .dll files that tie into Adept’s object model.


An add-on product for developers simplifies extensive customization and
automation.Adept does a good job of handling partial documents and invalid tags, and of
importing and exporting XML, and it integrates with six document management systems.


Adept Publisher sells for $2,350; Adept Editor is for $1,350. More details on both
products are available at www.arbortext.com.


Contact ArborText Inc. of Waltham, Mass., at 781-529-1000 or 734-997-0200.


Martin Heller is a software developer, consultant and writer in Andover, Mass.





X
This website uses cookies to enhance user experience and to analyze performance and traffic on our website. We also share information about your use of our site with our social media, advertising and analytics partners. Learn More / Do Not Sell My Personal Information
Accept Cookies
X
Cookie Preferences Cookie List

Do Not Sell My Personal Information

When you visit our website, we store cookies on your browser to collect information. The information collected might relate to you, your preferences or your device, and is mostly used to make the site work as you expect it to and to provide a more personalized web experience. However, you can choose not to allow certain types of cookies, which may impact your experience of the site and the services we are able to offer. Click on the different category headings to find out more and change our default settings according to your preference. You cannot opt-out of our First Party Strictly Necessary Cookies as they are deployed in order to ensure the proper functioning of our website (such as prompting the cookie banner and remembering your settings, to log into your account, to redirect you when you log out, etc.). For more information about the First and Third Party Cookies used please follow this link.

Allow All Cookies

Manage Consent Preferences

Strictly Necessary Cookies - Always Active

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data, Targeting & Social Media Cookies

Under the California Consumer Privacy Act, you have the right to opt-out of the sale of your personal information to third parties. These cookies collect information for analytics and to personalize your experience with targeted ads. You may exercise your right to opt out of the sale of personal information by using this toggle switch. If you opt out we will not be able to offer you personalised ads and will not hand over your personal information to any third parties. Additionally, you may contact our legal department for further clarification about your rights as a California consumer by using this Exercise My Rights link

If you have enabled privacy controls on your browser (such as a plugin), we have to take that as a valid request to opt-out. Therefore we would not be able to track your activity through the web. This may affect our ability to personalize ads according to your preferences.

Targeting cookies may be set through our site by our advertising partners. They may be used by those companies to build a profile of your interests and show you relevant adverts on other sites. They do not store directly personal information, but are based on uniquely identifying your browser and internet device. If you do not allow these cookies, you will experience less targeted advertising.

Social media cookies are set by a range of social media services that we have added to the site to enable you to share our content with your friends and networks. They are capable of tracking your browser across other sites and building up a profile of your interests. This may impact the content and messages you see on other websites you visit. If you do not allow these cookies you may not be able to use or see these sharing tools.

If you want to opt out of all of our lead reports and lists, please submit a privacy request at our Do Not Sell page.

Save Settings
Cookie Preferences Cookie List

Cookie List

A cookie is a small piece of data (text file) that a website – when visited by a user – asks your browser to store on your device in order to remember information about you, such as your language preference or login information. Those cookies are set by us and called first-party cookies. We also use third-party cookies – which are cookies from a domain different than the domain of the website you are visiting – for our advertising and marketing efforts. More specifically, we use cookies and other tracking technologies for the following purposes:

Strictly Necessary Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Functional Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Performance Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Social Media Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Targeting Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.