Get ready for the next big thing: XML

Most Web developers know Extensible Markup Language looms large in their future, but few know whether it will be next month or next year.

By Shawn P. McCarthySpecial to GCNMost Web developers know Extensible Markup Language looms large in their future, but few know whether it will be next month or next year.Hypertext Markup Language will be used for a long time because it does basic file presentation so well. XML becomes necessary only when you need better access to, or control of, data embedded in office files and databases.Like HTML, XML is a streamlined version of the Standard Generalized Markup Language, which makes it possible to use and display information in different ways by defining its structure and elements. The International Standards Organization's SGML specification is posted on the Web at .XML is designed specifically for Web presentation. Its big advantage is that groups of developers can collaborate using their own customized tags to exploit functions that aren't possible with HTML.As XML evolves, professional groups will establish specific XML tag sets to use in education, commerce, science and other fields. The sets likely will evolve in much the same way as OFEX, the Open Financial Exchange format used by the banking industry.Although XML is for presentation, think of it as a data format, not a document format. Straight XML documents are already common on the Web, but the language's great power lies in generating documents on the fly from databases.XML has two remarkably different purposes. It can locate specific types of data embedded in documents, and it also can generate temporary documentsÑWeb pages'from databases. Many sites have begun storing their information in databases and generating pages dynamically on demand.XML can easily personalize content views if the data sources are properly tagged. The pages convert to straight HTML after the necessary data is culled'that's how leading Web search engines produce their customizable start pages. Visitors do not need an XML-capable browser to see the data, which passes behind the scenes as XML and changes into HTML only for display.Conceptually, XML is a trio of specifications:''The XML 1.0 recommendation explains the syntax of the metalanguage.''The XML Linking Language and XPointer are the World Wide Web Consortium's working drafts that describe ways to link relationships between documents.''The Extensible Style Language, now a W3C Note, describes how to render XML using different style sheets for various types of display devices.XML also can issue commands. If you encounter a
Extensible Markup Language's tag sets will give Web developers greater access to databases











www.iso.ch/cate/d16387.html





Flexible format



An XML Glossary
Attribute: A property that can be assigned a value associated with an element. Hyperlinks and embedded images are attributes.

CDF: Channel Definition Format, a push technology used in XML.

DTD: Document Type Definition, a set of rules governing the tags in an XML document, set at the top of the document.

DSSSL: Document Style Semantics and Specification Language, an SGML linking standard.

Element: The key word that starts a declaration of element type.

Entity: Phrase or character that represents text or data stored elsewhere.

Parser: A program that checks an XML document to ensure it is valid.

Stlye sheets: Can be associated with an XML document to control information display.

Well-formed: An XML document whose open and close tags match and are nested correctly, and whose entities and attributes are properly declared.

XLL: Extensible Linking Language, the linking standard for XML.

XSL: Extensible Style Language, the style standard for XML.
















tag when browsing with Microsoft Internet Explorer, it starts a function that lets you update installed software.

The main page for the W3C's XML efforts is at www.w3.org/XML/Activity. XML only recently became a W3C recommendation and is not yet an official standard.

Internet Explorer 5.0 so far is the only browser that understands XML elements, based on the draft specification. The parts Microsoft adopted for Explorer will likely be part of the official XML standard. Netscape Communications Corp. is taking a wait-and-see attitude and likely will not release an XML-ready Communicator 5 until the specification becomes official.

Here's how XML makes documents readable by users and by browsers and other software programs. Say you want to create a document about some machine parts stored in a warehouse. In the HTML world, you would start with a document that looked something like this:



Machine Parts



Left-handed widgets



Then you would add more lines of description to produce a basic Web page. If your colleagues later wanted to put the information into a report or add it to a database, they would gather the page, strip out the font and alignment tags, and then reformat the information.

Now here's how the same document might look in XML:

Bob Smith



Machine parts

Left-handed widgets



In XML, tags can be invented to describe data types. Anyone searching for an occurrence of left-handed widgets within a recognized tag called would have a good chance of finding the widget entry.

Once tags are generally recognized, software can deal with them automatically. It's simple to tell a program to look at a specific directory and pull in the contents of the tags from all documents within the directory. Then the contents can be imported into a database field, outputted to other documents, updated for reinsertion in the original document or held for other uses such as building new pages.

If every bit of information is properly tagged, you can pull all of it into a database. At that point you no longer need to maintain the original document, just the database.

Get together

But you cannot keep adding new tag names, especially if you share data with other offices. How would they know what your tags meant? That's why groups have gotten together to develop standardized tag sets.

Given the appropriate tags, you can stack an enormous amount of data into an XML document. Anything becomes a data field just by tagging it, including the document itself. Take a look at this XML document:





Bob Smith



Machine parts

Left-handed widget

Roto Tiller

Garden

Detailed description

XYZ 186



Remove the cotter pin. Remove old widget. Install new widget. Replace cotter pin.









It looks like HTML, but it has no presentation data. That comes from another source, such as a style sheet. It's like having your word processor import addresses or names via mail merge rather than typing and formatting them directly in the document. Many systems merge XML and style data back into a presentational language such as HTML for easier reading.

The downside of XML's flexibility is that it is less forgiving than HTML. Browsers ignore HTML commands they fail to understand. If items aren't properly nested, it's no big deal to the browsers.'But in XML, an improperly formatted file creates a fatal error. Applications will refuse to process the file.

That means a document must be what XML experts call well-formed to work right. It has to be ready for a computer program to read, and thus ready to be used in multiple ways for network delivery.

In a well-formed document:

''All begin tags and end tags match up.

''Empty tags use the special XML syntax .

''All the attribute values are properly quoted, for example: .''All the entities, or reusable data chunks, are declared.Checking for code errors across thousands of documents is tough, so XML users turn to automated tools such as the Lark parser. An online demo of Lark appears at xml.com/xml/pub/tools/ruwf/check.html, which can check whether your document is well-formed.

XML designers recognized that document authors sometimes omit important information or include extraneous text. The document type definition, or DTD, makes sure that XML coding will do what was intended.

For the parts file above, a DTD might work like this:

text]. The text entered inside the brackets would represent the DTD for the document with a root element known as . The root element contains all other elements.

. This simply says to expect a standalone tag.

. This defines the tag. Within the parentheses are additional sets of tags. They must appear inside the tags, in the same order.

An XML document can have an internal or an external DTD. It must be external if the DTD applies to multiple XML files.

The elements can get more complex. For example, in the term , the #PCDATA term is parsed character data'nonbinary information such as an image or raw text. You could designate "author" as the author's name or a photo.

The DTD checks to confirm that items within the tags follow its rules. For details about how DTDs are constructed, visit www.w3.org/TR/REC-xml#dt-doctype. But an XML document need not have a DTD to function. If the document is well-formed, it requires no special rules to tell a browser or other device how to read it.

A validating parser knows whether a document is well-formed. To do a quick, simple validation, save a document with an .xml extension, then view it in Internet Explorer 5.0, which will show whether anything is incomplete.

The key to writing successful XML is to do a great deal of advance planning. Decide how documents will be stored and served, how databases will be accessed, what tag sets will be used, and how they will nest so that the resulting documents are not only well-formed but also make sense to readers.

Decide whether you will need a DTD. If so, should it be internal or external, and how should it be structured? Don't worry about style sheets until you have everything else in place.

Above all, learn what others in your agency are doing about tag set creation. Because the government shares so much information, it needs a governmentwide tag set.

Then set up some experiments with a few dozen documents. Check the resources in this article to get started.

You can read the full XML specification at www.w3.org/TR/REC-xml.
X
This website uses cookies to enhance user experience and to analyze performance and traffic on our website. We also share information about your use of our site with our social media, advertising and analytics partners. Learn More / Do Not Sell My Personal Information
Accept Cookies
X
Cookie Preferences Cookie List

Do Not Sell My Personal Information

When you visit our website, we store cookies on your browser to collect information. The information collected might relate to you, your preferences or your device, and is mostly used to make the site work as you expect it to and to provide a more personalized web experience. However, you can choose not to allow certain types of cookies, which may impact your experience of the site and the services we are able to offer. Click on the different category headings to find out more and change our default settings according to your preference. You cannot opt-out of our First Party Strictly Necessary Cookies as they are deployed in order to ensure the proper functioning of our website (such as prompting the cookie banner and remembering your settings, to log into your account, to redirect you when you log out, etc.). For more information about the First and Third Party Cookies used please follow this link.

Allow All Cookies

Manage Consent Preferences

Strictly Necessary Cookies - Always Active

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data, Targeting & Social Media Cookies

Under the California Consumer Privacy Act, you have the right to opt-out of the sale of your personal information to third parties. These cookies collect information for analytics and to personalize your experience with targeted ads. You may exercise your right to opt out of the sale of personal information by using this toggle switch. If you opt out we will not be able to offer you personalised ads and will not hand over your personal information to any third parties. Additionally, you may contact our legal department for further clarification about your rights as a California consumer by using this Exercise My Rights link

If you have enabled privacy controls on your browser (such as a plugin), we have to take that as a valid request to opt-out. Therefore we would not be able to track your activity through the web. This may affect our ability to personalize ads according to your preferences.

Targeting cookies may be set through our site by our advertising partners. They may be used by those companies to build a profile of your interests and show you relevant adverts on other sites. They do not store directly personal information, but are based on uniquely identifying your browser and internet device. If you do not allow these cookies, you will experience less targeted advertising.

Social media cookies are set by a range of social media services that we have added to the site to enable you to share our content with your friends and networks. They are capable of tracking your browser across other sites and building up a profile of your interests. This may impact the content and messages you see on other websites you visit. If you do not allow these cookies you may not be able to use or see these sharing tools.

If you want to opt out of all of our lead reports and lists, please submit a privacy request at our Do Not Sell page.

Save Settings
Cookie Preferences Cookie List

Cookie List

A cookie is a small piece of data (text file) that a website – when visited by a user – asks your browser to store on your device in order to remember information about you, such as your language preference or login information. Those cookies are set by us and called first-party cookies. We also use third-party cookies – which are cookies from a domain different than the domain of the website you are visiting – for our advertising and marketing efforts. More specifically, we use cookies and other tracking technologies for the following purposes:

Strictly Necessary Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Functional Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Performance Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Social Media Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Targeting Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.