| << 18.1.2- Examples of Markup Languages | Chapter18 | 18.1.4- The Data Revolution >> |
What is XML?
XML got the name Extensible Markup Language because it is not a fixed format like HTML. While HTML has a fixed set of tags that the author can use, XML users can create their own tags (or use those created by others, if applicable) so that they actually describe the content of the element. So, let's dive straight in and look at an example.
At its simplest level XML is just a way of marking up data so that it is self-describing. What do we mean by this? Well, as a publisher Wrox makes details about their books available in HTML over the web. For example, in HTML we might display details about this book like so:
<DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 //EN">
<HTML>
<HEAD>
<TITLE>Beginning ASP 3.0</TITLE>
</HEAD>
<BODY>
<H1>Beginning ASP 3.0</H1>
<H3>ISBN 1-861003-38-2</H3>
<H4>Authors</H4><H4>Brian Francis, Chris Ullman, Dave Sussman, John Kauffman, Jon Duckett, Juan Llibre</H4>
<P>US $49.99<BR>
<P>ASP is a powerful technology for dynamically creating web site content. Learn how to create exciting pages that are tailored to your audience. Enhance your web/intranet presence with powerful web applications.</P>
</BODY>
</HTML>
That's all you need to do if you want to put information about a book on a Web page. It will look something like this:
|
|
So, when we are building our web pages we have a lot of this data in HTML. As an up and coming ASP developer you are probably using script and generating content dynamically now, however you still have a lot of information marked up just for display on the web.
But the tags (or markup) don't give you any information about what you are displaying. There is no way that you can tell, from the tags, that you are displaying information about a book. With XML, however, you can create your own tags; they can actually describe whatever content you are marking up and hence the term 'self-describing data'.
So, how could we mark up the information in a more logical way, using XML, so that we know what we have in the file? In the following Try It Out, we will create our first XML document that mimics the data held in the above HTML example.
Try It Out – My First XML Document
All you need to create an XML document is a simple text editor; something like Notepad will do just fine for our first example.
1. Fire up your text browser and type in the following, which we will call books.xml. Make sure you type it in exactly as shown, since XML is case sensitive and spaces must be in the correct positions:
<?xml version="1.0"?>
<books>
<book>
<title>Beginning ASP 3.0</title>
<ISBN>1-861003-38-2</ISBN>
<authors>
<author_name>Brian Francis</author_name>
<author_name>Chris Ullman</author_name>
<author_name>Dave Sussman</author_name>
<author_name>John Kauffman</author_name>
<author_name>Jon Duckett</author_name>
<author_name>Juan Llibre</author_name>
</authors>
<description> ASP is a powerful technology for dynamically creating web site content. Learn how to create exciting pages that are tailored to your audience. Enhance your web/intranet presence with powerful web applications.</description>
<price US="$49.99"/>
</book>
</books>
While Notepad will do just fine as an XML editor, there are so many better free choices that do so much better. For example, NotePad2 adds line numbering and custom coloring schemes. It comes with a built in scheme for XML (choose View, Syntax Scheme, XML Document) which renders the XML from this example like the following screen. While it doesn't have the XML-specific features of an expensive XML editor, it does help you visually identify the tags and elements in the markup.
Could not generate link to image with ID:
2. Save our file as books.xml to any folder you want on your hard drive. It could be with the other directories on your web server, or it could be completely separate. I have placed all my XML files in a folder called XMLfiles as a subfolder to our BegASPFiles directory.
3. To open your XML file in Internet Explorer just use select Open from the file menu and browse to it. Or you can type in the URL. Here is how our XML version of the book details is displayed when we open it in IE5.
|
|
Both IE 6 and IE 7 continue the ability to view XML. You can download the latest version off IE from the Microsoft web site http://www.microsoft.com/ie. FireFox also displays XML.
How It Works
Let's go through our code step-by-step and take a look at exactly what is happening:
<?xml version="1.0"?>
This is the XML prolog. It tells the receiving application that they are getting an XML document compatible with version one of the XML specification. Note that the xml is in lowercase and that there are no white spaces between the question mark and the opening xml.
All XML documents must have a unique opening and closing tag; <books> is ours, as we have a file containing data about books:
<books>
...
</books>
This is known as the root element. Nearly all XML tags MUST have a corresponding closing tag; unlike HTML you cannot miss out end tags and expect your application to accept it. The only exception to this is called an empty element, in which there is no element content. An example of this in HTML would be an <IMG> tag. In XML, if you have an empty element you must add a slash before the closing delimiter, such as <tag attribute="value" />.
Note that XML, unlike HTML, is case sensitive, so <BOOK>, <Book> and <book> would be treated as three different tags.
Within this we are describing data about a specific book, so we use an opening tag that explains what will be contained by the tag. Here we are using <book>.
<books>
<book>
...
</book>
</books>
In the same way that we made sensible opening and closing tags for the document using the <books> element, we use similarly descriptive tags to mark up some more details, this time the title of the book and its ISBN number (the one that is just above the bar code on the back of the book). These go inside the opening and closing <book> tags.
<title>Beginning ASP 3.0</title>
<ISBN>1-861003-38-2</ISBN>
As there are several authors on this book we put the list of authors in nested elements. We start with an opening <authors> tag and then nest inside this an <author_name> tag for each author. Again these go between the opening and closing <book> tags. In this example we put them under the ISBN.
<authors>
<author_name>Brian Francis</author_name>
<author_name>Chris Ullman</author_name>
<author_name>Dave Sussman</author_name>
<author_name>John Kauffman</author_name>
<author_name>Jon Duckett</author_name>
<author_name>Juan Llibre</author_name>
</authors>
This is a very important aspect of XML as data, because it allows us to create hierarchical data records. This structure would not fit easily into the row and table model of relational databases (without the use of linked tables), whereas it fits fine in our text file.
Next, we added the description of the book. Here we are using an element called <description>, although it could equally be something like <precis>, <details> or <synopsis>.
<description> ASP is a powerful technology for dynamically creating web site content. Learn how to create exciting pages that are tailored to your audience. Enhance your web/intranet presence with powerful web applications.</description>
We then added the price of the book to the document. Here you can see that we are using an empty element tag, with the closing slash in the tag. The currency and amount are actually held within the US attribute:
<price US="$49.99"/>
</book>
Note that, in XML, all attribute values must be contained in quotes.
And that's all there is to it. You have just created your first XML document. It is plain text, its tags describe their content, and it is easily human-readable.
From this alone, you can tell that we are now talking about a book. The tags that meant little in our HTML version, such as <H3> and <P> are gone. In our HTML example, the ISBN number of the book was held in <H3> tags. When we markup the data about a book using XML, the ISBN number is in tags that are called <ISBN>. Now, this seems a lot more logical and it is simple to see what we are talking about.
But what about how it looks in a web browser? Browsers understand tags such as <H3>, referring to a category 3 heading, but we cannot expect a browser to understand tags that we are making ourselves. As XML is such a new technology, browsers are only just starting to support it; Internet Explorer 5 was the first browser to offer full support for the XML specification. For this reason, we will use Internet Explorer 5 for the examples in this chapter.
As it is, our XML version is not as attractive to look at as the HTML version but, as we said, HTML is language specifically for displaying data on the web. XML is just a way of marking up your data. It is still possible to make our XML documents more attractive using a style sheet. In fact there are several reasons why you might want to keep your data separate from styling rules. We shall look at these in a moment.
Technologies surrounding XML
As you might have guessed by now, if XML is just a way of describing your data, then there are several other things that you need to learn in order to use XML in the same way that we use HTML. These include:
- Schemas to define what these tags we are making up mean
- Style sheets for presenting the XML in an attractive way
- Linking rules, since XML has no built in hyper linking mechanism
- But don't let this put you off. The advantages of using XML in certain situations easily outweigh the disadvantages of having to learn these new specifications. Before we learn some of the rules for writing XML and using the associated technologies, let's consider why it is so important.
| << 18.1.2- Examples of Markup Languages | Chapter18 | 18.1.4- The Data Revolution >> |

RSS

