Wednesday, 21 December 2011

Lab 7

Quick questions:

1.    People who prepare XML documents sometimes put part of the document in a CDATA section.

    1. Why would they do that?

              CDATA  (Character Data) is used for any data that is to be ignored  by an XML parser.  Any type of text that is not to be parsed, such as some parts of code, can be put in a CDATA section.


    1. How is the CDATA section indicated?

A CDATA section is starts with “<![CDATA[“ and ends with “]]>” with the text wanted in between the 2 parts.


    1. If CDATA sections hadn’t been invented, would there be any other way to achieve the same effect?

Multi-line commenting is another simple way to have text which is ignored by the XML parser. A multi-line comment section starts with “<!--“ and ends with “-->”, again with the text between the 2 parts.

A disadvantage of using multi-line commenting instead of CDATA is that comments cannot be declared in a DTD.


  1. What is a parser and what does it have to do with validity?

An XML parser converts an XML document into an XML DOM object, this is done to read and extract data from the elements in the XML document.

Before any processing, an XML parser validates the XML document against a DTD to ensure that the XML is in the correct structure and format.


  1. You write a .dtd file to accompany a class of XML documents. You want one of the elements, with the tag <trinity>, to appear exactly three times within the document element of every document in this class. Is it possible for the .dtd file to specify this?

By defining the tag <trinity> three times, you need to build DTD to take not of the number of occurrence of this element, in this case “trinity” will be declared only once but have it appear three times in the root element:

For example taking this xml:

<theRootElement>
            <trinity>1</trinity>
<trinity>2</trinity>
<trinity>3</trinity>
</theRootElement>
The DTD would be as follows:

<!DOCTYPE theRootElement[
            <!ELEMENT theRootElement (trinity, trinity, trinity)>
            <!ELEMENT trinity (#PCDATA)
]>

The element trinity wasn’t done with the notations of * or + as it always will occur three times in the XML document and is not variable


Longer questions:
1.        The following is one of the documents that featured in last week’s exercises. As mentioned before, this is to be “Chapter 2: Volcanic winter” in a book.
a)      Write a suitable prolog for this document.

<?xml version = “1.0” encoding= “UTF-8”?>
<!DOCTYPE chapter SYSTEM “chptr.dtd”>


b)      Write a .dtd file to act as the Document Type Description for this document. Or modify the one you wrote last week, if you wrote one.

<!DOCTYPE chptr
[
            <!ELEMENT chapter (title, material)>
                        <!ATTLIST chapter number CDATA #REQUIRED>
<!ELEMENT material (paragraph+, poem*)>
            <!ELEMENT paragraph (#PCDATA)>
            <!ELEMENT poem (poemVerse+)>
                        <!ELEMENT poemVerse (#PCDATA)>
]



c)      Put tags into the document. Obviously, there must be a document element. But also, the poem needs special treatment (because of the way it will be displayed) and, in fact, each line of the poem needs special treatment (you can spot the places where the lines start, by the capital letters). The mention of the poets at Geneva needs to be identified, because it will feature in the index, and so do the pyroclastic flows and Mount Tambora and Sumbawa and the year without a summer and the famines.

<?xml version=”1.0” encoding=”UTF-8?”>
            <!DOCTYPE chapter SYSTEM “chptr.dtd”>
            <chapter number = “2”>
                        <title>Volcanic Winter</title?>
                        <material>
                                    <paragraph>
A volcanic winter is very bad news. The worst eruption in recorded history happened at Mount Tambora in 1815. It killed about 71 000 people locally, mainly because the pyroclastic flows killed everyone on the island of Sumbawa and the tsunamis drowned the neighbouring islands, but also because the ash blanketed many other islands and killed the vegetation. It also put about 160 cubic kilometres of dust and ash, and about 150 million tons of sulphuric acid mist, into the sky, which started a volcanic winter throughout the northern hemisphere. The next year was the year without a summer. No spring, no summer – it stayed dark and cold all the year round. This had its upside. In due course, all that ash and mist in the upper atmosphere made for some lovely sunsets, and Turner was inspired to paint this. The Lakeland poets took a holiday at Lake Geneva, and the weather was so horrible that Lord Byron was inspired to write this.
                                    </paragraph>
                                    <poem>
                                                <poemVerse>
The bright sun was extinguish'd, and the stars
</poemVerse>
<poemVerse>
Did wander darkling in the eternal space,
</poemVerse>
<poemVerse>
Rayless, and pathless, and the icy earth
</poemVerse>
<poemVerse>
Swung blind and blackening in the moonless air;
</poemVerse>
<poemVerse>
Morn came and went – and came, and brought no day.
</poemVerse>
                                    </poem>
                                    <paragraph>
Mary Shelley was inspired to write Frankenstein. The downside was that there were famines throughout Europe, India, China and North America, and perhaps 200 000 people died of starvation in Europe alone.
                                                </paragraph>
                                    </material>
                        </chapter>




2. This chapter obviously needs some pictures. You have available the following, and you decide to include them in the chapter, at appropriate places:
-       a picture of Sumbawa, after the volcanic eruption. It’s in a file sumbawa.jpg. Caption: “Sumbawa, after the volcanic eruption”.
-       a picture of Lake Geneva, in 1816. It’s in a file Geneva1816.jpg. Caption: “Lake Geneva, during the summer of 1816”.
-       a picture of Mary Shelley. It’s in a file MaryShelley.jpg. Caption: “Mary Shelley, author of Frankenstein”.
Amend your two files so that they can cope with these pictures and captions.


DTD:


<!DOCTYPE chptr
[
            <!ELEMENT chapter (title, material)>
                        <!ATTLIST chapter number CDATA #REQUIRED>
<!ELEMENT material (paragraph+, poem*, image*)>
            <!ELEMENT paragraph (#PCDATA)>
            <!ELEMENT poem (poemVerse+)>
                        <!ELEMENT poemVerse (#PCDATA)>
            <!NOTATION JPG SYSTEM “image/JPG”>
<!ENTITY sumbawa SYSTEM "sumbawa.jpg" NDATA JPG>
<!ENTITY lakeGeneva SYSTEM "geneva1816.jpg" NDATA JPG>
<!ENTITY maryShelley SYSTEM "maryShelley.jpg" NDATA JPG>

<!ELEMENT image (#PCDATA)>
<!ATTLIST image
    source ENTITY #IMPLIED
    caption CDATA #REQUIRED
>
]



XML:


<?xml version=”1.0” encoding=”UTF-8?”>
            <!DOCTYPE chapter SYSTEM “chptr.dtd”>
            <chapter number = “2”>
                        <title>Volcanic Winter</title?>
                        <material>
                                    <paragraph>
A volcanic winter is very bad news. The worst eruption in recorded history happened at Mount Tambora in 1815. It killed about 71 000 people locally, mainly because the pyroclastic flows killed everyone on the island of Sumbawa and the tsunamis drowned the neighbouring islands, but also because the ash blanketed many other islands and killed the vegetation. It also put about 160 cubic kilometres of dust and ash, and about 150 million tons of sulphuric acid mist, into the sky, which started a volcanic winter throughout the northern hemisphere. .</paragraph>
        <image source="sumbawa" caption="Sumbawa, after the volcanic eruption"/>
        <paragraph>The next year was the year without a summer. No spring, no summer – it stayed dark and cold all the year round. This had its upside. In due course, all that ash and mist in the upper atmosphere made for some lovely sunsets, and Turner was inspired to paint this. The Lakeland poets took a holiday at Lake Geneva, and the weather was so horrible that Lord Byron was inspired to write this.
                                    </paragraph>
        <image source="lakeGeneva" caption="Lake Geneva during the summer of 1816"/>
                                    <poem>
                                                <poemVerse>
The bright sun was extinguish'd, and the stars
</poemVerse>
<poemVerse>
Did wander darkling in the eternal space,
</poemVerse>
<poemVerse>
Rayless, and pathless, and the icy earth
</poemVerse>
<poemVerse>
Swung blind and blackening in the moonless air;
</poemVerse>
<poemVerse>
Morn came and went – and came, and brought no day.
</poemVerse>
                                    </poem>
                                    <paragraph>
<image source="maryShelley" caption="Mary Shelley, author of Frankenstein"/>
Mary Shelley was inspired to write Frankenstein. The downside was that there were famines throughout Europe, India, China and North America, and perhaps 200 000 people died of starvation in Europe alone.
                                                </paragraph>
                                    </material>
                        </chapter>

Thursday, 15 December 2011

Lab 6

Quick questions:

1.    What exactly does a DTD do in XML?
A DTD provides a template to indicate the presence, order and placement of elements and their attributes in an XML documents for document mark-up. Also it may contain other data such as entity definitions.

2. You’ve written an XML document, with the XML declaration  <?xml version= “1.0”?> at the start. You realise that the text contains some arabic characters. Which of the following should you do:

a)        change the XML declaration to <?xml version= “1.0” encoding=”ISO 8859-6”?>
b)        change the XML declaration to <?xml version= “1.0” encoding=”UTF-8”?>
c)        do nothing: the declaration is fine as it is.

The answer is C). This is because the XML parser will use UTF-8 encoding by default  if the encoding is not specified and that includes all the Arabic characters.

3. Can you use a binary graphics file in an XML document?

Yes if the following steps are taken it is possible:
A) The file is made as an external entity
B) NDATA is used followed by a format code in the entity declaration
C) The Notation is declared
D) The Entity is made the source of an attribute value for another element
E) The Entity is used with that attribute value as the graphic.


Longer questions:
1.        The following is the document element (root element) of an XML document.
This question is exactly the same as question 3 in my previous blog, please refer to it.

2. I decide to produce a book called “Toba: the worst volcanic eruption of all”. I ask 3 colleagues to write three text files entitled:
“Chapter 1: The mystery of Lake Toba’s origins”.
“Chapter 2: Volcanic winter”.
“Chapter 3: What Toba did to the human race”.
All three text files are placed into a folder c:\bookproject\chapters on the hard drive on my computer. I insert <text> at the start of each file, and </text> at the end. I name the three files chap1.xml, chap2.xml, and chap3.xml respectively. I draw up the title page, title page verso and contents page of the book like this:
Toba: the worst volcanic eruption of all
John
Jack
Jill
Joe
STC Press
Malta
Copyright © 2010
STC Press

Published by STC Press Ltd., Malta
ISBN: 978-0-596-52722-0
Contents
Chapter 1: The mystery of Lake Toba’s origins
Chapter 2: Volcanic winter
Chapter 3: What Toba did to the human race
 Then I construct an XML document that encompasses the whole book.
(a)    Provide this XML document

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE book SYSTEM "book.dtd">
<book>
    <titlePage>
        <title>Toba: The worst volcanic eruption of all</title>
        <authors>
            <author>John</author>
            <author>Jack</author>
            <author>Jill</author>
            <author>Joe</author>
        </authors>
        <publisher>&pblshr</publisher>
        <country>Malta</country>
    </titlePage>

    <verso>
        <copyright>Copyright © 2010</copyright>
        <publishedBy> Published by STC Press Ltd., Malta</publishedBy>
        <ISBN>ISBN: 978-0-596-52722-0</ISBN>
    </verso>

    <contents> Contents
        <chName number="1" title="The mystery of Lake Toba's origins"/>
        <chName number="2" title="Volcanic winter"/>
        <chName number="3" title="What Toba did to the human race"/>
    </contents>

    <chapters>
        <chapter number="1" title="The mystery of Lake Toba's origins">&chapter1</chapter>
        <chapter number="2" title="Volcanic winter">&chapter2</chapter>
        <chapter number="3" title="What Toba did to the human race">&chapter3</chapter>
    </chapters>

</book>

(b)   Provide the accompanying .dtd file


<?xml version="1.0" ?>
<DOCTYPE book [
 
    <!ELEMENT book (titlePage, verso, contents, chapters)>

    <!ELEMENT titlePage (title, authors, publisher, country)>
        <!ELEMENT title (#PCDATA)>
        <!ELEMENT authors (author+)>
            <!ELEMENT author (#PCDATA)>
        <!ELEMENT publisher (#PCDATA)>
        <!ELEMENT country (#PCDATA)>
      
    <!ELEMENT verso (copyright, publishedBy ,ISBN)>
        <!ELEMENT copyright (#PCDATA)>
        <!ELEMENT publishedBy (#PCDATA)>
        <!ELEMENT ISBN (#PCDATA)>

    <!ELEMENT contents (chName+)>
        <!ELEMENT chName (#PCDATA)>
        <!ATTLIST chName number CDATA #REQUIRED title CDATA #REQUIRED>

    <!ELEMENT chapters (chapter+)>
        <!ELEMENT chapter (#PCDATA)>
        <!ATTLIST chapter number CDATA #REQUIRED title CDATA #REQUIRED>


<!ENTITY pblshr "STC Press">
<!ENTITY chapter1 SYSTEM "chapter1.xml">
<!ENTITY chapter2 SYSTEM "chapter2.xml">
<!ENTITY chapter3 SYSTEM "chapter3.xml">