XML - University at Albany, SUNY

XML - University at Albany, SUNY

Introduction to XML 1 What is XML Its an acronym for eXtensible Markup Language Tag based syntax, very much like HTML Allows us to make our own tags, hence extensible Extensibility is its major benefit

Foundation for Web 2.0 applications such as RSS, AJAX, XHTML, and many others 2 What is XML Designed to describe data, not display it Tags are not predefined like HTML

W3C Standard XML doesnt do anything, not programming per se HTML is all about looks where as XML is all about brains 3 History of XML XML developed in 1998 Evolved out of SGML, or Standard Generalized Markup Language

SGML is a language that dates back to 1960s SGML developed by the work of 3 people, most notably Charles Goldfarb Harvard Law Grad that wanted a way to share docs and coined term markup language 4 XML History cont. XML developed by 11 members, +150 more consultants, and 3 years to develop 1998, the W3C released XML 1.0 Last release was 1.1 in 2006

Theres talk of XML 2.0, no serious plans yet 5 Before XML there was HTML Gore Bill or High Performance Computing & Communication Act of 1991 HTML or HyperTextMarkupLanguage developed 1991 by Tim Berners-Lee, inventor of the WWW Not widely used until HTML 2.0 was released, 1995. HTML 4.0 1998, and HTML 5.0 in 2011 HTML simplified SGML so that non-experts could

markup documents HTML made the Internet revolution possible HTML too successful! 6 Landscape of Markup Languages 7 DocBook A schema developed in 1991 Used to write books, especially technical

information Created and used before the advent of XML Created and maintained in part by OReilly Uses SGML and XML Has some compatibility issues with XML tools DocBook is on the decline 8 HTML Optimized for WWW Allows non-standard markup (i.e. sloppy

syntax) Content & format not separated Fixed set of tags, not extensible vs. XML Separates content from format Enforces clean markup

Meaningful, self describing syntax Tags can be user created since XML is extensible 9 HTML Lacks Meaning What does this HTML tag mean?


Weight of an automobile? Number of students enrolled at UAlbany?

A zip code for an Albany address? Appearance is the primary goal of HTML It cannot separate content from presentation! 10 XHTML Helped fix some of the problems found with HTML More restrictive Documents must be well formed ie, XHTML

elements must be properly nested No open elements i.e.,

with no closing tag Uses standard XML parser, not as flexible as an HTML parser elements must be in lowercase 11 XHTML cont. XHTML is still merely for presentation, not data XHTML is not a replacement for XML

12 So Why Dont We Use SGML? Lack of web browser support because all of its development occurred before WWW Too complicated to implement because there are too many options Little support for style sheets, and no agreed upon standard exists for presenting SGML data 13

XML Advantages over SGML A simplified subset of SGML (ISO 8879) Very powerful and easy to implement Small enough for Web browsers Internationalized from the beginning Unicode for both content and markup Not a language but a meta-language Designed to support the definition of an unlimited number languages for specific industries: "Write once, parse anywhere"

14 Unicode an international encoding standard for use with different languages and scripts, by which nearly every possible international character (letter, digit, or symbol) is assigned a unique numeric value that applies across different platforms and programs. XML is written in Unicode 15

XML is Self Describing Using XML tags we can: make relationships between data explicit Prevent data loss due to the inability to know what the data means enable domain specific professions a way to mark up their information how they want it. 16 Interoperability

XML can be used by a wide variety of platforms Most major programming languages provide support XML is a reliable, well-documented, and open standard 17 Markup Languages (MLs) A markup language is a modern system for annotating a document in a way that is syntactically distinguishable

from the text.* Hundreds of MLs Most developed in the past 20 years use XML XML is a general purpose mark up language used to create domain-based MLs. W3C writes standards for XML and leaves ML development to individuals and authorities within domains. Can you think of a mark up language? *http://en.wikipedia.org/wiki/Markup_language (accessed 6/26/2012) 18

Markup Languages MAchine Readable Cataloging(MARC) not XML but MARC XML is! Sports Markup Language (SportsML). Mathematical Markup Language (MathML) Green Building XML (gbXML) Text Encoding Initiative (TEI) Encoded Archival Description (EAD) 19 XML powers Web 2.0+ Its the backbone of the Semantic Web

Integral to Linked Data Integral to Web 2.0 and 3.0 RSS feeds AJAX functionality like auto suggest Computer generated reasoning 20 How might we think of XML? XML is like a mixture of

MS Word, DB, and HTML 21 XML Separates Content, Structure & Presentation 22 XML is like a Database It also allows us to do things:

We can query it (SQL) Sort it Update, add, & delete data 23 XML structures information Like a database functionality, XML structures our information Because it structures data, it can be treated like a database, but its more than that XML also allows us to create narrative

information in a similar way that we do in Word processor Its a hybrid of being data and document centric 24 XML is data & document centric XML can represent small pieces of information or data in highly structural manner Provides ways to reuse content It also can represent documents in a highly structured manner too. Popular in publishing

Text encoding initiatives 25 Hierarchical Data Representation 26 Root, Parents, Children, and Siblings too! Library is the Root node Books node is

child of Library node Library Books Books node is parent of Title node Title

Relationship & Node Books is a sibling of Books Books Author Title

Author 27 XML is Human Readable XML files can be read and inspected by a human and computer! Database, Word processor, or Spreadsheet files (pretty much most applications) are binary files. Binary files are only meaningful to computers

28 XML documents should be human legible and reasonably clear. The meaning of an XML document should be more or less apparent from the tags. Terrier GOOD Terrier POOR 29

XML is not Tersely Written Since XML documents are plain text, they dont take much memory. So theres no point using cryptic abbreviations. Use Mark , instead of Mark. XML is supposed to be Verbose 30 XML Documents shall be easy to create.

Many specialized editors available, but you can write perfectly good XML with just a text editor. We will be using Notepad++ 31 A Typical XML Architecture Content (the XML document)

Format (the stylesheet) Definition (the DTD) 32 How Data is Served to Web XSL stylesheet

HTML Document XSL stylesheet HTML Document XSL stylesheet

HTML Document XSL stylesheet HTML Document XML Document

33 Sustainable Standard Libraries, Archives, Banks, etc. rely on XML as a long-term standard to store their data Human readable Standards driven Lightweight Well supported Open 34

IST 538 Fundamentals of XML 35

Recently Viewed Presentations

  • Chapter 6

    Chapter 6

    diploid number (2n) it has two matching homologues per set. One of the homologues comes from the mother (and has the mother's DNA).… the other homologue comes from the father (and has the father's DNA). Most organisms are diploid. Humans...
  • Pain Stimulator - Research

    Pain Stimulator - Research

    Interaction between cardiovascular and pain regulatory systems to maintain homeostasis during painful stimuli compared to interaction in chronic pain patients Experimental Importance NIH funded 120 subjects Two research sessions one week apart Stimulation of C-fibers Placebo vs. drug Measures level...
  • Unit 4 Life in the Colonies Lessons 1-7

    Unit 4 Life in the Colonies Lessons 1-7

    bought and sold at auctions. punished if disobeyed. were not paid. Indentured Servant. worked for a set period of time in exchange for housing, food, and cost of voyage.
  • Cursive Handwriting at Bell Farm

    Cursive Handwriting at Bell Farm

    Cursive Handwriting at Bell Farm 12th January 2016 Why cursive handwriting?: Proven to raise standards of handwriting and presentation across the school. Proven to teach children to write fluently and create work which is legible and pleasant to look at....
  • Rev 9:1 And the fifth angel sounded, and I saw a star fall ...

    Rev 9:1 And the fifth angel sounded, and I saw a star fall ...

    1st witnessIsa 14:12 How art thou fallen from heaven, O Lucifer, son of the morning! how art thou cut down to the ground, which didst weaken the nations! Okay it says that Lucifer has fallen from heaven but does the...
  • Chapter 5.1: Airline Cost Categorization

    Chapter 5.1: Airline Cost Categorization

    Functional Cost Allocation. In many respects similar in structure to both functional and allocation schemes used by airlines and government around the world. ICAO Cost Categories. Overall , the structure of the cost allocation scheme is very similar.
  • 1. Giving and Responding To Advice/Suggestion

    1. Giving and Responding To Advice/Suggestion

    MateriKelas XII Semester 1. 1. Giving and Responding To Advice/Suggestion.. 2. Expression of Request. 3. Expression of Complain and Excuse. 4. Expressing of Possibility and Impossibility. By: Ari Puteri Indrayani
  • C-value Paradox and Non-coding RNAs

    C-value Paradox and Non-coding RNAs

    C-value is calculated by measuring the haploid genome of a species, so those of polyploid organisms, such as plants, are quite difficult to determine. Prokaryotes and viruses do not have introns or "junk DNA", so only eukaryotes are considered in...