Tuesday, November 2, 2010

Setting DTD standards for HTML/XML documents

DOCTYPE tag is used to include the dtd definition file for a xml or html file. 

DOCTYPE element is placed in the top of the xml/html files.

If it is xml file DOCTYPE tag can be placed right below as a second statement in the file. We have to keep this in mind, this tag (<?xml version-"1.0" encoding="utf-8"?>) is optional in the xml document.

If it is html file DOCTYPE tag can be placed at the top of the html file as a first statement.

DOCTYPE syntax

<!DOCTYPE
root_element_name
PUBLIC||SYSTEM
"[Registration]// [Organization]// [Type] [Label]// [Language]"
"dtd_file_link">

Exameple:-
<!DOCTYPE application PUBLIC '-//Sun Microsystems, Inc.//DTD J2EE Application 1.3//EN' 'http://java.sun.com/dtd/application_1_3.dtd'>

root_element_name - root node name of html/xml

PUBLIC||SYSTEM - PUBLIC is used if "dtd link" available in public over the net, so we can use http link. SYSTEM is used if "dtd link" available in local source that can be addressed via file:/// link.

Quotes can be double quote or single quote and shouldn't be mixed one

"[Registration]// [Organization]// [Type] [Label]// [Language]"
[Registration] - Indicated by either a plus ("+") or minus ("-"). A plus symbol indicates that the organization name that follows is ISO-registered. A minus sign indicates the organization name is not registered. The IETF and W3C are not registered ISO organizations and thus use a "-".
[Organization] - This is the "OwnerID" - a unique label indicating the name of the entity or organization responsible for the creation and/or maintenance of the artifact (DTD, etc.) being referenced by the DOCTYPE. The IETF and W3C are the two originating organizations of the official HTML/XHTML DTDs.
[Type] - This is the "Public Text Class" - the type of object being referenced. There are many different keywords possible here, but in the case of an HTML/XHTML DTD, it is "DTD" - a Document Type Definition.
[Label] - This is the "Public Text Description" - a unique descriptive name for the public text (DTD) being referenced. If the public text changes for any reason, a new Public Text Description string should be created for it.
[Language] - This is the "Public Text Language"; the natural language encoding system used in the creation of the referenced object. It is written as an ISO 639 language code (uppercase, two letters.) HTML/XHTML DTDs are usually (always?) written in English ("EN".)

dtd_file_link - link of the dtd file written. It is optional. Without this xml/html document follows normal standards given by w3c.