SGML
Welcome to the Wikiversity Center for the Study of SGML or Standard Generalized Markup Language. This is a content development project where participants create, organize and develop learning resources about SGML.
Remember this is a WIKI. To put it in other words, criticism is good but contributions are better.
Purpose
[edit | edit source]Untangle the SGML syntax by describing each production. The participant should be able to understand a SGML declaration, and the basics of SGML.
Motivation
[edit | edit source]I think you can name it "computer archaeology" ;-) as the world is evolving there is no practical interest behind this topic.
This is an overwhelming task, but IMHO it's worthwhile (and amusing?). As of 2008 there is no complete lay-oriented description of of SGML (at least I haven't found any). Most of SGML-related web resources which were available some years ago (by 2001) now are gone, and I think the English Wikipedia is not the right place for recreating this material (at least, until we can assure its quality).
Schedule
[edit | edit source]Honestly, I don't think this will be done from one day to another... maybe in one year or two.
Introduction
[edit | edit source]The Standard Generalized Markup Language (SGML) is a metalanguage in which one can define markup languages for documents. SGML is a descendant of IBM's Generalized Markup Language (GML), developed in the 1960s by Charles Goldfarb, Edward Mosher and Raymond Lorie.
SGML provides an abstract syntax that can be realized in many different concrete syntaxes. It was originally designed to enable the sharing of machine-readable documents in large projects in government, law and industry, which have to remain readable for several decades. It has also been used extensively in the printing and publishing industries, but its complexity has prevented its widespread application for small-scale general-purpose use.
Participants
[edit | edit source]I have no Earthly idea why anyone in their right mind would sign up for this, but here's a place to do so:
- You ~~~~
- Mr Rho 07:38, 16 Nov 2017 (UTC) (I have no earthly idea why I signed up for this, but I am not in my right mind at most times)
- w:Rjgodoy 13:46, 2 May 2008 (UTC)
Content
[edit | edit source](under construction)
An Overview of the SGML declaration
[edit | edit source]The SGML declaration is composed of
- CHARSET: a description of the character set.
- CAPACITY: restricts the maximum length of a document.
- SCOPE: whether the syntax applies to the document instance only, or both document prolog and instance.
- SYNTAX: the concrete syntax to be used within the document, which contains:
- a list of illegal characters (SHUNCHAR),
- a description of the character set used in the syntax (BASESET and DESCSET),
- The definition of special characters (FUNCTION),
- NAMING rules,
- a list of general and short-reference delimiters (DELIM),
- a list of reserved keywords for use in the DTD (NAMES),
- QUANTITY: restricts the maximum length of individual productions.
- FEATURES: optional features which modify the markup.
- APPINFO: application-specific information.
Reference Concrete Syntax
[edit | edit source]<!SGML "ISO 8879:1986"
CHARSET
BASESET "ISO 646:1991//CHARSET IRV//ESC 2/8 4/2"
DESCSET
0 9 UNUSED
9 2 9 -- TAB, LF --
11 2 UNUSED
13 1 13 -- CR --
14 18 UNUSED
32 95 32
127 1 UNUSED
CAPACITY SGMLREF
TOTALCAP 35000
ENTCAP 35000
ENTCHCAP 35000
ELEMCAP 35000
GRPCAP 35000
EXGRPCAP 35000
EXNMCAP 35000
ATTCAP 35000
ATTCHCAP 35000
AVGRPCAP 35000
NOTCAP 35000
NOTCHCAP 35000
IDCAP 35000
IDREFCAP 35000
MAPCAP 35000
LKSETCAP 35000
LKNMCAP 35000
SCOPE DOCUMENT
SYNTAX
SHUNCHAR NONE -- do not change this --
BASESET "ISO 646:1991//CHARSET IRV//ESC 2/8 4/2"
DESCSET 0 128 0
FUNCTION
RE 13 -- CR --
RS 10 -- LF --
SPACE 32 -- SP --
TAB SEPCHAR 9 -- TAB --
NAMING
LCNMSTRT "" -- in addition to a..z --
UCNMSTRT "" -- in addition to A..Z --
LCNMCHAR "-." -- in addition to 0..9 --
UCNMCHAR "-." -- in addition to 0..9 --
NAMECASE
GENERAL YES
ENTITY NO
DELIM
GENERAL SGMLREF
MDO "<!" -- markup decl open --
MDC ">" -- markup decl close --
DSO "[" -- declaration subset open --
DSC "]" -- declaration subset close --
MSC "]]" -- marked section close --
COM "--" -- comment --
RNI "#" -- reserved name indicator --
LIT """ -- literal --
LITA "'" -- alternative literal --
GRPO "(" -- group open --
GRPC ")" -- group close --
AND "&" -- and connector --
OR "|" -- or connector --
SEQ "," -- seq connector --
OPT "?" -- opt occurrence indicator --
REP "*" -- rep occurrence indicator --
PLUS "+" -- plus occ ind, inclusion --
MINUS "-" -- exclusion, omission flag --
CRO "&#" -- character reference open --
ERO "&" -- entity reference open --
PERO "%" -- parameter entity reference open --
REFC ";" -- reference close --
PIO "<?" -- processing instruction open --
PIC ">" -- processing instruction close --
STAGO "<" -- start tag open --
ETAGO "</" -- end tag open --
TAGC ">" -- tag close --
NET "/" -- null end-tag --
VI "=" -- value indicator --
SHORTREF NONE
"&#TAB;"
"&#RE;"
"&#RS;"
"&#RS;B"
"&#RS;&#RE;"
"&#RS;B&#RE;"
"B&#RE;"
"&#SPACE;"
"BB"
"""
"#"
"%"
"'"
"("
")"
"*"
"+"
","
"-"
"--"
":"
";"
"="
"@"
"["
"]"
"^"
"_"
"{"
"|"
"}"
"~"
NAMES SGMLREF
-- available names for substitution, grouped by area of use.
names marked with (*) are overloaded and must be substituted
only once, but the translation needs to fit all uses.
DOCTYPE
ELEMENT
ANY
CDATA (*)
RCDATA (*)
PCDATA
EMPTY (*)
O
ATTLIST
ID
IDREF
IDREFS
ENTITY (*)
ENTITIES
NOTATION (*)
NAME
NAMES
NMTOKEN
NMTOKENS
NUTOKEN
NUTOKENS
NUMBER
NUMBERS
CDATA (*)
FIXED
CONREF
CURRENT
REQUIRED
IMPLIED (*)
ENTITY (*)
DEFAULT
STARTTAG
ENDTAG
MD
MS
PI
CDATA (*)
SDATA
NDATA
SUBDOC
SYSTEM
PUBLIC
(marked section keywords)
CDATA (*)
RCDATA (*)
IGNORE
INCLUDE
TEMP
NOTATION (*)
SHORTREF
USEMAP
EMPTY (*)
LINKTYPE
SIMPLE
IMPLIED (*)
LINK
INITIAL
IDLINK
USELINK
RESTORE
EMPTY (*)
POSTLINK
(named character entity references)
RE
RS
SPACE
--
QUANTITY SGMLREF
NAMELEN 8
LITLEN 240
PILEN 240
TAGLEN 960
ATTSPLEN 960
TAGLVL 24
ENTLVL 16
ATTCNT 40
GRPCNT 32
GRPGTCNT 96
GRPLVL 16
BSEQLEN 960
FEATURES
MINIMIZE
DATATAG NO
OMITTAG YES
RANK NO
SHORTTAG YES
LINK
SIMPLE NO -- YES requires number --
IMPLICIT NO
EXPLICIT NO -- YES requires number --
OTHER
CONCUR NO
SUBDOC NO -- YES requires number --
FORMAL YES
APPINFO NONE
>
External Links
[edit | edit source]- SGML in the English Wikipedia
- The SGML Declaration, in SGML and HTML Explained, Martin Bryan (1997)
- SGML Declarations - Wayne Wohler, IBM Corporation, 1994.
- The SGML Handbook Charles F. Goldfarb, Oxford University Press, 1990. ISBN 0198537379, ISBN 9780198537373.