The Standard Generalized Markup Language (SGML) is a standard for specifying Generalized Markup Language (GML) for documents. SGML is not a document language but a description of how to specify one—it is a form of metadata.



SGML was a product of IBM’s Generalized Markup Language (GML), which Charles Goldfarb, Edward Mosher, and Raymond Lorie created in the 1960s. Goldfarb coined the term “GML” based on the term’s initials and wrote the definitive work on SGML syntax in The SGML Handbook.

As a document markup language, SGML was designed to allow the sharing of machine-readable large-project documents in law, government, and various industries. Many of these documents must remain readable for several years. SGML was also used in technical references, industrial publishing, and the military and aerospace industries.


Concept of SGML and Its Advantages

The concept of SGML relies on the fact that documents have semantic and structural elements that can be described without reference to how those elements should be displayed. The actual display of a document may vary depending on the output medium and style preferences.

Below are some known advantages of SGML-based documents:

  • They are more portable since an SGML compiler can define any document by its document type definition (DTD).
  • They can be created based on document structure rather than appearance characteristics, which can change over time.
  • Documents intended for the print medium can easily be re-adapted for other media, such as a computer display.

The language that most web browsers use—Hypertext Markup Language (HTML)—is an excellent example of an SGML-based language. In today’s networking environment, many documents are defined with the Extensible Markup Language (XML), a data description language that uses SGML principles.



Document markup languages defined via SGML are called applications by the standard; different pre-XML SGML applications were the proprietary property of organizations that created them and therefore are unavailable on the World Wide Web.

Here’s a list of notable pre-XML SGML applications:

  • Text Coding Initiative (TEI) – an academic consortium that maintains, designs, and develops technical standards for digital-format textual representation applications.
  • EDGAR (Electronic Data-Gathering, Analysis, and Retrieval) – a system that does the automated collection, validation, indexing, acceptance, and sending of submissions by companies and other entities legally required to file forms with the US Securities and Exchange Commission (SEC).
  • LinuxDoc – used for documentation for Linux packages.
  • DocBook – a markup language initially created as an SGML application designed for authoring technical documentation.
  • SGMLguid – an early SGML document type definition (DTD) created, developed, and utilized at the European Organization for Nuclear Research (CERN).
  • AAP DTD – a document type definition (DTD) for scientific documents developed by the Association of American Publishers.