Charles Goldfarb

Charles Goldfarb
Charles F. Goldfarb

The idea of markup languages was apparently first publicly presented by the engineer William W. Tunnicliffe (1922-1996) from Washington, D.C. In September of 1967, during a meeting at the Canadian Government Printing Office, Tunnicliffe gave a presentation on the separation of the information content of documents from their format. In the 1970s, Tunnicliffe led the development of a standard called GenCode for the publishing industry and later was the first chair of the International Organization for Standardization. At almost the same time, the book designer Stanley Rice published vague speculation along similar lines in the late 1960s. Rice, as an editor at a Major Publishing House, was writing about “Standardized Editorial Structures”. This was the beginning of a movement to separate the formatting of a document from its content.

In 1969 Charles F. Goldfarb, a graduate of Harvard Law School and Columbia College, hit upon the basic idea of markup languages while working on a primitive document management system intended for law firms, and at the end of the same 1969, leading a small team at IBM, developed the first markup language, called Generalized Markup Language, or GML. Later on, however, Goldfarb explains that he actually coined the term GML to be an anagram for the three researchers, Charles Goldfarb, Ed Mosher, and Ray Lorie, who worked also on the project. Goldfarb was also the man, who coined the term “markup language.”

Goldfarb felt that GML should both describe the structure of the document and be structured in a way, such that it could be both human-readable and machine-readable. At the beginning of the 1970s, he continued his work at IBM as GML began to grow in popularity. Several years later, in 1974, and with the influence of hundreds of people, the next version of the language, called  Standard Generalized Markup Language (SGML) was born. SGML added additional concepts that were not part of the GML project such as link processing, concurrent document types, and most importantly the concept of a validating parser (called ARCSGML), that could read and check the accuracy of the Markup code.

As Markup languages go, SGML was powerful, flexible, and complex, and was used extensively in the document processing of the huge IBM documentation. It’s widely known that when at the end of the 1980s Tim Berners-Lee and Robert Caillau created HTML, and they based their hypertext publishing language namely on SGML. HTML as a subset of SGML is on the other hand easy to learn, but not nearly as powerful. Their system used a NeXT computer and incorporated the concept of hyperlinks. Tim realized the need for a Markup language that was easy to use and implement into their system. In 1991 the Web debuted on the Internet and it was the simplicity of HTML that made the Web grow at a feverish pace.

As remembered later Goldfarb in an interview:
We were trying to do an automated law-office application. I had been a lawyer (in fact, I still am). Lawyers must do research on existing case law, decisions of court, and so on, to find out which ones are applicable to a given situation, find out what the previous legal rulings have been, and then merge that with text that the lawyer has written himself. Eventually, if it’s, say, a brief for the court, he must then compose it and print it. At the time, which was 1969 or 1970, there weren’t any systems available that did these three things. So in order to get the systems to share the data we had to come up with a way to represent it that was independent of any of those applications.
It was a very small research project. There was initially myself and another researcher, Ed Mosher, working on it full time. Then, we had part-time consulting from a very brilliant fellow named Ray Lorie who is also one of the pioneers of relational databases. Ray had the most brilliant insight into the whole thing, which is that all the elements that are tagged the same way should be processed the same way. Our manager, Andy Symonds, contributed technically as well…

In 1975, Goldfarb moved from Cambridge, Massachusetts to Silicon Valley and became a product planner at the IBM Almaden Research Center. There, he convinced IBM’s executives to deploy GML commercially in 1978 as part of IBM’s Document Composition Facility product. Development informally began that year on what ultimately became the SGML standard, and Goldfarb eventually became chair of the SGML committee. SGML was standardized and released by ISO in 1986.