Glossa (a formal language for semantic interoperability of computational resources) is the technical building block of The Agora. The Agora collects and transmits information to and from its partner databases using Glossa.
  • The fundamental paper that describes Glossa: "Semiotes: a Semantics for Sharing" (PostScript and PDF)
  • A technical report describing a bidirectional public-domain Glossa/SQL parser (PostSript).
  • The parser code itself is available as a tar file.
    You will need Java and TXL; the latter is available to academics for free from TXL Software Research, Inc.

Glossa is a language for specifying the semantics (``what the words mean'') of ideas, data, and computations. (``Glossa'' means ``tongue'', and ``language'', in Greek.)

With the diversity of databases and computational resources serving biology today, it would be wonderful to have these be able to call upon each other's data and computations reliably and accurately. Unfortunately, that's not yet possible, because every database, like every scientist, has its own notion of what the words mean. Just try getting ten people to agree on the definition of a gene! The problem this presents for computations is in knowing how to correctly use the information retrieved, assuming it can be retrieved.

The goal of Glossa is to provide a common language for defining the semantics of each computational resource's notions. Once these are computably defined, then translating programs can translate the definitions. Because the definitions are computational constructs, not natural language, and obey some grammatical rules, writing such translation programs is much easier.


One defines ideas with Glossa by building them from axioms, called semiotes, using the grammatical rules. A group of semiotes is called a bundle. Each semiote and bundle have four components of its definition: formal and informal (natural language) definitions of its syntax and semantics.


Glossa is free and placed in the public domain. Please contact us if you think we need to add semiotes --- it's very likely we do, but it's important to keep each one distinct.