Ontology for Words of the Day

If you go to the Word Archive page, there is a new graphic showing the archived Words of the Day organized into a hierarchy of basic semantic categories, an ontology. I’ll continue to add words of the day to this ontology; it is a work in progress and feel free to add your comments and critiques. Designing ontologies in general is not easy and it’s not wholly objective. The general categories I’ve chosen are by no means the only ones that might be appropriate, and of course, astute readers will immediately wonder whether words might reasonably fit under more than one category — maybe we need a network (well, an acyclic graph), not a simple tree of relationships. As time permits I hope to ‘pretty up’ the graphics as well, including an indication of what language the word belongs to (Spanish, English and Old English are currently undifferentiated, for instance). It would also be nice to add links to each word’s definition, so mousing over a term provides you with that information.

This entry was posted in Ontologies and tagged . Bookmark the permalink.

9 Responses to Ontology for Words of the Day

  1. bbear says:

    Okay, well, I don’t know a thing about linguistics, if that’s what this is a part of; and I’ve never before seen the word ‘ontology’ in this sense. Evidently it seeks to schematicize words somewhat as the periodic table does atoms. And that’s fine, but the utility of atoms lies chiefly in their combination into molecules that make up the compounds which construct our world. So, to continue the analogy, we look for a chemistry of words, in which, besides a definitional axis, there are additional dimensions that bear on the word’s embedding in the phrases that comprise messages…

    It’s easy to overlook these because most speakers today use the language in such a perfunctory way that the kinds of things I have in mind—cadence, timbre, texture, association—are vestigial, curled-up out of sight like the notional extra dimensions of string theory. But if I write ‘the sun dropped below the horizon as the sea turned from viridian to amethyst to black,’ I’m communicating something different than if I choose ‘from green to purple to black.’ That is, I am if I can get the reader to cooperate, tap into his associations. In a sense, each message is a conspiracy between speaker and listener. Well, an agreement anyway. A comprehensive ontology might look at the terms of that agreement, though maybe the way I’ve defined it is so idiosyncratic as not to be possible…

    • achouston says:

      Nice analogy to chemistry – one big difference (analogies always ‘limp’ a bit, right?) is that the elements have a periodicity to them, hence the name ‘periodic’ table. Morphemes don’t appear to manifest this property, but how they combine is certainly deeply important (as it is for atoms in chemistry), as you point out. What the edges, links represent between nodes is an interesting question; what I’ve got now is basically ‘kind-of’ X is a kind-of Y. (Part-of got snuck in there, too – it might be that ‘part-of’ should be a different type of link between terms.) Trying to explicitly capture or model semantic relationships such as ‘kind-of’ and ‘part-of’, etc.., is useful for computers that are trying to understand natural language, they make inferences based on such models.

  2. bbear says:

    Or maybe not. Why wouldn’t a relational or object-relational database work as an off-the-shelf way to implement such a multi-dimensional ontology?…

  3. bbear says:

    Your point about periodicity is a good one. You always make me think. And I didn’t know about computers apprehending natural language the way you describe. So let me ponder it a bit and get back to you…

  4. bbear says:

    I thought about it and I think you’re on the right track, ‘Vermilion’ and ‘viridian’ each start with v-vowel-r and end with i-vowel-n, and they’re close on a lot of axes–cadence, timbre, frequency of occurrence. But definitionally they’re far apart (about 200 nm 😉 ) and in most situations that trumps everything else. You could substitute ‘vermilion’ for ‘viridian’ in the fragment above and get nearly as much sensory bang for the buck; a vermilion sea is a reach but not out of the question. But elsewhere—traffic lights, for instance, or bullfights—you’d be in trouble. Especially if you were expecting a computer to make inferences based on what it was hearing…

  5. achouston says:

    Hi BBear,

    The ontology can perhaps help a computer decide whether ‘vermillion’ and ‘viridian’ are colors or pigments rather than events or artifacts, for example. Within ‘colors’ we would need to add other features to distinguish these two. The ‘ontology’ of words of the day is not to be taken too seriously at this point — it’s a bottom-up approach to organizing the words as they continue to accrue, rather than just archive them in a laundry list. By ‘bottom up’ I mean, I’m only adding upper ‘kind of’ and ‘part-of’ categories as they seem needed for the set of words so far. ‘Events’ and ‘artifacts’ at first approximation seem like different aspects of reality – at least at the level at which reality is described by natural human languages – and the point of an NL ontology is to attempt to capture and categorize language, not necessarily underlying reality, whatever that may be. And your ‘vermillion sea’ is not bad – didn’t Homer speak of the ‘wine dark sea’? 🙂

  6. bbear says:

    Wine-dark sea, yes. The place I was thinking of ( http://www.nature.nps.gov/air/WebCams/parks/porecam/porecam.cfm ) has a cold sea and a pounding surf against a desolate shore that sticks so far out into the Pacific it’s on another tectonic plate. You can actually see the change in the landform on either side of the road when you get close. There aren’t many sunny days out there, and toward dusk there’s often mist and a fog bank on the horizon that swallows what sun there is. So viridian is descriptive…

    Beyond that, I think the reason it works better here than vermilion is that it affords a descending triplet of syllables, viridian to amethyst to black, 4-3-1, so that the cadence itself mirrors the thing being described, the descent of the sun into the sea. And the hard ‘d’ in ‘viridian’ gives way to the softness of ‘amethyst’ (which contains ‘myst’ as a kind of bonus) the way visual forms soften in the gathering crepuscular gloom. I wish I could say I planned it that way, but it was an accident…

    In that vein, I leave you with the following from A. E. Housman: scholar and poet, by Norman Marlow (U. of Minnesota Press, 1958), which I found cited in Colin Dexter’s genre novel The Way Through The Woods (an Inspector Morse mystery), (Crown,1992)

    “Two of the most beautiful lines in Housman’s work are surely these:

    ‘And like a skylit water stood
    The bluebells in the azured wood.’

    Here again is a reflection in water, and this time the magic effect is produced by repeating the syllable ‘like’ inside the word ‘skylit’ but inverted as a reflection in water is inverted.”


  7. achouston says:

    Lots of great literary references here – thanks!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s