Thursday, 5 April 2012

Educated guesses about ancestral shapes and sounds: what do dead languages like Latin and Greek sound like?

What might ancient creatures have looked like? What would dead languages have sounded like? And what are the evolutionary relationships between currently observed shapes and sounds? While we have widely accepted methods that allow us to speculate (in an educated fashion) about ancestral genetic sequences we don't have well developed approaches for shapes and  functions.

I proposed to John Moriarty that we attempt to extend sequence inference to functions, and we got a grant with some most excellent colleagues John Aston, Dorothy Buck and Vincent Macaulay. John and I wrote a paper in which we investigated this using the versatile mathematical tools that are Gaussian Processes. We showed that in some controlled settings we could take (functional) observations from the world and make sensible guesses about what their ancestors might have been. If you want to see a video of us implementing an experiment with the help of some school children then click here (a blog specifically about our school engagement is here). More or less, our task is to take the game of telephone and run it backwards to identify the original sound (a sound can be viewed as a curve, or function on the line, or as, e.g., a spectrogram, a function on the plane).

But once you suppose you can reconstruct original sounds from mutated versions then one might hope to engage with some big and old questions: what do dead languages like Latin and Greek sound like? Can we use observations of contemporary speech sounds made at different leaves of linguistic trees (see the picture below) "to put probability distributions over" (make educated guesses about) possible ancestral speech sounds? One takes an audio recording of the same (sufficiently homologous) word in multiple different languages and attempts to make (probabilistic) inferences about the corresponding ancestral sounds.
On the left is Schleicher's original tree of Indo-European languages, from 1860. On the right is a numerical experiment where, given knowledge of the three black curves at the bottom and the evolutionary tree (thick black object) we can put a probability distribution over possible ancestral curves and sample from that distribution (red curve is the mean, blue a measure of standard deviation and dotted black is a sample from that distribution).

We just wrote a relatively non-technical paper in Trends in Ecology and Evolution "Phylogenetic inference for function-valued traits: speech sound evolution" (free version here and not free version here) with authors: John Aston (Warwick Stats) Dorothy Buck (Imperial Maths), John Coleman (Oxford Phonetics), Colin Cotter (Imperial Aero), NJ, Vincent Macaulay (Glasgow Stats), Norman Macleod (Natural History Museum), John Moriarty (Manchester Maths), Andrew Nevins (UCL Linguistics). In this we suggest that we have all the tools to try to reconstruct ancient speech (we also have lots of people with strong opinions about what ancient speech might have sounded like). We also use the paper to emphasise that this approach could allow us to reconstruct evolutionary trees from (functional) data. John Coleman says that they're (informally) calling this activity of reconstructing past speech sounds, necro-phonetics. I think that's neat. Nick