Monday, 8 January 2018

How cells adapt to progressive increase in mitochondrial mutation

Mitochondria produce the cell's major energy currency: ATP. If mitochondria become dysfunctional, this can be associated with a variety of devastating diseases, from Parkinson's disease to cancer. Technological advances have allowed us to generate huge volumes of data about these diseases. However, it can be a challenge to turn these large, complicated, datasets into basic understanding of how these diseases work, so that we can come up with rational treatments.

We were interested in a dataset (see here) which measured what happened to cells as their mitochondria became progressively more dysfunctional. A typical cell has roughly 1000 copies of mitochondrial DNA (mtDNA), which contains information on how to build some of the most important parts of the machinery responsible for making ATP in your cells. When mitochondrial DNA becomes mutated, these instructions accumulate errors, preventing the cell's energy machinery from working properly. Since your cells each contain about 1000 copies of mitochondrial DNA, it is interesting to think about what happens to a cell as the fraction of mutated mitochondrial DNA (called 'heteroplasmy') gradually increases.  We used maths to try and explain how a cell attempts to cope with increasing levels of heteroplasmy, resulting in a wealth of hypotheses which we hope to explore experimentally in the future.

The central idea arising from our analysis of this large dataset is that cells seem to attempt to maintain the number of normal mtDNAs per cell volume as heteroplasmy initially increases from 0% mutant. We suggest they do this by shrinking their size. By getting smaller, cells are able to reduce their energy demands as the fraction of mutant mtDNA increases, allowing them to balance their energy budget and maintain energy supply = demand. However, cells can only get so small and eventually the cell must change its strategy. At a critical fraction of mutated mtDNA (h* in the cartoon above), we suggest that cells switch on an alternative energy production mode called glycolysis. This causes energy supply to increase, and as a result, cells grow larger in size again. These ideas, as well as experimental proposals to test them, are freely available in the Biochemical Journal "Mitochondrial DNA Density Homeostasis Accounts for a Threshold Effect in a Cybrid Model of a Human Mitochondrial Disease". Juvid, Iain and Nick

Tuesday, 8 August 2017

What we learn from the learning rate

Cells need to sense their environment in order to survive. For example, some cells measure the concentration of food or the presence of signalling molecules. We are interested in studying the physical limits to sensing with limited resources, to understand the challenges faced by cells and to design synthetic sensors.

We have recently published a paper 'What we learn from the learning rate' (free version) where we explore the interpretation of a metric called 'the learning rate' that has been used to measure the quality of a sensor (e.g here). Our motivation is that in this field a number of metrics (a metric is a number you can calculate from the properties of the sensor that, ideally, tells you how good the sensor is) have been applied to make some statement about the quality of sensing, or limits to sensory performance. For example, a limit of particular interest is the energy required for sensing. However, it is not always clear how to interpret these metrics. We want to find out what the learning rate means. If one sensor has a higher learning rate than another what does that tell you? 

The learning rate is defined as the rate at which changes in the sensor increase the information the sensor has about the signal. The information the sensor has about the signal is how much your uncertainty about the state of the signal is reduced by knowing the state of the sensor (this is known as the mutual information). From this definition, it seems plausible that the learning rate could be a measure of sensing quality, but it is not clear. Our approach is a test to destruction – challenge the learning rate in a variety of circumstances, and try to understand how it behaves and why.

To do this we need a framework to model a general signal and sensor system. The signal hops between discrete states and the sensor also hops between discrete states in a way that follows the signal. A simple example is a cell using a surface receptor to detect the concentration of a molecule in its environment.

The figure shows such a system. The circles represent the states and the arrows represent transitions between the states. The signal is the concentration of a molecule in the cell’s environment. It can be in two states; high or low, where high is double the concentration of low. The sensor is a single cell surface receptor, which can be either unbound or bound to a molecule. Therefore, the joint system can be in four different states. The concentration jumps between its states with rates that don’t depend on the state of the sensor. The receptor becomes unbound at a constant rate and is bound at a rate proportional to the molecule concentration. 

We calculated the learning rate for several systems, including the one above, and compared it to the mutual information between the signal and the sensor (the mutual information is a refined measure of correlation). We found that in the simplest case, shown in the figure, the learning rate essentially reports the correlation between the sensor and the signal and so it is showing you the same thing as the mutual information. In more complicated systems the learning rate and mutual information show qualitatively different behaviour. This is because the learning rate actually reflects the rate at which the sensor must change in response to the signal, which is not, in general, the equivalent to the strength of correlations between the signal and sensor. Therefore, we do not think that the learning rate is useful as a general metric for the quality of a sensor. Rory, Nick and Tom 

Tuesday, 18 July 2017


Complex (adj.): 1. Consisting of many different and connected parts. ‘A complex network of water channels’.

Oxford English Dictionary

Complex systems’ – like cells, the brain or human society – are often defined as those whose interesting behaviour emerges from the interaction of many connected elements. A simple but particularly useful representation of almost any complex system is therefore as a network (aka a graph). When the connections (edges) between elements (nodes) have a direction, this takes the form of a directed network. For example, to describe interactions in an ecosystem, ecologists use directed networks called food webs, in which each species is a node and directed edges (usually drawn as arrows) go from prey to their predators. The last two decades have witnessed a lot of research into the properties of networks, and how their structure is related to aspects of complex systems, such as their dynamics or robustness. In the case of ecosystems, it has long been thought that their remarkable stability – in the sense that they don’t tend to succumb easily to destructive avalanches of extinctions – must have something to do with their underlying architecture, especially given May’s paradox: mathematical models predict that ecosystems should become more unstable with increasing size and complexity, but this doesn’t seem to happen to, say, rainforests or coral reefs.

Trophic coherence

In 2014 we proposed a solution to May’s paradox: the key structural property of ecosystems was a food-web feature called “trophic coherence”. Ecologists classify species by trophic level in the following way. Plants (nodes with no in-coming edges) have level one, herbivores (species which only have in-coming edges from plants) are at level two, and, in general, the level of any species is defined as the average level of its prey, plus one. Thus, if the network in the top left-hand corner of the figure below represented a food web, the nodes at the bottom would be plants (level 1) the next ones up would be herbivores (level 2), the next, primary carnivores (level 3) and so on. In reality, though, food webs are never quite so neatly organised, and many species prey on various levels, making food webs a bit more like the network in the top right-hand corner. Here, most species have a fractional trophic level. In order to measure this degree of order, which we called trophic coherence, we attributed to each directed edge a “trophic difference”, the difference between the levels of the predator and the prey, and looked at the statistical distribution of differences over all the edges in the whole network. We called the standard deviation of this distribution an “incoherence parameter”, q, because a perfectly coherent network like the one on the left has q=0, while a more incoherent one like that on the right has q>0 – in this case, q=0.7.

It turns out that the trophic coherence of food webs is key to their stability, and when we simulated (i.e. generated in the computer) networks with varying levels of coherence, we found that, for sufficiently coherent ones, the relationship between size and stability is inverted. Although there are plenty of caveats to this result – not least the question how one should measure stability – this suggests a solution to May’s paradox. Since then, further research has shown that trophic coherence affects other structural and dynamical properties of networks – for instance, whether a cascade of activity will propagate through a neural network (example papers here, here and here!). But all these results were somewhat anecdotal, since we didn’t have a mathematical theory relating trophic coherence to other network features. This is what we set out to do in our most recent paper.

Figure. Four directed networks, plotted so that the height of each node on the vertical axis is proportional in each case to its trophic level. The top two are synthetic networks, generated in a computer with the ‘preferential preying model’, which allows the user to tune trophic coherence [1,3]. Thus, they both have the same numbers of nodes and edges, but the one on the left is perfectly coherent (q=0) while the one on the right is more incoherent (q=0.7). The bottom two are empirically derived: the one on the left is the Ythan Estuary food web, which is significantly coherent (it has q=0.42, which is about 15% of its expected q) and belongs to the loopless regime; the one on the right is a representation of the Chlamydia pneumoniae metabolic network, which is singificantly incoherent (q=8.98, or about 162% of the random expectation) and sits in the loopful regime. The top two networks are reproduced from the SI of Johnson et alPNAS, 2014 [1], while the bottom two are from the SI of Johnson & Jones, PNAS, 2017 [5].


In statistical physics one thinks about systems in terms of ensembles – the sets of all possible systems which satisfy certain constraints – and this method has also been used in graph theory. For example, the Erdős-Rényi ensemble comprises all possible networks with given numbers of nodes N and edges L, while the configuration ensemble also specifies the degree sequence (the degree of a node being its number of neighbours). We defined the “coherence ensemble” as the set of all possible directed networks which not only have given N, L and degree sequences (each node has two degrees in directed networks, one in and one out) but also specified trophic coherence. This allows us to derive equations for the expected values of various network properties as a function of trophic coherence; in other words, these are the values we should expect to measure in a network given its trophic coherence (and other specified constraints) if we had no other knowledge about its structure.

Many network properties are heavily influenced by cycles – that is, paths through a network which begin and end at the same node. For example, in a food web you might find that eagles eat snakes, which eats squirrels, which eat eagles (probably in egg form), thus forming a cycle of length three. These cycles (properly called ‘directed cycles’ in directed networks), or loops, are related to various structural and dynamical features of complex systems. For example, feedback loops can destabilise ecosystems, mediate self-regulation of genes, or maintain neural activity in the brain. Furthermore, it had been reported that certain kinds of network – in particular, food webs and gene regulatory networks – often had either no cycles at all, or only a small number of quite short cycles. This was surprising, because in (arbitrarily large) random networks the number of cycles of length l grows exponentially with l, so it was assumed that there must be some evolutionary reason for this “looplessness”. We were able to use our coherence ensemble approach to derive the probability with which a randomly chosen path would be a cycle, as a function of q. From there we could obtain expected values for the number of cycles of length l, and for other quantities related to stability (in particular, for the adjacency matrix eigenspectrum, which captures the total extent of feedback in a system). It turns out that the number of cycles does indeed depend on length exponentially, but via a factor τ which is a function of trophic coherence. For sufficiently coherent networks, τ is negative, and hence the expected number of cycles of length l falls rapidly to zero. In fact, such networks have a high chance of being completely acyclic. Thus, our theory predicts that networks can belong to either of two regimes, depending on the “loop exponent” τ: a loopful one with lots of feedback, or a loopless one in which networks are either acyclic or have just a few short cycles. A comparison with a large set of networks from the real world – including networks of species, genes, metabolites, neurons, trading nations and English words –  shows that this is indeed so, and almost all of them are very close to our expectations given their trophic coherence.

Our theory can also be used to see how close quantities such as trophic coherence, or mean trophic level, are to what would be our random expectations, given just N, L and the degree sequences, for any real directed network. We found, for example, that in our dataset the food webs tended to be very coherent, while networks derived from metabolic reactions were significantly incoherent (see the bottom two networks in the figure: the one on the left is a food web and the one on the right is a metabolic network). Our gene regulatory networks are interesting in that, while often quite coherent in absolute terms, they are in fact very close to their random expectation.

Open questions

This work leaves open many new questions. Why are some networks significantly coherent, and others incoherent? We can guess at the mechanism behind food-web coherence: the adaptations which allow a given predator, say a wolf, to hunt deer are also useful for catching prey like goats or elk, which have similar characteristics because they, in turn, have similar diets – i.e. trophic levels. This correlation between trophic levels and node function might be more general. For example, we have shown that in a network of words which are concatenated in a text, trophic level serves to identify syntactic function, and something similar may occur in networks of genes or metabolites. If edges tend to form primarily between nodes with certain functions, this might induce coherence or incoherence. Some networks, like the artificial neural networks used for “deep learning”, are deliberately coherent, which suggests another question: how does coherence affect the performance of different kinds of system? Might there be an optimal level of trophic coherence for neural networks? And how might it affect financial, trade, or social networks, which can, in some sense, be considered human ecosystems? We hope topics such as these will attract the curiosity of other researchers who can make further inroads. You can read our paper “Looplessness in networks is linked to trophic coherence” for free here and also in the journal PNAS. Sam and Nick.