Thursday, 16 June 2011

Crisis and Changes in Communities of Assets

The global financial system is composed of many different financial markets on which a diverse set of assets are traded. Because there are so many assets traded on some markets it can sometimes be convenient to think about groupings of them. For example, shares are assigned to industry sectors based on the business activities of their companies. These sectors provide a useful tool for sorting and comparing different companies, and it can be insightful to compare the performance of stocks within the same sector to identify any that are under- (or over-) performing. For some markets, however, an external classification like this is not possible. An alternative approach is to group assets based on the behaviour of their prices. Two assets which have strongly correlated price changes (they increase and decrease in value at similar times) would belong to the same group and two assets which are weakly correlated belong to different groups. Groups identified in this way can be useful for several reasons but a familiar one is in the construction of diversified portfolios to minimize investment risk – see the “Can we spread the risk?” post.

Much prior work along these lines focused on equity markets, so we set out to investigate the group structure of the foreign exchange (FX) market. To do this, we represented the FX market as a network in which each node represented an exchange rate (such as the EURUSD rate which gives the number of US dollars that one receives in exchange for 1 euro) and each edge connecting pairs of exchange rates represented the strength of the correlations between those rates. A network like this is similar to a social network (such as a facebook network) in which each node represents a person and two people are linked if they are friends. A group in the exchange rate network (known as a community) then corresponded to a set of nodes that had stronger links to each other than they did to the rest of the network.

Importantly, the exchange-rate groups that we found changed through time depending on market conditions, so we introduced several techniques to track the changing relationships between the rates. Using this approach, we were able to uncover major trading changes that occurred in the FX market during the 2007-2008 credit crisis and to identify the relative importance of the different rates.

You can find out more about this work in the paper, “Dynamic communities in multichannel data: An application to the foreign exchange market during the 2007–2008 credit crisis” Chaos 19, 1 (2009). Dan and Nick

Monday, 13 June 2011

The high-school hierarchy in protein networks

The building blocks of cells, proteins, interact and form networks (each node is a protein and each link an interaction between them) See also Sumeet's post. We think that biological systems are modular. A module is a section of a whole that carries out its function relatively independently of the rest of the system (for example, consider the various components of your PC). Candidates for modules in a network are communities, groups of proteins that interact very closely with each other, and not so much with the rest of the network. There are algorithms that exist that can detect such communities.

But maybe there are communities inside each community. If you consider the social interaction network of a school (where nodes are pupils and links are friendships between them), you might expect there to be several large communities, one for each of the year groups. But, if pupils are more likely to be friends not only with someone in their year, but also with someone in their class, then each one of these large year-group communities would consist of several smaller communities, one for each of the class groups. At a yet smaller scale, each of these class-group communities might contain several friendship-group communities. In other words, there is structure of interest at many scales within the network.

We set out to investigate the multi-scale community structure of protein interaction networks. You can see this structure visualised in the image. On the top line (log(lambda)=-1) we are looking at the network at a low resolution: all the nodes are considered to be in one (purple) community. Moving down the figure (e.g. log(lambda)=1) increases resolution (like paying attention to classrooms instead of the whole school) and more structure is resolved: this community starts to split into several large communities. As we crank up the resolution yet higher (further down the image), these communities themselves split up.

Why are we interested in this structure? Proteins within a community might be expected to all carry out a similar task. We were interested if this was true at all scales: the result varied depending on which groups of proteins we were looking at. We were also interested because we simply don't know anything about many proteins, but perhaps the communities they are members of can suggest functions for them. In the school analogy, if all you knew about a pupil was which communities they were a member of, you could make a pretty good guess at many things particular to them, e.g. who their teacher was, whether they'd be studying for exams.

You can read the full story in our paper "The Function of Communities in Protein Interaction Networks at Multiple Scales" in BMC Systems Biology 4, 100 (2010): here. You can also read more about this work in this short review in Biomedical Computational Review. Anna
and Nick

Saturday, 11 June 2011

Promiscuous proteins: do proteins "date or party"?


Proteins are perhaps the major building blocks of the cell. They come in many different shapes and sizes, and they can join together (a bit like lego) to make all sorts of useful structures. There are ways to come up with simple representations of such systems: one way of doing this is to think of them as networks (like we have computer networks, or railway networks, or facebook networks). Displayed is a picture of a network of proteins; two proteins are joined if they interact (stick to each other). The different colours are different 'communities' of proteins. A community is just a group of proteins that interact a lot more amongst themselves than they do with outsiders. However, this picture isn't complete, because it's static. Cells are dynamic; they have a life-cycle, just like us, and they go through different stages: growing, dividing, dying. We are often interested in understanding what drives these changes; for instance, cancer happens when the growing and dividing stages go into overdrive. What is happening to the protein network as the cell goes through its different stages? At each stage, only part of the network is 'switched on'; the cell is making only those proteins needed for that stage. Imagine different parts of the network lighting up at different times. If we take this into account, can it help us to better understand what roles the different proteins are playing in the great cellular drama?

One interesting idea, suggested some years ago, was that if we focus on the seemingly important proteins, the ones that have many interactions (called 'hubs'), maybe by looking at when these interactions light up we can say something about what kind of protein it is. Supposing I am a hub protein in the network, with lots of partners. There could be two opposing scenarios: maybe all my partners get produced by the cell at the same time, and so all the interactions happen at once. In this case, it's like a big party; so I would be called a 'party hub'. On the other hand, it could be that my partners get switched on at different times (or places). In this case, my interactions happen one by one, like a sequence of dates, and so I would be called a 'date hub'. The idea that hubs came in two flavours, date and party, was quite exciting, because the two types seemed to have important roles in organising the whole network. Party hubs were like local coordinators: they helped to bring together many proteins with the same purpose. Date hubs were global organisers; they communicated between different parts of the network. Knowledge of what specific date and party hubs were doing could be a major step forward in understanding how the complicated protein cocktail produces specific kinds of cell behaviours.

Unfortunately, things turn out be not so simple. Several people disputed the idea that 'date' and 'party' hubs really existed, presenting evidence (e.g. here and here, noting this response and this article and noting, in fact, that this literature moves fast!) that there was no consistent relationship between the pattern in which the interactions light up and the protein's role in organising the network. Despite this the idea remained in vogue. In our recent article "Revisiting Date and Party Hubs: Novel Approaches to Role Assignment in Protein Interaction Networks" free in PLoS Computational Biology we suggest that, based on their patterns of connections to different protein communities, that the so-called date hubs (as defined) are not really any more likely to be global network coordinators than the party hubs. Moreover, protein hubs display a wide variety of 'lighting up' patterns for their interactions, and classifying them into just these two types is perhaps not carving nature it its joints.

It is not all bad news, however. So far, we have been thinking about roles for individual proteins. But what if we instead focus on interactions between proteins? In other words, what if we try to assign roles to the lines in the network, rather than the dots? Imagine that the lines are roads, joining up a bunch of cities. If I want to drive from one city to another, I will try to find the shortest path between them. Now, suppose we remove one of the lines; one road suddenly gets destroyed. How many of those shortest paths between cities have to be re-routed? If the answer is lots, then it means that the link we removed was important to efficiently connecting up the network. So, for each link, one way of measuring its importance is how many paths have to be re-rerouted if it's removed: this is called a betweenness. How is this betweenness relevant to the network of proteins? We found that the betweennness of a link is strongly related to the similarity of the two proteins joined by that link: the links with the highest betweenness tend to be interactions joining the most dissimilar proteins. This partly seems to mirror something observed in social networks, where a distinction can be made between 'weak' ties (or links) and 'strong' ties. Strong ties are close relations or friends; weak ties may be less familiar, or similar, acquaintances.

We might imagine weak ties are important for communicating information across the network: for example, if you are looking for a job, it seems more likely that someone like a friend's colleague will be able to provide a useful tip than someone whom you know very well. Coming back to protein networks, if we think of betweenness as a way of measuring a link's importance for information flows between proteins, then our results indicate that here too the most important links are 'weak', in the sense that they are between dissimilar proteins that have different functions and are not part of the same group. This suggests that a deeper understanding of the roles played by specific links may help us to unravel the tangled webs of proteins that control and comprise cells, and thus ultimately, life itself. Sumeet and Nick. You can find a longer version of this article on Sumeet's blog.

Thursday, 9 June 2011

Can we spread the risk?


A primary concern for many financial-market practitioners is the strength of correlations between price changes of different assets; that is, whether prices move up or down at the same time. There are many reasons for investors to think about correlations, but perhaps the most familiar is risk management. If an investor owns strongly correlated assets then there is a high level of risk in their investments – decreases in the value of one asset are often accompanied by falls in the other assets. More generally, the strength of correlations is of interest because it can shed light on the state of the global economy. Because correlations can sometimes be explained by macroeconomic factors, looking at their levels can help to illuminate the forces driving markets.

Historically, assets from different markets tended to behave in different ways, which made it possible to achieve reasonable diversification by buying different types of asset. In our paper "Temporal Evolution of Financial Market Correlations", recently accepted by Physical Review E, however, we show that since the 2007-2008 credit crisis things are not that simple: as
sets that previously moved more or less independently now behave in a very similar manner. We demonstrate this phenomenon using principal component analysis and show that there has been a significant increase in correlations since the crisis. This has profound implications for risk management because diversification is now much more difficult. It also suggests that lots of different assets are now driven by the same economic forces. Dan and Nick

Friday, 3 June 2011

Flow with the Grow

In order to remain metabolically active, living things require a source of energy, and a regular supply of molecules of various kinds. As living things are composed of cells, large organisms inevitably face a fundamental challenge: they need to supply all of their component cells with the resources needed for survival. Mammals have cardio-vascular systems, plants have xylem and phloem, but how do fungi tackle the fundamental transport challenge? Compared to the other major kingdoms of multi cellular life, transport in fungi is poorly understood. This is somewhat surprising, as transport in fungi is an ecologically critical process. Fungi are an essential component of soil: without fungi leaf litter would not degrade, and many fungi form foraging networks which circulate carbon, nitrogen and phosphate. If a fungus grows, that increase in volume must come from somewhere: if the source of new volume is distant from the growth then this must create flows in the network. Because the volume is mostly water, which here is effectively incompressible, growth in one part of the network will be rapidly coupled to the rest. We suggest that fluid flows associated with growth might themselves be the major form of long-range transport in fungi. To investigate transport in fungi and the developmental logic of fungal networks, we photographed growing fungal networks, and digitized the images to produce a sequence of matrices that describe how the networks change over time. For each sequence of networks we identify a set of fluid flows which are as small as possible whilst being consistent with the observed changes in volume. We found that those parts of the network that were predicted to carry a large current typically thickened over time, while other parts of the network became thinner, or were consumed by the fungi in order to fuel further exploratory growth. So our crude idea (that flows are directly coupled to growths) did seem to be, at least partly, consistent with the data. You'll find our paper "Growth-induced mass flows in fungal networks" here and in its journal version (free, from Nov 2011, at the Proceedings of the Royal Society B) here. The Institute for Science, Innovation and Society did a blog on this article as well. Luke and Nick