Social scientists are fascinated by social influence. That is, how people's beliefs, opinions and actions are influenced by others. This is relevant for understanding voting, health behaviour or opinions on issues like vaccination and climate change (topics our group is interested in). Mathematically inclined social scientists often interpret social influence using network theory. Networks or graphs are used to
represent systems consisting of many individual units, known as nodes, and the interactions between
them, which are referred to as edges
or links. In social
networks the nodes represent people and the links represent social ties such as
friendships.
Given a particular graph there are tools for modelling how opinions and beliefs
can spread through a graph. However, in practice we often don’t know the
structure of the social network itself. This could be because: i) the data we
would like is unavailable ii) privacy concerns about social network data mean we can't share it even if we have it iii) the data
exists but is full of errors or omissions. Fortunately, we know a lot about the structure of social networks
from decades of past research by social scientists and statisticians. For
example, many social networks are known to be homophilous - this means that people who are similar to each other
are more likely to share a social connection (e.g many of your friends are
probably a similar age to you).
Inspired by this, we consider a simple
mathematical model for homophilous networks known as a Random Geometric Graph
(RGG). In an RGG the nodes are assigned random positions in a (unit) box. Nodes are
connected to all the nodes which are within a set distance (see figure), which we
refer to as the connection radius.
Positions of nodes may represent the positions of individuals in geographic
space or in some “social space” where the coordinate axis might represent
attributes such as age, income and education level. Since social networks are
homophilous we will expect those who are closer together in “social space” to
share a social tie.
Example of a Random Geometric Graph with 100 nodes and a connection radius of 0.2. |
One basic question we can ask about a
network is: “how long does it take something to spread across it?” We refer to
this as the diffusion timescale. The diffusion timescale in a graph is indicative
of how well connected the graph is and governs how quickly we might expect a
disease, rumour or the adoption of a new behaviour to spread through it (or
even how long it will take a zombie apocalypse to take hold). In our recent research we focus on the
question:
“If
we do not know the network (but perhaps know some of its properties) how
precisely can we know the diffusion timescale?”
We show that different RGGs drawn at random
with the same number of nodes and connection radius can have very different
diffusion timescales. This implies that if we don’t have a good grasp of the
graph structure then it could be difficult to predict the outcomes of
processes such as the spread of an opinion through a social network. Or alternatively we can gain lots of extra information about diffusion timescales if we happen to know the social co-ordinates of individuals. On the
other hand, we do find some classes of RGGs where the diffusion time scale is
very predictable given only knowledge of the number of nodes and the connection
radius.
Our work helps put limitations on how
accurately we can forecast the outcome of processes on networks given the
available data (which is always imperfect). Future work may involve asking the
same questions for real world datasets. In addition, most of our new results
were obtained through computer simulations, meaning that there is also scope
for more theory.
You can read about our research in the
paper “Large
algebraic connectivity fluctuations in spatial network ensembles imply a
predictive advantage from node location information” for free here or for not-free here in Physical Review E. Matt and Nick.