Data approaches to identifying potential sources of emerging pathogens in humans, domesticated animals and crops
Ecological networks, in which nodes represent species and links illustrate different interactions between those species, have been used to model and investigate a spectrum of important phenomena. In ecological multi-host networks, nodes are host species linked through sharing of pathogens. The relative importance of nodes can be quantified using centrality measures. Central hosts act as interspecies super-spreaders, and their identification is important for developing surveillance protocols and interventions aimed at preventing future disease emergence in populations of humans, their domesticated animals or crops. Link prediction models which take into account the typology of observed interactions networks and evolutionary relationships between hosts can be used to predict missing links between hosts and pathogens. Missing links indicate future emerging pathogens or undocumented interactions between host and pathogen species. Developing the various components of this project requires combining skills in programming, data mining and management (in order to mine the information required to build the networks), with mathematical and statistical skills (for network analysis, and prediction of missing links), underpinned by understanding of evolutionary relationships between species (for link prediction model parametrisation and interpretation). This project will be at the interface of data science and network analysis, with statistical components.