Community detection identifies a subnetwork of the synaptic proteome associated with differences in educational attainment*

Grant Robertson1,2, Colin McLean2, Oksana Sorokina2, David Sterratt2, Emilia Wysocka2, Katharina Heil2, W David Hill3,  Ian Simpson2, Ian Deary3, and Douglas Armstrong2

1) Division of Psychiatry, Centre for Clinical Brain Sciences, University of Edinburgh 2) Institute for Adaptive and Neural Computation, School of Informatics, University of Edinburgh 3) Centre for Cognitive Ageing and Cognitive Epidemiology, University of Edinburgh

Background: The human synaptic proteome is a complex structure composed of over 5000 interacting proteins. Disruptions of these proteins have been associated with over 100 brain disorders making them of considerable interest to researchers examining the molecular antecedents of these disorders.  The structure of the human synaptic proteome can be modelled as a network with each protein a vertex and each interaction an edge.  A property of complex networks is community structure. Vertices form tightly interconnected groups (communities) with sparser connections between communities (1).

Previous studies have shown associations between communities detected in a subset of the synaptic proteome and cellular functions2.  The community detection methods used previously perform poorly at the scale of the complete synaptic proteome. We use a recently developed algorithm that scales well with increasing network size, to detect communities in a curated database of synaptic protein interactions3. We test whether the communities have an enriched association with educational attainment using Gene Set Analysis.

Methods: A  network representing all published excitatory synaptic protein-protein interactions was generated using our database (5,999 proteins, 93,378 interactions). We used the Louvain algorithm to perform non-overlapping community detection3. The genes encoding the proteins comprising each community were used as gene sets for Gene Set Analysis.

Summary SNP data from a large (N=328,917) GWAS study of educational attainment were downloaded from.the Social Science Genetic Association Consortium4. Gene level p values were calculated using MAGMA 5. Gene set analysis of the communities discovered using the Louvain algorithm was carried out using the gene set analysis function in MAGMA with correction for multiple testing by permutation.

Results: 10 structural communities were identified of which one (170 genes/proteins) shows a significantly enriched association with educational attainment (p=0.00047, p corrected=0.0055). Using Gene Ontology enrichment analysis this community shows functional enrichment for histone modification (fold change = 13.78, p <1×10-34). No other community showed a significant association with this phenotype.

Conclusions: Community detection algorithms can identify sets of genes encoding synaptic proteins that share a common network topology. Gene Set Analysis of GWAS results identifies an enriched association for educational attainment for the proteins in this novel subnetwork (community) of the synaptic proteome.

References

  1. Newman, M. E., & Girvan, M. (2004). Finding and evaluating community structure in networks. Physical review E, 69(2), 026113.
  2. Pocklington, A. J., Cumiskey, M., Armstrong, J. D., & Grant, S. G. (2006). The proteomes of neurotransmitter receptor complexes form modular networks with distributed functionality underlying plasticity and behaviour. Molecular systems biology, 2(1).
  3. Blondel, V. D., Guillaume, J. L., Lambiotte, R., & Lefebvre, E. (2008). Fast unfolding of communities in large networks. Journal of statistical mechanics: theory and experiment, 2008(10), P10008.
  4. Okbay A, Beauchamp JP, Fontana MA, Lee JJ, Pers TH, Rietveld CA, Turley P, Chen GB, Emilsson V, Meddens SF, Oskarsson S. (2016) Genome-wide association study identifies 74 loci associated with educational attainment Nature, 533(7604), 539-542.
  5. de Leeuw, C. A., Mooij, J. M., Heskes, T., & Posthuma, D. (2015). MAGMA: generalized gene-set analysis of GWAS data. PLoS Comput Biol, 11(4), e1004219.

Funded by: Grant Robertson is entirely funded by PsySTAR a new initiative, funded by the Medical Research Foundation, to provide postgraduate training for psychiatrists. It is jointly run by the Universities of Aberdeen, Dundee, Edinburgh and Glasgow.

* entered into the PhD student poster competition