Bioinformatics and Systems Biology

With the deluge of high-throughput data being generated in biological research and in our laboratory, there is an urgent and imminent need for bioinformatics to decrypt meaningful information from the complex data structures. The questions we are trying to answer now through these large datasets are no longer confined to biology but are often mathematical problems. Over the past 5 years, we have actively collaborated with Professor Jean Yang at the School of Mathematics and Statistics at the University of Sydney to develop a core bioinformatics unit within our lab. Currently, we have a small team of bioinformaticians (including  computer scientists, statisticians and physicists) in several stages of their careers working with us to analyse, visualize and generate hypotheses and predictions using the data generated within and beyond our lab. Our main goal is to carefully extract information from each molecular level and then integrate this information in a model that allows us to study the system as a whole.


 

A dynamic model of the mechanisms leading up to the activation of Akt. This model was designed to interrogate the location of various forms of phosphorylated Akt and the potential implications to the way Akt is activated.

 

Time courses of the simulated results obtained from the model. The upper figure demonstrates that the model is able to reproduce experimental observations. However, closer inspection of the individual model components reveals significant differences in the dynamic behaviour of membrane bound Akt and cytosolic Akt.

Dynamic Modelling

Using a bottom up engineering approach, we reconstruct the reactions that make up biochemical networks in silico and then by modelling them using mechanistic rate equations (Wong and Krycer et al., Febs Open Bio, 2015), attempt to simulate the dynamic behaviour of all individual (sometimes unobservable) components in the system. This helps us identify discrepancies in what we know about the network (when the model fails to reproduce reality) and make predictions about the system behaviour when perturbed (when the model successfully reproduces reality).

 


Flowchart depicting the computational prediction of kinase substrates using integrated phosphoproteomics data sets (Humphrey et al., Cell Metab, 2013).

Machine learning

We apply various machine learning approaches to generate predictions about biological systems. We are particularly interested in predicting kinases for proteins in key signaling pathways.

Prediction scores from a model predicting Akt and mTor substrates. Each point corresponds to a phosphorylation site and is rainbow colored by the value of the prediction score (Yang et al., Bioinformatics, 2015).


Statistical Bioinformatics

We use a variety of bioinformatics tools to process, analyse and visualise large-scale data from various 'omes to provide new biological insights that can be experimentally validated. Many of these tools have been developed within the lab, including PhosphOrtholog which maps protein modification sites between different species (Chaudhuri and Sadrieh et al., BMC Genomics, 2015), DPA to identify pathways that are regulated across multiple perturbations (Yang and Patrick et al., Bioinformatics, 2014), and Thunderbolt which is a pipeline for processing mass spectrometry based proteomics data.

Parallel coordinates plot reveals how significantly regulated metabolic pathways change across various models of insulin resistance in adipocytes. Thickness of solid lines represent the number of genes/proteins within the pathway that follows a particular trend in our data and dashed lines represent outlier genes/proteins within those pathways.

Quantification of phosphorylation of known and predicted AKT and mTOR substrates in response to insulin in 3T3-L1 adipocytes (Parker and Yang et al., Sci. Sig., 2015).


Network Analysis

Biological networks are commonly used to visualise relationships, for example between genes, proteins, kinases and substrates. Network analysis can reveal interesting regulatory patterns such as hubs and bottlenecks, and identify novel regulators of biological processes in a system.

The exercise-regulated phosphoproteome in human muscle, with mapped known and predicted kinases. Upregulated phosphosites are light red, downregulated phosphosites are dark red, known kinase edges are green and predicted kinase edges are yellow. The visualisation was created in collaboration with Tim Burykin (Coffey Life Lab, Charles Perkins Centre).

A protein-protein interaction wheel between the insulin signaling pathway and a computationally generated gene set reveals key nodes that represent hubs in the network. Lines in red indicate interactions between the insulin signaling pathway and the hub protein CTNNB1, which was experimentally verified to play a role in insulin action (Chaudhuri et al., Npj Systems Biology and Applications, 2015).

Network to visualise the relationships between kinases and their substrates in human skeletal muscle post acute exercise. The network has been overlaid with known and predicted kinase-substrate relationships and known protein-protein interactions.