network contains functions to arrange and analyze glycans in the context of networks. In such a network, each node represents a glycan and edges represent, for instance, their connection via a biosynthetic step. It should be noted, since glycowork treats glycans as molecular graphs, that these networks represent hierarchical graphs, with the network being one graph and each node within the network also a graph. network contains the following modules:
biosynthesis contains functions to construct and analyze biosynthetic glycan networks
evolution contains functions to compare (taxonomic) groups as to their glycan repertoires
biosynthesis
constructing and analyzing biosynthetic glycan networks
extracts diamond-shape motifs from biosynthetic networks (A->B,A->C,B->D,C->D) and uses evolutionary information to determine which path is taken from A to D
Arguments:
network (networkx object): biosynthetic network, returned from construct_network
species_list (list): list of species to compare network to
network_dic (dict): dictionary of form species name : biosynthetic network (gained from construct_network)
threshold (float): everything below or equal to that threshold will be cut; default:0.
nb_intermediates (int): number of intermediate nodes expected in a network motif to extract; has to be a multiple of 2 (2: diamond, 4: hexagon,…)
Returns:
Returns dataframe of each intermediary glycan and its proportion (0-1) of how often it has been experimentally observed in this path
highlights a certain attribute in the network that will be visible when using plot_network
Arguments:
network (networkx object): biosynthetic network, returned from construct_network
highlight (string): which attribute to highlight (choices are ‘motif’ for glycan motifs, ‘abundance’ for glycan abundances, ‘conservation’ for glycan conservation, ‘species’ for highlighting 1 species in multi-network)
motif (string): highlight=motif; which motif to highlight (absence/presence, in violet/green); default:None
abundance_df (dataframe): highlight=abundance; dataframe containing glycans and their relative intensity
glycan_col (string): highlight=abundance; column name of the glycans in abundance_df
intensity_col (string): highlight=abundance; column name of the relative intensities in abundance_df
conservation_df (dataframe): highlight=conservation; dataframe containing glycans from different species
network_dic (dict): highlight=conservation/species; dictionary of form species name : biosynthetic network (gained from construct_network); default:pre-computed milk networks
species (string): highlight=species; which species to highlight in a multi-species network
Returns:
Returns a network with the additional ‘origin’ (motif/species) or ‘abundance’ (abundance/conservation) node attribute storing the highlight
compares biosynthetic patterns between glycomes of two conditions
Arguments:
df (dataframe): dataframe containing glycan sequences in first column and relative abundances in subsequent columns [alternative: filepath to .csv]
group1 (list): list of column indices or names for the first group of samples, usually the control
group2 (list): list of column indices or names for the second group of samples
analysis (string): what type of analysis to perform on networks, options are “reaction” for reaction type fluxes and “flow” for comparing flow to sinks; default:“reaction”
paired (bool): whether samples are paired or not (e.g., tumor & tumor-adjacent tissue from same patient); default:False
Returns:
Returns a dataframe with:
(i) Differential flow features
(ii) Their mean abundance across all samples in group1 + group2
(iii) Log2-transformed fold change of group2 vs group1 (i.e., negative = lower in group2)
(iv) Uncorrected p-values (Welch’s t-test) for difference in mean
(v) Effect size as Cohen’s d
evolution
investigating evolutionary relationships of glycans