motif

motif contains many functions to process glycans in various ways and use this processing to analyze glycans via curated motifs, graph features, and sequence features. It contains the following modules:

draw contains the GlycoDraw function to draw glycans in SNFG style
analysis contains functions for downstream analyses of important glycan motifs etc.
annotate contains functions to extract curated motifs, graph features, and sequence features from glycan sequences
graph is used to convert glycan sequences to graphs and contains helper functions to search for motifs / check whether two sequences describe the same sequence, etc.
processing contains functions to process IUPAC-condensed glycan sequences, as well as conversion functions to convert other nomenclatures into IUPAC-condensed.
regex contains functionality for performing powerful regular expression-like searches on glycans; get_match is the user-facing function.
query is used to interact with the databases contained in glycowork, delivering insights for sequences of interest
tokenization has helper functions to map m/z–>composition, composition–>structure, structure–>motif, and more

draw

drawing glycans in SNFG style

GlycoDraw

 GlycoDraw (glycan:str, vertical:bool=False, compact:bool=False,
            show_linkage:bool=True, dim:float=50,
            highlight_motif:Optional[str]=None,
            highlight_termini_list:List=[], reverse_highlight:bool=False,
            repeat:Union[bool,int,str,NoneType]=None,
            repeat_range:Optional[List[int]]=None,
            draw_method:Optional[str]=None,
            filepath:Union[str,pathlib.Path,NoneType]=None,
            suppress:bool=False, per_residue:List=[],
            pdb_file:Union[str,pathlib.Path,NoneType]=None,
            alt_text:Optional[str]=None, libr:dict=None)

Renders glycan structure using SNFG symbols or chemical structure representation

	Type	Default	Details
glycan	str		IUPAC-condensed glycan sequence
vertical	bool	False	Draw vertically
compact	bool	False	Use compact style
show_linkage	bool	True	Show linkage labels
dim	float	50	Base dimension for scaling
highlight_motif	Optional	None	Motif to highlight
highlight_termini_list	List	[]	Terminal positions (from ‘terminal’, ‘internal’, and ‘flexible’)
reverse_highlight	bool	False	Whether to highlight everything EXCEPT highlight_motif
repeat	Union	None	Repeat unit specification (True: n units, int: # of units, str: range of units)
repeat_range	Optional	None	Repeat unit range
draw_method	Optional	None	Drawing method: None, ‘chem2d’, ‘chem3d’
filepath	Union	None	Output file path
suppress	bool	False	Suppress display
per_residue	List	[]	Per-residue intensity values (order should be the same as the monosaccharides in glycan string)
pdb_file	Union	None	only used when draw_method=‘chem3d’; already existing glycan structure
alt_text	Optional	None	Custom ALT text for accessibility
libr	dict	None	Can be modified for drawing too exotic monosaccharides
Returns	Any		Drawing object

GlycoDraw("Neu5Ac(a2-3)Gal(b1-4)[Fuc(a1-3)]GlcNAc(b1-2)Man(a1-3)[Neu5Gc(a2-6)Gal(b1-4)GlcNAc(b1-2)Man(a1-6)][GlcNAc(b1-4)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-6)]GlcNAc",
         highlight_motif = "GlcNAc(b1-?)Man")

annotate_figure

 annotate_figure (svg_input:str, scale_range:Tuple[int,int]=(25, 80),
                  compact:bool=False, glycan_size:str='medium',
                  filepath:Union[str,pathlib.Path]='', scale_by_DE_res:Opt
                  ional[pandas.core.frame.DataFrame]=None,
                  x_thresh:float=1, y_thresh:float=0.05,
                  x_metric:str='Log2FC')

Replaces text labels with glycan drawings in SVG figure

	Type	Default	Details
svg_input	str		Input SVG file path
scale_range	Tuple	(25, 80)	Min/max glycan dimensions
compact	bool	False	Use compact style
glycan_size	str	medium	Glycan size preset (‘small’, ‘medium’, ‘large’)
filepath	Union		Output file path
scale_by_DE_res	Optional	None	Differential expression results (motif_analysis.get_differential_expression)
x_thresh	float	1	X metric threshold
y_thresh	float	0.05	P-value threshold
x_metric	str	Log2FC	X axis metric (‘Log2FC’, ‘Effect size’)
Returns	Optional		Modified SVG code

plot_glycans_excel

 plot_glycans_excel
                     (df:Union[pandas.core.frame.DataFrame,str,pathlib.Pat
                     h], folder_filepath:Union[str,pathlib.Path],
                     glycan_col_num:int=0, scaling_factor:float=0.2,
                     compact:bool=False)

Creates Excel file with SNFG glycan images in a new column

	Type	Default	Details
df	Union		DataFrame or filepath with glycans
folder_filepath	Union		Output folder path
glycan_col_num	int	0	Glycan column index
scaling_factor	float	0.2	Image scaling
compact	bool	False	Use compact style
Returns	None

analysis

downstream analyses of important glycan motifs

get_pvals_motifs

 get_pvals_motifs (df:Union[pandas.core.frame.DataFrame,str],
                   glycan_col_name:str='glycan',
                   label_col_name:str='target', zscores:bool=True,
                   thresh:float=1.645, sorting:bool=True,
                   feature_set:List[str]=['exhaustive'],
                   multiple_samples:bool=False,
                   motifs:Optional[pandas.core.frame.DataFrame]=None,
                   custom_motifs:List[str]=[])

Identifies significantly enriched glycan motifs using Welch’s t-test with FDR correction and Cohen’s d effect size calculation, comparing samples above/below threshold

	Type	Default	Details
df	Union		Input dataframe or filepath (.csv/.xlsx)
glycan_col_name	str	glycan	Column name for glycan sequences
label_col_name	str	target	Column name for labels
zscores	bool	True	Whether data are z-scores
thresh	float	1.645	Threshold to separate positive/negative
sorting	bool	True	Sort p-value dataframe
feature_set	List	[‘exhaustive’]	Feature sets to use; exhaustive, known, terminal1, terminal2, terminal3, chemical, graph, custom, size_branch
multiple_samples	bool	False	Multiple samples with glycan columns
motifs	Optional	None	Modified motif_list
custom_motifs	List	[]	Custom motifs if using ‘custom’ feature set
Returns	DataFrame		DataFrame with p-values, FDR-corrected p-values, and Cohen’s d effect sizes for glycan motifs

glycans = ['Man(a1-3)[Man(a1-6)][Xyl(b1-2)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc',
           'Man(a1-2)Man(a1-2)Man(a1-3)[Man(a1-3)Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc',
           'GalNAc(a1-4)GlcNAcA(a1-4)[GlcN(b1-7)]Kdo(a2-5)[Kdo(a2-4)]Kdo(a2-6)GlcOPN(b1-6)GlcOPN',
          'Man(a1-2)Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc',
           'Glc(b1-3)Glc(b1-3)Glc']
label = [3.234, 2.423, 0.733, 3.102, 0.108]
test_df = pd.DataFrame({'glycan':glycans, 'binding':label})

print("Glyco-Motif enrichment p-value test")
out = get_pvals_motifs(test_df, 'glycan', 'binding').iloc[:10,:]

Glyco-Motif enrichment p-value test

	motif	pval	corr_pval	effect_size
4	GlcNAc	0.038120	0.205849	1.530905
8	Man	0.054356	0.234990	1.390253
24	Man(a1-?)Man	0.060923	0.234990	1.308333
22	Man(a1-3)Man	0.034212	0.205849	1.196586
14	GlcNAc(b1-4)GlcNAc	0.019543	0.175885	1.168815
23	Man(a1-6)Man	0.019543	0.175885	1.168815
25	Man(b1-4)GlcNAc	0.019543	0.175885	1.168815
7	Kdo	0.328790	0.479672	-0.811679
2	Glc	0.644180	0.668956	-0.811679
21	Man(a1-2)Man	0.177461	0.479672	0.772320

get_representative_substructures

 get_representative_substructures
                                   (enrichment_df:pandas.core.frame.DataFr
                                   ame)

Constructs minimal glycan structures that represent significantly enriched motifs by optimizing for motif content while minimizing structure size using subgraph isomorphism

	Type	Details
enrichment_df	DataFrame	Output from get_pvals_motifs
Returns	List	Up to 10 minimal glycans containing enriched motifs

get_heatmap

 get_heatmap (df:Union[pandas.core.frame.DataFrame,str,pathlib.Path],
              motifs:bool=False, feature_set:List[str]=['known'],
              transform:str='', datatype:str='response',
              rarity_filter:float=0.05,
              filepath:Union[str,pathlib.Path]='', index_col:str='glycan',
              custom_motifs:List[str]=[], return_plot:bool=False,
              show_all:bool=False, **kwargs:Any)

Creates hierarchically clustered heatmap visualization of glycan/motif abundances

	Type	Default	Details
df	Union		Input dataframe or filepath (.csv/.xlsx)
motifs	bool	False	Analyze motifs instead of sequences
feature_set	List	[‘known’]	Feature sets to use; exhaustive, known, terminal1, terminal2, terminal3, chemical, graph, custom, size_branch
transform	str		Transform data before plotting
datatype	str	response	Data type: ‘response’ for quantitative values or ‘presence’ for presence/absence
rarity_filter	float	0.05	Min proportion for non-zero values
filepath	Union		Path to save plot
index_col	str	glycan	Column to use as index
custom_motifs	List	[]	Custom motifs if using ‘custom’ feature set
return_plot	bool	False	Return plot object
show_all	bool	False	Show all tick labels
kwargs	Any
Returns	Optional		None or plot object if return_plot=True

glycans = ['Man(a1-3)[Man(a1-6)][Xyl(b1-2)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc',
           'Man(a1-2)Man(a1-2)Man(a1-3)[Man(a1-3)Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc',
           'GalNAc(a1-4)GlcNAcA(a1-4)[GlcN(b1-7)]Kdo(a2-5)[Kdo(a2-4)]Kdo(a2-6)GlcN4P(b1-6)GlcN4P',
           'Man(a1-2)Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc',
           'Glc(b1-3)Glc(b1-3)Glc']
label = [3.234, 2.423, 0.733, 3.102, 0.108]
label2 = [0.134, 0.345, 1.15, 0.233, 2.981]
label3 = [0.334, 0.245, 1.55, 0.133, 2.581]
test_df = pd.DataFrame([label, label2, label3], columns = glycans)

get_heatmap(test_df, motifs = True, feature_set = ['known', 'exhaustive'])

plot_embeddings

 plot_embeddings (glycans:List[str], emb:Union[Dict[str,numpy.ndarray],pan
                  das.core.frame.DataFrame,NoneType]=None,
                  label_list:Optional[List[Any]]=None,
                  shape_feature:Optional[str]=None,
                  filepath:Union[str,pathlib.Path]='', alpha:float=0.8,
                  palette:str='colorblind', **kwargs:Any)

Visualizes learned glycan embeddings using t-SNE dimensionality reduction with optional group coloring

	Type	Default	Details
glycans	List		List of IUPAC-condensed glycan sequences
emb	Union	None	Glycan embeddings dict/DataFrame; defaults to SweetNet embeddings
label_list	Optional	None	Labels for coloring points
shape_feature	Optional	None	Monosaccharide/bond for point shapes
filepath	Union		Path to save plot
alpha	float	0.8	Point transparency
palette	str	colorblind	Color palette for groups
kwargs	Any
Returns	None		Keyword args passed to seaborn scatterplot

df_fabales = df_species[df_species.Order == 'Fabales'].reset_index(drop = True)
plot_embeddings(df_fabales.glycan.values.tolist(), label_list = df_fabales.Family.values.tolist())

Download completed.

characterize_monosaccharide

 characterize_monosaccharide (sugar:str,
                              df:Optional[pandas.core.frame.DataFrame]=Non
                              e, mode:str='sugar',
                              glycan_col_name:str='glycan',
                              rank:Optional[str]=None,
                              focus:Optional[str]=None,
                              modifications:bool=False,
                              filepath:Union[str,pathlib.Path]='',
                              thresh:int=10)

Analyzes connectivity and modification patterns of specified monosaccharides/linkages in glycan sequences

	Type	Default	Details
sugar	str		Monosaccharide or linkage to analyze
df	Optional	None	DataFrame with glycan column ‘glycan’; defaults to df_species
mode	str	sugar	Analysis mode: ‘sugar’, ‘bond’, ‘sugarbond’
glycan_col_name	str	glycan	Column name for glycan sequences
rank	Optional	None	Column name for group filtering
focus	Optional	None	Row value for group filtering
modifications	bool	False	Consider modified monosaccharides
filepath	Union		Path to save plot
thresh	int	10	Minimum count threshold for inclusion
Returns	None

characterize_monosaccharide('Rha', rank = 'Kingdom', focus = 'Fungi', modifications = True)

get_differential_expression

 get_differential_expression
                              (df:Union[pandas.core.frame.DataFrame,str,pa
                              thlib.Path], group1:List[Union[str,int]],
                              group2:List[Union[str,int]],
                              motifs:bool=False,
                              feature_set:List[str]=['exhaustive',
                              'known'], paired:bool=False,
                              impute:bool=True, sets:bool=False,
                              set_thresh:float=0.9,
                              effect_size_variance:bool=False,
                              min_samples:float=0.1,
                              grouped_BH:bool=False,
                              custom_motifs:List[str]=[],
                              transform:Optional[str]=None,
                              gamma:float=0.1,
                              custom_scale:Union[float,Dict]=0,
                              glycoproteomics:bool=False,
                              level:str='peptide', monte_carlo:bool=False,
                              random_state:Union[int,numpy.random._generat
                              or.Generator,NoneType]=None)

Performs differential expression analysis using Welch’s t-test (or Hotelling’s T2 for sets) with multiple testing correction on glycomics abundance data

	Type	Default	Details
df	Union		DataFrame with glycans in rows (col 1) and abundance values in subsequent columns
group1	List		Column indices/names for first group
group2	List		Column indices/names for second group
motifs	bool	False	Analyze motifs instead of sequences
feature_set	List	[‘exhaustive’, ‘known’]	Feature sets to use; exhaustive, known, terminal1, terminal2, terminal3, chemical, graph, custom, size_branch
paired	bool	False	Whether samples are paired
impute	bool	True	Replace zeros with Random Forest model
sets	bool	False	Identify clusters of correlated glycans
set_thresh	float	0.9	Correlation threshold for clusters
effect_size_variance	bool	False	Calculate effect size variance
min_samples	float	0.1	Min percent of non-zero samples required
grouped_BH	bool	False	Use two-stage adaptive Benjamini-Hochberg
custom_motifs	List	[]	Custom motifs if using ‘custom’ feature set
transform	Optional	None	Transformation type: “CLR” or “ALR”
gamma	float	0.1	Uncertainty parameter for CLR transform
custom_scale	Union	0	Ratio of total signal in group2/group1 for an informed scale model (or group_idx: mean(group)/min(mean(groups)) signal dict for multivariate)
glycoproteomics	bool	False	Whether data is from glycoproteomics
level	str	peptide	Analysis level for glycoproteomics
monte_carlo	bool	False	Use Monte Carlo for technical variation
random_state	Union	None	optional random state for reproducibility
Returns	DataFrame		DataFrame with log2FC, p-values, FDR-corrected p-values, and Cohen’s d/Mahalanobis distance effect sizes

test_df = glycomics_data_loader.human_skin_O_PMC5871710_BCC

res = get_differential_expression(test_df, group1 = [1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39],
                                  group2 = [2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40], motifs = True, paired = True)
res

You're working with an alpha of 0.044390023979542614 that has been adjusted for your sample size of 40.

	Glycan	Mean abundance	Log2FC	p-val	corr p-val	significant	corr Levene p-val	Effect size	Equivalence p-val
8	GalOS(b1-3)GalNAc	0.169090	-0.930987	0.000848	0.005641	True	0.871813	-0.884455	1.000000
0	H_antigen_type2	0.258644	-0.689804	0.002058	0.005641	True	0.940575	-0.797572	1.000000
1	Internal_LacNAc_type2	2.442730	0.467133	0.002552	0.005641	True	0.871813	0.776382	1.000000
9	GlcNAc6S(b1-6)GalNAc	1.098349	0.894961	0.002820	0.005641	True	0.871813	0.766511	1.000000
12	Neu5Ac(a2-3)Gal	13.016651	0.250839	0.004655	0.007449	True	0.940575	0.716794	1.000000
2	Terminal_LacNAc_type2	2.555526	-0.475258	0.007864	0.010485	True	0.871813	-0.664152	1.000000
14	Neu5Ac(a2-8)Neu5Ac	0.040878	-0.635598	0.018773	0.021455	True	0.871813	-0.574518	1.000000
10	Neu5Ac	17.472084	0.165081	0.048618	0.048618	False	0.871813	0.471179	1.000000
6	Gal	19.503926	0.111858	0.073880	0.073880	False	0.940575	0.422987	0.706774
11	Gal(b1-3)GalNAc	13.407321	0.095097	0.133433	0.133433	False	0.940575	0.350569	0.647072
7	GalNAc	13.576411	0.085435	0.168975	0.168975	False	0.940575	0.319743	0.647072
13	Neu5Ac(a2-6)GalNAc	4.414554	-0.118513	0.312657	0.312657	False	0.871813	-0.231928	0.647072
3	Disialyl_T_antigen	4.014195	-0.073129	0.543318	0.543318	False	0.871813	-0.138395	0.647072
5	Oglycan_core6	3.031385	0.018436	0.883520	0.883520	False	0.871813	0.033203	0.647072
4	Mucin_elongated_core2	4.998256	-0.012072	0.918651	0.918651	False	0.940575	-0.023143	0.647072

get_volcano

 get_volcano (df_res:Union[pandas.core.frame.DataFrame,str,pathlib.Path],
              y_thresh:float=0.05, x_thresh:float=0, n:Optional[int]=None,
              label_changed:bool=True, x_metric:str='Log2FC',
              annotate_volcano:bool=False, filepath:str='', **kwargs:Any)

Creates volcano plot showing -log10(FDR-corrected p-values) vs Log2FC or effect size

	Type	Default	Details
df_res	Union		DataFrame from get_differential_expression with columns [Glycan, Log2FC, p-val, corr p-val]
y_thresh	float	0.05	Corrected p threshold for labeling
x_thresh	float	0	Absolute x metric threshold for labeling
n	Optional	None	Sample size for Bayesian-Adaptive Alpha
label_changed	bool	True	Add text labels to significant points
x_metric	str	Log2FC	x-axis metric: ‘Log2FC’ or ‘Effect size’
annotate_volcano	bool	False	Annotate dots with SNFG images
filepath	str		Path to save plot
kwargs	Any
Returns	None		Displays volcano plot

get_volcano(res)

You're working with a default alpha of 0.05. Set sample size (n = ...) for Bayesian-Adaptive Alpha Adjustment

get_coverage

 get_coverage (df:Union[pandas.core.frame.DataFrame,str,pathlib.Path],
               filepath:str='')

Visualizes glycan detection frequency across samples with intensity-based ordering

	Type	Details
df	Union	DataFrame with glycans in rows (col 1), abundances in columns
filepath	str	Path to save plot
Returns	None

test_df = pd.concat([test_df.iloc[:, 0], test_df[test_df.columns[1:]].astype(float)], axis = 1)

get_coverage(test_df)

get_pca

 get_pca (df:Union[pandas.core.frame.DataFrame,str,pathlib.Path],
          groups:Union[List[int],pandas.core.frame.DataFrame,NoneType]=Non
          e, motifs:bool=False, feature_set:List[str]=['known',
          'exhaustive'], pc_x:int=1, pc_y:int=2, color:Optional[str]=None,
          shape:Optional[str]=None, filepath:Union[str,pathlib.Path]='',
          custom_motifs:List[str]=[], transform:Optional[str]=None,
          rarity_filter:float=0.05)

Performs PCA on glycan/motif abundance data with group-based visualization

	Type	Default	Details
df	Union		DataFrame with glycans in rows (col 1), abundances in columns
groups	Union	None	Group labels (e.g., [1,1,1,2,2,2,3,3,3]) or metadata DataFrame with ‘id’ column
motifs	bool	False	Analyze motifs instead of sequences
feature_set	List	[‘known’, ‘exhaustive’]	Feature sets to use; exhaustive, known, terminal1, terminal2, terminal3, chemical, graph, custom, size_branch
pc_x	int	1	Principal component for x-axis
pc_y	int	2	Principal component for y-axis
color	Optional	None	Column in metadata for color grouping
shape	Optional	None	Column in metadata for shape grouping
filepath	Union		Path to save plot
custom_motifs	List	[]	Custom motifs if using ‘custom’ feature set
transform	Optional	None	Transformation type: “CLR” or “ALR”
rarity_filter	float	0.05	Min proportion for non-zero values
Returns	None

get_pca(test_df, motifs = True, groups = [1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2])

get_pval_distribution

 get_pval_distribution
                        (df_res:Union[pandas.core.frame.DataFrame,str,path
                        lib.Path], filepath:Union[str,pathlib.Path]='')

Creates histogram of p-values from differential expression analysis

	Type	Details
df_res	Union	Output DataFrame from get_differential_expression
filepath	Union	Path to save plot
Returns	None

get_pval_distribution(res)

get_ma

 get_ma (df_res:Union[pandas.core.frame.DataFrame,str,pathlib.Path],
         log2fc_thresh:int=1, sig_thresh:float=0.05,
         filepath:Union[str,pathlib.Path]='')

Generates MA plot (mean abundance vs log2 fold change) from differential expression results

	Type	Default	Details
df_res	Union		Output DataFrame from get_differential_expression
log2fc_thresh	int	1	Log2FC threshold for highlighting
sig_thresh	float	0.05	Significance threshold for highlighting
filepath	Union		Path to save plot
Returns	None

get_ma(res)

get_glycanova

 get_glycanova (df:Union[pandas.core.frame.DataFrame,str],
                groups:List[Any], impute:bool=True, motifs:bool=False,
                feature_set:List[str]=['exhaustive', 'known'],
                min_samples:float=0.1, posthoc:bool=True,
                custom_motifs:List[str]=[], transform:Optional[str]=None,
                gamma:float=0.1, custom_scale:float=0, random_state:Union[
                int,numpy.random._generator.Generator,NoneType]=None)

Performs one-way ANOVA with omega-squared effect size calculation and optional Tukey’s HSD post-hoc testing on glycomics data across multiple groups

	Type	Default	Details
df	Union		DataFrame with glycans in rows (col 1) and abundance values in columns
groups	List		Group labels for samples (e.g., [1,1,1,2,2,2,3,3,3])
impute	bool	True	Replace zeros with Random Forest model
motifs	bool	False	Analyze motifs instead of sequences
feature_set	List	[‘exhaustive’, ‘known’]	Feature sets to use; exhaustive, known, terminal1, terminal2, terminal3, chemical, graph, custom, size_branch
min_samples	float	0.1	Min percent of non-zero samples required
posthoc	bool	True	Perform Tukey’s HSD test post-hoc
custom_motifs	List	[]	Custom motifs if using ‘custom’ feature set
transform	Optional	None	Transformation type: “CLR” or “ALR”
gamma	float	0.1	Uncertainty parameter for CLR transform
custom_scale	float	0	Ratio of total signal in group2/group1 for an informed scale model (or group_idx: mean(group)/min(mean(groups)) signal dict for multivariate)
random_state	Union	None	optional random state for reproducibility
Returns	Tuple		(ANOVA results with F-stats and omega-squared effect sizes, post-hoc results)

test_df2 = glycomics_data_loader.HIV_gagtransfection_O_PMID35112714

anv, ph = get_glycanova(test_df2, [1,1,1,1,2,2,2,2,3,3,3,3], motifs = False)
anv

You're working with an alpha of 0.06364810000741428 that has been adjusted for your sample size of 12.

	Glycan	F statistic	p-val	corr p-val	significant	Effect size
0	Gal(b1-3)[Neu5Ac(a2-6)]GalNAc	3.337354	0.082356	0.255515	False	0.182074
1	Gal(b1-4)GlcNAc(b1-6)[Gal(b1-3)]GalNAc	2.264846	0.159697	0.255515	False	0.107511
2	Neu5Ac(a2-3)Gal(b1-3)[Gal(b1-4)GlcNAc(b1-6)]Ga...	2.987894	0.101120	0.255515	False	-0.097792
3	Neu5Ac(a2-3)Gal(b1-3)[Neu5Ac(a2-6)]GalNAc	4.902553	0.036295	0.255515	False	0.159186
7	Neu5Ac(a2-3)Gal(b1-4)GlcNAc6S(b1-6)[Neu5Ac(a2-...	2.442388	0.142124	0.255515	False	-0.072202
4	Neu5Ac(a2-3)Gal(b1-4)GlcNAc(b1-3/6)[GlcNAc(b1-...	1.346603	0.307886	0.368114	False	0.270963
5	Neu5Ac(a2-3)Gal(b1-4)GlcNAc(b1-6)[Gal(b1-3)]Ga...	1.288259	0.322100	0.368114	False	0.031955
6	Neu5Ac(a2-3)Gal(b1-4)GlcNAc(b1-6)[Neu5Ac(a2-3)...	0.292927	0.752927	0.752927	False	0.026720
8	Neu5Ac(a2-3)Gal(b1-3)GalNAc	0.000000	1.000000	1.000000	False	0.120779

get_meta_analysis

 get_meta_analysis (effect_sizes:Union[numpy.ndarray,List[float]],
                    variances:Union[numpy.ndarray,List[float]],
                    model:str='fixed', filepath:str='',
                    study_names:List[str]=[])

Performs fixed/random effects meta-analysis using DerSimonian-Laird method for between-study variance estimation, with optional Forest plot visualization

	Type	Default	Details
effect_sizes	Union		List of Cohen’s d/other effect sizes
variances	Union		Associated variance estimates
model	str	fixed	‘fixed’ or ‘random’ effects model
filepath	str		Path to save Forest plot
study_names	List	[]	Names corresponding to each effect size
Returns	Tuple		(combined effect size, two-tailed p-value)

get_meta_analysis([-8.759, -6.363, -5.199, -3.952],
                 [7.061, 4.041, 2.919, 1.968])

(-5.326913553837341, 3.005077298112724e-09)

get_time_series

 get_time_series (df:Union[pandas.core.frame.DataFrame,str,pathlib.Path],
                  impute:bool=True, motifs:bool=False,
                  feature_set:List[str]=['known', 'exhaustive'],
                  degree:int=1, min_samples:float=0.1,
                  custom_motifs:List[str]=[],
                  transform:Optional[str]=None, gamma:float=0.1,
                  custom_scale:Union[float,Dict]=0)

Analyzes time series glycomics data using polynomial regression

	Type	Default	Details
df	Union		DataFrame with sample IDs as ‘sampleID_timepoint_replicate’ in col 1 (e.g., T1_h5_r1)
impute	bool	True	Replace zeros with Random Forest model
motifs	bool	False	Analyze motifs instead of sequences
feature_set	List	[‘known’, ‘exhaustive’]	Feature sets to use; exhaustive, known, terminal1, terminal2, terminal3, chemical, graph, custom, size_branch
degree	int	1	Polynomial degree for regression
min_samples	float	0.1	Min percent of non-zero samples required
custom_motifs	List	[]	Custom motifs if using ‘custom’ feature set
transform	Optional	None	Transformation type: “CLR” or “ALR”
gamma	float	0.1	Uncertainty parameter for CLR transform
custom_scale	Union	0	Ratio of total signal in group2/group1 for an informed scale model (or group_idx: mean(group)/min(mean(groups)) signal dict for multivariate)
Returns	DataFrame		DataFrame with regression coefficients and FDR-corrected p-values

t_dic = {}
t_dic["ID"] = ["D1_h5_r1", "D1_h5_r2", "D1_h5_r3", "D1_h10_r1", "D1_h10_r2", "D1_h10_r3", "D1_h15_r1", "D1_h15_r2", "D1_h15_r3"]
t_dic["Neu5Ac(a2-3)Gal(b1-4)GlcNAc(b1-6)[Gal(b1-3)]GalNAc"] = [0.33, 0.31, 0.35, 1.51, 1.57, 1.66, 2.11, 2.04, 2.09]
t_dic["Fuc(a1-2)Gal(b1-3)GalNAc"] = [0.78, 1.01, 0.98, 0.88, 1.11, 0.72, 1.22, 1.00, 0.54]
t_dic["Neu5Ac(a2-6)GalNAc"] = [0.11, 0.09, 0.14, 0.02, 0.07, 0.10, 0.11, 0.09, 0.08]
get_time_series(pd.DataFrame(t_dic).set_index("ID").T)

You're working with an alpha of 0.0694557066556809 that has been adjusted for your sample size of 9.

	Glycan	Change	p-val	corr p-val	significant
0	Fuc(a1-2)Gal(b1-3)GalNAc	-0.009300	0.415220	0.633796	False
1	Neu5Ac(a2-3)Gal(b1-4)GlcNAc(b1-6)[Gal(b1-3)]Ga...	0.005395	0.422530	0.633796	False
2	Neu5Ac(a2-6)GalNAc	-0.001835	0.843457	0.843457	False

get_jtk

 get_jtk (df_in:Union[pandas.core.frame.DataFrame,str,pathlib.Path],
          timepoints:int, interval:int, periods:List[int]=[12, 24],
          motifs:bool=False, feature_set:List[str]=['known', 'exhaustive',
          'terminal'], custom_motifs:List[str]=[],
          transform:Optional[str]=None, gamma:float=0.1,
          correction_method:str='two-stage')

Identifies rhythmically expressed glycans using Jonckheere-Terpstra-Kendall algorithm for time series analysis

	Type	Default	Details
df_in	Union		DataFrame with glycans in rows (first column), then groups arranged by ascending timepoints
timepoints	int		Number of timepoints (each must have same number of replicates)
interval	int		Time units between experimental timepoints
periods	List	[12, 24]	Timepoints per cycle to test
motifs	bool	False	Analyze motifs instead of sequences
feature_set	List	[‘known’, ‘exhaustive’, ‘terminal’]	Feature sets to use; exhaustive, known, terminal1, terminal2, terminal3, chemical, graph, custom, size_branch
custom_motifs	List	[]	Custom motifs if using ‘custom’ feature set
transform	Optional	None	Transformation type: “CLR” or “ALR”
gamma	float	0.1	Uncertainty parameter for CLR transform
correction_method	str	two-stage	Multiple testing correction method
Returns	DataFrame		DataFrame with JTK results: adjusted p-values, period length, lag phase, amplitude

t_dic = {}
t_dic["Neu5Ac(a2-3)Gal(b1-3)GalNAc"] = [0.433138901, 0.149729209, 0.358018822, 0.537641256, 1.526963756, 1.349986672, 0.75156406, 0.736710183]
t_dic["Gal(b1-3)GalNAc"] = [0.919762334, 0.760237184, 0.725566662, 0.459945797, 0.523801515, 0.695106926, 0.627632047, 1.183511209]
t_dic["Gal(b1-3)[Neu5Ac(a2-6)]GalNAc"] = [0.533138901, 0.119729209, 0.458018822, 0.637641256, 1.726963756, 1.249986672, 0.55156406, 0.436710183]
t_dic["Fuc(a1-2)Gal(b1-3)GalNAc"] = [3.862169504, 5.455032837, 3.858163289, 5.614650335, 3.124254095, 4.189550337, 4.641831312, 4.19538484]
tps = 8  # number of timepoints in experiment
periods = [8]  # potential cycles to test
interval = 3  # units of time between experimental timepoints
t_df = pd.DataFrame(t_dic).T
t_df.columns = ["T3", "T6", "T9", "T12", "T15", "T18", "T21", "T24"]
get_jtk(t_df.reset_index(), tps, interval, periods = periods)

You're working with an alpha of 0.22004505213567527 that has been adjusted for your sample size of 1.
Significance inflation detected. The CLR/ALR transformation possibly cannot handle this dataset. Consider running again with a higher gamma value.             Proceed with caution; for now switching to Bonferroni correction to be conservative about this.

	Molecule_Name	Adjusted_P_value	Period_Length	Lag_Phase	Amplitude	significant
0	Gal(b1-3)GalNAc	0.037499	8	12	0.785714	False
1	Gal(b1-3)[Neu5Ac(a2-6)]GalNAc	0.037499	8	12	0.785714	False
2	Neu5Ac(a2-3)Gal(b1-3)GalNAc	0.431049	8	9	0.500000	False
3	Fuc(a1-2)Gal(b1-3)GalNAc	0.694185	8	9	0.428571	False

get_jtk(t_df.reset_index(), tps, interval, periods = periods, motifs = True, feature_set = ['terminal'])

You're working with an alpha of 0.22004505213567527 that has been adjusted for your sample size of 1.

	Molecule_Name	Adjusted_P_value	Period_Length	Lag_Phase	Amplitude	significant
0	Terminal_Neu5Ac(a2-6)	0.059080	8	12	0.642857	True
1	Terminal_Gal(b1-3)	0.059080	8	3	0.642857	True
2	Terminal_Neu5Ac(a2-?)	0.059080	8	12	0.714286	True
3	Terminal_Neu5Ac(a2-3)	0.216933	8	9	0.428571	True
4	Terminal_Fuc(a1-2)	0.386476	8	3	0.285714	False

get_biodiversity

 get_biodiversity (df:Union[pandas.core.frame.DataFrame,str,pathlib.Path],
                   group1:List[Union[str,int]],
                   group2:List[Union[str,int]],
                   metrics:List[str]=['alpha', 'beta'], motifs:bool=False,
                   feature_set:List[str]=['exhaustive', 'known'],
                   custom_motifs:List[str]=[], paired:bool=False,
                   permutations:int=999, transform:Optional[str]=None,
                   gamma:float=0.1, custom_scale:Union[float,Dict]=0, rand
                   om_state:Union[int,numpy.random._generator.Generator,No
                   neType]=None)

Calculates alpha (Shannon/Simpson) and beta (ANOSIM/PERMANOVA) diversity measures from glycomics data

	Type	Default	Details
df	Union		DataFrame with glycans in rows (col 1), abundances in columns
group1	List		First group column indices or group labels
group2	List		Second group indices or additional group labels
metrics	List	[‘alpha’, ‘beta’]	Diversity metrics to calculate
motifs	bool	False	Analyze motifs instead of sequences
feature_set	List	[‘exhaustive’, ‘known’]	Feature sets to use; exhaustive, known, terminal1, terminal2, terminal3, chemical, graph, custom, size_branch
custom_motifs	List	[]	Custom motifs if using ‘custom’ feature set
paired	bool	False	Whether samples are paired
permutations	int	999	Number of permutations for ANOSIM/PERMANOVA
transform	Optional	None	Transformation type: “CLR” or “ALR”
gamma	float	0.1	Uncertainty parameter for CLR transform
custom_scale	Union	0	Ratio of total signal in group2/group1 for an informed scale model (or group_idx: mean(group)/min(mean(groups)) signal dict for multivariate)
random_state	Union	None	optional random state for reproducibility
Returns	DataFrame		DataFrame with diversity indices and test statistics

res = get_biodiversity(test_df, group1 = [1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39],
                                  group2 = [2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40], motifs = True, paired = True)
res

You're working with an alpha of 0.044390023979542614 that has been adjusted for your sample size of 40.

	Metric	Group1 mean	Group2 mean	p-val	Effect size	corr p-val	significant
0	Beta diversity (ANOSIM)	NaN	NaN	0.004004	0.139329	0.004004	True
1	Beta diversity (PERMANOVA)	NaN	NaN	0.004004	43.482858	0.004004	True
2	simpson_diversity	0.869624	0.866328	0.004991	-0.709840	0.004991	True
3	shannon_diversity	2.208122	2.185089	0.008222	-0.659637	0.008222	True
4	species_richness	15.000000	15.000000	1.000000	0.000000	1.000000	False

get_SparCC

 get_SparCC (df1:Union[pandas.core.frame.DataFrame,str,pathlib.Path],
             df2:Union[pandas.core.frame.DataFrame,str,pathlib.Path],
             motifs:bool=False, feature_set:List[str]=['known',
             'exhaustive'], custom_motifs:List[str]=[],
             transform:Optional[str]=None, gamma:float=0.1,
             partial_correlations:bool=False)

Calculates SparCC (Sparse Correlations for Compositional Data) between two matching datasets (e.g., glycomics)

	Type	Default	Details
df1	Union		First DataFrame with glycans in rows (col 1) and abundances in columns
df2	Union		Second DataFrame with same format as df1
motifs	bool	False	Analyze motifs instead of sequences
feature_set	List	[‘known’, ‘exhaustive’]	Feature sets to use; exhaustive, known, terminal1, terminal2, terminal3, chemical, graph, custom, size_branch
custom_motifs	List	[]	Custom motifs if using ‘custom’ feature set
transform	Optional	None	Transformation type: “CLR” or “ALR”
gamma	float	0.1	Uncertainty parameter for CLR transform
partial_correlations	bool	False	Use regularized partial correlations
Returns	Tuple		(Spearman correlation matrix, FDR-corrected p-value matrix)

df1 = glycomics_data_loader.time_series_N_PMID32149347
df2 = glycomics_data_loader.time_series_O_PMID32149347
df1 = pd.merge(df1, df2[['ID']], on = 'ID', how = 'inner')
df2 = pd.merge(df2, df1[['ID']], on = 'ID', how = 'inner')
df1 = df1.set_index(df1.columns.tolist()[0]).T.reset_index()
df2 = df2.set_index(df2.columns.tolist()[0]).T.reset_index()

corr, pval = get_SparCC(df1, df2, motifs = True, transform = "CLR")
sns.clustermap(corr)

You're working with an alpha of 0.04787928055709467 that has been adjusted for your sample size of 31.

get_roc

 get_roc (df:Union[pandas.core.frame.DataFrame,str,pathlib.Path],
          group1:List[Union[str,int]], group2:List[Union[str,int]],
          motifs:bool=False, feature_set:List[str]=['known',
          'exhaustive'], paired:bool=False, impute:bool=True,
          min_samples:float=0.1, custom_motifs:List[str]=[],
          transform:Optional[str]=None, gamma:float=0.1,
          custom_scale:Union[float,Dict]=0,
          filepath:Union[str,pathlib.Path]='', multi_score:bool=False, ran
          dom_state:Union[int,numpy.random._generator.Generator,NoneType]=
          None)

Calculates ROC curves and AUC scores for glycans/motifs or multi-glycan classifiers

	Type	Default	Details
df	Union		DataFrame with glycans in rows (col 1), abundances in columns
group1	List		First group indices/names
group2	List		Second group indices/names
motifs	bool	False	Analyze motifs instead of sequences
feature_set	List	[‘known’, ‘exhaustive’]	Feature sets to use; exhaustive, known, terminal1, terminal2, terminal3, chemical, graph, custom, size_branch
paired	bool	False	Whether samples are paired
impute	bool	True	Replace zeros with Random Forest model
min_samples	float	0.1	Min percent of non-zero samples required
custom_motifs	List	[]	Custom motifs if using ‘custom’ feature set
transform	Optional	None	Transformation type: “CLR” or “ALR”
gamma	float	0.1	Uncertainty parameter for CLR transform
custom_scale	Union	0	Ratio of total signal in group2/group1 for an informed scale model (or group_idx: mean(group)/min(mean(groups)) signal dict for multivariate)
filepath	Union		Path to save ROC plot
multi_score	bool	False	Find best multi-glycan score
random_state	Union	None	optional random state for reproducibility
Returns	Union		(Feature scores with ROC AUC values)

get_roc(test_df, group1 = [1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39],
                                  group2 = [2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40], motifs = True, paired = True)

[('GlcNAc6S(b1-6)GalNAc', 0.7599999999999999),
 ('Internal_LacNAc_type2', 0.715),
 ('Neu5Ac(a2-3)Gal', 0.6699999999999999),
 ('Neu5Ac', 0.65),
 ('Gal', 0.6),
 ('Gal(b1-3)GalNAc', 0.5874999999999999),
 ('GalNAc', 0.585),
 ('Mucin_elongated_core2', 0.4975),
 ('Oglycan_core6', 0.49250000000000005),
 ('Disialyl_T_antigen', 0.465),
 ('Neu5Ac(a2-6)GalNAc', 0.45),
 ('Neu5Ac(a2-8)Neu5Ac', 0.37000000000000005),
 ('H_antigen_type2', 0.2625),
 ('Terminal_LacNAc_type2', 0.2625),
 ('GalOS(b1-3)GalNAc', 0.2475)]

get_lectin_array

 get_lectin_array (df:Union[pandas.core.frame.DataFrame,str,pathlib.Path],
                   group1:List[Union[str,int]],
                   group2:List[Union[str,int]], paired:bool=False,
                   transform:str='')

Analyzes lectin microarray data by mapping lectin binding patterns to glycan motifs, calculating Cohen’s d effect sizes between groups and clustering results by significance

	Type	Default	Details
df	Union		DataFrame with samples as rows and lectins as columns, first column containing sample IDs
group1	List		First group indices/names
group2	List		Second group indices/names
paired	bool	False	Whether samples are paired
transform	str		Optional log2 transformation
Returns	DataFrame		DataFrame with altered glycan motifs, supporting lectins, and effect sizes

lectin_df = lectin_array_data_loader.A549_influenza_PMID33046650
get_lectin_array(lectin_df, [5,6,7], [8,9,10])

Lectin "Ab-LeB-1" is not found in our annotated lectin library and is excluded from analysis.
Lectin "APA" is not found in our annotated lectin library and is excluded from analysis.
Lectin "APP" is not found in our annotated lectin library and is excluded from analysis.
Lectin "Blood Group B [CLCP-19B]" is not found in our annotated lectin library and is excluded from analysis.
Lectin "Blood Group H2" is not found in our annotated lectin library and is excluded from analysis.
Lectin "CA19-9 [121SLE]" is not found in our annotated lectin library and is excluded from analysis.
Lectin "CCA" is not found in our annotated lectin library and is excluded from analysis.
Lectin "CD15 [ICRF29-2]" is not found in our annotated lectin library and is excluded from analysis.
Lectin "CD15 [MY-1]" is not found in our annotated lectin library and is excluded from analysis.
Lectin "CD15 [SP-159]" is not found in our annotated lectin library and is excluded from analysis.
Lectin "Forssman" is not found in our annotated lectin library and is excluded from analysis.
Lectin "IAA" is not found in our annotated lectin library and is excluded from analysis.
Lectin "IRA" is not found in our annotated lectin library and is excluded from analysis.
Lectin "Le X [P12]" is not found in our annotated lectin library and is excluded from analysis.
Lectin "Lewis A [7LE]" is not found in our annotated lectin library and is excluded from analysis.
Lectin "Lewis B [218]" is not found in our annotated lectin library and is excluded from analysis.
Lectin "Lewis Y [F3]" is not found in our annotated lectin library and is excluded from analysis.
Lectin "LFA" is not found in our annotated lectin library and is excluded from analysis.
Lectin "LPA" is not found in our annotated lectin library and is excluded from analysis.
Lectin "MNA-M " is not found in our annotated lectin library and is excluded from analysis.
Lectin "MUC5Ac Ab" is not found in our annotated lectin library and is excluded from analysis.
Lectin "PMA" is not found in our annotated lectin library and is excluded from analysis.
Lectin "PTA_1" is not found in our annotated lectin library and is excluded from analysis.
Lectin "PTA_2" is not found in our annotated lectin library and is excluded from analysis.
Lectin "SNA-S" is not found in our annotated lectin library and is excluded from analysis.
Lectin "SNA-V" is not found in our annotated lectin library and is excluded from analysis.
Lectin "VFA" is not found in our annotated lectin library and is excluded from analysis.

	motif	named_motifs	lectin(s)	change	score	significance
39	Neu5Ac(a2-6)Gal(b1-3)GlcNAc	[Internal_LacNAc_type1]	PSL, SNA, TJA-I, BDA, BPA, WGA_1, WGA_2	down	11.32	highly significant
38	Neu5Ac(a2-6)Gal(b1-4)GlcNAc	[Internal_LacNAc_type2]	PSL, SNA, TJA-I, BDA, BPA, ECA, RCA120, Ricin ...	down	10.81	highly significant
7	Man(a1-2)	[]	ASA, Con A, CVN, HHL, SVN_1, GRFT, SVN_2, SNA-...	up	4.83	moderately significant
14	Gal(b1-4)GlcNAc(b1-2)Man(a1-3)[Gal(b1-4)GlcNAc...	[Chitobiose, Trimannosylcore, Terminal_LacNAc_...	CA, CAA, DSA_1, DSA_2, DSA_3, AMA, BDA, BPA, C...	up	3.51	moderately significant
4	Gal(b1-3)GalNAc	[]	ACA, AIA, MPA, PNA_1, PNA_2, BDA, BPA	up	3.48	moderately significant
43	Neu5Ac(a2-6)GalNAc(b1-4)GlcNAc	[Internal_LacdiNAc_type2]	SNA, CSA, SBA, VVA_1, VVA_2, WFA, BPA, ECA, ST...	down	2.86	moderately significant
10	Gal(b1-4)GlcNAc(b1-2)Man(a1-3)[GlcNAc(b1-4)][G...	[Chitobiose, Trimannosylcore, Terminal_LacNAc_...	Blackbean, Calsepa, PHA-E_1, PHA-E_2, AMA, BDA...	up	2.70	moderately significant
16	Fuc(a1-2)Gal(b1-3)GalNAc(b1-4)[Neu5Ac(a2-3)]Ga...	[Internal_LacNAc_type2, H_type3]	Cholera Toxin, AAA, AAL, ACA, AIA, AOL, BDA, B...	up	2.51	moderately significant
15	Gal(b1-3)GalNAc(b1-4)[Neu5Ac(a2-3)]Gal(b1-4)Gl...	[Internal_LacNAc_type2]	Cholera Toxin, ACA, AIA, BDA, BPA, CSA, ECA, L...	up	2.46	moderately significant
47	GlcNAc(b1-2)Man(a1-3)[GlcNAc(b1-2)Man(a1-6)]Ma...	[Chitobiose, Trimannosylcore, core_fucose]	TL, AAL, AMA, AOL, Con A, GNA, GNL, HHL, LcH, ...	up	2.36	moderately significant
18	Man(a1-6)	[]	Con A, GNA, GNL, HHL, NPA, SNA-II, UDA	up	2.30	moderately significant
17	Man(a1-3)	[]	Con A, GNA, GNL, HHL, NPA, SNA-II, UDA	up	2.30	moderately significant
22	Gal(b1-4)GlcNAc(b1-2)[Gal(b1-4)GlcNAc(b1-4)]Ma...	[Chitobiose, Trimannosylcore, Terminal_LacNAc_...	DSA_1, DSA_2, DSA_3, AMA, BDA, Blackbean, BPA,...	up	2.05	moderately significant
46	Fuc(a1-2)Gal(b1-3)GalNAc	[H_type3]	TJA-II, AAA, AAL, ACA, AIA, AOL, BDA, BPA, MPA...	up	1.96	moderately significant
3	Fuc(a1-6)	[]	AAL, AOL, LcH, PSA	up	1.70	moderately significant
34	Neu5Ac(a2-3)Gal(b1-3)GalNAc	[]	MAL-II, ACA, AIA, BDA, BPA, MPA, PNA_1, PNA_2,...	up	1.59	moderately significant
6	Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc	[Chitobiose, Trimannosylcore]	AMA, Con A, GNA, GNL, HHL, NPA, SNA-II, UDA, W...	up	1.58	moderately significant
11	GlcNAc(b1-2)Man(a1-3)[GlcNAc(b1-2)[GlcNAc(b1-6...	[Chitobiose, Trimannosylcore]	Blackbean, PHA-L, AMA, Con A, GNA, GNL, HHL, N...	up	1.44	moderately significant
42	GlcNAc(b1-2)[GlcNAc(b1-6)]Man(a1-6)[GlcNAc(b1-...	[Chitobiose, Trimannosylcore, bisectingGlcNAc]	RPA, AMA, Blackbean, Con A, GNA, GNL, HHL, NPA...	up	1.40	moderately significant
41	GlcNAc(b1-2)[GlcNAc(b1-4)]Man(a1-3)[GlcNAc(b1-...	[Chitobiose, Trimannosylcore, bisectingGlcNAc]	RPA, AMA, Con A, GNA, GNL, HHL, NPA, SNA-II, U...	up	1.36	moderately significant
23	Gal(b1-4)GlcNAc	[Terminal_LacNAc_type2]	ECA, RCA120, Ricin B Chain, SJA, BDA, BPA	up	1.05	low significance
5	GlcNAc(b1-3)GalNAc	[]	AIA, UEA-II, WGA_1, WGA_2	up	0.86	low significance
26	Gal(a1-3)	[]	GS-I_1, GS-I_2, GS-I_3, GS-I_4, MNA-G, PA-IL	up	0.83	low significance
27	Gal(a1-4)	[]	GS-I_1, GS-I_2, GS-I_3, GS-I_4, MNA-G, PA-IL	up	0.83	low significance
30	Gal(b1-4)GlcNAc(b1-3)	[Terminal_LacNAc_type2]	LEA_1, LEA_2, STA, BDA, BPA, ECA, RCA120, Rici...	up	0.54	low significance
25	Gal(a1-3)Gal	[]	EEA, EEL, MOA, GS-I_1, GS-I_2, GS-I_3, GS-I_4,...	up	0.51	low significance
33	Neu5Ac(a2-3)Gal(b1-4)GlcNAc	[Internal_LacNAc_type2]	MAA_1, MAA_2, MAL-I, BDA, BPA, ECA, RCA120, Ri...	up	0.49	low significance
37	Gal(a1-3)GalNAc	[]	MOA, EEA, EEL, GS-I_1, GS-I_2, GS-I_3, GS-I_4,...	up	0.46	low significance
20	GalNAc(a1-4)	[]	GHA, HAA, HPA, CSA, GS-I_1, GS-I_2, GS-I_3, GS...	up	0.39	low significance
19	GalNAc(a1-3)	[]	GHA, HAA, HPA, CSA, GS-I_1, GS-I_2, GS-I_3, GS...	up	0.39	low significance
21	GalNAc(a1-3)GalNAc(b1-3)	[]	DBA, SBA, CSA, GHA, HAA, HPA, VVA_1, VVA_2, WF...	up	0.25	low significance
24	GalNAc(b1-4)GlcNAc	[Terminal_LacdiNAc_type2]	ECA, STA, CSA, SBA, VVA_1, VVA_2, WFA, BPA, WG...	up	0.20	low significance
44	Fuc(a1-2)Gal(b1-4)GalNAc(b1-3)	[]	SNA-II, AAA, AAL, AOL, BDA, BPA, CSA, SBA, VVA...	up	0.16	low significance
13	GalNAc(b1-4)	[]	CSA, SBA, VVA_1, VVA_2, WFA, BPA, WGA_1, WGA_2	up	0.13	low significance
12	GalNAc(b1-3)	[]	CSA, SBA, VVA_1, VVA_2, WFA, BPA, WGA_1, WGA_2	up	0.13	low significance
40	Fuc(a1-2)Gal(b1-4)GlcNAc	[H_antigen_type2, Internal_LacNAc_type2]	PTL-II, TJA-II, UEA-I, UEA-II, AAA, AAL, AOL, ...	up	0.13	low significance
28	GlcNAc(a1-3)	[]	HAA, HPA, WGA_1, WGA_2	up	0.12	low significance
29	GlcNAc(a1-4)	[]	HAA, HPA, WGA_1, WGA_2	up	0.12	low significance
32	Gal3S(b1-4)GlcNAc	[]	MAA_1, MAA_2, MAL-I, MAL-II	down	0.12	low significance
0	Fuc(a1-2)	[]	AAA, AAL, AOL	up	0.09	low significance
36	Gal3S(b1-4)	[]	MAL-II	down	0.08	low significance
35	Gal3S(b1-3)	[]	MAL-II	down	0.08	low significance
49	Fuc(a1-2)Gal(b1-4)GalNAc	[]	UEA-II, AAA, AAL, AOL, BDA, BPA	up	0.07	low significance
9	Gal(b1-4)	[]	BDA, BPA	up	0.05	low significance
8	Gal(b1-3)	[]	BDA, BPA	up	0.05	low significance
1	Fuc(a1-3)	[]	AAL, AOL, Lotus	down	0.03	low significance
2	Fuc(a1-4)	[]	AAL, AOL	down	0.03	low significance
31	GlcNAc(b1-4)GlcNAc(b1-4)	[Chitobiose]	LEA_1, LEA_2, WGA_1, WGA_2	down	0.01	low significance
50	GlcNAc(b1-3)	[]	WGA_1, WGA_2	down	0.01	low significance
51	GlcNAc(b1-4)	[]	WGA_1, WGA_2	down	0.01	low significance
45	GlcNAc(b1-4)GlcNAc(b1-4)GlcNAc(b1-4)	[Chitobiose]	STA, LEA_1, LEA_2, WGA_1, WGA_2	down	0.00	low significance
48	GlcNAc(b1-3)Gal	[]	UEA-II, WGA_1, WGA_2	up	0.00	low significance
52	Neu5Ac(a2-3)	[]	WGA_1, WGA_2	down	0.00	low significance
53	Neu5Ac(a2-6)	[]	WGA_1, WGA_2	down	0.00	low significance
54	Neu5Ac(a2-8)	[]	WGA_1, WGA_2	down	0.00	low significance

get_glycoshift_per_site

 get_glycoshift_per_site
                          (df:Union[pandas.core.frame.DataFrame,str,pathli
                          b.Path], group1:List[Union[str,int]],
                          group2:List[Union[str,int]], paired:bool=False,
                          impute:bool=True, min_samples:float=0.2,
                          gamma:float=0.1,
                          custom_scale:Union[float,Dict]=0, random_state:U
                          nion[int,numpy.random._generator.Generator,NoneT
                          ype]=None)

Analyzes site-specific glycosylation changes in glycoproteomics data using generalized linear models (GLM) with compositional data normalization

	Type	Default	Details
df	Union		DataFrame with rows formatted as ‘protein_site_composition’ in col 1, abundances in remaining cols
group1	List		First group indices/names or group labels for multi-group
group2	List		Second group indices/names
paired	bool	False	Whether samples are paired
impute	bool	True	Replace zeros with Random Forest model
min_samples	float	0.2	Min percent of non-zero samples required
gamma	float	0.1	Uncertainty parameter for CLR transform
custom_scale	Union	0	Ratio of total signal in group2/group1 for an informed scale model (or group_idx: mean(group)/min(mean(groups)) signal dict for multivariate)
random_state	Union	None	optional random state for reproducibility
Returns	DataFrame		DataFrame with GLM coefficients and FDR-corrected p-values

df_milk = glycoproteomics_data_loader.human_milk_N_PMID34087070

get_glycoshift_per_site(df_milk, ['Colostrum1', 'Colostrum2', 'Colostrum3'], ['Mature1', 'Mature2', 'Mature3'])

You're working with an alpha of 0.07862467893233027 that has been adjusted for your sample size of 6.

	Condition_coefficient	Condition_corr_pval	Condition_significant	high_Man_Condition_coefficient	high_Man_Condition_corr_pval	high_Man_Condition_significant	antennary_Fuc_Condition_coefficient	antennary_Fuc_Condition_corr_pval	antennary_Fuc_Condition_significant	complex_Condition_coefficient	...	Neu5Ac_Condition_significant	Hex_Condition_coefficient	Hex_Condition_corr_pval	Hex_Condition_significant	HexNAc_Condition_coefficient	HexNAc_Condition_corr_pval	HexNAc_Condition_significant	hybrid_Condition_coefficient	hybrid_Condition_corr_pval	hybrid_Condition_significant
sp\|P47710\|CASA1_69	0.351306	0.000000e+00	True	0.000000	1.000000	False	0.000000	1.000000e+00	False	0.000000	...	True	-1.543583	0.000000e+00	True	1.405226	0.000000e+00	True	0.351306	0.000000e+00	True
sp\|P01024\|CO3_85	-13.741464	0.000000e+00	True	-13.741464	0.000000	True	0.000000	1.000000e+00	False	0.000000	...	False	12.821387	0.000000e+00	True	-27.482928	0.000000e+00	True	-13.741464	0.000000e+00	True
sp\|P10909\|CLUS_103	-0.148812	0.000000e+00	True	0.000000	1.000000	False	0.000000	1.000000e+00	False	4.461066	...	True	-0.744062	0.000000e+00	True	-0.595249	0.000000e+00	True	-4.609878	0.000000e+00	True
sp\|Q13410\|BT1A1_55	-13.032160	1.632815e-86	True	0.000000	1.000000	False	-13.621108	2.312512e-126	True	-4.158213	...	True	-0.631943	1.290904e-18	True	12.400217	4.919813e-91	True	-8.873947	1.136641e-42	True
sp\|P01011\|AACT_106	-0.027180	8.973881e-15	True	0.000000	1.000000	False	2.529792	0.000000e+00	True	-2.556972	...	True	-0.135901	8.973881e-15	True	-0.108721	8.973881e-15	True	2.529792	0.000000e+00	True
sp\|P00709\|LALBA_90	-1.256621	1.737034e-08	True	0.000000	1.000000	False	-1.295395	5.342502e-01	False	-0.554787	...	True	3.643645	1.656325e-06	True	-5.026485	1.737034e-08	True	-0.701834	4.151344e-01	False
sp\|P02749\|APOH_253	-0.002492	2.356643e-06	True	0.000000	1.000000	False	0.000000	1.000000e+00	False	-0.002492	...	True	-0.012460	2.356643e-06	True	-0.009968	2.356643e-06	True	0.000000	1.000000e+00	False
sp\|P00738\|HPT_241	0.001144	1.133736e-04	True	0.000000	1.000000	False	0.000000	1.000000e+00	False	0.001144	...	True	0.005722	1.133736e-04	True	0.004578	1.133736e-04	True	0.000000	1.000000e+00	False
sp\|P02765\|FETUA_156	-0.002672	3.394518e-03	True	0.000000	1.000000	False	0.000000	1.000000e+00	False	-0.002672	...	True	-0.013358	3.394518e-03	True	-0.010686	3.394518e-03	True	0.000000	1.000000e+00	False
sp\|P01871\|IGHM_46	-0.000319	7.588850e-02	True	0.000000	1.000000	False	0.000000	1.000000e+00	False	0.000000	...	True	-0.001594	7.588850e-02	True	-0.001275	7.588850e-02	True	-0.000319	9.443902e-02	False
sp\|Q08431\|MFGM_238	0.153221	1.005956e-01	False	-0.230342	0.970588	False	0.000000	1.000000e+00	False	0.000000	...	True	-0.010198	7.085456e-01	False	-0.009176	8.710687e-01	False	0.153221	1.239337e-01	False
sp\|P25311\|ZA2G_109	0.009796	1.359146e-01	False	0.000000	1.000000	False	0.000000	1.000000e+00	False	-0.263924	...	True	0.048980	1.482705e-01	False	0.039184	1.359146e-01	False	0.273720	7.532107e-02	True
sp\|P08571\|CD14_151	0.001329	3.366071e-01	False	0.001329	0.970588	False	0.000000	1.000000e+00	False	0.000000	...	False	0.007973	2.734933e-01	False	0.002658	3.125637e-01	False	0.001329	4.084166e-01	False
sp\|P10909\|CLUS_291	-0.000684	4.277733e-01	False	0.000000	1.000000	False	0.000000	1.000000e+00	False	0.000000	...	False	-0.003422	3.522839e-01	False	-0.002053	3.743016e-01	False	-0.000684	4.471657e-01	False
sp\|P06858\|LIPL_70	-0.000554	5.525567e-01	False	0.000000	1.000000	False	0.000000	1.000000e+00	False	-0.000554	...	False	-0.002768	4.362290e-01	False	-0.002215	4.875500e-01	False	0.000000	1.000000e+00	False
sp\|P07602\|SAP_101	-0.001426	5.635031e-01	False	0.000000	1.000000	False	0.000000	1.000000e+00	False	0.000000	...	False	-0.007129	4.508025e-01	False	-0.005704	5.008917e-01	False	-0.001426	6.311235e-01	False
sp\|P02788\|TRFL_156	-3.509734	6.172068e-01	False	0.000000	1.000000	False	0.781505	9.117647e-01	False	-0.558968	...	False	2.716540	2.734301e-01	False	-4.153083	9.749206e-02	False	-2.950766	4.471657e-01	False
sp\|P02788\|TRFL_497	0.282217	6.343135e-01	False	0.000000	1.000000	False	1.366091	9.117647e-01	False	-11.362849	...	False	1.888504	2.734301e-01	False	1.128866	5.708822e-01	False	-13.834092	5.356118e-07	True
sp\|P07602\|SAP_215	-0.001385	6.588839e-01	False	0.000000	1.000000	False	0.000000	1.000000e+00	False	0.000000	...	False	-0.002770	6.108151e-01	False	-0.002770	6.120552e-01	False	0.000000	1.000000e+00	False
sp\|P10909\|CLUS_374	0.000707	6.588839e-01	False	0.000000	1.000000	False	0.000707	9.117647e-01	False	0.000000	...	False	0.003535	6.120552e-01	False	0.002828	6.120552e-01	False	0.000707	8.058303e-01	False
sp\|P0C0L5\|CO4B_HUMAN/sp\|P0C0L4\|CO4A	0.000433	6.588839e-01	False	0.000433	0.970588	False	0.000000	1.000000e+00	False	0.000000	...	False	0.003901	6.108151e-01	False	0.000867	6.021829e-01	False	0.000433	7.456105e-01	False
sp\|P02790\|HEMO_453	0.000962	6.588839e-01	False	0.000000	1.000000	False	0.000000	1.000000e+00	False	0.000962	...	False	0.004810	6.108151e-01	False	0.003848	6.021829e-01	False	0.000000	1.000000e+00	False
sp\|P01876\|IGHA1_340	4.689804	6.588839e-01	False	3.339666	0.970588	False	0.000000	1.000000e+00	False	-4.685317	...	True	-0.676016	6.927812e-01	False	-0.296792	7.386504e-01	False	-2.193530	6.529675e-01	False
sp\|P0DOX2\|IGA2_HUMAN/sp\|P01877\|IGHA2	-1.970855	6.754175e-01	False	-4.157389	0.970588	False	0.000000	1.000000e+00	False	-5.021428	...	True	1.473677	2.656424e-01	False	-0.341289	6.021829e-01	False	-3.149798	1.501009e-01	False
sp\|P07602\|SAP_426	0.001044	6.862599e-01	False	0.000000	1.000000	False	0.000000	1.000000e+00	False	0.000000	...	False	0.005222	6.862599e-01	False	0.002089	6.862599e-01	False	0.001044	8.235294e-01	False
sp\|P01877\|IGHA2_327	-0.727920	7.386459e-01	False	3.881761	0.970588	False	0.000000	1.000000e+00	False	0.000000	...	False	0.217980	9.310474e-01	False	-1.455839	7.386459e-01	False	-0.727920	8.235294e-01	False
sp\|P19652\|A1AG2_HUMAN/sp\|P02763\|A1AG1	-0.000394	7.493023e-01	False	0.000000	1.000000	False	0.000000	1.000000e+00	False	-0.000394	...	False	-0.001971	7.493023e-01	False	-0.001577	7.493023e-01	False	0.000000	1.000000e+00	False
sp\|P01833\|PIGR_186	-0.001645	7.727110e-01	False	0.000000	1.000000	False	0.000000	1.000000e+00	False	-0.027002	...	False	0.017131	6.108151e-01	False	-0.006581	7.727110e-01	False	0.025357	8.058303e-01	False
sp\|P01833\|PIGR_499	-2.947551	7.916424e-01	False	0.000000	1.000000	False	1.517432	9.117647e-01	False	-1.813514	...	False	-0.871071	6.542124e-01	False	2.856328	3.095250e-01	False	-2.813763	7.196543e-01	False
sp\|Q08380\|LG3BP_125	-0.000227	8.418318e-01	False	0.000000	1.000000	False	0.000000	1.000000e+00	False	0.000000	...	False	-0.001135	8.418318e-01	False	-0.000908	8.418318e-01	False	-0.000227	8.418318e-01	False
sp\|P01591\|IGJ_71	0.986725	8.438826e-01	False	0.000000	1.000000	False	-1.068357	9.117647e-01	False	-0.341331	...	False	-0.877081	4.362290e-01	False	0.482604	5.351719e-01	False	1.031008	6.529675e-01	False
sp\|P01833\|PIGR_421	0.364442	8.577440e-01	False	0.000000	1.000000	False	1.445412	9.117647e-01	False	0.000000	...	False	0.129271	8.937029e-01	False	-0.455849	6.734337e-01	False	0.364442	8.577440e-01	False
sp\|P01833\|PIGR_469	0.917714	8.697268e-01	False	0.000000	1.000000	False	8.789078	9.836502e-04	True	14.562257	...	True	-2.408593	2.656424e-01	False	2.059357	3.386851e-01	False	6.868686	7.890305e-02	False
sp\|P10909\|CLUS_86	-0.000104	9.648675e-01	False	0.000000	1.000000	False	0.000000	1.000000e+00	False	0.000000	...	False	-0.000518	9.648675e-01	False	-0.000414	9.648675e-01	False	-0.000104	9.648675e-01	False

34 rows × 27 columns

annotate

extract curated motifs, graph features, and sequence features from glycan sequences

annotate_glycan

 annotate_glycan (glycan:Union[str,networkx.classes.digraph.DiGraph],
                  motifs:Optional[pandas.core.frame.DataFrame]=None,
                  termini_list:List=[], gmotifs:Optional[List[networkx.cla
                  sses.digraph.DiGraph]]=None)

Counts occurrences of known motifs in a glycan structure using subgraph isomorphism

	Type	Default	Details
glycan	Union		IUPAC-condensed glycan sequence or NetworkX graph
motifs	Optional	None	Motif dataframe (name + sequence); defaults to motif_list
termini_list	List	[]	Monosaccharide positions: ‘terminal’, ‘internal’, or ‘flexible’
gmotifs	Optional	None	Precalculated motif graphs for speed
Returns	DataFrame		DataFrame with motif counts for the glycan

annotate_glycan("Neu5Ac(a2-3)Gal(b1-4)[Fuc(a1-3)]GlcNAc(b1-2)Man(a1-3)[Gal(b1-4)GlcNAc(b1-2)Man(a1-6)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-6)]GlcNAc")

motif_name	Terminal_LewisX	Internal_LewisX	LewisY	SialylLewisX	SulfoSialylLewisX	Terminal_LewisA	Internal_LewisA	LewisB	SialylLewisA	SulfoLewisA	...	Mucin_elongated_core2	Fucoidan	Alginate	FG	XX	Difucosylated_core	GalFuc_core	DisialylLewisC	RM2	DisialylLewisA
Neu5Ac(a2-3)Gal(b1-4)[Fuc(a1-3)]GlcNAc(b1-2)Man(a1-3)[Gal(b1-4)GlcNAc(b1-2)Man(a1-6)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-6)]GlcNAc	0	1	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	0

1 rows × 165 columns

annotate_dataset

 annotate_dataset (glycans:List[str],
                   motifs:Optional[pandas.core.frame.DataFrame]=None,
                   feature_set:List[str]=['known'], termini_list:List=[],
                   condense:bool=False, custom_motifs:List=[])

Comprehensive glycan annotation combining multiple feature types: structural motifs, graph properties, terminal sequences

	Type	Default	Details
glycans	List		List of IUPAC-condensed glycan sequences
motifs	Optional	None	Motif dataframe (name + sequence); defaults to motif_list
feature_set	List	[‘known’]	Feature types to analyze: known, graph, exhaustive, terminal(1-3), custom, chemical, size_branch
termini_list	List	[]	Monosaccharide positions: ‘terminal’, ‘internal’, or ‘flexible’
condense	bool	False	Remove columns with only zeros
custom_motifs	List	[]	Custom motifs when using ‘custom’ feature set
Returns	DataFrame		DataFrame mapping glycans to presence/absence of motifs

glycans = ['Man(a1-3)[Man(a1-6)][Xyl(b1-2)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc',
           'Man(a1-2)Man(a1-2)Man(a1-3)[Man(a1-3)Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc',
           'GalNAc(a1-4)GlcNAcA(a1-4)[GlcN(b1-7)]Kdo(a2-5)[Kdo(a2-4)]Kdo(a2-6)GlcN4P(b1-6)GlcN4P']
print("Annotate Test")
out = annotate_dataset(glycans)

Annotate Test

motif_name	Chitobiose	Trimannosylcore	core_fucose(a1-3)
Man(a1-3)[Man(a1-6)][Xyl(b1-2)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	1	1	1
Man(a1-2)Man(a1-2)Man(a1-3)[Man(a1-3)Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc	1	1	0
GalNAc(a1-4)GlcNAcA(a1-4)[GlcN(b1-7)]Kdo(a2-5)[Kdo(a2-4)]Kdo(a2-6)GlcN4P(b1-6)GlcN4P	0	0	0

quantify_motifs

 quantify_motifs (df:Union[str,pandas.core.frame.DataFrame],
                  glycans:List[str], feature_set:List[str],
                  custom_motifs:List=[], remove_redundant:bool=True)

Extracts and quantifies motif abundances from glycan abundance data by weighting motif occurrences

	Type	Default	Details
df	Union		DataFrame or filepath with samples as columns, abundances as values
glycans	List		List of IUPAC-condensed glycan sequences
feature_set	List		Feature types to analyze: known, graph, exhaustive, terminal(1-3), custom, chemical, size_branch
custom_motifs	List	[]	Custom motifs when using ‘custom’ feature set
remove_redundant	bool	True	Remove redundant motifs via deduplicate_motifs
Returns	DataFrame		DataFrame with motif abundances (motifs as columns, samples as rows)

quantify_motifs(test_df.iloc[:, 1:], test_df.iloc[:, 0].values.tolist(), ['known', 'exhaustive'])

	control_1	tumor_1	control_2	tumor_2	control_3	tumor_3	control_4	tumor_4	control_5	tumor_5	...	control_16	tumor_16	control_17	tumor_17	control_18	tumor_18	control_19	tumor_19	control_20	tumor_20
H_antigen_type2	1.347737	0.892651	2.468405	1.810795	1.589162	0.449339	2.640132	0.572828	2.763890	0.737076	...	1.070249	0.647786	1.440912	1.810304	1.722289	1.475260	4.847788	4.552496	0.480035	0.494123
Internal_LacNAc_type2	8.845085	10.063160	13.435501	28.834006	5.585973	11.359659	11.672584	21.193308	12.734919	28.597709	...	10.883437	17.991155	21.166792	16.161351	11.909325	29.924308	12.820872	19.107379	8.802443	10.268911
Terminal_LacNAc_type2	52.982192	13.183951	24.413523	12.870782	9.555884	9.822266	12.628910	13.916662	26.569737	10.733867	...	18.779972	12.157928	14.828507	20.879287	27.689619	10.734756	28.328965	37.870847	14.835019	8.910804
Disialyl_T_antigen	20.803836	36.895471	32.803297	20.401157	33.971366	30.150599	37.703636	24.728411	31.798990	15.989214	...	46.337629	39.476930	39.087708	40.348217	35.791797	22.968160	11.026029	2.613718	44.676379	46.125360
Mucin_elongated_core2	61.827277	23.247111	37.849024	41.704788	15.141858	21.181925	24.301494	35.109970	39.304656	39.331576	...	29.663409	30.149083	35.995300	37.040638	39.598944	40.659064	41.149838	56.978227	23.637462	19.179715
Oglycan_core6	49.769912	15.548695	26.866089	26.215362	10.511929	12.055414	15.490453	23.290205	26.042379	31.278633	...	11.540685	11.661137	18.515116	17.537102	21.378093	33.895263	28.604677	45.566212	10.882797	8.913582
Gal	163.691481	126.500106	141.895063	147.702533	115.056369	132.721945	122.804259	138.398297	141.412183	167.203077	...	133.838024	140.218313	142.530133	139.697255	138.848449	154.791018	142.588964	157.426027	122.916027	120.555251
GalNAc	100.000000	100.000000	100.000000	100.000000	100.000000	100.000000	100.000000	100.000000	100.000000	100.000000	...	100.000000	100.000000	100.000000	100.000000	100.000000	100.000000	100.000000	100.000000	100.000000	100.000000
GalOS(b1-3)GalNAc	0.843710	1.185047	2.152084	0.687093	1.564450	0.381914	2.389590	0.533142	2.497482	0.338889	...	2.066978	1.088630	1.462826	2.259636	1.687785	1.137672	0.024033	0.117449	1.972512	1.304717
GlcNAc6S(b1-6)GalNAc	2.707913	4.438043	6.198123	6.684838	1.478960	11.921934	0.892356	3.821469	4.605009	28.210391	...	6.241593	11.157860	7.997660	4.916252	0.937290	15.269626	1.463159	0.565249	1.251077	2.680253
Neu5Ac	80.494155	134.094482	120.708503	125.892731	128.626161	137.543517	132.135127	124.740497	118.279272	134.227059	...	149.089683	152.360772	145.124475	140.251427	125.331418	121.962226	91.599064	72.000898	142.956534	148.579697
Gal(b1-3)GalNAc	99.156290	98.814953	97.847916	99.312907	98.435550	99.618086	97.610410	99.466858	97.502518	99.661111	...	97.933022	98.911370	98.537174	97.740364	98.312215	98.862328	99.975967	99.882551	98.027488	98.695283
Neu5Ac(a2-3)Gal	57.345927	94.670033	83.675402	103.574200	91.775344	106.231617	90.136699	98.461821	81.110136	117.087919	...	97.928245	109.749014	101.760261	93.222423	86.403840	96.715461	80.029183	69.040921	95.565848	99.973512
Neu5Ac(a2-6)GalNAc	23.063482	39.304399	36.644881	22.263129	36.571122	31.229766	41.628644	26.256121	37.088978	17.054227	...	50.675599	41.982557	42.829042	46.391984	38.682564	25.118814	11.540028	2.937334	47.171520	48.274238
Neu5Ac(a2-8)Neu5Ac	0.084745	0.120050	0.388219	0.055402	0.279696	0.082135	0.369784	0.022555	0.080158	0.084913	...	0.485839	0.629202	0.535171	0.637019	0.245015	0.127952	0.029853	0.022643	0.219166	0.331947

15 rows × 40 columns

get_k_saccharides

 get_k_saccharides (glycans:Union[List[str],Set[str]], size:int=2,
                    up_to:bool=False, just_motifs:bool=False,
                    terminal:bool=False)

Extracts k-saccharide fragments from glycan sequences with options for different fragment sizes and positions

	Type	Default	Details
glycans	Union		List or set of IUPAC-condensed glycan sequences
size	int	2	Number of monosaccharides per fragment
up_to	bool	False	Include fragments up to size k (adds monosaccharides)
just_motifs	bool	False	Return nested list of motifs instead of count DataFrame
terminal	bool	False	Only count terminal fragments
Returns	Union		DataFrame of k-saccharide counts or list of motifs per glycan

glycans = ['Man(a1-3)[Man(a1-6)][Xyl(b1-2)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc',
           'Man(a1-2)Man(a1-2)Man(a1-3)[Man(a1-3)Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc',
           'GalNAc(a1-4)GlcNAcA(a1-4)[GlcN(b1-7)]Kdo(a2-5)[Kdo(a2-4)]Kdo(a2-6)GlcN4P(b1-6)GlcN4P']
out = get_k_saccharides(glycans, size = 3)

	Fuc(a1-3)[GlcNAc(b1-4)]GlcNAc	GalNAc(a1-4)GlcNAcA(a1-4)Kdo	GlcN(b1-7)Kdo(a2-5)Kdo	GlcN(b1-?)[GlcNAcA(a1-?)]Kdo	GlcNAcA(a1-4)Kdo(a2-5)Kdo	GlcNAcA(a1-4)[GlcN(b1-7)]Kdo	Kdo(a2-4)Kdo(a2-6)GlcN4P	Kdo(a2-4)[Kdo(a2-5)]Kdo	Kdo(a2-5)Kdo(a2-6)GlcN4P	Kdo(a2-6)GlcN4P(b1-6)GlcN4P	Kdo(a2-?)Kdo(a2-?)GlcN4P	Man(a1-2)Man(a1-2)Man	Man(a1-2)Man(a1-3)Man	Man(a1-3)Man(a1-6)Man	Man(a1-3)Man(b1-4)GlcNAc	Man(a1-3)[Man(a1-6)]Man	Man(a1-6)Man(b1-4)GlcNAc	Man(a1-?)Man(a1-?)Man	Man(a1-?)Man(b1-?)GlcNAc	Man(a1-?)[Xyl(b1-?)]Man	Man(b1-4)GlcNAc(b1-4)GlcNAc	Xyl(b1-2)Man(b1-4)GlcNAc	Xyl(b1-2)[Man(a1-3)]Man	Xyl(b1-2)[Man(a1-6)]Man
0	1	0	0	0	0	0	0	0	0	0	0	0	0	0	1	1	1	0	2	2	1	1	1	1
1	0	0	0	0	0	0	0	0	0	0	0	1	1	1	1	1	1	3	2	0	1	0	0	0
2	0	1	1	1	1	1	1	1	1	1	2	0	0	0	0	0	0	0	0	0	0	0	0	0

get_terminal_structures

 get_terminal_structures
                          (glycan:Union[str,networkx.classes.digraph.DiGra
                          ph], size:int=1)

Identifies terminal monosaccharide sequences from non-reducing ends of glycan structure

	Type	Default	Details
glycan	Union		IUPAC-condensed glycan sequence or NetworkX graph
size	int	1	Number of monosaccharides in terminal fragment (1 or 2)
Returns	List		List of terminal structures with linkages

get_terminal_structures("Neu5Ac(a2-3)Gal(b1-4)GlcNAc(b1-2)Man(a1-3)[Neu5Ac(a2-6)Gal(b1-4)GlcNAc(b1-2)Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc")

['Neu5Ac(a2-3)', 'Neu5Ac(a2-6)']

get_molecular_properties

 get_molecular_properties (glycan_list:List[str], verbose:bool=False,
                           placeholder:bool=False)

Retrieves molecular properties from PubChem for a list of glycans using their SMILES representations

	Type	Default	Details
glycan_list	List		List of IUPAC-condensed glycan sequences
verbose	bool	False	Print SMILES not found on PubChem
placeholder	bool	False	Return dummy values instead of dropping failed requests
Returns	DataFrame		DataFrame with molecular parameters from PubChem

out = get_molecular_properties(["Neu5Ac(a2-3)Gal(b1-4)GlcNAc(b1-2)Man(a1-3)[Neu5Ac(a2-6)Gal(b1-4)GlcNAc(b1-2)Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc"])

O1C(O)[C@H](NC(C)=O)[C@@H](O)[C@H](O[C@@H]2O[C@H](CO)[C@@H](O[C@@H]3O[C@H](CO[C@H]4O[C@H](CO)[C@@H](O)[C@H](O)[C@@H]4O[C@@H]5O[C@H](CO)[C@@H](O[C@@H]6O[C@H](CO[C@]7(C(=O)O)C[C@H](O)[C@@H](NC(C)=O)[C@H]([C@H](O)[C@H](O)CO)O7)[C@H](O)[C@H](O)[C@H]6O)[C@H](O)[C@H]5NC(C)=O)[C@@H](O)[C@H](O[C@H]4O[C@H](CO)[C@@H](O)[C@H](O)[C@@H]4O[C@@H]5O[C@H](CO)[C@@H](O[C@@H]6O[C@H](CO)[C@H](O)[C@H](O[C@]7(C(=O)O)C[C@H](O)[C@@H](NC(C)=O)[C@H]([C@H](O)[C@H](O)CO)O7)[C@H]6O)[C@H](O)[C@H]5NC(C)=O)[C@@H]3O)[C@H](O)[C@H]2NC(C)=O)[C@H]1CO

	heavy_atom_count	h_bond_acceptor_count	complexity	bond_stereo_count	h_bond_donor_count	undefined_bond_stereo_count	defined_bond_stereo_count	isotope_atom_count	defined_atom_stereo_count	xlogp	exact_mass	undefined_atom_stereo_count	monoisotopic_mass	atom_stereo_count	molecular_weight	tpsa	covalent_unit_count	charge	rotatable_bond_count
Neu5Ac(a2-3)Gal(b1-4)GlcNAc(b1-2)Man(a1-3)[Neu5Ac(a2-6)Gal(b1-4)GlcNAc(b1-2)Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc	152	62	4410	0	39	0	0	0	56	-23.600000	2222.7830048	1	2222.7830048	57	2224.0	1070	1	0	43

graph

convert glycan sequences to graphs and contains helper functions to search for motifs / check whether two sequences describe the same sequence, etc.

glycan_to_nxGraph

 glycan_to_nxGraph (glycan:str,
                    libr:Optional[glycowork.glycan_data.loader.HashableDic
                    t[str,int]]=None, termini:str='ignore',
                    termini_list:Optional[Tuple[str]]=None)

Wrapper for converting glycans into networkx graphs; also works with floating substituents

	Type	Default	Details
glycan	str		Glycan in IUPAC-condensed format
libr	Optional	None	Dictionary of form glycoletter:index
termini	str	ignore	How to encode terminal/internal position; options: ignore, calc, provided
termini_list	Optional	None	List of positions from terminal/internal/flexible
Returns	DiGraph		NetworkX graph object of glycan

print('Glycan to networkx Graph (only edges printed)')
print(glycan_to_nxGraph('Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-6)]GlcNAc').edges())

Glycan to networkx Graph (only edges printed)
[(1, 0), (3, 2), (4, 1), (4, 3), (5, 4), (6, 5), (7, 6), (9, 8), (10, 7), (10, 9)]

graph_to_string

 graph_to_string (graph:networkx.classes.digraph.DiGraph,
                  canonicalize:bool=True, order_by:str='length')

Convert glycan graph back to IUPAC-condensed format, handling disconnected components

	Type	Default	Details
graph	DiGraph		Glycan graph (assumes root node is the one with the highest index)
canonicalize	bool	True	Whether to output canonicalized IUPAC-condensed
order_by	str	length	canonicalize by ‘length’ or ‘linkage’
Returns	str		IUPAC-condensed glycan string

graph_to_string(glycan_to_nxGraph('Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-6)]GlcNAc'))

'Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-6)]GlcNAc'

compare_glycans

 compare_glycans (glycan_a:Union[str,networkx.classes.digraph.DiGraph],
                  glycan_b:Union[str,networkx.classes.digraph.DiGraph],
                  return_matches:bool=False)

Check whether two glycans are identical

	Type	Default	Details
glycan_a	Union		First glycan to compare
glycan_b	Union		Second glycan to compare
return_matches	bool	False	Whether to return node mapping between glycans
Returns	bool		True if glycans are same, False if not

print("Graph Isomorphism Test")
print(compare_glycans('Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-6)]GlcNAc',
                      'Man(a1-6)[Man(a1-3)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-6)]GlcNAc'))

Graph Isomorphism Test
True

subgraph_isomorphism

 subgraph_isomorphism (glycan:Union[str,networkx.classes.digraph.DiGraph],
                       motif:Union[str,networkx.classes.digraph.DiGraph],
                       termini_list:List=[], count:bool=False,
                       return_matches:bool=False)

Check if motif exists as subgraph in glycan

	Type	Default	Details
glycan	Union		Glycan sequence or graph
motif	Union		Glycan motif sequence or graph
termini_list	List	[]	List of monosaccharide positions from terminal/internal/flexible
count	bool	False	Whether to return count instead of presence/absence
return_matches	bool	False	Whether to return matched subgraphs as node lists
Returns	Union		Boolean presence, count, or (count, matches)

print("Subgraph Isomorphism Test")
print(subgraph_isomorphism('Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-6)]GlcNAc',
                           'Fuc(a1-6)GlcNAc'))

Subgraph Isomorphism Test
True

generate_graph_features

 generate_graph_features
                          (glycan:Union[str,networkx.classes.digraph.DiGra
                          ph], glycan_graph:bool=True,
                          label:str='network')

Compute graph features of glycan or network

	Type	Default	Details
glycan	Union		Glycan sequence or network graph
glycan_graph	bool	True	True if input is glycan, False if network
label	str	network	Label for output dataframe if glycan_graph=False
Returns	DataFrame		Dataframe of graph features

generate_graph_features("Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-6)]GlcNAc")

	diameter	branching	nbrLeaves	avgDeg	varDeg	maxDeg	nbrDeg4	max_deg_leaves	mean_deg_leaves	deg_assort	...	flow_edgeMax	flow_edgeMin	flow_edgeAvg	flow_edgeVar	secorderMax	secorderMin	secorderAvg	secorderVar	egap	entropyStation
Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-6)]GlcNAc	8	1	3	1.818182	0.330579	3.0	0	3.0	3.0	-1.850372e-15	...	0.333333	0.111111	0.217778	0.007289	45.607017	20.736441	31.679285	62.422895	0.30323	-2.191032

1 rows × 49 columns

largest_subgraph

 largest_subgraph (glycan_a:Union[str,networkx.classes.digraph.DiGraph],
                   glycan_b:Union[str,networkx.classes.digraph.DiGraph])

Find the largest common subgraph of two glycans

	Type	Details
glycan_a	Union	First glycan
glycan_b	Union	Second glycan
Returns	str	Largest common subgraph in IUPAC format

glycan1 = 'Neu5Ac(a2-3)Gal(b1-4)GlcNAc(b1-2)Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-6)]GlcNAc'
glycan2 = 'Man(a1-3)[Man(a1-6)]Man(a1-6)[Man(a1-3)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-6)]GlcNAc'
largest_subgraph(glycan1, glycan2)

'Fuc(a1-6)GlcNAc'

ensure_graph

 ensure_graph (glycan:Union[str,networkx.classes.digraph.DiGraph],
               **kwargs)

Ensures function compatibility with string glycans and graph glycans

	Type	Details
glycan	Union	Glycan in IUPAC-condensed format or as networkx graph
kwargs	VAR_KEYWORD
Returns	DiGraph	NetworkX graph object of glycan

ensure_graph("Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-6)]GlcNAc")

<networkx.classes.digraph.DiGraph>

get_possible_topologies

 get_possible_topologies
                          (glycan:Union[str,networkx.classes.digraph.DiGra
                          ph], exhaustive:bool=False,
                          allowed_disaccharides:Optional[Set[str]]=None,
                          modification_map:Dict[str,Set[str]]={'6S':
                          {'Gal', 'GlcNAc'}, '3S': {'Gal'}, '4S':
                          {'GalNAc'}, 'OS': {'Gal', 'GalNAc', 'GlcNAc'}},
                          return_graphs:bool=False)

Create possible glycan graphs given a floating substituent

	Type	Default	Details
glycan	Union		Glycan with floating substituent
exhaustive	bool	False	Whether to allow additions at internal positions
allowed_disaccharides	Optional	None	Permitted disaccharides when creating possible glycans
modification_map	Dict	{‘6S’: {‘Gal’, ‘GlcNAc’}, ‘3S’: {‘Gal’}, ‘4S’: {‘GalNAc’}, ‘OS’: {‘Gal’, ‘GalNAc’, ‘GlcNAc’}}	Maps modifications to valid attachments
return_graphs	bool	False	Whether to return glycan graphs (otherwise return converted strings)
Returns	List		List of possible topology strings or graphs

possible_topology_check

 possible_topology_check
                          (glycan:Union[str,networkx.classes.digraph.DiGra
                          ph], glycans:List[Union[str,networkx.classes.dig
                          raph.DiGraph]], exhaustive:bool=False, **kwargs)

Check whether glycan with floating substituent could match glycans from a list

	Type	Default	Details
glycan	Union		Glycan with floating substituent
glycans	List		List of glycans to check against
exhaustive	bool	False	Whether to allow additions at internal positions
kwargs	VAR_KEYWORD
Returns	List		List of matching glycans

possible_topology_check("{Neu5Ac(a2-3)}Gal(b1-4)GlcNAc(b1-6)[Gal(b1-3)]GalNAc",
                       ["Fuc(a1-2)Gal(b1-3)GalNAc", "Neu5Ac(a2-3)Gal(b1-3)[Gal(b1-4)GlcNAc(b1-6)]GalNAc",
                       "Neu5Ac(a2-6)Gal(b1-3)[Gal(b1-4)GlcNAc(b1-6)]GalNAc"])

['Neu5Ac(a2-3)Gal(b1-3)[Gal(b1-4)GlcNAc(b1-6)]GalNAc']

deduplicate_glycans

 deduplicate_glycans (glycans:Union[List[str],Set[str]])

Remove duplicate glycans from a list/set, even if they have different strings

	Type	Details
glycans	Union	List/set of glycans to deduplicate
Returns	List	Deduplicated list of glycans

deduplicate_glycans(["Fuc(a1-2)Gal(b1-3)GalNAc", "Neu5Ac(a2-3)Gal(b1-4)GlcNAc(b1-6)[Neu5Ac(a2-3)Gal(b1-3)]GalNAc",
                     "Neu5Ac(a2-3)Gal(b1-3)[Neu5Ac(a2-3)Gal(b1-4)GlcNAc(b1-6)]GalNAc", "Neu5Ac(a2-6)Gal(b1-3)[Gal(b1-4)GlcNAc(b1-6)]GalNAc"])

['Neu5Ac(a2-3)Gal(b1-4)GlcNAc(b1-6)[Neu5Ac(a2-3)Gal(b1-3)]GalNAc',
 'Neu5Ac(a2-6)Gal(b1-3)[Gal(b1-4)GlcNAc(b1-6)]GalNAc',
 'Fuc(a1-2)Gal(b1-3)GalNAc']

processing

process IUPAC-condensed glycan sequences into glycoletters etc.

min_process_glycans

 min_process_glycans (glycan_list:List[str])

Convert list of glycans into a nested lists of glycoletters

	Type	Details
glycan_list	List	List of glycans in IUPAC-condensed format
Returns	List	List of glycoletter lists

min_process_glycans(['Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc',
                     'Man(a1-2)Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc'])

[['Man', 'a1-3', 'Man', 'a1-6', 'Man', 'b1-4', 'GlcNAc', 'b1-4', 'GlcNAc'],
 ['Man',
  'a1-2',
  'Man',
  'a1-3',
  'Man',
  'a1-6',
  'Man',
  'b1-4',
  'GlcNAc',
  'b1-4',
  'GlcNAc']]

get_lib

 get_lib (glycan_list:List[str])

Returns dictionary mapping glycoletters to indices

	Type	Details
glycan_list	List	List of IUPAC-condensed glycan sequences
Returns	Dict	Dictionary of glycoletter:index mappings

get_lib(['Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc',
                     'Man(a1-2)Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc'])

{'GlcNAc': 0, 'Man': 1, 'a1-2': 2, 'a1-3': 3, 'a1-6': 4, 'b1-4': 5}

expand_lib

 expand_lib (libr_in:Dict[str,int], glycan_list:List[str])

Updates libr with newly introduced glycoletters

	Type	Details
libr_in	Dict	Existing dictionary of glycoletter:index
glycan_list	List	List of IUPAC-condensed glycan sequences
Returns	Dict	Updated dictionary with new glycoletters

lib1 = get_lib(['Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc',
                     'Man(a1-2)Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc'])
lib2 = expand_lib(lib1, ['Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-6)]GlcNAc'])
lib2

{'GlcNAc': 0, 'Man': 1, 'a1-2': 2, 'a1-3': 3, 'a1-6': 4, 'b1-4': 5, 'Fuc': 6}

presence_to_matrix

 presence_to_matrix (df:pandas.core.frame.DataFrame,
                     glycan_col_name:str='glycan',
                     label_col_name:str='Species')

Converts a dataframe with glycan occurrence to absence/presence matrix

	Type	Default	Details
df	DataFrame		DataFrame with glycan occurrence
glycan_col_name	str	glycan	Column name for glycans
label_col_name	str	Species	Column name for labels
Returns	DataFrame		Matrix with labels as rows and glycan occurrences as columns

out = presence_to_matrix(df_species[df_species.Order == 'Fabales'].reset_index(drop = True),
                         label_col_name = 'Family')

glycan	Apif(a1-2)Xyl(b1-2)[Glc6Ac(b1-4)]Glc	Ara(a1-2)Ara(a1-6)GlcNAc	Ara(a1-2)Glc(b1-2)Ara	Ara(a1-2)GlcA	Ara(a1-2)[Glc(b1-6)]Glc	Ara(a1-3)Gal(b1-6)Gal	Ara(a1-6)Glc	Araf(a1-3)Araf(a1-5)[Araf(a1-6)Gal(b1-6)Glc(b1-6)Man(a1-3)]Araf(a1-5)Araf(a1-3)Araf(a1-3)Araf	Araf(a1-3)Gal(b1-6)Gal	D-Apif(b1-2)Glc	D-Apif(b1-2)GlcA	D-Apif(b1-3)Xyl(b1-2)[Glc6Ac(b1-4)]Glc	D-Apif(b1-3)Xyl(b1-4)Rha(a1-2)Ara	D-Apif(b1-3)Xyl(b1-4)Rha(a1-2)D-Fuc	D-Apif(b1-3)Xyl(b1-4)[Glc(b1-3)]Rha(a1-2)D-Fuc	D-Apif(b1-3)[Gal(b1-4)Xyl(b1-4)]Rha(a1-2)D-Fuc	D-Apif(b1-3)[Gal(b1-4)Xyl(b1-4)]Rha(a1-2)[Rha(a1-3)]D-Fuc	D-Apif(b1-3)[Gal(b1-4)Xyl(b1-4)]Rha(a1-3)D-Fuc	D-Apif(b1-6)Glc	D-ApifOMe(b1-3)XylOMe(b1-4)RhaOMe(a1-2)D-FucOMe	D-ApifOMe(b1-3)XylOMe(b1-4)[GlcOMe(b1-3)]RhaOMe(a1-2)D-FucOMe	Fruf(a2-1)[Glc(b1-2)][Glc(b1-3)Glc4Ac6Ac(b1-3)]Glc	Fruf(a2-1)[Glc(b1-2)][Glc(b1-3)Glc4Ac6Ac(b1-3)]Glc6Ac	Fruf(a2-1)[Glc(b1-2)][Glc(b1-3)Glc6Ac(b1-3)]Glc	Fruf(a2-1)[Glc(b1-2)][Glc(b1-3)Glc6Ac(b1-3)]Glc6Ac	Fruf(b2-1)Glc3Ac6Ac	Fruf(b2-1)Glc4Ac6Ac	Fruf(b2-1)Glc6Ac	Fruf(b2-1)[Glc(b1-2)]Glc	Fruf(b2-1)[Glc(b1-2)][Glc(b1-3)Glc(b1-3)]Glc	Fruf(b2-1)[Glc(b1-2)][Glc(b1-3)]Glc6Ac	Fruf(b2-1)[Glc(b1-2)][Glc(b1-4)Glc(b1-3)]Glc	Fruf(b2-1)[Glc(b1-2)][Glc(b1-4)Glc(b1-3)]Glc6Ac	Fruf(b2-1)[Glc(b1-2)][Glc(b1-4)Glc6Ac(b1-3)]Glc	Fruf(b2-1)[Glc(b1-2)][Glc(b1-4)Glc6Ac(b1-3)]Glc6Ac	Fruf(b2-1)[Glc(b1-2)][Glc6Ac(b1-3)]Glc	Fruf(b2-1)[Glc(b1-2)][Glc6Ac(b1-3)]Glc6Ac	Fruf(b2-1)[Glc(b1-4)Glc6Ac(b1-3)]Glc6Ac	Fruf(b2-1)[Glc3Ac(b1-2)]Glc	Fruf(b2-1)[Glc6Ac(b1-2)]Glc	Fruf1Ac(b2-1)Glc2Ac4Ac6Ac	Fuc(a1-2)Gal(b1-2)Xyl(a1-6)Glc	Fuc(a1-2)Gal(b1-2)Xyl(a1-6)Glc(b1-4)Glc	Fuc(a1-2)Gal(b1-2)Xyl(a1-6)[Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)]Glc(b1-4)Glc	Fuc(a1-2)Gal(b1-2)Xyl(a1-6)[Glc(b1-4)]Glc(b1-4)Glc	Fuc(a1-2)Gal(b1-4)Xyl	Fuc(a1-3)[Gal(b1-4)]GlcNAc(b1-2)Man(a1-6)[GlcNAc(b1-2)Man(a1-3)][Xyl(b1-2)]Man(b1-4)GlcNAc(b1-4)GlcNAc	Fuc(a1-4)GlcNAc(b1-2)Man(a1-3)[Gal(b1-3)[Fuc(a1-4)]GlcNAc(b1-2)Man(a1-6)][Xyl(b1-2)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	Fuc(a1-6)GlcNAc(b1-2)[Man(a1-6)]Man(a1-6)[Xyl(b1-2)][Man(a1-3)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	Gal(?1-?)Gal(b1-4)[Fuc(a1-3)]GlcNAc(b1-2)Man(a1-3)[Man(a1-2)Man(a1-6)][Xyl(b1-2)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	Gal(?1-?)[Gal(?1-?)]GlcNAc(?1-?)[Fuc(a1-3)]GlcNAc(b1-2)Man(a1-3)[Gal(?1-?)Man(a1-3)Man(a1-6)][Xyl(b1-2)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	Gal(a1-4)Gal	Gal(a1-6)Gal	Gal(a1-6)Gal(a1-6)Gal	Gal(a1-6)Gal(a1-6)Gal(a1-6)Gal(a1-6)Glc(a1-2)Fru	Gal(a1-6)Gal(a1-6)Gal(a1-6)Gal(a1-6)Glc(a1-2)Fruf	Gal(a1-6)Gal(a1-6)Gal(a1-6)Gal(a1-6)[Fruf(b2-1)]Glc	Gal(a1-6)Gal(a1-6)Gal(a1-6)Glc	Gal(a1-6)Gal(a1-6)Gal(a1-6)Glc(a1-2)Fru	Gal(a1-6)Gal(a1-6)Gal(a1-6)Glc(a1-2)Fruf	Gal(a1-6)Gal(a1-6)Glc	Gal(a1-6)Gal(a1-6)Glc(a1-2)Fru	Gal(a1-6)Gal(a1-6)Glc(a1-2)Fruf	Gal(a1-6)Glc(a1-2)Fru	Gal(a1-6)Glc(a1-2)Fruf	Gal(a1-6)Man	Gal(a1-6)Man(b1-4)Man	Gal(a1-6)Man(b1-4)Man(b1-4)Man(b1-4)Man	Gal(a1-6)Man(b1-4)Man(b1-4)Man(b1-4)[Gal(a1-6)]Man(b1-4)Man(b1-4)Man(b1-4)[Gal(a1-6)]Man	Gal(a1-6)Man(b1-4)Man(b1-4)[Gal(a1-6)]Man	Gal(a1-6)Man(b1-4)[Gal(a1-6)]Man	Gal(b1-2)Glc	Gal(b1-2)GlcA	Gal(b1-2)GlcA6Me	Gal(b1-2)Xyl(a1-6)Glc(b1-4)[Fuc(a1-2)Gal(b1-2)Xyl(a1-6)]Glc(b1-4)Glc	Gal(b1-2)Xyl(a1-6)[Glc(b1-4)]Glc(b1-4)[Fuc(a1-2)Gal(b1-2)Xyl(a1-6)]Glc(b1-4)Glc	Gal(b1-2)Xyl(a1-6)[Glc(b1-4)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)Glc	Gal(b1-2)[Xyl(b1-3)]GlcA	Gal(b1-3)GlcNAc(b1-2)Man(a1-3)[Gal(b1-3)GlcNAc(b1-2)Man(a1-6)][Xyl(b1-2)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	Gal(b1-3)GlcNAc(b1-2)Man(a1-3)[Gal(b1-3)[Fuc(a1-4)]GlcNAc(b1-2)Man(a1-6)][Xyl(b1-2)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	Gal(b1-3)GlcNAc(b1-2)Man(a1-3)[GlcNAc(b1-2)Man(a1-6)][Xyl(b1-2)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	Gal(b1-3)GlcNAc(b1-2)Man(a1-3)[Xyl(b1-2)][Man(a1-6)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	Gal(b1-3)GlcNAc(b1-2)Man(a1-6)[GlcNAc(b1-2)Man(a1-3)][Xyl(b1-2)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	Gal(b1-3)GlcNAc(b1-2)Man(a1-6)[Xyl(b1-2)][Man(a1-3)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	Gal(b1-3)GlcNAc(b1-4)Man(a1-3)[Gal(b1-3)GlcNAc(b1-4)Man(a1-6)][Xyl(b1-2)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	Gal(b1-3)GlcNAc(b1-4)Man(a1-3)[GlcNAc(b1-4)Man(a1-6)][Xyl(b1-2)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	Gal(b1-3)GlcNAc(b1-4)Man(a1-3)[Xyl(b1-2)][Man(a1-6)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	Gal(b1-3)GlcNAc(b1-4)Man(a1-6)[GlcNAc(b1-4)Man(a1-3)][Xyl(b1-2)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	Gal(b1-3)GlcNAc(b1-4)Man(a1-6)[Xyl(b1-2)][Man(a1-3)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	Gal(b1-3)[Fuc(a1-4)]GlcNAc(b1-2)Man(a1-3)[Gal(b1-3)[Fuc(a1-4)]GlcNAc(b1-2)Man(a1-6)][Xyl(b1-2)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	Gal(b1-3)[Fuc(a1-4)]GlcNAc(b1-2)Man(a1-3)[GlcNAc(b1-2)Man(a1-6)][Xyl(b1-2)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	Gal(b1-3)[Fuc(a1-4)]GlcNAc(b1-2)Man(a1-3)[Xyl(b1-2)][Man(a1-6)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	Gal(b1-3)[Fuc(a1-4)]GlcNAc(b1-2)Man(a1-3/6)[Gal(b1-3)GlcNAc(b1-2)Man(a1-3/6)][Xyl(b1-2)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	Gal(b1-3)[Fuc(a1-4)]GlcNAc(b1-2)Man(a1-3/6)[GlcNAc(b1-2)Man(a1-3/6)][Xyl(b1-2)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	Gal(b1-3)[Fuc(a1-4)]GlcNAc(b1-2)Man(a1-3/6)[Xyl(b1-2)][Man(a1-3/6)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	Gal(b1-3)[Fuc(a1-4)]GlcNAc(b1-2)Man(a1-6)[GlcNAc(b1-2)Man(a1-3)][Xyl(b1-2)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	Gal(b1-3)[Fuc(a1-4)]GlcNAc(b1-2)Man(a1-6)[Xyl(b1-2)][Man(a1-3)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	Gal(b1-3)[Fuc(a1-4)]GlcNAc(b1-2)[Man(a1-6)]Man(a1-6)[Xyl(b1-2)][Man(a1-3)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	Gal(b1-3)[Fuc(a1-6)]GlcNAc(b1-2)[Man(a1-6)]Man(a1-6)[Xyl(b1-2)][Man(a1-3)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	Gal(b1-4)Gal(b1-4)Man	Gal(b1-4)Gal(b1-4)ManOMe	Gal(b1-4)GlcA	Gal(b1-4)GlcNAc(b1-2)Man(a1-3)[Gal(b1-4)GlcNAc(b1-2)Man(a1-6)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-6)]GlcNAc	Gal(b1-4)GlcNAc(b1-2)[Gal(b1-4)GlcNAc(b1-4)]Man(a1-3)[Gal(b1-4)GlcNAc(b1-2)[Gal(b1-4)GlcNAc(b1-6)]Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc	Gal(b1-4)Man(b1-4)Man	Gal(b1-4)Man(b1-4)Man(b1-4)Gal	Gal(b1-4)Xyl(b1-4)Rha(a1-2)D-Fuc	Gal(b1-4)Xyl(b1-4)Rha(a1-2)D-Fuc1CoumOMe	Gal(b1-4)Xyl(b1-4)Rha(a1-2)D-Fuc1FerOMe	Gal(b1-4)Xyl(b1-4)Rha(a1-2)Fuc	Gal(b1-4)Xyl(b1-4)Rha(a1-2)Fuc4Ac	Gal(b1-4)Xyl(b1-4)Rha(a1-2)[Rha(a1-3)]D-Fuc	Gal(b1-4)Xyl(b1-4)Rha(a1-2)[Rha(a1-3)]D-Fuc1CoumOMe	Gal(b1-4)Xyl(b1-4)Rha(a1-2)[Rha(a1-3)]D-FucOMeOSin	Gal(b1-4)Xyl(b1-4)Rha(a1-2)[Rha(a1-3)]Fuc	Gal(b1-4)Xyl(b1-4)[D-Apif(b1-3)]Rha(a1-2)D-Fuc	Gal(b1-4)Xyl(b1-4)[D-Apif(b1-3)]Rha(a1-2)D-Fuc1CoumOMe	Gal(b1-4)Xyl(b1-4)[D-Apif(b1-3)]Rha(a1-2)[Rha(a1-3)]D-Fuc	Gal(b1-4)Xyl(b1-4)[D-Apif(b1-3)]Rha(a1-2)[Rha(a1-3)]D-Fuc1CoumOMe	GalA(a1-2)[Araf(a1-5)Araf(a1-4)]Rha(b1-4)GalA	GalA(a1-4)GalA(a1-4)GalA	GalA(a1-4)GalA(a1-4)GalA(a1-4)GalA(a1-2)Rha(a1-4)GalA(a1-2)Rha(a1-4)GalA(a1-2)GalA	GalA(a1-4)GalA(a1-4)GalA(a1-4)GalA(a1-4)GalA(a1-4)GalA(a1-4)GalA(a1-4)GalA(a1-4)GalA(a1-4)GalA(a1-4)GalA(a1-4)GalA	GalOMe(b1-2)[XylOMe(b1-3)]GlcAOMe	GalOMe(b1-4)XylOMe(b1-4)RhaOMe(a1-2)D-FucOMe	GalOMe(b1-4)XylOMe(b1-4)RhaOMe(a1-2)[RhaOMe(a1-3)]D-FucOMe	GalOMe(b1-4)XylOMe(b1-4)[D-ApifOMe(b1-3)]RhaOMe(a1-2)[RhaOMe(a1-3)]D-FucOMe	Galf(b1-2)[Galf(b1-4)]Man	Glc(a1-2)Fru	Glc(a1-2)Glc(a1-3)Glc(a1-3)Man(a1-2)Man(a1-2)Man(a1-3)[Man(a1-2)Man(a1-3)Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc	Glc(a1-2)Glc(a1-3)Glc(a1-3)Man(a1-2)Man(a1-2)Man(a1-3)[Man(a1-3)Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc	Glc(a1-2)Glc(a1-3)Glc(a1-3)Man(a1-2)Man(a1-2)Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc	Glc(a1-2)Glc(a1-3)Glc(a1-3)Man(a1-2)Man(a1-2)Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAcN	Glc(a1-2)Rha(a1-6)Glc	Glc(a1-3)Man(a1-2)Man(a1-2)Man(a1-3)[Man(a1-2)Man(a1-3)Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc	Glc(a1-3)Man(a1-2)Man(a1-2)Man(a1-3)[Man(a1-2)Man(a1-3)[Man(a1-2)Man(a1-6)]Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc	Glc(a1-3)Man(a1-2)Man(a1-2)Man(a1-3)[Man(a1-2)Man(a1-3)[Man(a1-6)]Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc	Glc(a1-4)Glc(a1-2)Rha(a1-6)Glc	Glc(a1-4)Glc(a1-4)Glc(a1-6)Glc	Glc(a1-4)Glc(a1-4)GlcA	Glc(a1-4)GlcA(b1-2)GlcA	Glc(b1-2)Ara	Glc(b1-2)Ara(a1-2)GlcA	Glc(b1-2)Gal(b1-2)Gal(b1-2)GlcA	Glc(b1-2)Gal(b1-2)GlcA	Glc(b1-2)Gal(b1-2)GlcA(b1-3)[Glc(b1-3)]Ara	Glc(b1-2)Glc	Glc(b1-2)Glc(a1-2)Fru	Glc(b1-2)Glc(a1-2)FrufOBzOCin	Glc(b1-2)Glc(b1-2)Glc	Glc(b1-2)GlcA	Glc(b1-2)Xyl	Glc(b1-2)[Ara(a1-3)]GlcA6Me	Glc(b1-2)[Ara(a1-3)]GlcAOMe	Glc(b1-2)[Ara(a1-6)]Glc	Glc(b1-2)[Glc(b1-3)]Glc(a1-2)Fruf	Glc(b1-2)[Glc(b1-3)]Glc1Fer6Ac(a1-2)Fruf1FerOBz	Glc(b1-2)[Glc(b1-3)]Glc6Ac(a1-2)Fru	Glc(b1-2)[Glc6Ac(b1-3)]Glc(a1-2)Fru	Glc(b1-2)[Glc6Ac(b1-3)]Glc1Fer(a1-2)Fruf1FerOBz	Glc(b1-2)[Glc6Ac(b1-3)]Glc6Ac(a1-2)Fru	Glc(b1-2)[Rha(a1-3)]GlcA	Glc(b1-2)[Xyl(b1-2)Ara(a1-6)]Glc	Glc(b1-2)[Xyl(b1-2)D-Fuc(b1-6)]Glc	Glc(b1-3)Ara	Glc(b1-3)Glc	Glc(b1-3)Glc(b1-3)[Glc(b1-2)]Glc(a1-2)Fru	Glc(b1-3)Glc(b1-3)[Glc(b1-2)]Glc(a1-2)Fruf	Glc(b1-3)Glc6Ac(b1-3)[Glc(b1-2)]Glc(a1-2)Fru	Glc(b1-3)Glc6Ac(b1-3)[Glc(b1-2)]Glc(a1-2)Fruf	Glc(b1-3)Glc6Ac(b1-3)[Glc(b1-2)]Glc1Coum6Ac(a1-2)Fruf1CoumOBz	Glc(b1-3)Glc6Ac(b1-3)[Glc(b1-2)]Glc1Fer(a1-2)Fruf1CoumOBz	Glc(b1-3)Glc6Ac(b1-3)[Glc(b1-2)]Glc1Fer(a1-2)Fruf1FerOBz	Glc(b1-3)Glc6Ac(b1-3)[Glc(b1-2)]Glc1Fer6Ac(a1-2)Fruf1CoumOBz	Glc(b1-3)Glc6Ac(b1-3)[Glc(b1-2)]Glc1Fer6Ac(a1-2)Fruf1FerOBz	Glc(b1-3)Glc6Ac(b1-3)[Glc(b1-2)]Glc6Ac(a1-2)Fru	Glc(b1-3)Glc6Ac(b1-3)[Glc(b1-2)][Rha(a1-4)]Glc1Coum6Ac(a1-2)Fruf1CoumOBz	Glc(b1-3)Glc6Ac(b1-3)[Glc(b1-2)][Rha(a1-4)]Glc1Fer6Ac(a1-2)Fruf1CoumOBz	Glc(b1-3)Rha1Fer(a1-4)Fruf(b2-1)GlcOBz	Glc(b1-3)[Araf(a1-4)]Rha(a1-2)Glc	Glc(b1-3)[Xyl(b1-4)]Rha(a1-2)D-FucOMe	Glc(b1-4)Glc(b1-3)[Glc(b1-2)]Glc(a1-2)Fru	Glc(b1-4)Glc(b1-3)[Glc(b1-2)]Glc(a1-2)Fruf	Glc(b1-4)Glc(b1-3)[Glc(b1-2)]Glc1Coum6Ac(a1-2)Fruf1FerOBz	Glc(b1-4)Glc(b1-3)[Glc(b1-2)]Glc1Fer(a1-2)Fruf1FerOBz	Glc(b1-4)Glc(b1-3)[Glc(b1-2)]Glc1Fer6Ac(a1-2)Fruf1CoumOBz	Glc(b1-4)Glc(b1-3)[Glc(b1-2)]Glc1Fer6Ac(a1-2)Fruf1FerOBz	Glc(b1-4)Glc(b1-3)[Glc(b1-2)]Glc6Ac(a1-2)Fru	Glc(b1-4)Glc(b1-4)Glc	Glc(b1-4)Glc(b1-4)Glc(b1-4)Man	Glc(b1-4)Glc6Ac(b1-3)Glc1Fer6Ac(a1-2)Fruf1FerOBz	Glc(b1-4)Glc6Ac(b1-3)Glc6Ac(a1-2)Fru	Glc(b1-4)Glc6Ac(b1-3)[Glc(b1-2)]Glc(a1-2)Fru	Glc(b1-4)Glc6Ac(b1-3)[Glc(b1-2)]Glc1Coum6Ac(a1-2)Fruf1FerOBz	Glc(b1-4)Glc6Ac(b1-3)[Glc(b1-2)]Glc1Fer(a1-2)Fruf1FerOBz	Glc(b1-4)Glc6Ac(b1-3)[Glc(b1-2)]Glc1Fer6Ac(a1-2)Fruf1CoumOBz	Glc(b1-4)Glc6Ac(b1-3)[Glc(b1-2)]Glc1Fer6Ac(a1-2)Fruf1FerOBz	Glc(b1-4)Glc6Ac(b1-3)[Glc(b1-2)]Glc6Ac(a1-2)Fru	Glc(b1-4)Man(b1-4)Glc	Glc(b1-4)Rha	Glc(b1-4)Rha1Fer(a1-4)Fruf(b2-1)GlcOBz	Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)Glc	Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)Glc	Glc(b1-6)Glc(b1-3)Glc	Glc(b1-6)GlcNAc	Glc1Cer	Glc2Ac(b1-4)[D-Apif(b1-3)Xyl(b1-2)]Glc	Glc2Ac3Ac4Ac6Ac(b1-3)Ara	Glc3Ac(b1-2)Glc(a1-2)Fru	Glc6Ac(a1-2)Fru	Glc6Ac(b1-2)Glc(a1-2)Fru	Glc6Ac(b1-2)Glc(a1-2)FrufOBzOCin	Glc6Ac(b1-3)Ara	Glc6Ac(b1-3)Glc6Ac(b1-3)[Glc6Ac(b1-2)]Glc1Fer6Ac(a1-2)Fruf1CoumOAcOBz	Glc6Ac(b1-3)Glc6Ac(b1-3)[Glc6Ac(b1-2)][RhaOAc(a1-4)]Glc1Fer6Ac(a1-2)Fruf1CoumOAcOBz	Glc6Ac(b1-3)[Glc(b1-2)]Glc1Coum(a1-2)Fruf1CoumOBz	Glc6Ac(b1-3)[Glc(b1-2)]Glc1Fer(a1-2)Fruf1CoumOBz	Glc6Ac(b1-3)[Glc(b1-2)]Glc1Fer(a1-2)Fruf1FerOBz	Glc6Ac(b1-3)[Glc(b1-2)]Glc1Fer6Ac(a1-2)Fruf1FerOBz	GlcA(b1-2)Glc	GlcA(b1-2)GlcA	GlcA(b1-2)GlcA(b1-2)Rha	GlcA4Me(a1-2)[Xyl(b1-4)Xyl(b1-4)Xyl(b1-4)Xyl(b1-4)]Xyl	GlcA4Me(a1-2)[Xyl(b1-4)Xyl(b1-4)Xyl(b1-4)]Xyl	GlcA4Me(a1-2)[Xyl(b1-4)]Xyl	GlcNAc(b1-2)Man(a1-3)[Gal(b1-3)GlcNAc(b1-2)Man(a1-6)][Xyl(b1-2)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	GlcNAc(b1-2)Man(a1-3)[GlcNAc(b1-2)Man(a1-6)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	GlcNAc(b1-2)Man(a1-3)[GlcNAc(b1-2)Man(a1-6)][Xyl(b1-2)]Man(b1-4)GlcNAc(b1-4)GlcNAc	GlcNAc(b1-2)Man(a1-3)[GlcNAc(b1-2)Man(a1-6)][Xyl(b1-2)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	GlcNAc(b1-2)Man(a1-3)[GlcNAc(b1-2)Man(a1-6)][Xyl(b1-2)]Man(b1-4)GlcNAc(b1-4)[Gal(a1-3)]GlcNAc	GlcNAc(b1-2)Man(a1-3)[Man(a1-3)[Man(a1-6)]Man(a1-6)][Xyl(b1-2)]Man(b1-4)GlcNAc(b1-4)GlcNAc	GlcNAc(b1-2)Man(a1-3)[Man(a1-3)[Man(a1-6)]Man(a1-6)][Xyl(b1-2)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	GlcNAc(b1-2)Man(a1-3)[Xyl(b1-2)][Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc	GlcNAc(b1-2)Man(a1-3)[Xyl(b1-2)][Man(a1-6)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	GlcNAc(b1-2)Man(a1-3/6)[Xyl(b1-2)][Man(a1-3/6)]Man(b1-4)GlcNAc(b1-4)GlcNAc	GlcNAc(b1-2)Man(a1-3/6)[Xyl(b1-2)][Man(a1-3/6)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	GlcNAc(b1-2)Man(a1-6)[Man(a1-3)]Man(b1-4)GlcNAc(b1-4)GlcNAc	GlcNAc(b1-2)Man(a1-6)[Man(a1-3)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	GlcNAc(b1-2)Man(a1-6)[Xyl(b1-2)][Man(a1-3)]Man(b1-4)GlcNAc(b1-4)GlcNAc	GlcNAc(b1-2)Man(a1-6)[Xyl(b1-2)][Man(a1-3)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	GlcNAc(b1-4)Man(a1-3)[GlcNAc(b1-4)Man(a1-6)][Xyl(b1-2)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	GlcNAc(b1-4)Man(a1-3)[Xyl(b1-2)][Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc	GlcNAc(b1-4)Man(a1-3)[Xyl(b1-2)][Man(a1-6)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	GlcNAc(b1-4)Man(a1-6)[Xyl(b1-2)][Man(a1-3)]Man(b1-4)GlcNAc(b1-4)GlcNAc	GlcNAc(b1-4)Man(a1-6)[Xyl(b1-2)][Man(a1-3)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	GlcOMe(b1-3)[XylOMe(b1-4)]RhaOMe(a1-2)D-FucOMe	Glcf(b1-2)Xyl(b1-4)Rha(b1-4)[Xyl(b1-3)]Xyl	Hexf(?1-?)Xyl(b1-4)Rha(b1-4)[Xyl(a1-3)]Xyl	L-Lyx(a1-2)Ara(a1-2)GlcA	Lyx(a1-2)Ara(a1-2)GlcA	Man(a1-2)Man(a1-2)Man(a1-2)Man(a1-2)Man(a1-2)Man(a1-3)[Man(a1-2)Man(a1-2)Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc	Man(a1-2)Man(a1-2)Man(a1-2)Man(a1-2)Man(a1-2)Man(a1-3)[Man(a1-2)Man(a1-2)Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAcN	Man(a1-2)Man(a1-2)Man(a1-2)Man(a1-2)Man(a1-2)Man(a1-3)[Man(a1-2)Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc	Man(a1-2)Man(a1-2)Man(a1-2)Man(a1-2)Man(a1-3)[Man(a1-2)Man(a1-2)Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc	Man(a1-2)Man(a1-2)Man(a1-3)[Man(a1-2)Man(a1-2)Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc	Man(a1-2)Man(a1-2)Man(a1-3)[Man(a1-2)Man(a1-3)[Man(a1-2)Man(a1-6)]Man(a1-6)]Man(b1-4)GlcNAc	Man(a1-2)Man(a1-2)Man(a1-3)[Man(a1-2)Man(a1-3)[Man(a1-2)Man(a1-6)]Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc	Man(a1-2)Man(a1-2)Man(a1-3)[Man(a1-2)Man(a1-3)[Man(a1-6)]Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc	Man(a1-2)Man(a1-2)Man(a1-3)[Man(a1-2)Man(a1-6)[Man(a1-3)]Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc	Man(a1-2)Man(a1-2)Man(a1-3)[Man(a1-3)[Man(a1-6)]Man(a1-6)]Man(b1-4)GlcNAc	Man(a1-2)Man(a1-2)Man(a1-3)[Man(a1-3)[Man(a1-6)]Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc	Man(a1-2)Man(a1-2)Man(a1-3)[Man(a1-3)[Man(a1-6)]Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAcN	Man(a1-2)Man(a1-2)Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc	Man(a1-2)Man(a1-2)Man(a1-6)[Man(a1-2)Man(a1-3)]Man(a1-3)[Man(a1-2)Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc	Man(a1-2)Man(a1-2)[Man(a1-6)]Man(a1-3)[Man(a1-2)Man(a1-6)[Man(a1-3)]Man(a1-6)]Man(b1-4)GlcNAc	Man(a1-2)Man(a1-3)Man(b1-4)GlcNAc(b1-4)GlcNAc	Man(a1-2)Man(a1-3)[Man(a1-2)Man(a1-2)Man(a1-6)]Man(a1-3)[Man(a1-2)Man(a1-2)Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc	Man(a1-2)Man(a1-3)[Man(a1-2)Man(a1-6)]Man(a1-3)[Man(a1-2)Man(a1-2)Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc	Man(a1-2)Man(a1-3)[Man(a1-2)Man(a1-6)]Man(a1-6)[Man(a1-2)Man(a1-3)]Man(b1-4)GlcNAc(b1-4)GlcNAc	Man(a1-2)Man(a1-3)[Man(a1-3)[Man(a1-6)]Man(a1-6)]Man(b1-4)GlcNAc	Man(a1-2)Man(a1-3)[Man(a1-3)[Man(a1-6)]Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc	Man(a1-2)Man(a1-3)[Man(a1-6)]Man(a1-6)[Man(a1-2)Man(a1-2)Man(a1-3)]Man(b1-4)GlcNAc(b1-4)GlcNAcN	Man(a1-2)Man(a1-3)[Man(a1-6)]Man(a1-6)[Man(a1-2)Man(a1-3)]Man(b1-4)GlcNAc(b1-4)GlcNAc	Man(a1-2)Man(a1-3)[Man(a1-6)]Man(a1-6)[Man(a1-2)Man(a1-3)]Man(b1-4)GlcNAc(b1-4)GlcNAcN	Man(a1-2)Man(a1-6)[Man(a1-2)Man(a1-3)]Man(a1-6)[Man(a1-2)Man(a1-2)Man(a1-3)]Man(b1-4)GlcNAc(b1-4)GlcNAcN	Man(a1-2)Man(a1-6)[Man(a1-3)]Man(a1-3)[Man(a1-2)Man(a1-6)[Man(a1-3)]Man(a1-6)][Xyl(b1-2)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAcN	Man(a1-2)Man(a1-6)[Man(a1-3)]Man(a1-6)[Man(a1-2)Man(a1-3)]Man(b1-4)GlcNAc(b1-4)GlcNAc	Man(a1-2)Man(a1-6)[Man(a1-3)]Man(a1-6)[Man(a1-3)]Man(b1-4)GlcNAc(b1-4)GlcNAc	Man(a1-2)[Man(a1-3)]Man(a1-6)[Xyl(b1-2)][Man(a1-3)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	Man(a1-3)Man(a1-6)[Man(a1-3)]Man(b1-4)GlcNAc(b1-4)GlcNAc	Man(a1-3)Man(a1-6)[Xyl(b1-2)][Man(a1-3)]Man(a1-6)Man(b1-4)GlcNAc(b1-4)GlcNAc	Man(a1-3)Man(a1-6)[Xyl(b1-2)][Man(a1-3)]Man(b1-4)GlcNAc(b1-4)GlcNAc	Man(a1-3)Man(b1-4)GlcNAc(b1-4)GlcNAc	Man(a1-3)Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	Man(a1-3)[Man(a1-2)Man(a1-6)]Man(a1-3)[Man(a1-3)[Man(a1-2)Man(a1-6)]Man(a1-6)][Xyl(b1-2)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	Man(a1-3)[Man(a1-2)Man(a1-6)]Man(a1-6)[Man(a1-2)Man(a1-2)Man(a1-3)]Man(b1-4)GlcNAc(b1-4)GlcNAcN	Man(a1-3)[Man(a1-6)]Man(a1-6)[Man(a1-3)]Man(b1-4)GlcNAc	Man(a1-3)[Man(a1-6)]Man(a1-6)[Man(a1-3)]Man(b1-4)GlcNAc(b1-4)GlcNAc	Man(a1-3)[Man(a1-6)]Man(a1-6)[Xyl(b1-2)][Man(a1-3)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc	Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	Man(a1-3)[Man(a1-6)][Xylf(a1-2)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	Man(a1-3)[Xyl(b1-2)][Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc-ol	Man(a1-3)[Xyl(b1-2)][Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAcN	Man(a1-3)[Xyl(b1-2)][Man(a1-6)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]Hex	Man(a1-3)[Xylf(b1-2)][Man(a1-6)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	Man(a1-3/6)Man(a1-6)[Man(a1-3)]Man(b1-4)GlcNAc	Man(a1-3/6)Man(a1-6)[Man(a1-3)]Man(b1-4)GlcNAc(b1-4)GlcNAc	Man(a1-3/6)Man(a1-6)[Xyl(b1-2)][Man(a1-3)]Man(b1-4)GlcNAc(b1-4)GlcNAc	Man(a1-3/6)Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	Man(b1-2)Man	Man(b1-4)Gal(b1-4)Gal(b1-4)Man	Man(b1-4)Gal(b1-4)Gal(b1-4)ManOMe	Man(b1-4)Man	Man(b1-4)Man(b1-4)Man	Man(b1-4)Man(b1-4)Man(b1-4)Man	Man(b1-4)Man(b1-4)Man(b1-4)Man(b1-4)Man	Man(b1-4)Man(b1-4)Man(b1-4)[Gal(a1-6)]Man	Man(b1-4)Man(b1-4)[Gal(a1-6)]Man	Man(b1-4)Man(b1-4)[Gal(a1-6)]Man(b1-4)[Gal(a1-6)]Man(b1-4)Man(b1-4)Man(b1-4)Man	Man(b1-4)Man(b1-4)[Gal(a1-6)]Man(b1-4)[Gal(a1-6)]Man(b1-4)Man(b1-4)Man(b1-4)[Man(b1-6)]Man(b1-4)[Man(b1-6)]Man(b1-4)Man(b1-4)Man(b1-4)[Gal(a1-6)]Man(b1-4)[Gal(a1-3)Gal(a1-3)Gal(a1-6)]Man(b1-4)Man(b1-4)Man(b1-4)[Man(b1-6)]Man(b1-4)[Man(b1-6)]Man(b1-4)Man(b1-4)Man(b1-4)[Man(b1-6)]Man(b1-4)[Man(b1-6)]Man(b1-4)Man(b1-4)Man	Man(b1-4)[Gal(a1-6)]Man	Man(b1-4)[Gal(a1-6)]Man(b1-4)Man	Man(b1-4)[Gal(a1-6)]Man(b1-4)Man(b1-4)Man	Man(b1-4)[Gal(a1-6)]Man(b1-4)[Gal(a1-6)]Man	Man(b1-4)[Gal(a1-6)]Man(b1-4)[Gal(a1-6)]Man(b1-4)[Gal(a1-6)]Man(b1-4)Man	Man(b1-4)[Gal(a1-6)]Man(b1-4)[Gal(a1-6)]Man(b1-4)[Gal(a1-6)]Man(b1-4)[Gal(a1-6)]Man	Man(b1-6)Glc	Neu5Ac(a2-6)Gal(b1-4)GlcNAc(b1-2)Man(a1-3)[Neu5Ac(a2-6)Gal(b1-4)GlcNAc(b1-2)Man(a1-6)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-6)]GlcNAc	Neu5Ac(a2-6)Gal(b1-4)GlcNAc(b1-2)[Neu5Ac(a2-3)Gal(b1-4)GlcNAc(b1-4)]Man(a1-3)[Neu5Ac(a2-6)Gal(b1-4)GlcNAc(b1-2)Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc	Rha(a1-2)Ara	Rha(a1-2)Ara(a1-2)GlcA	Rha(a1-2)Ara(a1-2)GlcA6Me	Rha(a1-2)Ara(a1-2)GlcAOMe	Rha(a1-2)D-Ara(b1-2)GlcA	Rha(a1-2)Gal(b1-2)Glc	Rha(a1-2)Gal(b1-2)GlcA	Rha(a1-2)Gal(b1-2)GlcA6Me	Rha(a1-2)Gal(b1-2)GlcAOMe	Rha(a1-2)Glc	Rha(a1-2)Glc(b1-2)Glc	Rha(a1-2)Glc(b1-2)GlcA	Rha(a1-2)Glc(b1-2)GlcA6Me	Rha(a1-2)Glc(b1-2)GlcAOMe	Rha(a1-2)Glc(b1-6)Glc	Rha(a1-2)GlcA(b1-2)GlcA	Rha(a1-2)GlcAOMe(b1-2)GlcAOMe	Rha(a1-2)Rha(a1-2)Gal(b1-4)[Glc(b1-2)]GlcA	Rha(a1-2)Xyl	Rha(a1-2)Xyl(b1-2)Glc	Rha(a1-2)Xyl(b1-2)GlcA	Rha(a1-2)Xyl(b1-2)GlcA6Me	Rha(a1-2)Xyl(b1-2)GlcAOMe	Rha(a1-2)Xyl3Ac	Rha(a1-2)Xyl4Ac	Rha(a1-2)[Glc(b1-3)]Glc	Rha(a1-2)[Glc(b1-6)]Gal(b1-2)GlcA6Me	Rha(a1-2)[Rha(a1-4)]Glc	Rha(a1-2)[Rha(a1-6)]Gal	Rha(a1-2)[Rha(a1-6)]Glc	Rha(a1-2)[Xyl(b1-4)]Glc	Rha(a1-2)[Xyl(b1-4)]Glc(b1-6)Glc	Rha(a1-3)GlcA	Rha(a1-3)[Rha(a1-4)]Gal	Rha(a1-4)Gal(b1-2)GlcA	Rha(a1-4)Gal(b1-2)GlcAOMe	Rha(a1-4)Gal(b1-2)GlcOMe	Rha(a1-4)Gal(b1-4)Gal(b1-4)GalGro	Rha(a1-4)Xyl(b1-2)Glc	Rha(a1-4)Xyl(b1-2)GlcA	Rha(a1-4)Xyl(b1-2)GlcAOMe	Rha(a1-6)Glc	Rha(a1-6)[Xyl(b1-3)Xyl(b1-2)]Glc(b1-2)Glc	Rha(b1-2)Glc(b1-2)GlcA	Rha1Fer(a1-4)Fruf(b2-1)GlcOBz	RhaOMe(a1-2)[RhaOMe(a1-6)]GlcOMe-ol	RhaOMe(a1-6)GlcOMe(b1-2)GlcOMe-ol	Xyl(a1-2)[Man(a1-3)][Man(a1-6)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	Xyl(a1-3)Fuc(a1-4)Rha	Xyl(a1-6)Glc(b1-4)Glc	Xyl(a1-6)Glc(b1-4)Glc-ol	Xyl(a1-6)Glc(b1-4)[Fuc(a1-2)Gal(b1-2)Xyl(a1-6)]Glc(b1-4)Glc	Xyl(a1-6)Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)Glc	Xyl(a1-6)Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)Glc-ol	Xyl(a1-6)Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)[Fuc(a1-2)Gal(b1-2)Xyl(a1-6)]Glc(b1-4)Glc	Xyl(a1-6)Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)Glc	Xyl(a1-6)Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)Glc	Xyl(a1-6)Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)Glc	Xyl(a1-6)Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)Glc	Xyl(a1-6)Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)Glc-ol	Xyl(a1-6)Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)Glc	Xyl(a1-6)Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)Glc	Xyl(a1-6)Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)Glc-ol	Xyl(a1-6)Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)Glc(b1-4)Glc	Xyl(a1-6)Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)Glc-ol	Xyl(a1-6)Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Fuc(a1-2)Gal(b1-2)Xyl(a1-6)]Glc(b1-4)Glc	Xyl(a1-6)Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)Glc	Xyl(a1-6)Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)Glc	Xyl(a1-6)Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)Glc	Xyl(a1-6)Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)Glc	Xyl(a1-6)Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)Glc-ol	Xyl(a1-6)Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)Glc	Xyl(a1-6)Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)Glc	Xyl(a1-6)Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)Glc	Xyl(a1-6)Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)Glc	Xyl(a1-6)Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)Glc	Xyl(a1-6)Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)Glc-ol	Xyl(b1-2)Ara(a1-6)Glc	Xyl(b1-2)Ara(a1-6)GlcNAc	Xyl(b1-2)Ara(a1-6)[Glc(b1-2)]Glc	Xyl(b1-2)Ara(a1-6)[Glc(b1-4)]GlcNAc	Xyl(b1-2)D-Fuc(b1-6)Glc	Xyl(b1-2)D-Fuc(b1-6)GlcNAc	Xyl(b1-2)D-Fuc(b1-6)[Glc(b1-2)]Glc	Xyl(b1-2)Fuc(a1-6)Glc	Xyl(b1-2)Fuc(a1-6)GlcNAc	Xyl(b1-2)Fuc(b1-6)Glc	Xyl(b1-2)Fuc(b1-6)GlcNAc	Xyl(b1-2)Fuc(b1-6)[Glc(b1-2)]Glc	Xyl(b1-2)Gal(b1-2)GlcA6Me	Xyl(b1-2)Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	Xyl(b1-2)Rha(a1-2)Ara	Xyl(b1-2)Xyl(b1-3)[Rha(b1-2)Rha(b1-4)]Xyl	Xyl(b1-2)[Glc(b1-3)]Ara	Xyl(b1-2)[Glc2Ac(b1-4)]Glc	Xyl(b1-2)[Man(a1-3)][Man(a1-6)]Man(a1-3)Man(b1-4)GlcNAc(b1-4)GlcNAc	Xyl(b1-2)[Man(a1-3)][Man(a1-6)]Man(a1-3)Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	Xyl(b1-2)[Man(a1-3)][Man(a1-6)]Man(a1-6)Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	Xyl(b1-2)[Man(a1-3)][Man(a1-6)]Man(b1-4)GlcNAc	Xyl(b1-2)[Man(a1-3)][Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc	Xyl(b1-2)[Man(a1-3)][Man(a1-6)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	Xyl(b1-2)[Man(a1-3)][Man(a1-6)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc-ol	Xyl(b1-2)[Man(a1-3)][Man(a1-6)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-6)]GlcNAc	Xyl(b1-2)[Man(a1-3)][Man(a1-6)]Man(b1-4)ManNAc	Xyl(b1-2)[Man(a1-6)]Man(a1-3)Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	Xyl(b1-2)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc	Xyl(b1-2)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-3)]GlcNAc	Xyl(b1-2)[Rha(a1-3)]GlcA	Xyl(b1-3)Ara	Xyl(b1-3)Xyl(b1-2)[Rha(a1-6)]Glc(b1-2)Glc	Xyl(b1-3)Xyl(b1-4)Rha(a1-2)[Rha(a1-6)]Glc	Xyl(b1-3)Xyl(b1-4)Rha(a1-2)[Rha(a1-6)]Glc(b1-2)Glc	Xyl(b1-4)Rha(a1-2)Ara	Xyl(b1-4)Rha(a1-2)D-Fuc	Xyl(b1-4)Rha(a1-2)D-FucOMe	Xyl(b1-4)Rha(a1-2)Fuc	Xyl(b1-4)Rha(a1-2)Fuc3Ac	Xyl(b1-4)Rha(a1-2)Fuc4Ac	Xyl(b1-4)Rha(a1-2)Glc	Xyl(b1-4)Rha(a1-2)[Rha(a1-3)]Fuc4Ac	Xyl(b1-4)Rha(a1-2)[Rha(a1-6)]Glc	Xyl(b1-4)Xyl(b1-4)Xyl(b1-4)Xyl3Ac(b1-4)Xyl(b1-4)Xyl(b1-4)[GlcA(a1-2)]Xyl(b1-4)Xyl	Xyl(b1-4)Xyl(b1-4)Xyl(b1-4)Xyl3Ac(b1-4)Xyl(b1-4)Xyl(b1-4)[GlcA(a1-2)]Xyl3Ac(b1-4)Xyl	Xyl(b1-4)Xyl(b1-4)Xyl(b1-4)Xyl3Ac(b1-4)Xyl(b1-4)Xyl(b1-4)[GlcA4Me(a1-2)]Xyl(b1-4)Xyl	Xyl(b1-4)Xyl(b1-4)Xyl(b1-4)Xyl3Ac(b1-4)Xyl(b1-4)Xyl(b1-4)[GlcA4Me(a1-2)]Xyl3Ac(b1-4)Xyl	Xyl(b1-4)Xyl(b1-4)[GlcA(a1-2)]Xyl(b1-4)Xyl(b1-4)Xyl(b1-4)Xyl(b1-4)Xyl	Xyl(b1-4)[GlcAOMe(a1-2)]Xyl(b1-4)Xyl(b1-4)Xyl(b1-4)Xyl	Xyl2Ac3Ac4Ac(b1-3)Ara	Xyl4Ac(b1-3)Ara	XylOMe(b1-2)[RhaOMe(a1-6)]GlcOMe(b1-2)GlcOMe-ol	XylOMe(b1-3)XylOMe(b1-2)[RhaOMe(a1-6)]GlcOMe(b1-2)GlcOMe-ol	XylOMe(b1-4)RhaOMe(a1-2)D-FucOMe	XylOMe(b1-4)RhaOMe(a1-2)[RhaOMe(a1-6)]GlcOMe	XylOMe(b1-4)RhaOMe(a1-2)[RhaOMe(a1-6)]GlcOMe-ol	Xylf(b1-2)Xyl(b1-3)[Rha(b1-2)Rha(b1-4)]Xyl	[Araf(a1-3)Gal(b1-3)Gal(b1-6)]Gal(b1-3)Gal	[Araf(a1-3)Gal(b1-6)]Gal(b1-3)Gal	[Gal(a1-4)Gal(a1-6)]Man(b1-4)[Gal(a1-6)]Man(b1-4)[Man(b1-4)Man(b1-4)Man(b1-4)Gal(a1-6)]Man(b1-2)[Gal(a1-6)]Man(b1-2)[Gal(a1-4)Gal(a1-6)]Man(b1-4)Man	[Gal(a1-6)]Man(b1-4)Man	[Gal(a1-6)]Man(b1-4)Man(b1-4)Man	[Gal(a1-6)]Man(b1-4)Man(b1-4)Man(b1-4)Man(b1-4)Man	[Gal(a1-6)]Man(b1-4)[Gal(a1-6)]Man(b1-4)Man(b1-4)Man	[Gal(a1-6)]Man(b1-4)[Gal(a1-6)]Man(b1-4)[Gal(a1-6)]Man(b1-4)Man	[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)Glc(b1-4)Glc	[Gal(b1-3)Gal(b1-6)[Araf(a1-3)]Gal(b1-6)]Gal(b1-3)Gal	[Gal(b1-3)Gal(b1-6)]Gal(b1-3)Gal	[Gal(b1-6)Gal(b1-6)Gal(b1-6)]Gal(b1-3)Gal	[Gal(b1-6)Gal(b1-6)]Gal(b1-3)Gal	[Gal(b1-6)]Gal(b1-3)Gal(b1-3)Gal(b1-3)Gal(b1-3)Gal(b1-3)Gal(b1-3)Gal	[Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)Glc(b1-4)[Araf(a1-2)Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)Glc(b1-4)Glc	[Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)Glc(b1-4)[Araf(a1-2)Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)Glc(b1-4)Glc	[Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)Glc(b1-4)[Araf(a1-2)Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)Glc(b1-4)Glc	[Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)Glc(b1-4)[Araf(a1-2)Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)Glc(b1-4)Glc	[Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)Glc(b1-4)[Gal(b1-5)Araf(a1-5)Araf(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)Glc(b1-4)Glc	[Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)Glc(b1-4)Glc	[Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)Glc(b1-4)[Gal(b1-5)Araf(a1-5)Araf(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)Glc(b1-4)Glc	[Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)Glc(b1-4)[Gal(b1-5)Araf(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)Glc(b1-4)Glc	[Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Fuc(a1-2)Gal(b1-2)Xyl(a1-6)]Glc(b1-4)Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)Glc(b1-4)Glc	[Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)Glc(b1-4)[Gal(b1-5)Araf(a1-5)Araf(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)Glc(b1-4)Glc	[Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)Glc(b1-4)[Gal(b1-5)Araf(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)Glc(b1-4)Glc	[Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)Glc(b1-4)[Gal(b1-5)Araf(a1-5)Araf(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)Glc(b1-4)Glc	[Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)Glc(b1-4)[Gal(b1-5)Araf(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)Glc(b1-4)Glc	[Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)Glc(b1-4)[Gal(b1-5)Araf(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Gal(b1-2)Xyl(a1-6)]Glc(b1-4)Glc(b1-4)Glc	[Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)Glc(b1-4)[Gal(b1-5)Araf(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)[Xyl(a1-6)]Glc(b1-4)Glc(b1-4)Glc
Family
Fabaceae	1	4	1	3	1	1	1	0	1	3	1	1	1	0	0	0	0	0	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	2	1	1	1	1	2	1	1	1	1	4	2	1	1	2	2	7	7	4	4	4	4	4	2	8	4	2	5	4	2	2	1	1	1	1	0	1	1	3	1	1	2	1	1	1	1	2	5	1	1	2	2	1	1	1	1	2	1	1	1	1	1	3	2	0	0	0	0	0	0	0	0	0	0	0	0	0	1	1	1	1	0	0	0	0	1	0	1	1	1	1	1	1	3	1	1	1	1	1	2	2	1	3	1	5	0	0	1	3	1	1	1	2	0	0	0	0	0	0	2	1	1	4	1	0	0	0	0	0	0	0	0	0	0	0	0	0	2	0	0	0	0	0	0	0	0	1	1	0	0	0	0	0	0	0	0	1	2	0	1	1	1	1	5	1	1	0	0	0	0	1	0	0	0	0	0	0	1	3	2	0	0	0	1	1	4	6	1	1	1	1	3	4	2	1	1	1	4	1	1	1	1	1	0	0	0	1	1	1	1	1	1	1	1	7	2	5	1	2	1	1	1	1	1	1	1	2	1	5	1	1	1	1	1	3	1	1	1	1	4	1	1	1	1	1	5	1	11	2	1	1	1	1	1	1	1	1	1	2	1	1	1	4	6	4	4	4	1	1	5	4	1	4	1	1	0	1	1	1	7	1	1	2	3	23	6	7	0	1	9	3	4	1	3	1	1	1	1	3	3	2	1	1	1	1	1	0	2	1	1	1	1	1	1	1	1	1	1	1	0	1	2	0	1	1	1	1	2	1	1	2	1	2	2	1	1	1	1	2	1	1	1	1	1	2	1	1	1	1	2	1	1	1	1	1	7	1	1	1	2	3	1	1	1	1	1	1	1	1	1	0	1	1	1	1	1	3	16	18	2	1	2	1	9	13	2	1	1	3	2	1	0	0	0	0	0	0	0	2	1	1	1	1	1	1	1	1	1	1	0	1	1	0	1	1	1	4	1	2	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1
Fagaceae	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	1	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
Polygalaceae	0	0	0	0	0	0	0	1	0	0	0	0	0	0	0	1	1	1	0	0	0	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	2	1	1	2	1	2	2	1	2	1	1	1	2	0	0	0	0	0	1	1	1	0	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	1	0	0	0	0	0	0	1	1	1	1	1	1	0	0	0	0	0	1	1	1	1	1	2	2	1	1	1	1	1	1	0	0	1	1	1	1	1	1	1	0	0	1	1	1	1	1	1	1	1	0	0	1	0	0	0	0	1	0	0	1	1	1	1	0	1	1	1	1	1	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	0	0	0	0	0	0	0	0	0	0	0	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	0	0	0	0	0	0	0	0	0	0	0	0	1	0	0	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	0	0	1	1	1	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
Quillajaceae	0	0	0	0	0	0	0	0	0	0	0	0	0	1	1	0	0	0	0	1	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	1	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0

enforce_class

 enforce_class (glycan:str, glycan_class:str, conf:Optional[float]=None,
                extra_thresh:float=0.3)

Determines whether glycan belongs to a specified class

	Type	Default	Details
glycan	str		Glycan in IUPAC-condensed nomenclature
glycan_class	str		Glycan class (O, N, free, or lipid)
conf	Optional	None	Prediction confidence to override class
extra_thresh	float	0.3	Threshold to override class
Returns	bool		True if glycan is in glycan class

enforce_class("Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-6)]GlcNAc", "O")

False

IUPAC_to_SMILES

 IUPAC_to_SMILES (glycan_list:Union[str,List[str]])

Convert list of IUPAC-condensed glycans to isomeric SMILES using GlyLES

	Type	Details
glycan_list	Union	List of IUPAC-condensed glycans or single glycan
Returns	List	List of corresponding SMILES strings

IUPAC_to_SMILES(['Neu5Ac(a2-3)Gal(b1-4)Glc'])

O1C(O)[C@H](O)[C@@H](O)[C@H](O[C@@H]2O[C@H](CO)[C@H](O)[C@H](O[C@]3(C(=O)O)C[C@H](O)[C@@H](NC(C)=O)[C@H]([C@H](O)[C@H](O)CO)O3)[C@H]2O)[C@H]1CO

['O1C(O)[C@H](O)[C@@H](O)[C@H](O[C@@H]2O[C@H](CO)[C@H](O)[C@H](O[C@]3(C(=O)O)C[C@H](O)[C@@H](NC(C)=O)[C@H]([C@H](O)[C@H](O)CO)O3)[C@H]2O)[C@H]1CO']

canonicalize_composition

 canonicalize_composition (comp:str)

Converts composition from any common format to standardized dictionary

	Type	Details
comp	str	Composition in Hex5HexNAc4Fuc1Neu5Ac2 or H5N4F1A2 format
Returns	Dict	Dictionary of monosaccharide:count

print(canonicalize_composition("HexNAc2Hex1Fuc3Neu5Ac1"))
print(canonicalize_composition("N2H1F3A1"))

{'HexNAc': 2, 'Hex': 1, 'dHex': 3, 'Neu5Ac': 1}
{'HexNAc': 2, 'Hex': 1, 'dHex': 3, 'Neu5Ac': 1}

canonicalize_iupac

 canonicalize_iupac (glycan:str)

Convert glycan from IUPAC-extended, LinearCode, GlycoCT, WURCS, Oxford, GLYCAM, GlycoWorkBench, CSDB-linear, GlyConnect IDs, and GlyTouCanIDs to standardized IUPAC-condensed format

	Type	Details
glycan	str	Glycan sequence in any supported format
Returns	str	Standardized IUPAC-condensed format

print(canonicalize_iupac("NeuAc?1-36SGalb1-4GlcNACb1-6(Fuc?1-2Galb1-4GlcNacb1-3Galb1-3)GalNAc-sp3"))
print(canonicalize_iupac("WURCS=2.0/5,11,10/[a2122h-1b_1-5_2*NCC/3=O][a1122h-1b_1-5][a1122h-1a_1-5][a2112h-1b_1-5][a1221m-1a_1-5]/1-1-2-3-1-4-3-1-4-5-5/a4-b1_a6-k1_b4-c1_c3-d1_c6-g1_d2-e1_e4-f1_g2-h1_h4-i1_i2-j1"))
print(canonicalize_iupac("Ma3(Ma6)Mb4GNb4GN;N"))
print(canonicalize_iupac("α-D-Manp-(1→3)[α-D-Manp-(1→6)]-β-D-Manp-(1→4)-β-D-GlcpNAc-(1→4)-β-D-GlcpNAc-(1→"))
print(canonicalize_iupac("""RES
1b:b-dgal-HEX-1:5
2s:n-acetyl
3b:b-dgal-HEX-1:5
4b:b-dglc-HEX-1:5
5b:b-dgal-HEX-1:5
6b:a-dglc-HEX-1:5
7b:b-dgal-HEX-1:5
8b:a-lgal-HEX-1:5|6:d
9b:a-dgal-HEX-1:5
10s:n-acetyl
11s:n-acetyl
12b:b-dglc-HEX-1:5
13b:b-dgal-HEX-1:5
14b:a-lgal-HEX-1:5|6:d
15b:a-lgal-HEX-1:5|6:d
16s:n-acetyl
17s:n-acetyl
18b:b-dgal-HEX-1:5
LIN
1:1d(2+1)2n
2:1o(3+1)3d
3:3o(3+1)4d
4:4o(-1+1)5d
5:5o(-1+1)6d
6:6o(-1+1)7d
7:7o(2+1)8d
8:7o(3+1)9d
9:9d(2+1)10n
10:6d(2+1)11n
11:5o(-1+1)12d
12:12o(-1+1)13d
13:13o(2+1)14d
14:12o(-1+1)15d
15:12d(2+1)16n
16:4d(2+1)17n
17:1o(6+1)18d
"""))

Fuc(a1-2)Gal(b1-4)GlcNAc(b1-3)Gal(b1-3)[Neu5Ac(a2-3)Gal6S(b1-4)GlcNAc(b1-6)]GalNAc
Fuc(a1-2)Gal(b1-4)GlcNAc(b1-2)Man(a1-6)[Gal(b1-4)GlcNAc(b1-2)Man(a1-3)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-6)]GlcNAc
Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc
Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc
Fuc(a1-2)[GalNAc(a1-3)]Gal(b1-?)GlcNAc(a1-?)[Fuc(a1-2)Gal(b1-?)[Fuc(a1-?)]GlcNAc(b1-?)]Gal(b1-?)GlcNAc(b1-3)Gal(b1-3)[Gal(b1-6)]GalNAc

get_possible_linkages

 get_possible_linkages (wildcard:str, linkage_list:List[str]={'a1-?',
                        '?2-6', '?1-6', 'a2-5', '1-4', '?1-4', 'b1-9',
                        'b1-6', 'b1-2', 'a1-11', 'b2-8', '?1-3', 'a1-4',
                        'a2-3', 'a2-2', 'a2-9', 'a1-7', 'a2-4', 'a1-5',
                        'a2-6', '?2-?', 'a1-9', 'a1-6', 'b1-5', '?2-8',
                        'b2-2', 'b1-7', 'b1-4', 'b1-1', 'a2-7', 'b1-8',
                        'a2-?', 'b2-3', 'b2-4', 'b2-6', 'b2-7', 'a1-2',
                        '?1-2', 'b1-?', 'a1-1', '1-6', 'a2-8', 'a2-1',
                        '?1-?', 'a1-8', '?2-3', 'b2-1', 'b2-5', 'b1-3',
                        'a1-3', 'a2-11'})

Retrieves all linkages that match a given wildcard pattern

	Type	Default	Details
wildcard	str		Pattern to match, ? can be wildcard
linkage_list	List	{‘a1-?’, ‘?2-6’, ‘?1-6’, ‘a2-5’, ‘1-4’, ‘?1-4’, ‘b1-9’, ‘b1-6’, ‘b1-2’, ‘a1-11’, ‘b2-8’, ‘?1-3’, ‘a1-4’, ‘a2-3’, ‘a2-2’, ‘a2-9’, ‘a1-7’, ‘a2-4’, ‘a1-5’, ‘a2-6’, ‘?2-?’, ‘a1-9’, ‘a1-6’, ‘b1-5’, ‘?2-8’, ‘b2-2’, ‘b1-7’, ‘b1-4’, ‘b1-1’, ‘a2-7’, ‘b1-8’, ‘a2-?’, ‘b2-3’, ‘b2-4’, ‘b2-6’, ‘b2-7’, ‘a1-2’, ‘?1-2’, ‘b1-?’, ‘a1-1’, ‘1-6’, ‘a2-8’, ‘a2-1’, ‘?1-?’, ‘a1-8’, ‘?2-3’, ‘b2-1’, ‘b2-5’, ‘b1-3’, ‘a1-3’, ‘a2-11’}	List of linkages to search
Returns	Set		Matching linkages

get_possible_linkages("a1-?")

{'a1-1',
 'a1-2',
 'a1-3',
 'a1-4',
 'a1-5',
 'a1-6',
 'a1-7',
 'a1-8',
 'a1-9',
 'a1-?'}

get_possible_monosaccharides

 get_possible_monosaccharides (wildcard:str)

Retrieves all matching common monosaccharides of a type

	Type	Details
wildcard	str	Monosaccharide type; options: Hex, HexNAc, dHex, Sia, HexA, Pen, HexOS, HexNAcOS
Returns	Set	Matching monosaccharides

get_possible_monosaccharides("HexNAc")

{'GalNAc', 'GlcNAc', 'HexNAc', 'ManNAc'}

equal_repeats

 equal_repeats (r1:str, r2:str)

Check whether two repeat units could stem from the same repeating structure

	Type	Details
r1	str	First glycan sequence
r2	str	Second glycan sequence
Returns	bool	True if repeats are shifted versions

equal_repeats("Fuc2S3S(a1-3)Fuc2S(a1-4)Fuc2S3S", "Fuc2S(a1-4)Fuc2S3S(a1-3)Fuc2S")

True

get_class

 get_class (glycan:str)

Determines glycan class

	Type	Details
glycan	str	Glycan in IUPAC-condensed nomenclature
Returns	str	Glycan class (repeat, O, N, free, lipid, lipid/free, or empty)

get_class("Gal(b1-4)GlcNAc(b1-2)Man(a1-3)[Gal(b1-4)GlcNAc(b1-2)Man(a1-6)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-6)]GlcNAc")

'N'

query

for interacting with the databases contained in glycowork, delivering insights for sequences of interest

get_insight

 get_insight (glycan:str,
              motifs:Optional[pandas.core.frame.DataFrame]=None)

Print meta-information about a glycan

	Type	Default	Details
glycan	str		Glycan in IUPAC-condensed format
motifs	Optional	None	DataFrame of glycan motifs; default:motif_list
Returns	None		Prints glycan meta-information

print("Test get_insight with 'Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-6)]GlcNAc'")
get_insight('Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-6)]GlcNAc')

Test get_insight with 'Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-6)]GlcNAc'
Let's get rolling! Give us a few moments to crunch some numbers.

This glycan occurs in the following species: ['Acanthocheilonema_viteae', 'Adeno-associated_dependoparvovirusA', 'Aedes_aegypti', 'Angiostrongylus_cantonensis', 'Anopheles_gambiae', 'Antheraea_pernyi', 'Apis_mellifera', 'Ascaris_suum', 'Autographa_californica_nucleopolyhedrovirus', 'AvianInfluenzaA_Virus', 'Bombus_ignitus', 'Bombyx_mori', 'Bos_taurus', 'Brugia_malayi', 'Caenorhabditis_elegans', 'Cardicola_forsteri', 'Cooperia_onchophora', 'Cornu_aspersum', 'Crassostrea_gigas', 'Crassostrea_virginica', 'Cricetulus_griseus', 'Danio_rerio', 'Dictyocaulus_viviparus', 'Dirofilaria_immitis', 'Drosophila_melanogaster', 'Fasciola_hepatica', 'Gallus_gallus', 'Glossina_morsitans', 'Haemonchus_contortus', 'Haliotis_tuberculata', 'Heligmosomoides_polygyrus', 'Helix_lucorum', 'Homo_sapiens', 'HumanImmunoDeficiency_Virus', 'Hylesia_metabus', 'Hypsibius_exemplaris', 'Lutzomyia_longipalpis', 'Lymantria_dispar', 'Macaca_mulatta', 'Mamestra_brassicae', 'Megathura_crenulata', 'Mus_musculus', 'Nilaparvata_lugens', 'Oesophagostomum_dentatum', 'Onchocerca_volvulus', 'Onchocerca_volvulus', 'Ophiactis_savignyi', 'Opisthorchis_viverrini', 'Ostrea_edulis', 'Ovis_aries', 'Pan_troglodytes', 'Pristionchus_pacificus', 'Ramazzottius_varieornatus', 'Rattus_norvegicus', 'Schistosoma_mansoni', 'SemlikiForest_Virus', 'Spodoptera_frugiperda', 'Sus_scrofa', 'Tick_borne_encephalitis_virus', 'Tribolium_castaneum', 'Trichinella_spiralis', 'Trichoplusia_ni', 'Trichuris_suis', 'Tropidolaemus_subannulatus', 'Volvarina_rubella', 'undetermined', 'unidentified_influenza_virus']

Puh, that's quite a lot! Here are the phyla of those species: ['Arthropoda', 'Artverviricota', 'Chordata', 'Cossaviricota', 'Echinodermata', 'Kitrinoviricota', 'Mollusca', 'Negarnaviricota', 'Nematoda', 'Platyhelminthes', 'Tardigrada', 'Virus']

This glycan contains the following motifs: ['Chitobiose', 'Trimannosylcore', 'core_fucose']

This is the GlyTouCan ID for this glycan: G63041RA

This glycan has been reported to be expressed in: ['2A3_cell_line', 'A549_cell_line', 'AML_193_cell_line', 'C10_cell_line', 'CHOK1_cell_line', 'CHOS_cell_line', 'COLO_205_cell_line', 'COLO_320_cell_line', 'CRL_1620_cell_line', 'Caco_2_cell_line', 'Cal-27_cell_line', 'Cervicovaginal_Secretion', 'Co_115_cell_line', 'EOL_1_cell_line', 'FaDu_cell_line', 'HCT_15_cell_line', 'HCT_8_cell_line', 'HEK293_cell_line', 'HEL92_1_7_cell_line', 'HEL_cell_line', 'HL_60_cell_line', 'HT_29_cell_line', 'KG_1_cell_line', 'KG_1a_cell_line', 'KM12_cell_line', 'Kasumi_1_cell_line', 'LS174T_cell_line', 'LS180_cell_line', 'LS411N_cell_line', 'LoVo_cell_line', 'MDA_MB_231BR_cell_line', 'ME_1_cell_line', 'ML_1_cell_line', 'MOLM_13_cell_line', 'MOLM_14_cell_line', 'MV4_11_cell_line', 'M_07e_cell_line', 'NB_4_cell_line', 'NS0_cell_line', 'OCI_AML2_cell_line', 'OCI_AML3_cell_line', 'PLB_985_cell_line', 'RKO_cell_line', 'SCC-9_cell_line', 'SCC_25_cell_line', 'SW1116_cell_line', 'SW1398_cell_line', 'SW1463_cell_line', 'SW480_cell_line', 'SW48_cell_line', 'SW620_cell_line', 'SW948_cell_line', 'T84_cell_line', 'TF_1_cell_line', 'THP_1_cell_line', 'U_937_cell_line', 'VU-147T_cell_line', 'WiDr_cell_line', 'alveolus_of_lung', 'brain', 'brain', 'cerebellar_cortex', 'cerebellar_cortex', 'cerebellar_cortex', 'cerebellar_cortex', 'cerebellum', 'colon', 'cortex', 'digestive_tract', 'digestive_tract', 'forebrain', 'gills', 'gills', 'heart', 'heart', 'heart', 'hindbrain', 'hippocampal_formation', 'hippocampus', 'hippocampus', 'hippocampus', 'hippocampus', 'iPS1A_cell_line', 'iPS2A_cell_line', 'kidney', 'liver', 'liver', 'liver', 'lung', 'mantle', 'mantle', 'metastatic_pancreatic_ductal_adenocarcinoma', 'milk', 'mucus', 'muscle_of_leg', 'nerve_ending', 'ovary', 'pancreas', 'placenta', 'prefrontal_cortex', 'prefrontal_cortex', 'prefrontal_cortex', 'prefrontal_cortex', 'primary_pancreatic_ductal_adenocarcinoma', 'prostate_gland', 'seminal_fluid', 'striatum', 'striatum', 'striatum', 'striatum', 'testicle', 'testis', 'trachea', 'urine', 'urothelium']

This glycan has been reported to be dysregulated in (disease, direction, sample): [('REM_sleep_behavior_disorder', 'down', 'serum'), ('benign_breast_tumor_tissues_vs_para_carcinoma_tissues', 'up', 'breast'), ('cystic_fibrosis', 'up', 'sputum'), ('female_breast_cancer', 'up', 'breast'), ('female_breast_cancer', 'up', 'cell_line'), ('prostate_cancer', 'up', 'prostate_cancer_biopsy'), ('thyroid_gland_papillary_carcinoma', 'up', 'serum'), ('urinary_bladder_cancer', 'down', 'urine')]

That's all we can do for you at this point!

glytoucan_to_glycan

 glytoucan_to_glycan (ids:List[str], revert:bool=False)

Convert between GlyTouCan IDs and IUPAC-condensed glycans

	Type	Default	Details
ids	List		List of GlyTouCan IDs or glycans
revert	bool	False	Whether to map glycans to IDs; default:False
Returns	List		List of glycans or IDs

glytoucan_to_glycan(['G63041RA'])

['Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-6)]GlcNAc']

regex

for performing regular expression-like searches in glycans, very powerful to find complicated motifs

get_match

 get_match (pattern:Union[str,List[str]],
            glycan:Union[str,networkx.classes.digraph.DiGraph],
            return_matches:bool=True)

Find matches for glyco-regular expression in glycan

	Type	Default	Details
pattern	Union		Expression or pre-compiled pattern; e.g., “Hex-HexNAc-([Hex\|Fuc]){1,2}-HexNAc”
glycan	Union		Glycan string or graph
return_matches	bool	True	Whether to return matches vs boolean
Returns	Union		Match results

# {} = between min and max occurrences, e.g., "Hex-HexNAc-([Hex|Fuc]){1,2}-HexNAc"
# * = zero or more occurrences, e.g., "Hex-HexNAc-([Hex|Fuc])*-HexNAc"
# + = one or more occurrences, e.g., "Hex-HexNAc-([Hex|Fuc])+-HexNAc"
# ? = zero or one occurrence, e.g., "Hex-HexNAc-([Hex|Fuc])?-HexNAc"
# {1,} = at minimum one occurrence, e.g., "Hex-HexNAc-([Hex|Fuc]){1,}-HexNAc"
# {,1} = at maximum one occurrence, e.g., "Hex-HexNAc-([Hex|Fuc]){,1}-HexNAc"
# {2} = exactly two occurrences, e.g., "Hex-HexNAc-([Hex|Fuc]){2}-HexNAc"
# ^ = start of sequence, e.g., "^Hex-HexNAc-([Hex|Fuc]){1,2}-HexNAc"
# % = middle of sequence (i.e., neither start nor end)
# $ = end of sequence, e.g., "Hex-HexNAc-([Hex|Fuc]){1,2}-HexNAc$"
# ?<= = lookbehind (i.e., provided pattern must be present before rest of pattern but is not included in match), e.g., "(?<=Xyl-)Hex-HexNAc-([Hex|Fuc]){1,2}-HexNAc"
# ?<! = negative lookbehind (i.e., provided pattern is not present before rest of pattern and is also not included in match), e.g., "(?<!Xyl-)Hex-HexNAc-([Hex|Fuc]){1,2}-HexNAc"
# ?= = lookahead (i.e., provided pattern must be present after rest of pattern but is not included in match), e.g., "Hex-HexNAc-([Hex|Fuc]){1,2}-HexNAc(?=-HexNAc)"
# ?! = negative lookahead (i.e., provided pattern is not present after rest of pattern and is not included in match), e.g., "Hex-HexNAc-([Hex|Fuc]){1,2}-HexNAc(?!-HexNAc)"

# Example: extracting the sequence from the a1-6 branch of N-glycans
pattern = "r[Sia]{,1}-Monosaccharide-([dHex]){,1}-Monosaccharide(?=-Mana6-Monosaccharide)"
print(get_match(pattern, "GalNAc(b1-4)GlcNAc(b1-2)Man(a1-3)[Gal(b1-4)GlcNAc(b1-2)Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc"))
print(get_match(pattern, "GalNAc(b1-4)GlcNAc(b1-2)Man(a1-3)[GalNAc(b1-4)GlcNAc(b1-2)Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc"))
print(get_match(pattern, "GalNAc(b1-4)GlcNAc(b1-2)Man(a1-3)[Neu5Ac(a2-6)GalNAc(b1-4)GlcNAc(b1-2)Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc"))
print(get_match(pattern, "GalNAc(b1-4)GlcNAc(b1-2)Man(a1-3)[Neu5Gc(a2-6)GalNAc(b1-4)[Fuc(a1-3)]GlcNAc(b1-2)Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc"))

['Gal(b1-4)GlcNAc']
['GalNAc(b1-4)GlcNAc']
['Neu5Ac(a2-6)GalNAc(b1-4)GlcNAc']
['Neu5Gc(a2-6)GalNAc(b1-4)[Fuc(a1-3)]GlcNAc']

For interested users, we here compile a selection of regular expression patterns that we find useful in our own work:

Lewis or sialyl-Lewis structures:
pattern = “r[Sia]{,1}-[Gal|GalOS]{1}-([Fuc]){1}-[GlcNAc|GlcNAc6S]{1}”
Blood groups:
pattern = “rFuc-([Gal|GalNAc])?-Gal-GlcNAc”
a1-6 branch in N-glycans:
pattern = “r[Sia]{,1}-[Hex|HexNAc]{,1}-([dHex]){,1}-[Man|GlcNAc]{1}-([.-.|.]){,1}-Mana6(?=-Manb4-GlcNAc)”
b1-6 branch in O-glycans (from core 2/4/6):
pattern = “r[Sia|dHex]{,1}-[Hex|HexNAc]{,1}-([dHex]){,1}-.b6(?=-GalNAc)”
b1-3 branch in O-glycans (from core 1/2):
pattern = “r[Sia]{,1}-[.]{,1}-([dHex]){,1}-.b3(?=-GalNAc)”

get_match_batch

 get_match_batch (pattern:str,
                  glycan_list:List[Union[str,networkx.classes.digraph.DiGr
                  aph]], return_matches:bool=True)

Find glyco-regular expression matches in list of glycans

	Type	Default	Details
pattern	str		Glyco-regular expression; e.g., “Hex-HexNAc-([Hex\|Fuc]){1,2}-HexNAc”
glycan_list	List		List of glycans
return_matches	bool	True	Whether to return matches vs boolean
Returns	Union		Match results for each glycan

motif_to_regex

 motif_to_regex (motif:str)

Convert glycan motif to regular expression pattern

	Type	Details
motif	str	Glycan in IUPAC-condensed
Returns	str	Regular expression

motif_to_regex("Fuc(a1-3)[Gal(b1-4)]GlcNAc(b1-?)")

'Fuca3-([Galb4]){1}-GlcNAcb?'

tokenization

helper functions to map m/z–>composition, composition–>structure, structure–>motif, and more

string_to_labels

 string_to_labels (character_string:str,
                   libr:Optional[Dict[str,int]]=None)

Tokenize word by indexing characters in library

	Type	Default	Details
character_string	str		String to tokenize
libr	Optional	None	Dictionary mapping characters to indices
Returns	List		List of character indices

string_to_labels(['Man','a1-3','Man','a1-6','Man'])

[None, None, None, None, None]

pad_sequence

 pad_sequence (seq:List[int], max_length:int,
               pad_label:Optional[int]=None,
               libr:Optional[Dict[str,int]]=None)

Pad sequences to same length using padding token

	Type	Default	Details
seq	List		Sequence to pad
max_length	int		Target length
pad_label	Optional	None	Padding token value
libr	Optional	None	Character library
Returns	List		Padded sequence

pad_sequence(string_to_labels(['Man','a1-3','Man','a1-6','Man']), 7)

[None, None, None, None, None, 25, 25]

stemify_glycan

 stemify_glycan (glycan:str, stem_lib:Optional[Dict[str,str]]=None,
                 libr:Optional[Dict[str,int]]=None)

Remove modifications from all monosaccharides in glycan

	Type	Default	Details
glycan	str		Glycan in IUPAC-condensed format
stem_lib	Optional	None	Modified to core monosaccharide mapping; default:created from lib
libr	Optional	None	Glycoletter to index mapping
Returns	str		Stemmed glycan string

stemify_glycan("Neu5Ac9Ac(a2-3)Gal6S(b1-3)[Neu5Ac(a2-6)]GalNAc")

'Neu5Ac(a2-3)Gal(b1-3)[Neu5Ac(a2-6)]GalNAc'

stemify_dataset

 stemify_dataset (df:pandas.core.frame.DataFrame,
                  stem_lib:Optional[Dict[str,str]]=None,
                  libr:Optional[Dict[str,int]]=None,
                  glycan_col_name:str='glycan', rarity_filter:int=1)

Remove monosaccharide modifications from all glycans in dataset

	Type	Default	Details
df	DataFrame		DataFrame with glycan column
stem_lib	Optional	None	Modified to core monosaccharide mapping; default:created from lib
libr	Optional	None	Glycoletter to index mapping
glycan_col_name	str	glycan	Column name for glycans
rarity_filter	int	1	Minimum occurrences to keep modification
Returns	DataFrame		DataFrame with stemified glycans

mask_rare_glycoletters

 mask_rare_glycoletters (glycans:List[str],
                         thresh_monosaccharides:Optional[int]=None,
                         thresh_linkages:Optional[int]=None)

Mask rare monosaccharides and linkages in glycans

	Type	Default	Details
glycans	List		List of IUPAC-condensed glycans
thresh_monosaccharides	Optional	None	Threshold for rare monosaccharides (default: 0.001*len(glycans))
thresh_linkages	Optional	None	Threshold for rare linkages (default: 0.03*len(glycans))
Returns	List		List of glycans with masked rare elements

mz_to_composition

 mz_to_composition (mz_value:float, mode:str='negative',
                    mass_value:str='monoisotopic', reduced:bool=False,
                    sample_prep:str='underivatized',
                    mass_tolerance:float=0.5, kingdom:str='Animalia',
                    glycan_class:str='all',
                    df_use:Optional[pandas.core.frame.DataFrame]=None,
                    filter_out:Optional[Set[str]]=None,
                    extras:List[str]=['doubly_charged'],
                    adduct:Optional[str]=None)

Map m/z value to matching monosaccharide composition

	Type	Default	Details
mz_value	float		m/z value from mass spec
mode	str	negative	MS mode: positive/negative
mass_value	str	monoisotopic	Mass type: monoisotopic/average
reduced	bool	False	Whether glycans are reduced
sample_prep	str	underivatized	Sample preparation method: underivatized/permethylated/peracetylated
mass_tolerance	float	0.5	Mass tolerance for matching
kingdom	str	Animalia	Taxonomic kingdom filter for choosing a subset of glycans to consider
glycan_class	str	all	Glycan class: N/O/lipid/free/all
df_use	Optional	None	Custom glycan database
filter_out	Optional	None	Monosaccharides to ignore during composition finding
extras	List	[‘doubly_charged’]	Additional operations: adduct/doubly_charged
adduct	Optional	None	Chemical formula of adduct that contributes to m/z, e.g., “C2H4O2”
Returns	List		List of matching compositions

mz_to_composition(665.4, glycan_class='O', filter_out={'Kdn', 'P', 'HexA', 'Pen', 'HexN', 'Me', 'PCho', 'PEtN'},
                    reduced = True)

[{'Hex': 1, 'HexNAc': 2, 'Neu5Ac': 1, 'Neu5Gc': 1, 'dHex': 1}]

match_composition_relaxed

 match_composition_relaxed (composition:Dict[str,int],
                            glycan_class:str='N', kingdom:str='Animalia', 
                            df_use:Optional[pandas.core.frame.DataFrame]=N
                            one, reducing_end:Optional[str]=None)

Map coarse-grained composition to matching glycans

	Type	Default	Details
composition	Dict		Dictionary indicating composition (e.g. {“dHex”: 1, “Hex”: 1, “HexNAc”: 1})
glycan_class	str	N	Glycan class: N/O/lipid/free
kingdom	str	Animalia	Taxonomic kingdom filter for choosing a subset of glycans to consider
df_use	Optional	None	Custom glycan database
reducing_end	Optional	None	Reducing end specification
Returns	List		List of matching glycans

match_composition_relaxed({"Hex":3, "HexNAc":2, "dHex":1}, glycan_class = 'O')

['Fuc(a1-2)[Gal(a1-3)]Gal(b1-4)GlcNAc(b1-6)[Gal(b1-3)]GalNAc',
 'Fuc(a1-2)[Gal(a1-3)]Gal(b1-3)[Gal(b1-4)GlcNAc(b1-6)]GalNAc',
 'Gal(b1-4)Gal(b1-4)[Fuc(a1-3)]GlcNAc(b1-6)[Gal(b1-3)]GalNAc',
 'Gal(?1-3/4)Gal(b1-3/4)[Fuc(a1-3/4)]GlcNAc(b1-6)[Gal(b1-3)]GalNAc',
 'Gal(b1-4)Gal(b1-4)[Fuc(a1-3)]GlcNAc(b1-3)Gal(b1-3)GalNAc',
 'Fuc(a1-2)Gal(b1-3/4)GlcNAc(b1-3)Gal(b1-3)[Gal(b1-6)]GalNAc',
 'Fuc(a1-2)Gal(b1-4)GlcNAc(b1-6)[Gal(?1-?)Gal(b1-3)]GalNAc',
 'Fuc(a1-2)[Gal(a1-3)]Gal(b1-3)GlcNAc(b1-3)Gal(b1-3)GalNAc',
 'Fuc(a1-2)[Gal(a1-3)]Gal(b1-4)GlcNAc(?1-3/4)Gal(b1-3)GalNAc',
 'Fuc(a1-3)[Gal(b1-4)]GlcNAc(b1-3)Gal(b1-4)GlcNAc(b1-3)Gal',
 'Gal(?1-?)Gal(b1-4)GlcNAc(b1-6)[Fuc(a1-2)Gal(b1-3)]GalNAc',
 'Fuc(a1-2)Gal(b1-4)GlcNAc(b1-6)[Gal(a1-3)Gal(b1-3)]GalNAc',
 'Gal(a1-3)Gal(b1-4)GlcNAc(b1-6)[Fuc(a1-2)Gal(b1-3)]GalNAc',
 'Fuc(a1-2)Gal(b1-3)Gal(b1-3)GlcNAc(b1-6)[Gal(b1-3)]GalNAc',
 'Fuc(a1-2)Gal(b1-3)Gal(b1-3)[Gal(b1-4)GlcNAc(b1-6)]GalNAc',
 'Gal(b1-4)Gal(b1-3)[Fuc(a1-3)[Gal(b1-4)]GlcNAc(b1-6)]GalNAc',
 'Fuc(a1-2)Gal(?1-?)Gal(b1-3/4)GlcNAc(b1-6)[Gal(b1-3)]GalNAc',
 'Gal(a1-3)GalNAc(a1-3)[Fuc(a1-2)]Gal(b1-3)Gal(b1-3)GalNAc',
 'Man(a1-6)Glc(a1-4)GlcNAc(b1-4)[Fuc(a1-2)]Gal(b1-3)GalNAc',
 'Man(a1-6)Glc(b1-4)GlcNAc(b1-4)[Fuc(a1-2)]Gal(b1-3)GalNAc',
 'Fuc(a1-2)Gal(b1-3)GlcNAc(b1-3)Gal(b1-4)GlcNAc(b1-?)Man',
 'Gal(b1-2)Gal(a1-3)[Fuc(a1-2)]Gal(b1-3)[GlcNAc(b1-6)]GalNAc',
 'Fuc(a1-2)Gal(a1-3)Gal(a1-4)Gal(b1-3)[GlcNAc(b1-6)]GalNAc',
 'Gal(b1-4)[Fuc(a1-3)]GlcNAc(b1-6)[Gal(b1-3)]Gal(b1-3)GalNAc',
 'Fuc(a1-3)[Gal(b1-4)]GlcNAc(b1-?)Gal(b1-6)[Gal(b1-3)]GalNAc']

condense_composition_matching

 condense_composition_matching (matched_composition:List[str])

Find minimum set of glycans characterizing matched composition

	Type	Details
matched_composition	List	List of matching glycans
Returns	List	Minimal list of representative glycans

match_comp = match_composition_relaxed({'Hex':1, 'HexNAc':1, 'Neu5Ac':1}, glycan_class = 'O')
print(match_comp)
condense_composition_matching(match_comp)

['Neu5Ac(a2-3)Gal(b1-3)GalNAc', 'Gal(b1-3)[Neu5Ac(a2-6)]GalNAc', '{Neu5Ac(a2-3/6)}Gal(b1-3)GalNAc', 'Neu5Ac(a2-3)[GalNAc(b1-4)]Gal', 'Gal(a1-3)[Neu5Ac(a2-6)]GalNAc', 'Neu5Ac(a2-3/6)Gal(b1-3)GalNAc', 'Neu5Ac(a2-6)Gal(b1-3)GalNAc', 'Gal(?1-3)[Neu5Ac(a2-6)]GalNAc', 'Neu5Ac(a2-3/6)Gal(?1-3)GalNAc', 'Neu5Ac(a2-?)Hex(?1-?)GalNAc', 'Neu5Ac(a2-3)Gal(?1-?)GalNAc', 'Neu5Ac(a2-3/6)GalNAc(a1-6)Gal', 'Neu5Ac(a2-6)Gal(a1-3)GalNAc', 'Gal(b1-4)[Neu5Ac(a2-6)]GalNAc', 'Neu5Ac(a2-3)GalNAc(b1-3)Gal']

['Neu5Ac(a2-3)Gal(b1-3)GalNAc',
 'Neu5Ac(a2-3/6)Gal(b1-3)GalNAc',
 'Gal(b1-3)[Neu5Ac(a2-6)]GalNAc',
 'Gal(a1-3)[Neu5Ac(a2-6)]GalNAc',
 '{Neu5Ac(a2-3/6)}Gal(b1-3)GalNAc',
 'Neu5Ac(a2-3)[GalNAc(b1-4)]Gal',
 'Neu5Ac(a2-6)Gal(b1-3)GalNAc',
 'Neu5Ac(a2-3/6)GalNAc(a1-6)Gal',
 'Neu5Ac(a2-6)Gal(a1-3)GalNAc',
 'Gal(b1-4)[Neu5Ac(a2-6)]GalNAc',
 'Neu5Ac(a2-3)GalNAc(b1-3)Gal']

mz_to_structures

 mz_to_structures (mz_list:List[float], glycan_class:str,
                   kingdom:str='Animalia',
                   abundances:Optional[pandas.core.frame.DataFrame]=None,
                   mode:str='negative', mass_value:str='monoisotopic',
                   sample_prep:str='underivatized',
                   mass_tolerance:float=0.5, reduced:bool=False,
                   df_use:Optional[pandas.core.frame.DataFrame]=None,
                   filter_out:Optional[Set[str]]=None, verbose:bool=False)

Map precursor masses to structures, supporting accompanying relative intensities

	Type	Default	Details
mz_list	List		List of precursor masses
glycan_class	str		Glycan class: N/O/lipid/free
kingdom	str	Animalia	Taxonomic kingdom filter for choosing a subset of glycans to consider
abundances	Optional	None	Sample abundances matrix
mode	str	negative	MS mode: positive/negative
mass_value	str	monoisotopic	Mass type: monoisotopic/average
sample_prep	str	underivatized	Sample prep: underivatized/permethylated/peracetylated
mass_tolerance	float	0.5	Mass tolerance for matching
reduced	bool	False	Whether glycans are reduced
df_use	Optional	None	Custom glycan database
filter_out	Optional	None	Monosaccharides to ignore
verbose	bool	False	Whether to print non-matching compositions
Returns	Union		DataFrame of structures x intensities or empty list

mz_to_structures([674.29], glycan_class = 'O')

0 compositions could not be matched. Run with verbose = True to see which compositions.

	glycan	abundance
0	GlcNAc(b1-3)[Kdn(a2-6)]GalNAc	0
1	GalNAc(a1-3)[Kdn(a2-6)]GalNAc	0

compositions_to_structures

 compositions_to_structures (composition_list:List[Dict[str,int]],
                             glycan_class:str='N', kingdom:str='Animalia',
                             abundances:Optional[pandas.core.frame.DataFra
                             me]=None, df_use:Optional[pandas.core.frame.D
                             ataFrame]=None, verbose:bool=False)

Map compositions to structures, supporting accompanying relative intensities

	Type	Default	Details
composition_list	List		List of compositions like {‘Hex’: 1, ‘HexNAc’: 1}
glycan_class	str	N	Glycan class: N/O/lipid/free
kingdom	str	Animalia	Taxonomic kingdom filter for choosing a subset of glycans to consider
abundances	Optional	None	Sample abundances matrix
df_use	Optional	None	Custom glycan database
verbose	bool	False	Whether to print non-matching compositions
Returns	DataFrame		DataFrame of structures x intensities

compositions_to_structures([{'Neu5Ac': 2, 'Hex': 1, 'HexNAc': 1}], glycan_class = 'O')

0 compositions could not be matched. Run with verbose = True to see which compositions.

	glycan	abundance
0	Neu5Ac(a2-3)Gal(b1-3)[Neu5Ac(a2-6)]GalNAc	0
1	Neu5Ac(a2-8)Neu5Ac(a2-6)[Gal(b1-3)]GalNAc	0
2	Neu5Ac(a2-3)[Neu5Ac(a2-6)]Gal(b1-3)GalNAc	0
3	Neu5Ac(a2-3)Gal(b1-4)[Neu5Ac(a2-6)]GalNAc	0

compositions_to_structures(["H1N1A2"], glycan_class = 'O')

0 compositions could not be matched. Run with verbose = True to see which compositions.

	glycan	abundance
0	Neu5Ac(a2-3)Gal(b1-3)[Neu5Ac(a2-6)]GalNAc	0
1	Neu5Ac(a2-8)Neu5Ac(a2-6)[Gal(b1-3)]GalNAc	0
2	Neu5Ac(a2-3)[Neu5Ac(a2-6)]Gal(b1-3)GalNAc	0
3	Neu5Ac(a2-3)Gal(b1-4)[Neu5Ac(a2-6)]GalNAc	0

structure_to_basic

 structure_to_basic (glycan:str)

Convert glycan structure to base topology

	Type	Details
glycan	str	Glycan in IUPAC-condensed format
Returns	str	Base topology string

structure_to_basic("Neu5Ac(a2-3)Gal6S(b1-3)[Neu5Ac(a2-6)]GalNAc")

'Neu5Ac(?1-?)HexOS(?1-?)[Neu5Ac(?1-?)]HexNAc'

glycan_to_composition

 glycan_to_composition (glycan:str,
                        stem_libr:Optional[Dict[str,str]]=None)

Map glycan to its composition

	Type	Default	Details
glycan	str		Glycan in IUPAC-condensed format
stem_libr	Optional	None	Modified to core monosaccharide mapping; default: created from lib
Returns	Dict		Dictionary of monosaccharide counts

glycan_to_composition("Neu5Ac(a2-3)Gal6S(b1-3)[Neu5Ac(a2-6)]GalNAc")

{'Hex': 1, 'HexNAc': 1, 'Neu5Ac': 2, 'S': 1}

glycan_to_mass

 glycan_to_mass (glycan:str, mass_value:str='monoisotopic',
                 sample_prep:str='underivatized',
                 stem_libr:Optional[Dict[str,str]]=None,
                 adduct:Union[str,float,NoneType]=None)

Calculate theoretical mass from glycan

	Type	Default	Details
glycan	str		Glycan in IUPAC-condensed format
mass_value	str	monoisotopic	Mass type: monoisotopic/average
sample_prep	str	underivatized	Sample prep: underivatized/permethylated/peracetylated
stem_libr	Optional	None	Modified to core monosaccharide mapping
adduct	Union	None	Chemical formula of adduct (e.g., “C2H4O2”) OR its exact mass in Da
Returns	float		Theoretical mass

glycan_to_mass("Neu5Ac(a2-3)Gal6S(b1-3)[Neu5Ac(a2-6)]GalNAc")

1045.2903546

composition_to_mass

 composition_to_mass (dict_comp_in:Dict[str,int],
                      mass_value:str='monoisotopic',
                      sample_prep:str='underivatized',
                      adduct:Union[str,float,NoneType]=None)

Calculate theoretical mass from composition

	Type	Default	Details
dict_comp_in	Dict		Composition dictionary of monosaccharide:count
mass_value	str	monoisotopic	Mass type: monoisotopic/average
sample_prep	str	underivatized	Sample prep: underivatized/permethylated/peracetylated
adduct	Union	None	Chemical formula of adduct (e.g., “C2H4O2”) OR its exact mass in Da
Returns	float		Theoretical mass

composition_to_mass({'Neu5Ac': 2, 'Hex': 1, 'HexNAc': 1, 'S': 1})

1045.2903546

calculate_adduct_mass

 calculate_adduct_mass (formula:str, mass_value:str='monoisotopic',
                        enforce_sign:bool=False)

Calculate mass of adduct from chemical formula, including signed formulas

	Type	Default	Details
formula	str		Chemical formula of adduct (e.g., “C2H4O2”, “-H2O”, “+Na”)
mass_value	str	monoisotopic	Mass type: monoisotopic/average
enforce_sign	bool	False	If True, returns 0 for unsigned formulas
Returns	float		Formula mass

calculate_adduct_mass("C2H4O2")

60.021

get_unique_topologies

 get_unique_topologies (composition:Dict[str,int], glycan_type:str,
                        df_use:Optional[pandas.core.frame.DataFrame]=None,
                        universal_replacers:Optional[Dict[str,str]]=None,
                        taxonomy_rank:str='Kingdom',
                        taxonomy_value:str='Animalia')

Get all observed unique base topologies for composition

	Type	Default	Details
composition	Dict		Composition dictionary of monosaccharide:count
glycan_type	str		Glycan class: N/O/lipid/free/repeat
df_use	Optional	None	Custom glycan database to use for mapping
universal_replacers	Optional	None	Base-to-specific monosaccharide mapping
taxonomy_rank	str	Kingdom	Taxonomic rank for filtering
taxonomy_value	str	Animalia	Value at taxonomy rank
Returns	List		List of unique base topologies

get_unique_topologies({'HexNAc':2, 'Hex':1}, 'O', universal_replacers = {'dHex':'Fuc'})

['Hex(?1-?)HexNAc(?1-?)HexNAc',
 'Hex(?1-?)[HexNAc(?1-?)]HexNAc',
 'HexNAc(?1-?)[HexNAc(?1-?)]Hex',
 'HexNAc(?1-?)HexNAc(?1-?)Hex',
 'HexNAc(?1-?)Hex(?1-?)HexNAc']

get_random_glycan

 get_random_glycan (n:int=1, glycan_class:str='all',
                    kingdom:str='Animalia')

	Type	Default	Details
n	int	1	How many random glycans to sample
glycan_class	str	all	Glycan class: N/O/lipid/free/repeat/all
kingdom	str	Animalia	Taxonomic kingdom filter for choosing a subset of glycans to consider
Returns	Union		Returns a random glycan or list of glycans if n > 1

get_random_glycan()

'Fuc(a1-3/4)[Gal(b1-3/4)]GlcNAc(b1-3/6)[Fuc(a1-2)]Gal(b1-3)GalNAc'