GMADB Summary

As June 2019, the GMADB contains 5203 glycan microarray samples collected from the Consortium of Functional Glycomics (CFG). The same lectin with the data from multiple experiments on different glycan arrays (from version 1.0 to 5.2) or under different concentrations is counted as multiple samples. Among 5203 microarray samples, 1849 have protein sequence information available (Table 1 and Figure 1A). We performed BLAST search against all protein sequences from PDB protein structures with sequence similarity greater than 95%, and the number of matched PDB entries are shown in Figure 1B. Since multiple microarray samples can have the same protein sequence and matched to the same PDB files, we removed redundancy. Consequently, there are 541 unique protein sequences included in all microarray samples and they are cross-linked to 5790 unique PDB entries. 1029 out of 5790 PDB entries are detected to contain glycan ligands, and the length distribution of the largest glycan ligand in each PDB files is shown in Figure 1C.