Cluster analysis has been shown to be successful in the categorization of physico-chemical and biological properties of compounds. However, conventional approaches to clustering molecular structures, where chemical graphs are transformed into sequences of numbers, seldom meet chemists' expectations. Graph based techniques that cluster compounds with respect to common structural motifs are gaining in popularity as these can better mimic human categorization. One such graph based method, called LibraryMCS, which clusters compounds according to their maximum common substructures (MCS) in a hierarchical manner is presented. Unlike some other graph based clustering methods, LibraryMCS neither involves a similarity based pre-clustering step nor relies on predefined fragments. Recent evaluation by different research groups indicated that LibraryMCS was capable of producing high quality clusters agreeing with human categorization within practicable time (approximately 1000 structures/s). The presentation will recount and demonstrate typical usages of LibraryMCS: virtual HTS hit set profiling, R-group decomposition by learned scaffolds, perception of novel scaffolds, reverse engineering of combinatorial libraries, diversity assessment of large chemical library and compound acquisition.
Hierarchical clustering of chemical structures by maximum common substructures
Posted by
Miklós Vargyas
on 13 09 2012
Related content
13 07 2022
< 1 minute
ICCS 2022 - Translating data to predictive models
Biological, chemical and physical properties of molecules are encoded in their molecular structure....
13 04 2022
< 1 minute
Efficient biomolecular structural data handling and analysis - Webinar with Discngine
Presenters: Peter Schmidtke- Product Manager, Discngine Márk Somogyi- Product Manager, Chemaxon...
13 12 2021
< 1 minute
Cheminfo Stories Virtual UGM 2021 Asia Pacific Edition: Deep dive in the future of chemical patent drafting and in-house IP management
Writing chemical patents with Markush claims is a time-consuming, complex and business-critical...
13 12 2021
< 1 minute
Cheminfo Stories 2021 Virtual UGM Asia Pacific Edition: Design of new compounds from the available chemical space
In computational compound design workflows, the analysis of the available chemical space is an...