Foodinformatics: Quantitative Structure-Property Relationship Modeling of Volatile Organic Compounds in Peppers
The aim of this work was the foodinformatic (chemoinformatic) modeling of volatile organic compounds (VOCs) of different samples of peppers based on a quantitative structure-property relationship (QSPR) for the retention indices of 273 identified compounds. The experimental retention indices were measured by means of comprehensive two-dimensional gas chromatography combined with quadrupole-mass spectrometry (GC × GC/qMS) using the BPX5 and BP20 column coupled system. All the VOCs were represented by means of both conformation-independent molecular descriptors and molecular fingerprints calculated in the Dragon and PaDEL-Descriptor software. The dataset was divided into training, validation and test sets of molecules according to the Balanced Subsets Method (BSM). Subsequently, the V-WSP unsupervised variable reduction method was used to reduce the presence of multicollinearity, redundancy, and noise in the initial pool of 4,336 molecular descriptors and fingerprints. Using this method, a reduced pool of 1,664 was submitted to the supervised selection by means of the replacement method (RM) variable subset selection in order to define a four-descriptor model. The quality of the model was measured by means of the coefficient of determination and the root-mean-square deviation in fitting (R2 train = 0.879 and RMSDtrain = 72.1), validation (R2 val = 0.832 and RMSDval = 91.7), and prediction (R2 test = 0.915 and RMSDtest = 55.4). The negligible differences among the parameters in the three sets indicate a stable and predictive QSPR model. This quantitative structure-activity relationship was developed keeping in mind the five principles defined by the Organization for Economic Co-operation and Development (OECD) to make it applicable.