Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
Ergebnis 19 von 64
Journal of Chemical Information and Computer Sciences, 2003-05, Vol.43 (3), p.964-969
2003
Volltextzugriff (PDF)

Details

Autor(en) / Beteiligte
Titel
VSMP:  A Novel Variable Selection and Modeling Method Based on the Prediction
Ist Teil von
  • Journal of Chemical Information and Computer Sciences, 2003-05, Vol.43 (3), p.964-969
Ort / Verlag
United States: American Chemical Society
Erscheinungsjahr
2003
Quelle
Alma/SFX Local Collection
Beschreibungen/Notizen
  • The use of numerous descriptors that are indicative of molecular structure and topology is becoming more common in quantitative structure−activity relationship (QSAR). How to choose the adequate descriptors for QSAR studies is important but difficult because there are no absolute rules to govern this choice. A variety of variable selection techniques including stepwise, partial least squares/principal component analysis (PLS/PCA), neural network, and evolutionary algorithm such as genetic algorithm have been applied to this common problem. All-subsets regression (ASR) is capable of finding out the best variable subset from among a large pool. In this paper, a novel variable selection and modeling method based on the prediction, for short VSMP, has been developed. Here two controllable parameters, the interrelation coefficient between the pairs of the independent variables (r int) and the correlation coefficient (q 2) obtained using the leave-one-out (LOO) cross-validation technique, are introduced into the ASR to improve its performances. This technique differs from the other variable selection procedures related to the ASR by two main features:  (1) The search of various optimal subset search is controlled by the statistic q 2 or root-mean-square error (RMSEP) in the LOO cross-validation step rather than the correlation coefficient obtained in the modeling step (r 2). (2) The searching speed of all optimal subsets is expedited by the statistic r int together with q 2. A comparison of the results of the VSMP applied to the Selwood data set (n = 31 compounds, m = 53 descriptors) with those obtained from alternative algorithms shows the good performance of the technique.
Sprache
Englisch
Identifikatoren
ISSN: 0095-2338
eISSN: 1549-960X
DOI: 10.1021/ci020377j
Titel-ID: cdi_proquest_miscellaneous_73322733
Format

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX