Combining Classifiers of Pesticides Toxicity through a Neuro-fuzzy Approach

Religión y Creencias

10 pages
0 views

Please download to get full document.

View again

of 10
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Share
Description
The increasing amount and complexity of data in toxicity prediction calls for new approaches based on hybrid intelligent methods for mining the data. This focus is required even more in the context of increasing number of different classifiers
Transcript
  Combining classifiers of pesticides toxicity through a neuro-fuzzy approach Emilio Benfenati 1 , Paolo Mazzatorta 1 , Daniel Neagu 2 , and Giuseppina Gini 2   1  Istituto di Ricerche Farmacologiche "Mario Negri" Milano, Via Eritrea, 62, 20157 Milano, Italy {Benfenati, Mazzatorta}@ marionegri.it 2  Dipartimento di Elettronica e Informazione, Politecnico di Milano, Piazza L. da Vinci 32, 20133 Milano, Italy Neagu@fusberta.elet.polimi.it, Gini@elet.polimi.it http://airlab.elet.polimi.it/imagetox Abstract.  The increasing amount and complexity of data in toxicity prediction calls for new approaches based on hybrid intelligent methods for mining the data. This focus is required even more in the context of increasing number of different classifiers applied in toxicity prediction. Consequently, there exist a need to develop tools to integrate various approaches. The goal of this research is to apply neuro-fuzzy networks to provide an improvement in combining the results of five classifiers applied in toxicity of pesticides. Nevertheless, fuzzy rules extracted from the trained developed networks can be used to perform useful comparisons between the performances of the involved classifiers. Our results suggest that the neuro-fuzzy approach of combining classifiers has the potential to significantly improve common classification methods for the use in toxicity of pesticides characterization, and knowledge discovery. 1 Introduction Quantitative structure  –  activity relationships (QSARs) correlate chemical structure to a wide variety of physical, chemical, biological (including biomedical, toxicological, ecotoxicological) and technological (glass transition temperatures of polymers, critical micelle concentrations of surfactants, rubber vulcanization rates) properties. Suitable correlations, once established and validated, can be used to predict properties for compounds as yet unmeasured or even unknown. Classification systems for QSAR studies are quite usual for carcinogenicity [9], because in this case carcinogenicity classes are defined by regulatory bodies such as IARC and EPA. For ecotoxicity, most of the QSAR models are regressions, referring to the dose giving the toxic effect in 50% of the animals (for instance LC 50 : lethal concentration for 50% of the test animals). This dose is a continuous value and regression seems the most appropriate algorithm. However, classification affords some advantages. Indeed, i) the regulatory values are indicated as toxicity classes and ii) classification can allow a better management of noisy data. For this reason we investigated classification in the past [7], [8], [9] and also in this study. No general  rule exists to define an approach suitable to solve a specific classification problem. In several cases, a selection of descriptors is the only essential condition to develop a general system. The next step consists in defining the best computational method to develop robust structure  –  activity models. Artificial neural networks (ANNs) represent an excellent tool that have been used to develop a wide range of real-world applications, especially when traditional solving methods fail [3]. They exhibit advantages such as ideal learning ability from data, classification capabilities and generalization, computationally fastness once trained due to parallel processing, and noise tolerance. The major shortcoming of neural networks is represented by their low degree of human comprehensibility. More transparency is offered by fuzzy neural networks FNN [14], [16], [18], which represent a paradigm combining the comprehensibility and capabilities of fuzzy reasoning to handle uncertainty, and the capabilities to learn from examples. The paper is organized as follows. Section 2 briefly presents the aspects of data preparation, based on chemical descriptors, some of the most common classification techniques and shows how they behave for toxicology modeling, with a emphasis to pesticides task. Section 3 proposes the neuro-fuzzy approach in order to manage the integration of all the studied classifiers, based on the structure developed as FNN Implicit Knowledge Module (IKM) of the hybrid intelligent system NIKE (Neural explicit&Implicit Knowledge inference system [17]). Preliminary results indicate that combination of several classifiers may lead to the improved performance [5], [11], [12]. The extracted fuzzy rules give new insights about the applicability domain of the implied classifiers. Conclusions of the paper are summarized in the last section. 2 Materials and Methods 2.1 Data set For this paper a data set constituted of 57 common organophosphorous compounds has been investigated. The main objective is to propose a good benchmark for the classification studies developed in this area. The toxicity values are the result of a wide bibliographic research mainly from “ the Pesticide Manual ”, ECOTOX database system, RTECS and HSDB [1]. An important problem that we faced is connected with the variability that the toxicity data presents [2]. Indeed, it is possible to find different fonts showing for the same compound and the same end  –  point LC 50  different for about two orders of magnitude. Such variability is due to different factors, as the different individual reactions of organisms tested, the different laboratory procedures, or is due to different experimental conditions or accidental errors. The toxicity value was expressed using the form Log 10  (1/LC 50 ). Then the values were scaled in the interval [-1..1]. Four classes were defined: Class 1 [-1..-0.5), Class 2 [-0.5..0), Class 3 [0..0.5), Class 4 [0.5..1] (Table 2).  2.2 Descriptors A set of about 150 descriptors were calculated by different software: Hyperchem 5.0 1 , CODESSA 2.2.1 2 , Pallas 2.1 3 . They are split into six categories: Constitutional (34 descriptors), Geometrical (14), Topological (38), Electrostatic (57), Quantum  –  chemicals (6), and Physico  –  chemical (4). In order to obtain a good model, a selection of the variables, which better describe the molecules, is necessary. There is the risk that some descriptors does not add information, and increase the noise, making more complex the result analysis. Furthermore, using a relatively low number of variables, the risk of overfitting is reduced. The descriptors selection (table 1) was obtained by Principal Components Analysis (PCA), using SCAN 4 : Table 1.  Names of the chemical descriptors involved in the classification task. Cat. Cod. Moment of inertia A G D1 Relative number of N atoms C D2 Binding energy (Kcal/mol) Q D3 DPSA-3 Difference in CPSAs (PPSA3-PNSA3) [Zefirov’s PC] E D4 Max partial charge (Qmax) [Zefirov’s PC] E D5 ZX Shadow / ZX Rectangle G D6 Number of atoms C D7 Moment of inertia C G D8 PNSA-3 Atomic charge weighted PNSA [Zefirov’s PC] E D9 HOMO (eV) E D10 LUMO (eV) Q D11 Kier&Hall index (order 3) T D12 2.3 Classification algorithms The classification algorithms used for this work are five: LDA (Linear Discriminant Analysis), RDA (Regularized Discriminant Analysis), SIMCA (Soft Independent Modeling of Class Analogy), KNN (K Nearest Neighbors classification), CART (Classification And Regression Tree). The first four are parametric statistical systems based on the Fisher’s discriminant analysis, the fifth and sixth are not parametrical statistical methods, the last one is a classification tree. LDA: the Fischer’s linear discrimination is an empirical method based on p  –  dimensional vectors of attributes. Thus the separation between classes occurs by an hyperplane, which divides the p  –  dimensional space of attributes. RDA: The variations introduced in this model have the aim to obviate the principal problems that afflict both the linear and quadratic discrimination. The regulation more efficient was carried out by Friedman, who proposed a compromise between the two previous techniques using a biparametrical method for the estimation ( λ  and γ  ). 1  Hypercube Inc., Gainsville, Florida, USA 2  SemiChem Inc., Shawnee, Kansas, USA 3  CompuDrug; Budapest, Hungary 4  SCAN (Software for Chemometric Analysis) v.1.1, from Minitab: http://www.minitab.com  SIMCA: the model is one of the first used in chemometry for modeling classes and, contrarily to the techniques before described, is not parametrical. The idea is to consider separately each class and to look for a representation using the principal components. An object is assigned to a class on the basis of the residual distance, rsd  2 , that it has from the model which represent the class itself: ( ) 22 ˆ igjigjigj  x xr  −= , )( 22  j j igj  M  pr rsd  −= ∑  ( 1 ) where x igj  = co  –  ordinates of the object’s projections on the inner space of the mathematical model for the class, x igj  = object’s co  –  ordinates, p=number of variables, M  j  = number of the principal components significant for the j class. KNN: this technique classifies each record in a data set based on a combination of the classes of the k   record(s) most similar to it in a historical data set (where k   = 1). CART is a tree  –  shaped structure that represents sets of decisions. These decisions generate rules for the classification of a data set. CART provides a set of rules that can be applied to a new (unclassified) data set to predict which records will have a given outcome. It segments a data set by creating two  –  way splits. The classification obtained using these algorithms is shown in Table 2. 2.4 Validation The more common methods for validation are: i) Leave  –  one  –  out (LOO); ii) Leave  –  more  –  out (LMO); iii) Train & Test; iv) Bootstrap. We used LOO, since it is considered the best working on data set of small dimension [10]. According to LOO, given n  objects, n  models are computed. For each model, the training set consists of n  –  1 objects and the evaluation set consists of the object left. To estimate the predictive ability, we considered the gap between the experimental (fitting) and the predicted value (cross  –  validation) for the n  objects left, one by one, out from the model. Table 2.  True class and class assigned by the algorithms for each compound 5 . True Class CART LDA KNN SIMCA RDA Anilofos 2 2 2 1 2 2 Chlorpyrifos 1 2 2 1 2 2 Chlorpyryfos-methyl 2 2 2 1 2 2 Isazofos 1 1 1 2 1 1 Phosalone 2 2 2 2 2 2 Profenofos 1 2 2 1 2 2 Prothiofos 2 2 2 2 2 2 Azamethiphos 2 2 2 1 4 2 Azinphos methyl 1 1 1 2 1 1 Diazinon 3 3 1 1 4 1 Phosmet 2 2 2 1 2 2 Pirimiphos ethyl 1 1 1 1 1 1 Pirimiphos methyl 2 3 1 2 1 1 5  The 40 molecules with a blank background were used to train the neuro-fuzzy classifier.  Pyrazophos 2 2 1 4 2 1 Quinalphos 1 1 1 2 1 1 Azinphos-ethyl 1 1 1 1 2 1 Etrimfos 1 1 1 3 3 1 Fosthiazate 4 2 2 2 4 2 Methidathion 1 1 1 1 1 1 Piperophos 3 3 3 2 2 3 Tebupirimfos 4 1 1 3 4 1 Triazophos 1 1 1 2 1 1 Dichlorvos 2 4 2 2 2 2 Disulfoton 3 3 3 1 3 3 Ethephon 4 4 4 4 4 4 Fenamiphos 1 1 3 2 1 1 Fenthion 2 2 3 2 2 3 Fonofos 1 1 3 2 1 3 Glyphosate 4 4 4 4 4 4 Isofenphos (isophenphos) 3 3 3 1 3 3 Methamidophos 4 4 4 3 4 4 Omethoate 3 3 3 3 3 3 Oxydemeton-methyl 3 3 3 3 3 3 Parathion ethyl (parathion) 2 2 2 3 1 3 Parathion methyl 3 3 3 3 3 3 Phoxim 2 2 1 1 1 1 Sulfotep 1 1 3 2 2 2 Tribufos 2 2 2 2 2 2 Trichlorfon 2 2 2 1 2 4 Acephate 4 4 1 3 4 4 Cadusafos 2 2 3 3 2 2 Chlorethoxyfos 2 2 2 3 2 2 Demeton-S-methyl 3 3 3 3 3 3 Dimethoate 3 3 1 1 3 3 Edifenphos 2 2 3 1 2 2 EPN 2 2 2 2 2 2 Ethion 2 2 2 2 2 2 Ethoprophos 3 3 3 2 2 3 Fenitrothion 3 2 3 3 3 3 Formothion 3 3 2 3 3 3 Methacrifos 2 2 2 2 2 3 Phorate 1 1 3 2 1 3 Propetamphos 3 3 3 4   2 3 Sulprofos 3 3 3 2 3 3 Temephos 3 3 2 1 3 2 Terbufos 1 1 3 2 3 3 Thiometon 3 3 3 3 3 3 3.1 The neuro-fuzzy combination of the classifiers 3.2 Motivations and architecture Combining multiple classifiers could be considered as a direction for the development of highly reliable pattern recognition systems, coming from the hybrid intelligent systems approach. Combination of several classifiers may result in improved performances [4], [5]. The necessity of combining multiple classifiers is arising from the main demand of increasing quality and reliability of the final models. There are different classification algorithms in almost all the current pattern recognition application areas, each one having certain degrees of success, but none of them being
Advertisement
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks