School of Computing

Multiobjective genetic algorithms for attribute selection

G. L. Pappa

Master's thesis, Pontificia Universidade Catolica do Parana - Brazil, December 2002.

Abstract

Attribute selection is one of the tasks that can be performed during the preprocessing of the data to be mined. It is an important task because, in the majority of the cases, data is collected for purposes other than classification. As a result, databases usually contain many irrelevant attributes, and if these attributes are not removed they can hinder the process of learning.

This work proposes a multiobjective Genetic Algorithm (GA) for attribute selection. Its development and implementation were motivated by the great success obtained by GAs in applications where the search space is vast and by the advantage of performing a global search in the space of candidate solutions, unlike other algorithms based on local search. The proposed GA uses concepts of multiobjective optimization, since the attribute selection problem requires, in our case, the optimization of two objectives: the classification error and the number of rules generated by a rule induction algorithm.

The evaluation of the individuals is performed according to the the wrapper approach, i.e., the evaluation of each individual of the population involves running the classification algorithm to be used later (with the set of selected attributes), in order to make the attribute selection procedure more robust. The classification algorithm used in this work is C4.5.

In addition to the multiobjective GA, this work also proposes a multiobjective version of the forward sequential selection method, in order to compare multiobjective versions of two methods often used in the attribute selection task.

Experiments in 18 public-domain databases showed that the multiobjective genetic algorithm and the multiobjective forward feature selection algorithm proposed can solve the feature selection task better than the single objective methods.



Bibtex Record

@mastersthesis{1790,
author = {G. L. Pappa},
title = {Multiobjective Genetic Algorithms for Attribute Selection},
month = {December},
year = {2002},
pages = {182-196},
keywords = {determinacy analysis, Craig interpolants},
note = {},
doi = {},
url = {http://www.cs.kent.ac.uk/pubs/2002/1790},
    publication_type = {mastersthesis},
    submission_id = {24122_1076271959},
    school = {Pontificia Universidade Catolica do Parana - Brazil},
}

School of Computing, University of Kent, Canterbury, Kent, CT2 7NF

Enquiries: +44 (0)1227 824180 or contact us.

Last Updated: 21/03/2014