Feature weighting and selection with a Pareto-optimal trade-off between relevancy and redundancy

Pattern Recognition Letters (PRL)
Evolutionary Computation
Theory
Authors
Affiliations

Ayan Das

Institute of Engineering & Management (IEM), Kolkata

Swagatam Das

Indian Statistical Institute (ISI), Kolkata

Published

March 1, 2017

Paper

Abstract

Feature Selection (FS) is an important pre-processing step in machine learning and it reduces the number of features/variables used to describe each member of a dataset. Such reduction occurs by eliminating some of the non-discriminating and redundant features and selecting a subset of the existing features with higher discriminating power among various classes in the data. In this paper, we formulate the feature selection as a bi-objective optimization problem of some real-valued weights corresponding to each feature. A subset of the weighted features is thus selected as the best subset for subsequent classification of the data. Two information theoretic measures, known as ‘relevancy’ and ‘redundancy’ are chosen for designing the objective functions for a very competitive Multi-Objective Optimization (MOO) algorithm called ‘Multi-Objective Evolutionary Algorithm based on Decomposition (MOEA/D)’. We experimentally determine the best possible constraints on the weights to be optimized. We evaluate the proposed bi-objective feature selection and weighting framework on a set of 15 standard datasets by using the popular k-Nearest Neighbor (k-NN) classifier. As is evident from the experimental results, our method appears to be quite competitive to some of the state-of-the-art FS methods of current interest. We further demonstrate the effectiveness of our framework by changing the choices of the optimization scheme and the classifier to Non-dominated Sorting Genetic Algorithm (NSGA)-II and Support Vector Machines (SVMs) respectively.

Citation

BibTeX citation:
@article{das2017,
  author = {Das, Ayan and Das, Swagatam},
  title = {Feature Weighting and Selection with a {Pareto-optimal}
    Trade-Off Between Relevancy and Redundancy},
  journal = {Pattern Recognition Letters (PRL)},
  volume = {88},
  pages = {12 - 19},
  date = {2017-03-01},
  url = {http://www.sciencedirect.com/science/article/pii/S0167865517300041},
  langid = {en}
}
For attribution, please cite this work as:
Das, Ayan, and Swagatam Das. 2017. “Feature Weighting and Selection with a Pareto-Optimal Trade-Off Between Relevancy and Redundancy.” Pattern Recognition Letters (PRL) 88 (March): 12–19. http://www.sciencedirect.com/science/article/pii/S0167865517300041.