Traceable Group-Wise Self-Optimizing Feature Transformation Learning: A Dual Optimization Perspective

Xiao, Meng; Wang, Dongjie; Wu, Min; Liu, Kunpeng; Xiong, Hui; Zhou, Yuanchun; Fu, Yanjie

Published in

Association for Computing Machinery (ACM), ACM Transactions on Knowledge Discovery from Data, 4(18), p. 1-22, 2024

DOI: 10.1145/3638059

Tools

Export citation

Search in Google Scholar

Traceable Group-Wise Self-Optimizing Feature Transformation Learning: A Dual Optimization Perspective

Journal article published in 2024 by Meng Xiao

, Dongjie Wang

, Min Wu

, Kunpeng Liu

, Hui Xiong

, Yuanchun Zhou

, Yanjie Fu

This paper was not found in any repository, but could be made available legally by the author.

Full text: Unavailable

Preprint: archiving allowed

Upload

Postprint: archiving allowed

Upload

Published version: archiving forbidden

Policy details

Data provided by

Abstract

Feature transformation aims to reconstruct an effective representation space by mathematically refining the existing features. It serves as a pivotal approach to combat the curse of dimensionality, enhance model generalization, mitigate data sparsity, and extend the applicability of classical models. Existing research predominantly focuses on domain knowledge-based feature engineering or learning latent representations. However, these methods, while insightful, lack full automation and fail to yield a traceable and optimal representation space. An indispensable question arises: Can we concurrently address these limitations when reconstructing a feature space for a machine learning task? Our initial work took a pioneering step towards this challenge by introducing a novel self-optimizing framework. This framework leverages the power of three cascading reinforced agents to automatically select candidate features and operations for generating improved feature transformation combinations. Despite the impressive strides made, there was room for enhancing its effectiveness and generalization capability. In this extended journal version, we advance our initial work from two distinct yet interconnected perspectives: 1) We propose a refinement of the original framework, which integrates a graph-based state representation method to capture the feature interactions more effectively and develop different Q-learning strategies to alleviate Q-value overestimation further. 2) We utilize a new optimization technique (actor-critic) to train the entire self-optimizing framework in order to accelerate the model convergence and improve the feature transformation performance. Finally, to validate the improved effectiveness and generalization capability of our framework, we perform extensive experiments and conduct comprehensive analyses. These provide empirical evidence of the strides made in this journal version over the initial work, solidifying our framework’s standing as a substantial contribution to the field of automated feature transformation. To improve the reproducibility, we have released the associated code and data by the Github link https://github.com/coco11563/TKDD2023_code.

Published in

Links

Tools

Traceable Group-Wise Self-Optimizing Feature Transformation Learning: A Dual Optimization Perspective

Abstract