Full text: Download
Remote sensing streams continuous data feed from the satellite to ground station for data analysis. Often the data analytics involves analyzing data in real-time, such as emergency control, surveillance of military operations or scenarios that change rapidly. Traditional data mining requires all the data to be available prior to inducing a model by supervised learning, for automatic image recognition or classification. Any new update on the data prompts the model to be built again by loading in all the previous and new data. Therefore, the training time will increase indefinitely making it unsuitable for real-time application in remote sensing. As a contribution to solving this problem, a new approach of data analytics for remote sensing for data stream mining is formulated and reported in this paper. Fresh data feed collected from afar is used to approximate an image recognition model without reloading the history, which helps eliminate the latency in building the model again and again. In the past, data stream mining has a drawback in approximating a classification model with a sufficiently high level of accuracy. This is due to the one-pass incremental learning mechanism inherently exists in the design of the data stream mining algorithm. In order to solve this problem, a novel streamlined sensor data processing method is proposed called evolutionary expand-and-contract instance-based learning algorithm (EEAC-IBL). The multivariate data stream is first expanded into many subspaces, and then the subspaces, which are corresponding to the characteristics of the features are selected and condensed into a significant feature subset. The selection operates stochastically instead of deterministically by evolutionary optimization, which approximates the best subgroup. Followed by data stream mining, the model learning for image recognition is done on the fly. This stochastic approximation method is fast and accurate, offering an alternative to the traditional machine learning method for image recognition application in remote sensing. Our experimental results show computing advantages over other classical approaches, with a mean accuracy improvement at 16.62%.