Classifying herbal medicine origins by temporal and spectral data mining of electronic nose

Abstract

The origins of herbal medicines are important for their treatment effect, which could be potentially distinguished by electronic nose system. Because the difference in the odor fingerprint of herbal medicines from different origins can be tiny, the discrimination of origins can be much harder than that of different categories. Therefore, better feature extraction methods are significant for this task to be more accurately done, but there lacks systematic studies on different feature extraction methods and a standardized manner to extract features from e-nose signals upon which most researchers agree. To investigate the effectiveness of multiple feature engineering approaches, we classified different origins of three categories of herbal medicines with different feature extraction methods, manual feature extraction, mathematical transformation, deep learning. With 50 repetitive experiments with bootstrapping, we compared the effectiveness of the extractions with a two-layer neural network w/o dimensionality reduction methods (principal component analysis, linear discriminant analysis) as the three base classifiers. Compared with the conventional aggregated features, the Fast Fourier Transform (FFT) method and our novel approach (longitudinal-information-in-a-line) showed an significant accuracy improvement(p < 0.05) on all 3 base classifiers and all three herbal medicine categories, with the highest median classification accuracy 0.675 and 0.7 over 30 experiments. Two of the deep learning algorithm we applied also showed partially significant improvement, one-dimensional convolution neural network(1D-CNN) and a novel graph pooling based framework - multivariate time pooling (MTPool), with the highest median accuracy 0.75 and 0.65.