Paper
11 October 2023 A feature integration method for Chinese chunking
Chen Lyu, Xuejing Xu, Jiangping Huang
Author Affiliations +
Proceedings Volume 12800, Sixth International Conference on Computer Information Science and Application Technology (CISAT 2023); 1280045 (2023) https://doi.org/10.1117/12.3004006
Event: 6th International Conference on Computer Information Science and Application Technology (CISAT 2023), 2023, Hangzhou, China
Abstract
Chunking is a crucial task in natural language processing. The task aims to divide a text into syntactically correlated non-overlapping chunks. The discrete model and neural model can be applied to Chinese chunking. The feature representations of them are different and both can achieve excellent performance. In this paper, we build both of the models and make a detailed comparison between the two models. Furthermore, we present the feature integration method to integrate the advantages of the two models. Experiments show the effectiveness of the feature integration method. Since the neural model utilizes pre-trained word embeddings, it can be regarded as a semi-supervised learning method. In order to make a fairer comparison between the discrete model and neural model, we incorporate word clusters into these models. Experimental results show that the word cluster information does not significantly improve chunking performance and the feature integration method still improves the performance of both the discrete model and neural model.
(2023) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Chen Lyu, Xuejing Xu, and Jiangping Huang "A feature integration method for Chinese chunking", Proc. SPIE 12800, Sixth International Conference on Computer Information Science and Application Technology (CISAT 2023), 1280045 (11 October 2023); https://doi.org/10.1117/12.3004006
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Machine learning

Neural networks

Information fusion

Feature extraction

Semantics

Back to Top