Paper
21 December 2021 TextRank keyword extraction method weighted by multivariate quantitative indexes
Xin Luan, WenYa Gao, Ming Chen, DaLei Song
Author Affiliations +
Proceedings Volume 12156, International Conference on Algorithms, High Performance Computing, and Artificial Intelligence (AHPCAI 2021); 121560G (2021) https://doi.org/10.1117/12.2626538
Event: International Conference on Algorithms, High Performance Computing, and Artificial Intelligence (AHPCAI 2021), 2021, Sanya, China
Abstract
In the process of keyword extraction, news text has its uniqueness. Keywords extraction of news text not only needs to pay attention to the difference of quantitative indexes of words, but also needs to consider the influence of phrases. In order to improve the keyword extraction effect of news texts, this paper constructs a keyword graph based on TextRank, improves the probability transition matrix by combining four quantitative indicators of node frequency, location, span and part of speech, realizing the weight difference of words. Considering the influence of word segmentation technology on phrases extraction, the reconstruction of phrases is completed according to the law of recombination and the concept of combinatorial entropy is defined to realize the filtering of reconstructed phrases. According to the statistical quantitative index of phrases, the linear weighted value is assigned to the reconstructed phrases, and finally, the TopN words or phrases are selected as keywords according to their weight value. Experimental results show that the proposed algorithm is not only superior to the traditional TextRank and TF-IDF algorithms, but also has great advantages compared with the improved PositionRank and MyWPMWRank algorithms, the F value of which can be increased by 9.75% at most, which effectively improves the keywords extraction effect of news text.
© (2021) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Xin Luan, WenYa Gao, Ming Chen, and DaLei Song "TextRank keyword extraction method weighted by multivariate quantitative indexes", Proc. SPIE 12156, International Conference on Algorithms, High Performance Computing, and Artificial Intelligence (AHPCAI 2021), 121560G (21 December 2021); https://doi.org/10.1117/12.2626538
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Evolutionary algorithms

Feature extraction

Digital filtering

Quantization

Statistical modeling

Chaos

Information science

Back to Top