|
1.INTRODUCTIONWith the continuous development of the Internet of Things in power, big data in power has ushered in new opportunities [1]. Power grid dispatching is an effective management method used to make all kinds of power production work orderly, ensure the safe and stable operation of the power grid internally, and provide reliable power supply externally. The existing power grid dispatching system focuses on power system data acquisition, system monitoring, security verification, etc., but it is difficult to process natural language-based dispatching data information [2]. The dispatching system needs to convert the unstructured power grid dispatching data into data that can be recognized by the computer, and visualize the data to help the staff to quickly and comprehensively trace the fault causes and make auxiliary decisions when the power grid dispatching fault is handled, so as to improve the stability of the power system. In this context, it is necessary for power grid dispatching personnel to be familiar with field equipment and power regulations, and to repeatedly consult and memorize a large amount of non-(semi-) structured text information. Therefore, targeted and efficient natural language understanding technology should be studied to realize semantic analysis of scheduling text [3, 4]. Knowledge graph is a knowledge representation method, which is a visual mesh representation of entities and their interrelationships using triplets as a representation form [5, 6]. Power equipment, fault handling, business process and grid structure in power grid dispatching can be represented by graph structure, and power grid dispatching needs to integrate and analyze a large amount of data. By combining the natural language parsing technology of power text with knowledge graph technology, unstructured data in power knowledge can be integrated into the knowledge graph, which can reduce mistaken operations in intelligent decision support, optimize disposal strategies, and improve work efficiency [7]. At present, knowledge graph technology is still in the initial exploration stage in the power sector, and there is no research on knowledge graph in the field of power grid dispatching [8]. Literature [9] proposed the “neighborhood knowledge” model of power grid dispatching and the fine operation rule method of online discovery, but the content of knowledge graph is limited to the cross-section and its quantitative relationship. Literature [10] proposed the construction method of “ one power grid diagram ” to achieve comprehensive integration of power grid data, but did not consider the correlation with business scenario knowledge. Literature [11, 12] integrates existing power grid multi-source data to build power equipment knowledge graph and realize intelligent search and visualization, mainly focusing on integrated equipment management in the case of non-power grid faults. Literature [13] integrated the graph computing platform and proposed the integration method of power grid multi-source information system, which greatly improved the analysis level of complex systems on the power grid side, but ignored the improvement of the business knowledge level of regulators and the man-machine interaction of scenes. In terms of assisted decision-making for power grid dispatching, literature [14] aims to construct a knowledge graph for the basic platform of the dispatch system, assisting operation and maintenance personnel in completing business fault analysis of the dispatch automation system. It mainly focuses on the problem of automation operating system faults and does not involve power system faults. In general, the current research work mainly focuses on conceptual frameworks and has few specific implementation applications; Most of them are limited to partial map construction and single function implementation, and comprehensive scenarios are not considered insufficiently; Convenient human-computer interaction is missing, algorithm ignores the leading role of regulators. Although knowledge graphs have been rapidly developed and applied in the field of electricity, research on the construction and intelligent assistance of knowledge graphs for power grid dispatch has not yet been conducted in depth. Therefore, this paper takes power grid dispatch data as the research object and designs and constructs a domain knowledge graph framework. Firstly, the BERT-BiLSTM-CRF model is proposed for entity recognition of power grid dispatching data. Then, a top-down ontology construction method is proposed to assist the construction of power dispatching group knowledge graph, and the visualization of power grid dispatching data is realized by using Neo4j graph database. Finally, the intelligent auxiliary decision-making is realized on the basis of power dispatching knowledge graph. 2.ANALYSIS OF CHARACTERISTICS OF POWER GRID DISPATCHING DATAThe power grid dispatching service needs to process a large amount of text information, such as power outage plans, operation tickets, work tickets, and fault alarm messages. The syntactic characteristics of these texts are similar, and this paper takes the more complex text of power outage and power transmission as an example to study computer natural language understanding algorithms. The structure of power grid dispatching data is shown in Figure. 1, which generally includes the name of applicant, application number, work location, work content, work time, job application unit, etc. Among them, the work content and opinions are generally written in natural language, containing a large number of professional electricity vocabulary, which is difficult for computers to accurately identify. The characteristics of power grid dispatching data are as follows:
3.KNOWLEDGE ENTITY RECOGNITION MODEL OF POWER GRID DISPATCHING BASED ON PBERT-BILSTM-CRFEntity recognition is an information extraction technology that determines the boundary of data such as person names and place names with specific meanings. Power regulation data has the characteristics of having a large number of unique location words and abbreviations, which belong to text in the field of expertise. Therefore, when recognizing entities, it is necessary to fully consider the characteristics of the text. In order to address the issue of difficulty in accurately identifying power regulation entities, this paper proposes the PBERT-BiLSTM-CRF model as a named entity recognition model for implementing the field of power grid fault handling. The BERT-BILSTM-CRF power regulation entity recognition model mainly consists of three parts: BERT representation layer, BiLSTM feature extraction layer and CRF output layer. The entity recognition process is shown in Figure. 2. 3.1BERT layerThe BERT layer is a knowledge representation model composed of multiple Transformer encoders that converts the input text sequence into a vector sequence representation, and each of which incorporates contextual information. The BERT layer captures word-level and sentence-level representations respectively through masking training and subsequent prediction training. Compared with traditional modesl that can only obtain semantic information in one direction, the BERT layer can obtain semantic information in any direction as the objective function. The addition of residual modules in the BERT layer improves the optimization ability of the model, which can effectively alleviate the gradient vanishing problem of the model. The BERT layer structure is shown in Figure. 3. Firstly, power grid dispatching data texts after segmentation are input into the BERT layer in the form of segmentation units o1, o2, …, ou, and semantic features of segmentation units, contextual position features, and position features in sentence fragments are extracted. Finally, the Transformer encoder outputs the texts in the form of vectors h1, h2, …, hu represented by different class features.
With the increase of the number of network layers, the training error periodically expands, and the accuracy of neural network gradually decreases. Assuming that the output of a designed neural network is H(x) = f(x) + x. Network is more easily optimized by adding residual learning, which makes the output more sensitive. 3.2BiLSTM feature extraction layerWhen entity recognition is performed on power grid dispatch texts, there is a strong semantic correlation between contextual texts. Therefore, LSTM models with arbitrary length sequences can be processed by introducing a built-in gating mechanism. LSTM selectively forgets unimportant sequence information while retaining the current input information. In order to better obtain contextual semantic information and expand backward knowledge acquisition method, forward LSTM is added on the basis of backward LSTM in the article to obtain forward hidden state and backward hidden state. The results of BiLST layer are shown in Figure. 4. At time t, the output sequence of the forward LSTM hidden layer is In the formulas, sigmoid is the activation function; tanh is the activation function; a is the gate control unit of LSTM; w is the hidden layer weight matrix; μ is the weight matrix of the input vector x; b is the offset term; Ct is the text information at time t of LSTM unit. 3.3CRF output layerAlthough Bi-LSTM considers the context information of power text, it cannot take the dependency relationship between power entity labels into account. Therefore, the model can consider the correlation between class labels by adding the CRF layer to the Bi-LSTM layer, and obtain the global optimal annotation results through the transfer matrix for global scoring. Firstly, the transfer matrix A is set that the transfer scores between adjacent labels in the statement of power safety hazards can be represented by matrix element Ai,j. The total score of a label sequence is determined by the transfer of each label. Given the input sentence X = (x1, x2, … xn) and the output label sequence Y = (y1, y2, …yn), the total score of the label sequence is shown in the formula. Pi,j is the i-th character of hidden text in the formula and P ∈ Rn*k; yi is the score of the label; n is sentence label type; Ai,j is matrix element and A ∈ R(k+2)*(k+2). Then, quotient the total correctly labeled score with the sum of all possible labeled scores, normalize the sequence path, and the sequence path is normalized to generate the probability of label sequence y under the condition of input sequence X. The probability is shown in the formula. During the training process, the logarithmic maximum likelihood estimation method is used to obtain the loss function and the logarithmic probability of the correct label sequence, as shown in the formula. Finally, the parameters are trained by the stochastic gradient descent learning algorithm. After obtaining the parameters, the dynamic programming algorithm Viterbi algorithm is used to obtain the output sequence with the maximum score, which is used as the final annotation result for the entities with safety hazards in power production, as shown in the formula. 4.RESEARCH ON THE CONSTRUCTION OF POWER REGULATION KNOWLEDGE MAP AND ASSISTED DECISION MAKING4.1Ontology assisted constructionDue to the relatively fixed physical information of the power grid dispatch text, the core elements can be further subdivided into various types of unstructured information. Therefore, this paper adopts the top-down ontology construction method to assist in constructing a knowledge graph of power grid dispatching. The process of ontology construction is as follows:
In the construction of the knowledge graph for power grid dispatching, the construction of the power grid dispatching ontology system provides conceptual relationships for the construction of the knowledge graph. The effectiveness of constructing a knowledge graph for power grid dispatching is enhanced by the improvement of constraining entities, relationships and attributes. 4.2Construction of knowledge graphKnowledge storage is the storage of data generated in power dispatching texts. Power dispatching texts requires a high-performance database for storage management because of characteristic of long-term preservation and huge quantity. Therefore, Neo4j graph database is used to store safety hazards in power production using the Cypher operating language. Neo4j has all the characteristics of mature database such as atomicity, consistency, isolation and persistence. The graph structure can be used to store data more efficiently. Query and display functions can be provided through Neo4j Web visual interface. The system takes user dialogue interaction as the main operation mode integrates the characteristics of MATLAB with high computational efficiency, knowledge graph Neo4j with high search efficiency and excellent visualization effect, and Python with friendly interface. Using Python equipped with AIML technology to complete MATLAB calculation, Neo4j display and search. The specific development process is shown in Figure. 5. 4.3Scheduling assistance question answering based on knowledge graphIn order to meet the demand for auxiliary Q&A in power grid dispatching, this paper combined the characteristics of power system operation, and designed the processing process of a dispatching auxiliary Q&A based on the knowledge graph of power grid dispatching and AIML. The example are shown in Figure. 6. First, the ambiguity of the dialogue and the ambiguity of the input text are considered in the process of matching Q&A statements through “ pattern ”, such as “Which is the shortest power supply path from A to B?”, “Which is the shortest path of power supply distance from A to B”, etc., which are the same as the example problems in Figure. 4. The “srai” label is used to normalize the dispatching problem, and then unified processing is carried out. A common ambiguity dictionary is established to reduce ambiguity and standardize terms to address the issues of unclear units and non-standard terminologies such as “220 substation” and “load adjustment”. Variables are stored with tags “ think ” and “set” in the “template” response process after matching, and custom functions such as “shortest_path” are activated with the wildcard “#”. Generating Cypher action statements based on the “ get ” tag of parameter passing. The results are output by collocation of text, voice and Neo4j interactive interface, so as to achieve Q&A of intelligent retrieval between regulators and dispatching knowledge graph by interacting, which greatly reduces the threshold of graph database operation and improves work efficiency. In addition. In addition, during the fault scheduling process, the Q&A module can be accessed at any time to retrieve scheduling knowledge to assist in completing fault handling tasks. 4.4Power grid dispatching fault information push and fault recordWhen the dispatching fault occurs, the fault information is transmitted back to the monitoring system of controller in real time. According to the location of the fault, the fault plan of the area can be automatically recommended and the related fault cases can be matched. At the same time, real-time fault information that includes fault equipment, location, status and phenomenon is matched with fault phenomena and fault cases in the fault dispatching knowledge graph of the fitted memory network by fitting human thinking and working mode, which involves three aspects of the single device fault case cluster, the congener device fault case cluster and the congener device fault handling. According to fault information, initial constraints are formed and entity links are established. Historical fault information is collected based on subgraph retrieval, path association of fault case clusters and fault handling knowledge, and knowledge search, calculation and recommendation are further achieved through path recall. Assuming that there are N types of faults in the congener device E = (e1, e2, …, en), and the sum of all possible fault phenomena are set as A. Then N types of faults should correspond to an empirical fault phenomenon subset Am = (a1, a2, …, am), and each fault case cluster C corresponds to a fault phenomenon subset. When device e1 fail and the fault phenomenon set is 1ae. The specific process is shown in Figure. 7. The case event clusters can be automatically generated, saved and displayed based on the construction of the fault case function module. Neo4j front-end interface can provide a visual display of fault case clusters and more fault information, which facilitates the identification of weak points in the power grid. 5.EXPERIMENTAL ANALYSISThis paper generated more than 40,000 operation sentences after sentence segmentation and formed a learning set by obtaining 9534 outage planning methods and protection opinions of a power grid company for 20 months from January 2020 to August 2021. Using conventional word segmentation and the new word discovery algorithm proposed in this paper, a total of 28863 words were selected to form a professional dictionary for power outage planning in power grid dispatching. Among them, there are 975 transformer related words, 32 inspection anomalies, 783 circuit related words, 5335 switches, 949 busbars, 15,340 tool brakes, 684 substation names, 88 generators, and 1847 other commonly used words. (1) Evaluation of entity recognition mode In this paper, Precision (P), Recall (R) and F1 Score are taken as evaluation indicators to evaluate the effectiveness, and the calculation formula is as follows. The effectiveness and rationality of the entity identification model for power production safety hazards can be evaluated through the evaluation index system. In order to verify the validity and rationality of entity recognition model for power production safety hazards based on BERT-BiLSTM-CRF proposed in this paper, the dataset that consists of grid dispatching records are compared with dictionary-rule, BiLSTM-CRF and BERT-BiLSTM-CRF models. Table 1 shows the comparison of entity recognition on different models. Table 1.Comparison of three entity recognition models.
The experimental results showed that the dictionary-rule based recognition method depends on the integrity and accuracy of the power grid dispatching dictionary. Incorrect recognition will be caused in the presence of typos in the power grid dispatching records. BiLSTM-CRF entity recognition model, ATT-BiLSTM-CRF entity recognition model and BERT-BiLSTM-CRF entity recognition model are developed on the basis of deep learning, which can explore the relationship the characters between themselves and the context of hidden text with higher accuracy. The performance of the BERT-BiLSTM-CRF entity recognition model is greatly promoted compared with the dictionary-rule recognition mode, increasing F1 score by about 0.16, recall rate by about 0.20 and accuracy rate by about 0.1. The entity recognition model of Bert-BiLSTM-CRF added the BERT vector expression layer based on the BiLSTM-CRF model, increasing F1 score by about 0.13, recall rate by about 0.19, and accuracy rate by about 0.06. The entity recognition model of ATT-BiLSTM-CRF has a certain improvement in performance compared with BiLSTM-CRF entity recognition model after incorporating attention mechanism. However, there is still some gap in accuracy, recall rate and F1 score compared with BERT-BiLSTM-CRF entity recognition model. The power grid dispatching named entity recognition model based on BERT-BilSTM-CRF utilizes BERT vector representation layer to obtain abstract features of text, fully considering the impact of context information on entities, and achieving the best entity extraction effect. Table 2 shows the experimental results of identifying various types of entities in power grid dispatching based on the BERT-BiLSTM-CRF model. Table 2.Comparison of entity recognition by category
6.CONCLUSIONSThis paper proposed a research on the construction of power grid dispatching knowledge graph and intelligent assisted decision-making. Considering characteristics of regulation and operation tasks in actual scenarios and the power grid electroplating data, the power grid dispatching knowledge graph is constructed. Based on this, the power grid dispatching assisted decision-making system is established and developed to achieve the functions of data management, intelligent interaction and auxiliary decision-making. The validity of the entity recognition model was verified through the actual operation data of the power system, and the scientific and practical nature of each functional module of the developed system was effectively verified through decision query. The fault regulation ability of dispatchers and the safe operation level of the distribution network are improved by conducting knowledge guidance and recommendation decision-making in the task scenario, which can provide a certain reference for other provinces and cities to improve the ability of dispatching assisted decision-making. 7.ACKNOWLEDGMENTSThis work is supported by State Grid Corporation Headquarters Science and Technology Project: Research and Application of Key Technologies for Heterogeneous Equipment Modeling and Dynamic State Estimation of Regional New Power System (5108-202218280A-2-296-XG). 8.8.REFERENCESZhang Yao, Wang Aohan, Zhang Hong,
“Overview of smart grid development in China,”
Power System Protection and Control, 49
(05), 180
–187
(2021). Google Scholar
Wan Qian, Zhu Liyue, Ouyang Feng,
“Artificial intelligence based public opinion analysis system for radio and television,”
Radio & TV Broadcast Engineering, 46
(12), 46
–52
(2019). Google Scholar
Liu Wenxia, Huang Yuchen, Wan Haiyang, et al.,
“Application of complex network theory in vulnerability and robustness evaluation of energy internet Smart Power,”
49
(01), 14
–21
(2021). Google Scholar
Yang Jian, Ye Qiuhong, Wang Huaijun, et al.,
“Research on big data technology in power failure prediction method,”
Microcomputer Applications, 38
(07), 188
–190
(2022). Google Scholar
Liu Jing,
“Research on named entity recognition,”
Computer Knowledge and Technology, 15
(09), 179
–180
(2019). Google Scholar
Zhang Junfei, Bi Zhisheng, Wang Jing, et al.,
“Design of Chinese domain named entity recognition framework based on BLSTM-CRF,”
Computing Technology and Automation, 38
(03), 117
–121
(2019). Google Scholar
Pu Tianjiao, Qiao Ji, Han Xiao, et al.,
“Research and application of artificial intelligence in operation and maintenance for power equipment,”
High Voltage Engineering, 46
(02), 369
–383
(2020). Google Scholar
Zhao Hui, Pang Haiting, Feng Shanshan, et al.,
“Summary of Chinese named entity recognition technology,”
Journal of Changchun University of Technology, 42
(05), 444
–450
(2021). Google Scholar
Lavine B K, White C G, Davidson C E.,
“Genetic algorithms for variable selection and pattern recognition - ScienceDirect,”
Comprehensive Chemometrics (Second Edition), 673
–700
(2020). https://doi.org/10.1016/B978-0-12-409547-2.14888-7 Google Scholar
Yang Chenju, Sun Jun, Pi Qiandong, et al.,
“Hierarchical parsing based on CRF and multiple rules,”
Journal of Jilin University(Science Edition), 58
(06), 1452
–1460
(2020). Google Scholar
Palaz D, Magimai-Doss M, Collobert R.,
“End-to-end acoustic modeling using convolutional neural networks for HMM-based automatic speech recognition,”
Speech Communication, 108 15
–32
(2019). https://doi.org/10.1016/j.specom.2019.01.004 Google Scholar
Changki LEE.,
“LSTM-CRF models for named entity recognition,”
The Institute of Electronics, Information and Communication Engineers, D
(4), E100
(2017). Google Scholar
Yan G, Lei G, Yefeng W, et al.,
“Constructing a Chinese electronic medical record corpus for named entity recognition on resident admit notes,”
BMC medical informatics and decision making, 19
(Suppl 2),
(2019). Google Scholar
Xiaolin Li, Jiaying Fan,
“Entity relationship extraction method based on dependency syntax analysis and rules,”
Robotics Systems and Vehicle Technology,
(2019). Google Scholar
|