|
1.INTRODUCTIONAs an economical and environmentally friendly renewable and clean energy, wind energy is favored by countries around the world. By the end of 2021, the installed capacity of onshore wind power in China has exceeded 300 million kilowatts1. Annual turbine maintenance costs account for about 10% to 20% of turbine operating revenue2. Because the operating environment of wind turbines is relatively harsh, resulting in too high a frequency of failures, Therefore, it is important to evaluate the operating performance of wind turbines and predict potential failures. At present, most of China’s research on wind turbine fault prediction is based on data-driven machine learning methods. For example, Jin Xiaohang3 and others use sparse self-encoders to encode and decode feature data for dimensionality reduction, and then predict power through deep neural networks. Wang Chao4 et al. adopted the LSTM neural network fault method, and analyzed and processed the prediction residuals through a sliding window. Li Senjuan5 et al. used the method based on SVM to classify and predict each fault and normal operation data. Liu Jiarui6 et al. combined the automatic encoder (AE) with the convolutional neural network (CNN) to propose a wind turbine fault warning method based on deep convolutional self-coding (DCAE). The above research mainly uses relevant algorithms to predict some important characteristics of wind turbines, Determine the threshold for abnormal alarms by comparing the residuals, and finally uses the threshold to analyze the system fault, and the experimental results are ideal. However, there are problems such as dimensional differences caused by excessive data volume of the network model, which leads to some errors in the model, and there are certain defects in data drive. The output active power of wind turbines is a direct reflection of the performance of wind turbines, and the use of operating data to model and analyze them has become a research hotspot for wind turbine performance evaluation7. For example, Huang Lingling8 et al. took the prediction error of the long short-term memory neural network as the dynamic deterioration of the monitoring index, and then used the fuzzy comprehensive evaluation method to evaluate the operating state of the wind turbine. Wang Yuhong9 et al. proposed an ultra-short-term power prediction method for multi-wind turbines based on the BiLSTM network based on TPA mechanism. In view of the above research on power prediction based on LSTM and its improved algorithm, retaining its advantages, this paper mainly uses the relevant theory of residual analysis to use the seq2seq (Sequence-to-sequence) neural network based on the LSTM model unit to perform power prediction analysis on the SCADA system data of wind turbines.2. Wind turbine SCADA system operation data processing 2.THE WIND TURBINE SCADA SYSTEM RUNS DATA PROCESSING2.1Data preprocessingIn this paper, the SCADA system operation data of a wind farm in Inner Mongolia is used, and the data mainly includes fault data and normal operation data, and the data characteristics are mainly running time, fan wind speed, active power and engine speed. Table 1 lists some of the important feature parameters used in this document Table 1.Characteristic data used in wind turbine fault prediction
There is a part of the data of 0 in the data, this kind of data is generally the data when the wind turbine is stopped and the manufacturer will set part of the data to 0 data, which often produces interference in modeling training, so it needs to be eliminated, and the principle of exclusion is mainly as follows: speed is 0, power is 0, The wind speed does not meet the data between the cut in and cut out wind speeds. 2.2Correlation analysis of dataThere is a strong correlation between SCADA data and data, such as state variables such as wind speed, power and generator rotor speed. This is a fixed parameter of learning for model prediction, which has little impact on the training of the model, the lower the correlation of the data in the small change often has a great impact on other data, we often need some relatively low data as an important data for model learning optimization, so the correlation of the data needs to be analyzed. The Pearson correlation coefficient is a linear correlation coefficient that is mainly used for the analysis of relationships between data. For a set of variables, the correlation coefficient r is calculated as: In this paper, the SCADA data used will be used for correlation experiments, and the resulting data correlation curve chart is shown in Figure 1. It can be seen from Figure 1 that the correlation between the front power, wind speed, speed and generator winding temperature is very strong, while the correlation between the bearing temperature of the generator drive end, the bearing temperature of the generator non-drive end, and the cabin temperature and the previous 4 sets of variables is relatively weak, and the subsequent training will have a greater impact on the model, so in the subsequent failure prediction time, we tend to use the latter 3 sets of faulty data to verify the experiment to ensure the accuracy of the experiment. 3.WIND TURBINE FAULT PREDICTION ALGORITHM BASED ON SEQ2SEQ3.1LSTM neural networkLSTM networkIt is a special recurrent neural network10. With the advantages of RNN neural network for sequence processing, it has many more gating switches than RNNs, which has a screening effect on information. It can solve a series of problems such as gradient damage and extremely large or minimal overfitting during training due to slow update of the weight relationship. The network has three control gate structures of output gate, input gate and forget gate, the work of the forget gate is to determine whether the memory unit at the previous moment enters the network for calculation, the input gate determines whether the candidate memory unit is used, and the output gate determines whether the hidden state is used. It is effectively controlled by the activation function. The internal structure of the LSTM network as a whole is shown in Figure 2. The LSTM is calculated as follows: Formula: ft for the Forgotten Door; σ is the sigmoid activation function; xt Enter for the vector at the current moment; wxf is the weight between the input vector and the forget gate; ht−1 is the hidden layer state of the previous moment;whf is the weight between the hidden layer and the forgetting gate;bf for the bias of the Forgotten Gate;it is the input door;wxi is the weight between the input vector and the input gate;whi is the weight size between the hidden layer and the input gate;bi is the bias of the input gate;ct’is a candidate memory unit;wxc is the weight size between the input vector and the candidate memory cells; whc is the weight size between the hidden layer and the candidate memory cells;bc is the bias of the candidate memory cells;ot is the output gate;wxo is the weight between the input vector and the output gate;who is the weight size between the hidden layer and the output gate;bo is the bias of the output gate;ct is the memory unit;ct−1 is the memory unit of the previous moment;ht is the hidden state of the current moment. 3.2seq2seq neural networkthe sequence-to-sequence model was mainly used in natural language processing tasks such as machine translation and speech and text recognition11, and later studies applied the model to time series forecasting tasks and achieved good prediction results. SEQ2SEQ can be divided into encoder and decoder as a whole, the basic unit uses the LSTM model12, and the encoder and decoder expansion diagram is shown in Figure 3. In the figure, the encoder input sequence length is t, the decoder output sequence length is t’, the encoder obtains the final hidden layer state ht as the input of the decoder, and finally the output of the decoder transforms the dimension through the fully connected layer to obtain the final output output. The hidden layer state in the encoder at the current moment is calculated as follows: The hidden layer state at the current moment in the decoder is calculated as follows: Formula:htis the hidden layer state under the encoder at the current moment;xtis the input vector at the current moment; ht−1is the hidden layer state of the previous moment in the encoder; LSTM()is the internal calculation function of the LSTM model, represents the implied layer state at the current moment in the decoder; ft−1 is the output vector at the previous moment; is the implied layer state at the last moment in the decoder. 4.EXPERIMENTAL RESEARCH AND RESULT ANALYSIS4.1Data analysis of wind turbine operationThe wind turbine had caused the SCADA system to malfunction due to generator bearing problems, and then shut down the wind turbine for maintenance. In order to verify the effectiveness of the proposed algorithm, the data of 3 months of the corresponding time period in the year before the failure is used as the training set to train the model, and the data of 13 months, including the fault time point, after the period of the failure, is used as the test data of the test set to verify the time of failure.Since there are abnormal data such as downtime data and fault data, data visualization is first used to preprocess the data. The variables used in this prediction experiment are plotted sequentially from A to G using the features in Table 1, as shown in Figure 4. As can be seen from the figure, there is a lot of data with 0, which is recorded by the SCADA system when the wind turbine is in a shutdown state. In the F sub-diagram, it can be seen that the temperature of the non-drive bearing of the generator is in a stable state as a whole, but it fluctuates significantly in the red elliptical area, and the temperature rises significantly due to the failure of the non-drive bearing of the generator.4.2 Model parameter optimization 4.2Model parameter optimizationThe data of 3 months of the corresponding time period in the year before the above generator bearing failure was extracted, and 19464 sample data were obtained after processing for SEQ2SEQ network training. Set the batch size to 64, optimize the initial learning rate of 0.001, the sequence length seq_len is 12, and the number of training times is 50. By adjusting the number of neurons in the hidden_dim hidden layer for prediction, the number of neurons in the hidden_dim was 8, 16, 32, 64, 128 for comparative prediction experiments, and the MAE and RMSE were used for evaluation. The specific calculation method is as follows: MAE calculation formula: RMSE calculation formula: where n is the number of data samples; yi is the actual value; is the predicted value. The specific experimental results are shown in Table 2. Table 2.Model prediction effect under different hidden_dim
With the increase of neurons, the model effect will be better and better, but when the number of neurons is not as good as the number of neurons at 128, this can indicate that the number of neurons is not as much as possible, the curve is concave curve, and the number of neurons used here is 32. 4.3Alarm threshold determinationThe above 3 months of training data were used for failure prediction, and then the residual between the predicted and actual values of the active power was calculated, as shown in Figure 5(a). The residual data are normally fitted, and the fitted normal distribution plot is shown in Figure 5(b). The distribution of residuals is mainly concentrated between ±0.5, and the main part still tends to 0. Using the nature of the normal distribution to set a suitable threshold a, so that the interval [-a,a] contains more than 99.7% of the data, the data between the two green dashed lines can be determined as normal data, and the data outside the dashed line is partial abnormal data, From this we can determine that the alarm threshold is ±0.3584. 4.4Verification of failure prediction methodsUsing the above trained model to predict and verify the test data in the next 13 months, using the number of hidden layer neurons 32 as the subsequent experimental parameters, the distribution curve of the predicted value of the active power output of the test set model and the actual value is shown in Figure 6(a), in order to verify the accuracy of the model prediction, the difference is calculated here, and the residual distribution plot is shown in Figure 6(b). As can be seen from the figure, the predicted value output by the model basically matches the actual value. Only some of the data have obvious deviations. Through the determination of the alarm threshold, the red dotted line is determined as the alarm line, and only part of the data in front of more than 90,000 data points intermittently exceeds the alarm line, as shown in the black elliptical area in the figure, but the residual in the red rectangular area has exceeded the alarm threshold of the red dotted line a lot, at this time it has been possible to judge that the wind turbine is in an abnormal state until the subsequent complete exceeding of the alarm threshold. The point where the threshold is exceeded three times in a row for the first time is at 23476, and it should be discarded because it does not meet the reality, and the point where the threshold is exceeded for the second three consecutive times is at 88890 points, at which point it can be determined that the wind turbine has been abnormal, and then the residual exceeds the threshold over time, until the subsequent residuals continue to exceed the threshold. According to the correspondence between data points and time and combined with the alarm records provided by the SCADA system, the wind turbine failure prediction through the seq2seq neural network can know that the wind turbine is abnormal about 6 days in advance, which can replace or repair the relevant components early to avoid unnecessary losses. 5.CONCLUSIONIn this paper, the seq2seq neural network is used to carry out power prediction experiments on wind turbines, calculate the residual difference between the predicted value of active power and the actual value, and determine the alarm threshold. Using the fault data for verification, it is found that the time of exceeding the threshold three times in a row is 6 days earlier than the time of the alarm of the SCADA system, indicating that the use of seq2seq neural network will effectively avoid the deterioration of the fault, provide technical support for optimizing the maintenance strategy of the unit, and improve the reliability of the operation of the wind turbine. ACKNOWLEDGMENTSThis topic comes from the Inner Mongolia Autonomous Region Science and Technology Plan Project: Research and Application of Key Components of Large Wind Turbines and Whole Machine Status Monitoring and Fault Early Warning Technology.(2021GG0433) REFERENCESLin,C.,
“Multiple measures to promote the high-quality development of distributed wind power,”
Machine E-commerce News, A07
(20222022). Google Scholar
Jin, X, H., Sun, Y., Shan, J, H. et al.,
“Review of fault diagnosis and prediction technology of wind turbines,”
Chinese Journal of Scientific Instrument, 38
(05), 1041
–1053
(2017). Google Scholar
Jin, X. H., Xu, Z. W., Sun, Y, et al.,
“Online operation status monitoring of wind turbines based on SCADA data analysis and sparse self-coding neural network,”
Journal of Solar Energy, 42
(06), 321
–328
(2021). Google Scholar
Wang, C., Li, Z. D.,
“Wind turbine gearbox bearing fault warning based on LSTM network,”
Electric Power Science and Engineering, 36
(09), 40
–45
(2020). Google Scholar
Li, S. J., Zhang, P., Yue, D. W., et al.,
“Fault prediction of wind turbine based on support vector machine,”
Computer Simulation, 39
(05), 84
–88+180
(2022). Google Scholar
Liu, J. R., Yang, G. T., Yang, X. Y.,
“Research on fault warning method of wind turbine based on deep convolutional autoencoder,”
Journal of Solar Energy, 43
(11), 215
–223
(2022). Google Scholar
Ma, T. S.,
“Modeling and performance evaluation method of wind turbine based on improved LSTM,”
Shenyang University of Technology, Shenyang
(2021). Google Scholar
Huang, L. L., Li, S., Fu, Y., et al.,
“Ultra-short-term offshore wind power prediction based on wind turbine status,”
Acta Solar Sinica, 43
(08), 391
–398
(2022). Google Scholar
Wang, Y. H., Shi, Y. X., Zhou, X., et al.,
“Ultra-short-term power prediction of BiLSTM multiwind turbine based on time mode attention mechanism,”
High Voltage Engineering, 48
(05), 1884
–1892
(2022). Google Scholar
Chen, R.,
“Research on English Machine Translation Based on LSTM Attention Embedding,”
Automation and Instrumentation, 264
(10), 140
–143
(2021). Google Scholar
Men, D., Chen, L.,
“Text abstract generation method based on improved Seq2Seq-Attention model,”
Electronic Design Engineering, 30
(23), 6
–10
(2022). Google Scholar
Chen, Y. F., Zhang, D. H., Yu, H., Wang, Y. Q.,
“Multi-feature short-term bus load prediction based on Seq2seq model,”
Transactions of Electric Power System and Automation, 35
(01), 1
–6+35
(2023). Google Scholar
|