Sea surface wind field smart fusion base on machine learning method
-
摘要: 基于多源资料进行海面风场的同化融合或插值融合,目前受到计算能力的较大制约。本文提出在多源卫星数据和ERA-5再分析数据重叠区域,训练基于XGBoost的机器学习ERA-5数据修正融合模型。然后基于该模型快速修正ERA-5数据(机器学习推理)。由于机器学习推理的快速性,ERA-5全区域修正融合的时间仅需2 s左右,可以较小计算代价构建整个海面融合风场。本文以10 m风速、10 m风向、U10分量和V10分量等典型风场变量展开,考虑了海陆分布差异使用陆地掩膜消除陆地区域,分别构建D_S_A_XGBoost、D_S_O_XGBoost、U_V_A_XGBoost、U_V_O_XGBoost 4个ERA-5修正模型,并最终生成海面融合风场。通过修正前后的ERA-5再分析数据与卫星数据进行比较,上述4个模型均减小了ERA-5再分析数据与卫星数据的差距。特别是在风速方面,不论是均方根误差(RMSE)还是绝对误差(MAE)都得到有效降低。在风向方面上,RMSEd以及MAEd也呈现降低趋势。在利用热带大气海洋观测计划(Tropical Atmosphere Ocean Array,TAO)浮标数据对4种XGBoost模型进行评价发现,U_V_O_XGBoost模型对于ERA-5数据的修正结果最好,其相关性达到0.893,提高了约0.011,结果表明本文在保证风场精度的情况下较大地提高了融合速度。Abstract: The assimilation fusion or interpolation fusion of the sea surface wind field based on multi-source data is currently restricted by computing power. This paper proposes to train the XGBoost-based machine learning ERA-5 data correction fusion model in the overlapping area of the multi-source satellite data and the ERA-5 reanalysis data, and then use the model to quickly correct (machine learning inference) ERA-5 data, of which the ERA-5 whole area correction fusion it only takes about 2 seconds. Due to the rapidity of machine learning inference, the entire sea surface fusion wind field can be constructed at a lower computational cost. This paper expands on typical wind field variables such as 10 m wind speed, 10 m wind direction, U10 component and V10 component, taking into account the difference in sea and land distribution, using land masks to eliminate land areas, and constructing D_S_A_XGBoost, D_S_O_XGBoost, U_V_A_XGBoost, U_V_O_XGBoost corrections model, and finally generate sea surface fusion wind field. By comparing the ERA-5 reanalysis data before and after the correction with the satellite data, the above four models all reduce the gap between the ERA-5 reanalysis data and the satellite data. Especially in terms of wind speed, both root mean square error (RMSE) and mean absolute error (MAE) are effectively reduced. In terms of wind direction, RMSEd and MAEd also show a decreasing trend. Using Tropical Atmosphere Ocean Array (TAO) buoy data to evaluate the four XGBoost models, it is found that the U_V_O_XGBoost model has the best correction results for ERA-5 data, and its correlation reaches 0.893, an increase of about 0.011, and the results show that the fusion speed is greatly improved under the condition of ensuring the accuracy of wind field.
-
表 1 卫星评价数据信息(全区域)
Tab. 1 Satellite data information used in the test (whole region)
卫星 时间 数据数量 HY-2B 2021年1月31日 00:00:00 7 026 2021年1月31日 12:00:00 6 233 CFOSAT 2021年1月31日 00:00:00 6 419 2021年1月31日 12:00:00 12 928 MetOp-B 2021年1月31日 00:00:00 54 982 2021年1月31日 12:00:00 55 586 表 2 全区域训练模型评价结果
Tab. 2 Evaluation results of the whole regional training model
卫星 时间 模型 风向/(°) 风速/(m∙s−1) RMSEd MAEd RMSE MAE HY-2B 2021年1月31日
00:00:00原始 42.941 12.480 1.313 0.917 U_V_A_XGBoost 42.133 11.614 1.166 0.803 D_S_A_XGBoost 39.497 12.490 1.083 0.774 2021年1月31日
12:00:00原始 50.959 11.917 1.238 0.946 U_V_A_XGBoost 48.910 10.757 1.118 0.853 D_S_A_XGBoost 43.517 10.989 1.133 0.845 卫星 时间 模型 风向/(°) 风速/(m∙s−1) RMSEd MAEd RMSE MAE CFOSAT 2021年1月31日
00:00:00原始 37.334 7.685 1.814 1.461 U_V_A_XGBoost 35.012 7.150 1.465 1.150 D_S_A_XGBoost 35.529 8.060 1.434 1.107 2021年1月31日
12:00:00原始 79.938 14.234 1.340 1.030 U_V_A_XGBoost 78.729 13.814 1.194 0.903 D_S_A_XGBoost 76.858 15.671 1.269 0.952 卫星 时间 模型 风向/(°) 风速/(m∙s−1) RMSEd MAEd RMSE MAE MetOp-B 2021年1月31日
00:00:00原始 25.232 9.860 1.270 0.940 U_V_A_XGBoost 24.408 10.122 1.118 0.806 D_S_A_XGBoost 24.391 10.063 1.053 0.771 2021年1月31日
12:00:00原始 32.589 8.530 1.190 0.883 U_V_A_XGBoost 31.433 8.534 1.076 0.786 D_S_A_XGBoost 33.728 9.318 1.011 0.735 注:加粗数字表示最优结果。 表 3 卫星评价数据信息(陆地掩码)
Tab. 3 Satellite data information used in the test (land mask)
卫星 时间 数据数量 HY-2B 2021年1月31日 00:00:00 7 026 2021年1月31日 12:00:00 6 233 CFOSAT 2021年1月31日 00:00:00 6 419 2021年1月31日 12:00:00 12 928 MetOp-B 2021年1月31日 00:00:00 54 977 2021年1月31日 12:00:00 55 575 表 4 陆地掩码训练模型评价结果
Tab. 4 Evaluation results of land mask training model
卫星 时间 模型 风向/(°) 风速/(m∙s−1) RMSEd MAEd RMSE MAE HY-2B 2021年1月31日
00:00:00原始 42.941 12.480 1.313 0.917 U_V_O_XGBoost 41.342 11.508 1.182 0.814 D_S_O_XGBoost 38.860 12.409 1.076 0.767 2021年1月31日
12:00:00原始 50.959 11.917 1.238 0.946 U_V_O_XGBoost 49.686 10.843 1.120 0.851 D_S_O_XGBoost 42.023 10.990 1.120 0.834 卫星 时间 模型 风向/(°) 风速/(m∙s−1) RMSEd MAEd RMSE MAE CFOSAT 2021年1月31日
00:00:00原始 37.334 7.685 1.814 1.461 U_V_O_XGBoost 34.275 7.214 1.461 1.146 D_S_O_XGBoost 36.178 8.031 1.431 1.103 2021年1月31日
12:00:00原始 79.938 14.234 1.340 1.030 U_V_O_XGBoost 79.149 13.827 1.219 0.914 D_S_O_XGBoost 76.327 15.197 1.241 0.937 卫星 时间 模型 风向/(°) 风速/(m∙s−1) RMSEd MAEd RMSE MAE MetOp-B 2021年1月31日
00:00:00原始 25.213 9.860 1.265 0.937 U_V_O_XGBoost 25.183 10.093 1.140 0.815 D_S_O_XGBoost 24.996 10.079 1.093 0.787 2021年1月31日
12:00:00原始 32.586 8.525 1.182 0.880 U_V_O_XGBoost 31.501 8.627 1.111 0.806 D_S_O_XGBoost 33.056 9.357 1.058 0.763 注:加粗数字表示最优结果。 表 5 不同机器学习算法风场融合结果
Tab. 5 Wind field fusion results of different machine learning algorithms
相关系数 均方根误差 标准差 ERA-5 0.882 0.980 1.938 XGBoost 0.893 0.890 1.936 Random Forest 0.890 0.915 1.955 Adaboost 0.892 0.906 1.978 表 6 融合时间对比
Tab. 6 Comparison of fusion time
融合方法 平均推理时间/s XGBoost模型 2.063 插值方法(IDW) 226.616 -
[1] 旷芳芳, 张友权, 张俊鹏, 等. 3种海面风场资料在台湾海峡的比较和评估[J]. 海洋学报, 2015, 37(5): 44−53.Kuang Fangfang, Zhang Youquan, Zhang Junpeng, et al. Comparison and evaluation of three sea surface wind products in Taiwan Strait[J]. Haiyang Xuebao, 2015, 37(5): 44−53. [2] 廖菲, 邓华, 曾琳, 等. 南海北部海面风速概率分布特征[J]. 海洋学报, 2018, 40(5): 37−47.Liao Fei, Deng Hua, Zeng Lin, et al. The probability distribution of sea surface wind speeds over the northern South China Sea[J]. Haiyang Xuebao, 2018, 40(5): 37−47. [3] 韩玉康, 周林, 赵艳玲, 等. 3种海面风场资料在吕宋海峡的比较与评估[J]. 海洋预报, 2019, 36(6): 44−52. doi: 10.11737/j.issn.1003-0239.2019.06.006Han Yukang, Zhou Lin, Zhao Yanling, et al. Evaluation of three sea surface wind data sets in Luzon Strait[J]. Marine Forecasts, 2019, 36(6): 44−52. doi: 10.11737/j.issn.1003-0239.2019.06.006 [4] 张毅, 蒋兴伟, 林明森, 等. 星载微波散射计的研究现状及发展趋势[J]. 遥感信息, 2009(6): 87−94. doi: 10.3969/j.issn.1000-3177.2009.06.019Zhang Yi, Jiang Xingwei, Lin Mingsen, et al. The present research status and development trend of spacebonre microwave scatterometer[J]. Remote Sensing Information, 2009(6): 87−94. doi: 10.3969/j.issn.1000-3177.2009.06.019 [5] 解学通, 郁文贤, 郭丽青, 等. 基于遗传算法的微波散射计海面风矢量反演研究[J]. 海洋通报, 2008, 27(4): 1−11. doi: 10.3969/j.issn.1001-6392.2008.04.001Xie Xuetong, Yu Wenxian, Guo Liqing, et al. Research on genetic algorithm based ocean surface wind vector retrieval for microwave scatterometer[J]. Marine Science Bulletin, 2008, 27(4): 1−11. doi: 10.3969/j.issn.1001-6392.2008.04.001 [6] 林溢园, 邹巨洪, 何原荣, 等. 我国海洋二号卫星微波散射计数据处理软件设计[J]. 海洋通报, 2016, 35(4): 443−448. doi: 10.11840/j.issn.1001-6392.2016.04.012Lin Yiyuan, Zou Juhong, He Yuanrong, et al. Design of data processing software for HY-2 satellite microwave scatterometer[J]. Marine Science Bulletin, 2016, 35(4): 443−448. doi: 10.11840/j.issn.1001-6392.2016.04.012 [7] 陈心一, 郝增周, 潘德炉, 等. 中国近海海面风场的时空特征分析[J]. 海洋学研究, 2014, 32(1): 1−10.Chen Xinyi, Hao Zengzhou, Pan Delu, et al. Analysis of temporal and spatial feature of sea surface wind field in China offshore[J]. Journal of Marine Sciences, 2014, 32(1): 1−10. [8] Hung S C, Chang W Y, Tsai W F, et al. Development of high-precision wind, wave and current forecast system for offshore wind energy industry in Taiwan: a two-stage method of numerical simulation and AI correction[J]. Journal of the Chinese Institute of Engineers, 2021, 44(6): 532−543. doi: 10.1080/02533839.2021.1936643 [9] 刘付前, 骆永军, 王超. 基于遥感资料南海月平均风场分析[C]. 2009 航海技术理论研究论文集, [出版地不详: 出版者不详], 2009.Liu Fuqian, Luo Yongjun, Wang Chao. Analysis of the monthly average wind field in the South China Sea based on remote sensing data [J]. 2009 Research Papers on Navigation Technology Theory, [S.l.: s.n.], 2009. [10] 唐焕丽, 姚琴, 吕晓莹, 等. 多源卫星融合的广东海域海面风场特征[J]. 遥感信息, 2020, 35(1): 117−122. doi: 10.3969/j.issn.1000-3177.2020.01.016Tang Huanli, Yao Qin, Lü Xiaoying, et al. Characteristics of sea surface wind field in Guangdong sea area with multi-source satellite fusion[J]. Remote Sensing Information, 2020, 35(1): 117−122. doi: 10.3969/j.issn.1000-3177.2020.01.016 [11] 冯倩. 多传感器卫星海面风场遥感研究[D]. 青岛: 中国海洋大学, 2004.Feng Qian. Study of sea surface wind remote sensing by satellite multi-sensor data[D]. Qingdao: Ocean University of China, 2004. [12] 柳婧. 基于最优插值方法的中国近海海面风场资料融合研究[D]. 北京: 国家海洋环境预报中心, 2018.Liu Jing. Research on data fusion of sea surface wind in China’s offshore based on optimal interpolation method[D]. Beijing: National Marine Environmental Forecasting Center, 2018. [13] 凌征, 王桂华, 陈大可, 等. 中国近海风场融合[C]// 首届中国“数字海洋”论坛. 天津: 国家海洋信息中心, 2008, 90−94.Ling Zheng, Wang Guihua, Chen Dake, et al. Integration of offshore wind fields in China [C]// The First China “Digital Ocean” Forum. Tianjin: National Maritime Information Centres, 2008, 90−94 [14] Zhang H M, Reynolds R W, Smith T M. Adequacy of the in situ observing system in the satellite era for climate SST[J]. Journal of Atmospheric and Oceanic Technology, 2006, 23(1): 107−120. doi: 10.1175/JTECH1828.1 [15] Zhang H M, Reynolds R W, Bates J J. P2. 23 blended and gridded high resolution global sea surface wind speed and climatology from multiple satellites: 1987-present[C]//Proceedings of the 14th Conference on Satellite Meteorology and Oceanography. Atlanta, GA: American Meteorological Society 2006 Annual Meeting, 2006, 2. [16] 齐亚琳, 林明森. 数据融合技术在海洋二号卫星数据中的应用[J]. 航天器工程, 2012, 21(3): 117−123. doi: 10.3969/j.issn.1673-8748.2012.03.045Qi Yalin, Lin Mingsen. Application of the data fusion technique in the HY-2 satellite data[J]. Spacecraft Engineering, 2012, 21(3): 117−123. doi: 10.3969/j.issn.1673-8748.2012.03.045 [17] Yan Q S, Zhang J, Meng J M, et al. Use of an optimum interpolation method to construct a high-resolution global ocean surface vector wind dataset from active scatterometers and passive radiometers[J]. International Journal of Remote Sensing, 2017, 38(20): 5569−5591. doi: 10.1080/01431161.2017.1341665 [18] Chao Y, Li Z J, Kindle J C, et al. A high-resolution surface vector wind product for coastal oceans: Blending satellite scatterometer measurements with regional mesoscale atmospheric model simulations[J]. Geophysical Research Letters, 2003, 30(1): 13−1−13−4. [19] 张东翔. 多源卫星海面风场产品检验及融合研究[D]. 长沙: 国防科技大学, 2018.Zhang Dongxiang. Research of multi-source satellite sea surface wind validation and data fusion[D]. Changsha: National University of Defense Technology, 2018. [20] 金荣花, 代刊, 赵瑞霞, 等. 我国无缝隙精细化网格天气预报技术进展与挑战[J]. 气象, 2019, 45(4): 445−457. doi: 10.7519/j.issn.1000-0526.2019.04.001Jin Ronghua, Dai Kan, Zhao Ruixia, et al. Progress and challenge of seamless fine gridded weather forecasting technology in China[J]. Meteorological Monthly, 2019, 45(4): 445−457. doi: 10.7519/j.issn.1000-0526.2019.04.001 [21] 陈克海, 解学通, 张金兰, 等. HY-2B卫星散射计海面风场产品质量分析[J]. 热带海洋学报, 2020, 39(6): 30−40.Chen Kehai, Xie Xuetong, Zhang Jinlan, et al. Accuracy analysis of the retrieved wind from HY-2B scatterometer[J]. Journal of Tropical Oceanography, 2020, 39(6): 30−40. [22] 黄耀辉, 赵晓磊, 阎诚, 等. 中法海洋卫星及典型应用[J]. 卫星应用, 2020(5): 32−37. doi: 10.3969/j.issn.1674-9030.2020.05.011Huang Yaohui, Zhao Xiaolei, Yan Cheng, et al. CFOSAT and typical applications[J]. Satellite Application, 2020(5): 32−37. doi: 10.3969/j.issn.1674-9030.2020.05.011 [23] Shen S S P, Dzikowski P, Li G L, et al. Interpolation of 1961−97 daily temperature and precipitation data onto Alberta polygons of ecodistrict and soil landscapes of Canada[J]. Journal of Applied Meteorology and Climatology, 2001, 40(12): 2162−2177. doi: 10.1175/1520-0450(2001)040<2162:IODTAP>2.0.CO;2 [24] Hofstra N, Haylock M, New M, et al. Comparison of six methods for the interpolation of daily, European climate data[J]. Journal of Geophysical Research: Atmospheres, 2008, 113(D21): D21110. doi: 10.1029/2008JD010100 [25] 潘留杰, 薛春芳, 王建鹏, 等. 一个简单的格点温度预报订正方法[J]. 气象, 2017, 43(12): 1584−1593. doi: 10.7519/j.issn.10000526.2017.12.015Pan Liujie, Xue Chunfang, Wang Jianpeng, et al. A simple grid temperature forecast correction method[J]. Meteorological Monthly, 2017, 43(12): 1584−1593. doi: 10.7519/j.issn.10000526.2017.12.015 [26] Jones P D, Raper S C B, Bradley R S, et al. Northern hemisphere surface air temperature variations: 1851−1984[J]. Journal of Applied Meteorology and Climatology, 1986, 25(2): 161−179. doi: 10.1175/1520-0450(1986)025<0161:NHSATV>2.0.CO;2 [27] 陈小燕, 杨劲松, 黄韦艮, 等. 多源卫星高度计有效波高数据融合方法研究[J]. 海洋学报, 2009, 31(4): 51−57.Chen Xiaoyan, Yang Jinsong, Huang Weigen, et al. Research on the fusion methods of significant wave height data from multisatellite altimeters[J]. Haiyang Xuebao, 2009, 31(4): 51−57. [28] 李彦, 王丽娜, 蒋镇. 一种针对气象要素的空间插值算法[J]. 重庆理工大学学报(自然科学), 2014, 28(6): 94−98, 116.Li Yan, Wang Li’na, Jiang Zhen. One kind of spatial interpolation algorithm for meteorological elements[J]. Journal of Chongqing University of Technology (Natural Science), 2014, 28(6): 94−98, 116. [29] 饶莉娟, 王健林, 张星. 不同插值方法对精细化预报产品在青岛地区的检验比较[J]. 中国农学通报, 2020, 36(32): 100−108. doi: 10.11924/j.issn.1000-6850.casb20191000706Rao Lijuan, Wang Jianlin, Zhang Xing. Different interpolation methods: comparison for refined forecast products in Qingdao area[J]. Chinese Agricultural Science Bulletin, 2020, 36(32): 100−108. doi: 10.11924/j.issn.1000-6850.casb20191000706 [30] 肇毓锋, 吴奇. 多时间尺度下Kriging与IDW空间插值方法的适用性研究[J]. 黑龙江水利科技, 2020, 48(11): 9−14. doi: 10.3969/j.issn.1007-7596.2020.11.002Zhao Yufeng, Wu Qi. Applicability of Kriging and IDW spatial interpolation methods on multiple time scales[J]. Heilongjiang Hydraulic Science and Technology, 2020, 48(11): 9−14. doi: 10.3969/j.issn.1007-7596.2020.11.002 [31] 蒋伟达, 孙永福, 刘绍文, 等. 基于IDW的埕岛海域水下三角洲地形演变[J]. 海洋科学进展, 2020, 38(4): 697−707. doi: 10.3969/j.issn.1671-6647.2020.04.013Jiang Weida, Sun Yongfu, Liu Shaowen, et al. Terrain evolution of subaqueous delta in Chengdao Sea area based on IDW[J]. Advances in Marine Science, 2020, 38(4): 697−707. doi: 10.3969/j.issn.1671-6647.2020.04.013 [32] 周志华. 机器学习[M]. 北京: 清华大学出版社, 2016: 23−47.Zhou Zhihua. Machine Learning[M]. Beijing: Tsinghua University Press, 2016: 23−47. [33] 马良玉, 於世磊, 赵尚羽, 等. 基于随机搜索算法优化XGBoost的过热汽温预测模型[J]. 华北电力大学学报(自然科学版), 2021, 48(4): 99−105.Ma Liangyu, Yu Shilei, Zhao Shangyu, et al. Superheated steam temperature prediction models based on XGBoost optimized with random search algorithm[J]. Journal of North China Electric Power University (Natural Science Edition), 2021, 48(4): 99−105. [34] 潘进, 丁强, 江爱朋, 等. 基于XGBoost的冷水机组不平衡数据故障诊断[J]. 机械强度, 2021, 43(1): 27−33.Pan Jin, Ding Qiang, Jiang Aipeng, et al. Fault diagnosis of unbalanced data of chillers based on XGBoost[J]. Journal of Mechanical Strength, 2021, 43(1): 27−33. [35] 孙晓黎, 马超群, 朱才华. 基于XGBoost的轨道交通短时客流预测精度分析[J]. 交通科技与经济, 2021, 23(1): 54−58.Sun Xiaoli, Ma Chaoqun, Zhu Caihua. XGBoost-based analysis of prediction accuracy for short-term passenger flow in rail transit[J]. Technology & Economy in Areas of Communications, 2021, 23(1): 54−58. [36] Chen T Q, Guestrin C. XGBoost: A scalable tree boosting system[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco: ACM, 2016: 785−794. [37] 曾晓青, 薛峰, 姚莉, 等. 针对模式风场的格点预报订正方案对比[J]. 应用气象学报, 2019, 30(1): 49−60. doi: 10.11898/1001-7313.20190105Zeng Xiaoqing, Xue Feng, Yao Li, et al. Comparative study of different error correction methods on model output wind field[J]. Journal of Applied Meteorological Science, 2019, 30(1): 49−60. doi: 10.11898/1001-7313.20190105 [38] Freilich M H, Dunbar R S. The accuracy of the NSCAT 1 vector winds: comparisons with national data buoy center buoys[J]. Journal of Geophysical Research: Oceans, 1999, 104(C5): 11231−11246. doi: 10.1029/1998JC900091 [39] 王国松, 王喜冬, 侯敏, 等. 基于观测和再分析数据的LSTM深度神经网络沿海风速预报应用研究[J]. 海洋学报, 2020, 42(1): 67−77.Wang Guosong, Wang Xidong, Hou Min, et al. Research on application of LSTM deep neural network on historical observation data and reanalysis data for sea surface wind speed forecasting[J]. Haiyang Xuebao, 2020, 42(1): 67−77.