本文已被:浏览 209次 下载 928次
投稿时间:2023-04-13 修订日期:2023-06-05
投稿时间:2023-04-13 修订日期:2023-06-05
中文摘要: 为提升电网设备缺陷文本的完整性、及时性、规范性,改善因缺陷数据不完整而导致缺陷管理上存在的管控模式粗放、事后管控、数据不完整、消缺不及时、缺陷分析不到位等情况,具体表现为缺陷数据存在体外循环的现象,因此提出了一种基于大数据深度挖掘电网设备缺陷体外循环的模型研制与应用。以大量的历史缺陷数据为载体,基于TF-IDF算法对庞大的历史缺陷数据进行识别,提取出属于缺陷的关键词,通过缺陷关键词筛选出属于缺陷的工作票,再运用Pair Letters similarity算法和Jaro Winkler算法将缺陷工作票与已有的缺陷数据做匹配,最终输出无法匹配的数据为缺陷体外循环数据。实验表明,本研究模型有效的提高了缺陷数据的完整性,在对缺陷数据的完整性和数据填报的及时性上有明显优势。
Abstract:In order to improve the integrity, timeliness and standardization of the defect text of power grid equipment, and improve the defects management caused by incomplete defect data, such as extensive control mode, post control, incomplete data, incomplete defect elimination and incomplete defect analysis,the phenomenon of cardiopulmonary bypass existed in the defective data.Therefore, this paper proposes the development and application of a model for deep mining grid equipment defects cardiopulmonary bypass based on big data.With a large number of historical defect data as the carrier, based on TF-IDF algorithm to identify the huge historical defect data, extract the keywords belonging to the defect, through the defect keywords to screen out the work tickets belonging to the defect. Then Pair Letters Similarity algorithm and Jaro Winkler algorithm are used to match the defect work ticket with the existing defect data. Finally, the unmatched data is the defect cardiopulmonary bypass data.The test results show that the research model effectively improves the integrity of defect data, and has obvious advantages in the integrity of defect data and the timeliness of data filling.
keywords: equipment defects fuzzy matching TF-IDF algorithm jaro winkler algorithm pair letters similarity algorithm
文章编号: 中图分类号: 文献标志码:
基金项目:中国南方电网有限责任公司科技项目(GZ2015-2-0047)
作者 | 单位 | |
万金金* | 贵州创星电力科学研究院有限责任公司 | 2681378414@qq.com |
文屹 | 贵州电网有限责任公司 | |
吕黔苏 | 贵州电网有限责任公司电力科学研究院 | |
张迅 | 贵州电网有限责任公司电力科学研究院 | |
范强 | 贵州电网有限责任公司电力科学研究院 | |
肖书舟 | 贵州电网有限责任公司电力科学研究院 | |
万云林 | 贵州电网有限责任公司铜仁供电局印江供电局天堂供电所 |
引用文本: