###

DOI:

电力大数据:2023,26(1):-

查看/发表评论过刊浏览高级检索 HTML

←前一篇 | 后一篇→

下载全文

基于“检索器-鉴别器”架构的电力地址匹配模型研究

赵坚鹏, 盛方, 徐川子, 陈奕, 罗庆, 陈聪

(国网杭州供电公司)

Power address matching model based on “Retriever Discriminator” architecture

ZHAO Jianpeng, SHENG Fang, XU Chuanzi, CHEN Yi, LUO Qing, CHEN Cong

(State Grid Hangzhou Electric Power Company)

摘要

图/表

参考文献

相似文献

本文已被：浏览 286次下载 894次
投稿时间：2022-05-28 修订日期：2023-04-25

中文摘要: 为解决电力地址库与外部地址库地址的匹配问题，保证电力地址准确性，实现电力系统与外部系统间数据信息共享互通，提出一种基于“检索器-鉴别器”架构的地址匹配模型。首先介绍地址匹配模型的详细结构，包括用于缩小地址检索范围的地址检索器和最终分辨地址是否匹配正确的地址鉴别器，其中地址检索器基于词频-逆文档频率算法构建，地址鉴别器基于中文预训练语言模型NEZHA构建。提出了一种负样本训练方法提升地址鉴别器辨别效果。详细介绍了实验分析所使用的两个数据集。实验结果表明基于“检索器-鉴别器”架构的电力地址匹配模型能够准确从外部地址库中找出与电力地址匹配的地址，其中，地址鉴别器能够非常准确地从多个候选地址中找出准确匹配地址，其F1分数达0.99以上。

中文关键词: 地址匹配，电力地址，词频-逆文档频率，中文预训练语言模型，负样本

Abstract:In order to solve the address matching problem between power address database and external address database and ensure the accuracy of power address, and realize the data information sharing and interworking between the power system and the external system, an address matching model based on "Retriever-Discriminator" architecture is proposed. Firstly, the detailed structure of the address matching model is introduced, including address retriever to narrow the search scope and the address discriminator to distinguish whether the address matches eventually. The Address Retriever is constructed based on the word frequency-inverse document frequency algorithm. The Address Discriminator is constructed based on the Chinese pre-training language model NEZHA. A negative sample training method is proposed to improve the discrimination effect of the model. Two datasets used in the experiment are introduced in detail. The experimental results show that the power address matching model based on “Retriever Discriminator” architecture can accurately find the address matching with the power address from the external address database. The Address Discriminator can find the matching address from multiple candidate address pairs with F1 score more than 0.99.

keywords: address matching, power address, term frequency–inverse document frequency, Chinese pretrain language model, negative sample

文章编号： 中图分类号： 文献标志码：

基金项目:

作者	单位	E-mail
赵坚鹏^*	国网杭州供电公司	zhaojianpeng0830@163.com
盛方	国网杭州供电公司
徐川子	国网杭州供电公司
陈奕	国网杭州供电公司
罗庆	国网杭州供电公司
陈聪	国网杭州供电公司

Author Name	Affiliation	E-mail
ZHAO Jianpeng	State Grid Hangzhou Electric Power Company	zhaojianpeng0830@163.com
SHENG Fang	State Grid Hangzhou Electric Power Company
XU Chuanzi	State Grid Hangzhou Electric Power Company
CHEN Yi	State Grid Hangzhou Electric Power Company
LUO Qing	State Grid Hangzhou Electric Power Company
CHEN Cong	State Grid Hangzhou Electric Power Company

引用文本：