###

DOI:

电力大数据:2025,28(01):-

查看/发表评论过刊浏览高级检索 HTML

←前一篇 | 后一篇→

下载全文

基于深度学习的表结构识别技术研究

金莎¹, 李林汉¹, 谢海龙¹, 黄晓宏¹, 董前前²

(1.中国电建集团江西省电力建设有限公司;2.西安工程大学)

Table structure recognition in documents based on deep learning

JinSha¹, Li LinHan¹, Xie HaiLong¹, HUANG XiaoHong¹, DONG QianQian²

(1.PowerChina Jiangxi Electric Power Construction Co., Ltd.;2.Xi’an Polytechnic University)

摘要

图/表

参考文献

相似文献

本文已被：浏览 48次下载 35次
投稿时间：2024-10-15 修订日期：2024-12-24

中文摘要: 在新型电力系统中，大模型的应用正在逐渐增多，它们在处理和分析大量电力数据、提高电网运营效率方面发挥了巨大的作用。为解决模型训练数据中表格的规范化问题，本文提出了一种结合Spatial CNN和Grid CNN的深度学习方法，用于表格结构的识别与重建。通过Spatial CNN预测分割线掩码，并利用连通组件分析算法（CCA）提取分隔线，构建单元格网格。Grid CNN单元格合并模块对相邻单元格进行合并，以纠正分割错误，形成准确的表格结构。为提升模型效率，采用基于ResNet-18的特征金字塔网络（FPN）作为骨干网络，整合多尺度特征信息，增强表格结构的识别能力。该方法不仅提高了表格识别的准确性，还优化了重建过程，为自动化表格处理提供了有效的技术方案。实验结果表明，与现有性能优异的方法相比，在SciTSR数据集上，本文表结构识别方法在精确率和F1指标上各提高了1.1百分点和0.3百分点；在更为复杂的表数据集PubTabNet上，本文设计的表结构识别方法的TEDS-Struct指标达到97.1%，为电力文档表格的预处理提供了一种高效准确的解决方案。

中文关键词: 大语言模型电力系统表格处理 CNN

Abstract:In the new type of power system, the application of large models is gradually increasing, and they have played a huge role in processing and analyzing a large amount of power data, and improving the efficiency of power grid operations. To solve the normalization problem of tables in the model training data, this paper proposes a deep learning method that combines Spatial CNN and Grid CNN for the identification and reconstruction of table structures. The method predicts the split line mask through the Spatial CNN and extracts the split lines by using the connected component analysis algorithm (CCA) to build a grid of cells. The Grid CNN cell merging module merges adjacent cells to correct splitting errors and form an accurate table structure. To improve the efficiency of the model, a feature pyramid network (FPN) based on ResNet-18 is used as the backbone network to integrate multi-scale feature information and enhance the recognition ability of the table structure. This method not only improves the accuracy of table recognition but also optimizes the reconstruction process, providing an effective technical solution for automated table processing. The experimental results show that compared with the existing excellent performance methods, the table structure recognition method in this paper has increased by 1.1 percentage points and 0.3 percentage points in precision and F1 metrics on the SciTSR dataset; on the more complex table dataset PubTabNet, the TEDS-Struct metric of the table structure recognition method in this paper reached 97.1%. The research in this paper provides a high-efficient and accurate solution for table processing and training of large language models.

keywords: Large language model power system table processing CNN

文章编号： 中图分类号：TP311 文献标志码：

基金项目:中国电力建设股份有限公司项目;陕西省教育厅重点科学研究计划项目

作者	单位	E-mail
金莎^*	中国电建集团江西省电力建设有限公司	3013195309@qq.com
李林汉	中国电建集团江西省电力建设有限公司
谢海龙	中国电建集团江西省电力建设有限公司
黄晓宏	中国电建集团江西省电力建设有限公司
董前前	西安工程大学

Author Name	Affiliation	E-mail
JinSha	PowerChina Jiangxi Electric Power Construction Co, Ltd	3013195309@qq.com
Li LinHan	PowerChina Jiangxi Electric Power Construction Co, Ltd
Xie HaiLong	PowerChina Jiangxi Electric Power Construction Co, Ltd
HUANG XiaoHong	PowerChina Jiangxi Electric Power Construction Co, Ltd
DONG QianQian	Xi’an Polytechnic University

引用文本：