深喉咙企业网站帮助深圳小程序建设公司
news/
2025/10/7 17:24:21/
文章来源:
深喉咙企业网站帮助,深圳小程序建设公司,设计网站横幅,品牌营销公司文章目录 一、安装 rpunct二、使用三、下载模型时报错1、报错详情2、报错原因3、解决方案 四、程序运行时报错1、报错详情2、报错原因3、解决方案 五、修改默认缓存路径 一、安装 rpunct pip install rpunct 相关依赖包信息#xff1a;
langdetect1.0.9
pandas1.2.4
simpletr… 文章目录 一、安装 rpunct二、使用三、下载模型时报错1、报错详情2、报错原因3、解决方案 四、程序运行时报错1、报错详情2、报错原因3、解决方案 五、修改默认缓存路径 一、安装 rpunct pip install rpunct 相关依赖包信息
langdetect1.0.9
pandas1.2.4
simpletransformers0.61.4
six1.16.0
torch1.8.1github链接https://github.com/Felflare/rpunct
二、使用
字符串方式测试
from rpunct import RestorePunctsdef main():rpunct RestorePuncts()text rpunct.punctuate(in 2018 cornell researchers built a high-powered detector that in combination with an algorithm-driven process called ptychography set a world recordby tripling the resolution of a state-of-the-art electron microscope as successful as it was that approach had a weakness it only worked with ultrathin samples that werea few atoms thick anything thicker would cause the electrons to scatter in ways that could not be disentangled now a team again led by david muller the samuel b eckertprofessor of engineering has bested its own record by a factor of two with an electron microscope pixel array detector empad that incorporates even more sophisticated3d reconstruction algorithms the resolution is so fine-tuned the only blurring that remains is the thermal jiggling of the atoms themselves)print(text)
if __name__ __main__:main()注只支持英文文本。程序中务必添加 if __name__ __main__否则将报错。
文件读取方式
rpunct RestorePuncts()with open(./upload/text/eng.txt,r,encodingutf-8) as f:text f.read()output_text rpunct.punctuate(text)
print(output_text)注测试过程发现没有指定 utf-8 编码时输出的文本会出现乱码。
预期效果
In 2018, Cornell researchers built a high-powered detector that, in combination with an algorithm-driven process called Ptychography, set a world record by tripling the
resolution of a state-of-the-art electron microscope. As successful as it was, that approach had a weakness. It only worked with ultrathin samples that were a few atoms
thick. Anything thicker would cause the electrons to scatter in ways that could not be disentangled. Now, a team again led by David Muller, the Samuel B.
Eckert Professor of Engineering, has bested its own record by a factor of two with an Electron microscope pixel array detector empad that incorporates even more
sophisticated 3d reconstruction algorithms. The resolution is so fine-tuned the only blurring that remains is the thermal jiggling of the atoms themselves.三、下载模型时报错
1、报错详情
OSError: We couldnt connect to https://huggingface.co to load this file,
couldnt find it in the cached files and it looks like felflare/bert-restore-punctuation is not the path to a directory containing a file named config.json.
Checkout your internet connection or see how to run the library in offline mode at https://huggingface.co/docs/transformers/installation#offline-mode.2、报错原因
目标网站无法访问无法下载相关模型数据。
3、解决方案
方法一
找到包安装目录 D:\Anaconda\envs\speech\Lib\site-packages\rpunct\punctuate.py 下的 punctuate.py 文件 。应根据自身实际查找包安装目录
也可以使用 Ctrl 鼠标点击代码 from rpunct import RestorePuncts 中的 RestorePuncts 快捷跳转
将下载地址修改为以下镜像地址
__author__ Daulet N.
...import os
os.environ[HF_ENDPOINT] https://hf-mirror.comimport logging
...参考链接(评论区)https://zhuanlan.zhihu.com/p/627688602
方法二
以离线方式加载模型通过自行下载模型的方式再填写对应模型路径。
找到包安装目录 D:\Anaconda\envs\speech\Lib\site-packages\rpunct\punctuate.py 下的 punctuate.py 文件 。应根据自身实际查找包安装目录
修改 punctuate.py 文件如下
from transformers import AutoModelclass RestorePuncts:def __init__(self, wrds_per_pred250):...self.model AutoModel.from_pretrained(D:/AnacondaCLI/cache/huggingface/hub/models--felflare--bert-restore-punctuation/snapshots/954108a105ef1f89f08b71c25d6e33bb89cde724) 注在测试过程中发现可以离线加载模型但程序却无法正常运行。不知是程序中方法更新不再支持还是路径填写方式错误导致的。
更多详情参见https://huggingface.co/docs/transformers/installation#offline-mode
四、程序运行时报错
1、报错详情
ValueError: use_cuda set to True when cuda is unavailable.Make sure CUDA is available or set use_cudaFalse.2、报错原因
CUDA 不可用确保 CUDA 可用或将 use_CUDAFalse 设置为空。
3、解决方案
找到包安装路径下的 ner_model.py 文件 D:\Anaconda\envs\speech\Lib\site-packages\simpletransformers\ner\ner_model.py。应根据自身实际查找包安装目录
快速定位
也可以使用 Ctrl 鼠标点击代码 from rpunct import RestorePuncts 中的 RestorePuncts 快捷跳转再通过代码 from simpletransformers.ner import NERModel 中的 NERModel 进行跳转。
Ctrl g 快捷键输入 114 快速定位。
修改 ner_model.py 文件参数
class NERModel:def __init__(...# use_cudaTrue,# 将参数修改为 Falseuse_cudaFalse,...):参考链接https://github.com/Felflare/rpunct/issues/1
五、修改默认缓存路径
默认缓存路径(Windows)C:\Users\username\.cache\huggingface\hub
添加环境变量HUGGINGFACE_HUB_CACHE
具体路径自行更改 更多详情参见https://huggingface.co/docs/transformers/installation#install-with-conda 因依赖包版本问题无法安装 rpunct 可参见https://github.com/samwaterbury/rpunct
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/930653.shtml
如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!