define different Jieba objects in python file
内容导读
互联网集市收集整理的这篇技术教程文章主要介绍了define different Jieba objects in python file,小编现在分享给大家,供广大互联网技能从业者学习和参考。文章包含3740字,纯文字阅读大概需要6分钟。
内容图文
![define different Jieba objects in python file](/upload/InfoBanner/zyjiaocheng/689/64b7fe23b55248f6a0c9bc43f6b5d134.jpg)
Now, I have three different vocab.txt (glove, tencent.ai, fasttext).
Target: use these vocab.txt to init jieba object in one python file.
Method: if define three different jieba objects, there should be three different cache files here. Of course, should solve how to pass in different cache file paths ? In
/home/user/anaconda3/envs/py36/lib/python3.6/site-packages/jieba/__init__.py, change the parameters of the __init__() function.
51 52 class Tokenizer(object): 53 54 def __init__(self, tmp_dir=None, dictionary=DEFAULT_DICT): 55 self.lock = threading.RLock() 56 if dictionary == DEFAULT_DICT: 57 self.dictionary = dictionary 58 else: 59 self.dictionary = _get_abs_path(dictionary) 60 self.FREQ = {} 61 self.total = 0 62 self.user_word_tag_tab = {} 63 self.initialized = False 64 self.tmp_dir = tmp_dir 65 self.cache_file = None
Result:
1 import sys 2 sys.path.append('/home/user/anaconda3/envs/py36/lib/python3.6/site-packages/jieba') 3 from jieba import Tokenizer 4 class Jieba(object): 5 """docstring for Jie""" 6 def __init__(self, vocab_path, model_path): 7 super(Jie, self).__init__() 8 self.jieba = Tokenizer(os.path.join("/home/user/models/serving_embedding_torch/model_path/torch/data", model_path)) 9 self.jieba.load_userdict(vocab_path) 10 11 def seg(self, text): 12 print(list(self.jieba.cut(text, cut_all=False))) 13 14 a = Jieba('glove.model/vocab.txt', 'glove.model') 15 b = Jieba('tencent.model/vocab.txt', 'tencent.model') 16 c = Jieba('fb.model/vocab.txt', 'fb.model') 17 text = "区块链是一个好方向海派青年公寓龙爪槐" 18 a.seg(text) 19 b.seg(text) 20 c.seg(text)
(py36) user@big-001:~/models/serving_embedding_torch/model_path/torch/data$ python3 peel.py Building prefix dict from the default dictionary ... 2019-10-17 17:14:20,745 DEBUG: Building prefix dict from the default dictionary ... Dumping model to file cache /home/user/models/serving_embedding_torch/model_path/torch/data/glove.model/jieba.cache 2019-10-17 17:14:21,575 DEBUG: Dumping model to file cache /home/user/models/serving_embedding_torch/model_path/torch/data/glove.model/jieba.cache Loading model cost 0.899 seconds. 2019-10-17 17:14:21,644 DEBUG: Loading model cost 0.899 seconds. Prefix dict has been built succesfully. 2019-10-17 17:14:21,644 DEBUG: Prefix dict has been built succesfully. Building prefix dict from the default dictionary ... 2019-10-17 17:14:26,352 DEBUG: Building prefix dict from the default dictionary ... Dumping model to file cache /home/user/models/serving_embedding_torch/model_path/torch/data/tencent.model/jieba.cache 2019-10-17 17:14:27,101 DEBUG: Dumping model to file cache /home/user/models/serving_embedding_torch/model_path/torch/data/tencent.model/jieba.cache Loading model cost 0.805 seconds. 2019-10-17 17:14:27,158 DEBUG: Loading model cost 0.805 seconds. Prefix dict has been built succesfully. 2019-10-17 17:14:27,159 DEBUG: Prefix dict has been built succesfully. Building prefix dict from the default dictionary ... 2019-10-17 17:18:41,279 DEBUG: Building prefix dict from the default dictionary ... Dumping model to file cache /home/user/models/serving_embedding_torch/model_path/torch/data/fb.model/jieba.cache 2019-10-17 17:18:42,045 DEBUG: Dumping model to file cache /home/user/models/serving_embedding_torch/model_path/torch/data/fb.model/jieba.cache Loading model cost 0.822 seconds. 2019-10-17 17:18:42,101 DEBUG: Loading model cost 0.822 seconds. Prefix dict has been built succesfully. 2019-10-17 17:18:42,102 DEBUG: Prefix dict has been built succesfully. ['区块', '链是', '一个', '好', '方向', '海派', '青年', '公寓', '龙爪槐'] ['区块链', '是', '一个', '好方向', '海派青年公寓', '龙爪槐'] ['区块链', '是', '一个', '好', '方向', '海派', '青年', '公寓', '龙爪槐']
内容总结
以上是互联网集市为您收集整理的define different Jieba objects in python file全部内容,希望文章能够帮你解决define different Jieba objects in python file所遇到的程序开发问题。 如果觉得互联网集市技术教程内容还不错,欢迎将互联网集市网站推荐给程序员好友。
内容备注
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 gblab@vip.qq.com 举报,一经查实,本站将立刻删除。
内容手机端
扫描二维码推送至手机访问。