自定义Python Charmap编解码器
内容导读
互联网集市收集整理的这篇技术教程文章主要介绍了自定义Python Charmap编解码器,小编现在分享给大家,供广大互联网技能从业者学习和参考。文章包含2155字,纯文字阅读大概需要4分钟。
内容图文
![自定义Python Charmap编解码器](/upload/InfoBanner/zyjiaocheng/696/813d9695d7044cbc809ae3f2427b5f6f.jpg)
我正在尝试编写自定义Python编解码器.这是一个简短的例子:
import codecs
class TestCodec(codecs.Codec):
def encode(self, input_, errors='strict'):
return codecs.charmap_encode(input_, errors, {
'a': 0x01,
'b': 0x02,
'c': 0x03,
})
def decode(self, input_, errors='strict'):
return codecs.charmap_decode(input_, errors, {
0x01: 'a',
0x02: 'b',
0x03: 'c',
})
def lookup(name):
if name != 'test':
return None
return codecs.CodecInfo(
name='test',
encode=TestCodec().encode,
decode=TestCodec().decode,
)
codecs.register(lookup)
print(b'\x01\x02\x03'.decode('test'))
print('abc'.encode('test'))
解码有效,但编码会引发异常:
$python3 codectest.py
abc
Traceback (most recent call last):
File "codectest.py", line 29, in <module>
print('abc'.encode('test'))
File "codectest.py", line 8, in encode
'c': 0x03,
UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-2:
character maps to <undefined>
任何想法如何正确使用charmap_encode?
解决方法:
看看https://docs.python.org/3/library/codecs.html#encodings-and-unicode(第三段):
There’s another group of encodings (the so called charmap encodings) that choose a different subset of all Unicode code points and how these code points are mapped to the bytes
0x0
–0xff
. To see how this is done simply open e.g.encodings/cp1252.py
(which is an encoding that is used primarily on Windows). There’s a string constant with 256 characters that shows you which character is mapped to which byte value.
接受提示查看编码/ cp1252.py,并查看以下代码:
import codecs
class TestCodec(codecs.Codec):
def encode(self, input_, errors='strict'):
return codecs.charmap_encode(input_, errors, encoding_table)
def decode(self, input_, errors='strict'):
return codecs.charmap_decode(input_, errors, decoding_table)
def lookup(name):
if name != 'test':
return None
return codecs.CodecInfo(
name='test',
encode=TestCodec().encode,
decode=TestCodec().decode,
)
decoding_table = (
'z'
'a'
'b'
'c'
)
encoding_table=codecs.charmap_build(decoding_table)
codecs.register(lookup)
### --- following is test/debug code
print(ascii(encoding_table))
print(b'\x01\x02\x03'.decode('test'))
foo = 'abc'.encode('test')
print(ascii(foo))
输出:
{97: 1, 122: 0, 99: 3, 98: 2}
abc
b'\x01\x02\x03'
内容总结
以上是互联网集市为您收集整理的自定义Python Charmap编解码器全部内容,希望文章能够帮你解决自定义Python Charmap编解码器所遇到的程序开发问题。 如果觉得互联网集市技术教程内容还不错,欢迎将互联网集市网站推荐给程序员好友。
内容备注
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 gblab@vip.qq.com 举报,一经查实,本站将立刻删除。
内容手机端
扫描二维码推送至手机访问。