python編碼和解碼_python電子郵件編碼和解碼問題

2023-07-05 08:59:03

Basically I want to read all new emails from an inbox and put them in a database. The reason I use python is because it has imaplib, but I know nothing about it.

Currently, I have something like this :

def primitive_get_text_blocks(email_message_instance):

maintype = email_message_instance.get_content_maintype()

if maintype == 'multipart':

return_parts = ""

for part in email_message_instance.get_payload():

if part.get_content_maintype() == 'text':

return_parts+= " "+ part.get_payload()

return return_parts

elif maintype == 'text':

return email_message_instance.get_payload()

return ""

fromField=con.escape(email_message["From"])

contentField=con.escape(primitive_get_text_blocks(email_message))

primitive get_text_blocks is copy pasted from somewhere.

The result is that I get database entries like this :

From what I understand, that has something to do with being encoded in utf-7. So I changed to get_payload(decode=True), but that gives me byte-arrays. If I append another decode('utf-8'), it sometimes crashes with errors like

'codec error can't decode to ...'.

I don't know how encodings work, I only want a unicode string with the body of my email.

Why is there no simple convert(charset from, charset to)? How do I get a readable email body (and address?). I've discovered IMAP Fetch Encoding and using decode_header I got no further.

I assume encoding is the way bytes represent characters, so with that in mind, shouldn't decode take a byte array and spit out a string? and here on stack overflow I came across somebody claming it had something to do with beeing encoded with utf-8 and utf-7. What does that even mean?

I did google and there appear to be tons of duplicates but the answers they got didn't really help me out (I've tried most of them)

解決方案

Turns out it's quite easy. Even though all documentation points to the glorious past when the unicode function still was a real thing, 'str' does the same.

So to recap, you have to pass 'decode=True' with 'getPayload' and wrap that around a str(...,'utf-8').

python編碼和解碼_python電子郵件編碼和解碼問題

繼續閱讀

python編碼和解碼_Python的解碼和編碼

python編碼和解碼_Python的編碼與解碼（二）

python編碼和解碼_python中的編碼與解碼

python編碼和解碼_Python3标準庫：codecs字元串編碼和解碼

python編碼和解碼_uu --- 對 uuencode 檔案進行編碼與解碼 — Python 3.7.9 文檔

python編碼和解碼_python編碼和解碼差別是什麼？