Python

[python] UnicodeDecodeError: 'utf-8' codec can't decode byte 해결 방법

thxxyj 2022. 11. 11. 10:16
728x90

[현상]

fd = subprocess.run(cmd, shell=True, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE, executable='/bin/bash')
out = fd.stdout.decode('utf-8').strip()
err = fd.stderr.decode('utf-8').strip()

위 코드 실행 중 다음과 같은 에러 발생

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xee in position 1583: invalid continuation byte

 

[원인]

The errors argument specifies the response when the input string can’t be converted according to the encoding’s rules. Legal values for this argument are 'strict' (raise a UnicodeDecodeError exception), 'replace' (use U+FFFD, REPLACEMENT CHARACTER), 'ignore' (just leave the character out of the Unicode result), or 'backslashreplace' (inserts a \xNN escape sequence). The following examples show the differences

 

 

[해결 방법]

>>> b'\x80abc'.decode("utf-8", "strict")  
Traceback (most recent call last):
    ...
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0:
  invalid start byte
>>> b'\x80abc'.decode("utf-8", "replace")
'\ufffdabc'
>>> b'\x80abc'.decode("utf-8", "backslashreplace")
'\\x80abc'
>>> b'\x80abc'.decode("utf-8", "ignore")
'abc'

 

나는 utf-8로 decode 할 수 없는 문자에 대한 변환이 필요없어서 'ignore'을 했다

fd = subprocess.run(cmd, shell=True, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE, executable='/bin/bash')
out = fd.stdout.decode('utf-8', 'ignore').strip()
err = fd.stderr.decode('utf-8', 'ignore').strip()

 

 

참고 https://docs.python.org/3/howto/unicode.html#the-unicode-type

728x90

'Python' 카테고리의 다른 글

[python] subprocess.run()과 Popen()의 timeout  (0) 2023.04.12