UnicodeDecodeError: ‘utf-8’ codec can’t decode byte 0x89 in position 1: invalid start byte
Encoding and decoding in Python
Encoding and decoding values can get confusing.
This sample code might help you out. I named the variables to show whether they were encoded (bytes) or decoded (a string) or base64 encoded or decoded.
import base64value="Decoded string value"value_encode = value.encode('utf-8')
value_base64encode = base64.b64encode(value_encode)
value_decode = value_base64encode.decode('utf-8')print(value_decode)value_encode = value_decode.encode('utf-8')
value_base64decode = base64.b64decode(value_encode)
value_decode = value_base64decode.decode('utf-8')print(value_decode)
Sometimes you encode values so they don’t accidentally get processed incorrectly. For example if you have code in a value and you don’t want it to mess up parsing and processing, you might encode it to prevent certain special characters from being interpreted as code or breaking something.
Encoding values properly can help stop some security problems where attackers try to inject code into process to take malicious actions.
Attackers also may encode values to bypass security tools that inspect data.
At any rate, if you want to base64 encode a value you first need to translate into bytes, then encode it. Then decode it to turn it back into a string that you can pass into functions that expect a string.
When you want to decode the base64 encoded value, turn it back into bytes, then base64 decode it, then convert it back to a string.
There are different types of encoding and decoding like ASCII or UTF-8 — so you will want to understand what character set you need to support or work with to use the proper encoding and decoding.
One of the challenges is when you don’t know how something was encoded in the first place and you have to decode it. You’ll probably want to refer to the source code or the documentation to make sure you decode it correctly.
If you liked this story please clap and follow:
Medium: Teri Radichel or Email List: Teri Radichel
Twitter: @teriradichel or @2ndSightLab
Requests services via LinkedIn: Teri Radichel or IANS Research
© 2nd Sight Lab 2022
Need Cloud Security Training? 2nd Sight Lab Cloud Security Training
Cybersecurity & Cloud Security Resources by Teri Radichel: Cybersecurity and Cloud security classes, articles, white papers, presentations, and podcasts