[3.12] gh-98188: Fix EmailMessage.get_payload to decode data when CTE value has extra text (GH-127547) #128529

miss-islington · 2025-01-06T01:32:29Z

Up to this point message handling has been very strict with regards to content encoding values: mixed case was accepted, but trailing blanks or other text would cause decoding failure, even if the first token was a valid encoding. By Postel's Rule we should go ahead and decode as long as we can recognize that first token. We have not thought of any security or backward compatibility concerns with this fix.

This fix does introduce a new technique/pattern to the Message code: we look to see if the header has a 'cte' attribute, and if so we use that. This effectively promotes the header API exposed by HeaderRegistry to an API that any header parser "should" support. This seems like a reasonable thing to do. It is not, however, a requirement, as the string value of the header is still used if there is no cte attribute.

The full fix (ignore any trailing blanks or blank-separated trailing text) applies only to the non-compat32 API. compat32 is only fixed to the extent that it now ignores trailing spaces. Note that the HeaderRegistry parsing still records a HeaderDefect if there is extra text.

(cherry picked from commit a62ba52)

Co-authored-by: RanKKI [email protected]
Co-authored-by: Bénédikt Tran [email protected]

Issue: email: get_payload(decode=True) doesn't handle Content-Transfer-Encoding with trailing white space #98188

…value has extra text (pythonGH-127547) Up to this point message handling has been very strict with regards to content encoding values: mixed case was accepted, but trailing blanks or other text would cause decoding failure, even if the first token was a valid encoding. By Postel's Rule we should go ahead and decode as long as we can recognize that first token. We have not thought of any security or backward compatibility concerns with this fix. This fix does introduce a new technique/pattern to the Message code: we look to see if the header has a 'cte' attribute, and if so we use that. This effectively promotes the header API exposed by HeaderRegistry to an API that any header parser "should" support. This seems like a reasonable thing to do. It is not, however, a requirement, as the string value of the header is still used if there is no cte attribute. The full fix (ignore any trailing blanks or blank-separated trailing text) applies only to the non-compat32 API. compat32 is only fixed to the extent that it now ignores trailing spaces. Note that the HeaderRegistry parsing still records a HeaderDefect if there is extra text. (cherry picked from commit a62ba52) Co-authored-by: RanKKI <[email protected]> Co-authored-by: Bénédikt Tran <[email protected]>

miss-islington requested a review from a team as a code owner January 6, 2025 01:32

This was referenced Jan 6, 2025

email: get_payload(decode=True) doesn't handle Content-Transfer-Encoding with trailing white space #98188

Closed

gh-98188: Fix EmailMessage.get_payload to decode data #127547

Merged

bedevere-app bot added the awaiting review label Jan 6, 2025

bitdancer merged commit dae5b16 into python:3.12 Jan 7, 2025
29 of 30 checks passed

bedevere-app bot removed the awaiting review label Jan 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[3.12] gh-98188: Fix EmailMessage.get_payload to decode data when CTE value has extra text (GH-127547) #128529

[3.12] gh-98188: Fix EmailMessage.get_payload to decode data when CTE value has extra text (GH-127547) #128529

miss-islington commented Jan 6, 2025 •

edited by bedevere-app bot

Loading

[3.12] gh-98188: Fix EmailMessage.get_payload to decode data when CTE value has extra text (GH-127547) #128529

[3.12] gh-98188: Fix EmailMessage.get_payload to decode data when CTE value has extra text (GH-127547) #128529

Conversation

miss-islington commented Jan 6, 2025 • edited by bedevere-app bot Loading

miss-islington commented Jan 6, 2025 •

edited by bedevere-app bot

Loading