Improve HTML/Markdown sanitization #14
Labels
No Label
area/i18n
area/infrastructure
area/moderation & safety
area/ux
priority/1.high
priority/2.medium
priority/3.low
tag/upstream issue
tag/duplicate
tag/help wanted
tag/invalid
tag/won't fix
type/bug
type/enhancement
type/question
No Milestone
No project
No Assignees
3 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: treehouse/mastodon#14
Loading…
Reference in New Issue
There is no content yet.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may exist for a short time before cleaning up, in most cases it CANNOT be undone. Continue?
The recent exploit required HTML formatting. Glitch has an open issue for Markdown rendering. Other vulnerabilities likely exist.
Removing the option to create posts in HTML/Markdown is trivial, but to disable rendering recieved HTML/Markdown posts is much more integrated.
Perhaps the ability to toggle HTML/Markdown formatting instance wide is an upstream feature request. Disabling these features also breaks previously formatted posts (it will look the same as sending HTML formatting to an instance that does not support HTML formatting).
I use markdown formatting a lot, so disabling markdown is a "won't fix". However, it's probably worth looking into better sanitization.
Disable HTML and Markdown Formattingto Improve HTML/Markdown sanitizationI am not against Markdown (I'd prefer others to beta test 😅).
HTML I'd hope to deprecate.
The PoC of the last exploit looked scary by stealing autofill passwords, but much more was possible. They could inject iframes. The vulnerability was only in Glitch, but upstream Mastodon fixed it--which makes me question Glitch.
We discussed this on Discord briefly, so just recapping: I think the best route here is to entity encode the entire non-text (excluding "safe" symbol) input before doing Markdown processing. Doing sanitiziation is prone to errors and corner cases, even if we use a full blown HTML parser to do it. A lot of injection bugs rely on malformed tags to escape parser-based sanitization, so just forbidding them from appearing unencoded seems like the best route. This does bloat the size of the toot data, but it seems worthwhile.