Re: Decode, transcode, sanitize, filter, escape
Hey,
Also bear in mind another outcome of Defense In Depth - you can't
trust HTML generators ;). Alternative formats like BBCode and Markdown
(which actually allows arbitary HTML insertions as part of the
specification) generate HTML but do not necessarily filter or validate
the contents.
You also have a quality issue. Integrated apps like phpBB or Github
make alternative formats look safe because they are already restricted
and sanitised by the app itself. Isolated libraries adhering to the
specification probably won't do this at all leaving it up to the user
to perform sanitisation of the generated output.
They both still need HTMLPurifier or a similar whitelisted sanitiser
to ensure it's safe.
Simple example is to grab phpmarkdown from Twitter, parse some
Markdown containing a <script> tag and see what it outputs. :P
If I recall, a common problem with naive BBCode libs was not filtering
for javascript: URIs properly.
> more specifically: accepting HTML, but trying to allow some of the tags but
> still filtering most of it.
> HTMLPurifier is the tool for this kind of job, but most people would
> recommend using some kind of alternative markup format, like
> BBCode<http://en.wikipedia.org/wiki/BBCode>
Paddy
--
Pádraic Brady
http://blog.astrumfutura.com
http://www.survivethedeepend.com
Zend Framework Community Review Team
Thread (8 messages)