When parsing a html string with single use of <, it removes it from the parsed value that being returned . For example
<?php
use Masterminds\HTML5;
$html = '<img src="invalid-url" onerror="alert(\'XSS Attack prefix\')" /> 2 > 1 & 3 < 5 and some more text';
// Parse the document. $dom is a DOMDocument.
$html5 = new HTML5();
$dom = $html5->loadHTML($html);
// Render it as HTML5:
print $html5->saveHTML($dom);
the print of $html5->saveHTML($dom) should return as
<!DOCTYPE html>
<html><img src="invalid-url" onerror="alert('XSS Attack prefix')"> 2 > 1 & 3 < 5 and some more text</html>
but instead it return as
<!DOCTYPE html>
<html><img src="invalid-url" onerror="alert('XSS Attack prefix')"> 2 > 1 & 3 5 and some more text</html>
see the missing encoded < of < character .
This is a continuation of symfony/symfony#57597 where it is impacting the sanitization process of html-sanitizer
When parsing a html string with single use of
<, it removes it from the parsed value that being returned . For examplethe print of
$html5->saveHTML($dom)should return asbut instead it return as
see the missing encoded
<of<character .This is a continuation of symfony/symfony#57597 where it is impacting the sanitization process of html-sanitizer