Bugzilla@Mozilla – Bug 228856
[FIX] \0 in CSS is ignored
Last modified: 2009-02-09 11:30:05 PST
Summon comment box
User-Agent: Mozilla/5.0 (Windows; U; Win98; en-US; rv:1.6b) Gecko/20031213 Build Identifier: Mozilla/5.0 (Windows; U; Win98; en-US; rv:1.6b) Gecko/20031213 In CSS, \0 means a null character. But it is ignored by Mozilla. (ex. "c\0olor:red;" is treated as "color:red;".) BTW, a raw null character in CSS is treated correctly as null by Mozilla. (ex. "c*olor:red;" (*=null character(x00)) is not treated as "color:red".) Reproducible: Always Steps to Reproduce: Actual Results: \0 in CSS is ignored. Expected Results: \0 in CSS is treated as a null character. (equal to a raw null character) This \0 ignoring has possibility to avoid sanitizing of web application (webmail etc.). Internet Explorer executes JScript in CSS so \0 can be a vulnerability. http://altba.com/bakera/bug/nul-test.html http://www.st.ryukoku.ac.jp/%7Ekjm/security/memo/2003/12.html#20031216_IE Though Mozilla does not executes JavaScript in CSS (so far as I know), it can use for web bug easily. (ex. @i\0mport )
Created attachment 137652 [details] testcase for \0 c\0olor: red;
Created attachment 137653 [details] testcase of raw null c(null)olor: red;
http://www.w3.org/TR/CSS21/syndata.html#q6 says (third bullet): "Third, backslash escapes allow authors to refer to characters they can't easily put in a document. In this case, the backslash is followed by at most six hexadecimal digits (0..9A..F), which stand for the ISO 10646 ([ISO10646]) character with that number. If a character in the range [0-9a-zA-Z] follows the hexadecimal number, the end of the number needs to be made clear. There are two ways to do that: 1. with a space (or other whitespace character): "\26 B" ("&B"). In this case, user agents should treat a "CR/LF" pair (13/10) as a single whitespace character. 2. by providing exactly 6 hexadecimal digits: "\000026B" ("&B") " AFAICT, the two ways mentioned above to write zero is: c\0 olor: red; c\000000olor: red; which means that your example has an illegal unicode escape sequence. The question is, shouldn't this property name fail to parse in this case?
I think I remember seeing a comment in www-style: there's a typo in the CSS21 spec. That paragraph should read: "If a character in the range [0-9a-fA-F] follows the hexadecimal number, the end of the number needs to be made clear" In other words, you only need a space if the end of the escape is ambiguous (e.g., 'eat\0food'). 'c\0olor' is unambigous and (as far as I can see) valid.
Here's the comment I was thinking of, for completeness: http://lists.w3.org/Archives/Public/www-style/2003Oct/0075.html
Sounds reasonable to me. Confirming bug.
Created attachment 137699 [details] testcase of \0 (legal with CSS 2.1 Working Draft, 15 September 2003) legal backslash escape suggested by Mats. \0 is ignored also. (same result attachment 137652 [details])
Mats: > The question is, shouldn't this property name fail to parse in this case? Yes. I think not to ignore \0 or \000000 specially. \0 or \000000 should be unescaped to a null character as well as other backslash escapes, even if it breaks the property name.
I don't fully understand the previous comment, but I'd note that CSS 2.1 (even the current WD) doesn't place any restrictions on identifier names, specifically: "In CSS 2.1, identifiers ... can also contain ... any ISO 10646 character as a numeric code".
To Malcom. I'm sorry that I confused you. I could not explain my thought in English well. (Also, I may misunderstand the opinion of Mats.) It is certain that I misused the word, "specially". My opinion is only that \0 and \000000 in CSS should not be ignored by Mozilla. Other backslash escaped characters aren't ignored by Mozilla. A raw null character isn't ignored too. Only \0 and \000000 are ignored. It is strange behavior. I used "specially" to intend to point out the difference.
yes, nobody is arguing this is not a bug. :-)
I think the problem is basically here: http://lxr.mozilla.org/mozilla/source/content/html/style/src/nsCSSScanner.cpp#75 9 746 PRBool nsCSSScanner::GatherIdent(nsresult& aErrorCode, PRInt32 aChar, 747 nsString& aIdent) ... 758 if (aChar == CSS_ESCAPE) { 759 aChar = ParseEscape(aErrorCode); 760 if (0 < aChar) { 761 aIdent.Append(PRUnichar(aChar)); 762 } We don't append the unescaped character if it's NUL. AFAICS, this is only because ParseEscape overloads its return value so that zero means either 'no character' or 'ASCII NUL'. In ParseEscape, 'No character' is used only once (line 730), which deals with <backslash><newline>. Oddly, <backslash> at the end of the stream will return <backslash> (line 674). I can't see that ParseEscape will ever return zero otherwise, except where a legitimate NUL is escaped. So, it would help if we changed ParseEscape to return -1 for 'no character' in the <bs><nl> case (unless Ian has any better ideas) -- line 730 of nsCSSScanner.cpp. However, the output is written to an nsString, which (IIRC) is a zero- terminated string. I think it would be 'unhappy' if it contained a NUL character. So to fix this, we'd need to: 1. Change lines that check the return value of ParseEscape to allow a zero return to mean 'NUL' -- lines 632, 752, 760 (as above), 954. 2. Write the output into something that can accept NULs. 3. Change consumers to use that 'something'. As well as identifiers, this also affects \0 in url's and quoted strings. I don't know what the effects might be of allowing NUL's into strings further up the chain, though.
Ian: any chance you could suggest that the NUL character (escaped or unescaped) should be disallowed (ignored?) in CSS files entirely? :) Seriously, I can't see any good reason for allowing it, and it sounds like a nightmare to implement. Backwards compatibility with CSS1/CSS2 might be one reason, but I doubt any existing CSS parsers handle the situation correctly.
I don't really see that it should be disallowed, although admittedly I see very few reasons for allowing it.
Actually, maybe an nsString can contain NUL characters after all? http://www.mozilla.org/projects/xpcom/string-guide.html#Concrete_Classes says it's NUL-terminated, but the implementation has a length count, so maybe it can.
It can't contain null characters. We should probably just put in 0xfffd.
Not that it would matter in the real world, but that would technically make us non-compliant since the following two rules would become equivalent: .t\0 est { ... } .t\fffd est { ... } ...both matching: <div class="t�est"></div> ...and neither matching: <div class="t�est"></div> But on the other hand, that would never matter in the real world.
Any fix for this should probably also take into account the embedded NUL case (attachment 137653 [details]). At the moment, I suspect we parse '.c<NUL>olor' as '.c'. I don't think there's any way to allow NUL characters (escaped or not) without also breaking something else. dbaron's comment 16 sounds like a sensible compromise.
Created attachment 137905 [details] Testcase #4, 'content' with embedded NUL
Created attachment 137906 [details] [review] Patch rev. 1 David, I think you're wrong regarding nsString - it handles embedded NUL characters just fine. This bug is caused by three separate problems. 1. nsCSSScanner::ParseEscape returns zero to indicate it found an escaped newline (that the caller should ignore). 2. nsCSSValue and nsStyleContentData uses PRUnichar* which results in truncation at the first embedded NUL. 3. CSS class and property names are Atomized but nsAtomTable and pldhash.c doesn't handle strings/keys with embedded NULs (in fact truncates them so "col\0or" matches "col"). This patch fixes 1. and 2.
I could take at stab at 3. as well but I'm not sure if there are other plans (or bugs) for Atoms that should be considered at the same time...
Comment on attachment 137905 [details] Testcase #4, 'content' with embedded NUL Just a note to point out that this testcase doesn't actually include any embedded NUL characters (by which I meant an actual, literal, binary ASCII zero character).
Comment on attachment 137906 [details] [review] Patch rev. 1 > PRBool nsCSSScanner::ParseNumber(nsresult& aErrorCode, PRInt32 c, >@@ -947,11 +947,8 @@ > } > if (ch == CSS_ESCAPE) { > ch = ParseEscape(aErrorCode); >- if (ch < 0) { >- return PR_FALSE; >- } > } >- if (0 < ch) { >+ if (0 <= ch) { > aBuffer.Append(PRUnichar(ch)); > } > } This changes the return value of ParseNumber from PR_FALSE to PR_TRUE in the <bs><nl> case. Was that intentional? >RCS file: /cvsroot/mozilla/layout/html/base/src/nsFrameManager.cpp,v >RCS file: /cvsroot/mozilla/layout/html/style/src/nsCSSFrameConstructor.cpp,v Why the changes from nsAutoString to nsString here? You should be able to initialise an nsAutoString from a nsAString.
> 3. CSS class and property names are Atomized but nsAtomTable > and pldhash.c doesn't handle strings/keys with embedded NULs FYI: Nothing inherent in pldhash.c precludes counted strings rather than terminated strings as keys. /be
The fact is that pldhash.c does not _support_ counted strings natively and working around that limitation was tricky... Taking bug since I have a complete patch now.
Re: comment 23 > This changes the return value of ParseNumber from PR_FALSE to > PR_TRUE in the <bs><nl> case. No, it doesn't. The old ParseEscape() returns zero for this case and it was rightly ignored by both the cited removed tests. The new version of ParseEscape() now returns -1 for this case and it's still ignored. (Please note that the old ParseEscape() never returned negative values at all, so the removed |return PR_FALSE| is just dead code elimination.) Also note that the cited code is in nsCSSScanner::GatherString() and not in nsCSSScanner::ParseNumber() - this is probably "cvs diff -pu" not being very good at parsing C++. (I have not changed ParseNumber at all). You are right about the behaviour though - both the old and new version fails on <bs><nl> when a number is expected but AFAICT this is not a bug. This <bs><nl> syntax is only allowed for strings, see: http://www.w3.org/TR/CSS21/syndata.html#q6 > You should be able to initialise an nsAutoString from a nsAString. Yes, but I believe nsString(nsString& x) is more efficient than nsAutoString(nsString& x).
Created attachment 138243 [details] Testcase #5, extensive tests for NUL and \0
Created attachment 138244 [details] [review] Patch1 rev. 2 (Style System) There are two changes since rev. 1: 1. changed nsAString* to nsString* in the nsStyleContentData struct. 2. I discovered an additional embedded NUL problem; in the special parsing of the "class" HTML attribute value. (function ParseClasses() in file content/html/style/src/nsHTMLAttributes.cpp)
Created attachment 138245 [details] [review] Patch2 rev. 1 (XPCOM) This is the XPCOM part of the fix. The changes to nsStaticCaseInsensitiveNameTable are straightforward, (we save a pointer in NameTableEntry). The nsAtomTable changes are trickier, mostly because I wanted to keep the type nsStaticAtom as is. I used the same ptr|0x1 trick that is already used to mark a "wrapped atom". I had to add a 32 bit length field to nsAtomImpl though. I also want to point out an unrelated bug I found while experimenting with different solutions. At one point I had added a member to nsStaticAtomWrapper and that caused crashes because data was overwritten: - PL_ARENA_ALLOCATE(mem, gStaticAtomArena, sizeof(nsStaticAtom)); + PL_ARENA_ALLOCATE(mem, gStaticAtomArena, sizeof(nsStaticAtomWrapper)); If anyone can give me some feedback on the performance of this patch that would be appreciated. I do see a couple of things that can be done to improve performance a bit more if this patch is accepted.
Technically nsString doesn't allow embedded null characters, even if they work, since nsString is-a nsAFlatString, which has a get method.
Regarding the <bs><nl> syntax; We support it for identifiers as well as for strings, eg .na\ me is parsed as .name but I can't find anything in the CSS2.1 spec. regarding this, is this just an oversight in the spec. or is it a bug?
http://www.w3.org/TR/2004/CR-CSS21-20040225/syndata.html#q6 forbids character escapes for the zero-character.
So should \0 then cause the whole declaration/rule/whatever to be ignored?
I think the \ should tokenize as DELIM just like it would if it were followed by something that wasn't a hex digit.
(In reply to comment #34) > I think the \ should tokenize as DELIM just like it would if it were followed > by something that wasn't a hex digit. If it were followed by something that wasn't a hex digit, it would escape that character, not tokenise as DELIM.
Comment on attachment 138244 [details] [review] Patch1 rev. 2 (Style System) Per recent changes to CSS, this isn't what we want to do. I'm not entirely sure what is, though...
Comment on attachment 138245 [details] [review] Patch2 rev. 1 (XPCOM) I'm not sure how much of this patch is still relevant, but a few comments: * have you thought about the effects of this change on performance? Some of this code is used a lot. >Index: xpcom/ds/nsStaticNameTable.h >-class NS_COM nsStaticCaseInsensitiveNameTable >+class NS_COM nsStaticCaseInsensitiveNameTable : >+ public PLDHashTable > { > public: > enum { NOT_FOUND = -1 }; >@@ -69,7 +70,6 @@ public: > > private: > nsDependentCString* mNameArray; >- PLDHashTable mNameTable; > nsDependentCString mNullStr; > }; Why did you need to do this? Could you avoid making this public?
Created attachment 178082 [details] [review] work in progress on what I think we should do with \0 The callers of GatherIdent as well and the printing of error tokens need a good bit of additional work.
Anything we can do to poke this bug back into life?
Created attachment 343767 [details] [review] new minimally invasive fix; doesn't quite work All the patches on here are quite invasive and no longer applicable to trunk, so here's a minimum-necessary-change fix versus moz-central, that hopefully can go on many of the live branches, either as is or with adjustment for churn. It modifies nsCSSScanner::ParseAndAppendEscape and nothing else; and what it does is apply the \-<any character other than a hex digit> rule to \0, so e.g. "\000" is equivalent to "000". Note that this is *not* what Mats' test cases expect; they expect \0 (any form) to be equivalent to a literal NUL, but that is clearly forbidden by CSS2.1 section 4.1.3 ("[the number] must not be zero"). There are two reftests in here. One checks for *exactly* the behavior I coded, inside strings. The other began as Mats' testcase #5 but no longer bears any resemblance; it checks all the cases I could think of where a \0 or literal NUL character should prevent a style rule from matching. Four out of the umpteen subcases of this test fail; these constructs #id.\0 { color: red } .cl#\0 { color: red } are matching <div id="id" class="�"> and <div class="cl" id="�"> respectively, but it seems like they shouldn't. If I could get a little help with that, we could maybe put this one to bed soon. (I regret that some of the new files are uuencoded and thus not inspectible in-browser. They contain embedded NULs.)
(In reply to comment #40) > Four out of the umpteen > subcases of this test fail; these constructs > > #id.\0 { color: red } > .cl#\0 { color: red } > > are matching <div id="id" class="�"> and <div class="cl" id="�"> > respectively, but it seems like they shouldn't. If I could get a little help > with that, we could maybe put this one to bed soon. Are the class and ID attributes in those cases what you expect them to be? If so, are we using string comparison functions for the matching in question that don't handle embedded NULs?
Comment on attachment 343767 [details] [review] new minimally invasive fix; doesn't quite work Other than that, the patch looks fine, though, but marking review- because of the failing tests.
Created attachment 344334 [details] [review] (rev 2) test cases fixed [Checkin: Comment 45] It turns out the test failures were just because of errors in the test itself. I had accidentally reused class and ID names, so rules that were meant for other subcases were matching the problem subcases. I did some experiments using JS to check on the attribute values, and they are coming out the way they're supposed to.
Comment on attachment 344334 [details] [review] (rev 2) test cases fixed [Checkin: Comment 45] ok, r+sr=dbaron
Comment on attachment 344334 [details] [review] (rev 2) test cases fixed [Checkin: Comment 45] http://hg.mozilla.org/mozilla-central/rev/9f05f41462ee
dveditz: shall I prepare patches for older branches? There may be a little adjustment needed.
I suspect to get this in on branches you'd have to prepare patches and request approval; people are unlikely to come looking for you.
By reading this bug report, I was under the impression this could be used as a hack to specifically target Gecko (like the IE underscore or star hacks). If anyone thought the same and actually used it on their site, then backporting this patch to branches might be a bad idea.
A couple of casual web searches reveals a number of people asking whether any such hack exists, but nobody (that I saw) describing this as a hack to target Gecko; and in my opinion we should backport the patch, because the possibility of malicious people using the bug to sneak CSS past a filter is of greater concern. I've got a tested 1.9.0.x branch patch now and will have a 1.8.0.x branch patch shortly.
Created attachment 345026 [details] [review] patch for 1.9.0.x branch Here's the patch for 1.9.0.x. Beware raw NUL characters in the diff; after applying it you should make sure that the files layout/reftests/bugs/228856-2.html and layout/reftests/bugs/228856-2-ref.html are byte-for-byte identical to the files of the same name that currently exist in mozilla-central. It's probably too late for this to make 1.9.0.4 but I'm flagging it for that release anyway - it *is* a security issue and it's a low-risk patch IMO. 1.9.0.x's reftests pass with the patch applied.
Created attachment 345028 [details] [review] patch for 1.8.1.x (MOZILLA_1_8_BRANCH) Here's the patch for 1.8.1.x. Again, I'm flagging for approval for .18 and .19 despite expecting that this won't make .18, because this is a relatively safe fix for a security issue. There are no reftests on MOZILLA_1_8_BRANCH so this patch, unlike the others, does not include tests. I intended to manually test it, but I got link errors below gfx/ due to some sort of problem with freetype. The modified file itself compiles fine. I'm assuming this doesn't need fixed in 1.8.0.x (Firefox 1.5) but I'll do it if someone wants.
Comment on attachment 345026 [details] [review] patch for 1.9.0.x branch approved for 1.9.0.5 and 1.8.1.19, a=dveditz for release-drivers (please watch tinderbox for the tree to open before landing)
Fix checked into 1.8 and 1.9.0 branches
Verified for 1.9.0 since I see the new tests passing. Is there a good way to verify for 1.8? The manual testcases above don't turn all green in either 1.9.0 or 1.8 (though they turn more green and the results are consistent between 1.9.0 and 1.8).
(In reply to comment #54) > Verified for 1.9.0 since I see the new tests passing. Is there a good way to > verify for 1.8? The manual testcases above don't turn all green in either 1.9.0 > or 1.8 (though they turn more green and the results are consistent between > 1.9.0 and 1.8). By 'manual testcases' you mean the first four attachments to this bug and the sixth? The behavior as finally agreed upon is not what those expect. You can use the reftests from the m-c patch as manual testcases - http://hg.mozilla.org/mozilla-central/raw-file/9f05f41462ee/layout/reftests/bugs/228856-2.html (should be all green except for headings) and http://hg.mozilla.org/mozilla-central/raw-file/9f05f41462ee/layout/reftests/bugs/228856-1-ref.html (compare to http://hg.mozilla.org/mozilla-central/raw-file/9f05f41462ee/layout/reftests/bugs/228856-1-ref.html ) I don't remember if either of those tests needs other CSS features not available in 1.8, though.
Thanks, Zack. Those both seem to work and it looks good in 1.8.1.19: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.19pre) Gecko/2008120103 BonEcho/2.0.0.19pre. Marking as verified.
for 1.8.0 we need patch from bug 316394 (and regression patch from 316859)
Verified on 1.9.1 with builds on OS X and Windows and the mentioned tests from comment 55 Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.1b3pre) Gecko/20090204 Shiretoko/3.1b3pre ID:20090204020327
Comment on attachment 345028 [details] [review] patch for 1.8.1.x (MOZILLA_1_8_BRANCH) a=asac for 1.8.0
(In reply to comment #59) > (From update of attachment 345028 [details] [review]) > a=asac for 1.8.0 Do you need me to revise the patch for 1.8.0 or is it good as is?