Bugzilla@Mozilla – Bug 501934
Crash [@ nsXBLBinding::GenerateAnonymousContent] with DOMAttrModified and oncommand removing element and xbl:inherits="xbl:text"
Last modified: 2010-05-09 15:52:16 PDT
Summon comment box
Created attachment 386497 [details] testcase See testcase, which crashes current trunk build and Firefox 3 after 100ms. http://crash-stats.mozilla.com/report/index/b25ff7b7-dfdb-4b4e-b896-312272090702?p=1 0 xul.dll xul.dll@0x421a98 1 xul.dll nsBaseHashtable<nsISupportsHashKey,nsCSSFrameConstructor::RestyleData,nsCSSFrameConstructor::RestyleData>::s_EnumStub obj-firefox/dist/include/nsBaseHashtable.h:346 2 xul.dll PL_DHashTableEnumerate obj-firefox/xpcom/build/pldhash.c:754 3 xul.dll nsXBLBinding::GenerateAnonymousContent content/xbl/src/nsXBLBinding.cpp:794 4 xul.dll nsXBLService::LoadBindings content/xbl/src/nsXBLService.cpp:664 5 xul.dll nsElementSH::PostCreate dom/base/nsDOMClassInfo.cpp:7620 6 xul.dll XPCWrappedNative::GetNewOrUsed js/src/xpconnect/src/xpcwrappednative.cpp:599 Firefox 3 crash report: http://crash-stats.mozilla.com/report/index/aae1cfba-71ac-4698-b21c-6100c2090702?p=1 0 xul.dll nsCOMPtr<nsIRDFNode>::operator= nsCOMPtr.h:713 1 xul.dll RealizeDefaultContent mozilla/content/xbl/src/nsXBLBinding.cpp:548 2 xul.dll nsBaseHashtable<nsURIHashKey,CacheScriptEntry,CacheScriptEntry>::s_EnumStub nsBaseHashtable.h:346 3 xul.dll PL_DHashTableEnumerate pldhash.c:724 4 xul.dll nsXBLBinding::GenerateAnonymousContent mozilla/content/xbl/src/nsXBLBinding.cpp:775 5 xul.dll nsXBLService::LoadBindings mozilla/content/xbl/src/nsXBLService.cpp:635 6 xul.dll nsElementSH::PostCreate mozilla/dom/src/base/nsDOMClassInfo.cpp:7236 7 xul.dll XPCWrappedNative::GetNewOrUsed mozilla/js/src/xpconnect/src/xpcwrappednative.cpp:546 8 xul.dll XPCConvert::NativeInterface2JSObject mozilla/js/src/xpconnect/src/xpcconvert.cpp:1107 9 xul.dll XPCConvert::NativeData2JS mozilla/js/src/xpconnect/src/xpcconvert.cpp:482 10 xul.dll XPCWrappedNative::CallMethod mozilla/js/src/xpconnect/src/xpcwrappednative.cpp:2481 11 xul.dll XPC_WN_CallMethod mozilla/js/src/xpconnect/src/xpcwrappednativejsops.cpp:1473 12 js3250.dll js_Invoke mozilla/js/src/jsinterp.c:1310 13 js3250.dll js_Interpret mozilla/js/src/jsinterp.c:4883 14 js3250.dll js_Invoke mozilla/js/src/jsinterp.c:1326 15 js3250.dll JS_CallFunctionValue mozilla/js/src/jsapi.c:5058 16 xul.dll nsJSContext::CallEventHandler mozilla/dom/src/base/nsJSEnvironment.cpp:1962 17 xul.dll nsGlobalWindow::RunTimeout mozilla/dom/src/base/nsGlobalWindow.cpp:7900 18 xul.dll nsGlobalWindow::TimerCallback mozilla/dom/src/base/nsGlobalWindow.cpp:8231 19 xul.dll nsTimerImpl::Fire mozilla/xpcom/threads/nsTimerImpl.cpp:400 20 xul.dll nsTimerEvent::Run mozilla/xpcom/threads/nsTimerImpl.cpp:490 21 xul.dll nsThread::ProcessNextEvent mozilla/xpcom/threads/nsThread.cpp:510 22 xul.dll nsBaseAppShell::Run mozilla/widget/src/xpwidgets/nsBaseAppShell.cpp:170 23 nspr4.dll PR_GetEnv 24 firefox.exe wmain mozilla/toolkit/xre/nsWindowsWMain.cpp:87 25 firefox.exe firefox.exe@0x217f 26 kernel32.dll kernel32.dll@0x17076
It doesn't seem to crash online, so you need to download the testcase to your computer and test there, to see the crash.
Created attachment 386506 [details] stack Deleting current nsXBLInsertionPoint while iterating mInsertionPointTable
Created attachment 386507 [details] Hashtable assertion stack We could use a nsRefPtr to avoid the above, but we would still get an assertion on a recursive hash table op...
Created attachment 386509 [details] [review] Like so? Wrapping the enumerate op in a script blocker fixes it.
That makes sense.
Can we deal with script running when that scriptblocker goes out of scope?
Ping?
Mats, can you drive this one in for 1.9.2?
I'm reviewing the code to answer comment 6, and if there are additional crash paths...
Mats, any updates here?
Mats, any updates? We need to get this in soon...
I got interrupted by a couple of other bugs, sorry. I'll have a look at this one now...
The crash stopped occurring on trunk 2009-10-07-03 -- 2009-10-08-03 http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=4d5760a4f96f&tochange=abe269bb23ef No crash on 1.9.2 tip either, but it does crash 1.9.1 tip. I suspect the underlying bug is still there though, also on trunk. I get this when loading the testcase on trunk: JavaScript error: event.target.parentNode is null I'm going to track down the exact change that fixed the bug/broke the test...
It's changeset eef814b58a9c that made it not crash anymore: "Bug 502567. Get rid of the silly ShouldBuildChildFrames check." It doesn't seem like a real fix though. Boris?
Yeah, I really don't know why that would have affected this bug, other than both being involved with xbl:text.
Created attachment 412447 [details] Testcase #2 In a pre-Bug 502567 tree, this testcase doesn't crash but triggers: ###!!! ASSERTION: op == PL_DHASH_LOOKUP || RECURSION_LEVEL(table) == 0: 'op == PL_DHASH_LOOKUP || RECURSION_LEVEL(table) == 0', file pldhash.c, line 612 ###!!! ASSERTION: Insertion parent should have NODE_IS_INSERTION_PARENT flag!: '!aParent || aParent->HasFlag(NODE_IS_INSERTION_PARENT)', file nsBindingManager.cpp, line 599 (so probably worth making a regression test from)
When bug 502567 removed the ShouldBuildChildFrames() from the if-condition it made LoadBindings() occur earlier (in AddFrameConstructionItemsInternal). Before bug 502567 it was delayed for the element with xbl:inherits="xbl:text" until nsElementSH::PostCreate(), see the attached "Hashtable assertion stack". It looks to me as it might still be possible to trigger the crash using document.addBinding but I haven't been able to so far.
Created attachment 412484 [details] [review] Patch rev. 2 Protect all "mInsertionPointTable->Enumerate" with nsAutoScriptBlocker. Remove a redundant null document check. Add a strong ref on "content". Re: comment 6 - Yes, I believe we can handle script running when they go out of scope in these blocks.
Created attachment 412485 [details] [review] crashtest.diff
Created attachment 412488 [details] [review] for 1.9.0 (including crashtests) Minor difference due to an existing nsAutoScriptBlocker. Moved it a bit earlier so the resulting code is the same as in the trunk patch.
Boris, now that you pointed it out in bug 529087, we do in fact access mBoundElement after the nsAutoScriptBlocker goes out of scope. http://mxr.mozilla.org/mozilla-central/source/content/xbl/src/nsXBLBinding.cpp#814 So, I should probably just wrap that part too, i.e. all of nsXBLBinding::GenerateAnonymousContent() except the first few lines that calls mNextBinding->GenerateAnonymousContent(); Or, wrap the loop at the end inside a "if (mBoundElement)" depending on what the answer is to your question in bug 529087 comment 3.
Mats, how close to a fix do you think we are here? We're really close to the code freeze and we'd love to see this blocker fixed before the freeze!
Created attachment 412771 [details] [review] Patch rev. 3 (diff -w) Blocking script locally in GenerateAnonymousContent() isn't enough, because we can't trust the weak ref 'mBoundElement' so we would crash in the AllowScripts() call in nsXBLBinding::InstallEventHandlers() instead (as in bug 529087). Fortunately there is only one consumer, nsXBLService::LoadBindings: http://mxr.mozilla.org/mozilla-central/source/content/xbl/src/nsXBLService.cpp#623 Wrapping the whole block, line 623 -- 639, should be enough to fix that. (A variation, if we can't block script through all of that, would be to hold a kungFuDeathGrip on 'aContent' here (same as 'mBoundElement') and block script in a narrower scope) Also, in nsXBLBinding::ChangeDocument we have the same problem with using of 'mBoundElement' (and maybe also 'aOldDocument') after the script blocker goes out of scope, so we need a larger scope there.
Comment on attachment 412771 [details] [review] Patch rev. 3 (diff -w) Looks reasonable.
Comment on attachment 412771 [details] [review] Patch rev. 3 (diff -w) Security bug, so need sr too.
http://hg.mozilla.org/mozilla-central/rev/d0badc9bc496 Will push the crashtests when older branches have shipped the fix.
http://hg.mozilla.org/releases/mozilla-1.9.2/rev/28c179d1b30e
Comment on attachment 412771 [details] [review] Patch rev. 3 (diff -w) Approved for 1.9.1.7 and 1.9.0.17, a=dveditz for release-drivers
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/e2e5b13d483c CVS trunk: mozilla/content/xbl/src/nsXBLBinding.cpp 1.269 mozilla/content/xbl/src/nsXBLService.cpp 1.261
Did this cause orange on the Firefox3.5-Unittest tree?
I don't think so, the U/M/E unit test runs for my changeset were all green except for one mochitest failure in test_videocontrols_audio_direction.html which I think is a known intermittent orange. The run after my change has a few orange M with failed local storage tests - I don't think these are related to XBL at all. There's also some orange E with a lot of tests timing out, hard to say what that could be... Is there a way we can force more unit test runs so we can get a few more cycles to see if it's consistent?
The 3.5 unit test Tinderboxes are green now.
Yeah, sorry, I should have followed up. Talking with the build team, apparently it was building the tag (which was at the tip) not actually mozilla-1.9.1. Checking in something to mozilla-1.9.1 made it go green.
Verified for 1.9.1 using the nightly build on XP SP3 with Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.8pre) Gecko/20100115 Shiretoko/3.5.8pre (.NET CLR 3.5.30729) and local testcase. Same testcase crashes 1.9.1.7 reliably when run.
Verified for 1.9.0.18 using Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.16pre) Gecko/2010020215 GranParadiso/3.0.18pre (.NET CLR 3.5.30729) (my debug). Testcases crash 1.9.0.17 reliably when run locally and don't affect the 1.9.0.18 build.