Last Comment Bug 637957 - Make notifiers strong owners of mCallback, to prevent crashes due to unexpected events ordering
: Make notifiers strong owners of mCallback, to prevent crashes due to unexpect...
Status: RESOLVED FIXED
: [sg:moderate][hardblocker][tests in b...
:
Product: Toolkit
Classification: Components
Component: Storage
: unspecified
: All All
: -- normal (vote)
: mozilla2.0
Assigned To: Marco Bonardo [:mak]
: storage
:
: 638405 638123
: 629285 634796
  Show dependency treegraph
 
Reported: 2011-03-01 17:44 PST by Marco Bonardo [:mak]
Modified: 2011-05-24 23:37 PDT (History)
6 users (show)
See Also:
Crash Signature:
  ---
  ---
  ---
  ---
  ---
  ---
  ---
  ---
  ---
  final+
  ---
  .17+
  .17-fixed
  .19+
  .19-fixed


Attachments
patch v1.0 (5.21 KB, patch)
2011-03-02 07:12 PST, Marco Bonardo [:mak]
no flags Details | Diff | Splinter Review
patch v1.0 (8 lines context) (5.72 KB, patch)
2011-03-02 07:12 PST, Marco Bonardo [:mak]
sdwilsh: review+
Details | Diff | Splinter Review
patch v2.0 (1.31 KB, patch)
2011-03-02 09:42 PST, Marco Bonardo [:mak]
no flags Details | Diff | Splinter Review
patch v2.1 (1.48 KB, patch)
2011-03-02 09:50 PST, Marco Bonardo [:mak]
no flags Details | Diff | Splinter Review
1.9.2 rollup (10.34 KB, patch)
2011-03-07 12:47 PST, Shawn Wilsher :sdwilsh
no flags Details | Diff | Splinter Review
1.9.2 rollup (1.53 KB, patch)
2011-03-07 14:43 PST, Shawn Wilsher :sdwilsh
clegnitto: approval1.9.2.17+
Details | Diff | Splinter Review
1.9.1 rollup (1.45 KB, patch)
2011-03-07 14:46 PST, Shawn Wilsher :sdwilsh
clegnitto: approval1.9.1.19+
Details | Diff | Splinter Review

Summon comment box

Description Marco Bonardo [:mak] 2011-03-01 17:44:47 PST
See https://bugzilla.mozilla.org/show_bug.cgi?id=634796#c22

While spinnging the events loop in a HandleResult is not nice at all, owenrship should be changed so that it won't make us crash at random addresses.
Comment 1 Marco Bonardo [:mak] 2011-03-01 17:50:39 PST
Changing ownership here, should pay particularly attention that releases will happen on the right thread.
Comment 2 Shawn Wilsher :sdwilsh 2011-03-01 19:33:24 PST
(In reply to comment #1)
> Changing ownership here, should pay particularly attention that releases will
> happen on the right thread.
Releases will absolutely happen on the correct thread; the callback is being run on the thread it was called on (which is the one the releases should happen on).
Comment 3 Marco Bonardo [:mak] 2011-03-02 03:41:49 PST
(In reply to comment #2)
> (In reply to comment #1)
> > Changing ownership here, should pay particularly attention that releases will
> > happen on the right thread.
> Releases will absolutely happen on the correct thread; the callback is being
> run on the thread it was called on (which is the one the releases should happen
> on).

Yes it runs on the correct thread, but the runnable is not created on the same thread, so it could still release mCallback on the thread where the runnable was created rather than the thread where mCallback runs.
For example AsyncGetBookmarksForURI should be destroyed on the main-thread, the results notifier runs on the main thread, but is created and destroyed on the async thread afaict.
Comment 4 Marco Bonardo [:mak] 2011-03-02 04:49:32 PST
So, we have to release the ownership explicitely in Run(), this is fine, there is still the AddRef problem... Actually Run() could just:
AddRef
->HandleResult
Release
Comment 5 Marco Bonardo [:mak] 2011-03-02 07:12:11 PST
Created attachment 516261 [details] [review]
patch v1.0

The test is not perfect, but works. It is practically partially running on destruction, thus I use NS_RUNTIMEABORT, looks like I'm unable to make it properly fake-async (my tries with spinning ended up making it hang).
Comment 6 Marco Bonardo [:mak] 2011-03-02 07:12:41 PST
Created attachment 516262 [details] [review]
patch v1.0 (8 lines context)

better for review
Comment 7 Shawn Wilsher :sdwilsh 2011-03-02 08:23:25 PST
This fixes bug 634796, so I'm promoting this to hardblocker.  Review coming shortly...
Comment 8 Shawn Wilsher :sdwilsh 2011-03-02 09:10:16 PST
Comment on attachment 516262 [details] [review]
patch v1.0 (8 lines context)

>+++ b/storage/src/mozStorageAsyncStatementExecution.cpp
>+    // Hold a strong reference to the callback while notifying it, so that if it
>+    // should spin the events loop, won't be able to release it in spite of us.
>+    nsCOMPtr<mozIStorageStatementCallback> callback = mCallback;
nit: "...spin the event loop, the callback won't be released and freed out from under us."
Also, this needs to be done for ErrorCallback, with a test as well (easiest way to get that callback is to trigger a constraint error by trying to inset a duplicate primary key).

>+    // Ensure to release our reference on this thread.
>+    callback = 0;
You don't need to do this.  nsCOMPtr's destructor will run the moment this method returns, which will happen on this thread.

>+++ b/storage/test/test_HandleResult_spinEventsLoop.cpp
How about we go with test_async_callbacks_with_spun_event_loops.cpp?

I don't think the test actually tests this, however.  With that said, I'm OK with getting the fix in, and then spending time out of the release critical path to get the test right.  What we really want to test is this:
1) Fire off a query that would get a handleResult callback (ideally, just one).
2) When we get the handleResult call, spin the event loop until we get a handleCompletion callback
3) Set a heap variable that isn't a member of the callback object to true indicating that the object is currently alive.  In the callbacks destructor, we'll want to set this to false.  (we may need to create this heap variable before step 4, and may set it earlier too)
4) At this point, our handleResult spinning will be complete, and we should ensure that our heap variable that is set to false when our destructor runs is still true.

The above test should fail without your patch.  You can replace handleResult with handleError for the additional testcase I asked for.

r=sdwilsh on the code changes.  We need to work on the test more though, so file a follow-up please.
Comment 9 Marco Bonardo [:mak] 2011-03-02 09:18:30 PST
> 1) Fire off a query that would get a handleResult callback (ideally, just one).

the test does this

> 2) When we get the handleResult call, spin the event loop until we get a
> handleCompletion callback

and does this

> 3) Set a heap variable that isn't a member of the callback object to true
> indicating that the object is currently alive.  In the callbacks destructor,
> we'll want to set this to false.  (we may need to create this heap variable
> before step 4, and may set it earlier too)

it's done through a local property of the callback, I like the idea of a global var though, or maybe a static property.

> 4) At this point, our handleResult spinning will be complete, and we should
> ensure that our heap variable that is set to false when our destructor runs is
> still true.

and does this

> The above test should fail without your patch.

and does this!

My test does exactly what you said, apart using a global variable, could be made more readable maybe.

 You can replace handleResult
> with handleError for the additional testcase I asked for.

will do
Comment 10 Marco Bonardo [:mak] 2011-03-02 09:33:26 PST
I'm splitting the tests to bug 638123, since Shawn currently cannot do a full review of it and we can come with a more polished test.

The fix is pretty much straight-forward and I'll run the tests locally before pushing since they already work fine, we can land the tests in the next days on tinderboxes once properly reviewed.
Comment 11 Marco Bonardo [:mak] 2011-03-02 09:42:04 PST
Created attachment 516289 [details] [review]
patch v2.0

Addressed comments.

Before pushing I'll fix and run tests locally, then will attach them to the followup for review (so this won't block the release).
Comment 12 Marco Bonardo [:mak] 2011-03-02 09:50:41 PST
Created attachment 516292 [details] [review]
patch v2.1

Per IRC discussion.
Comment 13 Marco Bonardo [:mak] 2011-03-02 10:57:48 PST
I attached tests I used locally to bug 638123, after double checking they fail before the patch and pass after.  Those tests will land apart per comment 10.

http://hg.mozilla.org/mozilla-central/rev/290403971149
Comment 14 Marco Bonardo [:mak] 2011-03-02 12:36:53 PST
pushed a follow-up, I forgot to properly QI and luckily a Places test complained!
http://hg.mozilla.org/mozilla-central/rev/38028bf13fb2
Comment 15 Shawn Wilsher :sdwilsh 2011-03-02 13:06:23 PST
(In reply to comment #14)
> pushed a follow-up, I forgot to properly QI and luckily a Places test
> complained!
> http://hg.mozilla.org/mozilla-central/rev/38028bf13fb2
Er, what?  mCallback in both methods is a  mozIStorageStatementCallback*, so that QI doesn't do anything useful.
Comment 16 Shawn Wilsher :sdwilsh 2011-03-02 13:08:39 PST
...and yet that is very clearly what that assertion is saying.  Can somebody explain to me why?  I'm guessing XPConnect doesn't create the QI until it's needed, but would love a confirmation.
Comment 17 Marco Bonardo [:mak] 2011-03-02 13:15:13 PST
That fixed the error locally, i preferred avoiding a permaorange.
Looking at the thing, looks like there is some wrong iSupports inheritance in nsNavHistoryFolderResultNode, when test_async.js runs it probably doesn't cast it properly due to a bogus NS_IMPL_ISUPPORTS_INHERITED
Comment 18 Marco Bonardo [:mak] 2011-03-02 13:23:25 PST
or most likely ambiguous inheritance
Comment 19 Shawn Wilsher :sdwilsh 2011-03-02 19:49:21 PST
(In reply to comment #17)
> That fixed the error locally, i preferred avoiding a permaorange.
> Looking at the thing, looks like there is some wrong iSupports inheritance in
> nsNavHistoryFolderResultNode, when test_async.js runs it probably doesn't cast
> it properly due to a bogus NS_IMPL_ISUPPORTS_INHERITED
I'm still pretty sure we shouldn't have to do that.  Can you file a bug about it so we can get the right fix in eventually please?
Comment 20 Marco Bonardo [:mak] 2011-03-03 04:02:06 PST
sure, fwiw Neil suggested we could use a RefPtr to save the QI
Comment 21 Shawn Wilsher :sdwilsh 2011-03-04 07:10:39 PST
This can fix some crashes on branch (especially when Sync is installed).
Comment 23 Daniel Veditz 2011-03-07 10:44:10 PST
Does this apply to the 1.9.1 branch also?

We'll need roll-up patches to approve for the branches. Looks like there were at least two chunks checked in to fix this.
Comment 24 Shawn Wilsher :sdwilsh 2011-03-07 10:54:29 PST
(In reply to comment #23)
> Does this apply to the 1.9.1 branch also?
It should, yeah.

> We'll need roll-up patches to approve for the branches. Looks like there were
> at least two chunks checked in to fix this.
I'll do that later today.
Comment 25 Shawn Wilsher :sdwilsh 2011-03-07 12:47:35 PST
Created attachment 517502 [details] [review]
1.9.2 rollup
Comment 26 Shawn Wilsher :sdwilsh 2011-03-07 12:49:37 PST
Comment on attachment 517502 [details] [review]
1.9.2 rollup

This is very clearly not what I intended to upload...
Comment 27 Shawn Wilsher :sdwilsh 2011-03-07 12:55:53 PST
Somehow, my 1.9.2 tree was totally hosed.  Repulling and trying again...
Comment 28 Shawn Wilsher :sdwilsh 2011-03-07 14:43:28 PST
Created attachment 517555 [details] [review]
1.9.2 rollup
Comment 29 Shawn Wilsher :sdwilsh 2011-03-07 14:46:47 PST
Created attachment 517556 [details] [review]
1.9.1 rollup

Filename difference here and we use a different method name for checking if we should run (runEvent vs shouldNotify).  Otherwise, it's all the same.
Comment 30 Shawn Wilsher :sdwilsh 2011-03-09 13:34:44 PST
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/e32983ecfaf9
Comment 31 Shawn Wilsher :sdwilsh 2011-03-09 13:45:39 PST
http://hg.mozilla.org/releases/mozilla-1.9.2/rev/425048c1d04f

Note You need to log in before you can comment on or make changes to this bug.