Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a data race that could manifest as null pointer dereference in FutureBase::Release() #747

Merged
merged 7 commits into from
Nov 16, 2021

Conversation

dconeybe
Copy link
Contributor

@dconeybe dconeybe commented Nov 12, 2021

The FutureBase class has two data members that can be accessed concurrently from multiple threads without any synchronization: api_ and handle_. This data race was detected by the thread sanitizer.

Although there are no known customer reports of crashes resulting from this data race, a recent continuous integration run crashed in Firestore with a null pointer dereference in the following stack trace:

firebase::FutureBase::Release()
firebase::FutureBase::~FutureBase()
firebase::Future<void>::~Future()
firebase::firestore::FirestoreIntegrationTest_FirestoreCanBeDeletedFromTransaction_Test::TestBody()

The test in question is FirestoreCanBeDeletedFromTransaction:

TEST_F(FirestoreIntegrationTest, FirestoreCanBeDeletedFromTransaction) {
Firestore* db = TestFirestore();
DisownFirestore(db);
auto future = db->RunTransaction(
[](Transaction&, std::string&) { return Error::kErrorOk; });
std::promise<void> callback_done_promise;
auto callback_done = callback_done_promise.get_future();
future.AddOnCompletion([&](const Future<void>&) mutable {
delete db;
callback_done_promise.set_value();
});
Await(future);
callback_done.wait();
}

This test deletes the Firestore instance from an auxiliary thread, which would cause all of the Future objects it owns to be "released" from that alternate thread. Therefore, in rare conditions, this would race with the deletion of the Future object from the test thread, and yield the null pointer dereference.

This PR fixes the data race by protecting those two members with a mutex.

Googlers can see b/183225305 and b/205966141 for more details.

@dconeybe dconeybe self-assigned this Nov 12, 2021
@google-cla google-cla bot added the cla: yes label Nov 12, 2021
@dconeybe dconeybe added the tests-requested: quick Trigger a quick set of integration tests. label Nov 12, 2021
@github-actions github-actions bot added tests: in-progress This PR's integration tests are in progress. and removed tests-requested: quick Trigger a quick set of integration tests. labels Nov 12, 2021
@github-actions
Copy link

github-actions bot commented Nov 12, 2021

❌  Integration test FAILED

Requested by @dconeybe on commit a219a97
Last updated: Mon Nov 15 19:40 PST 2021
View integration test log & download artifacts

Failures Configs
messaging [TEST] [ERROR] [Android] [ubuntu] [android_target]

Add flaky tests to go/fpl-cpp-flake-tracker

@github-actions github-actions bot added the tests: failed This PR's integration tests failed. label Nov 12, 2021
@firebase-workflow-trigger firebase-workflow-trigger bot removed the tests: in-progress This PR's integration tests are in progress. label Nov 12, 2021
@dconeybe dconeybe added the tests-requested: quick Trigger a quick set of integration tests. label Nov 15, 2021
@github-actions github-actions bot added tests: in-progress This PR's integration tests are in progress. tests: failed This PR's integration tests failed. and removed tests-requested: quick Trigger a quick set of integration tests. tests: failed This PR's integration tests failed. labels Nov 15, 2021
@dconeybe dconeybe added the tests-requested: quick Trigger a quick set of integration tests. label Nov 15, 2021
@github-actions github-actions bot removed tests-requested: quick Trigger a quick set of integration tests. tests: failed This PR's integration tests failed. labels Nov 15, 2021
@@ -20,6 +20,7 @@
#include <stddef.h>
#include <stdint.h>

#include <mutex>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use firebase::Mutex and firebase::MutexLock instead?

@@ -569,6 +569,9 @@ code.
## Release Notes
### Next Release
- Changes
- All Products (Android): Fixed a data race that could manifest as null
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you change this to "General (Android):"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

jonsimantov
jonsimantov previously approved these changes Nov 15, 2021
@github-actions github-actions bot dismissed jonsimantov’s stale review November 15, 2021 20:20

🍞 Dismissed stale approval on external PR.

@@ -20,6 +20,7 @@
#include <stddef.h>
#include <stdint.h>

#include <mutex>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Lint warning: <mutex> is an unapproved C++11 header.

}
detail::RegisterForCleanup(api_, this);
}

return *this;
}

#if defined(FIREBASE_USE_MOVE_OPERATORS)
inline FutureBase::FutureBase(FutureBase&& rhs) noexcept
: api_(NULL) // NOLINT
{

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Lint warning: { should almost always be at the end of the previous line

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, could you fix the lint warning actually?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait just noticed it's not in your code - never mind.

@dconeybe dconeybe enabled auto-merge (squash) November 15, 2021 20:22
@github-actions github-actions bot added the tests: failed This PR's integration tests failed. label Nov 15, 2021
@firebase-workflow-trigger firebase-workflow-trigger bot removed the tests: in-progress This PR's integration tests are in progress. label Nov 15, 2021
@dconeybe dconeybe added the tests-requested: quick Trigger a quick set of integration tests. label Nov 15, 2021
@github-actions github-actions bot added tests: in-progress This PR's integration tests are in progress. tests: succeeded This PR's integration tests succeeded. and removed tests-requested: quick Trigger a quick set of integration tests. tests: failed This PR's integration tests failed. labels Nov 15, 2021
@firebase-workflow-trigger firebase-workflow-trigger bot removed the tests: in-progress This PR's integration tests are in progress. label Nov 15, 2021
@jonsimantov jonsimantov self-requested a review November 16, 2021 00:49
@@ -569,6 +569,9 @@ code.
## Release Notes
### Next Release
- Changes
- General (Android): Fixed a data race that could manifest as null pointer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last note (it can wait for the followup) - this isn't Android only, right? We can remove the "(Android)" tag.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh right. I've opened #749 to fix the release notes entry.

@dconeybe dconeybe merged commit a219a97 into main Nov 16, 2021
@github-actions github-actions bot added tests: in-progress This PR's integration tests are in progress. and removed tests: succeeded This PR's integration tests succeeded. labels Nov 16, 2021
@dconeybe dconeybe deleted the dconeybe/FixFutureDataRaces branch November 16, 2021 02:04
dconeybe added a commit that referenced this pull request Nov 16, 2021
I had originally thought that the issue was specific to Android, because the bug only surfaced on Android; however, it indeed affects all platforms.
@github-actions github-actions bot added the tests: failed This PR's integration tests failed. label Nov 16, 2021
@firebase-workflow-trigger firebase-workflow-trigger bot removed the tests: in-progress This PR's integration tests are in progress. label Nov 16, 2021
DellaBitta added a commit that referenced this pull request Dec 1, 2021
* Fix test on emulator workflow failures (#734)

* If simulator  install ios app failed, reset simulator and try again (#733)

* Trigger workflow move github api cod to github.py (#746)

* Fix a data race that could manifest as null pointer dereference in FutureBase::Release() (#747)

* Cancel callbacks for messaging (#745)

* Cancel callbacks for messaging

util::Terminate is referenced counted som when there ar more APIs than messaging active the callbacks will not be canceled until later and still cause a NULL ref due to the FutureData being destroyed now.

* Cancel callback earlier

* Update readme

* Remove "Android" tag from the release notes entry for #747 (#749)

* Remove calls to LogInfo, LogError, LogDebug during obj-c +load. (#706)

* Remove calls to LogInfo, LogError, LogDebug during obj-c +load.

This could be causing an issue in C++ as global class constructors have not yet been run.

* Add Objective-C/C++ and Java to code formatter script; format those files. (#755)

* Allow format_code to format .m/.mm files; clang-format already knows how.

* Run format_code.py on all objective-c/objective-c++ files.

* Add Java file extensions to format_code.py

* Format all Java source files.

* Remove check for objc header, as they are now supported.

* Format objective-c .h files.

* Don't let lint comment on line length any more; code formatting will report that.

* Messaging crash during initialization (#760)

* Messaging crash during initialization

* Update readme

* Don't redeclare inherited state in CredentialsProviderDesktop (#731)

* Reduce disk space usage when packaging the built SDK (#763)

Remove intermediate build files during desktop packaging step.

This should reduce the disk space usage, as those files (*.o and *.obj) are not required when merging libraries.

* Workaround for Linux x86 build:  downgrade libraries on GitHub runners (#764)

When installing 32-bit Linux dependencies on GitHub runners, downgrade libpcre2-8-0 to an earlier version to ensure compatibility with the i386 version of the package. This is something that should be fixed in a subsequent Ubuntu release and so is a temporary workaround.

This also adds checks to the various prerequisite commands run by build_desktop.py, which was previously just silently ignoring errors (making this much harder to track down). Now it will error out as soon as a command fails.

Co-authored-by: Mou Sun <69009538+sunmou99@users.noreply.github.com>
Co-authored-by: Denver Coneybeare <dconeybe@google.com>
Co-authored-by: Tobias Barendt <tobias@robotsquid.com>
Co-authored-by: Jon Simantov <jsimantov@google.com>
Co-authored-by: Sebastian Schmidt <mrschmidt@google.com>
@firebase firebase locked and limited conversation to collaborators Dec 16, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api: firestore cla: yes tests: failed This PR's integration tests failed.
2 participants