Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gRPC abort in grpc::Channel::PerformOpsOnCall #2138

Closed
pschneider opened this issue Dec 2, 2018 · 34 comments
Closed

gRPC abort in grpc::Channel::PerformOpsOnCall #2138

pschneider opened this issue Dec 2, 2018 · 34 comments
Assignees
Milestone

Comments

@pschneider
Copy link

pschneider commented Dec 2, 2018

[REQUIRED] Step 2: Describe your environment

  • Xcode version: 10.1
  • Firebase SDK version: 5.13.0
  • Firebase Component: Firestore
  • Component version: Latest one included in 5.13.0 pod

[REQUIRED] Step 3: Describe the problem

We never saw this crash before. We thought it might be related to the recent switch from gRPC in #1968 and related ones.

The app is currently running in a small internal beta test with about 20 Users so far. The crash occurred on 3 different devices for 3 users from those 20. (iPhone 6, iPhone 6s, iPhone X. All are running iOS 12.0.1 or newer)

Steps to reproduce:

As far as we can see from our analytics screen views and the code we can just believe that the following state was in the app:

  • We had one snapshot listener open which was about to being disposed (the user left the screen where this snapshot listener was running)
  • It might have been that the document the snapshot listener was attached to was deleted or modified while the listener has been disposed

Stacktrace:

Crashed: com.google.firebase.firestore
SIGABRT ABORT 0x00000001d429b104

Crashed: com.google.firebase.firestore
0  libsystem_kernel.dylib         0x1d429b104 __pthread_kill + 8
1  libsystem_pthread.dylib        0x1d4316070 pthread_kill$VARIANT$mp + 380
2  libsystem_c.dylib              0x1d41f2d78 abort + 140
3  grpcpp                         0x10203d414 non-virtual thunk to grpc::Channel::PerformOpsOnCall(grpc::internal::CallOpSetInterface*, grpc::internal::Call*) (channel_cc.cc)
4  APP_NAME                        0x10139345c grpc::ClientAsyncReaderWriter<grpc::ByteBuffer, grpc::ByteBuffer>::Finish(grpc::Status*, void*) (call.h:680)
5  APP_NAME                        0x101393238 firebase::firestore::remote::GrpcStream::Shutdown() (functional:1862)
6  APP_NAME                        0x1013931c8 firebase::firestore::remote::GrpcStream::FinishImmediately() (grpc_stream.h:186)
7  APP_NAME                        0x1013935c0 firebase::firestore::remote::GrpcStream::WriteAndFinish(grpc::ByteBuffer&&) (optional.h:184)
8  APP_NAME                        0x1013a9704 firebase::firestore::remote::WriteStream::TearDown(firebase::firestore::remote::GrpcStream*) (byte_buffer.h:88)
9  APP_NAME                        0x1013a4a04 firebase::firestore::remote::Stream::Close(firebase::firestore::util::Status const&) (memory:2595)
10 APP_NAME                        0x1013a4868 firebase::firestore::remote::Stream::Stop() (memory:2595)
11 APP_NAME                        0x1012fb454 firebase::firestore::util::AsyncQueue::ExecuteBlocking(std::__1::function<void ()> const&) (atomic:921)
12 APP_NAME                        0x1013032a8 firebase::firestore::util::internal::TimeSlot::InvokedByLibdispatch(void*) (executor_libdispatch.mm:173)
13 libdispatch.dylib              0x1d413e484 _dispatch_client_callout + 16
14 libdispatch.dylib              0x1d40e183c _dispatch_continuation_pop$VARIANT$mp + 412
15 libdispatch.dylib              0x1d40f1bac _dispatch_source_invoke$VARIANT$mp + 1704
16 libdispatch.dylib              0x1d40e5aac _dispatch_lane_serial_drain$VARIANT$mp + 284
17 libdispatch.dylib              0x1d40e6728 _dispatch_lane_invoke$VARIANT$mp + 432
18 libdispatch.dylib              0x1d40eeec8 _dispatch_workloop_worker_thread + 600
19 libsystem_pthread.dylib        0x1d43200dc _pthread_wqthread + 312
20 libsystem_pthread.dylib        0x1d4322cec start_wqthread + 4
com.google.firebase.firestore.rpc

com.google.firebase.firestore.rpc
0  libsystem_kernel.dylib         0x1d429d000 poll + 8
1  grpc                           0x101f39a3c pollset_work(grpc_pollset*, grpc_pollset_worker**, long long) (ev_poll_posix.cc:986)
2  grpc                           0x101f3c308 pollset_work(grpc_pollset*, grpc_pollset_worker**, long long) (ev_posix.cc:264)
3  grpc                           0x101f31270 cq_next(grpc_completion_queue*, gpr_timespec, void*) (completion_queue.cc:927)
4  grpcpp                         0x10203e968 grpc::CompletionQueue::AsyncNextInternal(void**, bool*, gpr_timespec) (completion_queue_cc.cc:57)
5  APP_NAME                        0x1012fdb50 firebase::firestore::remote::Datastore::PollGrpcQueue() (datastore.mm:128)
6  APP_NAME                        0x101303e6c firebase::firestore::util::internal::DispatchAsync(NSObject<OS_dispatch_queue>*, std::__1::function<void ()>&&)::$_0::__invoke(void*) (executor_libdispatch.mm:57)
7  libdispatch.dylib              0x1d413e484 _dispatch_client_callout + 16
8  libdispatch.dylib              0x1d40e5be0 _dispatch_lane_serial_drain$VARIANT$mp + 592
9  libdispatch.dylib              0x1d40e6728 _dispatch_lane_invoke$VARIANT$mp + 432
10 libdispatch.dylib              0x1d40eeec8 _dispatch_workloop_worker_thread + 600
11 libsystem_pthread.dylib        0x1d43200dc _pthread_wqthread + 312
12 libsystem_pthread.dylib        0x1d4322cec start_wqthread + 4
Thread #1

Thread
0  libsystem_kernel.dylib         0x1d429b374 __select_nocancel + 8
1  libsystem_dnssd.dylib          0x1d4232b78 deliver_request + 1000
2  libsystem_dnssd.dylib          0x1d4233ea0 DNSServiceCreateConnection + 116
3  libsystem_info.dylib           0x1d424c244 _mdns_search_ex + 940
4  libsystem_info.dylib           0x1d424bdac _mdns_search + 128
5  libsystem_info.dylib           0x1d424b2a8 mdns_addrinfo + 940
6  libsystem_info.dylib           0x1d4251118 search_addrinfo + 264
7  libsystem_info.dylib           0x1d42551c4 si_addrinfo + 1672
8  libsystem_info.dylib           0x1d4248964 _getaddrinfo_internal + 196
9  libsystem_info.dylib           0x1d4248894 getaddrinfo + 56
10 grpc                           0x101f6dba0 posix_blocking_resolve_address(char const*, char const*, grpc_resolved_addresses**) (resolve_address_posix.cc:87)
11 grpc                           0x101f6de98 do_request_thread(void*, grpc_error*) (resolve_address_posix.cc:157)
12 grpc                           0x101f3d2b0 GrpcExecutor::RunClosures(grpc_closure_list) (executor.cc:70)
13 grpc                           0x101f3d440 GrpcExecutor::ThreadMain(void*) (executor.cc:175)
14 grpc                           0x101f93338 grpc_core::(anonymous namespace)::ThreadInternalsPosix::ThreadInternalsPosix(char const*, void (*)(void*), void*, bool*)::'lambda'(void*)::__invoke(void*) (thd_posix.cc:101)
15 libsystem_pthread.dylib        0x1d431f2ac _pthread_body + 128
16 libsystem_pthread.dylib        0x1d431f20c _pthread_start + 48
17 libsystem_pthread.dylib        0x1d4322cf4 thread_start + 4
Thread #2


Thread
0  libsystem_kernel.dylib         0x1d429af0c __psynch_cvwait + 8
1  libsystem_pthread.dylib        0x1d4317cd8 _pthread_cond_wait$VARIANT$mp + 636
2  grpc                           0x101f8c950 gpr_cv_wait (sync_posix.cc:91)
3  grpc                           0x101f95e64 timer_thread(void*) (trace.h:72)
4  grpc                           0x101f93338 grpc_core::(anonymous namespace)::ThreadInternalsPosix::ThreadInternalsPosix(char const*, void (*)(void*), void*, bool*)::'lambda'(void*)::__invoke(void*) (thd_posix.cc:101)
5  libsystem_pthread.dylib        0x1d431f2ac _pthread_body + 128
6  libsystem_pthread.dylib        0x1d431f20c _pthread_start + 48
7  libsystem_pthread.dylib        0x1d4322cf4 thread_start + 4
Thread #3

Thread
0  libsystem_kernel.dylib         0x1d429af0c __psynch_cvwait + 8
1  libsystem_pthread.dylib        0x1d4317cd8 _pthread_cond_wait$VARIANT$mp + 636
2  grpc                           0x101f8c940 gpr_cv_wait (sync_posix.cc:89)
3  grpc                           0x101f95e64 timer_thread(void*) (trace.h:72)
4  grpc                           0x101f93338 grpc_core::(anonymous namespace)::ThreadInternalsPosix::ThreadInternalsPosix(char const*, void (*)(void*), void*, bool*)::'lambda'(void*)::__invoke(void*) (thd_posix.cc:101)
5  libsystem_pthread.dylib        0x1d431f2ac _pthread_body + 128
6  libsystem_pthread.dylib        0x1d431f20c _pthread_start + 48
7  libsystem_pthread.dylib        0x1d4322cf4 thread_start + 4

In the "Keys" entry of the crash log there's the following entry:

crash_info_entry_0
abort() called

These are all stack traces related to firestore / grpc i believe. There is no other information in the crash log.

I hope i could provide some information to maybe find the cause of this crash.

Kind regards

@ntnmrndn
Copy link

ntnmrndn commented Dec 3, 2018

We have exactly the same issue. I guess we should downgrade firebase while waiting for a fix

@var-const var-const self-assigned this Dec 3, 2018
@var-const
Copy link
Contributor

Thank you for the report. This certainly looks related to the recent gRPC transition. I will try to reproduce this and will follow up here.

@var-const
Copy link
Contributor

We have exactly the same issue.

@ntnmrndn Does your stack trace also look identical?

@ntnmrndn
Copy link

ntnmrndn commented Dec 3, 2018

@var-const I have the same stack trace, and a second one in addition (almost identical).

See:

Crashed: com.google.firebase.firestore
0  libsystem_kernel.dylib                0x19ac5f104 __pthread_kill + 8
1  libsystem_pthread.dylib               0x19acde998 pthread_kill$VARIANT$armv81 + 296
2  libsystem_c.dylib                     0x19abb6d78 abort + 140
3  grpcpp                                0x1054d5020 grpc::Channel::PerformOpsOnCall(grpc::internal::CallOpSetInterface*, grpc::internal::Call*) + 622
4  Alpaca                                0x1031c272c grpc::ClientAsyncReaderWriter<grpc::ByteBuffer, grpc::ByteBuffer>::Finish(grpc::Status*, void*) (async_stream.h:570)
5  Alpaca                                0x1031c203c firebase::firestore::remote::GrpcStream::Shutdown() (functional:1862)
6  Alpaca                                0x1031c1eb8 firebase::firestore::remote::GrpcStream::FinishImmediately() (grpc_stream.h:186)
7  Alpaca                                0x1031c2bd8 firebase::firestore::remote::GrpcStream::WriteAndFinish(grpc::ByteBuffer&&) (optional.h:184)
8  Alpaca                                0x10321fd04 firebase::firestore::remote::WriteStream::TearDown(firebase::firestore::remote::GrpcStream*) (byte_buffer.h:88)
9  Alpaca                                0x10320a9d8 firebase::firestore::remote::Stream::Close(firebase::firestore::util::Status const&) (memory:2595)
10 Alpaca                                0x10320a3ec firebase::firestore::remote::Stream::Stop() (memory:2595)
11 Alpaca                                0x1030090f0 firebase::firestore::util::AsyncQueue::ExecuteBlocking(std::__1::function<void ()> const&) (atomic:921)
12 Alpaca                                0x10302b97c firebase::firestore::util::internal::TimeSlot::Execute() (executor_libdispatch.mm:187)
13 Alpaca                                0x10302b7a8 firebase::firestore::util::internal::TimeSlot::InvokedByLibdispatch(void*) (executor_libdispatch.mm:173)
14 libclang_rt.asan_ios_dynamic.dylib    0x105af7030 asan_dispatch_call_block_and_release + 312
15 libdispatch.dylib                     0x19ab02484 _dispatch_client_callout + 16
16 libdispatch.dylib                     0x19aad8e14 _dispatch_continuation_pop$VARIANT$armv81 + 404
17 libdispatch.dylib                     0x19aae8ab4 _dispatch_source_invoke$VARIANT$armv81 + 1704
18 libdispatch.dylib                     0x19aadce84 _dispatch_lane_serial_drain$VARIANT$armv81 + 248
19 libdispatch.dylib                     0x19aaddaf4 _dispatch_lane_invoke$VARIANT$armv81 + 412
20 libdispatch.dylib                     0x19aae5f14 _dispatch_workloop_worker_thread + 584
21 libsystem_pthread.dylib               0x19ace40dc _pthread_wqthread + 312
22 libsystem_pthread.dylib               0x19ace6cec start_wqthread + 4

We also got a new crash in objc_object::release() but it might very well be on us :D

@pschneider
Copy link
Author

We sadly could not reproduce this crash in our development environment yet but as soon as I have more information I'll provide it here.

@willtr101
Copy link

We got this same issue, seems related to the newly replaced gRPC:
image

@zackshapiro
Copy link

zackshapiro commented Dec 3, 2018

I'm also getting this as well.

I send a notification whenever the app backgrounds to any view controller listening to call .remove() on the appropriate listeners (for good measure and to conserve phone resources) since I don't need to keep track of anything in the background. And another notification to re-attach snapshot listeners for anything listening but neither of those events seem to be related to this.

I'm unable to recreate locally but I see about 1/3 of my test users experiencing this. Happens when they open the app. Otherwise I don't have any additional info

From my Podfile.lock

- FirebaseAuth (5.0.5):
- FirebaseAuthInterop (1.0.0)
- FirebaseCore (5.1.8):
- FirebaseFirestore (0.15.0):
- FirebaseFirestore/abseil-cpp (0.15.0):

screen shot 2018-12-02 at 11 20 47 am

screen shot 2018-12-02 at 11 20 29 am

@zackshapiro
Copy link

zackshapiro commented Dec 3, 2018

I'm downgrading to FirebaseFirestore 0.14.0 to see if it fixes the issue since #1968 was merged 20 days ago and 0.14.0 is the last build before. Will report back with results

After some testing, downgrading seems to have fixed the issue

@var-const
Copy link
Contributor

Ignore my previous post; looks like that function can indeed be optimized away.

@willtr101
Copy link

willtr101 commented Dec 4, 2018

UPDATED: We fixed the following issue by adding pod 'Firebase/Firestore', '~> 5.12.0'

@zackshapiro can you please share how to deal with the downgrade. We used

  pod 'FirebaseCore', '~> 5.1.7' 
  pod 'FirebaseFirestore', '~> 0.14.0'
  pod 'Firebase/Database'
  pod 'Firebase/Storage'
  pod 'Firebase/Messaging'
  pod 'Firebase/Performance'
  pod 'Firebase/RemoteConfig'
  pod 'Firebase/InAppMessagingDisplay'

  pod 'FirebaseUI/Firestore'

However, we faced this error:

  In Podfile:
    FirebaseUI/Firestore was resolved to 5.2.2, which depends on
      Firebase/Firestore

Specs satisfying the `Firebase/Firestore` dependency were found, but they required a higher minimum deployment target.
CocoaPods could not find compatible versions for pod "FirebaseFirestore":
  In Podfile:
    FirebaseFirestore (~> 0.14.0)

    FirebaseUI/Firestore was resolved to 5.2.2, which depends on
      Firebase/Firestore was resolved to 5.13.0, which depends on
        FirebaseFirestore (= 0.15.0)

Any ideas?

@var-const
Copy link
Contributor

We found a bug which might be the cause of the issue (hard to say with full confidence, because the issue is so hard to reproduce, but the symptoms look similar). We're actively working on a fix, and I will post an update here as soon as I have it.

@zackshapiro
Copy link

@businessengine I just bumped the version of FirebaseFirestore down from 0.15.0 to 0.14.0 and did a pod install. That's it :)

@pschneider
Copy link
Author

@var-const Thank you very much for your work on this! Very appreciated!

@zackshapiro
Copy link

+1

@var-const
Copy link
Contributor

Update: I merged a PR to master which I believe might fix the issue. However, because the issue is so difficult to reproduce, it's hard to know whether this fix covers all possible cases.

It would be super helpful if anyone could temporarily modify their podfiles to get Firestore from the current master branch and give it a try. The instructions are here, let us know if you run into any difficulties with this process. Once we're confident that the issue is fully resolved, we'll try to release the fix ASAP.

Additionally, it would also be very helpful if you could enable logging in Firestore (if you're using Swift, it can be done via FirebaseConfiguration.shared.setLoggerLevel(.debug)) and attach the device/emulator logs here in case you experience crashes (hopefully not).

@wilhuff
Copy link
Contributor

wilhuff commented Dec 6, 2018

@pschneider @zackshapiro @businessengine

Any chance you could try out our fix?

Add this to your Podfile for testing:

pod 'FirebaseCore', :git => 'https://github.com/firebase/firebase-ios-sdk.git', :branch => 'master'
pod 'FirebaseFirestore', :git => 'https://github.com/firebase/firebase-ios-sdk.git', :branch => 'master'
@ryanwilson ryanwilson added this to the M40 milestone Dec 6, 2018
@willtr101
Copy link

Thanks @var-const for the quick fix on this. We have made a build with this fix and will monitor to see if the crash is gone.

@pschneider
Copy link
Author

@wilhuff We've pointed to the fix in our app as well. Unfortunately the crash also didn't happen for us anymore since I've created the issue (I guess the internal test usage of the app is probably to low right now) and therefor it will be hard to really verify this. But I'll report back if we can see the crash again.

@willtr101
Copy link

We confirm that no crash for the last 12hours since the fix was deployed!

@ghost
Copy link

ghost commented Dec 8, 2018

Just a heads up that I am also seeing the exact same crash on the latest Firestore version. Will deploy the latest fix on my test environment and see if it still reproduces.

Thanks for the quick work on trying to fix this!

@wilhuff wilhuff changed the title Crash in channel_cc.cc non-virtual thunk to grpc::Channel::PerformOpsOnCall(grpc::internal::CallOpSetInterface*, grpc::internal::Call*) Dec 8, 2018
@wilhuff
Copy link
Contributor

wilhuff commented Dec 8, 2018

(I've updated the title here to avoid the "non-virtual thunk" because that description was misleading.)

At this point we haven't been able to reproduce this on device, but given the stack were able to recreate this in unit tests added in #2146. We believe that fixes this issue and it will go out with the next release.

@wilhuff wilhuff closed this as completed Dec 8, 2018
@zackshapiro
Copy link

zackshapiro commented Dec 10, 2018

Thanks for all of your work on this. Really appreciate it and so quickly.

Just curious, what does this functionality improve or enable that wasn't there before? Thanks

@wilhuff
Copy link
Contributor

wilhuff commented Dec 10, 2018

I'm not sure I understand the question. Do you mean what did our fix change? The PR notes describe the underlying problem and the changes to address it so we didn't repeat that here. However, maybe you meant something else though--could you clarify which functionality you're inquiring about?

@wilhuff
Copy link
Contributor

wilhuff commented Dec 10, 2018

A colleague has pointed out that you may have been inquiring why we're migrating our usage C++ internally. This doesn't really benefit existing users at all but the point of the project is to make Firestore compatible with the Firebase Games SDKs, which are available in C++ and Unity and also run on Windows and Linux. Our eventual goal here is to share the core of this Firestore codebase among all those efforts, so we're moving away from Objective-C as the primary implementation language.

@zackshapiro
Copy link

@wilhuff Yeah I was more inquiring what benefits to my company, if any, these changes had aside from what's in 0.14.0 and if there was any urgency for me to upgrade our pod. I understand that your goal is to make Firestore compatible with the games SDK. Thanks for explaining

@wilhuff
Copy link
Contributor

wilhuff commented Dec 10, 2018

Our release notes list changes of note across Firebase products. If you're just interested in Firestore, the release notes come from our CHANGELOG.

Note that while we're putting a release together the CHANGELOG may have notes about changes that haven't yet made it out to CocoaPods. For example, as of this writing the 0.16.1 release of Firestore is in the CHANGELOG (and includes the fix for this issue) but is not yet available.

@ghost
Copy link

ghost commented Dec 15, 2018

Just checking whether a new release will be pushed out soon?

I am a bit in a bind as I don't want to push my app with this version given the random fatal errors this issue create (I've experienced it 4-5 times a week on my test build) and when rolling back to previous versions 5.11.0 or 5.12.0 either give me the GoogleAppMeasurement issue #2151 (wrong version) or the #2102 issue at build (multiple commands produce).

If someone could point me to the last stable version that would be great! I tried rolling back to 5.10.0 but then get the #2151 error again (and I don't seem to be able to find the GoogleAppMeasurement version that would work with 5.10.0), so if you could point me to the right dependencies on GoogleAppMeasurement that would be great.

Thanks!

@paulb777
Copy link
Member

@fschaus 5.15.0 is planned to be pushed out in the next few work days. For some older versions, the GoogleAppMeasurement needs to be explicitly forced to match the FirebaseAnalytics version in the Podfile.

@ntnmrndn
Copy link

@fschaus You need to specify a correct version in your podfile. Maybe you can checkout an older version of your podfile.lock and use that ?

It's definitely an issue in the pod spec tho...

@edwardbeecroft
Copy link

Thanks for your work on this one guys. Intending to hold out for 5.15.0 (unless it's arriving after the holidays 😄).

@StackHelp
Copy link

Same issue here as well
screenshot 2019-02-14 at 1 27 49 pm

@wilhuff
Copy link
Contributor

wilhuff commented Feb 14, 2019

This comment isn’t actionable.

This issue was filed against Firebase 5.13.0 and was fixed in 5.15.0. If you’re using a version prior to that point, please upgrade. If you’re using Firebase 5.15.0 or later please open a new issue with all the details.

@EvenDu123
Copy link

some issue like this .
how can I fix it?

屏幕快照 2019-06-10 上午10 45 04

@mikelehen
Copy link
Contributor

@EvenDu This comment isn’t actionable.

This issue was filed against Firebase 5.13.0 and was fixed in 5.15.0. If you’re using a version prior to that point, please upgrade. If you’re using Firebase 5.15.0 or later please open a new issue with all the details

@firebase firebase locked as resolved and limited conversation to collaborators Jun 10, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.