Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Firestore: Optimize local cache sync when resuming a query that had docs deleted #4982

Merged
merged 49 commits into from
May 8, 2023

Conversation

milaGGL
Copy link
Contributor

@milaGGL milaGGL commented May 5, 2023

Ported from firebase/firebase-js-sdk#7229

Implement an optimization in Firestore when resuming a query where documents have either been deleted or no longer match the query on the server (a.k.a. "removed"). The optimization avoids re-running the entire query just to figure out which documents were deleted/removed in most cases.

Background Information

When a Firestore query is sent to the server, the server replies with the documents in the result set and a "resume token". The result set and the resume token are stored locally on the client. If the same query is resumed at a later time, such as by a later call to Query.get() or a listener registered via Query.addSnapshotListener() reconnects, then the client sends the same query to the server, but this time includes the resume token. To save on network bandwidth, the server only replies with the documents that have changed since the timestamp encoded in the resume token. Additionally, if the query is resumed within 30 minutes, and persistence is enabled, then the customer is only billed for the delta, and not the entire result set (see https://firebase.google.com/docs/firestore/pricing#listens for the official and most up-to-date details on pricing).

The problem is that if some documents in the result set were deleted or removed (i.e. changed to no longer match the query) then the server simply does not observe their presence in the result set and does not send updates for them. This leaves the client's cache in an inconsistent state because it still contains the deleted/removed documents. To work around this cache inconsistency, the server also replies with an "existence filter", a count of the documents that matched the query on the server. The client then compares this count with the number of documents that match the query in its local cache. If those counts are the same then all is good and the result set is raised via a snapshot; however, if the counts do not match then this is called an "existence filter mismatch" and the client re-runs the entire query from scratch, without a resume token, to figure out which documents in its local cache were deleted or removed. Then, the deleted or removed documents go into "limbo" and individual document reads are issued for each of the limbo documents to bring them into sync with the server.

The inefficiency is realized when the client "re-runs the entire query from scratch". This is inefficient for 2 reasons: (1) it re-transmits documents that were just sent when the query was resumed, wasting network bandwidth and (2) it results in being billed for document reads of the entire result set.

The Optimization

To avoid this expensive re-running of the query from scratch the server has been modified to also reply with the names of the documents that had not changed since the timestamp encoded in the resume token. With this additional information, the client can determine which documents in its local cache were deleted or removed, and directly put them into "limbo" without having to re-run the entire query from scratch.

The document names sent from the server are encoded in a data structure called a "bloom filter". A bloom filter is a size-efficient way to encode a "set" of strings. The size efficiency comes at the cost of correctness; that is, when testing for membership in a bloom filter it may incorrectly report that a value is contained in the bloom filter when in fact it is not (a.k.a. a "false positive"). The probability of this happening is made to be exceptionally low by tweaking the parameters of the bloom filter. However, when a false positive does happen then the client is forced to fall back to a full requery. But eliminating the vast majority of the full requeries is an overall win.

Googlers see go/firestore-ttl-deletion-protocol-changes for full details.

milaGGL and others added 30 commits January 17, 2023 10:29
…e bloom filter support has now been deployed to production. (#4871)
@milaGGL milaGGL requested a review from dconeybe May 5, 2023 13:33
@milaGGL milaGGL self-assigned this May 5, 2023
@github-actions
Copy link
Contributor

github-actions bot commented May 5, 2023

📝 PRs merging into main branch

Our main branch should always be in a releasable state. If you are working on a larger change, or if you don't want this change to see the light of the day just yet, consider using a feature branch first, and only merge into the main branch when the code complete and ready to be released.

@google-oss-bot
Copy link
Contributor

google-oss-bot commented May 5, 2023

Coverage Report 1

Affected Products

  • firebase-firestore

    Overall coverage changed from 44.22% (41890a0) to 44.32% (4a18354) by +0.10%.

    13 individual files with coverage change

    FilenameBase (41890a0)Merge (4a18354)Diff
    AutoValue_TestingHooks_ExistenceFilterBloomFilterInfo.java?20.00%?
    BitSequence.java?43.48%?
    BitSequenceOrBuilder.java?0.00%?
    BloomFilter.java?87.72%?
    BloomFilterOrBuilder.java?0.00%?
    BloomFilterProto.java?0.00%?
    ExistenceFilter.java80.00%90.00%+10.00%
    LruGarbageCollector.java97.27%93.64%-3.64%
    RemoteSerializer.java79.18%79.45%+0.26%
    RemoteStore.java88.49%88.80%+0.31%
    TargetData.java77.50%77.78%+0.28%
    TestingHooks.java45.00%64.52%+19.52%
    WatchChangeAggregator.java98.26%98.60%+0.35%

Test Logs

  1. https://storage.googleapis.com/firebase-sdk-metric-reports/K0OSqdmyes.html
@github-actions
Copy link
Contributor

github-actions bot commented May 5, 2023

Unit Test Results

   162 files  +     88     162 suites  +88   1m 57s ⏱️ - 3m 10s
1 158 tests +   918  1 142 ✔️ +   903  16 💤 +16  0  - 1 
2 316 runs  +1 948  2 284 ✔️ +1 917  32 💤 +32  0  - 1 

Results for commit d9a59e2. ± Comparison against base commit 41890a0.

This pull request removes 240 and adds 1158 tests. Note that renamed tests count towards both.
com.google.firebase.appcheck.FirebaseAppCheckRegistrarTest ‑ testGetComponents
com.google.firebase.appcheck.FirebaseAppCheckTest ‑ testGetInstance_defaultFirebaseAppName_matchesDefaultGetter
com.google.firebase.appcheck.FirebaseAppCheckTest ‑ testGetInstance_otherFirebaseAppName_doesNotMatch
com.google.firebase.appcheck.debug.DebugAppCheckProviderFactoryTest ‑ testGetInstance_callTwice_sameInstance
com.google.firebase.appcheck.debug.FirebaseAppCheckDebugRegistrarTest ‑ testGetComponents
com.google.firebase.appcheck.debug.internal.DebugAppCheckProviderTest ‑ exchangeDebugToken_onFailure_setsTaskException
com.google.firebase.appcheck.debug.internal.DebugAppCheckProviderTest ‑ exchangeDebugToken_onSuccess_setsTaskResult
com.google.firebase.appcheck.debug.internal.DebugAppCheckProviderTest ‑ testDetermineDebugSecret_noStoredSecret_createsNewSecret
com.google.firebase.appcheck.debug.internal.DebugAppCheckProviderTest ‑ testDetermineDebugSecret_storedSecret_usesExistingSecret
com.google.firebase.appcheck.debug.internal.DebugAppCheckProviderTest ‑ testPublicConstructor_nullFirebaseApp_expectThrows
…
com.google.firebase.TimestampTest ‑ testCompare
com.google.firebase.TimestampTest ‑ testFromDate
com.google.firebase.TimestampTest ‑ testRejectBadDates
com.google.firebase.TimestampTest ‑ testTimestampParcelable
com.google.firebase.firestore.AggregateQuerySnapshotTest ‑ createWithCountShouldReturnInstanceWithTheGivenQueryAndCount
com.google.firebase.firestore.AggregateQueryTest ‑ testSourceMustNotBeNull
com.google.firebase.firestore.BlobTest ‑ testComparison
com.google.firebase.firestore.BlobTest ‑ testEquals
com.google.firebase.firestore.BlobTest ‑ testMutableBytes
com.google.firebase.firestore.CollectionReferenceTest ‑ testEquals
…

♻️ This comment has been updated with latest results.

@google-oss-bot
Copy link
Contributor

google-oss-bot commented May 5, 2023

Size Report 1

Affected Products

  • firebase-firestore

    TypeBase (41890a0)Merge (4a18354)Diff
    aar1.34 MB1.36 MB+21.6 kB (+1.6%)
    apk (aggressive)518 kB520 kB+1.95 kB (+0.4%)
    apk (release)3.94 MB3.95 MB+8.75 kB (+0.2%)

Test Logs

  1. https://storage.googleapis.com/firebase-sdk-metric-reports/mov1Uu0hVN.html
@google-oss-bot
Copy link
Contributor

google-oss-bot commented May 5, 2023

Startup Time Report 1

Note: Layout is sometimes suboptimal due to limited formatting support on GitHub. Please check this report on GCS.

Notes

Startup Times

  • fire-fst

    DeviceStatisticsDistributions
    oriole-32
    Percentile41890a04a18354DiffSignificant (?)
    p10325 ±32 μs351 ±80 μs+26.4 μs (+8.1%)NO
    p25338 ±47 μs369 ±96 μs+30.6 μs (+9.1%)NO
    p50355 ±63 μs398 ±114 μs+43.3 μs (+12.2%)NO
    p75381 ±73 μs484 ±209 μs+103 μs (+27.0%)NO
    p90423 ±98 μs636 ±432 μs+213 μs (+50.4%)NO

    20 test runs in comparison
    CommitTest Runs
    41890a0
    • 2023-05-05_04:15:40.081227_YyuI
    • 2023-05-05_04:15:40.083518_PZSZ
    • 2023-05-05_04:15:40.083530_wNXk
    • 2023-05-05_04:15:40.083536_tnaZ
    • 2023-05-05_04:15:40.083542_wmss
    • 2023-05-05_04:15:40.083547_UyBC
    • 2023-05-05_04:15:40.083553_hXuX
    • 2023-05-05_04:15:40.083558_eaTI
    • 2023-05-05_04:15:40.083563_NUxE
    • 2023-05-05_04:15:40.083569_BRjb
    4a18354
    • 2023-05-08_17:36:57.916586_bADU
    • 2023-05-08_17:36:57.920282_nsGU
    • 2023-05-08_17:36:57.920297_ZxcM
    • 2023-05-08_17:36:57.920305_cdUZ
    • 2023-05-08_17:36:57.920310_xQmU
    • 2023-05-08_17:36:57.920316_Tnou
    • 2023-05-08_17:36:57.920321_qVtH
    • 2023-05-08_17:36:57.920326_apzT
    • 2023-05-08_17:36:57.920331_uanA
    • 2023-05-08_17:36:57.920336_cNVp
    redfin-30
    Percentile41890a04a18354DiffSignificant (?)
    p10642 ±31 μs637 ±28 μs-4.65 μs (-0.7%)NO
    p25661 ±34 μs655 ±30 μs-5.58 μs (-0.8%)NO
    p50688 ±39 μs680 ±39 μs-7.80 μs (-1.1%)NO
    p75726 ±48 μs712 ±53 μs-14.1 μs (-1.9%)NO
    p90768 ±60 μs779 ±106 μs+10.7 μs (+1.4%)NO

    20 test runs in comparison
    CommitTest Runs
    41890a0
    • 2023-05-05_04:15:40.081227_YyuI
    • 2023-05-05_04:15:40.083518_PZSZ
    • 2023-05-05_04:15:40.083530_wNXk
    • 2023-05-05_04:15:40.083536_tnaZ
    • 2023-05-05_04:15:40.083542_wmss
    • 2023-05-05_04:15:40.083547_UyBC
    • 2023-05-05_04:15:40.083553_hXuX
    • 2023-05-05_04:15:40.083558_eaTI
    • 2023-05-05_04:15:40.083563_NUxE
    • 2023-05-05_04:15:40.083569_BRjb
    4a18354
    • 2023-05-08_17:36:57.916586_bADU
    • 2023-05-08_17:36:57.920282_nsGU
    • 2023-05-08_17:36:57.920297_ZxcM
    • 2023-05-08_17:36:57.920305_cdUZ
    • 2023-05-08_17:36:57.920310_xQmU
    • 2023-05-08_17:36:57.920316_Tnou
    • 2023-05-08_17:36:57.920321_qVtH
    • 2023-05-08_17:36:57.920326_apzT
    • 2023-05-08_17:36:57.920331_uanA
    • 2023-05-08_17:36:57.920336_cNVp
  • timeToInitialDisplay

    DeviceStatisticsDistributions
    oriole-32
    Percentile41890a04a18354DiffSignificant (?)
    p10193 ±10 ms207 ±25 ms+14.1 ms (+7.3%)NO
    p25201 ±17 ms213 ±25 ms+12.4 ms (+6.2%)NO
    p50210 ±22 ms222 ±31 ms+12.0 ms (+5.7%)NO
    p75219 ±26 ms233 ±34 ms+13.7 ms (+6.3%)NO
    p90227 ±30 ms250 ±39 ms+22.6 ms (+10.0%)NO

    20 test runs in comparison
    CommitTest Runs
    41890a0
    • 2023-05-05_04:15:40.081227_YyuI
    • 2023-05-05_04:15:40.083518_PZSZ
    • 2023-05-05_04:15:40.083530_wNXk
    • 2023-05-05_04:15:40.083536_tnaZ
    • 2023-05-05_04:15:40.083542_wmss
    • 2023-05-05_04:15:40.083547_UyBC
    • 2023-05-05_04:15:40.083553_hXuX
    • 2023-05-05_04:15:40.083558_eaTI
    • 2023-05-05_04:15:40.083563_NUxE
    • 2023-05-05_04:15:40.083569_BRjb
    4a18354
    • 2023-05-08_17:36:57.916586_bADU
    • 2023-05-08_17:36:57.920282_nsGU
    • 2023-05-08_17:36:57.920297_ZxcM
    • 2023-05-08_17:36:57.920305_cdUZ
    • 2023-05-08_17:36:57.920310_xQmU
    • 2023-05-08_17:36:57.920316_Tnou
    • 2023-05-08_17:36:57.920321_qVtH
    • 2023-05-08_17:36:57.920326_apzT
    • 2023-05-08_17:36:57.920331_uanA
    • 2023-05-08_17:36:57.920336_cNVp
    redfin-30
    Percentile41890a04a18354DiffSignificant (?)
    p10232 ±5 ms255 ±3 ms+23.5 ms (+10.1%)MAYBE
    p25238 ±5 ms261 ±4 ms+23.1 ms (+9.7%)MAYBE
    p50245 ±6 ms269 ±4 ms+24.2 ms (+9.9%)MAYBE
    p75253 ±6 ms279 ±6 ms+26.7 ms (+10.6%)MAYBE
    p90262 ±6 ms292 ±8 ms+30.6 ms (+11.7%)MAYBE

    20 test runs in comparison
    CommitTest Runs
    41890a0
    • 2023-05-05_04:15:40.081227_YyuI
    • 2023-05-05_04:15:40.083518_PZSZ
    • 2023-05-05_04:15:40.083530_wNXk
    • 2023-05-05_04:15:40.083536_tnaZ
    • 2023-05-05_04:15:40.083542_wmss
    • 2023-05-05_04:15:40.083547_UyBC
    • 2023-05-05_04:15:40.083553_hXuX
    • 2023-05-05_04:15:40.083558_eaTI
    • 2023-05-05_04:15:40.083563_NUxE
    • 2023-05-05_04:15:40.083569_BRjb
    4a18354
    • 2023-05-08_17:36:57.916586_bADU
    • 2023-05-08_17:36:57.920282_nsGU
    • 2023-05-08_17:36:57.920297_ZxcM
    • 2023-05-08_17:36:57.920305_cdUZ
    • 2023-05-08_17:36:57.920310_xQmU
    • 2023-05-08_17:36:57.920316_Tnou
    • 2023-05-08_17:36:57.920321_qVtH
    • 2023-05-08_17:36:57.920326_apzT
    • 2023-05-08_17:36:57.920331_uanA
    • 2023-05-08_17:36:57.920336_cNVp

  1. https://storage.googleapis.com/firebase-sdk-metric-reports/3or0f3IOJF/index.html
@milaGGL milaGGL merged commit c83d5e5 into master May 8, 2023
@milaGGL milaGGL deleted the mila/BloomFilter branch May 8, 2023 18:05
dconeybe added a commit to firebase/firebase-js-sdk that referenced this pull request May 8, 2023
dconeybe added a commit to firebase/firebase-js-sdk that referenced this pull request May 8, 2023
@firebase firebase locked and limited conversation to collaborators Jun 8, 2023
@dconeybe
Copy link
Contributor

For a discussion about the implementation details of this PR, see firebase/firebase-ios-sdk#12270.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
3 participants