Firestore: Optimize local cache sync when resuming a query that had docs deleted #11457

milaGGL · 2023-06-19T19:59:23Z

Ported from firebase/firebase-js-sdk#7229 and firebase/firebase-android-sdk#4982

Implement an optimization in Firestore when resuming a query where documents have either been deleted or no longer match the query on the server (a.k.a. "removed"). The optimization avoids re-running the entire query just to figure out which documents were deleted/removed in most cases.

Background Information

When a Firestore query is sent to the server, the server replies with the documents in the result set and a "resume token". The result set and the resume token are stored locally on the client. If the same query is resumed at a later time, such as by a later call to Query.get() or a listener registered via Query.addSnapshotListener() reconnects, then the client sends the same query to the server, but this time includes the resume token. To save on network bandwidth, the server only replies with the documents that have changed since the timestamp encoded in the resume token. Additionally, if the query is resumed within 30 minutes, and persistence is enabled, then the customer is only billed for the delta, and not the entire result set (see https://firebase.google.com/docs/firestore/pricing#listens for the official and most up-to-date details on pricing).

The problem is that if some documents in the result set were deleted or removed (i.e. changed to no longer match the query) then the server simply does not observe their presence in the result set and does not send updates for them. This leaves the client's cache in an inconsistent state because it still contains the deleted/removed documents. To work around this cache inconsistency, the server also replies with an "existence filter", a count of the documents that matched the query on the server. The client then compares this count with the number of documents that match the query in its local cache. If those counts are the same then all is good and the result set is raised via a snapshot; however, if the counts do not match then this is called an "existence filter mismatch" and the client re-runs the entire query from scratch, without a resume token, to figure out which documents in its local cache were deleted or removed. Then, the deleted or removed documents go into "limbo" and individual document reads are issued for each of the limbo documents to bring them into sync with the server.

The inefficiency is realized when the client "re-runs the entire query from scratch". This is inefficient for 2 reasons: (1) it re-transmits documents that were just sent when the query was resumed, wasting network bandwidth and (2) it results in being billed for document reads of the entire result set.

The Optimization

To avoid this expensive re-running of the query from scratch the server has been modified to also reply with the names of the documents that had not changed since the timestamp encoded in the resume token. With this additional information, the client can determine which documents in its local cache were deleted or removed, and directly put them into "limbo" without having to re-run the entire query from scratch.

The document names sent from the server are encoded in a data structure called a "bloom filter". A bloom filter is a size-efficient way to encode a "set" of strings. The size efficiency comes at the cost of correctness; that is, when testing for membership in a bloom filter it may incorrectly report that a value is contained in the bloom filter when in fact it is not (a.k.a. a "false positive"). The probability of this happening is made to be exceptionally low by tweaking the parameters of the bloom filter. However, when a false positive does happen then the client is forced to fall back to a full requery. But eliminating the vast majority of the full requeries is an overall win.

Googlers see go/firestore-ttl-deletion-protocol-changes for full details.

* upload initial code * add golden test with md5 calculator * resolve comments * add static const to kGoldenDocumentPrefix * resolve comments * resolve comments

…nner (#10862)

…, added in #10823: google_firestore_v1_ExistenceFilter, google_firestore_v1_BloomFilter, and google_firestore_v1_BitSequence (#10983)

…requery (#11440)

…l PR number)

google-oss-bot · 2023-06-19T20:25:11Z

Size Report ¹

Affected Products

FirebaseFirestore
Type Base (671c879) Merge (1743194) Diff
CocoaPods ? -51.5 kB ? (?)

Test Logs

Merge (1743194): https://github.com/firebase/firebase-ios-sdk/actions/runs/5315666714

https://storage.googleapis.com/firebase-sdk-metric-reports/MRAeUrALwC.html

google-oss-bot · 2023-06-19T21:31:32Z

Coverage Report ¹

Affected Products

`FirebaseFirestore-iOS-FirebaseFirestore.framework`

Overall coverage changed from ? (671c879) to 88.05% (1743194) by ?.

227 individual files with coverage change

Filename	Base (`671c879`)	Merge (`1743194`)	Diff
aggregate_alias.cc	?	100.00%	?
aggregate_field.cc	?	100.00%	?
aggregate_query.cc	?	100.00%	?
aggregation_result.nanopb.cc	?	0.00%	?
any.nanopb.cc	?	0.00%	?
array_contains_any_filter.cc	?	100.00%	?
array_contains_filter.cc	?	100.00%	?
async_queue.cc	?	100.00%	?
auth_token.cc	?	100.00%	?
autoid.cc	?	100.00%	?
background_queue.cc	?	100.00%	?
bits.cc	?	100.00%	?
bloom_filter.cc	?	90.12%	?
bloom_filter.nanopb.cc	?	0.00%	?
bound.cc	?	84.29%	?
bundle.nanopb.cc	?	0.00%	?
bundle_loader.cc	?	95.65%	?
bundle_reader.cc	?	91.26%	?
bundle_serializer.cc	?	91.49%	?
byte_stream_apple.mm	?	86.36%	?
byte_stream_cpp.cc	?	83.33%	?
byte_string.cc	?	80.30%	?
collection_reference.cc	?	100.00%	?
common.nanopb.cc	?	34.48%	?
comparison.cc	?	100.00%	?
composite_filter.cc	?	98.11%	?
connectivity_monitor.cc	?	100.00%	?
connectivity_monitor_apple.mm	?	51.92%	?
converters.mm	?	100.00%	?
database_id.cc	?	65.38%	?
database_info.cc	?	100.00%	?
datastore.cc	?	96.69%	?
delete_mutation.cc	?	84.00%	?
direction.cc	?	76.92%	?
document.cc	?	0.00%	?
document.nanopb.cc	?	96.67%	?
document_change.cc	?	62.50%	?
document_key.cc	?	93.06%	?
document_key_reference.cc	?	65.00%	?
document_overlay_cache.cc	?	100.00%	?
document_reference.cc	?	100.00%	?
document_set.cc	?	87.30%	?
document_snapshot.cc	?	100.00%	?
empty.nanopb.cc	?	0.00%	?
error_apple.mm	?	92.31%	?
event_manager.cc	?	98.08%	?
exception.cc	?	23.68%	?
exception_apple.mm	?	96.55%	?
executor_libdispatch.mm	?	96.06%	?
executor_std.cc	?	97.71%	?
exponential_backoff.cc	?	100.00%	?
field_filter.cc	?	95.12%	?
field_index.cc	?	92.68%	?
field_mask.cc	?	100.00%	?
field_path.cc	?	98.17%	?
field_transform.cc	?	41.67%	?
filesystem_apple.mm	?	83.33%	?
filesystem_common.cc	?	81.67%	?
filesystem_posix.cc	?	81.82%	?
filter.cc	?	62.50%	?
FIRAggregateField.mm	?	98.41%	?
FIRAggregateQuery.mm	?	100.00%	?
FIRAggregateQuerySnapshot.mm	?	100.00%	?
FIRCollectionReference.mm	?	95.08%	?
FIRDocumentChange.mm	?	76.32%	?
FIRDocumentReference.mm	?	97.10%	?
FIRDocumentSnapshot.mm	?	97.89%	?
firebase_app_check_credentials_provider_apple.mm	?	92.93%	?
firebase_auth_credentials_provider_apple.mm	?	76.64%	?
firebase_metadata_provider_apple.mm	?	86.96%	?
firebase_metadata_provider_noop.cc	?	100.00%	?
firestore.cc	?	92.13%	?
firestore.nanopb.cc	?	35.95%	?
firestore_client.cc	?	98.91%	?
firestore_index_value_writer.cc	?	90.00%	?
FIRFieldPath.mm	?	90.24%	?
FIRFieldValue.mm	?	89.41%	?
FIRFilter.mm	?	100.00%	?
FIRFirestore.mm	?	88.70%	?
FIRFirestoreSettings.mm	?	81.00%	?
FIRFirestoreSource.mm	?	90.91%	?
FIRGeoPoint.mm	?	82.22%	?
FIRListenerRegistration.mm	?	100.00%	?
FIRLoadBundleTask.mm	?	80.70%	?
FIRLocalCacheSettings.mm	?	48.97%	?
FIRQuery.mm	?	87.87%	?
FIRQuerySnapshot.mm	?	95.45%	?
FIRSnapshotMetadata.mm	?	100.00%	?
FIRTimestamp.m	?	79.35%	?
FIRTransaction.mm	?	97.64%	?
FIRTransactionOptions.mm	?	100.00%	?
FIRWriteBatch.mm	?	100.00%	?
FSTFirestoreComponent.mm	?	95.74%	?
FSTUserDataReader.mm	?	95.38%	?
FSTUserDataWriter.mm	?	87.06%	?
geo_point.cc	?	65.00%	?
grpc_completion.cc	?	100.00%	?
grpc_connection.cc	?	77.50%	?
grpc_nanopb.cc	?	94.87%	?
grpc_root_certificate_finder_generated.cc	?	100.00%	?
grpc_stream.cc	?	99.01%	?
grpc_streaming_reader.cc	?	100.00%	?
grpc_unary_call.cc	?	100.00%	?
grpc_util.cc	?	100.00%	?
hard_assert.cc	?	100.00%	?
http.nanopb.cc	?	0.00%	?
index.nanopb.cc	?	0.00%	?
index_backfiller.cc	?	100.00%	?
index_entry.cc	?	60.00%	?
in_filter.cc	?	100.00%	?
json_reader.cc	?	87.50%	?
key_field_filter.cc	?	100.00%	?
key_field_in_filter.cc	?	100.00%	?
key_field_not_in_filter.cc	?	100.00%	?
latlng.nanopb.cc	?	86.67%	?
leveldb_bundle_cache.cc	?	76.00%	?
leveldb_document_overlay_cache.cc	?	97.17%	?
leveldb_index_manager.cc	?	97.67%	?
leveldb_key.cc	?	98.82%	?
leveldb_lru_reference_delegate.cc	?	95.93%	?
leveldb_migrations.cc	?	92.64%	?
leveldb_mutation_queue.cc	?	92.42%	?
leveldb_opener.cc	?	76.81%	?
leveldb_overlay_migration_manager.cc	?	100.00%	?
leveldb_persistence.cc	?	90.82%	?
leveldb_remote_document_cache.cc	?	94.61%	?
leveldb_target_cache.cc	?	94.68%	?
leveldb_transaction.cc	?	98.79%	?
leveldb_util.cc	?	71.43%	?
load_bundle_task.cc	?	97.06%	?
local_documents_view.cc	?	96.86%	?
local_serializer.cc	?	87.78%	?
local_store.cc	?	100.00%	?
local_view_changes.cc	?	100.00%	?
logic_utils.cc	?	97.94%	?
log_apple.mm	?	93.33%	?
lru_garbage_collector.cc	?	91.34%	?
maybe_document.nanopb.cc	?	28.89%	?
md5.cc	?	91.88%	?
memory_bundle_cache.cc	?	100.00%	?
memory_document_overlay_cache.cc	?	100.00%	?
memory_eager_reference_delegate.cc	?	100.00%	?
memory_index_manager.cc	?	50.00%	?
memory_lru_reference_delegate.cc	?	95.41%	?
memory_mutation_queue.cc	?	100.00%	?
memory_persistence.cc	?	100.00%	?
memory_remote_document_cache.cc	?	93.51%	?
memory_target_cache.cc	?	100.00%	?
message.cc	?	100.00%	?
mutable_document.cc	?	68.52%	?
mutation.cc	?	86.15%	?
mutation.nanopb.cc	?	75.76%	?
mutation_batch.cc	?	88.30%	?
mutation_batch_result.cc	?	52.17%	?
nanopb_util.cc	?	100.00%	?
not_in_filter.cc	?	100.00%	?
object_value.cc	?	100.00%	?
online_state_tracker.cc	?	100.00%	?
ordered_code.cc	?	94.39%	?
order_by.cc	?	50.00%	?
overlay.cc	?	100.00%	?
patch_mutation.cc	?	100.00%	?
path.cc	?	100.00%	?
precondition.cc	?	86.49%	?
pretty_printing.cc	?	83.33%	?
proto_sizer.cc	?	100.00%	?
query.cc	?	98.45%	?
query.nanopb.cc	?	67.92%	?
query_core.cc	?	96.23%	?
query_engine.cc	?	98.28%	?
query_listener.cc	?	100.00%	?
query_listener_registration.cc	?	100.00%	?
query_snapshot.cc	?	84.21%	?
reader.cc	?	100.00%	?
reference_set.cc	?	88.89%	?
remote_event.cc	?	97.67%	?
remote_objc_bridge.cc	?	92.96%	?
remote_store.cc	?	91.16%	?
resource.nanopb.cc	?	0.00%	?
resource_path.cc	?	100.00%	?
schedule.cc	?	100.00%	?
secure_random_arc4random.cc	?	100.00%	?
serializer.cc	?	90.80%	?
server_timestamp_util.cc	?	95.77%	?
settings.cc	?	83.11%	?
set_mutation.cc	?	89.13%	?
snapshots_in_sync_listener_registration.cc	?	100.00%	?
snapshot_metadata.cc	?	100.00%	?
snapshot_version.cc	?	85.71%	?
status.cc	?	73.05%	?
status.nanopb.cc	?	72.22%	?
statusor.cc	?	80.00%	?
status_apple.mm	?	96.61%	?
status_errno.cc	?	37.82%	?
stream.cc	?	98.83%	?
strerror.cc	?	100.00%	?
string_apple.cc	?	100.00%	?
string_format.cc	?	93.94%	?
string_util.cc	?	100.00%	?
struct.nanopb.cc	?	7.32%	?
sync_engine.cc	?	95.26%	?
target.cc	?	96.23%	?
target.nanopb.cc	?	45.00%	?
target_data.cc	?	43.04%	?
target_id_generator.cc	?	100.00%	?
target_index_matcher.cc	?	97.80%	?
task.cc	?	94.78%	?
testing_hooks.cc	?	100.00%	?
timestamp.cc	?	94.00%	?
timestamp.nanopb.cc	?	100.00%	?
timestamp_internal.cc	?	53.33%	?
transaction.cc	?	90.06%	?
transaction_runner.cc	?	100.00%	?
transform_operation.cc	?	77.03%	?
user.cc	?	100.00%	?
user_data.cc	?	96.30%	?
value_util.cc	?	95.71%	?
verify_mutation.cc	?	12.50%	?
view.cc	?	98.39%	?
view_snapshot.cc	?	78.99%	?
watch_change.cc	?	90.00%	?
watch_stream.cc	?	90.70%	?
wrappers.nanopb.cc	?	11.11%	?
write.nanopb.cc	?	60.57%	?
writer.cc	?	96.97%	?
write_batch.cc	?	91.89%	?
write_stream.cc	?	94.37%	?

`FirebaseFirestore-iOS-FirebaseFirestoreSwift.framework`

Overall coverage changed from ? (671c879) to 46.92% (1743194) by ?.

22 individual files with coverage change

Filename	Base (`671c879`)	Merge (`1743194`)	Diff
CodablePassThroughTypes.swift	?	100.00%	?
CollectionReference+AsyncAwait.swift	?	96.88%	?
CollectionReference+WriteEncodable.swift	?	100.00%	?
DocumentID.swift	?	86.44%	?
DocumentReference+Codable.swift	?	50.00%	?
DocumentReference+ReadDecodable.swift	?	0.00%	?
DocumentReference+WriteEncodable.swift	?	100.00%	?
DocumentSnapshot+ReadDecodable.swift	?	80.00%	?
EncoderDecoder.swift	?	100.00%	?
ExplicitNull.swift	?	90.48%	?
FieldValue+Encodable.swift	?	100.00%	?
Firestore+AsyncAwait.swift	?	97.75%	?
FirestoreQuery.swift	?	0.00%	?
FirestoreQueryObservable.swift	?	0.00%	?
GeoPoint+Codable.swift	?	100.00%	?
QueryPredicate.swift	?	0.00%	?
ServerTimestamp.swift	?	96.97%	?
Timestamp+Codable.swift	?	100.00%	?
TimestampDecodingStrategy.swift	?	100.00%	?
TimestampEncodingStrategy.swift	?	100.00%	?
Transaction+WriteEncodable.swift	?	100.00%	?
WriteBatch+WriteEncodable.swift	?	100.00%	?

Test Logs

Merge (1743194): https://github.com/firebase/firebase-ios-sdk/actions/runs/5315666714

https://storage.googleapis.com/firebase-sdk-metric-reports/RbcfYLXyJk.html

…ure has been merged into the ios sdk in firebase/firebase-ios-sdk#11457

dconeybe · 2024-01-16T15:25:11Z

For a discussion about the implementation details of this PR, see #12270.

milaGGL and others added 30 commits February 15, 2023 21:52

add bloomFilter to proto (#10823)

ca6f7d2

Create bloom filter class header filer (#10841)

abd5af2

Merge branch 'master' into mila/BloomFilter

2e78e41

Implement BloomFilter class (#10846)

024df8c

Firestore: Add CalculateMd5Digest() function (#10854)

698c946

Add golden tests for BloomFilter (#10874)

93815e6

* upload initial code * add golden test with md5 calculator * resolve comments * add static const to kGoldenDocumentPrefix * resolve comments * resolve comments

Add expected count to Target (#10872)

3b5b49b

Merge branch 'master' into mila/BloomFilter

d08ee84

Merge branch 'master' into mila/BloomFilter

5252a62

Add unchanged names to existence filter and watchFilters spec test ru…

9dc9f3e

…nner (#10862)

Merge branch 'master' into mila/BloomFilter

de45a95

fields_array.h: add missing template specializations for bloom filter…

d86d788

…, added in #10823: google_firestore_v1_ExistenceFilter, google_firestore_v1_BloomFilter, and google_firestore_v1_BitSequence (#10983)

Apply bloom filter on existence filter mismatch (#10953)

b761c5f

Merge branch 'master' into mila/BloomFilter

62afd7c

Add new goog-listen-tags for BloomFilter (#10977)

08025ae

remove unnecessary local variable

3bc3a48

Merge remote-tracking branch 'origin/master' into mila/BloomFilter

0745252

Use ByteString instead of std::vector in the BloomFilter (#11038)

5b83bd4

Merge remote-tracking branch 'origin/master' into mila/BloomFilter

3819fcf

Merge remote-tracking branch 'origin/master' into mila/BloomFilter

7e8f921

project.pbxproj update

d43007c

FSTSpecTests.mm: add TargetPurposeExistenceFilterMismatchBloom

bd0ebba

Merge remote-tracking branch 'origin/master' into mila/BloomFilter

d196b9a

Merge remote-tracking branch 'origin/master' into mila/BloomFilter

bed753b

Merge remote-tracking branch 'origin/master' into mila/BloomFilter

6435094

FIRQueryTests.mm: static_cast<int> -> size_t{1}

7979684

Merge remote-tracking branch 'origin/master' into mila/BloomFilter

665372e

Update the integration test to verify that bloom filter averted full …

d73d97c

…requery (#11440)

Firestore/CHANGELOG.md: Add change log entry (NOTE: update with actua…

749ee64

…l PR number)

Merge remote-tracking branch 'origin/master' into mila/BloomFilter

be2052b

google-oss-bot added the api: firestore label Jun 19, 2023

milaGGL requested a review from dconeybe June 19, 2023 20:03

dconeybe mentioned this pull request Jun 20, 2023

Firestore: Optimize local cache sync when resuming a query that had docs deleted firebase/firebase-js-sdk#7229

Merged

dconeybe approved these changes Jun 20, 2023

View reviewed changes

dconeybe merged commit 3f5972f into master Jun 20, 2023

dconeybe deleted the mila/BloomFilter branch June 20, 2023 16:10

dconeybe added a commit to firebase/firebase-js-sdk that referenced this pull request Jun 20, 2023

Remove the no-ios tag from bloom filter spec tests, now that the feat…

87b8c22

…ure has been merged into the ios sdk in firebase/firebase-ios-sdk#11457

dconeybe mentioned this pull request Jun 20, 2023

Firestore: Remove the no-ios tag from bloom filter spec tests firebase/firebase-js-sdk#7372

Merged

dconeybe added a commit to firebase/firebase-js-sdk that referenced this pull request Jun 20, 2023

Remove the no-ios tag from bloom filter spec tests, now that the feat…

75978dd

…ure has been merged into the ios sdk in firebase/firebase-ios-sdk#11457

dconeybe added a commit to firebase/firebase-android-sdk that referenced this pull request Jun 20, 2023

Remove the no-ios tag from bloom filter spec tests, now that the feat…

618ad06

…ure has been merged into the ios sdk in firebase/firebase-ios-sdk#11457

firebase locked and limited conversation to collaborators Jul 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Firestore: Optimize local cache sync when resuming a query that had docs deleted #11457

Firestore: Optimize local cache sync when resuming a query that had docs deleted #11457

milaGGL commented Jun 19, 2023 •

edited by dconeybe

Loading

google-oss-bot commented Jun 19, 2023

`FirebaseFirestore`

google-oss-bot commented Jun 19, 2023

`FirebaseFirestore-iOS-FirebaseFirestore.framework`

`FirebaseFirestore-iOS-FirebaseFirestoreSwift.framework`

dconeybe commented Jan 16, 2024

Firestore: Optimize local cache sync when resuming a query that had docs deleted #11457

Firestore: Optimize local cache sync when resuming a query that had docs deleted #11457

Conversation

milaGGL commented Jun 19, 2023 • edited by dconeybe Loading

Background Information

The Optimization

google-oss-bot commented Jun 19, 2023

Size Report 1

Affected Products

FirebaseFirestore

Test Logs

google-oss-bot commented Jun 19, 2023

Coverage Report 1

Affected Products

FirebaseFirestore-iOS-FirebaseFirestore.framework

FirebaseFirestore-iOS-FirebaseFirestoreSwift.framework

Test Logs

dconeybe commented Jan 16, 2024

milaGGL commented Jun 19, 2023 •

edited by dconeybe

Loading

Size Report ¹

`FirebaseFirestore`

Coverage Report ¹

`FirebaseFirestore-iOS-FirebaseFirestore.framework`

`FirebaseFirestore-iOS-FirebaseFirestoreSwift.framework`