Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigQuery: Add support for BigQuery Storage API Arrow format in to_dataframe and to_arrow. #8551

Merged
merged 20 commits into from
Jul 12, 2019

Conversation

TheNeuralBit
Copy link
Contributor

  • Makes _StreamParser abstract, and breaks it into two implementations: one for arrow and one for avro. The implementation is selected is based on the schema set in the ReadSession.
  • Modifiies BigQuery client library to use the Arrow format for calls to client.list_rows(table).to_dataframe(bq_storage_client)
@TheNeuralBit TheNeuralBit requested review from tswast and a team July 1, 2019 19:34
@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Jul 1, 2019
@tswast tswast added api: bigquery Issues related to the BigQuery API. api: bigquerystorage Issues related to the BigQuery Storage API. do not merge Indicates a pull request not ready for merge, due to either quality or timing. labels Jul 1, 2019
@tswast
Copy link
Contributor

tswast commented Jul 1, 2019

Added the do not merge label. We'll need to wait for the Arrow changes to hit prod before merging.

@tswast
Copy link
Contributor

tswast commented Jul 3, 2019

Once #8609 goes in, I'll aim to rebase this on top of that change, providing a fast version of to_arrow, too.

@tswast
Copy link
Contributor

tswast commented Jul 8, 2019

to_arrow is added in #8609 using tabledata.list. Once that's in, we can update this PR to use the BQ Storage API for that method as well.

@tswast tswast changed the title BigQuery Storage: Add support for arrow format in BQ Read API Jul 11, 2019
@tswast
Copy link
Contributor

tswast commented Jul 11, 2019

Let's wait for #8644 to be merged and released to PyPI before merging this one.

@tswast tswast removed api: bigquerystorage Issues related to the BigQuery Storage API. do not merge Indicates a pull request not ready for merge, due to either quality or timing. labels Jul 11, 2019
@tswast tswast requested a review from shollyman July 11, 2019 21:22
@tswast tswast merged commit 8852687 into googleapis:master Jul 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the BigQuery API. cla: yes This human has signed the Contributor License Agreement.
3 participants