Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigQuery: Add ability to pass in a table ID instead of a query to the %%bigquery magic. #9170

Merged
merged 6 commits into from
Sep 23, 2019

Conversation

shubha-rajan
Copy link
Contributor

@shubha-rajan shubha-rajan commented Sep 4, 2019

Third of 3 PRs towards resolving #9105 as described in review for #9147
Screenshot of feature in notebook:
image

@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Sep 4, 2019
added default patch to unit tests
@shubha-rajan shubha-rajan force-pushed the bq-table-id-instead-of-query branch from 6761217 to 77e62c5 Compare September 6, 2019 05:33
@shubha-rajan shubha-rajan marked this pull request as ready for review September 6, 2019 14:49
@shubha-rajan shubha-rajan requested a review from a team September 6, 2019 14:49
rows = client.list_rows(table_id, max_results=max_results)
except Exception as ex:
error = str(ex)
if error:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could this be simplified to return from the exception?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it definitely can. fixed in 7a13ca4

@@ -434,6 +444,26 @@ def _cell_magic(line, query):
else:
max_results = None

error = None

if not re.search(r"\s", query.rstrip()):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What scenario is this protecting against? Is there a scenario where unicode strings will not remove space characters via rstrip?

What do we expect in the else case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related: Does this repository enforce code coverage? Is there a case where we test there being unstoppable whitespace and not taking this branch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition is actually testing for queries that contain whitespace characters that aren't removed by rstrip (so any whitespace that isn't a trailing newline). The assumption being made (as described in #9105 ) is that anything containing whitespace is a SQL query and won't take this branch, while anything string without whitespace is a table_id and will take this branch.

So anything regular SQL query would fall into the else case. There are tests that check whether query strings without spaces will be interpreted as table IDs and a test that checks if a string without whitespace that isn't a valid table_id raises an appropriate error message.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add a comment in the code if that would be helpful.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also write this down in a comment, i.e. that anything without whitespace is assumed to be table identifier which triggers a different use case of the magic command.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added comment in d663104

@plamut plamut added the api: bigquery Issues related to the BigQuery API. label Sep 19, 2019
Copy link
Contributor

@plamut plamut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change generally works.

I did notice, however, that the logic is sensitive to the leading whitespace:

┌───────────────────────────────────────────┐
│ %%bigquery --max_results=6                │                 
|  bigquery-public-data.samples.shakespeare │
└───────────────────────────────────────────┘
  Executing query with job ID: a0f195e6-f748-4c42-8c3b-d89abaa37c9f
  Query executing: 0.92s

  ERROR:
    400 Syntax error: Unexpected identifier "bigquery" at [1:3]

Wouldn't it be better to strip the whitespace on both sides first, or why we only do .rstrip()?

try:
query_job = _run_query(client, query, job_config=job_config)
except Exception as ex:
error = str(ex)
return _print_error(str(ex), args.destination_var)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be slightly confusing for the reader at first glance, because _print_error() itself does not return anything, it just has a side effect. I would simply express the same in two lines (a sole return in its own).

Copy link
Contributor Author

@shubha-rajan shubha-rajan Sep 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in 6f2dd64

@@ -434,6 +444,26 @@ def _cell_magic(line, query):
else:
max_results = None

error = None

if not re.search(r"\s", query.rstrip()):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also write this down in a comment, i.e. that anything without whitespace is assumed to be table identifier which triggers a different use case of the magic command.

@@ -434,6 +444,21 @@ def _cell_magic(line, query):
else:
max_results = None

if not re.search(r"\s", query.rstrip()):
table_id = query.rstrip()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, stripping whitespace from the same value again is not necessary, we could store the stripped string into a variable the first time we do it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in d663104

@shubha-rajan shubha-rajan requested a review from plamut September 21, 2019 00:46
Copy link
Contributor

@plamut plamut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

It might cause a merge conflict with #9245, but I'll address that there.

@plamut plamut merged commit ee0f70a into googleapis:master Sep 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the BigQuery API. cla: yes This human has signed the Contributor License Agreement.
4 participants