Comments (3)
Yes, please open a new issue
from spark-bigquery-connector.
Reading directly from a BigTable external table is not supported by the BigQuery Storage Read API we use. As a workaround, you can run a query and load its result:
bq_df = spark.read.format("bigquery") \
.option("viewsEnabled", True) \
.option("materializationProject", "<Project Name>") \
.option("materializationDataset", "<Dataset Name>") \
.load("SELECT * FROM <Project Name>.<Dataset Name>.<Table Name>")
Notice that there's some additional cost to that.
from spark-bigquery-connector.
@davidrabinowitz that actually fails as well, but it looks like it's for a different reason. Is this worth opening a different issue for, or am I missing something?
---------------------------------------------------------------------------
Py4JJavaError Traceback (most recent call last)
[<ipython-input-11-89b564a6f2ea>](https://localhost:8080/#) in <cell line: 1>()
3 .option("materializationProject", "dsci-632-390302") \
4 .option("materializationDataset", "iris_dataset") \
----> 5 .load("SELECT * FROM dsci-632-390302.iris_dataset.iris")
3 frames
[/content/spark-3.5.0-bin-hadoop3/python/lib/py4j-0.10.9.7-src.zip/py4j/protocol.py](https://localhost:8080/#) in get_return_value(answer, gateway_client, target_id, name)
324 value = OUTPUT_CONVERTER[type](answer[2:], gateway_client)
325 if answer[1] == REFERENCE_TYPE:
--> 326 raise Py4JJavaError(
327 "An error occurred while calling {0}{1}{2}.\n".
328 format(target_id, ".", name), value)
Py4JJavaError: An error occurred while calling o35.load.
: com.google.cloud.bigquery.connector.common.BigQueryConnectorException: Error creating destination table using the following query: [SELECT * FROM dsci-632-390302.iris_dataset.iris]
at com.google.cloud.bigquery.connector.common.BigQueryClient.materializeTable(BigQueryClient.java:648)
at com.google.cloud.bigquery.connector.common.BigQueryClient.materializeQueryToTable(BigQueryClient.java:588)
at com.google.cloud.bigquery.connector.common.BigQueryClient.getReadTable(BigQueryClient.java:390)
at com.google.cloud.spark.bigquery.BigQueryRelationProvider.createRelationInternal(BigQueryRelationProvider.scala:77)
at com.google.cloud.spark.bigquery.BigQueryRelationProvider.createRelation(BigQueryRelationProvider.scala:46)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:346)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:229)
at org.apache.spark.sql.DataFrameReader.$anonfun$load$2(DataFrameReader.scala:211)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:211)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:186)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:374)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)
at py4j.ClientServerConnection.run(ClientServerConnection.java:106)
at java.lang.Thread.run(Thread.java:750)
Caused by: com.google.cloud.spark.bigquery.repackaged.com.google.common.util.concurrent.UncheckedExecutionException: com.google.cloud.spark.bigquery.repackaged.com.google.cloud.bigquery.BigQueryException: Request couldn't be served.
at com.google.cloud.spark.bigquery.repackaged.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2087)
at com.google.cloud.spark.bigquery.repackaged.com.google.common.cache.LocalCache.get(LocalCache.java:4019)
at com.google.cloud.spark.bigquery.repackaged.com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4933)
at com.google.cloud.bigquery.connector.common.BigQueryClient.materializeTable(BigQueryClient.java:636)
... 22 more
Caused by: com.google.cloud.spark.bigquery.repackaged.com.google.cloud.bigquery.BigQueryException: Request couldn't be served.
at com.google.cloud.spark.bigquery.repackaged.com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.translate(HttpBigQueryRpc.java:114)
at com.google.cloud.spark.bigquery.repackaged.com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.create(HttpBigQueryRpc.java:219)
at com.google.cloud.spark.bigquery.repackaged.com.google.cloud.bigquery.BigQueryImpl$5.call(BigQueryImpl.java:410)
at com.google.cloud.spark.bigquery.repackaged.com.google.cloud.bigquery.BigQueryImpl$5.call(BigQueryImpl.java:399)
at com.google.cloud.spark.bigquery.repackaged.com.google.api.gax.retrying.DirectRetryingExecutor.submit(DirectRetryingExecutor.java:103)
at com.google.cloud.spark.bigquery.repackaged.com.google.cloud.bigquery.BigQueryRetryHelper.run(BigQueryRetryHelper.java:86)
at com.google.cloud.spark.bigquery.repackaged.com.google.cloud.bigquery.BigQueryRetryHelper.runWithRetries(BigQueryRetryHelper.java:49)
at com.google.cloud.spark.bigquery.repackaged.com.google.cloud.bigquery.BigQueryImpl.create(BigQueryImpl.java:398)
at com.google.cloud.spark.bigquery.repackaged.com.google.cloud.bigquery.BigQueryImpl.create(BigQueryImpl.java:363)
at com.google.cloud.bigquery.connector.common.BigQueryClient.create(BigQueryClient.java:498)
at com.google.cloud.bigquery.connector.common.BigQueryClient$TempTableBuilder.createTableFromQuery(BigQueryClient.java:898)
at com.google.cloud.bigquery.connector.common.BigQueryClient$TempTableBuilder.call(BigQueryClient.java:885)
at com.google.cloud.bigquery.connector.common.BigQueryClient$TempTableBuilder.call(BigQueryClient.java:860)
at com.google.cloud.spark.bigquery.repackaged.com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4938)
at com.google.cloud.spark.bigquery.repackaged.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3576)
at com.google.cloud.spark.bigquery.repackaged.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2318)
at com.google.cloud.spark.bigquery.repackaged.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2191)
at com.google.cloud.spark.bigquery.repackaged.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2081)
... 25 more
Caused by: com.google.cloud.spark.bigquery.repackaged.com.google.api.client.googleapis.json.GoogleJsonResponseException: 404 Not Found
POST https://www.googleapis.com/bigquery/v2/projects//jobs?prettyPrint=false
{
"code": 404,
"errors": [
{
"domain": "global",
"message": "Request couldn't be served.",
"reason": "notFound"
}
],
"message": "Request couldn't be served.",
"status": "NOT_FOUND"
}
at com.google.cloud.spark.bigquery.repackaged.com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:146)
at com.google.cloud.spark.bigquery.repackaged.com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:118)
at com.google.cloud.spark.bigquery.repackaged.com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:37)
at com.google.cloud.spark.bigquery.repackaged.com.google.api.client.googleapis.services.AbstractGoogleClientRequest$3.interceptResponse(AbstractGoogleClientRequest.java:466)
at com.google.cloud.spark.bigquery.repackaged.com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1111)
at com.google.cloud.spark.bigquery.repackaged.com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:552)
at com.google.cloud.spark.bigquery.repackaged.com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:493)
at com.google.cloud.spark.bigquery.repackaged.com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:603)
at com.google.cloud.spark.bigquery.repackaged.com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.create(HttpBigQueryRpc.java:217)
... 41 more
from spark-bigquery-connector.
Related Issues (20)
- Direct writemethod not working in Databricks for Spark 3.5 HOT 5
- Idempotent write support in BQ
- JARs marked 'latest' not being updated HOT 1
- Automatically read JSON types
- Storage Read API logging HOT 5
- BigQuery Pushdown filtering on Spark 3.4.2 HOT 10
- BIGNUMERIC Precision Handling: Inaccurate Decimal Values HOT 2
- java.lang.NoSuchMethodError: 'org.apache.spark.sql.catalyst.encoders.ExpressionEncoder org.apache.spark.sql.catalyst.encoders.RowEncoder.apply(org.apache.spark.sql.types.StructType)' HOT 4
- How to handle column datatype change? HOT 8
- Add `createTable(TableInfo)` method to `BigQueryClient`
- Error: This connector was made for Scala null, it was not meant to run on Scala 2.12 HOT 2
- Predicate pushdown doesn't work with DateTime BQ field - Spark 3.5, connector version 0.37 HOT 1
- error while writing with many timestamp data
- Request to Stop Enforcing Delete Permissions on Materialization Dataset HOT 1
- PySpark virtualenv: Missing dependencies lead to No FileSystem for scheme "gs" HOT 3
- Unable to write to a partitioned table in spark 3.5 with connector version spark-3.4-bigquery-0.34.0 HOT 6
- TIMESTAMP appears to add an hour compared to DATETIME HOT 2
- Indirect write to existing datetime column not possible HOT 2
- Configuring DataProc Cluster to Publish BigQuery Metrics on GCP HOT 1
- BigNumeric precision is too wide (76), Spark can only handle decimal types with max precision of 38 HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from spark-bigquery-connector.