Comments (2)
Hi @Linchin, thanks for the help! It turns out that client.extract_table
already created parquet files and the line extract_job.result()
tried to do the exact same thing which caused the retention policy error. I deleted the extract_job.result()
as it seems to be unnecessary and now the job runs correctly. Thanks again, and since it's solved on my end I am closing this issue
from python-bigquery.
Thank you @kansuke-at-trimble for raising the issue! From what I can see, it seems to be working as intended. I suppose 'bucket-a/train_dataset/000000000124.parquet'
already exists, and a retention policy indicates that
objects in the bucket can only be deleted or replaced once their age is greater than the retention period[1].
So basically the file cannot be overwritten during the retention period. I guess a way to bypass it is to use a unique file name, maybe one with UUID or the timestamp when it's generated.
Also, the client doesn't do the extract job itself - it sends the command to the backend, and the backend does the work. That's why there's no code in the client library doing the extraction. Hope these answer your question, and please let us know if you have any further questions :)
from python-bigquery.
Related Issues (20)
- Bug introduced in v3.12.0 HOT 8
- Warning: a recent release failed
- shapely >= 2.0.0 not supported for google-cloud-bigquery = "^2.x.x" with extras pandas HOT 2
- NULL is incorrectly introduced in previously working query HOT 11
- Warning: a recent release failed HOT 1
- Warning: a recent release failed HOT 1
- Warning: a recent release failed HOT 1
- .create_table method not working on BigQuery, if the table is an ibis table HOT 2
- `load_table_from_dataframe` doesn't respect `schema` fields order when specified HOT 2
- `query_and_wait` use in `Cursor.execute` breaks `default_query_job_config` in SQLAlchemy BQ Dialect HOT 2
- get_iam_policy and set_iam_policy are missing doc strings
- pyarrow import error in 3.20.0 HOT 9
- Support string_dtype, etc. in to_geodataframe HOT 2
- perf: don't retry getQueryResults as often with ambiguous errors in `QueryJob.result()` HOT 2
- Natural language support in BigQuery HOT 1
- AttributeError: module 'google._upb._message' has no attribute 'MessageMapContainer' HOT 3
- Dependency version mismatch on pyarrow HOT 4
- Allow enabling/disabling the storage API on a per-connection basis HOT 2
- `QueryJob.result()` not identifying and returning after submitted query completes HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from python-bigquery.