Hi, I'm using Spark on a Google Compute Engine cluster with the Google Cloud Stora

This should be fixed now with <a class="commit-link" data-hovercard-type="commit" data

Rate limit of GCS connector about hadoop-connectors HOT 3 CLOSED

googleclouddataproc commented on August 27, 2024

Rate limit of GCS connector

from hadoop-connectors.

Comments (3)

dennishuo commented on August 27, 2024

This should be fixed now with 141b1ef

To test, just run:

git clone https://github.com/GoogleCloudPlatform/bigdata-interop.git
cd bigdata-interop
mvn -P hadoop1 package
# Or or Hadoop 2
mvn -P hadoop2 package

And you should find the files "gcs/target/gcs-connector-_-shaded.jar" available for use. To plug it into bdutil, simply gsutil cp gcs/target/gcs-connector-_shaded.jar gs://<your-bucket>/some-path/and then editbdutil/bdutil_env.shfor Hadoop 1 orbdutil/hadoop2_env.sh to change:

GCS_CONNECTOR_JAR='https://storage.googleapis.com/hadoop-lib/gcs/gcs-connector-1.4.1-hadoop2.jar'

To instead point at your gs://<your-bucket>/some-path/ path; bdutil automatically detects that you're using a gs:// prefixed URI and will do the right thing during deployment.

Please let us know if it fixes the issue for you!

from hadoop-connectors.

oren-yowza commented on August 27, 2024

It looks good now, I don't get those errors anymore, thanks!

Though, i'm curious about your solution - I expected to see some exponential backoff or rate limiting of the API requests. Instead I saw that you checked whether the write operation did actually success even when got error.
Is this always the case? The GCS always writes the object when answering rateLimit errors? And no need for exponential backoff retries?

from hadoop-connectors.

dennishuo commented on August 27, 2024

Right, we considered whether exponential backoff would be appropriate; we do already plug in a lower-level retry initializer which makes exponential backoff on 5xx errors. One consideration is that initial backoffs for 5xx errors are quite short, since under normal circumstances those indicate simply re-sending and getting routed to a fresh frontend should fix the problem, whereas bucketing for these particular types of 429 errors can be on the order of seconds, so that there'd be several retries which fail out with a 429 again, ultimately really slowing down the startup of Spark jobs.

In this case, it's not quite the original GCS write which succeeded writing (in fact, on a rateLimit error, we should expect the request which errored out to not have been written), but actually other concurrent writers from possibly other machines which caused them to be written.

In this particular case of "empty objects" being related to directory placeholders, that means we can optimize out the need to retry over the course of several seconds since the rate limit likely means another writer already did our job for us (also why we have to check the metadata, in cases where createEmptyObjects is used for more advanced metadata).

from hadoop-connectors.

Recommend Projects

Rate limit of GCS connector about hadoop-connectors HOT 3 CLOSED

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent