Giter Site home page Giter Site logo

Comments (2)

pmkc avatar pmkc commented on August 27, 2024 1

I was able to repro this and diagnose this.

While all data to and from Google Cloud Storage is routed through the proxy, I never wired the authorization channel through the proxy, which obviously does not work, when you cannot access Google's OAuth endpoint.

I should be able to fix it in a PR by the end of the week at the latest.

from hadoop-connectors.

lukeFalsina avatar lukeFalsina commented on August 27, 2024

I am also experiencing a similar proxy issue: I have been trying to set fs.gs.proxy.address to use corporate proxy to access GCS from my Hadoop on premise, using a command like hadoop dfs -ls gs://bucket-name.

I checked via tcpdump and the connector does not attempt to use the proxy at all.

I also tried to set the config variable fs.gs.http.transport.type to a different type, as well as to export HADOOP_OPTS="$HADOOP_OPTS -Dhttp.proxyHost=... as suggested by @peay but with no luck.

The exception that is raised is always the same (also before setting up the value for the fs.gs.proxy.address property):

~ HADOOP_ROOT_LOGGER='DEBUG,console' hadoop fs -ls gs://bucket-name
17/06/08 12:52:14 DEBUG util.Shell: setsid exited with exit code 0
17/06/08 12:52:15 DEBUG conf.Configuration: parsing URL jar:file:/usr/lib/hadoop/hadoop-common-2.6.0-cdh5.8.2.jar!/core-default.xml
17/06/08 12:52:15 DEBUG conf.Configuration: parsing input stream sun.net.www.protocol.jar.JarURLConnection$JarURLInputStream@53eb80e9
17/06/08 12:52:15 DEBUG conf.Configuration: parsing URL file:/etc/hadoop/conf/core-site.xml
...
17/06/08 12:52:16 DEBUG gcsio.ForwardingGoogleCloudStorage: GoogleCloudStorageImpl.getItemInfo(gs://bucket-name)
17/06/08 12:52:16 DEBUG gcsio.GoogleCloudStorage: getItemInfo(gs://bucket-name)
17/06/08 12:52:16 DEBUG gcsio.GoogleCloudStorage: getBucket(bucket-name)
17/06/08 12:52:16 DEBUG util.RetryHttpInitializer: Request is missing a user-agent, adding default value of 'GHFS/1.6.0-hadoop2'
17/06/08 12:52:16 DEBUG gcsio.GoogleCloudStorage: getBucket(bucket-name) threw exception:
java.net.SocketException: Network is unreachable                                                                                        
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:579)
        at sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:618)
        at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
        at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
        at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
        at sun.net.www.protocol.https.HttpsClient.<init>(HttpsClient.java:275)
        at sun.net.www.protocol.https.HttpsClient.New(HttpsClient.java:371)
        at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(AbstractDelegateHttpsURLConnection.java:191)
        at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:932)
        at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:177)
        at sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1091)
        at sun.net.www.protocol.https.HttpsURLConnectionImpl.getOutputStream(HttpsURLConnectionImpl.java:250)
        at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:77)
        at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:972)
        at com.google.api.client.auth.oauth2.TokenRequest.executeUnparsed(TokenRequest.java:283)
        at com.google.api.client.auth.oauth2.TokenRequest.execute(TokenRequest.java:307)
        at com.google.cloud.hadoop.util.CredentialFactory$GoogleCredentialWithRetry.executeRefreshToken(CredentialFactory.java:132)
        at com.google.api.client.auth.oauth2.Credential.refreshToken(Credential.java:489)
        at com.google.api.client.auth.oauth2.Credential.intercept(Credential.java:217)
        at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:859)
        at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:419)
        at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)
        at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469)
        at com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl.getBucket(GoogleCloudStorageImpl.java:1657)
        at com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl.getItemInfo(GoogleCloudStorageImpl.java:1612)
        at com.google.cloud.hadoop.gcsio.ForwardingGoogleCloudStorage.getItemInfo(ForwardingGoogleCloudStorage.java:214)
        at com.google.cloud.hadoop.gcsio.GoogleCloudStorageFileSystem.getFileInfo(GoogleCloudStorageFileSystem.java:1093)
        at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.getFileStatus(GoogleHadoopFileSystemBase.java:1413)
        at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:64)
        at org.apache.hadoop.fs.Globber.doGlob(Globber.java:285)
        at org.apache.hadoop.fs.Globber.glob(Globber.java:151)
        at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1656)
        at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.globStatus(GoogleHadoopFileSystemBase.java:1583)
        at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.globStatus(GoogleHadoopFileSystemBase.java:1506)
        at org.apache.hadoop.fs.shell.PathData.expandAsGlob(PathData.java:326)
        at org.apache.hadoop.fs.shell.Command.expandArgument(Command.java:235)
        at org.apache.hadoop.fs.shell.Command.expandArguments(Command.java:218)
        at org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:102)
        at org.apache.hadoop.fs.shell.Command.run(Command.java:165)
        at org.apache.hadoop.fs.FsShell.run(FsShell.java:315)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
        at org.apache.hadoop.fs.FsShell.main(FsShell.java:372)
ls: Error accessing: bucket: bucket-name
17/06/08 12:52:16 DEBUG gcs.GoogleHadoopFileSystemBase: GHFS.close:
17/06/08 12:52:16 DEBUG gcs.GoogleHadoopFileSystemBase: GHFS.processDeleteOnExit:
17/06/08 12:52:16 DEBUG gcsio.GoogleCloudStorageFileSystem: close()
17/06/08 12:52:16 DEBUG gcsio.ForwardingGoogleCloudStorage: GoogleCloudStorageImpl.close()
17/06/08 12:52:16 DEBUG gcsio.GoogleCloudStorage: close()

Notice I am able to execute other programs which requires access to the network and indeed they pick up and use the proxy settings correctly.

from hadoop-connectors.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.