Giter Site home page Giter Site logo

dataproc-jupyter-plugin's People

Contributors

aditee-accenture avatar danelias avatar dependabot[bot] avatar harsha-accenture avatar jeyaprakash-nk avatar jinnthehuman avatar medb avatar ojarjur avatar ptwng avatar saranyaloganathan23 avatar shubha-accenture avatar ywskycn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dataproc-jupyter-plugin's Issues

Dev branch review

Some of these comments are generic and the fix should be applied throughout the codebase.

Please use explicit types and avoid using any

.then((responseResult: any) => {

This pattern is common throughout the code. When there are multiple checks like this where there can ever be one, we should put this in a separate util function and use the result of the function for rendering.

Remove all instances of //@ts-ignore or add a comment so that the reasoning is clear

Remove commented out code

Commonly used strings should not be hard coded.

<div className="no-data-style">No rows to display</div>

This can be simplified to return selectedMode === toggleItem ? 'selected-header' : 'unselected-header'

if (selectedMode === toggleItem) {

If this check is here to handle a specific case, an error code is a bit too broad so we should use some determinator in responseResult to check for this. Regardless, we also need a comment to describe exactly what case(s) we are setting the error view for.

if (responseResult.error && responseResult.error.code === 404) {

There is this table creation pattern that can be generalized and used in all the details pages so that we don't have to re develop tables each time.

<div className="cluster-details-container">

Does data.message work in both the case where the JSON.parse call succeeds and fails?

throw new ServerConnection.ResponseError(response, data.message);

Use useState<T> where T is the type of the default value. Also, if the default value is shared between states, we should not duplicate work there.

const [jobInfo, setjobInfo] = useState({

Function doesn't seem to return undefined. We can remove that return type. I see this in other util functions as well

export const jobTypeValue = (data: IJobData): string | undefined => {

The params should either both be strings or both be a Date type. If we are sticking with a string param type, we need to handle invalid dates

export const elapsedTime = (endTime: string, jobStartTime: Date): string => {

Buttons should use the button html tag instead of div for accessibility

<div className="popup-button-style" onClick={onCancel}>

Performance issues loading batches

If the list of batches is very large (the project I tested on had 10k+ batches), the observed behavior is the tab crashes due to an out of memory error or the batches list table reverts to a "no results" state after showing partial results.

I suspect the following things are happening:

  1. The plugin is requesting all of the page for batches list during initialization instead of requesting a handful and continuing the request as the user pages through the list.
  2. The batch info that the plugin requests are stored in memory, which can grow very large and cause OOM errors.
  3. Due to the volume of requests sent in a short period of time, the server returns with a 429 error.

Exception when networks list is empty. Some users or service account may not have permissions to list networks or the project doesn't have any networks

https://github.com/GoogleCloudDataproc/dataproc-jupyter-plugin/blame/e06b9fdd41bc9819d7c23d9505ef079479789003/src/runtime/runtimeService.tsx#L481

Some users or service account may not have permissions to list networks or the project doesn't have any networks

As the code is trying to access [0] this throws the exception "Error listing Networks" as the networks array is empty

The exception is not specific - it could be raised by either the HTTP Request failing or the networks list being empty, could we please add a check before line https://github.com/GoogleCloudDataproc/dataproc-jupyter-plugin/blame/e06b9fdd41bc9819d7c23d9505ef079479789003/src/runtime/runtimeService.tsx#L481

such as:

setNetworklist(transformedNetworkList);
if (selectedRuntimeClone === undefined) {
  if (transformedNetworkList.length > 0) {
    setNetworkSelected(transformedNetworkList[0]);
  } else {
    DataprocLoggingService.log('No networks found. Account may lack access to list networks', LOG_LEVEL.ERROR);
    toast.error(`No networks found. Account may lack access to list networks.`, toastifyCustomStyle);
  }
}

incorrect persistent history server uri for dataproc serverless notebook

Error message

Error from Gateway: [Bad Request] failure creating a backend resource: failure starting the kernel creation: failure starting the kernel creation: failure creating session: [400 Bad Request] generic::invalid_argument: com.google.cloud.hadoop.services.common.error.DataprocException: Cluster name 'projects/my-project-id/locations/my-location/clusters/my-phs-cluster-name' must conform to ^(?:/?/?dataproc\.googleapis\.com/)?projects/([^/]+)/regions/([^/]+)/clusters/([^/]+) pattern (INVALID_ARGUMENT)
. Ensure gateway url is valid and the Gateway instance is running.

The error message regex is validating for regions, but the string passed is locations

Steps

  • Opening jupyter lab
  • Clicking the "New Runtime Template" in the Launcher in "Dataproc Serverless Notebooks"
  • Select an existing "Persistent Spark History Server" from the drop down menu
  • Click "Save" (after also filling out the other required configuration fields)
  • Back at Launcher, create a new Dataproc Serverless notebook with the previously created template

Environment

# OS
Ubuntu 20.04 LTS x86_64

# Python version
Python 3.10.13

# Relevant Python dependencies
jupyterlab==4.0.6
dataproc_jupyter_plugin==0.1.9

# output of gcloud version
Google Cloud SDK 448.0.0
beta 2023.09.22
bq 2.0.98
bundled-python3-unix 3.9.16
core 2023.09.22
gsutil 5.25

default subnetwork doesn't appear when creating a new runtime template

Steps

  • Click "New Runtime Template" from the Launcher
  • looking at the "Network Configuration" - the "Primary network" and "Subnetwork" both say default. And clicking on the subnetwork drop down menu shows this image
    Screenshot from 2023-09-26 11-41-10
  • Clicking the primary network drop down menu and then selecting a different primary network
  • Then clicking the primary network drop down menu and selecting the "default" primary network again now shows the correct default subnetwork (default-123456...).
  • Additionally, now clicking the subnetwork drop down menu shows a list of the other available subnetworks

Expected behavior

  • The default subnetwork should be prepopulated upon opening the runtime template config rather than having to change the primary network value back and forth to get to the correct subnetwork value

Environment

# OS
Ubuntu 20.04 LTS x86_64

# Python version
Python 3.10.13

# Relevant Python dependencies
jupyterlab==4.0.6
dataproc_jupyter_plugin==0.1.9

# output of gcloud version
Google Cloud SDK 448.0.0
beta 2023.09.22
bq 2.0.98
bundled-python3-unix 3.9.16
core 2023.09.22
gsutil 5.25

License header is missing in multiple files

Here is the list of files that we need to do some fixes.
- ❌ Expected to find CONTRIBUTING, CONTRIBUTING.txt or CONTRIBUTING.md
- ❌ /babel.config.js: Missing a required license header (or its header was not recognized)
- ❌ /conftest.py: Missing a required license header (or its header was not recognized)
- ❌ /dataproc_plugin/init.py: Missing a required license header (or its header was not recognized)
- ❌ /dataproc_plugin/handlers.py: Missing a required license header (or its header was not recognized)
- ❌ /dataproc_plugin/tests/init.py: Missing a required license header (or its header was not recognized)
- ❌ /dataproc_plugin/tests/test_handlers.py: Missing a required license header (or its header was not recognized)
- ❌ /jest.config.js: Missing a required license header (or its header was not recognized)
- ❌ /setup.py: Missing a required license header (or its header was not recognized)
- ❌ /src/batches/batchDetails.tsx: Missing a required license header (or its header was not recognized)
- ❌ /src/batches/batches.tsx: Missing a required license header (or its header was not recognized)
- ❌ /src/batches/listBatches.tsx: Missing a required license header (or its header was not recognized)
- ❌ /src/cluster/cluster.tsx: Missing a required license header (or its header was not recognized)
- ❌ /src/cluster/clusterDetails.tsx: Missing a required license header (or its header was not recognized)
- ❌ /src/cluster/listCluster.tsx: Missing a required license header (or its header was not recognized)
- ❌ /src/handler/handler.ts: Missing a required license header (or its header was not recognized)
- ❌ /src/index.ts: Missing a required license header (or its header was not recognized)
- ❌ /src/jobs/jobDetails.tsx: Missing a required license header (or its header was not recognized)
- ❌ /src/jobs/jobs.tsx: Missing a required license header (or its header was not recognized)
- ❌ /src/jobs/labelProperties.tsx: Missing a required license header (or its header was not recognized)
- ❌ /src/jobs/submitJob.tsx: Missing a required license header (or its header was not recognized)
- ❌ /src/login/authLogin.tsx: Missing a required license header (or its header was not recognized)
- ❌ /src/login/configSelection.tsx: Missing a required license header (or its header was not recognized)
- ❌ /src/sessions/listSessions.tsx: Missing a required license header (or its header was not recognized)
- ❌ /src/sessions/sessionDetails.tsx: Missing a required license header (or its header was not recognized)
- ❌ /src/svg.d.ts: Missing a required license header (or its header was not recognized)
- ❌ /src/utils/batchService.tsx: Missing a required license header (or its header was not recognized)
- ❌ /src/utils/clusterServices.tsx: Missing a required license header (or its header was not recognized)
- ❌ /src/utils/const.ts: Missing a required license header (or its header was not recognized)
- ❌ /src/utils/deletePopup.tsx: Missing a required license header (or its header was not recognized)
- ❌ /src/utils/globalFilter.tsx: Missing a required license header (or its header was not recognized)
- ❌ /src/utils/jobServices.tsx: Missing a required license header (or its header was not recognized)
- ❌ /src/utils/sessionService.tsx: Missing a required license header (or its header was not recognized)
- ❌ /src/utils/statusDisplay.tsx: Missing a required license header (or its header was not recognized)
- ❌ /src/utils/tableData.tsx: Missing a required license header (or its header was not recognized)
- ❌ /src/utils/utils.ts: Missing a required license header (or its header was not recognized)
- ❌ /src/utils/viewLogs.tsx: Missing a required license header (or its header was not recognized)
- ❌ /style/index.js: Missing a required license header (or its header was not recognized)
- ❌ /ui-tests/jupyter_server_test_config.py: Missing a required license header (or its header was not recognized)
- ❌ /ui-tests/playwright.config.js: Missing a required license header (or its header was not recognized)
- ❌ /ui-tests/tests/dataproc_plugin.spec.ts: Missing a required license header (or its header was not recognized)

The "create runtime template" UI does not give the user any indication about what is wrong if their subnetwork does not have Private Google Access enabled.

Dataproc Serverless Sessions require Private Google Access to be enabled on the subnetwork they use.

Accordingly, the UI provided by the plugin for creating a runtime template filters out any subnetworks that do not have this enabled, but it does not give the user any indication as to why they are being filtered, or even the fact that they are filtered out.

It would be much better from a usability standpoint if the user saw that the subnetwork was found but is not supported, and it would be even better from a discoverability perspective if there was some sort of message indicating why it was not supported.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.