Giter Site home page Giter Site logo

Comments (6)

rapoth avatar rapoth commented on August 24, 2024

@Ragavenderan: Thanks for raising this! Can you give us a sample Scala method and how you intend to use it within Spark?

from spark.

Ragavenderan avatar Ragavenderan commented on August 24, 2024

Currently we have a jar file that contains the function definitions.
From pyspark - this is how we call the method -

from pyspark.sql import SparkSession, DataFrame
def load_curated_email(spark, category, startdate="", enddate="", location=""):

df = spark._jvm.com.microsoft.odinml.Extractor.loadCuratedEmail(spark._jsparkSession, category, startdate, enddate, location)
return DataFrame(df, spark)

from spark.

imback82 avatar imback82 commented on August 24, 2024

The proper way to do this to introduce new datasource such that you can do:

var df = spark.Read().Format("your_format_here").Option("startdate","something").Load();

Have you considered this option?

Otherwise, we don't have a plan to support calling Scala methods from .NET other than RegisterJava for UDF here.

from spark.

zhiyong-gayang avatar zhiyong-gayang commented on August 24, 2024

Hi @imback82
I assume the UDF can only be used in spark sql, right?
The scenario Raga shared is to call a scala method to return a DataFrame as the python code shows, since we don't want to duplicate the logic in C#, is there any way to achieve this goal?

from spark.

rapoth avatar rapoth commented on August 24, 2024

@garyyang2002: Yes, that's correct.

I'm afraid what you want is not possible by any simple means and is a use case we cannot support immediately. While you can call into regular Java functions through SparkSQL, the use case described here is to call a wrapper function that then invokes spark.read.format().load() which will return a Dataframe. This is a bit unconventional and is not the recommended way – that brings me to the next question – can you share some details regarding the loadCuratedEmail()? How complex would it be to write this one function in .NET?

from spark.

imback82 avatar imback82 commented on August 24, 2024

The workaround is to use https://github.com/aelij/IgnoresAccessChecksToGenerator to access some internal classes (but this is not recommended since internal classes can break APIs).

from spark.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.