Giter Site home page Giter Site logo

Comments (9)

nils-braun avatar nils-braun commented on May 22, 2024 2

Very thoughtful comments from your side!
Thanks for sharing your code and how-to. Yes - this is exactly correct.

Do you have a running Superset service already? Or do you start one via docker or docker-compose? As I am new to the service I would be interested in how to deploy it for development best. Currently, I am using a docker-compose setup (which works).

from dask-sql.

nils-braun avatar nils-braun commented on May 22, 2024 1

Thank you so much for testing it out! That is very valuable to me and with every test, we learn something more about it :-)

I will try to install and setup Apache Superset today and will try to set it up. I will come back with my thoughts after that.

from dask-sql.

nils-braun avatar nils-braun commented on May 22, 2024 1

Hi @rajagurunath!
I have successfully tested Apache Superset (by the way - great tool. Thanks for bringing my attention to it :-))

I was running on Superset master, but I can reproduce your error with dask-sql 0.1.2. It seems, I was not implementing the presto protocol properly.
Fortunately, I was working on a new better implementation anyways, where this problem (and a few others) are fixed :-)
If you want, you can already try it out (see the linked pull request). However, I also plan to do a new release soon if you do not want to use a source HEAD version.

With the new version in the PR, I was able to successfully add dask-sql as a data source, do some SQL queries, have a look into plots and dashboards etc. For example:
image

This is really a nice tool - I am thinking about writing some more documentation on how to get started with dask-sql and a BI tool. What way did you use to start the software?

from dask-sql.

rajagurunath avatar rajagurunath commented on May 22, 2024 1

Wow Superb, Thanks a ton @nils-braun , that was real quick.

Indeed , Apache Superset is an Awesome Tool in every aspect among the BI tools in python ecosystem.

Making this Integration between dask-sql and a BI tool seamless ,will opens up lot of potential use cases I guess.

  • No need to save the preprocessed (prepared) files in another DB (sqlite or duckDB) to be used by the BI Tools
  • Since dask-sql provides the facility of adding custom functions (like UDF)
    • This can be leveraged to use trained machine learning model as a UDF for example,
    • And many other complex data Logic can be registered as an UDF and people of all background can make use of it through SQL
  • Power of Dask (Distributed computing ) and simplicity integrated for analytics community. (No need of complex ETL Pipeline)
  • And much more ,(Some straight curious thoughts from my mind , pardon me if repetitive)

Currently , I used the example in the dask-sql documentation, prepared the data and make use of run_server method
and started the dask-presto server, and then connected with Superset.
For example:

from dask_sql import Context
from dask.datasets import timeseries

from dask_sql import run_server
c = Context()
df = timeseries()
c.register_dask_table(df, "timeseries")
run_server(context=c) 

kindly let me know if this is what you are expecting or I misunderstood your question.

I will also give a try and let you know once I am able to setup some dashboard.

Thanks

from dask-sql.

avriiil avatar avriiil commented on May 22, 2024 1

@nils-braun - I'm curious if you ended up writing the documentation pages mentioned in your comment above? I've found this in the Dask docs.

If not, I'd be up for taking this on.

from dask-sql.

quasiben avatar quasiben commented on May 22, 2024 1

@rrpelgrim I don't think anyone will say no to doing work here.

from dask-sql.

rajagurunath avatar rajagurunath commented on May 22, 2024

Hi @nils-braun

First of all thanks for this awesome project😍.

I have tried using dask-sql's presto server with Apache-Superset (Open source BI tool in python ecosystem) using PyHive(presto) as a driver.

Superset Happily detected and accepted Presto protocol.

But As you mentioned there are some quirks, one such thing was mentioned below.

The Superset after successful execution of the Query not able to render the data , getting 500 HTTP status in the superset log.

Please find the below screenshot for details
image

(My suspect is some data schema /contract are mismatching, that's why UI not able to
render the Table after successful execution of Query)

Software Versions
Superset 0.37.2
dask-sql 0.1.2

Please let me know if you need any further information.

Thanks

from dask-sql.

avriiil avatar avriiil commented on May 22, 2024

@quasiben - good to know. I'll be drafting some code examples in the coming week or two, will check back in here when I have something to share. @nils-braun @rajagurunath - happy to incorporate any of your input, so feel free to share that here.

from dask-sql.

rajagurunath avatar rajagurunath commented on May 22, 2024

Hi @rrpelgrim,

Thanks for taking up this work.

It would be of great help if we have any scripts or automated ways of connecting/Deploying Dask-SQL and BI tools together. As mentioned in this issue: #57, Building helm charts for both the dask-sql server and BI tools was one such plan. (So that users can use that chart and create compute(dask-sql) and BI Platform easily).

Feel free to come up with new Suggestions/Ideas 👍

Let me know if you need any further details.

from dask-sql.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.