Giter Site home page Giter Site logo

[FEA] Semi Join about blazingsql HOT 6 CLOSED

blazingdb avatar blazingdb commented on August 15, 2024
[FEA] Semi Join

from blazingsql.

Comments (6)

felipeblazing avatar felipeblazing commented on August 15, 2024 1

this is the relational algebra it generates. I am not sure what the semi join would generate but I am guessing something very similar.

  LogicalJoin(condition=[=($0, $4)], joinType=[inner])
    LogicalTableScan(table=[[main, nation]])
    LogicalAggregate(group=[{0}])
      BindableTableScan(table=[[main, nation]], filters=[[=($0, 16)]], projects=[[0]], aliases=[[n_nationkey]])

from blazingsql.

felipeblazing avatar felipeblazing commented on August 15, 2024 1

In the future we will be able to optimize out LogicalAggregate(group=[{0}]) using the CBO when the column in question is unique but that is currently not supported.

from blazingsql.

felipeblazing avatar felipeblazing commented on August 15, 2024

I just tried the following and it worked for me:

import cudf
bc = BlazingContext()
bc.s3('bsql_data', bucket_name='blazingsql-colab', access_key_id='AKIAJGB3SR3IXU3TE5WA', secret_key='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')

bc.create_table('nation', 's3://bsql_data/tpch_sf1/nation/0_0_0.parquet')
result = bc.sql('select * from nation where nation.n_nationkey in ( select other.n_nationkey from nation as other where n_nationkey = 16)').get()
print(result.columns)

The output i got was

0           16  MOZAMBIQUE            0   

                                       n_comment  
0  s. ironic, unusual asymptotes wake blithely r  ```

from blazingsql.

felipeblazing avatar felipeblazing commented on August 15, 2024

Can you show me a complete example where this is not working how you would expect?

from blazingsql.

VibhuJawa avatar VibhuJawa commented on August 15, 2024

Sorry for being unclear ,
I would like left-join to work natively i.e , i would like below sql query to work.

  SELECT e.EmpName, e.DepID  
  FROM @employees AS e  
  LEFT SEMIJOIN (SELECT (int?) DepID AS DepID, DepName FROM @departments) AS d  
  ON e.DepID == d.DepID;  

Question

Will there be performance implications for using SELECT * FROM A WHERE A.key IN (SELECT B.key FROM B) pattern instead of left-semi join or do you expect the performance to remain the same ?

from blazingsql.

VibhuJawa avatar VibhuJawa commented on August 15, 2024

this is the relational algebra it generates. I am not sure what the semi join would generate but I am guessing something very similar.

  LogicalJoin(condition=[=($0, $4)], joinType=[inner])
    LogicalTableScan(table=[[main, nation]])
    LogicalAggregate(group=[{0}])
      BindableTableScan(table=[[main, nation]], filters=[[=($0, 16)]], projects=[[0]], aliases=[[n_nationkey]])

Thanks for looking into this, will update on this issue if using this pattern is not as performent as we would like on my use-case .

from blazingsql.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.