Giter Site home page Giter Site logo

blazingdb / blazingsql Goto Github PK

View Code? Open in Web Editor NEW
1.9K 55.0 181.0 42.34 MB

BlazingSQL is a lightweight, GPU accelerated, SQL engine for Python. Built on RAPIDS cuDF.

Home Page: https://blazingsql.com

License: Apache License 2.0

Python 19.79% Shell 2.57% Java 6.30% FreeMarker 0.28% CMake 1.71% C++ 62.87% Cuda 2.41% Dockerfile 0.39% Lua 0.05% Jupyter Notebook 2.24% Makefile 0.03% Batchfile 0.02% CSS 0.23% C 0.03% Cython 1.09%
rapidsai sql python machine-learning machine-learning-workflow artificial-intelligence gpu-acceleration gpu rapids blazingsql

blazingsql's Introduction

A lightweight, GPU accelerated, SQL engine built on the RAPIDS.ai ecosystem.

Get Started on app.blazingsql.com

Getting Started | Documentation | Examples | Contributing | License | Blog | Try Now

BlazingSQL is a GPU accelerated SQL engine built on top of the RAPIDS ecosystem. RAPIDS is based on the Apache Arrow columnar memory format, and cuDF is a GPU DataFrame library for loading, joining, aggregating, filtering, and otherwise manipulating data.

BlazingSQL is a SQL interface for cuDF, with various features to support large scale data science workflows and enterprise datasets.

  • Query Data Stored Externally - a single line of code can register remote storage solutions, such as Amazon S3.
  • Simple SQL - incredibly easy to use, run a SQL query and the results are GPU DataFrames (GDFs).
  • Interoperable - GDFs are immediately accessible to any RAPIDS library for data science workloads.

Try our 5-min Welcome Notebook to start using BlazingSQL and RAPIDS AI.

Getting Started

Here's two copy + paste reproducable BlazingSQL snippets, keep scrolling to find example Notebooks below.

Create and query a table from a cudf.DataFrame with progress bar:

import cudf

df = cudf.DataFrame()

df['key'] = ['a', 'b', 'c', 'd', 'e']
df['val'] = [7.6, 2.9, 7.1, 1.6, 2.2]

from blazingsql import BlazingContext
bc = BlazingContext(enable_progress_bar=True)

bc.create_table('game_1', df)

bc.sql('SELECT * FROM game_1 WHERE val > 4') # the query progress will be shown
Key Value
0 a 7.6
1 b 7.1

Create and query a table from a AWS S3 bucket:

from blazingsql import BlazingContext
bc = BlazingContext()

bc.s3('blazingsql-colab', bucket_name='blazingsql-colab')

bc.create_table('taxi', 's3://blazingsql-colab/yellow_taxi/taxi_data.parquet')

bc.sql('SELECT passenger_count, trip_distance FROM taxi LIMIT 2')
passenger_count fare_amount
0 1.0 1.1
1 1.0 0.7

Examples

Notebook Title Description Try Now
Welcome Notebook An introduction to BlazingSQL Notebooks and the GPU Data Science Ecosystem. Launch on BlazingSQL Notebooks
The DataFrame Learn how to use BlazingSQL and cuDF to create GPU DataFrames with SQL and Pandas-like APIs. Launch on BlazingSQL Notebooks
Data Visualization Plug in your favorite Python visualization packages, or use GPU accelerated visualization tools to render millions of rows in a flash. Launch on BlazingSQL Notebooks
Machine Learning Learn about cuML, mirrored after the Scikit-Learn API, it offers GPU accelerated machine learning on GPU DataFrames. Launch on BlazingSQL Notebooks

Documentation

You can find our full documentation at docs.blazingdb.com.

Prerequisites

  • Anaconda or Miniconda installed
  • OS Support
    • Ubuntu 16.04/18.04 LTS
    • CentOS 7
  • GPU Support
    • Pascal or Better
    • Compute Capability >= 6.0
  • CUDA Support
    • 11.0
    • 11.2
    • 11.4
  • Python Support
    • 3.7
    • 3.8

Install Using Conda

BlazingSQL can be installed with conda (miniconda, or the full Anaconda distribution) from the blazingsql channel:

Stable Version

conda install -c blazingsql -c rapidsai -c nvidia -c conda-forge -c defaults blazingsql python=$PYTHON_VERSION cudatoolkit=$CUDA_VERSION

Where $CUDA_VERSION is 11.0, 11.2 or 11.4 and $PYTHON_VERSION is 3.7 or 3.8 For example for CUDA 11.2 and Python 3.8:

conda install -c blazingsql -c rapidsai -c nvidia -c conda-forge -c defaults blazingsql python=3.8 cudatoolkit=11.2

Nightly Version

For nightly version cuda 11+ are only supported, see https://github.com/rapidsai/cudf#cudagpu-requirements

conda install -c blazingsql-nightly -c rapidsai-nightly -c nvidia -c conda-forge -c defaults blazingsql python=$PYTHON_VERSION  cudatoolkit=$CUDA_VERSION

Where $CUDA_VERSION is 11.0, 11.2 or 11.4 and $PYTHON_VERSION is 3.7 or 3.8 For example for CUDA 11.2 and Python 3.8:

conda install -c blazingsql-nightly -c rapidsai-nightly -c nvidia -c conda-forge -c defaults blazingsql python=3.8  cudatoolkit=11.2

Build/Install from Source (Conda Environment)

This is the recommended way of building all of the BlazingSQL components and dependencies from source. It ensures that all the dependencies are available to the build process.

Stable Version

Install build dependencies

conda create -n bsql python=$PYTHON_VERSION
conda activate bsql
./dependencies.sh 21.08 $CUDA_VERSION

Where $CUDA_VERSION is is 11.0, 11.2 or 11.4 and $PYTHON_VERSION is 3.7 or 3.8 For example for CUDA 11.2 and Python 3.7:

conda create -n bsql python=3.7
conda activate bsql
./dependencies.sh 21.08 11.2

Build

The build process will checkout the BlazingSQL repository and will build and install into the conda environment.

cd $CONDA_PREFIX
git clone https://github.com/BlazingDB/blazingsql.git
cd blazingsql
git checkout main
export CUDACXX=/usr/local/cuda/bin/nvcc
./build.sh

NOTE: You can do ./build.sh -h to see more build options.

$CONDA_PREFIX now has a folder for the blazingsql repository.

Nightly Version

Install build dependencies

For nightly version cuda 11+ are only supported, see https://github.com/rapidsai/cudf#cudagpu-requirements

conda create -n bsql python=$PYTHON_VERSION
conda activate bsql
./dependencies.sh 21.10 $CUDA_VERSION nightly

Where $CUDA_VERSION is 11.0, 11.2 or 11.4 and $PYTHON_VERSION is 3.7 or 3.8 For example for CUDA 11.2 and Python 3.8:

conda create -n bsql python=3.8
conda activate bsql
./dependencies.sh 21.10 11.2 nightly

Build

The build process will checkout the BlazingSQL repository and will build and install into the conda environment.

cd $CONDA_PREFIX
git clone https://github.com/BlazingDB/blazingsql.git
cd blazingsql
export CUDACXX=/usr/local/cuda/bin/nvcc
./build.sh

NOTE: You can do ./build.sh -h to see more build options.

NOTE: You can perform static analysis with cppcheck with the command cppcheck --project=compile_commands.json in any of the cpp project build directories.

$CONDA_PREFIX now has a folder for the blazingsql repository.

Storage plugins

To build without the storage plugins (AWS S3, Google Cloud Storage) use the next arguments:

# Disable all storage plugins
./build.sh disable-aws-s3 disable-google-gs

# Disable AWS S3 storage plugin
./build.sh disable-aws-s3

# Disable Google Cloud Storage plugin
./build.sh disable-google-gs

NOTE: By disabling the storage plugins you don't need to install previously AWS SDK C++ or Google Cloud Storage (neither any of its dependencies).

SQL providers

To build without the SQL providers (MySQL, PostgreSQL, SQLite) use the next arguments:

# Disable all SQL providers
./build.sh disable-mysql disable-sqlite disable-postgresql

# Disable MySQL provider
./build.sh disable-mysql

...

NOTES:

  • By disabling the storage plugins you don't need to install mysql-connector-cpp=8.0.23 libpq=13 sqlite=3 (neither any of its dependencies).
  • Currenlty we support only MySQL. but PostgreSQL and SQLite will be ready for the next version!

Documentation

User guides and public APIs documentation can be found at here

Our internal code architecture can be built using Spinx.

conda install -c conda-forge doxygen
cd $CONDA_PREFIX
cd blazingsql/docsrc
pip install -r requirements.txt
make doxygen
make html

The generated documentation can be viewed in a browser at blazingsql/docsrc/build/html/index.html

Community

Contributing

Have questions or feedback? Post a new github issue.

Please see our guide for contributing to BlazingSQL.

Contact

Feel free to join our channel (#blazingsql) in the RAPIDS-GoAi Slack: join RAPIDS-GoAi workspace.

You can also email us at [email protected] or find out more details on BlazingSQL.com.

License

Apache License 2.0

RAPIDS AI - Open GPU Data Science

The RAPIDS suite of open source software libraries aim to enable execution of end-to-end data science and analytics pipelines entirely on GPUs. It relies on NVIDIA® CUDA® primitives for low-level compute optimization, but exposing that GPU parallelism and high-bandwidth memory speed through user-friendly Python interfaces.

Apache Arrow on GPU

The GPU version of Apache Arrow is a common API that enables efficient interchange of tabular data between processes running on the GPU. End-to-end computation on the GPU avoids unnecessary copying and converting of data off the GPU, reducing compute time and cost for high-performance analytics common in artificial intelligence workloads. As the name implies, cuDF uses the Apache Arrow columnar data format on the GPU. Currently, a subset of the features in Apache Arrow are supported.

blazingsql's People

Contributors

0xflotus avatar actions-user avatar ajschmidt8 avatar aocsa avatar aucahuasi avatar ayushdg avatar beckernick avatar chrisjar avatar christian8491 avatar cjnolet avatar diegodfrf avatar dillon-cullinan avatar drabastomek avatar editaxz avatar felipeblazing avatar gcca avatar gumdropsteve avatar jeanp413 avatar jglaser avatar kharoc avatar kkraus14 avatar mario21ic avatar mrocklin avatar quasiben avatar raydouglass avatar rlratzel avatar roaramburu avatar rommeldb avatar romulo-auccapuclla avatar wmalpica avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

blazingsql's Issues

[BUG] Hive querying returns "hdfs path not found" error

Describe the bug
I'm trying the new Hive querying feature (in v11) using pyhive. Pyhive connection and cursor are created correctly. When I call the create_table() operation, the command fails with hdfs path not found. The error message shows that the hdfs path is collected correctly, so the pyhive connection
and the metadata collection in get_hive_table() may be working correctly.
Error snippet: "ParseSchemaError: [ParseSchema Error] Path 'hdfs://.../../../' does not exist."

Steps/Code to reproduce bug
from blazingsql import BlazingContext
from pyhive import hive
bc = BlazingContext()
cursor = hive.connect('your_hive_ip_address').cursor()
bc.create_table("hive_db_name.hive_table_name",cursor) # fails here

Expected behavior
bc.create_table() should succeed

Additional context
Not HDFS register step followed while testing pyhive based connection.

Python does not start services automatically

Hello, to execute a simple demo. I can see the query result is correct but in the end there is a weird message:

Demo:

from blazingsql import BlazingContext
bc = BlazingContext()
import time
nation = bc.create_table('nation', '/tmp/nation_0_0.parquet')
sql = "select n_nationkey, n_comment from nation"
result_gdf = bc.sql(sql).get()
print(result_gdf)

Output:

(blazingsql) root@8d16c30f8ae3:~# python hola.py
connection established
columns = n_nationkey n_comment
0 0 haggle. carefully final deposits detect slyly...
1 1 al foxes promise slyly according to the regula...
2 2 y alongside of the pending deposits. carefully...
3 3 eas hang ironic, silent packages. slyly regula...
4 4 y above the carefully unusual theodolites. fin...
5 5 ven packages wake quickly. regu
6 6 refully final requests. regular, ironi
7 7 l platelets. regular accounts x-ray: unusual, ...
8 8 ss excuses cajole slyly across the packages. d...
9 9 slyly express asymptotes. regular deposits ha...
10 10 efully alongside of the slyly final dependenci...
11 11 nic deposits boost atop the quickly final requ...
12 12 ously. final, express gifts cajole a
13 13 ic deposits are blithely about the carefully r...
14 14 pending excuses haggle furiously deposits. pe...
15 15 rns. blithely bold courts among the closely re...
16 16 s. ironic, unusual asymptotes wake blithely r
17 17 platelets. blithely pending dependencies use f...
18 18 c dependencies. furiously express notornis sle...
19 19 ular asymptotes are about the furious multipli...
20 20 ts. silent requests haggle. closely express pa...
21 21 hely enticingly express accounts. even, final
22 22 requests against the platelets use never acco...
23 23 eans boost carefully special requests. account...
24 24 y final packages. slow foxes cajole quickly. q...
resultToken = 14785780709824492479
interpreter_path = 127.0.0.1
interpreter_port = 8891
handle = [<numba.cuda.cudadrv.driver.IpcHandle object at 0x7ff86d2819b0>]
client = <pyblazing.api.PyConnector object at 0x7ff86d2daef0>
calciteTime = 301
ralTime = 310
totalTime = 851
error_message =
Exception ignored in: <function ResultSetHandle.del at 0x7ff86d2e42f0>
Traceback (most recent call last):
File "/miniconda3/envs/blazingsql/lib/python3.7/site-packages/pyblazing/api.py", line 310, in del
File "/miniconda3/envs/blazingsql/lib/python3.7/site-packages/pyblazing/api.py", line 229, in free_result
File "/miniconda3/envs/blazingsql/lib/python3.7/site-packages/pyblazing/api.py", line 47, in _send_request
File "/miniconda3/envs/blazingsql/lib/python3.7/site-packages/blazingdb/protocol/init.py", line 71, in send
TypeError: can only concatenate str (not "tuple") to str
Exception ignored in: <function PyConnector.del at 0x7ff86d2d6c80>
Traceback (most recent call last):
File "/miniconda3/envs/blazingsql/lib/python3.7/site-packages/pyblazing/api.py", line 64, in del
File "/miniconda3/envs/blazingsql/lib/python3.7/site-packages/pyblazing/api.py", line 108, in close_connection
File "/miniconda3/envs/blazingsql/lib/python3.7/site-packages/pyblazing/api.py", line 46, in _send_request
File "/miniconda3/envs/blazingsql/lib/python3.7/site-packages/blazingdb/protocol/init.py", line 61, in init
TypeError: can only concatenate str (not "tuple") to str
Colla

[BUG] Single Node Dockerfile doesn't build

Describe the bug

The Dockerfile described here might need updated to work Conda: https://github.com/BlazingDB/pyBlazing/blob/develop/docker/single-node/Dockerfile

Step 10/16 : RUN conda install -y -c conda-forge -c defaults -c nvidia -c rapidsai -c blazingsql/label/cuda10.0 blazingsql-calcite blazingsql-orchestrator blazingsql-ral blazingsql-python python=3.7 cudatoolkit=10.0
 ---> Running in e811e50d46c5
Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... failed with initial frozen solve. Retrying with flexible solve.
Collecting package metadata (repodata.json): ...working... done
Solving environment: ...working... failed with initial frozen solve. Retrying with flexible solve.

PackagesNotFoundError: The following packages are not available from current channels:

  - blazingsql-orchestrator
  - blazingsql-calcite

Current channels:

  - https://conda.anaconda.org/conda-forge/linux-64
  - https://conda.anaconda.org/conda-forge/noarch
  - https://repo.anaconda.com/pkgs/main/linux-64
  - https://repo.anaconda.com/pkgs/main/noarch
  - https://repo.anaconda.com/pkgs/r/linux-64
  - https://repo.anaconda.com/pkgs/r/noarch
  - https://conda.anaconda.org/nvidia/linux-64
  - https://conda.anaconda.org/nvidia/noarch
  - https://conda.anaconda.org/rapidsai/linux-64
  - https://conda.anaconda.org/rapidsai/noarch
  - https://conda.anaconda.org/blazingsql/label/cuda10.0/linux-64
  - https://conda.anaconda.org/blazingsql/label/cuda10.0/noarch

To search for alternate channels that may provide the conda package you're
looking for, navigate to

    https://anaconda.org

and use the search bar at the top of the page.


The command '/bin/sh -c conda install -y -c conda-forge -c defaults -c nvidia -c rapidsai -c blazingsql/label/cuda10.0 blazingsql-calcite blazingsql-orchestrator blazingsql-ral blazingsql-python python=3.7 cudatoolkit=10.0' returned a non-zero code: 1

Steps/Code to reproduce bug

wget https://raw.githubusercontent.com/BlazingDB/pyBlazing/develop/docker/single-node/Dockerfile

docker build -f Dockerfile -t blazing .

Expected behavior
Dockerfile should build.

Environment overview (please complete the following information)

  • Environment location: bare metal, DGX Station
  • Method of cuDF install: Docker
    • If method of install is [Docker], provide docker pull & docker run commands used

Environment details
N/A

Additional context
Add any other context about the problem here.

Slack Channel

HI, I am reaching on git but there is no response and also the slack link listed on the page was expired. So can you update it with the new link?

[BUG] - AttributeError: 'BlazingContext' object has no attribute 'need_shutdown'

Describe the bug
After 25.10.19 I have not been able to get Blazingsql to work.

CODE:

python3.7

from blazingsql import BlazingContext

bc = BlazingContext()
...
Output:
WARNING: blazingsql-orchestrator was not automativally started, its probably already running
WARNING: blazingsql-engine was not automativally started, its probably already running
^CTraceback (most recent call last):
File "cs_dataprep.py", line 17, in
bc = BlazingContext()
File "/home/REMOVED/miniconda3/envs/blazing2/lib/python3.7/site-packages/pyblazing/apiv2/context.py", line 173, in init
internal_api.SetupOrchestratorConnection(orchestrator_host_ip, orchestrator_port)
File "/home/REMOVED/miniconda3/envs/blazing2/lib/python3.7/site-packages/pyblazing/api.py", line 904, in SetupOrchestratorConnection
client.connect(orchestrator_host_ip, orchestrator_port)
File "/home/REMOVED/miniconda3/envs/blazing2/lib/python3.7/site-packages/pyblazing/api.py", line 353, in connect
self.orchestrator_port, requestBuffer)
File "/home/REMOVED/miniconda3/envs/blazing2/lib/python3.7/site-packages/pyblazing/api.py", line 311, in send_request
return client.send(requestBuffer, expect_response)
File "/home/REMOVED/miniconda3/envs/blazing2/lib/python3.7/site-packages/blazingdb/protocol/init.py", line 69, in send
length = struct.unpack('I', self.connection
.socket
.recv(4))[0]
KeyboardInterrupt
Exception ignored in: <function BlazingContext.del at 0x7fccc269bbf8>
Traceback (most recent call last):
File "/home/REMOVED/miniconda3/envs/blazing2/lib/python3.7/site-packages/pyblazing/apiv2/context.py", line 212, in del
if self.need_shutdown:
AttributeError: 'BlazingContext' object has no attribute 'need_shutdown'

Environment overview (please complete the following information)

  • Environment location: Bare-metal
  • Method of cuDF install: conda

Environment details

Click here to see environment details
 **git***
 Not inside a git repository
 
 ***OS Information***
 DISTRIB_ID=Ubuntu
 DISTRIB_RELEASE=18.04
 DISTRIB_CODENAME=bionic
 DISTRIB_DESCRIPTION="Ubuntu 18.04.3 LTS"
 NAME="Ubuntu"
 VERSION="18.04.3 LTS (Bionic Beaver)"
 ID=ubuntu
 ID_LIKE=debian
 PRETTY_NAME="Ubuntu 18.04.3 LTS"
 VERSION_ID="18.04"
 HOME_URL="https://www.ubuntu.com/"
 SUPPORT_URL="https://help.ubuntu.com/"
 BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
 PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
 VERSION_CODENAME=bionic
 UBUNTU_CODENAME=bionic
 Linux REMOVED-ub 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
 
 ***GPU Information***
 Mon Oct 28 10:13:09 2019
 +-----------------------------------------------------------------------------+
 | NVIDIA-SMI 430.26       Driver Version: 430.26       CUDA Version: 10.2     |
 |-------------------------------+----------------------+----------------------+
 | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
 | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
 |===============================+======================+======================|
 |   0  GeForce RTX 2080    Off  | 00000000:01:00.0  On |                  N/A |
 | 24%   30C    P8    12W / 215W |    692MiB /  7979MiB |     10%      Default |
 +-------------------------------+----------------------+----------------------+
 
 +-----------------------------------------------------------------------------+
 | Processes:                                                       GPU Memory |
 |  GPU       PID   Type   Process name                             Usage      |
 |=============================================================================|
 |    0      1192      G   /usr/lib/xorg/Xorg                            28MiB |
 |    0      1227      G   /usr/bin/gnome-shell                          57MiB |
 |    0      1497      G   /usr/lib/xorg/Xorg                           195MiB |
 |    0      1630      G   /usr/bin/gnome-shell                         147MiB |
 |    0      2642      G   /snap/pycharm-community/155/jbr/bin/java       8MiB |
 |    0     15504      C   blazingsql-engine                            249MiB |
 +-----------------------------------------------------------------------------+
 
 ***CPU***
 Architecture:        x86_64
 CPU op-mode(s):      32-bit, 64-bit
 Byte Order:          Little Endian
 CPU(s):              8
 On-line CPU(s) list: 0-7
 Thread(s) per core:  1
 Core(s) per socket:  8
 Socket(s):           1
 NUMA node(s):        1
 Vendor ID:           GenuineIntel
 CPU family:          6
 Model:               158
 Model name:          Intel(R) Core(TM) i7-9700K CPU @ 3.60GHz
 Stepping:            12
 CPU MHz:             4615.365
 CPU max MHz:         4900,0000
 CPU min MHz:         800,0000
 BogoMIPS:            7200.00
 Virtualization:      VT-x
 L1d cache:           32K
 L1i cache:           32K
 L2 cache:            256K
 L3 cache:            12288K
 NUMA node0 CPU(s):   0-7
 Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp md_clear flush_l1d arch_capabilities
 
 ***CMake***
 /home/REMOVED/miniconda3/envs/blazing2/bin/cmake
 cmake version 3.15.4
 
 CMake suite maintained and supported by Kitware (kitware.com/cmake).
 
 ***g++***
 /usr/bin/g++
 g++ (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
 Copyright (C) 2017 Free Software Foundation, Inc.
 This is free software; see the source for copying conditions.  There is NO
 warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
 
 
 ***nvcc***
 
 ***Python***
 /home/REMOVED/miniconda3/envs/blazing2/bin/python
 Python 3.7.3
 
 ***Environment Variables***
 PATH                            : /home/REMOVED/miniconda3/envs/blazing2/bin:/home/REMOVED/miniconda3/condabin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/home/REMOVED/.local/bin:/home/REMOVED/.local/bin
 LD_LIBRARY_PATH                 :
 NUMBAPRO_NVVM                   :
 NUMBAPRO_LIBDEVICE              :
 CONDA_PREFIX                    : /home/REMOVED/miniconda3/envs/blazing2
 PYTHON_PATH                     :
 
 ***conda packages***
 /home/REMOVED/miniconda3/condabin/conda
 # packages in environment at /home/REMOVED/miniconda3/envs/blazing2:
 #
 # Name                    Version                   Build  Channel
 _libgcc_mutex             0.1                        main
 alsa-lib                  1.1.5             h516909a_1001    conda-forge
 arrow-cpp                 0.14.1           py37h5ac5442_4    conda-forge
 blazingsql-calcite        0.4.5                         0    blazingsql
 blazingsql-communication  0.4.5                cuda10.0_0    blazingsql/label/cuda10.0
 blazingsql-io             0.4.4                         0    blazingsql
 blazingsql-orchestrator   0.4.5                         0    blazingsql
 blazingsql-protocol       0.4.5                    py37_0    blazingsql
 blazingsql-python         0.4.5           cuda10.0_py37_0    blazingsql/label/cuda10.0
 blazingsql-ral            0.4.5                cuda10.0_0    blazingsql/label/cuda10.0
 blazingsql-toolchain      0.4.5                         0    blazingsql
 bokeh                     1.3.4                    py37_0    conda-forge
 boost                     1.70.0           py37h9de70de_1    conda-forge
 boost-cpp                 1.70.0               h8e57a91_2    conda-forge
 brotli                    1.0.7             he1b5a44_1000    conda-forge
 bzip2                     1.0.8                h516909a_1    conda-forge
 c-ares                    1.15.0            h516909a_1001    conda-forge
 ca-certificates           2019.9.11            hecc5488_0    conda-forge
 certifi                   2019.9.11                py37_0    conda-forge
 chardet                   3.0.4                    pypi_0    pypi
 click                     7.0                        py_0    conda-forge
 cloudpickle               1.2.2                      py_0    conda-forge
 cmake                     3.15.4               hf94ab9c_0    conda-forge
 cppzmq                    4.4.1                hc9558a2_0    conda-forge
 cudatoolkit               10.0.130                      0
 cudf                      0.10.0                   py37_0    rapidsai
 curl                      7.65.3               hf8cf82a_0    conda-forge
 cython                    0.29.13          py37he1b5a44_0    conda-forge
 cytoolz                   0.10.0           py37h516909a_0    conda-forge
 dask                      2.6.0                      py_0    conda-forge
 dask-core                 2.6.0                      py_0    conda-forge
 dask-cudf                 0.10.0                   py37_0    rapidsai
 distributed               2.6.0                      py_0    conda-forge
 dlpack                    0.2                  he1b5a44_1    conda-forge
 double-conversion         3.1.5                he1b5a44_1    conda-forge
 expat                     2.2.5             he1b5a44_1004    conda-forge
 fastavro                  0.22.5           py37h516909a_0    conda-forge
 flatbuffers               1.11                     pypi_0    pypi
 fontconfig                2.13.1            h86ecdb6_1001    conda-forge
 freetype                  2.10.0               he983fc9_1    conda-forge
 fsspec                    0.5.2                      py_0    conda-forge
 geoip2                    2.9.0                    pypi_0    pypi
 gettext                   0.19.8.1          hc5be6a0_1002    conda-forge
 gflags                    2.2.2             he1b5a44_1001    conda-forge
 giflib                    5.1.7                h516909a_1    conda-forge
 glog                      0.4.0                he1b5a44_1    conda-forge
 gmock                     1.10.0                        0    conda-forge
 grpc-cpp                  1.23.0               h18db393_0    conda-forge
 gtest                     1.10.0               hc9558a2_0    conda-forge
 heapdict                  1.0.1                      py_0    conda-forge
 icu                       64.2                 he1b5a44_1    conda-forge
 idna                      2.8                      pypi_0    pypi
 jinja2                    2.10.3                     py_0    conda-forge
 jpeg                      9c                h14c3975_1001    conda-forge
 krb5                      1.16.3            h05b26f9_1001    conda-forge
 lcms2                     2.9                  h2e4bb80_0    conda-forge
 libblas                   3.8.0               14_openblas    conda-forge
 libcblas                  3.8.0               14_openblas    conda-forge
 libcudf                   0.10.0               cuda10.0_0    rapidsai
 libcurl                   7.65.3               hda55be3_0    conda-forge
 libedit                   3.1.20181209         hc058e9b_0
 libevent                  2.1.10               h72c5cf5_0    conda-forge
 libffi                    3.2.1                hd88cf55_4
 libgcc-ng                 9.1.0                hdf63c60_0
 libgcrypt                 1.8.4             hf484d3e_1000    conda-forge
 libgfortran-ng            7.3.0                hdf63c60_2    conda-forge
 libgpg-error              1.36                 he1b5a44_0    conda-forge
 libgsasl                  1.8.0             h19a2143_1004    conda-forge
 libhdfs3                  2.3               h311b756_1006    conda-forge
 libiconv                  1.15              h516909a_1005    conda-forge
 liblapack                 3.8.0               14_openblas    conda-forge
 libllvm8                  8.0.1                hc9558a2_0    conda-forge
 libntlm                   1.4               h14c3975_1002    conda-forge
 libnvstrings              0.10.0               cuda10.0_0    rapidsai
 libopenblas               0.3.7                h6e990d7_2    conda-forge
 libpng                    1.6.37               hed695b0_0    conda-forge
 libprotobuf               3.8.0                h8b12597_0    conda-forge
 librmm                    0.10.0               cuda10.0_0    rapidsai
 libsodium                 1.0.17               h516909a_0    conda-forge
 libssh2                   1.8.2                h22169c7_2    conda-forge
 libstdcxx-ng              9.1.0                hdf63c60_0
 libtiff                   4.0.10            h57b8799_1003    conda-forge
 libuuid                   2.32.1            h14c3975_1000    conda-forge
 libuv                     1.33.1               h516909a_0    conda-forge
 libxcb                    1.13              h14c3975_1002    conda-forge
 libxml2                   2.9.9                hee79883_5    conda-forge
 llvmlite                  0.30.0           py37h8b12597_0    conda-forge
 locket                    0.2.0                      py_2    conda-forge
 lz4-c                     1.8.3             he1b5a44_1001    conda-forge
 markupsafe                1.1.1            py37h14c3975_0    conda-forge
 maven                     3.6.0                         0    conda-forge
 maxminddb                 1.5.1                    pypi_0    pypi
 msgpack-python            0.6.2            py37hc9558a2_0    conda-forge
 ncurses                   6.1                  he6710b0_1
 numba                     0.46.0           py37hb3f55d8_0    conda-forge
 numpy                     1.17.3           py37h95a1406_0    conda-forge
 nvstrings                 0.10.0                   py37_0    rapidsai
 olefile                   0.46                       py_0    conda-forge
 openjdk                   11.0.1            h46a85a0_1017    conda-forge
 openssl                   1.1.1c               h516909a_0    conda-forge
 packaging                 19.2                       py_0    conda-forge
 pandas                    0.24.2           py37hb3f55d8_0    conda-forge
 parquet-cpp               1.5.1                         2    conda-forge
 partd                     1.0.0                      py_0    conda-forge
 pillow                    6.2.1            py37h6b7be26_0    conda-forge
 pip                       19.3.1                   py37_0
 psutil                    5.6.3            py37h516909a_0    conda-forge
 pthread-stubs             0.4               h14c3975_1001    conda-forge
 pyarrow                   0.14.1           py37h8b68381_2    conda-forge
 pyparsing                 2.4.2                      py_0    conda-forge
 python                    3.7.3                h5b0a415_0    conda-forge
 python-dateutil           2.8.0                      py_0    conda-forge
 pytz                      2019.3                     py_0    conda-forge
 pyyaml                    5.1.2            py37h516909a_0    conda-forge
 rapidjson                 1.1.0             he1b5a44_1002    conda-forge
 re2                       2019.09.01           he1b5a44_0    conda-forge
 readline                  7.0                  h7b6447c_5
 requests                  2.22.0                   pypi_0    pypi
 rhash                     1.3.6             h14c3975_1001    conda-forge
 rmm                       0.10.0                   py37_0    rapidsai
 setuptools                41.4.0                   py37_0
 six                       1.12.0                py37_1000    conda-forge
 snappy                    1.1.7             he1b5a44_1002    conda-forge
 sortedcontainers          2.1.0                      py_0    conda-forge
 sqlite                    3.30.1               h7b6447c_0
 tblib                     1.4.0                      py_0    conda-forge
 thrift-cpp                0.12.0            hf3afdfd_1004    conda-forge
 tk                        8.6.9             hed695b0_1003    conda-forge
 toolz                     0.10.0                     py_0    conda-forge
 tornado                   6.0.3            py37h516909a_0    conda-forge
 uriparser                 0.9.3                he1b5a44_1    conda-forge
 urllib3                   1.25.6                   pypi_0    pypi
 wheel                     0.33.6                   py37_0
 xorg-fixesproto           5.0               h14c3975_1002    conda-forge
 xorg-inputproto           2.3.2             h14c3975_1002    conda-forge
 xorg-kbproto              1.0.7             h14c3975_1002    conda-forge
 xorg-libx11               1.6.9                h516909a_0    conda-forge
 xorg-libxau               1.0.9                h14c3975_0    conda-forge
 xorg-libxdmcp             1.1.3                h516909a_0    conda-forge
 xorg-libxext              1.3.4                h516909a_0    conda-forge
 xorg-libxfixes            5.0.3             h516909a_1004    conda-forge
 xorg-libxi                1.7.10               h516909a_0    conda-forge
 xorg-libxrender           0.9.10            h516909a_1002    conda-forge
 xorg-libxtst              1.2.3             h14c3975_1002    conda-forge
 xorg-recordproto          1.14.2            h14c3975_1002    conda-forge
 xorg-renderproto          0.11.1            h14c3975_1002    conda-forge
 xorg-xextproto            7.3.0             h14c3975_1002    conda-forge
 xorg-xproto               7.0.31            h14c3975_1007    conda-forge
 xz                        5.2.4                h14c3975_4
 yaml                      0.1.7             h14c3975_1001    conda-forge
 zeromq                    4.3.2                he1b5a44_2    conda-forge
 zict                      1.0.0                      py_0    conda-forge
 zlib                      1.2.11               h7b6447c_3
 zstd                      1.4.0                h3b9ef0a_0    conda-forge

Additional context
Add any other context about the problem here.

Basic SQL Functions

Missing Basic SQL operations
I don't see may basic functions like

  1. Data type casting
  2. IS (not) NULL
  3. String Functions
    a) SubString
    b) Regex SubString
    c) Regex Like
    d)Length,Trim
  1. Date Fucntions
    a) Date Formatting
    b) Add / Subtract time/date's
  2. Interval Fucntion
  3. Sequence Fucntion

Reference
Here is the reference of built-in function in Spark SQL similar we may need them here
https://spark.apache.org/docs/latest/api/sql/index.html

No such file or directory: 'blazingsql-orchestrator': 'blazingsql-orchestrator'

I'm testing the conda installation procedure released 3 days ago.
(rapids) sh-4.2$ conda list | grep blazing
blazingdb-toolchain 0.4.0 py37hf484d3e_0 blazingsql
blazingsql-calcite 0.4.0 py37_0 blazingsql
blazingsql-communication 0.4.0 py37_80 blazingsql
blazingsql-io 0.4.0 py37_31 blazingsql
blazingsql-orchestrator 0.4.0 py37_19 blazingsql
blazingsql-protocol 0.4.0 py37_25 blazingsql
blazingsql-python 0.4.0 cuda10.0_py37_14 blazingsql/label/cuda10.0
blazingsql-ral 0.4.0 cuda10.0_py37_5 blazingsql/label/cuda10.0

However when I try to bc = BlazingContext() I got the following:
AttributeError: 'BlazingContext' object has no attribute 'processes'
FileNotFoundError: [Errno 2] No such file or directory: 'blazingsql-orchestrator': 'blazingsql-orchestrator'

Perhaps there is something else I need to do before I call bc = BlazingContext() that I'm not aware of?

Function - calculate memory impact

Is your feature request related to a problem? Please describe.

There is difficult to know if a file will fit in memory when importing to blazingsql.

Describe the solution you'd like
I would like either:

A)
bc.estimate_mem_usage(csv/parquet/gdf/pdf): <do calculations> return value_mem_usage

B)
bc.fits_in_mem(csv/parquet/gdf/pdf): available_mem = get available gpu-memory value_mem_usage = <calculate file impact> if available_mem - value_mem_usage > 0: return True else: return False

Dynamic Resource Allocation

Not sure Dask has a way to specify the number of worked / maximum memory to be used for a job or not if so inherit then and apply then at the Query level. This means for every query we should have the option to limit workers and memory.

Recommended Modes

  1. Fixed Mode
    In this mode Workers / Memory / both are fixed per process
  2. Step Mode
    In this mode Workers / Memory / both can have a range of values (lower and higher ) and Step value. Using the step value it should increment to the higher range
  3. Auto Mode
    It just runs the Step Mode with some default values or with the size of the data that needs to be processed and available resources.

[BUG] Reading from csv/cudf with `blazing` and `dask_client` fails

Describe the bug

Reading from csv with blazing and dask_client seems to be failing. Below works if i dont supply the dask client to the blazing context .

Steps/Code to reproduce bug

import dask_cudf 

from blazingsql import BlazingContext
from dask_cuda import LocalCUDACluster
from dask.distributed import Client
from distributed import wait
from dask import delayed
import os
import cudf
import pandas as pd
### start workers
cluster = LocalCUDACluster()
client = Client(cluster)

bc = BlazingContext(dask_client=client)
bc
Script that bugs
pd.DataFrame({'1':[0,1,2]}).to_csv('test.csv',index=False)
result = bc.create_table('test', os.getcwd()+'/'+'test.csv')
table_result = bc.sql('SELECT * FROM test').get()
table_df = table_result.columns
table_df

Python Output:

Index(['1'], dtype='object')

Dask Worker Trace

distributed.worker - WARNING -  Compute Failed
Function:  convert_to_dask
args:      ([<class 'blazingdb.protocol.transport.responses'>], <pyblazing.api.PyConnector object at 0x7f574a735048>)
kwargs:    {}
Exception: CudaAPIError(217, 'Call to call_cuIpcOpenMemHandle results in UNKNOWN_CUDA_ERROR')

Expected behavior
I would expect the behavior to be same independent of providing dask-client.

Environment overview (please complete the following information)

  • Environment location: Docker
  • Blazing Build: Source
  • Cudf Build - Docker dev container

Follow up question:

Are there examples workflows of using dask-cudf with blazing you can share . It will be super helpful .

[BUG] Multiple LIKEs

Describe the bug

When trying to query with multiple LIKE conditions, only the first applies.
image

Steps/Code to reproduce bug

Here's a copy of the notebook in Google Colab: https://colab.research.google.com/drive/1VrUN6DEcfZZOTZb2ZKMC_YvS3jCQffd_

Here's the Query:

# tag column names
column_names = ['key', 'fare_amount', 'pickup_longitude', 'pickup_latitude', 
                'dropoff_longitude', 'dropoff_latitude', 'passenger_count']

# 5m row table (given column names) from taxi_00.csv
bc.create_table('taxi', '/home/winston/bsql-demos/taxi_00.csv', names=column_names)

# find january instances with 20 in the fare amount
query = '''
        select 
            *
        from taxi 
            where key like '%-01-%'
            and fare_amount like '%20%'
        '''

# query the table 
results = bc.sql(query).get()

# extract cudf dataframe
df = results.columns

# how's it look?
df.head()

Expected behavior

Return instances where key has -01- (January) in it's value and fare_amount has 20 in it's value.

Environment overview

  • Environment location: Cloud (Google)
  • Method of cuDF install: conda

Environment details
Please run and paste the output of the print_env.sh script here, to gather any other relevant environment details

<details><summary>Click here to see environment details</summary><pre>
     
     **git***
     Not inside a git repository
     
     ***OS Information***
     DISTRIB_ID=Ubuntu
     DISTRIB_RELEASE=16.04
     DISTRIB_CODENAME=xenial
     DISTRIB_DESCRIPTION="Ubuntu 16.04.6 LTS"
     NAME="Ubuntu"
     VERSION="16.04.6 LTS (Xenial Xerus)"
     ID=ubuntu
     ID_LIKE=debian
     PRETTY_NAME="Ubuntu 16.04.6 LTS"
     VERSION_ID="16.04"
     HOME_URL="http://www.ubuntu.com/"
     SUPPORT_URL="http://help.ubuntu.com/"
     BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
     VERSION_CODENAME=xenial
     UBUNTU_CODENAME=xenial
     Linux winston-gpu-rig 4.15.0-1047-gcp #50-Ubuntu SMP Wed Oct 2 00:50:34 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
     
     ***GPU Information***
     Sat Nov  9 06:53:24 2019
     +-----------------------------------------------------------------------------+
     | NVIDIA-SMI 418.87.00    Driver Version: 418.87.00    CUDA Version: 10.1     |
     |-------------------------------+----------------------+----------------------+
     | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
     | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
     |===============================+======================+======================|
     |   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
     | N/A   77C    P0    33W /  70W |   5640MiB / 15079MiB |      0%      Default |
     +-------------------------------+----------------------+----------------------+
     
     +-----------------------------------------------------------------------------+
     | Processes:                                                       GPU Memory |
     |  GPU       PID   Type   Process name                             Usage      |
     |=============================================================================|
     |    0      1952      G   /usr/lib/xorg/Xorg                            98MiB |
     |    0      2893      C   blazingsql-engine                           5531MiB |
     +-----------------------------------------------------------------------------+
     
     ***CPU***
     Architecture:          x86_64
     CPU op-mode(s):        32-bit, 64-bit
     Byte Order:            Little Endian
     CPU(s):                4
     On-line CPU(s) list:   0-3
     Thread(s) per core:    2
     Core(s) per socket:    2
     Socket(s):             1
     NUMA node(s):          1
     Vendor ID:             GenuineIntel
     CPU family:            6
     Model:                 63
     Model name:            Intel(R) Xeon(R) CPU @ 2.30GHz
     Stepping:              0
     CPU MHz:               2300.000
     BogoMIPS:              4600.00
     Hypervisor vendor:     KVM
     Virtualization type:   full
     L1d cache:             32K
     L1i cache:             32K
     L2 cache:              256K
     L3 cache:              46080K
     NUMA node0 CPU(s):     0-3
     Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm invpcid_single pti ssbd ibrs ibpb stibp fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt arat md_clear arch_capabilities
     
     ***CMake***
     /home/winston/miniconda3/envs/bzsqlenv/bin/cmake
     cmake version 3.15.5
     
     CMake suite maintained and supported by Kitware (kitware.com/cmake).
     
     ***g++***
     /usr/bin/g++
     g++ (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609
     Copyright (C) 2015 Free Software Foundation, Inc.
     This is free software; see the source for copying conditions.  There is NO
     warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
     
     
     ***nvcc***
     
     ***Python***
     /home/winston/miniconda3/envs/bzsqlenv/bin/python
     Python 3.7.3
     
     ***Environment Variables***
     PATH                            : /home/winston/bin:/home/winston/.local/bin:/home/winston/miniconda3/envs/bzsqlenv/bin:/home/winston/miniconda3/condabin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
     LD_LIBRARY_PATH                 :
     NUMBAPRO_NVVM                   :
     NUMBAPRO_LIBDEVICE              :
     CONDA_PREFIX                    : /home/winston/miniconda3/envs/bzsqlenv
     PYTHON_PATH                     :
     
     ***conda packages***
     /home/winston/miniconda3/condabin/conda
     # packages in environment at /home/winston/miniconda3/envs/bzsqlenv:
     #
     # Name                    Version                   Build  Channel
     _libgcc_mutex             0.1                        main
     alsa-lib                  1.1.5             h516909a_1001    conda-forge
     arrow-cpp                 0.14.1           py37h5ac5442_4    conda-forge
     attrs                     19.3.0                   pypi_0    pypi
     backcall                  0.1.0                    pypi_0    pypi
     blazingsql-calcite        0.4.5                         0    blazingsql
     blazingsql-communication  0.4.5                cuda10.0_0    blazingsql/label/cuda10.0
     blazingsql-io             0.4.4                         0    blazingsql
     blazingsql-orchestrator   0.4.5                         0    blazingsql
     blazingsql-protocol       0.4.5                    py37_0    blazingsql
     blazingsql-python         0.4.5           cuda10.0_py37_0    blazingsql/label/cuda10.0
     blazingsql-ral            0.4.5                cuda10.0_0    blazingsql/label/cuda10.0
     blazingsql-toolchain      0.4.5                         3    blazingsql
     bleach                    3.1.0                    pypi_0    pypi
     bokeh                     1.3.4                    py37_0    conda-forge
     boost                     1.70.0           py37h9de70de_1    conda-forge
     boost-cpp                 1.70.0               h8e57a91_2    conda-forge
     brotli                    1.0.7             he1b5a44_1000    conda-forge
     bzip2                     1.0.8                h516909a_1    conda-forge
     c-ares                    1.15.0            h516909a_1001    conda-forge
     ca-certificates           2019.9.11            hecc5488_0    conda-forge
     certifi                   2019.9.11                py37_0    conda-forge
     click                     7.0                        py_0    conda-forge
     cloudpickle               1.2.2                      py_0    conda-forge
     cmake                     3.15.5               hf94ab9c_0    conda-forge
     cppzmq                    4.4.1                hc9558a2_0    conda-forge
     cudatoolkit               10.0.130                      0
     cudf                      0.10.0                   py37_0    rapidsai
     curl                      7.65.3               hf8cf82a_0    conda-forge
     cython                    0.29.14          py37he1b5a44_0    conda-forge
     cytoolz                   0.10.1           py37h516909a_0    conda-forge
     dask                      2.6.0                      py_0    conda-forge
     dask-core                 2.6.0                      py_0    conda-forge
     dask-cudf                 0.10.0                   py37_0    rapidsai
     decorator                 4.4.1                    pypi_0    pypi
     defusedxml                0.6.0                    pypi_0    pypi
     distributed               2.6.0                      py_0    conda-forge
     dlpack                    0.2                  he1b5a44_1    conda-forge
     double-conversion         3.1.5                he1b5a44_2    conda-forge
     entrypoints               0.3                      pypi_0    pypi
     expat                     2.2.5             he1b5a44_1004    conda-forge
     fastavro                  0.22.7           py37h516909a_0    conda-forge
     flatbuffers               1.11                     pypi_0    pypi
     fontconfig                2.13.1            h86ecdb6_1001    conda-forge
     freetype                  2.10.0               he983fc9_1    conda-forge
     fsspec                    0.5.2                      py_0    conda-forge
     gettext                   0.19.8.1          hc5be6a0_1002    conda-forge
     gflags                    2.2.2             he1b5a44_1002    conda-forge
     giflib                    5.1.7                h516909a_1    conda-forge
     glog                      0.4.0                he1b5a44_1    conda-forge
     gmock                     1.10.0                        1    conda-forge
     grpc-cpp                  1.23.0               h18db393_0    conda-forge
     gtest                     1.10.0               hc9558a2_1    conda-forge
     heapdict                  1.0.1                      py_0    conda-forge
     icu                       64.2                 he1b5a44_1    conda-forge
     importlib-metadata        0.23                     pypi_0    pypi
     ipykernel                 5.1.3                    pypi_0    pypi
     ipython                   7.9.0                    pypi_0    pypi
     ipython-genutils          0.2.0                    pypi_0    pypi
     jedi                      0.15.1                   pypi_0    pypi
     jinja2                    2.10.3                     py_0    conda-forge
     jpeg                      9c                h14c3975_1001    conda-forge
     jsonschema                3.1.1                    pypi_0    pypi
     jupyter-client            5.3.4                    pypi_0    pypi
     jupyter-core              4.6.1                    pypi_0    pypi
     jupyterlab                0.34.0                   pypi_0    pypi
     jupyterlab-launcher       0.13.1                   pypi_0    pypi
     krb5                      1.16.3            h05b26f9_1001    conda-forge
     lcms2                     2.9                  h2e4bb80_0    conda-forge
     libblas                   3.8.0               14_openblas    conda-forge
     libcblas                  3.8.0               14_openblas    conda-forge
     libcudf                   0.10.0               cuda10.0_0    rapidsai
     libcurl                   7.65.3               hda55be3_0    conda-forge
     libedit                   3.1.20181209         hc058e9b_0
     libevent                  2.1.10               h72c5cf5_0    conda-forge
     libffi                    3.2.1                hd88cf55_4
     libgcc-ng                 9.1.0                hdf63c60_0
     libgcrypt                 1.8.4             hf484d3e_1000    conda-forge
     libgfortran-ng            7.3.0                hdf63c60_2    conda-forge
     libgpg-error              1.36                 he1b5a44_0    conda-forge
     libgsasl                  1.8.0             h19a2143_1004    conda-forge
     libhdfs3                  2.3               h311b756_1006    conda-forge
     libiconv                  1.15              h516909a_1005    conda-forge
     liblapack                 3.8.0               14_openblas    conda-forge
     libllvm8                  8.0.1                hc9558a2_0    conda-forge
     libntlm                   1.4               h14c3975_1002    conda-forge
     libnvstrings              0.10.0               cuda10.0_0    rapidsai
     libopenblas               0.3.7                h6e990d7_3    conda-forge
     libpng                    1.6.37               hed695b0_0    conda-forge
     libprotobuf               3.8.0                h8b12597_0    conda-forge
     librmm                    0.10.0               cuda10.0_0    rapidsai
     libsodium                 1.0.17               h516909a_0    conda-forge
     libssh2                   1.8.2                h22169c7_2    conda-forge
     libstdcxx-ng              9.1.0                hdf63c60_0
     libtiff                   4.0.10            h57b8799_1003    conda-forge
     libuuid                   2.32.1            h14c3975_1000    conda-forge
     libuv                     1.33.1               h516909a_0    conda-forge
     libxcb                    1.13              h14c3975_1002    conda-forge
     libxml2                   2.9.10               hee79883_0    conda-forge
     llvmlite                  0.30.0           py37h8b12597_1    conda-forge
     locket                    0.2.0                      py_2    conda-forge
     lz4-c                     1.8.3             he1b5a44_1001    conda-forge
     markupsafe                1.1.1            py37h516909a_0    conda-forge
     maven                     3.6.0                         0    conda-forge
     mistune                   0.8.4                    pypi_0    pypi
     more-itertools            7.2.0                    pypi_0    pypi
     msgpack-python            0.6.2            py37hc9558a2_0    conda-forge
     nbconvert                 5.6.1                    pypi_0    pypi
     nbformat                  4.4.0                    pypi_0    pypi
     ncurses                   6.1                  he6710b0_1
     notebook                  6.0.2                    pypi_0    pypi
     numba                     0.46.0           py37hb3f55d8_1    conda-forge
     numpy                     1.17.3           py37h95a1406_0    conda-forge
     nvstrings                 0.10.0                   py37_0    rapidsai
     olefile                   0.46                       py_0    conda-forge
     openjdk                   11.0.1            h46a85a0_1017    conda-forge
     openssl                   1.1.1d               h516909a_0    conda-forge
     packaging                 19.2                       py_0    conda-forge
     pandas                    0.24.2           py37hb3f55d8_0    conda-forge
     pandocfilters             1.4.2                    pypi_0    pypi
     parquet-cpp               1.5.1                         2    conda-forge
     parso                     0.5.1                    pypi_0    pypi
     partd                     1.0.0                      py_0    conda-forge
     pexpect                   4.7.0                    pypi_0    pypi
     pickleshare               0.7.5                    pypi_0    pypi
     pillow                    6.2.1            py37h6b7be26_0    conda-forge
     pip                       19.3.1                   py37_0
     prometheus-client         0.7.1                    pypi_0    pypi
     prompt-toolkit            2.0.10                   pypi_0    pypi
     psutil                    5.6.5            py37h516909a_0    conda-forge
     pthread-stubs             0.4               h14c3975_1001    conda-forge
     ptyprocess                0.6.0                    pypi_0    pypi
     pyarrow                   0.14.1           py37h8b68381_2    conda-forge
     pygments                  2.4.2                    pypi_0    pypi
     pyparsing                 2.4.4                      py_0    conda-forge
     pyrsistent                0.15.5                   pypi_0    pypi
     python                    3.7.3                h5b0a415_0    conda-forge
     python-dateutil           2.8.1                      py_0    conda-forge
     pytz                      2019.3                     py_0    conda-forge
     pyyaml                    5.1.2            py37h516909a_0    conda-forge
     pyzmq                     18.1.0                   pypi_0    pypi
     rapidjson                 1.1.0             he1b5a44_1002    conda-forge
     re2                       2019.09.01           he1b5a44_0    conda-forge
     readline                  7.0                  h7b6447c_5
     rhash                     1.3.6             h14c3975_1001    conda-forge
     rmm                       0.10.0                   py37_0    rapidsai
     send2trash                1.5.0                    pypi_0    pypi
     setuptools                41.6.0                   py37_0
     six                       1.13.0                   py37_0    conda-forge
     snappy                    1.1.7             he1b5a44_1002    conda-forge
     sortedcontainers          2.1.0                      py_0    conda-forge
     sqlite                    3.30.1               h7b6447c_0
     tblib                     1.4.0                      py_0    conda-forge
     terminado                 0.8.2                    pypi_0    pypi
     testpath                  0.4.4                    pypi_0    pypi
     thrift-cpp                0.12.0            hf3afdfd_1004    conda-forge
     tk                        8.6.9             hed695b0_1003    conda-forge
     toolz                     0.10.0                     py_0    conda-forge
     tornado                   6.0.3            py37h516909a_0    conda-forge
     traitlets                 4.3.3                    pypi_0    pypi
     uriparser                 0.9.3                he1b5a44_1    conda-forge
     wcwidth                   0.1.7                    pypi_0    pypi
     webencodings              0.5.1                    pypi_0    pypi
     wheel                     0.33.6                   py37_0
     xorg-fixesproto           5.0               h14c3975_1002    conda-forge
     xorg-inputproto           2.3.2             h14c3975_1002    conda-forge
     xorg-kbproto              1.0.7             h14c3975_1002    conda-forge
     xorg-libx11               1.6.9                h516909a_0    conda-forge
     xorg-libxau               1.0.9                h14c3975_0    conda-forge
     xorg-libxdmcp             1.1.3                h516909a_0    conda-forge
     xorg-libxext              1.3.4                h516909a_0    conda-forge
     xorg-libxfixes            5.0.3             h516909a_1004    conda-forge
     xorg-libxi                1.7.10               h516909a_0    conda-forge
     xorg-libxrender           0.9.10            h516909a_1002    conda-forge
     xorg-libxtst              1.2.3             h516909a_1002    conda-forge
     xorg-recordproto          1.14.2            h516909a_1002    conda-forge
     xorg-renderproto          0.11.1            h14c3975_1002    conda-forge
     xorg-xextproto            7.3.0             h14c3975_1002    conda-forge
     xorg-xproto               7.0.31            h14c3975_1007    conda-forge
     xz                        5.2.4                h14c3975_4
     yaml                      0.1.7             h14c3975_1001    conda-forge
     zeromq                    4.3.2                he1b5a44_2    conda-forge
     zict                      1.0.0                      py_0    conda-forge
     zipp                      0.6.0                    pypi_0    pypi
     zlib                      1.2.11               h7b6447c_3
     zstd                      1.4.0                h3b9ef0a_0    conda-forge
     
</pre></details>

UDF

Why we need UDF?
As the users and use cases increase they may come with different types of functions that are required for their use cases. So proving a way to register their own logic/code as a UDF for their SQL function will help the user to accomplish their task

How to it needs to be implemented?
As BlazingSQL is around python, the developer should be able to write his/her own function in python and register it with a decorator.

Additional context
Different type's of UDF would be appreciated

  1. Simple which works on a column
  2. UDF on a Window/ Sliding Window
  3. Able to run UDF in the clustered mode (not sure just throwing my idea)

[BUG] b'In function ddlCreateTableService: cannot create the table: Could not create table'

(rapids_blazing) sh-4.2$ conda list | grep blazing
blazingdb-toolchain 0.4.0 py37hf484d3e_0 blazingsql
blazingsql-calcite 0.4.0 py37_0 blazingsql
blazingsql-communication 0.4.0 py37_80 blazingsql
blazingsql-io 0.4.0 py37_31 blazingsql
blazingsql-orchestrator 0.4.0 py37_19 blazingsql
blazingsql-protocol 0.4.0 py37_25 blazingsql
blazingsql-python 0.4.0 cuda10.0_py37_14 blazingsql/label/cuda10.0
blazingsql-ral 0.4.0 cuda10.0_py37_5 blazingsql/label/cuda10.0

Yesterday I was able to create a table based on a cudf but today I'm having some errors. I have already created again the whole instance and did again all the steps but I'm facing some errors similar to the following:
b'In function ddlCreateTableService: cannot create the table: Could not create table'

There is no more information. Any ideas how I can trace the error?
In addition, it seems in version 0.4.2 there will be some changes in how BlazingContext launches processes, could it be related to this? When this release will be conda installable?

If I execute again BlazingContext() and try to create the table I can get two kinds of different errors:
1)
Already connected to the Orchestrator
b'In function ddlCreateTableService: cannot create the table: Connection to server failed.'

WARNING: blazingsql-orchestrator was not automativally started, its probably already running
WARNING: blazingsql-engine was not automativally started, its probably already running
WARNING: blazingsql-algebra was not automativally started, its probably already running
Already connected to the Orchestrator
Unexpected error on create_table, can only concatenate str (not "tuple") to str

[BUG] GCP Bucket - Can't pull from bucket which is private and requires a JSON Key

Describe the bug
Cannot register the filesystem: Couldn't create gcs::ClientOptions for Project ID XXXXX
status=Could not automatically determine credentials. For more information, please see https://developers.google.com/identity/protocols/application-default-credentials

Steps/Code to reproduce bug
bc.gcs('dir_name',project_id='xxx,bucket_name='xxx',use_default_adc_json_file=False,adc_json_file='../home/david/Downloads/a3e4838767e8.json')

Expected behavior
Using JSON from service account credentials.

Environment overview (please complete the following information)
Docker Pull

[BUG] Erroneous results with use of "<>" in the query

Erroneous results with use of "<>" in the query

Steps/Code to reproduce bug

>>> bc.create_table('tbl', df)
<pyblazing.apiv2.context.BlazingTable object at 0x7f7c4969cb90>
>>> df
   a    b
0  0  1.5
1  1  1.5
2  0  1.5
3  1  1.5
4  0  1.5
5  1  1.5
6  0  1.5
7  1  1.5
8  0  1.5
9  1  1.5
>>> bc.sql('SELECT CASE WHEN a <> 1 THEN 0 ELSE b END as s, a, b from tbl')
30956
               s  a    b
0   0.000000e+00  0  1.5
1  4.940656e-324  1  1.5
2   0.000000e+00  0  1.5
3  4.940656e-324  1  1.5
4   0.000000e+00  0  1.5
5  4.940656e-324  1  1.5
6   0.000000e+00  0  1.5
7  4.940656e-324  1  1.5
8   0.000000e+00  0  1.5
9  4.940656e-324  1  1.5
>>> bc.sql('SELECT CASE WHEN a <> 1 THEN 0 ELSE b END as s, a, b from tbl').s.sum()
30956
2.5e-323
>>> int(bc.sql('SELECT CASE WHEN a <> 1 THEN 0 ELSE b END as s, a, b from tbl').s.sum())
30956
0

Expected behavior

>>> df =cudf.DataFrame()
>>> df['a'] = [i%2 for i in range(10)]
>>> df['b'] = 1.5
>>> df.b.sum()
15.0
>>> df[df.a == 1]['b'].sum()
7.5
>>> df
   a    b
0  0  1.5
1  1  1.5
2  0  1.5
3  1  1.5
4  0  1.5
5  1  1.5
6  0  1.5
7  1  1.5
8  0  1.5
9  1  1.5
>>> int(df[df.a == 1]['b'].sum())
7

*When I am using the actual data I also see some NaN's in the resulting query. But the input rows doesn't have any NaN's.

In《Data Lake to AI — BlazingSQL + RAPIDS Initial Benchmark》,the E2E workflow Repo had gone?

The doc Url is https://blog.blazingdb.com/data-lake-to-ai-blazingsql-rapids-initial-benchmark-aa753031ac8b

To see the E2E workflow, see our Public Github Repo.

You can see the full workload at the link above, but we want to go over a few code snippets to show you how you can expect to interact with BlazingSQL.

The Public Github Repo url is https://github.com/BlazingDB/blazingsql-public-demos/blob/master/mortgage-xgboost/e2e.py ,but it is 404.

[BUG] Runs Slow on First Run

Describe the bug
All queries run slower on the first launch of BlazingSQL. If I launch a python Kernel with BlazingSQL all queries and create_table statements run slower. If I restart the kernel and launch again BlazingSQL gets dramatically faster.

Steps/Code to reproduce bug
I launched a new GCP server with CUDA 10.0 installed.
I installed miniconda and then installed bsql with dask-cuda and jupyterlab:

conda install -c blazingsql-nightly/label/cuda10.0 -c blazingsql-nightly -c rapidsai-nightly -c conda-forge -c defaults blazingsql
conda install -c rapidsai-nightly dask-cuda

I launched a jupyterlab and executed the following code:

from blazingsql import BlazingContext
from dask_cuda import LocalCUDACluster
from dask.distributed import Client
cluster = LocalCUDACluster()
client = Client(cluster)
bc = BlazingContext(dask_client=client, network_interface='lo')
files = ['/home/rodrigo/tpch_sf100/lineitem/0_0_' + str(num) + '.parquet' for num in range(0,72)]
bc.create_table('lineitem',files)

The above code will run dramatically faster if I restart the kernel and run again. Something seems to be making the first launched kernel to run slower than it should.

Expected behavior
It should run equivalently as fast with the first launch or the nth launch.

Environment overview (please complete the following information)

  • n1-standard-8 w/ 4 Tesla T4 GPUs, CUDA 10.0, Ubuntu 16.04
  • Method of BSQL install: [conda, Docker, or from source]
    • conda blazingsql-nightly

Additional context
Add any other context about the problem here.
image

[QST] Latest Source Build Failing

Describe the bug

I am trying to do a fresh source build but it seems to be failing with the following stack trace:

-- The following REQUIRED packages have been found:

 * GTest

-- Configuring done
-- Generating done
-- Build files have been written to: /opt/conda/envs/bzsqlenv/blazingdb-communication/build
[  4%] Building CXX object CMakeFiles/blazingdb-transport.dir/src/blazingdb/transport/Server.cc.o
[  9%] Building CXX object CMakeFiles/blazingdb-transport.dir/src/blazingdb/transport/Client.cc.o
[ 13%] Building CXX object CMakeFiles/blazingdb-manager.dir/src/blazingdb/manager/Manager.cc.o
[ 18%] Building CXX object CMakeFiles/blazingdb-manager.dir/src/blazingdb/transport/io/fd_reader_writer.cpp.o
[ 22%] Building CXX object CMakeFiles/blazingdb-transport.dir/src/blazingdb/transport/io/fd_reader_writer.cpp.o
[ 27%] Building CXX object CMakeFiles/blazingdb-transport.dir/src/blazingdb/manager/Manager.cc.o
/opt/conda/envs/bzsqlenv/blazingdb-communication/src/blazingdb/transport/io/fd_reader_writer.cpp:4:19: fatal error: zmq.hpp: No such file or directory
compilation terminated.
CMakeFiles/blazingdb-manager.dir/build.make:153: recipe for target 'CMakeFiles/blazingdb-manager.dir/src/blazingdb/transport/io/fd_reader_writer.cpp.o' failed
make[2]: *** [CMakeFiles/blazingdb-manager.dir/src/blazingdb/transport/io/fd_reader_writer.cpp.o] Error 1
make[2]: *** Waiting for unfinished jobs....
In file included from /opt/conda/envs/bzsqlenv/blazingdb-communication/src/blazingdb/transport/Client.cc:4:0:
/opt/conda/envs/bzsqlenv/blazingdb-communication/include/blazingdb/network/TCPSocket.h:13:19: fatal error: zmq.hpp: No such file or directory
compilation terminated.
CMakeFiles/blazingdb-transport.dir/build.make:75: recipe for target 'CMakeFiles/blazingdb-transport.dir/src/blazingdb/transport/Client.cc.o' failed
make[2]: *** [CMakeFiles/blazingdb-transport.dir/src/blazingdb/transport/Client.cc.o] Error 1
make[2]: *** Waiting for unfinished jobs....
In file included from /opt/conda/envs/bzsqlenv/blazingdb-communication/src/blazingdb/transport/Server.cc:4:0:
/opt/conda/envs/bzsqlenv/blazingdb-communication/include/blazingdb/network/TCPSocket.h:13:19: fatal error: zmq.hpp: No such file or directory
compilation terminated.
CMakeFiles/blazingdb-transport.dir/build.make:88: recipe for target 'CMakeFiles/blazingdb-transport.dir/src/blazingdb/transport/Server.cc.o' failed
make[2]: *** [CMakeFiles/blazingdb-transport.dir/src/blazingdb/transport/Server.cc.o] Error 1
/opt/conda/envs/bzsqlenv/blazingdb-communication/src/blazingdb/transport/io/fd_reader_writer.cpp:4:19: fatal error: zmq.hpp: No such file or directory
compilation terminated.
CMakeFiles/blazingdb-transport.dir/build.make:153: recipe for target 'CMakeFiles/blazingdb-transport.dir/src/blazingdb/transport/io/fd_reader_writer.cpp.o' failed
make[2]: *** [CMakeFiles/blazingdb-transport.dir/src/blazingdb/transport/io/fd_reader_writer.cpp.o] Error 1
In file included from /opt/conda/envs/bzsqlenv/blazingdb-communication/src/blazingdb/manager/Manager.cc:6:0:
/opt/conda/envs/bzsqlenv/blazingdb-communication/include/blazingdb/network/TCPSocket.h:13:19: fatal error: zmq.hpp: No such file or directory
compilation terminated.
CMakeFiles/blazingdb-manager.dir/build.make:101: recipe for target 'CMakeFiles/blazingdb-manager.dir/src/blazingdb/manager/Manager.cc.o' failed
make[2]: *** [CMakeFiles/blazingdb-manager.dir/src/blazingdb/manager/Manager.cc.o] Error 1
CMakeFiles/Makefile2:104: recipe for target 'CMakeFiles/blazingdb-manager.dir/all' failed
make[1]: *** [CMakeFiles/blazingdb-manager.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
In file included from /opt/conda/envs/bzsqlenv/blazingdb-communication/src/blazingdb/manager/Manager.cc:6:0:
/opt/conda/envs/bzsqlenv/blazingdb-communication/include/blazingdb/network/TCPSocket.h:13:19: fatal error: zmq.hpp: No such file or directory
compilation terminated.
CMakeFiles/blazingdb-transport.dir/build.make:166: recipe for target 'CMakeFiles/blazingdb-transport.dir/src/blazingdb/manager/Manager.cc.o' failed
make[2]: *** [CMakeFiles/blazingdb-transport.dir/src/blazingdb/manager/Manager.cc.o] Error 1
CMakeFiles/Makefile2:649: recipe for target 'CMakeFiles/blazingdb-transport.dir/all' failed
make[1]: *** [CMakeFiles/blazingdb-transport.dir/all] Error 2
Makefile:140: recipe for target 'all' failed
make: *** [all] Error 2
######################################################################### Build failed blazingdb-communication @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@


Steps/Code to reproduce bug

conda activate bzsqlenv
conda install -c conda-forge maven flatbuffers gtest gmock rapidjson
conda install -c conda-forge -c rapidsai-nightly/label/cuda10.0 cudf=0.10.0a191009
conda install -c rapidsai-nightly/label/cuda10.0 dask-cudf=0.10.0a191009
conda install -c conda-forge -c blazingsql blazingsql-toolchain
cd $CONDA_PREFIX
git clone https://github.com/BlazingDB/pyBlazing.git
cd pyBlazing/scripts
build-all.sh``

Any ideas ?

[DOC] mkdir /blazingsql

Report incorrect documentation

Location of incorrect documentation
https://github.com/BlazingDB/pyBlazing#install-using-conda

Describe the problems or issues found in the documentation
When installing with Conda, changing the first two commands from this:

mkdir /blazingsql # Make a blazingsql directory in root folder for Apache Calcite schema management. This will requirement will be removed soon.

chown <user_name> /blazingsql

to this:

sudo mkdir /blazingsql
sudo chmod 777 /blazingsql

resolved my comment in issue #72

Steps taken to verify documentation is incorrect
Installed in multiple Conda environments, with both blazingsql and blazingsql-nightly

Suggested fix for documentation
Add sudo before mkdir /blazingsql in the instructions to make the first line read:

sudo mkdir /blazingsql

Not sure changing the 2nd line is needed, or that 777 instead of <user_name> is a preferable option

[BUG] java.lang.NoClassDefFoundError: CatalogColumnDataType

After installing via:

conda install -c blazingsql/label/cuda10.0 -c blazingsql -c rapidsai -c nvidia -c conda-forge -c defaults blazingsql python=3.6 cudatoolkit=10.0

I get the following error on import:

java.lang.NoClassDefFoundError: com/blazingdb/calcite/catalog/domain/CatalogColumnDataType

I followed the recommended install directions. I haven't found away around this issue. I tried the nightly build too, but that gave the same error.

[BUG] Table gets created in bad state when files not found

from blazingsql import BlazingContext
import cudf
bc = BlazingContext(dask_client=client)
bc.create_table('lineitem', 's3://bsql_data/tpch_sf1/lineitem/0_0_0.parquet') #this is a user error but a common one
bc.s3('bsql_data', bucket_name='blab', access_key_id='', secret_key='')
bc.create_table('lineitem', 's3://bsql_data/tpch_sf1/lineitem/0_0_0.parquet')

If a user accidentally makes a table before registering the file system that table doesnt get created properly and it cant be dropped without restarting. When a table is not created it should leave 0 state that would impact the creation of that same table again.

[BUG] Some JOIN workloads seems to crash with RMM errors

Describe the bug
Sometimes running the queries with JOIN we see crashes like:

ERROR: CUDA Runtime call cudaStreamSynchronize(stream) in line 283 of file /conda/envs/bzsqlenv/blazingdb-ral/src/Interpreter/interpreter_cpp.cu failed with an illegal memory access was encountered (77).
terminate called after throwing an instance of 'thrust::system::system_error'
 what():  rmm_allocator::deallocate(): RMM_FREE: __global__ function call is not configured
Aborted (core dumped)

Steps/Code to reproduce bug

algebra = """LogicalProject(visit_nbr=[$0], store_nbr=[$1], scan_type=[$3], rpt_cd=[$4], upc_nbr=[$5], othr_income_ind=[$6], scan_rtl_amt=[$10], visit_dt=[$11], scan_cnt=[$12], vendor_nbr=[CASE($14, $17, null:INTEGER)], CASE=[CASE(=(CAST($3):INTEGER, 3), 38, $2)], *=[*($7, $8)], CASE12=[CASE(=(CAST($9):INTEGER, 1), $10, 0:DOUBLE)], CASE13=[CASE(AND(>=($11, 2018-01-27), <($11, 2019-01-26), =(CAST($3):INTEGER, 0), =(CAST($16):INTEGER, 4), <>(/($10, $7), $18), OR(=(CASE($14, $17, null:INTEGER), 467830), =(CASE($14, $17, null:INTEGER), 475632), =(CASE($14, $17, null:INTEGER), 735105)), =(CAST($9):INTEGER, 1)), 0:DOUBLE, <>(CAST($9):INTEGER, 1), 0:DOUBLE, $7)], CASE14=[CASE(AND(>=($11, 2018-01-27), <($11, 2019-01-26), =(CAST($3):INTEGER, 0), =(CAST($16):INTEGER, 4), <>(/($10, $7), $18), OR(=(CASE($14, $17, null:INTEGER), 467830), =(CASE($14, $17, null:INTEGER), 475632), =(CASE($14, $17, null:INTEGER), 735105))), 0:DOUBLE, $7)], CASE15=[CASE(=(CAST($13):INTEGER, 1), $10, 0:DOUBLE)], CASE16=[CASE(AND(>=($11, 2018-01-27), <($11, 2019-01-26), =(CAST($3):INTEGER, 0), =(CAST($16):INTEGER, 4), <>(/($10, $7), $18), OR(=(CASE($14, $17, null:INTEGER), 467830), =(CASE($14, $17, null:INTEGER), 475632), =(CASE($14, $17, null:INTEGER), 735105))), 0:DOUBLE, <>(CAST($13):INTEGER, 1), 0:DOUBLE, $7)], ==[=($1, 7368)], =18=[=($1, 241)], =19=[=($1, 3335)], =20=[=($1, 2466)], =21=[=($1, 717)], =22=[=($1, 2168)], =23=[=($1, 3816)], =24=[=($1, 5746)], =25=[=($1, 5915)], =26=[=($1, 245)], =27=[=($1, 997)], =28=[=($1, 4108)], >==[>=($11, 2019-01-01)])
 LogicalTableScan(table=[[main, res]])"""
res1 =bc.sql("select whatever it does not matter", algebra=algebra)

Expected behavior
Execution without crash

Environment overview (please complete the following information)

  • Environment location: Bare-metal
  • Method of cuDF install: from source (latest)

Environment details
Please run and paste the output of the print_env.sh script here, to gather any other relevant environment details

Click here to see environment details
 **git***
 commit 8c50b2c7ccfcd2aed9e19558e276d3952d9e7629 (HEAD -> develop, origin/feature/gpuci, origin/develop, origin/HEAD)
 Author: Percy Camilo Triveño Aucahuasi <[email protected]>
 Date:   Mon Dec 2 12:32:38 2019 -0500
 
 Add dask-cuda as dep
 **git submodules***
 
 ***OS Information***
 DISTRIB_ID=Ubuntu
 DISTRIB_RELEASE=16.04
 DISTRIB_CODENAME=xenial
 DISTRIB_DESCRIPTION="Ubuntu 16.04.4 LTS"
 NAME="Ubuntu"
 VERSION="16.04.4 LTS (Xenial Xerus)"
 ID=ubuntu
 ID_LIKE=debian
 PRETTY_NAME="Ubuntu 16.04.4 LTS"
 VERSION_ID="16.04"
 HOME_URL="http://www.ubuntu.com/"
 SUPPORT_URL="http://help.ubuntu.com/"
 BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
 VERSION_CODENAME=xenial
 UBUNTU_CODENAME=xenial
 Linux pctabz 4.15.0-70-generic #79~16.04.1-Ubuntu SMP Tue Nov 12 14:01:10 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
 
 ***GPU Information***
 Tue Dec  3 11:16:34 2019
 +-----------------------------------------------------------------------------+
 | NVIDIA-SMI 418.87.00    Driver Version: 418.87.00    CUDA Version: 10.1     |
 |-------------------------------+----------------------+----------------------+
 | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
 | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
 |===============================+======================+======================|
 |   0  GeForce GTX 105...  Off  | 00000000:01:00.0 Off |                  N/A |
 | N/A   51C    P0    N/A /  N/A |      0MiB /  4042MiB |      0%      Default |
 +-------------------------------+----------------------+----------------------+
 
 +-----------------------------------------------------------------------------+
 | Processes:                                                       GPU Memory |
 |  GPU       PID   Type   Process name                             Usage      |
 |=============================================================================|
 |  No running processes found                                                 |
 +-----------------------------------------------------------------------------+
 
 ***CPU***
 Architecture:          x86_64
 CPU op-mode(s):        32-bit, 64-bit
 Byte Order:            Little Endian
 CPU(s):                8
 On-line CPU(s) list:   0-7
 Thread(s) per core:    2
 Core(s) per socket:    4
 Socket(s):             1
 NUMA node(s):          1
 Vendor ID:             GenuineIntel
 CPU family:            6
 Model:                 158
 Model name:            Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz
 Stepping:              9
 CPU MHz:               3479.683
 CPU max MHz:           3800,0000
 CPU min MHz:           800,0000
 BogoMIPS:              5616.00
 Virtualization:        VT-x
 L1d cache:             32K
 L1i cache:             32K
 L2 cache:              256K
 L3 cache:              6144K
 NUMA node0 CPU(s):     0-7
 Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp md_clear flush_l1d
 
 ***CMake***
 /home/percy/Applications/anaconda/conda/envs/new3/bin/cmake
 cmake version 3.15.5
 
 CMake suite maintained and supported by Kitware (kitware.com/cmake).
 
 ***g++***
 /usr/bin/g++
 g++ (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609
 Copyright (C) 2015 Free Software Foundation, Inc.
 This is free software; see the source for copying conditions.  There is NO
 warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
 
 
 ***nvcc***
 
 ***Python***
 /home/percy/Applications/anaconda/conda/envs/new3/bin/python
 Python 3.7.4
 
 ***Environment Variables***
 PATH                            : /home/percy/Applications/anaconda/conda/envs/new3/bin:/home/percy/Applications/anaconda/conda/condabin:/home/percy/Applications/gcloud/google-cloud-sdk/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin:/home/percy/Applications/docker-compose/current:/home/percy/Applications/kubectl:/home/percy/Applications/minikube:/home/percy/Applications/ctop:/home/percy/Applications/anaconda/conda/bin
 LD_LIBRARY_PATH                 :
 NUMBAPRO_NVVM                   :
 NUMBAPRO_LIBDEVICE              :
 CONDA_PREFIX                    : /home/percy/Applications/anaconda/conda/envs/new3
 PYTHON_PATH                     :
 
 ***conda packages***
 /home/percy/Applications/anaconda/conda/condabin/conda
 # packages in environment at /home/percy/Applications/anaconda/conda/envs/new3:
 #
 # Name                    Version                   Build  Channel
 _libgcc_mutex             0.1                        main
 arrow-cpp                 0.15.0           py37h090bef1_2    conda-forge
 blazingsql                0.6                      pypi_0    pypi
 bokeh                     1.4.0                    py37_0    conda-forge
 boost-cpp                 1.70.0               h8e57a91_2    conda-forge
 brotli                    1.0.7             he1b5a44_1000    conda-forge
 bsql-engine               0.6                      pypi_0    pypi
 bsql-toolchain            0.4.7                         0    blazingsql-nightly
 bsql-toolchain-aws-cpp    0.4.7                         0    blazingsql-nightly
 bsql-toolchain-gcp-cpp    0.4.7                         0    blazingsql-nightly
 bzip2                     1.0.8                h516909a_1    conda-forge
 c-ares                    1.15.0            h516909a_1001    conda-forge
 ca-certificates           2019.11.28           hecc5488_0    conda-forge
 certifi                   2019.11.28               py37_0    conda-forge
 cffi                      1.13.2           py37h8022711_0    conda-forge
 chardet                   3.0.4                 py37_1003    conda-forge
 click                     7.0                        py_0    conda-forge
 cloudpickle               1.2.2                      py_1    conda-forge
 cmake                     3.15.5               hf94ab9c_0    conda-forge
 cppzmq                    4.4.1                hc9558a2_0    conda-forge
 cryptography              2.8              py37h72c5cf5_0    conda-forge
 cudatoolkit               10.0.130                      0
 cudf                      0.11.0a191202         py37_3723    rapidsai-nightly/label/cuda10.0
 cudnn                     7.6.4                cuda10.0_0
 cupy                      6.5.0            py37h49a79c6_0    conda-forge
 curl                      7.67.0               hbc83047_0
 cyrus-sasl                2.1.26               h82bf5a1_4
 cython                    0.29.14          py37he1b5a44_0    conda-forge
 cytoolz                   0.10.1           py37h516909a_0    conda-forge
 dask                      2.8.1                      py_0    conda-forge
 dask-core                 2.8.1                      py_0    conda-forge
 dask-cuda                 0.11.0a0+16.gac37fed          pypi_0    pypi
 dask-cudf                 0.11.0a191202         py37_3723    rapidsai-nightly/label/cuda10.0
 distributed               2.8.1                      py_0    conda-forge
 dlpack                    0.2                  he1b5a44_1    conda-forge
 double-conversion         3.1.5                he1b5a44_2    conda-forge
 expat                     2.2.5             he1b5a44_1004    conda-forge
 fastavro                  0.22.7           py37h516909a_0    conda-forge
 fastrlock                 0.4             py37he1b5a44_1000    conda-forge
 freetype                  2.10.0               he983fc9_1    conda-forge
 fsspec                    0.6.1                      py_0    conda-forge
 future                    0.18.2                   py37_0    conda-forge
 gflags                    2.2.2             he1b5a44_1002    conda-forge
 glog                      0.4.0                he1b5a44_1    conda-forge
 gmock                     1.10.0                        1    conda-forge
 grpc-cpp                  1.23.0               h18db393_0    conda-forge
 gtest                     1.10.0               hc9558a2_1    conda-forge
 heapdict                  1.0.1                      py_0    conda-forge
 icu                       64.2                 he1b5a44_1    conda-forge
 idna                      2.8                   py37_1000    conda-forge
 jinja2                    2.10.3                     py_0    conda-forge
 jpeg                      9c                h14c3975_1001    conda-forge
 jpype1                    0.7              py37h9de70de_0    conda-forge
 krb5                      1.16.1               h173b8e3_7
 libblas                   3.8.0               14_openblas    conda-forge
 libcblas                  3.8.0               14_openblas    conda-forge
 libcudf                   0.11.0a191202     cuda10.0_3723    rapidsai-nightly/label/cuda10.0
 libcurl                   7.67.0               h20c2e04_0
 libdb                     6.1.26            hf484d3e_2000    conda-forge
 libedit                   3.1.20181209         hc058e9b_0
 libevent                  2.1.10               h72c5cf5_0    conda-forge
 libffi                    3.2.1                hd88cf55_4
 libgcc-ng                 9.1.0                hdf63c60_0
 libgfortran-ng            7.3.0                hdf63c60_2    conda-forge
 liblapack                 3.8.0               14_openblas    conda-forge
 libllvm8                  8.0.1                hc9558a2_0    conda-forge
 libntlm                   1.4               h14c3975_1002    conda-forge
 libnvstrings              0.11.0a191202     cuda10.0_3723    rapidsai-nightly/label/cuda10.0
 libopenblas               0.3.7                h5ec1e0e_4    conda-forge
 libpng                    1.6.37               hed695b0_0    conda-forge
 libprotobuf               3.8.0                h8b12597_0    conda-forge
 librmm                    0.11.0b191202       cuda10.0_58    rapidsai-nightly/label/cuda10.0
 libsodium                 1.0.17               h516909a_0    conda-forge
 libssh2                   1.8.2                h22169c7_2    conda-forge
 libstdcxx-ng              9.1.0                hdf63c60_0
 libtiff                   4.1.0                hfc65ed5_0    conda-forge
 libuv                     1.33.1               h516909a_0    conda-forge
 llvmlite                  0.30.0           py37h8b12597_1    conda-forge
 locket                    0.2.0                      py_2    conda-forge
 lz4-c                     1.8.3             he1b5a44_1001    conda-forge
 markupsafe                1.1.1            py37h516909a_0    conda-forge
 maven                     3.6.0                         0    conda-forge
 msgpack-python            0.6.2            py37hc9558a2_0    conda-forge
 nccl                      2.4.8.1              hd6f8bf8_1    conda-forge
 ncurses                   6.1                  he6710b0_1
 netifaces                 0.10.9          py37h516909a_1000    conda-forge
 numba                     0.46.0           py37hb3f55d8_1    conda-forge
 numpy                     1.17.3           py37h95a1406_0    conda-forge
 nvstrings                 0.11.0a191202         py37_3723    rapidsai-nightly/label/cuda10.0
 olefile                   0.46                       py_0    conda-forge
 openjdk                   8.0.192           h14c3975_1003    conda-forge
 openssl                   1.1.1d               h516909a_0    conda-forge
 packaging                 19.2                       py_0    conda-forge
 pandas                    0.24.2           py37hb3f55d8_1    conda-forge
 parquet-cpp               1.5.1                         2    conda-forge
 partd                     1.0.0                      py_0    conda-forge
 pillow                    6.2.1            py37h34e0f95_0
 pip                       19.3.1                   py37_0
 psutil                    5.6.7            py37h516909a_0    conda-forge
 pyarrow                   0.15.0           py37h8b68381_1    conda-forge
 pycparser                 2.19                     py37_1    conda-forge
 pyhive                    0.6.1                    py37_0
 pynvml                    8.0.3                      py_0    conda-forge
 pyopenssl                 19.1.0                   py37_0    conda-forge
 pyparsing                 2.4.5                      py_0    conda-forge
 pysocks                   1.7.1                    py37_0    conda-forge
 python                    3.7.4                h265db76_1
 python-dateutil           2.8.1                      py_0    conda-forge
 pytz                      2019.3                     py_0    conda-forge
 pyyaml                    5.1.2            py37h516909a_1    conda-forge
 rapidjson                 1.1.0             he1b5a44_1002    conda-forge
 re2                       2019.12.01           he1b5a44_0    conda-forge
 readline                  7.0                  h7b6447c_5
 requests                  2.22.0                   py37_1    conda-forge
 rhash                     1.3.6             h14c3975_1001    conda-forge
 rmm                       0.11.0b191202           py37_58    rapidsai-nightly/label/cuda10.0
 sasl                      0.2.1           py37hf484d3e_1001    conda-forge
 setuptools                42.0.1                   py37_0
 six                       1.13.0                   py37_0    conda-forge
 snappy                    1.1.7             he1b5a44_1002    conda-forge
 sortedcontainers          2.1.0                      py_0    conda-forge
 sqlalchemy                1.3.11           py37h516909a_0    conda-forge
 sqlite                    3.30.1               h7b6447c_0
 tblib                     1.4.0                      py_0    conda-forge
 thrift                    0.11.0          py37he1b5a44_1001    conda-forge
 thrift-cpp                0.12.0            hf3afdfd_1004    conda-forge
 thrift_sasl               0.3.0           py37h516909a_1001    conda-forge
 tk                        8.6.8                hbc83047_0
 toolz                     0.10.0                     py_0    conda-forge
 tornado                   6.0.3            py37h516909a_0    conda-forge
 uriparser                 0.9.3                he1b5a44_1    conda-forge
 urllib3                   1.25.7                   py37_0    conda-forge
 wheel                     0.33.6                   py37_0
 xz                        5.2.4                h14c3975_4
 yaml                      0.2.2                he1b5a44_0    conda-forge
 zeromq                    4.3.2                he1b5a44_2    conda-forge
 zict                      1.0.0                      py_0    conda-forge
 zlib                      1.2.11               h7b6447c_3
 zstd                      1.4.3                h3b9ef0a_0    conda-forge

Additional context

About BlazingSQL processing power and scalability

1、In the official website, you can see that the data processed by BlazingSQL and Spark is 15.6G, which is less than the 16G of the T4 GPU. If the data volume exceeds 16G, can the T4 GPU process it?
2、Didn't see the introduction to distributed in the official website, including the description of flexible extension, is the distributed ability of Apache Arrow used?

[QST] blazingsql on pypi ?

I see the project has a setup.py and there is mention of conda on the last commit. However its not yet published to pypi. Is this planned ?

[QST] Working with Dask and Blazing sql

I am trying to scale to multiple-gpus/multiple nodes using dask_cudf and blazing and have some questions based on my experimentation:

1. Does the data-frame has to be persisted in memory ?

I get the following key error, when i try to create the table like below.

From the error, I gather currently it will only work if it is persisted in memory .
It also might be nice to check and raise a error for this.

df = dask_cudf.from_cudf(cudf.DataFrame({'x':[1,2]*16,'y':[0,1]*16}), npartitions=8)
result = bc.create_table('test',df)

Error Trace:

/opt/conda/envs/rapids/lib/python3.7/site-packages/pyblazing/api.py in dask_cudf_to_BlazingDaskTable(dask_cudf, dask_client)
   1407     who_has = dask_client.who_has()
   1408     ips = [re.findall(r'(?:\d+\.){3}\d+', who_has[str(k)][0])[0]
-> 1409            for k in dask_cudf.dask.keys()]
   1410 
   1411     dask_cudf_ret = [NodeTableSchema(ip=p[0], gdf=tableSchemaFrom(p[1]))

/opt/conda/envs/rapids/lib/python3.7/site-packages/pyblazing/api.py in <listcomp>(.0)
   1407     who_has = dask_client.who_has()
   1408     ips = [re.findall(r'(?:\d+\.){3}\d+', who_has[str(k)][0])[0]
-> 1409            for k in dask_cudf.dask.keys()]
   1410 
   1411     dask_cudf_ret = [NodeTableSchema(ip=p[0], gdf=tableSchemaFrom(p[1]))

KeyError: "('from_pandas-72100d0ce723a00bc5c6e285ba433698', 0)"

2. With Persisted -DF's when number of partitions!=number of workers :

From comment at #92 (comment), I gather that currently blazing only supports df's whose partitions == number of workers. Is that correct ?

3. Persisted -df's with number of partitions==number of workers:

I can create the table but I cant seem to query it. Any ideas about can be happening here ?

i.e Below works:

df = dask_cudf.from_cudf(cudf.DataFrame({'x':[1,2]*16,'y':[0,1]*16}), npartitions=8)
df = df.persist()
done = wait(df)
# print(client.has_what())

result = bc.create_table('test',df)

But when i try to query it, like below, It fails with the error with the following trace:

table_result = bc.sql('select x from test').get()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-7-f481e632c131> in <module>
----> 1 table_result = bc.sql('select x from test').get()

/opt/conda/envs/rapids/lib/python3.7/site-packages/pyblazing/apiv2/sql.py in get(self)
     65             for worker in list(self.dask_client.scheduler_info()["workers"]):
     66                 dask_futures.append(self.dask_client.submit(internal_api.convert_to_dask, self.metaToken, self.client, workers = [worker]))
---> 67             temp = dd.from_delayed(dask_futures)
     68 
     69         return temp

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/io/io.py in from_delayed(dfs, meta, divisions, prefix, verify_meta)
    561 
    562     if meta is None:
--> 563         meta = delayed(make_meta)(dfs[0]).compute()
    564     else:
    565         meta = make_meta(meta)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py in compute(self, **kwargs)
    163         dask.base.compute
    164         """
--> 165         (result,) = compute(self, traverse=False, **kwargs)
    166         return result
    167 

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py in compute(*args, **kwargs)
    434     keys = [x.__dask_keys__() for x in collections]
    435     postcomputes = [x.__dask_postcompute__() for x in collections]
--> 436     results = schedule(dsk, keys, **kwargs)
    437     return repack([f(r, *a) for r, (f, a) in zip(results, postcomputes)])
    438 

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/client.py in get(self, dsk, keys, restrictions, loose_restrictions, resources, sync, asynchronous, direct, retries, priority, fifo_timeout, actors, **kwargs)
   2543                     should_rejoin = False
   2544             try:
-> 2545                 results = self.gather(packed, asynchronous=asynchronous, direct=direct)
   2546             finally:
   2547                 for f in futures.values():

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/client.py in gather(self, futures, errors, direct, asynchronous)
   1843                 direct=direct,
   1844                 local_worker=local_worker,
-> 1845                 asynchronous=asynchronous,
   1846             )
   1847 

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/client.py in sync(self, func, asynchronous, callback_timeout, *args, **kwargs)
    760         else:
    761             return sync(
--> 762                 self.loop, func, *args, callback_timeout=callback_timeout, **kwargs
    763             )
    764 

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py in sync(loop, func, callback_timeout, *args, **kwargs)
    331     if error[0]:
    332         typ, exc, tb = error[0]
--> 333         raise exc.with_traceback(tb)
    334     else:
    335         return result[0]

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py in f()
    315             if callback_timeout is not None:
    316                 future = gen.with_timeout(timedelta(seconds=callback_timeout), future)
--> 317             result[0] = yield future
    318         except Exception as exc:
    319             error[0] = sys.exc_info()

/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/gen.py in run(self)
    733 
    734                     try:
--> 735                         value = future.result()
    736                     except Exception:
    737                         exc_info = sys.exc_info()

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/client.py in _gather(self, futures, errors, direct, local_worker)
   1699                             exc = CancelledError(key)
   1700                         else:
-> 1701                             raise exception.with_traceback(traceback)
   1702                         raise exc
   1703                     if errors == "skip":

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py in __call__()
    503         """
    504         meth = self.dispatch(type(arg))
--> 505         return meth(arg, *args, **kwargs)
    506 
    507     @property

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/utils.py in make_meta_object()
    354 
    355     if is_scalar(x):
--> 356         return _nonempty_scalar(x)
    357 
    358     raise TypeError("Don't know how to create metadata from {0}".format(x))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/utils.py in _nonempty_scalar()
    520         return make_scalar(dtype)
    521 
--> 522     raise TypeError("Can't handle meta of type '{0}'".format(typename(type(x))))
    523 
    524 

TypeError: Can't handle meta of type 'NoneType'

It would be amazing if you could share a example of scaling out using dask-cudf so that i can follow it for my workflows.

[BUG] OS NoneType Error Importing BlazingContext in Colab

Describe the bug

Error: TypeError: expected str, bytes or os.PathLike object, not NoneType

Using the install script from this repo's README.md (CUDA 10, python 3.6), I was able to successfully install BlazingSQL v0.11 on P-100 instance in Google Colab (after installing RAPIDS AI v0.11, importing cuDF & making a test Series), but ran into an error trying to import BlazingContext.

image

Steps/Code to reproduce bug
Here's the notebook I ran in Google Colab. It's just the default "Get Started" notebook from rapids.ai with BlazingSQL v0.11 install script added in the cell below the RAPIDS install cell.

Conda install script (run in Google Colab after running rapids-colab.sh and importing cuDF):

!conda install -c blazingsql/label/cuda10.0 -c blazingsql -c rapidsai -c nvidia -c conda-forge -c defaults blazingsql python=3.6 cudatoolkit=10.0

Expected behavior
Importing BlazingContext from blazingsql

Environment details

Google Colab with Hardware Accelerator set to GPU (Tesla P100-PCIE-16GB)
Output of !bash print_env.sh:

<details><summary>Click here to see environment details</summary><pre>
     
     **git***
     Not inside a git repository
     
     ***OS Information***
     DISTRIB_ID=Ubuntu
     DISTRIB_RELEASE=18.04
     DISTRIB_CODENAME=bionic
     DISTRIB_DESCRIPTION="Ubuntu 18.04.3 LTS"
     NAME="Ubuntu"
     VERSION="18.04.3 LTS (Bionic Beaver)"
     ID=ubuntu
     ID_LIKE=debian
     PRETTY_NAME="Ubuntu 18.04.3 LTS"
     VERSION_ID="18.04"
     HOME_URL="https://www.ubuntu.com/"
     SUPPORT_URL="https://help.ubuntu.com/"
     BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
     PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
     VERSION_CODENAME=bionic
     UBUNTU_CODENAME=bionic
     Linux 8a41bde5458e 4.14.137+ #1 SMP Thu Aug 8 02:47:02 PDT 2019 x86_64 x86_64 x86_64 GNU/Linux
     
     ***GPU Information***
     Sun Dec 22 15:07:04 2019
     +-----------------------------------------------------------------------------+
     | NVIDIA-SMI 440.44       Driver Version: 418.67       CUDA Version: 10.1     |
     |-------------------------------+----------------------+----------------------+
     | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
     | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
     |===============================+======================+======================|
     |   0  Tesla P100-PCIE...  Off  | 00000000:00:04.0 Off |                    0 |
     | N/A   36C    P0    32W / 250W |    407MiB / 16280MiB |      0%      Default |
     +-------------------------------+----------------------+----------------------+
     
     +-----------------------------------------------------------------------------+
     | Processes:                                                       GPU Memory |
     |  GPU       PID   Type   Process name                             Usage      |
     |=============================================================================|
     +-----------------------------------------------------------------------------+
     
     ***CPU***
     Architecture:        x86_64
     CPU op-mode(s):      32-bit, 64-bit
     Byte Order:          Little Endian
     CPU(s):              2
     On-line CPU(s) list: 0,1
     Thread(s) per core:  2
     Core(s) per socket:  1
     Socket(s):           1
     NUMA node(s):        1
     Vendor ID:           GenuineIntel
     CPU family:          6
     Model:               79
     Model name:          Intel(R) Xeon(R) CPU @ 2.20GHz
     Stepping:            0
     CPU MHz:             2200.000
     BogoMIPS:            4400.00
     Hypervisor vendor:   KVM
     Virtualization type: full
     L1d cache:           32K
     L1i cache:           32K
     L2 cache:            256K
     L3 cache:            56320K
     NUMA node0 CPU(s):   0,1
     Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx smap xsaveopt arat md_clear arch_capabilities
     
     ***CMake***
     /usr/local/bin/cmake
     cmake version 3.12.0
     
     CMake suite maintained and supported by Kitware (kitware.com/cmake).
     
     ***g++***
     /usr/bin/g++
     g++ (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
     Copyright (C) 2017 Free Software Foundation, Inc.
     This is free software; see the source for copying conditions.  There is NO
     warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
     
     
     ***nvcc***
     /usr/local/cuda/bin/nvcc
     nvcc: NVIDIA (R) Cuda compiler driver
     Copyright (c) 2005-2018 NVIDIA Corporation
     Built on Sat_Aug_25_21:08:01_CDT_2018
     Cuda compilation tools, release 10.0, V10.0.130
     
     ***Python***
     /usr/local/bin/python
     Python 3.6.7
     
     ***Environment Variables***
     PATH                            : /usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/tools/node/bin:/tools/google-cloud-sdk/bin:/opt/bin
     LD_LIBRARY_PATH                 : /usr/lib64-nvidia
     NUMBAPRO_NVVM                   :
     NUMBAPRO_LIBDEVICE              :
     CONDA_PREFIX                    :
     PYTHON_PATH                     :
     
     ***conda packages***
     /usr/local/bin/conda
     # packages in environment at /usr/local:
     #
     # Name                    Version                   Build  Channel
     _libgcc_mutex             0.1                        main    conda-forge
     arrow-cpp                 0.15.0           py36h090bef1_2    conda-forge
     asn1crypto                0.24.0                   py36_0
     blazingsql                0.11            cuda10.0_py36_2    blazingsql/label/cuda10.0
     blinker                   1.4                        py_1    conda-forge
     bokeh                     1.4.0                    py36_0    conda-forge
     boost-cpp                 1.70.0               h8e57a91_2    conda-forge
     brotli                    1.0.7             he1b5a44_1000    conda-forge
     bsql-toolchain            0.11                          0    blazingsql
     bsql-toolchain-aws-cpp    0.11                          0    blazingsql
     bsql-toolchain-gcp-cpp    0.11                          0    blazingsql
     bzip2                     1.0.8                h516909a_2    conda-forge
     c-ares                    1.15.0            h516909a_1001    conda-forge
     ca-certificates           2019.11.28           hecc5488_0    conda-forge
     cachetools                3.1.1                      py_0    conda-forge
     cairo                     1.16.0            hfb77d84_1002    conda-forge
     certifi                   2019.11.28               py36_0    conda-forge
     cffi                      1.13.2           py36h8022711_0    conda-forge
     cfitsio                   3.470                hb60a0a2_2    conda-forge
     chardet                   3.0.4                 py36_1003    conda-forge
     click                     7.0                        py_0    conda-forge
     cloudpickle               1.2.2                      py_1    conda-forge
     conda                     4.5.4                    py36_0
     conda-env                 2.6.0                h36134e3_1
     cppzmq                    4.4.1                hc9558a2_0    conda-forge
     cryptography              2.8              py36h72c5cf5_1    conda-forge
     cudatoolkit               10.0.130                      0    nvidia
     cudf                      0.11.0b191212            py36_7    rapidsai-nightly
     cudnn                     7.6.0                cuda10.0_0    nvidia
     cugraph                   0.11.0                   py36_0    rapidsai
     cuml                      0.11.0          cuda10.0_py36_0    rapidsai
     cupy                      6.6.0            py36h809cb0f_1    conda-forge
     curl                      7.65.3               hf8cf82a_0    conda-forge
     cuspatial                 0.11.0b191212            py36_0    rapidsai-nightly
     cyrus-sasl                2.1.27               he38ecfd_0    conda-forge
     cython                    0.29.14          py36he1b5a44_0    conda-forge
     cytoolz                   0.10.1           py36h516909a_0    conda-forge
     dask                      2.9.0                      py_0    conda-forge
     dask-core                 2.9.0                      py_0    conda-forge
     dask-cuda                 0.11.0                   py36_0    rapidsai
     dask-cudf                 0.11.0b191212            py36_7    rapidsai-nightly
     decorator                 4.4.1                      py_0    conda-forge
     distributed               2.9.0                      py_0    conda-forge
     dlpack                    0.2                  he1b5a44_1    conda-forge
     double-conversion         3.1.5                he1b5a44_2    conda-forge
     expat                     2.2.5             he1b5a44_1004    conda-forge
     fastavro                  0.22.8           py36h516909a_0    conda-forge
     fastrlock                 0.4             py36he1b5a44_1000    conda-forge
     fontconfig                2.13.1            h86ecdb6_1001    conda-forge
     freetype                  2.10.0               he983fc9_1    conda-forge
     freexl                    1.0.5             h14c3975_1002    conda-forge
     fsspec                    0.6.2                      py_0    conda-forge
     future                    0.18.2                   py36_0    conda-forge
     gcsfs                     0.6.0                      py_0    conda-forge
     gdal                      2.4.3            py36h5f563d9_9    conda-forge
     geos                      3.7.2                he1b5a44_2    conda-forge
     geotiff                   1.5.1                hbd99317_7    conda-forge
     gettext                   0.19.8.1          hc5be6a0_1002    conda-forge
     gflags                    2.2.2             he1b5a44_1002    conda-forge
     giflib                    5.1.7                h516909a_1    conda-forge
     glib                      2.58.3          py36h6f030ca_1002    conda-forge
     glog                      0.4.0                he1b5a44_1    conda-forge
     google-auth               1.10.0                     py_0    conda-forge
     google-auth-oauthlib      0.4.1                      py_0    conda-forge
     grpc-cpp                  1.23.0               h18db393_0    conda-forge
     hdf4                      4.2.13            hf30be14_1003    conda-forge
     hdf5                      1.10.5          nompi_h3c11f04_1104    conda-forge
     heapdict                  1.0.1                      py_0    conda-forge
     icu                       64.2                 he1b5a44_1    conda-forge
     idna                      2.8                   py36_1000    conda-forge
     jinja2                    2.10.3                     py_0    conda-forge
     joblib                    0.14.1                     py_0    conda-forge
     jpeg                      9c                h14c3975_1001    conda-forge
     jpype1                    0.7              py36h9de70de_0    conda-forge
     json-c                    0.13.1            h14c3975_1001    conda-forge
     kealib                    1.4.10            h58c409b_1005    conda-forge
     krb5                      1.16.4               h2fd8d38_0    conda-forge
     libblas                   3.8.0               14_openblas    conda-forge
     libcblas                  3.8.0               14_openblas    conda-forge
     libcudf                   0.11.0               cuda10.0_0    rapidsai
     libcugraph                0.11.0               cuda10.0_0    rapidsai
     libcuml                   0.11.0               cuda10.0_0    rapidsai
     libcumlprims              0.11.0               cuda10.0_0    nvidia
     libcurl                   7.65.3               hda55be3_0    conda-forge
     libcuspatial              0.11.0               cuda10.0_0    rapidsai
     libdap4                   3.20.4               hd3bb157_0    conda-forge
     libedit                   3.1.20170329      hf8c457e_1001    conda-forge
     libevent                  2.1.10               h72c5cf5_0    conda-forge
     libffi                    3.2.1             he1b5a44_1006    conda-forge
     libgcc-ng                 9.2.0                hdf63c60_0    conda-forge
     libgcrypt                 1.8.4             hf484d3e_1000    conda-forge
     libgdal                   2.4.3                h2f07a13_9    conda-forge
     libgfortran-ng            7.3.0                hdf63c60_2    conda-forge
     libgpg-error              1.36                 he1b5a44_0    conda-forge
     libgsasl                  1.8.0             h19a2143_1004    conda-forge
     libhdfs3                  2.3               h311b756_1006    conda-forge
     libiconv                  1.15              h516909a_1005    conda-forge
     libkml                    1.3.0             h4fcabce_1010    conda-forge
     liblapack                 3.8.0               14_openblas    conda-forge
     libllvm8                  8.0.1                hc9558a2_0    conda-forge
     libnetcdf                 4.7.1           nompi_h94020b1_102    conda-forge
     libntlm                   1.4               h14c3975_1002    conda-forge
     libnvstrings              0.11.0               cuda10.0_0    rapidsai
     libopenblas               0.3.7                h5ec1e0e_6    conda-forge
     libpng                    1.6.37               hed695b0_0    conda-forge
     libpq                     11.5                 hd9ab2ff_2    conda-forge
     libprotobuf               3.8.0                h8b12597_0    conda-forge
     librmm                    0.11.0               cuda10.0_0    rapidsai
     libsodium                 1.0.17               h516909a_0    conda-forge
     libspatialite             4.3.0a            h4f6d029_1032    conda-forge
     libssh2                   1.8.2                h22169c7_2    conda-forge
     libstdcxx-ng              9.2.0                hdf63c60_0    conda-forge
     libtiff                   4.1.0                hfc65ed5_0    conda-forge
     libuuid                   2.32.1            h14c3975_1000    conda-forge
     libxcb                    1.13              h14c3975_1002    conda-forge
     libxgboost                1.0.0.SNAPSHOT       cuda10.0_1    rapidsai
     libxml2                   2.9.10               hee79883_0    conda-forge
     llvmlite                  0.30.0           py36h8b12597_1    conda-forge
     locket                    0.2.0                      py_2    conda-forge
     lz4-c                     1.8.3             he1b5a44_1001    conda-forge
     markupsafe                1.1.1            py36h516909a_0    conda-forge
     msgpack-python            0.6.2            py36hc9558a2_0    conda-forge
     nccl                      2.4.6.1              cuda10.0_0    nvidia
     ncurses                   6.1               hf484d3e_1002    conda-forge
     netifaces                 0.10.9          py36h516909a_1000    conda-forge
     numba                     0.46.0           py36hb3f55d8_1    conda-forge
     numpy                     1.17.3           py36h95a1406_0    conda-forge
     nvstrings                 0.11.0b191212            py36_7    rapidsai-nightly
     oauthlib                  3.0.1                      py_0    conda-forge
     olefile                   0.46                       py_0    conda-forge
     openjdk                   8.0.192           h14c3975_1003    conda-forge
     openjpeg                  2.3.1                h981e76c_3    conda-forge
     openssl                   1.1.1d               h516909a_0    conda-forge
     packaging                 19.2                       py_0    conda-forge
     pandas                    0.24.2           py36hb3f55d8_1    conda-forge
     parquet-cpp               1.5.1                         2    conda-forge
     partd                     1.1.0                      py_0    conda-forge
     pcre                      8.43                 he1b5a44_0    conda-forge
     pillow                    6.2.1            py36hd70f55b_1    conda-forge
     pip                       19.3.1                   py36_0    conda-forge
     pixman                    0.38.0            h516909a_1003    conda-forge
     poppler                   0.67.0               h14e79db_8    conda-forge
     poppler-data              0.4.9                         1    conda-forge
     postgresql                11.5                 hc63931a_2    conda-forge
     proj                      6.2.1                hc80f0dc_0    conda-forge
     psutil                    5.6.7            py36h516909a_0    conda-forge
     pthread-stubs             0.4               h14c3975_1001    conda-forge
     py-xgboost                1.0.0.SNAPSHOT   cuda10.0py36_1    rapidsai
     pyarrow                   0.15.0           py36h8b68381_1    conda-forge
     pyasn1                    0.4.8                      py_0    conda-forge
     pyasn1-modules            0.2.7                      py_0    conda-forge
     pycosat                   0.6.3            py36h0a5515d_0
     pycparser                 2.19                     py36_1    conda-forge
     pyhive                    0.6.1                    py36_0
     pyjwt                     1.7.1                      py_0    conda-forge
     pynvml                    8.0.3                      py_0    conda-forge
     pyopenssl                 19.1.0                   py36_0    conda-forge
     pyparsing                 2.4.5                      py_0    conda-forge
     pysocks                   1.7.1                    py36_0    conda-forge
     python                    3.6.7             h357f687_1006    conda-forge
     python-dateutil           2.8.1                      py_0    conda-forge
     pytz                      2019.3                     py_0    conda-forge
     pyyaml                    5.2              py36h516909a_0    conda-forge
     re2                       2019.12.01           he1b5a44_0    conda-forge
     readline                  8.0                  hf8c457e_0    conda-forge
     requests                  2.22.0                   py36_1    conda-forge
     requests-oauthlib         1.2.0                      py_0    conda-forge
     rmm                       0.11.0b191212           py36_80    rapidsai-nightly
     rsa                       4.0                        py_0    conda-forge
     ruamel_yaml               0.15.37          py36h14c3975_2
     sasl                      0.2.1           py36he1b5a44_1001    conda-forge
     scikit-learn              0.22             py36hcdab131_1    conda-forge
     scipy                     1.4.0            py36h921218d_0    conda-forge
     setuptools                42.0.2                   py36_0    conda-forge
     six                       1.13.0                   py36_0    conda-forge
     snappy                    1.1.7             he1b5a44_1002    conda-forge
     sortedcontainers          2.1.0                      py_0    conda-forge
     sqlalchemy                1.3.12           py36h516909a_0    conda-forge
     sqlite                    3.30.1               hcee41ef_0    conda-forge
     tblib                     1.6.0                      py_0    conda-forge
     thrift                    0.11.0          py36he1b5a44_1001    conda-forge
     thrift-cpp                0.12.0            hf3afdfd_1004    conda-forge
     thrift_sasl               0.3.0           py36h516909a_1001    conda-forge
     tk                        8.6.10               hed695b0_0    conda-forge
     toolz                     0.10.0                     py_0    conda-forge
     tornado                   6.0.3            py36h516909a_0    conda-forge
     tzcode                    2019a             h516909a_1002    conda-forge
     uriparser                 0.9.3                he1b5a44_1    conda-forge
     urllib3                   1.25.7                   py36_0    conda-forge
     wheel                     0.33.6                   py36_0    conda-forge
     xerces-c                  3.2.2             h8412b87_1004    conda-forge
     xgboost                   1.0.0.SNAPSHOT   cuda10.0py36_1    rapidsai
     xorg-kbproto              1.0.7             h14c3975_1002    conda-forge
     xorg-libice               1.0.10               h516909a_0    conda-forge
     xorg-libsm                1.2.3             h84519dc_1000    conda-forge
     xorg-libx11               1.6.9                h516909a_0    conda-forge
     xorg-libxau               1.0.9                h14c3975_0    conda-forge
     xorg-libxdmcp             1.1.3                h516909a_0    conda-forge
     xorg-libxext              1.3.4                h516909a_0    conda-forge
     xorg-libxrender           0.9.10            h516909a_1002    conda-forge
     xorg-renderproto          0.11.1            h14c3975_1002    conda-forge
     xorg-xextproto            7.3.0             h14c3975_1002    conda-forge
     xorg-xproto               7.0.31            h14c3975_1007    conda-forge
     xz                        5.2.4             h14c3975_1001    conda-forge
     yaml                      0.2.2                h516909a_1    conda-forge
     zeromq                    4.3.2                he1b5a44_2    conda-forge
     zict                      1.0.0                      py_0    conda-forge
     zlib                      1.2.11            h516909a_1006    conda-forge
     zstd                      1.4.3                h3b9ef0a_0    conda-forge
     
</pre></details>

Additional context

BlazingSQL Conda install output:

  • Available on GitHub (gist) here as text file.

Full error output:

---------------------------------------------------------------------------

TypeError                                 Traceback (most recent call last)

<ipython-input-26-0b19b5b41f48> in <module>()
----> 1 from blazingsql import BlazingContext

2 frames

/usr/local/lib/python3.6/site-packages/blazingsql/__init__.py in <module>()
      1 from pyblazing.apiv2 import S3EncryptionType
      2 from pyblazing.apiv2 import DataType
----> 3 from pyblazing.apiv2.context import BlazingContext

/usr/local/lib/python3.6/site-packages/pyblazing/apiv2/context.py in <module>()
     41     os.path.join(
     42         os.getenv("CONDA_PREFIX"),
---> 43         'lib/blazingsql-algebra.jar'))
     44 jpype.addClassPath(
     45     os.path.join(

/usr/lib/python3.6/posixpath.py in join(a, *p)
     78     will be discarded.  An empty last part will result in a path that
     79     ends with a separator."""
---> 80     a = os.fspath(a)
     81     sep = _get_sep(a)
     82     path = a

TypeError: expected str, bytes or os.PathLike object, not NoneType

[BUG] Resulting Column Name as Alias Not Applying

Describe the bug

Provided alias column names are not applying to columns of query results which are instead generically titled $f0, $f1 ... $fn.

Context
After creating a table ("taxi"), I'm trying to:

  1. extract hour, month, and year from each row of a datetime column (key) with each being a new column titled hours, months, and years (respectively)
  2. find the difference between 2 columns with dropoff and pickup longitude as a new column longitude_distance
  3. find the difference between 2 columns with dropoff and pickup latitudes as a new column latitude_distance

but the new time column names (hours, months, years) are being output as $f0, $f1 and $f2, and the distance column names (longitude_distance, latitude_distance) are being output as $f3 and $f4.

Here's the query and execution:

# define the query
query = '''
        SELECT hour(key) as hours, month(key) as months, year(key) - 2000 as years,  
        dropoff_longitude - pickup_longitude as longitude_distance, 
        dropoff_latitude - pickup_latitude as latitude_distance, 
        passenger_count FROM main.taxi
        '''

# run query on table
X_train = bc.sql(query).get()

# extract dataframe
X_train_gdf = X_train.columns

# how's that look?
X_train_gdf.head()

Here's the current output:
image

Steps/Code to reproduce bug

Here's the notebook in Colab: https://colab.research.google.com/drive/1gEX0CrTMNLu5Y4V4JbLw5HAQm6UHgtAr

Dataframe with incorrect column names output is displayed and can be reproduced by downloading and running the notebook locally. Currently there is a manual correction fix in place.

Expected behavior

image

Environment overview

  • Environment location: local and Google Cloud (same issue in both)
  • Method of install: conda

Environment details

<details><summary>Click here to see environment details</summary><pre>
     
     **git***
     commit 7a45cd317eb341cfa8693f5377ed1c7052a6eaee (HEAD -> feature_taxi, origin/feature_taxi)
     Author: Winston <[email protected]>
     Date:   Mon Oct 28 03:45:16 2019 -0700
     
     formatted and runs e2e with temp fixes (issue: column names)
     **git submodules***
     
     ***OS Information***
     DISTRIB_ID=Ubuntu
     DISTRIB_RELEASE=16.04
     DISTRIB_CODENAME=xenial
     DISTRIB_DESCRIPTION="Ubuntu 16.04.6 LTS"
     NAME="Ubuntu"
     VERSION="16.04.6 LTS (Xenial Xerus)"
     ID=ubuntu
     ID_LIKE=debian
     PRETTY_NAME="Ubuntu 16.04.6 LTS"
     VERSION_ID="16.04"
     HOME_URL="http://www.ubuntu.com/"
     SUPPORT_URL="http://help.ubuntu.com/"
     BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
     VERSION_CODENAME=xenial
     UBUNTU_CODENAME=xenial
     Linux winston-gpu-rig 4.15.0-1047-gcp #50-Ubuntu SMP Wed Oct 2 00:50:34 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
     
     ***GPU Information***
     Sun Nov  3 06:40:18 2019
     +-----------------------------------------------------------------------------+
     | NVIDIA-SMI 418.87.00    Driver Version: 418.87.00    CUDA Version: 10.1     |
     |-------------------------------+----------------------+----------------------+
     | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
     | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
     |===============================+======================+======================|
     |   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
     | N/A   42C    P8    10W /  70W |     98MiB / 15079MiB |      0%      Default |
     +-------------------------------+----------------------+----------------------+
     
     +-----------------------------------------------------------------------------+
     | Processes:                                                       GPU Memory |
     |  GPU       PID   Type   Process name                             Usage      |
     |=============================================================================|
     |    0      2014      G   /usr/lib/xorg/Xorg                            98MiB |
     +-----------------------------------------------------------------------------+
     
     ***CPU***
     Architecture:          x86_64
     CPU op-mode(s):        32-bit, 64-bit
     Byte Order:            Little Endian
     CPU(s):                4
     On-line CPU(s) list:   0-3
     Thread(s) per core:    2
     Core(s) per socket:    2
     Socket(s):             1
     NUMA node(s):          1
     Vendor ID:             GenuineIntel
     CPU family:            6
     Model:                 63
     Model name:            Intel(R) Xeon(R) CPU @ 2.30GHz
     Stepping:              0
     CPU MHz:               2300.000
     BogoMIPS:              4600.00
     Hypervisor vendor:     KVM
     Virtualization type:   full
     L1d cache:             32K
     L1i cache:             32K
     L2 cache:              256K
     L3 cache:              46080K
     NUMA node0 CPU(s):     0-3
     Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm invpcid_single pti ssbd ibrs ibpb stibp fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt arat md_clear arch_capabilities
     
     ***CMake***
     /home/winston/miniconda3/envs/bzsqlenv/bin/cmake
     cmake version 3.15.5
     
     CMake suite maintained and supported by Kitware (kitware.com/cmake).
     
     ***g++***
     /usr/bin/g++
     g++ (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609
     Copyright (C) 2015 Free Software Foundation, Inc.
     This is free software; see the source for copying conditions.  There is NO
     warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
     
     
     ***nvcc***
     
     ***Python***
     /home/winston/miniconda3/envs/bzsqlenv/bin/python
     Python 3.7.3
     
     ***Environment Variables***
     PATH                            : /home/winston/bin:/home/winston/.local/bin:/home/winston/miniconda3/envs/bzsqlenv/bin:/home/winston/miniconda3/condabin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
     LD_LIBRARY_PATH                 :
     NUMBAPRO_NVVM                   :
     NUMBAPRO_LIBDEVICE              :
     CONDA_PREFIX                    : /home/winston/miniconda3/envs/bzsqlenv
     PYTHON_PATH                     :
     
     ***conda packages***
     /home/winston/miniconda3/condabin/conda
     # packages in environment at /home/winston/miniconda3/envs/bzsqlenv:
     #
     # Name                    Version                   Build  Channel
     _libgcc_mutex             0.1                        main
     alsa-lib                  1.1.5             h516909a_1001    conda-forge
     arrow-cpp                 0.14.1           py37h5ac5442_4    conda-forge
     attrs                     19.3.0                   pypi_0    pypi
     backcall                  0.1.0                    pypi_0    pypi
     blazingsql-calcite        0.4.5                         0    blazingsql
     blazingsql-communication  0.4.5                cuda10.0_0    blazingsql/label/cuda10.0
     blazingsql-io             0.4.4                         0    blazingsql
     blazingsql-orchestrator   0.4.5                         0    blazingsql
     blazingsql-protocol       0.4.5                    py37_0    blazingsql
     blazingsql-python         0.4.5           cuda10.0_py37_0    blazingsql/label/cuda10.0
     blazingsql-ral            0.4.5                cuda10.0_0    blazingsql/label/cuda10.0
     blazingsql-toolchain      0.4.5                         3    blazingsql
     bleach                    3.1.0                    pypi_0    pypi
     bokeh                     1.3.4                    py37_0    conda-forge
     boost                     1.70.0           py37h9de70de_1    conda-forge
     boost-cpp                 1.70.0               h8e57a91_2    conda-forge
     brotli                    1.0.7             he1b5a44_1000    conda-forge
     bzip2                     1.0.8                h516909a_1    conda-forge
     c-ares                    1.15.0            h516909a_1001    conda-forge
     ca-certificates           2019.9.11            hecc5488_0    conda-forge
     certifi                   2019.9.11                py37_0    conda-forge
     click                     7.0                        py_0    conda-forge
     cloudpickle               1.2.2                      py_0    conda-forge
     cmake                     3.15.5               hf94ab9c_0    conda-forge
     cppzmq                    4.4.1                hc9558a2_0    conda-forge
     cudatoolkit               10.0.130                      0
     cudf                      0.10.0                   py37_0    rapidsai
     curl                      7.65.3               hf8cf82a_0    conda-forge
     cython                    0.29.13          py37he1b5a44_0    conda-forge
     cytoolz                   0.10.0           py37h516909a_0    conda-forge
     dask                      2.6.0                      py_0    conda-forge
     dask-core                 2.6.0                      py_0    conda-forge
     dask-cudf                 0.10.0                   py37_0    rapidsai
     decorator                 4.4.1                    pypi_0    pypi
     defusedxml                0.6.0                    pypi_0    pypi
     distributed               2.6.0                      py_0    conda-forge
     dlpack                    0.2                  he1b5a44_1    conda-forge
     double-conversion         3.1.5                he1b5a44_2    conda-forge
     entrypoints               0.3                      pypi_0    pypi
     expat                     2.2.5             he1b5a44_1004    conda-forge
     fastavro                  0.22.5           py37h516909a_0    conda-forge
     flatbuffers               1.11                     pypi_0    pypi
     fontconfig                2.13.1            h86ecdb6_1001    conda-forge
     freetype                  2.10.0               he983fc9_1    conda-forge
     fsspec                    0.5.2                      py_0    conda-forge
     gettext                   0.19.8.1          hc5be6a0_1002    conda-forge
     gflags                    2.2.2             he1b5a44_1002    conda-forge
     giflib                    5.1.7                h516909a_1    conda-forge
     glog                      0.4.0                he1b5a44_1    conda-forge
     gmock                     1.10.0                        0    conda-forge
     grpc-cpp                  1.23.0               h18db393_0    conda-forge
     gtest                     1.10.0               hc9558a2_0    conda-forge
     heapdict                  1.0.1                      py_0    conda-forge
     icu                       64.2                 he1b5a44_1    conda-forge
     importlib-metadata        0.23                     pypi_0    pypi
     ipykernel                 5.1.3                    pypi_0    pypi
     ipython                   7.9.0                    pypi_0    pypi
     ipython-genutils          0.2.0                    pypi_0    pypi
     jedi                      0.15.1                   pypi_0    pypi
     jinja2                    2.10.3                     py_0    conda-forge
     jpeg                      9c                h14c3975_1001    conda-forge
     json5                     0.8.5                    pypi_0    pypi
     jsonschema                3.1.1                    pypi_0    pypi
     jupyter-client            5.3.4                    pypi_0    pypi
     jupyter-core              4.6.1                    pypi_0    pypi
     jupyterlab                0.34.0                   pypi_0    pypi
     jupyterlab-launcher       0.13.1                   pypi_0    pypi
     jupyterlab-server         1.0.6                    pypi_0    pypi
     krb5                      1.16.3            h05b26f9_1001    conda-forge
     lcms2                     2.9                  h2e4bb80_0    conda-forge
     libblas                   3.8.0               14_openblas    conda-forge
     libcblas                  3.8.0               14_openblas    conda-forge
     libcudf                   0.10.0               cuda10.0_0    rapidsai
     libcurl                   7.65.3               hda55be3_0    conda-forge
     libedit                   3.1.20181209         hc058e9b_0
     libevent                  2.1.10               h72c5cf5_0    conda-forge
     libffi                    3.2.1                hd88cf55_4
     libgcc-ng                 9.1.0                hdf63c60_0
     libgcrypt                 1.8.4             hf484d3e_1000    conda-forge
     libgfortran-ng            7.3.0                hdf63c60_2    conda-forge
     libgpg-error              1.36                 he1b5a44_0    conda-forge
     libgsasl                  1.8.0             h19a2143_1004    conda-forge
     libhdfs3                  2.3               h311b756_1006    conda-forge
     libiconv                  1.15              h516909a_1005    conda-forge
     liblapack                 3.8.0               14_openblas    conda-forge
     libllvm8                  8.0.1                hc9558a2_0    conda-forge
     libntlm                   1.4               h14c3975_1002    conda-forge
     libnvstrings              0.10.0               cuda10.0_0    rapidsai
     libopenblas               0.3.7                h6e990d7_2    conda-forge
     libpng                    1.6.37               hed695b0_0    conda-forge
     libprotobuf               3.8.0                h8b12597_0    conda-forge
     librmm                    0.10.0               cuda10.0_0    rapidsai
     libsodium                 1.0.17               h516909a_0    conda-forge
     libssh2                   1.8.2                h22169c7_2    conda-forge
     libstdcxx-ng              9.1.0                hdf63c60_0
     libtiff                   4.0.10            h57b8799_1003    conda-forge
     libuuid                   2.32.1            h14c3975_1000    conda-forge
     libuv                     1.33.1               h516909a_0    conda-forge
     libxcb                    1.13              h14c3975_1002    conda-forge
     libxml2                   2.9.10               hee79883_0    conda-forge
     llvmlite                  0.30.0           py37h8b12597_0    conda-forge
     locket                    0.2.0                      py_2    conda-forge
     lz4-c                     1.8.3             he1b5a44_1001    conda-forge
     markupsafe                1.1.1            py37h14c3975_0    conda-forge
     maven                     3.6.0                         0    conda-forge
     mistune                   0.8.4                    pypi_0    pypi
     more-itertools            7.2.0                    pypi_0    pypi
     msgpack-python            0.6.2            py37hc9558a2_0    conda-forge
     nbconvert                 5.6.1                    pypi_0    pypi
     nbformat                  4.4.0                    pypi_0    pypi
     ncurses                   6.1                  he6710b0_1
     notebook                  6.0.1                    pypi_0    pypi
     numba                     0.46.0           py37hb3f55d8_1    conda-forge
     numpy                     1.17.3           py37h95a1406_0    conda-forge
     nvstrings                 0.10.0                   py37_0    rapidsai
     olefile                   0.46                       py_0    conda-forge
     openjdk                   11.0.1            h46a85a0_1017    conda-forge
     openssl                   1.1.1c               h516909a_0    conda-forge
     packaging                 19.2                       py_0    conda-forge
     pandas                    0.24.2           py37hb3f55d8_0    conda-forge
     pandocfilters             1.4.2                    pypi_0    pypi
     parquet-cpp               1.5.1                         2    conda-forge
     parso                     0.5.1                    pypi_0    pypi
     partd                     1.0.0                      py_0    conda-forge
     pexpect                   4.7.0                    pypi_0    pypi
     pickleshare               0.7.5                    pypi_0    pypi
     pillow                    6.2.1            py37h6b7be26_0    conda-forge
     pip                       19.3.1                   py37_0
     prometheus-client         0.7.1                    pypi_0    pypi
     prompt-toolkit            2.0.10                   pypi_0    pypi
     psutil                    5.6.3            py37h516909a_0    conda-forge
     pthread-stubs             0.4               h14c3975_1001    conda-forge
     ptyprocess                0.6.0                    pypi_0    pypi
     pyarrow                   0.14.1           py37h8b68381_2    conda-forge
     pygments                  2.4.2                    pypi_0    pypi
     pyparsing                 2.4.2                      py_0    conda-forge
     pyrsistent                0.15.5                   pypi_0    pypi
     python                    3.7.3                h5b0a415_0    conda-forge
     python-dateutil           2.8.0                      py_0    conda-forge
     pytz                      2019.3                     py_0    conda-forge
     pyyaml                    5.1.2            py37h516909a_0    conda-forge
     pyzmq                     18.1.0                   pypi_0    pypi
     rapidjson                 1.1.0             he1b5a44_1002    conda-forge
     re2                       2019.09.01           he1b5a44_0    conda-forge
     readline                  7.0                  h7b6447c_5
     rhash                     1.3.6             h14c3975_1001    conda-forge
     rmm                       0.10.0                   py37_0    rapidsai
     send2trash                1.5.0                    pypi_0    pypi
     setuptools                41.6.0                   py37_0
     six                       1.12.0                py37_1000    conda-forge
     snappy                    1.1.7             he1b5a44_1002    conda-forge
     sortedcontainers          2.1.0                      py_0    conda-forge
     sqlite                    3.30.1               h7b6447c_0
     tblib                     1.4.0                      py_0    conda-forge
     terminado                 0.8.2                    pypi_0    pypi
     testpath                  0.4.2                    pypi_0    pypi
     thrift-cpp                0.12.0            hf3afdfd_1004    conda-forge
     tk                        8.6.9             hed695b0_1003    conda-forge
     toolz                     0.10.0                     py_0    conda-forge
     tornado                   6.0.3            py37h516909a_0    conda-forge
     traitlets                 4.3.3                    pypi_0    pypi
     uriparser                 0.9.3                he1b5a44_1    conda-forge
     wcwidth                   0.1.7                    pypi_0    pypi
     webencodings              0.5.1                    pypi_0    pypi
     wheel                     0.33.6                   py37_0
     xorg-fixesproto           5.0               h14c3975_1002    conda-forge
     xorg-inputproto           2.3.2             h14c3975_1002    conda-forge
     xorg-kbproto              1.0.7             h14c3975_1002    conda-forge
     xorg-libx11               1.6.9                h516909a_0    conda-forge
     xorg-libxau               1.0.9                h14c3975_0    conda-forge
     xorg-libxdmcp             1.1.3                h516909a_0    conda-forge
     xorg-libxext              1.3.4                h516909a_0    conda-forge
     xorg-libxfixes            5.0.3             h516909a_1004    conda-forge
     xorg-libxi                1.7.10               h516909a_0    conda-forge
     xorg-libxrender           0.9.10            h516909a_1002    conda-forge
     xorg-libxtst              1.2.3             h14c3975_1002    conda-forge
     xorg-recordproto          1.14.2            h516909a_1002    conda-forge
     xorg-renderproto          0.11.1            h14c3975_1002    conda-forge
     xorg-xextproto            7.3.0             h14c3975_1002    conda-forge
     xorg-xproto               7.0.31            h14c3975_1007    conda-forge
     xz                        5.2.4                h14c3975_4
     yaml                      0.1.7             h14c3975_1001    conda-forge
     zeromq                    4.3.2                he1b5a44_2    conda-forge
     zict                      1.0.0                      py_0    conda-forge
     zipp                      0.6.0                    pypi_0    pypi
     zlib                      1.2.11               h7b6447c_3
     zstd                      1.4.0                h3b9ef0a_0    conda-forge
     
</pre></details>

[BUG] HDFS Register Failure

Describe the bug
Cannot register HDFS to the BlazingContext and hdfs register step fails to connect to HDFS. Tested on both stable and nightly version of blazingSQL.
RAL.log file shows the error trace: "|TRACE|deregisterFileSystem: filesystem authority not found".

Steps/Code to reproduce bug
from blazingsql import BlazingContext
import cudf
bc = BlazingContext()
bc.hdfs('test_dir', host='', port=<port_number>, user='', kerberos_ticket='/path/to/keytablefile.keytab')

Expected behavior
bc.hdfs() should register HDFS successfully without any failures.

Environment overview (please complete the following information)

  • Environment location: Docker ( and conda install of blazingSQL packages)
  • Method of cuDF install: conda
    • If method of install is [Docker], provide docker pull & docker run commands used

Environment details
Please run and paste the output of the print_env.sh script here, to gather any other relevant environment details

Additional context
Add any other context about the problem here.

Migration of IO module with cudf-0.12

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

----For BlazingSQL Developers----
How and where should this be implemented?
What part of the code should be feature be implemented? What should the APIs and/or classes look like?

**Other design considerations **
What components of the engine could be affected by this? What functions should we make sure we use/reuse?

Testing considerations?
What sort of unit tests and/or End to End tests be implemented to test this?

if support Multi machine multi Graphic card

Our data center has 20 servers, each with 4 Graphic cards.
Can we build one cluster and make full use of all resources (80 NV 2080ti )? to support maximum parallelism.
Thanks!

[BUG] Distributed Query client = Client('127.0.0.1:8786') error

In blazingsql (0.4.4) I want to execute a distributed sql on several GPUs. According to the documentation, the first steps are:

from blazingsql import BlazingContext
import cudf
import dask_cudf
import dask
from dask.distributed import Client
client = Client('127.0.0.1:8786')

However, in client = Client('127.0.0.1:8786') I got the following error:
OSError: Timed out trying to connect to 'tcp://127.0.0.1:8786' after 10 s: in <distributed.comm.tcp.TCPConnector object at 0x7f97eb566e80>: ConnectionRefusedError: [Errno 111] Connection refused
I would like to know if there is something else I need to do or if the error is because of something not working in my environment.

[BUG] [Build Failure] Error in the Build

Build from src is failing

-- The following OPTIONAL packages have been found:

 * PythonLibs

-- The following REQUIRED packages have been found:

 * aws-cpp-sdk-core, <https://aws.amazon.com/sdk-for-cpp/>
   AWS SDK for C++ allows to integrate any C++ application with AWS services. Module: aws-cpp-sdk-core
 * aws-cpp-sdk-s3, <https://aws.amazon.com/sdk-for-cpp/>
   AWS SDK for C++ allows to integrate any C++ application with AWS services. Module: aws-cpp-sdk-s3
 * aws-cpp-sdk-kms, <https://aws.amazon.com/sdk-for-cpp/>
   AWS SDK for C++ allows to integrate any C++ application with AWS services. Module: aws-cpp-sdk-kms
 * aws-cpp-sdk-s3-encryption, <https://aws.amazon.com/sdk-for-cpp/>
   AWS SDK for C++ allows to integrate any C++ application with AWS services. Module: aws-cpp-sdk-s3-encryption
 * storage_client, <https://github.com/googleapis/google-cloud-cpp>
   Google Cloud Client Library for C++ 
 * Threads
 * GTest

-- The following OPTIONAL packages have not been found:

 * PkgConfig

-- Configuring done
-- Generating done
-- Build files have been written to: /conda/envs/bsql/blazingsql/engine/build
make -j all && ctest
Scanning dependencies of target blazingsql-engine
[  0%] Building CXX object CMakeFiles/blazingsql-engine.dir/src/config/BlazingConfig.cpp.o
[  2%] Building CUDA object CMakeFiles/blazingsql-engine.dir/src/config/GPUManager.cu.o
[  2%] Building CXX object CMakeFiles/blazingsql-engine.dir/src/exception/RalException.cpp.o
[  2%] Building CXX object CMakeFiles/blazingsql-engine.dir/src/io/Schema.cpp.o
[  3%] Building CXX object CMakeFiles/blazingsql-engine.dir/src/io/data_parser/ParquetParser.cpp.o
[  3%] Building CXX object CMakeFiles/blazingsql-engine.dir/src/operators/OrderBy.cpp.o
[  5%] Building CXX object CMakeFiles/blazingsql-engine.dir/src/operators/JoinOperator.cpp.o
[  5%] Building CXX object CMakeFiles/blazingsql-engine.dir/src/operators/GroupBy.cpp.o
[  6%] Building CXX object CMakeFiles/blazingsql-engine.dir/src/io/data_provider/UriDataProvider.cpp.o
[  7%] Building CXX object CMakeFiles/blazingsql-engine.dir/src/Traits/RuntimeTraits.cpp.o
[  8%] Building CXX object CMakeFiles/blazingsql-engine.dir/src/io/data_parser/CSVParser.cpp.o
[  9%] Building CXX object CMakeFiles/blazingsql-engine.dir/src/io/data_parser/JSONParser.cpp.o
[ 10%] Building CXX object CMakeFiles/blazingsql-engine.dir/src/utilities/RalColumn.cpp.o
[ 10%] Building CXX object CMakeFiles/blazingsql-engine.dir/src/io/data_parser/GDFParser.cpp.o
[ 13%] Building CXX object CMakeFiles/blazingsql-engine.dir/src/utilities/StringUtils.cpp.o
[ 13%] Building CXX object CMakeFiles/blazingsql-engine.dir/src/Config/Config.cpp.o
[ 13%] Building CXX object CMakeFiles/blazingsql-engine.dir/src/io/data_parser/OrcParser.cpp.o
[ 14%] Building CXX object CMakeFiles/blazingsql-engine.dir/src/io/data_parser/ArgsUtil.cpp.o
[ 14%] Building CXX object CMakeFiles/blazingsql-engine.dir/src/io/data_parser/ParserUtil.cpp.o
[ 15%] Building CXX object CMakeFiles/blazingsql-engine.dir/src/io/DataLoader.cpp.o
[ 15%] Building CXX object CMakeFiles/blazingsql-engine.dir/src/utilities/CommonOperations.cpp.o
[ 16%] Building CUDA object CMakeFiles/blazingsql-engine.dir/src/Interpreter/interpreter_cpp.cu.o
[ 18%] Building CXX object CMakeFiles/blazingsql-engine.dir/src/CalciteInterpreter.cpp.o
[ 18%] Building CXX object CMakeFiles/blazingsql-engine.dir/src/CalciteExpressionParsing.cpp.o
[ 18%] Building CXX object CMakeFiles/blazingsql-engine.dir/src/utilities/TableWrapper.cpp.o
[ 18%] Building CUDA object CMakeFiles/blazingsql-engine.dir/src/ColumnManipulation.cu.o
[ 19%] Building CXX object CMakeFiles/blazingsql-engine.dir/src/ResultSetRepository.cpp.o
[ 22%] Building CUDA object CMakeFiles/blazingsql-engine.dir/src/GDFColumn.cu.o
[ 22%] Building CXX object CMakeFiles/blazingsql-engine.dir/src/JoinProcessor.cpp.o
[ 22%] Building CUDA object CMakeFiles/blazingsql-engine.dir/src/GDFCounter.cu.o
[ 22%] Building CXX object CMakeFiles/blazingsql-engine.dir/src/LogicalFilter.cpp.o
[ 24%] Building CXX object CMakeFiles/blazingsql-engine.dir/src/cython/initialize.cpp.o
[ 24%] Building CUDA object CMakeFiles/blazingsql-engine.dir/src/Utils.cu.o
[ 24%] Building CXX object CMakeFiles/blazingsql-engine.dir/src/QueryState.cpp.o
[ 25%] Building CXX object CMakeFiles/blazingsql-engine.dir/src/cython/io.cpp.o
[ 26%] Building CXX object CMakeFiles/blazingsql-engine.dir/src/CodeTimer.cpp.o
[ 27%] Building CXX object CMakeFiles/blazingsql-engine.dir/src/parser/expression_utils.cpp.o
[ 27%] Building CXX object CMakeFiles/blazingsql-engine.dir/src/cython/errors.cpp.o
[ 28%] Building CXX object CMakeFiles/blazingsql-engine.dir/src/cython/engine.cpp.o
[ 29%] Building CXX object CMakeFiles/blazingsql-engine.dir/src/cuDF/Allocator.cpp.o
[ 29%] Building CUDA object CMakeFiles/blazingsql-engine.dir/src/cuDF/generator/sample_generator.cu.o
[ 30%] Building CXX object CMakeFiles/blazingsql-engine.dir/src/communication/factory/MessageFactory.cpp.o
[ 32%] Building CXX object CMakeFiles/blazingsql-engine.dir/src/communication/network/Client.cpp.o
[ 32%] Building CXX object CMakeFiles/blazingsql-engine.dir/src/communication/CommunicationData.cpp.o
[ 32%] Building CXX object CMakeFiles/blazingsql-engine.dir/src/communication/network/Server.cpp.o
[ 34%] Building CXX object CMakeFiles/blazingsql-engine.dir/src/distribution/NodeColumns.cpp.o
[ 34%] Building CXX object CMakeFiles/blazingsql-engine.dir/src/distribution/NodeSamples.cpp.o
[ 35%] Building CXX object CMakeFiles/blazingsql-engine.dir/src/distribution/Exception.cpp.o
In file included from /conda/envs/bsql/blazingsql/engine/src/CodeTimer.cpp:8:0:
/conda/envs/bsql/blazingsql/engine/src/CodeTimer.h:11:39: fatal error: blazingdb/manager/Context.h: No such file or directory
compilation terminated.
CMakeFiles/blazingsql-engine.dir/build.make:452: recipe for target 'CMakeFiles/blazingsql-engine.dir/src/CodeTimer.cpp.o' failed
make[2]: *** [CMakeFiles/blazingsql-engine.dir/src/CodeTimer.cpp.o] Error 1
make[2]: *** Waiting for unfinished jobs....
[ 36%] Building CUDA object CMakeFiles/blazingsql-engine.dir/src/distribution/primitives_util.cu.o
[ 36%] Building CXX object CMakeFiles/blazingsql-engine.dir/src/distribution/primitives.cpp.o
In file included from /conda/envs/bsql/blazingsql/engine/src/communication/network/Client.cpp:1:0:
/conda/envs/bsql/blazingsql/engine/src/communication/network/Client.h:3:47: fatal error: blazingdb/manager/NodeDataMessage.h: No such file or directory
compilation terminated.
CMakeFiles/blazingsql-engine.dir/build.make:608: recipe for target 'CMakeFiles/blazingsql-engine.dir/src/communication/network/Client.cpp.o' failed
make[2]: *** [CMakeFiles/blazingsql-engine.dir/src/communication/network/Client.cpp.o] Error 1
In file included from /conda/envs/bsql/blazingsql/engine/src/communication/network/Server.cpp:1:0:
/conda/envs/bsql/blazingsql/engine/src/communication/network/Server.h:3:41: fatal error: blazingdb/transport/Message.h: No such file or directory
compilation terminated.
CMakeFiles/blazingsql-engine.dir/build.make:621: recipe for target 'CMakeFiles/blazingsql-engine.dir/src/communication/network/Server.cpp.o' failed
make[2]: *** [CMakeFiles/blazingsql-engine.dir/src/communication/network/Server.cpp.o] Error 1
In file included from /conda/envs/bsql/blazingsql/engine/src/communication/CommunicationData.cpp:1:0:
/conda/envs/bsql/blazingsql/engine/src/communication/CommunicationData.h:4:38: fatal error: blazingdb/transport/Node.h: No such file or directory
compilation terminated.
CMakeFiles/blazingsql-engine.dir/build.make:634: recipe for target 'CMakeFiles/blazingsql-engine.dir/src/communication/CommunicationData.cpp.o' failed
make[2]: *** [CMakeFiles/blazingsql-engine.dir/src/communication/CommunicationData.cpp.o] Error 1
In file included from /conda/envs/bsql/blazingsql/engine/src/operators/OrderBy.cpp:1:0:
/conda/envs/bsql/blazingsql/engine/src/operators/OrderBy.h:5:39: fatal error: blazingdb/manager/Context.h: No such file or directory
compilation terminated.
In file included from /conda/envs/bsql/blazingsql/engine/src/operators/GroupBy.cpp:1:0:
/conda/envs/bsql/blazingsql/engine/src/operators/GroupBy.h:5:39: fatal error: blazingdb/manager/Context.h: No such file or directory
compilation terminated.
CMakeFiles/blazingsql-engine.dir/build.make:101: recipe for target 'CMakeFiles/blazingsql-engine.dir/src/operators/OrderBy.cpp.o' failed
make[2]: *** [CMakeFiles/blazingsql-engine.dir/src/operators/OrderBy.cpp.o] Error 1
CMakeFiles/blazingsql-engine.dir/build.make:127: recipe for target 'CMakeFiles/blazingsql-engine.dir/src/operators/GroupBy.cpp.o' failed
make[2]: *** [CMakeFiles/blazingsql-engine.dir/src/operators/GroupBy.cpp.o] Error 1
/conda/envs/bsql/blazingsql/engine/src/cython/initialize.cpp:19:50: fatal error: blazingdb/transport/io/reader_writer.h: No such file or directory
compilation terminated.
CMakeFiles/blazingsql-engine.dir/build.make:517: recipe for target 'CMakeFiles/blazingsql-engine.dir/src/cython/initialize.cpp.o' failed
make[2]: *** [CMakeFiles/blazingsql-engine.dir/src/cython/initialize.cpp.o] Error 1
In file included from /conda/envs/bsql/blazingsql/engine/src/distribution/NodeSamples.cpp:1:0:
/conda/envs/bsql/blazingsql/engine/src/distribution/NodeSamples.h:5:38: fatal error: blazingdb/transport/Node.h: No such file or directory
compilation terminated.
CMakeFiles/blazingsql-engine.dir/build.make:673: recipe for target 'CMakeFiles/blazingsql-engine.dir/src/distribution/NodeSamples.cpp.o' failed
make[2]: *** [CMakeFiles/blazingsql-engine.dir/src/distribution/NodeSamples.cpp.o] Error 1
In file included from /conda/envs/bsql/blazingsql/engine/src/distribution/primitives.cpp:1:0:
/conda/envs/bsql/blazingsql/engine/src/distribution/primitives.h:6:39: fatal error: blazingdb/manager/Context.h: No such file or directory
compilation terminated.
CMakeFiles/blazingsql-engine.dir/build.make:686: recipe for target 'CMakeFiles/blazingsql-engine.dir/src/distribution/primitives.cpp.o' failed
make[2]: *** [CMakeFiles/blazingsql-engine.dir/src/distribution/primitives.cpp.o] Error 1
In file included from /conda/envs/bsql/blazingsql/engine/src/operators/JoinOperator.cpp:1:0:
/conda/envs/bsql/blazingsql/engine/src/operators/JoinOperator.h:4:39: fatal error: blazingdb/manager/Context.h: No such file or directory
compilation terminated.
CMakeFiles/blazingsql-engine.dir/build.make:114: recipe for target 'CMakeFiles/blazingsql-engine.dir/src/operators/JoinOperator.cpp.o' failed
make[2]: *** [CMakeFiles/blazingsql-engine.dir/src/operators/JoinOperator.cpp.o] Error 1
In file included from /conda/envs/bsql/blazingsql/engine/src/communication/messages/ComponentMessages.h:3:0,
                 from /conda/envs/bsql/blazingsql/engine/src/communication/factory/MessageFactory.h:3,
                 from /conda/envs/bsql/blazingsql/engine/src/communication/factory/MessageFactory.cpp:1:
/conda/envs/bsql/blazingsql/engine/src/communication/messages/GPUComponentMessage.h:4:41: fatal error: blazingdb/transport/Address.h: No such file or directory
compilation terminated.
CMakeFiles/blazingsql-engine.dir/build.make:595: recipe for target 'CMakeFiles/blazingsql-engine.dir/src/communication/factory/MessageFactory.cpp.o' failed
make[2]: *** [CMakeFiles/blazingsql-engine.dir/src/communication/factory/MessageFactory.cpp.o] Error 1
In file included from /conda/envs/bsql/blazingsql/engine/src/distribution/NodeColumns.cpp:1:0:
/conda/envs/bsql/blazingsql/engine/src/distribution/NodeColumns.h:5:38: fatal error: blazingdb/transport/Node.h: No such file or directory
compilation terminated.
CMakeFiles/blazingsql-engine.dir/build.make:660: recipe for target 'CMakeFiles/blazingsql-engine.dir/src/distribution/NodeColumns.cpp.o' failed
make[2]: *** [CMakeFiles/blazingsql-engine.dir/src/distribution/NodeColumns.cpp.o] Error 1
In file included from /conda/envs/bsql/blazingsql/engine/src/io/DataLoader.cpp:2:0:
/conda/envs/bsql/blazingsql/engine/src/io/DataLoader.h:16:39: fatal error: blazingdb/manager/Context.h: No such file or directory
compilation terminated.
make[2]: *** [CMakeFiles/blazingsql-engine.dir/src/io/DataLoader.cpp.o] Error 1
CMakeFiles/blazingsql-engine.dir/build.make:348: recipe for target 'CMakeFiles/blazingsql-engine.dir/src/io/DataLoader.cpp.o' failed
In file included from /conda/envs/bsql/blazingsql/engine/src/CalciteInterpreter.h:9:0,
                 from /conda/envs/bsql/blazingsql/engine/src/CalciteInterpreter.cpp:1:
/conda/envs/bsql/blazingsql/engine/src/io/DataLoader.h:16:39: fatal error: blazingdb/manager/Context.h: No such file or directory
compilation terminated.
In file included from /conda/envs/bsql/blazingsql/engine/src/LogicalFilter.cpp:19:0:
/conda/envs/bsql/blazingsql/engine/src/CodeTimer.h:11:39: fatal error: blazingdb/manager/Context.h: No such file or directory
compilation terminated.
In file included from /conda/envs/bsql/blazingsql/engine/src/cython/io.cpp:2:0:
/conda/envs/bsql/blazingsql/engine/src/cython/../io/DataLoader.h:16:39: fatal error: blazingdb/manager/Context.h: No such file or directory
compilation terminated.
CMakeFiles/blazingsql-engine.dir/build.make:426: recipe for target 'CMakeFiles/blazingsql-engine.dir/src/LogicalFilter.cpp.o' failed
make[2]: *** [CMakeFiles/blazingsql-engine.dir/src/LogicalFilter.cpp.o] Error 1
CMakeFiles/blazingsql-engine.dir/build.make:374: recipe for target 'CMakeFiles/blazingsql-engine.dir/src/CalciteInterpreter.cpp.o' failed
make[2]: *** [CMakeFiles/blazingsql-engine.dir/src/CalciteInterpreter.cpp.o] Error 1
CMakeFiles/blazingsql-engine.dir/build.make:530: recipe for target 'CMakeFiles/blazingsql-engine.dir/src/cython/io.cpp.o' failed
make[2]: *** [CMakeFiles/blazingsql-engine.dir/src/cython/io.cpp.o] Error 1
In file included from /conda/envs/bsql/blazingsql/engine/src/cython/../CalciteInterpreter.h:9:0,
                 from /conda/envs/bsql/blazingsql/engine/src/cython/engine.cpp:2:
/conda/envs/bsql/blazingsql/engine/src/cython/../io/DataLoader.h:16:39: fatal error: blazingdb/manager/Context.h: No such file or directory
compilation terminated.
make[2]: *** [CMakeFiles/blazingsql-engine.dir/src/cython/engine.cpp.o] Error 1
CMakeFiles/blazingsql-engine.dir/build.make:556: recipe for target 'CMakeFiles/blazingsql-engine.dir/src/cython/engine.cpp.o' failed
/conda/envs/bsql/blazingsql/engine/src/Interpreter/helper_cuda.h: In function ‘const char* _cudaGetErrorEnum(cudaError_t)’:
/conda/envs/bsql/blazingsql/engine/src/Interpreter/helper_cuda.h:38:8: warning: enumeration value ‘cudaErrorSystemNotReady’ not handled in switch [-Wswitch]
  switch(error) {
        ^
/conda/envs/bsql/blazingsql/engine/src/Interpreter/helper_cuda.h:38:8: warning: enumeration value ‘cudaErrorIllegalState’ not handled in switch [-Wswitch]
/conda/envs/bsql/blazingsql/engine/src/Interpreter/helper_cuda.h:38:8: warning: enumeration value ‘cudaErrorStreamCaptureUnsupported’ not handled in switch [-Wswitch]
/conda/envs/bsql/blazingsql/engine/src/Interpreter/helper_cuda.h:38:8: warning: enumeration value ‘cudaErrorStreamCaptureInvalidated’ not handled in switch [-Wswitch]
/conda/envs/bsql/blazingsql/engine/src/Interpreter/helper_cuda.h:38:8: warning: enumeration value ‘cudaErrorStreamCaptureMerge’ not handled in switch [-Wswitch]
/conda/envs/bsql/blazingsql/engine/src/Interpreter/helper_cuda.h:38:8: warning: enumeration value ‘cudaErrorStreamCaptureUnmatched’ not handled in switch [-Wswitch]
/conda/envs/bsql/blazingsql/engine/src/Interpreter/helper_cuda.h:38:8: warning: enumeration value ‘cudaErrorStreamCaptureUnjoined’ not handled in switch [-Wswitch]
/conda/envs/bsql/blazingsql/engine/src/Interpreter/helper_cuda.h:38:8: warning: enumeration value ‘cudaErrorStreamCaptureIsolation’ not handled in switch [-Wswitch]
/conda/envs/bsql/blazingsql/engine/src/Interpreter/helper_cuda.h:38:8: warning: enumeration value ‘cudaErrorStreamCaptureImplicit’ not handled in switch [-Wswitch]
/conda/envs/bsql/blazingsql/engine/src/Interpreter/helper_cuda.h:38:8: warning: enumeration value ‘cudaErrorCapturedEvent’ not handled in switch [-Wswitch]
/conda/envs/bsql/blazingsql/engine/src/Interpreter/helper_cuda.h: In function ‘const char* _cudaGetErrorEnum(CUresult)’:
/conda/envs/bsql/blazingsql/engine/src/Interpreter/helper_cuda.h:220:8: warning: enumeration value ‘CUDA_ERROR_ILLEGAL_STATE’ not handled in switch [-Wswitch]
  switch(error) {
        ^
/conda/envs/bsql/blazingsql/engine/src/Interpreter/helper_cuda.h:220:8: warning: enumeration value ‘CUDA_ERROR_SYSTEM_NOT_READY’ not handled in switch [-Wswitch]
/conda/envs/bsql/blazingsql/engine/src/Interpreter/helper_cuda.h:220:8: warning: enumeration value ‘CUDA_ERROR_STREAM_CAPTURE_UNSUPPORTED’ not handled in switch [-Wswitch]
/conda/envs/bsql/blazingsql/engine/src/Interpreter/helper_cuda.h:220:8: warning: enumeration value ‘CUDA_ERROR_STREAM_CAPTURE_INVALIDATED’ not handled in switch [-Wswitch]
/conda/envs/bsql/blazingsql/engine/src/Interpreter/helper_cuda.h:220:8: warning: enumeration value ‘CUDA_ERROR_STREAM_CAPTURE_MERGE’ not handled in switch [-Wswitch]
/conda/envs/bsql/blazingsql/engine/src/Interpreter/helper_cuda.h:220:8: warning: enumeration value ‘CUDA_ERROR_STREAM_CAPTURE_UNMATCHED’ not handled in switch [-Wswitch]
/conda/envs/bsql/blazingsql/engine/src/Interpreter/helper_cuda.h:220:8: warning: enumeration value ‘CUDA_ERROR_STREAM_CAPTURE_UNJOINED’ not handled in switch [-Wswitch]
/conda/envs/bsql/blazingsql/engine/src/Interpreter/helper_cuda.h:220:8: warning: enumeration value ‘CUDA_ERROR_STREAM_CAPTURE_ISOLATION’ not handled in switch [-Wswitch]
/conda/envs/bsql/blazingsql/engine/src/Interpreter/helper_cuda.h:220:8: warning: enumeration value ‘CUDA_ERROR_STREAM_CAPTURE_IMPLICIT’ not handled in switch [-Wswitch]
/conda/envs/bsql/blazingsql/engine/src/Interpreter/helper_cuda.h:220:8: warning: enumeration value ‘CUDA_ERROR_CAPTURED_EVENT’ not handled in switch [-Wswitch]
CMakeFiles/Makefile2:756: recipe for target 'CMakeFiles/blazingsql-engine.dir/all' failed
make[1]: *** [CMakeFiles/blazingsql-engine.dir/all] Error 2
Makefile:140: recipe for target 'all' failed
make: *** [all] Error 2
cp libblazingsql-engine.so /conda/envs/bsql/lib/libblazingsql-engine.so
cp: cannot stat 'libblazingsql-engine.so': No such file or directory
The command '/bin/bash -c source activate bsql && cd $CONDA_PREFIX && git clone https://github.com/BlazingDB/blazingsql.git && cd blazingsql && export CUDACXX=/usr/local/cuda/bin/nvcc && conda/recipes/blazingsql/build.sh' returned a non-zero code: 1

Json/XML SQL Functions

**Flatting Json **
In the document it says it supports all the JSON function via cuIO not sure can it be performed in the SQL or not.

Describe the solution you'd like
Reading the JSON document using SQL functions and flatting it to a table and other functions that will help for data validation similarly do it for XML document

Ref
https://docs.snowflake.net/manuals/sql-reference/functions-semistructured.html
https://spark.apache.org/docs/latest/api/sql/index.html (XML and Json Fucntions)

[BUG] Reading a file twice causes kernel to die

Describe the bug

When i try to read the attached table twice , it works once but fails when i try to read it again. I also checked against cudf and it seems to read that file correctly.

The kernel seems to die with the following error:

terminate called after throwing an instance of 'thrust::system::system_error'
  what():  rmm_allocator::deallocate(): RMM_FREE: __global__ function call is not configured
Aborted (core dumped)

You can download the file from the url: https://srv-file6.gofile.io/download/ywd3Ab/item.csv

Code to reproduce bug

import os
from blazingsql import BlazingContext

bc = BlazingContext()
bc

item_result = bc.create_table('item', 
                         os.getcwd() + '/item.csv', 
                         delimiter= '|',
                         file_format = 'csv',
                         header  = None)

for i in range(0,2):
    table_result = bc.sql('select * from item limit 5').get()
    table_df = table_result.columns
    print(table_df.head(10))
    print(f"Worked {i} time")

Output:

BlazingContext ready
       0                 1           2     3                                                  4  ...        17      18       19  20                                              21
0  17820  AAAAAAAAAAAABAJK  2000-07-09  None  regular somas past the fluffy braids engage up...  ...      cyan      Oz  Unknown  57  8MJdS2aYdSydZ9Zw6llsXatytb6AUrj42owNPSpbbr0ARL
1  17821  AAAAAAAAAAAABAJL  2000-08-23  None  quiet idle hockey players would was. enticing ...  ...  metallic  Pallet  Unknown  33                   bmqB7bBbNCSJvGawoJGX25VQD24bX

[2 rows x 22 columns]
Worked 0 time
terminate called after throwing an instance of 'thrust::system::system_error'
  what():  rmm_allocator::deallocate(): RMM_FREE: __global__ function call is not configured
Aborted (core dumped)


Environment overview (please complete the following information)

  • Environment location: Docker
  • Method of cuDF install: Docker (Docker nightly on 16th October)

Additional context

This will be most probably related to the issue of reading files in a certain order discussed on slack.

support jdbc

Many applications access SQL through JDBC access backed data .
You should consider adding this feature.
thanks.

Sent with GitHawk

[BUG] S3 File Not Found

BlazeSQL is unable to find s3 files that are (and have been confirmed) to actually be in S3.

ParseSchemaError`` Traceback (most recent call last) ParseSchemaError: [ParseSchema Error] Path '/test/rocky2/restid=000002/' does not exist. File or directory paths are expected to be in one of the following formats: For local file paths: '/folder0/folder1/fileName.extension' For local file paths with wildcard: '/folder0/folder1/*fileName*.*' For local directory paths: '/folder0/folder1/' For s3 file paths: 's3://registeredFileSystemName/folder0/folder1/fileName.extension' For s3 file paths with wildcard: '/folder0/folder1/*fileName*.*' For s3 directory paths: 's3://registeredFileSystemName/folder0/folder1/' For gs file paths: 'gs://registeredFileSystemName/folder0/folder1/fileName.extension' For gs file paths with wildcard: '/folder0/folder1/*fileName*.*' For gs directory paths: 'gs://registeredFileSystemName/folder0/folder1/' For HDFS file paths: 'hdfs://registeredFileSystemName/folder0/folder1/fileName.extension' For HDFS file paths with wildcard: '/folder0/folder1/*fileName*.*' For HDFS directory paths: 'hdfs://registeredFileSystemName/folder0/folder1/' Exception ignored in: 'cio.parseSchemaPython' cio.ParseSchemaError: [ParseSchema Error] Path '/test/rocky2/restid=000002/' does not exist. File or directory paths are expected to be in one of the following formats: For local file paths: '/folder0/folder1/fileName.extension' For local file paths with wildcard: '/folder0/folder1/*fileName*.*' For local directory paths: '/folder0/folder1/' For s3 file paths: 's3://registeredFileSystemName/folder0/folder1/fileName.extension' For s3 file paths with wildcard: '/folder0/folder1/*fileName*.*' For s3 directory paths: 's3://registeredFileSystemName/folder0/folder1/' For gs file paths: 'gs://registeredFileSystemName/folder0/folder1/fileName.extension' For gs file paths with wildcard: '/folder0/folder1/*fileName*.*' For gs directory paths: 'gs://registeredFileSystemName/folder0/folder1/' For HDFS file paths: 'hdfs://registeredFileSystemName/folder0/folder1/fileName.extension' For HDFS file paths with wildcard: '/folder0/folder1/*fileName*.*' For HDFS directory paths: 'hdfs://registeredFileSystemName/folder0/folder1/'
`

I start the BlazingContext and set the s3 details like so:

``
s3_bc = BlazingContext()

s3_bc.s3('tld',
bucket_name=NAME,
access_key_id=auth_config["s3_access_key_id"],
secrect_key=auth_config["s3_secret_access_key"])

s3_bc.create_table('tld_s3', '/test/rocky2/restid=000002/')
``

I tried it several ways, all with the same error.

I tried giving it the file name as:

  1. '/test/rocky2/restid=000002/'
  2. '/test/rocky2/restid=000002/part.*'
  3. '/test/rocky2/restid=000002/part.parquet'
  4. '/test/rocky2/restid=/part*.parquet'
  5. '/test/rocky2//part*.parquet'
  6. '/test/rocky2//'
  7. And as a list of files with full path (which included s://BUCKET_NAME).

I have confirmed the files are in s3.

Be able to create a blazing table from compressed csv files

Would like to be able to specify compressoin like one can in cudf.read_csv

Describe alternatives you've considered
calling read_csv directly then creating a table but this is complicated and uses more memory in case the query only requires a few columns

Window Function

Why we need window functions?
When working with big data window functions help to slice the things out like removing the duplicated with rank/row number/dense rank without theses inbuilt functions it's hard /complex to remove duplicates. and here are the few use cases listed below

  1. Running totals within the group's
  2. Athematic calculations like Max, Min, Avg within a group
  3. First Value / Last Vale /Nth Value within a group

Above are a few use cases of the standard window function. How about calculating/data filling on a sliding window(by range/rows) within a group like Sales for last's 7 days from the current day (sliding by range from the current date in the row to the last 7 days within the window).

Describe the solution you'd like
For implementation please refer to Hive/Presto/Pandas/Spark SQL/MySQL/Postgres implementations

Additional context
Here are a few links which help to get a better understanding of these functions
ref:
http://shzhangji.com/blog/2017/09/04/hive-window-and-analytical-functions/
https://medium.com/jbennetcodes/how-to-get-rid-of-loops-and-use-window-functions-in-pandas-or-spark-sql-907f274850e4
https://databricks.com/blog/2015/07/15/introducing-window-functions-in-spark-sql.html
https://acadgild.com/blog/windowing-functions-in-hive

Enterprise DB Windows Functions documents
https://docs.snowflake.net/manuals/sql-reference/functions-analytic.html
https://docs.aws.amazon.com/redshift/latest/dg/c_Window_functions.html

RDMS DB Window Functions
https://dev.mysql.com/doc/refman/8.0/en/window-functions-usage.html
https://www.postgresql.org/docs/9.1/tutorial-window.html

[BUG] jdk crash on start

Describe the bug

JDK crash during blazing healthcheck call .

Steps/Code to reproduce bug

Primary code included -- the app crashes every time -- but struggling to reproduce standalone. The same healthcheck run in the same container from an empty python env (no surrounding gpu app running) passes.

Maybe there's a way to pull out lower-level logs/traces? Let me know.

import cudf
import uuid
import asyncio
import sys

from blazingsql import BlazingContext
import cudf
bc = BlazingContext()

async def run_query(sql, tables):
    out = None
    if True:
        for table_name in tables:
            bc.create_table(table_name, tables[table_name])
        try:
            out = bc.sql(sql)
        except Exception as e:
            print('blazing run_query err', e)
            raise e
        finally:
            for table_name in tables:
                bc.drop_table(table_name)
    return out

def healthcheck()
  print('start')
  nines = cudf.DataFrame({'a': [9,9,9,9,9], 'b': [0, 1, 2, 3, 4]})
  table_name = 'u' + uuid.uuid4().hex
  tables = {table_name: nines}
  gdf = await run_query('select SUM(a * b) as cross_product from ' + table_name, tables)
  print('end')

health()

=>

forge-etl-python_1     | 2019-11-27T20:50:55.091791578Z start
forge-etl-python_1     | 2019-11-27T20:50:55.642078285Z #
forge-etl-python_1     | 2019-11-27T20:50:55.642138234Z # A fatal error has been detected by the Java Runtime Environment:
forge-etl-python_1     | 2019-11-27T20:50:55.642144365Z #
forge-etl-python_1     | 2019-11-27T20:50:55.642162646Z #  SIGSEGV (0xb) at pc=0x00007f99c1bcea10, pid=6, tid=0x00007f9a36183740
forge-etl-python_1     | 2019-11-27T20:50:55.642169205Z #
forge-etl-python_1     | 2019-11-27T20:50:55.642181950Z # JRE version: OpenJDK Runtime Environment (8.0_192-b01) (build 1.8.0_192-b01)
forge-etl-python_1     | 2019-11-27T20:50:55.642187416Z # Java VM: OpenJDK 64-Bit Server VM (25.192-b01 mixed mode linux-amd64 compressed oops)
forge-etl-python_1     | 2019-11-27T20:50:55.642191729Z # Problematic frame:
forge-etl-python_1     | 2019-11-27T20:50:55.642527428Z # C  [libblazingsql-engine.so+0xfaa10]  gdf_column_cpp::dtype() const+0x0
forge-etl-python_1     | 2019-11-27T20:50:55.642561387Z #
forge-etl-python_1     | 2019-11-27T20:50:55.642569283Z # Core dump written. Default location: /opt/graphistry/apps/forge/etl-server-python/core or core.6
forge-etl-python_1     | 2019-11-27T20:50:55.642576476Z #
forge-etl-python_1     | 2019-11-27T20:50:55.644013778Z # An error report file with more information is saved as:
forge-etl-python_1     | 2019-11-27T20:50:55.644034209Z # /tmp/hs_err_pid6.log
forge-etl-python_1     | 2019-11-27T20:50:55.677110517Z #
forge-etl-python_1     | 2019-11-27T20:50:55.677153111Z # If you would like to submit a bug report, please visit:
forge-etl-python_1     | 2019-11-27T20:50:55.677159727Z #   http://www.azulsystems.com/support/
forge-etl-python_1     | 2019-11-27T20:50:55.677164188Z # The crash happened outside the Java Virtual Machine in native code.
forge-etl-python_1     | 2019-11-27T20:50:55.677168627Z # See problematic frame for where to report the bug.
forge-etl-python_1     | 2019-11-27T20:50:55.677172691Z #
ubuntu@ip-172-31-28-246:~/graphistry$ 

Environment overview (please complete the following information)

rapids 0.11 nightly / blazing nightly: https://hub.docker.com/repository/docker/graphistry/graphistry-blazing (source activate rapids)

[BUG] - SQL query, multiple WHERE clauses

Describe the bug
Not sure if bug or feature request.

Sql query, running "... WHERE table1.colA LIKE '%foo%' AND table1.colB = 'bar'" returns empty dataframe.

Combining the two WHERE clauses causes empty dataframe.

Steps/Code to reproduce bug
bc.sql("SELECT table1.colA, table1.colB from table1 WHERE table1.colA LIKE '%foo%' AND table1.colB = 'bar'").get()

Expected behavior
I expected to get a valid result.

Using grep on the raw data gives expected result.
Running the same query in an sqlite database gives expected result.

"... WHERE table1.colA LIKE '%foo%'" - Expected result
"... WHERE table1.colB = 'bar'" - Expected result

Environment overview (please complete the following information)

  • Environment location: Bare-metal
  • Method of cuDF install: conda

Environment details

Click here to see environment details
 **git***
 commit c4371b0021fb74d71e35b8ef84e8d5629b3111c4 (HEAD -> master)
 Author: REMOVED REMOVED <REMOVED.REMOVED@REMOVED>
 Date:   Tue Oct 29 08:02:22 2019 +0100
 
 change structure
 **git submodules***
 
 ***OS Information***
 DISTRIB_ID=Ubuntu
 DISTRIB_RELEASE=18.04
 DISTRIB_CODENAME=bionic
 DISTRIB_DESCRIPTION="Ubuntu 18.04.3 LTS"
 NAME="Ubuntu"
 VERSION="18.04.3 LTS (Bionic Beaver)"
 ID=ubuntu
 ID_LIKE=debian
 PRETTY_NAME="Ubuntu 18.04.3 LTS"
 VERSION_ID="18.04"
 HOME_URL="https://www.ubuntu.com/"
 SUPPORT_URL="https://help.ubuntu.com/"
 BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
 PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
 VERSION_CODENAME=bionic
 UBUNTU_CODENAME=bionic
 Linux REMOVED 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
 
 ***GPU Information***
 Tue Nov  5 10:58:50 2019
 +-----------------------------------------------------------------------------+
 | NVIDIA-SMI 430.26       Driver Version: 430.26       CUDA Version: 10.2     |
 |-------------------------------+----------------------+----------------------+
 | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
 | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
 |===============================+======================+======================|
 |   0  GeForce RTX 2080    Off  | 00000000:01:00.0  On |                  N/A |
 | 24%   29C    P8    14W / 215W |   3327MiB /  7979MiB |     29%      Default |
 +-------------------------------+----------------------+----------------------+
 
 +-----------------------------------------------------------------------------+
 | Processes:                                                       GPU Memory |
 |  GPU       PID   Type   Process name                             Usage      |
 |=============================================================================|
 |    0      1465      G   /usr/lib/xorg/Xorg                            28MiB |
 |    0      1501      G   /usr/bin/gnome-shell                          57MiB |
 |    0      1813      G   /usr/lib/xorg/Xorg                           369MiB |
 |    0      1963      G   /usr/bin/gnome-shell                         238MiB |
 |    0      3183      G   ...uest-channel-token=12555389628416861818    42MiB |
 |    0      5836      G   /snap/pycharm-community/155/jbr/bin/java      28MiB |
 |    0     15894      G   /usr/lib/firefox/firefox                       6MiB |
 |    0     16611      G   /usr/lib/firefox/firefox                      26MiB |
 |    0     17223      G   /usr/lib/firefox/firefox                     185MiB |
 |    0     17427      G   /usr/lib/firefox/firefox                     150MiB |
 |    0     17848      G   /usr/lib/firefox/firefox                       6MiB |
 |    0     18210      G   /usr/lib/firefox/firefox                      83MiB |
 |    0     18322      G   /usr/lib/firefox/firefox                     150MiB |
 |    0     19738      C   blazingsql-engine                           1325MiB |
 |    0     26193      C   ...Data/anaconda3/envs/REMOVED/bin/python   623MiB |
 +-----------------------------------------------------------------------------+
 
 ***CPU***
 Architecture:        x86_64
 CPU op-mode(s):      32-bit, 64-bit
 Byte Order:          Little Endian
 CPU(s):              8
 On-line CPU(s) list: 0-7
 Thread(s) per core:  1
 Core(s) per socket:  8
 Socket(s):           1
 NUMA node(s):        1
 Vendor ID:           GenuineIntel
 CPU family:          6
 Model:               158
 Model name:          Intel(R) Core(TM) i7-9700K CPU @ 3.60GHz
 Stepping:            12
 CPU MHz:             4600.005
 CPU max MHz:         4900,0000
 CPU min MHz:         800,0000
 BogoMIPS:            7200.00
 Virtualization:      VT-x
 L1d cache:           32K
 L1i cache:           32K
 L2 cache:            256K
 L3 cache:            12288K
 NUMA node0 CPU(s):   0-7
 Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp md_clear flush_l1d arch_capabilities
 
 ***CMake***
 /media/REMOVED/Data/anaconda3/envs/REMOVED/bin/cmake
 cmake version 3.15.4
 
 CMake suite maintained and supported by Kitware (kitware.com/cmake).
 
 ***g++***
 /usr/bin/g++
 g++ (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
 Copyright (C) 2017 Free Software Foundation, Inc.
 This is free software; see the source for copying conditions.  There is NO
 warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
 
 
 ***nvcc***
 
 ***Python***
 /media/REMOVED/Data/anaconda3/envs/REMOVED/bin/python
 Python 3.7.3
 
 ***Environment Variables***
 PATH                            : /media/REMOVED/Data/anaconda3/envs/REMOVED/bin:/media/REMOVED/Data/anaconda3/condabin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
 LD_LIBRARY_PATH                 :
 NUMBAPRO_NVVM                   :
 NUMBAPRO_LIBDEVICE              :
 CONDA_PREFIX                    : /media/REMOVED/Data/anaconda3/envs/REMOVED
 PYTHON_PATH                     :
 
 ***conda packages***
 /media/REMOVED/Data/anaconda3/condabin/conda
 # packages in environment at /media/REMOVED/Data/anaconda3/envs/REMOVED:
 #
 # Name                    Version                   Build  Channel
 _libgcc_mutex             0.1                        main
 alsa-lib                  1.1.5             h516909a_1001    conda-forge
 altair                    3.2.0                    pypi_0    pypi
 argh                      0.26.2                   pypi_0    pypi
 arrow-cpp                 0.14.1           py37h5ac5442_4    conda-forge
 astor                     0.8.0                    pypi_0    pypi
 attrs                     19.3.0                   pypi_0    pypi
 backcall                  0.1.0                    pypi_0    pypi
 base58                    1.0.3                    pypi_0    pypi
 blazingsql-calcite        0.4.5                         0    blazingsql
 blazingsql-communication  0.4.5                cuda10.0_0    blazingsql/label/cuda10.0
 blazingsql-io             0.4.4                         0    blazingsql
 blazingsql-orchestrator   0.4.5                         0    blazingsql
 blazingsql-protocol       0.4.5                    py37_0    blazingsql
 blazingsql-python         0.4.5           cuda10.0_py37_0    blazingsql/label/cuda10.0
 blazingsql-ral            0.4.5                cuda10.0_0    blazingsql/label/cuda10.0
 blazingsql-toolchain      0.4.5                         0    blazingsql
 bleach                    3.1.0                    pypi_0    pypi
 blessings                 1.7                      pypi_0    pypi
 blinker                   1.4                      pypi_0    pypi
 bokeh                     1.3.4                    py37_0    conda-forge
 boost                     1.70.0           py37h9de70de_1    conda-forge
 boost-cpp                 1.70.0               h8e57a91_2    conda-forge
 boto3                     1.10.4                   pypi_0    pypi
 botocore                  1.13.4                   pypi_0    pypi
 brotli                    1.0.7             he1b5a44_1000    conda-forge
 bzip2                     1.0.8                h516909a_1    conda-forge
 c-ares                    1.15.0            h516909a_1001    conda-forge
 ca-certificates           2019.9.11            hecc5488_0    conda-forge
 certifi                   2019.9.11                py37_0    conda-forge
 chardet                   3.0.4                    pypi_0    pypi
 click                     7.0                        py_0    conda-forge
 cloudpickle               1.2.2                      py_0    conda-forge
 cmake                     3.15.4               hf94ab9c_0    conda-forge
 cppzmq                    4.4.1                hc9558a2_0    conda-forge
 cudatoolkit               10.0.130                      0
 cudf                      0.10.0                   py37_0    rapidsai
 curl                      7.65.3               hf8cf82a_0    conda-forge
 cython                    0.29.13          py37he1b5a44_0    conda-forge
 cytoolz                   0.10.0           py37h516909a_0    conda-forge
 dask                      2.6.0                      py_0    conda-forge
 dask-core                 2.6.0                      py_0    conda-forge
 dask-cudf                 0.10.0                   py37_0    rapidsai
 decorator                 4.4.1                    pypi_0    pypi
 defusedxml                0.6.0                    pypi_0    pypi
 distributed               2.6.0                      py_0    conda-forge
 dlpack                    0.2                  he1b5a44_1    conda-forge
 docutils                  0.15.2                   pypi_0    pypi
 double-conversion         3.1.5                he1b5a44_1    conda-forge
 entrypoints               0.3                      pypi_0    pypi
 enum-compat               0.0.3                    pypi_0    pypi
 expat                     2.2.5             he1b5a44_1004    conda-forge
 fastavro                  0.22.5           py37h516909a_0    conda-forge
 flatbuffers               1.11                     pypi_0    pypi
 fontconfig                2.13.1            h86ecdb6_1001    conda-forge
 freetype                  2.10.0               he983fc9_1    conda-forge
 fsspec                    0.5.2                      py_0    conda-forge
 future                    0.18.1                   pypi_0    pypi
 geoip2                    2.9.0                    pypi_0    pypi
 gettext                   0.19.8.1          hc5be6a0_1002    conda-forge
 gflags                    2.2.2             he1b5a44_1002    conda-forge
 giflib                    5.1.7                h516909a_1    conda-forge
 glog                      0.4.0                he1b5a44_1    conda-forge
 gmock                     1.10.0                        0    conda-forge
 grpc-cpp                  1.23.0               h18db393_0    conda-forge
 gtest                     1.10.0               hc9558a2_0    conda-forge
 heapdict                  1.0.1                      py_0    conda-forge
 icu                       64.2                 he1b5a44_1    conda-forge
 idna                      2.8                      pypi_0    pypi
 importlib-metadata        0.23                     pypi_0    pypi
 ipykernel                 5.1.3                    pypi_0    pypi
 ipython                   7.9.0                    pypi_0    pypi
 ipython-genutils          0.2.0                    pypi_0    pypi
 jedi                      0.15.1                   pypi_0    pypi
 jinja2                    2.10.3                     py_0    conda-forge
 jmespath                  0.9.4                    pypi_0    pypi
 jpeg                      9c                h14c3975_1001    conda-forge
 json5                     0.8.5                    pypi_0    pypi
 jsonschema                3.1.1                    pypi_0    pypi
 jupyter-client            5.3.4                    pypi_0    pypi
 jupyter-core              4.6.1                    pypi_0    pypi
 jupyterlab                1.1.4                    pypi_0    pypi
 jupyterlab-server         1.0.6                    pypi_0    pypi
 krb5                      1.16.3            h05b26f9_1001    conda-forge
 lcms2                     2.9                  h2e4bb80_0    conda-forge
 libblas                   3.8.0               14_openblas    conda-forge
 libcblas                  3.8.0               14_openblas    conda-forge
 libcudf                   0.10.0               cuda10.0_0    rapidsai
 libcurl                   7.65.3               hda55be3_0    conda-forge
 libedit                   3.1.20181209         hc058e9b_0
 libevent                  2.1.10               h72c5cf5_0    conda-forge
 libffi                    3.2.1                hd88cf55_4
 libgcc-ng                 9.1.0                hdf63c60_0
 libgcrypt                 1.8.4             hf484d3e_1000    conda-forge
 libgfortran-ng            7.3.0                hdf63c60_2    conda-forge
 libgpg-error              1.36                 he1b5a44_0    conda-forge
 libgsasl                  1.8.0             h19a2143_1004    conda-forge
 libhdfs3                  2.3               h311b756_1006    conda-forge
 libiconv                  1.15              h516909a_1005    conda-forge
 liblapack                 3.8.0               14_openblas    conda-forge
 libllvm8                  8.0.1                hc9558a2_0    conda-forge
 libntlm                   1.4               h14c3975_1002    conda-forge
 libnvstrings              0.10.0               cuda10.0_0    rapidsai
 libopenblas               0.3.7                h6e990d7_2    conda-forge
 libpng                    1.6.37               hed695b0_0    conda-forge
 libprotobuf               3.8.0                h8b12597_0    conda-forge
 librmm                    0.10.0               cuda10.0_0    rapidsai
 libsodium                 1.0.17               h516909a_0    conda-forge
 libssh2                   1.8.2                h22169c7_2    conda-forge
 libstdcxx-ng              9.1.0                hdf63c60_0
 libtiff                   4.0.10            h57b8799_1003    conda-forge
 libuuid                   2.32.1            h14c3975_1000    conda-forge
 libuv                     1.33.1               h516909a_0    conda-forge
 libxcb                    1.13              h14c3975_1002    conda-forge
 libxml2                   2.9.9                hee79883_5    conda-forge
 llvmlite                  0.30.0           py37h8b12597_0    conda-forge
 locket                    0.2.0                      py_2    conda-forge
 lz4-c                     1.8.3             he1b5a44_1001    conda-forge
 markupsafe                1.1.1            py37h14c3975_0    conda-forge
 maven                     3.6.0                         0    conda-forge
 maxminddb                 1.5.1                    pypi_0    pypi
 mistune                   0.8.4                    pypi_0    pypi
 more-itertools            7.2.0                    pypi_0    pypi
 msgpack-python            0.6.2            py37hc9558a2_0    conda-forge
 nbconvert                 5.6.1                    pypi_0    pypi
 nbformat                  4.4.0                    pypi_0    pypi
 ncurses                   6.1                  he6710b0_1
 notebook                  6.0.1                    pypi_0    pypi
 numba                     0.46.0           py37hb3f55d8_0    conda-forge
 numpy                     1.17.3           py37h95a1406_0    conda-forge
 nvstrings                 0.10.0                   py37_0    rapidsai
 olefile                   0.46                       py_0    conda-forge
 openjdk                   11.0.1            h46a85a0_1017    conda-forge
 openssl                   1.1.1c               h516909a_0    conda-forge
 packaging                 19.2                       py_0    conda-forge
 pandas                    0.24.2           py37hb3f55d8_0    conda-forge
 pandocfilters             1.4.2                    pypi_0    pypi
 parquet-cpp               1.5.1                         2    conda-forge
 parso                     0.5.1                    pypi_0    pypi
 partd                     1.0.0                      py_0    conda-forge
 pathtools                 0.1.2                    pypi_0    pypi
 pexpect                   4.7.0                    pypi_0    pypi
 pickleshare               0.7.5                    pypi_0    pypi
 pillow                    6.2.1            py37h6b7be26_0    conda-forge
 pip                       19.3.1                   py37_0
 prometheus-client         0.7.1                    pypi_0    pypi
 prompt-toolkit            2.0.10                   pypi_0    pypi
 protobuf                  3.10.0                   pypi_0    pypi
 psutil                    5.6.3            py37h516909a_0    conda-forge
 pthread-stubs             0.4               h14c3975_1001    conda-forge
 ptyprocess                0.6.0                    pypi_0    pypi
 pyarrow                   0.14.1           py37h8b68381_2    conda-forge
 pygments                  2.4.2                    pypi_0    pypi
 pyparsing                 2.4.2                      py_0    conda-forge
 pyrsistent                0.15.5                   pypi_0    pypi
 python                    3.7.3                h5b0a415_0    conda-forge
 python-dateutil           2.8.0                      py_0    conda-forge
 pytz                      2019.3                     py_0    conda-forge
 pyyaml                    5.1.2            py37h516909a_0    conda-forge
 pyzmq                     18.1.0                   pypi_0    pypi
 rapidjson                 1.1.0             he1b5a44_1002    conda-forge
 re2                       2019.09.01           he1b5a44_0    conda-forge
 readline                  7.0                  h7b6447c_5
 requests                  2.22.0                   pypi_0    pypi
 rhash                     1.3.6             h14c3975_1001    conda-forge
 rmm                       0.10.0                   py37_0    rapidsai
 s3transfer                0.2.1                    pypi_0    pypi
 send2trash                1.5.0                    pypi_0    pypi
 setuptools                41.4.0                   py37_0
 shellgraph                0.1.0                    pypi_0    pypi
 six                       1.12.0                py37_1000    conda-forge
 snappy                    1.1.7             he1b5a44_1002    conda-forge
 sortedcontainers          2.1.0                      py_0    conda-forge
 sqlite                    3.30.1               h7b6447c_0
 streamlit                 0.49.0                   pypi_0    pypi
 tblib                     1.4.0                      py_0    conda-forge
 terminado                 0.8.2                    pypi_0    pypi
 testpath                  0.4.2                    pypi_0    pypi
 thrift-cpp                0.12.0            hf3afdfd_1004    conda-forge
 tk                        8.6.9             hed695b0_1003    conda-forge
 toml                      0.10.0                   pypi_0    pypi
 toolz                     0.10.0                     py_0    conda-forge
 tornado                   5.1.1                    pypi_0    pypi
 traitlets                 4.3.3                    pypi_0    pypi
 tzlocal                   2.0.0                    pypi_0    pypi
 uriparser                 0.9.3                he1b5a44_1    conda-forge
 urllib3                   1.25.6                   pypi_0    pypi
 validators                0.14.0                   pypi_0    pypi
 watchdog                  0.9.0                    pypi_0    pypi
 wcwidth                   0.1.7                    pypi_0    pypi
 webencodings              0.5.1                    pypi_0    pypi
 wheel                     0.33.6                   py37_0
 xorg-fixesproto           5.0               h14c3975_1002    conda-forge
 xorg-inputproto           2.3.2             h14c3975_1002    conda-forge
 xorg-kbproto              1.0.7             h14c3975_1002    conda-forge
 xorg-libx11               1.6.9                h516909a_0    conda-forge
 xorg-libxau               1.0.9                h14c3975_0    conda-forge
 xorg-libxdmcp             1.1.3                h516909a_0    conda-forge
 xorg-libxext              1.3.4                h516909a_0    conda-forge
 xorg-libxfixes            5.0.3             h516909a_1004    conda-forge
 xorg-libxi                1.7.10               h516909a_0    conda-forge
 xorg-libxrender           0.9.10            h516909a_1002    conda-forge
 xorg-libxtst              1.2.3             h14c3975_1002    conda-forge
 xorg-recordproto          1.14.2            h14c3975_1002    conda-forge
 xorg-renderproto          0.11.1            h14c3975_1002    conda-forge
 xorg-xextproto            7.3.0             h14c3975_1002    conda-forge
 xorg-xproto               7.0.31            h14c3975_1007    conda-forge
 xz                        5.2.4                h14c3975_4
 yaml                      0.1.7             h14c3975_1001    conda-forge
 zeromq                    4.3.2                he1b5a44_2    conda-forge
 zict                      1.0.0                      py_0    conda-forge
 zipp                      0.6.0                    pypi_0    pypi
 zlib                      1.2.11               h7b6447c_3
 zstd                      1.4.0                h3b9ef0a_0    conda-forge

Additional context
Add any other context about the problem here.

[BUG] Algebra unstable before a create table

Describe the bug
A clear and concise description of what the bug is.
Calcite gets unstable after I run a create table only put the table name and file path.

Steps/Code to reproduce bug
When I create a table with the following code:

bc.create_table('nation', table_list)

I mean, without these params: delimiter='|', dtype=column_types, names=column_names

Expected behavior
The SQL script must be executed correctly.

Environment overview (please complete the following information)
Docker

Additional context
I got this error message on Algebra:

ERROR: org.hibernate.AssertionFailure - an assertion failure occured (this may indicate a bug in Hibernate, but is more likely due to unsafe use of the session)
org.hibernate.AssertionFailure: null id in com.blazingdb.calcite.catalog.domain.CatalogColumnImpl entry (don't flush the Session after an exception occurs)
        at org.hibernate.event.def.DefaultFlushEntityEventListener.checkId(DefaultFlushEntityEventListener.java:82)
        at org.hibernate.event.def.DefaultFlushEntityEventListener.getValues(DefaultFlushEntityEventListener.java:190)
        at org.hibernate.event.def.DefaultFlushEntityEventListener.onFlushEntity(DefaultFlushEntityEventListener.java:147)
        at org.hibernate.event.def.AbstractFlushingEventListener.flushEntities(AbstractFlushingEventListener.java:219)
        at org.hibernate.event.def.AbstractFlushingEventListener.flushEverythingToExecutions(AbstractFlushingEventListener.java:99)
        at org.hibernate.event.def.DefaultFlushEventListener.onFlush(DefaultFlushEventListener.java:50)
        at org.hibernate.impl.SessionImpl.flush(SessionImpl.java:1216)
        at org.hibernate.impl.SessionImpl.managedFlush(SessionImpl.java:383)
        at org.hibernate.transaction.JDBCTransaction.commit(JDBCTransaction.java:133)
        at com.blazingdb.calcite.catalog.repository.DatabaseRepository.createDatabase(DatabaseRepository.java:58)
        at com.blazingdb.calcite.catalog.repository.DatabaseRepository.updateDatabase(DatabaseRepository.java:156)
        at com.blazingdb.calcite.catalog.connection.CatalogServiceImpl.dropTable(CatalogServiceImpl.java:36)
        at com.blazingdb.calcite.application.CalciteService.processRequest(CalciteService.java:78)
        at com.blazingdb.calcite.application.TCPService.run(TCPService.java:134)
        at java.base/java.lang.Thread.run(Thread.java:834)
Waiting for messages in TCP port: 8890

I think pyBlazing should verify the params before executing the SQL script.

Partition/Cluster/Bucket Keys like Hive

Why we need to have Key's to the table ?

  • As we all know not all the time you scan the whole table we may sometimes just look at the incremental load for which we need to apply a (sub) partitions key to the table.
  • In the distributed world it good to have the sort key options within the worker and global which give the developer an option for optimization

Describe the solution you'd like
Look at the Hive implementation.

[BUG] Error in create a table from a dask_cudf dataframe

I'm trying to create a blazingsql (0.4.3) table based on a dask_cudf dataframe but I'm getting an error:

df2=dask_cudf.read_csv(list_of_files)
df2.head()

bc.create_table('test', df2)

AttributeError: 'NoneType' object has no attribute 'table_name'

[BUG] Error in create a table from a dask_cudf dataframe

According to #82 , in version 4.4.5 it should be possible to create a table based on a dask_cudf dataframe, but I got an error:

# packages in environment at /home/ec2-user/anaconda3/envs/rapids-blazing:
blazingsql-calcite        0.4.5                         0    blazingsql
blazingsql-communication  0.4.4                         0    blazingsql
blazingsql-io             0.4.4                         0    blazingsql
blazingsql-orchestrator   0.4.5                         0    blazingsql
blazingsql-protocol       0.4.5                    py37_0    blazingsql
blazingsql-python         0.4.5           cuda10.0_py37_0    blazingsql
blazingsql-ral            0.4.5                cuda10.0_3    blazingsql
blazingsql-toolchain      0.4.5                         0    blazingsql

df=dask_cudf.read_csv(files, compression='gzip', sep='~', header=None, encoding='cp1252', low_memory=False, usecols=numerics)

len(df)
3168478

type(df)
dask_cudf.core.DataFrame

bc.create_table('test', df)

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-16-4280a0575141> in <module>
      1 #from blazingsql import BlazingContext
      2 #bc = BlazingContext()
----> 3 bc.create_table('test', df)
      4 result = bc.sql('SELECT count(*) FROM test').get()
      5 result_gdf = result.columns

~/anaconda3/envs/rapids-blazing/lib/python3.7/site-packages/pyblazing/apiv2/context.py in create_table(self, table_name, input, **kwargs)
    247                               table_name,
    248                               dask_client=self.dask_client,
--> 249                               **kwargs)
    250         table = self.sqlObject.create_table(ds)
    251 

~/anaconda3/envs/rapids-blazing/lib/python3.7/site-packages/pyblazing/apiv2/datasource.py in build_datasource(client, input, table_name, **kwargs)
    438                         descriptor,
    439                         dask_cudf=input,
--> 440                         **kwargs)
    441 
    442     # TODO percy raise error if ds is None

~/anaconda3/envs/rapids-blazing/lib/python3.7/site-packages/pyblazing/apiv2/datasource.py in __init__(self, client, table_name, descriptor, **kwargs)
    220 
    221         # init the data source
--> 222         self._valid = self._load(**kwargs)
    223 
    224     def is_valid(self):

~/anaconda3/envs/rapids-blazing/lib/python3.7/site-packages/pyblazing/apiv2/datasource.py in _load(self, **kwargs)
    258             return self._load_orc(**kwargs)
    259         elif type == Type.dask_cudf:
--> 260             return self._load_dask_cudf(**kwargs)
    261         else:
    262             # TODO percy manage errors

~/anaconda3/envs/rapids-blazing/lib/python3.7/site-packages/pyblazing/apiv2/datasource.py in _load_dask_cudf(self, **kwargs)
    402             dtypes=column_dtypes,
    403             dask_cudf=dask_cudf,
--> 404             dask_client=dask_client
    405         )
    406 

~/anaconda3/envs/rapids-blazing/lib/python3.7/site-packages/pyblazing/apiv2/bridge.py in create_table(client, table_name, **kwargs)
     33     @staticmethod
     34     def create_table(client, table_name, **kwargs):
---> 35         return pyblazing.create_table(table_name, **kwargs)
     36 
     37     @staticmethod

~/anaconda3/envs/rapids-blazing/lib/python3.7/site-packages/pyblazing/api.py in create_table(tableName, **kwargs)
   1316     else:
   1317         dask_client = kwargs['dask_client']
-> 1318         dask_tables = dask_cudf_to_BlazingDaskTable(dask_cudf, dask_client)
   1319 
   1320     if (len(columnTypes) > 0):

~/anaconda3/envs/rapids-blazing/lib/python3.7/site-packages/pyblazing/api.py in dask_cudf_to_BlazingDaskTable(dask_cudf, dask_client)
   1405         gdf_to_BlazingTable).compute()
   1406 
-> 1407     who_has = dask_client.who_has()
   1408     ips = [re.findall(r'(?:\d+\.){3}\d+', who_has[str(k)][0])[0]
   1409            for k in dask_cudf.dask.keys()]

AttributeError: 'NoneType' object has no attribute 'who_has'

[QST] Is LIKE supported?

What is your question?
Hello. I'm trying to filter my data set using the LIKE filter but I always get the same result regardless of the filter value I use. So I'm wondering if LIKE is actually supported.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.