intel-ai / hdk Goto Github PK

A low-level execution library for analytic data processing.

License: Apache License 2.0

CMake 2.26% C++ 84.38% C 0.27% Dockerfile 0.02% Python 3.82% Cython 0.94% FreeMarker 0.60% Java 5.23% Cuda 2.06% LLVM 0.14% Makefile 0.01% CSS 0.01% HTML 0.10% NASL 0.01% Shell 0.09% Ruby 0.01% PowerShell 0.04% Batchfile 0.01%

sql gpu query analytics query-engine modin pandas data-science machine-learning query-builder

hdk's People

Stargazers

Watchers

Forkers

garra1980 vlad-penkin aregm djs6255 bagrorg gaybro8777 akroviakov nikzasel gshimansky devjiu andreypavlenko xiangxud yarshev

hdk's Issues

Running tests puts `${sys:MAPD_LOG_DIR}` directories in test folder

It appears that the environment variable MAPD_LOG_DIR set here https://github.com/intel-ai/omniscidb/blob/jit-engine/Calcite/CMakeLists.txt#L30 is not being picked up by the log4j properties file(s) https://github.com/intel-ai/omniscidb/blob/jit-engine/Calcite/java/calcite/src/main/resources/log4j2.properties.

To reproduce, build as normal, then enter build/Tests and run ArrowBasedExecuteTest --gtest_filter=Select.GroupBy (keeps the tests brief).

@vlad-penkin Ilya suggested you might have some ideas about how to debug?

Allow building the engine with L0 support in conda environment

Since the default L0 driver location is in system libraries, there's an issue when building jit engine. CMake's find_package looks for the headers and finds them in /usr/include. CMake does not add it to include even if the target_include_directories is set explicitly to include the system paths (see https://gitlab.kitware.com/cmake/cmake/-/issues/17966 for details). Providing hints/paths to find_package break the linking process for other libraries due to conflicts.
There's also currently no conda package that we could use to avoid using the system includes/libraries. We need to either build a package or create a workaround for building jit-engine with L0 under conda env.

Open-source community work

Add TSAN, ASAN, UBSAN to CI

Blocked by #34

Establish baseline performance

Allow building HDK with both Nvidia and Intel GPU support

Add Modin tests to HDK suite

Add a smoke/sanity test for Modin powered by HDK to the GitHub Actions tests in this repo. Will need to use the HDK branch in the Modin repository for now.

Add working pyhdk example -- to readme?

e.g.:

import pyhdk
storage = pyhdk.storage.ArrowStorage(1) # 1 is schema id
data_mgr = pyhdk.storage.DataMgr()
data_mgr.registerDataProvider(storage)

calcite = pyhdk.sql.Calcite(storage)
executor = pyhdk.Executor(data_mgr)

import pyarrow
import pandas
at = pyarrow.Table.from_pandas(
            pandas.DataFrame({"a": [1, 2, 3], "b": [10, 20, 30]})
        )
opt = pyhdk.storage.TableOptions(2)
storage.importArrowTable(at, "test", opt)

sql = "SELECT * FROM test;"
ra = calcite.process(sql)
rel_alg_executor = pyhdk.sql.RelAlgExecutor(executor, storage, data_mgr, ra)
print(rel_alg_executor.execute().to_arrow().to_pandas())

print(rel_alg_executor.execute(just_explain = True).to_explain_str())

maven does not work well with openjdk-17

See #51 for details

My attempt to use Ubuntu 22 was blocked with an invalid CUDA package for this Ubuntu version, see https://askubuntu.com/questions/1421423/cuda-11-7-dependencies-issue-on-ubuntu-22-04

I'll check if it's possible update maven only

Decouple Codegen and Executor to allow passing backend traits to query template generator

Most of the code generation enabling requires a simple change: switch to correct pointers address space and calling convention. For native code generation, this is handled by CodegenTraits, however, existing codegen logic does not allow passing CodegenTraits to CodeGenerator at construction time.

Support LLVM 14

Remove duplicate version numbers in HDK CMakeLists

From Igor:
we have version mark here https://github.com/intel-ai/hdk/blob/main/CMakeLists.txt#L5 and here https://github.com/intel-ai/hdk/blob/main/CMakeLists.txt#L32 - probably we can drop one of it to not change in two places each time

Create Conda package

Add cython and java clean targets

Remove java target folder and Cython generated cpp files in python folder when running make clean from this repo.

Support bringing jit-engine branch in as module

Functionality required:

QueryEngine
Analyzer
DataMgr / ArrowStorage
Possibly Calcite, though initially we can directly generate Analyzer nodes
ArrowStorageExecuteTest

Functionality not required/desired:

Parser/ParserNode
Catalog
Thrift/DBHandler

Initial attempts have failed due to linking problems, but we can try again once https://github.com/intel-ai/omniscidb/pull/332 lands.

Also requires:

document endpoints exposed for integration
support build and minimal test w/ CI to prevent regressions (on either side)

Modin doesn't work with PyHDK with submodule updated to the latest jit-engine branch

If I update omniscidb submodule to the latest jit-engine branch then I get this error trying to parse RelAlg JSON queries in Calcite:

java.lang.NoClassDefFoundError: com/fasterxml/jackson/annotation/JsonIncludeProperties
        at com.fasterxml.jackson.databind.introspect.JacksonAnnotationIntrospector.findPropertyInclusionByName(JacksonAnnotationIntrospector.java:321) ~[calcite-1.0-SNAPSHOT-jar-with-dependencies.jar:?]

Looks like it is related to the latest change in the jackson-databind version used by Calcite. The problem can be reproduced using ienkovich/config of HDK and ienkovich/pyhdk-config branch of Modin.

Add Intel GPU testing to CI

Heterogeneous co-execution on CPU-GPU

Add documentation skeleton

Use github pages + sphinx + github actions for auto-build?

JVM Initialization preventing back to back test runs

Running pytest from the hdk tests directory causes a crash on the second test.

Specifically, this line is failing:

    if (JNI_CreateJavaVM(&jvm, (void**)&env, &vm_args) != JNI_OK) {

      LOG(FATAL) << "Couldn't initialize JVM.";

    }

And due to some issues with the logger not being around anymore, we are failing to log the message and aborting.

The problem appears to be an issue of JNI context trying to initialize the JVM twice.

Full backtrace:

Thread 1 "python3" received signal SIGABRT, Aborted.
0x00007ffff7c8e36c in ?? () from /usr/lib/libc.so.6
(gdb) bt
#0 0x00007ffff7c8e36c in ?? () from /usr/lib/libc.so.6
#1 0x00007ffff7c3e838 in raise () from /usr/lib/libc.so.6
#2 0x00007ffff7c28535 in abort () from /usr/lib/libc.so.6
#3 0x00007fff29d4cac0 in logger::Logger::~Logger (this=0x7fffffff6110, __in_chrg=<optimized out>)
at /home/alexb/Projects/hdk/omniscidb/Logger/Logger.cpp:459
#4 0x00007fff29959551 in (anonymous namespace)::JVM::createJVM (max_mem_mb=<optimized out>)
at /home/alexb/Projects/hdk/omniscidb/Calcite/CalciteJNI.cpp:144
#5 (anonymous namespace)::JVM::getInstance (max_mem_mb=<optimized out>)
at /home/alexb/Projects/hdk/omniscidb/Calcite/CalciteJNI.cpp:88
#6 CalciteJNI::Impl::Impl (this=0x5555564e1060, schema_provider=..., udf_filename=..., calcite_max_mem_mb=<optimized out>)
at /home/alexb/Projects/hdk/omniscidb/Calcite/CalciteJNI.cpp:171
#7 0x00007fff2995ab13 in std::make_unique<CalciteJNI::Impl, std::shared_ptr<SchemaProvider>&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned long&> ()
at /home/alexb/.conda/envs/omnisci-dev/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/unique_ptr.h:857
#8 CalciteJNI::CalciteJNI (this=0x555556a04190, schema_provider=..., udf_filename=..., calcite_max_mem_mb=1024)
at /home/alexb/Projects/hdk/omniscidb/Calcite/CalciteJNI.cpp:602
#9 0x00007fff2997906e in __gnu_cxx::new_allocator<CalciteJNI>::construct<CalciteJNI, std::shared_ptr<SchemaProvider>&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, unsigned long&> (this=<optimized out>, __p=0x555556a04190)
at /home/alexb/.conda/envs/omnisci-dev/x86_64-conda-linux-gnu/include/c++/9.4.0/ext/new_allocator.h:146
#10 std::allocator_traits<std::allocator<CalciteJNI> >::construct<CalciteJNI, std::shared_ptr<SchemaProvider>&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, unsigned long&> (__a=..., __p=0x555556a04190)
at /home/alexb/.conda/envs/omnisci-dev/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/alloc_traits.h:483
#11 std::_Sp_counted_ptr_inplace<CalciteJNI, std::allocator<CalciteJNI>, (__gnu_cxx::_Lock_policy)2>::_Sp_counted_ptr_inplace<std::shared_ptr<SchemaProvider>&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, unsigned long&> (
__a=..., this=0x555556a04180)
at /home/alexb/.conda/envs/omnisci-dev/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/shared_ptr_base.h:548
#12 std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<CalciteJNI, std::allocator<CalciteJNI>, std::shared_ptr<SchemaProvider>&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, unsigned long&> (__a=...,
__p=<optimized out>, this=<optimized out>)
at /home/alexb/.conda/envs/omnisci-dev/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/shared_ptr_base.h:679
#13 std::__shared_ptr<CalciteJNI, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<CalciteJNI>, std::shared_ptr<SchemaProvider>&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, unsigned long&> (__tag=...,
this=<optimized out>)
at /home/alexb/.conda/envs/omnisci-dev/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/shared_ptr_base.h:1344
#14 std::shared_ptr<CalciteJNI>::shared_ptr<std::allocator<CalciteJNI>, std::shared_ptr<SchemaProvider>&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, unsigned long&> (__tag=..., this=<optimized out>)
at /home/alexb/.conda/envs/omnisci-dev/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/shared_ptr.h:359
#15 std::allocate_shared<CalciteJNI, std::allocator<CalciteJNI>, std::shared_ptr<SchemaProvider>&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, unsigned long&> (__a=...)
at /home/alexb/.conda/envs/omnisci-dev/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/shared_ptr.h:702

Propagate ArrowStorage exceptions to Python

In case of Arrow import data error in PyHDK, we would just get a segfault with no proper diagnostic message. Python exceptions would be much nicer.

Intel GPU Enabling

HDK PR checks in CI

Need to enable PRs checks in the HDK CI

Run Dwarf Benchmarks on Intel PVC GPU

Support extract/date time runtime in L0 backend

For Taxi Q3/Q4, we need to determine how to pull the Date/Time runtime into SPIRV. For CUDA, we compile the extension functions into a CUDA FatBinary at build time, then use the CUDA linker. We could follow a similar approach with SPIRV, or move the time extraction functions to the module and inline them during the JIT process. The downside to this could be increased module compile time (though there are some optimizations meant to keep such increases to a minimum), so we are considering building a benchmark to test.

L0 backend non-scalar code generation produces incorrect results

There's a bug in how we process steps if multiple threads are running.

Heterogeneous execution fail on assert

There seems to be a flaw with recompilation when QueryMustRunOnCPU is thrown.
ArrowBasedExecutionTest fails:

2022-11-17T08:53:04.345347 F 2829187 0 0 RelAlgExecutor.cpp:622 Check failed: co.device_type == ExecutorDeviceType::GPU

Executor holds dangling reference to data mgr

In the unit tests, we delete storage between each test. This deletes DataMgr, but the Executor ends up with a pointer to the old DataMgr. This results in a segfault the next time an Executor is created, because the Python Executor class calls getExecutor which pulls from the Executor pool.

Sanitizers do not work with JNI

Needs investigation.

Support modin using HDK

Build library exposing required endpoints and Python wrapper

Add pyhdk install instructions to readme

Bringing in jit-engine causes compiler error with date/time runtime

Compiler error bringing in extract from time code:

heterogeneous-data-kernels/omniscidb/QueryEngine/ExtractFromTime.cpp: In function 'int64_t ExtractFromTime(ExtractField, int64_t)':
heterogeneous-data-kernels/omniscidb/QueryEngine/ExtractFromTime.cpp:156:1: error: inlining failed in call to always_inline 'int64_t extract_epoch(int64_t)': function body can be overwritten at link time
  156 | extract_epoch(const int64_t timeval) {
      | ^~~~~~~~~~~~~
heterogeneous-data-kernels/omniscidb/QueryEngine/ExtractFromTime.cpp:270:27: note: called from here
  270 |       return extract_epoch(timeval);
      |              ~~~~~~~~~~~~~^~~~~~~~~

test

Allow GPU execution

Add GPU managers initialization, options for controlling device selection at user level.

Design and publish public API (IR)

Run Dwarf on Nvidia A100 via AWS

Run Dwarf Benchmarks on Intel PVC

pyhdk failing inside Jupyter notebook

When running inside a Jupyter notebook we get:

      [3] data_mgr = pyhdk.storage.DataMgr()
      [4] data_mgr.registerDataProvider(storage)
----> [6] calcite = pyhdk.sql.Calcite(storage)
      [7] executor = pyhdk.Executor(data_mgr)
      [9] import pyarrow

File _sql.pyx:36, in pyhdk._sql.Calcite.__cinit__()

RuntimeError: Couldn't initialize JVM.

pyhdk from conda-forge segfaults

Here is the scenario:

conda env remove -n omnisci-dev
conda env update -f omniscidb/scripts/mapd-deps-conda-dev-env.yml
git clone https://github.com/intel-ai/modin.git
git checkout ienkovich/pyhdk
conda activate omnisci-dev
mamba install  -c conda-forge pyhdk
cd modin/
pip install -e .
cd ..
python python/tests/modin/modin_smoke_test.py

Here is the error:

UserWarning: Distributing <class 'list'> object. This may take some time.
FutureWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.
0    12
Name: a, dtype: int64
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f2bcc5c3b85, pid=2895537, tid=2895537
#
# JRE version: OpenJDK Runtime Environment (11.0.15) (build 11.0.15-internal+0-adhoc..src)
# Java VM: OpenJDK 64-Bit Server VM (11.0.15-internal+0-adhoc..src, mixed mode, tiered, compressed oops, g1 gc, linux-amd64)
# Problematic frame:
# C  [libjimage.so+0x2b85]  ImageStrings::find(Endian*, char const*, int*, unsigned int)+0x65
#
# Core dump will be written. Default location: /localdisk2/afedotov/git/hdk/core
#
# An error report file with more information is saved as:
# /localdisk2/afedotov/git/hdk/hs_err_pid2895537.log
#
# If you would like to submit a bug report, please visit:
#   https://bugreport.java.com/bugreport/crash.jsp
#
Aborted (core dumped)

Fix CI builds and add pytests to HDK test suite

Fix import of Arrow table with time32[s] data

Currently, ArrowStorage fails to import such data due to improper scheme checks. But other issues might also exist in the actual data import.

C++ exception with description "Mismatched type for column col4: timestamp[s] vs. time32[s]" thrown in the test body.

Run Dwarf benchmarks on Intel ATS-P

Allow building HDK from an arbitrary folder

Our current flow assumes the exact location of the build folder. This is a request to lift that restriction to allow something like this:

cd /my/build/folder
cmake /path/to/hdk

InsertOrderFragmenter depends on Catalog code

The method insertData depends on Catalog::getTableEpochs for error handling. This requires linking Catalog into Fragmenter, and Fragmenter is currently a dependency on data fetch in QueryEngine. Need to elevate the Catalog accesses to remove the dependency.

Enable HDK on Windows

Enabling includes successful execution of all OmniSci and HDK tests and integration with Modin.

Replicate jit-engine CI here

In preparation for repo merge, we should add the GitHub actions from the other repo here, to run under the omniscidb folder. We will need to copy the actions over and update the paths.

create manylinux2014_x86_64 build

Status:

The container manylinux2014_x86_64 does not work because of outdated repo url. The container from cibuildwheel cannot be used because it does not have sudo.

It seems people build their own containers for their builds and check them using auditwheel

Flaky Select.FilterAndSimpleAggregation test

After CUDA tests were introduced to CI, we saw failures of the Select.FilterAndSimpleAggregation test. The failure is flaky, but if it fails, it always fails in the same way:

Expected equality of these values:
  20
  v<int64_t>( run_simple_agg("SELECT COUNT(*) FROM test WHERE MOD(x, 7) <> 7;", dt))
    Which is: 22

I found out, that it's enough to leave only this particular query in the test to reproduce the failure. The query is supposed to return the number of rows but somehow returns a greater value. The input table has 10 fragments, 2 rows each.

I dumped generated IR module and all data copied to a CUDA device. Dumps are the same for good and bad runs. It looks like we run the same code on the same data but get different results.

I was able to reproduce it on an August 30 version of jit-engine branch, so the problem is not new. Don't know when it was introduced.

Add CLA/Contributing document

Add dependencies manifest using vcpkg or conda-forge

Support installing all required dependencies on Linux using vcpkg.

intel-ai / hdk Goto Github PK

hdk's People

Stargazers

Watchers

Forkers

hdk's Issues

Recommend Projects

Recommend Topics

Recommend Org