Giter Site home page Giter Site logo

duckdb / extension-template Goto Github PK

View Code? Open in Web Editor NEW
89.0 89.0 31.0 155 KB

Template for DuckDB extensions to help you develop, test and deploy a custom extension

License: MIT License

CMake 6.39% Makefile 33.97% Python 21.80% C++ 14.50% JavaScript 6.17% Shell 17.17%

extension-template's Introduction

DuckDB logo

Github Actions Badge discord Latest Release

DuckDB

DuckDB is a high-performance analytical database system. It is designed to be fast, reliable, portable, and easy to use. DuckDB provides a rich SQL dialect, with support far beyond basic SQL. DuckDB supports arbitrary and nested correlated subqueries, window functions, collations, complex types (arrays, structs), and more. For more information on using DuckDB, please refer to the DuckDB documentation.

Installation

If you want to install and use DuckDB, please see our website for installation and usage instructions.

Data Import

For CSV files and Parquet files, data import is as simple as referencing the file in the FROM clause:

SELECT * FROM 'myfile.csv';
SELECT * FROM 'myfile.parquet';

Refer to our Data Import section for more information.

SQL Reference

The website contains a reference of functions and SQL constructs available in DuckDB.

Development

For development, DuckDB requires CMake, Python3 and a C++11 compliant compiler. Run make in the root directory to compile the sources. For development, use make debug to build a non-optimized debug version. You should run make unit and make allunit to verify that your version works properly after making changes. To test performance, you can run BUILD_BENCHMARK=1 BUILD_TPCH=1 make and then perform several standard benchmarks from the root directory by executing ./build/release/benchmark/benchmark_runner. The details of benchmarks are in our Benchmark Guide.

Please also refer to our Build Guide and Contribution Guide.

Support

See the Support Options page.

extension-template's People

Contributors

carlopi avatar jaicewizard avatar jalateras avatar jexp avatar krlmlr avatar mytherin avatar nickcrews avatar pdet avatar peter-gy avatar samansmink avatar szarnyasg avatar tshauck avatar ttanay avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

extension-template's Issues

python test/script can not pass

Steps to reproduce:

  1. Create a new repo from latest template
  2. run make
  3. pip install duckdb --pre --upgrade
  4. run python3 -m pytest under dir test/python

Expected result:

python test passed

Actual result:

  conn.execute(f"load '{extension_binary}'")

E duckdb.duckdb.InvalidInputException: Invalid Input Error: Initialization function "quack_init" from file "/path/to/duckdb-template/build/release/extension/quack/quack.duckdb_extension" threw an exception: "INTERNAL Error: Missing DB manager"

How do I dynamically link libraries in extension?

I'm trying to build an ODBC scanner extension and I need to link to unixodbc. I can't figure out how to link it in CMakeLists.txt. I get an undefined reference during the build.

https://github.com/rupurt/odbc-scanner-duckdb-extension/pull/2/files#diff-1e7de1ae2d059d21e1dd75d5812d5a34b0222cef273b7c3a2af62eb747f9d20aR10

[ 78%] Linking CXX executable test_sqlite3_api_wrapper
/usr/bin/ld: ../../src/libduckdb.so: undefined reference to `SQLDriverConnect'
collect2: error: ld returned 1 exit status
gmake[3]: *** [tools/sqlite3_api_wrapper/CMakeFiles/test_sqlite3_api_wrapper.dir/build.make:128: tools/sqlite3_api_wrapper/test_sqlite3_api_wrapper] Error 1

Suggestion: More realistic example function

Hi,

after talking about the repo yesterday, I wanted to check it out at night.
Unfortunately being no c++ coder and having no duckdb API/SPI docs available (add link in readme and code?)
it took me about an hour to turn static function with no parameters into something that takes a parameter and is executed for each row of input.

Trying to use the existing extensions as inspiration was only minimally helpful because their internal structure (even the simple excel-format function) was vastly different from the boilerplate function, using the BinaryExecutor for each vectorized chunk of input. And figuring out how to access the content and create and return a string_t was also a lot of searching in the extensions examples. I found it with String::EmptyString(result, size) and them memcpy.

My suggestions would be (I also have a PR ready if you want).

  • turn it into a kind of "Hello World" function that takes a single string argument
  • return a friendly message using the argument, e.g. "Hello "+name+"!"
  • the name boilerplate has negative connotations and is not welcoming
  • I would make it something friendly and fun, e.g. quack(name) -> "Quack "+name+" 🐥"

Cheers, Michael

#3

Error when loading the extension with the official duckdb binary!

When I follow the Readme to checkout and compile the extensions-template, I get binary of the quack extension under extension-template/build/release/extension/quack/quack.duckdb_extension. This extension can be successfully loaded and used with the duckdb build at the same time (extension-template/build/release/duckdb).

I currently build against the duckdb tag v0.9.1 (which is the default of the submodule currently). If I download the binary from https://github.com/duckdb/duckdb/releases/download/v0.9.1/duckdb_cli-linux-amd64.zip and use the duckdb binary used there I get the following error when I try to LOAD 'quack';:

D INSTALL 'quack.duckdb_extension';
D LOAD 'quack';
Error: Invalid Input Error: Initialization function "quack_init" from file "/home/jr/.duckdb/extensions/v0.9.1/linux_amd64_gcc4/quack.duckdb_extension" threw an exception: "INTERNAL Error: Missing DB manager"
D

I tried this with several Linux based build environments, and always have the same issue. I also have the same problem with a different custom extension. It works with the duckdb binary built at the same time with the extension, but does not work with the release version (most of the users have). This prevents distribution of the binary.

Any idea how to fix that?

Some nice-to-have features

This issue groups a few nice-to-have features we would like to see added. Issue will be updated as we think of more, feel free to contribute ideas

  • Update notifier: Third party extension maintainers should notified through some way of a new DuckDB release.
  • Some standardized way of getting documentation on an extension (either a block of text or a url to the docs)
  • Auto-Install extension dependencies
  • Use duckdb clang-format, clang-tidy, and editorconfig
  • Ensure release also produce binaries as artifacts on github for all builds
  • Change extension script path: It currently prefixes all paths with the extension namespace, this should be a configurable path to allow deploying multiple extensions to the same directory allowing a single set custom_extension_repository to easily access a bunch of extensions that are separately managed
  • CI to ensure both linkage of the static extension and loadable extension are correct: especially when an extension links against other libraries, errors here are easy to make rendering the extension unloadable.
  • Explain how the build process works of extensions: clarify that duckdbs cmake file is the root cmakefile
  • Document and/or provide template for linking dependencies to both static lib and loadable extension

Support for multiple languages

This is a great start, I admire what you folks are doing! I wonder if there is any plan to provide an API to register Python functions as UDF or UDTFs that we can call from SQL. The Python binding has a map that lets us process the data in Python but the SQL interface doesn't support registering Python support yet AFAIK and the only way to support UDF & UDTF's seems to be native duckdb-extensions if I'm not mistaken.

Help: explain portability of an extension

Hey there! I am working on https://github.com/NickCrews/libpostal-duckdb

I have a question for distributing in regards to how portable an extension file is. If I build an extension for

  • x86
  • macOS
  • duckdb 0.9.2

Is that going to work on Windows x86 machine using duckdb 0.9.2? Or do I need to publish a built version for every combo of (arch, OS, duckdb version)?

The reason I ask is because I am trying to incorporate libpostal as a dependency. It isn't on vcpkg. So I started with git submodules. But then I found an unofficial homebrew recipe that would be much easier, but it would limit the built to working on linux and macos. So if I need to build on windows, that is a nonstarter.

I figured that once we have this figured out I can submit a PR that clarifies this in the README, as a C NOOB this would have helped me. Thank you!

Windows VC2022 - Compiler error code 9009 for postgres_scanner (maybe a template issue?)

I started out looking for guidance, not sure how to include postgres_scanner in the full duckdb build on windows 10, 64 bit environment. My hope is the issues I had might help others navigate through. I suspect part of the issues I experienced was a cmake conversion glitch, maybe the template, if it isn't please just reject this.

Building labs postgres_scanner found an issue in the the .vcxproj, it is making a call to "sh" in a windows environment, "sh" doesn't exist. I do not know nearly enough about the cmake process to suggest a fix but I do have a mitigation to offer.

I edited the postgres_scanner_loadable_extension.vcxproj, replacing:
sh pgconfigure

Instead I employed the bash environment that comes with git, e.g.:
"C:\Program Files\Git\bin\bash.exe" -c "sh pgconfigure"

This worked and I was able to run into the next issue, versions.

I've captured a lot of additional detail but I believe those a lot more knowledgeable about cmake and resulting vcxproj files will understand the mitigation.

Full path:
C:\duckdb\extension\postgres_scanner\postgres_scanner_loadable_extension.vcxproj`

Hope this helps the next poor soul trying to build out this extension in windows.

Extension Does not Build for DuckDB v0.8.1

When I create an extension repo, and try to build it using DuckDB v0.8.1 (latest release as of this writing), the extension does not get built. I attempt to execute the quack() scalar function and it isn't found. Additionally, the extension doesn't exist at build/release/extension/quack/quack.duckdb_extension. Though the directory build/release/extension/quack/ does.

Steps to reproduce:

  1. Create a new repo from this template.
  2. Update the DuckDB git submodule to point to the v0.8.1 tag.
  3. run make
  4. Run ./build/release/duckdb
  5. Run select quack('Jane') as result;

Expected result:

The query result documented in the README:

D select quack('Jane') as result;
┌───────────────┐
│    result     │
│    varchar    │
├───────────────┤
│ Quack Jane 🐥 │
└───────────────┘

Actual Result:

D select quack('Jane') as result;
Error: Catalog Error: Scalar Function with name quack does not exist!
Did you mean "acos"?
LINE 1: select quack('Jane') as result;
               ^
D load quack;
Error: IO Error: Extension "/home/mark/.duckdb/extensions/v0.8.1/linux_amd64/quack.duckdb_extension" not found.

Candidate extensions: "icu"

edit: formatting

rename is broken

  1. documentation for set_extension_name.py is wrong it expects two arguments not one

  2. when renaming extension to something that has a underscore in the name or it starts in lower case it fails:

/home/ziereis/db721_reader/build/release/codegen/src/generated_extension_loader.cpp:9:26: error: ‘Db721ScanExtension’ was not declared in this scope; did you mean ‘Db721_scanExtension’?
    9 |         db.LoadExtension<Db721ScanExtension>();
      |                          ^~~~~~~~~~~~~~~~~~
      |                          Db721_scanExtension
/home/ziereis/db721_reader/build/release/codegen/src/generated_extension_loader.cpp:9:45: error: no matching function for call to ‘duckdb::DuckDB::LoadExtension<<expression error> >()’
    9 |         db.LoadExtension<Db721ScanExtension>();
      |         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~
In file included from /home/ziereis/db721_reader/duckdb/src/include/duckdb/main/extension/generated_extension_loader.hpp:11,
                 from /home/ziereis/db721_reader/build/release/codegen/src/generated_extension_loader.cpp:1:
/home/ziereis/db721_reader/duckdb/src/include/duckdb/main/database.hpp:90:14: note: candidate: ‘template<class T> void duckdb::DuckDB::LoadExtension()’
   90 |         void LoadExtension() {
      |              ^~~~~~~~~~~~~
/home/ziereis/db721_reader/duckdb/src/include/duckdb/main/database.hpp:90:14: note:   template argument deduction/substitution failed:
/home/ziereis/db721_reader/build/release/codegen/src/generated_extension_loader.cpp:9:45: error: template argument 1 is invalid
    9 |         db.LoadExtension<Db721ScanExtension>();
      |         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~
/home/ziereis/db721_reader/build/debug/codegen/src/generated_extension_loader.cpp:9:26: error: ‘Db721ReaderExtension’ was not declared in this scope; did you mean ‘Db721readerExtension’?
    9 |         db.LoadExtension<Db721ReaderExtension>();
      |                          ^~~~~~~~~~~~~~~~~~~~
      |                          Db721readerExtension
/home/ziereis/db721_reader/build/debug/codegen/src/generated_extension_loader.cpp:9:47: error: no matching function for call to ‘duckdb::DuckDB::LoadExtension<<expression error> >()’
    9 |         db.LoadExtension<Db721ReaderExtension>();
      |         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~
In file included from /home/ziereis/db721_reader/duckdb/src/include/duckdb/main/extension/generated_extension_loader.hpp:11,
                 from /home/ziereis/db721_reader/build/debug/codegen/src/generated_extension_loader.cpp:1:
/home/ziereis/db721_reader/duckdb/src/include/duckdb/main/database.hpp:90:14: note: candidate: ‘template<class T> void duckdb::DuckDB::LoadExtension()’
   90 |         void LoadExtension() {
      |              ^~~~~~~~~~~~~
/home/ziereis/db721_reader/duckdb/src/include/duckdb/main/database.hpp:90:14: note:   template argument deduction/substitution failed:
/home/ziereis/db721_reader/build/debug/codegen/src/generated_extension_loader.cpp:9:47: error: template argument 1 is invalid
    9 |         db.LoadExtension<Db721ReaderExtension>();
      |         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~

i assume it is because of duckdb/src/main/extension/CMakeLists.txt:

# generated_extension_loader.hpp
set(EXT_LOADER_NAME_LIST "")
set(EXT_LOADER_BODY "")
if(NOT ${DISABLE_BUILTIN_EXTENSIONS})
  foreach(EXT_NAME IN LISTS DUCKDB_EXTENSION_NAMES)
    string(TOUPPER ${EXT_NAME} EXT_NAME_UPPERCASE)
    if(${DUCKDB_EXTENSION_${EXT_NAME_UPPERCASE}_SHOULD_LINK})

      # Assumes lowercase input!
      string(REPLACE "_" ";" EXT_NAME_SPLIT ${EXT_NAME})
      set(EXT_NAME_CAMELCASE "")
      foreach(EXT_NAME_PART IN LISTS EXT_NAME_SPLIT)
        string(SUBSTRING ${EXT_NAME_PART} 0 1 FIRST_LETTER)
        string(SUBSTRING ${EXT_NAME_PART} 1 -1 REMAINDER)
        string(TOUPPER ${FIRST_LETTER} FIRST_LETTER)
        set(EXT_NAME_CAMELCASE
            "${EXT_NAME_CAMELCASE}${FIRST_LETTER}${REMAINDER}")
      endforeach()

      set(EXT_LOADER_NAME_LIST "${EXT_LOADER_NAME_LIST},\n\t\"${EXT_NAME}\"")
      set(EXT_LOADER_BODY
          "${EXT_LOADER_BODY}\
    if (extension==\"${EXT_NAME}\") {
        db.LoadExtension<${EXT_NAME_CAMELCASE}Extension>();
        return true;
    }
")
    endif()
  endforeach()
endif()

Additionally the sqltest after renaming also fails:

Filters: /home/ziereis/db721_reader/test/*
[0/1] (0%): /home/ziereis/db721_reader/test/sql/Db721Reader.test                ================================================================================
Query failed, but error message did not match expected error message: Catalog Error: Scalar Function with name Db721Reader does not exist! (/home/ziereis/db721_reader/test/sql/Db721Reader.test:6)!
================================================================================
SELECT Db721Reader('Sam');
Actual result:
================================================================================
Catalog Error: Scalar Function with name db721reader does not exist!
Did you mean "decade"?
LINE 1: SELECT Db721Reader('Sam');
               ^


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
unittest is a Catch v2.13.7 host application.
Run with -? for options

-------------------------------------------------------------------------------
/home/ziereis/db721_reader/test/sql/Db721Reader.test
-------------------------------------------------------------------------------
/home/ziereis/db721_reader/duckdb/test/sqlite/test_sqllogictest.cpp:213
...............................................................................

/home/ziereis/db721_reader/test/sql/Db721Reader.test:6: FAILED:
explicitly with message:
  0

[1/1] (100%): /home/ziereis/db721_reader/test/sql/Db721Reader.test              
===============================================================================
test cases: 1 | 1 failed
assertions: 1 | 1 failed

the name of the function should be in all lower db721reader but the rename script replaces it with Db721Reader

Vcpkg dependency management

Vcpkg is a C++/C dependency manager that works well with CMake. I think it would be a killer feature to have the extension template pre-configured for building extensions with vcpkg. Extensions would have a vcpkg manifest that contains all dependencies. To add a dependency, all that would be required is adding it to the manifest and then using find_package in cmake. This saves a bunch of time as the approach currently used in extensions such as arrow and spatial is to basically build dependencies from source manually which is a cumbersome process to get right. I've done some initial testing that seems to work well, with both linking the static extensions and producing the loadable extensions working "out of the box"

Segfault in Python on Linux

Hi --

I'm running into a segfault when trying to load an extension built this way through Python. I think it may be linux specifc, but I'm not 100% sure. I can load the extension ok when I do it through the CLI in the build folder, just not the extension.

E.g. if I follow the instructions from the README, then run:

python3 ./scripts/set_extension_name.py wt
cd duckdb; git checkout v0.7.0; cd ..
make

This creates an extension at build/release/extension/wt/wt.duckdb_extension if I then try to use that extension from python, the code segfaults on Linux.

# tester.py
# pip intall duckdb==0.7.0
import duckdb

con = duckdb.connect(config={'allow_unsigned_extensions': True})
con.load_extension('build/release/extension/wt/wt.duckdb_extension')

And running...

python tester.py 
Segmentation fault

Create schema in extension transaction not available to TableFunction CreateView

Howdy,

I'm trying to create an ODBC extension that maps an attached database to DuckDB schemas and views. I'm following the structure of the Postgres extension which has been super helpful but it uses the default main schema in the memory catalog. I can get it to work if I start a new transaction for every row of SQLTables but ideally I would like everything to be rolled back if there is a failure. It also works sometimes if I create the schemas in the bind function and the views in the attach function. This makes me think there is some background work or data in a buffer that has not been flushed.

Are there any examples of this you can point me to? Or do you have a recommended approach for this kind of thing?

I have attached a copy of my branch that is causing the issue https://github.com/rupurt/odbc-scanner-duckdb-extension/compare/odbc-attach-schema-not-found#diff-3f910f46660aa2e53cd6309da4fc6ff8fe21edc2321d24271f90932ca708c7b2R274

Cheers 🍻

Use `cookie-cutter`

There's a repository that can create parameterized templates, maybe we can use this instead of the set_extension_name.py script

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.