vertica / vertica-sqlalchemy-dialect Goto Github PK

View Code? Open in Web Editor NEW

2.0 7.0 7.0 116 KB

Official Vertica SQLAlchemy dialect to connect Vertica and SQLAlchemy applications.

Home Page: https://pypi.org/project/vertica-sqlalchemy-dialect/

License: Apache License 2.0

Shell 3.46% Python 96.54%

database dialect python sqlalchemy

vertica-sqlalchemy-dialect's Introduction

Vertica SQLAlchemy Dialect

Vertica dialect for SQLAlchemy uses the pure-Python DB-API driver vertica-python, to connect a Vertica database and SQLAlchemy applications.

This repo was derived from a series of chained forks as outlined in the Acknowledgement.

Warning This dialect currently doesn't not have state-of-the-art features and support maybe limited based on Vertica developer availability. However, we encourage you to raise a PR to add new features that may help your SQLAlchemy application.

Prerequisites

You will need the following softwares to run, build and test the dialect. Everything apart from Python and pip can be installed via pip itself.

Python 3.x or higher
pip 22 or higher
sqlalchemy>=1.3.24,<=1.4.44
vertica-python 1.1.1 or higher

Vertica-Python

vertica-python is needed to use the Vertica-SQLAlchemy-Dialect. The connector does not need to be installed as the dialect installation takes care of it.

Note: We recommend using the vertica-python connector. However, the dialect also allows connecting using pyodbc. More instructions are at the end of this README.

Installing Vertica SQLAlchemy Dialect

The Vertica SQLAlchemy package can be installed from the public PyPI repository using pip:

pip install --upgrade vertica-sqlalchemy-dialect

pip automatically installs all required modules, including vertica-python.

Verifying Your Installation

Create a file (e.g. validate.py) that contains the following Python sample code, which connects to Vertica and displays the Vertica version:

Using vertica-python

from sqlalchemy import create_engine

engine = create_engine(
    'vertica+vertica_python://{user}:{password}@{host}:{port}/{db}'.format(
        user='<your_user_login_name>',
        password='<your_password>',
        host='<your_Host_IP>',
        port='<your_db_port>',
        db='your_db_name'
    )
)
try:
    connection = engine.connect()
    results = connection.execute('select version()').fetchone()
    print(results[0])
finally:
    connection.close()
    engine.dispose()

Replace the credentials with the values for Vertica DB and user.

For more details, see Connection Parameters.
Execute the sample code. For example, if you created a file named validate.py:
```
python validate.py
```
The Vertica version (e.g. v12.0.1) should be displayed.

Parameters and Behavior

As much as possible, Vertica SQLAlchemy provides compatible functionality for SQLAlchemy applications. For information on using SQLAlchemy, see the SQLAlchemy documentation.

Note: Current state of the dialect only supports metadata functions. It is still under development.

However, Vertica SQLAlchemy also provides specific parameters and behavior, which are described in the following sections.

Connection Parameters

Vertica SQLAlchemy Dialect uses the following syntax for the connection string used to connect to Vertica and initiate a session:

'vertica+vertica_python://<user>:<password>@<host_name>/<database_name>'

Where:

<user> is the login name for your Vertica user.
<password> is the password for your Vertica user.
<host_name> is the IP/FQDN of your Vertica Host.
<database_name> is the name of your Vertica Database.

You can optionally specify the initial database and schema for the Vertica session by including them at the end of the connection string, separated by /. You can also specify other supported parameters by vertica-python at the end of the connection string:

'vertica+vertica_python://<user>:<password>@<host_name>/<database_name>?session_label=sqlalchemy&connection_load_balance=1'

For more information, check out the connection options of vertica-python.

Opening and Closing Connection

Open a connection by executing engine.connect(); avoid using engine.execute(). Make certain to close the connection by executing connection.close() before engine.dispose(); otherwise, the Python Garbage collector removes the resources required to communicate with Vertica, preventing the Python connector from closing the session properly.

# Example
engine = create_engine(...)
connection = engine.connect()
try:
    connection.execute(<SQL>)
finally:
    connection.close()
    engine.dispose()

Using pyodbc instead of vertica-python

You may use pyodbc instead of vertica-python for the connection.

Create a Vertica DSN

You will need to have a Vertica ODBC driver installed from Vertica-Client-Drivers. For steps to install ODBC for Vertica, follow official Vertica Docs.

For example, you will need to configure these files with your credentials:

/etc/vertica.ini

[Driver]
ErrorMessagesPath = /opt/vertica/lib64/
ODBCInstLib = /usr/lib/x86_64-linux-gnu/libodbcinst.so
DriverManagerEncoding=UTF-16

~/.odbc.ini

[ODBC Data Sources]
vertica = "My Database"

[verticadsn]
Description = My Database
Driver = /opt/vertica/lib64/libverticaodbc.so
Database = docker
Servername = 127.0.0.1
UID = dbadmin
PWD =

Then use the Vertica DSN in a file like so:

from sqlalchemy import create_engine

engine = sa.create_engine('vertica+pyodbc://@verticadsn')
try:
    res = engine.connect().scalar('select version();')
    print(res)
finally:
    connection.close()
    engine.dispose()

This should display the Vertica version info: "Vertica Analytic Database v12.0.0-0".

vertica-sqlalchemy-dialect's People

Contributors

Stargazers

Watchers

Forkers

sitingren rajasekhar-vuppala vishalksimplify dingqiangliu rvoona

vertica-sqlalchemy-dialect's Issues

vertica_python not working with SQLAlchemy

When I try to connect with latest version of sqlalchemy 2.0, we have faced the issue, Older version it's fine.

import sqlalchemy as sa
engine = sa.create_engine('vertica+vertica_python://:data@****:5433/')

conn = engine.connect()

sqlalchemy.exc.ObjectNotExecutableError: Not an executable object: 'select version()'

SQLAlchemy Version in Use
2.0.2

DBAPI (i.e. the database driver)
python_vertica

Database Vendor and Major Version
Vertica 10.1.0

Python Version
3.11

Operating system
Window/Linux
To Reproduce
import sqlalchemy as sa

create engine

engine = sa.create_engine('vertica+vertica_python://:data@****:5433/')
conn = engine.connect()

sqlalchemy 2.0 support

Hello,

What are the gaps to support sqlalchemy 2.0? Is there a plan for it?

Thank you

Trying to send DF to Vertica


engine = sqlalchemy.create_engine('vertica+vertica_python://username:password@localhost:3306/db_name')
connection  = engine.raw_connection()

df.to_sql("df", if_exists="replace", con=connection, schema="schema_name")

ValueError Traceback (most recent call last)
File ~\AppData\Roaming\Python\Python311\site-packages\pandas\io\sql.py:2675, in SQLiteDatabase.execute(self, sql, params)
2674 try:
-> 2675 cur.execute(sql, *args)
2676 return cur

File ~\AppData\Roaming\Python\Python311\site-packages\vertica_python\vertica\cursor.py:188, in Cursor.handle_ctrl_c..wrap(self, *args, **kwargs)
187 try:
--> 188 return func(self, *args, **kwargs)
189 except KeyboardInterrupt:

File ~\AppData\Roaming\Python\Python311\site-packages\vertica_python\vertica\cursor.py:265, in Cursor.execute(self, operation, parameters, use_prepared_statements, copy_stdin, buffer_size)
264 if parameters:
--> 265 operation = self.format_operation_with_parameters(operation, parameters)
266 self._execute_simple_query(operation)

File ~\AppData\Roaming\Python\Python311\site-packages\vertica_python\vertica\cursor.py:739, in Cursor.format_operation_with_parameters(self, operation, parameters, is_copy_data)
738 if parameters and '%s' not in operation:
--> 739 raise ValueError(f'Invalid SQL: {operation}'
740 "\nHINT: When argument 'parameters' is a tuple/list, "
741 'variables in SQL should be specified with positional format (%s) placeholders. '
742 'Question mark (?) placeholders have to be used with use_prepared_statements=True setting.')
743 tlist = []

ValueError: Invalid SQL:
SELECT
name
FROM
sqlite_master
WHERE
type IN ('table', 'view')
AND name=?;

HINT: When argument 'parameters' is a tuple/list, variables in SQL should be specified with positional format (%s) placeholders. Question mark (?) placeholders have to be used with use_prepared_statements=True setting.

The above exception was the direct cause of the following exception:

DatabaseError Traceback (most recent call last)
Cell In[56], line 1
----> 1 df.to_sql("df", if_exists="replace", con=connection, schema="rtinkler")

File ~\AppData\Roaming\Python\Python311\site-packages\pandas\util_decorators.py:333, in deprecate_nonkeyword_arguments..decorate..wrapper(*args, **kwargs)
327 if len(args) > num_allow_args:
328 warnings.warn(
329 msg.format(arguments=_format_argument_list(allow_args)),
330 FutureWarning,
331 stacklevel=find_stack_level(),
332 )
--> 333 return func(*args, **kwargs)

File ~\AppData\Roaming\Python\Python311\site-packages\pandas\core\generic.py:3081, in NDFrame.to_sql(self, name, con, schema, if_exists, index, index_label, chunksize, dtype, method)
2886 """
2887 Write records stored in a DataFrame to a SQL database.
2888
(...)
3077 [(1,), (None,), (2,)]
3078 """ # noqa: E501
3079 from pandas.io import sql
-> 3081 return sql.to_sql(
3082 self,
3083 name,
3084 con,
3085 schema=schema,
3086 if_exists=if_exists,
3087 index=index,
3088 index_label=index_label,
3089 chunksize=chunksize,
3090 dtype=dtype,
3091 method=method,
3092 )

File ~\AppData\Roaming\Python\Python311\site-packages\pandas\io\sql.py:842, in to_sql(frame, name, con, schema, if_exists, index, index_label, chunksize, dtype, method, engine, **engine_kwargs)
837 raise NotImplementedError(
838 "'frame' argument should be either a Series or a DataFrame"
839 )
841 with pandasSQL_builder(con, schema=schema, need_transaction=True) as pandas_sql:
--> 842 return pandas_sql.to_sql(
843 frame,
844 name,
845 if_exists=if_exists,
846 index=index,
847 index_label=index_label,
848 schema=schema,
849 chunksize=chunksize,
850 dtype=dtype,
851 method=method,
852 engine=engine,
853 **engine_kwargs,
854 )

File ~\AppData\Roaming\Python\Python311\site-packages\pandas\io\sql.py:2851, in SQLiteDatabase.to_sql(self, frame, name, if_exists, index, index_label, schema, chunksize, dtype, method, engine, **engine_kwargs)
2840 raise ValueError(f"{col} ({my_type}) not a string")
2842 table = SQLiteTable(
2843 name,
2844 self,
(...)
2849 dtype=dtype,
2850 )
-> 2851 table.create()
2852 return table.insert(chunksize, method)

File ~\AppData\Roaming\Python\Python311\site-packages\pandas\io\sql.py:984, in SQLTable.create(self)
983 def create(self) -> None:
--> 984 if self.exists():
985 if self.if_exists == "fail":
986 raise ValueError(f"Table '{self.name}' already exists.")

File ~\AppData\Roaming\Python\Python311\site-packages\pandas\io\sql.py:970, in SQLTable.exists(self)
969 def exists(self):
--> 970 return self.pd_sql.has_table(self.name, self.schema)

File ~\AppData\Roaming\Python\Python311\site-packages\pandas\io\sql.py:2866, in SQLiteDatabase.has_table(self, name, schema)
2855 wld = "?"
2856 query = f"""
2857 SELECT
2858 name
(...)
2863 AND name={wld};
2864 """
-> 2866 return len(self.execute(query, [name]).fetchall()) > 0

File ~\AppData\Roaming\Python\Python311\site-packages\pandas\io\sql.py:2687, in SQLiteDatabase.execute(self, sql, params)
2684 raise ex from inner_exc
2686 ex = DatabaseError(f"Execution failed on sql '{sql}': {exc}")
-> 2687 raise ex from exc

DatabaseError: Execution failed on sql '
SELECT
name
FROM
sqlite_master
WHERE
type IN ('table', 'view')
AND name=?;
': Invalid SQL:
SELECT
name
FROM
sqlite_master
WHERE
type IN ('table', 'view')
AND name=?;

HINT: When argument 'parameters' is a tuple/list, variables in SQL should be specified with positional format (%s) placeholders. Question mark (?) placeholders have to be used with use_prepared_statements=True setting

dependency conflict

The sqlalchemy version dependency is different from docs and src.

docs:

Prerequisites
You will need the following softwares to run, build and test the dialect. Everything apart from Python and pip can be installed via pip itself.

Python 3.x or higher

pip 22 or higher

SQLAlchemy > 1.3.24 and < 2.0

vertica-python 1.1.1 or higher

src:

    install_requires=(
        'six >= 1.10.0',
        'sqlalchemy<2.0.0,>=1.4.0',
        'vertica-python>=1.1.1'
    ),

Is it must be higher than 1.4.0?
I would like to use this dialect for Superset, but apache-superset is not supported sqlalchemy 1.4.0 now.

if I install sqlalchemy==1.4.46:

apache-superset 2.0.1 requires sqlalchemy!=1.3.21,<1.4,>=1.3.16, but you have sqlalchemy 1.4.46 which is incompatible.

if I install sqlalchemy== 1.3.16:

vertica-sqlalchemy-dialect 0.0.1 depends on sqlalchemy<2.0.0 and >=1.4.0

info_cache is now required for to_sql?

Hello, it looks like SQLAlchemy has changed requirements on the fly without consulting the DB vendors.

They call the dialect method has_table with extra parameter info_cache. But vertica-sqlalchemy-dialect doesn't expect this parameter. I was wondering if it's possible to add **kw dummy parameter to all the methods in base.py so that sqlalchemy doesn't fail?

I raised the issue with SQLAlchemy but obviously they blame it on Vertica. Here's the link:
sqlalchemy/sqlalchemy#11135