futurewei-cloud / chogori-sql Goto Github PK

Horizontally scalable SQL Service built upon Chogori-Platform SKV storage

License: Other

Emacs Lisp 0.01% CMake 0.01% Makefile 0.76% M4 0.25% Shell 0.18% C 84.41% PLpgSQL 6.72% Perl 2.04% C++ 2.69% Yacc 1.47% Lex 0.47% Ruby 0.56% Python 0.18% XSLT 0.14% CSS 0.01% Assembly 0.01% Roff 0.07% sed 0.01% DTrace 0.01% XS 0.02%

chogori-sql's People

Contributors

Stargazers

Watchers

Forkers

yazhifeng xieus ivan-avramov daweilics jfunston

chogori-sql's Issues

Add column name to PG column Ref

SKV record are schema based, not by the projection order. We need the column name when deserialize SKV record to PgTuple.

Add SQL entities for PG connector

Define Data type, schema, and expression for PG connector

Implement Create Table DDL

Create a table, either system catalog table or a user table on top of catalog tables.

Remove tablet related YBC PG gate apis

YBC PG gate APIs consist of YBC's feature that we don't need, for example,

YBCStatus YBCPgCreateTableSetNumTablets(YBCPgStatement handle, int32_t num_tablets);

YBCStatus YBCPgCreateTableAddSplitRow(YBCPgStatement handle, int num_cols,
YBCPgTypeEntity **types, uint64_t *data);

YBCStatus YBCPgCreateIndexSetNumTablets(YBCPgStatement handle, int32_t num_tablets);

YBCStatus YBCPgCreateIndexAddSplitRow(YBCPgStatement handle, int num_cols,
YBCPgTypeEntity **types, uint64_t *data);

We need to remove them from the PG Gate APIs and PG code.

Investigate whether we should move secondary index out of PgDml to PgDmlRead

Need to do some investigate to decide whether we should do the refactor.

Implement DML logic in PG connector

Implement all the DML logic such as select, insert, update, delete in PG connector.

Change class names and k2 pg namespaces

Change the class names to be more meaningful to chogori sql as in the branch jfang-dev to make merge easier.

Add support to scan SKV by a given list of row ids

Need to implement the logic in pg_op.h and make sure that K2 SKV supports this feature

CHECKED_STATUS PopulateDmlByRowIdOps(const vector *ybctids) {
}

Add retry logic to handle errors

Need to have a generic retry logic/framework to decide whether we should retry an operation or transaction and what would be retry policy.

Add sequence table support

Sequence table support

Clean up PgGate code base to remove the yb folder

We should put common, entities, pggate under k2/connector directly and remove the yb folder.

change namespace to database in catalog manager

As discussed on #56, we should change namespace to database.

Remove global namespace yb

Need to remove any place that we bring in the whole yb namespace by using

using namespace yb;

Implement DDL logic in PG connector

This won't include the actual logic on how to create table or database, but the logic path to send DDL command down the PG connector.

support database level catalog version

Right now, the catalog version is at cluster level and it is used for an indication of cache refresh. To have fine-grained control of the catalog version, ideally, we should have catalog version at the database level. However, this requires to change ybc logic inside PG. Also pay attention to shared table update such as pg_database.

Introduce SQL read/write operations

introduce a template read/write operation and the PG operation based on that.

Fix condition expression allocation logic AllocBindConditionExpr() in pg_column

We should use the AND condition as the top of the expression tree and add a new operand each time do a condition expression allocation.

TPCC-SQL client

We have to choose a TPCC-SQL driver to use and benchmark our SQL milestone

Pg Env default logic

PGGate API get() on future into k23si should have a reasonable timeout.

Otherwise, it will hang forever. This could happen when it failed to contact Chogori/SKV layer.

e.g. in following code
std::shared_ptr SqlCatalogManager::NewTransactionContext() {
std::future txn_future = k2_adapter_->beginTransaction();
std::shared_ptr txn = std::make_shared(txn_future.get()); // hang when K2_adapter can't return a transaction asynchronously.

Add secondary Index support

Store secondary index as another schema, but need to insert the index data in SKV since SKV does not have a secondary index concept.

Might not be needed for Milestone M0.1.

Add logic to maintain cluster level logic

Need to maintain cluster level logic, for example, if the cluster has already been bootstrapped or not. The catalog version that used to indicate that a catalog version has been updated.

Bind PgExpr to SKV expression

Right now, we bind PgExpr to intermediate expression data structure in pg_op_api.h. We should refactor this to bind PgExpr directly to the SKV expression to avoid additional conversion when we call SKV clients.

installation and deployment

PG connector API implementation

Add Transaction handler

Clean up YB_ Data types in PG and PG gate

Need to clean up the YB_ data types:

typedef enum PgDataType {
YB_YQL_DATA_TYPE_NOT_SUPPORTED = -1,
YB_YQL_DATA_TYPE_UNKNOWN_DATA = 999,
YB_YQL_DATA_TYPE_NULL_VALUE_TYPE = 0,
YB_YQL_DATA_TYPE_INT8 = 1,
YB_YQL_DATA_TYPE_INT16 = 2,
YB_YQL_DATA_TYPE_INT32 = 3,
YB_YQL_DATA_TYPE_INT64 = 4,
YB_YQL_DATA_TYPE_STRING = 5,
YB_YQL_DATA_TYPE_BOOL = 6,
YB_YQL_DATA_TYPE_FLOAT = 7,
YB_YQL_DATA_TYPE_DOUBLE = 8,
YB_YQL_DATA_TYPE_BINARY = 9,
YB_YQL_DATA_TYPE_TIMESTAMP = 10,
YB_YQL_DATA_TYPE_DECIMAL = 11,
YB_YQL_DATA_TYPE_VARINT = 12,
YB_YQL_DATA_TYPE_INET = 13,
YB_YQL_DATA_TYPE_LIST = 14,
YB_YQL_DATA_TYPE_MAP = 15,
YB_YQL_DATA_TYPE_SET = 16,
YB_YQL_DATA_TYPE_UUID = 17,
YB_YQL_DATA_TYPE_TIMEUUID = 18,
YB_YQL_DATA_TYPE_TUPLE = 19,
YB_YQL_DATA_TYPE_TYPEARGS = 20,
YB_YQL_DATA_TYPE_USER_DEFINED_TYPE = 21,
YB_YQL_DATA_TYPE_FROZEN = 22,
YB_YQL_DATA_TYPE_DATE = 23,
YB_YQL_DATA_TYPE_TIME = 24,
YB_YQL_DATA_TYPE_JSONB = 25,
YB_YQL_DATA_TYPE_UINT8 = 100,
YB_YQL_DATA_TYPE_UINT16 = 101,
YB_YQL_DATA_TYPE_UINT32 = 102,
YB_YQL_DATA_TYPE_UINT64 = 103
} YBCPgDataType;

TPCC-SQL benchmarking

Run and publish TPCC-SQL benchmark results

Add support to indicate the operation is in PG gate or not in PG's explain or explain analyze for query plans

Need to expose information to indicate whether the query plan is running inside PG gate/pushed to external storage or not when we run "explain SQL" or "explain analyze SQL".

Implement K2 Adapter to integrate with SKV client

map SQL operations to SKV read/write requests

allow to pass an isNull flag to PgTuple

Need to add a isNull flag to PgTuple so that we could handle null values

[SQL] SQL catalog manager

PG connector session

Add tenant support for catalog manager

We don't have user and account support yet, need to add that later.

Support row identifier (ycbtid) in k2 adapter and k2 SKV

ycbtid is a unique row identifier string heavily used in the layers above k2 adapter. We need to support vending out the identifier given a row, and using the identifier to issue read and write requests. This may require adding support to the SKV client API.

Remove the global std namespace from port.h

Seems port.h brought in the whole std namespace, we should remove that.

namespace std {} // Avoid error if we didn't see std.
using namespace std; // NOLINT

K23SI-PG code integration

Integration of tpcc-sql , chogori-sql, and SKV

Work needed to integrate the components of the SQL milestone and get an end-to-end minimal working transaction

Exposing SKV Query API

Refactor RStatusCode to use K2's HTTP like status code

Need to address the comment from the following PR:

#56

"Considering adding a work item to replace/eliminate this RStatusCode, with K2's HTTP like status code"

Implement create database DDL

Create DB by clone from the template DB

Remove TableIdentifier object

No need to keep it and consider to remove it.

Add logic to pass in configuration parameters

Need to pass in configuration parameters instead of using the default ones. This could be done via gflag library

Fine tune logs in PG gate and catalog manager

Change some logs to debug level
pass in log level as a parameter such as gflag
Use object overload method such as toString() to dump object information
Fine tune logs

bootstrap of PG system tables

refactor SessionTransactionContext in Catalog Manager to use lazy logic to start transaction on the first operation

Have a lazy logic to reduce the chances of operation timeout.

Use future based APIs for Catalog Manager

We like to have a consistent APIs. Right now, some are using future based APIs. We might want to convert Catalog manager to use future based APIs later. However, to do that, we need to think of how to support future based APIs on server side since eventually, catalog manager will be a remote server.