futurewei-cloud / chogori-sql Goto Github PK
View Code? Open in Web Editor NEWHorizontally scalable SQL Service built upon Chogori-Platform SKV storage
License: Other
Horizontally scalable SQL Service built upon Chogori-Platform SKV storage
License: Other
SKV record are schema based, not by the projection order. We need the column name when deserialize SKV record to PgTuple.
Define Data type, schema, and expression for PG connector
Create a table, either system catalog table or a user table on top of catalog tables.
YBC PG gate APIs consist of YBC's feature that we don't need, for example,
YBCStatus YBCPgCreateTableSetNumTablets(YBCPgStatement handle, int32_t num_tablets);
YBCStatus YBCPgCreateTableAddSplitRow(YBCPgStatement handle, int num_cols,
YBCPgTypeEntity **types, uint64_t *data);
YBCStatus YBCPgCreateIndexSetNumTablets(YBCPgStatement handle, int32_t num_tablets);
YBCStatus YBCPgCreateIndexAddSplitRow(YBCPgStatement handle, int num_cols,
YBCPgTypeEntity **types, uint64_t *data);
We need to remove them from the PG Gate APIs and PG code.
Need to do some investigate to decide whether we should do the refactor.
Implement all the DML logic such as select, insert, update, delete in PG connector.
Change the class names to be more meaningful to chogori sql as in the branch jfang-dev to make merge easier.
Need to implement the logic in pg_op.h and make sure that K2 SKV supports this feature
CHECKED_STATUS PopulateDmlByRowIdOps(const vector *ybctids) {
}
Need to have a generic retry logic/framework to decide whether we should retry an operation or transaction and what would be retry policy.
Sequence table support
We should put common, entities, pggate under k2/connector directly and remove the yb folder.
As discussed on #56, we should change namespace to database.
Need to remove any place that we bring in the whole yb namespace by using
using namespace yb;
This won't include the actual logic on how to create table or database, but the logic path to send DDL command down the PG connector.
Right now, the catalog version is at cluster level and it is used for an indication of cache refresh. To have fine-grained control of the catalog version, ideally, we should have catalog version at the database level. However, this requires to change ybc logic inside PG. Also pay attention to shared table update such as pg_database.
introduce a template read/write operation and the PG operation based on that.
We should use the AND condition as the top of the expression tree and add a new operand each time do a condition expression allocation.
We have to choose a TPCC-SQL driver to use and benchmark our SQL milestone
Otherwise, it will hang forever. This could happen when it failed to contact Chogori/SKV layer.
e.g. in following code
std::shared_ptr SqlCatalogManager::NewTransactionContext() {
std::future txn_future = k2_adapter_->beginTransaction();
std::shared_ptr txn = std::make_shared(txn_future.get()); // hang when K2_adapter can't return a transaction asynchronously.
Store secondary index as another schema, but need to insert the index data in SKV since SKV does not have a secondary index concept.
Might not be needed for Milestone M0.1.
Need to maintain cluster level logic, for example, if the cluster has already been bootstrapped or not. The catalog version that used to indicate that a catalog version has been updated.
Right now, we bind PgExpr to intermediate expression data structure in pg_op_api.h. We should refactor this to bind PgExpr directly to the SKV expression to avoid additional conversion when we call SKV clients.
Need to clean up the YB_ data types:
typedef enum PgDataType {
YB_YQL_DATA_TYPE_NOT_SUPPORTED = -1,
YB_YQL_DATA_TYPE_UNKNOWN_DATA = 999,
YB_YQL_DATA_TYPE_NULL_VALUE_TYPE = 0,
YB_YQL_DATA_TYPE_INT8 = 1,
YB_YQL_DATA_TYPE_INT16 = 2,
YB_YQL_DATA_TYPE_INT32 = 3,
YB_YQL_DATA_TYPE_INT64 = 4,
YB_YQL_DATA_TYPE_STRING = 5,
YB_YQL_DATA_TYPE_BOOL = 6,
YB_YQL_DATA_TYPE_FLOAT = 7,
YB_YQL_DATA_TYPE_DOUBLE = 8,
YB_YQL_DATA_TYPE_BINARY = 9,
YB_YQL_DATA_TYPE_TIMESTAMP = 10,
YB_YQL_DATA_TYPE_DECIMAL = 11,
YB_YQL_DATA_TYPE_VARINT = 12,
YB_YQL_DATA_TYPE_INET = 13,
YB_YQL_DATA_TYPE_LIST = 14,
YB_YQL_DATA_TYPE_MAP = 15,
YB_YQL_DATA_TYPE_SET = 16,
YB_YQL_DATA_TYPE_UUID = 17,
YB_YQL_DATA_TYPE_TIMEUUID = 18,
YB_YQL_DATA_TYPE_TUPLE = 19,
YB_YQL_DATA_TYPE_TYPEARGS = 20,
YB_YQL_DATA_TYPE_USER_DEFINED_TYPE = 21,
YB_YQL_DATA_TYPE_FROZEN = 22,
YB_YQL_DATA_TYPE_DATE = 23,
YB_YQL_DATA_TYPE_TIME = 24,
YB_YQL_DATA_TYPE_JSONB = 25,
YB_YQL_DATA_TYPE_UINT8 = 100,
YB_YQL_DATA_TYPE_UINT16 = 101,
YB_YQL_DATA_TYPE_UINT32 = 102,
YB_YQL_DATA_TYPE_UINT64 = 103
} YBCPgDataType;
Run and publish TPCC-SQL benchmark results
Need to expose information to indicate whether the query plan is running inside PG gate/pushed to external storage or not when we run "explain SQL" or "explain analyze SQL".
map SQL operations to SKV read/write requests
Need to add a isNull flag to PgTuple so that we could handle null values
We don't have user and account support yet, need to add that later.
ycbtid is a unique row identifier string heavily used in the layers above k2 adapter. We need to support vending out the identifier given a row, and using the identifier to issue read and write requests. This may require adding support to the SKV client API.
Seems port.h brought in the whole std namespace, we should remove that.
namespace std {} // Avoid error if we didn't see std.
using namespace std; // NOLINT
Work needed to integrate the components of the SQL milestone and get an end-to-end minimal working transaction
Need to address the comment from the following PR:
"Considering adding a work item to replace/eliminate this RStatusCode, with K2's HTTP like status code"
Create DB by clone from the template DB
No need to keep it and consider to remove it.
Need to pass in configuration parameters instead of using the default ones. This could be done via gflag library
Have a lazy logic to reduce the chances of operation timeout.
We like to have a consistent APIs. Right now, some are using future based APIs. We might want to convert Catalog manager to use future based APIs later. However, to do that, we need to think of how to support future based APIs on server side since eventually, catalog manager will be a remote server.
std::shared_ptr has the same functionality and it does not rely on inheritance. We should change to use std::shared_ptr instead.
Getting k2 schemas requires a request to seastar over the shared queue which could be expensive. Schemas are immutable so they can easily be cached by the k2 adapter.
SKV does not support signed integers when using the integer to compare records. It would be fine if we store the unsigned integer as signed integers and don't use it for record filtering.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.