Giter Site home page Giter Site logo

dingodb / dingo Goto Github PK

View Code? Open in Web Editor NEW
1.2K 113.0 217.0 24.14 MB

A multi-modal vector database that supports upserts and vector queries using unified SQL (MySQL-Compatible) on structured and unstructured data, while meeting the requirements of high concurrency and ultra-low latency.

Home Page: https://www.dingodb.com

License: Apache License 2.0

Java 98.58% Shell 0.40% Dockerfile 0.01% FreeMarker 1.01%
serving embedding-store vector-database mysql-compatibility embedding-search key-value-distributed-store vector-ocean unified-sql structured-data unstructured-data

dingo's People

Contributors

astor-oss avatar benben005 avatar cliff57 avatar elaine11ya avatar githubgxll avatar guojn1 avatar guomm37 avatar haotman avatar haoton2023 avatar hechanghaogary avatar human-ai-alaya avatar jt030 avatar jwcxs-m avatar jycz avatar ketor avatar kevinguo1989 avatar lasyard avatar rock-git avatar sophia0608 avatar stevenchennet avatar wangzhen11aaa avatar wt0530 avatar yuhaijun999 avatar zbynek avatar zhangjie0303 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dingo's Issues

[dingo-expr] Support null value in operator and functions.

Null values are not allowed in dingo-expr operations. If the value of operands is null, exeptions may be thrown.

Shoud implemente the null value operation according to SQL standards.

Now, literal 'null' is parsed as a variable and compiled to a RtNull, but no further processing in operators is implemented.

Support functions about date/datetime.

Summary:
add datetime function:
Now(), curDate(), Current_Date, Current_Time, CurTime(), Current_Timestamp(),From_UnixTime(), Unix_TimeStamp(),Date_Format().

Support aggregation SUM0 and function CASE.

There are two aggregation function SUM and SUM0 in Calcite logical plan. Sometimes SUM is transformed to CASE(COUNT(x) = 0 , null , SUM0(x)). Without SUM0 and CASE supporting, these quries would fail.

Support multiple schema names to map multiple meta service provider.

Currently in DingoDB, there is only one schema named 'DINGO', which is statically bound to a single meta service.

If there are serveral meta services provided, some other schema names are needed.

A "schema name --- meta service provider" register mechanism is good for maximize flexibility.

[dingo-kv] update distributed kv status, regioin size

Reason

Now, the information about the heartbeat from Executor to Coordinator cannot be used for region management and scheduler.

Solution

  • Using approximately to get the region size and total key count about the region.

[dingo-config] Unified configuration information about coordinator and store module

change some item names:

old new
default.replicas store.default.replicas
default.partitions store.default.partitions
kv.cluster.id store.cluster.id
kv.cluster.name store.cluster.name
kv.initialServerList store.initialServerList
kv.failoverRetries store.failoverRetries
kv.futureTimeoutMillis store.futureTimeoutMillis
placement.driver.options.fake coordinator.options.fake
placement.driver.options.pdGroupId coordinator.options.raftGroupId
placement.driver.options.initialPdServerList coordinator.options.initCoordinatorSrvList
placement.driver.options.cliOptions.timeoutMs coordinator.options.cliOptions.timeoutMs
placement.driver.options.cliOptions.maxRetry coordinator.options.cliOptions.maxRetry
placement.driver.options.cliOptions.rpcProcessorThreadPoolSize coordinator.options.cliOptions.rpcProcessorThreadPoolSize

[Bug]: Default value of a table column are not applied when inserting data.

What happened?

Insert a record into a table, in which there is a column with constraint 'NOT NULL DEFAULT ...', the if the value of the column is not provided, the running of query will not use the default value, but throw exceptions.

Version

0.3.0

Contact Details

No response

Relevant log output

No response

Support date, time, datetime

Support date, timestamp, datetime to dingo.

date

The DATE type is used for values with a date part but no time part. Dingo retrieves and displays DATE values in 'YYYY-MM-DD' format. The supported range is '1000-01-01' to '9999-12-31'.

time

The DATE type is used for values with a date part with time part. Dingo retrieves and displays DATE values in 'HH-mm-ss' format.

datetime

The DATETIME type is used for values that contain both date and time parts. Dingo retrieves and displays DATETIME values in 'YYYY-MM-DD hh:mm:ss' format. The supported range is '1000-01-01 00:00:00' to '9999-12-31 23:59:59'.

Add String basic functions

  • str1 || str2 || str3:将str1,str2,...,strn连接为一个完整的字符串
  • Format(x,n):格式化数字,将x保留到小数点后n位,
  • Locate(substr,str): 查找子串位置
  • Lower(str): 将字符串str中的所有字母变成小写,
  • Lcase(str):将字符串str中的所有字母变成小写,
  • Upper(str): 将字符串str中的所有字母变成大写,
  • Ucase(str):将字符串str中的所有字母变成大写,
  • Left(str,x): 返回字符串最左边的x个字符,
  • Right(str,x): 返回字符串最右边的x个字符,
  • Repeat(str,x): 返回字符串str重复x次的结果,
  • Replace(str,a,b):使用字符串b替换字符串str中所出现的字符串a,
  • Trim(str): 去掉字符串行头和行尾的空格,
  • Ltrim (str) :去掉字符串str开始处的空格,
  • Rtrim(str):去掉字符串s结尾处的空格,
  • MID (str,n,len) :从字符串str的n位置截取长度len的子字符串,
  • SubString(str,x,y): 返回字符串str中从x位置起y个字符串长度的字符串,
  • Reverse(str) :将字符串str的顺序反过来

[Bug]: local files cannot be import to dingo

What happened?

[ ] CSV files with boolean type(default null) cannot be import
[ ] Import tools can be compute the import record count

Version

v0.4.0

Contact Details

No response

Relevant log output

No response

[Bug]: NullPointerException thrown when do get table list operation.

What happened?

Do the following JDBC call:

DatabaseMetaData metaData = connection.getMetaData();
ResultSet resultSet = metaData.getTables(null, null, "%", null);

Then NullPointerException is thrown.

Version

0.3.0-SNAPSHOT

Contact Details

No response

Relevant log output

No response

IS TRUE, IS FALSE, IS NOT TRUE and IS NOT FALSE support

MySQL implements bool type as tinyint(1), so a == TRUE is different from a IS TRUE, the former returns false if a is not 1 because TRUE is 1, the latter returns true if a is a number and is not 0.
DingoDB do have bool type, but we need operators such as IS TRUE to mimic this behavior.

[DIP-0] DingoDB Improvement Proposals

DingoDB Improvement Proposals

Purpose

The purpose of a DingoDB Improvement Proposal (DIP) is to introduce any major change into DingoDB. This is required in order to balance the need to support new features, uses cases, while avoiding accidentally introducing half thought-out interfaces that cause needless problems when changed.

What is considered a major change that needs a DIP?

Any of the following should be considered a major change:

  • Any major new feature, subsystem, or piece of functionality
  • Any change that impacts the public interfaces of the project

What are the "public interfaces" of the project? All of the following are public interfaces that people build around:

  • Data types
  • SQL
  • REST endpoints
  • Data passed between backend and frontend
  • Configuration
  • Command line tools and arguments

What should be included in a DIP?

A DIP should contain the following sections:

  • Motivation: describe the problem to be solved.
  • Proposed Change: describe the new thing you want to do. This may be fairly extensive and have large subsections of its own. Or it may be a few sentences, depending on the scope of the change.
  • New or Changed Public Interfaces: impact to any of the "compatibility commitments" described above. We want to call these out in particular so everyone thinks about them.
  • New dependencies: describe any third-party libraries that the feature will require. In particular, make sure their license is compatible with the [Apache License v2.0] (https://www.apache.org/licenses/LICENSE-2.0).
  • Migration Plan and Compatibility: if this feature requires additional support for seamless upgrades describe how that will work. In particular, it’s important to mention if:
    • The feature requires a database migration;
    • The feature will coexist with similar functionality for some period of time, allowing for a deprecation period.
  • Rejected Alternatives: What are the other alternatives you considered and why are they worse? The goal of this section is to help people understand why this is the best solution now, and also to prevent churn in the future when old alternatives are reconsidered.

Who should initiate the DIP?

Anyone can initiate a DIP, but preferably someone with the intention of implementing it.

Process

  1. Create an issue with the prefix “[DIP]” in the title. The issue will be tagged as “dip” by a committer, and the title will be updated with the current DIP number.
  2. Notify the dingodb@zetyun mailing list that the DIP has been created, use the subject line [DISCUSS] DIP-0 DingoDB Improvement Proposals, the body of the email should look something like Please discuss & subscribe here: #1
  3. When writing the issue, fill in the sections as described above in “What should be included in a DIP?”. You can use the template included at the end of this document.
  4. A committer will initiate the discussion, and ensure that there’s enough time to analyze the proposal. Before accepting the DIP, a committer should call for a vote, requiring 3 votes and no vetoes from committers. Votes are timebox at 1 week, and conducted through email (with the subject [VOTE]).
  5. Create a pull request implementing the DIP, and referencing the issue.

Template

[DIP] Proposal for _

Motivation

Description of the problem to be solved.

Proposed Change

Describe how the feature will be implemented, or the problem will be solved. If possible, include mocks, screenshots, or screencasts (even if from different tools).

New or Changed Public Interfaces

Describe any new additions to the model, views or REST endpoints. Describe any changes to existing sub modules.

New dependencies

Describe any packages that are required. Are they actively maintained? What are their licenses?

Migration Plan and Compatibility

Describe any database migrations that are necessary, or updates to stored URLs.

Rejected Alternatives

Describe alternative approaches that were considered and rejected.

[dingo-driver] Basic jdbc driver for DingoDB.

Originally calcite jdbc driver is used to access DingoDB. Implement basic functionality in dingo-driver to replace it so we can simplify and customize more in 'sql -> db' process.

CAST functions support

need to support CAST (sth as type) function. Actually, some CAST operation is added even not used in sql explicitly. For example, a AVG aggregation of INTEGER type requires the result be casted back to INTEGER.

[Bug]: Array of char/varchar type is not supported.

What happened?

Create a table

create table test(
    id int,
    data varchar array,
    primary key(id)
)

then insert some values

insert into test values (1, array['a', 'b', 'c'])

the execution would fail.

Version

0.4.0-SNAPSHOT

Contact Details

No response

Relevant log output

No response

[dingo-exec] Add table id to the key to access the row store.

Currently only the primary fields of a table are used to construct the key of the kv store. If the store is a whole kv store, the keys of different table will be mixed up. So a fixed-length table id should be prepend to the key to identify data of the table.

[dingo-kvstore-api] Refactor the original store-api to hide more details from store implementation.

Generally a key-value store implementation don't need to know the row type of a table. Keep the encoding/decoding secret inside the exec engine can make the store implementation and resource management less coupled with the main execution layer of DingoDB.

The new api module can be called dingo-kvstore-api, the original 'dingo-store-localmodule can be migrated todingo-kvstore-rocks. Here explicit rocks` is used for there may be multiple storage implementation.

A columnar store api may be needed, so leave the dingo-stroe-api there for future work.

IS NULL and IS NOT NULL support

Null is a special value in sql. To predicate if a value is null or not, special function IS NULL and IS NOT NULL is needed.

[Bug]: Result of exprs are not correctly typed.

What happened?

The result of expression evaluating are not correctly typed, make avro encoding fail for mismatched schema. So sometimes updating a table would crash.

Version

0.3.0

Contact Details

No response

Relevant log output

No response

[dingo-sdk] Support to create table using Annotation

[ ] Support to create table from Annotation using Java Pojo
[ ] Support Date type of column
[ ] Support Time type of column
[ ] Support Timestamp type of column
[ ] Support boolean type of column
[ ] Support String type of column

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.