I have "wide" tables, where for some analytical queries would greatly benefit from col

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Can I use the columnar store extension with timescales about timescaledb HOT 5 CLOSED

timescale commented on May 20, 2024 1

Can I use the columnar store extension with timescales

from timescaledb.

Comments (5)

stalltron commented on May 20, 2024 1

The answer at this point in time is no. The data partitioning (chunking) TimescaleDB uses is optimized for indexing the data so that queries, especially as they increase in complexity, are performant across larger volumes of data. With a columnar store you lose almost-all indexing (i.e., there is no B-tree support at all) so it doesn't make sense to combine the two models given our decisions. We've had some internal engineering discussions about some ideas for columnar storage, but it is not on any shorter-term roadmap.

from timescaledb.

akulkarni commented on May 20, 2024

Also - If you feel comfortable sharing the general structure of the data you are storing (and the relevant queries), we can also take a closer look / make suggestions on how we'd recommend to best store that data in Timescale.

from timescaledb.

tr8dr commented on May 20, 2024

I recognize that columnar storage is poor for certain workloads and better for others. My main issue is the cost of table scans when a given column-narrow ad-hoc query cannot be resolved by an index.

I suspect the biggest win for my sort of queries would be if could apply parallel disk read + filtering (in this case for 1 server with a 10-way disk array and may cores). This would be, without the hardware, similar to what Netezza does, i.e. parallel reads with filtration based on what part of a query can be run on a chunk of data, on each tightly coupled cpu <-> disk.

At the moment, short of creating numerous indices across many rows, some queries will involve a linear table scan. Linear scan can work reasonably well with parallelism.

from timescaledb.

mfreed commented on May 20, 2024

Hi @tr8dr, sorry for the delay in responding.

One of the lesser-advertised features in our recent 0.1.0 release is the ability to associate multiple Postgres tablespaces with a single hypertable, so that this single "table" can reside across multiple disks, and chunks belonging to this hypertable can be then queried in parallel.

Better documentation is forthcoming for the new attach_tablespace() API command, but if you are interested in the details:

71c5e78

from timescaledb.

mfreed commented on May 20, 2024

@tr8dr I'm going to close out this issue unless there's anything else?

from timescaledb.

Recommend Projects

Can I use the columnar store extension with timescales about timescaledb HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent