kobsio / klogs Goto Github PK
View Code? Open in Web Editor NEWFast, scalable and reliable logging using Fluent Bit and ClickHouse
Home Page: https://kobs.io/main/plugins/klogs/
License: MIT License
Fast, scalable and reliable logging using Fluent Bit and ClickHouse
Home Page: https://kobs.io/main/plugins/klogs/
License: MIT License
Hi,
Thanks for interesting project for log storing in ClickHouse.
There are couple suggestions for schema optimization:
ZSTD encoding multiple times slower to read than default LZ4.
https://kb.altinity.com/altinity-kb-schema-design/codecs/codecs-speed/
And for LowCardinality (high compressed columns) usually it doesn't make sense to use it, because compression ratio fine already.
cluster LowCardinality(String) CODEC (ZSTD(1)),
namespace LowCardinality(String) CODEC (ZSTD(1)),
app String CODEC (ZSTD(1)),
pod_name String CODEC (ZSTD(1)),
container_name String CODEC (ZSTD(1)),
So i can suggest you to adjust like following:
cluster LowCardinality(String),
namespace LowCardinality(String),
app LowCardinality(String)
pod_name LowCardinality(String),
container_name LowCardinality(String),
Even if it's possible to have a lot (well not that a lot, probably lower hundreds thousands) distinct app, pod_name, container_name, it's still perfectly fine to use LowCardinality here, because it work well as it's per part local dictionary.
fields_string Nested(key String, value String) CODEC (ZSTD(1)),
fields_number Nested(key String, value Float64) CODEC (ZSTD(1)),
It's make sense to use LowCardinality for key column here as well.
fields_string Nested(key LowCardinality(String), value String),
fields_number Nested(key LowCardinality(String), value Float64),
ORDER BY (cluster, namespace, app, pod_name, container_name, host, -toUnixTimestamp(timestamp));
As i understand host is k8s node hostname?
In that case, why it's almost last in ORDER BY clause?
BTW, why do you trying to reverse timestamp? because of a lot of queries with ORDER BY timestamp DESC
?
If you can show result of this query for one of this table with a lot of data, it will be helpful.
SELECT
database,
table,
column,
type,
sum(rows) AS rows,
sum(column_data_compressed_bytes) AS compressed_bytes,
formatReadableSize(compressed_bytes) AS compressed,
formatReadableSize(sum(column_data_uncompressed_bytes)) AS uncompressed,
sum(column_data_uncompressed_bytes) / compressed_bytes AS ratio,
any(compression_codec) AS codec
FROM system.parts_columns AS pc
LEFT JOIN system.columns AS c
ON (pc.database = c.database) AND (c.table = pc.table) AND (c.name = pc.column)
WHERE (database LIKE '%') AND (table LIKE '%') AND active
GROUP BY
database,
table,
column,
type
ORDER BY database, table, sum(column_data_compressed_bytes) DESC
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.