Comments (21)
Umami is hosted on a small Hetzner Cloud machine (CX 21). I'm using MySQL as database.
from umami.
That sounds really long. Those queries should be using indexed columns so it shouldn't take that long. What kind of server is it running on?
from umami.
I'm running on a much smaller server (although fewer users) and I haven't had queries that long. Check your database and make sure the indexes were created.
from umami.
Indexes are there.
from umami.
You can try doing an EXPLAIN
on the query to see what's going on. This is the query for metrics for MySQL:
select sum(t.c) as "pageviews",
count(distinct t.session_id) as "uniques",
sum(case when t.c = 1 then 1 else 0 end) as "bounces",
sum(t.time) as "totaltime"
from (
select session_id,
date_trunc('hour', created_at),
count(*) c,
floor(unix_timestamp(max(created_at)) - unix_timestamp(min(created_at))) as "time"
from pageview
where website_id=?
and created_at between ? and ?
group by 1, 2
) t
Replace the ?
with parameters.
from umami.
Also, what is the date range on that query?
from umami.
Running the query itself, it takes "just" 7,125s:
select sum(t.c) as "pageviews",
count(distinct t.session_id) as "uniques",
sum(case when t.c = 1 then 1 else 0 end) as "bounces",
sum(t.time) as "totaltime"
from (
select session_id,
date_trunc('hour', created_at),
count(*) c,
floor(unix_timestamp(max(created_at)) - unix_timestamp(min(created_at))) as "time"
from pageview
where website_id=2
and created_at between from_unixtime(1597788000) and from_unixtime(1598392799)
group by 1, 2
) t
Explained:
id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
---|---|---|---|---|---|---|---|---|---|
1 | PRIMARY | ALL | 242784 | ||||||
2 | DERIVED | pageview | ref | pageview_created_at_idx,pageview_website_id_idx | pageview_website_id_idx | 4 | const | 242784 | Using where; Using temporary; Using filesort |
Also, what is the date range on that query?
7 days is set in the UI. However, the first entry was created on 2020-08-20 03:14:25.
from umami.
Try creating a new composite index and see if that helps:
create index pageview_website_id_created_at_idx on pageview(website_id, created_at);
from umami.
Still no change.
metrics: 10.96s
pageviews: 19.03s
from umami.
@SoftCreatR do you think you can send me a mysql dump of your data? I only have postgresql data. I ran a test with 500K records and it returned in 30ms. I want to see if there is something mysql specific.
from umami.
Ok, I mocked up a database in MySQL with a million records. Even this simple query it takes 5 seconds:
select session_id,
count(*) c
from pageview
where website_id=1
group by 1
The same query in Postgres takes 13ms. It seems MySQL has issues with group by
and count
in the query. May need someone with more MySQL experience to chime in here.
from umami.
I guess, the problem here is the numeric grouping. I'm not sure, but it could be, that indexes don't have that much effect in this case.
from umami.
I'll keep trying with different queries but it seems really odd that such is simple query is so slow.
from umami.
@SoftCreatR try adding this index:
create index pageview_website_id_session_id_created_at_idx on pageview(website_id, session_id, created_at);
from umami.
No change. Ill send you a dump of my db. Maybe that helps.
from umami.
@SoftCreatR Getting closer to a solution. I've got the main queries down to less than 2 seconds.
from umami.
@SoftCreatR try pulling the lastest build. I made some improvements. And make sure you create this index:
create index pageview_website_id_session_id_created_at_idx on pageview(website_id, session_id, created_at);
from umami.
Created the index, pulled the latest commits, rebuilt it. Result:
from umami.
It's even worse: It consumes all available RAM and CPU so it's completely useless for me atm.
from umami.
So, after some analysis, 3 queries are fired:
select sum(t.c) as "pageviews",
count(distinct t.session_id) as "uniques",
sum(case when t.c = 1 then 1 else 0 end) as "bounces",
sum(t.time) as "totaltime"
from (
select session_id,
DATE_FORMAT(created_at, '%Y-%m-%d %H:00:00'),
count(*) c,
floor(unix_timestamp(max(created_at)) - unix_timestamp(min(created_at))) as "time"
from pageview
where website_id=2
and created_at between TIMESTAMP'2020-08-21 22:00:00' and TIMESTAMP'2020-08-28 21:59:59.999000'
group by 1, 2
) t
Execution time (Avg): 10.375s
Explanation:
id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | PRIMARY | ALL | 460892 | 192252,00 | 100,00 | 100,00 | ||||||
2 | DERIVED | pageview | ref | pageview_created_at_idx,pageview_website_id_idx,pageview_website_id_session_id_created_at_idx | pageview_website_id_session_id_created_at_idx | 4 | const | 460892 | 950738,00 | 100,00 | 88,77 | Using where; Using index; Using temporary; Using filesort |
select DATE_FORMAT(convert_tz(created_at,'+00:00','+02:00'), '%Y-%m-%d') t,
count(distinct session_id) y
from pageview
where website_id=2
and created_at between TIMESTAMP'2020-08-21 22:00:00' and TIMESTAMP'2020-08-28 21:59:59.999000'
group by 1
order by 1
Execution time (Avg): 2.750s
Explanation:
id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | SIMPLE | pageview | ref | pageview_created_at_idx,pageview_website_id_idx,pageview_website_id_session_id_created_at_idx | pageview_website_id_session_id_created_at_idx | 4 | const | 460860 | 843947,00 | 100,00 | 100,00 | Using where; Using index; Using filesort |
select distinct url x, count(*) y
from pageview
where website_id=2
and created_at between TIMESTAMP'2020-08-21 22:00:00' and TIMESTAMP'2020-08-28 21:59:59.999000'
group by 1
order by 2 desc
Execution time (Avg): 18.004s
Explanation:
id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | SIMPLE | pageview | ref | pageview_created_at_idx,pageview_website_id_idx,pageview_website_id_session_id_created_at_idx | pageview_website_id_idx | 4 | const | 460709 | 950401,00 | 100,00 | 88,77 | Using where; Using temporary; Using filesort |
I've updated the dump, that I've sent to you already.
from umami.
@SoftCreatR The data dump was very useful. I was able to fix a couple bugs from it. For example the domain field for the website should just be the domain, not including http. So thanks for that.
Using the new dump and I made a few small improvements. Here are the numbers from the details page:
A little higher than I'd like but not terrible. But I think I understand what is going on with your server. You are running it on a live site while I am just using a local MySQL instance with no incoming traffic. With the amount of hits you are getting, it's probably taking up all your resources for these queries. You might need a much stronger server or set up a read replica.
Originally I had a cookie using localStorage that would cache some data and save a few queries, but I removed it due to GDPR concerns. It would probably help a lot in your case. I can add it back and make it configurable. What do you think? Since you're EU based it's probably more of a concern for you.
from umami.
Related Issues (20)
- Filter by url | query string | country | etc doesn't work for site details
- Clicking on team website from dashboard changes team HOT 1
- admin user shows role as unknown HOT 1
- Website endpoint returns 500 after upgrading to 2.10.2 HOT 3
- Argument `websiteId` is missing. HOT 1
- No query parameters captured HOT 2
- api for real-time online counts & api for the current page HOT 2
- Add `splitted_url_query` and `splitted_referrer_query` for better UTM/Query Params filtering HOT 2
- missing formal parameter HOT 5
- crypto is leaking into the frontend, causing 100kb of useless js
- completely redundant large dependency HOT 4
- No region and cities are showing HOT 2
- Inconsistent query result: Field user is required to return data, got `null` instead HOT 2
- world map HOT 1
- Search is broken HOT 1
- report's query is not working HOT 1
- This issue still exists in v2.10.2. url encode special characters
- Dashboard: undefined is not an object (evaluating 'D.find(t=>t.value===e).label') HOT 1
- translations: regression with `tracking code` HOT 2
- Environment variable "CLIENT_IP_HEADER" support Case insensitive HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from umami.