Comments (3)
Hi Prashanth,
To confirm the result, I rerun Q7 and Q8 using both Kùzu and duckdb. Kùzu and duckdb do generate the same result for Q7 and Q8.
For Q7:
Kùzu actually returns 170 rather than 169 for Q7 (same result as duckdb). Can you rerun your testing script?
Query 7:
MATCH (p:Person)-[:LivesIn]->(:City)-[:CityIn]->(s:State)
WHERE p.age >= $age_lower AND p.age <= $age_upper AND s.country = $country
WITH p, s
MATCH (p)-[:HasInterest]->(i:Interest)
WHERE lower(i.interest) = lower($interest)
RETURN count(p.id) AS numPersons, s.state AS state, s.country AS country
ORDER BY numPersons DESC LIMIT 1
State in United States with the most users between ages 23-30 who have an interest in photography:
shape: (1, 3)
┌────────────┬────────────┬───────────────┐
│ numPersons ┆ state ┆ country │
│ --- ┆ --- ┆ --- │
│ i64 ┆ str ┆ str │
╞════════════╪════════════╪═══════════════╡
│ 170 ┆ California ┆ United States │
└────────────┴────────────┴───────────────┘
For Q8:
Both Kùzu and duckdb give 1214477 as the result.
from kuzudb-study.
BTW:
Here is the DDL and copy statements that i used to load the data:
create table person(id int64, name varchar, gender varchar, birthday date, age int64, isMarried bool, primary key(id));
create table city (id int64, city varchar, state varchar, country varchar, lat float, lon float, population int, primary key(id));
create table state(id int64, state varchar, country varchar, primary key(id));
create table interest(id int64, interest varchar, primary key(id));
create table livesin(id1 int64, id2 int64);
create table cityin(id1 int64, id2 int64);
create table hasinterest(id1 int64, id2 int64);
create table follows(id1 int64, id2 int64);
copy person from '${outputPath}/nodes/persons.csv' (AUTO_DETECT TRUE);
copy city from '${outputPath}/nodes/cities.csv' (AUTO_DETECT TRUE);
copy state from '${outputPath}/nodes/states.csv' (AUTO_DETECT TRUE);
copy interest from '${outputPath}/nodes/interests.csv' (AUTO_DETECT TRUE);
copy livesin from '${outputPath}/edges/lives_in.csv' (AUTO_DETECT TRUE);
copy cityin from '${outputPath}/edges/city_in.csv' (AUTO_DETECT TRUE);
copy hasinterest from '${outputPath}/edges/interests.csv' (AUTO_DETECT TRUE);
copy follows from '${outputPath}/edges/follows.csv' (AUTO_DETECT TRUE);
Q7 in sql:
SELECT COUNT(p.id) AS numPersons, s.state AS state, s.country AS country
FROM person p
JOIN livesin pl ON p.id = pl.id1
JOIN city c ON pl.id2 = c.id
JOIN cityin ci ON c.id = ci.id1
JOIN state s ON ci.id2 = s.id
JOIN hasinterest hi ON p.id = hi.id1
JOIN interest i ON hi.id2 = i.id
WHERE p.age >= 23 AND p.age <= 30 AND s.country = 'United States'
AND LOWER(i.interest) = LOWER('photography')
GROUP BY s.state, s.country
ORDER BY numPersons DESC
LIMIT 1;
Q8 in sql:
SELECT COUNT(f.id1) AS numFollowers
FROM person p1
JOIN follows f ON p1.id = f.id1
JOIN person p2 ON f.id2 = p2.id
WHERE p1.id > p2.id;
from kuzudb-study.
Hi @acquamarin, it looks like both my databases had some inconsistent state, leading the counts to not match up. I just cleared my entire DBs and regenerated the data and reran the queries again, and it all adds up! Very relieved that this isn't a bug. 😅
Thanks a LOT for doing this check in DuckDB as well -- this helped me learn how the graph actually uses relational algebra under the hood, and it also helps to have a reputable SQL DB show the same results as the graph. It will help in a lot of future work, too!
I'm confident that this isn't a bug, so we can close this issue and move on to other things. Cheers!
Query 7:
MATCH (p:Person)-[:LivesIn]->(:City)-[:CityIn]->(s:State)
WHERE p.age >= $age_lower AND p.age <= $age_upper AND s.country = $country
WITH p, s
MATCH (p)-[:HasInterest]->(i:Interest)
WHERE lower(i.interest) = lower($interest)
RETURN count(p.id) AS numPersons, s.state AS state, s.country AS country
ORDER BY numPersons DESC LIMIT 1
State in United States with the most users between ages 23-30 who have an interest in photography:
shape: (1, 3)
┌────────────┬────────────┬───────────────┐
│ numPersons ┆ state ┆ country │
│ --- ┆ --- ┆ --- │
│ i64 ┆ str ┆ str │
╞════════════╪════════════╪═══════════════╡
│ 170 ┆ California ┆ United States │
└────────────┴────────────┴───────────────┘
Query 7 completed in 0.012754s
Query 8:
MATCH (p1:Person)-[f:Follows]->(p2:Person)
WHERE p1.id > p2.id
RETURN count(f) as numFollowers
Number of second degree connections reachable in the graph:
shape: (1, 1)
┌──────────────┐
│ numFollowers │
│ --- │
│ i64 │
╞══════════════╡
│ 1214477 │
└──────────────┘
Query 8 completed in 0.103467s
from kuzudb-study.
Related Issues (9)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from kuzudb-study.