Giter Site home page Giter Site logo

Comments (3)

acquamarin avatar acquamarin commented on September 24, 2024

Hi Prashanth,

To confirm the result, I rerun Q7 and Q8 using both Kùzu and duckdb. Kùzu and duckdb do generate the same result for Q7 and Q8.
For Q7:
Kùzu actually returns 170 rather than 169 for Q7 (same result as duckdb). Can you rerun your testing script?
Query 7:

        MATCH (p:Person)-[:LivesIn]->(:City)-[:CityIn]->(s:State)
        WHERE p.age >= $age_lower AND p.age <= $age_upper AND s.country = $country
        WITH p, s
        MATCH (p)-[:HasInterest]->(i:Interest)
        WHERE lower(i.interest) = lower($interest)
        RETURN count(p.id) AS numPersons, s.state AS state, s.country AS country
        ORDER BY numPersons DESC LIMIT 1
    

            State in United States with the most users between ages 23-30 who have an interest in photography:
shape: (1, 3)
┌────────────┬────────────┬───────────────┐
│ numPersons ┆ state      ┆ country       │
│ ---        ┆ ---        ┆ ---           │
│ i64        ┆ str        ┆ str           │
╞════════════╪════════════╪═══════════════╡
│ 170        ┆ California ┆ United States │
└────────────┴────────────┴───────────────┘

For Q8:
Both Kùzu and duckdb give 1214477 as the result.

from kuzudb-study.

acquamarin avatar acquamarin commented on September 24, 2024

BTW:
Here is the DDL and copy statements that i used to load the data:

create table person(id int64, name varchar, gender varchar, birthday date, age int64, isMarried bool, primary key(id));
create table city (id int64, city varchar, state varchar, country varchar, lat float, lon float, population int, primary key(id));
create table state(id int64, state varchar, country varchar, primary key(id));
create table interest(id int64, interest varchar, primary key(id));
create table livesin(id1 int64, id2 int64);
create table cityin(id1 int64, id2 int64);
create table hasinterest(id1 int64, id2 int64);
create table follows(id1 int64, id2 int64);
copy person from '${outputPath}/nodes/persons.csv' (AUTO_DETECT TRUE);
copy city from '${outputPath}/nodes/cities.csv' (AUTO_DETECT TRUE);
copy state from '${outputPath}/nodes/states.csv' (AUTO_DETECT TRUE);
copy interest from '${outputPath}/nodes/interests.csv' (AUTO_DETECT TRUE);
copy livesin from '${outputPath}/edges/lives_in.csv' (AUTO_DETECT TRUE);
copy cityin from '${outputPath}/edges/city_in.csv' (AUTO_DETECT TRUE);
copy hasinterest from '${outputPath}/edges/interests.csv' (AUTO_DETECT TRUE);
copy follows from '${outputPath}/edges/follows.csv' (AUTO_DETECT TRUE);

Q7 in sql:

SELECT COUNT(p.id) AS numPersons, s.state AS state, s.country AS country
FROM person p
JOIN livesin pl ON p.id = pl.id1
JOIN city c ON pl.id2 = c.id
JOIN cityin ci ON c.id = ci.id1
JOIN state s ON ci.id2 = s.id
JOIN hasinterest hi ON p.id = hi.id1
JOIN interest i ON hi.id2 = i.id
WHERE p.age >= 23 AND p.age <= 30 AND s.country = 'United States'
    AND LOWER(i.interest) = LOWER('photography')
GROUP BY s.state, s.country
ORDER BY numPersons DESC
LIMIT 1;

Q8 in sql:

SELECT COUNT(f.id1) AS numFollowers
FROM person p1
JOIN follows f ON p1.id = f.id1
JOIN person p2 ON f.id2 = p2.id
WHERE p1.id > p2.id;

from kuzudb-study.

prrao87 avatar prrao87 commented on September 24, 2024

Hi @acquamarin, it looks like both my databases had some inconsistent state, leading the counts to not match up. I just cleared my entire DBs and regenerated the data and reran the queries again, and it all adds up! Very relieved that this isn't a bug. 😅

Thanks a LOT for doing this check in DuckDB as well -- this helped me learn how the graph actually uses relational algebra under the hood, and it also helps to have a reputable SQL DB show the same results as the graph. It will help in a lot of future work, too!

I'm confident that this isn't a bug, so we can close this issue and move on to other things. Cheers!

Query 7:
 
        MATCH (p:Person)-[:LivesIn]->(:City)-[:CityIn]->(s:State)
        WHERE p.age >= $age_lower AND p.age <= $age_upper AND s.country = $country
        WITH p, s
        MATCH (p)-[:HasInterest]->(i:Interest)
        WHERE lower(i.interest) = lower($interest)
        RETURN count(p.id) AS numPersons, s.state AS state, s.country AS country
        ORDER BY numPersons DESC LIMIT 1
    

            State in United States with the most users between ages 23-30 who have an interest in photography:
shape: (1, 3)
┌────────────┬────────────┬───────────────┐
│ numPersons ┆ state      ┆ country       │
│ ---        ┆ ---        ┆ ---           │
│ i64        ┆ str        ┆ str           │
╞════════════╪════════════╪═══════════════╡
│ 170        ┆ California ┆ United States │
└────────────┴────────────┴───────────────┘
            
Query 7 completed in 0.012754s

Query 8:
 
        MATCH (p1:Person)-[f:Follows]->(p2:Person)
        WHERE p1.id > p2.id
        RETURN count(f) as numFollowers
    
Number of second degree connections reachable in the graph:
shape: (1, 1)
┌──────────────┐
│ numFollowers │
│ ---          │
│ i64          │
╞══════════════╡
│ 1214477      │
└──────────────┘
Query 8 completed in 0.103467s

from kuzudb-study.

Related Issues (9)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.