Giter Site home page Giter Site logo

Comments (9)

marekro avatar marekro commented on July 22, 2024 1

Just commenting to document I am seeing the same problem with Django 4.2 and django_cte manifesting itself as
django.db.utils.OperationalError: (1054, "Unknown column 'cte.id' in 'on clause'")
Current workaround is to stay with Django < 4.2, I am hesitating to apply the workaround applied by bigBrain1901 here, having multiple places where CTEs are used (exactly same scenarios as detailed above). Any indication when a version of django_cte would be available, fixing this issue?

from django-cte.

blissdev avatar blissdev commented on July 22, 2024 1

My company is hitting this issue as well. What is needed for review on #78?

from django-cte.

millerdev avatar millerdev commented on July 22, 2024

Recording some notes, mostly thinking out loud.

Unfortunately the Query object appears to be unaware of column aliases added by SQLCompiler when with_col_aliases=True. In the test failure above the bad column references appeared in JOIN clauses, but this could be a problem anywhere a cte.col.some_name (an instance of CTEColumn) is referenced.

In some cases CTEColumn.as_sql is passed the compiler that added the alias, and in other cases it is not. Both scenarios appear in TestCTE.test_named_ctes.

Scenario 1: compiler adding the alias is passed to CTEColumn.as_sql

def make_root_mapping(rootmap):
    return Region.objects.filter(
        parent__isnull=True
    ).values(
        "name",  # SQLCompiler adds an alias to this column because of `with_col_aliases=True`
        root=F("name"),
    ).union(
        # Reference to rootmap.col.name, which is aliased by SQLCompiler.
        # Here would be possible for <CTEColumn rootmap.name> to get the alias from
        # the compiler, although it would require a deep reach.
        rootmap.join(Region, parent=rootmap.col.name).values(
            "name",
            root=rootmap.col.root,
        ),
        all=True,
    )

Scenario 2: compiler adding the alias is not passed to CTEColumn.as_sql

# Reference to rootmap.col.name, which has been aliased by SQLCompiler.
# There is no easy way for CTEColumn to get the alias because it does not have
# a reference to the compiler that created the alias in this context.
rootmap.join(Order, region_id=rootmap.col.name)

from django-cte.

millerdev avatar millerdev commented on July 22, 2024

I don't have a timeline. Pull requests are welcome.

from django-cte.

bernd-wechner avatar bernd-wechner commented on July 22, 2024

I just tried following the README, and running tests and indeed it fails with:

======================================================================
ERROR: test_named_ctes (tests.test_cte.TestCTE)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/.../.venvs/cte/lib/python3.10/site-packages/django/db/backends/utils.py", line 89, in _execute
    return self.cursor.execute(sql, params)
  File "/.../.venvs/cte/lib/python3.10/site-packages/django/db/backends/sqlite3/base.py", line 328, in execute
    return super().execute(query, params)
sqlite3.OperationalError: no such column: rootmap.name

and I can see:

$ pip freeze | grep Django
Django==4.2.3

As an aside nose is also broken with Python 3.10 and so: #73

from django-cte.

bernd-wechner avatar bernd-wechner commented on July 22, 2024

Reading @millerdev's notes to date, I took a look at the SQL generated in test_named_ctes.

WITH RECURSIVE "rootmap" AS
  (SELECT "region"."name" AS "col1",
          "region"."name" AS "root"
   FROM "region"
   WHERE "region"."parent_id" IS NULL
   UNION ALL SELECT "region"."name" AS "col1",
                    "rootmap"."root" AS "root"
   FROM "region"
   INNER JOIN "rootmap" ON "region"."parent_id" = ("rootmap"."name")),
               "totals" AS
  (SELECT "rootmap"."root" AS "root",
          COUNT("orders"."id") AS "orders_count",
          SUM("orders"."amount") AS "region_total"
   FROM "orders"
   INNER JOIN "rootmap" ON "orders"."region_id" = ("rootmap"."name")
   GROUP BY 1)
SELECT "region"."name",
       "region"."parent_id",
       "totals"."orders_count" AS "orders_count",
       "totals"."region_total" AS "region_total"
FROM "region"
INNER JOIN "totals" ON "region"."name" = ("totals"."root")

And I suspect the issue (am learning here) is that pre-Django4.2 it was perhaps:

WITH RECURSIVE "rootmap" AS
  (SELECT "region"."name",
          "region"."name" AS "root"
   FROM "region"
   WHERE "region"."parent_id" IS NULL
   UNION ALL SELECT "region"."name" AS,
                    "rootmap"."root" AS "root"
   FROM "region"
   INNER JOIN "rootmap" ON "region"."parent_id" = ("rootmap"."name")),
               "totals" AS
  (SELECT "rootmap"."root" AS "root",
          COUNT("orders"."id") AS "orders_count",
          SUM("orders"."amount") AS "region_total"
   FROM "orders"
   INNER JOIN "rootmap" ON "orders"."region_id" = ("rootmap"."name")
   GROUP BY 1)
SELECT "region"."name",
       "region"."parent_id",
       "totals"."orders_count" AS "orders_count",
       "totals"."region_total" AS "region_total"
FROM "region"
INNER JOIN "totals" ON "region"."name" = ("totals"."root")

That is visible in rootmap.query and we can see: rootmap.query.alias_cols is True and rootmap.query.default_cols is False

And the challenge as I read the out-loud thinking is that rootmap.col.name needs to resolve to "col1" here not "name" .

rootmap.query knows about "col1" but by the looks of it that is a standard django.db.models.query.QuerySet().query but the Django code gets a tad hard to understand around here but it seems to call on django.db.models.sql.query.Query which builds that SQL. More reading needed I guess to work out what's going on, but somewhere around here those aliases must be generated and I'd like to image the mapping retained somehow. As surely all the standard queryset methods that can take field names as arguments (like .filter() for example) need to map those field names to the same aliases later as SQL is generated.

Anyhow, I too am just thinking out load, and leaving some notes.

from django-cte.

pgammans avatar pgammans commented on July 22, 2024

More notes:

The breaking change was added in django commit 70499b2 and related to 8c3046d
This basically ensure all columns have aliases fixing there issues ref#28333 but this also triggers django-cte queries.

You can just add the following to CTEQueryCompiler.as_sql to fix all the tests but i have the feeling this is not a good idea
ie:

class CTEQueryCompiler(SQLCompiler):
    def as_sql(self, *args, **kwargs):
        with_col_aliases = kwargs.pop('with_col_aliases', None)
        ....

I have a set of fixes that, improves alias support to the cte columns and link them so changing the cte alias changes the col alias This is needed so that the joins and where clauses are updated.
Unfortunately this seams to fail as as the outer select is built incorrectly, and it will need more work. The outer select is now incorrect as the CTE's columns are names col1, col2 etc.

from django-cte.

pgammans avatar pgammans commented on July 22, 2024

To prevent anyone starting afresh.
I have a set of commits that fix this, but atm its based on #75 so i need to re-base it. It will need so work but i will hopefully submit a draft PR in the next few days.

Whilst it passes all the check the test coverage isn't as good.

Ideally if someone what to write a test against where the CTE query triggers any of the related django bugs, i would be grateful. I yet to identify them but here are two that i found so far,

I need to ensure that the new changes dont revert these fixes, the last one is significant as i had to manipulate the col aliases to ensure they match the column names.
ie

WITH RECURSIVE "rootmap" AS
  (SELECT "region"."name" AS "name",
          "region"."name" AS "root"
   FROM "region"
   WHERE "region"."parent_id" IS NULL
   UNION ALL SELECT "region"."name" AS "name",
                    "rootmap"."root" AS "root"
   FROM "region"
   ....

from django-cte.

afpekmezci avatar afpekmezci commented on July 22, 2024

the solution of removing with_col_aliases may cause side effects if union and order_by used together.

# simple model

class Population(models.Model):
    class Meta:
        db_table = "population"

    year = models.PositiveIntegerField()
    population = models.BigIntegerField()

    objects = CTEManager()
def get_population_queryset(
    start_year: int,
) -> QuerySet:
    population_queryset_one = Population.objects.filter(
        year=start_year,
    ).order_by("year")
    population_queryset_second = Population.objects.filter(
        year=start_year + 10,
    ).order_by("year")
    union_queryset = population_queryset_one.union(
        population_queryset_second,
    ).order_by("year")
    print(
        '--DJANGO QUERY : \n',
        union_queryset.query
    )
    return union_queryset

def calculate_population(
        start_year: int,
):
    queryset = get_population_queryset(
        start_year=start_year,
    )
    cte = With(
        queryset,
        name="population_cte",
    )
    print(
        '--CTE (WITH) QUERY : \n',
        cte.query
    )
    populations = (
        cte.queryset().with_cte(
            cte
        ).order_by('year')
    )
    print(
        '--POPULATIONS QUERY : \n',
        populations.query
    )
    print(populations)
--DJANGO QUERY :
(
    SELECT
        "population"."id" AS "col1",
        "population"."year" AS "col2",
        "population"."population" AS "col3"
    FROM
        "population"
    WHERE
        "population"."year" = 1976
    ORDER BY
        "population"."year" ASC
)
UNION
    (
        SELECT
            "population"."id" AS "col1",
            "population"."year" AS "col2",
            "population"."population" AS "col3"
        FROM
            "population"
        WHERE
            "population"."year" = 1986
        ORDER BY
            "population"."year" ASC
    )
ORDER BY
    "col2" ASC 

--CTE (WITH) QUERY :
(
    SELECT
        "population"."id" AS "col1",
        "population"."year" AS "col2",
        "population"."population" AS "col3"
    FROM
        "population"
    WHERE
        "population"."year" = 1976
    ORDER BY
        "population"."year" ASC
)
UNION
    (
        SELECT
            "population"."id" AS "col1",
            "population"."year" AS "col2",
            "population"."population" AS "col3"
        FROM
            "population"
        WHERE
            "population"."year" = 1986
        ORDER BY
            "population"."year" ASC
    )
ORDER BY
    "col2" ASC

--POPULATIONS QUERY :
WITH RECURSIVE "population_cte" AS (
    (
        SELECT
            "population"."id",
            "population"."year",
            "population"."population"
        FROM
            "population"
        WHERE
            "population"."year" = 1976
        ORDER BY
            "population"."year" ASC
    )
    UNION
        (
            SELECT
                "population"."id",
                "population"."year",
                "population"."population"
            FROM
                "population"
            WHERE
                "population"."year" = 1986
            ORDER BY
                "population"."year" ASC
        )
    ORDER BY
        "col2" ASC
)
SELECT
    "population_cte"."id",
    "population_cte"."year",
    "population_cte"."population"
FROM
    "population_cte"
ORDER BY
    "population_cte"."year" ASC

as you can see on the POPULATIONS QUERY, the UNION part of the query is trying to call ORDER BY "col2" ASC, which has been removed after applying with_cte method.

from django-cte.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.