Giter Site home page Giter Site logo

Comments (8)

aarondl avatar aarondl commented on April 27, 2024

Hi @mvdan. First let me apologize for not getting back to you quicker. I saw this issue right away and then promptly got distracted by other things.

I'm going to respond to both issues here since they're essentially the same kind of problem. And the first thing I want to say is that I agree with you that there are inefficiencies in the in-memory representation. However it's a trade-off on multiple levels. The reason they behave like they do currently is that was the path of least resistance to get it working for these cases. Let me elaborate on the reasons:

UUID:

  1. [16]byte is not supported by the lib/pq driver (which we shamelessly assume everyone uses). Meaning we would have to have created another type and the less types we create the better since we then become the maintainers of those types. I know we have a lot already but they were all absolutely necessary, a UUID type is not.
  2. [16]byte is not supported nicely by things like JSON and we'd be forced to implement MarshalJSON, MarshalTOML, TextMarshal, BinaryMarshal etc etc and support all that too.

Now these things could be circumvented by using []byte but then you've lost a lot of the reason for making the change since you're saving some 3 integers or so by not having the slice header. And recall that you'll always have multiple allocations of this anyway since you'll be forced into converting to slice all over the place (as usual when using non-slices).

Char:

  1. byte is not a supported type for the lib/pq library to scan a char(N) into, because it's a variable width type. So in order to accomplish this one we'd not only need a custom byte type, but we'd have to dig into another level of metadata for field types inside databases (the length of variable width columns) which we'd like to avoid if possible since it puts more requirements on information_schema support and may impact later additions of say SQLite.

In summary the positive side of these changes are:

  1. Leaner (and truer) in-memory representation of these types

The downsides are:

  1. More custom SQLBoiler types for users to deal with, rather than pure Go types.
  2. Array usage in Go, which complicates usage of the struct fields, with no usability gains since you're still converting to/from a separate UUID package
  3. Unpleasant representations in other serialization formats that we currently try to support.
  4. Requires additional metadata from the information_schema

I recognize the inefficiencies, and hopefully you recognize the tradeoffs that I've outlined and that neither is an ideal state. I don't want to say we'll never do this because this kind of inefficiency makes me really sad but I think we need some more data on whether or not this amounts to being a bottleneck in real-world applications. I'm sure there's some awful cases like a table with 80 char(1) columns... but I'd like to see something in the wild where this is hurting enough to make the change. I think rather than have this, I'd prefer a mechanism whereby users can select the types generated via config files - but that has it's own downsides in terms of complexity.

Thoughts?

from sqlboiler.

mvdan avatar mvdan commented on April 27, 2024

[16]byte is not supported by the lib/pq driver

Ah, didn't know that.

[16]byte is not supported nicely by things like JSON and we'd be forced to implement MarshalJSON, MarshalTOML, TextMarshal, BinaryMarshal etc etc and support all that too.

JSON/TOML and the rest would be covered by the Text one. And I don't know why Binary would be important here - currently it's just the string version, so you could just use Text as well.

Now these things could be circumvented by using []byte but then you've lost a lot of the reason for making the change since you're saving some 3 integers or so by not having the slice header.

Remember that the []byte would contain 16 bytes, not the 36 bytes (chars) that it takes when stringified. But yes, it would be a small gain to go from string to []byte.

And recall that you'll always have multiple allocations of this anyway since you'll be forced into converting to slice all over the place (as usual when using non-slices).

Depends on the use-case. My understanding is that right now there is one stringification and allocation when gathering the fields from postgres. If we used [16]byte, this would not happen upfront and it would be up to the user to do if needed.

In summary the positive side of these changes are:

  1. Leaner (and truer) in-memory representation of these types

I would add the ability to use go.uuid a lot easier. Right now, you have to use String() everywhere.

The downsides are:

  1. More custom SQLBoiler types for users to deal with, rather than pure Go types.

I understand that's only because lib/pq does not support it.

  1. Array usage in Go, which complicates usage of the struct fields, with no usability gains since you're still converting to/from a separate UUID package

Complicates how?

There would be usability gains. If go.uuid defines type UUID [16]byte, you can assign and convert to [16]byte easily. If not, you have to do str = uuid.String() everywhere.

  1. Unpleasant representations in other serialization formats that we currently try to support.

If this is a problem, you could always fall back to the text marshaling which is what is currently done, right?

  1. Requires additional metadata from the information_schema

I don't know anything about that :)

I'm not going to insist, so feel free to close this issue if you want to decline this request/proposal. This was more of a "I wonder why they do it this way" issue.

I have a different view of the ups and downs of [16]byte vs string, but I do see that there are still disadvantages. And since you guys know more about sqlboiler and SQL in general than me, I'll leave it to your judgement.

Thanks again!

from sqlboiler.

mvdan avatar mvdan commented on April 27, 2024

Also, since you merged the "char" issue into this one.

  1. byte is not a supported type for the lib/pq library to scan a char(N) into, because it's a variable width type. So in order to accomplish this one we'd not only need a custom byte type, but we'd have to dig into another level of metadata for field types inside databases (the length of variable width columns) which we'd like to avoid if possible since it puts more requirements on information_schema support and may impact later additions of say SQLite.

I understood that "char" and char(1) were different things. Like byte and [1]byte are in Go.

If "char" really is a byte in postgres, then I think what I said makes sense. If "char" is just a synonim for char(1), then I misread the documentation and it's nonsense.

from sqlboiler.

aarondl avatar aarondl commented on April 27, 2024

You're right about serialization I THINK. Having said that I've never had the TextMarshal work for JSON encoding for me. Not sure why but I'm sure I tried several times. Going to test this again :)
The one point I made that was sort of lost here I think was any use of the [16]byte requires a slice struct to be created regardless, since no method in Go takes [n]byte you may end up doing something like this: strings.ToUpper(uuid[:]), but this is also assuming you actually need to use it in some function you didn't write, and since it's a uuid you probably don't. This is why I mentioned extra allocs. It's possible they live on the stack though. I also considered this a "complication" hence my point where I said "Array usage in Go, which complicated the usage of struct fields". Arrays in Go are more cumbersome to use than a string or []byte.
lib/pq not supporting it does force us to make additional types yes, you're right.
I hadn't realized that a type alias for [16]byte would allow assignment, somehow I thought you still needed to have an explicit conversion.
Just so you're aware we harvest metadata from columns from the information_schema parts, this is an SQL Standard and if we're to maintain compatibility across multiple database engines, we need to make sure they all support the attributes that we ask for. It -is- a standard but like most things it's half implemented by every sql database :|
You're also absolutely right about char, thanks very much for pointing it out! This may change things for us depending on what's in the information_schema tables.

Anyway. I'm going to think about this a little more with renewed enthusiasm about char. It's really not a bad idea. Thanks for all the discussion :)

from sqlboiler.

aarondl avatar aarondl commented on April 27, 2024

Hey @mvdan. We've been busy on other projects so let this issue drift a little.

Did a bit more research on this one, we're good to go on all fronts. Aside from it being a breaking and just an optimization forgive me for a little more drift :) The facts we gathered were all correct -and- I found that the TextMarshaler problems I've seen in the past were probably my own doing. The information_schema data is also sufficient.

We'll get to this one.

from sqlboiler.

mvdan avatar mvdan commented on April 27, 2024

Awesome! I assume you're talking about "char" - if so, could you change the title?

from sqlboiler.

aarondl avatar aarondl commented on April 27, 2024

I still want the discussion searchable so that we can point to it later. Looked at UUID again and found several other negatives, so definitely not going to do that unless there's solid data and people having trouble keeping a lot of these in ram for long periods of time. Starting on "char" implementation.

from sqlboiler.

aarondl avatar aarondl commented on April 27, 2024

https://github.com/vattle/sqlboiler/releases/tag/v2.1.0

from sqlboiler.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.