Comments (17)
I'm thinking it might be best to create a new provider instead of trying to fit these changes into the existing one.
What's your opinion about that?
fwiw, I've seen comments elsewhere about breaking the type providers for different data types into different packages so maybe that's a place to start for that, otherwise just creating something new might help to avoid any legacy stuff altogether?
from fsharp.data.
I went ahead and started a JsonProvider2
implementation.
https://github.com/fsprojects/FSharp.Data/compare/main...mlaily:FSharp.Data:jsonprovider2?expand=1
Here are some progress screenshots and side-by-side comparisons with the original JsonProvider
...
An array is an array. It does not depend on the number of items it contains:
An empty string is still a string, not an optional value:
0, 1, or "yes" are not booleans:
from fsharp.data.
I just learnt about empty strings being inferred as null. Using this file as a sample:
{
"Lines": [
" **",
" *",
" ",
" ",
" ",
" ",
" *",
" *",
" ***** *",
" ******* ",
" ***** *",
" *",
" ",
" ",
" ",
" ",
" "
]
}
Now reading this file using the provider results in an array for Lines
being:
[|" **"; " *"; " *"; " *";
" ***** *"; " ******* "; " ***** *"; " *"|]
Very surprising and interesting to me that I now find this newly created issue. Are the nuances of the JsonProvider documented somewhere? For example, I don't find any mention of null
in the reference documentation.
from fsharp.data.
@goswinr It doesn't seem to help. Here is my test program and files:
#r "nuget: FSharp.Data"
open FSharp.Data
type Provider = JsonProvider<"sample.json", InferenceMode = InferenceMode.NoInference>
let sample = Provider.Load "test.json"
// show the Lines from json value and from generated type
printfn
"%A\n%A"
(sample.JsonValue.Properties() |> Array.find (fst >> ((=) "Lines")) |> snd)
(sample.Lines)
sample.json:
{
"Lines": [
""
]
}
test.json:
{
"Lines": [
" **",
" *",
" ",
" ",
" ",
" ",
" *",
" *",
" ***** *",
" ******* ",
" ***** *",
" *",
" ",
" ",
" ",
" ",
" "
]
}
It looks like the provider reads all lines, as can be seen when inspecting the JsonValue
, but "provides" only a subset.
Output:
PS C:\> dotnet fsi .\Scratch.fsx
[
" **",
" *",
" ",
" ",
" ",
" ",
" *",
" *",
" ***** *",
" ******* ",
" ***** *",
" *",
" ",
" ",
" ",
" ",
" "
]
[|" **"; " *"; " *"; " *";
" ***** *"; " ******* "; " ***** *"; " *"|]
from fsharp.data.
@mlaily I love this work! Is your JsonProvider2 working ? I guess these changes are not merging here soon. Could you publish your own nuget?
from fsharp.data.
@goswinr thanks for your interest.
This new json provider is still a prototype: there are a lot of things that are still changing or are not tested properly and can crash.
My branch is available but it's way too early for a release.
Handling null or missing values properly is proving challenging, but I'm learning a lot and making progress.
Ideally, I would also like to use System.Text.Json, but for now all my changes are based on the existing JsonValue parser.
I'll worry about how to release it when it works well enough for production use :)
from fsharp.data.
@dlidstrom does inferenceMode = InferenceMode.NoInference
help? in type MyJson = JsonProvider<myFilePath, InferenceMode = InferenceMode.NoInference>
from fsharp.data.
Yes it seems MS puts quite an effort for System.Text.Json (also Newtonsoft would probably be faster than custom implementation).
System.Text.Json source code is open (https://github.com/dotnet/runtime/tree/main/src/libraries/System.Text.Json/src/System/Text/Json) and one of the maintainers @eiriktsarpalis has been active in F# community. I don't know if there would be any quick wins without rewriting a lot, and do we even want dependencies like System.Memory to FSharp.Data (to use Spans).
from fsharp.data.
His advice was:
just use Utf8JsonReader under wraps. That way make sure it's compliant and also take advantage of all the vectorization code it's using under the hood.
This makes sense:
The current FSharp.Data parser JsonValue (see: https://github.com/fsprojects/FSharp.Data/blob/main/src/FSharp.Data.Json.Core/JsonValue.fs#L39)
would actually map the System.Text.Json Utf8JsonReader values (just testing here: https://gist.github.com/Thorium/09a67ce08adee4fcc02ec7f0048e6962).
So the parsing could be replaced without re-building the whole domain model.
And now that FSharp.Data is split to multiple NuGet packages, maybe a dependency of System.Text.Json to FSharp.Data.Json.Core would be acceptable?
from fsharp.data.
I haven't worked on it for a while but I actually have started to use System.Text.Json on my branch.
I opted to use the JsonNode
api instead of the JsonDocument
one. https://learn.microsoft.com/en-us/dotnet/standard/serialization/system-text-json/use-dom
It seemed the most fitting for a balance between (user)-convenience and performance for a general use library.
(Keep in mind we want to be able to serialize json, not only deserialize it, and IMO an F# immutable wrapper over a mutable DOM is perfect. It's also very close to what is done with the XmlProvider and I think the two of them having a similar feature set is a good idea.)
The parsing somewhat works (not for all cases yet I think), but serialization is currently non-existent.
One big problem I had that I decided to ignore for now is that I didn't manage to make the dependency on System.Text.Json work everywhere when using the compiled provider.
In my tests it works with Visual Studio Code without issue, but I can't figure out how to make it work with Visual Studio! (IIRC a project using the compiled provider referencing STJ fails to compile, complaining that it doesn't find STJ or something similar)
Another thing is my JsonWrapper
object backing every provided type after erasure is currently a struct
. My reasoning is that this JsonWrapper
is only there because there are members on JsonNode
I don't want to expose, and if I could I would use JsonNode
directly as the backing type for provided types, so I want to make it have the smallest footprint possible. I'm not 100% sure it's the right call...
Anyway if you are curious I pushed my current version to my branch (still in a somewhat PoC-state):
- https://github.com/mlaily/FSharp.Data/blob/3c590f61a4a9c51d7421eb45931a590ace4ef9c3/src/FSharp.Data.Json.Core2/JsonRuntime2.fs#L22
- https://github.com/mlaily/FSharp.Data/blob/3c590f61a4a9c51d7421eb45931a590ace4ef9c3/src/FSharp.Data.Json.Core2/JsonDocument2.fs#L15
(side note: the JsonNode
design makes it hard to use properly, e.g. because it uses null
to represent json null tokens, and plenty of other strange decisions, but it still seems to me like the most fitting api to use for our case)
from fsharp.data.
I'm trying to get this working. So far:
- I updated System.Text.Json to version 8.0.0. There are more serialization options which may be helpful.
- I merged main to this branch. No conflicts, but now the build wanted me to run some code formatting tool:
dotnet fake build -t Format
- "build.cmd" failed as I didn't have .NET 6.0.5. That happens in main branch too, which is Ironic as this provider is using .NET5. "dotnet build" works but I need to use "build.cmd -t pack" to generate packet. So I copied the latest
C:\Program Files\dotnet\packs\Microsoft.NETCore.App.Ref\6.0.25
to requiredC:\Program Files\dotnet\packs\Microsoft.NETCore.App.Ref\6.0.5
and now I can run "build.cmd -t pack" - Packed. Then I unzipped
\bin\FSharp.Data.6.3.0.nupkg
and copied over my files in my Nuget cache atC:\Users\(myuser)\.nuget\packages\fsharp.data\6.3.0
. - Then, to avoid missing STJ complaint, I copied netstandard2.0 dll versions of
System.Text.Json.dll
andMicrosoft.Bcl.AsyncInterfaces.dll
from my nuget-cache to a temp-folder and then to nuget-cache both foldersFSharp.Data\6.3.0\typeproviders\fsharp41\netstandard2.0
andFSharp.Data\6.3.0\lib\netstandard2.0
(The temp is needed because NuGet-restores will remove them easily.) If those are needed in the package, there could be added some post-build deployment task.
Now I'm this far:
Error FS3021 Unexpected exception from provided type 'FSharp.Data.JsonProvider2,Sample="{\"x\": 1}"'
member 'GetMethods': The type provider 'ProviderImplementation.JsonProvider2' reported an error:
The design-time type 'System.Text.Json.Nodes.JsonNode' utilized by a type provider was not found in the target reference assembly
set '[tgt assembly FSharp.Core, Version=8.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a;
---<cut list of assemblies...but not System.Text.Json here>---
]'. You may be referencing a profile which contains fewer types than those needed by the type provider you are using.
from fsharp.data.
I forgot to mention another interesting thing about this: if you dotnet build
a test project referencing FSharp.Data containing my version based on STJ, it works, and then VS is able to build it too! (at least until you clean your bin output folder...)
from fsharp.data.
As far as I know:
System.Text.Json is released on multiple frameworks:
net462 / netstandard2.0 / net6.0 / net7.0 / net8.0
TypeProvider uses the runtime-code on compile-time.
Different IDEs may use different versions (e.g. VS2022 may be running on net462, meanwhile VSCode can be running on net6 or whatever).
After compilation is the easy part, because if the customer referenced the System.Text.Json to their project (or other dependencies did it), then it probably picks the correct runtime-version anyway.
from fsharp.data.
I tried to play with this version but I think it goes very complex to try to expose the System.Text.Json types on design-time.
I think what I would rather prefer, is just to use the current "obsolete" JSON-model mapping and add 2 methods:
- For input, current
.Parse
, to have sister method like ".ParseWithSystemTextJson" - For output, current
.JsonValue
to have sister method like ".JsonValueWithSystemTextJson" (or JsonSaveOptions overload but that's messy).
And so, not to expose System.Text.Json types to outside at all. Design-time would work as it is, no System.Text.Json, and runtime user would have option to use faster serialization if needed. Keep backward compatibility. Just map current Json types to Utf8JsonReader types with simple match clause. And, actually, if even more independency wanted, the System.Text.Json.dll could be loaded with reflection, so that if it's missing, the methods would just throw an error, that way there would not be dependency needed to Nuget-packet level at all to the actual library.
from fsharp.data.
I think exposing the underlying JsonNode
is a desirable feature. The providers are not perfect, and it's very valuable as a user to be able to get full control and bypass the provider if you need it, and if the provider is based on STJ I would expect to get control of the STJ primitives, and not just some opaque wrapper. (Same as the XmlProvider)
I don't like the idea of choosing an implementation based on a method name but without actually seeing anything else change.
Also, throwing at runtime if STJ is missing seems worse than failing at build time...
I agree adding STJ as a dependency of FSharp.Data isn't that great, but I don't see a good path around that...
from fsharp.data.
hmm, maybe the ideas are both doable..
Maybe the "faster serializer" I was thinking (not as good as raw STJ as you pointed out) could be a separate nuget package depending both fsharp.data and system.text.json. That way current old provider can stay as it is, and old projects could benefit with minimal migration.
from fsharp.data.
That would probably be ideal if we could make the providers independent yeah.
Even though I'm aware some work has already been done to go in that direction, I'm not sure how easy it will be though.
from fsharp.data.
Related Issues (20)
- The design-time type 'FSharp.Data.Runtime.StructuralInference+InferenceMode' utilized by a type provider was not found HOT 8
- Support DateOnly and TimeOnly types HOT 2
- FSharp.Data.Core.Tests.CSharp.csproj: [NU1504] Duplicate 'PackageReference' items found. HOT 2
- s HOT 1
- JsonProvider - writable properties? HOT 1
- When closing VS 2022, the FSharp.Data.DesignTime stays on lock that prevents closing the editor. HOT 1
- Why are the core features (`CommonRuntime` folder) implemented in the Http project? HOT 6
- Question: an F# type provider backed by ML.NET for AI/learning HOT 1
- Requesting a new beta package published to nuget.org HOT 3
- FSharp.Data.Runtime.Utilities is not published HOT 2
- Preserving linefeed style HOT 3
- A new FSharp.Data.JsonSaveOptions value needed HOT 3
- XmlProvider generates types based on name attribute in XSD schema rather than the type attribute HOT 5
- support empty space separator HOT 1
- update pictures in docs with #r nuget reference (no .dll ref)
- Update FSharp.Core from 5 to 6 or 7, to get less memory usage and better performance HOT 4
- System.Text.Json for JsonProvider? HOT 1
- Memory Limitations in XmlProvider HOT 10
- FS3033: Multiple definition of element error.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fsharp.data.