Giter Site home page Giter Site logo

streamparser-json's People

Contributors

callumlocke avatar dependabot[bot] avatar drawmindmap avatar juanjodiaz avatar knownasilya avatar miunau avatar mrazauskas avatar slevy85 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

streamparser-json's Issues

Stream incomplete values

Hey @juanjoDiaz

Is it possible to stream incomplete values? Currenly both OnToken and onValue will provide full tokens or attributes. However, I want to be able to easily access given string value before it is fully complete. It would need an artificial closing quote dynamically generated until the end closing quote finally arrives.

Any idea how to do it?

Tokenizer token offset is incorrect

The Tokenizer outputs the wrong offset for tokens after a string token with special characters. The difference in the expected offset is consistent with the number of certain special characters within the input string.

Some examples

This is the expected behaviour

test('testing string 1', async () => {  
  const json = JSON.stringify({"abcd": "abcd"});  
  console.log('raw string length: ', json.length)  
  const tokenizer = new streamParser.Tokenizer()  
  tokenizer.onToken = (token) => console.log(token);  
  tokenizer.write(json)  
  console.log(json[7])  
})  
  
// raw string length:  15
// { token: 0, value: '{', offset: 0 }  
// { token: 9, value: 'abcd', offset: 1 }  
// { token: 4, value: ':', offset: 7 }  // Using this token as the reference
// { token: 9, value: 'abcd', offset: 8 }  
// { token: 1, value: '}', offset: 14 }  
// :  // We print the expected character

Using a single \t special character

test('testing string 2', async () => {  
  const json = JSON.stringify({"ab\t": "abcd"});  
  console.log('raw string length: ', json.length)  
  const tokenizer = new streamParser.Tokenizer()  
  tokenizer.onToken = (token) => console.log(token);  
  tokenizer.write(json)  
  console.log(json[6])  
})  
  
// raw string length:  15 // Same length as above
// { token: 0, value: '{', offset: 0 }  
// { token: 9, value: 'ab\t', offset: 1 }  
// { token: 4, value: ':', offset: 6 } // Off by 1 now
// { token: 9, value: 'abcd', offset: 7 }  
// { token: 1, value: '}', offset: 13 }  
// " // This isn't the character we expected

The difference in expected output is consistent with the number of special characters

test('testing string 3', async () => {  
  const json = JSON.stringify({"\t\n": "abcd"});  
  console.log('raw string length: ', json.length)  
  const tokenizer = new streamParser.Tokenizer()  
  tokenizer.onToken = (token) => console.log(token);  
  tokenizer.write(json)  
  console.log(json[5])  
})  
  
// raw string length:  15  // Same length
// { token: 0, value: '{', offset: 0 }  
// { token: 9, value: '\t\n', offset: 1 }  
// { token: 4, value: ':', offset: 5 }  // Off by 2 now
// { token: 9, value: 'abcd', offset: 6 }  
// { token: 1, value: '}', offset: 12 }  
// n

My expectation here should be that the offset is relative to the input. I understand that this is a niche use case but is this something you can fix?

Cannot import package, incorrect exports

I tried to use the package in a vite project and I get the following error:

[vite] Internal server error: Failed to resolve entry for package "@streamparser/json". The package may have incorrect main/module/exports specified in its package.json.

It seems like the "module" key in package.json points to a not existing file ./dist/mjs/index.js.

Cannot find module '@streamparser/json/index' or its corresponding type declarations

After upgrading to the latest version of all packages, I'm getting this type error:

../../common/temp/node_modules/.pnpm/@[email protected]/node_modules/@types/json2csv__plainjs/src/StreamParser.d.ts:1:58 - error TS2307: Cannot find module '@streamparser/json/index' or its corresponding type declarations.

1 import { Tokenizer, TokenizerOptions, TokenParser } from '@streamparser/json/index';
                                                           ~~~~~~~~~~~~~~~~~~~~~~~~~~


Found 1 error in ../../common/temp/node_modules/.pnpm/@[email protected]/node_modules/@types/json2csv__plainjs/src/StreamParser.d.ts:1

Versions:

"@json2csv/plainjs": "6.1.3",
"@types/json2csv__plainjs": "6.1.0",
"@streamparser/json": "0.0.14"

Replace nested objects and arrays

@juanjoDiaz ,

I know this is probably out of scope of this library, but do you think it is possible to adjust the code in order to omit nested objects and arrays.

I have a large json object that looks like this:

{
   "cards": [
     {
       "id": 1,
       "name": "Some card name"
      },
     {},
     {}
    ],
   "meta": {
      "updated":"2022-12-31"
    }
}

The cards array is very large, so that it won't fit into memory on its own. (Even when parsing in chunks)
I'd like to get all objects as flat objects that replace nested arrays with "[...]" and nested objects with "{...}".
The result would look like this:

{
  "cards":"[...]",
  "meta":"{...}"
},
{
 "id": 1,
 "name": "Some card name"
},
{},
{},
{
  "updated": "2022-12-31"
}

I'm aware that this is probably out of scope of this repo, but I would like to apply the changes in my fork.
Can you point me into a direction where to look at or where those changes would fit best?

Best regards and thanks a lot for the awesome parser :-)

Error importing `JSONParser`

In our code base when we try to import JSONParser with import { JSONParser } from '@streamparser/json'; we get the following error TS2307: Cannot find module '@streamparser/json' or its corresponding type declarations..

Currently as a work around we are importing it with const jsonStreamParsers = require('@streamparser/json'); which works fine.

My question is, do you have any insights to why the usual import is failing? Or is this likely a problem with the configuration of our project in some way?

Thanks.

error TS7029: Fallthrough case in switch

I just tried migrating from the old json2csv package to the new one, and now I'm getting type errors from within node_modules:

../../common/temp/node_modules/.pnpm/@[email protected]/node_modules/@streamparser/json/src/tokenizer.ts:374:11 - error TS7029: Fallthrough case in switch.

374           case TokenizerStates.STRING_UNICODE_DIGIT_4:
              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

../../common/temp/node_modules/.pnpm/@[email protected]/node_modules/@streamparser/json/src/tokenizer.ts:500:11 - error TS7029: Fallthrough case in switch.

500           case TokenizerStates.NUMBER_AFTER_E:
              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

../../common/temp/node_modules/.pnpm/@[email protected]/node_modules/@streamparser/json/src/tokenizer.ts:697:18 - error TS6133: 'parsedToken' is declared but its value is never read.

697   public onToken(parsedToken: ParsedTokenInfo): void {
                     ~~~~~~~~~~~

../../common/temp/node_modules/.pnpm/@[email protected]/node_modules/@streamparser/json/src/tokenparser.ts:324:18 - error TS6133: 'parsedElementInfo' is declared but its value is never read.

324   public onValue(parsedElementInfo: ParsedElementInfo): void {
                     ~~~~~~~~~~~~~~~~~


Found 4 errors in 2 files.

Errors  Files
     3  ../../common/temp/node_modules/.pnpm/@[email protected]/node_modules/@streamparser/json/src/tokenizer.ts:374
     1  ../../common/temp/node_modules/.pnpm/@[email protected]/node_modules/@streamparser/json/src/tokenparser.ts:324

Message parser should be able to support arbitrary whitespace such as '\n', '\t', '\r', and ' ' within and between messages

How do I configure the stream parser to be able to discard whitespace between JSON messages?

I am facing an issue during implementation of a RPC system, which requires usage of JSONParser to parse binary to JSON for transmitting thru RPC. The issue I am facing is that input streams could be separated by a variety of different whitespaces, and the current implementation of seperator in the lib only supports a single separator.

Assuming as such, we have to keep out input streams in the following manner :
{...message}{...message}.

However, to improve readability, we wish to be able to add whitespaces in between messages, in a similar manner as demonstrated below.

// Normal separation
{...message}{...message}{...message}

// Spaces
{...message} {...message}    {...message}

// New lines
{...message}
{...message}
{...message}

// Any combination
{...message}                                  {...message}
                       {...message}
{...message}

ignore BOM

I ran into an issue where the tokenizer is choking on files with a BOM. This throws with Error: Unexpected "รฏ" at position "0" in state START.

I was able to patch tokenizer with a quick and dirty addition of TokenizerStates.BOM. Unfortunately I don't have time to submit a formal PR but wanted to raise the issue for tracking.

Keep jsonPath to each object

Hey @juanjoDiaz ,
could the Tokenizer be extended to keep the jsonPath of each emitted object?

something like this:

jsonparser.onValue = (value, key, parent, stack, jsonPath) => {
   console.log(jsonPath);
   //e.g. ['someProp', 0, 'someProp',...]
};

What would be the right place to look at?

Thanks for this awesome parser!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.