Comments (24)
Could you please supply a reproducible example? Would you expect the json to be parsed into an R object?
from rpostgres.
Kinda hard to setup a nice reproducible example because it depends on your SQL database.
All I am saying is: when using RPostgreSQL's dbGetQuery on a table that contains data of type json you get the warning mentioned above. Try to query any Postgres column of type json in R to reproduce this.
Currently a popular workaround is to: SELECT col_that_is_json::text FROM table_that_contains_json
which casts json to text. So since you guys are working on a new postgres package, I thought true json support might be a good idea.
from rpostgres.
Perhaps you could create such a database with SQL code... I have limited capability to learn Postgres specific SQL so if you want me to look into this more, you'll need to include a basic example.
from rpostgres.
Sure, here's a minimal example (find the elaborate example here: https://github.com/mbannert/timeseriesdb/tree/master/inst/sql):
CREATE TABLE timeseries_json (ts_key varchar primary key,
ts_data json);
INSERT INTO timeseries_json (ts_key,ts_data) VALUES ('ts2','{"ts_key":"ts2","ts_data":{"2010-01-01": "132.02", "2010-02-01": "232.6", "2010-03-01": "23.16", "2010-04-01": "-121.09", "2010-05-01": "311.16", "2010-06-01": "83", "2010-07-01": "60.13", "2010-08-01": "60.75", "2010-09-01": "104.39", "2010-10-01": "282.39", "2010-11-01": "122.66", "2010-12-01": "190.82", "2011-01-01": "100.83", "2011-02-01": "100.35", "2015-02-01": "133.8"},"ts_frequency":12}'::json);
Now try to read that in R and it should to issue a warning: RS-DBI driver warning: (unrecognized PostgreSQL field type json (id:114) in column 1)
Not a big deal, since you could cast it to text when selecting but still it would be nice to avoid the warning. Btw. among the json packages RJSONIO did the best job to get it into out-of-the-box into the right R format.
from rpostgres.
@mbannert Would you expect RPostgres to parse it for you?
from rpostgres.
No, package's like @jeroenooms jsonlite are just perfect (though RJSONIO worked better in this case). I just don't like the fact that RS-DBI complains about json as a postgres type though it's perfectly valid postgreSQL. Casting to text is easy with the current json but with postgres new binary representation jsonb it might not be in the future. No big deal, but still it would be nice to avoid the warning.
from rpostgres.
Casting as text seems best as JSON isn't a basic R type. I guess if RPostgres
could do this automatically, that would be perfect (alternative is to use a::json
(CAST(a as json)
) in the SQL query).
from rpostgres.
hi guys - sorry for resurrecting an old bug, but when I use rpostgres to retrieve a jsonb field... it seems to return junk (a set of 2 digit numbers). Am I doing something wrong in casting ?
library(RPostgres)
con <- dbConnect(RPostgres::Postgres(), host="girac.com", user="david", password="dd", dbname="production")
bbs_results <- dbSendQuery(con, "select id,mobile,data from data_scores where JSONB_EXTRACT_PATH_TEXT(data,'type') = 'Selfie'")
fetchedResult <- dbFetch(bbs_results)
id mobile
1 5456 +91846840
data
1 7b 6c, 22, 3a, 20, 22, 68, 74, 74, 70, 73, 3a, 2f, 2f, 73, 33, 2d, 61, 70, 2d, 73, 6f, 75, 74, 68, 65, 61, 73, 74, 2d, 31, 2e, 61, 6d, 61, 7a, 6f, 6e, 61, 77, 73, 2e, 63, 6f, 6d, 2f, 72, 65, 64, 63, 61, 72, 70, 65, 74, 2d, 6d, 65, 64, 69, 61, 2f, 63, 61, 72, 64, 73, 2f, 70, 72, 6f, 64, 2f, 30, 30, 30, 2f, 30, 32, 31, 2f, 36, 36, 33, 2f, 4e, 41, 6c, 4b, 78, 70, 38, 77, 4e, 64, 6f, 6b, 6d, 6f, 59, 31, 6b, 6e, 35, 39, 22, 2c, 20, 22, 74, 79, 70, 65, 22, 3a, 20, 22, 53, 65, 6c, 66, 69, 65, 22, 7d
``
from rpostgres.
It looks like a raw vector. What sort of content exactly are you expecting? Try:
str(fetchedResult)
from rpostgres.
@jeroenooms thanks for the reply. the content is jsonb - you can see that my query is actually filtering on it.
select id,mobile,data from data_scores where JSONB_EXTRACT_PATH_TEXT(data,'type') = 'Selfie'
This is what it gives me
> str(fetchedResult)
'data.frame': 1 obs. of 3 variables:
$ id : int 5456
$ mobile: chr "+91846840"
$ data :List of 1
..$ : raw 7b 22 75 72 ...
from rpostgres.
So how is this wrong? How are we supposed to know what is in your db? It looks fine to me. You got one record with three fields: an id
, a phone number and a binary data blob.
from rpostgres.
@jeroenooms - the data is actually Postgres jsonb.
I didnt claim someone is wrong - I'm trying to figure out my mistake.
When I use rpostgresql, then it returns
> f$data
[1] "{\"url\": \"https://s3-ap-southeast-1.amazonaws.com/media/cards/prod/000/020/242/Dp7ENZzAe1GaBW3\", \"type\": \"Selfie\"}"
[2] "{\"url\": \"https://s3-ap-southeast-1.amazonaws.com/media/cards/prod/000/020/242/Dp7ENZQRzAJOmaBW3\", \"type\": \"Selfie\"}"
[3] "{\"url\": \"https://s3-ap-southeast-1.amazonaws.com/media/cards/prod/000/020/400/kvXN4AMyjZojZdJP\", \"type\": \"Selfie\"}"
from rpostgres.
OK so the data
field gets returned as a raw vector instead of a string. Try:
sapply(fetchedResult$data, rawToChar)
from rpostgres.
hey that seems to have worked :
> rawToChar(fetchedResult$data[[1]])
[1] "{\"url\": \"https://s3-ap-southeast-1.amazonaws.com/media/cards/prod/000/021/663/NAlKxokmoY1kn59\", \"type\": \"Selfie\"}"
> jsonlite::fromJSON(rawToChar(fetchedResult$data[[1]]))
$url
[1] "https://s3-ap-southeast-1.amazonaws.com/media/cards/prod/000/021/663/NAlKxokmoY1kn59"
$type
[1] "Selfie"
is this the right way to extract json from the query fields (say when there are about 100000 rows ) ?
from rpostgres.
For now maybe you can use:
sapply(fetchedResult$data, rawToChar)
It might make more sense if RPostGres would automatically convert json
types to strings instead of raw vector. Not sure why it is not doing that. Maybe open an new issue for this.
from rpostgres.
Actually I think I see the problem. One sec.
from rpostgres.
@jeroenooms will do.
BTW, I just realized you were the author of opencpu and jsonlite ! Thank you so much for that work. I couldnt work without it.
from rpostgres.
Can you please try to reinstall RPostgress and see if it returns a string now?
devtools::install_github("rstats-db/DBI")
devtools::install_github("rstats-db/RPostgres", ref="jsonb")
from rpostgres.
yup - its working great now!!
thanks so much for that..
This is the first time I'm using databases with R. In general do everyone load all the rows in memory (using dbfetch) and then apply the transformations.. or is there a "lazy" way to do it. We use dplyr... but RPostgres does not support it natively.
from rpostgres.
I know that RPostgreSQL
supports lazy ways of doing things; I'm surprised that rpostgres
doesn't.
Laziness here can be inferred from the fact that the table is about 22GB.
> suppressPackageStartupMessages(library(dplyr))
> pg <- src_postgres()
> system.time(dsf <- tbl(pg, sql("SELECT * FROM crsp.dsf")))
user system elapsed
0.013 0.001 0.048
> dsf
Source: postgres 9.5.0 [igow@localhost:5432/crsp]
From: <derived table> [?? x 20]
cusip permno permco issuno hexcd hsiccd date bidlo askhi prc vol ret bid ask shrout cfacpr cfacshr openprc numtrd retx
(chr) (int) (dbl) (dbl) (dbl) (dbl) (date) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl)
1 68391610 10000 7952 10396 3 3990 1986-01-07 2.375 2.750 -2.5625 1000 NA NA NA 3680 1 1 NA NA NA
2 68391610 10000 7952 10396 3 3990 1986-01-08 2.375 2.625 -2.5000 12800 -0.024390 NA NA 3680 1 1 NA NA -0.024390
3 68391610 10000 7952 10396 3 3990 1986-01-09 2.375 2.625 -2.5000 1400 0.000000 NA NA 3680 1 1 NA NA 0.000000
4 68391610 10000 7952 10396 3 3990 1986-01-10 2.375 2.625 -2.5000 8500 0.000000 NA NA 3680 1 1 NA NA 0.000000
5 68391610 10000 7952 10396 3 3990 1986-01-13 2.500 2.750 -2.6250 5450 0.050000 NA NA 3680 1 1 NA NA 0.050000
6 68391610 10000 7952 10396 3 3990 1986-01-14 2.625 2.875 -2.7500 2075 0.047619 NA NA 3680 1 1 NA NA 0.047619
7 68391610 10000 7952 10396 3 3990 1986-01-15 2.750 3.000 -2.8750 22490 0.045455 NA NA 3680 1 1 NA NA 0.045455
8 68391610 10000 7952 10396 3 3990 1986-01-16 2.875 3.125 -3.0000 10900 0.043478 NA NA 3680 1 1 NA NA 0.043478
9 68391610 10000 7952 10396 3 3990 1986-01-17 2.875 3.125 -3.0000 8470 0.000000 NA NA 3680 1 1 NA NA 0.000000
10 68391610 10000 7952 10396 3 3990 1986-01-20 2.875 3.125 -3.0000 1000 0.000000 NA NA 3680 1 1 NA NA 0.000000
.. ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
from rpostgres.
hi Ian - are you able to do this with jsonb columns ? my problem is that
rpostgresql does not support the json/jsonb types.
However, you are right - lazy rpostgres would be awesome!
On Wed, Feb 10, 2016 at 10:53 PM, Ian Gow [email protected] wrote:
I know that RPostgreSQL supports lazy ways of doing things; I'm surprised
that rpostgres doesn't.Laziness here can be inferred from the fact that the table is about 22GB.
suppressPackageStartupMessages(library(dplyr))
pg <- src_postgres()
system.time(dsf <- tbl(pg, sql("SELECT * FROM crsp.dsf")))
user system elapsed
0.013 0.001 0.048
dsf
Source: postgres 9.5.0 [igow@localhost:5432/crsp]
From: [?? x 20]cusip permno permco issuno hexcd hsiccd date bidlo askhi prc vol ret bid ask shrout cfacpr cfacshr openprc numtrd retx (chr) (int) (dbl) (dbl) (dbl) (dbl) (date) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl)
1 68391610 10000 7952 10396 3 3990 1986-01-07 2.375 2.750 -2.5625 1000 NA NA NA 3680 1 1 NA NA NA
2 68391610 10000 7952 10396 3 3990 1986-01-08 2.375 2.625 -2.5000 12800 -0.024390 NA NA 3680 1 1 NA NA -0.024390
3 68391610 10000 7952 10396 3 3990 1986-01-09 2.375 2.625 -2.5000 1400 0.000000 NA NA 3680 1 1 NA NA 0.000000
4 68391610 10000 7952 10396 3 3990 1986-01-10 2.375 2.625 -2.5000 8500 0.000000 NA NA 3680 1 1 NA NA 0.000000
5 68391610 10000 7952 10396 3 3990 1986-01-13 2.500 2.750 -2.6250 5450 0.050000 NA NA 3680 1 1 NA NA 0.050000
6 68391610 10000 7952 10396 3 3990 1986-01-14 2.625 2.875 -2.7500 2075 0.047619 NA NA 3680 1 1 NA NA 0.047619
7 68391610 10000 7952 10396 3 3990 1986-01-15 2.750 3.000 -2.8750 22490 0.045455 NA NA 3680 1 1 NA NA 0.045455
8 68391610 10000 7952 10396 3 3990 1986-01-16 2.875 3.125 -3.0000 10900 0.043478 NA NA 3680 1 1 NA NA 0.043478
9 68391610 10000 7952 10396 3 3990 1986-01-17 2.875 3.125 -3.0000 8470 0.000000 NA NA 3680 1 1 NA NA 0.000000
10 68391610 10000 7952 10396 3 3990 1986-01-20 2.875 3.125 -3.0000 1000 0.000000 NA NA 3680 1 1 NA NA 0.000000
.. ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...—
Reply to this email directly or view it on GitHub
#18 (comment).
from rpostgres.
RPostgreSQL
complains, but seems to silently covert to text. The field ranges
is jsonb
:
> system.time(tagged_data <- tbl(pg, sql("SELECT id, ranges FROM director_bio.raw_tagging_data")))
user system elapsed
0.004 0.000 0.005
Warning message:
In postgresqlExecStatement(conn, statement, ...) :
RS-DBI driver warning: (unrecognized PostgreSQL field type jsonb (id:3802) in column 1)
> tagged_data
Source: postgres 9.5.0 [igow@localhost:5432/crsp]
From: <derived table> [?? x 2]
id ranges
(chr) (chr)
1 AVIyJJoGxrHqqznVUw96 {"end": "/div/div/div/pre", "start": "/div/div/div/pre", "endOffset": 378, "startOffset": 350}
2 AVIyJHkoxrHqqznVUw95 {"end": "/div/div/div/pre", "start": "/div/div/div/pre", "endOffset": 648, "startOffset": 624}
3 AVIyI-yhxrHqqznVUw94 {"end": "/div/div/div/ul/li[4]", "start": "/div/div/div/ul/li[4]", "endOffset": 17, "startOffset": 7}
4 AVIyI9HrxrHqqznVUw93 {"end": "/div/div/div/pre", "start": "/div/div/div/pre", "endOffset": 490, "startOffset": 477}
5 AVIyI7GPxrHqqznVUw92 {"end": "/div/div/div/pre", "start": "/div/div/div/pre", "endOffset": 472, "startOffset": 458}
6 AVIyIyaDxrHqqznVUw91 {"end": "/div/div/div/ul/li[5]", "start": "/div/div/div/ul/li[5]", "endOffset": 24, "startOffset": 7}
7 AVIyIyUXxrHqqznVUw90 {"end": "/div/div/div/ul/li", "start": "/div/div/div/ul/li", "endOffset": 33, "startOffset": 7}
8 AVIyIlwfxrHqqznVUw9z {"end": "/div/div/div/ul/li[5]", "start": "/div/div/div/ul/li[5]", "endOffset": 24, "startOffset": 7}
9 AVIyIgTIxrHqqznVUw9y {"end": "/div/div/div/ul/li", "start": "/div/div/div/ul/li", "endOffset": 21, "startOffset": 7}
10 AVIyIYDfxrHqqznVUw9x {"end": "/div/div/div/pre[2]", "start": "/div/div/div/pre[2]", "endOffset": 98, "startOffset": 80}
.. ... ...
Warning message:
In postgresqlExecStatement(conn, statement, ...) :
RS-DBI driver warning: (unrecognized PostgreSQL field type jsonb (id:3802) in column 1)
But one might be able to pre-process using PostgreSQL functions. For example, this might work for some cases:
> system.time(tagged_data <- tbl(pg, sql("SELECT id, (jsonb_each(ranges)).* FROM director_bio.raw_tagging_data")))
user system elapsed
0.005 0.000 0.006
Warning message:
In postgresqlExecStatement(conn, statement, ...) :
RS-DBI driver warning: (unrecognized PostgreSQL field type jsonb (id:3802) in column 2)
> tagged_data
Source: postgres 9.5.0 [igow@localhost:5432/crsp]
From: <derived table> [?? x 3]
id key value
(chr) (chr) (chr)
1 AVIyJJoGxrHqqznVUw96 end "/div/div/div/pre"
2 AVIyJJoGxrHqqznVUw96 start "/div/div/div/pre"
3 AVIyJJoGxrHqqznVUw96 endOffset 378
4 AVIyJJoGxrHqqznVUw96 startOffset 350
5 AVIyJHkoxrHqqznVUw95 end "/div/div/div/pre"
6 AVIyJHkoxrHqqznVUw95 start "/div/div/div/pre"
7 AVIyJHkoxrHqqznVUw95 endOffset 648
8 AVIyJHkoxrHqqznVUw95 startOffset 624
9 AVIyI-yhxrHqqznVUw94 end "/div/div/div/ul/li[4]"
10 AVIyI-yhxrHqqznVUw94 start "/div/div/div/ul/li[4]"
.. ... ... ...
Warning message:
In postgresqlExecStatement(conn, statement, ...) :
RS-DBI driver warning: (unrecognized PostgreSQL field type jsonb (id:3802) in column 2)
from rpostgres.
hey that is interesting - we were trying to filter tbl (or in your case...
the lazy tagged_data) using a complicated jsonb query "select data,created_at
from data_scores d, jsonb_array_elements(d.data) o where data @> '[]' and
jsonb_object_field_text(o,'url') like '%RraQg3Ab%';"
I'm gonna try the way you recommended. thanks!
On Wed, Feb 10, 2016 at 11:26 PM, Ian Gow [email protected] wrote:
RPostgreSQL complains, but seems to silently covert to text. The field
ranges is jsonb:system.time(tagged_data <- tbl(pg, sql("SELECT id, ranges FROM director_bio.raw_tagging_data")))
user system elapsed
0.004 0.000 0.005 Warning message:In postgresqlExecStatement(conn, statement, ...) :
RS-DBI driver warning: (unrecognized PostgreSQL field type jsonb (id:3802) in column 1)> tagged_dataSource: postgres 9.5.0 [igow@localhost:5432/crsp]From: [?? x 2]id ranges (chr) (chr)1 AVIyJJoGxrHqqznVUw96 {"end": "/div/div/div/pre", "start": "/div/div/div/pre", "endOffset": 378, "startOffset": 350}2 AVIyJHkoxrHqqznVUw95 {"end": "/div/div/div/pre", "start": "/div/div/div/pre", "endOffset": 648, "startOffset": 624}3 AVIyI-yhxrHqqznVUw94 {"end": "/div/div/div/ul/li[4]", "start": "/div/div/div/ul/li[4]", "endOffset": 17, "startOffset": 7}4 AVIyI9HrxrHqqznVUw93 {"end": "/div/div/div/pre", "start": "/div/div/div/pre", "endOffset": 490, "startOffset": 477}5 AVIyI7GPxrHqqznVUw92 {"end": "/div/div/div/pre", "start": "/div/div/div/pre", "endOffset": 472, "startOffset": 458}6 AVIyIyaDxrHqqznVUw91 {"end": "/div/div/div/ul/li[5]", "start": "/div/div/div/ul/li[5]", "endOffset": 24, "startOffset": 7}7 AVIyIyUXxrHqqznVUw90 {"end": "/div/div/div/ul/li", "start": "/div/div/div/ul/li", "endOffset": 33, "startOffset": 7}8 AVIyIlwfxrHqqznVUw9z {"end": "/div/div/div/ul/li[5]", "start": "/div/div/div/ul/li[5]", "endOffset": 24, "startOffset": 7}9 AVIyIgTIxrHqqznVUw9y {"end": "/div/div/div/ul/li", "start": "/div/div/div/ul/li", "endOffset": 21, "startOffset": 7}10 AVIyIYDfxrHqqznVUw9x {"end": "/div/div/div/pre[2]", "start": "/div/div/div/pre[2]", "endOffset": 98, "startOffset": 80}
.. ... ...Warning message:In postgresqlExecStatement(conn, statement, ...) :
RS-DBI driver warning: (unrecognized PostgreSQL field type jsonb (id:3802) in column 1)But one might be able to pre-process using PostgreSQL functions. For
example, this might work for some cases:system.time(tagged_data <- tbl(pg, sql("SELECT id, (jsonb_each(ranges)).* FROM director_bio.raw_tagging_data")))
user system elapsed
0.005 0.000 0.006 Warning message:In postgresqlExecStatement(conn, statement, ...) :
RS-DBI driver warning: (unrecognized PostgreSQL field type jsonb (id:3802) in column 2)> tagged_dataSource: postgres 9.5.0 [igow@localhost:5432/crsp]From: [?? x 3]id key value (chr) (chr) (chr)1 AVIyJJoGxrHqqznVUw96 end "/div/div/div/pre"2 AVIyJJoGxrHqqznVUw96 start "/div/div/div/pre"3 AVIyJJoGxrHqqznVUw96 endOffset 3784 AVIyJJoGxrHqqznVUw96 startOffset 3505 AVIyJHkoxrHqqznVUw95 end "/div/div/div/pre"6 AVIyJHkoxrHqqznVUw95 start "/div/div/div/pre"7 AVIyJHkoxrHqqznVUw95 endOffset 6488 AVIyJHkoxrHqqznVUw95 startOffset 6249 AVIyI-yhxrHqqznVUw94 end "/div/div/div/ul/li[4]"10 AVIyI-yhxrHqqznVUw94 start "/div/div/div/ul/li[4]"
.. ... ... ...Warning message:In postgresqlExecStatement(conn, statement, ...) :
RS-DBI driver warning: (unrecognized PostgreSQL field type jsonb (id:3802) in column 2)—
Reply to this email directly or view it on GitHub
#18 (comment).
from rpostgres.
This old thread has been automatically locked. If you think you have found something related to this, please open a new issue and link to this old issue if necessary.
from rpostgres.
Related Issues (20)
- Site to test the same database in multiple backends HOT 1
- `dbWriteTable(temporary = TRUE, overwrite = TRUE)` deletes non-temporary table! HOT 1
- Printing of "invalid" `PqConnection` object HOT 1
- Connection to AWS DB with RPostgres on M1 Mac HOT 18
- binding parameters fails with input from dbQuoteLiteral HOT 2
- Connection closed unexpectedly by server HOT 9
- Encoding issue with identifiers passed via dbWriteTable or copy_to? HOT 4
- Cannot use DBI::dbWriteTable to save temporary table to Redshift HOT 2
- Failure to acquire connections on Mac OS within multicore fork()-ed process HOT 2
- Redshift varchar character length not respected by dplyr::rows_*() HOT 5
- collation warning on Ubuntu 22.04 HOT 4
- relicense RPostgres as MIT HOT 2
- using databases.pacha.dev for testing HOT 4
- Curious bug with transaction and `dbSendQuery()` HOT 6
- Redshift: dbWriteTable() with larger DFs and CSVs HOT 3
- Unable to create table with logical column HOT 4
- Schema with DBI::Id appears not to work with `dbListTables` HOT 4
- Bigints in dbQuoteLiteral HOT 4
- Polygon/multipolygon columns: incorrect write, correct read HOT 7
- Something wrong with the 1.4.5 Mac binary on CRAN with Redshift HOT 33
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rpostgres.