Giter Site home page Giter Site logo

matthewjrogers / rairtable Goto Github PK

View Code? Open in Web Editor NEW
12.0 1.0 4.0 444 KB

Efficient, Tidyverse-friendly wrapper for the Airtable API

Home Page: https://matthewjrogers.github.io/rairtable/

License: Other

R 100.00%
r airtable airtable-api api-client

rairtable's Introduction

  • ๐Ÿ‘‹ Hi, Iโ€™m @matthewjrogers. I'm a programmer and R enthusiast living in Vermont
  • ๐Ÿ‘€ Iโ€™m interested in machine learning, public data accessibility in R and Python, and building a better broadband data set
  • ๐ŸŒฑ Iโ€™m currently learning about spatial ML and efficient API interaction in R
  • ๐Ÿ’ž๏ธ Iโ€™m looking to collaborate on R packages or projects related to my interests
  • ๐Ÿ“ซ How to reach me: [email protected]

rairtable's People

Contributors

matthewjrogers avatar mspittler avatar roboton avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

rairtable's Issues

Downloading only fields included in a View

Is your feature request related to a problem? Please describe.

I mistakenly assumed that reading data based on an Airtable object for a view would only include the fields in that view (in addition to including only the records in that view). Currently, the read_airtable() function downloads the entire underlying table (with the exception of empty fields, which alerted me to this behaviour). Including only the fields that are not hidden in the Airtable view is useful because it would allow limiting the data to access via the Airtable API in the case of very large tables.

Describe the solution you'd like

A good solution would be to only download the fields that are shown in the view, if the Airtable API makes this possible.

Describe alternatives you've considered

An alternative solution would be to specify the fields to include in the download with a character vector of field names. This would provide fine-grained programmatic control, but it a bit less user-friendly, especially for tables with many fields.

Thank you!

encode_batch_patch and encode_batch_post drop precision of numbers

Describe the bug
encode_batch_patch() and encode_batch_post() (and likely anywhere else that) use jsonlite's toJSON() function. This presents a problem, as the function defaults to only allowing 4 digits of precision after the decimal place. There are many cases when that precision is necessary (like with lat/lng data).

To Reproduce

jsonlite::toJSON(list(lat = 1.9993292939931929), digits = 10000)
{"lat":[1.9993292939931928]}
jsonlite::toJSON(list(lat = 1.9993292939931929))
{"lat":[1.9993]}

Expected behavior
Preferably no precision would be lost. Consider different packages for converting to JSON or set digits to something higher.

Integrate metadata API

Some key features of the Airtable metadata API are now generally available to all account tiers. Access to base metadata in particular would allow

  1. Easy access to table names, reducing potential for typographical errors in code writing
  2. Simplified reference for multiselect options
  3. Potential to obviate need to look up and copy base IDs

This work will need to happen in tandem with API key deprecation, as the metadata api endpoints do not accept API keys.

It seems there is some community appetite for this functionality per bergant/airtabler#11

Getting Error when trying to read base

I moved from my Mac to an Ubuntu machine getting a strange error from code, that works on MacOS.

set_airtable_api_key('pat0TSQcD.....SECRET')
airtable_base_id = 'appgRIfj...SECRET'

table <- airtable('Assessments', airtable_base_id)
assessments <- read_airtable(table)

Error in data.table::rbindlist(dta, use.names = TRUE, fill = TRUE) :
Column 17 of item 1 is length 3 inconsistent with column 1 which is length 100. Only length-1 columns are recycled.
(works on the Mac)

sessionInfo()
R version 4.2.3 (2023-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.2 LTS

other attached packages:
[1] fmsb_0.7.5 kableExtra_1.3.4 scales_1.2.1 ggplot2_3.4.1 rairtable_0.1.1 dplyr_1.1.1 sqldf_0.4-11 RSQLite_2.3.0 gsubfn_0.7 proto_1.0.0
[11] knitr_1.42

The Mac is running
R 4.2.1, airtable_0.1.1 dplyr_1.1.0

Best regards and many thanks
Wolfgang

Columns returned in random order

read_airtable() is returning columns in random order, irrespective of initial order in the base. I would expect to get columns in the order provided.

Structure and format for `airtable` class objects

This is a bit of a brain dump so I'm hoping it makes sense but here is a quick summary of the existing class structure and potential new directions/alternate approaches.

Background

The new implementation of airtable class objects in the pull request for migrating the package to {httr2} #14 includes four S3 object types:

  • airtable
  • airtable_base_schema
  • airtable_table_schema
  • airtable_fields_schema

An airtable object is a list with a base ID, table ID or name, user-facing base url, and API request url. A table url is included if table is specified as a table ID (but not as a table name). A base name/description, user permissions level, and table name can optionally be included. The base name (called description for the airtable object to avoid a conflict with the table name value) is pulled from the metadata if description is NULL. The table value is assumed to also be the name name if the supplied table is not a table ID. One or more views can be included (although multiple views will break other functions).

airtable_base_schema is a list that includes airtable_table_schema and airtable_fields_schema as components.

These were all part of the existing development branch but the pull request converted them from environment objects to list objects to use the S3 vectors classes functions available through vctrs.

Possible challenges with existing airtable class objects

Currently, the airtable class objects represent a single table but the values are not validated when the object is created so it is possible to create an airtable object that has a table and view that does not exist within the specified base. Would we want to add validation so this is no longer possible? If so, is an airtable object effectively just a subset of the data included within an airtable_table_schema?

I'd also prefer a clearer hierarchy where there is an object that represents a single base, an object that represents a single table, and an object that represents a single view (potentially using sub-classes, e.g. airtable, airtable_tbl, airtable_view).

Correspondence between API and class structure

I one goal for the class structure of the package should be a clear correspondence between the package class structure and the data model built in to the Airtable Web API.

Right now I think the airtable and airtable_table_schema are both close equivalents to the table model: https://airtable.com/developers/web/api/model/table-model We could add a similar object to serve as an equivalent for a table config object: https://airtable.com/developers/web/api/model/table-config

airtable_fields_schema should exist as (or be convertible to) an array of field config objects: https://airtable.com/developers/web/api/field-model

Ideally, when create_table() is implemented, it could take an airtable_table_schema and convert it into a create an identically structured table. Similarly, a airtable_fields_schema could be use to add a set of fields using create_field() (or some additional function, e.g. create_fields()).

The view metadata endpoint is part of the Enterprise API but it effectively includes an implicit view object model that we could also use as a base for an additional class or sub-class: https://airtable.com/developers/web/api/get-view-metadata

Object type for airtable class objects

While the option to return to an environment object as a base might make it easier to set the active view, I think I prefer the idea of sticking with the infrastructure that vctrs offers for vector/list-style S3 objects.

Automatically create new Single or Multi select options.

Describe the bug
When trying to add new rows from a data.frame in a Single or Multi select column, the following error is given:

Error: Error in POST. Error Code 422: Unprocessable entity. The request was well-formed but was unable to be followed due to semantic errors.

Ensure that the column types in R are compatible with the column types of your Airtable table.

The reason here is that the options in the data frame aren't added to the selection yet. However, in many occasions, you don't know the options in advance.

The solution here is to add new unique options that aren't present in the column yet. After doing some digging, I found this implementation in a different package. Is this something you can add to rairtable (which is better maintained and has a better UI).

To Reproduce

Where Status is a Single Select column

x <- tibble(status = c("Done", "To Do", "In Progress"))
insert_records(x, base)

Error in reading User field types

Getting this error when trying to read_airtable:

Error in data.table::rbindlist(dta, use.names = TRUE, fill = TRUE) : Column 9 of item 1 is length 3 inconsistent with column 1 which is length 100. Only length-1 columns are recycled.

Error seems to correspond with "User" field types. I know that the airtabler package splits such fields into three columns: User name, email address, and something else I'm not remembering right now. Maybe error is related to this oddity of airtable, where they have one field that's actually 3?

PAT instead of API key

Airtable is phasing out API keys by next year, eventually making users migrate to Personal Access Tokens (PATs). Can the set_airtable_api_key() function be used with PATs? If not, will there be a new function for this? Thanks!

`update_records()` fails for data type multiple select.

I have noticed that batch_encode_post() called from update_records() returns malformed JSON for the data type Multiple Select which is an Array of Strings. For a given tibble, e.g.

tibble(
  airtable_record_id = "recQAGljdD111pHOb",
  Preferences = list(c("Men", "Babies"))
)

batch_encode_post() returns:

"{\"records\":[{\"id\":\"recQAGljdD111pHOb\",\"fields\":{\"Preferences\":[[\"Men\",\"Babies\"]]}}]}"

but it should be formed as:

"{\"records\":[{\"id\":\"recQAGljdD111pHOb\",\"fields\":{\"Preferences\":[\"Men\",\"Babies\"]}}]}"

I could gaffer tape it by adding the following after line 57 in update_records.R, but this is of course no solution:

library(magrittr)
batch_json_requests %<>%
    stringr::str_replace_all(fixed(":[["), fixed(":[")) %>% 
    stringr::str_replace_all(fixed("]]}"), fixed("]}"))

Thanks for the package!

Refactor suggestion: Migrating package from `{httr}` to `{httr2}`

I'm excited to see a new R package for working with Airtable and I'm definitely interested in contributing! I added some Airtable data access functions to my own getdata package but struggled to get the offset working.

I was wondering if you may be open for a pull request to migrate the package from using {httr} to use {httr2}. I've migrated a few different API packages from {httr} or {jsonlite} to {httr2} over the past year or two and I've found that using {httr2} makes it much easier to add new features and handle rate limiting and authentication. If you're open to it, I'd be happy to do a first pass on migrating the rairtable package over to {httr2}.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.