jyprojs / patentr Goto Github PK
View Code? Open in Web Editor NEWR package to access USPTO bulk data in tidy, rectangular format
License: Other
R package to access USPTO bulk data in tidy, rectangular format
License: Other
each assignment copies over entire data frame resulting in slowdown (ridiculously inefficient code)
Fix values in WKU column to equal patent number instead (and switch column name to Pat_ID or similar)
Hi,
I noticed that the csv output for week 1 for 2019 has blank values for the "references" field for all WKUs. I checked randomly for references for some of them on https://portal.uspto.gov/pair/PublicPair using their patent number as the search field and found the latest reference document on the "Display References" tab shows multiple references for these patents. Then I compared the contents for week 1 of 2012. The csv output for this shows populated fields for references for the patents issued and these matched with the information on the PublicPair portal for a randomly chosen patents.
Can you please execute an iteration with 2019 and 2012 week 1 data and check the reason for the references mismatch between the patentr output and the PublicPair portal.
Thanks!
Example of the PAL is 1995 wk 32
~10-50 rows showing the patent format output
Congratulations for the work done! The tool works well and it seems to be very userfriendly. I know that in the bulk files of USPTO there are other fields that currently are not implemented in the package, such as the abstract and the description related to each patent. These information are fundamental for the analysis of the technical state-of-the-art of a domain. Is it possible to implement some functions for extracting other text fields of the patents?
For 1 January 1976, TXT currently represents as 19760101
while XML formats represent as 1976-01-01
; should switch TXT to 1976-01-01
to maintain consistency and readability
fout << currID << ",\"" << title << "\"," << appDate << "," << issDate << ",\"" << inventor << "\",\"" << assignee << "\"," << iclClass << "," << refs << "\n";
should be like above and not
fout << currID << ",\"" << title << "\"," << appDate << "," << issDate << "," << inventor << "," << assignee << "," << iclClass << "," << refs << "\n";
note: assignee and inventor fields may have commas
confirm content as well in tests/testthat/test-convert.R
1980 has 53 Tuesday's example code gets 6th to second from last weeks not the last five weeks
no default value, output file name must be provided
likely w/ cat
statements, per n
patents
Hi devs,
Thanks for the work and package.
For more recent years I am getting the TRUE placeholder and no data is downloading.
I tried running the examples and other years before 2001, and it attempts to download but I am getting this error:
Error in utils::download.file(url = curr_url, destfile = dest_file) :
cannot open URL 'https://bulkdata.uspto.gov/data/patent/grant/redbook/fulltext/2001/pftaps20010102_wk01.zip'
In addition: Warning message:
In utils::download.file(url = curr_url, destfile = dest_file) :
InternetOpenUrl failed: 'The certificate authority is invalid or incorrect'
Probably can't verify the SSL of the USPTO website.
On an unrelated note, is there a way to download the patent data connected to a certain inventor or company instead of going by week?
Hi,
I have two questions on the package's execution.
df2018w1 <- get_bulk_patent_data(year = 2018, week = 1)
Error in cat("WKU,Title,App_Date,Issue_Date,Inventor,Assignee,ICL_Class,References,Claims\n", :
argument "output_file" is missing, with no default
Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.