ip-tools / uspto-opendata-python Goto Github PK
View Code? Open in Web Editor NEWA client library for accessing the USPTO Open Data APIs, written in Python.
Home Page: https://docs.ip-tools.org/uspto-opendata-python/
License: MIT License
A client library for accessing the USPTO Open Data APIs, written in Python.
Home Page: https://docs.ip-tools.org/uspto-opendata-python/
License: MIT License
I can't download such fields like abstract or description of patent using this library.
Okay, this one's pretty simple:
pip install uspto-opendata-python
# success
import uspto
# success
import uspto.pdb.client
# ImportError: No module named pdb.client
import pkgutil
[name for _, name, _ in pkgutil.iter_modules(['uspto'])]
# []
Any ideas?
We should take some details about the mm (Minimum Should Match) Parameter of the DisMax query parser into consideration, see #10 (comment) ff.
The upstream user interface currently will always set mm=0%
, see #10 (comment).
I tried following.
from uspto.peds.client import UsptoPatentExaminationDataSystemClient
client = UsptoPatentExaminationDataSystemClient()
client.search('appEarlyPubNumber:(US 2006-0063272 A1)')
client.search('appEarlyPubNumber:(US 2006-0063272)')
client.search('appEarlyPubNumber:(2006-0063272 A1)')
client.search('appEarlyPubNumber:(2006-0063272 2017-0042821)')
all gives
{'numFound': 0,
'start': 0,
'docs': [],
'metadata': {'indexLastUpdatedDate': 'Thu May 30 02:30:21 EDT 2019',
'queryId': '9f12c1af-cb6b-4f8c-8e0e-97289ba404ec',
'responseHeader': {'zkConnected': True, 'status': 0, 'QTime': 73}}}
I know client have some issues but search by patent number is working fine.
client.search('patentNumber:(6583088 6875727 8697602)')
Hello, I have been using the API for about a month now and I noticed something different today. Using the below query returns 451438 records and should only be returning 269 records associated to the given examiner.
# Peds basic query to check if PEDS is online
from uspto.peds.client import UsptoPatentExaminationDataSystemClient
import pandas as pd
name = 'WILSON, NICHOLAS R'
client = UsptoPatentExaminationDataSystemClient()
expression = "appExamName:{0}".format(name)
result = client.search(expression)
{'numFound': 451438,
'start': 0,
'docs': [{'corrAddrCountryName': 'UNITED STATES',
'applId': '03429712',
'totalPtoDays': '0',
'appFilingDate': '1954-05-13T00:00:00Z',
'appExamName': 'MATZ, DANIEL R',
'appExamNameFacet': 'MATZ, DANIEL R',
...
I also emailed PEDS. They recently throttled the number of requests they could handle but we were able to get them to increase it again. I don't think the problem I'm experiencing is associated to their changes tho. Any thoughts?
Hey! This is the only way I can see to contact you, so here I go!
I'm the author and maintainer of patent_client
, a library with a similar scope and feature set as your own. patent_client
is under active development, and growing, so if you'd like, I'd love to have you contribute, or add a note on your readme pointing to it!
Thanks!
Parker
I would like to know if I can download the list of the patent number or application number in synchronous mode. I can do that on https://ped.uspto.gov/peds/ by giving a coma separated values like '6583088, 6875727, 8697602, 6331531, 6274350, 10112906, 9491944, 9504251, 9137998'
This is because I think and tested also to find out that It's constant time operation whether you request one or 300 it will take the almost same time to complete the requests.
Something like:
from uspto.peds.client import UsptoPatentExaminationDataSystemClient
client = UsptoPatentExaminationDataSystemClient()
client.download_document(
type='patent'
numbers='6583088, 6875727, 8697602, 6331531, 6274350, 10112906, 9491944, 9504251, 9137998', # or list
)
I tried to search on USPTO web site for patents of Amazon and the result was ~9000 patents. Using this library I found only ~600 patents.
I was searching for patents using expression:
expression = 'firstNamedApplicant:(Amazon)'
What other fields can be used? (I tried some other fields, but result of search didn't change).
Coming from #7, @rahul-gj created a fork of this library called uspto-peds-python, which just wraps the PEDS Search API and is purely based on the requests
and BeautifulSoup
packages.
This variant is obviously able to operate with a trimmed down subset of dependencies making it apparently more usable for specific use cases. However, the same thing could be achieved using extras_require()
mechanisms.
This issue has been created to track the reintegration of both variants with each other again.
Thanks for your valuable input on that, Rahul.
Can we update the dependencies?
It is un-necessarily uninstalling the updated packages and installing old packages.
Found existing installation: urllib3 1.24.1
Uninstalling urllib3-1.24.1:
Successfully uninstalled urllib3-1.24.1
Found existing installation: idna 2.8
Uninstalling idna-2.8:
Successfully uninstalled idna-2.8
Found existing installation: requests 2.21.0
Uninstalling requests-2.21.0:
Successfully uninstalled requests-2.21.0
Found existing installation: lxml 4.3.1
Uninstalling lxml-4.3.1:
Successfully uninstalled lxml-4.3.1
Found existing installation: beautifulsoup4 4.7.1
Uninstalling beautifulsoup4-4.7.1:
Successfully uninstalled beautifulsoup4-4.7.1
beautifulsoup4-4.6.0, lxml-4.2.5 requests-2.18.4 urllib3-1.22
Based on comments by @andyhegedus coming from #5, we would like to add support for accessing more data sources from USPTO in the future.
Building wheel for regex (setup.py) ... error
ERROR: Command errored out with exit status 1:
command: /usr/bin/python2.7 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-uxmfp1/regex/setup.py'"'"'; __file__='"'"'/tmp/pip-install-uxmfp1/regex/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-JPZCzH
cwd: /tmp/pip-install-uxmfp1/regex/
Complete output (1451 lines):
running bdist_wheel
running build
running build_py
creating build
creating build/lib.linux-x86_64-2.7
creating build/lib.linux-x86_64-2.7/regex
copying regex_3/__init__.py -> build/lib.linux-x86_64-2.7/regex
copying regex_3/regex.py -> build/lib.linux-x86_64-2.7/regex
copying regex_3/_regex_core.py -> build/lib.linux-x86_64-2.7/regex
copying regex_3/test_regex.py -> build/lib.linux-x86_64-2.7/regex
running build_ext
building 'regex._regex' extension
creating build/temp.linux-x86_64-2.7
creating build/temp.linux-x86_64-2.7/regex_3
x86_64-linux-gnu-gcc -fno-strict-aliasing -Wdate-time -D_FORTIFY_SOURCE=2 -g -ffile-prefix-map=/build/python2.7-W40Ff2/python2.7-2.7.18=. -flto=auto -ffat-lto-objects -flto=auto -ffat-lto-objects -fstack-protector-strong -Wformat -Werror=format-security -fPIC -I/usr/include/python2.7 -c regex_3/_regex.c -o build/temp.linux-x86_64-2.7/regex_3/_regex.o
regex_3/_regex.c: In function ‘bytes1_char_at’:
regex_3/_regex.c:755:15: error: ‘Py_UCS1’ undeclared (first use in this function); did you mean ‘Py_UCS4’?
755 | return *((Py_UCS1*)text + pos);
| ^~~~~~~
| Py_UCS4
regex_3/_regex.c:755:15: note: each undeclared identifier is reported only once for each function it appears in
regex_3/_regex.c:755:23: error: expected expression before ‘)’ token
755 | return *((Py_UCS1*)text + pos);
| ^
regex_3/_regex.c: In function ‘bytes1_set_char_at’:
regex_3/_regex.c:760:8: error: ‘Py_UCS1’ undeclared (first use in this function); did you mean ‘Py_UCS4’?
760 | *((Py_UCS1*)text + pos) = (Py_UCS1)ch;
| ^~~~~~~
| Py_UCS4
regex_3/_regex.c:760:16: error: expected expression before ‘)’ token
760 | *((Py_UCS1*)text + pos) = (Py_UCS1)ch;
| ^
regex_3/_regex.c:760:40: error: expected ‘;’ before ‘ch’
760 | *((Py_UCS1*)text + pos) = (Py_UCS1)ch;
| ^~
| ;
regex_3/_regex.c: In function ‘bytes1_point_to’:
regex_3/_regex.c:765:13: error: ‘Py_UCS1’ undeclared (first use in this function); did you mean ‘Py_UCS4’?
765 | return (Py_UCS1*)text + pos;
| ^~~~~~~
| Py_UCS4
regex_3/_regex.c:765:21: error: expected expression before ‘)’ token
765 | return (Py_UCS1*)text + pos;
| ^
regex_3/_regex.c: In function ‘bytes2_char_at’:
regex_3/_regex.c:770:15: error: ‘Py_UCS2’ undeclared (first use in this function); did you mean ‘Py_UCS4’?
770 | return *((Py_UCS2*)text + pos);
| ^~~~~~~
| Py_UCS4
regex_3/_regex.c:770:23: error: expected expression before ‘)’ token
770 | return *((Py_UCS2*)text + pos);
| ^
regex_3/_regex.c: In function ‘bytes2_set_char_at’:
regex_3/_regex.c:775:8: error: ‘Py_UCS2’ undeclared (first use in this function); did you mean ‘Py_UCS4’?
775 | *((Py_UCS2*)text + pos) = (Py_UCS2)ch;
| ^~~~~~~
| Py_UCS4
regex_3/_regex.c:775:16: error: expected expression before ‘)’ token
775 | *((Py_UCS2*)text + pos) = (Py_UCS2)ch;
| ^
regex_3/_regex.c:775:40: error: expected ‘;’ before ‘ch’
775 | *((Py_UCS2*)text + pos) = (Py_UCS2)ch;
| ^~
| ;
^
regex_3/_regex.c:26230:16: note: declared here
26230 | PyMODINIT_FUNC PyInit__regex(void) {
| ^~~~~~~~~~~~~
regex_3/_regex.c: At top level:
regex_3/_regex.c:26217:27: error: storage size of ‘regex_module’ isn’t known
26217 | static struct PyModuleDef regex_module = {
| ^~~~~~~~~~~~
error: command 'x86_64-linux-gnu-gcc' failed with exit status 1
----------------------------------------
ERROR: Command errored out with exit status 1: /usr/bin/python2.7 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-uxmfp1/regex/setup.py'"'"'; __file__='"'"'/tmp/pip-install-uxmfp1/regex/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-A7_y2n/install-record.txt --single-version-externally-managed --user --prefix= --compile --install-headers /home/sudhanshu/.local/include/python2.7/regex Check the logs for full command output.
WARNING: You are using pip version 20.3.4; however, version 23.2.1 is available.
You should consider upgrading via the '/usr/bin/python2.7 -m pip install --upgrade pip' command.
sudhanshu@
This is about writing query expressions properly.
uspto-peds search 'appExamName:"WILSON, NICHOLAS R"'
Note the quotes around the examiner name here.
uspto-peds search 'patentNumber:(6583088 OR 6875727 OR 8697602)'
--numberlist=
command line option.I tried cmd, powershell, ipython-qt-console, and cygwin.
this is the cygwin output. others are same.
$ pip install uspto-opendata-python
Collecting uspto-opendata-python
Using cached uspto-opendata-python-0.7.1.tar.gz
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\cygwin64\tmp\pip-build-irgg6e4i\uspto-opendata-python\setup.py", line 5, in <module>
README = open(os.path.join(here, 'README.rst')).read()
File "C:\Users\user\WinPython-32bit-3.6.2.0Qt5\python-3.6.2\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 4791: character maps to <undefined>
----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in C:\cygwin64\tmp\pip-build-irgg6e4i\uspto-opendata-python\
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.