ip-tools / uspto-opendata-python Goto Github PK

View Code? Open in Web Editor NEW

87.0 13.0 23.0 138 KB

A client library for accessing the USPTO Open Data APIs, written in Python.

Home Page: https://docs.ip-tools.org/uspto-opendata-python/

License: MIT License

Python 97.35% Makefile 2.65%

uspto pair patent information research search bulk-api bulk-download bulk-downloader opendata

uspto-opendata-python's People

Contributors

Stargazers

Watchers

uspto-opendata-python's Issues

Problem with downloading full information about patent

I can't download such fields like abstract or description of patent using this library.

Adjust the mm (Minimum Should Match) Parameter of Lucene/Solr

We should take some details about the mm (Minimum Should Match) Parameter of the DisMax query parser into consideration, see #10 (comment) ff.

The upstream user interface currently will always set mm=0%, see #10 (comment).

How to search by appEarlyPubNumber?

I tried following.

from uspto.peds.client import UsptoPatentExaminationDataSystemClient

client = UsptoPatentExaminationDataSystemClient()

client.search('appEarlyPubNumber:(US 2006-0063272 A1)')
client.search('appEarlyPubNumber:(US 2006-0063272)')
client.search('appEarlyPubNumber:(2006-0063272 A1)')
client.search('appEarlyPubNumber:(2006-0063272 2017-0042821)')

all gives

{'numFound': 0,
 'start': 0,
 'docs': [],
 'metadata': {'indexLastUpdatedDate': 'Thu May 30 02:30:21 EDT 2019',
  'queryId': '9f12c1af-cb6b-4f8c-8e0e-97289ba404ec',
  'responseHeader': {'zkConnected': True, 'status': 0, 'QTime': 73}}}

I know client have some issues but search by patent number is working fine.

client.search('patentNumber:(6583088 6875727 8697602)')

Hello, I have been using the API for about a month now and I noticed something different today. Using the below query returns 451438 records and should only be returning 269 records associated to the given examiner.

# Peds basic query to check if PEDS is online
from uspto.peds.client import UsptoPatentExaminationDataSystemClient
import pandas as pd
name = 'WILSON, NICHOLAS R'
client = UsptoPatentExaminationDataSystemClient()
expression = "appExamName:{0}".format(name)
result = client.search(expression)

{'numFound': 451438,
 'start': 0,
 'docs': [{'corrAddrCountryName': 'UNITED STATES',
   'applId': '03429712',
   'totalPtoDays': '0',
   'appFilingDate': '1954-05-13T00:00:00Z',
   'appExamName': 'MATZ, DANIEL R',
   'appExamNameFacet': 'MATZ, DANIEL R',
...

I also emailed PEDS. They recently throttled the number of requests they could handle but we were able to get them to increase it again. I don't think the problem I'm experiencing is associated to their changes tho. Any thoughts?

Bug in the POST query to the API

I copy pasted the query from the docs and it seems like theres a 404 response from the API.

Future Development? - Patent Client

Hey! This is the only way I can see to contact you, so here I go!

I'm the author and maintainer of patent_client, a library with a similar scope and feature set as your own. patent_client is under active development, and growing, so if you'd like, I'd love to have you contribute, or add a note on your readme pointing to it!

PyPI | GitHub | Docs

Thanks!

Parker

Synchronously download documents for multiple patent numbers

I would like to know if I can download the list of the patent number or application number in synchronous mode. I can do that on https://ped.uspto.gov/peds/ by giving a coma separated values like '6583088, 6875727, 8697602, 6331531, 6274350, 10112906, 9491944, 9504251, 9137998'

This is because I think and tested also to find out that It's constant time operation whether you request one or 300 it will take the almost same time to complete the requests.

Something like:

from uspto.peds.client import UsptoPatentExaminationDataSystemClient
client = UsptoPatentExaminationDataSystemClient()

client.download_document(
    type='patent'
    numbers='6583088, 6875727, 8697602, 6331531, 6274350, 10112906, 9491944, 9504251, 9137998', # or list
)

Unable to access the USPTO PBD system

Problem with search fields

I tried to search on USPTO web site for patents of Amazon and the result was ~9000 patents. Using this library I found only ~600 patents.

I was searching for patents using expression:
expression = 'firstNamedApplicant:(Amazon)'

What other fields can be used? (I tried some other fields, but result of search didn't change).

Reintegrate aspects from "uspto-peds-python" fork

Coming from #7, @rahul-gj created a fork of this library called uspto-peds-python, which just wraps the PEDS Search API and is purely based on the requests and BeautifulSoup packages.

This variant is obviously able to operate with a trimmed down subset of dependencies making it apparently more usable for specific use cases. However, the same thing could be achieved using extras_require() mechanisms.

This issue has been created to track the reintegration of both variants with each other again.

Thanks for your valuable input on that, Rahul.

Outdated dependencies

Can we update the dependencies?

It is un-necessarily uninstalling the updated packages and installing old packages.

  Found existing installation: urllib3 1.24.1
    Uninstalling urllib3-1.24.1:
      Successfully uninstalled urllib3-1.24.1
  Found existing installation: idna 2.8
    Uninstalling idna-2.8:
      Successfully uninstalled idna-2.8
  Found existing installation: requests 2.21.0
    Uninstalling requests-2.21.0:
      Successfully uninstalled requests-2.21.0
  Found existing installation: lxml 4.3.1
    Uninstalling lxml-4.3.1:
      Successfully uninstalled lxml-4.3.1
  Found existing installation: beautifulsoup4 4.7.1
    Uninstalling beautifulsoup4-4.7.1:
      Successfully uninstalled beautifulsoup4-4.7.1
beautifulsoup4-4.6.0,  lxml-4.2.5 requests-2.18.4 urllib3-1.22

Add more data sources from USPTO

Based on comments by @andyhegedus coming from #5, we would like to add support for accessing more data sources from USPTO in the future.

BDSS API: https://bulkdata.uspto.gov/
Bulk search and download API: https://developer.uspto.gov/api-catalog/bulk-search-and-download
PatentsView API: https://developer.uspto.gov/api-catalog/patentsview

Cant able to install giving lots of error like regex etc

  Building wheel for regex (setup.py) ... error
  ERROR: Command errored out with exit status 1:
   command: /usr/bin/python2.7 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-uxmfp1/regex/setup.py'"'"'; __file__='"'"'/tmp/pip-install-uxmfp1/regex/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-JPZCzH
       cwd: /tmp/pip-install-uxmfp1/regex/
  Complete output (1451 lines):
  running bdist_wheel
  running build
  running build_py
  creating build
  creating build/lib.linux-x86_64-2.7
  creating build/lib.linux-x86_64-2.7/regex
  copying regex_3/__init__.py -> build/lib.linux-x86_64-2.7/regex
  copying regex_3/regex.py -> build/lib.linux-x86_64-2.7/regex
  copying regex_3/_regex_core.py -> build/lib.linux-x86_64-2.7/regex
  copying regex_3/test_regex.py -> build/lib.linux-x86_64-2.7/regex
  running build_ext
  building 'regex._regex' extension
  creating build/temp.linux-x86_64-2.7
  creating build/temp.linux-x86_64-2.7/regex_3
  x86_64-linux-gnu-gcc -fno-strict-aliasing -Wdate-time -D_FORTIFY_SOURCE=2 -g -ffile-prefix-map=/build/python2.7-W40Ff2/python2.7-2.7.18=. -flto=auto -ffat-lto-objects -flto=auto -ffat-lto-objects -fstack-protector-strong -Wformat -Werror=format-security -fPIC -I/usr/include/python2.7 -c regex_3/_regex.c -o build/temp.linux-x86_64-2.7/regex_3/_regex.o
  regex_3/_regex.c: In function ‘bytes1_char_at’:
  regex_3/_regex.c:755:15: error: ‘Py_UCS1’ undeclared (first use in this function); did you mean ‘Py_UCS4’?
    755 |     return *((Py_UCS1*)text + pos);
        |               ^~~~~~~
        |               Py_UCS4
  regex_3/_regex.c:755:15: note: each undeclared identifier is reported only once for each function it appears in
  regex_3/_regex.c:755:23: error: expected expression before ‘)’ token
    755 |     return *((Py_UCS1*)text + pos);
        |                       ^
  regex_3/_regex.c: In function ‘bytes1_set_char_at’:
  regex_3/_regex.c:760:8: error: ‘Py_UCS1’ undeclared (first use in this function); did you mean ‘Py_UCS4’?
    760 |     *((Py_UCS1*)text + pos) = (Py_UCS1)ch;
        |        ^~~~~~~
        |        Py_UCS4
  regex_3/_regex.c:760:16: error: expected expression before ‘)’ token
    760 |     *((Py_UCS1*)text + pos) = (Py_UCS1)ch;
        |                ^
  regex_3/_regex.c:760:40: error: expected ‘;’ before ‘ch’
    760 |     *((Py_UCS1*)text + pos) = (Py_UCS1)ch;
        |                                        ^~
        |                                        ;
  regex_3/_regex.c: In function ‘bytes1_point_to’:
  regex_3/_regex.c:765:13: error: ‘Py_UCS1’ undeclared (first use in this function); did you mean ‘Py_UCS4’?
    765 |     return (Py_UCS1*)text + pos;
        |             ^~~~~~~
        |             Py_UCS4
  regex_3/_regex.c:765:21: error: expected expression before ‘)’ token
    765 |     return (Py_UCS1*)text + pos;
        |                     ^
  regex_3/_regex.c: In function ‘bytes2_char_at’:
  regex_3/_regex.c:770:15: error: ‘Py_UCS2’ undeclared (first use in this function); did you mean ‘Py_UCS4’?
    770 |     return *((Py_UCS2*)text + pos);
        |               ^~~~~~~
        |               Py_UCS4
  regex_3/_regex.c:770:23: error: expected expression before ‘)’ token
    770 |     return *((Py_UCS2*)text + pos);
        |                       ^
  regex_3/_regex.c: In function ‘bytes2_set_char_at’:
  regex_3/_regex.c:775:8: error: ‘Py_UCS2’ undeclared (first use in this function); did you mean ‘Py_UCS4’?
    775 |     *((Py_UCS2*)text + pos) = (Py_UCS2)ch;
        |        ^~~~~~~
        |        Py_UCS4
  regex_3/_regex.c:775:16: error: expected expression before ‘)’ token
    775 |     *((Py_UCS2*)text + pos) = (Py_UCS2)ch;
        |                ^
  regex_3/_regex.c:775:40: error: expected ‘;’ before ‘ch’
    775 |     *((Py_UCS2*)text + pos) = (Py_UCS2)ch;
        |                                        ^~
        |                                        ;
   ^
    regex_3/_regex.c:26230:16: note: declared here
    26230 | PyMODINIT_FUNC PyInit__regex(void) {
          |                ^~~~~~~~~~~~~
    regex_3/_regex.c: At top level:
    regex_3/_regex.c:26217:27: error: storage size of ‘regex_module’ isn’t known
    26217 | static struct PyModuleDef regex_module = {
          |                           ^~~~~~~~~~~~
    error: command 'x86_64-linux-gnu-gcc' failed with exit status 1
    ----------------------------------------
ERROR: Command errored out with exit status 1: /usr/bin/python2.7 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-uxmfp1/regex/setup.py'"'"'; __file__='"'"'/tmp/pip-install-uxmfp1/regex/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-A7_y2n/install-record.txt --single-version-externally-managed --user --prefix= --compile --install-headers /home/sudhanshu/.local/include/python2.7/regex Check the logs for full command output.
WARNING: You are using pip version 20.3.4; however, version 23.2.1 is available.
You should consider upgrading via the '/usr/bin/python2.7 -m pip install --upgrade pip' command.
sudhanshu@

Improve query expression documentation

Introduction

This is about writing query expressions properly.

Searching for names

uspto-peds search 'appExamName:"WILSON, NICHOLAS R"'

Note the quotes around the examiner name here.

-- #10 (comment)

Searching for (multiple) document numbers

For querying numberlists, propose an expression like (see also #10 (comment))

uspto-peds search 'patentNumber:(6583088 OR 6875727 OR 8697602)'

Improve querying numberlists by providing an appropriate --numberlist= command line option.

In windows it gives codec UnicodeDecodeError:.

I tried cmd, powershell, ipython-qt-console, and cygwin.

this is the cygwin output. others are same.

$ pip install uspto-opendata-python
Collecting uspto-opendata-python
  Using cached uspto-opendata-python-0.7.1.tar.gz
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "C:\cygwin64\tmp\pip-build-irgg6e4i\uspto-opendata-python\setup.py", line 5, in <module>
        README = open(os.path.join(here, 'README.rst')).read()
      File "C:\Users\user\WinPython-32bit-3.6.2.0Qt5\python-3.6.2\lib\encodings\cp1252.py", line 23, in decode
        return codecs.charmap_decode(input,self.errors,decoding_table)[0]
    UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 4791: character maps to <undefined>

    ----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in C:\cygwin64\tmp\pip-build-irgg6e4i\uspto-opendata-python\

ip-tools / uspto-opendata-python Goto Github PK

uspto-opendata-python's People

Contributors

Stargazers

Watchers

Forkers

uspto-opendata-python's Issues

Introduction

Searching for names

Searching for (multiple) document numbers

Recommend Projects

Recommend Topics

Recommend Org