Giter Site home page Giter Site logo

load_patstat's People

Contributors

dportabella avatar erikkemperman avatar exedre avatar hughamacmullaniv avatar leofiore avatar simonemainardi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

load_patstat's Issues

Specified key was too long when creating table tls906_person.

load_patstat.sh complains with this error:

CREATE TABLE tls906_person (
  person_id int(11) NOT NULL DEFAULT '0',
  person_name varchar(300) COLLATE utf8mb4_unicode_ci NOT NULL,
  person_address varchar(1000) COLLATE utf8mb4_unicode_ci NOT NULL,
  person_ctry_code char(2) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '',
  nuts char(5) NOT NULL DEFAULT '',
  nuts_level smallint  NOT NULL DEFAULT '9',
  doc_std_name_id int(11) NOT NULL DEFAULT '0',
  doc_std_name varchar(500) COLLATE utf8mb4_unicode_ci NOT NULL,
  psn_id int(11) NOT NULL DEFAULT '0',
  psn_name varchar(500) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '',
  psn_level tinyint(4) NOT NULL DEFAULT '0',
  psn_sector varchar(50) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '',
  han_id int(11) NOT NULL DEFAULT '0',
  han_name varchar(500) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '',
  han_harmonized int(11) NOT NULL DEFAULT '0',
  PRIMARY KEY (person_id),
  KEY IX_ppat_person_ctry_code (person_ctry_code),
  KEY IX_ppat_nuts (nuts),
  KEY IX_ppat_psn_name (psn_name(333)),
  KEY IX_ppat_psn_sector (psn_sector),
  KEY IX_ppat_psn_id (psn_id),
  KEY IX_ppat_han_id (han_id),
  KEY IX_han_name (han_name(333)),
  KEY IX_han_harmonized (han_harmonized)
) ENGINE=MyISAM DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci  AVG_ROW_LENGTH=100
--------------

ERROR 1071 (42000) at line 422: Specified key was too long; max key length is 1000 bytes
Bye

I am running mysql version 5.7.17.

any idea?
why is it working for you?
did you configured your mysql server somehow?

Import also the PATSTAT Register data files

Is there an script also to import this data to mysql?

$ ls PATSTAT_2017a/PATSTAT_Register_2017_Spring/Data
reg101_appln_part01.zip
reg102_pat_publn_part01.zip
reg103_ipc_part01.zip
reg106_prior_part01.zip
reg107_parties_part01.zip
reg107_parties_part02.zip
reg108_applicant_states_part01.zip
reg109_design_states_part01.zip
reg110_title_part01.zip
reg111_licensee_part01.zip
reg112_licensee_states_part01.zip
reg113_terms_of_grant_part01.zip
reg114_dates_part01.zip
reg117_relation_part01.zip
reg118_prev_filed_appln_part01.zip
reg125_appeal_part01.zip
reg127_petition_rvw_part01.zip
reg128_limitation_part01.zip
reg130_opponent_part01.zip
reg135_text_part01.zip
reg136_search_report_part01.zip
reg201_proc_step_part01.zip
reg201_proc_step_part02.zip
reg202_proc_step_text_part01.zip
reg202_proc_step_text_part02.zip
reg202_proc_step_text_part03.zip
reg202_proc_step_text_part04.zip
reg203_proc_step_date_part01.zip
reg203_proc_step_date_part02.zip
reg301_event_data_part01.zip
reg301_event_data_part02.zip
reg401_appeal_result_part01.zip
reg402_event_text_part01.zip

PATSTAT_2017a tls203_part18 corrupted file?

I have a problem importing the PATSTAT_2017a tls203_part18.zipfile.
I wonder if the data is corrupted, and whether it is from PASTAT itself or from the (legal) copy I received.

Can you please execute this command and let me know if you get the same result?

In particular, the line 3077 contains the substring the desired bund is formed. -A $ 'N < \ ~ Ny\«.4\ \ , which seems corrupted.

$ unzip -p data_PATSTAT_Biblio_2017_Spring_04/tls203_part18.zip | cat -n | head -n 3078 | tail -n 3
  3076	419831085,"en","METHODS, APPARATUS, AND ARTICLES OF MANUFACTURE TO ENCODE AUXILIARY DATA INTO TEXT DATA AND METHODS, APPARATUS, AND ARTICLES OF MANUFACTURE TO OBTAIN ENCODED DATA FROM TEXT DATA Methods, apparatus, and articles of manufacture to encode auxiliary data into text data and methods, apparatus, and articles of manufacture to obtain encoded data from text data are disclosed. An example method to embed auxiliary data into text data includes selecting a portion of auxiliary data to be encoded into text data, mapping the portion of auxiliary data to a first set of one or more encoded characters representative of the portion of the auxiliary data, mapping a position of the portion of auxiliary data within the auxiliary data to a second set of one or more encoded characters representative of the portion of the auxiliary data, and generating encoded data by including the first set of encoded characters and the second set of encoded characters in the text data. 102 __ 110 -O f 100 DATABASE DATA 104 106 -108 DATA AUXILIARY AUXILIARY REQUEST DATA DATA RECEIVER ENCODER DECODER AUTHORIZED UNAUTHORIZED PARTY PARTY"
  3077	419831086,"en","A bund is formed, according to this invention, by moulding bund sections in a bund mould which may be placed on the ground to receive and consolidate material in the cross-sectional shape of the required bund. After forming a short bund section the bund mould is repositioned to form a continuation of the formed bund section, refilled to form a further bund wall section as a continuation of the previously formed bund section and this process is repeated until the desired bund is formed. -A $ 'N < \ ~ Ny\«.4\ \ \ ''''N' N 'A~\ \ & 'N*~'~~ ~'NN~NN 'K K> >\~N4 N N ' N x'\ '~ ~ ' N\N~' 'K \N.tN'7tK\. ~NQN< \'*'*' ''K-'' N'' N-> N' N' ''N N' '4"
  3078	419831088,"en","A method is provided for the wet surface treatment of titanium dioxide, in order to produce durable universal grade titanium dioxide rutile pigment with superior optical properties. The method is characterized in that, a hydrous zirconia and silica composite layer is co-precipitated at acidic pH. Then, a layer of alumina is precipitated under a range of pH required for complete precipitation above the initial composite layer. The upper pH limit of the slurry during the alumina precipitation can be well controlled to avoid any chance for dissolution or damage of the composite zirconia-silica layer formed. Zirconia-silica composite layers and alumina thus precipitated advantageously improve the competence of the layers formed over a TiO2 base and provide improved durability with superior optical performance. The total surface treatment cycle time and chemicals used are minimal compared to conventional methods. Improvements in throughput and washing efficiency are also realized."

Table tls903

The tls903 file is not imported (not even in the original patstat script). Do you know if there is a reason for this? Is it possible to import it also?

reg_code,reg_label,up_reg_code,up_reg_label,up_reg_code_alt,up_reg_label_alt,ctry_code
"AT111","Mittelburgenland","AT11","BURGENLAND (A)","","","AT"
"AT112","Nordburgenland","AT11","BURGENLAND (A)","","","AT"
"AT113","Südburgenland","AT11","BURGENLAND (A)","","","AT"
"AT121","Mostviertel-Eisenwurzen","AT12","NIEDERÖSTERREICH","","","AT"
...

standardized EEE-PPAT person table and TLS221_INPADOC_PRS

The README.md file says:

The utility is also capable of loading the standardized EEE-PPAT person table with harmonized assignee names and assignee sector allocations (https://www.ecoom.be/en/EEE-PPAT). This table has been officially included in version 2015a.

Is this the tls906_person table, which is already included by default in the patstat_2017a?

TLS221_INPADOC_PRS DVD zipped table may be copied there as well.

Is this the tls231_inpadoc_legal_event table, which is already included by default in the patstat_2017a?

Question about myisamchk and myisampack

the load_patstat.sh script uses myisamchk and myisampack. It seems that this command requires root or mysql permision on unix. I don't have these permissions. What is the implication if I run the script without these three commands?

	myisamchk  --keys-used=0 -rqp $MYSQLDATAPATH/$DB/$1*.MYI
        ...
	myisampack $MYSQLDATAPATH/$DB/$1.MYI
        ...
	myisamchk  -rqp --sort-buffer-size=2G $MYSQLDATAPATH/$DB/$1*.MYI

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.