Giter Site home page Giter Site logo

wetneb / openrefine-wikibase Goto Github PK

View Code? Open in Web Editor NEW
100.0 22.0 24.0 5.97 MB

This repository has migrated to:

Home Page: https://gitlab.com/nfdi4culture/ta1-data-enrichment/openrefine-wikibase

License: Other

Python 93.31% HTML 6.54% Dockerfile 0.15%
openrefine wikidata reconciliation reconciliation-service wikibase reconciliation-interface

openrefine-wikibase's Introduction

openrefine-wikibase's People

Contributors

diegodlh avatar framawiki avatar jbpressac avatar paulduchesne avatar regisrob avatar shigapov avatar stylestrip avatar thadguidry avatar wetneb avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

openrefine-wikibase's Issues

Autocomplete: show main label instead of matched alias

When searching for "start date", two properties show up with apparently the same name, whereas one has that as label and the other as alias. The behaviour of the autocomplete endpoint should match that of Wikidata.

Wikidata properties reconciliation

Hello Antonin,

Are there plans to also reconcile the properties of Wikidata and not just the items? This would be very useful to facilitate mapping with other ontologies such as schema.org.

Support querying additional columns "As Property" to increase match score

The old Freebase Reconcile accepted a parameter called "columnDetails" to help pass additional columns to the Reconcile service to help increase match scoring.

Some Code: https://github.com/OpenRefine/OpenRefine/blob/d5fb07384242d07241b0ce103c47d930046f1135/main/src/com/google/refine/model/recon/StandardReconConfig.java#L212

Here is a simple test case CSV that can be added to test functionality once implemented:

Title,Published In,Author
"Ovarian Cancer",International Journal of Medical Sciences,E Visonà
"Trilobita Walch, 1771",,
Insecticide,JAMA,Steketee

Matches:
https://www.wikidata.org/wiki/Q24791449
https://www.wikidata.org/wiki/Q21032691
https://www.wikidata.org/wiki/Q21032524

CORS support for access to openrefine-wikidata

When embedding query strings, we need to access files that are CORS enabled

For example, when using this service is not natively CORS-enabled:
https://es.wikipedia.org/w/api.php?action=query&prop=extracts&exsentences=3&titles=Inform%C3%A1tica&exintro=&format=json

Hay ([email protected]) advised to append 'origin=*' like this:
https://es.wikipedia.org/w/api.php?action=query&prop=extracts&exsentences=3&titles=Inform%C3%A1tica&exintro=&format=json&origin=*

Similarly, the openrefine-wikidata site is not CORS enabled:
https://tools.wmflabs.org/openrefine-wikidata/en/api?query=https://en.wikipedia.org/wiki/Tom_Hanks

We tried this:
https://tools.wmflabs.org/openrefine-wikidata/en/api?query=https://en.wikipedia.org/wiki/Tom_Hanks&origin=*

but it does not act as the metho acts on PHP site

How can we access openrefine-wikidata if our access programs require the target site to be CORS-enabled?

/jay gray

Queries preprocessing

I wonder if a little bit of automatic preprocessing could not dramatically increase the reconciliation rate. Looks like a simple blank space after an apostrophe completely changes the result.

screencast

null value for properties in request causes 500 internal server error

Remember history of edit summaries

When processing a large data set gradually, the edit summary may often be precisely the same, e.g. "Adding population numbers based on 2011 census". It would be useful not to have to re-type it every time.

How to create reconciliation data from existing QID's?

If I wanted to use the update and add features and had obtained Wikidata QID's with other means than reconciling with OpenRefine, how should I construct the reconciliation data to be able to proceed?

I can reconcile afresh using textual data or the QID's, but that seems unnecessary.

Guess Types sometimes causes continuous "Working..."

Steps to reproduce with OpenRefine 2.6 RC2

When trying to reconcile on Column 1... (where 3rd row purposefully mispelled)

All Column 1 Column 2
1. Journey Ubisoft
2. Journey Microsoft
3. Jouney Infocom
4.

I get the below error stack trace.
Perhaps this might be due to the empty space for my 4th line where "query":"" ??

Interestingly, the 3rd row is actually showing as "q1"... look at the misspelling i purposely did "Jouney".
Yet It does not throw an error and I get returned Types for Column 2 just fine...but always an error trying to recon on Column 1

21:30:01.805 [                   refine] POST /command/core/guess-types-of-column (9027ms)
21:30:02.357 [                  command] Failed to guess cell types for load
{"q0":{"query":"Journey","limit":3},"q1":{"query":"Jouney","limit":3},"q2":{"query":"","limit":3}} (552ms)
java.io.IOException:
    <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
    <html>
        <head>
            <title>Error: 500 Internal Server Error</title>
            <style type="text/css">
              html {background-color: #eee; font-family: sans;}
              body {background-color: #fff; border: 1px solid #ddd;
                    padding: 15px; margin: 15px;}
              pre {background-color: #eee; border: 1px solid #ddd; padding: 5px;}
            </style>
        </head>
        <body>
            <h1>Error: 500 Internal Server Error</h1>
            <p>Sorry, the requested URL <tt>&#039;https://tools.wmflabs.org/openrefine-wikidata/api&#039;</tt>
               caused an error:</p>
            <pre>Internal Server Error</pre>
        </body>
    </html>

        at com.google.refine.commands.recon.GuessTypesOfColumnCommand.guessTypes(GuessTypesOfColumnCommand.java:191)
        at com.google.refine.commands.recon.GuessTypesOfColumnCommand.doPost(GuessTypesOfColumnCommand.java:89)
        at com.google.refine.RefineServlet.service(RefineServlet.java:177)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
...
        at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
        at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:923)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:547)
...
        at java.lang.Thread.run(Unknown Source)
21:30:02.360 [                  command] Exception caught (3ms)
java.io.IOException:
    <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
    <html>
        <head>
            <title>Error: 500 Internal Server Error</title>
            <style type="text/css">
              html {background-color: #eee; font-family: sans;}
              body {background-color: #fff; border: 1px solid #ddd;
                    padding: 15px; margin: 15px;}
              pre {background-color: #eee; border: 1px solid #ddd; padding: 5px;}
            </style>
        </head>
        <body>
            <h1>Error: 500 Internal Server Error</h1>
            <p>Sorry, the requested URL <tt>&#039;https://tools.wmflabs.org/openrefine-wikidata/api&#039;</tt>
               caused an error:</p>
            <pre>Internal Server Error</pre>
        </body>
    </html>

        at com.google.refine.commands.recon.GuessTypesOfColumnCommand.guessTypes(GuessTypesOfColumnCommand.java:191)
        at com.google.refine.commands.recon.GuessTypesOfColumnCommand.doPost(GuessTypesOfColumnCommand.java:89)
        at com.google.refine.RefineServlet.service(RefineServlet.java:177)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
...
        at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
        at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:923)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:547)
...
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.lang.Thread.run(Unknown Source)
21:30:07.445 [           ProjectManager] Saving some modified projects ... (5085ms)
21:30:07.453 [        project_utilities] Saved project '1754850059635' (8ms)

Copy references across statements in Wikidata schema

Began using OpenRefine seriously, and the Wikidata support is just fantastic. One thing that would save some time is if after adding a reference in the Wikidata schema, the reference could be "filled down" or copied and pasted, or otherwise duplicated, for cases where all the statements are supported by the same source (e.g. census data).

Performance challenges

The reconcilition service output is remarkably low. It causes part of the requests to time out and cause errors. The processing times are very long, and error rates high, resulting in a lot of manual work.

The previews don't generally work, and therefore the links to Wikidata cannot be used as substitute for automatic lookup. The link contains the Wikidata ID, which can bee visually seen in the browser and checked that way.

It would be handy not to have to reconcile already reconciled data, but the copy reconciliation data produces no results.

I have a database of 3000 items and several reconciled properties. I am not running the very latest version (I cannot find the version number) of 3.0 Beta.

I have tested with different internet connections, and that does not seem to be the reason.

Search preview

After a reconciliation, when no candidate is suitable, it's sometimes necessary to modify the spelling of the term by using the "search" function.

But Wikidata then returns so many homonyms, and Qids bring so little information, that it is difficult to choose the right one without previewing each Wikidata page returned.

sans titre 2

There was a system of this type with Freebase or with the OpenCorporates reconciliation service.

sans titre 1

It would be a great improvement if it could be implemented easily. :)

Thanks again for this very useful service!

PS: I'm sure the point had to be touched elsewhere, but I did not find it.

Type constraints should be smoother

Currently, candidates whose type does not match the target type (via subclasses) are only included if no other candidate of the right type was found, and with a score divided by two. But some types are arguably closer than others, like "university" and "school" are close to each other while not being subclasses of each other. "school" and "tombstone" are further apart.

Figure out a way to return penalized candidates whose type is "close" to the target type (for an appropriate notion of proximity).

Server Error when the service try to guess the category

First time I get this kinds of error directly when I launch a reconciliation. :-|

12:03:46.534 [                  command] Failed to guess cell types for load
{"q0":{"query":"rommel","limit":3},"q1":{"query":"carette leuven","limit":3},"q2":{"query":"leuven sigaren","limit":3},"q3":{"query":"guillemin gombert bruxelles","limit":3},"q4":{"query":"incendie","limit":3},"q5":{"query":"guillemin directeur institut","limit":3},"q6":{"query":"de smet","limit":3},"q7":{"query":"pierre victor guillemin","limit":3},"q8":{"query":"bosco","limit":3},"q9":{"query":"1878rachez","limit":3}} (2289ms)
java.io.IOException:
    <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
    <html>
        <head>
            <title>Error: 500 Internal Server Error</title>
            <style type="text/css">
              html {background-color: #eee; font-family: sans;}
              body {background-color: #fff; border: 1px solid #ddd;
                    padding: 15px; margin: 15px;}
              pre {background-color: #eee; border: 1px solid #ddd; padding: 5px;}
            </style>
        </head>
        <body>
            <h1>Error: 500 Internal Server Error</h1>
            <p>Sorry, the requested URL <tt>&#039;https://tools.wmflabs.org/openrefine-wikidata/fr/api&#039;</tt>
               caused an error:</p>
            <pre>Internal Server Error</pre>
        </body>
    </html>

        at com.google.refine.commands.recon.GuessTypesOfColumnCommand.guessTypes(GuessTypesOfColumnCommand.java:191)
        at com.google.refine.commands.recon.GuessTypesOfColumnCommand.doPost(GuessTypesOfColumnCommand.java:89)
        at com.google.refine.RefineServlet.service(RefineServlet.java:177)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
        at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1166)
        at org.mortbay.servlet.UserAgentFilter.doFilter(UserAgentFilter.java:81)
        at org.mortbay.servlet.GzipFilter.doFilter(GzipFilter.java:132)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157)
        at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:388)
        at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
        at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
        at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
        at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:418)
        at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
        at org.mortbay.jetty.Server.handle(Server.java:326)
        at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
        at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:923)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:547)
        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
        at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.lang.Thread.run(Unknown Source)
12:03:46.535 [                  command] Exception caught (1ms)
java.io.IOException:
    <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
    <html>
        <head>
            <title>Error: 500 Internal Server Error</title>
            <style type="text/css">
              html {background-color: #eee; font-family: sans;}
              body {background-color: #fff; border: 1px solid #ddd;
                    padding: 15px; margin: 15px;}
              pre {background-color: #eee; border: 1px solid #ddd; padding: 5px;}
            </style>
        </head>
        <body>
            <h1>Error: 500 Internal Server Error</h1>
            <p>Sorry, the requested URL <tt>&#039;https://tools.wmflabs.org/openrefine-wikidata/fr/api&#039;</tt>
               caused an error:</p>
            <pre>Internal Server Error</pre>
        </body>
    </html>

        at com.google.refine.commands.recon.GuessTypesOfColumnCommand.guessTypes(GuessTypesOfColumnCommand.java:191)
        at com.google.refine.commands.recon.GuessTypesOfColumnCommand.doPost(GuessTypesOfColumnCommand.java:89)
        at com.google.refine.RefineServlet.service(RefineServlet.java:177)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
        at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1166)
        at org.mortbay.servlet.UserAgentFilter.doFilter(UserAgentFilter.java:81)
        at org.mortbay.servlet.GzipFilter.doFilter(GzipFilter.java:132)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157)
        at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:388)
        at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
        at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
        at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
        at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:418)
        at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
        at org.mortbay.jetty.Server.handle(Server.java:326)
        at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
        at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:923)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:547)
        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
        at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.lang.Thread.run(Unknown Source)

Exclude certain types during reconciliation?

Currently in OpenRefine there's an option to reconcile against particular types or no particular type; would it be possible to exclude certain types (for example, Wikidata disambiguation pages) during reconciliation? I'm not sure if this is a request better put to the OpenRefine repo; please let me know if so. Thanks!

Compatibility with Google Refine 2.5

Hi - thank you SO MUCH for implementing this service. I have been waiting for it for a very long time!

I wanted to test it with a small dataset, but can't get it to work. Here's my process:

I scraped the member section of http://www.kvab.be/ (the Flemish Academy of Science - has many notable members). (The dataset is in Mix'n'Match too, see http://tools.wmflabs.org/mix-n-match/?mode=catalog_details&catalog=359 - there I got 90 matches with Wikidata).

Added the data to OpenRefine and cleaned it a bit.
image
The data I'm working with is this: https://drive.google.com/open?id=0B9jfaELxxRSxU2hadzJJZGxleTg

Now I want to reconcile the names in the 'persoon 1' column. (I'm absolutely certain that many of these are on Wikidata)

Starting the reconciliation service:
image

I think here it goes wrong already? I get no automatic type suggestions like in the example you provided yourself at https://tools.wmflabs.org/openrefine-wikidata/.
image

When I type anything in the 'reconcile against type' box, it just keeps saying 'searching....', producing no results. I try with human, Q5, makes no difference.
image

For now, I have just typed Q5 there and continue with the reconciliation itself.

It's busy now. Takes a few minutes for 233 names.
image

And the results: no matches, unfortunately. Although I'm quite certain that quite a few of these names have (exact and unique) matches on Wikidata.
image

For instance, Katlijne van der Stighelen has an exact and unique match:
image

But even when doing the 'search for match' function, OpenRefine keeps hanging on search.
image

Now I'm wondering if this is an error on my side? Am I doing something wrong in the process? Or do others have the same issue? I have noticed earlier that my Chrome browser does act strange sometimes. Thanks!

Show progress in console while reconciling

Reconciling seems to hang sometimes. It would be good to emit some progress indicators in the console (if not in the UI), to enable us to notice a hang and restart OpenRefine. Especially on slower connections, it's crucial to be able to tell apart slowness from a hang.

Exception when trying to add OpenRefine date (not string) values with too high precision

I got...

org.wikidata.wdtk.wikibaseapi.apierrors.MediaWikiApiErrorException: [modification-failed] Out of range, must be no higher than 11
  at org.wikidata.wdtk.wikibaseapi.apierrors.MediaWikiApiErrorHandler.throwMediaWikiApiErrorException(MediaWikiApiErrorHandler.java:62)
  at org.wikidata.wdtk.wikibaseapi.ApiConnection.checkErrors(ApiConnection.java:422)
  at org.wikidata.wdtk.wikibaseapi.ApiConnection.sendJsonRequest(ApiConnection.java:362)
  at org.wikidata.wdtk.wikibaseapi.WbEditingAction.performAPIAction(WbEditingAction.java:720)
  at org.wikidata.wdtk.wikibaseapi.WbEditingAction.wbEditEntity(WbEditingAction.java:298)
  at org.wikidata.wdtk.wikibaseapi.WikibaseDataEditor.createItemDocument(WikibaseDataEditor.java:247)
  at org.openrefine.wikidata.editing.EditBatchProcessor.performEdit(EditBatchProcessor.java:140)
  at org.openrefine.wikidata.operations.PerformWikibaseEditsOperation$PerformEditsProcess.run(PerformWikibaseEditsOperation.java:211)
  at java.lang.Thread.run(Thread.java:748)

Use geographical coordinates for match scoring

This is such an amazing project that really helps so many other datasets to connect to Wikidata! Thank you @wetneb 🙇

I'm using the tool to reconcile the Natural Earth dataset of public domain map data to Wikidata. This worked really well for countries with 100% matches nvkelso/natural-earth-vector#224 (comment) but while matching airports by the name/IATA code, there are multiple matches due to errors in Wikidata itself of wrong data.

In such cases of geographical datasets it would be really useful to leverage the coordinate information in the source dataset (usually as X=lon,Y=lat columns) to suggest the best match when possible.
screen shot 2017-11-01 at 12 00 04 pm

User story

  • User chooses the X, Y coordinate columns before starting reconciliation
  • Include match distance between source and Wikidata coordinate P625 in the match score
  • Have a facet slider to filter out results by match distance

Ability to disable/disable part of schema for specific updates

It'd be nice to be able to apply only parts of the schema for certain facets of the data.

When items already exist and have some statements or terms defined differently from my dataset, I don't want to replace those terms, and I don't want to duplicate the statement with my slightly different statements.

Perhaps if I can reconcile them with something like #42 this wouldn't be needed.

Currently I delete the part of the schema that would modify someone else's good description which just happens to be different from mine. Same with statements - if the same statement already exists but with one more or one less qualifier, if I'm not careful to delete the schema that would create the statement, I end up duplicating it

Sort the candidates by inverse QID magnitude ?

Hi @wetneb. I come back with my scoring problems. :p

I was wondering: is it technically possible to rank the candidates of a reconciliation in descending order of their QID number ? Let me explain. If I reconcile for example the string "Amsterdam" with a vague category like "Geographic region" (Q82794), the result is often surprising. The first Amsterdam that is proposed is an obscure town of Texas, while the famous city in the Netherlands is only in 4th position.

sans titre 1

It's therefore impossible to perform a "match each cell with its best candidate". On the other hand, when you click on "search for match", the right Amsterdam appears directly in first position.
sans titre 2

It's immediately recognized by the fact that its QID contains only three digits, a sign undoubtedly that it's an important entity created very early in the history of Wikidata, and certainly well before the locality of Texas (Q4748816 ).

If it were possible to perform this ranking, the manual reconciliation task would be greatly facilitated.

Scoring System

Hi Antonin,

I finally had the opportunity to test the reconciliation intensively: it's great work :) For now, the only remark that comes to my mind concerns the scoring of ex-aequo. For the French month "Avril", for example, I have four candidates who each have 100% probability. The first three are "Avril" and the fourth "Avril Lavigne". I wonder if we should not penalize candidates whose distance Levenshtein exceeds a certain threshold, something like that. For the rest, thank you again !

Yours,

Error 403 forbidden - make error messages more meaningful

Hi @wetneb,

I get this error in the Open Refine console during a reconciliation. Difficult to say what values cause it. Is this error documented?

10:05:17.403 [    refine-standard-recon]
    <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
    <html>
        <head>
            <title>Error: 403 Forbidden</title>
            <style type="text/css">
              html {background-color: #eee; font-family: sans;}
              body {background-color: #fff; border: 1px solid #ddd;
                    padding: 15px; margin: 15px;}
              pre {background-color: #eee; border: 1px solid #ddd; padding: 5px;}
            </style>
        </head>
        <body>
            <h1>Error: 403 Forbidden</h1>
            <p>Sorry, the requested URL <tt>&#039;https://tools.wmflabs.org/openrefine-wikidata/fr/api&#039;</tt>
               caused an error:</p>
            <pre>{&quot;message&quot;: &quot;invalid query&quot;, &quot;status&quot;: &quot;error&quot;, &quot;details&quot;: &quot;&#039;id&#039;&quot;}</pre>
        </body>
    </html>

Duplicate items created when creating multiple new linked items in one update

I think there's a problem creating multiple interlinked items in one update.

I'm trying to create a mayor and an executive and link the to each other and an existing municipality in one update. What ends up happening is the following:

  • an item Q57414669 is created without a label, with just the part of statement to Q57414667
  • an item Q57414667 is created without a label, with the has part statement to Q57414666
  • an item Q57414666 is created with a label, with everything except the part of statement which Q57414669 got.
  • the municipality Q1983450 gets an executive body statement to Q57414667
  • an item Q57414665 is created with a label and everything except the has part statement which Q57414667 got

here's the "Export to quick statements" of the same update set, after running the updates

Q57414667	Laf	"Burgermeesterskommittee van Newcastle"
Q57414667	Len	"Mayoral Committee of Newcastle"
Q57414667	Daf	"uitvoerende gesag van die Newcastle Plaaslike Munisipaliteit"
Q57414667	Den	"executive authority of the Newcastle Local Municipality"
Q57414667	P17	Q258
Q57414667	P1001	Q1983450
Q57414667	P279	Q640506
Q57414667	P527	Q57414669
Q1983450	P208	Q57414667
Q1983450	P1313	Q57414669
Q57414669	Len	"Executive Mayor of Newcastle Local Municipality"
Q57414669	Laf	"Uitvoerende Burgermeester van Newcastle Plaaslike Munisipaliteit"
Q57414669	Den	"head of government of the Newcastle Local Municipality"
Q57414669	Daf	"regeringshoof van die Newcastle Plaaslike Munisipaliteit"
Q57414669	P17	Q258
Q57414669	P361	Q57414667
Q57414669	P1001	Q1983450
Q57414669	P31	Q294414

openrefine-wikidata-dupes-1
openrefine-wikidata-dupes-2
openrefine-wikidata-dupes-contributions

Implement "type_strict"

@thadguidry: can you point me to the description of what each value of "type_strict" should do?
I can see that it expects "any", "all" or "should", but I'm not sure what they mean. I think the interface currently implements the "any" mode: items have to match at least one type provided (so, the union). I guess "all" means the intersection of the provided types. About "should", I suppose it becomes a soft constraint? How is it specified?

Impossible to match these five QIDs

I don't know what these five QIDs have in particular, but I cannot match them with Wikidata (they exist, they are even very important items, and "search for match" finds them).

Weird. It may be nothing, but it may be the sign of a bug somewhere, so I prefer to report it.

To reproduce the bug : go on column "qid", try to reconcile with Wikidata (en) by choosing "Reconcile against no particular type".

screencast

(OR 2.8 and 3.0, Windows 10, Google Chrome)

OR Project : wikipedia_errors.openrefine.tar.gz

Let users fetch descriptions or sitelinks

The data extension protocol should let users fetch descriptions, sitelinks, aliases, and other things like that. These could be added by default to the suggested properties.

Suggest service sometimes does not respond

Hi Antonin,

Spent some time this evening with PR1167 and couldn't get the Suggest service to respond like it did earlier today and provide dropdown suggested Properties for my columns or even the "Reconcile against type:"
However the main guess types was working consistently.
No errors in command console.
FYI

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.