Giter Site home page Giter Site logo

elasticsearch-reindex's People

Contributors

abador avatar avvs avatar chrishiestand avatar danpaz avatar drinks avatar garbin avatar i-like-robots avatar jwarkentin avatar mrq1911 avatar pakaufmann avatar thedeveloper avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

elasticsearch-reindex's Issues

.

.

Fails to reindex documents that have parents

If you set up a parent-child relationship between documents, the child documents fail to be reindexed.

PUT /company
{
"mappings": {
"branch": {},
"employee": {
"_parent": {
"type": "branch"
}
}
}
}

In this case, any employee documents will not be reindexed.

Not working with Node 4.2.1

Getting the following error:

Starting reindex in 1 shards.
/usr/local/lib/node_modules/elasticsearch-reindex/node_modules/elasticsearch/src/lib/log.js:65
Log.prototype.listenerCount = function (event) {
^

RangeError: Maximum call stack size exceeded
at Log.listenerCount (/usr/local/lib/node_modules/elasticsearch-reindex/node_modules/elasticsearch/src/lib/log.js:65:40)
at Function.EventEmitter.listenerCount (events.js:399:20)
at Log.listenerCount (/usr/local/lib/node_modules/elasticsearch-reindex/node_modules/elasticsearch/src/lib/log.js:68:25)
at Function.EventEmitter.listenerCount (events.js:399:20)
at Log.listenerCount (/usr/local/lib/node_modules/elasticsearch-reindex/node_modules/elasticsearch/src/lib/log.js:68:25)
at Function.EventEmitter.listenerCount (events.js:399:20)
at Log.listenerCount (/usr/local/lib/node_modules/elasticsearch-reindex/node_modules/elasticsearch/src/lib/log.js:68:25)
at Function.EventEmitter.listenerCount (events.js:399:20)
at Log.listenerCount (/usr/local/lib/node_modules/elasticsearch-reindex/node_modules/elasticsearch/src/lib/log.js:68:25)
at Function.EventEmitter.listenerCount (events.js:399:20)
at Log.listenerCount (/usr/local/lib/node_modules/elasticsearch-reindex/node_modules/elasticsearch/src/lib/log.js:68:25)
at Function.EventEmitter.listenerCount (events.js:399:20)
at Log.listenerCount (/usr/local/lib/node_modules/elasticsearch-reindex/node_modules/elasticsearch/src/lib/log.js:68:25)
at Function.EventEmitter.listenerCount (events.js:399:20)
at Log.listenerCount (/usr/local/lib/node_modules/elasticsearch-reindex/node_modules/elasticsearch/src/lib/log.js:68:25)
at Function.EventEmitter.listenerCount (events.js:399:20)
worker exited with error code: 1
Reindexing completed sucessfully.

Do a reindex in background or detached mode

Hello all.

I've tried to execute the script in background mode (from a command line with &) but I haven't been able to run it and keep the output result in a file (for example).

If is run it as:

elasticsearch-reindex -f http://localhost:9200/index -t https:/<new_elastic>:9200 indexer.js &

The output reindexing [--------------------------------------] 0/2998080(0%) 270.5 0.0s - 1/30 working is still shown and the script is not executed in background mode.

If is run in background and sending output to a file, like:

elasticsearch-reindex -f http://localhost:9200/index -t https:/<new_elastic>:9200 indexer.js 1> /var/tmp/es-reindex.log &

In this file is saved this output:

Starting reindex in 30 shards.
Reindexing completed sucessfully.

But the output reindexing [--------------------------------------] ... is missing.

Is there any way to run it in backgorund keeping the output?

Thanks in advance.

Regards,
Miguel.

Increasing memory usage when trying to reindex an index

I have an index with 26mio rows which I try to reindex. However memory usage of the reindex-script grows constantly, it is already at 4g when reporting 19% reindexing done. Is this expected, it seems a lot of memory for a simple copy-operation which should be done in a streaming fashion, right?

Also there are no documents arriving in index-2 yet, would it only start writing data at th end?

$ elasticsearch-reindex -f http://localhost:9200/index-1/type -t http://localhost:9200/index-2/type -c 3 -b 10000 -q 1000 -s 10m -o 600000
Starting reindex in 1 shards.
 reindexing [======------------------------] 5016000/26000000(19%) 4028.4 16852.6s - 1/1 working
 1119 ec2-user  20   0 4833m 4.1g 9904 R 99.8 13.8  68:04.02 /usr/bin/node /usr/bin/elasticsearch-reindex
2

Add flag to turn off logging completely?

Small feature request: I'm currently using /dev/null as a destination for the logs, which is fine, except some of my colleagues are Windows users, and thus the script will fail if it tries to write there. I'm happy to look at this myself but would like to check that this would be an acceptable feature.

Using "-z" as last option in command line prevents reindex script from executing

Hi,

The command line tool fails to execute the provided reindex script, if the last option is "-z", e.g.

elasticsearch-reindex -f 'https://user:[email protected]/index/type' \ -t 'https://user:[email protected]/index/type' -c 2 -b 25 -q 50 -v 2.1-z my_reindex.js
wont execute my_reindex.js whereas
elasticsearch-reindex -f 'https://user:[email protected]/index/type' \ -t 'https://user:[email protected]/index/type' -z -c 2 -b 25 -q 50 -v 2.1 my_reindex.js
will.

I suspect this is due to the way command line options are evaluated. This is not a big issue, but should be fixed or documented.

Cheers and thank for a great tool,
Patrick

Reindex is failing with error No living connections

While trying to re-index with the below command:-

elasticsearch-reindex -f http://172.16.84.220:9200/prod.jps.2015.07.21/ -t http://172.16.84.220:9200/

where 172.16.84.220 is a dedicated ES Client Node. Other than the client node ES Cluster has a dedicated master and data node.

I am seeing the below error:-

Elasticsearch ERROR: 2015-08-14T22:27:15Z
  Error: Request error, retrying -- getaddrinfo ENOTFOUND http
      at Log.error (/usr/local/lib/node_modules/elasticsearch-reindex/node_modules/elasticsearch/src/lib/log.js:213:60)
      at checkRespForFailure (/usr/local/lib/node_modules/elasticsearch-reindex/node_modules/elasticsearch/src/lib/transport.js:192:18)
      at HttpConnector.<anonymous> (/usr/local/lib/node_modules/elasticsearch-reindex/node_modules/elasticsearch/src/lib/connectors/http.js:153:7)
      at ClientRequest.wrapper (/usr/local/lib/node_modules/elasticsearch-reindex/node_modules/elasticsearch/node_modules/lodash/index.js:3095:19)
      at ClientRequest.emit (events.js:107:17)
      at Socket.socketErrorListener (_http_client.js:271:9)
      at Socket.emit (events.js:107:17)
      at net.js:950:16
      at process._tickCallback (node.js:355:11)

Elasticsearch WARNING: 2015-08-14T22:27:15Z
  Unable to revive connection: http://http/

Elasticsearch WARNING: 2015-08-14T22:27:15Z
  No living connections

Reindex error: Error: No Living connections

Environment

  • Node Version - v0.12.7
  • npm Version - 2.11.3
  • OS - Debian 7 - x86_64
  • ES Version - 1.7.1

Authentication exception, and then data that should only be visible when authenticated

I'm trying to re-index a Compose MongoDB instance. Wit the correct credentials, I get this error:

Reindex error: Authentication Exception :: {"path":"/_bulk","query":{},"body":"{\"index\":{\"_index\":\"mutations_568e7be77d7aad7e18213534_2.4.x\",\"_type\":\"credit\",\"_id\":\"AVMIVJkZc9q5AbmWDN3X\"}}\n{\"type\":\"credit\",\"amount\":6600,\"origin\":\"SHOP_ce3741e6-be4d-482a-9d11-72b88bdfa9fd\",\"description\":\"Bestelling: undefined, Mon Feb 22 2016 04:34:34 GMT-0500 (EST)\ 
(...) 
"statusCode":401,"response":"<html><body><h1>401 Unauthorized</h1>\nYou need a valid user and password to access this content.\n</body></html>\n","wwwAuthenticateDirective":"Basic realm=\"bk_db\""}

The omitted bit is a very long dump of data in the elasticsearch index which makes me think that we are actually authenticated (why would it show data otherwise? Unless it's a massive security problem somewhere)

Any thoughts?

re-index not working

Hey, I need help. I try:
elasticsearch-reindex -f http://localhost:9200/sms -t http://localhost:9200/sms-new
And I have error:

Starting reindex in 1 shards.
/usr/lib/node_modules/elasticsearch-reindex/node_modules/elasticsearch/src/lib/transport.js:60
      throw new TypeError('Invalid hosts config. Expected a URL, an array of urls, a host config object, ' +
      ^

TypeError: Invalid hosts config. Expected a URL, an array of urls, a host config object, or an array of host config objects.
    at new Transport (/usr/lib/node_modules/elasticsearch-reindex/node_modules/elasticsearch/src/lib/transport.js:60:13)
    at Object.EsApiClient (/usr/lib/node_modules/elasticsearch-reindex/node_modules/elasticsearch/src/lib/client.js:57:22)
    at new Client (/usr/lib/node_modules/elasticsearch-reindex/node_modules/elasticsearch/src/lib/client.js:101:10)
    at createClient (/usr/lib/node_modules/elasticsearch-reindex/bin/elasticsearch-reindex.js:194:18)
    at Object.<anonymous> (/usr/lib/node_modules/elasticsearch-reindex/bin/elasticsearch-reindex.js:212:14)
    at Module._compile (module.js:570:32)
    at Object.Module._extensions..js (module.js:579:10)
    at Module.load (module.js:487:32)
    at tryModuleLoad (module.js:446:12)
    at Function.Module._load (module.js:438:3)
    at Module.runMain (module.js:604:10)
    at run (bootstrap_node.js:390:7)
    at startup (bootstrap_node.js:150:9)
    at bootstrap_node.js:505:3
worker exited with error code: 1
Reindexing completed sucessfully.

How can I fix it?
And if I try re-index with a indexer.js (include date) I have error:

Starting reindex in 4 shards.
/usr/lib/node_modules/elasticsearch-reindex/node_modules/elasticsearch/src/lib/transport.js:60
      throw new TypeError('Invalid hosts config. Expected a URL, an array of urls, a host config object, ' +
      ^

TypeError: Invalid hosts config. Expected a URL, an array of urls, a host config object, or an array of host config objects.
    at new Transport (/usr/lib/node_modules/elasticsearch-reindex/node_modules/elasticsearch/src/lib/transport.js:60:13)
    at Object.EsApiClient (/usr/lib/node_modules/elasticsearch-reindex/node_modules/elasticsearch/src/lib/client.js:57:22)
    at new Client (/usr/lib/node_modules/elasticsearch-reindex/node_modules/elasticsearch/src/lib/client.js:101:10)
    at createClient (/usr/lib/node_modules/elasticsearch-reindex/bin/elasticsearch-reindex.js:194:18)
    at Object.<anonymous> (/usr/lib/node_modules/elasticsearch-reindex/bin/elasticsearch-reindex.js:212:14)
    at Module._compile (module.js:570:32)
    at Object.Module._extensions..js (module.js:579:10)
    at Module.load (module.js:487:32)
    at tryModuleLoad (module.js:446:12)
    at Function.Module._load (module.js:438:3)
    at Module.runMain (module.js:604:10)
    at run (bootstrap_node.js:390:7)
    at startup (bootstrap_node.js:150:9)
    at bootstrap_node.js:505:3
/usr/lib/node_modules/elasticsearch-reindex/node_modules/elasticsearch/src/lib/transport.js:60
      throw new TypeError('Invalid hosts config. Expected a URL, an array of urls, a host config object, ' +
      ^

TypeError: Invalid hosts config. Expected a URL, an array of urls, a host config object, or an array of host config objects.
    at new Transport (/usr/lib/node_modules/elasticsearch-reindex/node_modules/elasticsearch/src/lib/transport.js:60:13)
    at Object.EsApiClient (/usr/lib/node_modules/elasticsearch-reindex/node_modules/elasticsearch/src/lib/client.js:57:22)
    at new Client (/usr/lib/node_modules/elasticsearch-reindex/node_modules/elasticsearch/src/lib/client.js:101:10)
    at createClient (/usr/lib/node_modules/elasticsearch-reindex/bin/elasticsearch-reindex.js:194:18)
    at Object.<anonymous> (/usr/lib/node_modules/elasticsearch-reindex/bin/elasticsearch-reindex.js:212:14)
    at Module._compile (module.js:570:32)
    at Object.Module._extensions..js (module.js:579:10)
    at Module.load (module.js:487:32)
    at tryModuleLoad (module.js:446:12)
    at Function.Module._load (module.js:438:3)
    at Module.runMain (module.js:604:10)
    at run (bootstrap_node.js:390:7)
    at startup (bootstrap_node.js:150:9)
    at bootstrap_node.js:505:3
worker exited with error code: 1
worker exited with error code: 1
/usr/lib/node_modules/elasticsearch-reindex/node_modules/elasticsearch/src/lib/transport.js:60
      throw new TypeError('Invalid hosts config. Expected a URL, an array of urls, a host config object, ' +
      ^

Error when document has _parent field

Hello .

I have a child document depends on a parent document.

In the re-indexing error it occurs:

{
"name": "elasticsearch-reindex",
"hostname": "AONATE-PC",
"pid": 3920,
"level": 40,
"index": {
"_index": "issues",
"_type": "messages",
"_id": "D5EE3CE7",
"status": 400,
"error": "RoutingMissingException[routing is required for [issues]/[messages]/[D5EE3CE7]]"
},
"msg": "",
"time": "2016-07-12T12:45:56.113Z",
"src": {
"file": "C:....\elasticsearch-reindex.js",
"line": 211
},
"v": 0
}

Apparently the _parent field is not returned or is not accessible.

Re-indexing into an existing index

I am attempting to reindex an index into an already created/initialized index with new mapping configurations for given fields (but no existing data).

The tool does report success (though it does throw the dropped connection message) but the ES index is empty upon inspection.

If I target a non-existent index, the tool reports success (and the dropped connection message) and the ES index has all the relevant docs.

fwiw, I tried this with a very small number of docs ~100, so it isn't an issue of attempting to move gigs of data.

Reindex error: Error: Request Timeout after 30000ms (elasticsearch 6.0.1)

I need help ! I'm using elasticsearch 6.0.1 and I'm trying to run this command:

elasticsearch-reindex -f http://****-****-****:9200/datapatient/patient -t http://localhost:9200/datapatientnew/patient

But I get an error: Reindex error: Error: Request Timeout after 30000ms

I also tried to run the following command from head plugin of my local machine:

_ reindex

{
  "source": {
    "remote": {
      "host": "http://****-****-****:9200",
    },
    "index": "datapatient",
    "type": "patient"
  },
  "dest": {
    "index": "datapatientnew"
  }
}

And i got an error too:

{
"error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "Remote responded with a chunk that was too large. Use a smaller batch size."
}
],
"type": "illegal_argument_exception",
"reason": "Remote responded with a chunk that was too large. Use a smaller batch size.",
"caused_by": {
"type": "content_too_long_exception",
"reason": "entity content is too long [523680994] for the configured buffer limit [104857600]"
}
},
"status": 400
}

But when trying the same command with an index containing fewer documents, it works well

There is any way to increase timeout value ?

wrapper the binary into library, so can be called by other node code?

currently, this tool can only be called from cmdline.

but I'd like to call it within my node.js code, after create a new index, I call this, then update the alias.

I did plan directly use official _reindex command, but, the aws ES just not support that command. so why this module not wrapper itself then it can be called by other node script?

Reindex with the new mapping

I changed template for mapping, and I want when I reindex to get indices with new mapping, is that possible with this plugin?

Strange behaviour when data is re-indexed with Shard feature

Hello all.

I'm using this script to re-index from an 1.7 ES to a 6.3 ES cloud. The 1.7 ES has a very large index, with a lot of GB, and I want to re-index into the new 6.3 separate by dates (by months in this case). So the indexer.js contains something similar to this:

var moment = require('moment');

module.exports = {
  sharded:{
    field: "created_at",
    start: "2018-01-01",
    end:   "2018-02-01",
    interval: 'month'
  },
  index: function(item, options) {
    return [
      {index:{_index: 'new-index_' + moment(item._source.date).format('YYYY-MM'), _type:options.type || item._type, _id: item._id}},
      item._source
    ];
  }
};

If the script is executed with this similar configuration, I have checked that two indexes has been created:

  • new-index_2018-07: With all data from 01-01-2018 to 30-01-2018 (both included).
  • new-index_2018-08: With all data from 31-01-2018 to 31-01-2018.

If "end" date is replaced by "31-01-2018" only data from 01-01-2018 to 30-01-2018 (both included) is migrated, but data from 31-01-2018 is not migrated.

Have anyone experienced something similar to this? Is it a problem with configuration?

Thanks in advance.

Regards,
Miguel.

Not re-indexing properly

Hi,

When I try to re-index some indices I get:
Starting reindex in 1 shards.
Reindexing completed sucessfully.

But nothing really happens... I could reindex some of my indices (but I used the aliases instead of the indices).

I noticed that this happens with big indices... It does work with smaller ones but I am specifically trying to reindex one index that is 4.51GB and it is failing. Any solution?

Thanks,
Uriel

error when trying the custom indexer

Elasticsearch version: 5.3.0

Error: Scroll error: [illegal_argument_exception] No search type for [scan] :: {"path":"/zoe_old/impacts/_search","query":{"search_type":"scan","scroll":"1m","size":100},"body":"{}","statusCode":400,"response":"{\"error\":{\"root_cause\":[{\"type\":\"illegal_argument_exception\",\"reason\":\"No search type for [scan]\"}],\"type\":\"illegal_argument_exception\",\"reason\":\"No search type for [scan]\"},\"status\":400}"}
    at scroll_fetch (/Users/manuzenou/.nvm/versions/node/v6.9.4/lib/node_modules/elasticsearch-reindex/bin/elasticsearch-reindex.js:258:15)
    at respond (/Users/manuzenou/.nvm/versions/node/v6.9.4/lib/node_modules/elasticsearch-reindex/node_modules/elasticsearch/src/lib/transport.js:308:9)
    at checkRespForFailure (/Users/manuzenou/.nvm/versions/node/v6.9.4/lib/node_modules/elasticsearch-reindex/node_modules/elasticsearch/src/lib/transport.js:248:7)
    at HttpConnector.<anonymous> (/Users/manuzenou/.nvm/versions/node/v6.9.4/lib/node_modules/elasticsearch-reindex/node_modules/elasticsearch/src/lib/connectors/http.js:164:7)
    at IncomingMessage.wrapper (/Users/manuzenou/.nvm/versions/node/v6.9.4/lib/node_modules/elasticsearch-reindex/node_modules/lodash/index.js:3095:19)
    at emitNone (events.js:91:20)
    at IncomingMessage.emit (events.js:185:7)
    at endReadableNT (_stream_readable.js:974:12)
    at _combinedTickCallback (internal/process/next_tick.js:74:11)
    at process._tickCallback (internal/process/next_tick.js:98:9)

Reindex not working

Hi,

I have tried to create new index from my existing index. Below is the output

> elasticsearch-reindex -f http://127.0.0.1:9200/customer/external -t http://127.0.0.1:9200/

Starting reindex in 1 shards.
 reindexing [==============================] 1/1(100%) 0.0 0.0s - 1/1 working
Reindexing completed sucessfully.

But the new mappings are not applied in my new index files.

Can some one help me on this.

ES Shield

Team,

I have a question here, How to connect to the the elastic cluster which is Shield protected?

Thanks
Pranesh

Reindex fails - socket hang up on initial search

Node v5.7.1
Elasticsearch-reindex v1.1.14

Attempting to reindex from one index into another, using the following:

elasticsearch-reindex -f https://hostname.es.amazonaws.com/originalindex/log-type -t https://hostname.es.amazonaws.com/newindex/ index-script.js

Results in the following error:

Elasticsearch ERROR: 2016-08-11T15:37:35Z
Error: Request error, retrying
POST https://hostname.es.amazonaws.com/originalindex/NCSA-common-log-format/_search?search_type=scan&scroll=1m&size=100 => socket hang up
at Log.error (/temp/.nvm/versions/node/v5.7.1/lib/node_modules/elasticsearch-reindex/node_modules/elasticsearch/src/lib/log.js:225:56)
at checkRespForFailure (/temp/.nvm/versions/node/v5.7.1/lib/node_modules/elasticsearch-reindex/node_modules/elasticsearch/src/lib/transport.js:240:18)
at HttpConnector. (/temp/.nvm/versions/node/v5.7.1/lib/node_modules/elasticsearch-reindex/node_modules/elasticsearch/src/lib/connectors/http.js:162:7)
at ClientRequest.wrapper (/temp/.nvm/versions/node/v5.7.1/lib/node_modules/elasticsearch-reindex/node_modules/lodash/index.js:3095:19)
at emitOne (events.js:90:13)
at ClientRequest.emit (events.js:182:7)
at TLSSocket.socketCloseListener (_http_client.js:271:9)
at emitOne (events.js:95:20)
at TLSSocket.emit (events.js:182:7)
at TCP._onclose (net.js:475:12)

The same result if the last argument (the custom index script) is omitted.

Note the the URL being POSTed to in the log above is accessible via curl and returns

{
"_scroll_id": "somestuff==",
"took": 17,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 176434,
"max_score": 0,
"hits": []
}
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.