Giter Site home page Giter Site logo

how to tell if ETL is done about tdm HOT 14 OPEN

cisco-ie avatar cisco-ie commented on September 21, 2024
how to tell if ETL is done

from tdm.

Comments (14)

anubisg1 avatar anubisg1 commented on September 21, 2024 1

yes hopefully all we work now. not sure why dbms didn't start (or stopped)

m30004jh6220:/home/anubisg1/tdm # docker-compose run --rm etl python main.py --stage search
Warning: Your Pipfile requires python_version 3.6, but you are using 3.7.2 (/root/.local/share/v/d/bin/python).
$ pipenv --rm and rebuilding the virtual environment may resolve the issue.
$ pipenv check will surely fail.
INFO:root:Loading configuration.
INFO:root:Awaiting DBMS availability.
INFO:root:Awaiting DBMS connectivity.
INFO:root:Creating database.
ERROR:root:TDM database already exists! Not overwriting.
INFO:root:Awaiting Search availability.
INFO:root:Populating search database with parsed data.
INFO:root:Acquiring DataPaths from TDM...

INFO:root:Setting up ES...
INFO:root:Populating ES with DataPaths...

from tdm.

remingtonc avatar remingtonc commented on September 21, 2024

Hmmm that is problematic. It looks like ElasticSearch failed to come up. The good news is that it looks like the data successfully loaded into the database, just not ElasticSearch. Could you try running docker logs tdm_search_1 and adding the logs here so we can see why it failed?

Also, what is the OS it is running on?

from tdm.

remingtonc avatar remingtonc commented on September 21, 2024

Likely related to #24 and due to vm.max_map_count being too low. #24 (comment)

If using Linux:

sysctl -w vm.max_map_count=262144
docker-compose up -d --no-deps search

ETL should resume loading once ES is up.

from tdm.

anubisg1 avatar anubisg1 commented on September 21, 2024

Apparently you are right:

linux-qdx4:/home/anubisg1/tdm # docker logs tdm_search_1
OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
[2019-02-14T08:49:11,841][INFO ][o.e.n.Node ] [] initializing ...
[2019-02-14T08:49:12,118][INFO ][o.e.e.NodeEnvironment ] [tfW51mX] using [1] data paths, mounts [[/usr/share/elasticsearch/data (/dev/sda2)]], net usable_space [23.6gb], net total_space [40gb], types [btrfs]
[2019-02-14T08:49:12,119][INFO ][o.e.e.NodeEnvironment ] [tfW51mX] heap size [494.9mb], compressed ordinary object pointers [true]
[2019-02-14T08:49:12,121][INFO ][o.e.n.Node ] [tfW51mX] node name derived from node ID [tfW51mXoRAWO-apLpxsoUQ]; set [node.name] to override
[2019-02-14T08:49:12,122][INFO ][o.e.n.Node ] [tfW51mX] version[6.4.0], pid[1], build[default/tar/595516e/2018-08-17T23:18:47.308994Z], OS[Linux/4.12.14-lp150.12.45-default/amd64], JVM["Oracle Corporation"/OpenJDK 64-Bit Server VM/10.0.2/10.0.2+13]
[2019-02-14T08:49:12,122][INFO ][o.e.n.Node ] [tfW51mX] JVM arguments [-Xms1g, -Xmx1g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -XX:-OmitStackTraceInFastThrow, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Djava.io.tmpdir=/tmp/elasticsearch.xMtgDaNQ, -XX:+HeapDumpOnOutOfMemoryError, -XX:HeapDumpPath=data, -XX:ErrorFile=logs/hs_err_pid%p.log, -Xlog:gc*,gc+age=trace,safepoint:file=logs/gc.log:utctime,pid,tags:filecount=32,filesize=64m, -Djava.locale.providers=COMPAT, -XX:UseAVX=2, -Des.cgroups.hierarchy.override=/, -Xms512m, -Xmx512m, -Des.path.home=/usr/share/elasticsearch, -Des.path.conf=/usr/share/elasticsearch/config, -Des.distribution.flavor=default, -Des.distribution.type=tar]
[2019-02-14T08:49:14,983][INFO ][o.e.p.PluginsService ] [tfW51mX] loaded module [aggs-matrix-stats]
[2019-02-14T08:49:14,985][INFO ][o.e.p.PluginsService ] [tfW51mX] loaded module [analysis-common]
[2019-02-14T08:49:14,986][INFO ][o.e.p.PluginsService ] [tfW51mX] loaded module [ingest-common]
[2019-02-14T08:49:14,986][INFO ][o.e.p.PluginsService ] [tfW51mX] loaded module [lang-expression]
[2019-02-14T08:49:14,986][INFO ][o.e.p.PluginsService ] [tfW51mX] loaded module [lang-mustache]
[2019-02-14T08:49:14,987][INFO ][o.e.p.PluginsService ] [tfW51mX] loaded module [lang-painless]
[2019-02-14T08:49:14,987][INFO ][o.e.p.PluginsService ] [tfW51mX] loaded module [mapper-extras]
[2019-02-14T08:49:14,987][INFO ][o.e.p.PluginsService ] [tfW51mX] loaded module [parent-join]
[2019-02-14T08:49:14,987][INFO ][o.e.p.PluginsService ] [tfW51mX] loaded module [percolator]
[2019-02-14T08:49:14,987][INFO ][o.e.p.PluginsService ] [tfW51mX] loaded module [rank-eval]
[2019-02-14T08:49:14,988][INFO ][o.e.p.PluginsService ] [tfW51mX] loaded module [reindex]
[2019-02-14T08:49:14,988][INFO ][o.e.p.PluginsService ] [tfW51mX] loaded module [repository-url]
[2019-02-14T08:49:14,988][INFO ][o.e.p.PluginsService ] [tfW51mX] loaded module [transport-netty4]
[2019-02-14T08:49:14,988][INFO ][o.e.p.PluginsService ] [tfW51mX] loaded module [tribe]
[2019-02-14T08:49:14,989][INFO ][o.e.p.PluginsService ] [tfW51mX] loaded module [x-pack-core]
[2019-02-14T08:49:14,989][INFO ][o.e.p.PluginsService ] [tfW51mX] loaded module [x-pack-deprecation]
[2019-02-14T08:49:14,989][INFO ][o.e.p.PluginsService ] [tfW51mX] loaded module [x-pack-graph]
[2019-02-14T08:49:14,989][INFO ][o.e.p.PluginsService ] [tfW51mX] loaded module [x-pack-logstash]
[2019-02-14T08:49:14,990][INFO ][o.e.p.PluginsService ] [tfW51mX] loaded module [x-pack-ml]
[2019-02-14T08:49:14,990][INFO ][o.e.p.PluginsService ] [tfW51mX] loaded module [x-pack-monitoring]
[2019-02-14T08:49:14,990][INFO ][o.e.p.PluginsService ] [tfW51mX] loaded module [x-pack-rollup]
[2019-02-14T08:49:14,990][INFO ][o.e.p.PluginsService ] [tfW51mX] loaded module [x-pack-security]
[2019-02-14T08:49:14,990][INFO ][o.e.p.PluginsService ] [tfW51mX] loaded module [x-pack-sql]
[2019-02-14T08:49:14,991][INFO ][o.e.p.PluginsService ] [tfW51mX] loaded module [x-pack-upgrade]
[2019-02-14T08:49:14,991][INFO ][o.e.p.PluginsService ] [tfW51mX] loaded module [x-pack-watcher]
[2019-02-14T08:49:14,992][INFO ][o.e.p.PluginsService ] [tfW51mX] loaded plugin [ingest-geoip]
[2019-02-14T08:49:14,992][INFO ][o.e.p.PluginsService ] [tfW51mX] loaded plugin [ingest-user-agent]
[2019-02-14T08:49:17,580][WARN ][o.e.d.s.ScriptModule ] Script: returning default values for missing document values is deprecated. Set system property '-Des.scripting.exception_for_missing_value=true' to make behaviour compatible with future major versions.
[2019-02-14T08:49:19,927][INFO ][o.e.x.s.a.s.FileRolesStore] [tfW51mX] parsed [0] roles from file [/usr/share/elasticsearch/config/roles.yml]
[2019-02-14T08:49:21,163][INFO ][o.e.x.m.j.p.l.CppLogMessageHandler] [controller/120] [Main.cc@109] controller (64 bit): Version 6.4.0 (Build cf8246175efff5) Copyright (c) 2018 Elasticsearch BV
[2019-02-14T08:49:21,892][INFO ][o.e.d.DiscoveryModule ] [tfW51mX] using discovery type [zen]
[2019-02-14T08:49:22,856][INFO ][o.e.n.Node ] [tfW51mX] initialized
[2019-02-14T08:49:22,857][INFO ][o.e.n.Node ] [tfW51mX] starting ...
[2019-02-14T08:49:23,160][INFO ][o.e.t.TransportService ] [tfW51mX] publish_address {172.27.0.5:9300}, bound_addresses {0.0.0.0:9300}
[2019-02-14T08:49:23,177][INFO ][o.e.b.BootstrapChecks ] [tfW51mX] bound or publishing to a non-loopback address, enforcing bootstrap checks
ERROR: [1] bootstrap checks failed
[1]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
[2019-02-14T08:49:23,199][INFO ][o.e.n.Node ] [tfW51mX] stopping ...
[2019-02-14T08:49:23,268][INFO ][o.e.n.Node ] [tfW51mX] stopped
[2019-02-14T08:49:23,268][INFO ][o.e.n.Node ] [tfW51mX] closing ...
[2019-02-14T08:49:23,281][INFO ][o.e.n.Node ] [tfW51mX] closed
[2019-02-14T08:49:23,283][INFO ][o.e.x.m.j.p.NativeController] Native controller process has stopped - no new native processes can be started

from tdm.

anubisg1 avatar anubisg1 commented on September 21, 2024

I'm running it on openSUSE Leap 15.0

from tdm.

anubisg1 avatar anubisg1 commented on September 21, 2024

afterwards i ended up with another error

FO:root:Loading IOS_XR 6.1.2 data.
INFO:root:Loading IOS_XR 6.1.3 data.
INFO:root:Loading IOS_XR 6.2.1 data.
INFO:root:Loading IOS_XR 6.2.2 data.
INFO:root:Loading IOS_XR 6.3.1 data.
INFO:root:Loading IOS_XR 6.5.1 data.
INFO:root:Awaiting Search availability.
INFO:root:Populating search database with parsed data.
INFO:root:Acquiring DataPaths from TDM...

===
Unable to establish connection, perhaps arango is not running.

Traceback (most recent call last):
File "/root/.local/share/virtualenvs/data-I7nS9QO2/lib/python3.7/site-packages/urllib3/connectionpool.py", line 600, in urlopen
chunked=chunked)
File "/root/.local/share/virtualenvs/data-I7nS9QO2/lib/python3.7/site-packages/urllib3/connectionpool.py", line 384, in _make_request
six.raise_from(e, None)
File "", line 2, in raise_from
File "/root/.local/share/virtualenvs/data-I7nS9QO2/lib/python3.7/site-packages/urllib3/connectionpool.py", line 380, in _make_request
httplib_response = conn.getresponse()
File "/usr/local/lib/python3.7/http/client.py", line 1321, in getresponse
response.begin()
File "/usr/local/lib/python3.7/http/client.py", line 296, in begin
version, status, reason = self._read_status()
File "/usr/local/lib/python3.7/http/client.py", line 265, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/root/.local/share/virtualenvs/data-I7nS9QO2/lib/python3.7/site-packages/requests/adapters.py", line 449, in send
timeout=timeout
File "/root/.local/share/virtualenvs/data-I7nS9QO2/lib/python3.7/site-packages/urllib3/connectionpool.py", line 638, in urlopen
_stacktrace=sys.exc_info()[2])
File "/root/.local/share/virtualenvs/data-I7nS9QO2/lib/python3.7/site-packages/urllib3/util/retry.py", line 367, in increment
raise six.reraise(type(error), error, _stacktrace)
File "/root/.local/share/virtualenvs/data-I7nS9QO2/lib/python3.7/site-packages/urllib3/packages/six.py", line 685, in reraise
raise value.with_traceback(tb)
File "/root/.local/share/virtualenvs/data-I7nS9QO2/lib/python3.7/site-packages/urllib3/connectionpool.py", line 600, in urlopen
chunked=chunked)
File "/root/.local/share/virtualenvs/data-I7nS9QO2/lib/python3.7/site-packages/urllib3/connectionpool.py", line 384, in _make_request
six.raise_from(e, None)
File "", line 2, in raise_from
File "/root/.local/share/virtualenvs/data-I7nS9QO2/lib/python3.7/site-packages/urllib3/connectionpool.py", line 380, in _make_request
httplib_response = conn.getresponse()
File "/usr/local/lib/python3.7/http/client.py", line 1321, in getresponse
response.begin()
File "/usr/local/lib/python3.7/http/client.py", line 296, in begin
version, status, reason = self._read_status()
File "/usr/local/lib/python3.7/http/client.py", line 265, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
urllib3.exceptions.ProtocolError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "main.py", line 118, in
main()
File "main.py", line 103, in main
populate_search(db, config['search']['searchURL'])
File "/data/search.py", line 27, in populate_search
query_iterable = query_all_datapaths(db)
File "/data/search.py", line 67, in query_all_datapaths
return db.AQLQuery(query, rawResults=True, batchSize=1000)
File "/root/.local/share/virtualenvs/data-I7nS9QO2/lib/python3.7/site-packages/pyArango/database.py", line 204, in AQLQuery
json_encoder = json_encoder, **moreArgs)
File "/root/.local/share/virtualenvs/data-I7nS9QO2/lib/python3.7/site-packages/pyArango/query.py", line 146, in init
request = self.connection.session.post(database.cursorsURL, data = json.dumps(payload, cls=json_encoder, default=str))
File "/root/.local/share/virtualenvs/data-I7nS9QO2/lib/python3.7/site-packages/pyArango/connection.py", line 43, in call
ret = self.fct(*args, **kwargs)
File "/root/.local/share/virtualenvs/data-I7nS9QO2/lib/python3.7/site-packages/requests/sessions.py", line 581, in post
return self.request('POST', url, data=data, json=json, **kwargs)
File "/root/.local/share/virtualenvs/data-I7nS9QO2/lib/python3.7/site-packages/requests/sessions.py", line 533, in request
resp = self.send(prep, **send_kwargs)
File "/root/.local/share/virtualenvs/data-I7nS9QO2/lib/python3.7/site-packages/requests/sessions.py", line 646, in send
r = adapter.send(request, **kwargs)
File "/root/.local/share/virtualenvs/data-I7nS9QO2/lib/python3.7/site-packages/requests/adapters.py", line 498, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

from tdm.

anubisg1 avatar anubisg1 commented on September 21, 2024

i started manually the DB

m30004jh6220:/home/anubisg1/tdm # docker-compose up -d --no-deps
tdm_kibana_1 is up-to-date
tdm_nginx_1 is up-to-date
Starting tdm_doc_1 ...
tdm_goaccess_1 is up-to-date
Starting tdm_doc_1
Starting tdm_dbms_1 ...
Starting tdm_etl_1 ...
tdm_web_1 is up-to-date
tdm_search_1 is up-to-date
Starting tdm_etl_1
Starting tdm_dbms_1 ... done
m30004jh6220:/home/anubisg1/tdm # docker logs -f tdm_dbms_1
automatically choosing storage engine
Initializing root user...Hang on...
Initializing database...Hang on...
Database initialized...Starting System...
automatically choosing storage engine

let see if things change

from tdm.

remingtonc avatar remingtonc commented on September 21, 2024

Dependency issues are fun.. :) If dbms and search are both up, and ETL has stopped, try the following to jump ETL straight to the ElasticSearch loading.

docker-compose run --rm etl python main.py --stage search

from tdm.

remingtonc avatar remingtonc commented on September 21, 2024

@anubisg1 Wonderful. It should be smooth sailing from here. Please follow up here with success or failure.

from tdm.

anubisg1 avatar anubisg1 commented on September 21, 2024

@remingtonc it finally worked. the interesting thing about DBMS is that it started and several hours in the docker went down with error 137 . once restarted and resumed etl process all went well.

a side question if i may, i am solely interested in NX-OS, is there any way i can add newer NX-OS releases (9.2.2 is missing for example) and tell ETL to skip all other OSes in the future?

from tdm.

remingtonc avatar remingtonc commented on September 21, 2024

@anubisg1 Hm that's typically due to the OOM killer. There is a lot of room for improvement in the ETL process in terms of efficiency, it's just simply not addressed at the moment. This OOM is rather problematic however, and I will prioritize preventing it. I suspect it's because we're flattening in the query. If it's working now, and ES has the data, good.

Part of the room for improvement includes how we declare what we desire for loading. This might change in the future as we want to use YANG Catalog's data, but in this version there is a "map" mirroring the structure of the Cisco YangModels's folder. Removing any of those mapped items will result in the associated data not being loaded. e.g. You could remove the xr/xe sections entirely, and remove the other versions as well.

from tdm.

anubisg1 avatar anubisg1 commented on September 21, 2024

@remingtonc thank you. I will play with it and see how it works.

You explained how I could skip things I'm not interested in, but what about adding new ones? Nx-os 9.2(2) is missing for example.

I'm assuming is enough to add the entry in the map file, or am I wrong?

from tdm.

remingtonc avatar remingtonc commented on September 21, 2024

@anubisg1 To add an OS/Release, there is a certain amount of static information loaded at the beginning of ETL. This includes the OSes largely to ensure we maintain consistent naming etc. as there seems to be a lot of different ways people like to type the OS names. :)

Add entry with name the way it's supposed to be presented here:
https://github.com/cisco-ie/tdm/blob/master/etl/src/static.py#L296
Map the name to the folder name here:
https://github.com/cisco-ie/tdm/blob/master/etl/src/yang/__init__.py#L41

from tdm.

anubisg1 avatar anubisg1 commented on September 21, 2024

Thank you!

from tdm.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.