Giter Site home page Giter Site logo

fititnt / uwazi-docker Goto Github PK

View Code? Open in Web Editor NEW
11.0 6.0 4.0 107 KB

Dockerized version of Uwazi (“openness" in Swahili). HURIDOCS designed Uwazi to make human rights information more open and accessible to the defenders who need it.

License: The Unlicense

Shell 71.79% Dockerfile 28.21%
uwazi-docker docker uwazi

uwazi-docker's People

Contributors

fititnt avatar mayeulk avatar vasyugan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

uwazi-docker's Issues

data persistence

For now I am not clear on how it can be ensured that my data actually survives.
A docker-compose stop && docker-compose start seems to wipe everything.

Is there any safe way of bringing the swarm (if that's the correct word) down and up again without loosing all the data?

I find docker increasingly fascinating but I am also close to giving up on it with regards to Uwazi, because it seems way way to easy to loose all your data, and at the same time, backup and restore doesn't work as it should.

Additional step `yarn reindex` on Uwazi initialization

If this issue is not caused by #1, inform the maintainers.


One error index_not_found_exception was fixed by running the non-documented command yarn reindex after the yarn blank-state.

Assuming one Elastic Search and Mongo already running using this repository (e.g. docker-compose up elasticsearch mongo), this is the step by step:

git clone [email protected]:huridocs/uwazi.git
cd uwazi
## You need NVM installed, see https://github.com/creationix/nvm
nvm install 6.13
npm install
## You need Yarn installed, see https://yarnpkg.com/
yarn production-build
yarn blank-state
yarn run-production

# Open browser on http://127.0.0.1:3000/

## Still not work. Getting this error

{ error: 
      [ '[index_not_found_exception] no such index, with { resource.type="index_or_alias" & resource.id="uwazi_development" & index_uuid="_na_" & index="uwazi_development" } :: {"path":"/uwazi_development/_search","query":{},"body":"{\\"_source\\":{\\"include\\":[\\"title\\",\\"icon\\",\\"processed\\",\\"creationDate\\",\\"template\\",\\"metadata\\",\\"type\\",\\"sharedId\\",\\"toc\\",\\"attachments\\",\\"language\\",\\"file\\",\\"uploaded\\",\\"published\\",\\"relationships\\"]},\\"from\\":0,\\"size\\":30,\\"query\\":{\\"bool\\":{\\"must\\":[{\\"bool\\":{\\"should\\":[]}}],\\"must_not\\":[],\\"filter\\":[{\\"term\\":{\\"published\\":true}},{\\"term\\":{\\"language\\":\\"en\\"}}]}},\\"sort\\":[{\\"creationDate.sort\\":{\\"order\\":\\"desc\\",\\"unmapped_type\\":\\"boolean\\"}}],\\"aggregations\\":{\\"all\\":{\\"global\\":{},\\"aggregations\\":{\\"types\\":{\\"terms\\":{\\"field\\":\\"template.raw\\",\\"missing\\":\\"missing\\",\\"size\\":9999},\\"aggregations\\":{\\"filtered\\":{\\"filter\\":{\\"bool\\":{\\"must\\":[{\\"bool\\":{\\"should\\":[]}},{\\"term\\":{\\"language\\":\\"en\\"}}],\\"filter\\":[{\\"match\\":{\\"published\\":true}}]}}}}}}}}}","statusCode":404,"response":"{\\"error\\":{\\"root_cause\\":[{\\"type\\":\\"index_not_found_exception\\",\\"reason\\":\\"no such index\\",\\"resource.type\\":\\"index_or_alias\\",\\"resource.id\\":\\"uwazi_development\\",\\"index_uuid\\":\\"_na_\\",\\"index\\":\\"uwazi_development\\"}],\\"type\\":\\"index_not_found_exception\\",\\"reason\\":\\"no such index\\",\\"resource.type\\":\\"index_or_alias\\",\\"resource.id\\":\\"uwazi_development\\",\\"index_uuid\\":\\"_na_\\",\\"index\\":\\"uwazi_development\\"},\\"status\\":404}"}',
        '    at respond (/alligo/code/fititnt/uwazi-docker/uwazi/node_modules/elasticsearch/src/lib/transport.js:295:15)',
        '    at checkRespForFailure (/alligo/code/fititnt/uwazi-docker/uwazi/node_modules/elasticsearch/src/lib/transport.js:254:7)',
        '    at HttpConnector.<anonymous> (/alligo/code/fititnt/uwazi-docker/uwazi/node_modules/elasticsearch/src/lib/connectors/http.js:159:7)',
        '    at IncomingMessage.bound (/alligo/code/fititnt/uwazi-docker/uwazi/node_modules/elasticsearch/node_modules/lodash/dist/lodash.js:729:21)',
        '    at emitNone (events.js:91:20)',
        '    at IncomingMessage.emit (events.js:185:7)',
        '    at endReadableNT (_stream_readable.js:974:12)',
        '    at _combinedTickCallback (internal/process/next_tick.js:80:11)',
        '    at process._tickCallback (internal/process/next_tick.js:104:9)' ] },
  status: 500 }
    at /alligo/code/fititnt/uwazi-docker/uwazi/app/react/ServerRouter.js:184:15
    at process._tickCallback (internal/process/next_tick.js:109:7)

yarn reindex
yarn run-production
# http://127.0.0.1:3000/ Shows "Uwazi To start you need to create some templates in settings"

Automate `yarn migrate` & `yarn reindex` routines on uwazi-docker

As reference https://github.com/huridocs/uwazi/wiki/Backup-and-restore

Backup and restoring operations are performed manually.

Backup
In order to have a full backup of your data, all you need to do is dump the whole collection in MongoDB, and keep a copy of everything contained in the "uploaded_documents" folder.

Restore
Follow these steps in a fresh Uwazi install:

1. Copy/extract the documents in the "uploaded_documents".
2. Restore the database.
3. Run "yarn migrate" in the uwazi directory. This will update your data if needed.
4. Run "yarn reindex".
5. Run the server and navigate to localhost:3000

Both yarn migrate & yarn reindex, to avoid user have to do a docker exec -it... could have some way to re-import old data, like how docker-compose run -e IS_FIRST_RUN=true --rm uwazi and the code at docker-entrypoint.sh works.

The export/import of mongodb and uploaded_documents could still use some external commands & tools, but not these ones.

use bind mounts instead of named volumes

This is maybe a matter of taste, but I find the use of bind mount for databases, documents and dumps preferable over using named docker volumes, because it eases backup and restore, it survives when I wipe /var/lib/docker, so my precious user data is safe.
So in my local install, the uwazi service has the following volume parameter:

volumes:
  - ./documents:/home/node/documents
  - ./dump:/home/node/dump
  - ./log:/home/node/log

and mongo has:

volumes:
 - ./data/mongo:/data/db
 - ./dump:/dump

I don't care about elasticsearch, because in my understanding, that's no persistent data. I have added a directory for dumps, because for backup purposes I do a daily mongodump and I want to have access to it.

rewrite build scripts to accommodate current releases

I think the build script needs to be completely rewritten. The release versions of Uwazi don't require you do build uwazi, they are ready to run. See the description of the install/upgrade procedure here.

Meanwhile I have transitioned to using Uwazi natively rather than dockerized, because this is just way easier. I am not quite sure, that there is a substantial benefit to using docker with the release versions. I'll be archiving my own repository because I haven't been maintaining it it a long while.

Explicit use a hardcoded version of uwazi instead master branch

Using this:

## Download Uwazi
RUN git clone -b master --single-branch --depth=1 https://github.com/huridocs/uwazi.git /home/node/uwazi/ \
  && chown node:node -R /home/node/uwazi/ \
  && cd /home/node/uwazi/ \
  && npm install

Maybe is not the best idea. It's safer to explicit force a release version.

make elasticsearch behave

Today I found that the elasticsearch container has brought down our server with 32gb ram two times in a row. I don't know what the ulimit memlock -1 means exactly that you added to the elasticsearch services, the docker documentation does not explain. Does it mean that you lift all limitations? Because I'd rather see Uwazi fail than seeing the entire server it runs on. And that's what I have seen two times today. For the time being, I have stopped uwazi and will investigate the situation.

Document how to start uwazi-docker with test data

From #48 (comment) by @mayeulk

Is there a test database already populated with documents, a full data model with many relationships among entities, a rich collection, etc?
This could help testing a lot.
Thank you.

Reports from #48 and #50 would be easy to debug already with testdata.

In theory, this already is possible, because the README.md (Advanced) All initialization options with default values give a hint on how to change a few variables. But last update on this repository, I did not investigate which exactly path was used for testdata.

Add a compatible version of poppler to uwazi-docker for Uwazi v1.3

Uwazi v1.3 released yesterday (https://github.com/huridocs/uwazi/releases/tag/v1.3) have a new dependency, pdftotext version 0.67.0.

The base image node:8-slim provably will load the version 0.26.5-2+deb8u4 (refs https://packages.debian.org/pt-br/jessie/poppler-utils) from Sat Sep 27, 2014, but the poppler-0.68 was released on Sun Aug 19, 2018 (https://poppler.freedesktop.org/releases.html). So the developer used a pretty recent version, maybe with very good reason.

Jessie no longer works

The jessie node image no longer gets its updates, obviously, Debian has pulled the plug on jessie. Therefore I am now rebuilding with stretch

Draft of how to run uwazi on production environment

Here we have a development-instructions.md but is not very, very explicit how to run Uwazi in production. So, since eventually people will ask for how to do it, let's at least create a file with a draft that point for the huridocs/uwazi wiki (https://github.com/huridocs/uwazi/wiki) and make it clear that the uwazi-docker, by default and by design, is not designed to run in production, since it aims to be more users friendly.

Also, not that me Rocha have some special relation with Huridocs, but point that is possible to run both as testing and both as production with https://www.uwazi.io

redis connection in broken state

After a while (maybe 30 or 60 minutes?), the uwazi-docker_uwazi_1 container stops.
Below are extracts of the logs. Attached are longer logs. Happy to provide more if needed.

2023-07-25T20:38:37.944Z [uwazi_development] uncaughtException: Redis connection in broken state: connection timeout exceeded.
Error: Redis connection in broken state: connection timeout exceeded.
at RedisClient.connection_gone (/home/node/uwazi/prod/node_modules/redis/index.js:588:19)
at RedisClient.on_error (/home/node/uwazi/prod/node_modules/redis/index.js:346:10)
at Socket. (/home/node/uwazi/prod/node_modules/redis/index.js:223:14)
at Socket.emit (node:events:513:28)
at Socket.emit (node:domain:489:12)
at emitErrorNT (node:internal/streams/destroy:151:8)
at emitErrorCloseNT (node:internal/streams/destroy:116:3)
at process.processTicksAndRejections (node:internal/process/task_queues:82:21)
[Tenant error] Error: Accessing nonexistent async context
2023-07-25T20:38:37.946Z [uwazi_development]
uncaught exception or unhandled rejection, Node process finished !!
Error: Redis connection in broken state: connection timeout exceeded.
at RedisClient.connection_gone (/home/node/uwazi/prod/node_modules/redis/index.js:588:19)
at RedisClient.on_error (/home/node/uwazi/prod/node_modules/redis/index.js:346:10)
at Socket. (/home/node/uwazi/prod/node_modules/redis/index.js:223:14)
at Socket.emit (node:events:513:28)
at Socket.emit (node:domain:489:12)
at emitErrorNT (node:internal/streams/destroy:151:8)
at emitErrorCloseNT (node:internal/streams/destroy:116:3)
at process.processTicksAndRejections (node:internal/process/task_queues:82:21)
original error: {
"code": "CONNECTION_BROKEN",

Originally posted by @mayeulk in #45 (comment)

Instructions or mechanism for upgrading

I just saw that Uwazi 1.5 is out. Upgrading on a plain installation is easy. But on docker it seems to involve removing the image and building a new one, or is there a better way?

FileNotFoundError: [Errno 2] No such file or directory: '/home/jr/uwazi-docker/data/mongo/WiredTiger.turtle.set'

After cloning the code,

docker-compose run -e IS_FIRST_RUN=true --rm uwazi

ends with the following error:

Status: Downloaded newer image for mongo:3.4
Creating uwazi-docker_mongo_1         ... done
Creating uwazi-docker_elasticsearch_1 ... done
Building uwazi
Traceback (most recent call last):
  File "bin/docker-compose", line 6, in <module>
  File "compose/cli/main.py", line 71, in main
  File "compose/cli/main.py", line 127, in perform_command
  File "compose/cli/main.py", line 845, in run
  File "compose/cli/main.py", line 1297, in run_one_off_container
  File "compose/service.py", line 316, in create_container
  File "compose/service.py", line 352, in ensure_image_exists
  File "compose/service.py", line 1067, in build
  File "site-packages/docker/api/build.py", line 154, in build
  File "site-packages/docker/utils/build.py", line 31, in tar
  File "site-packages/docker/utils/build.py", line 79, in create_archive
  File "tarfile.py", line 1803, in gettarinfo
FileNotFoundError: [Errno 2] No such file or directory: '/home/jr/uwazi-docker/data/mongo/WiredTiger.turtle.set'
[4393] Failed to execute script docker-compose
jr@erwin:~/uwazi-docker$ ls data/mongo/WiredTiger.
WiredTiger.lock    WiredTiger.turtle  WiredTiger.wt 

Elasticsearch fails for memory related causes

$ node run.js ./database/reindex_elastic.js
Deleting index... uwazi_development
{ FetchError: request to http://elasticsearch:9200/uwazi_development failed, reason: getaddrinfo ENOTFOUND elasticsearch elasticsearch:9200
    at ClientRequest.<anonymous> (/home/node/uwazi/node_modules/node-fetch/index.js:133:11)
    at emitOne (events.js:96:13)
    at ClientRequest.emit (events.js:188:7)
    at Socket.socketErrorListener (_http_client.js:314:9)
    at emitOne (events.js:96:13)
    at Socket.emit (events.js:188:7)
    at connectErrorNT (net.js:1034:8)
    at _combinedTickCallback (internal/process/next_tick.js:80:11)
    at process._tickCallback (internal/process/next_tick.js:104:9)
  name: 'FetchError',
  message: 'request to http://elasticsearch:9200/uwazi_development failed, reason: getaddrinfo ENOTFOUND elasticsearch elasticsearch:9200',
  type: 'system',
  errno: 'ENOTFOUND',
  code: 'ENOTFOUND' }
Creating index... uwazi_development
{ FetchError: request to http://elasticsearch:9200/uwazi_development failed, reason: getaddrinfo ENOTFOUND elasticsearch elasticsearch:9200
    at ClientRequest.<anonymous> (/home/node/uwazi/node_modules/node-fetch/index.js:133:11)
    at emitOne (events.js:96:13)
    at ClientRequest.emit (events.js:188:7)
    at Socket.socketErrorListener (_http_client.js:314:9)
    at emitOne (events.js:96:13)
    at Socket.emit (events.js:188:7)
    at connectErrorNT (net.js:1034:8)
    at _combinedTickCallback (internal/process/next_tick.js:80:11)
    at process._tickCallback (internal/process/next_tick.js:104:9)
  name: 'FetchError',
  message: 'request to http://elasticsearch:9200/uwazi_development failed, reason: getaddrinfo ENOTFOUND elasticsearch elasticsearch:9200',
  type: 'system',
  errno: 'ENOTFOUND',
  code: 'ENOTFOUND' }
Indexing documents and entities... - 0 indexed

Document and/or change default behavior on uwazi-docker bind to :3000 for more than just localhost (e.g be public accessibly)

Refs: #31, #32, #33, and also huridocs/uwazi#2047 and huridocs/uwazi#2051


This issue is for discuss about how to document about the default behavior or even change the default behavior of uwazi-docker do expose for external access if the user does not make any changes on a default docker installation.

Even if the uwazi application change some of it's behavior, we would still need to make changes here on the docker version. Also, since we are using the v1.3 oficial release (and no really essential change must be made to affect uwazi-docker) we would still stick with the v1.3 version from 14 days ago or maybe do a very specific file patch. So it would make easier to use uwazi-docker with some more stable version than just some lastest version on uwazi development branch.

Should use VOLUME for Uwazi document directory

The contents of the Uwazi document directory and the mongodb database seem to be the two things which contain all the payload of Uwazi. Thus the documents dir should not be part of the Uwazi container but on a separate volume, like the db.

GUI for MongoDB

Configure docker-compose.yml to optionally install some GUI for who would want do debug but is not familiar with MongoDB yet. And even for who knows, is not hard to get overwhelmed by the docker networking if I intentionally only expose the Uwazi ports, not the internal ones.

Working MVP

Make it work and document a Minimum Viable Product. Does not need all applications run as containers, just that they work.

GUI for ElasticSearch

Configure docker-compose.yml to optionally install some GUI for who would want do debug but is not familiar with ElasticSearch yet. And even for who knows, is not hard to get overwhelmed by the docker networking if I intentionally only expose the Uwazi ports, not the internal ones.

Unmaintained deb package list, not starting

$ docker-compose run -e IS_FIRST_RUN=true --rm uwazi
Building uwazi
Step 1/8 : FROM node:8-jessie
 ---> 0f8964092ab1
Step 2/8 : LABEL maintainer="Emerson Rocha <[email protected]>"
 ---> Using cache
 ---> 47df3a7c03bb
Step 3/8 : RUN DEBIAN_FRONTEND=noninteractive apt-get update && apt-get install -y   bzip2   dh-autoreconf   git   libpng-dev   poppler-utils
 ---> Running in 9c17417cbab4
Ign http://deb.debian.org jessie InRelease
Ign http://security.debian.org jessie/updates InRelease
Ign http://deb.debian.org jessie-updates InRelease
Ign http://security.debian.org jessie/updates Release.gpg
Ign http://deb.debian.org jessie Release.gpg
Ign http://security.debian.org jessie/updates Release
Ign http://deb.debian.org jessie-updates Release.gpg
Err http://security.debian.org jessie/updates/main amd64 Packages
  
Ign http://deb.debian.org jessie Release
Err http://security.debian.org jessie/updates/main amd64 Packages
  
Ign http://deb.debian.org jessie-updates Release
Err http://security.debian.org jessie/updates/main amd64 Packages
  
Err http://security.debian.org jessie/updates/main amd64 Packages
  
Err http://security.debian.org jessie/updates/main amd64 Packages
  404  Not Found [IP: 151.101.194.132 80]
Err http://deb.debian.org jessie/main amd64 Packages
  404  Not Found
Err http://deb.debian.org jessie-updates/main amd64 Packages
  404  Not Found
W: Failed to fetch http://security.debian.org/debian-security/dists/jessie/updates/main/binary-amd64/Packages  404  Not Found [IP: 151.101.194.132 80]

W: Failed to fetch http://deb.debian.org/debian/dists/jessie/main/binary-amd64/Packages  404  Not Found

W: Failed to fetch http://deb.debian.org/debian/dists/jessie-updates/main/binary-amd64/Packages  404  Not Found

E: Some index files failed to download. They have been ignored, or old ones used instead.
ERROR: Service 'uwazi' failed to build : The command '/bin/sh -c DEBIAN_FRONTEND=noninteractive apt-get update && apt-get install -y   bzip2   dh-autoreconf   git   libpng-dev   poppler-utils' returned a non-zero code: 100
(base) 

https://unix.stackexchange.com/a/508728/455148 said (in 2019):

"Wheezy and Jessie were recently removed from the mirror network, so if you want to continue fetching Jessie backports, you need to use archive.debian.org instead" "Since you’re building a container image, I highly recommend basing it on Debian 9 (Stretch) instead."

Is this project maintained?
Thank you.
Mayeul

service mongo should not publish port 27017

The docker-compose.xml for service mongo contains the stanza

ports:
  - 27017:27017

There is no reason why mongodb should be directly reachable from outside. Moreso since uwazi uses a passwordless setup. It appears that by publishing the port, one even makes it accessible to the outside world. By default, mongodb only listens on 127.0.0.1, but the publishing of the port by docker seems to circumvent this security measure and makes mongo accessible to the world, which is really not what you want as long as uwazi cannot handle db credentials.

Use Debian Bookworm

Originally posted by @fititnt in #45 (comment)
Hi @mayeulk , this issue is now fixed. Maybe I will have newer ones, feel free to open other issues.
I didn't attempt to use the very last Debian 12 Bookworm (which will have a LTS by 2028, see https://endoflife.date/debian) but now is the Debian 11 (Bullseye), but if you want we could do a quick check.

Relevant point which is different from current Uwazi documentation: this Uwazi-docker is using MongoDB 6.0, not the MongoDB 4.2, which I had problems using recently Debian. In theory the Uwazi app is working, but I haven't tested every feature, so in particular @acme might need to do tests on a backup (not main database) and maybe warn me if it was a bad idea.

If you all have some time next weeks and are interested, I could try to test again on very recent Debian, so we might push the versions up and for several years just keep doing minor incremental upgrades. For now I only done with MongoDB.

But, as expected, the fititnt/uwazi-docker, even with current approach of point to latest production tag of uwazi, is granted "to break" after some years because of the dependencies, however this likely to be better since at least the ones who are users don't stay with version too older.

Installation error?

Hello. I installed Uwazi on the Ubuntu 20.04 platform. I loaded sample data in the installation preference. However, when I logged in as an administrator, I could not make any changes in the settings. An error occurred while recording. I reinstalled the installation. In this installation, I did it without installing the sample data. I could not log in with my user information on the user login screen.
In both installations, when I exit the ssh connection, the services on the server stop. I cannot access Uwazi.
What mistake do you think I'm making?
Thank you in advance for your help.
Kind regards.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.