Giter Site home page Giter Site logo

Comments (13)

iseessel avatar iseessel commented on September 23, 2024 1

Hi @bridgeqiqi @antal-horvath and @doulemint We have fixed the docker containers and the tutorials in our latest release: https://github.com/facebookresearch/vissl/releases/tag/v0.1.6

Please checkout https://vissl.ai/tutorials/.

from vissl.

prigoyal avatar prigoyal commented on September 23, 2024

Hi @antal-horvath , thank you for the report. We need to update our dockers , I'd recommend to follow the INSTALL.md to cross-check any missing commands.

For classy vision, you can run pip install classy-vision@https://github.com/facebookresearch/ClassyVision/tarball/master and it should work :)

from vissl.

antal-huck avatar antal-huck commented on September 23, 2024

Thanks for the quick reply. I'll try next Monday and will let you know if it worked.

from vissl.

antal-huck avatar antal-huck commented on September 23, 2024

Alright, I followed the installation instructions for the vissl pip package up to RUN pip install vissl: This part took forever because the dependency resolver never finished here
INFO: pip is looking at multiple versions of pillow to determine which version is compatible with other requirements. This could take a while. Downloading Pillow-7.0.0-cp38-cp38-manylinux1_x86_64.whl (2.1 MB) Downloading Pillow-6.2.2-cp38-cp38-manylinux1_x86_64.whl (2.1 MB) Downloading Pillow-6.2.1-cp38-cp38-manylinux1_x86_64.whl (2.1 MB) Downloading Pillow-6.2.0.tar.gz (37.4 MB)
I then instead followed step 4.

Now, although I installed apex using this line in the Dockerfile
RUN VERSIONSTR=$(python3 /tmp/get_version_str.py); echo $VERSIONSTR; python3 -m pip install -f https://dl.fbaipublicfiles.com/vissl/packaging/apexwheels/$VERSIONSTR/download.html apex;
I realized after running the swav training command again that there is a warning about apex not being available.

Also python3 -c 'import apex' yields
Traceback (most recent call last):
File "", line 1, in
File "/usr/local/lib/python3.8/dist-packages/apex/init.py", line 13, in
from pyramid.session import UnencryptedCookieSessionFactoryConfig
ImportError: cannot import name 'UnencryptedCookieSessionFactoryConfig' from 'pyramid.session' (unknown location)

The dockerfile command above, in my case yields the following during the docker build process:
py38_cu102_pyt181
Looking in links: https://dl.fbaipublicfiles.com/vissl/packaging/apexwheels/py38_cu102_pyt181/download.html
Collecting apex
Downloading apex-0.9.10dev.tar.gz (36 kB)
...
...
Successfully built apex velruse anykeystore cryptacular pbkdf2
Installing collected packages: zope.interface, urllib3, plaster, PasteDeploy, idna, chardet, certifi, zope.deprecation, webob, venusian, translationstring, transaction, requests, plaster-pastedeploy, oauthlib, MarkupSafe, hupper, greenlet, defusedxml, wtforms, SQLAlchemy, requests-oauthlib, repoze.sendmail, python3-openid, pyramid, pbkdf2, anykeystore, zope.sqlalchemy, wtforms-recaptcha, velruse, pyramid-mailer, cryptacular, apex
Successfully installed MarkupSafe-1.1.1 PasteDeploy-2.1.1 SQLAlchemy-1.4.12 anykeystore-0.2 apex-0.9.10.dev0 certifi-2020.12.5 chardet-4.0.0 cryptacular-1.5.5 defusedxml-0.7.1 greenlet-1.0.0 hupper-1.10.2 idna-2.10 oauthlib-3.1.0 pbkdf2-1.3 plaster-1.0 plaster-pastedeploy-0.7 pyramid-2.0 pyramid-mailer-0.15.1 python3-openid-3.2.0 repoze.sendmail-4.4.1 requests-2.25.1 requests-oauthlib-1.3.0 transaction-3.0.1 translationstring-1.4 urllib3-1.26.4 velruse-1.1.1 venusian-3.0.0 webob-1.8.7 wtforms-2.3.3 wtforms-recaptcha-0.3.2 zope.deprecation-4.4.0 zope.interface-5.4.0 zope.sqlalchemy-1.4
WARNING: Running pip as root will break packages and permissions. You should install packages reliably by using venv: https://pip.pypa.io/warnings/venv

The last warning might suggest, that RUN python3 -m pip install -f https://dl.fbaipublicfiles.com/vissl/packaging/apexwheels/py38_cu102_pyt181/download.html apex might not work as expected.

from vissl.

antal-huck avatar antal-huck commented on September 23, 2024

I think we can close this issue as it is indeed solved with

RUN python3 -m pip uninstall -y classy_vision
RUN python3 -m pip install classy-vision@https://github.com/facebookresearch/ClassyVision/tarball/master

But, since this issue was a result of the dockerfile not being up-to-date, and I was trying to fix it, I want to share my experience.

  1. Adding the installation of classy-vision to the current version of the dockerfile in this repo indeed solved the problem. But with the setup of cuda 10.2 and torch==1.5 I am not able to succeed with the tests in ./dev/run/quick_tests.sh as it does not support the GPU architecture A100-SXM4.
  2. Therefore, I tried to set up my own dockerfile and because of the WARNING at the bottom of my last comment I also added venv to the docker, following https://pythonspeed.com/articles/activate-virtualenv-dockerfile/. Here's my dockerfile:

FROM nvidia/cuda:11.2.2-cudnn8-devel-ubuntu20.04
ENV DEBIAN_FRONTEND noninteractive

# Install some basic utilities
RUN apt-get update && apt-get install -y wget vim nano curl ca-certificates sudo git python3 python3-dev python3-pip python3-distutils python3-setproctitle python3-opencv python3-venv && rm -rf /var/lib/apt/lists/*

# set virtuel environment venv in docker, see https://pythonspeed.com/articles/activate-virtualenv-dockerfile/
ENV VIRTUAL_ENV=/opt/venv
RUN python3 -m venv $VIRTUAL_ENV
ENV PATH="$VIRTUAL_ENV/bin:$PATH"

# could be necessary for vissl? adapted it to venv
RUN ln -sv $VIRTUAL_ENV/bin/python3 /usr/bin/python

# setup python environment
RUN python3 -m pip install --no-cache-dir -U setuptools
RUN python3 -m pip install --upgrade pip

# install pytorch
RUN python3 -m pip install --pre torch torchvision -f https://download.pytorch.org/whl/nightly/cu112/torch_nightly.html

# install apex
COPY ./get_version_str.py /tmp/get_version_str.py
RUN VERSIONSTR=$(python3 /tmp/get_version_str.py); echo $VERSIONSTR; python3 -m pip install -f https://dl.fbaipublicfiles.com/vissl/packaging/apexwheels/$VERSIONSTR/download.html apex;

# install vissl
RUN git clone --recursive https://github.com/facebookresearch/vissl.git /tmp/vissl
WORKDIR /tmp/vissl
RUN python3 -m pip install --progress-bar off -r requirements.txt
RUN python3 -m pip install opencv-python
RUN python3 -m pip uninstall -y classy_vision
RUN python3 -m pip install classy-vision@https://github.com/facebookresearch/ClassyVision/tarball/master
RUN python3 -m pip install -e ".[dev]"

#RUN python3 -m pip uninstall pyramid -y

RUN python3 -c 'import vissl, apex'

But docker build --network=host -t vissl:latest . still fails when trying to import apex at the last line, which results in the same error mentioned in the previous comment:

Traceback (most recent call last):
File "", line 1, in
File "/opt/venv/lib/python3.8/site-packages/apex/init.py", line 13, in
from pyramid.session import UnencryptedCookieSessionFactoryConfig
ImportError: cannot import name 'UnencryptedCookieSessionFactoryConfig' from 'pyramid.session' (unknown location)

And the proposed solution in https://stackoverflow.com/questions/66610378/unencryptedcookiesessionfactoryconfig-error-when-importing-apex does not solve the problem because then we would get the error No module named 'pyramid'.

@prigoyal I'm eager to use this repo, but I would need a docker version with the latest cuda und pytorch versions without apex problems. When do you plan to update the dockerfile?

from vissl.

prigoyal avatar prigoyal commented on September 23, 2024

Hi @antal-horvath , thank you so much. I appreciate you sharing the docker file above. We just need to refresh the docker files on our end. If you are up for it, please feel free to send the PR. :)

As for Apex, we have built Apex packages up to pytorch 1.8.0 only. The other versions will be supported in future release. However, you don't need to be blocked by it. We provide a simple bash script that can install the Apex from source. https://github.com/facebookresearch/vissl/blob/master/docker/common/install_apex.sh
It sounds like long term this might be the scalable solution for you (?) since you'd like to continue to adapt Apex , cuda, pytorch etc to latest versions? :)

I am excited and willing to support you , let me know what you prefer/what solution works :)

from vissl.

antal-huck avatar antal-huck commented on September 23, 2024

Actually, I don't get it to work. Also not with pytorch 1.8.0 (same error with pyramid); and also not with the above-mentioned script (with cuda 11.1, cudnn8, and torch 1.8.0). This script results in the following error:

         ...
         File "/opt/venv/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1561, in _get_cuda_arch_flags                                                                                                                                                                                                          
     arch_list[-1] += '+PTX'                                                                                                                                                                                                                                                                                              
 IndexError: list index out of range                                                                                                                                                                                                                                                                                      
 Running setup.py install for apex: finished with status 'error'

ERROR: Command errored out with exit status 1: /opt/venv/bin/python3 -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-sv_7wu_k/apex_5dfed5a412cf4f1c9412e6765e702701/setup.py'"'"'; file='"'"'/tmp/pip-install-sv_7wu_k/apex_5dfed5a412cf4f1c9412e6765e702701/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(file) if os.path.exists(file) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' --cpp_ext --cuda_ext install --record /tmp/pip-record-w3h7171m/install-record.txt --single-version-externally-managed --compile --install-headers /opt/venv/include/site/python3.8/apex Check the logs for full command output.
Exception information:
Traceback (most recent call last):
File "/opt/venv/lib/python3.8/site-packages/pip/_internal/req/req_install.py", line 825, in install
success = install_legacy(
File "/opt/venv/lib/python3.8/site-packages/pip/_internal/operations/install/legacy.py", line 81, in install
raise LegacyInstallFailure
pip._internal.operations.install.legacy.LegacyInstallFailure

from vissl.

prigoyal avatar prigoyal commented on September 23, 2024

thank you @antal-horvath , we will take this into account and work on providing the VISSL official docker.

To unblock you immediately, it sounds like your blocker is primarily the Apex installation? Apex is currently used / required for 3 things:

  1. AMP for mixed precision
  2. SyncBatchNorm layer
  3. LARC

for the 1) and 2), PyTorch provides the alternatives now.
on 1) , for using PyTorch AMP, see this https://github.com/facebookresearch/vissl/blob/master/vissl/config/defaults.yaml#L724
on 2), for using pytorch SyncBN , see this https://github.com/facebookresearch/vissl/blob/master/vissl/config/defaults.yaml#L706

for 3), we recently added the LARS to VISSL. We can look into adding the LARC directly as well. LMK if this is a blocker.

Hope this helps.

from vissl.

doulemint avatar doulemint commented on September 23, 2024

new updates. https://github.com/facebookresearch/ClassyVision/tarball/master This link doesn't work right now.

from vissl.

prigoyal avatar prigoyal commented on September 23, 2024

Hi @doulemint , thank you for reaching out. The classy vision master branch was renamed to main. Could you try with that? Our installation instructions are updated to reflect the changes. https://github.com/facebookresearch/vissl/blob/main/INSTALL.md#step-4-install-vissl :)

from vissl.

bridgeqiqi avatar bridgeqiqi commented on September 23, 2024

How to register the dataset_catalog?
I follow the official tutorial code, save_file(json_data, '<my_vissl_path>/configs/config/dataset_catalog.json') and then print(VisslDatasetCatalog.list()). It shows me null list [].

from vissl.

iseessel avatar iseessel commented on September 23, 2024

Hi @antal-horvath and @doulemint I've fixed the docker containers in this PR: #458

LMK if you have any questions.

from vissl.

iseessel avatar iseessel commented on September 23, 2024

@bridgeqiqi The tutorials have a few problems right now, I am working on fixing them and will release ASAP.

from vissl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.