Giter Site home page Giter Site logo

getting-started's Introduction

Getting started

PRs Welcome

This repository contains tutorials, scripts, examples etc. for getting started with your machine learning / NLP project.

The information are mainly tailored to users of DFKI's PEGASUS system.

Software development

IDE

Debugger

One of the key features of any good IDE is its debugging support. The debugger will make it much easier to fix your code (no need for print-statements anymore).

Tutorials for debuggers:

GitHub Copilot

Coding best practices

Get familar with coding standards and best practices! This improve your code by a lot and makes it much easier to maintain.

You can use automated tools to enforce coding styles:

Remote server

Today's machine learning requires large computing resources that your local machine won't have. Thus, you need to connect a remote server to run your experiments.

SSH

SSH keys

SSH config

Example

A SSH-config may contain entries like below. Replace <dfki_account> with your DFKI Account and <pegasus_account> with your PEGASUS account.

# PEGASUS via SSH-Gate
Host pegasus.dfki  # a custom hostname
    User <pegasus_account>
    HostName login2.pegasus.kl.dfki.de  # change this to a different login node if needed
    ProxyJump <dfki_account>@sshgate.sb.dfki.de

With such a config, you can simply connect to PEGASUS by typing ssh pegasus.dfki.

SSH proxy

An SSH connectio can be used a proxy to access resources from the intranet:

# replace <proxy_port> with a port number, e.g. 8001
ssh -D <proxy_port> pegasus.dfki

This creates a SOCKS proxy (see https://ma.ttias.be/socks-proxy-linux-ssh-bypass-content-filters/).

Enable this proxy via or system settings or browser settings or use a proxy browser plugin like FoxyProxy:

Use the following settings:

  • Proxy Type: SOCKS5
  • Proxy IP address: localhost
  • Proxy port: <proxy_port>

tmux

You connection to a remote might be lost and, therefore, it is important to maintain sessions of the remote server independent from your own connection. tmux provides this and many other features that will make your work on remote servers much easier.

Alternatives to tmux are: screen, ...

Environment

.bashrc example

The .bashrc in your home directory is loaded everytime you start a bash session. It is a good place to define global environment variables, e.g., cache paths or login credentials. For example:

# append to "~/.bashrc"

# shortcuts
alias ll="ls -l"

# PIP cache
# http://projects.dfki.uni-kl.de/km-publications/web/ML/core/hpc-doc/posts/pypi-cache/
export PIP_INDEX_URL=http://pypi-cache/index
export PIP_TRUSTED_HOST=pypi-cache
export PIP_NO_CACHE=true

# Huggingface
export HF_LOGIN=<your huggingface login>
export HF_PASSWORD=<your huggingface API key>
export HF_DATASETS_CACHE="/netscratch/$USER/datasets/hf_datasets_cache/"
export TRANSFORMERS_CACHE="/netscratch/$USER/datasets/transformers_cache/"

# Weights & Biases
export WANDB_API_KEY<your WANDB api>

Python environments (conda / virtualenv / ...)

Slurm

Read the PEGASUS documentation. It should provide all necassary information. For other questions, please use the cluster chat.

Some potentially useful commands:

# starts an interactive job with pytorch (8hrs time limit)
$ srun -K \
  --container-image=/netscratch/enroot/nvcr.io_nvidia_pytorch_23.07-py3.sqsh \
  --container-workdir="`pwd`" \
  --container-mounts=/netscratch/$USER:/netscratch/$USER,/ds:/ds:ro,"`pwd`":"`pwd`" \
   --time 08:00:00 --pty bash

# list your current jobs
squeue --me

Docker & containers

PEGASUS uses enroot for containers. If you have rebuild Docker images you can convert them as follows:

srun -p $ANY_PARTITION \
  enroot import \
  -o /netscratch/$USER/enroot/malteos_eulm_latest.sqsh \
  'docker://ghcr.io#malteos/eulm:latest'

Build custom images with Podman:

sbin/podman_build.sh

Jupyter notebooks

You can run Jupyter noteboks on Pegasus:

# start interactive compute job
# --container-save=$EVAL_DEV_IMAGE 
srun \
  --container-mounts=/netscratch:/netscratch,/home/$USER:/home/$USER \
  --container-image=$IMAGE \
  --container-workdir=$(pwd) -p RTX6000 --time 08:00:00 --pty /bin/bash

# run in compute job
echo "Jupyter starting at ... http://${HOSTNAME}.kl.dfki.de:8880" && jupyter notebook --ip=0.0.0.0 --port=8880 \
    --allow-root --no-browser --config /home/mostendorff/.jupyter/jupyter_notebook_config.json \
    --notebook-dir /netscratch/mostendorff/experiments

# start with fixed token (for VSCode -> "Specify Jupyter connection")
JUPYTER_TOKEN=yoursecrettoken jupyter notebook

Other useful links & resources

getting-started's People

Contributors

malteos avatar bhauptvogel avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.