This is how I set up a fresh linux installation to start working in machine learning and programming.
It is also useful for WSL (Windows Subsystem for Linux) and I will add comments for it as well.
I keep this tutorial handy in case I do a clean OS install or if I need to check some of my initial settings.
- WSL installation guide
- Basic Settings
- Setup Git
- Check your branches in git log history in a pretty line
- Push with tags: multi-line git alias
- GitHub Markdown math expressions for README.md, etc.
- GitLab Markdown math expressions for README.md, etc.
- Install Git Large File System
- Make a new Git (LFS) repository from local
- Manage multiple GitHub or GitLab accounts
- Install Docker Engine for Linux.
- Install Python versions with pyenv and virtual environments with poetry
- Install pyenv first
- Install poetry
- Useful Data Science libraries
- Install and setup Flask for Python Web Development
- Install and setup Ruby, Bundler and Jekyll for websites
- Install LaTeX and latexdiff
- Shell Scripting for convenience
- Install Pandoc to convert/export markdown, HTML, LaTeX, Word
- CUDA and GPU settings
- Accessibility Stuff
https://www.groovypost.com/howto/install-windows-subsystem-for-linux-in-windows-11/
- Open the cmd with administrator privileges
wsl --install
- Restart computer
- Make username under new Linux terminal
If in the above tutorial for separate git accounts, for example, you needed to use paths to locations in the Windows system, you can replace C: with /mnt/c/
There is a difference between running an interactive shell inside of a started up Linux system, say, if you had Ubuntu installed the regular way, and running the main WSL window.
Running the WSL software inside Windows opens a login shell, which is different from the interactive shell we are used to.
Login shells load .profile, which then reads .bashrc if the shell being used is bash. However, .profile is ignored if there exists a .bash_profile, which means usually that .bashrc will never be read, and so will also .bash_aliases not be read.
This can be fixed in a few ways:
-
Run the command
bash
every time at start up to open an interactive shell inside the login shell. The only difference this makes is that to exit WSL via commands you'd have to runexit
on the interactive shell and then on the login shell, twice. -
Move all the contents of .bash_profile to .profile, then delete .bash_profile so that there's nothing stopping all the initial codes from running even in a login shell.
-
Add
source ~/.profile
to the beginning of .bash_profile so that it is run regardless, and therefore also loads .bashrc if necessary. Personally I chose this one.
Setup root password: https://www.cyberciti.biz/faq/how-to-change-root-password-on-macos-unix-using-terminal/
sudo passwd root
-
Set up the screen lock command so you can do it every time you stand up: https://itsfoss.com/ubuntu-shortcuts/
-
Set up your WiFi connection.
-
For ease of use, make hidden files and file extensions visible.
Hidden files visibility:
I can't work without seeing hidden files, so in Ubuntu we can do CTRL+H
and the hidden files will appear.
To set it as the default:
https://help.ubuntu.com/stable/ubuntu-help/nautilus-views.html.en
In order for most anything else to install properly, we need these first:
sudo apt update
sudo apt install \
build-essential \
curl \
libbz2-dev \
libffi-dev \
liblzma-dev \
libncursesw5-dev \
libreadline-dev \
libsqlite3-dev \
libssl-dev \
libxml2-dev \
libxmlsec1-dev \
llvm \
make \
tk-dev \
wget \
xz-utils \
zlib1g-dev
- Install SublimeText4 for ease of use (this is my personal favorite, but it's not necessary)
https://www.sublimetext.com/docs/linux_repositories.html
wget -qO - https://download.sublimetext.com/sublimehq-pub.gpg | gpg --dearmor | sudo tee /etc/apt/trusted.gpg.d/sublimehq-archive.gpg
sudo apt-get update
sudo apt-get install sublime-text
- Paste the SublimeText4 preferences (my personal preferences)
{
"ignored_packages":
[
"Vintage",
],
"spell_check": true,
"tab_size": 4,
"translate_tabs_to_spaces": true,
"copy_with_empty_selection": false
}
Also, Sublime Text is all about the plugins. Install Package Control by typing CTRL+Shift+P, then typing "Install Package Control"
Then here's some cool packages to try:
- LaTeXTools
- MarkdownTOC
- MarkdownPreview
- MarkdownEditing
- Alignment
- IncrementSelection
- Selection Evaluator
- Paste as One Line
- Invert Current Color Scheme
- PackageResourceViewer
Now, for the Invert Current Color Scheme, I have my own fork that works with Sublime Text 4, so use the PackageResourceViewer to replace the main python file with my code:
https://github.com/elisa-aleman/sublime-invert-current-color-scheme
In MarkdownTOC.sublime-settings, paste the following for hyperlink markdowns and compatibility with MarkdownPreview:
{
"defaults": {
"autoanchor": true,
"autolink": true,
"markdown_preview": "github",
"uri_encoding": false
},
}
After installing Markdown Editing, add this to the SublimeText4 preferences (my personal preferences)
"mde.auto_fold_link.enabled": false,
I found myself needing paired
Searching for how to do it on macros, I found this post about keybindings which is a way better solution:
https://stackoverflow.com/questions/34115090/sublime-text-2-trying-to-escape-the-dollar-sign
Which, as long as we implement the double escaped dollar sign solution, we can use freely.
- Preferences > Key Bindings:
- Add this inside the brackets:
// Auto-pair dollar signs
{ "keys": ["$"], "command": "insert_snippet", "args": {"contents": "\\$$0\\$"}, "context":
[
{ "key": "setting.auto_match_enabled", "operator": "equal", "operand": true },
{ "key": "selection_empty", "operator": "equal", "operand": true, "match_all": true },
{ "key": "following_text", "operator": "regex_contains", "operand": "^(?:\t| |\\)|]|\\}|>|$)", "match_all": true },
{ "key": "preceding_text", "operator": "not_regex_contains", "operand": "[\\$a-zA-Z0-9_]$", "match_all": true },
{ "key": "eol_selector", "operator": "not_equal", "operand": "string.quoted.double", "match_all": true }
]
},
{ "keys": ["$"], "command": "insert_snippet", "args": {"contents": "\\$${0:$SELECTION}\\$"}, "context":
[
{ "key": "setting.auto_match_enabled", "operator": "equal", "operand": true },
{ "key": "selection_empty", "operator": "equal", "operand": false, "match_all": true }
]
},
{ "keys": ["$"], "command": "move", "args": {"by": "characters", "forward": true}, "context":
[
{ "key": "setting.auto_match_enabled", "operator": "equal", "operand": true },
{ "key": "selection_empty", "operator": "equal", "operand": true, "match_all": true },
{ "key": "following_text", "operator": "regex_contains", "operand": "^\\$", "match_all": true }
]
},
{ "keys": ["backspace"], "command": "run_macro_file", "args": {"file": "Packages/Default/Delete Left Right.sublime-macro"}, "context":
[
{ "key": "setting.auto_match_enabled", "operator": "equal", "operand": true },
{ "key": "selection_empty", "operator": "equal", "operand": true, "match_all": true },
{ "key": "preceding_text", "operator": "regex_contains", "operand": "\\$$", "match_all": true },
{ "key": "following_text", "operator": "regex_contains", "operand": "^\\$", "match_all": true }
]
},
- Sublime text, lower right corner
- Click on Spaces
- Select the current space number
- Click Convert indentation to Tabs
- Select the desired space number
- Click Convert indentation to Spaces
Linux already has git installed, but we can update it and manage it with apt-get.
sudo apt-get update
sudo apt-get install git
Then setup the configuration. Make an account at GitHub to get a username and email associated with Git. Type the settings on the terminal. My settings are like this:
git config --global http.proxy http://{PROXY_HOST}:{PORT}
git config --global user.name {YOUR_USERNAME}
git config --global user.email {YOUR_EMAIL}
git config --global color.ui auto
git config --global merge.conflictstyle diff3
git config --global core.editor nano
git config --global core.autocrlf input
git config --global core.fileMode false
git config --global pull.ff only
This should make a file ~/.gitconfig
with the following text
# ~/.gitconfig
[http]
proxy = http://{PROXY_HOST}:{PORT}
[user]
name = YOUR_USERNAME
email = YOUR_EMAIL
[color]
ui = auto
[merge]
conflictstyle = diff3
[core]
editor = nano
autocrlf = input
fileMode = false
[pull]
ff = only
[alias]
adog = log --all --decorate --oneline --graph
That last one, git adog
is very useful as I explain in Check your branches in git log history in a pretty line
This makes your history tree pretty and easy to understand inside of the terminal. I found this in https://stackoverflow.com/a/35075021
git log --all --decorate --oneline --graph
Not everyone would be doing a git log all the time, but when you need it just remember: "A Dog" = git log --all --decorate --oneline --graph
Actually, let's set an alias:
git config --global alias.adog "log --all --decorate --oneline --graph"
This adds the following to the .gitconfig file:
[alias]
adog = log --all --decorate --oneline --graph
And you run it like:
git adog
To add a multi-line alias, for example, push and then push the tags on one single command, use '!git ... && git ...'
as a format:
Push with tags:
git config --global alias.pusht '!git push && git push --tags'
Following this guide, math is different in GitLab markdown than say, GitHub or LaTeX. However, inside of the delimiters, it renders it using KaTeX, which uses LaTeX math syntax!
https://docs.gitlab.com/ee/user/markdown.html#math
Inline:
> $a^2 + b^2 = c^2$
Renders as:
Block:
> $$a^2 + b^2 = c^2$$
Renders as:
But it only supports one line of math, so for multiple lines you have to do this:
> $$a^2 + b^2 = c^2$$
> <!-- (line break is important) -->
> $$c = \sqrt{ a^2 + b^2 }$$
Renders as:
It can even display matrices and the like:
> $$
> l_1 =
> \begin{bmatrix}
> \begin{bmatrix}
> x_1 & y_1
> \end{bmatrix} \\
> \begin{bmatrix}
> x_2 & y_2
> \end{bmatrix} \\
> ... \\
> \begin{bmatrix}
> x_n & y_n
> \end{bmatrix} \\
> \end{bmatrix}
> $$
However, % comments will break the environment.
Math syntax in LaTeX:
https://katex.org/docs/supported.html
Following this guide, math is different in GitLab markdown than say, GitHub or LaTeX. However, inside of the delimiters, it renders it using KaTeX, which uses LaTeX math syntax!
https://docs.gitlab.com/ee/user/markdown.html#math
Inline:
> $`a^2 + b^2 = c^2`$
Renders as:
Block:
> ```math
> a^2 + b^2 = c^2
> ```
Renders as:
But it only supports one line of math, so for multiple lines you have to do this:
> ```math
> a^2 + b^2 = c^2
> ```
> ```math
> c = \sqrt{ a^2 + b^2 }
> ```
Renders as:
It can even display matrices and the like:
> ```math
> l_1 =
> \begin{bmatrix}
> \begin{bmatrix}
> x_1 & y_1
> \end{bmatrix} \\
> \begin{bmatrix}
> x_2 & y_2
> \end{bmatrix} \\
> ... \\
> \begin{bmatrix}
> x_n & y_n
> \end{bmatrix} \\
> \end{bmatrix}
> ```
However, % comments will break the environment.
Math syntax in LaTeX:
https://katex.org/docs/supported.html
This is for files larger than 50 MB to be able to be used in Git. Still, GitLFS has some limitations if you don't buy data packages to increase your usage limit. By default you get 1GB of storage and 1GB of bandwidth (how much you push or pull per month). For 5$USD, you can add a data pack that adds 50GB bandwith and 50GB Git LFS storage.
Now we need to install the git-lfs package to use it:
sudo apt install git-lfs
Now that we have Git and Python installed, we can make our first project. I like to leave this part of the tutorial in even if it doesn't classify as a setup because using Git and GitLFS was confusing at first.
First make a repository on GitHub with no .gitignore, no README and no license. Then, on local terminal, cd to the directory of your project and initialize git
cd path/to/your/project
git init
If using Git LFS:
git lfs install
It's supposed to be ready, but first, let's make a few hooks executable
chmod +x .git/hooks/*
Make a .gitignore depending on which files you don't want in the repository and add it
git add .gitignore
If using Git LFS, add the tracking settings for this project (For example, heavy csv files in this case)
git lfs track "*.csv"
And then add them to git
git add .gitattributes
Commit these changes first
git commit -m "First commit, add .gitignore and .gitattributes"
Now add all the data from your local repository. git add .
adds all the files in the folder.
git add .
Depending on the size of your project, it might be wiser to add it in parts instead of all at once. e.g.
git add *.py
git add *.csv
...
or
git add dir1
git add dir2
...
Check if all the paths are added
git status
Check if all the Git LFS files are tracked correctly
git lfs ls-files
If so, commit.
git commit -m "First data commit"
Set the new remote URL from the repository you created on GitHub. It'll appear with a copy button and everything, and end in .git
git remote add origin remote_repository_URL_here
Verify the new remote URL
git remote -v
Set upstream and then push only the lfs files to remote
git lfs push origin master
Afterwards push normally to upload everything
git push --set-upstream origin master
You only need to write --set-upstream origin master the first time for normal push
, after this just write push. For git lfs you always have to write it.
Because I want to update my personal code when I find better ways to program at work, I want to push and pull from my personal GitHub account aside from the work GitLab projects. CAUTION: DON'T UPLOAD COMPANY SECRETS TO YOUR PERSONAL ACCOUNT
To be able to do this, I followed these guides:
https://blog.gitguardian.com/8-easy-steps-to-set-up-multiple-git-accounts/
- Generate an SSH key First, create an SSH key for your personal account:
ssh-keygen -t rsa -b 4096 -C "[email protected]" -f ~/.ssh/<personal_key>
Then for your work account:
ssh-keygen -t rsa -b 4096 -C "[email protected]" -f ~/.ssh/<work_key>
- Add a passphrase
Then add a passphrase and press enter, it will ask for it twice. Press enter again.
To update the passphrase for your SSH keys:
ssh-keygen -p -f ~/.ssh/<personal_key>
You can check your newly created key with:
ls -la ~/.ssh
which should output <personal_key> and <personal_key>.pub.
Do the same steps for the <work_key>.
- Tell ssh-agent
The website has an -K tag that works for macOSX and such but we don't need it.
eval "$(ssh-agent -s)" && \
ssh-add ~/.ssh/<personal_key>
ssh-add ~/.ssh/<work_key>
- Edit your SSH config
nano ~/.ssh/config
-----------nano----------
# Work account - default
Host <some_host_name_work>
HostName <HOST>:<PORT>
User git
IdentityFile ~/.ssh/<work_key>
# Personal account
Host <personal_host_name>
HostName github.com
User git
IdentityFile ~/.ssh/<personal_key>
CTRL+O
CTRL+X
-------------------------
- Copy the SSH public key
cat ~/.ssh/<personal_key>.pub | pbcopy
Then paste on your respective website settings, such as the GitHub SSH settings page. Title it something you'll know it's your work computer.
Same for your <work_key>
- Structure your workspace for different profiles
Now, for each key pair (aka profile or account), we will create a .conf file to make sure that your individual repositories have the user settings overridden accordingly. Let’s suppose your home directory is like that:
/myhome/
|__.gitconfig
|__work/
|__personal/
We are going to create two overriding .gitconfigs for each dir like this:
/myhome/
|__.gitconfig
|__work/
|_.gitconfig.work
|__personal/
|_.gitconfig.pers
Of course the folder and filenames can be whatever you prefer.
- Set up your Git configs
In the personal git projects folder, make .gitconfig.pers
nano ~/personal/.gitconfig.pers
---------------nano-----------------
# ~/personal/.gitconfig.pers
[user]
email = [email protected]
name = Your Name
[github] #or gitlab or whatever
user = "personal-username"
[core]
sshCommand = “ssh -i ~/.ssh/<personal_key>”
# ~/work/.gitconfig.work
[user]
email = [email protected]
name = Your Name
[github] #or gitlab or whatever
user = "work_username"
[core]
sshCommand = “ssh -i ~/.ssh/<work_key>”
And finally add this to the end of your original main .gitconfig
file:
[includeIf “gitdir:~/personal/”] # include for all .git projects under personal/
path = ~/personal/.gitconfig.pers
[includeIf “gitdir:~/work/”]
path = ~/work/.gitconfig.work
Now finally to confirm if it worked, go to any work project you have and type the following:
cd ~/work/work-project
git config user.email
It should be your work e-mail.
Now go to a personal project:
cd ~/personal/personal-project
git config user.email
And it should output your personal e-mail.
- To clone new projects, specially private or protected ones, use the username before the website:
git clone https://<username>@github.com/<organization>/<repo>.git
If you have a 2 Factor Authentication, the clone might fail on the first try, because you need to generate a Personal Access Token.
And then copy and paste that as the password when the terminal asks you for user and password.s
And done! When you push or pull from the personal account you might encounter some 2 factor authorizations at login, but otherwise it's ready to work on both personal and work projects.
Do the whole thing on windows first, then follow these steps:
https://devblogs.microsoft.com/commandline/sharing-ssh-keys-between-windows-and-wsl-2/
- Copy keys to WSL
cp -r /mnt/c/Users/<username>/.ssh ~/.ssh
- Update permissions on the keys
chmod 600 ~/.ssh/id_rsa
Repeat for the other keys as well
- Install keychain
sudo apt install keychain
- Add keychain eval to .bash_profile for every key you have:
echo 'eval "$(keychain --eval --agents ssh id_rsa)"' >> ~/.bash_profile
echo 'eval "$(keychain --eval --agents ssh id_rsa_<personal_key>)"' >> ~/.bash_profile
echo 'eval "$(keychain --eval --agents ssh id_rsa_<work_key>)"' >> ~/.bash_profile
This will make it so that every time you start up the computer you have to type in the passwords for each of the keys, but they'll remain accessible after that.
- Setup Git config to match
Copy .gitconfig from the Windows home to the WSL home folder.
Mirror the folder structure and sub-configuration files (e.g. .gitconfig.pers, .gitconfig.work).
Now any new folders created under WSL in these folders will have the same permissions.
However, if you want to access a git repository under the Windows environment through WSL, entering the paths to match will not be enough.
For example, even if you add:
[includeIf “gitdir:/mnt/c/Users/<username>/personal/”] # include for all .git projects under personal/
path = /mnt/c/Users/<username>/personal/.gitconfig.pers
Git will return an error like this:
fatal: detected dubious ownership in repository at '/mnt/c/Users/......'
To add an exception for this directory, call:
git config --global --add safe.directory /mnt/c/Users/......
This happens because the path to the directory is different than expected, even if it points at the same directory.
This link explains that the newer versions of git are stricter with directory ownership.
This can be bypassed by setting this: (However, only use this if you do not consider yourself at risk)
git config --global safe.directory '*'
Now it is accessible from both ends!
If you mirrored the folders as well as added the windows folders, your configuration file should look like this:
[includeIf “gitdir:~/personal/”] # include for all .git projects under personal/
path = ~/personal/.gitconfig.pers
[includeIf “gitdir:/mnt/c/Users/<username>/personal/”] # include for all .git projects under personal/
path = /mnt/c/Users/<username>/personal/.gitconfig.pers
Docker allows us to run server apps that share an internal environment separate from the OS.
Follow the following guide for docker. https://docs.docker.com/engine/install/
For Ubuntu, specifically, there's this guide: https://docs.docker.com/engine/install/ubuntu/
Reboot after installing.
Depending on your installation, you might already have a python, but it is better to avoid using it as it interacts with the system, so we install a local version with Pyenv. Pyenv also makes it so that pip and python are always matched for each other in the correct version.
This is specially useful if you need different versions for different projects (Maybe caused by tensorflow updates vs other libraries updates...), you should follow these tutorials:
https://github.com/pyenv/pyenv#installation
For Linux, we have to use the github distribution, clone it and install it with make. Then we add the paths to .bash_profile for bash or to .zprofile for zsh.
cd ~
git clone https://github.com/pyenv/pyenv.git ~/.pyenv
cd ~/.pyenv && src/configure && make -C src
cd ~
source ~/.bash_profile
Now let's install and set the latest version:
pyenv install 3.10.7
pyenv global 3.10.7
And then we can add it to our PATH so that every time we open python
it's the pyenv one and not the system one:
echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bash_profile
echo 'command -v pyenv >/dev/null || export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bash_profile
echo 'eval "$(pyenv init -)"' >> ~/.bash_profile
source ~/.bash_profile
That eval
line I found out thanks to this StackOverflow post
We can confirm we are using the correct one:
pyenv versions
which python
python -V
which pip
pip -V
Poetry is a tool to manage python project dependencies and environments in a version controlled (e.g. git) and group accessible syntax. It allows to use a virtual environment to locally install all dependencies, remove or update them as needed while having access to previous instances of the environment at a given time via the commit history
pip install poetry
Usage guide: https://python-poetry.org/
Making a new project can be as easy as:
poetry new project-name-here
cd project-name-here
Then, instead of using pip install
or pip uninstall
we use poetry add
poetry add pathlib
This updates the dependency control files poetry.toml
, poetry.lock
, and pyproject.toml
, which can be committed to version control.
And finally, when cloning a repository, you can use poetry install
to easily install all the dependencies controlled by poetry in one command.
This is my generic fresh start install so I can work. Usually I'd install all of them in general, but recently I only install the necessary libraries under venv. There's more libraries with complicated installations in other repositories of mine, and you might not wanna run this particular piece of code without checking what I'm doing first. For example, you might have a specific version of Tensorflow that you want, or some of these you won't use. But I'll leave it here as reference.
pip install numpy scipy jupyter statsmodels \
pandas pathlib tqdm retry openpyxl
pip install matplotlib adjustText plotly kaleido
pip install sklearn sympy pyclustering
pip install beautifulsoup4 requests selenium
pip install gensim nltk langdetect
For Japanese NLP tools see: https://github.com/elisa-aleman/MeCab-python
For Chinese NLP tools see: https://github.com/elisa-aleman/StanfordCoreNLP_Chinese
pip install tensorflow tflearn keras \
torch torchaudio torchvision \
optuna
To Install with CPU:
pip install xgboost
To Install with CUDA GPU integration:
git clone --recursive https://github.com/dmlc/xgboost
cd xgboost
mkdir build
cd build
cmake .. -DUSE_CUDA=ON
make -j8
cd ../python-package
python setup.py install
To Install with CPU:
pip install lightgbm
Install dependencies:
apt-get install libboost-all-dev
apt install ocl-icd-libopencl1
apt install opencl-headers
apt install clinfo
apt install ocl-icd-opencl-dev
Install with CUDA GPU integration:
pip install lightgbm --install-option=--gpu --install-option="--opencl-include-dir=/usr/local/cuda/include/" --install-option="--opencl-library=/usr/local/cuda/lib64/libOpenCL.so"
For Minepy / Maximal Information Coefficient, we need the Visual Studio C++ Build Tools as a dependency, so install it first:
https://visualstudio.microsoft.com/visual-cpp-build-tools/
pip install minepy
Note to self: re-write with poetry project use instead of venv
with CPU and no extra options:
python -m pip install -U opencv-python opencv-contrib-python
Install dependencies:
apt-get update
apt-get upgrade
apt-get install build-essential
apt-get install cmake git libgtk2.0-dev pkg-config libavcodec-dev libavformat-dev libswscale-dev
apt-get install python-dev python-numpy libtbb2 libtbb-dev libjpeg-dev libpng-dev libtiff-dev libjasper-dev libdc1394-22-dev
apt-get install libavcodec-dev libavformat-dev libswscale-dev libv4l-dev
apt-get install libxvidcore-dev libx264-dev
apt-get install libgtk-3-dev
apt-get install libatlas-base-dev gfortran pylint
apt-get install python2.7-dev python3.5-dev python3.6-dev
apt-get install unzip
Now the ffmpeg dependency:
add-apt-repository ppa:jonathonf/ffmpeg-3
apt update
apt install ffmpeg libav-tools x264 x265
Check the version:
ffmpeg
Download and build opencv
wget https://github.com/opencv/opencv/archive/3.4.0.zip -O opencv-3.4.0.zip
wget https://github.com/opencv/opencv_contrib/archive/3.4.0.zip -O opencv_contrib-3.4.0.zip
unzip opencv-3.4.0.zip
unzip opencv_contrib-3.4.0.zip
cd opencv-3.4.0
mkdir build_3.5
mkdir build
cd build
Make, but remember to replace Python versions:
which python
cmake -DCMAKE_BUILD_TYPE=Release \
-D WITH_FFMPEG=ON \
-D PYTHON3_EXECUTABLE=<path to your python> \
-D CMAKE_INSTALL_PREFIX=/usr/local \
-D OPENCV_EXTRA_MODULES_PATH=../../opencv_contrib-3.4.0/modules \
-D OPENCV_ENABLE_NONFREE=True ..
make -j8 #(where -j8 is for 8 cores in the server cpu)
make install
ldconfig
Note to self: re-write with poetry project use instead of venv
I'm using this guide: https://flask.palletsprojects.com/en/2.1.x/installation/#install-flask
For Web Development, it's apparently better to make a Virtual Environment to install flask project-wise instead of system or user level.
First we go to our project and make the virtual environment.
cd my-project
python -m venv venv
. venv/scripts/activate
it might also be
. venv/bin/activate
so if one fails try the other.
Activate results in the python and pip versions to be internal to the project now and active in the bash:
which python
>>> .... my-project\venv/Scripts/python
pip --version
pip 22.1 from c:\users\...\my-project\venv\lib\site-packages\pip (python 3.9)
So since this pip is empty, let's install ONLY what we need for the project:
pip install --upgrade pip
pip install selenium beautifulsoup4 pandas pathlib retry
pip install flask
Now let's test it:
I'm using this guide here:
https://flask.palletsprojects.com/en/2.1.x/quickstart/
Under: my-project/python/hello.py
from flask import Flask
app = Flask(__name__)
@app.route("/")
def hello_world():
return "<p>Hello, World!</p>"
Then in bash:
cd my-project/python
export FLASK_APP=hello
flask run
and it should return:
* Serving Flask app 'hello' (lazy loading)
* Environment: production
WARNING: This is a development server. Do not use it in a production deployment.
Use a production WSGI server instead.
* Debug mode: off
* Running on http://127.0.0.1:5000 (Press CTRL+C to quit)
Unlike Jekyll, it won't update as you edit, unless you set up a development environment variable:
export FLASK_APP=hello
export FLASK_ENV=development
flask run
It will update, but you still have to click refresh on the screen.
from flask import Flask
app = Flask(__name__)
@app.route("/")
def hello_world():
return "<h1>Hello, World!</h1>"
Now to make the project a package and keep running under the same structure as before, now I use this structure: (which by the way you can output with tree -a -N -n --charset ascii | pbcopy
on the terminal if you install it with Homebrew with brew install tree
).
my-project
|-- .gitignore
|-- README.md
|-- logs
| `-- tasklog.md
|-- python
| |-- ProjectPaths.py
| |-- __pycache__
| |-- package
| | |-- __init__.py
| | |-- __pycache__
| | `-- module.py
| `-- tests
| |-- 0_test.py
| `-- __pycache__
|-- requirements.txt
`-- venv
ProjectPaths.py
is just a bunch of methods I like to call to make directories without much hassle.
To run the tests and the files in the package, now the command is like this:
cd my-project/python
python -m tests.0_test
Notice that it's a .
instead of a /
and also there's no .py
It should now run and import as if from the parent directory python
I also make websites on my free time, and lots of researchers have their projects on a github pages website. For this, I like to use Jekyll in combination with github pages. First lets install all our dependencies.
sudo apt-get install ruby-full build-essential zlib1g-dev
Then, we have to add to the $PATH so that ruby gems are found:
echo '# Install Ruby Gems to ~/gems' >> ~/.bash_profile
echo 'export GEM_HOME="$HOME/gems"' >> ~/.bash_profile
echo 'export PATH="$HOME/gems/bin:$PATH"' >> ~/.bash_profile
source ~/.bash_profile
Then install
gem install bundler jekyll jekyll-sitemap
Now it's installed! Well, to make a Jekyll Github Page I followed this tutorial, so go ahead and do it:
https://docs.github.com/en/pages/setting-up-a-github-pages-site-with-jekyll
Now, once you have your webiste repository and you're ready to test the jekyll serve, do the following:
cd (your_repository_here)
bundle init
bundle add jekyll
bundle add jekyll-sitemap
bundle add webrick
And then all that's left to do is to serve the website with jekyll! Also for the sitemaps make sure to check this tutorial:
https://github.com/jekyll/jekyll-sitemap
And add this to your _config.yml
url: "https://example.com" # the base hostname & protocol for your site
plugins:
- jekyll-sitemap
bundle exec jekyll serve
If you get an error like:
Could not find webrick-1.7.0 in any of the sources
Run `bundle install` to install missing gems.
Do as it says and just run:
bundle install
Now you can work on the website and look at how it changes on screen.
By the way, if you are hosting on GitHub Pages and have a custom domain, you need to add these to the DNS
Type Name Points to TTL
a @ 185.199.108.153 600 seconds
a @ 185.199.109.153 600 seconds
a @ 185.199.110.153 600 seconds
a @ 185.199.111.153 600 seconds
cname www your-username.github.io 600 seconds
sudo apt install texlive-latex-extra
sudo apt-get install latexdiff
This installs a few packages along with it, including latexdiff which I use a lot as a PhD student.
https://github.com/elisa-aleman/latex_helpers
I made these shell scripts to help in compiling faster when using bibliographies and to delete cumbersome files when not necessary every time I compile. Since they are .sh scripts, they run normally with git bash.
- Install Package Control.
- Install LaTeXTools plugin.
https://tex.stackexchange.com/a/85487
If you have the LaTeXTools plugin, it already does that except that it is mapped on Shift+Enter instead of Enter.
For Japanese UTF-8 text in XeLaTeX:
\usepackage{xeCJK}
Set the fonts: these are the default, but they have no bold
\setCJKmainfont{IPAMincho} % No bold, serif
\setCJKsansfont{IPAGothic} % No bold, sans-serif
\setCJKmonofont{IPAGothic} % No bold, sans-serif
Installing fonts, for example, Aozora mincho has guaranteed bold
https://web.archive.org/web/20200321102301/http://blueskis.wktk.so/AozoraMincho/download.html
Make sure to install for all users:
https://stackoverflow.com/questions/55264642/how-to-force-win10-to-install-fonts-in-c-windows-fonts
Set the installed font:
\setCJKmainfont[BoldFont=AozoraMincho-bold,AutoFakeSlant=0.15]{Aozora Mincho}
Japanse document style:
\usepackage[english,japanese]{babel} % For Japanese date format
\usepackage{indentfirst} % For Japanese style indentation
\setlength\parindent{11pt}
Japanese babel messes itemize up inside tables, so:
\usepackage{enumitem}
\newlist{jpcompactitemize}{itemize}{1} % defined new list
\setlist[jpcompactitemize]{topsep=0em, itemsep=-0.5em, label=\textbullet} % new list setup
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\usepackage[lighttt]{lmodern}
\usepackage{listings} % to display code
\usepackage{lstautogobble} % to indent inside latex without affecting the code, keeping the indent the code has inside
\usepackage{anyfontsize} % for code font size
\usepackage[os=win]{menukeys} % to display keystrokes
% For the color behind the code sections:
\usepackage{xcolor} %custom colours
\definecolor{light-gray}{gray}{0.95} %the shade of grey that stack exchange uses
\definecolor{editorGreen}{rgb}{0, 0.5, 0} % #007C00 -> rgb(0, 124, 0)
% Make a more defined languages for nice colors
\include{lststyle-css.sty}
\include{lststyle-html5.sty}
% Set up the code display lst options
\lstset{
% for the code font and size:
% basicstyle=\ttfamily\small,
basicstyle=\ttfamily\fontsize{10}{12}\selectfont,
% to avoid spaces showing as brackets in strings
showstringspaces=false,
% for straight quotes in code
upquote=true,
% for the middle tildes in the code
literate={~}{{\fontfamily{ptm}\selectfont \textasciitilde}}1,
% for the line break in long texts
breaklines=true,
postbreak=\mbox{\textcolor{red}{$\hookrightarrow$}\space},
% for the keyword colors in the code
keywordstyle=\color{blue}\bfseries\ttfamily,
stringstyle=\color{purple},
commentstyle=\color{darkgray}\ttfamily,
keywordstyle={[2]{\color{editorGreen}\bfseries\ttfamily}},
autogobble=true % to ignore latex indents but keep code indent
}
% unnecessary in XeLaTeX
% % For this specific document with lots of degree signs inside listings
% \lstset{
% literate={°}{\textdegree}1
% }
% for straight double quotes in code
\usepackage[T1]{fontenc}
% frame set up
\usepackage[framemethod=TikZ]{mdframed} %nice frames
\mdfsetup{
backgroundcolor=light-gray,
roundcorner=7pt,
leftmargin=1,
rightmargin=1,
innerleftmargin=1em,
innertopmargin=0.5em,
innerbottommargin=0,
outerlinewidth=1,
linecolor=light-gray,
}
% Make it affect all lstlistings
\BeforeBeginEnvironment{lstlisting}{\begin{mdframed}\vskip-.5\baselineskip}
\AfterEndEnvironment{lstlisting}{\end{mdframed}}
% Make colored box around inline code
\usepackage{realboxes}
\usepackage{xpatch}
\makeatletter
\xpretocmd\lstinline{\Colorbox{light-gray}\bgroup\appto\lst@DeInit{\egroup}}{}{}
\makeatother
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
When it comes down to it, specially when working with LaTeX or git, you find yourself making the same commands over and over again. That takes time and frustration, so I find that making scripts from time to time saves me a lot of time in the future.
Once in a while those scripts will need some input to be more useful in as many cases as possible instead of a one time thing.
Looking for how to do this I ran across a simple StackOverflow question, which led me to the getopts
package and its tutorial:
This is a working example:
while getopts ":a:" opt; do
case $opt in
a)
echo "-a was triggered, Parameter: $OPTARG" >&2
;;
\?)
echo "Invalid option: -$OPTARG" >&2
exit 1
;;
:)
echo "Option -$OPTARG requires an argument." >&2
exit 1
;;
esac
done
Now sometimes you'll want to have fancy arguments with both a shortcut name (-) and a long name (--), for example -a
and --doall
both pointing to the same command. In that case I recommend using nhoffman's implementation of Python's argparse
in bash:
argparse.bash by nhoffman on GitHub
Personally, I find it tiring to try to compile a LaTeX document, only to have to run the bibliography, and then compile the document twice again so all the references are well put where they need to be, rather tiring. Also, I find that the output files are cluttering my space and I only need to see them when I run into certain errors.
Also, for academic papers, I used latexdiff
commands quite a lot, and while customizable, I noticed I needed a certain configuration for most journals and that was it.
So I made LaTeX helpers, a couple of bash scripts that make that process faster.
So instead of typing
pdflatex paper.tex
bibtex paper
pdflatex paper.tex
pdflatex paper.tex
open paper.tex
rm paper.log paper.out paper.aux paper.... and so on
Every. Single. Time.
I just need to type:
./latexcompile.sh paper.tex --view --clean
and if I needed to make a latexdiff I just:
./my_latexdiff.sh paper_V1-1.tex paper.tex --newversion="2" --compile --view --clean
And there it is, a latexdiff PDF right on my screen.
I would also commonly have several documents of different languages, or save my latexdiff command in another script, called cur_compile_all.sh
or cur_latexdiff.sh
so I didn't have to remember version numbers and stuff when working across several weeks or months.
Usually with code such as:
cd en
./latexcompile.sh paper.tex --view --clean --xelatex
cd ../es
./latexcompile.sh paper.tex --view --clean --xelatex
cd ../jp
./latexcompile.sh paper.tex --view --clean --xelatex
And so on, to save time.
I discovered this tool recently when I was asked to share a PDF of my private GitLab MarkDown notes. Of course I wouldn't share the whole repository so that it can be displayed in GitLab for them, so I searched for an alternative.
It can be installed in Windows, macOS, Linux, ChromeOS, BSD, Docker, ... it's really portable
Pandoc Install:
https://pandoc.org/installing.html
Pandoc Manual:
https://pandoc.org/MANUAL.html
Export to PDF syntax
pandoc test1.md -s -o test1.pdf
Note that it uses LaTeX to convert to PDF, so UTF-8 languages (japanese, etc.) might return errors.
pandoc test1.md -s -o test1.pdf --pdf-engine=xelatex
But it doesn't load the Font for Japanese... Also, the default margins are way too wide.
So, in the original markdown file preamble we need to add Variables for LaTeX:
---
title: "Title"
author: "Name"
date: YYYY-MM-DD
<!-- add the following -->
geometry: margin=1.5cm
output: pdf_document
<!-- CJKmainfont: IPAMincho #default font but no bold -->
<!-- install this one for bold japanese: https://web.archive.org/web/20200321102301/http://blueskis.wktk.so/AozoraMincho/download.html -->
<!-- https://stackoverflow.com/questions/55264642/how-to-force-win10-to-install-fonts-in-c-windows-fonts (install for all users) -->
CJKmainfont: Aozora Mincho
CJKoptions:
- BoldFont=AozoraMincho-bold
<a id="--autofakeslant015"></a>
- AutoFakeSlant=0.15
---
And voilà, the markdown is now a PDF.
I'm still unsure if it will process the GitHub or GitLab math environments, since the syntax is different.
Upon confirmation with the User's Guide: Math section, it uses the GitHub math syntax.
Inline: $x=3$
Renders as:
Block: $$x=3$$
Renders as:
For a linux server to use an nvidia GPU for calculations instead of the CPU, we need to install CUDA, and for neural networks specifically, cuDNN is also needed.
-
NVIDIA drivers installation guide:
https://docs.nvidia.com/datacenter/tesla/tesla-installation-notes/index.html -
CUDA Installation guide:
https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html -
cuDNN installation guide:
https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html#download
If you already have nvidia or cuda tools installed and want to upgrade, you can use these commands before installing the new software:
apt remove --purge cuda*
apt remove --purge nvidia*
apt remove --purge libcuda*
Be prepared to run your server in console only mode, since you'll be altering the graphics drivers during this process.
When designing new things it's important to keep in mind color theory, as well as accessibility for the visually impaired and color blind people, etc. But that's so much time one could spend doing so much else, so here's a tool that can help with that and also visualizing how other people with different ranges of color vision would perceive it. It's called Paletton.
There was a new tool developed called "Bionic Reading", which bolds the beginnings of words so that our eyes glide over them more easily, basically making a tool for speed reading without having to train specifically for that. Lots of neurodivergent people such as myself (I have ADHD and am autistic), have a hard time following long texts or focusing when there is too much information at the same time (say, with very small line spacing). This new tool has been praised by the ND (neurodivergent) community, since making it available for businesses or companies to use would mean more accessibility in everyday services... or at least it was until they decided to charge an OUTRAGEOUS amount of money to implement it, making it obviously not attractive for companies to implement and therefore ruining it for everyone.
That is why someone decided to make "Not Bionic Reading" which is, legally speaking, not the same thing as Bionic Reading and therefore can be made available for everyone as Open Source.
Here's the usable link: https://not-br.neocities.org/
Have fun reading!
https://pncnmnp.github.io/blogs/firefox-dark-mode.html
After hunting on the web for about 30 minutes, I found this thread on Bugzilla. It turns out starting with Firefox 60, extensions are no longer allowed to interact with the native pdf viewer. Determined, I decided to locally modify the CSS rendered by Firefox's PDF viewer. The steps for the same are:
- Open Firefox and press Alt to show the top menu, then click on Help → Troubleshooting Information
- Click the Open Directory button beside the Profile Directory entry
- Create a folder named chrome in the directory that opens
- In the chrome folder, create a CSS file with the name userContent.css
- Open the userContent.css file and insert -
<a id="viewercontainer--viewer--page--canvaswrapper--canvas-"></a>
#viewerContainer > #viewer > .page > .canvasWrapper > canvas {
filter: grayscale(100%);
filter: invert(100%);
}
- On Firefox's URL bar, type about:config.
- Search for toolkit.legacyUserProfileCustomizations.stylesheets and set it to true.
- Restart Firefox and fire up a PDF file to see the change!
I found a solution in this post:
https://superuser.com/a/1527417
The following snippet adds a div overlay to any browser tab currently displaying a PDF document.
- Open up your browser's Dev tools then browser console.
- Paste this JavaScript code in your browser console:
const overlay = document.createElement("div");
const css = `
position: fixed;
pointer-events: none;
top: 0;
left: 0;
width: 100vw;
height: 100vh;
background-color: white;
mix-blend-mode: difference;
z-index: 1;
`
overlay.setAttribute("style", css);
document.body.appendChild(overlay);
- Hit Enter
Special thanks: https://www.reddit.com/r/chrome/comments/e3txhi/comment/fem1cto
That is all for now. This is my initial setup for the lab environment under a proxy. If I have any projects that need further tinkering, that goes on another repository / tutorial.