Comments (4)
Hello Paul,
Thanks a lot for the question. It gives me the opportunity to clarify a few things on the Ubuntu 14.04.4 LTS process. When somebody has Ubuntu 14.04.4 LTS the most efficient way to set up the system is by using docker (without the Virtualbox overhead). It's extra efficient since all those "virtual machines" run as "native" processes in the system without incurring larger virtualisation overheads.
Follow the Appendix A process to set those up for Ubuntu. Don't forget to logout and re-login at the end of this process. This is important because otherwise you won't be able to run docker without sudo
. When you re-login you should be able to check that docker runs fine:
docker run hello-world
One more time (cause it got me twice); the above command must run nicely without sudo
. My book has installation instructions to get to this point which are valid right now, but possibly at some point in the future they might become invalid. Thus always cross-check the latest docker installation instructions here.
Now, a thing that I must admit is not very explicit on the book is that if one installs Vagrant with sudo apt-get install vagrant
will get a version that is rather old (1.4.3
at the moment). Vagrant moves fast and there are significant bug fixes added in-between. So please, install or upgrade Vagrant to the latest available version (1.8.1
at the moment) with something along the lines of:
wget https://releases.hashicorp.com/vagrant/1.8.1/vagrant_1.8.1_x86_64.deb
sudo dpkg -i vagrant_1.8.1_x86_64.deb
You can find the latest process and urls here. If you don't have docker installed or if Vagrant is old, you might get errors like this:
The executable 'docker' Vagrant is trying to run was not
found in the PATH variable. This is an error. Please verify
this software is installed and on the path.
If you have docker and Vagrant installed, at this point, you should be able to do the usual process:
git clone https://github.com/scalingexcellence/scrapybook.git
cd scrapybook
vagrant up --no-parallel
The system should be up and running after some time (a bit more the first time because it downloads docker images). The process above is 100% Vagrant - Docker based and works nicely, is very efficient and highly recommended for Ubuntu 14.04.4 LTS.
This should be enough to run book's system and this is where the answer really finishes. The rest of the material is just for reference.
Some Reference Material
Of course one can use VirtualBox under Ubuntu. One thing to be aware of is that Linux that already runs inside a Virtual Machines (e.g. the ones one gets from Amazon AWS EC2) might not have virtualization extensions enabled. As per #5, I'm not willing to support in great extend such systems but I provide some pointers and workarounds there. So if you are on AWS/EC2, prefer docker.
I will assume from now on that you run on a machine that has virtualization extensions enabled and you want to run the usual Virtualbox flow. First of all you will have to install Virtualbox as described here and then Vagrant as described previously. As mentioned in issue 5 here there's no further need to explicitly download and install scrapybook.box
. This is great and simplifies the process. (If you did so, it wouldn't really hurt but keep in mind that you would have to change config.vm.box = "lookfwd/scrapybook"
to config.vm.box = "scrapybook"
in Vagrantfile.dockerhost
.)
So let's assume that you take the easy path and you've just downloaded/installed Virtualbox and Vagrant on Ubuntu 14.04.4 LTS. All you have to do then is set an environment variable:
export SCRAPYBOOK_FORCE_HOST_VM=TRUE
and then the typical:
git clone https://github.com/scalingexcellence/scrapybook.git
cd scrapybook
vagrant up --no-parallel
The --provider=virtualbox
won't work unfortunately because it tries to treat server's definitions as Virtualbox images instead of docker images. config.vm.box
is necessary for Virtualbox images and optional and meaningless for docker images, thus the very confusing error message:
There are errors in the configuration of this machine. Please fix
the following errors and try again:
vm:
* A box must be specified.
All those extra comments and processes are just for reference. The only thing that you really need for Ubuntu 14.04.4 LTS is latest docker + Vagrant as described on the beginning of this answer. This is the most efficient and easy way for Ubuntu.
from scrapybook.
Hello, maybe something stupid but I think you forgot to cd into the scrapybook directory before running the vagrant up command.
Cheers
from scrapybook.
Thank you for the thorough answer.
Getting the latest vagrant
did the trick!
wget https://releases.hashicorp.com/vagrant/1.8.1/vagrant_1.8.1_x86_64.deb
sudo dpkg -i vagrant_1.8.1_x86_64.deb
from scrapybook.
Awesome!
from scrapybook.
Related Issues (20)
- vagrant up error HOT 1
- can't access http://scrapybook.s3.amazonaws.com/properties/ 403 forbidden HOT 1
- there is an Syntax Error on page 16
- is it because of socks5?
- seems that I have the same problem: HOT 1
- install panda
- OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to d1kby793vfk4bq.cloudfront.net:443 HOT 1
- Virtual machine has terminated unexpectedly during startup with exit code 1 (0x1) HOT 1
- Can't deploy 'properties' spider to scrapinghub.com from the docker container (chapter 6)
- Vagrant up --no-parallel You are trying to forward a host IP that does not exist. HOT 1
- package 'openssh-server' is not installed
- The problem of setting up the environment HOT 1
- Cloning into 'algo-cs503'... fatal: unable to access 'https://github.com/saqibutm/algo-cs503.git/': error setting certificate verify locations: CAfile: D:/4th semster/ds/Git/mingw64/ssl/certs/ca-bundle.crt CApath: none this is the issue can plzz solve the issue
- Vagrant Setup - Resolving port conflicts on Mac HOT 1
- !!
- can't visit http://web:9312/ HOT 1
- how to connect local github with github id
- VAGRANT UP ERROR 2022
- vagrant up --no-parallel command not working
- vagrant up --no-parallel command not working HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from scrapybook.