Giter Site home page Giter Site logo

ansible-playbooks's Introduction

Snowplow logo

Release Release activity Latest release Docker pulls Discourse posts License


As of January 8, 2024, Snowplow is introducing the Snowplow Limited Use License Agreement, and we will be releasing new versions of our core behavioral data pipeline technology under this license.

Our mission to empower everyone to own their first-party customer behavioral data remains the same. We value all of our users and remain dedicated to helping our community use Snowplow in the optimal capacity that fits their business goals and needs.

We reflect on our Snowplow origins and provide more information about these changes in our blog post here → https://eu1.hubs.ly/H06QJZw0


Overview

Snowplow is a developer-first engine for collecting behavioral data. In short, it allows you to:

Thousands of organizations like Burberry, Strava, and Auto Trader rely on Snowplow to collect, manage, and operationalize real-time event data from their central data platform to uncover deeper customer journey insights, predict customer behaviors, deliver differentiated customer experiences, and detect fraudulent activities.

Table of contents

Why Snowplow?

  • 🏔️ “Glass-box” technical architecture capable of processing billions of events per day.
  • 🛠️ Over 20 SDKs to collect data from web, mobile, server-side, and other sources.
  • ✅ A unique approach based on schemas and validation ensures your data is as clean as possible.
  • 🪄 Over 15 enrichments to get the most out of your data.
  • 🏭 Stream data to your data warehouse/lakehouse or SaaS destinations of choice — Snowplow fits nicely within the Modern Data Stack.

➡ Where to start? ⬅️

Snowplow Community Edition Snowplow Behavioral Data Platform
Community Edition equips you with everything you need to start creating behavioral data in a high-fidelity, machine-readable way. Head over to the Quick Start Guide to set things up. Looking for an enterprise solution with a console, APIs, data governance, workflow tooling? The Behavioral Data Platform is our managed service that runs in your AWS, Azure or GCP cloud. Book a demo.

The documentation is a great place to learn more.

Would rather dive into the code? Then you are already in the right place!


Snowplow technology 101

Snowplow architecture

The repository structure follows the conceptual architecture of Snowplow, which consists of six loosely-coupled sub-systems connected by five standardized data protocols/formats.

To briefly explain these six sub-systems:

  • Trackers fire Snowplow events. Currently we have 15 trackers, covering web, mobile, desktop, server and IoT
  • Collector receives Snowplow events from trackers. Currently we have one official collector implementation with different sinks: Amazon Kinesis, Google PubSub, Amazon SQS, Apache Kafka and NSQ
  • Enrich cleans up the raw Snowplow events, enriches them and puts them into storage. Currently we have several implementations, built for different environments (GCP, AWS, Apache Kafka) and one core library
  • Storage is where the Snowplow events live. Currently we store the Snowplow events in a flat file structure on S3, and in the Redshift, Postgres, Snowflake and BigQuery databases
  • Data modeling is where event-level data is joined with other data sets and aggregated into smaller data sets, and business logic is applied. This produces a clean set of tables which make it easier to perform analysis on the data. We officially support data models for Redshift, Snowflake and BigQuery.
  • Analytics are performed on the Snowplow events or on the aggregate tables.

For more information on the current Snowplow architecture, please see the Technical architecture.


About this repository

This repository is an umbrella repository for all loosely-coupled Snowplow components and is updated on each component release.

Since June 2020, all components have been extracted into their dedicated repositories (more info here) and this repository serves as an entry point for Snowplow users and as a historical artifact.

Components that have been extracted to their own repository are still here as git submodules.

Trackers

A full list of supported trackers can be found on our documentation site. Popular trackers and use cases include:

Web Mobile Gaming TV Desktop & Server
JavaScript Android Unity Roku Command line
AMP iOS C++ iOS .NET
React Native Lua Android Go
Flutter React Native Java
Node.js
PHP
Python
Ruby
Scala
C++
Rust
Lua

Loaders

Iglu

Data modeling

Web

Mobile

Media

Retail

Testing

Parsing enriched event


Community

We want to make it super easy for Snowplow users and contributors to talk to us and connect with one another, to share ideas, solve problems and help make Snowplow awesome. Join the conversation:

  • Meetups. Don’t miss your chance to talk to us in person. We are often on the move with meetups in Amsterdam, Berlin, Boston, London, and more.
  • Discourse. Our forum for all Snowplow users: engineers setting up Snowplow, data modelers structuring the data, and data consumers building insights. You can find guides, recipes, questions and answers from Snowplow users and the Snowplow team. All questions and contributions are welcome!
  • Twitter. Follow @Snowplow for official news and @SnowplowLabs for engineering-heavy conversations and release announcements.
  • GitHub. If you spot a bug, please raise an issue in the GitHub repository of the component in question. Likewise, if you have developed a cool new feature or an improvement, please open a pull request, we’ll be glad to integrate it in the codebase! For brainstorming a potential new feature, Discourse is the best place to start.
  • Email. If you want to talk to Snowplow directly, email is the easiest way. Get in touch at [email protected].

Copyright and license

Snowplow is copyright 2012-2023 Snowplow Analytics Ltd.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this software except in compliance with the License.

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

ansible-playbooks's People

Contributors

alexanderdean avatar benfradet avatar bigsnarfdude avatar chuwy avatar fblundun avatar jbeemster avatar jrluis avatar mhadam avatar rzats avatar yalisassoon avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ansible-playbooks's Issues

RVM operations should not be sudo'ed

Just confirmed that Jekyll does not work on a fresh build, even when making sure to log out between Ruby & snowplow.github.com installation.

vagrant@precise64:~$ jekyll --server
/home/vagrant/.rvm/rubies/ruby-1.9.3-p484/lib/ruby/site_ruby/1.9.1/rubygems/dependency.rb:298:in `to_specs': Could not find 'jekyll' (>= 0) among 7 total gem(s) (Gem::LoadError)
    from /home/vagrant/.rvm/rubies/ruby-1.9.3-p484/lib/ruby/site_ruby/1.9.1/rubygems/dependency.rb:309:in `to_spec'
    from /home/vagrant/.rvm/rubies/ruby-1.9.3-p484/lib/ruby/site_ruby/1.9.1/rubygems/core_ext/kernel_gem.rb:53:in `gem'
    from /home/vagrant/.rvm/rubies/ruby-1.9.3-p484/bin/jekyll:22:in `<main>'
    from /home/vagrant/.rvm/gems/ruby-1.9.3-p484@global/bin/ruby_executable_hooks:15:in `eval'
    from /home/vagrant/.rvm/gems/ruby-1.9.3-p484@global/bin/ruby_executable_hooks:15:in `<main>'

Can't get latest website installer working

From a clean dev-environment and ansible-playbooks:

vagrant@precise64:~$ ansible-playbook /vagrant/ansible-playbooks/snowplow-website.yml \
>
ansible/                   .bash_history              .bashrc                    postinstall.sh             .ssh/                      .vbox_version
ansible_hosts              .bash_logout               .cache/                    .profile                   .sudo_as_admin_successful  .veewee_version
> --inventory-file=/home/vagrant/ansible_hosts --connection=local

PLAY [vagrant] ****************************************************************

GATHERING FACTS ***************************************************************
ok: [127.0.0.1]

TASK: [base | Update apt cache] ***********************************************
ok: [127.0.0.1]

TASK: [base | Update all packages] ********************************************
changed: [127.0.0.1]

TASK: [base | install basic packages] *****************************************
changed: [127.0.0.1] => (item=curl,vim,libxslt-dev,libxml2-dev,unzip,python-pip)

TASK: [ruby-rvm | Ensure curl is installed (used to download RVM)] ************
ok: [127.0.0.1] => (item=curl)

TASK: [ruby-rvm | Ensure RVM is installed] ************************************
changed: [127.0.0.1]

TASK: [ruby-rvm | Ensure Ruby 1.9.3 is installed as default] ******************
changed: [127.0.0.1]

TASK: [ruby-rvm | Start RVM (otherwise only launched when the shell is rebooted)] ***
changed: [127.0.0.1]

TASK: [ruby-rvm | NOTICE! You will need to exit the terminal and open a fresh terminal before you can use RVM. \n(Or alternatively run '. /home/vagrant/.rvm/scripts/rvm] ***
changed: [127.0.0.1]

TASK: [jekyll | Ensure Jekyll's gem dependencies are installed] ***************
failed: [127.0.0.1] => (item={'version': u'1.4.3', 'flags': '--no-ri --no-rdoc', 'name': 'jekyll'}) => {"changed": true, "cmd": "gem install jekyll -v 1.4.3 --no-ri --no-rdoc ", "delta": "0:00:32.792299", "end": "2014-03-05 12:38:44.284301", "item": {"flags": "--no-ri --no-rdoc", "name": "jekyll", "version": "1.4.3"}, "rc": 1, "start": "2014-03-05 12:38:11.492002"}
stderr: ERROR:  While executing gem ... (Gem::FilePermissionError)
    You don't have write permissions into the /opt/vagrant_ruby/lib/ruby/gems/1.8 directory.

FATAL: all hosts have already failed -- aborting

PLAY RECAP ********************************************************************
           to retry, use: --limit @/home/vagrant/snowplow-website.retry

127.0.0.1                  : ok=9    changed=6    unreachable=0    failed=1

ruby-rvm.yaml* is missing a pg dependency

Almost all our Ruby stuff requires PG access, and the pg gem cannot be installed without first:

sudo apt-get install libpq-dev

* or arguably snowplow-batch-pipeline.yaml instead

Problems auto-cloning snowplow-python tracker

Have tried:

  • Deleting dev-environment's own .git* files and folders
  • Installing with dest=/vagrant/
  • Installing with dest=/vagrant/snowplow-python-tracker

All give different error messages, suggesting getting closer to the answer (?), but no joy yet...

Remove yui-compressor from snowplow-javascript-tracker.yml

Grunt takes care of compression so this is no longer needed.

Running the playbook without removing the line produces this error:

TASK: [yui-compressor | Ensure YUI Compressor is downloaded] ****************** 
failed: [127.0.0.1] => {"dest": "/tmp/", "failed": true, "gid": 0, "group": "root", "mode": "01777", "owner": "root", "response": "Request failed: <urlopen error [Errno 1] _ssl.c:504: error:14077410:SSL routines:SSL23_GET_SERVER_HELLO:sslv3 alert handshake failure>", "size": 4096, "state": "directory", "status_code": -1, "uid": 0, "url": "https://github.com/downloads/yui/yuicompressor/yuicompressor-2.4.2.zip"}
msg: Request failed

FATAL: all hosts have already failed -- aborting

node installation missing 2 dependencies

TASK: [nodejs | Node.js | Make] ***********************************************
failed: [127.0.0.1] => {"changed": true, "cmd": "/usr/bin/make ", "delta": "0:00:00.430994", "end": "2014-03-17 11:00:48.098899", "item": "", "rc": 2, "start": "2014-03-17 11:00:47.667905"}
stderr: make[1]: g++: Command not found
make[1]: *** [/tmp/node-v0.10.25/out/Release/obj.target/v8_base/deps/v8/src/accessors.o] Error 127
make: *** [node] Error 2
stdout: /usr/bin/make -C out BUILDTYPE=Release V=1
make[1]: Entering directory `/tmp/node-v0.10.25/out'
  g++ '-DENABLE_DEBUGGER_SUPPORT' '-DENABLE_EXTRA_CHECKS' '-DV8_TARGET_ARCH_X64' -I../deps/v8/src  -Wall -Wextra -Wno-unused-parameter -pthread -m64 -fno-strict-aliasing -O2 -fno-strict-aliasing -fno-tree-vrp -fno-omit-frame-pointer -fno-rtti -fno-exceptions -MMD -MF /tmp/node-v0.10.25/out/Release/.deps//tmp/node-v0.10.25/out/Release/obj.target/v8_base/deps/v8/src/accessors.o.d.raw  -c -o /tmp/node-v0.10.25/out/Release/obj.target/v8_base/deps/v8/src/accessors.o ../deps/v8/src/accessors.cc
make[1]: Leaving directory `/tmp/node-v0.10.25/out'

FATAL: all hosts have already failed -- aborting

Fix:

sudo apt-get install build-essential g++

Add Android environment setup

Here are the exact build steps that you need to build the Android Tracker. I've tested it on a fresh Ubuntu 14.04 x64 VM.

Your VM needs 2GB RAM for the SDK to build the tracker

# Installing OpenJDK
apt-get update
apt-get install -y openjdk-7-jre openjdk-7-jdk

# Download the latest Android SDK
wget http://dl.google.com/android/android-sdk_r23.0.2-linux.tgz
tar -xvf android-sdk_r23.0.2-linux.tgz

# Setup the Android environment
export ANDROID_HOME=`pwd`/android-sdk-linux
export PATH="$ANDROID_HOME/tools:$ANDROID_HOME/platform-tools:$PATH"

# Adds the environment to your profile as well
echo "export ANDROID_HOME=`pwd`/android-sdk-linux
export PATH=\"\$ANDROID_HOME/tools:\$ANDROID_HOME/platform-tools:\$PATH\"" >> ~/.bashrc

# Downloads the Platform and Build tools, App Support and Play Services SDKs
( sleep 5 && while [ 1 ]; do sleep 1; echo y; done ) | android update sdk -u --all --filter platform-tool-19.1.0,build-tools-19.1.0,android-19,105,106,99,98,16,4,2

# Libraries required to build
apt-get install -y libc6-i386 lib32stdc++6 lib32gcc1 lib32ncurses5 zlib1g lib32z1

# Actual build steps
apt-get install -y git-core
git clone https://github.com/snowplow/snowplow-android-tracker.git
cd snowplow-android-tracker/
#git checkout develop
./gradlew build

snowplow-js-tracker missing Java dependency

vagrant@precise64:/vagrant/snowplow-javascript-tracker$ grunt publish
Running "concat:dist" (concat) task
File "dist/snowplow.js" created.

Running "min:dist" (min) task
Warning: Command failed: /bin/sh: 1: java: not found
 Use --force to continue.

Aborted due to warnings.

Can't figure out how to make Ansible respect the Java 8 requirement

$ /vagrant/ansible-playbooks/roles/gradle/tasks$ ansible-playbook /vagrant/ansible-playbooks/java-8-gradle.yml --inventory-file=/vagrant/home/ansible/ansible_hosts --connection=local

When installing:

TASK: [oracle-java | Install Oracle Java Repo Installer Repository] ***********
ok: [127.0.0.1]

TASK: [oracle-java | Wizardry to bypass the Oracle License File prompt] *******
changed: [127.0.0.1]

TASK: [oracle-java | Install Oracle Java 6] ***********************************
skipping: [127.0.0.1]

TASK: [oracle-java | Install Oracle Java 7] ***********************************
skipping: [127.0.0.1]

TASK: [oracle-java | Install Oracle Java 8] ***********************************
ok: [127.0.0.1]

TASK: [oracle-java | Install PyCurl (required for apt-repository updates)] ****
ok: [127.0.0.1]

TASK: [oracle-java | Install Oracle Java Repo Installer Repository] ***********
ok: [127.0.0.1]

TASK: [oracle-java | Wizardry to bypass the Oracle License File prompt] *******
changed: [127.0.0.1]

TASK: [oracle-java | Install Oracle Java 6] ***********************************
skipping: [127.0.0.1]

TASK: [oracle-java | Install Oracle Java 7] ***********************************
ok: [127.0.0.1]

TASK: [oracle-java | Install Oracle Java 8] ***********************************
skipping: [127.0.0.1]

The first run through, for Java 8, successfully installs Java 8 and skips the others. When it comes to Gradle's turn, it seems to attempt to install Java 7, despite being run in an environment that specifically requests Java 8.

Maybe you know what's going on here?

Incompatible character encodings bug with jekyll --server

Having fixed the other issues, now left with:

vagrant@precise64:/vagrant/snowplow.github.com$ jekyll --server
Configuration from /vagrant/snowplow.github.com/_config.yml
Building site: . -> ./_site
Liquid Exception: incompatible character encodings: UTF-8 and ISO-8859-1 in blog-post
/home/vagrant/.rvm/gems/ruby-1.9.3-p484@global/gems/liquid-2.4.1/lib/liquid/block.rb:92:in `join'
/home/vagrant/.rvm/gems/ruby-1.9.3-p484@global/gems/liquid-2.4.1/lib/liquid/block.rb:92:in `render_all'
/home/vagrant/.rvm/gems/ruby-1.9.3-p484@global/gems/liquid-2.4.1/lib/liquid/block.rb:82:in `render'
/home/vagrant/.rvm/gems/ruby-1.9.3-p484@global/gems/liquid-2.4.1/lib/liquid/template.rb:124:in `render'
/home/vagrant/.rvm/gems/ruby-1.9.3-p484@global/gems/liquid-2.4.1/lib/liquid/template.rb:132:in `render!'
/home/vagrant/.rvm/gems/ruby-1.9.3-p484@global/gems/jekyll-0.12.1/lib/jekyll/convertible.rb:101:in `do_layout'
/home/vagrant/.rvm/gems/ruby-1.9.3-p484@global/gems/jekyll-0.12.1/lib/jekyll/post.rb:195:in `render'
/home/vagrant/.rvm/gems/ruby-1.9.3-p484@global/gems/jekyll-0.12.1/lib/jekyll/site.rb:200:in `block in render'
/home/vagrant/.rvm/gems/ruby-1.9.3-p484@global/gems/jekyll-0.12.1/lib/jekyll/site.rb:199:in `each'
/home/vagrant/.rvm/gems/ruby-1.9.3-p484@global/gems/jekyll-0.12.1/lib/jekyll/site.rb:199:in `render'
/home/vagrant/.rvm/gems/ruby-1.9.3-p484@global/gems/jekyll-0.12.1/lib/jekyll/site.rb:41:in `process'
/home/vagrant/.rvm/gems/ruby-1.9.3-p484@global/gems/jekyll-0.12.1/bin/jekyll:264:in `<top (required)>'
/home/vagrant/.rvm/gems/ruby-1.9.3-p484@global/bin/jekyll:23:in `load'
/home/vagrant/.rvm/gems/ruby-1.9.3-p484@global/bin/jekyll:23:in `<main>'
/home/vagrant/.rvm/gems/ruby-1.9.3-p484@global/bin/ruby_executable_hooks:15:in `eval'
/home/vagrant/.rvm/gems/ruby-1.9.3-p484@global/bin/ruby_executable_hooks:15:in `<main>'
Build Failed

Add terraform install

This needs tidying up, is hideous:

cd /tmp
wget https://dl.bintray.com/mitchellh/terraform/terraform_0.3.0_linux_amd64.zip
unzip terraform_0.3.0_linux_amd64.zip
sudo mv terraform* /usr/bin

Make Postgres open like Travis CI's

Travis CI has a very open PG which is great for tests.

To achieve the same:

$ sudo vi /etc/postgresql/8.4/main/pg_hba.conf

then add/replace:

local   all             postgres                                trust
host    all             postgres          127.0.0.1/32                      trust

then:

$ sudo /etc/init.d/postgresql reload

Figure out why Neo4J complains about number of file opens

Neo4J complains about the maximum number of file opens on loading / restarting:

vagrant@precise64:/etc/init.d$ sudo ./neo4j-service restart
 * Restarting Neo4j Graph Database neo4j                                                                                                                WARNING: Max 1024 open files allowed, minimum of 40 000 recommended. See the Neo4j manual.
Using additional JVM arguments:  -server -XX:+DisableExplicitGC -Dorg.neo4j.server.properties=conf/neo4j-server.properties -Djava.util.logging.config.file=conf/logging.properties -Dlog4j.configuration=file:conf/log4j.properties -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled
Starting Neo4j Server...WARNING: not changing user
process [2168]... waiting for server to be ready..... OK.
http://localhost:7474/ is ready.

I've followed the instructions in the following blog post to increase the limit: http://www.delimited.io/blog/2014/1/15/getting-started-with-neo4j-on-ubuntu-server, but I still get the same error...

node.js version check doesn't work

TASK: [nodejs | Node.js | Checking installed version of node.js] **************
failed: [127.0.0.1] => {"changed": true, "cmd": "/usr/bin/test "$(node -v 2> /dev/null)" = v0.10.25 ", "delta": "0:00:00.003899", "end": "2014-03-17 11:08:12.214885", "item": "", "rc": 1, "start": "2014-03-17 11:08:12.210986"}
...ignoring

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.