Giter Site home page Giter Site logo

saltfs's Introduction

Servo's SaltStack Configuration

Build Status

What's going on?

Salt is a configuration management tool that we use to automate Servo's infrastructure. See docs/salt.md to get started.

Contributing

There are guides available on the Servo wiki, as well as some documention in-tree in the docs folder. If you see a way that these configurations could be improved, or try to set up your own instance and run into trouble, file an issue!

Travis

Travis CI is set up to test all configurations.

License

This repository is distributed under the terms of both the MIT license and the Apache License (Version 2.0).

See LICENSE-APACHE and LICENSE-MIT for details.

Note that some files in underscore-prefix directories (e.g. under _modules) are copies (possibly with changes) of files from the main Salt repo; these files have headers detailing the source of those files, any changes made, and the original license notice associated with those files.

saltfs's People

Contributors

aneeshusa avatar bors-servo avatar canaltinova avatar charlesvdv avatar delan avatar edunham avatar emilio avatar ferjm avatar gw3583 avatar jdm avatar jgraham avatar jrmuizel avatar kichjang avatar larsbergstrom avatar manishearth avatar mbrubeck avatar metajack avatar mfeckie avatar mortimergoro avatar mrego avatar mrobinson avatar ms2ger avatar nical avatar notriddle avatar nox avatar paulrouget avatar sagudev avatar simonsapin avatar uk992 avatar wafflespeanut avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

saltfs's Issues

Investigate OSX client perf

Below are some notes on raw OSX builder performance, comparing our mac minis with a mac pro. HUGE improvement on test-css and some improvement on test-wpt/compile.

servo-mac3 (mini), mac-rel-css
compile (./mach build --release) 16min, 30secs
test-css (./mach test-css --release --processes 4 --log-raw test-css.log) 29min, 29secs

servo-mac2 (mini), mac-rel-wpt
compile (./mach build --release) 15mins, 15sec
test-wpt (./mach test-wpt --release --processes 4 --log-raw test-wpt.log) 17 mins, 27sec
test-css (./mach test-css --release --processes 4 --log-raw test-css.log) 25min, 46secs

servo-macpro1
compile from full clean (./mach build --release) 16min, 55sec
compile from target-clean (./mach build --release) 12min, 57sec
test-css (./mach test-css --release --processes 4 --log-raw test-css.log) 15mins, 23sec
^ stayed around 15-20% idle the whole time. WindowServer 33% cpu, Terminal 35% cpu. autohide dock took 10% off cpu usage of WindowServer.
test-css (./mach test-css --release --processes 6 --log-raw test-css.log) 14mins, 48sec
test-css (./mach test-css --release --processes 8 --log-raw test-css.log) 14mins, 43sec
test-wpt (./mach test-wpt --release --processes 4 --log-raw test-wpt.log) 15min 57sec

cc @metajack @edunham

homu repos that use travis should exclude master

Right now it causes a lot of spurious travis runs. 1) initial PR 2) merge into tmp 3) merge into auto 4) fast forward to master.

(2) will fixed by #56. (4) is what this bug proposes to fix. The other two are both needed.

Vagrant on Linux failing to load due to permission error with the minon file?

@aneeshusa Do you have any idea why this might be failing when I attempt to vagrant up a new machine and how to fix it?

Copying salt minion config to /etc/salt
Failed to upload a file to the guest VM via SCP due to a permissions
error. This is normally because the SSH user doesn't have permission
to write to the destination location. Alternately, the user running
Vagrant on the host machine may not have permission to read the file.

Source: /home/larsberg/saltfs/.travis/minion
Dest: /etc/salt/minion
[larsberg@thor saltfs]$ 

Move buildmaster from Linode to AWS (tracking issue)

cc @larsbergstrom

Async steps

  • remove Longview states from SaltFS, since they're strictly for Linode instances
  • Select appropriate size AWS instance for new master
    • According to Longview, current master CPU hits 24% of one core, memory stays under 4GB, and disk usage is around 13GB.
    • an m3.medium instance on AWS has 1 core and 4.02GB RAM
  • manually create AWS instance, give it an elastic IP
  • Set up salt master w/ identical versions to old master on the AWS instance
  • turn old master's minion id into servo-master0
  • Follow instructions at https://github.com/servo/servo/wiki/Buildbot-administration#linux to set up the AWS host as servo-master and run a Salt highstate
  • Copy Pillar data and master settings directly to new master with sftp
    • /srv/salt/pillar
    • /srv/pillar/*
    • /etc/salt/master
  • Copy /etc/salt/pki/master directly to new master with sftp
  • Update all Salt minions to say master: build.servo.org in /etc/salt/minion; amend https://github.com/servo/servo/wiki/Buildbot-administration#linux to reflect this change
    • Create servo-master0.servo.org A record pointing to old buildmaster
    • Create servo-master1.servo.org A record pointing to new buildmaster
    • Set all slaves to dual-master as per https://docs.saltstack.com/en/latest/topics/tutorials/multimaster.html#configure-minions -- looks like the processes for managing salt with salt get pretty involved, so it'll be faster to make the change manually this time around
      • servo-mac1
      • servo-macpro1
      • servo-mac2
      • servo-mac3
      • servo-master0
      • servo-master1
      • servo-linux1
      • servo-linux2
      • servo-linux-cross1
      • servo-linux-cross2
  • Test that new master can connect to all minions, such as running a debug highstate
  • Test whether Homu service is running on new master
  • Verify that Buildbot is running on the new master
  • Verify that all requisite ports are permitted by the new master's security group
  • Check all Homu-enabled repos to verify that hooks (https://github.com/servo/servo/settings/hooks) use build.servo.org rather than the Linode master's IP

At 9am PST Wednesday 3/30

  • sftp /home/servo/homu/main.db from old master to new; restart homu
  • In the Cloudflare interface, edit build.servo.org record to point at new master. All webhooks use DNS, so they will not need to be modified.
  • Retry a failed build to verify that everything's working

Workflow for upgrading Salt

Homebrew has packaged Salt 2015.8.1 so I'll put in a PR to update the pinned salt version soon. However, upgrading Salt requires some extra care for two reasons:

  1. It's best to restart the service when the package is updated to keep the installed and running versions in sync, but restarting a master or minion service will interrupt an ongoing highstate.
  2. Masters need to be updated before minions, but salt '*' state.highstate cannot enforce ordering - Salt's orchestration can help with this.

I want to make sure there is a concrete plan for this and future Salt version upgrades to prevent breakage. Do you currently have a workflow for updating Salt? If not, I think manual updates are fine for now since there are so few machines: first on the salt master, then on the minions. I'll look into adding some automation to this.

Homu-ify glutin

17:29:13 <@mbrubeck> jack or edunham: servo/glutin needs  to be homu-ified

add an android cargo builder

We want some deps checked on android, but need a special cargo builder for that. There is already some work done in master.cfg so that specific projects can opt into android builds (ie, ANDROID_PROJECTS).

I think good candidates are any of the *-sys and their bindings: skia, rust-azure, mozjs, rust-mozjs, libfontconfig, etc. These were recently broken on android when trying to modify them for desktop arm linux systems. Some basic build checking would have prevented that bustage.

Tweak RAM allotment for Vagrant VMs

The Vagrant machines have a dual usage: not only are they used to check that the Salt states run successfully, but they're also used to actually try to build Servo to make sure the Salted configuration is correct. For the first use case, it's nice to be able to just vagrant up, and see that all the machines successfully came up, so right now the Vagrantfile has 1 GB / VM or 4 GB total, which is reasonable to allot on a laptop. However, actually building Servo takes more RAM (at least 3-4 GB apparently?); this is mitigated by the fact that you're likely to only spin up one VM at a time while investigating Servo build failures.

We should figure out a way to tweak the RAM allotments to support both use cases more easily than just editing the Vagrantfile by hand.

Refs #239

Ignoring statuses from the try branch.

I'll submit a PR when I get more work done. This is more to keep @larsbergstrom updated, and for anyone to help out.

Right now I've found something that works to work as a filter for reporting statuses. (This all goes in master.cfg for Buildbot.)

def conditional_status(branch=None):
    def bn(branch=None):
        return branch if branch else "master"
    return bn(branch) != "try"

old_buildFinished = GitHubStatus.buildFinished
old_buildStarted = GitHubStatus.buildStarted

def new_buildFinished(self, builderName, build, results):
    source_stamp = build.getSourceStamps()[0]
    builder = build.getBuilder()

    if conditional_status(source_stamp.branch):
        old_buildFinished(self, builderName, build, results)

def new_buildStarted(self, builderName, build):
    source_stamp = build.getSourceStamps()[0]
    builder = build.getBuilder()

    if conditional_status(source_stamp.branch):
        old_buildStarted(self, builderName, build)

GitHubStatus.buildFinished = new_buildFinished
GitHubStatus.buildStarted = new_buildStarted

It's not pretty, but it works. I'm going to try to make this look a little prettier so that it can be used in the config file without making anyone feel gross.

Any improvement on this would be great.

Problems with setting up the msttfonts on linux

We moved EC2 builders this past week, and had problems with the MS TT fonts installer again. The relevant install lines appear to be here:
https://github.com/servo/saltfs/blob/master/servo-dependencies.sls#L10-L19

What seems to have happened (from looking at the failing machine) is that the mscorefonts-eula was set to False rather than true before the installation of the package happened, and so the package was showing as installed but none of the fonts were available. I had to reset the eula flag then forcibly uninstall/reinstall the fonts.

Any ideas? CC: @metajack @aneeshusa @edunham

Trim down the symlinks to ARM cross-compilation binaries

We use cross-compilation to test ARM32 and ARM64 builds. However, the target triples differ between the installed packages, rustc, and build scripts, so we create a symlink farm in /home/servo/bin that points to the installed binaries to provide the right names. Currently, we create symlinks for almost every binary installed (see the binaries variable in arm.sls), but we don't use all of these, and we should trim down the list.

Steps:

  • Set up Vagrant locally and edit the Vagrantfile to provide enough memory to build Servo (last I checked, 4GB was enough, but it may have increased since then...).
  • Run a test build of Servo for ARM32 and ARM64 to make sure everything is working. To build Servo, you'll want to use the steps for arm32 and arm64 from our steps config. Make sure to set up the appropriate environment variables, as configured in our Buildbot config.
  • Edit the arm.sls file to whittle down the binaries variable; I'd recommend trying to remove one at a time.
  • After each edit, run vagrant provision to update the symlinks (old symlinks should get automatically removed) and try to build Servo for ARM32 and ARM64 inside Vagrant. You'll need to use the servo-linux-cross1 VM.
  • Submit a PR with the minimum set of symlinks needed.

Alternatives for multi-platform CI support in servo/servo

Today, we have a default strategy for each new platform or build configuration in the servo/servo repository - add new buildbot rules and rebalance those rules across a new set of builders we spin up.

That has a couple of issues:

  1. Editing buildbot rules and deploying them is something that really only core CI maintainers can do, since it requires root access and the ability to "deal with meltdowns," since any change can trigger cascading failures and require the ability to pick the whole system back up
  2. Each builder we spin up requires additional oversight, maintenance, etc.

One alternative that I'm considering is to add both AppVeyor support (barosl/homu#87) and the ability to gate on multiple CI systems (barosl/homu#100) to homu.

This would mean that for some new platforms (Windows - servo/servo#9406, ARM - https://github.com/mmatyas/servo-nightly/blob/master/.travis.yml, etc.) and also some tests (test-tidy), we could run them on Travis or AppVeyor infrastructure and use the merged buildbot+travis+appveyor results.

The upsides are:

  1. We don't have to maintain those servers.
  2. As seen from the PRs pointed to, contributing to and testing the CI rules is much more easily done by community members.

Downsides:

  1. If any one of the three services go offline, we're blocked on landing things. Today, we're really only blocked if Amazon, Linode, MacStadium, or Github go down. This would add another 1--2.
  2. It may be difficult to get extremely large instance types (e.g., the c4.4xlarge that we use on EC2) on some of these other services, at least in the first 3--6 months, which could put an upper limit on our build time.
  3. More homu complexity.

Thoughts? CC @Manishearth @metajack @edunham

Set up log removal on buildbot master

The buildbot manual (http://docs.buildbot.net/0.8.1/full.html) recommends rules like:

     @weekly cd BASEDIR && find . -mindepth 2 i-path './public_html/*' -prune -o -type f -mtime +14 -exec rm {} \;
     @weekly cd BASEDIR && find twistd.log* -mtime +14 -exec rm {} \;

I couldn't quite make this work (the i-path thing to exclude public_html kept giving me errors). I also think we should do 5 days instead of 2 weeks, as we don't have that much disk space :-)

We ran out of disk again today, though luckily it was during normal working hours, so nobody had to do a late-night recovery.

CC @edunham

Sync GitHub Servo organization groups with saltfs reviewers?

Right now, the GitHub Servo org has a somewhat ad-hoc list of groups and group membership that is overlapping but not entirely in sync with the saltfs reviewers.

In particular, we have an admins group from "the old days" when admins existed plus a developers group which has now-interchangeable permissions, but was separate because admins used to be owners back in the day.

There are also owners, but only a few of those, and it's a separate "bit" from group membership.

Finally, there's the servo:cargo-publish group, which should possibly be the same as these others?

cc @edunham @metajack

Automatically apply updates regularly

Right now, we manually apply updates on the linux boxes (the macs are set to auto-update). We should add a cron job or similar to auto-update the linux boxes.

Need automation for prebuilding Rust compiler

In order to do a Servo release, we prebuild the Rust compiler. This is currently done on @larsbergstrom's laptop.
Some relatively up-to-date docs on the prebuild process are at https://github.com/servo/servo/wiki/Updating-the-Rust-compiler-used-by-Servo

The machine which successfully builds the compiler is Ubuntu 14.04 LTS with installed software as described by the dpkg list at https://pastebin.mozilla.org/8835186

A better solution will be to have Salt configs describing a machine capable of building the compiler, which could be applied to a VM on one's laptop or to an EC2 instance.

Consider moving to reserved EC2 instances?

cc @edunham @metajack

We're finding that, particularly after any network hiccups, buildbot is really bad about spinning up a new EC2 latent instance. I also can't figure out how to connect one - if you do the obvious thing of spinning one up and starting buildbot on it, you get an error because the master is not in the process of starting it up (the only time when the latent ones can connect). And the master doesn't seem to spin up new ones unless you restart it, which is really hard to time between homu runs w/o breaking other things.

Is there some command I'm missing here that I could be using instead?

Or, should we consider moving to EC2 reserved instances? Even using a smaller instance type would help, as we're basically running with only one EC2 instance anyway most of the time.

We didn't see this as much before because we always had my linode instance to "pick up the slack".

buildbot-master needs python3

Homu only works on a python3 toolchain. Our salt configuration does not autoinstall this.

Homu worked so far because it somehow had a python3 virtualenv with everything it needed, without a global install of any of the python3 tools.

Specifically, we need:

sudo apt-get install python3.4 python3-pip
pip3 install virtualenv

We can do the former with a - require: - pkg: key in homu.sls somewhere.

I'm not sure how to indicate a pip3-specific dependency, though. Perhaps something with bin_env?

We also need to make sure that the pip/python/virtualenv setup does not make 3.4 the default for the pip/python/virtualenv binaries. Installing them as shown above seemed to preserve the default, but I'm not sure if that's something we can rely on.

cc @edunham @larsbergstrom

Killing servo processes doesn't always work

From http://build.servo.org/builders/linux-rel/builds/367:

pkill -x servo
 in dir /home/servo/buildbot/slave/linux-rel/build (timeout 1200 secs)
 watching logfiles {}
 argv: ['pkill', '-x', 'servo']
 environment:
  HOME=/home/servo
  PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin
  PWD=/home/servo/buildbot/slave/linux-rel/build
  TERM=linux
  UPSTART_INSTANCE=
  UPSTART_JOB=buildbot-slave
 using PTY: False

command timed out: 1200 seconds without output running ['pkill', '-x', 'servo'], attempting to kill
SIGKILL failed to kill process
using fake rc=-1
program finished with exit code -1

remoteFailed: [Failure instance: Traceback from remote host -- Traceback (most recent call last):
Failure: exceptions.RuntimeError: SIGKILL failed to kill process
exceptions.RuntimeError: SIGKILL failed to kill process
]

License

This repo has a fair amount of code in it, it would be a good idea to properly license it.

Helps on the build issue on Mac OS X

I searched in the issue list, and it was suggested to be autoconf linking problem and that issue was closed. Tried to follow the suggestion there (servo/servo#9504), did not succeed. Don't know if it is the same problem or not. Helps are appreciated.

cd /Users/xli/GoogleDrive/[email protected]/temp/git/servo/target/release/build/harfbuzz-sys-14497f51d6b45da3/out && make -j8
/Applications/Xcode.app/Contents/Developer/usr/bin/make all-recursive
Making all in src
/Applications/Xcode.app/Contents/Developer/usr/bin/make all-recursive
Making all in hb-ucdn
CC libhb_ucdn_la-ucdn.lo
CCLD libhb-ucdn.la
CXX libharfbuzz_la-hb-blob.lo
CXX libharfbuzz_la-hb-buffer-serialize.lo
CXX libharfbuzz_la-hb-buffer.lo
CXX libharfbuzz_la-hb-common.lo
CXX libharfbuzz_la-hb-face.lo
CXX libharfbuzz_la-hb-font.lo
CXX libharfbuzz_la-hb-ot-tag.lo
CXX libharfbuzz_la-hb-set.lo
CXX libharfbuzz_la-hb-shape.lo
CXX libharfbuzz_la-hb-shape-plan.lo
CXX libharfbuzz_la-hb-shaper.lo
CXX libharfbuzz_la-hb-unicode.lo
CXX libharfbuzz_la-hb-warning.lo
CXX libharfbuzz_la-hb-ot-font.lo
CXX libharfbuzz_la-hb-ot-layout.lo
CXX libharfbuzz_la-hb-ot-map.lo
CXX libharfbuzz_la-hb-ot-shape.lo
CXX libharfbuzz_la-hb-ot-shape-complex-arabic.lo
CXX libharfbuzz_la-hb-ot-shape-complex-default.lo
CXX libharfbuzz_la-hb-ot-shape-complex-hangul.lo
CXX libharfbuzz_la-hb-ot-shape-complex-hebrew.lo
CXX libharfbuzz_la-hb-ot-shape-complex-indic.lo
CXX libharfbuzz_la-hb-ot-shape-complex-indic-table.lo
CXX libharfbuzz_la-hb-ot-shape-complex-myanmar.lo
CXX libharfbuzz_la-hb-ot-shape-complex-thai.lo
CXX libharfbuzz_la-hb-ot-shape-complex-tibetan.lo
CXX libharfbuzz_la-hb-ot-shape-complex-use.lo
CXX libharfbuzz_la-hb-ot-shape-complex-use-table.lo
CXX libharfbuzz_la-hb-ot-shape-normalize.lo
CXX libharfbuzz_la-hb-ot-shape-fallback.lo
CXX libharfbuzz_la-hb-fallback-shape.lo
CXX libharfbuzz_la-hb-ucdn.lo
CXX main-main.o
CXX test-test.o
CXX test_buffer_serialize-test-buffer-serialize.o
CXX test_size_params-test-size-params.o
CXX test_would_substitute-test-would-substitute.o
GEN harfbuzz.pc

--- stderr
configure: WARNING:
You will not be able to create source packages with 'make dist'
because gtk-doc >= 1.15 is not found.
make[1]: warning: -jN forced in submake: disabling jobserver mode.
sed: 1: "s@%prefix%@/Users/xli/G ...": bad flag in substitute command: 'm'
/bin/sh: harfbuzz.pc: command not found
make[5]: *** [harfbuzz.pc] Error 1
make[5]: *** Waiting for unfinished jobs....
make[4]: *** [all-recursive] Error 1
make[3]: *** [all] Error 2
make[2]: *** [all-recursive] Error 1
make[1]: *** [all] Error 2
make: *** [all] Error 2
thread '

' panicked at 'assertion failed: Command::new("make").args(&["-R", "-f", "makefile.cargo",
&format!("-j{}" , env:: var ( "NUM_JOBS" ) .
unwrap (
))]).status().unwrap().success()', /Users/xli/GoogleDrive/[email protected]/temp/git/servo/.cargo/registry/src/github.com-88ac128001ac3a9a/harfbuzz-sys-0.1.2/build.rs:14
note: Run with RUST_BACKTRACE=1 for a backtrace.

[Warning] Could not generate notification! Optional Python module 'pyobjc' is not installed.
Build completed in 122.83s

Wiki/Docs for contributing

The current README says to check the wiki for information on tutorials and contributing, but it looks like the wiki has been disabled. I have a few things I'd like to write up (such as how to use Vagrant, how to invoke Salt, etc.) so can the wiki be re-enabled? Alternately, I can start a CONTRIBUTING file, but I'm worried it will get longer and harder to navigate over time.

Investigations of failures to provision the ARM cross builders

As noted in #274, we are failing to install the ubuntu packages 'g++-arm-linux-gnueabihf' & 'g++-aarch64-linux-gnu' in GCE.

Just now, I did an apt-get upgrade on the cross builders (as I noticed that they had failed to autoupdate) and got:

root@ip-172-31-34-111:~# apt-get upgrade
Reading package lists... Done
Building dependency tree       
Reading state information... Done
You might want to run 'apt-get -f install' to correct these.
The following packages have unmet dependencies:
 libstdc++-4.8-dev-armhf-cross : Depends: libstdc++6-armhf-cross (>= 4.8.4-2ubuntu1~14.04.1cross0.11.1) but 4.8.2-16ubuntu4cross0.11 is installed
 libstdc++6-armhf-cross : Depends: gcc-4.8-arm-linux-gnueabihf-base (= 4.8.2-16ubuntu4cross0.11) but 4.8.4-2ubuntu1~14.04.1cross0.11.1 is installed
E: Unmet dependencies. Try using -f.
root@ip-172-31-34-111:~# 

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.