Giter Site home page Giter Site logo

coreos / ignition Goto Github PK

View Code? Open in Web Editor NEW
803.0 41.0 241.0 19.69 MB

First boot installer and configuration tool

Home Page: https://coreos.github.io/ignition/

License: Apache License 2.0

Go 96.68% C 1.99% Shell 1.16% Makefile 0.15% Dockerfile 0.03%

ignition's Introduction

Ignition

Ignition is the utility used by Fedora CoreOS and RHEL CoreOS to manipulate disks during the initramfs. This includes partitioning disks, formatting partitions, writing files (regular files, systemd units, etc.), and configuring users. On first boot, Ignition reads its configuration from a source of truth (remote URL, network metadata service, hypervisor bridge, etc.) and applies the configuration.

Usage

Odds are good that you don't want to invoke Ignition directly. In fact, it isn't even present in the root filesystem. Take a look at the Getting Started Guide for details on providing Ignition with a runtime configuration.

Contact

Contributing

See CONTRIBUTING for details on submitting patches and the contribution workflow.

To help triage or fix bugs, see the current Ignition issues.

Config Validation

To validate a config for Ignition there are binaries for a cli tool called ignition-validate available on the releases page. There is also an ignition-validate container: quay.io/coreos/ignition-validate.

Example:

# This example uses podman, but docker can be used too
podman run --pull=always --rm -i quay.io/coreos/ignition-validate:release - < myconfig.ign

ignition's People

Contributors

ajeddeloh avatar arithx avatar ashcrow avatar bgilbert avatar bubblemelon avatar cgwalters avatar coreosbot avatar crawford avatar dependabot[bot] avatar dghubble avatar dm0- avatar dragan avatar dustymabe avatar euank avatar glevand avatar gregkh avatar jlebon avatar laenion avatar lucab avatar marineam avatar marmijo avatar mischief avatar philips avatar prestist avatar rahuls0720 avatar sdemos avatar sohankunkerkar avatar tormath1 avatar travier avatar vcaputo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ignition's Issues

Ignition config translation does not set `wipeFilesystem`

Bug

Operating System Version

Any

Ignition Version

Spec versions 1.x and 2.0.0

Environment

Any

Expected Behavior

Ignition configs translated from 2.0.0 (or below) to a 2.1.0 should set wipeFilesystem = true. Config versions 2.0.0 (and below) always wipe the file system.

Actual Behavior

Ignition configs translated from 2.0.0 (or below) to a 2.1.0 have wipeFilesystem set to its default value of false.

Support conditional user creation

Splitting this out of discussion here:

coreos/afterburn#90 (comment)

Today CL precreates a core user. Not all distributions will want to do that; I personally prefer the cloud-init semantic where it has a default user that's created dynamically if and only if you do not specify a different username.

I suspect for most people who are defining a different user with Ignition than core for CL, the core user is unused. And it could be a source of conflict if the machine is joined to a LDAP domain or whatever that happens to have a core user.

Default files.overwrite to false

Feature Request

Environment

Any

Desired Feature

files.overwrite defaults to true, unlike directories.overwrite and links.overwrite. Default files.overwrite to false for spec 3.0.

Other Information

Ignition files stage is not fully declarative and behavior is unspecified

Feature Request and a Bug Report in 1

2 for 1! what a deal!

Environment

All

Desired Feature

Note: I will use "node" to refer to any "filelike" object (e.g. normal file, symlink, directory).

Background

Currently Ignition has no defined behavior for how to handle duplicate entries in the files/dirs/links sections. Ignition currently just blindly writes each entry in order. Appended configs append to the end of the each list. Furthermore there's nothing that specifies what order the files/dirs/links substages should be run in.

Details of Ignition's current behavior as implemented but not documented:

  • Ignition always follows symlinks when navigating the path given but not for the terminal node. The exception to this is appending to files which appends to the link target. Appending to a symlink is an error.
  • Ignition filesystem entries are executed directories first, then files, then links.
  • Directories sorted before creation such that /a will be created before /a/b. The sorting is not stable, so appended configs could actually be overwritten by the base config.
  • Any leading directories needed for a node are created with owner/group root and mode 0755 except directories, where leading directories are created with the same permissions the specified directory. (This should be documented)
  • Duplicate paths are allowed; one entry may overwrite another.

Proposal (2.x spec)

  1. Ignition's strict backwards compat requirements mean we can't change the dirs/files/links order. If a current user if overriding a file created in the base config with a symlink, swapping the creation order could cause Ignition to fail or produce a different result. Both of these are not acceptable.

  2. This ordering should become part of the spec.

  3. The currently implementation for ordering of files/dirs/links should become part of the spec. This is useful for overriding things in base configs and allows appended configs to override things in the config being appended to. This also has the advantage of already being implemented and thus has no backwards compat risk with existing configs.

Other misc. thoughts for the 3.0 spec

Configs should disallow duplicate entries

Ideally any one self contained Ignition config (i.e. one without replace/append) should not have duplicate paths for the same filesystem and should error out if that is the case. Ignition specs are declarative and thus it should be an error to specify some path should both be a directory and a file. Obviously appended configs make this more complicated (more on that later). The other tricky case is appending to files. We need to allow duplicate appends, but should check that its not trying to append to a directory.

Ordering of links/dirs/files

Assuming there are no duplicate entries, there's no reason files should not be last. Ordering between directories and links is more complicated however. Links can point to directories. It's possible a user might want to have something like a file /foo/bar/baz where foo is a directory existing on the system, bar is a symlink to another directory and baz is another directory they want to create. We can either say all paths for directories should not require resolving links or merge the dirs/files/links sections and use the user supplied ordering there. There's other options to explore here. One big constraint is we need to be able to translate 2.x configs to a semantically identical 3.x config, which gets tricky if the new spec is unordered. If we go from an unordered config to an ordered, deduped config, the translator needs to e able to determine what Ignition 2.x would have done and generate a 3.x config that does the same. This is extrodinarily tricky.

Merging configs

Any merged config ought to have no duplicates. When two configs are merged, any duplicate entries ought to be resolved by picking the newer one (excluding files with append=true).

document storage better

disks

  • is partition number 1 or 0 indexed?
  • size documentation confusing
    • how do you say 'use the remaining available space'?
    • what meaning does + have?
    • it's documented as an integer but some docs use a string in the json

filesystems

  • create has no documentation

Add AliCloud support for ignition

Issue by @onionf91


Issue Report

Feature Request

Environment

AliCloud

Desired Feature

I'd like to host CoreOS VM on AliCloud, but ignition doesn't support AliCloud. If possible could you please add AliCloud support for ignition.

Other Information

Ignition appears to swallow S3 errors

Issue by @colhom


Issue Report

Bug

Container Linux Version

stable @ 1520.8.0
ignition config v2.1.0

Environment

AWS (us-west-2, m3.medium, t2.medium)

Expected Behavior

Ignition fetches s3 bucket object content and uses as a config.

  • S3 object exists and is accessible to IAM profile of ec2 instance hosting ignition
  • Tried with and without a verification hash

Actual Behavior

ignition-disks fails to fetch the config contents from s3, with the following output:

[    7.007728] systemd-networkd[234]: eth0: Gained IPv6LL
(1 of 2) A start job is running forโ€ฆ-mapper-usr.device (8s / no limit)
(1 of 2) A start job is running forโ€ฆ-mapper-usr.device (9s / no limit)
(1 of 2) A start job is running forโ€ฆ-mapper-usr.device (9s / no limit)
(2 of 2) A start job is running for Ignition (disks) (10s / no limit)
(2 of 2) A start job is running for Ignition (disks) (10s / no limit)
... 
(2 of 2) A start job is running forโ€ฆition (disks) (1min 1s / no limit)
(2 of 2) A start job is running forโ€ฆition (disks) (1min 1s / no limit)
(2 of 2) A start job is running forโ€ฆition (disks) (1min 2s / no limit)
(1 of 2) A start job is running forโ€ฆer-usr.device (1min 2s / no limit)
(1 of 2) A start job is running forโ€ฆer-usr.device (1min 3s / no limit)
[   66.496720] ignition[387]: failed to fetch config: RequestCanceled: request context canceled
caused by: context deadline exceeded
[   66.501156] ignition[387]: failed to acquire config: RequestCanceled: request context canceled
caused by: context deadline exceeded
Failed to start Ignition (disks).
See 'systemctl status ignition-disks.service' for details.
Dependency failed for Ignition (files).
Dependency failed for Ignition (record completion).
Dependency failed for Initrd Default Target.

Naive analysis indicates it's hitting the context deadline either when fetching the IAM credentials or the s3 object contents.

Reproduction Steps

Host ignition config as s3 bucket object

{
    "ignition": {
      "config": {
        "append": [
          {
            "source": "s3://<config-s3-bucket>/ign/registry-cache-masters.json",
            "verification": {}
          },
          {
            "source": "s3://<config-s3-bucket>/ign/custom-cacerts-masters.json",
            "verification": {}
          }
        ]
      },
      "timeouts": {},
      "version": "2.1.0"
    }

\cc @dgonyeo

Ignition does not support nested RAID

Issue by @ajeddeloh


Issue Report

Bug

Container Linux Version

$ cat /etc/os-release
NAME="Container Linux by CoreOS"
ID=coreos
VERSION=1506.0.0+2017-08-28-1705
VERSION_ID=1506.0.0
BUILD_ID=2017-08-28-1705
PRETTY_NAME="Container Linux by CoreOS 1506.0.0+2017-08-28-1705 (Ladybug)"
ANSI_COLOR="38;5;75"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://issues.coreos.com"
COREOS_BOARD="amd64-usr"

Environment

Any

Expected Behavior

Ignition can create nested raids

Actual Behavior

Ignition fails to create nested raids.

Reproduction Steps

Use this CLC to create an instance.

storage:
  disks:
    - device: /dev/vdb
      wipe_table: true
      partitions:
       - label: root11
         number: 1
         size: 256MiB
         start: 0
         type_guid: be9067b9-ea49-4f15-b4f6-f36f8c9e1818
       - label: root21
         number: 2
         size: 256MiB
         start: 0
         type_guid: be9067b9-ea49-4f15-b4f6-f36f8c9e1818
       - label: root12
         number: 3
         size: 256MiB
         start: 0
         type_guid: be9067b9-ea49-4f15-b4f6-f36f8c9e1818
       - label: root22
         number: 4
         size: 256MiB
         start: 0
         type_guid: be9067b9-ea49-4f15-b4f6-f36f8c9e1818
  raid:
    - name: "inner1"
      level: "raid1"
      devices:
        - "/dev/vdb1"
        - "/dev/vdb2"
    - name: "inner2"
      level: "raid1"
      devices:
        - "/dev/vdb3"
        - "/dev/vdb4"
    - name: "outer"
      level: "raid0"
      devices:
        - "/dev/md/inner1"
        - "/dev/md/inner2"
  filesystems:
    - name: "ROOT"
      mount:
        device: "/dev/md/outer"
        format: "ext4"
        create:
          options:
            - "-L"
            - "ROOT"
    - name: "NOT_ROOT"
      mount:
        device: "/dev/vda9"
        format: "ext4"
        create:
          options:
            - "-L"
            - "WASTELAND"

Other Information

createRaids creates a list of devices it needs to make the arrays and waits for those devices before it creates the arrays. In the case of nested raid this includes devices it has yet to create.

Add ability for Ignition to report provisioning failures

Issue by @dgonyeo


Issue Report

Feature Request

Environment

All

Desired Feature

Some clouds (ex: packet.net provide the ability for a machine to report their state. Once Ignition fails, it should report this state to the cloud it's running on.

Other Information

internal/providers provides cloud-specific logic. An additional function could be added there that gets called when Ignition fails (pretty much identically to how the NewFetcher function there gets used). Packet seems like a good first choice for a cloud to do this on.

#585 is related, but that suggests reporting to a user-provided URL whereas this is reporting to the APIs for the given cloud environment in which Ignition is running.

Ignition spec 2.x partitioning is not translatable to tools other than sgdisk

Bug

Operating System Version

Any

Ignition Version

Spec version 2.x (maybe 1.x as well?)

Environment

Any

Overview of Partitioning Today:

The Ignition spec allows for specifying 0 for the "start" or "size" of a partition. This is passed through to sgdisk. It refers to the starting sector of the largest available block and the ending sector of the largest available block. This allows easy creation of a "last partition" filling the rest of the disk (start and size both 0) and creating a bunch of new partitions on the same disk (start 0 and defined size), but is disastrous in all other situations. For one, size 0 always means "try to make the partition end at the end of the largest block" regardless of what start is. This is unintuitive and not useful. (A more useful version would be to make the partition as big as possible with the specified start).

We want to change the partitioning support with 3.0 to break from these semantics and either eliminate them (as an attractive nuisance) or replace them with something more intuitive (like have size 0 mean "as big as possible for the given starting sector"). Ignition internally translates old configs to the latest.

The problem is that the meaning of "0" in old configs is dependent on the contents of the disk it's defined for; it has no concrete meaning by itself without the context of the contents of the disk. If the new config fixes the quirks of the old version, there's no way to translate it to the new version since nothing with the same semantics exists in the new version and we don't know the contents of the disk it will be run on.

This leaves four awful options:

  1. Keep the sgdisk semantics forever in the spec
  2. Change the spec. Have some future Ignition fail if it encounters 0's for start/size in a 2.x spec since it can't translate it to a new spec version with the same meaning.
  3. Change the spec. Have some mechanism for detecting this config uses "tainted" start/size 0 and carry twice the partitioning logic in the disks stage.
  4. Change the spec. Have the translator make a "best effort" translation.

Add Ignition options to create partition or RAID volume only if missing

Issue by @travisgroth


Issue Report

Bug

Container Linux Version

# cat /etc/os-release
NAME="Container Linux by CoreOS"
ID=coreos
VERSION=1409.7.0
VERSION_ID=1409.7.0
BUILD_ID=2017-07-19-0005
PRETTY_NAME="Container Linux by CoreOS 1409.7.0 (Ladybug)"
ANSI_COLOR="38;5;75"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://issues.coreos.com"
COREOS_BOARD="amd64-usr"

Environment

VMware ESXi / PXE Boot

Expected Behavior

When using a storage layout which does not wipe partition tables or filesystems at boot time, the system functions normally. Example CT snippet:

storage:
  disks:
    - device: /dev/sda
      wipe_table: false
      partitions:
        - label: ETCD
          number: 1
          size: 10GiB
        - label: DOCKER
          number: 2
          size: 0

  filesystems:
    - name: etcd
      mount:
        device: /dev/disk/by-partlabel/ETCD
        format: ext4
        create:
          force: false

    - name: docker
      mount:
        device: /dev/disk/by-partlabel/DOCKER
        format: ext4
        create:
          force: true

Actual Behavior

If booted with the provided snippet, the system will hang for a long period of time and eventually reboot. I have yet to catch the screen at reboot time. As soon as I change the 'force' and 'wipe_table' to 'true', the system boots as expected, but it obviously wipes things I don't want to be wiped. A single one of those two settings will trigger this behavior. I've tested with only wipe_table as false and only a single filesystem as create=false and I get the same issue. Leaving out the options has a similar effect.

As one might imagine, I'm trying to carve off persistent space for etcd to live on a master while doing a fresh format on the docker volume to keep things clean.

Reproduction Steps

  1. Create CLC config specifying a device without wipe_table set to true || Create a CLC config specifying a filesystem without a 'force' set to true
  2. Transpile into ignition (tested with 0.4.2, which transpiles into 2.0.0 ignition)
  3. Boot system. It will hang approximately after the RNG initializes

Other Information

Ignition appears to ignore proxy environment variables for http requests

Issue by @thomasmortensson


Issue Report

Bug

Ignition appears to ignore proxy environment variables for http requests.

Container Linux Version

NAME="VMware Photon"
VERSION="1.0"
ID=photon
VERSION_ID=1.0
PRETTY_NAME="VMware Photon/Linux"
ANSI_COLOR="1;34"
HOME_URL="https://vmware.github.io/photon/"
BUG_REPORT_URL="https://github.com/vmware/photon/issues"

Environment

kubernetes-anywhere deployment
https://github.com/kubernetes/kubernetes-anywhere/blob/master/phase1/vsphere/README.md

Using vsphere as deployment mechanism.

ignition:
docker.io/cnastorage/k8s-ignition:v2

Expected Behavior

Expected to be able to download files when proxy environment variable is set.

Actual Behavior

null_resource.kubernetes-master (remote-exec): DEBUG    : files: createFilesystemsFiles: createFiles: GET https://storage.googleapis.com/kubernetes-release/release/v1.6.5/bin/linux/amd64/kubectl: attempt #1
null_resource.kubernetes-master: Still creating... (2m20s elapsed)
null_resource.kubernetes-master (remote-exec): DEBUG    : files: createFilesystemsFiles: createFiles: GET error: Get https://storage.googleapis.com/kubernetes-release/release/v1.6.5/bin/linux/amd64/kubectl: dial tcp: lookup storage.googleapis.com on [2001:4860:4860::8888]:53: dial udp [2001:4860:4860::8888]:53: i/o timeout
null_resource.kubernetes-master (remote-exec): DEBUG    : files: createFilesystemsFiles: createFiles: GET https://storage.googleapis.com/kubernetes-release/release/v1.6.5/bin/linux/amd64/kubectl: attempt #2
null_resource.kubernetes-master (remote-exec): DEBUG    : files: createFilesystemsFiles: createFiles: GET error: Get https://storage.googleapis.com/kubernetes-release/release/v1.6.5/bin/linux/amd64/kubectl: dial tcp: lookup storage.googleapis.com on [2001:4860:4860::8888]:53: dial udp [2001:4860:4860::8888]:53: i/o timeout
null_resource.kubernetes-master (remote-exec): DEBUG    : files: createFilesystemsFiles: createFiles: GET https://storage.googleapis.com/kubernetes-release/release/v1.6.5/bin/linux/amd64/kubectl: attempt #3
null_resource.kubernetes-master: Still creating... (2m30s elapsed)
null_resource.kubernetes-master (remote-exec): DEBUG    : files: createFilesystemsFiles: createFiles: GET error: Get https://storage.googleapis.com/kubernetes-release/release/v1.6.5/bin/linux/amd64/kubectl: dial tcp: lookup storage.googleapis.com on [2001:4860:4860::8888]:53: dial udp [2001:4860:4860::8888]:53: i/o timeout
null_resource.kubernetes-master (remote-exec): DEBUG    : files: createFilesystemsFiles: createFiles: GET https://storage.googleapis.com/kubernetes-release/release/v1.6.5/bin/linux/amd64/kubectl: attempt #4
null_resource.kubernetes-master (remote-exec): DEBUG    : files: createFilesystemsFiles: createFiles: GET error: Get https://storage.googleapis.com/kubernetes-release/release/v1.6.5/bin/linux/amd64/kubectl: dial tcp: lookup storage.googleapis.com on [2001:4860:4860::8888]:53: dial udp [2001:4860:4860::8888]:53: i/o timeout
null_resource.kubernetes-master (remote-exec): DEBUG    : files: createFilesystemsFiles: createFiles: GET https://storage.googleapis.com/kubernetes-release/release/v1.6.5/bin/linux/amd64/kubectl: attempt #5
null_resource.kubernetes-master: Still creating... (2m40s elapsed)
null_resource.kubernetes-master (remote-exec): DEBUG    : files: createFilesystemsFiles: createFiles: GET error: Get https://storage.googleapis.com/kubernetes-release/release/v1.6.5/bin/linux/amd64/kubectl: dial tcp: lookup storage.googleapis.com on [2001:4860:4860::8888]:53: dial udp [2001:4860:4860::8888]:53: i/o timeout
null_resource.kubernetes-master (remote-exec): DEBUG    : files: createFilesystemsFiles: createFiles: GET https://storage.googleapis.com/kubernetes-release/release/v1.6.5/bin/linux/amd64/kubectl: attempt #6
null_resource.kubernetes-master: Still creating... (2m50s elapsed)
null_resource.kubernetes-master (remote-exec): DEBUG    : files: createFilesystemsFiles: createFiles: GET error: Get https://storage.googleapis.com/kubernetes-release/release/v1.6.5/bin/linux/amd64/kubectl: dial tcp: lookup storage.googleapis.com on [2001:4860:4860::8888]:53: dial udp [2001:4860:4860::8888]:53: i/o timeout
null_resource.kubernetes-master (remote-exec): DEBUG    : files: createFilesystemsFiles: createFiles: GET https://storage.googleapis.com/kubernetes-release/release/v1.6.5/bin/linux/amd64/kubectl: attempt #7
null_resource.kubernetes-master: Still creating... (3m0s elapsed)
null_resource.kubernetes-master (remote-exec): DEBUG    : files: createFilesystemsFiles: createFiles: GET error: Get https://storage.googleapis.com/kubernetes-release/release/v1.6.5/bin/linux/amd64/kubectl: dial tcp: lookup storage.googleapis.com on [2001:4860:4860::8888]:53: dial udp [2001:4860:4860::8888]:53: i/o timeout
null_resource.kubernetes-master (remote-exec): DEBUG    : files: createFilesystemsFiles: createFiles: GET https://storage.googleapis.com/kubernetes-release/release/v1.6.5/bin/linux/amd64/kubectl: attempt #8
null_resource.kubernetes-master: Still creating... (3m10s elapsed)
null_resource.kubernetes-master (remote-exec): DEBUG    : files: createFilesystemsFiles: createFiles: GET error: Get https://storage.googleapis.com/kubernetes-release/release/v1.6.5/bin/linux/amd64/kubectl: dial tcp: lookup storage.googleapis.com on [2001:4860:4860::8888]:53: dial udp [2001:4860:4860::8888]:53: i/o timeout
null_resource.kubernetes-master (remote-exec): DEBUG    : files: createFilesystemsFiles: createFiles: GET https://storage.googleapis.com/kubernetes-release/release/v1.6.5/bin/linux/amd64/kubectl: attempt #9
null_resource.kubernetes-master: Still creating... (3m20s elapsed)
null_resource.kubernetes-master (remote-exec): DEBUG    : files: createFilesystemsFiles: createFiles: GET error: Get https://storage.googleapis.com/kubernetes-release/release/v1.6.5/bin/linux/amd64/kubectl: dial tcp: lookup storage.googleapis.com on [2001:4860:4860::8888]:53: dial udp [2001:4860:4860::8888]:53: i/o timeout
null_resource.kubernetes-master (remote-exec): DEBUG    : files: createFilesystemsFiles: createFiles: GET https://storage.googleapis.com/kubernetes-release/release/v1.6.5/bin/linux/amd64/kubectl: attempt #10
null_resource.kubernetes-master: Still creating... (3m30s elapsed)
null_resource.kubernetes-master (remote-exec): DEBUG    : files: createFilesystemsFiles: createFiles: GET error: Get https://storage.googleapis.com/kubernetes-release/release/v1.6.5/bin/linux/amd64/kubectl: dial tcp: lookup storage.googleapis.com on [2001:4860:4860::8888]:53: dial udp [2001:4860:4860::8888]:53: i/o timeout
null_resource.kubernetes-master (remote-exec): DEBUG    : files: createFilesystemsFiles: createFiles: GET https://storage.googleapis.com/kubernetes-release/release/v1.6.5/bin/linux/amd64/kubectl: attempt #11
null_resource.kubernetes-master: Still creating... (3m40s elapsed)
null_resource.kubernetes-master (remote-exec): DEBUG    : files: createFilesystemsFiles: createFiles: GET error: Get https://storage.googleapis.com/kubernetes-release/release/v1.6.5/bin/linux/amd64/kubectl: dial tcp: lookup storage.googleapis.com on [2001:4860:4860::8888]:53: dial udp [2001:4860:4860::8888]:53: i/o timeout
null_resource.kubernetes-master (remote-exec): DEBUG    : files: createFilesystemsFiles: createFiles: GET https://storage.googleapis.com/kubernetes-release/release/v1.6.5/bin/linux/amd64/kubectl: attempt #12
null_resource.kubernetes-master: Still creating... (3m50s elapsed)
null_resource.kubernetes-master (remote-exec): DEBUG    : files: createFilesystemsFiles: createFiles: GET error: Get https://storage.googleapis.com/kubernetes-release/release/v1.6.5/bin/linux/amd64/kubectl: dial tcp: lookup storage.googleapis.com on [2001:4860:4860::8888]:53: dial udp [2001:4860:4860::8888]:53: i/o timeout
null_resource.kubernetes-master (remote-exec): DEBUG    : files: createFilesystemsFiles: createFiles: GET https://storage.googleapis.com/kubernetes-release/release/v1.6.5/bin/linux/amd64/kubectl: attempt #13
null_resource.kubernetes-master: Still creating... (4m0s elapsed)
null_resource.kubernetes-master (remote-exec): DEBUG    : files: createFilesystemsFiles: createFiles: GET error: Get https://storage.googleapis.com/kubernetes-release/release/v1.6.5/bin/linux/amd64/kubectl: dial tcp: lookup storage.googleapis.com on [2001:4860:4860::8888]:53: dial udp [2001:4860:4860::8888]:53: i/o timeout
null_resource.kubernetes-master (remote-exec): DEBUG    : files: createFilesystemsFiles: createFiles: GET https://storage.googleapis.com/kubernetes-release/release/v1.6.5/bin/linux/amd64/kubectl: attempt #14
null_resource.kubernetes-master: Still creating... (4m10s elapsed)
null_resource.kubernetes-master (remote-exec): DEBUG    : files: createFilesystemsFiles: createFiles: GET error: Get https://storage.googleapis.com/kubernetes-release/release/v1.6.5/bin/linux/amd64/kubectl: dial tcp: lookup storage.googleapis.com on [2001:4860:4860::8888]:53: dial udp [2001:4860:4860::8888]:53: i/o timeout
null_resource.kubernetes-master (remote-exec): DEBUG    : files: createFilesystemsFiles: createFiles: GET https://storage.googleapis.com/kubernetes-release/release/v1.6.5/bin/linux/amd64/kubectl: attempt #15
null_resource.kubernetes-master: Still creating... (4m20s elapsed)
null_resource.kubernetes-master (remote-exec): DEBUG    : files: createFilesystemsFiles: createFiles: GET error: Get https://storage.googleapis.com/kubernetes-release/release/v1.6.5/bin/linux/amd64/kubectl: dial tcp: lookup storage.googleapis.com on [2001:4860:4860::8888]:53: dial udp [2001:4860:4860::8888]:53: i/o timeout
null_resource.kubernetes-master (remote-exec): CRITICAL : files: createFilesystemsFiles: createFiles: Error fetching file "/usr/local/bin/kubectl": unable to fetch resource (no more attempts available)
null_resource.kubernetes-master (remote-exec): CRITICAL : files: failed to create files: failed to create files: failed to resolve file "/usr/local/bin/kubectl"
null_resource.kubernetes-master (remote-exec): Failed to docker run installer container
Error applying plan:

1 error(s) occurred:

* null_resource.kubernetes-master: 1 error(s) occurred:

* Script exited with non-zero exit status: 1

Terraform does not automatically rollback in the face of errors.
Instead, your Terraform state file has been partially updated with
any resources that successfully completed. Please address the error
above and apply again to incrementally change your infrastructure.
Makefile:70: recipe for target 'do' failed
make[1]: *** [do] Error 1
make[1]: Leaving directory '/opt/kubernetes-anywhere'
Makefile:48: recipe for target 'deploy-cluster' failed
make: *** [deploy-cluster] Error 2

Reproduction Steps

  1. Use kubernetes-anywhere behind a proxy server. (deploy something using ignition behind a proxy server. Requests will fail)

Other Information

I do not believe this to be a bug with kubernetes-anywhere but actually with the underlying ignition code.

I have tried the following within my environment.

  • Set http_proxy, https_proxy, HTTP_PROXY and HTTPS_PROXY in /etc/sysconfig/proxy
  • Set proxy env vars via /etc/environment
  • Set proxy info in ENV section of DockerFile
  • Set proxy info in docker service envvars for startup. - I can verify that docker can do a docker pull
  • verified I can wget this url with environment variables

I am not the only one this affects:
kubernetes-retired/kubernetes-anywhere#313

It looks like ignition doesn't make use of ProxyFromEnvironment

// NewHttpClient creates a new client with the given logger and timeouts.

Basically, anyone behind a corporate proxy CANNOT use ignition, I would imagine this is a large percentage of business users.

network: interface goes down

Initially, the interface look good and work find:

3: ens1f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether a0:48:1c:b9:9b:a0 brd ff:ff:ff:ff:ff:ff
    inet 10.187.10.10/24 brd 10.187.10.255 scope global ens1f0
       valid_lft forever preferred_lft forever
    inet6 fe80::a248:1cff:feb9:9ba0/64 scope link 
       valid_lft forever preferred_lft forever

However, after some period or scheduled updates, the interface shows DOWN. Importantly, the switch it is connected to shows UP.

core@td-np10 ~ $ ip link show dev ens1f0
3: ens1f0: <BROADCAST,MULTICAST> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000
    link/ether a0:48:1c:b9:9b:a0 brd ff:ff:ff:ff:ff:ff
core@td-np10 ~ $ 
core@td-np10 ~ $ sudo ip link set dev ens1f0 up
RTNETLINK answers: N

Please assisnt

Revisit SELinux labeling approach

So, we had a large discussion in both #569 and coreos/bugs#2417 about the best way to tackle SELinux labeling. To summarize, the two options were:

(a) create an ignition-relabel.service unit that runs restorecon on all the created files
(b) load the policy from the initramfs

In the end, we went with (a). I'd like us to now reconsider option (b). The reason is that there are some pitfalls with (a):

  1. somewhat goes against the Ignition philosophy of doing everything pre-pivot

All mutations specified in the Ignition config happen pre-pivot. Dumping a relabeling service to "complete the job" post-pivot goes against that.

  1. (most importantly) running relabeling operations on a booting system is ill-defined/racy

This is also mentioned in the Fedora devel thread I had linked in the PR: basically there is no reliable way to eliminate race conditions with other systemd services that may access mislabelled files. One example of this is systemd-sysctl.service, which runs earlier than ignition-relabel.service currently, and thus can trip on drop-in files in /etc/sysctl.d/. In fact, right now systemd-analyze plot shows there are more than a dozen services that start before ignition-relabel.service.

(Un)fortunately AFAIK systemd has no concept of "run this unit first"; i.e. if multiple units have the same Before= specification, it's still undefined when each will run; they're normally run in parallel. Of course, you could start playing the game of tacking on specific troublesome services in your Before= but that's hacky and brittle.

In contrast, the pitfall of (b) is that:

  1. it's heavy-handed and might lead to systemd issues

Loading the policy in the initramfs seems like a big change. The key piece of software this directly affects is systemd. Normally, post-pivot systemd is the one that loads the policy first. However, systemd is designed to deal with pre-loaded policies from the initramfs. First, it always tries to load the policy, regardless of whether a policy is already loaded. Even if it fails (e.g. somehow the policy prevents systemd from loading the policy, which it currently does not, but let's say), systemd will just keep going. Which is fine; we already have the correct policy loaded.

Note that even with the policy being loaded in the initramfs, it would still be best to run a relabeling pass (or using setfscreatecon at file creation time). Files created by utilities Ignition runs through chroot will likely be labeled properly. Other files Ignition directly creates might be labeled correctly through the default algorithm (e.g. file transition rules, or inheritance from parent dir), but it might not. So most of the current "file tracking" code used for relabeling would still be needed.

My proposal is to add a new experimental compile time switch which loads the policy in Ignition and performs relabeling as part of the initramfs. We can then enable this switch in FCOS and/or RHCOS and see how it works. If there are major roadblocks, we just drop the experimental switch. If not, we stabilize on the new behaviour for the selinuxRelabel knob and drop the ignition-relabel.service path.

Thoughts? Are there other pitfalls with (b) (or major advantages of (a))?

Expand support for Ignition files

Issue by @crawford


/cc @colhom

Currently, Ignition only supports including file contents inline within the config. This works fine for small text files but isn't really feasible for binary files, large files, or even a large number of small files (some cloud-providers limit the size of the userdata). I wanted to start a discussion about how support for files should evolve.

Here are my rough notes:

  • File Contents
    • Format
      • Plain text
      • Compressed
      • Alternate encoding (e.g. base64)?
    • Source
      • Inline
      • Reference
        • URI
          • Local resource?
          • Network resource
        • Verification
          • Hash
            • MD5?
            • SHA1?
            • SHA256
            • SHA512
          • PGP
            • Detached signature
            • Inline signature
            • DNS?
            • Key location
              • UEFI variable
              • Inline
          • HTTPS?

dbus not present in initramfs

disks: createFilesystems: op(1): [failed]   waiting for devices [/dev/disk/by-label/ROOT]: dial unix /var/run/dbus/system_bus_socket: no such file or directory

We need to either include the dbus-daemon in the initramfs or use a different wait mechanism in Ignition.

Ignore cloud-configs

Ignition needs to ignore cloud-configs. Likewise, coreos-cloudinit needs to ignore ignition configs.

Ignition should have reasonable default file permissions

Issue by @ajeddeloh


Issue Report

Feature Request

Environment

Any

Desired Feature

Ignition should create files with sane permissions (644), not 000 when mode is not specified.

Other Information

This is a breaking change. Suddenly making files more permissive is a security concern.

Kola AWS test failure: nfs unit not enabled

Bug

Operating System Version

Red Hat CoreOS 4.0.4760

Ignition Version

Ignition 0.27.0

Environment

AWS (ami: ami-0fc9356f07902b569)

Expected Behavior

Active var-mnt.mount in rhcos.linux.nfs.v4 test: https://github.com/coreos/mantle/blob/master/kola/tests/misc/nfs.go#L121

Actual Behavior

--- FAIL: rhcos.linux.nfs.v4 (167.07s)
        nfs.go:76: NFS server booted.
        nfs.go:81: Test file "/tmp/tmp.G04yeAUFem" created on server.
        nfs.go:118: NFS client booted.
        nfs.go:131: var-mnt.mount status is "unknown": Process exited with status 3
systemctl status var-mnt.mount
โ— var-mnt.mount - NFS Client
   Loaded: loaded (/etc/systemd/system/var-mnt.mount; disabled; vendor preset: enabled)
   Active: inactive (dead)
    Where: /var/mnt
     What: 172.31.18.42:/

Other Information

Raising issue here as suggested by @arithx. Original mantle issue: coreos/mantle#905

The detailed logs can be found here: https://continuous-infra-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/rhcos/job/jerzhang-test-coreos-rhcos-aws-test-and-launch/6/ , under build artifacts.

`ignition --help` returns non-zero return code

Bug

[root@vanilla-f28 artifacts]# ignition --help
Usage of ignition:
  -clear-cache
        clear any cached config
  -config-cache string
        where to cache the config (default "/run/ignition.json")
  -fetch-timeout duration
        initial duration for which to wait for config (default 1m0s)
  -log-to-stdout
        log to stdout instead of the system log when set
  -oem value
        current oem. [azure brightbox cloudsigma cloudstack digitalocean ec2 exoscale file gce hyperv interoute niftycloud openstack packet pxe qemu rackspace rackspace-onmetal vagrant vagrant-virtualbox virtualbox vmware]
  -root string
        root of the filesystem (default "/")
  -stage value
        execution stage. [disks files]
  -version
        print the version and exit
[root@vanilla-f28 artifacts]# echo $?
2

See the non-zero return code above ^^

Operating System Version

Fedora 28

Ignition Version

[root@vanilla-f28 artifacts]# ignition --version
Ignition 0.27.0

RPM file here. Koji scratch build here

Ignition: Support for Basic Authentication

Issue by @jcrowthe


Issue Report

CoreOS Version

$ cat /etc/os-release
NAME=CoreOS 1235.6.0

Expected Behavior

Accept basic authentication username/password in URL of remote Ignition config.

Actual Behavior

Ignition fails to pull the remote config.

Reproduction Steps

  1. Provide the following as ignition data:
{
    "ignition": {
        "version": "2.0.0",
        "config": {
            "append": [{
                "source": "http://test:test@<server-ip>/test.json",
                "verification": {}
            }]
        }
    }
}
  1. Host the following test.json at the server address above:
{
    "ignition": {
        "version": "2.0.0"
    },
    "storage": {
        "files": [{
            "filesystem": "root",
            "path": "/foo/bar",
            "contents": {
                "source": "data:,example%20file%0A"
            }
        }]
    }
}

Feature Request

This is a request for basic authentication support in Ignition. This support would allow ignition data to be housed on a remote server that requires a username/password in order to be accessed. The application of this feature would be to provide ignition data that contains sensitive data, such as SSL certificates, passwords, etc.

While an obvious alternative to this solution is to embed this sensitive config information directly into the parent ignition file, this is not always possible. Specifically, right now, cloud providers have a max of 16KB of data that can be placed inside the user-data (AWS) or custom-data (Azure) fields, where the ignition data is submitted. Once this limit is reached, it is necessary to reference further configs by using Ignition's remote config ability.

Let's consider this request. Thanks!

ignition does not attempt to retry hung GET from metadata server with default timeouts

Bug

Operating System Version

Red Hat CoreOS

Ignition Version

2.2.0

Environment

Red Hat CoreOS

What hardware/cloud provider/hypervisor is being used to run Ignition?

Openstack Ocata

Expected Behavior

ignition gets config from metadata server

Actual Behavior

ignition's initial GET is hung for some reason (network not quite ready yet?), but never retries

Reproduction Steps

$ cat config.ign
{
	"ignition": {
		"version": "2.2.0"
	},
	"passwd": {
		"users": [{
			"groups": ["sudo"],
			"name": "sjennings",
			"sshAuthorizedKeys": ["ssh-rsa AAAAB3NzaC1..."]
		}]
	}
}

$ openstack server create --image rhcos-20180712 --flavor m1.medium --key-name yubikey rhcos2 --user-data config.ign --wait --nic net-id=dbf780a9-5035-48fc-a929-133de0a1101c

instance logs

[    5.049390] systemd[1]: Started Remount /sysroot/var for Ignition.         Starting Ignition (files)...

[    5.050564] systemd[1]: Starting Ignition (files)...
[    5.057660] ignition[764]: INFO     : Ignition 0.26.0
[    5.058432] ignition[764]: INFO     : reading system config file "/usr/lib/ignition/base.ign"
[    5.059398] ignition[764]: INFO     : no config at "/usr/lib/ignition/base.ign"
[    5.060438] ignition[764]: DEBUG    : parsed url from cmdline: ""
[    5.061193] ignition[764]: INFO     : no config URL provided
[    5.061807] ignition[764]: INFO     : reading system config file "/usr/lib/ignition/user.ign"
[    5.062622] ignition[764]: INFO     : no config at "/usr/lib/ignition/user.ign"
[    5.063345] ignition[764]: DEBUG    : config drive ("/dev/disk/by-label/config-2") not found. Waiting...
[    5.064172] ignition[764]: INFO     : GET http://169.254.169.254/openstack/latest/user_data: attempt #1
[    5.065014] ignition[764]: DEBUG    : config drive ("/dev/disk/by-label/CONFIG-2") not found. Waiting...
[    6.058136] ignition[764]: DEBUG    : config drive ("/dev/disk/by-label/config-2") not found. Waiting...
[    6.061856] ignition[764]: DEBUG    : config drive ("/dev/disk/by-label/CONFIG-2") not found. Waiting...
[    7.058360] ignition[764]: DEBUG    : config drive ("/dev/disk/by-label/config-2") not found. Waiting...
...
[   34.063241] ignition[764]: DEBUG    : config drive ("/dev/disk/by-label/config-2") not found. Waiting...
[   34.064892] ignition[764]: DEBUG    : config drive ("/dev/disk/by-label/CONFIG-2") not found. Waiting...
[   35.057952] ignition[764]: INFO     : neither config drive nor metadata service were available in time. Continuing without a config...
[[32m  OK  [0m] Started Ignition (files).
[   35.059562] ignition[764]: ERROR    : timed out while fetching config from config drive (CONFIG-2)
         Starting Switch Root...
[   35.060612] systemd[1]: Started Ignition (files).
[   35.061185] ignition[764]: ERROR    : timed out while fetching config from config drive (config-2)
[   35.061872] ignition[764]: WARNING  : failed to fetch config: not a config (empty)
[   35.062580] ignition[764]: INFO     : not a config (empty): ignoring user-provided config
[   35.063302] ignition[764]: INFO     : reading system config file "/usr/lib/ignition/default.ign"
[   35.064189] ignition[764]: INFO     : no config at "/usr/lib/ignition/default.ign"
[   35.064889] systemd[1]: Starting Switch Root...
[   35.065496] systemd[1]: Switching root.

Other Information

Opt-in translation between config versions (vs. current required support)

Currently config/v2_2 is importing config/v2_1 for translation (and backwards-compat). It would be nice for that to be optional (with the logic stored in a separate package), so consumers who knew they only needed to support v2.2 configs could prune the code that supports older config versions.

I'm sure there are many possible approaches to this sort of issue, but see here for an approach I've used previously. A generic version struct lets you extract the target version. You can then lookup the appropriate reader from a registry. The per-version readers register themselves on import. And consumers who want backwards compat opt in by importing the old versions they want to support.

If this is an issue you want to address, and the libpod hook approach sounds reasonable, I can work up an ignition PR for it.

Failure to fetch the user-data on EC2

This looks like a transient network failure, but then it just seems to hang after 5/6 seconds.

ignition[319]: Ignition v0.1.4
ignition[319]: op(1): [started]  GET "http://169.254.169.254/2009-04-04/user-data"
ignition[319]: op(1): [failed]   GET "http://169.254.169.254/2009-04-04/user-data": Get http://169.254.169.254/2009-04-04/user-data: dial tcp 169.254.169.254:80: network is unreachable
ignition[319]: op(2): [started]  GET "http://169.254.169.254/2009-04-04/user-data"
ignition[319]: op(2): [failed]   GET "http://169.254.169.254/2009-04-04/user-data": Get http://169.254.169.254/2009-04-04/user-data: dial tcp 169.254.169.254:80: network is unreachable
ignition[319]: op(3): [started]  GET "http://169.254.169.254/2009-04-04/user-data"
ignition[319]: op(3): [failed]   GET "http://169.254.169.254/2009-04-04/user-data": Get http://169.254.169.254/2009-04-04/user-data: dial tcp 169.254.169.254:80: network is unreachable
ignition[319]: op(4): [started]  GET "http://169.254.169.254/2009-04-04/user-data"
ignition[319]: op(4): [failed]   GET "http://169.254.169.254/2009-04-04/user-data": Get http://169.254.169.254/2009-04-04/user-data: dial tcp 169.254.169.254:80: network is unreachable
ignition[319]: op(5): [started]  GET "http://169.254.169.254/2009-04-04/user-data"
ignition[319]: op(5): [failed]   GET "http://169.254.169.254/2009-04-04/user-data": Get http://169.254.169.254/2009-04-04/user-data: dial tcp 169.254.169.254:80: network is unreachable
ignition[319]: op(6): [started]  GET "http://169.254.169.254/2009-04-04/user-data"
 start job is running for Ignition (disks) (6s / 31s)
 start job is running for Ignition (disks) (7s / 31s)
 start job is running for Ignition (disks) (7s / 31s)
...
 start job is running for Ignition (disks) (30s / 31s)
 start job is running for Ignition (disks) (31s / 31s)
ignition-disks.service: Start operation timed out. Terminating.
ignition-disks.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
systemd[1]: Failed to start Ignition (disks).

Ignition Config not working as expected on VMWare Fusion

Issue by @thomasbell1985


Issue Report

I am attempting to use the ignition config to configure the authorized ssh keys for the core user using guest info for vmware. I am following the docs found here. It does not appear to be working correctly, i.e. the authorized ssh keys are not being set.

Bug

Container Linux Version

Below is the os-release information

screen shot 2017-11-07 at 7 16 25 am

BUG_REPORT_URL="https://issues.coreos.com"

Environment

I am using VMWare Fusion Pro 10.0.1 for the mac

Expected Behavior

When setting the following parameters with the correct ignition config:

guestinfo.coreos.config.data = ""
guestinfo.coreos.config.data.encoding = "base64"

I expect the parameters specified to be set on the machine at boot time. I am attempting to set the ssh_authorized keys for the core user.

Actual Behavior

The machine boots, but no authorized keys are set for the core user. running the following command:

$: ls .ssh/

shows an empty directory.

Reproduction Steps

  1. Create a ignition config.yaml that looks like this:
passwd:
  users:
    - name: core
    - ssh_authorized_keys:
      - "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQDLu+/5DffWK2Olj1SdlG335yKdRtUVq6WKhMoee0lF4UaqaXPXJx2R2wD869QcK+ys2AJshCpH/sPVKqFZrvtagROlMND/y8JAtfdlamILwUve7gGisSz5a4wOJIV4MQfzFC29aiGcHasBDDu4XyWICeWCa8vhgqeAAm2VsdHvMbqSbOWLIHYj5JxMBE13/e3Y9TuBiBYQegW4GFCVj3GCO7jrI+hF6RkTBorHhP8DeoiB7zGHwVbVn5iHgEfUpJKqXCVL+JI/7TZ0DnCKGvF3ibn8B6ZrUO43jzNAERif44aAf39Dib4fNgR7RuVh6Ss+QsiuZlBE5XmJeUGaUhtumaT1Md5jd5m2uKRlt7gJECl6376hFdUDIJmK8oZQsGqwyWrzHGi8YrOtYpUladIgTS9JGnmfoTretxTfmdUsNCMnFB8hzPLNp6CtZL+ibSJIQxhX5bbLDj+V3Teeb4S+iP/14Hk2DzA3ItzFxYRudb6xeI8xfcRYiH8W+gN1zPJ3ppWrB8KgXxvYxyd/rwDs6uwU6nu8zXJRCoz0ltLL9hPmyZGsH/3CX4W0m99L8nPcakn/WSvDTfYJJD6P2o8lI4DbjF9vl3J3d4muiIQMFTvuRVTTw6gkkX9zJKzdL34jTXuYvtKy9iNVsIw7f98L1tTnCzNzRh1LobTSJUXrbw== [email protected]"        
  1. I then transpile using the latest version of the ignition transpile by running the following command:
ct --in-file path/to/ignition-config.yaml --out-file path/to/ignition-config.json
  1. Base 64 encode the ignition-config.json using the following command
base64  path/to/ignition-config.json

which produces this output:

eyJpZ25pdGlvbiI6eyJjb25maWciOnt9LCJ0aW1lb3V0cyI6e30sInZlcnNpb24iOiIyLjEuMCJ9LCJuZXR3b3JrZCI6e30sInBhc3N3ZCI6eyJ1c2VycyI6W3sibmFtZSI6ImNvcmUifSx7InNzaEF1dGhvcml6ZWRLZXlzIjpbInNzaC1yc2EgQUFBQUIzTnphQzF5YzJFQUFBQURBUUFCQUFBQ0FRREx1Ky81RGZmV0syT2xqMVNkbEczMzV5S2RSdFVWcTZXS2hNb2VlMGxGNFVhcWFYUFhKeDJSMndEODY5UWNLK3lzMkFKc2hDcEgvc1BWS3FGWnJ2dGFnUk9sTU5EL3k4SkF0ZmRsYW1JTHdVdmU3Z0dpc1N6NWE0d09KSVY0TVFmekZDMjlhaUdjSGFzQkREdTRYeVdJQ2VXQ2E4dmhncWVBQW0yVnNkSHZNYnFTYk9XTElIWWo1SnhNQkUxMy9lM1k5VHVCaUJZUWVnVzRHRkNWajNHQ083anJJK2hGNlJrVEJvckhoUDhEZW9pQjd6R0h3VmJWbjVpSGdFZlVwSktxWENWTCtKSS83VFowRG5DS0d2RjNpYm44QjZaclVPNDNqek5BRVJpZjQ0YUFmMzlEaWI0Zk5nUjdSdVZoNlNzK1FzaXVabEJFNVhtSmVVR2FVaHR1bWFUMU1kNWpkNW0ydUtSbHQ3Z0pFQ2w2Mzc2aEZkVURJSm1LOG9aUXNHcXd5V3J6SEdpOFlyT3RZcFVsYWRJZ1RTOUpHbm1mb1RyZXR4VGZtZFVzTkNNbkZCOGh6UExOcDZDdFpMK2liU0pJUXhoWDViYkxEaitWM1RlZWI0UytpUC8xNEhrMkR6QTNJdHpGeFlSdWRiNnhlSTh4ZmNSWWlIOFcrZ04xelBKM3BwV3JCOEtnWHh2WXh5ZC9yd0RzNnV3VTZudTh6WEpSQ296MGx0TEw5aFBteVpHc0gvM0NYNFcwbTk5TDhuUGNha24vV1N2RFRmWUpKRDZQMm84bEk0RGJqRjl2bDNKM2Q0bXVpSVFNRlR2dVJWVFR3Nmdra1g5ekpLemRMMzRqVFh1WXZ0S3k5aU5Wc0l3N2Y5OEwxdFRuQ3pOelJoMUxvYlRTSlVYcmJ3PT0gazhzLWNsdXN0ZXItdXNlckBteWRvbWFpbi5jb20iXX1dfSwic3RvcmFnZSI6e30sInN5c3RlbWQiOnt9fQ==

The decoded output looks like this:

{"ignition":{"config":{},"timeouts":{},"version":"2.1.0"},"networkd":{},"passwd":{"users":[{"name":"core"},{"sshAuthorizedKeys":["ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQDLu+/5DffWK2Olj1SdlG335yKdRtUVq6WKhMoee0lF4UaqaXPXJx2R2wD869QcK+ys2AJshCpH/sPVKqFZrvtagROlMND/y8JAtfdlamILwUve7gGisSz5a4wOJIV4MQfzFC29aiGcHasBDDu4XyWICeWCa8vhgqeAAm2VsdHvMbqSbOWLIHYj5JxMBE13/e3Y9TuBiBYQegW4GFCVj3GCO7jrI+hF6RkTBorHhP8DeoiB7zGHwVbVn5iHgEfUpJKqXCVL+JI/7TZ0DnCKGvF3ibn8B6ZrUO43jzNAERif44aAf39Dib4fNgR7RuVh6Ss+QsiuZlBE5XmJeUGaUhtumaT1Md5jd5m2uKRlt7gJECl6376hFdUDIJmK8oZQsGqwyWrzHGi8YrOtYpUladIgTS9JGnmfoTretxTfmdUsNCMnFB8hzPLNp6CtZL+ibSJIQxhX5bbLDj+V3Teeb4S+iP/14Hk2DzA3ItzFxYRudb6xeI8xfcRYiH8W+gN1zPJ3ppWrB8KgXxvYxyd/rwDs6uwU6nu8zXJRCoz0ltLL9hPmyZGsH/3CX4W0m99L8nPcakn/WSvDTfYJJD6P2o8lI4DbjF9vl3J3d4muiIQMFTvuRVTTw6gkkX9zJKzdL34jTXuYvtKy9iNVsIw7f98L1tTnCzNzRh1LobTSJUXrbw== [email protected]"]}]},"storage":{},"systemd":{}}

I have validated this config here which results in the following output:

screen shot 2017-11-07 at 7 28 56 am

Other Information

If I base64 encode a cloud-config.yml file and put it in the guest info parameter things work as expected, but not with the ignition config

Ignition runs on every boot

When Ignition runs the disks target, systemd cannot check /etc/machine-id because root isn't mounted yet. As a result, it attempts to run the stage on every boot.

Appending to a file is not idempotent

Bug

Operating System Version

Any

Ignition Version

0.28.0

Environment

Any

Expected Behavior

If storage.files[].append is true, the contents are appended exactly once.

Actual Behavior

In the presence of failures, the contents can be appended multiple times.

Reproduction Steps

On Container Linux:

{
  "ignition": {
    "version": "2.2.0"
  },
  "storage": {
    "files": [
      {
        "filesystem": "root",
        "path": "/foo",
        "append": true,
        "contents": {
          "source": "data:,hello%0A"
        },
        "mode": 420
      },
      {
        "filesystem": "root",
        "path": "/bar",
        "append": true,
        "contents": {
          "source": "https://httpbin.org/status/404,404,404,200"
        },
        "mode": 420
      }
    ]
  }
}

Other Information

On Container Linux, if Ignition fails, the machine will automatically reboot after 5 minutes and Ignition will rerun. With the above config, fetching /bar will fail 75% of the time but will eventually succeed, at which point /foo will contain one line per attempt.

We can fix this by appending to files in two passes:

  1. Before the first time we append to a particular file in the files stage, check if file.ignition-orig exists. If so, overwrite file with file.ignition-orig before appending to file. If not, copy file to file.ignition-orig.
  2. At the end of the files stage, delete the .ignition-orig files.

That approach appropriates *.ignition-orig as private namespace for Ignition. One alternative is to create an .ignition-temp directory at the root of the filesystem and put the backup files there.

No log output seen if ignition hangs

Bug

Operating System Version

  • RHCOS ~4392

Ignition Version

0.27.0-2.gitcc7ebe0.fc30

Environment

  • libvirt
  • openstack

What hardware/cloud provider/hypervisor is being used to run Ignition?

  • libvirt
  • openstack

Expected Behavior

If ignition is unable to honor the config the boot will fail with log output

Actual Behavior

If ignition is unable to honor the config the booting hangs with no log output.

Reproduction Steps

  1. Provide an ignition config to RHCOS (example to be provided)
  2. Booting hangs with no logs

Other Information

The following PRs addressed similar issues earlier on:

This was reported by @eparis.

Nodes created/touched by Ignition lack the proper selinux labels

Issue by @dgonyeo


Issue Report

Bug

Environment

All distros with selinux enabled

Expected Behavior

Filesystem nodes created by Ignition receive the correct default selinux labels for their location

Actual Behavior

Nodes created by Ignition lack a selinux label (ls -Z shows system_u:object_r:unlabeled_t:s0)

Other Information

When a node (i.e. files, dirs, and links) is created on a distro with selinux a default label is applied to the node based on the selinux policies. These selinux policies are stored on disk, and on a typical distro with selinux enabled (Fedora 27 is what I'm testing on) these policies are loaded in after the switch-root occurs. This means that any node created by Ignition lacks the proper default selinux label. One side effect of this is that when Ignition creates a user /etc/{passwd,shadow,group} become unlabeled, sssd.service fails to start as a result, and then the machine doesn't fully boot.

I asked Dan Walsh the following:

Would it make sense for Ignition to fetch the SELinux policy off of the root filesystem and load it into the initramfs systemd?

His answer was:

that involves a lot of intelligence and would have to be coordinated with the systemd team

So loading in the selinux policies in the initramfs would be a fairly involved solution to this.

Luckily Ignition knows which nodes it creates, so a far easier solution would be to call restorecon on the touched nodes once the policies have been loaded. Having Ignition create a unit like this resulted in a Fedora machine successfully booting in my testing.

This leaves the question of what creates this unit. I see two options:

  • Ignition generates this unit in the files stage. The path for restorecon could be set in the internal/distro package, and if the file exists in the root fs then Ignition will add/enable this generated unit. This approach somewhat breaks the Ignition model by being dependent on work post switch-root, but this work would happen as early as possible after that point. It should be safe to always generate this unit when restorecon exists, since having things labeled properly on systems with selinux disabled won't affect things.
  • Some tool that generates Ignition configs for Fedora/RHEL (the ct equivalent that doesn't exist yet) is responsible for generating this unit and including it in the config. This approach makes hand writing Ignition configs that don't break machines much trickier, since the config author would need to manually generate this unit.

Which option should I take?

And as a side note, I'm going to add the ability to manually set extended attributes on nodes eventually. Any node that has security.selinux set for it in the config wouldn't be included in the list of nodes to be relabeled.

Support for encrypted luks volumes

Issue by @crawford


Issue Report

Feature Request

Environment

All

Desired Feature

Ignition should be able to create LUKS devices which can then be used to back filesystems. This is needed for anyone who wants to encrypt their root filesystem.

I'm not sure where the decryption secrets come from just yet. (/cc @brianredbeard @mjg59)

I think the config structure would look something like this:

{
  "ignition": { "version": "2.0.0" },
  "storage": {
    "luks": [{
      "device": "/dev/disk/by-partlabel/ROOT",
      "name": "encrypted-root",
      "other configuration": "goes here"
    }],
    "filesystems": [{
      "name": "root",
      "mount": {
        "device": "/dev/mapper/encrypted-root",
        "format": "btrfs",
        "create": {
          "force": true,
          "options": [ "--label=ROOT" ]
        }
      }
    }]
  }
}

We still have to figure out how to make the initramfs automatically open the LUKS devices before attempting to mount ROOT.

Add device-independent Ignition option for specifying partition start/size

Issue by @bgilbert


Issue Report

Feature Request

Environment

Bare metal

Desired Feature

Provide a device-independent way to specify the start and size of a partition.

Other Information

These values are currently expressed in device-specific logical sectors, which are 4 KiB on 4K native drives and 512 bytes otherwise (ignoring special cases like DVD-RAM disks). This could be awkward if a fleet of bare-metal machines contains data drives which are a mix of 512 or 512e disks and 4Kn disks.

Reproduction

Ignition config:

{
 "ignition": {"version": "2.1.0"},
 "storage": {
  "disks": [{
   "device": "/dev/vda",
   "partitions": [{
    "size": 8192,
    "label": "TEST"
   }]
  }]
 }
}

Execution environment:

touch image
truncate image -s 8G
./coreos_production_pxe.sh -i config.ign -global virtio-blk-device.logical_block_size=4096 -drive if=virtio,format=raw,file=image -append coreos.first_boot=1
sudo sgdisk -p /dev/vda

This will report a 32 MiB volume, or a 4 MiB volume if the logical_block_size parameter is removed.

cc @dgonyeo

Support for metadata in OpenStack/Ec2 providers

Feature Request

Desired Feature

When deploying on OpenStack (and I think also EC2 but I've not tested it) it's a little inconvenient that the instance metadata isn't read, which means you have to duplicate some data (and in particular the ssh key for the core user) in the ignition config manually.

For example, start an instance like:

$ nova boot --user-data ignition.json --image coreos --key-name default --nic net-name=ctlplane --flavor baremetal test

The --key-name default means we pass the ssh key via either the nova metadata service or config drive, but ignition only considers the user_data (not meta_data.json), so you must specify the key inline in the ignition config like:

(undercloud) [stack@undercloud ~]$ cat ignition.json 
{
  "ignition": {
    "config": {},
    "timeouts": {},
    "version": "2.1.0"
  },
  "networkd": {},
  "passwd": {
    "users": [
      {
        "name": "core",
        "sshAuthorizedKeys": [
          "ssh-rsa thekey root@undercloud"
        ]
      }
    ]
  }
}

https://github.com/coreos/ignition/blob/master/internal/providers/openstack/openstack.go#L49 reads from e.g openstack/latest/user_data, but inside openstack/latest/meta_data.json there's the key provided due to the --key-name default, as well as some other useful things like the hostname.

I'm wondering if a PR to read this data would be acceptable, so we can populate defaults for e.g the ssh key and hostname without necessarily needing the explicit ignition config? Coming from cloud-init this would seem a little simpler to me, but I'd welcome feedback on that before spending time on code :)

Environment

I'm testing Ignition 0.27.0 on a recent trunk OpenStack build.

Exporting Ignition's `renderConfig` functionality

Feature Request

Exporting Ignition's renderConfig functionality so that external agents can resolve Ignition configs same way as Ignition engines does on nodes.

Desired Feature

Ignition uses renderConfig on machines to resolve remote sources for replace and append directives.

  • right now renderConfig function is unexported
  • Fetcher struct used by renderConfig can be vendored as it is part of internal package.

Exporting the renderConfig function would allow external agents to render an Ignition config that is complaint with Ignition's behaviour.

Specific Use Case

To manage the configuration for machines in Openshift clusters, there is plan to deploy an Ignition endpoint inside the cluster, that will serve Ignition config to machines. The config served to machines should be static and should have no remote sources.

Rendering the remote sources in the ignition config (that might have been provided by user) ensures that machines joining the cluster at a time in future continue to receive the same configuration even if the contents of the remote location have changed.

/cc: @crawford @yifan-gu

Ignition silently fails to enable a nonexistent unit

Issue by @bgilbert


Issue Report

Bug

Container Linux Version

$ cat /etc/os-release
NAME="Container Linux by CoreOS"
ID=coreos
VERSION=1590.0.0
VERSION_ID=1590.0.0
BUILD_ID=2017-11-08-0831
PRETTY_NAME="Container Linux by CoreOS 1590.0.0 (Ladybug)"
ANSI_COLOR="38;5;75"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://issues.coreos.com"
COREOS_BOARD="amd64-usr"

Environment

Any

Expected Behavior

If Ignition is asked to enable a nonexistent unit, it fails the boot.

Actual Behavior

Ignition claims to have enabled the unit, but nothing happens.

Reproduction Steps

  1. Boot the following CLC:
systemd:
  units:
    - name: blargh.service
      enabled: true
  1. Check journalctl -t ignition and systemctl status.

Other Information

This happens because Ignition configures systemd units via presets, and presets are allowed to reference nonexistent units.

journalctl says:

Nov 10 01:19:32 localhost ignition[439]: files: op(8): [started]  processing unit "blargh.service"
Nov 10 01:19:32 localhost ignition[439]: files: op(8): [finished] processing unit "blargh.service"
Nov 10 01:19:32 localhost ignition[439]: files: op(9): [started]  enabling unit "blargh.service"
Nov 10 01:19:32 localhost ignition[439]: files: op(9): [finished] enabling unit "blargh.service"

Ignition should be able to enable systemd aliases

Issue by @arithx


Issue Report

Bug

Expected Behavior

When you enable a systemd alias the service should correctly be enabled on the resulting system.

Actual Behavior

Ignition writes a systemd-preset which explicitly does not work with aliases.

Other Information

I ran into this when re-writing the kola nfs tests trying to enable the nfsd.service resulted in nothing happening on the nfs host. Changing the unit being enabled to nfs-server.service fixed the issue.

Add support for fetching the user data with the VersionId

Feature Request

Environment

This applies to EC2 instances fetching the userdata from a S3 bucket

Desired Feature

It would be nice if the FetchFromS3 function could also take the VersionID of the S3 file in consideration when fetching the userdata from S3. Something like this:

at: internal/resource/url.go

 	sess.Config.Region = aws.String(region)
 
+	var versionId *string
+	if v, ok := u.Query()["versionId"]; ok {
+		versionId = &v
+	}
+
 	input := &s3.GetObjectInput{
-		Bucket: &u.Host,
-		Key:    &u.Path,
+		Bucket:    &u.Host,
+		Key:       &u.Path,
+		VersionId: versionId,
 	}
 	err = f.fetchFromS3WithCreds(ctx, dest, input, sess)
 	if err != nil {

Other Information

My use case for this is Terraform. Everytime I execute a deployment with Terraform and the instance has it's data store in a S3 bucket, Terraform replaces the whole instance because Terraform assumes that the userdata might have changed. This causes several instances to be replaced even when their userdata has not changed.

providers/virtualbox: investigate using GuestProperties

Ignition currently supports a virtualbox provider, which reads straight JSON bytes from a well-known GPT partition: https://github.com/coreos/ignition/blob/v0.28.0/internal/providers/virtualbox/virtualbox.go

This means that at the injection site (e.g. vagrant-virtualbox) we have a bunch of logic to handcraft a raw GPT disk: https://github.com/coreos/vagrant-ignition/blob/v0.0.1/lib/vagrant-ignition/action/IgnitionDiskGenerator.rb

However, virtualbox already provides an HostService to exchange node properties between host<->guest, which can be used from the guest to consume host-provided userdata: https://www.virtualbox.org/browser/vbox/trunk/src/VBox/HostServices/GuestProperties/service.cpp
Those can be directly managed via VBoxManage guestproperty: http://underpop.online.fr/v/virtualbox/vboxmanage-guestproperty-virtualbox.html

At the same time, another point is that the guest-property can also be used to report back provisioning status (outside the scope of this ticket, but nice).

The main problem I see is that property values are currently limited to 128 bytes: https://www.virtualbox.org/browser/vbox/trunk/include/VBox/HostServices/GuestPropertySvc.h?rev=71010#L40

It would be good to investigate:

  • whether vbox upstream would be ok taking a patch to enlarge that size to something slightly larger (e.g. 16KiB like the AWS limit)
  • whether there is already a good away to consume this from userspace or if we need additional features in kernel guest drivers (i.e. recent Linux VBOXGUEST)
  • whether it makes sense to implement and use this in ignition (e.g. to get rid of udev settling races)

Ignition follows absolute symlinks

Bug

Operating System Version

Any

Ignition Version

v0.28.0

Environment

What hardware/cloud provider/hypervisor is being used to run Ignition?
Any

Expected Behavior

Ignition either refuses to follow absolute symlinks or follows them as if it were rooted in the filesystem the link is in.

Actual Behavior

Ignition follows absolute symlinks with the initramfs root as /

Reproduction Steps

Boot with this config:

{
  "ignition": { "version": "2.2.0" },
  "storage": {
    "links": [{
      "filesystem": "root",
      "path": "/var/bar",
      "target": "/etc"
    },
    {
      "filesystem": "root",
      "path": "/var/bar/baz",
      "target": "/foo/bar"
    },
    ]
  }
}

Observe that no symlink to /foo/bar named baz exists in /etc

Other Information

Ignition should either fail if it tries to follow an absolute symlink or follow them as if they are rooted in the filesystem it is operating on.

Release Ignition v0.28.0

Release checklist:

  • Write release notes in NEWS. Get them reviewed and merged
    • If doing a branched release, also include a PR to merge the NEWS changes into master
  • Ensure your local copy is up to date with master and your working directory is clean
  • Ensure you can sign commits and any yubikeys/smartcards are plugged in
  • Run ./tag_release <vX.Y.z> <git commit hash>
  • Push that tag to Github
  • Run ./build_releases
  • Sign the release artifacts by running
gpg --local-user 0xCDDE268EBB729EC7 --detach-sign --armor <path to artifact>

for each release artifact. Do not try to sign all of them at once by globbing. If you do, gpg will sign the combination of all the release artifacts instead of each one individually.

  • Create a draft release on Github and upload all the release artifacts and their signatures. Copy and paste the release notes from NEWS here as well.
  • Publish the release
  • Sync the docs using ignition for PROJECT and the version X.Y.Z (not vX.Y.Z) for RELEASE.
  • Bump the Ignition ebuild in coreos-overlay
  • Vendor the new Ignition version in mantle

systemd instantiated services are not started when enabled via ignition

Issue by @redbaron


Bug

Instantiated services can be enabled and return no error, but when machine boots, service is not enabled:

systemd:
  units:
    - name: [email protected]
      contents: |
        [Unit]
        Description=f
        [Service]
        Type=oneshot
        ExecStart=/bin/echo %i
        [Install]
        WantedBy=multi-user.target
    - name: [email protected]
      enable: true

then boot, see systemctl status [email protected] and it shows that it didn't start.

Container Linux Version

NAME="Container Linux by CoreOS"
ID=coreos
VERSION=1520.8.0
VERSION_ID=1520.8.0
BUILD_ID=2017-10-26-0342
PRETTY_NAME="Container Linux by CoreOS 1520.8.0 (Ladybug)"
ANSI_COLOR="38;5;75"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://issues.coreos.com"
COREOS_BOARD="amd64-usr"

Expected Behavior

Enabled service starts

Actual Behavior

Ingition uses systemd presets, which apparently does not support enabling instantiated services.
There is this discussion https://lists.freedesktop.org/archives/systemd-devel/2015-August/033834.html where hack with explicit symlinks installed via Alias= are used. Would be nice if Ignition took care of it

Provide a way for Ignition to report provisioning success/failure

Issue by @bgilbert


Issue Report

Feature Request

Environment

Any

Desired Feature

Consider adding a way (in the Ignition config itself, or on the kernel command line) to specify a URL Ignition can POST to report success or failure.

Other Information

Use cases:

  1. Cluster monitoring, e.g. nodes which fail provisioning due to a disk failure
  2. Platform-specific reporting via platform hooks, e.g. coreos/bugs#2130
  3. kola logging

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.