Giter Site home page Giter Site logo

Comments (13)

zackliu avatar zackliu commented on August 12, 2024

you run az container show combined with az container logs to get more details.
The initialization failed will produce this problem, but Azure will have two different behaviors to initialization failed. One is deployment shows successful and container shows terminated, the other is endless deployment.

from azure-container-agents-plugin.

raphaelquati avatar raphaelquati commented on August 12, 2024

I'm tryiing to use my docker image as a JNLP agent (windows). Same problem.
But the problem is not the plugin. Is Azure itself

In Linux it is working.

from azure-container-agents-plugin.

zackliu avatar zackliu commented on August 12, 2024

Seems it's a connection issue, but I can't tell why connection timed out.
In the new version(0.4.0), we support SSH connection. You can have a try.

from azure-container-agents-plugin.

jamesabbottsmith avatar jamesabbottsmith commented on August 12, 2024

I've tried this with an SSH connection. I see similar behaviour. The container is started and appears to run, but exits before doing anything useful.

$ az container logs --name ...
Linux caas-82e05980240a429795153b976151f58a-2347572569-kqlfb 4.4.0-101-generic #124-Ubuntu SMP Fri Nov 10 18:29:59 UTC 2017 x86_64 GNU/Linux

I also see the ALPN callback dropped message in the logs

ALPN callback dropped: SPDY and HTTP/2 are disabled. Is alpn-boot on the boot class path?

from azure-container-agents-plugin.

zackliu avatar zackliu commented on August 12, 2024

@jamesabbottsmith Can you provide more logs about SSHLauncher in Jenkins logs.

Such as:

Jan 16, 2018 6:16:32 AM INFO com.microsoft.jenkins.containeragents.remote.SSHLauncher launch
SSHLauncher: Start to connect node acissh-g0w3t : 52.224.69.226 via SSH
Jan 16, 2018 6:16:44 AM INFO com.microsoft.jenkins.containeragents.remote.SSHLauncher launch
SSHLauncher: Copy slave.jar to remote host successfully

from azure-container-agents-plugin.

atrauzzi avatar atrauzzi commented on August 12, 2024

I've noticed a lot of issues with ACI even in just manually creating containers. It seems like the service has a lot of performance and consistency issues. Definitely to the extent that a plugin like this wouldn't make sense until the start time of any Windows or Linux container can be brought to under ~15 seconds.

from azure-container-agents-plugin.

arjanschaaf avatar arjanschaaf commented on August 12, 2024

I'm trying to setup an Jenkins deployment using the Solution Template together with an ACI based build environment. I'm running into very similar problems with the this setup and it is impossible to debug. az container show and az container logs are impossible to use because the containers are gone before you know it.

jenkins.log shows many entries like these:

Jan 26, 2018 1:01:10 PM com.microsoft.jenkins.containeragents.aci.AciCloud provision
INFO: Start ACI container for label ACI-container workLoad 1
Jan 26, 2018 1:01:10 PM com.microsoft.jenkins.containeragents.aci.AciCloud provision
INFO: Using ACI Container template: ACI-container
Jan 26, 2018 1:01:10 PM hudson.slaves.NodeProvisioner$StandardStrategyImpl apply
INFO: Started provisioning ACI-container from AciAgents with 1 executors. Remaining excess workload: 0
Jan 26, 2018 1:01:10 PM com.microsoft.jenkins.containeragents.aci.AciCloud$1 call
INFO: Add ACI node: aci-container-8q2gl
Jan 26, 2018 1:01:10 PM okhttp3.internal.platform.Platform log
INFO: ALPN callback dropped: HTTP/2 is disabled. Is alpn-boot on the boot class path?
Jan 26, 2018 1:01:12 PM com.microsoft.jenkins.containeragents.aci.AciService createDeployment
INFO: Waiting for deployment aci-container-q2m0kn9b
Jan 26, 2018 1:01:22 PM com.microsoft.jenkins.containeragents.aci.AciCloud$1 call
WARNING: AciCloud: Provision agent aci-container-8q2gl failed: Deployment aci-container-q2m0kn9b status: Failed
Jan 26, 2018 1:01:22 PM okhttp3.internal.platform.Platform log
INFO: ALPN callback dropped: HTTP/2 is disabled. Is alpn-boot on the boot class path?
Jan 26, 2018 1:01:22 PM com.microsoft.jenkins.containeragents.aci.AciService deleteAciContainerGroup
INFO: Delete ACI Container Group: aci-container-8q2gl successfully
Jan 26, 2018 1:01:30 PM hudson.slaves.NodeProvisioner$2 run
WARNING: Unexpected exception encountered while provisioning agent ACI-container
java.lang.Exception: java.lang.Exception: Deployment aci-container-q2m0kn9b status: Failed
	at com.microsoft.jenkins.containeragents.aci.AciCloud$1.call(AciCloud.java:136)
	at com.microsoft.jenkins.containeragents.aci.AciCloud$1.call(AciCloud.java:80)
	at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.Exception: Deployment aci-container-q2m0kn9b status: Failed
	at com.microsoft.jenkins.containeragents.aci.AciService.createDeployment(AciService.java:137)
	at com.microsoft.jenkins.containeragents.aci.AciContainerTemplate.provisionAgents(AciContainerTemplate.java:128)
	at com.microsoft.jenkins.containeragents.aci.AciCloud$1.call(AciCloud.java:104)
	... 6 more

from azure-container-agents-plugin.

arjanschaaf avatar arjanschaaf commented on August 12, 2024

So I'm finding my way around the Azure Portal and I finally found some relevant error feedback in the Deployments view of my ResourceGroup. Turned out I needed to issue this command and then it started working!!
az provider register --namespace Microsoft.ContainerInstance

from azure-container-agents-plugin.

chenkennt avatar chenkennt commented on August 12, 2024

@atrauzzi We recommend to use SSH to launch container as it's much more stable than JNLP. Please have a try and let us know if there is any issue.

from azure-container-agents-plugin.

atrauzzi avatar atrauzzi commented on August 12, 2024

I don't have SSH access to launch nodes on certain services. So this is still an issue.

Why did you close this issue before I could reply?

from azure-container-agents-plugin.

zackliu avatar zackliu commented on August 12, 2024

@atrauzzi Please provide more logs and configurations about using SSH to launch ACI as you haven't succeeded. From our test, SSH is more stable than JNLP. Please keep in contact or we don't know what going on now.

from azure-container-agents-plugin.

ravee-rachakonda-by avatar ravee-rachakonda-by commented on August 12, 2024

I am facing similar issue with jenkins/ssh-slave setup with this plugin. I see the agent gets deployed in ACI but gets deleted and another instance keeps coming up.

2020-03-29T17:18:15.716749905Z: [ERROR] 2020-03-29 17:18:15.716+0000 [id=112036] INFO c.m.j.c.aci.AciService#createDeployment: Deployment ztest-r3tkt376 succeed
2020-03-29T17:18:15.971375792Z: [ERROR] 2020-03-29 17:18:15.971+0000 [id=112036] INFO o.internal.platform.Platform#log: ALPN callback dropped: HTTP/2 is disabled. Is alpn-boot on the boot class path?
2020-03-29T17:18:16.119031025Z: [ERROR] 2020-03-29 17:18:16.114+0000 [id=112209] INFO c.m.j.c.remote.SSHLauncher#launch: SSHLauncher: Start to connect node ztest-kctsj : 52.152.198.206 via SSH

2020-03-29T17:18:16.422975930Z: [ERROR] 2020-03-29 17:18:16.422+0000 [id=112209] INFO c.m.j.c.remote.SSHLauncher#launch: SSHLauncher: Copy slave.jar to remote host successfully
2020-03-29T17:18:16.605629524Z: [ERROR] 2020-03-29 17:18:16.605+0000 [id=112209] INFO c.m.j.c.remote.SSHLauncher#launch: SSHLauncher: launching agent failed
2020-03-29T17:18:16.606118343Z: [ERROR] 2020-03-29 17:18:16.605+0000 [id=112036] WARNING c.m.j.c.aci.AciCloud$1#call: AciCloud: Provision agent ztest-kctsj failed: java.lang.InterruptedException: java.io.EOFException: unexpected stream termination
2020-03-29T17:18:17.050554205Z: [ERROR] 2020-03-29 17:18:17.050+0000 [id=112036] INFO o.j.p.cloudstats.CloudStatistics#getIdFor: No support for cloud-stats-plugin by class com.microsoft.jenkins.containeragents.aci.AciAgent

2020-03-29T17:18:17.241691129Z: [ERROR] 2020-03-29 17:18:17.241+0000 [id=112215] INFO o.internal.platform.Platform#log: ALPN callback dropped: HTTP/2 is disabled. Is alpn-boot on the boot class path?
2020-03-29T17:18:18.162858608Z: [ERROR] 2020-03-29 17:18:18.161+0000 [id=112209] INFO c.m.j.c.aci.AciService#deleteAciContainerGroup: Delete ACI Container Group: ztest-kctsj successfully

2020-03-29T17:18:24.527343702Z: [ERROR] 2020-03-29 17:18:24.526+0000 [id=88] WARNING hudson.slaves.NodeProvisioner#lambda$update$6: Unexpected exception encountered while provisioning agent ztest
2020-03-29T17:18:24.527390804Z: [ERROR] java.lang.InterruptedException: java.io.EOFException: unexpected stream termination
2020-03-29T17:18:24.527398804Z: [ERROR] at com.microsoft.jenkins.containeragents.remote.SSHLauncher.launch(SSHLauncher.java:121)
2020-03-29T17:18:24.527403904Z: [ERROR] at hudson.slaves.SlaveComputer.lambda$_connect$0(SlaveComputer.java:292)
2020-03-29T17:18:24.527409204Z: [ERROR] Caused: java.util.concurrent.ExecutionException
2020-03-29T17:18:24.527414405Z: [ERROR] at java.util.concurrent.FutureTask.report(FutureTask.java:122)
2020-03-29T17:18:24.527419405Z: [ERROR] at java.util.concurrent.FutureTask.get(FutureTask.java:192)
2020-03-29T17:18:24.527424205Z: [ERROR] at com.microsoft.jenkins.containeragents.aci.AciCloud$1.call(AciCloud.java:115)
2020-03-29T17:18:24.527429005Z: [ERROR] Caused: java.lang.Exception
2020-03-29T17:18:24.527433705Z: [ERROR] at com.microsoft.jenkins.containeragents.aci.AciCloud$1.call(AciCloud.java:140)
2020-03-29T17:18:24.527438606Z: [ERROR] at com.microsoft.jenkins.containeragents.aci.AciCloud$1.call(AciCloud.java:80)
2020-03-29T17:18:24.527443306Z: [ERROR] at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
2020-03-29T17:18:24.527448206Z: [ERROR] at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:71)
2020-03-29T17:18:24.527452906Z: [ERROR] at java.util.concurrent.FutureTask.run(FutureTask.java:266)
2020-03-29T17:18:24.527457606Z: [ERROR] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
2020-03-29T17:18:24.527462407Z: [ERROR] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
2020-03-29T17:18:24.527467207Z: [ERROR] at java.lang.Thread.run(Thread.java:748)
2020-03-29T17:18:24.527471907Z: [ERROR] 2020-03-29 17:18:24.527+0000 [id=88] WARNING c.m.j.c.aci.AciCloud#canProvision: Cannot provision: template for label ztest is not available now, because it failed to provision last time.

2020-03-29T17:18:28.893307318Z: [ERROR] 2020-03-29 17:18:28.893+0000 [id=112245] INFO hudson.model.AsyncPeriodicWork#lambda$doRun$0: Started DockerContainerWatchdog Asynchronous Periodic Work
2020-03-29T17:18:28.893558628Z: [ERROR] 2020-03-29 17:18:28.893+0000 [id=112245] INFO c.n.j.p.d.DockerContainerWatchdog#execute: Docker Container Watchdog has been triggered
2020-03-29T17:18:28.893581829Z: [ERROR] 2020-03-29 17:18:28.893+0000 [id=112245] INFO c.n.j.p.d.DockerContainerWatchdog$Statistics#writeStatisticsToLog: Watchdog Statistics: Number of overall executions: 571, Executions with processing timeout: 0, Containers removed gracefully: 0, Containers removed with force: 0, Containers removal failed: 0, Nodes removed successfully: 0, Nodes removal failed: 0, Container removal average duration (gracefully): 0 ms, Container removal average duration (force): 0 ms, Average overall runtime of watchdog: 0 ms, Average runtime of container retrieval: 0 ms
2020-03-29T17:18:28.893631631Z: [ERROR] 2020-03-29 17:18:28.893+0000 [id=112245] INFO c.n.j.p.d.DockerContainerWatchdog#loadNodeMap: We currently have 2 nodes assigned to this Jenkins instance, which we will check
2020-03-29T17:18:28.893644931Z: [ERROR] 2020-03-29 17:18:28.893+0000 [id=112245] INFO c.n.j.p.d.DockerContainerWatchdog#execute: Docker Container Watchdog check has been completed
2020-03-29T17:18:28.893813838Z: [ERROR] 2020-03-29 17:18:28.893+0000 [id=112245] INFO hudson.model.AsyncPeriodicWork#lambda$doRun$0: Finished DockerContainerWatchdog Asynchronous Periodic Work. 0 ms

2020-03-29T17:18:34.526268101Z: [ERROR] 2020-03-29 17:18:34.525+0000 [id=96] INFO c.m.j.c.aci.AciCloud#provision: Start ACI container for label ztest workLoad 1
2020-03-29T17:18:34.526304302Z: [ERROR] 2020-03-29 17:18:34.526+0000 [id=96] INFO c.m.j.c.aci.AciCloud#provision: Using ACI Container template: ztest
2020-03-29T17:18:34.526324603Z: [ERROR] 2020-03-29 17:18:34.526+0000 [id=96] INFO h.s.NodeProvisioner$StandardStrategyImpl#apply: Started provisioning ztest from SaaSBld with 1 executors. Remaining excess workload: 0
2020-03-29T17:18:34.526693417Z: [ERROR] 2020-03-29 17:18:34.526+0000 [id=112036] INFO c.m.j.c.aci.AciCloud$1#call: Add ACI node: ztest-3qs59
2020-03-29T17:18:34.538040658Z: [ERROR] 2020-03-29 17:18:34.537+0000 [id=112036] INFO c.m.j.c.a.AciCleanTask$DeploymentRegistrar#registerDeployment: AzureAciCleanUpTask: registerDeployment: Registering deployment ztest-6rmxcrfl in saas-*****-01
2020-03-29T17:18:34.853962320Z: [ERROR] 2020-03-29 17:18:34.853+0000 [id=112036] INFO o.internal.platform.Platform#log: ALPN callback dropped: HTTP/2 is disabled. Is alpn-boot on the boot class path?

2020-03-29T17:18:35.578729453Z: [ERROR] 2020-03-29 17:18:35.578+0000 [id=112036] INFO c.m.j.c.aci.AciService#createDeployment: Waiting for deployment ztest-6rmxcrfl

from azure-container-agents-plugin.

ravee-rachakonda-by avatar ravee-rachakonda-by commented on August 12, 2024

Logs from the container before it dies out

TX1-XX:aci$ az container logs --name ztest-23n25 --resource-group XX-01 --subscription XX

  • [[ ssh-rsa XXX ]]
  • write_key 'ssh-rsa XXX'
  • local ID_GROUP
    ++ stat -c %U:%G /home/jenkins
  • ID_GROUP=jenkins:jenkins
  • mkdir -p /home/jenkins/.ssh
  • echo 'ssh-rsa XXXXX'
  • chown -Rf jenkins:jenkins /home/jenkins/.ssh
  • chmod 0700 -R /home/jenkins/.ssh
  • [[ 0 -gt 0 ]]
  • grep _
  • env
  • ssh-keygen -A
    ssh-keygen: generating new host keys: DSA
  • exec /usr/sbin/sshd -D -e
    Server listening on 0.0.0.0 port 22.
    Server listening on :: port 22.
    Did not receive identification string from 10.240.255.55 port 53589
    Did not receive identification string from 10.240.255.56 port 61040
    Did not receive identification string from 10.240.255.55 port 53924
    Did not receive identification string from 10.240.255.56 port 61365
    Did not receive identification string from 10.240.255.55 port 54129
    Did not receive identification string from 10.240.255.56 port 61557

from azure-container-agents-plugin.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.