kubecost / features-bugs Goto Github PK
View Code? Open in Web Editor NEWA public repository for filing of Kubecost feature requests and bugs. Please read the issue guidelines before filing an issue here.
A public repository for filing of Kubecost feature requests and bugs. Please read the issue guidelines before filing an issue here.
When going to /cluster-inspect
, I'm met with a blank page. Upon inspecting the API requests the frontend is making to the backend, this one is consistently failing with HTTP 500 /model/savings/clusterSizingETL
.
/overview
page.Cluster breakdown
/cluster-inspect
, sometimes failingHAR files with reproduced behavior linked in Slack thread below.
Identify all locations in the frontend that are calling /model/savings/clusterSizingETL
. Build a graceful failure mode so that the frontend doesn't retry the request too many times, and doesn't end up displaying a blank page.
Link to a Slack thread showing that at least four users are running into this issue.
My team has enjoyed using the /cluster-inspect
view, as it concisely summarizes all the activity on the cluster. Being unable to use it now disrupts our workflow.
The "Continuous Request Right-Sizing" currently uses the max
algorithm for recommendations, which causes services with high start-up CPU usage to be overprovisioned.
For example, some of our services spike to ~2 cores at start-up, then drop down to ~0.3 cores when stable. This has too negative effects:
Introduce cpu.request.autoscaling.kubecost.com/algorithm
and cpu.request.autoscaling.kubecost.com/q
annotations (or similar) to allow the algorithmCPU
and qCPU
right-size recommendations parameters to be set on a per-workload basis.
It probably makes sense to introduce this for memory as well, for consistency.
Allow arbitrary query parameters to be added to the recommendation API requests (e.g., request.autoscaling.kubecost.com/extraRecommendationParameters: "algorithmCPU=quantile&qCPU=0.95
)
This could be useful for allowing the use of alpha/experimental parameters, without making it part of the Cluster Controller's API.
No response
What problem are you trying to solve?
I would like to resize my cluster nodes for more efficient utilization using Kubecosts "Rightsize your cluster nodes" savings recommendations.
Describe the solution you'd like
The recommended instance sizes and quantities should abide by Kubernetes best practices, or in the case of a cloud-based environment, the cloud providers' best practices and limitations.
Kubernetes:
Considerations for large clusters
Azure:
Azure Kubernetes Service service limits
AWS:
Amazon EKS - Elastic Network Interface (ENI) max pods
ENI max pods by instance
Google:
GKE max pods per node
Describe alternatives you've considered
Manually calculating instance quantities and sizes.
How would users interact with this feature?
I can envision a few different ways. Looks like 110 pods per node is the most common maximum. Kubecost could set that as the default quantity and then make it adjustable via the advanced settings in the UI.
A future version may take provider maximums into consideration based on provider details, such as AWS EKS nodes using ENI.
gz#2241
(related to Zendesk ticket kubecost/cost-analyzer-helm-chart#2241)
When using IRSA, Kubecost cannot access aws ec2 resources and logs the following messages even when the service account has the correct policy.
I back tested this with 1.101 and 1.102 and all versions have the issue.
error message:
WRN unable to get addresses: operation error EC2: DescribeAddresses, failed to sign request: failed to retrieve credentials: failed to refresh cached credentials, failed to retrieve credentials, operation error STS: AssumeRoleWithWebIdentity, exceeded maximum number of attempts, 3, https response error StatusCode: 400, RequestID: c63cf5bd-27d3-4919-8251-08fcf7ce7151, InvalidIdentityToken: No OpenIDConnect provider found in your account for https://oidc.eks.ca-central-1.amazonaws.com/id/2086E4D4C3BEAFFF61F3617142CA5DCC
WRN unable to get disks: operation error EC2: DescribeVolumes, failed to sign request: failed to retrieve credentials: failed to refresh cached credentials, failed to retrieve credentials, operation error STS: AssumeRoleWithWebIdentity, exceeded maximum number of attempts, 3, https response error StatusCode: 400, RequestID: 97337482-f0e2-489d-b8e6-c9108a264d8e, InvalidIdentityToken: No OpenIDConnect provider found in your account for https://oidc.eks.ca-central-1.amazonaws.com/id/2086E4D4C3BEAFFF61F3617142CA5DCC
To Reproduce
Steps to reproduce the behavior:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "KubecostSavingsAccess",
"Effect": "Allow",
"Action": [
"ec2:DescribeAddresses",
"ec2:DescribeVolumes"
],
"Resource": "*"
}
]
}
create an IRSA account with the policy:
>eksctl create iamserviceaccount \
--name kubecost-serviceaccount \
--namespace kubecost \
--cluster jesse-temp --region ca-central-1 \
--attach-policy-arn arn:aws:iam::297945954695:policy/jesse-temp-savings-policy \
--override-existing-serviceaccounts \
--approve
install kubecost and view logs
helm install kubecost kubecost/cost-analyzer --version 1.104.1 \
--set serviceAccount.create=false --set serviceAccount.name=kubecost-serviceaccount
Expected behavior
no errors
What impact will this have on your ability to get value out of Kubecost?
savings reports broken for
/orphaned-resources
What problem are you trying to solve?
Provide the ability to hide Cloud Costs & Possible savings
Breakout from Issue kubecost/cost-analyzer-helm-chart#1574
Problem: Kubecost data needs to be integrated with BI and other custom FinOps tools . The following solution would help achieve this.
Please refer to the open cost implementation of CSV export to S3 bucket and other cloud targets. This really helps to integrate with existing BI and FinOps tools.
https://www.opencost.io/docs/integrations/csv-export
By doing this you can also provide/enable the current API based CSV export of cost data to include dates. Currently json export have dates. csv export is simple and nice, but dates on them would make it more usable.
Please refer to the open cost implementation of CSV export to S3 bucket and other cloud targets. This really helps to integrate with existing BI and FinOps tools.
https://www.opencost.io/docs/integrations/csv-export
By doing this you can also provide/enable the current API based CSV export of cost data to include dates. Currently json export have dates. csv export is simple and nice, but dates on them would make it more usable.
"Kubecost does provide access to our APIs which power all the data that you see in our UI. You can also opt to download any reports you see in the UI as a .csv. Here is a blog we published on how users can intregrate Kubecost Data into a Datadog Dashboard."
The blog does not match with what is being requested. Blog is in particular to data log integration. The ask is an export option for the data, so we can integrate any custom solution and not to a particular tool. The open cost option of exporting the data to a bucket or storage for example is more generic and any tooling can consume it. We do not use data dog. Also sub ask in this request is adding date field to csv file that is exported out for kubecost, which does not have a date.
No response
What problem are you trying to solve?
We have a request in ZD ticket: 3771 to show costs for local ephemeral storage.
Kubernetes documentation: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#local-ephemeral-storage
Note: In addition to local, there are CSI and Generic ephemeral volumes: https://kubernetes.io/docs/concepts/storage/ephemeral-volumes/#types-of-ephemeral-volumes
In their example they have a deployment which consumes the majority of local ephemeral storage across some number of nodes in the cluster and they would like to know the deployments cost associated with the underlying storage volume.
Cluster nodes are configured with:
allocatable:
ephemeral-storage: "<Storage capacity>"
Deployments have the ephemeral-storage
limit / requests:
resources:
limits:
ephemeral-storage: <Storage limit>
requests:
ephemeral-storage: <Storage request>
Describe the solution you'd like
Visibility into ephemeral storage costs on the allocations page.
With exception of Generic ephemeral volumes I believe they are similar to configMap, downwardAPI and secret volume types in that they are not associated with a PV. Because of this the solution may be a new ephemeral-volume
allocations cost category itself that would be visible on the allocations page and when drilling into the Container level. The ability to enable / disable this category may be useful as well.
1.106.2
n/a
Other (specify in description)
Incorrectly seeing this message when I request a long when, data does eventually load: https://infra.kceng.dev/cloud?reportTitle=Cumulative+cost+for+last+12+months+by+provider&window=365d&agg=provider
Load this view: https://infra.kceng.dev/cloud?reportTitle=Cumulative+cost+for+last+12+months+by+provider&window=365d&agg=provider
Expect a loading state, but instead shows a data not available message
No response
No response
No response
No response
Hello Team,
I recently deployed Kubecost (version v1.106.4) in my Kubernetes cluster and have set up a scheduled report. However, I'm not receiving the scheduled report emails, and I suspect that the sender email ID and relay server details need to be configured.
I would appreciate any guidance on where and how I can configure these details within Kubecost. Specifically, I would like to know:
The email domain used by Kubecost to send scheduled reports.
The configuration location for setting the sender email ID.
The steps to configure the email/relay server details in Kubecost.
Your assistance in resolving this matter is highly valued. Thank you in advance for your help!
Regards,
Renjith
Currently, Kubecost doesn't support Scaleway as a provider and the pod crashes indefinitely:
2023/08/29 13:20:39 maxprocs: Leaving GOMAXPROCS=2: CPU quota undefined
[90m2023-08-29T13:20:39.778948267Z[0m [1m???[0m Log level set to info
[90m2023-08-29T13:20:39.779009061Z[0m [32mINF[0m Starting Kubecost cost-model version 1.105.2 (94410c5)
[90m2023-08-29T13:20:39.780996678Z[0m [32mINF[0m Prometheus/Thanos Client Max Concurrency set to 5
[90m2023-08-29T13:20:39.78686936Z[0m [32mINF[0m Success: retrieved the 'up' query against prometheus at: http://prometheus-operated.prometheus.traefik.mesh:9090/
[90m2023-08-29T13:20:39.793785625Z[0m [32mINF[0m Retrieved a prometheus config file from: http://prometheus-operated.prometheus.traefik.mesh:9090/
[90m2023-08-29T13:20:39.803457139Z[0m [32mINF[0m Using scrape interval of 60.000000
[90m2023-08-29T13:20:39.80406909Z[0m [32mINF[0m NAMESPACE: kubecost
[90m2023-08-29T13:20:40.104978244Z[0m [32mINF[0m Done waiting
[90m2023-08-29T13:20:40.105568815Z[0m [32mINF[0m Starting *v1.Namespace controller
[90m2023-08-29T13:20:40.105756237Z[0m [32mINF[0m Starting *v1.Node controller
[90m2023-08-29T13:20:40.10582145Z[0m [32mINF[0m Starting *v1.Pod controller
[90m2023-08-29T13:20:40.10590626Z[0m [32mINF[0m Starting *v1.Service controller
[90m2023-08-29T13:20:40.106037486Z[0m [32mINF[0m Starting *v1.ConfigMap controller
[90m2023-08-29T13:20:40.106141752Z[0m [32mINF[0m Starting *v1.DaemonSet controller
[90m2023-08-29T13:20:40.10621519Z[0m [32mINF[0m Starting *v1.Deployment controller
[90m2023-08-29T13:20:40.106333703Z[0m [32mINF[0m Starting *v1.StatefulSet controller
[90m2023-08-29T13:20:40.106450723Z[0m [32mINF[0m Starting *v1.ReplicaSet controller
[90m2023-08-29T13:20:40.106565679Z[0m [32mINF[0m Starting *v1.PersistentVolume controller
[90m2023-08-29T13:20:40.106637394Z[0m [32mINF[0m Starting *v1.PersistentVolumeClaim controller
[90m2023-08-29T13:20:40.106684403Z[0m [32mINF[0m Starting *v1.StorageClass controller
[90m2023-08-29T13:20:40.106728666Z[0m [32mINF[0m Starting *v1.Job controller
[90m2023-08-29T13:20:40.106767749Z[0m [32mINF[0m Starting *v1beta1.PodDisruptionBudget controller
[90m2023-08-29T13:20:40.106813194Z[0m [32mINF[0m Starting *v1.ReplicationController controller
[90m2023-08-29T13:20:40.10943267Z[0m [32mINF[0m Found ProviderID starting with "scaleway", using Scaleway Provider
[90m2023-08-29T13:20:40.12136691Z[0m [32mINF[0m No asset-report-configs configmap found at install time, using existing configs: configmaps "asset-report-configs" not found
[90m2023-08-29T13:20:40.131145807Z[0m [32mINF[0m No advanced-report-configs configmap found at install time, using existing configs: configmaps "advanced-report-configs" not found
[90m2023-08-29T13:20:40.217114238Z[0m [32mINF[0m No saved-report-configs configmap found at install time, using existing configs: configmaps "saved-report-configs" not found
[90m2023-08-29T13:20:40.416305838Z[0m [32mINF[0m No pricing-configs configmap found at install time, using existing configs: configmaps "pricing-configs" not found
[90m2023-08-29T13:20:40.614670714Z[0m [32mINF[0m No cloud-cost-report-configs configmap found at install time, using existing configs: configmaps "cloud-cost-report-configs" not found
[90m2023-08-29T13:20:40.815326476Z[0m [32mINF[0m No product-configs configmap found at install time, using existing configs: configmaps "product-configs" not found
[90m2023-08-29T13:20:41.014819183Z[0m [32mINF[0m No alert-configs configmap found at install time, using existing configs: configmaps "alert-configs" not found
[90m2023-08-29T13:20:41.21705655Z[0m [32mINF[0m No recurring-budget-rule-configs configmap found at install time, using existing configs: configmaps "recurring-budget-rule-configs" not found
[90m2023-08-29T13:20:41.416224485Z[0m [32mINF[0m No group-report-configs configmap found at install time, using existing configs: configmaps "group-report-configs" not found
[90m2023-08-29T13:20:41.616794296Z[0m [32mINF[0m No budget-configs configmap found at install time, using existing configs: configmaps "budget-configs" not found
[90m2023-08-29T13:20:41.813632291Z[0m [32mINF[0m No group-filters configmap found at install time, using existing configs: configmaps "group-filters" not found
[90m2023-08-29T13:20:42.015814434Z[0m [32mINF[0m No metrics-config configmap found at install time, using existing configs: configmaps "metrics-config" not found
[90m2023-08-29T13:20:42.214287803Z[0m [32mINF[0m No app-configs configmap found at install time, using existing configs: configmaps "app-configs" not found
[90m2023-08-29T13:20:42.547232903Z[0m [32mINF[0m Init: AggregateCostModel cache warming disabled
panic: provider is required: failed to convert cost model provider
goroutine 1 [running]:
github.com/kubecost/kubecost-cost-model/pkg/cmd/costmodel.Initialize()
/home/runner/work/release-scripts/release-scripts/release-scripts/workdir-prod-tag/kubecost/kubecost-cost-model/pkg/cmd/costmodel/costmodel.go:1577 +0x2db9
github.com/kubecost/kubecost-cost-model/pkg/cmd/costmodel.Execute(0x1?)
/home/runner/work/release-scripts/release-scripts/release-scripts/workdir-prod-tag/kubecost/kubecost-cost-model/pkg/cmd/costmodel/costmodel.go:2028 +0xf7
github.com/kubecost/kubecost-cost-model/pkg/cmd.Execute.newCostModelCommand.func1(0xc0000e3400?, {0x330f85e?, 0x4?, 0x330f862?})
/home/runner/work/release-scripts/release-scripts/release-scripts/workdir-prod-tag/kubecost/kubecost-cost-model/pkg/cmd/commands.go:43 +0x2f
github.com/spf13/cobra.(*Command).execute(0xc000fb6600, {0x5a1a300, 0x0, 0x0})
/home/runner/go/pkg/mod/github.com/spf13/[email protected]/command.go:916 +0x87c
github.com/spf13/cobra.(*Command).ExecuteC(0xc000fb7200)
/home/runner/go/pkg/mod/github.com/spf13/[email protected]/command.go:1040 +0x38d
github.com/spf13/cobra.(*Command).Execute(...)
/home/runner/go/pkg/mod/github.com/spf13/[email protected]/command.go:968
github.com/opencost/opencost/pkg/cmd.Execute(0x0?, {0xc00103fee0, 0x3, 0x3})
/home/runner/work/release-scripts/release-scripts/release-scripts/workdir-prod-tag/kubecost/opencost/pkg/cmd/commands.go:61 +0x3a5
github.com/kubecost/kubecost-cost-model/pkg/cmd.Execute()
/home/runner/work/release-scripts/release-scripts/release-scripts/workdir-prod-tag/kubecost/kubecost-cost-model/pkg/cmd/commands.go:27 +0x1e5
main.main()
/home/runner/work/release-scripts/release-scripts/release-scripts/workdir-prod-tag/kubecost/kubecost-cost-model/cmd/costmodel/main.go:12 +0x13
Add Scaleway as a supported provider for Kubecost.
No response
No response
v1.107.1
v1.27
AKS
We have SAML working previously without leader follower. We tried enabling the leader follower with StatedulSet option but that still doesn't work. The login keeps redirecting with infinite loop.
Dashboard must be visible correctly in addition to SAML working.
Kubecost dashboard not visible.
No response
No response
No response
1.107.0
1.26
GKE
I have 30 NFS mount points that I use on my application, these mount points are attached via NFS PV on my pods, but kubecost understand that these PV are a physical cloud volumes and report it.
Here I have each nfs volumes reported by physical volumes:
The main NFS server have 3TB of size and for each nfs volume that I attach on my pods, kubecost report 3TB of physical volume.
Create PV and PVC on Kubernetes:
apiVersion: v1
kind: PersistentVolume
metadata:
name: name-here
spec:
capacity:
storage: "3000Gi"
accessModes:
- "ReadWriteMany"
nfs:
server: filestore.nfs.server
path: /home/directory001
path: /mnt/directory001
--
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: name-here
spec:
accessModes:
- ReadWriteMany
storageClassName: ""
resources:
requests:
storage: 3000Gi
volumeName: name-here
Is expected that kubecost understand that these volumes are NFS shares.
Unfortunately my cost report don't show correct values.
No response
https://kubecost.slack.com/archives/CLFV60Y90/p1699553491710209
When reviewing cloud costs, it's really helpful to see spend change over time, which we used to show on the cloud costs view.
Add spend change pills as shown in allocation view
None yet
No response
network-costs-rs running as daemonset listens to ipv4 addresses only
/ # netstat -lntp |grep network-costs *
*tcp 0 0 0.0.0.0:3001 0.0.0.0:* LISTEN 1/network-costs-rs
So Prometheus, working on ipv6 network, cannot scrape its metrics.
It would be great to have an opportunity to allow listening ipv6 addresses
No response
No response
v1.106.3
1.24
EKS
I am trying to upgrade from v1.101.3 to v1.106.3 on EKS v1.24 but the (Netowrk, Grafana and Cost Analyzer) PODS are failing with CrashLoopBackOff error.
The events say : Back-off restarting failed container
Kubecost should be installed with no issues
No response
No response
No response
No response
Moving kubecost/cost-analyzer-helm-chart#2050 (comment) over from helm chart repo
see above
No response
No response
What problem are you trying to solve?
Currently we want the savings information based on the labels that we have for the namespace and app level, currently it has ability to filter for label that means cost-model api already knows the label mappings but doesn't return those associated labels in the Savings API response.
Describe the solution you'd like
If we can have those labels in the response of Savings API (Container Request Right Sizing Recommendation API (V2)) that will help teams to do further automation such as building grafana dashboards and more.
Describe alternatives you've considered
We tried using the filter approach but it is overkill and doesn't help much in the automation process and taking the data for more insights because the response doesn't have Label mappings.
How would users interact with this feature?
Users can use the API to fetch the labels mappings with Savings API response and then Teams can further perform grouping of those savings based on the labels and perform insights with those responses as well.
When consuming allocation data via the API, there isn't a reliable method to determine if data points have already been reconciled with the CSP or not. As a comparison, there is a feature in the Kubecost UI that highlights whether or not costs have been already reconciled (although it isn't clear how reliable that functionality is either).
A simple property in each data element in the return response like reconciled: True
would be enough. Optionally, it would also be useful to be able to submit this as a query parameter. For example, to only fetch costs that have been reconciled.
*CostAdjustment
properties being different than 0: although true for most cases, there's likely a variety of scenarios where the adjustment for a reconciled cost might still be 0.No response
cost-analyzer-1.107.1
v1.27.6+f67aeb3
OpenShift
panic: Error in OIDC discovery 'https://keycloak-keycloak-operator.apps-crc.testing/realms/sso/.well-known/openid-configuration': Get "https://keycloak-keycloak-operator.apps-crc.testing/realms/sso/.well-known/openid-configuration": tls: failed to verify certificate: x509: certificate signed by unknown authoritygoroutine 1 [running]:github.com/kubecost/kubecost-cost-model/pkg/cmd/costmodel.Execute(0x1?) /app/kubecost-cost-model/pkg/cmd/costmodel/costmodel.go:2650 +0x8b9dgithub.com/kubecost/kubecost-cost-model/pkg/cmd.Execute.newCostModelCommand.func1(0xc001528800?, {0x4855d85?, 0x4?, 0x4855d89?}) /app/kubecost-cost-model/pkg/cmd/commands.go:68 +0x2fgithub.com/spf13/cobra.(*Command).execute(0xc001526000, {0x75ac380, 0x0, 0x0}) /go/pkg/mod/github.com/spf13/[email protected]/command.go:916 +0x87cgithub.com/spf13/cobra.(*Command).ExecuteC(0xc001527800) /go/pkg/mod/github.com/spf13/[email protected]/command.go:1040 +0x38dgithub.com/spf13/cobra.(*Command).Execute(...) /go/pkg/mod/github.com/spf13/[email protected]/command.go:968github.com/opencost/opencost/pkg/cmd.Execute(0x0?, {0xc00143fec0, 0x7, 0x7}) /app/opencost/pkg/cmd/commands.go:61 +0x3a5github.com/kubecost/kubecost-cost-model/pkg/cmd.Execute() /app/kubecost-cost-model/pkg/cmd/commands.go:43 +0x353main.main() /app/kubecost-cost-model/cmd/costmodel/main.go:12 +0x13
### Steps to reproduce
Create a keycloak with self-signed certificate and try to run kubecost
### Expected behavior
OIDC connection works with self-signed certs
### Impact
_No response_
### Screenshots
_No response_
### Logs
_No response_
### Slack discussion
_No response_
### Troubleshooting
- [X] I have read and followed the [issue guidelines](https://github.com/kubecost/features-bugs/blob/main/ISSUE_GUIDELINES.md) and this is a bug impacting only the Kubecost application.
- [X] I have searched other issues in this repository and mine is not recorded.
1.107.1 (f87c784)
1.28
EKS
I have a reasonably large multi-az cluster with KubeCost installed. Network costs daemon is installed and is up and running. No cross-zone traffic is shown in the KubeCost UI (Allocation / Network Costs). There is a large "Adjustment" cost, implying that KubeCost is aware of the spend, but just can't classify the traffic.
I have followed instructions at Network Cost Configuration - Troubleshooting
I think I may have discovered a clue, and I'll include screenshots below. It seems that the grafana dashboard for network costs is also missing cross zone and cross region costs, but when I look at the promql that's fetching the metrics, it's clearly wrong (see screenshots below). It's using the wrong labels: sameRegion
instead of same_region
and sameZone instead of same_zone
. Is it possible the cost model engine is also looking at the wrong labels?
As you can see from the screenshots and config below, the classified network traffic metrics are there in prometheus, but not making it all the way to the UI somehow.
Any help would be greatly appreciated!
I would expect to see some traffic in the cross-zone and cross-region columns of the Allocation / Network Costs screen.
One of our main use cases for KubeCost is to help tease out cross zone network costs. Since it's not classifying properly, this is making the tool much less useful.
Here you can see all Prometheus kubecost-networking targets are up:
Here you can see that kubecost_pod_network_egress_bytes_total
metrics are being captured:
Here you can see that cross zone and cross region traffic is missing:
And here you can see why. The name of the labels being used in the promql is wrong:
If I fix it, you can see that the data appears. This proves that my classification rules are working. The kubecost-networking daemon is classifying traffic.
I have included my network costs config below.
Here's what things look like in the UI. This is network costs for kube-system namespace:
Here you can see all Kubecost services and pods up and running:
$ k get all --namespace kubecost
NAME READY STATUS RESTARTS AGE
pod/kubecost-cost-analyzer-8779958fd-978f6 2/2 Running 0 29h
pod/kubecost-grafana-79c8884f54-gzrv9 2/2 Running 0 7d4h
pod/kubecost-network-costs-2mfm2 1/1 Running 0 7d4h
pod/kubecost-network-costs-2q645 1/1 Running 0 7d4h
pod/kubecost-network-costs-59x2p 1/1 Running 0 7d4h
pod/kubecost-network-costs-5jv2b 1/1 Running 0 7d4h
pod/kubecost-network-costs-5mrpb 1/1 Running 0 7d4h
pod/kubecost-network-costs-5vd7s 1/1 Running 0 7d4h
pod/kubecost-network-costs-8plj7 1/1 Running 0 7d4h
pod/kubecost-network-costs-8t4mx 1/1 Running 0 7d4h
pod/kubecost-network-costs-bfvxp 1/1 Running 0 7d4h
pod/kubecost-network-costs-bj49l 1/1 Running 0 7d4h
pod/kubecost-network-costs-bpmjf 1/1 Running 0 7d4h
pod/kubecost-network-costs-bvnkn 1/1 Running 0 7d4h
pod/kubecost-network-costs-cgvnd 1/1 Running 0 7d4h
pod/kubecost-network-costs-d5gh6 1/1 Running 0 7d4h
pod/kubecost-network-costs-dfftx 1/1 Running 0 7d4h
pod/kubecost-network-costs-dl9gx 1/1 Running 0 7d4h
pod/kubecost-network-costs-dmbr4 1/1 Running 0 7d4h
pod/kubecost-network-costs-dv86s 1/1 Running 0 7d4h
pod/kubecost-network-costs-dvxgv 1/1 Running 0 7d4h
pod/kubecost-network-costs-f9qnk 1/1 Running 0 7d4h
pod/kubecost-network-costs-gd46d 1/1 Running 0 7d4h
pod/kubecost-network-costs-h6vlp 1/1 Running 0 7d4h
pod/kubecost-network-costs-hwkww 1/1 Running 0 7d4h
pod/kubecost-network-costs-jjhcc 1/1 Running 0 7d4h
pod/kubecost-network-costs-kbhhw 1/1 Running 0 7d4h
pod/kubecost-network-costs-kcxbf 1/1 Running 0 7d4h
pod/kubecost-network-costs-kdgns 1/1 Running 0 7d4h
pod/kubecost-network-costs-kllst 1/1 Running 0 7d4h
pod/kubecost-network-costs-kpts6 1/1 Running 0 7d4h
pod/kubecost-network-costs-kqkwp 1/1 Running 0 7d4h
pod/kubecost-network-costs-mc8v2 1/1 Running 0 7d4h
pod/kubecost-network-costs-n9rd5 1/1 Running 0 7d4h
pod/kubecost-network-costs-nl7dc 1/1 Running 0 7d4h
pod/kubecost-network-costs-nsn9r 1/1 Running 0 6d2h
pod/kubecost-network-costs-q66fg 1/1 Running 0 7d4h
pod/kubecost-network-costs-qkvt5 1/1 Running 0 7d4h
pod/kubecost-network-costs-qn85p 1/1 Running 0 7d4h
pod/kubecost-network-costs-r9s69 1/1 Running 0 7d4h
pod/kubecost-network-costs-rlmzj 1/1 Running 0 7d4h
pod/kubecost-network-costs-rms9h 1/1 Running 0 7d4h
pod/kubecost-network-costs-rxpj9 1/1 Running 0 7d4h
pod/kubecost-network-costs-sqpxp 1/1 Running 0 7d4h
pod/kubecost-network-costs-t6tdh 1/1 Running 0 7d4h
pod/kubecost-network-costs-tl58d 1/1 Running 0 7d4h
pod/kubecost-network-costs-twfxk 1/1 Running 0 7d4h
pod/kubecost-network-costs-tzkqc 1/1 Running 0 7d4h
pod/kubecost-network-costs-v546l 1/1 Running 0 7d4h
pod/kubecost-network-costs-w9h4d 1/1 Running 0 7d4h
pod/kubecost-network-costs-w9lps 1/1 Running 0 7d4h
pod/kubecost-network-costs-wfcq2 1/1 Running 0 7d4h
pod/kubecost-network-costs-wkzmt 1/1 Running 0 7d4h
pod/kubecost-network-costs-wxgjh 1/1 Running 0 7d4h
pod/kubecost-network-costs-wzhzv 1/1 Running 0 7d4h
pod/kubecost-network-costs-zcgb4 1/1 Running 0 7d4h
pod/kubecost-prometheus-server-74b9f65cb5-gl97j 2/2 Running 0 7d4h
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubecost-cost-analyzer ClusterIP 172.20.127.37 <none> 9003/TCP,9090/TCP 7d4h
service/kubecost-grafana ClusterIP 172.20.184.57 <none> 80/TCP 7d4h
service/kubecost-network-costs ClusterIP None <none> 3001/TCP 7d4h
service/kubecost-prometheus-server ClusterIP 172.20.46.173 <none> 80/TCP 7d4h
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/kubecost-network-costs 54 54 54 54 54 <none> 7d4h
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/kubecost-cost-analyzer 1/1 1 1 7d4h
deployment.apps/kubecost-grafana 1/1 1 1 7d4h
deployment.apps/kubecost-prometheus-server 1/1 1 1 7d4h
NAME DESIRED CURRENT READY AGE
replicaset.apps/kubecost-cost-analyzer-675b4d75f5 0 0 0 6d3h
replicaset.apps/kubecost-cost-analyzer-68d47798d8 0 0 0 7d4h
replicaset.apps/kubecost-cost-analyzer-76fd85c97d 0 0 0 6d3h
replicaset.apps/kubecost-cost-analyzer-8779958fd 1 1 1 29h
replicaset.apps/kubecost-cost-analyzer-fff47d548 0 0 0 6d3h
replicaset.apps/kubecost-grafana-79c8884f54 1 1 1 7d4h
replicaset.apps/kubecost-prometheus-server-74b9f65cb5 1 1 1 7d4h
And here is the network costs config:
$ k get cm network-costs-config -o yaml
apiVersion: v1
data:
config.yaml: |
destinations:
cross-region: []
direct-classification:
- ips:
- 10.0.3.0/24
- 10.0.60.0/22
- 10.0.72.0/22
- 10.0.80.0/24
region: us-east1
zone: us-east1-a
- ips:
- 10.0.4.0/24
- 10.0.68.0/22
- 10.0.76.0/22
- 10.0.81.0/24
region: us-east1
zone: us-east1-b
- ips:
- 10.0.5.0/24
- 10.0.36.0/22
- 10.0.82.0/24
region: us-east1
zone: us-east1-c
- ips:
- 10.0.32.0/22
- 10.0.56.0/22
- 10.0.83.0/24
- 10.0.92.0/24
region: us-east1
zone: us-east1-d
- ips:
- 10.0.93.0/24
region: us-east1
zone: us-east1-e
- ips:
- 10.0.40.0/22
- 10.0.64.0/22
- 10.0.84.0/24
- 10.0.94.0/24
region: us-east1
zone: us-east1-f
in-region: []
in-zone:
- 127.0.0.0/8
- 169.254.0.0/16
- 172.16.0.0/12
- 192.168.0.0/16
internet: []
services:
amazon-web-services: true
azure-cloud-services: false
google-cloud-services: false
kind: ConfigMap
metadata:
annotations:
meta.helm.sh/release-name: kubecost
meta.helm.sh/release-namespace: kubecost
creationTimestamp: "2023-12-05T14:39:35Z"
labels:
app: cost-analyzer
app.kubernetes.io/instance: kubecost
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: cost-analyzer
helm.sh/chart: cost-analyzer-1.107.1
name: network-costs-config
namespace: kubecost
resourceVersion: "2931042379"
uid: d4064385-5f4c-402c-8505-206fc6b89cb1
No response
106.2
n/a
EKS
Back button in browser stopped working.
Return to the previous view of cloud data.
Instead of expected behavior the back button did not change the page. This makes browsing data hard in this particular moment but also causes me to lose some faith in the Kubecost product because most users will view this as table stakes.
No response
No response
No response
When looking at daily/hourly trends over time, I'm unable to get a sense for movement by looking at the graph tooltip.
It would be helpful to show at least one more decimal place, i.e. $2k --> $2.1k
Show two more decimal places...
This same concept may apply to others graphs...
1.106.3
1.25.6
AKS
We deployed kubecost using the following commands
The promethus pods and the following below pods are in container creating states
kubecost-grafana-57c8b5d877-v4xwv 0/2 ContainerCreating 0 110m
kubecost-kube-state-metrics-68459c8d5f-p78sn 0/1 ContainerCreating 0 110m
The URL need to be accessible and the details needs loading.Currenlty its stuck in the below state
No response
PS C:\Users\nidicula\OneDrive - RM PLC\WORK\AKSClusterIssues\KubeCostSetup> kubectl get events --sort-by=.metadata.creationTimestamp --field-selector type!=Normal -n kubecost
LAST SEEN TYPE REASON OBJECT MESSAGE
7m30s Warning FailedMount pod/kubecost-kube-state-metrics-68459c8d5f-p78sn Unable to attach or mount volumes: unmounted volumes=[kube-api-access-hrzkm], unattached volumes=[kube-api-access-hrzkm]: timed out waiting for the condition
7m35s Warning FailedMount pod/kubecost-prometheus-server-7f745bf6f4-65wr8 Unable to attach or mount volumes: unmounted volumes=[kube-api-access-4xjrf], unattached volumes=[config-volume kube-api-access-4xjrf storage-volume]: timed out waiting for the condition
2m46s Warning FailedMount pod/kubecost-grafana-57c8b5d877-v4xwv (combined from similar events): MountVolume.SetUp failed for volume "kube-api-access-bwnlz" : chown c:\var\lib\kubelet\pods\c8ce496a-54ea-46fc-802b-6994d5df6ec1\volumes\kubernetes.io~projected\kube-api-access-bwnlz\..2023_10_25_08_46_05.1185287298\token: not supported by windows
2m46s Warning FailedMount pod/kubecost-kube-state-metrics-68459c8d5f-p78sn (combined from similar events): MountVolume.SetUp failed for volume "kube-api-access-hrzkm" : chown c:\var\lib\kubelet\pods\d2eef8ec-6d67-43ad-a2bf-4f1219c6a978\volumes\kubernetes.io~projected\kube-api-access-hrzkm\..2023_10_25_08_46_05.1332739432\token: not supported by windows
2m44s Warning FailedMount pod/kubecost-prometheus-server-7f745bf6f4-65wr8 (combined from similar events): MountVolume.SetUp failed for volume "kube-api-access-4xjrf" : chown c:\var\lib\kubelet\pods\637f2370-8210-4b79-85db-0a0c35f247f6\volumes\kubernetes.io~projected\kube-api-access-4xjrf\..2023_10_25_08_46_07.4008867723\token: not supported by windows
12m Warning FailedMount pod/kubecost-prometheus-server-7f745bf6f4-65wr8 Unable to attach or mount volumes: unmounted volumes=[kube-api-access-4xjrf], unattached volumes=[storage-volume config-volume kube-api-access-4xjrf]: timed out waiting for the condition
27m Warning FailedMount pod/kubecost-grafana-57c8b5d877-v4xwv Unable to attach or mount volumes: unmounted volumes=[kube-api-access-bwnlz], unattached volumes=[sc-dashboard-volume kube-api-access-bwnlz config ldap sc-dashboard-provider storage]: timed out waiting for the condition
11m Warning FailedScheduling pod/kubecost-prometheus-node-exporter-k5xbr 0/9 nodes are available: 1 node(s) didn't have free ports for the requested pod ports. preemption: 0/9 nodes are available: 9 No preemption victims found for incoming pod.
11m Warning FailedScheduling pod/kubecost-prometheus-node-exporter-gmsf8 0/9 nodes are available: 1 node(s) didn't have free ports for the requested pod ports. preemption: 0/9 nodes are available: 9 No preemption victims found for incoming pod.
11m Warning FailedScheduling pod/kubecost-prometheus-node-exporter-m2bw6 0/9 nodes are available: 1 node(s) didn't have free ports for the requested pod ports. preemption: 0/9 nodes are available: 9 No preemption victims found for incoming pod.
11m Warning FailedScheduling pod/kubecost-prometheus-node-exporter-m9ldv 0/9 nodes are available: 1 node(s) didn't have free ports for the requested pod ports. preemption: 0/9 nodes are available: 9 No preemption victims found for incoming pod.
11m Warning FailedScheduling pod/kubecost-prometheus-node-exporter-rhc7t 0/9 nodes are available: 1 node(s) didn't have free ports for the requested pod ports. preemption: 0/9 nodes are available: 9 No preemption victims found for incoming pod.
11m Warning FailedScheduling pod/kubecost-prometheus-node-exporter-rpfll 0/9 nodes are available: 1 node(s) didn't have free ports for the requested pod ports. preemption: 0/9 nodes are available: 9 No preemption victims found for incoming pod.
11m Warning FailedScheduling pod/kubecost-prometheus-node-exporter-xqpb5 0/9 nodes are available: 1 node(s) didn't have free ports for the requested pod ports. preemption: 0/9 nodes are available: 9 No preemption victims found for incoming pod.
11m Warning FailedScheduling pod/kubecost-prometheus-node-exporter-2ns68 0/9 nodes are available: 1 node(s) didn't have free ports for the requested pod ports. preemption: 0/9 nodes are available: 9 No preemption victims found for incoming pod.
12m Warning FailedMount pod/kubecost-grafana-57c8b5d877-v4xwv Unable to attach or mount volumes: unmounted volumes=[kube-api-access-bwnlz], unattached volumes=[config ldap sc-dashboard-provider storage sc-dashboard-volume kube-api-access-bwnlz]: timed out waiting for the condition
7m35s Warning FailedMount pod/kubecost-grafana-57c8b5d877-v4xwv Unable to attach or mount volumes: unmounted volumes=[kube-api-access-bwnlz], unattached volumes=[storage sc-dashboard-volume kube-api-access-bwnlz config ldap sc-dashboard-provider]: timed out waiting for the condition
No response
Describe the bug
A user reported they noticed GPU costs on their cluster. After looking into the details of their environment, we noticed that they use the aws-virtual-gpu-device-plugin to manage their GPU devices on their cluster. I was able to reproduce the same issue by deploying an AWS GPU-supported node and deploying the controller. Before deploying the DaemonSet controller, I had valid values displaying for the underlying node GPU cost, but after deploying, my node_gpu_count
metric emitted from /model/metrics
is 0.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Idle should still be associated with the node to attribute total cost to an allocation correctly.
Screenshots
Namespace with GPU related to Negative Idle:
Node with no controller deployed:
Node deployed, then controller added:
Prometheus metrics corresponding with toggling the label for the controller DaemonSet:
What problem are you trying to solve?
We consume Kubecost data via API, but we build reports in the UI to quickly verify the data. Mapping the UI data to the matching API can be difficult depending on options (aggregating by labels, filtering by labels, sharing idle, etc.) especially because the UI uses a different API than the Allocation API.
Describe the solution you'd like
It would be helpful if there were a way to translate what I am viewing in the UI to the matching API call.
Describe alternatives you've considered
Trial and error matching up UI and API calls.
How would users interact with this feature?
Ideally, there would be an option on the allocation page to "Show me the API call."
1.106.2
1.23.17
EKS
After setting up the monthly
run rate allocation report with time window last week
, downloading the report as pdf (and scheduling it to be sent to email recipients) shows the cumulative cost
for last week
instead. The allocation report is correct through the UI, but not when exported as a pdf file (and when it is sent as a scheduled email).
Last Week
as date option, aggregated by Namespace
, Cost
for chart, and Monthly Rate
for cost metric.An allocation report with a monthly
run rate based on cost from last week
should be generated with the correct numbers. If scheduled a report to be sent via email, the same allocation report should be generated.
Describe the bug
If cloud-integration is used, /diagnostics "pricing sources" leads users to believe they have a configuration issue.
When kubecost detects that cloud-integration is present, disable "Cloud Cost Settings" in /settings
To Reproduce
.Values.kubecostProductConfigs.cloudIntegrationSecret
Expected behavior
Users should not be able to configure "cloud cost settings" when using cloud-integrations
Cloud Costs panel in /overview should link to: https://guide.kubecost.com/hc/en-us/articles/4407595968919-Setting-Up-Cloud-Integrations
Potentially bring /diagnostics "Cloud Integrations" to top of page as this is the preferred method for cloud billing integration.
Collect logs (please complete the following information):
NA
1.106.2
1.27.4
EKS
We've configured kubecost with AWS price reconcilliation as described here: https://docs.kubecost.com/install-and-configure/install/cloud-integration/aws-cloud-integrations
After the configuration it looks like everything is working fine:
but we are still getting a warning about "A pricing source is unavailable: Savings Plan, Reserved Instance, and Out-Of-Cluster" not sure if this is a but or we're missing some configuration:
No warnings or some errors pointing as to why something isn't working as intended.
No response
No response
No response
No response
1.106.4
1.25.11
AKS
We deployed kubecost using the following helm command
helm install kubecost cost-analyzer --repo https://kubecost.github.io/cost-analyzer/ --namespace kubecost --create-namespace --set kubecostToken="bmlkaWN1bGFnZW9yZ2VAaW4ucm0uY29txm343yadf98"
As per the following link
https://www.kubecost.com/install#show-instructions
The container should be running but the pods are in container creating state
PFB below
The pods are not in running state ,its in container creating state and unable to access the kubecost
PS /home/nithin> kubectl logs kubecost-cost-analyzer-7596f84b9d-tv5w2 -n kubecost
Defaulted container "cost-model" out of: cost-model, cost-analyzer-frontend
Error from server (BadRequest): container "cost-model" in pod "kubecost-cost-analyzer-7596f84b9d-tv5w2" is waiting to start: ContainerCreating
No response
n some namespaces, there seems no activities and kubecost do not show them in the 'Cumulative cost'. However, as financial people would like to have them listed (even with activity/cost 0). Can Kubecost provide an option to display these namespaces if required? Or have some way to inject to the report?
What problem are you trying to solve?
currently RBAC allows for filtering in cost allocation. Customer requesting filtering in assets as well.
Ideally insure all saving and advanced reports adhere to filters as well.
Describe the solution you'd like
Customer has stated: "We have a bunch of groups and every group should have access to exactly one cluster. Also in every cluster we want to exclude visibility to a few namespaces."
The primary focus for the next revision could be on namespaces and clusters.
Describe alternatives you've considered
Gave other options for using the API to pull data into other dashboard, but kubecost UI has other valuable features.
How would users interact with this feature?
filters.json configmap
Describe the bug
On the abandoned workloads savings page, PVCs deployed to the cluster have a line item for "x-unmounted-pvcs," regardless if those PVCs are attached to a pod. Perhaps this is intended. Still, it seems to pull all "-unmounted-pvc" values.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
I'm unsure how we expect to show "abandoned pvcs" since this page is related to traffic, and I'm unsure how that metric relates to PVCs. An explanation on the page to describe how to interpret this data would be helpful.
What impact will this have on your ability to get value out of Kubecost?
This helps clarify what the information on this page is trying to portray.
Please share the support case, if any
ZD 4488
v1.106.0
1.26.3
AKS
Kubecost configured and correctly integrated with Azure with no errors reported through the UI Bug Report.
Even so, when trying to get the ExternalCost value through the API, this always returns 0.
Result:
{ "code": 200, "data": [ { "1": { "name": "1", "properties": { "namespace": "1", "namespaceLabels": { "istio_injection": "enabled", "kubernetes_io_metadata_name": "1", "zen_security": "enabled" } }, "window": { "start": "2023-10-03T00:00:00Z", "end": "2023-10-10T00:00:00Z" }, "start": "2023-10-03T00:00:00Z", "end": "2023-10-09T15:50:00Z", "minutes": 9590, "cpuCores": 0.06648, "cpuCoreRequestAverage": 0.06643, "cpuCoreUsageAverage": 0.01415, "cpuCoreHours": 10.62639, "cpuCost": 0.92469, "cpuCostAdjustment": 0, "cpuEfficiency": 0.21304, "gpuCount": 0, "gpuHours": 0, "gpuCost": 0, "gpuCostAdjustment": 0, "networkTransferBytes": 3323627386.04374, "networkReceiveBytes": 6739244814.07457, "networkCost": 0.00232, "networkCrossZoneCost": 0, "networkCrossRegionCost": 0, "networkInternetCost": 0.00232, "networkCostAdjustment": 0, "loadBalancerCost": 0, "loadBalancerCostAdjustment": 0, "pvBytes": 0, "pvByteHours": 0, "pvCost": 0, "pvs": null, "pvCostAdjustment": 0, "ramBytes": 762101360.72892, "ramByteRequestAverage": 749403818.31074, "ramByteUsageAverage": 465442891.87187, "ramByteHours": 121809200823.17249, "ramCost": 0.7994, "ramCostAdjustment": 0, "ramEfficiency": 0.62108, "externalCost": 0, "sharedCost": 0.03852, "totalCost": 1.76493, "totalEfficiency": 0.40224, "proportionalAssetResourceCosts": {}, "lbAllocations": null, "sharedCostBreakdown": {} } } ] }
Get the correct value from ExternalCost.
No response
No response
#5155
1.108.1
1.24
EKS
node_total_hourly_cost metric gives the wrong price tag...
I noticed it in several ec2 types, especially c family…even more strange is that in some clusters it does show the right price:
happened on 1.105.1 version and again after upgrade to 1.108.1 which is the latest at the moment - I was hoping it's a known bug which already been fixed in new versions...
query node_total_hourly_cost{instance_type="c5d.18xlarge"} will return 3.46$
wrong cost estimation for workloads
No response
No response
https://kubecost.slack.com/archives/CE76NJE6S/p1704884137548639
1.98
1.25
EKS
Hello folks!
After upgrading our EKS cluster to the 1.25, we noticed our kubecost stopped to work, due to some deprecations in the PSP features (as stated here kubecost/cost-analyzer-helm-chart#1773 (comment))
After some days research, we found that workaround and we decided to apply it. It worked, as expected and Kubecost is now back online.
However, we noticed that we lost our old metrics, and we got only new metrics from today and on. Checking the kubecost logs, I found the following in the logs (pasted in the log part)
After considering this, maybe our data is still here and somehow is not being read, or do we had another issue and we lost it, aftter pushing the helm chart with the updates to Argo?
thanks
The old metrics existing in the console
big - finops cant get any value from it
200
2024-01-10T18:38:52.833535279Z WRN CostModel.ComputeAllocation: Node spot query result for missing node: cluster-one/fargate-ip-10-149-241-43.ec2.internal
199
2024-01-10T18:38:52.836968532Z INF ETL: Allocation[1h]: AggregatedStore[UDejW]: run: aggregated [2024-01-10T18:00:00+0000, 2024-01-10T19:00:00+0000) from 132 to 46 in 733.849µs
198
2024-01-10T18:38:52.840605258Z INF ETL: Asset[1h]: AggregatedStore.Run[sipig]: run: aggregated [2024-01-10T14:00:00+0000, 2024-01-10T15:00:00+0000) from 0 to 0 in 460ns
197
2024-01-10T18:38:52.878688631Z INF ETL: Asset[1h]: AggregatedStore.Run[sipig]: run: aggregated [2024-01-10T15:00:00+0000, 2024-01-10T16:00:00+0000) from 0 to 0 in 380ns
196
2024-01-10T18:38:52.918977102Z INF ETL: Asset[1h]: AggregatedStore.Run[sipig]: run: aggregated [2024-01-10T16:00:00+0000, 2024-01-10T17:00:00+0000) from 0 to 0 in 370ns
195
2024-01-10T18:38:52.94252342Z INF ETL: Asset[1h]: AggregatedStore.Run[sipig]: run: aggregated [2024-01-10T17:00:00+0000, 2024-01-10T18:00:00+0000) from 0 to 0 in 350ns
194
2024-01-10T18:38:52.994287847Z INF ETL: Asset[1h]: AggregatedStore.Run[sipig]: run: aggregated [2024-01-10T18:00:00+0000, 2024-01-10T19:00:00+0000) from 17 to 2 in 1.534649ms
193
2024-01-10T18:39:01.650451796Z INF Error getting node pricing. Error: Invalid Pricing Key "us-east-1,,linux"
192
2024-01-10T18:39:01.650540317Z INF Error getting node pricing. Error: Invalid Pricing Key "us-east-1,,linux"
191
2024-01-10T18:39:01.65316764Z INF Error getting node pricing. Error: Invalid Pricing Key "us-east-1,,linux"
190
2024-01-10T18:39:01.653273281Z INF Error getting node pricing. Error: Invalid Pricing Key "us-east-1,,linux"
189
2024-01-10T18:40:01.694239625Z INF Error getting node pricing. Error: Invalid Pricing Key "us-east-1,,linux"
188
2024-01-10T18:40:01.694317886Z INF Error getting node pricing. Error: Invalid Pricing Key "us-east-1,,linux"
187
2024-01-10T18:40:01.696444753Z INF Error getting node pricing. Error: Invalid Pricing Key "us-east-1,,linux"
186
2024-01-10T18:40:01.696531594Z INF Error getting node pricing. Error: Invalid Pricing Key "us-east-1,,linux"
185
2024-01-10T18:40:07.488728344Z INF http: named cookie not present
184
2024-01-10T18:40:07.488836665Z INF [JWT Groups] No Cookie set
183
2024-01-10T18:40:07.489532994Z INF ETL: Allocation: QueryAllocation([2024-01-03T18:40:07+0000, 2024-01-10T18:40:07+0000), [cluster]) from AggregatedStore[1d] 602.178µs [query 263.583µs] [idle/tenancy 480ns] [external 330ns] [aggregate 337.105µs] [accumulate 440ns] [stop 240ns]
182
2024-01-10T18:40:07.490493286Z INF http: named cookie not present
181
2024-01-10T18:40:07.490555817Z INF [JWT Groups] No Cookie set
180
2024-01-10T18:40:07.490567657Z INF http: named cookie not present
179
2024-01-10T18:40:07.490665469Z INF [JWT Groups] No Cookie set
178
2024-01-10T18:40:07.491098084Z INF ETL: QuerySummaryAllocation([2024-01-10T14:40:07+0000, 2024-01-10T18:40:07+0000), [namespace]) from AggregatedStore[1h] 464.366µs [query 322.865µs] [idle/tenancy 25.52µs] [external 380ns] [aggregate 114.971µs] [accumulate 400ns] [stop 230ns]
177
2024-01-10T18:40:07.49158431Z INF ETL: QuerySummaryAllocation([2024-01-07T18:40:07+0000, 2024-01-10T18:40:07+0000), [cluster]) from AggregatedStore[1d] 372.185µs [query 251.633µs] [idle/tenancy 18.641µs] [external 520ns] [aggregate 100.361µs] [accumulate 810ns] [stop 220ns]
176
2024-01-10T18:40:08.063251158Z INF http: named cookie not present
175
2024-01-10T18:40:08.06332345Z INF [JWT Groups] No Cookie set
174
2024-01-10T18:40:08.064087869Z INF ETL: QuerySummaryAllocation([2024-01-09T18:40:08+0000, 2024-01-10T18:40:08+0000), [namespace]) from AggregatedStore[1h] 658.648µs [query 362.115µs] [idle/tenancy 180.052µs] [external 640ns] [aggregate 115.231µs] [accumulate 390ns] [stop 220ns]
173
2024-01-10T18:40:08.06967862Z ERR ETL: failed to merge cloud usage: error merging cloud usage: MergeAssetSetRanges failed: expected range length 24, but got 1
172
2024-01-10T18:40:08.069887802Z INF ETL: Asset: QueryAsset([2024-01-09T18:40:08+0000, 2024-01-10T18:40:08+0000), [type]) from ETLStore[1h] 648.518µs [query 439.196µs] [cloud 121.551µs] [aggregate 86.951µs] [accumulate 510ns] [stop 310ns]
171
2024-01-10T18:40:08.079095549Z INF http: named cookie not present
170
2024-01-10T18:40:08.079168321Z INF [JWT Groups] No Cookie set
169
2024-01-10T18:40:08.079718088Z INF ETL: QuerySummaryAllocation([2024-01-09T18:40:08+0000, 2024-01-10T18:40:08+0000), [cluster]) from AggregatedStore[1h] 471.256µs [query 338.014µs] [idle/tenancy 54.551µs] [external 230ns] [aggregate 78.071µs] [accumulate 220ns] [stop 170ns]
168
2024-01-10T18:40:08.080747351Z INF http: named cookie not present
167
2024-01-10T18:40:08.080843862Z INF [JWT Groups] No Cookie set
166
2024-01-10T18:40:08.08073867Z INF ETL: Asset: QueryAsset([2024-01-03T18:40:08+0000, 2024-01-10T18:40:08+0000), [service]) from ETLStore[1d] 439.355µs [query 374.264µs] [cloud 11.6µs] [aggregate 52.931µs] [accumulate 310ns] [stop 250ns]
165
2024-01-10T18:40:08.264739847Z INF ETL: Allocation: QueryAllocation([2024-01-09T18:40:08+0000, 2024-01-10T18:40:08+0000), [cluster node namespace pod container]) from ETLStore[1d] 1.658131ms [query 1.348698ms] [idle/tenancy 1.51µs] [external 230ns] [aggregate 306.733µs] [accumulate 400ns] [stop 560ns]
164
2024-01-10T18:40:08.26576271Z INF ETL: Allocation: QueryAllocation([2024-01-08T18:40:08+0000, 2024-01-10T18:40:08+0000), [cluster node namespace pod controller]) from ETLStore[1d] 2.792447ms [query 1.865884ms] [idle/tenancy 820ns] [external 270ns] [aggregate 809.321µs] [accumulate 115.902µs] [stop 250ns]
163
2024-01-10T18:40:08.265991922Z INF [Profiler] 3.090089ms: Savings: abandonedWorkloads
162
2024-01-10T18:40:08.266806192Z INF ETL: Allocation: QueryAllocation([2024-01-08T18:40:08+0000, 2024-01-10T18:40:08+0000), [cluster node namespace pod container]) from ETLStore[1d] 3.738708ms [query 3.418894ms] [idle/tenancy 660ns] [external 260ns] [aggregate 318.234µs] [accumulate 410ns] [stop 250ns]
161
2024-01-10T18:40:08.26737143Z INF [Profiler] 4.343265ms: Savings: requestSizing
160
2024-01-10T18:40:08.267471681Z INF ETL: Asset: QueryAsset([2024-01-08T18:40:08+0000, 2024-01-10T18:40:08+0000), [cluster]) from ETLStore[1d] 297.653µs [query 239.423µs] [cloud 4.65µs] [aggregate 48.65µs] [accumulate 3.33µs] [stop 1.6µs]
159
2024-01-10T18:40:08.268544885Z INF [Profiler] 5.433419ms: Savings: clusterSizing
158
2024-01-10T18:40:08.273082982Z INF http: named cookie not present
157
2024-01-10T18:40:08.273221034Z INF [JWT Groups] No Cookie set
156
2024-01-10T18:40:08.27366333Z INF http: named cookie not present
155
2024-01-10T18:40:08.27369793Z INF [JWT Groups] No Cookie set
154
2024-01-10T18:40:08.27446743Z INF ETL: QuerySummaryAllocation([2024-01-03T18:40:08+0000, 2024-01-10T18:40:08+0000), [cluster]) from AggregatedStore[1d] 1.150625ms [query 308.914µs] [idle/tenancy 68.001µs] [external 440ns] [aggregate 772.32µs] [accumulate 700ns] [stop 250ns]
153
2024-01-10T18:40:08.276718678Z INF ETL: QuerySummaryAllocation([2024-01-03T18:40:08+0000, 2024-01-10T18:40:08+0000), [cluster]) from AggregatedStore[1d] 2.172538ms [query 1.683962ms] [idle/tenancy 104.601µs] [external 690ns] [aggregate 382.355µs] [accumulate 600ns] [stop 330ns]
152
2024-01-10T18:40:08.427697895Z INF Found Discount for InstanceType: t3a.xlarge of 0.00
151
2024-01-10T18:40:08.427784386Z INF [Turndown Savings] Failed to locate 'instance_type' on node pricing metric.
150
2024-01-10T18:40:08.427811097Z INF Found Discount for InstanceType: of 0.00
149
2024-01-10T18:40:08.427837787Z INF [Turndown Savings] Failed to locate 'instance_type' on node pricing metric.
148
2024-01-10T18:40:08.427859757Z INF Found Discount for InstanceType: of 0.00
147
2024-01-10T18:40:08.427907348Z INF Found Discount for InstanceType: t3a.large of 0.00
146
2024-01-10T18:40:08.427956138Z INF Found Discount for InstanceType: t3a.medium of 0.00
145
2024-01-10T18:40:08.428006209Z INF [Profiler] 165.524482ms: Savings: nodeTurndown
144
2024-01-10T18:41:01.739217306Z INF Error getting node pricing. Error: Invalid Pricing Key "us-east-1,,linux"
143
2024-01-10T18:41:01.739297098Z INF Error getting node pricing. Error: Invalid Pricing Key "us-east-1,,linux"
142
2024-01-10T18:41:01.741587906Z INF Error getting node pricing. Error: Invalid Pricing Key "us-east-1,,linux"
141
2024-01-10T18:41:01.741670107Z INF Error getting node pricing. Error: Invalid Pricing Key "us-east-1,,linux"
140
2024-01-10T18:42:01.770136424Z INF Error getting node pricing. Error: Invalid Pricing Key "us-east-1,,linux"
139
2024-01-10T18:42:01.770219475Z INF Error getting node pricing. Error: Invalid Pricing Key "us-east-1,,linux"
138
2024-01-10T18:42:01.772953Z INF Error getting node pricing. Error: Invalid Pricing Key "us-east-1,,linux"
137
2024-01-10T18:42:01.77304115Z INF Error getting node pricing. Error: Invalid Pricing Key "us-east-1,,linux"
1
======= snip=========
No response
What problem are you trying to solve?
My organization has a predefined list of allowed instance sizes. The cluster rightsizing savings report recommends instances sizes and families that are disallowed by my organization.
Describe the solution you'd like
A way to define what instances/families are allowed to be suggested. Ideally, a flexible way to define what instances are allowed or disallowed based on names, families, and potentially instance properties(generation, region, network features, GPU, etc.). A good MVP would be simple name matching for allowed/disallowed.
Describe alternatives you've considered
Manually picking instances my organization supports based on the specs of the instances Kubecost suggests (vpcu, gb mem, etc)
How would users interact with this feature?
helm, configmap, API, download from object storage, settings, etc.
What problem are you trying to solve?
Each namespace has a PVC that maps to an Azure Storage Account (via Blob CSI). The Storage is (a) already accounted for in other costing and (b) is not actively using 10TiB of data (it says it has) and therefore should not be priced as such in Kubecost.
Describe the solution you'd like
A way to allow users to ignore specific resources through a configuration within Kubecost
Describe alternatives you've considered
Using a relabel config to drop the metrics from Prometheus directly
How would users interact with this feature?
Configuration through values.yaml, maybe through an ignoredResources
section, which can specify a string for resources to ignore. "{namespace}/{objectType}/{objectName}" or "kubecost/persistentvolumeclaim/kubecost-cost-analyzer". Ideally, this includes wildcard support to handle multiple namespaces or objects.
What problem are you trying to solve?
In /allocations in the UI, users are able to "Inspect Details" when aggregating by Namespace. This opens the /details page in a new browser tab. This is shown as an option when aggregating by namespace, but not when aggregating from a group like "Owner" or "Product"
[Examples of the "Inspect Details" pop-up for aggregation:namespace]
You are able to edit the endpoint URL directly and specify the name + type (shown below), and it brings up the /details page for that selection. In my case, I will use "type=product,name=grafana", which works but this /details page is missing the "Cloud Costs" and "Network Costs".
If I go to Advanced Reporting, I can see the "product:grafana" line item does have associated external cost, so it should also show within the /details inspector page.
[Example of a different /details page for "namespace:kubecost" - this one showing the Cloud + Network Costs]
Describe the solution you'd like
Add ability to pull up the Inspect Details (/details) for any aggregation in the UI. This would behave just like it does in v1.99 /allocations UI for "namespace"
Within each new /details page, add Cloud Costs and Network Costs that are currently missing when you go to the URL directly.
This should ideally be added to Advanced Reporting as well. I should be able to click and get to the "Inspect Details" page from any "aggregation" row where possible.
Describe alternatives you've considered
If these pages need to be accessed via hard coding the URL vs adding an "Inspect Details" hyperlink to each reporting row in UI, that would be fine, but the cloud costs + network costs that are missing when going direct to the URL should be added
How would users interact with this feature?
They would click on the row name within the UI and "Inspect Details" when aggregating by Owner/Team/Department/Environment (any of the built-in "aggregation groups"). This would be an option in any reporting view that makes sense, including in the new Advanced Reporting.
106.2
n/a
EKS
Most other Kubecost views are sorted by total costs, but this particular view of cloud costs is not.
OR
Individual items sorted by descending cost.
Was confused how "costs could be $4k+ but then have the biggest item be $6" then saw that there were bigger items in the graph tool tip... thought there was a major data problem... and then realized that data wasn't sorted like in other views.
No response
No response
When I look at data over long windows, e.g. quarters or years, I'd really like to view it by weeks or months.
Proposal is we add a step size to this window that support default, daily, weekly, and monthly.
No response
No response
When consuming allocation data via the API, there isn't a reliable method to determine if data points have already been reconciled with the CSP or not. As a comparison, there is a feature in the Kubecost UI that highlights whether or not costs have been already reconciled (although it isn't clear how reliable that functionality is either).
A simple property in each data element in the return response like reconciled: True
would be enough. Optionally, it would also be useful to be able to submit this as a query parameter. For example, to only fetch costs that have been reconciled.
*CostAdjustment
properties being different than 0: although true for most cases, there's likely a variety of scenarios where the adjustment for a reconciled cost might still be 0.No response
I want to integrate a OIDC provider for authentication. For its configuration I have the clientID, clientSecret and discoveryURL from the oidc provider. But currently to configure OIDC in kubecost I also need to specify a authURL with all required parameters hardcoded in the url.
The discoveryURL should be used to discover all other required information, such as the authURL. On login the user should be redirected to the authUrl with all required parameter automatically added. If required additional parameters should be configurable. The kubecost should also add a nonce (state) parameter automatically as defined by the oidc spec.
Use the hardcoded authUrl with all parameters (redundant) and nonce parameter state (insecure).
I'm setting up Auth0 as OIDC provider, but this feature request applies to all OIDC spec compliant OIDC providers.
What problem are you trying to solve?
I would like to be able to rightsize the RDS instances to what is appropriate to the usage.
Describe the solution you'd like
Currently kubecost is able to recommend the instance types for the k8s nodes and would like a similar feature for other AWS services starting with RDS. Like maybe using graviton etc. It should also provide some information on the performance impact of using an instance type
Describe alternatives you've considered
AWS cloud watch/datadog
How would users interact with this feature?
Thought the kubecost ui under Assets with a separate section for each AWS servicd.
What problem are you trying to solve?
I want to be able to understand which of my groups (i.e. namespaces/teams/services) are using the most money on resources they have allocated but aren't actually being used. Total cost for each grouping is useful, but it's not very meaningful if resource usage is high so the cost is justified. Efficiency for each grouping is useful, but it's not very helpful if most of my low efficiency groups aren't costing much anyways.
Describe the solution you'd like
Kubecost UI has more columns / configurable columns that have those sorts of metrics. For example : Cost of Resources Used, Cost of (pod-level) Idle Resources, Cost of RAM Used, Cost of Idle Ram, etc.
Describe alternatives you've considered
I've tried exporting metrics to Datadog to look at these numbers. But the only costing metrics Kubecost exports are all by node (i.e. node_ram_hourly_cost). So they can't be used to gain insight into groups such as namespace/team/etc.
How would users interact with this feature?
They would sort by a metric such as "Cost of Idle Resources" to quickly recognize which groups are paying a lot for resources that aren't necessarily needed.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.