Giter Site home page Giter Site logo

features-bugs's People

Contributors

chipzoller avatar

Watchers

 avatar  avatar  avatar  avatar

features-bugs's Issues

The `/cluster-inspect` view is blank when selecting a cluster

Describe the bug

When going to /cluster-inspect, I'm met with a blank page. Upon inspecting the API requests the frontend is making to the backend, this one is consistently failing with HTTP 500 /model/savings/clusterSizingETL.

image

Reproduce

  1. Go to the /overview page.
  2. Click on an active cluster under Cluster breakdown
  3. Get redirected to /cluster-inspect, sometimes failing

HAR files with reproduced behavior linked in Slack thread below.

Expected behavior

Identify all locations in the frontend that are calling /model/savings/clusterSizingETL. Build a graceful failure mode so that the frontend doesn't retry the request too many times, and doesn't end up displaying a blank page.

Please share the support case, if any

Link to a Slack thread showing that at least four users are running into this issue.

What impact will this have on your ability to get value out of Kubecost?

My team has enjoyed using the /cluster-inspect view, as it concisely summarizes all the activity on the cluster. Being unable to use it now disrupts our workflow.

[Feature] Support `algorithmCPU` parameter for "Continuous Request Right-Sizing"

Problem Statement

The "Continuous Request Right-Sizing" currently uses the max algorithm for recommendations, which causes services with high start-up CPU usage to be overprovisioned.

For example, some of our services spike to ~2 cores at start-up, then drop down to ~0.3 cores when stable. This has too negative effects:

  1. The CPU requests is over-provisioned, increasing cost (for some services, our CPU efficiency has dropped to 1% since enabling this feature)
  2. If a pod is left running for longer than the window, the start-up spike will no longer be considered, so it will be right-sized down to a reasonable 0.3 CPU. However, that causes the pods to be re-created, re-introducing the start-up spike, so it'll then be "right-sized" back to 2 CPU

Solution Description

Introduce cpu.request.autoscaling.kubecost.com/algorithm and cpu.request.autoscaling.kubecost.com/q annotations (or similar) to allow the algorithmCPU and qCPU right-size recommendations parameters to be set on a per-workload basis.

It probably makes sense to introduce this for memory as well, for consistency.

Alternatives

Allow arbitrary query parameters to be added to the recommendation API requests (e.g., request.autoscaling.kubecost.com/extraRecommendationParameters: "algorithmCPU=quantile&qCPU=0.95)

This could be useful for allowing the use of alpha/experimental parameters, without making it part of the Cluster Controller's API.

Additional Context

No response

Troubleshooting

  • I have read and followed the issue guidelines and this is a feature request only for the Kubecost application.
  • I have searched other issues in this repository and mine is not recorded.

Take Kubernetes best practices (or Providers) into consideration when recommending instance sizes - max pods

What problem are you trying to solve?
I would like to resize my cluster nodes for more efficient utilization using Kubecosts "Rightsize your cluster nodes" savings recommendations.

Describe the solution you'd like
The recommended instance sizes and quantities should abide by Kubernetes best practices, or in the case of a cloud-based environment, the cloud providers' best practices and limitations.

Kubernetes:
Considerations for large clusters

Azure:
Azure Kubernetes Service service limits

AWS:
Amazon EKS - Elastic Network Interface (ENI) max pods
ENI max pods by instance

Google:
GKE max pods per node

Describe alternatives you've considered
Manually calculating instance quantities and sizes.

How would users interact with this feature?
I can envision a few different ways. Looks like 110 pods per node is the most common maximum. Kubecost could set that as the default quantity and then make it adjustable via the advanced settings in the UI.
image
A future version may take provider maximums into consideration based on provider details, such as AWS EKS nodes using ENI.

gz#2241

(related to Zendesk ticket kubecost/cost-analyzer-helm-chart#2241)

┆Issue is synchronized with this Jira Task by Unito

DescribeAddresses and DescribeVolumes fails with valid IRSA config

When using IRSA, Kubecost cannot access aws ec2 resources and logs the following messages even when the service account has the correct policy.

I back tested this with 1.101 and 1.102 and all versions have the issue.

error message:

WRN unable to get addresses: operation error EC2: DescribeAddresses, failed to sign request: failed to retrieve credentials: failed to refresh cached credentials, failed to retrieve credentials, operation error STS: AssumeRoleWithWebIdentity, exceeded maximum number of attempts, 3, https response error StatusCode: 400, RequestID: c63cf5bd-27d3-4919-8251-08fcf7ce7151, InvalidIdentityToken: No OpenIDConnect provider found in your account for https://oidc.eks.ca-central-1.amazonaws.com/id/2086E4D4C3BEAFFF61F3617142CA5DCC

WRN unable to get disks: operation error EC2: DescribeVolumes, failed to sign request: failed to retrieve credentials: failed to refresh cached credentials, failed to retrieve credentials, operation error STS: AssumeRoleWithWebIdentity, exceeded maximum number of attempts, 3, https response error StatusCode: 400, RequestID: 97337482-f0e2-489d-b8e6-c9108a264d8e, InvalidIdentityToken: No OpenIDConnect provider found in your account for https://oidc.eks.ca-central-1.amazonaws.com/id/2086E4D4C3BEAFFF61F3617142CA5DCC

To Reproduce
Steps to reproduce the behavior:

  1. create a policy:
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "KubecostSavingsAccess",
            "Effect": "Allow",
            "Action": [
                "ec2:DescribeAddresses",
                "ec2:DescribeVolumes"
            ],
            "Resource": "*"
        }
    ]
}

create an IRSA account with the policy:

>eksctl create iamserviceaccount \
    --name kubecost-serviceaccount \
    --namespace kubecost \
    --cluster jesse-temp --region ca-central-1 \
    --attach-policy-arn arn:aws:iam::297945954695:policy/jesse-temp-savings-policy \
    --override-existing-serviceaccounts \
    --approve

install kubecost and view logs

helm install kubecost kubecost/cost-analyzer --version 1.104.1 \
  --set serviceAccount.create=false --set serviceAccount.name=kubecost-serviceaccount

Expected behavior
no errors

What impact will this have on your ability to get value out of Kubecost?
savings reports broken for
/orphaned-resources

[Feature] CSV export to S3 buckets and other cloud targets

Problem Statement

Problem: Kubecost data needs to be integrated with BI and other custom FinOps tools . The following solution would help achieve this.

Please refer to the open cost implementation of CSV export to S3 bucket and other cloud targets. This really helps to integrate with existing BI and FinOps tools.
https://www.opencost.io/docs/integrations/csv-export

By doing this you can also provide/enable the current API based CSV export of cost data to include dates. Currently json export have dates. csv export is simple and nice, but dates on them would make it more usable.

Solution Description

Please refer to the open cost implementation of CSV export to S3 bucket and other cloud targets. This really helps to integrate with existing BI and FinOps tools.
https://www.opencost.io/docs/integrations/csv-export

By doing this you can also provide/enable the current API based CSV export of cost data to include dates. Currently json export have dates. csv export is simple and nice, but dates on them would make it more usable.

Alternatives

"Kubecost does provide access to our APIs which power all the data that you see in our UI. You can also opt to download any reports you see in the UI as a .csv. Here is a blog we published on how users can intregrate Kubecost Data into a Datadog Dashboard."

The blog does not match with what is being requested. Blog is in particular to data log integration. The ask is an export option for the data, so we can integrate any custom solution and not to a particular tool. The open cost option of exporting the data to a bucket or storage for example is more generic and any tooling can consume it. We do not use data dog. Also sub ask in this request is adding date field to csv file that is exported out for kubecost, which does not have a date.

Additional Context

No response

Troubleshooting

  • I have read and followed the issue guidelines and this is a feature request only for the Kubecost application.
  • I have searched other issues in this repository and mine is not recorded.

Add support for ephemeral storage costs

What problem are you trying to solve?

We have a request in ZD ticket: 3771 to show costs for local ephemeral storage.

Kubernetes documentation: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#local-ephemeral-storage

Note: In addition to local, there are CSI and Generic ephemeral volumes: https://kubernetes.io/docs/concepts/storage/ephemeral-volumes/#types-of-ephemeral-volumes

In their example they have a deployment which consumes the majority of local ephemeral storage across some number of nodes in the cluster and they would like to know the deployments cost associated with the underlying storage volume.

Cluster nodes are configured with:

allocatable:
  ephemeral-storage: "<Storage capacity>"

Deployments have the ephemeral-storage limit / requests:

resources:
  limits:
    ephemeral-storage: <Storage limit>
  requests:
    ephemeral-storage: <Storage request>

Describe the solution you'd like

Visibility into ephemeral storage costs on the allocations page.

With exception of Generic ephemeral volumes I believe they are similar to configMap, downwardAPI and secret volume types in that they are not associated with a PV. Because of this the solution may be a new ephemeral-volume allocations cost category itself that would be visible on the allocations page and when drilling into the Container level. The ability to enable / disable this category may be useful as well.

[Bug] Incorrect ("data missing") message shown when cloud data is loading

Kubecost Version

1.106.2

Kubernetes Version

n/a

Kubernetes Platform

Other (specify in description)

Description

Incorrectly seeing this message when I request a long when, data does eventually load: https://infra.kceng.dev/cloud?reportTitle=Cumulative+cost+for+last+12+months+by+provider&window=365d&agg=provider

image

Steps to reproduce

Load this view: https://infra.kceng.dev/cloud?reportTitle=Cumulative+cost+for+last+12+months+by+provider&window=365d&agg=provider

Expected behavior

Expect a loading state, but instead shows a data not available message

Impact

No response

Screenshots

No response

Logs

No response

Slack discussion

No response

Troubleshooting

  • I have read and followed the issue guidelines and this is a bug impacting only the Kubecost application.
  • I have searched other issues in this repository and mine is not recorded.

Not receiving the scheduled report emails

Hello Team,

I recently deployed Kubecost (version v1.106.4) in my Kubernetes cluster and have set up a scheduled report. However, I'm not receiving the scheduled report emails, and I suspect that the sender email ID and relay server details need to be configured.

I would appreciate any guidance on where and how I can configure these details within Kubecost. Specifically, I would like to know:

The email domain used by Kubecost to send scheduled reports.
The configuration location for setting the sender email ID.
The steps to configure the email/relay server details in Kubecost.
Your assistance in resolving this matter is highly valued. Thank you in advance for your help!

Regards,
Renjith

[Bug] Pod crashes when used in Scaleway

Problem Statement

Currently, Kubecost doesn't support Scaleway as a provider and the pod crashes indefinitely:

2023/08/29 13:20:39 maxprocs: Leaving GOMAXPROCS=2: CPU quota undefined
[90m2023-08-29T13:20:39.778948267Z[0m [1m???[0m Log level set to info
[90m2023-08-29T13:20:39.779009061Z[0m [32mINF[0m Starting Kubecost cost-model version 1.105.2 (94410c5)
[90m2023-08-29T13:20:39.780996678Z[0m [32mINF[0m Prometheus/Thanos Client Max Concurrency set to 5
[90m2023-08-29T13:20:39.78686936Z[0m [32mINF[0m Success: retrieved the 'up' query against prometheus at: http://prometheus-operated.prometheus.traefik.mesh:9090/
[90m2023-08-29T13:20:39.793785625Z[0m [32mINF[0m Retrieved a prometheus config file from: http://prometheus-operated.prometheus.traefik.mesh:9090/
[90m2023-08-29T13:20:39.803457139Z[0m [32mINF[0m Using scrape interval of 60.000000
[90m2023-08-29T13:20:39.80406909Z[0m [32mINF[0m NAMESPACE: kubecost
[90m2023-08-29T13:20:40.104978244Z[0m [32mINF[0m Done waiting
[90m2023-08-29T13:20:40.105568815Z[0m [32mINF[0m Starting *v1.Namespace controller
[90m2023-08-29T13:20:40.105756237Z[0m [32mINF[0m Starting *v1.Node controller
[90m2023-08-29T13:20:40.10582145Z[0m [32mINF[0m Starting *v1.Pod controller
[90m2023-08-29T13:20:40.10590626Z[0m [32mINF[0m Starting *v1.Service controller
[90m2023-08-29T13:20:40.106037486Z[0m [32mINF[0m Starting *v1.ConfigMap controller
[90m2023-08-29T13:20:40.106141752Z[0m [32mINF[0m Starting *v1.DaemonSet controller
[90m2023-08-29T13:20:40.10621519Z[0m [32mINF[0m Starting *v1.Deployment controller
[90m2023-08-29T13:20:40.106333703Z[0m [32mINF[0m Starting *v1.StatefulSet controller
[90m2023-08-29T13:20:40.106450723Z[0m [32mINF[0m Starting *v1.ReplicaSet controller
[90m2023-08-29T13:20:40.106565679Z[0m [32mINF[0m Starting *v1.PersistentVolume controller
[90m2023-08-29T13:20:40.106637394Z[0m [32mINF[0m Starting *v1.PersistentVolumeClaim controller
[90m2023-08-29T13:20:40.106684403Z[0m [32mINF[0m Starting *v1.StorageClass controller
[90m2023-08-29T13:20:40.106728666Z[0m [32mINF[0m Starting *v1.Job controller
[90m2023-08-29T13:20:40.106767749Z[0m [32mINF[0m Starting *v1beta1.PodDisruptionBudget controller
[90m2023-08-29T13:20:40.106813194Z[0m [32mINF[0m Starting *v1.ReplicationController controller
[90m2023-08-29T13:20:40.10943267Z[0m [32mINF[0m Found ProviderID starting with "scaleway", using Scaleway Provider
[90m2023-08-29T13:20:40.12136691Z[0m [32mINF[0m No asset-report-configs configmap found at install time, using existing configs: configmaps "asset-report-configs" not found
[90m2023-08-29T13:20:40.131145807Z[0m [32mINF[0m No advanced-report-configs configmap found at install time, using existing configs: configmaps "advanced-report-configs" not found
[90m2023-08-29T13:20:40.217114238Z[0m [32mINF[0m No saved-report-configs configmap found at install time, using existing configs: configmaps "saved-report-configs" not found
[90m2023-08-29T13:20:40.416305838Z[0m [32mINF[0m No pricing-configs configmap found at install time, using existing configs: configmaps "pricing-configs" not found
[90m2023-08-29T13:20:40.614670714Z[0m [32mINF[0m No cloud-cost-report-configs configmap found at install time, using existing configs: configmaps "cloud-cost-report-configs" not found
[90m2023-08-29T13:20:40.815326476Z[0m [32mINF[0m No product-configs configmap found at install time, using existing configs: configmaps "product-configs" not found
[90m2023-08-29T13:20:41.014819183Z[0m [32mINF[0m No alert-configs configmap found at install time, using existing configs: configmaps "alert-configs" not found
[90m2023-08-29T13:20:41.21705655Z[0m [32mINF[0m No recurring-budget-rule-configs configmap found at install time, using existing configs: configmaps "recurring-budget-rule-configs" not found
[90m2023-08-29T13:20:41.416224485Z[0m [32mINF[0m No group-report-configs configmap found at install time, using existing configs: configmaps "group-report-configs" not found
[90m2023-08-29T13:20:41.616794296Z[0m [32mINF[0m No budget-configs configmap found at install time, using existing configs: configmaps "budget-configs" not found
[90m2023-08-29T13:20:41.813632291Z[0m [32mINF[0m No group-filters configmap found at install time, using existing configs: configmaps "group-filters" not found
[90m2023-08-29T13:20:42.015814434Z[0m [32mINF[0m No metrics-config configmap found at install time, using existing configs: configmaps "metrics-config" not found
[90m2023-08-29T13:20:42.214287803Z[0m [32mINF[0m No app-configs configmap found at install time, using existing configs: configmaps "app-configs" not found
[90m2023-08-29T13:20:42.547232903Z[0m [32mINF[0m Init: AggregateCostModel cache warming disabled
panic: provider is required: failed to convert cost model provider

goroutine 1 [running]:
github.com/kubecost/kubecost-cost-model/pkg/cmd/costmodel.Initialize()
	/home/runner/work/release-scripts/release-scripts/release-scripts/workdir-prod-tag/kubecost/kubecost-cost-model/pkg/cmd/costmodel/costmodel.go:1577 +0x2db9
github.com/kubecost/kubecost-cost-model/pkg/cmd/costmodel.Execute(0x1?)
	/home/runner/work/release-scripts/release-scripts/release-scripts/workdir-prod-tag/kubecost/kubecost-cost-model/pkg/cmd/costmodel/costmodel.go:2028 +0xf7
github.com/kubecost/kubecost-cost-model/pkg/cmd.Execute.newCostModelCommand.func1(0xc0000e3400?, {0x330f85e?, 0x4?, 0x330f862?})
	/home/runner/work/release-scripts/release-scripts/release-scripts/workdir-prod-tag/kubecost/kubecost-cost-model/pkg/cmd/commands.go:43 +0x2f
github.com/spf13/cobra.(*Command).execute(0xc000fb6600, {0x5a1a300, 0x0, 0x0})
	/home/runner/go/pkg/mod/github.com/spf13/[email protected]/command.go:916 +0x87c
github.com/spf13/cobra.(*Command).ExecuteC(0xc000fb7200)
	/home/runner/go/pkg/mod/github.com/spf13/[email protected]/command.go:1040 +0x38d
github.com/spf13/cobra.(*Command).Execute(...)
	/home/runner/go/pkg/mod/github.com/spf13/[email protected]/command.go:968
github.com/opencost/opencost/pkg/cmd.Execute(0x0?, {0xc00103fee0, 0x3, 0x3})
	/home/runner/work/release-scripts/release-scripts/release-scripts/workdir-prod-tag/kubecost/opencost/pkg/cmd/commands.go:61 +0x3a5
github.com/kubecost/kubecost-cost-model/pkg/cmd.Execute()
	/home/runner/work/release-scripts/release-scripts/release-scripts/workdir-prod-tag/kubecost/kubecost-cost-model/pkg/cmd/commands.go:27 +0x1e5
main.main()
	/home/runner/work/release-scripts/release-scripts/release-scripts/workdir-prod-tag/kubecost/kubecost-cost-model/cmd/costmodel/main.go:12 +0x13

Solution Description

Add Scaleway as a supported provider for Kubecost.

Alternatives

No response

Additional Context

No response

[Bug] Leader Follower doesn't work with SAML enabled

Kubecost Helm Chart Version

v1.107.1

Kubernetes Version

v1.27

Kubernetes Platform

AKS

Description

We have SAML working previously without leader follower. We tried enabling the leader follower with StatedulSet option but that still doesn't work. The login keeps redirecting with infinite loop.

Steps to reproduce

  1. Enable leader-follower with replica count of 2.
  2. Have SAML with correct settings and with RBAC enabled.
  3. Visit the kubecost url to access dashboard.
  4. It redirects to login with SAML once logged in it keeps redirecting.

Expected behavior

Dashboard must be visible correctly in addition to SAML working.

Impact

Kubecost dashboard not visible.

Screenshots

No response

Logs

No response

Slack discussion

No response

Troubleshooting

  • I have read and followed the issue guidelines and this is a bug impacting only the Helm chart.
  • I have searched other issues in this repository and mine is not recorded.

[Bug] NFS volumes are reported how cloud disks

Kubecost Helm Chart Version

1.107.0

Kubernetes Version

1.26

Kubernetes Platform

GKE

Description

I have 30 NFS mount points that I use on my application, these mount points are attached via NFS PV on my pods, but kubecost understand that these PV are a physical cloud volumes and report it.

Here I have each nfs volumes reported by physical volumes:
image

The main NFS server have 3TB of size and for each nfs volume that I attach on my pods, kubecost report 3TB of physical volume.

Steps to reproduce

Create PV and PVC on Kubernetes:

apiVersion: v1
kind: PersistentVolume
metadata:
name: name-here
spec:
capacity:
storage: "3000Gi"
accessModes:
- "ReadWriteMany"
nfs:
server: filestore.nfs.server
path: /home/directory001
path: /mnt/directory001

--
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: name-here
spec:
accessModes:
- ReadWriteMany
storageClassName: ""
resources:
requests:
storage: 3000Gi
volumeName: name-here

Expected behavior

Is expected that kubecost understand that these volumes are NFS shares.

Impact

Unfortunately my cost report don't show correct values.

Screenshots

image

Logs

No response

Slack discussion

https://kubecost.slack.com/archives/CLFV60Y90/p1699553491710209

Troubleshooting

  • I have read and followed the issue guidelines and this is a bug impacting only the Helm chart.
  • I have searched other issues in this repository and mine is not recorded.

[Feature] Add spend change back to the cloud cost view

Problem Statement

When reviewing cloud costs, it's really helpful to see spend change over time, which we used to show on the cloud costs view.

Solution Description

Add spend change pills as shown in allocation view

image

Alternatives

None yet

Additional Context

No response

Troubleshooting

  • I have read and followed the issue guidelines and this is a feature request only for the Kubecost application.
  • I have searched other issues in this repository and mine is not recorded.

Allow networks-costs listen to ipv6 address

Problem Statement

network-costs-rs running as daemonset listens to ipv4 addresses only

/ # netstat -lntp |grep network-costs *
*tcp        0      0 0.0.0.0:3001            0.0.0.0:*               LISTEN      1/network-costs-rs

So Prometheus, working on ipv6 network, cannot scrape its metrics.

Solution Description

It would be great to have an opportunity to allow listening ipv6 addresses

Alternatives

No response

Additional Context

No response

Troubleshooting

  • I have read and followed the issue guidelines and this is a feature request only for the Kubecost application.
  • I have searched other issues in this repository and mine is not recorded.

[Bug] Unable to upgrade to v1.106.3 from v1.101.3

Kubecost Helm Chart Version

v1.106.3

Kubernetes Version

1.24

Kubernetes Platform

EKS

Description

I am trying to upgrade from v1.101.3 to v1.106.3 on EKS v1.24 but the (Netowrk, Grafana and Cost Analyzer) PODS are failing with CrashLoopBackOff error.

The events say : Back-off restarting failed container

Steps to reproduce

  1. EKSv1.24 is up and running
  2. Kubecost v1.101.3 is installed
  3. Upgrade kubecost to v1.106.3

Expected behavior

Kubecost should be installed with no issues

Impact

No response

Screenshots

No response

Logs

No response

Slack discussion

No response

Troubleshooting

  • I have read and followed the issue guidelines and this is a bug impacting only the Helm chart.
  • I have searched other issues in this repository and mine is not recorded.

Adding label mappings in the Savings API (Container Request Right Sizing Recommendation API (V2)) response

What problem are you trying to solve?
Currently we want the savings information based on the labels that we have for the namespace and app level, currently it has ability to filter for label that means cost-model api already knows the label mappings but doesn't return those associated labels in the Savings API response.

Describe the solution you'd like
If we can have those labels in the response of Savings API (Container Request Right Sizing Recommendation API (V2)) that will help teams to do further automation such as building grafana dashboards and more.

Describe alternatives you've considered
We tried using the filter approach but it is overkill and doesn't help much in the automation process and taking the data for more insights because the response doesn't have Label mappings.

How would users interact with this feature?
Users can use the API to fetch the labels mappings with Savings API response and then Teams can further perform grouping of those savings based on the labels and perform insights with those responses as well.

[Feature] Method to determine if costs have been reconciled

Problem Statement

When consuming allocation data via the API, there isn't a reliable method to determine if data points have already been reconciled with the CSP or not. As a comparison, there is a feature in the Kubecost UI that highlights whether or not costs have been already reconciled (although it isn't clear how reliable that functionality is either).

Solution Description

A simple property in each data element in the return response like reconciled: True would be enough. Optionally, it would also be useful to be able to submit this as a query parameter. For example, to only fetch costs that have been reconciled.

Alternatives

  • Fetch data with at least 48 hours of lag: although this should work for the happy path, it's not safe in situations when problems occurred to reconcile the data.
  • Test for *CostAdjustment properties being different than 0: although true for most cases, there's likely a variety of scenarios where the adjustment for a reconciled cost might still be 0.

Additional Context

No response

[Bug] OIDC doesn't work with self-signed certificate

Kubecost Version

cost-analyzer-1.107.1

Kubernetes Version

v1.27.6+f67aeb3

Kubernetes Platform

OpenShift

Description

panic: Error in OIDC discovery 'https://keycloak-keycloak-operator.apps-crc.testing/realms/sso/.well-known/openid-configuration': Get "https://keycloak-keycloak-operator.apps-crc.testing/realms/sso/.well-known/openid-configuration": tls: failed to verify certificate: x509: certificate signed by unknown authoritygoroutine 1 [running]:github.com/kubecost/kubecost-cost-model/pkg/cmd/costmodel.Execute(0x1?)	/app/kubecost-cost-model/pkg/cmd/costmodel/costmodel.go:2650 +0x8b9dgithub.com/kubecost/kubecost-cost-model/pkg/cmd.Execute.newCostModelCommand.func1(0xc001528800?, {0x4855d85?, 0x4?, 0x4855d89?})	/app/kubecost-cost-model/pkg/cmd/commands.go:68 +0x2fgithub.com/spf13/cobra.(*Command).execute(0xc001526000, {0x75ac380, 0x0, 0x0})	/go/pkg/mod/github.com/spf13/[email protected]/command.go:916 +0x87cgithub.com/spf13/cobra.(*Command).ExecuteC(0xc001527800)	/go/pkg/mod/github.com/spf13/[email protected]/command.go:1040 +0x38dgithub.com/spf13/cobra.(*Command).Execute(...)	/go/pkg/mod/github.com/spf13/[email protected]/command.go:968github.com/opencost/opencost/pkg/cmd.Execute(0x0?, {0xc00143fec0, 0x7, 0x7})	/app/opencost/pkg/cmd/commands.go:61 +0x3a5github.com/kubecost/kubecost-cost-model/pkg/cmd.Execute()	/app/kubecost-cost-model/pkg/cmd/commands.go:43 +0x353main.main()	/app/kubecost-cost-model/cmd/costmodel/main.go:12 +0x13

### Steps to reproduce

Create a keycloak with self-signed certificate and try to run kubecost

### Expected behavior

OIDC connection works with self-signed certs

### Impact

_No response_

### Screenshots

_No response_

### Logs

_No response_

### Slack discussion

_No response_

### Troubleshooting

- [X] I have read and followed the [issue guidelines](https://github.com/kubecost/features-bugs/blob/main/ISSUE_GUIDELINES.md) and this is a bug impacting only the Kubecost application.
- [X] I have searched other issues in this repository and mine is not recorded.

[Bug] Network Costs is not showing any cross-zone traffic,

Kubecost Version

1.107.1 (f87c784)

Kubernetes Version

1.28

Kubernetes Platform

EKS

Description

I have a reasonably large multi-az cluster with KubeCost installed. Network costs daemon is installed and is up and running. No cross-zone traffic is shown in the KubeCost UI (Allocation / Network Costs). There is a large "Adjustment" cost, implying that KubeCost is aware of the spend, but just can't classify the traffic.

I have followed instructions at Network Cost Configuration - Troubleshooting

I think I may have discovered a clue, and I'll include screenshots below. It seems that the grafana dashboard for network costs is also missing cross zone and cross region costs, but when I look at the promql that's fetching the metrics, it's clearly wrong (see screenshots below). It's using the wrong labels: sameRegion instead of same_region and sameZone instead of same_zone. Is it possible the cost model engine is also looking at the wrong labels?

As you can see from the screenshots and config below, the classified network traffic metrics are there in prometheus, but not making it all the way to the UI somehow.

Any help would be greatly appreciated!

Steps to reproduce

  1. Configure KubeCost in a multi-az EKS cluster
  2. Enable network costs
  3. Generate some traffic
  4. Look at the Allocations / Network Costs section of the UI for any namespace
  5. Note that while traffic is captured, and costs are reconciled with AWS, the cross-zone and cross-region columns are all zeros.

Expected behavior

I would expect to see some traffic in the cross-zone and cross-region columns of the Allocation / Network Costs screen.

Impact

One of our main use cases for KubeCost is to help tease out cross zone network costs. Since it's not classifying properly, this is making the tool much less useful.

Screenshots

Here you can see all Prometheus kubecost-networking targets are up:
image

Here you can see that kubecost_pod_network_egress_bytes_total metrics are being captured:
image

Here you can see that cross zone and cross region traffic is missing:
image

And here you can see why. The name of the labels being used in the promql is wrong:
image

If I fix it, you can see that the data appears. This proves that my classification rules are working. The kubecost-networking daemon is classifying traffic.
image

I have included my network costs config below.

Here's what things look like in the UI. This is network costs for kube-system namespace:
image

Logs

Here you can see all Kubecost services and pods up and running:
$ k get all --namespace kubecost
NAME                                              READY   STATUS    RESTARTS   AGE
pod/kubecost-cost-analyzer-8779958fd-978f6        2/2     Running   0          29h
pod/kubecost-grafana-79c8884f54-gzrv9             2/2     Running   0          7d4h
pod/kubecost-network-costs-2mfm2                  1/1     Running   0          7d4h
pod/kubecost-network-costs-2q645                  1/1     Running   0          7d4h
pod/kubecost-network-costs-59x2p                  1/1     Running   0          7d4h
pod/kubecost-network-costs-5jv2b                  1/1     Running   0          7d4h
pod/kubecost-network-costs-5mrpb                  1/1     Running   0          7d4h
pod/kubecost-network-costs-5vd7s                  1/1     Running   0          7d4h
pod/kubecost-network-costs-8plj7                  1/1     Running   0          7d4h
pod/kubecost-network-costs-8t4mx                  1/1     Running   0          7d4h
pod/kubecost-network-costs-bfvxp                  1/1     Running   0          7d4h
pod/kubecost-network-costs-bj49l                  1/1     Running   0          7d4h
pod/kubecost-network-costs-bpmjf                  1/1     Running   0          7d4h
pod/kubecost-network-costs-bvnkn                  1/1     Running   0          7d4h
pod/kubecost-network-costs-cgvnd                  1/1     Running   0          7d4h
pod/kubecost-network-costs-d5gh6                  1/1     Running   0          7d4h
pod/kubecost-network-costs-dfftx                  1/1     Running   0          7d4h
pod/kubecost-network-costs-dl9gx                  1/1     Running   0          7d4h
pod/kubecost-network-costs-dmbr4                  1/1     Running   0          7d4h
pod/kubecost-network-costs-dv86s                  1/1     Running   0          7d4h
pod/kubecost-network-costs-dvxgv                  1/1     Running   0          7d4h
pod/kubecost-network-costs-f9qnk                  1/1     Running   0          7d4h
pod/kubecost-network-costs-gd46d                  1/1     Running   0          7d4h
pod/kubecost-network-costs-h6vlp                  1/1     Running   0          7d4h
pod/kubecost-network-costs-hwkww                  1/1     Running   0          7d4h
pod/kubecost-network-costs-jjhcc                  1/1     Running   0          7d4h
pod/kubecost-network-costs-kbhhw                  1/1     Running   0          7d4h
pod/kubecost-network-costs-kcxbf                  1/1     Running   0          7d4h
pod/kubecost-network-costs-kdgns                  1/1     Running   0          7d4h
pod/kubecost-network-costs-kllst                  1/1     Running   0          7d4h
pod/kubecost-network-costs-kpts6                  1/1     Running   0          7d4h
pod/kubecost-network-costs-kqkwp                  1/1     Running   0          7d4h
pod/kubecost-network-costs-mc8v2                  1/1     Running   0          7d4h
pod/kubecost-network-costs-n9rd5                  1/1     Running   0          7d4h
pod/kubecost-network-costs-nl7dc                  1/1     Running   0          7d4h
pod/kubecost-network-costs-nsn9r                  1/1     Running   0          6d2h
pod/kubecost-network-costs-q66fg                  1/1     Running   0          7d4h
pod/kubecost-network-costs-qkvt5                  1/1     Running   0          7d4h
pod/kubecost-network-costs-qn85p                  1/1     Running   0          7d4h
pod/kubecost-network-costs-r9s69                  1/1     Running   0          7d4h
pod/kubecost-network-costs-rlmzj                  1/1     Running   0          7d4h
pod/kubecost-network-costs-rms9h                  1/1     Running   0          7d4h
pod/kubecost-network-costs-rxpj9                  1/1     Running   0          7d4h
pod/kubecost-network-costs-sqpxp                  1/1     Running   0          7d4h
pod/kubecost-network-costs-t6tdh                  1/1     Running   0          7d4h
pod/kubecost-network-costs-tl58d                  1/1     Running   0          7d4h
pod/kubecost-network-costs-twfxk                  1/1     Running   0          7d4h
pod/kubecost-network-costs-tzkqc                  1/1     Running   0          7d4h
pod/kubecost-network-costs-v546l                  1/1     Running   0          7d4h
pod/kubecost-network-costs-w9h4d                  1/1     Running   0          7d4h
pod/kubecost-network-costs-w9lps                  1/1     Running   0          7d4h
pod/kubecost-network-costs-wfcq2                  1/1     Running   0          7d4h
pod/kubecost-network-costs-wkzmt                  1/1     Running   0          7d4h
pod/kubecost-network-costs-wxgjh                  1/1     Running   0          7d4h
pod/kubecost-network-costs-wzhzv                  1/1     Running   0          7d4h
pod/kubecost-network-costs-zcgb4                  1/1     Running   0          7d4h
pod/kubecost-prometheus-server-74b9f65cb5-gl97j   2/2     Running   0          7d4h

NAME                                 TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)             AGE
service/kubecost-cost-analyzer       ClusterIP   172.20.127.37   <none>        9003/TCP,9090/TCP   7d4h
service/kubecost-grafana             ClusterIP   172.20.184.57   <none>        80/TCP              7d4h
service/kubecost-network-costs       ClusterIP   None            <none>        3001/TCP            7d4h
service/kubecost-prometheus-server   ClusterIP   172.20.46.173   <none>        80/TCP              7d4h

NAME                                    DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
daemonset.apps/kubecost-network-costs   54        54        54      54           54          <none>          7d4h

NAME                                         READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/kubecost-cost-analyzer       1/1     1            1           7d4h
deployment.apps/kubecost-grafana             1/1     1            1           7d4h
deployment.apps/kubecost-prometheus-server   1/1     1            1           7d4h

NAME                                                    DESIRED   CURRENT   READY   AGE
replicaset.apps/kubecost-cost-analyzer-675b4d75f5       0         0         0       6d3h
replicaset.apps/kubecost-cost-analyzer-68d47798d8       0         0         0       7d4h
replicaset.apps/kubecost-cost-analyzer-76fd85c97d       0         0         0       6d3h
replicaset.apps/kubecost-cost-analyzer-8779958fd        1         1         1       29h
replicaset.apps/kubecost-cost-analyzer-fff47d548        0         0         0       6d3h
replicaset.apps/kubecost-grafana-79c8884f54             1         1         1       7d4h
replicaset.apps/kubecost-prometheus-server-74b9f65cb5   1         1         1       7d4h

And here is the network costs config:

$ k get cm network-costs-config -o yaml
apiVersion: v1
data:
  config.yaml: |
    destinations:
      cross-region: []
      direct-classification:
      - ips:
        - 10.0.3.0/24
        - 10.0.60.0/22
        - 10.0.72.0/22
        - 10.0.80.0/24
        region: us-east1
        zone: us-east1-a
      - ips:
        - 10.0.4.0/24
        - 10.0.68.0/22
        - 10.0.76.0/22
        - 10.0.81.0/24
        region: us-east1
        zone: us-east1-b
      - ips:
        - 10.0.5.0/24
        - 10.0.36.0/22
        - 10.0.82.0/24
        region: us-east1
        zone: us-east1-c
      - ips:
        - 10.0.32.0/22
        - 10.0.56.0/22
        - 10.0.83.0/24
        - 10.0.92.0/24
        region: us-east1
        zone: us-east1-d
      - ips:
        - 10.0.93.0/24
        region: us-east1
        zone: us-east1-e
      - ips:
        - 10.0.40.0/22
        - 10.0.64.0/22
        - 10.0.84.0/24
        - 10.0.94.0/24
        region: us-east1
        zone: us-east1-f
      in-region: []
      in-zone:
      - 127.0.0.0/8
      - 169.254.0.0/16
      - 172.16.0.0/12
      - 192.168.0.0/16
      internet: []
    services:
      amazon-web-services: true
      azure-cloud-services: false
      google-cloud-services: false
kind: ConfigMap
metadata:
  annotations:
    meta.helm.sh/release-name: kubecost
    meta.helm.sh/release-namespace: kubecost
  creationTimestamp: "2023-12-05T14:39:35Z"
  labels:
    app: cost-analyzer
    app.kubernetes.io/instance: kubecost
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: cost-analyzer
    helm.sh/chart: cost-analyzer-1.107.1
  name: network-costs-config
  namespace: kubecost
  resourceVersion: "2931042379"
  uid: d4064385-5f4c-402c-8505-206fc6b89cb1

Slack discussion

No response

Troubleshooting

  • I have read and followed the issue guidelines and this is a bug impacting only the Kubecost application.
  • I have searched other issues in this repository and mine is not recorded.

[Bug] History appears to be broken on cloud costs view

Kubecost Version

106.2

Kubernetes Version

n/a

Kubernetes Platform

EKS

Description

Back button in browser stopped working.

Steps to reproduce

  1. Load this view https://infra.kceng.dev/cloud?reportTitle=Cumulative+cost+for+last+60+days+by+item&window=60d&agg=item&context=W3sicHJvcGVydHkiOiJzZXJ2aWNlIiwidmFsdWUiOiJBbWF6b25FQzIiLCJuYW1lIjoiU2VydmljZSJ9XQ%3D%3D&filters=W10%3D&costMetric=AmortizedNetCost&selectedProviderId=&selectedItemName=
  2. Click "Cloud Costs" in navigation
  3. Hit back button in your browser

Expected behavior

Return to the previous view of cloud data.

Impact

Instead of expected behavior the back button did not change the page. This makes browsing data hard in this particular moment but also causes me to lose some faith in the Kubecost product because most users will view this as table stakes.

Screenshots

No response

Logs

No response

Slack discussion

No response

Troubleshooting

  • I have read and followed the issue guidelines and this is a bug impacting only the Kubecost application.
  • I have searched other issues in this repository and mine is not recorded.

[Feature] Show more decimals in cloud costs graph tooltips

Problem Statement

When looking at daily/hourly trends over time, I'm unable to get a sense for movement by looking at the graph tooltip.

image

Solution Description

It would be helpful to show at least one more decimal place, i.e. $2k --> $2.1k

Alternatives

Show two more decimal places...

Additional Context

This same concept may apply to others graphs...

Troubleshooting

  • I have read and followed the issue guidelines and this is a feature request only for the Kubecost application.
  • I have searched other issues in this repository and mine is not recorded.

Deployed Kubecost in Azure AKS cluster .But data not loading

Kubecost Helm Chart Version

1.106.3

Kubernetes Version

1.25.6

Kubernetes Platform

AKS

Description

image

We deployed kubecost using the following commands

image

The promethus pods and the following below pods are in container creating states

kubecost-grafana-57c8b5d877-v4xwv 0/2 ContainerCreating 0 110m
kubecost-kube-state-metrics-68459c8d5f-p78sn 0/1 ContainerCreating 0 110m

Steps to reproduce

Expected behavior

The URL need to be accessible and the details needs loading.Currenlty its stuck in the below state

image

Impact

No response

Screenshots

image

Logs

PS C:\Users\nidicula\OneDrive - RM PLC\WORK\AKSClusterIssues\KubeCostSetup> kubectl get events --sort-by=.metadata.creationTimestamp --field-selector type!=Normal -n kubecost
LAST SEEN   TYPE      REASON             OBJECT                                             MESSAGE
7m30s       Warning   FailedMount        pod/kubecost-kube-state-metrics-68459c8d5f-p78sn   Unable to attach or mount volumes: unmounted volumes=[kube-api-access-hrzkm], unattached volumes=[kube-api-access-hrzkm]: timed out waiting for the condition
7m35s       Warning   FailedMount        pod/kubecost-prometheus-server-7f745bf6f4-65wr8    Unable to attach or mount volumes: unmounted volumes=[kube-api-access-4xjrf], unattached volumes=[config-volume kube-api-access-4xjrf storage-volume]: timed out waiting for the condition
2m46s       Warning   FailedMount        pod/kubecost-grafana-57c8b5d877-v4xwv              (combined from similar events): MountVolume.SetUp failed for volume "kube-api-access-bwnlz" : chown c:\var\lib\kubelet\pods\c8ce496a-54ea-46fc-802b-6994d5df6ec1\volumes\kubernetes.io~projected\kube-api-access-bwnlz\..2023_10_25_08_46_05.1185287298\token: not supported by windows
2m46s       Warning   FailedMount        pod/kubecost-kube-state-metrics-68459c8d5f-p78sn   (combined from similar events): MountVolume.SetUp failed for volume "kube-api-access-hrzkm" : chown c:\var\lib\kubelet\pods\d2eef8ec-6d67-43ad-a2bf-4f1219c6a978\volumes\kubernetes.io~projected\kube-api-access-hrzkm\..2023_10_25_08_46_05.1332739432\token: not supported by windows
2m44s       Warning   FailedMount        pod/kubecost-prometheus-server-7f745bf6f4-65wr8    (combined from similar events): MountVolume.SetUp failed for volume "kube-api-access-4xjrf" : chown c:\var\lib\kubelet\pods\637f2370-8210-4b79-85db-0a0c35f247f6\volumes\kubernetes.io~projected\kube-api-access-4xjrf\..2023_10_25_08_46_07.4008867723\token: not supported by windows
12m         Warning   FailedMount        pod/kubecost-prometheus-server-7f745bf6f4-65wr8    Unable to attach or mount volumes: unmounted volumes=[kube-api-access-4xjrf], unattached volumes=[storage-volume config-volume kube-api-access-4xjrf]: timed out waiting for the condition
27m         Warning   FailedMount        pod/kubecost-grafana-57c8b5d877-v4xwv              Unable to attach or mount volumes: unmounted volumes=[kube-api-access-bwnlz], unattached volumes=[sc-dashboard-volume kube-api-access-bwnlz config ldap sc-dashboard-provider storage]: timed out waiting for the condition
11m         Warning   FailedScheduling   pod/kubecost-prometheus-node-exporter-k5xbr        0/9 nodes are available: 1 node(s) didn't have free ports for the requested pod ports. preemption: 0/9 nodes are available: 9 No preemption victims found for incoming pod.
11m         Warning   FailedScheduling   pod/kubecost-prometheus-node-exporter-gmsf8        0/9 nodes are available: 1 node(s) didn't have free ports for the requested pod ports. preemption: 0/9 nodes are available: 9 No preemption victims found for incoming pod.
11m         Warning   FailedScheduling   pod/kubecost-prometheus-node-exporter-m2bw6        0/9 nodes are available: 1 node(s) didn't have free ports for the requested pod ports. preemption: 0/9 nodes are available: 9 No preemption victims found for incoming pod.
11m         Warning   FailedScheduling   pod/kubecost-prometheus-node-exporter-m9ldv        0/9 nodes are available: 1 node(s) didn't have free ports for the requested pod ports. preemption: 0/9 nodes are available: 9 No preemption victims found for incoming pod.
11m         Warning   FailedScheduling   pod/kubecost-prometheus-node-exporter-rhc7t        0/9 nodes are available: 1 node(s) didn't have free ports for the requested pod ports. preemption: 0/9 nodes are available: 9 No preemption victims found for incoming pod.
11m         Warning   FailedScheduling   pod/kubecost-prometheus-node-exporter-rpfll        0/9 nodes are available: 1 node(s) didn't have free ports for the requested pod ports. preemption: 0/9 nodes are available: 9 No preemption victims found for incoming pod.
11m         Warning   FailedScheduling   pod/kubecost-prometheus-node-exporter-xqpb5        0/9 nodes are available: 1 node(s) didn't have free ports for the requested pod ports. preemption: 0/9 nodes are available: 9 No preemption victims found for incoming pod.
11m         Warning   FailedScheduling   pod/kubecost-prometheus-node-exporter-2ns68        0/9 nodes are available: 1 node(s) didn't have free ports for the requested pod ports. preemption: 0/9 nodes are available: 9 No preemption victims found for incoming pod.
12m         Warning   FailedMount        pod/kubecost-grafana-57c8b5d877-v4xwv              Unable to attach or mount volumes: unmounted volumes=[kube-api-access-bwnlz], unattached volumes=[config ldap sc-dashboard-provider storage sc-dashboard-volume kube-api-access-bwnlz]: timed out waiting for the condition
7m35s       Warning   FailedMount        pod/kubecost-grafana-57c8b5d877-v4xwv              Unable to attach or mount volumes: unmounted volumes=[kube-api-access-bwnlz], unattached volumes=[storage sc-dashboard-volume kube-api-access-bwnlz config ldap sc-dashboard-provider]: timed out waiting for the condition

Slack discussion

No response

Troubleshooting

  • I have read and followed the issue guidelines and this is a bug impacting only the Helm chart.
  • I have searched other issues in this repository and mine is not recorded.

Clusters using aws-virtual-gpu-device-plugin have negative GPU idle cost

Describe the bug
A user reported they noticed GPU costs on their cluster. After looking into the details of their environment, we noticed that they use the aws-virtual-gpu-device-plugin to manage their GPU devices on their cluster. I was able to reproduce the same issue by deploying an AWS GPU-supported node and deploying the controller. Before deploying the DaemonSet controller, I had valid values displaying for the underlying node GPU cost, but after deploying, my node_gpu_count metric emitted from /model/metrics is 0.

To Reproduce
Steps to reproduce the behavior:

  1. Deploy an AWS cluster with a GPU-supported device (g4dn.xlarge)
  2. Deploy Kubecost and aws-virtual-gpu-device-plugin
  3. Deploy an example application using a GPU
  4. See that the node associated with the GPU has 0 GPUs, and the deployment has a negative GPU cost.

Expected behavior
Idle should still be associated with the node to attribute total cost to an allocation correctly.

Screenshots
Namespace with GPU related to Negative Idle:
image

Node with no controller deployed:
image

Node deployed, then controller added:
image

Prometheus metrics corresponding with toggling the label for the controller DaemonSet:
image

┆Issue is synchronized with this Jira Task by Unito

Add option to "show api call" in the UI

What problem are you trying to solve?
We consume Kubecost data via API, but we build reports in the UI to quickly verify the data. Mapping the UI data to the matching API can be difficult depending on options (aggregating by labels, filtering by labels, sharing idle, etc.) especially because the UI uses a different API than the Allocation API.

Describe the solution you'd like
It would be helpful if there were a way to translate what I am viewing in the UI to the matching API call.

Describe alternatives you've considered
Trial and error matching up UI and API calls.

How would users interact with this feature?
Ideally, there would be an option on the allocation page to "Show me the API call."

┆Issue is synchronized with this Jira Task by Unito

[BUG] Allocation Report run rate cost is incorrect when schedule sent as csv or pdf to emails

Kubecost Helm Chart Version

1.106.2

Kubernetes Version

1.23.17

Kubernetes Platform

EKS

Description

After setting up the monthly run rate allocation report with time window last week, downloading the report as pdf (and scheduling it to be sent to email recipients) shows the cumulative cost for last week instead. The allocation report is correct through the UI, but not when exported as a pdf file (and when it is sent as a scheduled email).

Steps to reproduce

  1. Create a new allocation report with Last Week as date option, aggregated by Namespace, Cost for chart, and Monthly Rate for cost metric.
  2. Download the file as a PDF (or schedule the report to be sent to an email).

Expected behavior

An allocation report with a monthly run rate based on cost from last week should be generated with the correct numbers. If scheduled a report to be sent via email, the same allocation report should be generated.

correct status on diagnostic "Pricing Sources" when using cloud-integration and disable settings UI cloud providers

Describe the bug

If cloud-integration is used, /diagnostics "pricing sources" leads users to believe they have a configuration issue.

When kubecost detects that cloud-integration is present, disable "Cloud Cost Settings" in /settings

To Reproduce

  1. Configure .Values.kubecostProductConfigs.cloudIntegrationSecret
  2. deploy helm
  3. view errors in /diagnostics page and the ability to configure clouds in /settings

Expected behavior

Users should not be able to configure "cloud cost settings" when using cloud-integrations
Cloud Costs panel in /overview should link to: https://guide.kubecost.com/hc/en-us/articles/4407595968919-Setting-Up-Cloud-Integrations
Potentially bring /diagnostics "Cloud Integrations" to top of page as this is the preferred method for cloud billing integration.

Screenshots
image
image

Collect logs (please complete the following information):

NA

┆Issue is synchronized with this Jira Task by Unito

[Bug] A pricing source is unavailable when everything seems fine

Kubecost Helm Chart Version

1.106.2

Kubernetes Version

1.27.4

Kubernetes Platform

EKS

Description

We've configured kubecost with AWS price reconcilliation as described here: https://docs.kubecost.com/install-and-configure/install/cloud-integration/aws-cloud-integrations

After the configuration it looks like everything is working fine:
image
but we are still getting a warning about "A pricing source is unavailable: Savings Plan, Reserved Instance, and Out-Of-Cluster" not sure if this is a but or we're missing some configuration:
image

Steps to reproduce

  1. Install kubecost
  2. configure price reconcilliation: https://docs.kubecost.com/install-and-configure/install/cloud-integration/aws-cloud-integrations
  3. configure spot instance pricing: https://docs.kubecost.com/install-and-configure/install/cloud-integration/aws-cloud-integrations/aws-spot-instances

Expected behavior

No warnings or some errors pointing as to why something isn't working as intended.

Impact

No response

Screenshots

No response

Logs

No response

Slack discussion

No response

Troubleshooting

  • I have read and followed the issue guidelines and this is a bug impacting only the Helm chart.
  • I have searched other issues in this repository and mine is not recorded.

Deployed Kubecost and its in container creating state

Kubecost Version

1.106.4

Kubernetes Version

1.25.11

Kubernetes Platform

AKS

Description

We deployed kubecost using the following helm command

helm install kubecost cost-analyzer --repo https://kubecost.github.io/cost-analyzer/ --namespace kubecost --create-namespace --set kubecostToken="bmlkaWN1bGFnZW9yZ2VAaW4ucm0uY29txm343yadf98"

As per the following link

https://www.kubecost.com/install#show-instructions

Steps to reproduce

  1. Install kubecost using the following link - https://www.kubecost.com/install#show-instructions
  2. Command - helm install kubecost cost-analyzer --repo https://kubecost.github.io/cost-analyzer/ --namespace kubecost --create-namespace --set kubecostToken="bmlkaWN1bGFnZW9yZ2VAaW4ucm0uY29txm343yadf98"

Expected behavior

The container should be running but the pods are in container creating state

PFB below

image

Impact

The pods are not in running state ,its in container creating state and unable to access the kubecost

Screenshots

image

Logs

PS /home/nithin> kubectl logs kubecost-cost-analyzer-7596f84b9d-tv5w2 -n kubecost                     
Defaulted container "cost-model" out of: cost-model, cost-analyzer-frontend
Error from server (BadRequest): container "cost-model" in pod "kubecost-cost-analyzer-7596f84b9d-tv5w2" is waiting to start: ContainerCreating

Slack discussion

No response

Troubleshooting

  • I have read and followed the issue guidelines and this is a bug impacting only the Kubecost application.
  • I have searched other issues in this repository and mine is not recorded.

show namespace items even its total cost is zero

n some namespaces, there seems no activities and kubecost do not show them in the 'Cumulative cost'. However, as financial people would like to have them listed (even with activity/cost 0). Can Kubecost provide an option to display these namespaces if required? Or have some way to inject to the report?

enable RBAC filtering in assets

What problem are you trying to solve?

currently RBAC allows for filtering in cost allocation. Customer requesting filtering in assets as well.

Ideally insure all saving and advanced reports adhere to filters as well.

Describe the solution you'd like

Customer has stated: "We have a bunch of groups and every group should have access to exactly one cluster. Also in every cluster we want to exclude visibility to a few namespaces."

The primary focus for the next revision could be on namespaces and clusters.

Describe alternatives you've considered

Gave other options for using the API to pull data into other dashboard, but kubecost UI has other valuable features.

How would users interact with this feature?

filters.json configmap

┆Issue is synchronized with this Jira Task by Unito

Unmounted PVCs appearing under Abandoned workloads

Describe the bug
On the abandoned workloads savings page, PVCs deployed to the cluster have a line item for "x-unmounted-pvcs," regardless if those PVCs are attached to a pod. Perhaps this is intended. Still, it seems to pull all "-unmounted-pvc" values.

To Reproduce
Steps to reproduce the behavior:

  1. Go to the Abandoned workload savings page and look for "-unmounted-pvc" values

Expected behavior
I'm unsure how we expect to show "abandoned pvcs" since this page is related to traffic, and I'm unsure how that metric relates to PVCs. An explanation on the page to describe how to interpret this data would be helpful.

Screenshots
image

What impact will this have on your ability to get value out of Kubecost?
This helps clarify what the information on this page is trying to portray.

Please share the support case, if any
ZD 4488

[Bug] Getting 0 in ExternalCost field when performing GET request to Kubecost API

Kubecost Helm Chart Version

v1.106.0

Kubernetes Version

1.26.3

Kubernetes Platform

AKS

Description

Kubecost configured and correctly integrated with Azure with no errors reported through the UI Bug Report.
Even so, when trying to get the ExternalCost value through the API, this always returns 0.

  • Kubecost version

image

  • Cloud Integration

image

  • Prometheus Status

image

  • Through the UI the value is correctly returned.

image

Steps to reproduce

  1. try making a GET request to Kubecost API: curl --location 'https://xxxx.xxx.xxx.xx.ai/kubecost/model/allocation?aggregate=namespace&externalCost=true&accumulate=true&shareIdle=weighted&reconcile=false&window=7d&shareCost=100'
    --header 'Authorization: Bearer token'

Result:

{ "code": 200, "data": [ { "1": { "name": "1", "properties": { "namespace": "1", "namespaceLabels": { "istio_injection": "enabled", "kubernetes_io_metadata_name": "1", "zen_security": "enabled" } }, "window": { "start": "2023-10-03T00:00:00Z", "end": "2023-10-10T00:00:00Z" }, "start": "2023-10-03T00:00:00Z", "end": "2023-10-09T15:50:00Z", "minutes": 9590, "cpuCores": 0.06648, "cpuCoreRequestAverage": 0.06643, "cpuCoreUsageAverage": 0.01415, "cpuCoreHours": 10.62639, "cpuCost": 0.92469, "cpuCostAdjustment": 0, "cpuEfficiency": 0.21304, "gpuCount": 0, "gpuHours": 0, "gpuCost": 0, "gpuCostAdjustment": 0, "networkTransferBytes": 3323627386.04374, "networkReceiveBytes": 6739244814.07457, "networkCost": 0.00232, "networkCrossZoneCost": 0, "networkCrossRegionCost": 0, "networkInternetCost": 0.00232, "networkCostAdjustment": 0, "loadBalancerCost": 0, "loadBalancerCostAdjustment": 0, "pvBytes": 0, "pvByteHours": 0, "pvCost": 0, "pvs": null, "pvCostAdjustment": 0, "ramBytes": 762101360.72892, "ramByteRequestAverage": 749403818.31074, "ramByteUsageAverage": 465442891.87187, "ramByteHours": 121809200823.17249, "ramCost": 0.7994, "ramCostAdjustment": 0, "ramEfficiency": 0.62108, "externalCost": 0, "sharedCost": 0.03852, "totalCost": 1.76493, "totalEfficiency": 0.40224, "proportionalAssetResourceCosts": {}, "lbAllocations": null, "sharedCostBreakdown": {} } } ] }

Expected behavior

Get the correct value from ExternalCost.

Impact

No response

Screenshots

image
image

Logs

No response

Slack discussion

#5155

Troubleshooting

  • I have read and followed the issue guidelines and this is a bug impacting only the Helm chart.
  • I have searched other issues in this repository and mine is not recorded.

node_total_hourly_cost showing the wrong price (eks)[Bug]

Kubecost Helm Chart Version

1.108.1

Kubernetes Version

1.24

Kubernetes Platform

EKS

Description

node_total_hourly_cost metric gives the wrong price tag...

I noticed it in several ec2 types, especially c family…even more strange is that in some clusters it does show the right price:
image

ec2 pricing:
image

happened on 1.105.1 version and again after upgrade to 1.108.1 which is the latest at the moment - I was hoping it's a known bug which already been fixed in new versions...

Steps to reproduce

  1. Install kubecost
  2. query node_total_hourly_cost{instance_type="c5d.18xlarge"}
  3. result 0.00670$

image

image

Expected behavior

query node_total_hourly_cost{instance_type="c5d.18xlarge"} will return 3.46$

Impact

wrong cost estimation for workloads

Screenshots

No response

Logs

No response

Slack discussion

https://kubecost.slack.com/archives/CE76NJE6S/p1704884137548639

Troubleshooting

  • I have read and followed the issue guidelines and this is a bug impacting only the Helm chart.
  • I have searched other issues in this repository and mine is not recorded.

[Bug] Kubecost not showing the old metrics after some updates to allow the upgrade to EKS 1.25

Kubecost Helm Chart Version

1.98

Kubernetes Version

1.25

Kubernetes Platform

EKS

Description

Hello folks!

After upgrading our EKS cluster to the 1.25, we noticed our kubecost stopped to work, due to some deprecations in the PSP features (as stated here kubecost/cost-analyzer-helm-chart#1773 (comment))

After some days research, we found that workaround and we decided to apply it. It worked, as expected and Kubecost is now back online.

However, we noticed that we lost our old metrics, and we got only new metrics from today and on. Checking the kubecost logs, I found the following in the logs (pasted in the log part)

After considering this, maybe our data is still here and somehow is not being read, or do we had another issue and we lost it, aftter pushing the helm chart with the updates to Argo?

thanks

Steps to reproduce

  1. Update cluster to 1.25
  2. Apply the last workaround stated here kubecost/cost-analyzer-helm-chart#1773 (comment)
  3. Check that kubecost is back on, but missing the old metrics.

Expected behavior

The old metrics existing in the console

Impact

big - finops cant get any value from it

Screenshots

image

Logs

200
2024-01-10T18:38:52.833535279Z WRN CostModel.ComputeAllocation: Node spot  query result for missing node: cluster-one/fargate-ip-10-149-241-43.ec2.internal
199
2024-01-10T18:38:52.836968532Z INF ETL: Allocation[1h]: AggregatedStore[UDejW]: run: aggregated [2024-01-10T18:00:00+0000, 2024-01-10T19:00:00+0000) from 132 to 46 in 733.849µs
198
2024-01-10T18:38:52.840605258Z INF ETL: Asset[1h]: AggregatedStore.Run[sipig]: run: aggregated [2024-01-10T14:00:00+0000, 2024-01-10T15:00:00+0000) from 0 to 0 in 460ns
197
2024-01-10T18:38:52.878688631Z INF ETL: Asset[1h]: AggregatedStore.Run[sipig]: run: aggregated [2024-01-10T15:00:00+0000, 2024-01-10T16:00:00+0000) from 0 to 0 in 380ns
196
2024-01-10T18:38:52.918977102Z INF ETL: Asset[1h]: AggregatedStore.Run[sipig]: run: aggregated [2024-01-10T16:00:00+0000, 2024-01-10T17:00:00+0000) from 0 to 0 in 370ns
195
2024-01-10T18:38:52.94252342Z INF ETL: Asset[1h]: AggregatedStore.Run[sipig]: run: aggregated [2024-01-10T17:00:00+0000, 2024-01-10T18:00:00+0000) from 0 to 0 in 350ns
194
2024-01-10T18:38:52.994287847Z INF ETL: Asset[1h]: AggregatedStore.Run[sipig]: run: aggregated [2024-01-10T18:00:00+0000, 2024-01-10T19:00:00+0000) from 17 to 2 in 1.534649ms
193
2024-01-10T18:39:01.650451796Z INF Error getting node pricing. Error: Invalid Pricing Key "us-east-1,,linux"
192
2024-01-10T18:39:01.650540317Z INF Error getting node pricing. Error: Invalid Pricing Key "us-east-1,,linux"
191
2024-01-10T18:39:01.65316764Z INF Error getting node pricing. Error: Invalid Pricing Key "us-east-1,,linux"
190
2024-01-10T18:39:01.653273281Z INF Error getting node pricing. Error: Invalid Pricing Key "us-east-1,,linux"
189
2024-01-10T18:40:01.694239625Z INF Error getting node pricing. Error: Invalid Pricing Key "us-east-1,,linux"
188
2024-01-10T18:40:01.694317886Z INF Error getting node pricing. Error: Invalid Pricing Key "us-east-1,,linux"
187
2024-01-10T18:40:01.696444753Z INF Error getting node pricing. Error: Invalid Pricing Key "us-east-1,,linux"
186
2024-01-10T18:40:01.696531594Z INF Error getting node pricing. Error: Invalid Pricing Key "us-east-1,,linux"
185
2024-01-10T18:40:07.488728344Z INF http: named cookie not present
184
2024-01-10T18:40:07.488836665Z INF [JWT Groups] No Cookie set
183
2024-01-10T18:40:07.489532994Z INF ETL: Allocation: QueryAllocation([2024-01-03T18:40:07+0000, 2024-01-10T18:40:07+0000), [cluster]) from AggregatedStore[1d] 602.178µs [query 263.583µs] [idle/tenancy 480ns] [external 330ns] [aggregate 337.105µs] [accumulate 440ns] [stop 240ns]
182
2024-01-10T18:40:07.490493286Z INF http: named cookie not present
181
2024-01-10T18:40:07.490555817Z INF [JWT Groups] No Cookie set
180
2024-01-10T18:40:07.490567657Z INF http: named cookie not present
179
2024-01-10T18:40:07.490665469Z INF [JWT Groups] No Cookie set
178
2024-01-10T18:40:07.491098084Z INF ETL: QuerySummaryAllocation([2024-01-10T14:40:07+0000, 2024-01-10T18:40:07+0000), [namespace]) from AggregatedStore[1h] 464.366µs [query 322.865µs] [idle/tenancy 25.52µs] [external 380ns] [aggregate 114.971µs] [accumulate 400ns] [stop 230ns]
177
2024-01-10T18:40:07.49158431Z INF ETL: QuerySummaryAllocation([2024-01-07T18:40:07+0000, 2024-01-10T18:40:07+0000), [cluster]) from AggregatedStore[1d] 372.185µs [query 251.633µs] [idle/tenancy 18.641µs] [external 520ns] [aggregate 100.361µs] [accumulate 810ns] [stop 220ns]
176
2024-01-10T18:40:08.063251158Z INF http: named cookie not present
175
2024-01-10T18:40:08.06332345Z INF [JWT Groups] No Cookie set
174
2024-01-10T18:40:08.064087869Z INF ETL: QuerySummaryAllocation([2024-01-09T18:40:08+0000, 2024-01-10T18:40:08+0000), [namespace]) from AggregatedStore[1h] 658.648µs [query 362.115µs] [idle/tenancy 180.052µs] [external 640ns] [aggregate 115.231µs] [accumulate 390ns] [stop 220ns]
173
2024-01-10T18:40:08.06967862Z ERR ETL: failed to merge cloud usage: error merging cloud usage: MergeAssetSetRanges failed: expected range length 24, but got 1
172
2024-01-10T18:40:08.069887802Z INF ETL: Asset: QueryAsset([2024-01-09T18:40:08+0000, 2024-01-10T18:40:08+0000), [type]) from ETLStore[1h] 648.518µs [query 439.196µs] [cloud 121.551µs] [aggregate 86.951µs] [accumulate 510ns] [stop 310ns]
171
2024-01-10T18:40:08.079095549Z INF http: named cookie not present
170
2024-01-10T18:40:08.079168321Z INF [JWT Groups] No Cookie set
169
2024-01-10T18:40:08.079718088Z INF ETL: QuerySummaryAllocation([2024-01-09T18:40:08+0000, 2024-01-10T18:40:08+0000), [cluster]) from AggregatedStore[1h] 471.256µs [query 338.014µs] [idle/tenancy 54.551µs] [external 230ns] [aggregate 78.071µs] [accumulate 220ns] [stop 170ns]
168
2024-01-10T18:40:08.080747351Z INF http: named cookie not present
167
2024-01-10T18:40:08.080843862Z INF [JWT Groups] No Cookie set
166
2024-01-10T18:40:08.08073867Z INF ETL: Asset: QueryAsset([2024-01-03T18:40:08+0000, 2024-01-10T18:40:08+0000), [service]) from ETLStore[1d] 439.355µs [query 374.264µs] [cloud 11.6µs] [aggregate 52.931µs] [accumulate 310ns] [stop 250ns]
165
2024-01-10T18:40:08.264739847Z INF ETL: Allocation: QueryAllocation([2024-01-09T18:40:08+0000, 2024-01-10T18:40:08+0000), [cluster node namespace pod container]) from ETLStore[1d] 1.658131ms [query 1.348698ms] [idle/tenancy 1.51µs] [external 230ns] [aggregate 306.733µs] [accumulate 400ns] [stop 560ns]
164
2024-01-10T18:40:08.26576271Z INF ETL: Allocation: QueryAllocation([2024-01-08T18:40:08+0000, 2024-01-10T18:40:08+0000), [cluster node namespace pod controller]) from ETLStore[1d] 2.792447ms [query 1.865884ms] [idle/tenancy 820ns] [external 270ns] [aggregate 809.321µs] [accumulate 115.902µs] [stop 250ns]
163
2024-01-10T18:40:08.265991922Z INF [Profiler] 3.090089ms: Savings: abandonedWorkloads
162
2024-01-10T18:40:08.266806192Z INF ETL: Allocation: QueryAllocation([2024-01-08T18:40:08+0000, 2024-01-10T18:40:08+0000), [cluster node namespace pod container]) from ETLStore[1d] 3.738708ms [query 3.418894ms] [idle/tenancy 660ns] [external 260ns] [aggregate 318.234µs] [accumulate 410ns] [stop 250ns]
161
2024-01-10T18:40:08.26737143Z INF [Profiler] 4.343265ms: Savings: requestSizing
160
2024-01-10T18:40:08.267471681Z INF ETL: Asset: QueryAsset([2024-01-08T18:40:08+0000, 2024-01-10T18:40:08+0000), [cluster]) from ETLStore[1d] 297.653µs [query 239.423µs] [cloud 4.65µs] [aggregate 48.65µs] [accumulate 3.33µs] [stop 1.6µs]
159
2024-01-10T18:40:08.268544885Z INF [Profiler] 5.433419ms: Savings: clusterSizing
158
2024-01-10T18:40:08.273082982Z INF http: named cookie not present
157
2024-01-10T18:40:08.273221034Z INF [JWT Groups] No Cookie set
156
2024-01-10T18:40:08.27366333Z INF http: named cookie not present
155
2024-01-10T18:40:08.27369793Z INF [JWT Groups] No Cookie set
154
2024-01-10T18:40:08.27446743Z INF ETL: QuerySummaryAllocation([2024-01-03T18:40:08+0000, 2024-01-10T18:40:08+0000), [cluster]) from AggregatedStore[1d] 1.150625ms [query 308.914µs] [idle/tenancy 68.001µs] [external 440ns] [aggregate 772.32µs] [accumulate 700ns] [stop 250ns]
153
2024-01-10T18:40:08.276718678Z INF ETL: QuerySummaryAllocation([2024-01-03T18:40:08+0000, 2024-01-10T18:40:08+0000), [cluster]) from AggregatedStore[1d] 2.172538ms [query 1.683962ms] [idle/tenancy 104.601µs] [external 690ns] [aggregate 382.355µs] [accumulate 600ns] [stop 330ns]
152
2024-01-10T18:40:08.427697895Z INF Found Discount for InstanceType: t3a.xlarge of 0.00
151
2024-01-10T18:40:08.427784386Z INF [Turndown Savings] Failed to locate 'instance_type' on node pricing metric.
150
2024-01-10T18:40:08.427811097Z INF Found Discount for InstanceType:  of 0.00
149
2024-01-10T18:40:08.427837787Z INF [Turndown Savings] Failed to locate 'instance_type' on node pricing metric.
148
2024-01-10T18:40:08.427859757Z INF Found Discount for InstanceType:  of 0.00
147
2024-01-10T18:40:08.427907348Z INF Found Discount for InstanceType: t3a.large of 0.00
146
2024-01-10T18:40:08.427956138Z INF Found Discount for InstanceType: t3a.medium of 0.00
145
2024-01-10T18:40:08.428006209Z INF [Profiler] 165.524482ms: Savings: nodeTurndown
144
2024-01-10T18:41:01.739217306Z INF Error getting node pricing. Error: Invalid Pricing Key "us-east-1,,linux"
143
2024-01-10T18:41:01.739297098Z INF Error getting node pricing. Error: Invalid Pricing Key "us-east-1,,linux"
142
2024-01-10T18:41:01.741587906Z INF Error getting node pricing. Error: Invalid Pricing Key "us-east-1,,linux"
141
2024-01-10T18:41:01.741670107Z INF Error getting node pricing. Error: Invalid Pricing Key "us-east-1,,linux"
140
2024-01-10T18:42:01.770136424Z INF Error getting node pricing. Error: Invalid Pricing Key "us-east-1,,linux"
139
2024-01-10T18:42:01.770219475Z INF Error getting node pricing. Error: Invalid Pricing Key "us-east-1,,linux"
138
2024-01-10T18:42:01.772953Z INF Error getting node pricing. Error: Invalid Pricing Key "us-east-1,,linux"
137
2024-01-10T18:42:01.77304115Z INF Error getting node pricing. Error: Invalid Pricing Key "us-east-1,,linux"
1
======= snip=========

Slack discussion

No response

Troubleshooting

  • I have read and followed the issue guidelines and this is a bug impacting only the Helm chart.
  • I have searched other issues in this repository and mine is not recorded.

Allow Cluster Rightsizing report to only suggest from allowed instance types

What problem are you trying to solve?
My organization has a predefined list of allowed instance sizes. The cluster rightsizing savings report recommends instances sizes and families that are disallowed by my organization.

Describe the solution you'd like
A way to define what instances/families are allowed to be suggested. Ideally, a flexible way to define what instances are allowed or disallowed based on names, families, and potentially instance properties(generation, region, network features, GPU, etc.). A good MVP would be simple name matching for allowed/disallowed.

Describe alternatives you've considered
Manually picking instances my organization supports based on the specs of the instances Kubecost suggests (vpcu, gb mem, etc)

How would users interact with this feature?
helm, configmap, API, download from object storage, settings, etc.

Add ability to ignore specific resources

What problem are you trying to solve?
Each namespace has a PVC that maps to an Azure Storage Account (via Blob CSI). The Storage is (a) already accounted for in other costing and (b) is not actively using 10TiB of data (it says it has) and therefore should not be priced as such in Kubecost.

Describe the solution you'd like
A way to allow users to ignore specific resources through a configuration within Kubecost

Describe alternatives you've considered
Using a relabel config to drop the metrics from Prometheus directly

How would users interact with this feature?
Configuration through values.yaml, maybe through an ignoredResources section, which can specify a string for resources to ignore. "{namespace}/{objectType}/{objectName}" or "kubecost/persistentvolumeclaim/kubecost-cost-analyzer". Ideally, this includes wildcard support to handle multiple namespaces or objects.

┆Issue is synchronized with this Jira Task by Unito

Link to "Inspect Details" for any aggregation group (Owner/Product etc), fix missing Cloud/Network

What problem are you trying to solve?

In /allocations in the UI, users are able to "Inspect Details" when aggregating by Namespace. This opens the /details page in a new browser tab. This is shown as an option when aggregating by namespace, but not when aggregating from a group like "Owner" or "Product"

[Examples of the "Inspect Details" pop-up for aggregation:namespace]

You are able to edit the endpoint URL directly and specify the name + type (shown below), and it brings up the /details page for that selection. In my case, I will use "type=product,name=grafana", which works but this /details page is missing the "Cloud Costs" and "Network Costs".

If I go to Advanced Reporting, I can see the "product:grafana" line item does have associated external cost, so it should also show within the /details inspector page.

[Example of a different /details page for "namespace:kubecost" - this one showing the Cloud + Network Costs]

Describe the solution you'd like

Add ability to pull up the Inspect Details (/details) for any aggregation in the UI. This would behave just like it does in v1.99 /allocations UI for "namespace"

Within each new /details page, add Cloud Costs and Network Costs that are currently missing when you go to the URL directly.

This should ideally be added to Advanced Reporting as well. I should be able to click and get to the "Inspect Details" page from any "aggregation" row where possible.

Describe alternatives you've considered

If these pages need to be accessed via hard coding the URL vs adding an "Inspect Details" hyperlink to each reporting row in UI, that would be fine, but the cloud costs + network costs that are missing when going direct to the URL should be added

How would users interact with this feature?

They would click on the row name within the UI and "Inspect Details" when aggregating by Owner/Team/Department/Environment (any of the built-in "aggregation groups"). This would be an option in any reporting view that makes sense, including in the new Advanced Reporting.

┆Issue is synchronized with this Jira Task by Unito

[Bug] Cloud cost view not default sorted by total cost

Kubecost Version

106.2

Kubernetes Version

n/a

Kubernetes Platform

EKS

Description

Most other Kubecost views are sorted by total costs, but this particular view of cloud costs is not.

Steps to reproduce

  1. View this page and look at total cost: https://infra.kceng.dev/cloud?reportTitle=Cumulative+cost+for+last+60+days+by+item&window=60d&agg=item&context=W3sicHJvcGVydHkiOiJzZXJ2aWNlIiwidmFsdWUiOiJBbWF6b25FQzIiLCJuYW1lIjoiU2VydmljZSJ9XQ%3D%3D&filters=W10%3D&costMetric=AmortizedNetCost&selectedProviderId=&selectedItemName=

OR

  1. Visit https://infra.kceng.dev/cloud
  2. Drill into EC2 and view this page: https://infra.kceng.dev/cloud?reportTitle=Cumulative+cost+for+last+7+days+by+item&agg=item&context=W3sicHJvcGVydHkiOiJzZXJ2aWNlIiwidmFsdWUiOiJBbWF6b25FQzIiLCJuYW1lIjoiU2VydmljZSJ9XQ%3D%3D&filters=W10%3D

Expected behavior

Individual items sorted by descending cost.

Impact

Was confused how "costs could be $4k+ but then have the biggest item be $6" then saw that there were bigger items in the graph tool tip... thought there was a major data problem... and then realized that data wasn't sorted like in other views.

Screenshots

image

Logs

No response

Slack discussion

No response

Troubleshooting

  • I have read and followed the issue guidelines and this is a bug impacting only the Kubecost application.
  • I have searched other issues in this repository and mine is not recorded.

[Feature] Support differenly step sizes in cloud data, e.g. daily, weekly, and monthly

Problem Statement

When I look at data over long windows, e.g. quarters or years, I'd really like to view it by weeks or months.

Solution Description

Proposal is we add a step size to this window that support default, daily, weekly, and monthly.

image

Alternatives

No response

Additional Context

No response

Troubleshooting

  • I have read and followed the issue guidelines and this is a feature request only for the Kubecost application.
  • I have searched other issues in this repository and mine is not recorded.

[Feature] Method to determine if costs have been reconciled through the API

Problem Statement

When consuming allocation data via the API, there isn't a reliable method to determine if data points have already been reconciled with the CSP or not. As a comparison, there is a feature in the Kubecost UI that highlights whether or not costs have been already reconciled (although it isn't clear how reliable that functionality is either).

Solution Description

A simple property in each data element in the return response like reconciled: True would be enough. Optionally, it would also be useful to be able to submit this as a query parameter. For example, to only fetch costs that have been reconciled.

Alternatives

  • Fetch data with at least 48 hours of lag: although this should work for the happy path, it's not safe in situations when problems occurred to reconcile the data.
  • Test for *CostAdjustment properties being different than 0: although true for most cases, there's likely a variety of scenarios where the adjustment for a reconciled cost might still be 0.

Additional Context

No response

Troubleshooting

  • I have read and followed the issue guidelines and this is a feature request only for the Helm chart.
  • I have searched other issues in this repository and mine is not recorded.

[Feature] Use OIDC discoveryURL to construct authURL

Problem Statement

I want to integrate a OIDC provider for authentication. For its configuration I have the clientID, clientSecret and discoveryURL from the oidc provider. But currently to configure OIDC in kubecost I also need to specify a authURL with all required parameters hardcoded in the url.

Solution Description

The discoveryURL should be used to discover all other required information, such as the authURL. On login the user should be redirected to the authUrl with all required parameter automatically added. If required additional parameters should be configurable. The kubecost should also add a nonce (state) parameter automatically as defined by the oidc spec.

Alternatives

Use the hardcoded authUrl with all parameters (redundant) and nonce parameter state (insecure).

Additional Context

I'm setting up Auth0 as OIDC provider, but this feature request applies to all OIDC spec compliant OIDC providers.

Troubleshooting

  • I have read and followed the issue guidelines and this is a feature request only for the Kubecost application.
  • I have searched other issues in this repository and mine is not recorded.

Monitor RDS instances and provide recommendations on instance type

What problem are you trying to solve?
I would like to be able to rightsize the RDS instances to what is appropriate to the usage.

Describe the solution you'd like
Currently kubecost is able to recommend the instance types for the k8s nodes and would like a similar feature for other AWS services starting with RDS. Like maybe using graviton etc. It should also provide some information on the performance impact of using an instance type

Describe alternatives you've considered
AWS cloud watch/datadog

How would users interact with this feature?

Thought the kubecost ui under Assets with a separate section for each AWS servicd.

Expand UI so users can gain insight into the most inefficient+expensive namespaces/teams/services/etc

What problem are you trying to solve?
I want to be able to understand which of my groups (i.e. namespaces/teams/services) are using the most money on resources they have allocated but aren't actually being used. Total cost for each grouping is useful, but it's not very meaningful if resource usage is high so the cost is justified. Efficiency for each grouping is useful, but it's not very helpful if most of my low efficiency groups aren't costing much anyways.

Describe the solution you'd like
Kubecost UI has more columns / configurable columns that have those sorts of metrics. For example : Cost of Resources Used, Cost of (pod-level) Idle Resources, Cost of RAM Used, Cost of Idle Ram, etc.

Describe alternatives you've considered
I've tried exporting metrics to Datadog to look at these numbers. But the only costing metrics Kubecost exports are all by node (i.e. node_ram_hourly_cost). So they can't be used to gain insight into groups such as namespace/team/etc.

How would users interact with this feature?
They would sort by a metric such as "Cost of Idle Resources" to quickly recognize which groups are paying a lot for resources that aren't necessarily needed.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.