Giter Site home page Giter Site logo

cloud-cost-reporter's People

Contributors

timalces avatar voxsecundus avatar

Watchers

 avatar  avatar  avatar

cloud-cost-reporter's Issues

Handle Azure API inconsistent resource group capitalisation in instance id

The Azure API sometimes returns instance ids with either a lowercase or uppercase resource group. If this occurs it can cause issue for cloud-cost-visualiser, as exact comparisons of these instance ids will not match. To accommodate this Azure inconsistency, when creating an instance log we should first downcase the resource group in the instance id.

Add Note That Budget is Monthly

When adding a project, the Budget (c.u.) could be made a little more specific with something like Budget (c.u./month) or Monthly Budget (c.u.)

Record compute group for AWS instance logs

For use by the cloud-cost-visualiser, record each compute node's compute group (if any) when creating an AWS project's instance logs. Already implemented for Azure projects.

Data egress figures for Azure

Daily and weekly reports currently include figures (cost and GB) for data egress on AWS only. This should also be included for Azure projects.

As we are currently receiving all the cost & usage data from Azure for a given period, it should be possible to iterate through these and obtain entries specifically for data out. Initial investigation suggests this would be entries with "UsageResourceKind" (part of "properties" -> "additionalInfo") that includes "DataTrOut".

Helper for Regular Running

Add some sort of helper for regular running of daily checks & weekly reports. Perhaps just configuration files with our defaults (currently I set my cronjob to run daily checks at midday everyday).

It's worth looking at cron configuration.

Record core infrastructure costs (Azure)

For use by the project specific version of cloud-cost-visualiser, we must record the costs specifically attributed to the project's core infrastructure, as part of the daily report.

Currently this is only required for Azure. Most likely this will involve adding a tag to the relevant resources, used in a similar way to how compute costs are identified.

Support for Multiple Regions & Resource Groups

In some cases resources may be spread between regions (applies to both AWS & Azure) & resource groups (Azure-specific). I would like to be able to configure a "project" that combines the costs of multiple regions and/or resource groups into one.

Needs some thought on implementation, initial thoughts:

  • Meta-projects that combine the data from child projects
  • Changes to project configuration to loop through project details to add multiple groups & regions

Ability to List Configuration

While using the tool I found it difficult to list what I already have. This mainly applies to projects & compute mappings. I think it's worth having functionality to be able to list & show the various configured components.

Greater pricing flexibility

Currently pricing for both AWS and Azure is limited in the following ways:

  • AWS cost forecasts assume using shared, linux based instances, with no special software installed. Similarly, Azure projects assume using 'Standard' VMs. Being able to accommodate other types should be explored
  • To reduce data write/read for Azure pricing, currently only UK prices are stored. Storing all prices globally would be excessive and introduce long read times, but could explore intelligent storing of prices for locations we know projects belong to

Prevent blanks when creating/ updating projects & handle SDK breaking input

In manage_projects.rb we don't always validate if an entered attribute is blank. We should add more validations of this, especially as in the case of regions this breaks the AWS sdk.

We can't be certain what unexpected values users might enter, so should also investigate generally handling SDK breaking errors within manage_projects.rb (they are handled in the daily and weekly reports).

AWS pricing using saved data

After the implementation of #65 the pricing details of AWS instances are now stored in a text file. It would be worth exploring if more efficient to use the data in this file to estimate future costs in the weekly report, instead of the current practice of making multiple individual pricing queries (which also involve a small cost).

Azure App Creation & Unmanaged Sites

This issue is largely unique to Azure as AWS CLI & API both only require the Access ID & Secret to get in, the issue is with Azure not allowing user-based API access.

In some situations we would like to use this to monitor cloud costs of externally managed accounts. Sadly, in these scenarios there will only be a single authenticated user that can be used and they will be unable to create app registrations. This means that only an authenticated az CLI exists.

Needs some thought/evaluation.

Support "pretty" compute names in daily reports

For the most part daily reports will be used for internal cost tracking, however, there could be occasions where this information is exposed to a customer in which case the pretty names will be needed.

Record costs for each compute node group (Azure)

For use by the project specific version of cloud-cost-visualiser, we must record the costs for each group of compute nodes, as part of the daily report.

Currently this is only required for Azure. Most likely this will involve adding a tag to the relevant resources, used in a similar way to how overall compute costs are identified.

Project credentials/ permissions validation at creation

It would be helpful if some test queries/ SDK objects were made upon creating a new project (in manage_projects.rb) to determine if the credentials given are valid and, if not, which additional permissions were required.

Command Line Interface

Implement a proper command interface for interacting with the tool, this is the first step towards further integration with other Flight Tools and the packaging system. Some sort of entrypoint/short name can be used for accessing the tool, such as cloud-cost or ccr or perhaps just costs.

Initial mock-up of CLI (open for discussion & interpretation):

  • dbsetup
  • compute-mapping
  • project
  • report
    • daily
    • weekly

Further to the above, implementation of the compulsory & optional args in a more standardised format.

Azure costs duplication workaround

Sometimes Azure will create duplicates of all its cost items (possibly when an invoice is being prepared), resulting in doubled cost figures. To work around this, add some initial processing of the Azure API cost data received to remove any duplicates.

Replace inline rescues with specific error rescues

Inline rescues are used rather often in the codebase, and they're dangerous because they rescue StandardError and all of its descendants, meaning completely unrelated errors (like typos or runtime problems) will be caught and ignored. It would be better practice to replace the inline rescues with begin-rescue blocks for what exactly we're looking for.

Record Azure and AWS instance details

The cloud-cost-visualiser project requires access to details of what instances are available in each region, their prices and resources (memory, GPUs and CPUs).

Although Azure prices are currently stored, this application should also obtain and store details of Azure instance sizes, AWS prices and instance sizes. For read/write efficiency, only details for regions/locations used by existing projects should be stored.

Relates to https://github.com/openflighthpc/cloud-cost-visualiser/issues/9

AWS SDK error handling

Currently there is no special handling of AWS SDK errors (such as permission errors or failed queries), with these currently throwing exceptions that end the running of the current file. More robust handling of these should be added.

Use instances on day when forecasting costs

Currently the weekly report uses the latest instance logs for all forecasts. As there are some forecast days that are in the past (i.e. dates between the last cost log and today), for these dates we should instead use the costs of the instances recorded as running on that day. This should also be suitably explained in the report text.

Summary/Short Mode

While the current daily messages contain a lot of useful data, akin to "showing the working" for the various costs, a way of producing a digest of daily costs can come in useful for quick-glance understanding of costs.

I propose that this message would only contain:

:moneybag: Usage for DATE :moneybag:
Compute Costs (USD): X
Data Out Costs (USD): X
Total Costs(USD): X
Total Compute Units (Flat): X
Total Compute Units (Risk): X
FC Credits: X

Question: Would this be better as a CLI flag or a per-project setting?

Record log dates

At the moment we record a timestamp for each InstanceLog and CostLog and do a lot of searches for logs on specific dates using LIKE queries on these timestamps. These types of query are very inefficient - we should add a date field for these logs and use this to query logs on a specific day instead.

Make type tagging consistent

At the moment Azure projects allow for instances to be tagged with type => core or type => compute. Whereas AWS projects use compute => true and core => true. AWS projects should be updated to also use the type => format.

Record storage cost log

Record a new cost log for the project's storage costs.

  • For Azure this is all 'Disk' costs
  • For AWS this is all 'EBS' costs + the Usage Type Group S3: Storage - Standard
  • For both this includes those for 'core', and excludes data out costs

Credentials protection/ encryption

The application could potentially include credentials for various different users/ apps in its database. If someone gained access to where the application was hosted, they could potentially access all of these details. It would therefore be prudent to protect/ obscure/ encrypt these credentials (and/or the database as a whole).

Track budget change history

A project's budget may change over its lifetime. At the moment only the latest value is recorded, but it would be useful to record a history of these changes. This would allow for month/date specific budgets to be used by the related cloud costs visualiser application.

Include project name in output

As title says, in situations where multiple projects exist on multiple providers for one channel of output it would help clarify what the costs are for by including the name in the message

When adding a project, 'start date' can be prompted twice.

Steps to reproduce:

[jack@jack-alces ruby-cost-tracker]$ ruby manage_projects.rb 
List, add, or update project(s) (list/add/update/validate)? add
Project name: test
Host (aws or azure): azure
Start date (YYYY-MM-DD): 2020-10-09
End date (YYYY-MM-DD). Press enter to leave blank: 
Start date (YYYY-MM-DD): 

Support for Deleting Projects & Compute Mappings

As per title, also see CLI issue.

It's probably worth retiring/disabling projects so data is preserved for them but they are excluded from all, perhaps just setting the end_date to today to give the illusion of deletion/decommisioning

More detailed Azure API error messages

Queries to Azure APIs are now attempted up to 3 times. If all 3 fail, currently only the final error message is printed. It would be more useful to print the error message for each attempt (only printing them if all attempts fail).

Investigate using Azure VM power state instead of availability for instance logs

At the moment we use the 'availability' of an Azure VM, as reported by the Azure AvailabilityStatuses API. However this availability status / the Azure API's reporting of it may be unreliable. It may therefore be more appropriate to use the power status of the VM.

At the moment only the instanceView API seems to give this power state, but a single query must be made for each VM, so potentially very slow if many instances to check. It should be investigated further if there is a better API/ more efficient way of checking and using power state.

Set default date further in the past

Azure costs appear to be unreliable 36-48 hours after the end of a given day, being higher or lower than later reported by azure.

To give them more time to stabilise, change the default retrieval date to be 3 days in the past rather than 2.

In addition, further validation should be made of when azure costs fluctuate/ stabilise (moving this default back further if needed).

Record compute group costs for AWS projects

For use by the cloud-cost-visualiser, record cost logs for any compute groups in AWS projects (already implemented for Azure projects). This will involving adding and filtering by a new tag.

Define project tag separate from project name

At the moment a project tag must match a project's name as it is saved in the database. Instead it should be possible to define & set a project tag that is different from the project name, to give more flexibility and ensure no overlap when creating clusters in an account.

Create baseline pricing/instance data

As part of an ongoing development in cloud-cost-visualiser, it would be useful to have a copy of the instance/pricing data available to use from the get-go, instead of needing to add at least one of each project and retrieve the data manually. It may be worth adding to the README that the data could be out of date, and may not reflect exact values until it is updated.

README Tagging Clarification

It is worth updating the README to better address some of the caveats of tagging post-resource creation

These tags should be added at the point a resource is created. If adding tags to instances in the AWS online console, these tags will also be applied to their associated storage.

The above somewhat reads that if tags are added after a resource is created that it'd propagate them to the storage volumes (which is sadly not the case).

My suggestion is to either expand this to address both scenarios (e.g. a section for "at creation time" and "after creation tagging") or to reduce & generalise the docs to say something along the lines of "For tracking by project tag, ensure that all desired resources are given a tag with the key project and a value of what you have named the project. It is recommending checking that all expected resources (IPs, volumes, etc) have the expected tag before configuring the project tracking"

Make Installation Instructions Clearer

The installation instructions could be made clearer by putting all the commands to get to a working state in a single code block. From scanning the README (as I usually do with projects) I figured all I needed to do was clone the repository to be able to go, really the installation process is clone -> bundle install -> db setup -> mappings setup.

This could be made clearer.

Prevent/ handle database locking errors

If ruby daily_reports.rb and ruby record_instance_logs.rb are run at the same time then the error SQLite3::BusyException: database is locked may occur, breaking the relevant process. This should be investigated further and the error prevented and/or gracefully handled.

Ability to record costs for each day within a given date range

If the cloud-cost-reporter is added mid project, there will be no cost data in the database for dates before this point. Although not an issue for this application, it may leave the cloud-cost-visualiser with significant gaps, which can currently only be filled by generating daily reports for each day individually.

It would therefore be useful to have a script for obtaining and recording cost data for a given range of dates (just recording, no generation of reports). For all or just a named project.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.