openflighthpc / cloud-cost-reporter Goto Github PK
View Code? Open in Web Editor NEWLicense: Eclipse Public License 2.0
License: Eclipse Public License 2.0
The Azure API sometimes returns instance ids with either a lowercase or uppercase resource group. If this occurs it can cause issue for cloud-cost-visualiser
, as exact comparisons of these instance ids will not match. To accommodate this Azure inconsistency, when creating an instance log we should first downcase the resource group in the instance id.
When adding a project, the Budget (c.u.)
could be made a little more specific with something like Budget (c.u./month)
or Monthly Budget (c.u.)
For use by the cloud-cost-visualiser
, record each compute node's compute group (if any) when creating an AWS project's instance logs. Already implemented for Azure projects.
Daily and weekly reports currently include figures (cost and GB) for data egress on AWS only. This should also be included for Azure projects.
As we are currently receiving all the cost & usage data from Azure for a given period, it should be possible to iterate through these and obtain entries specifically for data out. Initial investigation suggests this would be entries with "UsageResourceKind" (part of "properties" -> "additionalInfo") that includes "DataTrOut".
Add some sort of helper for regular running of daily checks & weekly reports. Perhaps just configuration files with our defaults (currently I set my cronjob to run daily checks at midday everyday).
It's worth looking at cron configuration.
For use by the project specific version of cloud-cost-visualiser
, we must record the costs specifically attributed to the project's core infrastructure, as part of the daily report.
Currently this is only required for Azure. Most likely this will involve adding a tag to the relevant resources, used in a similar way to how compute costs are identified.
In some cases resources may be spread between regions (applies to both AWS & Azure) & resource groups (Azure-specific). I would like to be able to configure a "project" that combines the costs of multiple regions and/or resource groups into one.
Needs some thought on implementation, initial thoughts:
While using the tool I found it difficult to list what I already have. This mainly applies to projects & compute mappings. I think it's worth having functionality to be able to list & show the various configured components.
Currently pricing for both AWS and Azure is limited in the following ways:
For use by the cloud-cost-visualiser
, record core infrastructure cost logs for AWS projects (already implemented for Azure). This will involve adding and filtering by a new tag.
In manage_projects.rb
we don't always validate if an entered attribute is blank. We should add more validations of this, especially as in the case of regions this breaks the AWS sdk.
We can't be certain what unexpected values users might enter, so should also investigate generally handling SDK breaking errors within manage_projects.rb
(they are handled in the daily and weekly reports).
Would be helpful to be given the option to set end date when creating a project (but should not be required).
After the implementation of #65 the pricing details of AWS instances are now stored in a text file. It would be worth exploring if more efficient to use the data in this file to estimate future costs in the weekly report, instead of the current practice of making multiple individual pricing queries (which also involve a small cost).
This issue is largely unique to Azure as AWS CLI & API both only require the Access ID & Secret to get in, the issue is with Azure not allowing user-based API access.
In some situations we would like to use this to monitor cloud costs of externally managed accounts. Sadly, in these scenarios there will only be a single authenticated user that can be used and they will be unable to create app registrations. This means that only an authenticated az
CLI exists.
Needs some thought/evaluation.
Printing text will fill up cron logs, so we should set the 'slack' option to only post to slack.
For the most part daily reports will be used for internal cost tracking, however, there could be occasions where this information is exposed to a customer in which case the pretty names will be needed.
For use by the project specific version of cloud-cost-visualiser
, we must record the costs for each group of compute nodes, as part of the daily report.
Currently this is only required for Azure. Most likely this will involve adding a tag to the relevant resources, used in a similar way to how overall compute costs are identified.
It would be helpful if some test queries/ SDK objects were made upon creating a new project (in manage_projects.rb
) to determine if the credentials given are valid and, if not, which additional permissions were required.
Implement a proper command interface for interacting with the tool, this is the first step towards further integration with other Flight Tools and the packaging system. Some sort of entrypoint/short name can be used for accessing the tool, such as cloud-cost
or ccr
or perhaps just costs
.
Initial mock-up of CLI (open for discussion & interpretation):
Further to the above, implementation of the compulsory & optional args in a more standardised format.
Sometimes Azure will create duplicates of all its cost items (possibly when an invoice is being prepared), resulting in doubled cost figures. To work around this, add some initial processing of the Azure API cost data received to remove any duplicates.
Inline rescues are used rather often in the codebase, and they're dangerous because they rescue StandardError
and all of its descendants, meaning completely unrelated errors (like typos or runtime problems) will be caught and ignored. It would be better practice to replace the inline rescues with begin
-rescue
blocks for what exactly we're looking for.
The cloud-cost-visualiser
project requires access to details of what instances are available in each region, their prices and resources (memory, GPUs and CPUs).
Although Azure prices are currently stored, this application should also obtain and store details of Azure instance sizes, AWS prices and instance sizes. For read/write efficiency, only details for regions/locations used by existing projects should be stored.
Relates to https://github.com/openflighthpc/cloud-cost-visualiser/issues/9
Currently there is no special handling of AWS SDK errors (such as permission errors or failed queries), with these currently throwing exceptions that end the running of the current file. More robust handling of these should be added.
Currently the weekly report uses the latest instance logs for all forecasts. As there are some forecast days that are in the past (i.e. dates between the last cost log and today), for these dates we should instead use the costs of the instances recorded as running on that day. This should also be suitably explained in the report text.
While the current daily messages contain a lot of useful data, akin to "showing the working" for the various costs, a way of producing a digest of daily costs can come in useful for quick-glance understanding of costs.
I propose that this message would only contain:
:moneybag: Usage for DATE :moneybag:
Compute Costs (USD): X
Data Out Costs (USD): X
Total Costs(USD): X
Total Compute Units (Flat): X
Total Compute Units (Risk): X
FC Credits: X
Question: Would this be better as a CLI flag or a per-project setting?
At the moment we record a timestamp
for each InstanceLog
and CostLog
and do a lot of searches for logs on specific dates using LIKE
queries on these timestamps. These types of query are very inefficient - we should add a date
field for these logs and use this to query logs on a specific day instead.
At the moment Azure projects allow for instances to be tagged with type => core
or type => compute
. Whereas AWS projects use compute => true
and core => true
. AWS projects should be updated to also use the type =>
format.
Would be useful to have a script that gets the latest instance logs, without needing to also get new cost logs.
Record a new cost log for the project's storage costs.
S3: Storage - Standard
The parts that should refer to this project instead refer to Openflight Boilerplate
.
The application could potentially include credentials for various different users/ apps in its database. If someone gained access to where the application was hosted, they could potentially access all of these details. It would therefore be prudent to protect/ obscure/ encrypt these credentials (and/or the database as a whole).
A project's budget may change over its lifetime. At the moment only the latest value is recorded, but it would be useful to record a history of these changes. This would allow for month/date specific budgets to be used by the related cloud costs visualiser application.
If generating a daily report fails for a project (e.g. as seen with Azure API returning repeated 504s), would be helpful to send a message to slack highlighting the error.
As title says, in situations where multiple projects exist on multiple providers for one channel of output it would help clarify what the costs are for by including the name in the message
Steps to reproduce:
[jack@jack-alces ruby-cost-tracker]$ ruby manage_projects.rb
List, add, or update project(s) (list/add/update/validate)? add
Project name: test
Host (aws or azure): azure
Start date (YYYY-MM-DD): 2020-10-09
End date (YYYY-MM-DD). Press enter to leave blank:
Start date (YYYY-MM-DD):
As per title, also see CLI issue.
It's probably worth retiring/disabling projects so data is preserved for them but they are excluded from all
, perhaps just setting the end_date
to today to give the illusion of deletion/decommisioning
Queries to Azure APIs are now attempted up to 3 times. If all 3 fail, currently only the final error message is printed. It would be more useful to print the error message for each attempt (only printing them if all attempts fail).
At the moment, Azure projects use resource group based tracking, and require a set of resource groups to be specified. It would be useful to be able to specify just a subscription and evaluate costs for everything on that subscription.
For https://github.com/openflighthpc/cloud-cost-visualiser/issues/61 we must record the compute group for each Azure compute InstanceLog
.
Compute group for a resource is identified using the tag "compute_group" => "groupname" (see #93).
At the moment we use the 'availability' of an Azure VM, as reported by the Azure AvailabilityStatuses API. However this availability status / the Azure API's reporting of it may be unreliable. It may therefore be more appropriate to use the power status of the VM.
At the moment only the instanceView API seems to give this power state, but a single query must be made for each VM, so potentially very slow if many instances to check. It should be investigated further if there is a better API/ more efficient way of checking and using power state.
Azure costs appear to be unreliable 36-48 hours after the end of a given day, being higher or lower than later reported by azure.
To give them more time to stabilise, change the default retrieval date to be 3 days in the past rather than 2.
In addition, further validation should be made of when azure costs fluctuate/ stabilise (moving this default back further if needed).
For use by the cloud-cost-visualiser
, record cost logs for any compute groups in AWS projects (already implemented for Azure projects). This will involving adding and filtering by a new tag.
At the moment a project tag must match a project's name as it is saved in the database. Instead it should be possible to define & set a project tag that is different from the project name, to give more flexibility and ensure no overlap when creating clusters in an account.
As part of an ongoing development in cloud-cost-visualiser, it would be useful to have a copy of the instance/pricing data available to use from the get-go, instead of needing to add at least one of each project and retrieve the data manually. It may be worth adding to the README that the data could be out of date, and may not reflect exact values until it is updated.
Determine what kinds of costs are included in 'other' costs and how best to separate out data storage costs.
Occasionally queries to Azure apis fail due to timeout issues - either a 504 error from Azure, or a Net::ReadTimeout
error. See: https://alces.slack.com/archives/C01A4K27B37/p1606743409002000
Increasing HTTParty timeout may resolve this. It may also be worth adding automated retries for Azure timeout errors, up to 2 times.
It is worth updating the README to better address some of the caveats of tagging post-resource creation
These tags should be added at the point a resource is created. If adding tags to instances in the AWS online console, these tags will also be applied to their associated storage.
The above somewhat reads that if tags are added after a resource is created that it'd propagate them to the storage volumes (which is sadly not the case).
My suggestion is to either expand this to address both scenarios (e.g. a section for "at creation time" and "after creation tagging") or to reduce & generalise the docs to say something along the lines of "For tracking by project tag, ensure that all desired resources are given a tag with the key project and a value of what you have named the project. It is recommending checking that all expected resources (IPs, volumes, etc) have the expected tag before configuring the project tracking"
The installation instructions could be made clearer by putting all the commands to get to a working state in a single code block. From scanning the README (as I usually do with projects) I figured all I needed to do was clone the repository to be able to go, really the installation process is clone -> bundle install -> db setup -> mappings setup.
This could be made clearer.
If ruby daily_reports.rb
and ruby record_instance_logs.rb
are run at the same time then the error SQLite3::BusyException: database is locked
may occur, breaking the relevant process. This should be investigated further and the error prevented and/or gracefully handled.
If the cloud-cost-reporter
is added mid project, there will be no cost data in the database for dates before this point. Although not an issue for this application, it may leave the cloud-cost-visualiser
with significant gaps, which can currently only be filled by generating daily reports for each day individually.
It would therefore be useful to have a script for obtaining and recording cost data for a given range of dates (just recording, no generation of reports). For all or just a named project.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.