ks888 / lambstatus Goto Github PK
View Code? Open in Web Editor NEW[Maintenance mode] Serverless Status Page System
Home Page: https://lambstatus.github.io
License: Apache License 2.0
[Maintenance mode] Serverless Status Page System
Home Page: https://lambstatus.github.io
License: Apache License 2.0
It is able to import metrics from CloudWatch Metrics so far. New Relic is widely used monitoring SaaS. So let's support this.
As a Status Admin I want the ability to edit incident timelines so that I can make updates to the timeline notes (Like inaccurate spelling or updates on a url) without having to delete the entire incident.
From @stephencornelius' comment on gitter:
stephencornelius @stephencornelius Jul 14 19:36
Hi, ive just setup lambstatus and really liking it so far, very easy to get working. However I’ve got a question about metrics, im adding cloudwatch metrics and was wondering if theres anyway to specify the statistic? e.g. for ELB 2xx status code the only statistic that makes sense is sum, all others just report 1, im guessing average is chosen by default?
Kishin Yagami @ks888 22:53
@stephencornelius Thank you for using LambStatus. You're right. On collecting the datapoints from CloudWatch, 'Average' statistic is used. Unfortunately there is no option to choose the other statistics. I guess it is better to add an option to choose them.
UserEmail
and UserName
parameters of CloudFormation template accept any values, and it will cause the errors like #6 later.
Add AllowedPattern
property to parameters so that we can notice the wrong values as soon as possible.
http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/parameters-section-structure.html
Would be good to also specify different accounts to fetch metrics from too.
The concatenation used to form the S3 website endpoints used by CloudFront is broken for regions that use the .region instead of the -region format (Ohio, Canada, Mumbai, Seoul, Frankfurt and London)
http://docs.aws.amazon.com/AmazonS3/latest/dev/WebsiteEndpoints.html
The two general forms of an Amazon S3 website endpoint are as follows:
bucket-name.s3-website-region.amazonaws.com (dash region)
bucket-name.s3-website.region.amazonaws.com (dot region)
Listing of all region endpoints
http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_website_region_endpoints
Including a mapping lookup as a possible fix?
https://blog.doismellburning.co.uk/pointing-aws-cloudfront-at-an-s3-website-with-cloudformation/
Admin and Status page CloudFront distribution origins
https://github.com/ks888/LambStatus/blob/v0.3.0/cloudformation/lamb-status.yml#L2524
https://github.com/ks888/LambStatus/blob/v0.3.0/cloudformation/lamb-status.yml#L2573
Hello,
When adding a Service Name on the settings page, it combines whatever you enter there with the green Status text. I.E. My AWS MonitoringStatus . There needs to be a space between the service name entered and the word Status.
The other thing, is when adding scheduled Maintenance. The word currently shown on the Status page is Maintenances. It should say Maintenance.
Nothing huge, but it helps with the presentation :)
From @maximede's comment on gitter:
Maxime Deravet @maximede 00:16
Great! Thanks! Also, is there any plan to link cloudwatch alarms to component status?
Kishin Yagami @ks888 09:53
Such an integration sounds interesting. I don't have a specific plan to support it, though.
Anyway, it would be good to create the issue at first.
From the antonivs's comment on reddit.
One other issue that I haven't investigated yet is that the site I'm testing with
happened to be down for a number hours on Friday, but for some reason I
don't see any gap on the LambStatus metrics chart, although there's a clear
gap in the charts on CloudWatch. If I find out more, I'll let you know the details.
The metrics chart of LambStatus connects the points by the straight line, and it doesn't check whether there is the missing data. It seems CloudWatch Metrics connects the points only when there is no missing data in between. So maybe that's why the appearance of the charts are different. This issue fixes this.
So you and I don't have to wait for a release to take advantage of the latest commit, and this template must be based in S3, it's probably worth having another 'Create Stack' button that allows people to use a template based on the latest commit (like a 'beta' button or something). That way people are less likely to feel the need to fork or do something manual.
Introduce LambCI, the nice alternative to SaaS CI.
Also, it may reveal some advantages (and disadvantages) of this kind of tools.
Sometimes the scheduled maintenance is inevitable. In that case, it is important to announce the maintenance schedule in advance. This issue implements the functions to support this.
Suggestion from @kaustubhmenon #14 (comment)
Suggestion:
Integrate a new CSS/SCSS file for manipulating the UI, without touching any core SCSS files. Could also be similar to Edit CSS in Wordpress (https://en.support.wordpress.com/custom-design/editing-css/)
. While inspecting the structure I noticed that all class names are appended with a random string in the end which could be a possible blocker for targeting CSS classes?
My reply:
StatusPage.io also has a similar feature https://help.statuspage.io/knowledge_base/topics/using-custom-css . Though the css class has a random prefix since CSS modules are used, maybe we can let the principal UI components have the id attribute and customize the design like this:
#container {
width: 90%;
max-width: 850px;
}
To tell the recent performance of the web service, it is good to have the graph which shows the response time, uptime, and so on. Although LambStatus will not have the ability to monitor the service, it is possible to integrate with other monitoring SaaS.
I'm going to start with CloudWatch Metrics since we don't need another account to try it.
TODOs:
Created the show-cloudwatch-metrics branch for this purpose.
I try to click a button launch stack but it failed.
Here are some errors:
... 21:49:36 UTC+1100 DELETE_IN_PROGRESS AWS::IAM::Role CognitoSMSCallerRole 21:49:36 UTC+1100 DELETE_IN_PROGRESS AWS::S3::BucketPolicy StatusPageS3BucketPolicy 21:49:29 UTC+1100 ROLLBACK_IN_PROGRESS AWS::CloudFormation::Stack StatusPage The following resource(s) failed to create: [AdminPageFrontend, StatusPageDistribution, AdminPageDistribution, LambdaRoleInstanceProfile, StatusPageFrontend]. . Rollback requested by user. 21:49:27 UTC+1100 CREATE_FAILED AWS::CloudFront::Distribution StatusPageDistribution Resource creation cancelled 21:49:27 UTC+1100 CREATE_FAILED AWS::IAM::InstanceProfile LambdaRoleInstanceProfile Resource creation cancelled 21:49:27 UTC+1100 CREATE_FAILED AWS::CloudFront::Distribution AdminPageDistribution Resource creation cancelled 21:49:27 UTC+1100 CREATE_FAILED Custom::S3SyncObjects AdminPageFrontend Data returned must be an object 21:49:26 UTC+1100 CREATE_COMPLETE AWS::DynamoDB::Table ServiceComponentTable 21:49:26 UTC+1100 CREATE_FAILED Custom::S3SyncObjects StatusPageFrontend Data returned must be an object 21:49:26 UTC+1100 CREATE_COMPLETE AWS::DynamoDB::Table IncidentTable 21:49:26 UTC+1100 CREATE_COMPLETE AWS::DynamoDB::Table IncidentUpdateTable 21:49:23 UTC+1100 CREATE_COMPLETE AWS::Lambda::Permission GetComponentsLambdaInvokePermission ...
https://github.com/streaka/lambda-ping
The problem is there is no 'uptime' metric naturally occurring on cloudwatch. It would be nice to package with this an optional lambda function that pings on a fixed schedule, making that available in cloudwatch for metric selection (and it can be a 'recommended metric' or something in the UI).
I'm using the above script to add 'uptime' metrics and whatever else might be handy... i'll see if I can integrate this in.
Could you make me a contributor and I can publish on another branch instead of a fork?
Suggestion from @kaustubhmenon #14 (comment)
Suggestion:
Ability to add external links to the footer or header. Would love to add links for documentation, support, etc.
Manipulate the structure of Header, Main Container and Footer would be a added bonus if possible. Something similar to https://www.statuspage.io/features/customization.
My reply:
Maybe we can support these customizations by having the feature to customize the header html and footer html.
At the moment, to my knowledge, the only way to update is through a whole new AWS CF Stack. I was wondering if Lambda could be used somehow to auto update when a new update is pushed on github.
From @maximede's comment on gitter:
Maxime Deravet @maximede 08:29
Hey there !
I'm a bit confused by what the github repo states :
Choose the AWS region different from your service's region. If both your service and its status page rely on the same region, the region outage may stop both.
I'm not sure how I can achieve that as, when I try to add a metric, I don't have access to the metrics coming from a different region. Am I missing something?
Kishin Yagami @ks888 09:52
Oops, that’s a bug. Only the metrics from the LambStatus' region are fetched. The metrics from other regions should be fetched, too. I will fix this on the weekend and release the new version.
I have a lot of suggestions, so I thought it would be a good idea to combine them all here.
From the antonivs's comment on reddit.
Edit: Forgot to mention, I installed it with no trouble except that I was dutifully
waiting 20+ minutes for the password email, when I realized that it had gone
into my junk folder like 15 minutes before. I guess that's because the from
address is invalid.
Add a settings page for things like custom CSS, colours and custom domains.
Hi!
I'm really interested in LambStatus, particularly because all of our applications are using the serverless framework.
Nonetheless, I feel like LambStatus could use an "Event" model. e.g. A client would ping the Event API at the end of a successful deployment.
Currently we are doing this with Cachet as a "incident". Though to my eyes, a deployment is not an incident.
The bigger question is are status pages meant to be a repository of events? Taking a quick look around at the public status pages, no one seems to be doing this.
CloudWatch data comes back as bytes. Being able to convert 4352360851042
into 3.958 TB
would be useful and more human-friendly.
Status reason:
The following resource(s) failed to delete: [AdminPageS3, StatusPageS3].
Do we need doc on stack delete? Or can this be covered in stack delete? I'll try to just delete the buckets and run stack delete again..
Settings page is breaking showing:
And status page is stuck on fetching data. It looks like the get-settings and get-public-settings functions are both not working properly
Settings table in DynamoDB just contains this (not sure if good or bad):
Logs say this for /aws/lambda/myapp-GetSettings in CloudWatch
Logs say this for /aws/lambda/myapp-GetPublicSettings in CloudWatch
Because the code is minified and web packed etc, no idea how to begin to fix this myself. You may need to provide some info on how to load up the lambda projects and debug them etc as I don't know how to set up a similar development workflow
From the antonivs's comment on reddit.
Hourly is fine. However, the issue is that when weekly or monthly is selected,
the range of the Y axis is not adjusted to match the range of the data, which
I think is the reason that the lines then appear "squashed" almost flat. I saw
this with my own site data, as well as with your demo.
Feature suggestion from @GustavoEsser:
Show the uptime percentage of components. Maybe the view like the screenshot below looks good.
From the J4cku's comment on reddit:
[–]J4cku
Hi, one question, does it have API so that my internal monitoring
can push info when one of my components go down and remove
it when it's up again?
[–]kyagami[S]
Thank you for asking! So far, there is no API to post an incident
and change components' status. However, it seems some users
need such an API and I'm going to implement it within one month.
I think the API will (partially?) solve the issue #41 and so it will help some users.
When incidents are created/updated, send the notification to users.
So far the clients directly access the API gateway. It works, but the API gateway and corresponding lambda functions often do same computations again and again, which generate useless costs. If the API is public which can be safely shared among users, the responses should be cached.
The API gateway implements the cache mechanism but it's costly. So I think to set the CloudFront in front of API gateway and use the cache mechanism of CloudFront is the most cost effective.
So far the GET external-metrics API returns all the metrics. If someone has the large number of metrics, it may return timeout errors.
This issue lets the external-metrics API support the pagination.
(And change the frontend's action so that all the metrics are fetched at last)
Folks will try to clone this repo and they will change .env
file. The change can't be commited and it will make it hard for them to keep their repo in sync with this one.
What if we gitignore .env
and add a .env-example
for reference?
I noticed that I had put the wrong domain for my email address when creating the email address :(
Thats my bad, but I was wondering if it might be possible that a stack upgrade with the correct email address might send the email out to me as I need instead of re-creating the stack?
Its minor, but a consideration..
From @blw9u2012's comment on gitter:
Brandon Walton @blw9u2012 06:02
@ks888 i'm having issues with displaying metrics from our target groups. It displays the correct graph but the Y axis is only showing a range between 0 and 1. Am i goofing somewhere?
i want to display the Y axis in milliseconds
i've added a metric from the AWS/ApplicationELB namespace and added the TargetResponseTime metric name and dimension for one of our load balancers
Kishin Yagami @ks888 22:35
@blw9u2012 Thank you for asking. The data points are averaged over 5 minutes when the 'Day' time frame is selected. Perhaps the maximum value of your data points is higher than 1.0, but the maximum value of the AVERAGED values is lower than 1.0.
Or, the rounded value in the tooltip may be the problem. Even if the actual value is 0.01, the value in the tooltip is rounded and will be 0. Such a behavior seems incovenient in your case, so it should be fixed.
Brandon Walton @blw9u2012 23:09
@ks888 thanks for responding! I also looked into the metrics and the unit that the metric TargetResponseTime returns is in seconds vs the latency metric in the demo site being milliseconds. I don't think that you can specify another unit to be returned other than seconds. I thought that this possibly could be related but it sounds like the latter issue you described. In any event thanks for responding!
The suggestion from VTHokie2015 at reddit:
[–]VTHokie2015 2 points 10 hours ago
I think you could make the background a light grey and make the cards have white background
to add more distinction
[–]kyagami 1 point 2 minutes ago
Wow, I tried your idea and it surely improved the page design! I will change the background color
or maybe enable a user to change that color in the settings page! Thank you!
When the color is #FAFAFA, the status page looks like this:
My concern is the conflict with the background of the logo image, though LambStatus does not support the feature to change the logo image of the status page header for now. So maybe it's better to enable a user to specify the background color in the settings page.
So far admin page is not protected by user authentication. Any person who knows the URL of an admin page can change service status. To stop this, support user authentication (maybe using Amazon Cognito User Pools).
At least these functions are necessary:
From @ccannell's comment on gitter:
Christopher Cannell @ccannell 05:33
I've deployed LambStatus and now I'm a bit lost on the next step. I'd like to check an HTTP server status regularly. What is the best way to approach that? Is there documentation on the LambStatus API? Should I run a periodic lambda to poll my HTTP server and then post to LambStatus API on changes?
Kishin Yagami @ks888 10:42
Thank you for asking. Since LambStatus has few features for monitoring the service, such as alarm, it's better to post your server's status to your monitoring service at first. Then, integrate the monitoring service with LambStatus. If your monitoring service is CloudWatch, the integration is easy. If not, it is harder because there is no LambStatus API to post the data.
LambStatus API to post the data sounds nice. Maybe it's better to support it.
As a Status Admin I want the ability to inject notes at a specific time into the timeline so that when I get additional information about an event that occurred in the past I can add when that event occurred with an accurate timestamp.
Use case: I am tracking an Incident on a web page outage. I find out that DNS was changed at 4:10 PM and I was not notified until 4:30 PM. I want to add an event at a specific time (4:10 PM) it occurred 4:10 rather than when I had a chance to post the event. 4:30 PM
Add two factor authentication for logging in
When the incident is created/updated, tweet it.
Add an integration to AWS SNS in order to enable notifications
If I get some time I might throw together a PR or 2 with these implemented
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.