Giter Site home page Giter Site logo

ks888 / lambstatus Goto Github PK

View Code? Open in Web Editor NEW
1.3K 31.0 121.0 6.43 MB

[Maintenance mode] Serverless Status Page System

Home Page: https://lambstatus.github.io

License: Apache License 2.0

JavaScript 96.49% HTML 0.12% Shell 0.27% CSS 3.12%
serverless aws-lambda statuspage react redux nodejs javascript cloudformation lambda

lambstatus's People

Contributors

ajohnstone avatar beck avatar kbariotis avatar ks888 avatar nodomain avatar salekseev avatar wmnnd avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

lambstatus's Issues

Integrate with New Relic Metrics

It is able to import metrics from CloudWatch Metrics so far. New Relic is widely used monitoring SaaS. So let's support this.

Ability to edit Incident timeline

As a Status Admin I want the ability to edit incident timelines so that I can make updates to the timeline notes (Like inaccurate spelling or updates on a url) without having to delete the entire incident.

[Metrics] Add an option to choose the statistics

From @stephencornelius' comment on gitter:

stephencornelius @stephencornelius Jul 14 19:36
Hi, ive just setup lambstatus and really liking it so far, very easy to get working. However I’ve got a question about metrics, im adding cloudwatch metrics and was wondering if theres anyway to specify the statistic? e.g. for ELB 2xx status code the only statistic that makes sense is sum, all others just report 1, im guessing average is chosen by default?

Kishin Yagami @ks888 22:53
@stephencornelius Thank you for using LambStatus. You're right. On collecting the datapoints from CloudWatch, 'Average' statistic is used. Unfortunately there is no option to choose the other statistics. I guess it is better to add an option to choose them.

Change the color of the header according to the service status

So far, the color of the header is always green:
screen shot 2017-10-14 at 15 09 21

A user will notice an incident immediately if this color changes according to the service status, like the red color for major outage. GitHub Status page actually changes the title logo when the incident happens.

Healthy
Major outage
Minor outage

S3 website endpoints

The concatenation used to form the S3 website endpoints used by CloudFront is broken for regions that use the .region instead of the -region format (Ohio, Canada, Mumbai, Seoul, Frankfurt and London)

http://docs.aws.amazon.com/AmazonS3/latest/dev/WebsiteEndpoints.html
The two general forms of an Amazon S3 website endpoint are as follows:
bucket-name.s3-website-region.amazonaws.com (dash region)
bucket-name.s3-website.region.amazonaws.com (dot region)

Listing of all region endpoints
http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_website_region_endpoints

Including a mapping lookup as a possible fix?
https://blog.doismellburning.co.uk/pointing-aws-cloudfront-at-an-s3-website-with-cloudformation/

Admin and Status page CloudFront distribution origins
https://github.com/ks888/LambStatus/blob/v0.3.0/cloudformation/lamb-status.yml#L2524
https://github.com/ks888/LambStatus/blob/v0.3.0/cloudformation/lamb-status.yml#L2573

Couple of suggestions

Hello,

When adding a Service Name on the settings page, it combines whatever you enter there with the green Status text. I.E. My AWS MonitoringStatus . There needs to be a space between the service name entered and the word Status.

The other thing, is when adding scheduled Maintenance. The word currently shown on the Status page is Maintenances. It should say Maintenance.

Nothing huge, but it helps with the presentation :)

Link cloudwatch alarms to component status

From @maximede's comment on gitter:

Maxime Deravet @maximede 00:16
Great! Thanks! Also, is there any plan to link cloudwatch alarms to component status?

Kishin Yagami @ks888 09:53
Such an integration sounds interesting. I don't have a specific plan to support it, though.
Anyway, it would be good to create the issue at first.

[Metrics] Do not connect the points if there is the missing data in between

From the antonivs's comment on reddit.

One other issue that I haven't investigated yet is that the site I'm testing with
happened to be down for a number hours on Friday, but for some reason I
don't see any gap on the LambStatus metrics chart, although there's a clear
gap in the charts on CloudWatch. If I find out more, I'll let you know the details.

The metrics chart of LambStatus connects the points by the straight line, and it doesn't check whether there is the missing data. It seems CloudWatch Metrics connects the points only when there is no missing data in between. So maybe that's why the appearance of the charts are different. This issue fixes this.

Have separate URL for latest commit on master

So you and I don't have to wait for a release to take advantage of the latest commit, and this template must be based in S3, it's probably worth having another 'Create Stack' button that allows people to use a template based on the latest commit (like a 'beta' button or something). That way people are less likely to feel the need to fork or do something manual.

Support scheduled maintenance

Sometimes the scheduled maintenance is inevitable. In that case, it is important to announce the maintenance schedule in advance. This issue implements the functions to support this.

Support custom CSS

Suggestion from @kaustubhmenon #14 (comment)

Suggestion:
Integrate a new CSS/SCSS file for manipulating the UI, without touching any core SCSS files. Could also be similar to Edit CSS in Wordpress (https://en.support.wordpress.com/custom-design/editing-css/)
. While inspecting the structure I noticed that all class names are appended with a random string in the end which could be a possible blocker for targeting CSS classes?

My reply:
StatusPage.io also has a similar feature https://help.statuspage.io/knowledge_base/topics/using-custom-css . Though the css class has a random prefix since CSS modules are used, maybe we can let the principal UI components have the id attribute and customize the design like this:

#container {
    width: 90%;
    max-width: 850px;
}

Integrate with CloudWatch Metrics

To tell the recent performance of the web service, it is good to have the graph which shows the response time, uptime, and so on. Although LambStatus will not have the ability to monitor the service, it is possible to integrate with other monitoring SaaS.

I'm going to start with CloudWatch Metrics since we don't need another account to try it.

TODOs:

  • List available metrics
  • Register the metrics whose graph will be shown on the StatusPage
  • Periodically collect the data of registered metrics
  • Show the collected data on the StatusPage

Created the show-cloudwatch-metrics branch for this purpose.

Launch Stack failed

I try to click a button launch stack but it failed.
Here are some errors:

...
21:49:36 UTC+1100	DELETE_IN_PROGRESS	AWS::IAM::Role	CognitoSMSCallerRole	
21:49:36 UTC+1100	DELETE_IN_PROGRESS	AWS::S3::BucketPolicy	StatusPageS3BucketPolicy	
21:49:29 UTC+1100	ROLLBACK_IN_PROGRESS	AWS::CloudFormation::Stack	StatusPage	The following resource(s) failed to create: [AdminPageFrontend, StatusPageDistribution, AdminPageDistribution, LambdaRoleInstanceProfile, StatusPageFrontend]. . Rollback requested by user.
21:49:27 UTC+1100	CREATE_FAILED	AWS::CloudFront::Distribution	StatusPageDistribution	Resource creation cancelled
21:49:27 UTC+1100	CREATE_FAILED	AWS::IAM::InstanceProfile	LambdaRoleInstanceProfile	Resource creation cancelled
21:49:27 UTC+1100	CREATE_FAILED	AWS::CloudFront::Distribution	AdminPageDistribution	Resource creation cancelled
21:49:27 UTC+1100	CREATE_FAILED	Custom::S3SyncObjects	AdminPageFrontend	Data returned must be an object
21:49:26 UTC+1100	CREATE_COMPLETE	AWS::DynamoDB::Table	ServiceComponentTable	
21:49:26 UTC+1100	CREATE_FAILED	Custom::S3SyncObjects	StatusPageFrontend	Data returned must be an object
21:49:26 UTC+1100	CREATE_COMPLETE	AWS::DynamoDB::Table	IncidentTable	
21:49:26 UTC+1100	CREATE_COMPLETE	AWS::DynamoDB::Table	IncidentUpdateTable	
21:49:23 UTC+1100	CREATE_COMPLETE	AWS::Lambda::Permission	GetComponentsLambdaInvokePermission	
...

Integrate uptime testing

https://github.com/streaka/lambda-ping

The problem is there is no 'uptime' metric naturally occurring on cloudwatch. It would be nice to package with this an optional lambda function that pings on a fixed schedule, making that available in cloudwatch for metric selection (and it can be a 'recommended metric' or something in the UI).

I'm using the above script to add 'uptime' metrics and whatever else might be handy... i'll see if I can integrate this in.

Could you make me a contributor and I can publish on another branch instead of a fork?

Auto update?

At the moment, to my knowledge, the only way to update is through a whole new AWS CF Stack. I was wondering if Lambda could be used somehow to auto update when a new update is pushed on github.

Fetch CloudWatch metrics from other regions

From @maximede's comment on gitter:

Maxime Deravet @maximede 08:29
Hey there !
I'm a bit confused by what the github repo states :
Choose the AWS region different from your service's region. If both your service and its status page rely on the same region, the region outage may stop both.
I'm not sure how I can achieve that as, when I try to add a metric, I don't have access to the metrics coming from a different region. Am I missing something?

Kishin Yagami @ks888 09:52
Oops, that’s a bug. Only the metrics from the LambStatus' region are fetched. The metrics from other regions should be fetched, too. I will fix this on the weekend and release the new version.

Feature suggestions

I have a lot of suggestions, so I thought it would be a good idea to combine them all here.

  • Component Grouping #10

  • Two factor authentication #13

  • AWS SNS intergration #11

  • Custom colours/ styling changeable via a settings page #12

    • This could be done by having the S3 page trigger a lambda even in order to modify the CSS in S3
  • SSO

    • E.g. With google/ facebook

Settings page

Add a settings page for things like custom CSS, colours and custom domains.

Event model?

Hi!

I'm really interested in LambStatus, particularly because all of our applications are using the serverless framework.

Nonetheless, I feel like LambStatus could use an "Event" model. e.g. A client would ping the Event API at the end of a successful deployment.

Currently we are doing this with Cachet as a "incident". Though to my eyes, a deployment is not an incident.

The bigger question is are status pages meant to be a repository of events? Taking a quick look around at the public status pages, no one seems to be doing this.

Delete stack fails

Status reason:
The following resource(s) failed to delete: [AdminPageS3, StatusPageS3].

Do we need doc on stack delete? Or can this be covered in stack delete? I'll try to just delete the buckets and run stack delete again..

Broken on latest commit

Settings page is breaking showing:

screen shot 2017-11-26 at 6 37 41 pm

And status page is stuck on fetching data. It looks like the get-settings and get-public-settings functions are both not working properly

Settings table in DynamoDB just contains this (not sure if good or bad):

screen shot 2017-11-26 at 6 33 53 pm

Logs say this for /aws/lambda/myapp-GetSettings in CloudWatch

screen shot 2017-11-26 at 6 34 57 pm

Logs say this for /aws/lambda/myapp-GetPublicSettings in CloudWatch

screen shot 2017-11-26 at 6 36 18 pm

Because the code is minified and web packed etc, no idea how to begin to fix this myself. You may need to provide some info on how to load up the lambda projects and debug them etc as I don't know how to set up a similar development workflow

API to create and update an incident

From the J4cku's comment on reddit:

[–]J4cku

Hi, one question, does it have API so that my internal monitoring
can push info when one of my components go down and remove
it when it's up again?

[–]kyagami[S]

Thank you for asking! So far, there is no API to post an incident
and change components' status. However, it seems some users
need such an API and I'm going to implement it within one month.

I think the API will (partially?) solve the issue #41 and so it will help some users.

Set CloudFront in front of API gateway

So far the clients directly access the API gateway. It works, but the API gateway and corresponding lambda functions often do same computations again and again, which generate useless costs. If the API is public which can be safely shared among users, the responses should be cached.

The API gateway implements the cache mechanism but it's costly. So I think to set the CloudFront in front of API gateway and use the cache mechanism of CloudFront is the most cost effective.

Let the external-metrics API support the pagination.

So far the GET external-metrics API returns all the metrics. If someone has the large number of metrics, it may return timeout errors.

This issue lets the external-metrics API support the pagination.
(And change the frontend's action so that all the metrics are fetched at last)

Git ignore .env

Folks will try to clone this repo and they will change .env file. The change can't be commited and it will make it hard for them to keep their repo in sync with this one.

What if we gitignore .env and add a .env-example for reference?

Update of CF UserEmail param after stack create

I noticed that I had put the wrong domain for my email address when creating the email address :(
Thats my bad, but I was wondering if it might be possible that a stack upgrade with the correct email address might send the email out to me as I need instead of re-creating the stack?

Its minor, but a consideration..

The rounded value in the tooltip causes the problem

From @blw9u2012's comment on gitter:

Brandon Walton @blw9u2012 06:02
@ks888 i'm having issues with displaying metrics from our target groups. It displays the correct graph but the Y axis is only showing a range between 0 and 1. Am i goofing somewhere?
i want to display the Y axis in milliseconds
i've added a metric from the AWS/ApplicationELB namespace and added the TargetResponseTime metric name and dimension for one of our load balancers

Kishin Yagami @ks888 22:35
@blw9u2012 Thank you for asking. The data points are averaged over 5 minutes when the 'Day' time frame is selected. Perhaps the maximum value of your data points is higher than 1.0, but the maximum value of the AVERAGED values is lower than 1.0.
Or, the rounded value in the tooltip may be the problem. Even if the actual value is 0.01, the value in the tooltip is rounded and will be 0. Such a behavior seems incovenient in your case, so it should be fixed.

Brandon Walton @blw9u2012 23:09
@ks888 thanks for responding! I also looked into the metrics and the unit that the metric TargetResponseTime returns is in seconds vs the latency metric in the demo site being milliseconds. I don't think that you can specify another unit to be returned other than seconds. I thought that this possibly could be related but it sounds like the latter issue you described. In any event thanks for responding!

Change the background color of the status page

The suggestion from VTHokie2015 at reddit:

[–]VTHokie2015 2 points 10 hours ago 
I think you could make the background a light grey and make the cards have white background
to add more distinction

[–]kyagami 1 point 2 minutes ago 
Wow, I tried your idea and it surely improved the page design! I will change the background color
or maybe enable a user to change that color in the settings page! Thank you!

When the color is #FAFAFA, the status page looks like this:
screen shot 2018-01-17 at 22 49 04

My concern is the conflict with the background of the logo image, though LambStatus does not support the feature to change the logo image of the status page header for now. So maybe it's better to enable a user to specify the background color in the settings page.

Support user authentication

So far admin page is not protected by user authentication. Any person who knows the URL of an admin page can change service status. To stop this, support user authentication (maybe using Amazon Cognito User Pools).

At least these functions are necessary:

  • The admin user can invite a new user (OK to use AWS Console)
  • The invited user can do an initial setup.
  • The user can sign in/out.
  • Save a user who forgets the password.
  • Protect API Gateway so that only authenticated users can call its APIs.
  • Create Cognito User Pools using CluodFormation (Note CloudFormation does not have Cognito resource)

LambStatus API to post the metric data

From @ccannell's comment on gitter:

Christopher Cannell @ccannell 05:33
I've deployed LambStatus and now I'm a bit lost on the next step. I'd like to check an HTTP server status regularly. What is the best way to approach that? Is there documentation on the LambStatus API? Should I run a periodic lambda to poll my HTTP server and then post to LambStatus API on changes?

Kishin Yagami @ks888 10:42
Thank you for asking. Since LambStatus has few features for monitoring the service, such as alarm, it's better to post your server's status to your monitoring service at first. Then, integrate the monitoring service with LambStatus. If your monitoring service is CloudWatch, the integration is easy. If not, it is harder because there is no LambStatus API to post the data.
LambStatus API to post the data sounds nice. Maybe it's better to support it.

Ability to add notes with at a specific time rather than just the current time.

As a Status Admin I want the ability to inject notes at a specific time into the timeline so that when I get additional information about an event that occurred in the past I can add when that event occurred with an accurate timestamp.

Use case: I am tracking an Incident on a web page outage. I find out that DNS was changed at 4:10 PM and I was not notified until 4:30 PM. I want to add an event at a specific time (4:10 PM) it occurred 4:10 rather than when I had a chance to post the event. 4:30 PM

2FA

Add two factor authentication for logging in

A few ideas

  • separate Lambda function that actually does the checking (I would make it modular as there will be a multitude of ways that users want their systems checked)
  • display/import stats from influxdb/grafana
  • embed a twitter feed of selected user

If I get some time I might throw together a PR or 2 with these implemented

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.