Giter Site home page Giter Site logo

serverless-transcribe's Introduction

serverless-transcribe

A simple, serverless web UI for Amazon Transcribe. Supports WAV, FLAC, AMR, MP3, MP4, Ogg (Opus), and WebM audio without any fixed costs.

How it Works

Once the project has been launched in CloudFormation, you will have access to a webpage that allows users to upload audio files. The page uploads the files directly to S3. The S3 bucket is configured to watch for audio files. When it sees new audio files, an AWS Lambda function is invoked, which starts a transcription job.

File detection is based on the file extension. Supported extensions are: .wav, .flac, .amr, .3ga, .mp3, .mp4, .m4a, .oga, .ogg, .opus, and .webm.

Another Lambda function is triggered via EventBridge event when the transcription job completes (or fails). An email is sent to the user who uploaded file with details about the job failure, or a raw transcript that is extracted from the job results.

The webpage is protected by HTTP Basic authentication, with a single set of credentials that you set when launching the stack. This is handled by an authorizer on the API Gateway, and could be extended to allow for more robust authorization schemes.

Amazon Transcribe currently has file limits of 4 hours and 2 GB.

AWS Costs

The cost of running and using this project are almost entirely based on usage. Media files uploaded to S3 are set to expire after one day, and the resulting files in the transcripts bucket expire after 30 days. The Lambda functions have no fixed costs, so you will only be charged when they are invoked. Amazon Transcribe is "pay-as-you-go based on the seconds of audio transcribed per month".

Most resources created from the CloudFormation template include a Project resource tag, which you can use for cost allocation. Transcription jobs also include this tag, and can include an optional tag defined using stack parameters.

How to Use

The project is organized using a SAM CloudFormation template. Launching a stack from this template will create all the resources necessary for the system to work.

Requirements

  • The stack must be launched in an AWS region that supports SES. The addresses that SES will send to and from are determined by your SES domain verification and sandboxing status.

Deploying from the AWS Serverless Application Repository

The application can be deployed from the AWS Serverless Application Repository into your own AWS account. This is the easiest way to get started with serverless-transcribe.

Using the SAM CLI to deploy

The app can also be deployed using the AWS SAM CLI. Once the CLI is installed, you can run sam deploy --guided in the project directory to deploy the application. (After the first deploy, you can use sam deploy if samconfig.toml is present in the directory.)

Any other deployment method that is compatible with SAM templates would also work.

Note: The deploy script that was previously included in the project is no longer supported.

serverless-transcribe's People

Contributors

farski avatar topher200 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

serverless-transcribe's Issues

AccessDenied</Code> <Message> Invalid according to Policy:

@farski After creating an S3 bucket with versioning enabled deployed the solution.
I get a link and when I fill in the email, set speackers to one and upload a short video.
As soon as I click on upload I get the following error:

<Error> <Code>AccessDenied</Code> <Message> Invalid according to Policy: Policy Condition failed: ["starts-with", "$Content-Type", "audio/"] </Message> <RequestId>16839082886D08F3</RequestId> <HostId> 5rEgrEN3HIyNty2CLkuxaDL6EUv/X1Kn0OrFw4oIAW8JRpuHinUfq92PyADc6oqVB547oN8dkXI= </HostId> </Error>

README could be more clear on how to create UPLOAD_ACCESS_KEY

The way the readme is written, it's a little confusing on how many s3 buckets one must create. The readme clearly says that you must create one (STACK_RESOURCES_BUCKET), but then it talks about providing access keys to a different bucket (MEDIA_BUCKET_IDENTIFIER). Does one have to create the second bucket to be to provide access keys to it? Or is there a way to generate the access keys without knowing the bucket ahead of time?

To resolve this, I'd do one of these things:

  1. provide explicit instructions on how to generate UPLOAD_ACCESS_KEY
  2. is it possible to have UPLOAD_ACCESS_KEY be generated by the cloudformation script instead? why does the user have to create the IAM role themselves at all?

{"message":null}

I tried your project, i used deploy.sh to install well, but when i use the output link of cloudformation to login,fill right user and password out, but finally it has no further page to let me upload file to transcribe it, just display {"message":null}, i dont know how to fix it, looking for help. thanks.

image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.