Giter Site home page Giter Site logo

assemblyai / assemblyai-ruby-sdk Goto Github PK

View Code? Open in Web Editor NEW
5.0 7.0 0.0 1.08 MB

The AssemblyAI Ruby SDK provides an easy-to-use interface for interacting with the AssemblyAI API, which supports async and real-time transcription, audio intelligence models, as well as the latest LeMUR models.

Home Page: https://www.assemblyai.com

Ruby 100.00%
ai asr assemblyai llm ruby speech-to-text stt transcription

assemblyai-ruby-sdk's Introduction


Gem Version GitHub License AssemblyAI Twitter AssemblyAI YouTube Discord

AssemblyAI Ruby SDK

The AssemblyAI Ruby SDK provides an easy-to-use interface for interacting with the AssemblyAI API, which supports async, audio intelligence models, as well as the latest LeMUR models.

The Ruby SDK does not support Streaming STT at this time.

Documentation

Visit the AssemblyAI documentation for step-by-step instructions and a lot more details about our AI models and API.

Quickstart

Install the gem and add to the application's Gemfile by executing:

bundle add assemblyai

If bundler is not being used to manage dependencies, install the gem by executing:

gem install assemblyai

Import the AssemblyAI package and create an AssemblyAI object with your API key:

require 'assemblyai'

client = AssemblyAI::Client.new(api_key: 'YOUR_API_KEY')

You can now use the client object to interact with the AssemblyAI API.

Speech-To-Text

Transcribe an audio file with a public URL
transcript = client.transcripts.transcribe(
  audio_url: 'https://storage.googleapis.com/aai-web-samples/espn-bears.m4a',
)

transcribe queues a transcription job and polls it until the status is completed or error.

If you don't want to wait until the transcript is ready, you can use submit:

transcript = client.transcripts.submit(
  audio_url: 'https://storage.googleapis.com/aai-web-samples/espn-bears.m4a'
)
Transcribe a local audio file
uploaded_file = client.files.upload(file: '/path/to/your/file')
# You can also pass an IO object or base64 string
# uploaded_file = client.files.upload(file: File.new('/path/to/your/file')

transcript = client.transcripts.transcribe(audio_url: uploaded_file.upload_url)
puts transcript.text

transcribe queues a transcription job and polls it until the status is completed or error.

If you don't want to wait until the transcript is ready, you can use submit:

transcript = client.transcripts.submit(audio_url: uploaded_file.upload_url)
Enable additional AI models

You can extract even more insights from the audio by enabling any of our AI models using transcription options. For example, here's how to enable Speaker diarization model to detect who said what.

transcript = client.transcripts.transcribe(
  audio_url: audio_url,
  speaker_labels: true
)

transcript.utterances.each do |utterance|
  printf('Speaker %<speaker>s: %<text>s', speaker: utterance.speaker, text: utterance.text)
end
Get a transcript

This will return the transcript object in its current state. If the transcript is still processing, the status field will be queued or processing. Once the transcript is complete, the status field will be completed.

transcript = client.transcripts.get(transcript_id: transcript.id)
Get sentences and paragraphs
sentences = client.transcripts.get_sentences(transcript_id: transcript.id)
p sentences

paragraphs = client.transcripts.get_paragraphs(transcript_id: transcript.id)
p paragraphs
Get subtitles
srt = client.transcripts.get_subtitles(
  transcript_id: transcript.id,
  subtitle_format: AssemblyAI::Transcripts::SubtitleFormat::SRT
)
srt = client.transcripts.get_subtitles(
  transcript_id: transcript.id,
  subtitle_format: AssemblyAI::Transcripts::SubtitleFormat::SRT,
  chars_per_caption: 32
)

vtt = client.transcripts.get_subtitles(
  transcript_id: transcript.id,
  subtitle_format: AssemblyAI::Transcripts::SubtitleFormat::VTT
)
vtt = client.transcripts.get_subtitles(
  transcript_id: transcript.id,
  subtitle_format: AssemblyAI::Transcripts::SubtitleFormat::VTT,
  chars_per_caption: 32
)
List transcripts This will return a page of transcripts you created.
page = client.transcripts.list

You can pass parameters to .list to filter the transcripts. To paginate over all pages, subsequently, use the .list_by_url method.

loop do
  page = client.transcripts.list_by_url(url: page.page_details.prev_url)
  break if page.page_details.prev_url.nil?
end
Delete a transcript
response = client.transcripts.delete(transcript_id: transcript.id)

Apply LLMs to your audio with LeMUR

Call LeMUR endpoints to apply LLMs to your transcript.

Prompt your audio with LeMUR
response = client.lemur.task(
  transcript_ids: ['0d295578-8c75-421a-885a-2c487f188927'],
  prompt: 'Write a haiku about this conversation.'
)
Summarize with LeMUR
response = client.lemur.summary(
  transcript_ids: ['0d295578-8c75-421a-885a-2c487f188927'],
  answer_format: 'one sentence',
  context: {
    'speakers': ['Alex', 'Bob']
  }
)
Ask questions
response = client.lemur.question_answer(
  transcript_ids: ['0d295578-8c75-421a-885a-2c487f188927'],
  questions: [
    {
      question: 'What are they discussing?',
      answer_format: 'text'
    }
  ]
)
Generate action items
response = client.lemur.action_items(
  transcript_ids: ['0d295578-8c75-421a-885a-2c487f188927']
)
Delete LeMUR request
response = client.lemur.task(...)
deletion_response = client.lemur.purge_request_data(request_id: response.request_id)

assemblyai-ruby-sdk's People

Contributors

swimburger avatar armandobelardo avatar fern-api[bot] avatar ploeber avatar

Stargazers

He Fang avatar Julian Keenaghan avatar Ron Bellido avatar Dan avatar  avatar

Watchers

Lucian avatar  avatar Dylan Fox avatar Michael Nguyen avatar Deep Singhvi avatar  avatar  avatar

assemblyai-ruby-sdk's Issues

Uninitialized constant error since 1.0.0-beta.12+

Hi there!

Since updating to 1.0.0-beta.12 and beyond, the error below started to show up:

NameError: uninitialized constant AssemblyAI::Gemconfig (NameError)

        "X-Fern-SDK-Version": AssemblyAI::Gemconfig::VERSION,

Not sure if I missed anything on the config or a step from the README. Let me know otherwise if you need more details as well. Thanks!

Clarification Needed: Gemspec Dependency Range vs. Faraday Compatibility

Hello Team,

I noticed that in the gemspec, Faraday is specified to be greater than 1.10, however, the assemblyai gem seems to require Faraday version 2.7 and above. Is this intentional? It appears there might be a discrepancy between the specified dependency range and the actual compatibility with Faraday versions. Could you please clarify this issue?

Thank you.

Allow file upload without loading entire file into memory

In other SDKs we use streams or equivalent to only read parts of the file into memory and stream the data to the file upload API.
We support file uploads up to 2.2GB, so we have to be able to upload files without loading the whole file into memory.

Word search should accept array of words

Currently, the word search method accepts a single string, it should be an array of strings:

word = 'foo'
response = client.transcripts.word_search(
  transcript_id: transcript.id,
  words: word
)

It should be

word = 'foo'
response = client.transcripts.word_search(
  transcript_id: transcript.id,
  words: [word]
)

Missing realtime file

The real-time file was intentionally removed, but the generator also generated imports to the file, which are still in the SDK.
We cannot import the SDK.

/Users/niels/.rbenv/versions/3.3.0/lib/ruby/gems/3.3.0/gems/assemblyai-1.0.0.pre.beta.4/lib/types_export.rb:49:in `require_relative': cannot load such file -- /Users/niels/.rbenv/versions/3.3.0/lib/ruby/gems/3.3.0/gems/assemblyai-1.0.0.pre.beta.4/lib/assemblyai/realtime/types/realtime (LoadError)
	from /Users/niels/.rbenv/versions/3.3.0/lib/ruby/gems/3.3.0/gems/assemblyai-1.0.0.pre.beta.4/lib/types_export.rb:49:in `<top (required)>'
	from /Users/niels/.rbenv/versions/3.3.0/lib/ruby/gems/3.3.0/gems/assemblyai-1.0.0.pre.beta.4/lib/assemblyai.rb:4:in `require_relative'
	from /Users/niels/.rbenv/versions/3.3.0/lib/ruby/gems/3.3.0/gems/assemblyai-1.0.0.pre.beta.4/lib/assemblyai.rb:4:in `<top (required)>'
	from /Users/niels/.rbenv/versions/3.3.0/lib/ruby/3.3.0/bundled_gems.rb:74:in `require'
	from /Users/niels/.rbenv/versions/3.3.0/lib/ruby/3.3.0/bundled_gems.rb:74:in `block (2 levels) in replace_require'
	from /Users/niels/RubymineProjects/AssemblyAISample/transcribe.rb:3:in `<main>'

`from_json': undefined local variable or method `parsed_json' for class AssemblyAI::Transcripts::Transcript (NameError)

I'm running this code:

require 'dotenv/load'
require 'assemblyai'

client = AssemblyAI::Client.new(
  api_key: ENV['ASSEMBLYAI_API_KEY']
)

transcript = client.transcripts.transcribe(
  audio_url: 'https://storage.googleapis.com/aai-web-samples/5_common_sports_injuries.mp3',
  speech_model: AssemblyAI::Transcripts::SpeechModel::NANO
)

raise transcript.error unless transcript.error.nil?

puts transcript.text

Which throws this error

/Users/niels/.rbenv/versions/3.3.0/lib/ruby/gems/3.3.0/gems/assemblyai-1.0.0.pre.beta.9/lib/assemblyai/transcripts/types/transcript.rb:453:in `from_json': undefined local variable or method `parsed_json' for class AssemblyAI::Transcripts::Transcript (NameError)

        words = parsed_json["words"]&.map do |v|
                ^^^^^^^^^^^
	from /Users/niels/.rbenv/versions/3.3.0/lib/ruby/gems/3.3.0/gems/assemblyai-1.0.0.pre.beta.9/lib/assemblyai/transcripts/client.rb:160:in `submit'
	from /Users/niels/.rbenv/versions/3.3.0/lib/ruby/gems/3.3.0/gems/assemblyai-1.0.0.pre.beta.9/lib/assemblyai/transcripts/polling_client.rb:76:in `transcribe'
	from /Users/niels/RubymineProjects/AssemblyAISample/transcribe.rb:9:in `<main>'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.