Giter Site home page Giter Site logo

mzsm / logia Goto Github PK

View Code? Open in Web Editor NEW
1.0 2.0 0.0 5.46 MB

Transcription and subtitling support app for video creators. 動画クリエイターのための文字起こし・字幕制作支援アプリ

License: MIT License

TypeScript 75.56% HTML 0.12% Python 21.75% CSS 2.57%
closed-captions closedcaptions subtitles transcribe transcription whisper speech-to-text

logia's Introduction

Logia

GitHub License GitHub last commit PRs Welcome

Transcription and subtitling support app for video creators.
動画クリエイターのための文字起こし・字幕制作支援アプリ

DOWNLOAD HERE

What Can This App Do For You? / 何ができるの?

Transcribes video / audio files automatically with speech recognition.
It supports about 100 languages, such as English, Japanese, Chinese, French, Korean, etc...
You can correct it manually if the result was wrong,
It supports to output subtitle files for YouTube and HTML5 videos, and you can add high-quality subtitles to your videos easily.
Also available to output CSV and plain text, you can use for a wide range of use cases, not only video creation.
FYI, speech recognition is processed on your machine, so the audio data is never sent outside.

音声認識により自動で動画・音声ファイルの文字起こしを行います
英語や日本語、**語、フランス語、韓国語など、約100の言語に対応しています
自動文字起こしの結果が間違っていた場合は、手動で内容を修正できます
YouTubeやHTML5ビデオ用字幕ファイルの出力に対応しており、動画に高品質な字幕を手軽に付与できます
また、CSVやプレーンテキストでの出力も可能なので、動画制作だけでなく幅広い用途で利用可能です
なお、音声認識処理はローカルマシン上で実行されるため、音声データが外部に送信されることはありません

System Requirements / 動作環境

  • OS
    • As new and stable as possible (Whether Windows or macOS)
      なるべく新しくて安定してるやつ (WindowsでもmacOSでも)
    • macOS >= 13.5 (with Apple Silicon)
      Apple Silicon搭載macの場合、macOS 13.5以上
  • CPU
    • As fast as possible
      なるべく速いやつ
  • Memory
    • As much as possible
      なるべくたくさん

Recommended / 推奨環境

The app work faster, if your computer equipped with below H/W.
コンピューターに以下のハードウェアが搭載されていれば、より高速に動作します

on Windows

  • with CUDA supported NVIDIA GPU
    CUDA対応のNVIDIA製GPU搭載

on macOS

  • with Apple Silicon
    Appleシリコン搭載

Performance Benchmark Results / 性能ベンチマーク結果

Below are sample data measured by developer or friends.
Processing time is heavily depending on your PC specs, the content of the audio or other reasons.
These are intended as a guide only.

以下は開発者や友人が測定したサンプルデータです
処理時間はPCのスペックや音声の内容などにより大きく左右されます
あくまで目安として参考にしてください

Sample #1

  • Lang / 言語
    • Japanese / 日本語
  • Duration / 再生時間
    • 20:03
  • Genre / ジャンル
    • Conversation of two / 2人の会話

CPU Only

CPU Mem. Model Time Speed
Ryzen 7 7840U 32GB Medium 12:14 x1.63
Ryzen 7 7840U 32GB Large-v3 19:38 x1.02

with CUDA supported NVIDIA GPU

CPU GPU Mem. Model Time Speed
i7-6700K GeForce GTX 1070 Ti 32GB Medium 05:03 x3.97
i7-6700K GeForce GTX 1070 Ti 32GB Large-v3 08:13 x2.44
Ryzen 7 7745HX GeForce RTX 4090 64GB Medium 01:14 x16.45
Ryzen 7 7745HX GeForce RTX 4090 64GB Large-v3 01:27 x13.86

with Apple Silicon

CPU/GPU GPU Cores Mem. Model Time Speed
Apple M1 8 16GB Medium 04:46 x4.20
Apple M1 8 16GB Large-v3 08:50 x2.27
Apple M2 10 24GB Medium 03:42 x5.42
Apple M2 10 24GB Large-v3 06:25 x3.12

Sample #2

  • Lang / 言語
    • English / 英語
  • Duration / 再生時間
    • 21:03
  • Genre / ジャンル
    • Solo speech / 1人によるスピーチ

CPU Only

CPU Mem. Model Time Speed
Ryzen 7 7840U 32GB Medium 10:59 x1.91
Ryzen 7 7840U 32GB Large-v3 15:04 x1.39

with CUDA supported NVIDIA GPU

CPU GPU Mem. Model Time Speed
i7-6700K GeForce GTX 1070 Ti 32GB Medium 05:05 x4.14
i7-6700K GeForce GTX 1070 Ti 32GB Large-v3 06:21 x3.31
Ryzen 7 7745HX GeForce RTX 4090 64GB Medium 01:04 x19.76
Ryzen 7 7745HX GeForce RTX 4090 64GB Large-v3 01:18 x16.37

with Apple Silicon

CPU/GPU GPU Cores Mem. Model Time Speed
Apple M1 8 16GB Medium 04:28 x4.72
Apple M1 8 16GB Large-v3 08:08 x2.59
Apple M2 10 24GB Medium 03:19 x6.37
Apple M2 10 24GB Large-v3 05:55 x3.56

Dependencies / 依存ライブラリー

This app is developed with below libraries.
このアプリは以下のライブラリーを利用して開発されています

  • Faster Whisper
    • A reimplementation of OpenAI's Whisper model using CTranslate2.
    • CTranslate2を用いたOpenAI Whisperモデルの再実装
    • (Original)
      • Whisper
        • A set of open source speech recognition models from OpenAI.
        • OpenAIによるオープンソースの音声認識モデルセット

on Apple Silicon

  • MLX
    • An array framework for machine learning research on Apple Silicon.
    • Appleシリコン用の機械学習フレームワーク
  • mlx-whisper
    • OpenAI Whisper on Apple Silicon with MLX and the Hugging Face Hub.
    • MLXとHugging Face Hubを利用したAppleシリコン用のOpenAI Whisper

License / ライセンス

MIT License

Where The Name Come From / 名前の由来

Logia comes from Greek word λόγια (the plural of λόγος (logos)) which meanings words. And also pun for log and AI.
Logia(ロギア)の由来はギリシア語で「言葉」を意味する単語 λόγια (※λόγος(ロゴス)の複数形)で、またlogAIにも掛けています

logia's People

Contributors

mzsm avatar

Stargazers

 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.