Amazon Transcribe POC

AWS/Amazon Transcribe

·

1 min read

Play this article
  • pricing

Based on audio voice time // 0.024 USD per minute // ap-northeast-2

  • performance

Coverting MP4 to SRT from 12 minutes video takes 1 minute.

  • intput

Amazon S3 // MP3, MP4, WAV, FLAC, AMR, OGG, and WebM.

  • output

Amazon S3 // SRT, Text, VTT

  • alternative

Naver CLOVA Note - https://clovanote.naver.com/

Google STT AI - https://cloud.google.com/speech-to-text?hl=ko

OpenAI Whisper - https://platform.openai.com/docs/guides/speech-to-text

  • note

If you are going to integrate with AWS services, you must use it. It is absolute in terms of network cost and architecture.

Real-time voice recognition is also possible, but is not considered as it is poor in performance and accuracy.

  • reference

https://aws.amazon.com/transcribe/

https://aws.amazon.com/ko/blogs/korea/amazon-transcribe-now-supports-speech-to-text-in-korean/

https://www.awsgeek.com/Amazon-Transcribe/

  • comparison
Pricing (60s)Limit
Amazon
Transcribe32 KRW2 GB
OpenAI Whisper8 KRW25 MB
Clova Voice60 KRW2 GB