Introduction to WAAS
WAAS, or Whisper as a Service, is an innovative tool that offers a seamless experience for handling audio and video transcription tasks. It provides both a graphical user interface (GUI) and an application programming interface (API) for users to interact with.
Core Features
One of the standout features of WAAS is its ability to handle various transcription tasks. Users can add a new transcribe job to the queue by simply making a POST request to the /v1/transcribe endpoint. The job is then processed asynchronously by the worker. It also allows for language detection of the audio file through the /v1/detect endpoint. Another great aspect is the flexibility in output formats. When retrieving the finished job result via the /v1/download/<job_id> endpoint, users can choose from options like JSON response, plain text with timecodes, WebVTT files, and more.
Basic Usage
Getting started with WAAS is relatively straightforward. For installation, it requires a Python environment compatible with versions 3.8 - 3.10. After setting up the necessary environment variables and creating a.envrc file with the required configurations, users can run the full setup using docker-compose. They can also choose to use devcontainers for remote development if they prefer. When it comes to actually using the service for transcription, users can make curl requests. For example, to upload a Norwegian audio file with an email callback, they can use a specific curl command with the appropriate parameters like the output format, language, and model. Similarly, a webhook callback can also be used for a different type of interaction.
WAAS compares favorably to some existing transcription services. While other services might have a more limited set of output options or a less intuitive API, WAAS offers a comprehensive set of features and a user-friendly interface for both novice and experienced users alike.