The Attendi Speech Service API allows you to quickly obtain an accurate transcription of an uploaded audio file. The API is a SaaS solution for healthcare institutions. We have different language models dedicated to a variety of domains within healthcare.
This document will take you through the basic steps to get the Attendi Speech Service API up and running, so that you can integrate it seamlessly into your own application. We will show you how to authenticate your requests to the API, link customers and transcribe audio.
The Attendi Speech Service API uses your account’s API keys to authenticate requests. Attendi will provide you with your API keys. If you don’t include your API key when making an API request, the Speech Service API returns an error.
There are two types of API keys: publishable and secret.
Secret keys should be kept confidential, and you should only store them on your servers. You must not share your secret API key with any third parties. The secret API key should be treated like a password.
Transcribe audio directly from a client application where the API key cannot be secured
Use the Customers endpoints. Transcribe audio via machine-to-machine communication
Authentication to the API is done via a custom HTTP header. Provide your API key in the following header: x-API-key.
All API requests must be made over HTTPS. Requests made over HTTP will fail, so will API requests made without authentication.
Cross-Origin requests are only allowed for the transcribe endpoint, using your public key from a frontend application.
We implemented and rely on end-to-end TLS/SSL encryption. This allows us to encrypt and securely transmit data to the backend and send the transcribed result to your app. We only accept requests over HTTPS.
Requests to transcribe audio with the Attendi Speech Service API can be parameterized with a customer ID, allowing you to keep track of the usage of each specific customer. There are several customer endpoints that allow you to create, update or retrieve customers. Each customer has both an internal ID (generated by us) and an external ID (provided by you). The name of a customer can be updated using the internal or the external ID. In case you want to update the customer name using the internal ID we provided you with, you need to use the PATCH method. If you want to use the external ID to insert or update the customer’s name, you should use the PUT method.
You can send a POST request to the /v1/speech/transcribe endpoint to obtain a transcription of an audio file. The audio file you want to transcribe must be encoded in base64 so that the binary audio data is converted into characters. Additionally, the audio should meet the following specifications:
Single (mono) channel recording
16 kHz sampling rate
16-bit audio recording
In case your file does not meet these requirements, you can use a tool to convert it. ffmpeg is a command line tool, which you can use to convert your audio as follows:
Besides the audio file, it is required to select the specific language model you want to use (DistrictCare, ResidentialCare or MentalHealthcare). You also have to add a unique UserID. The unique UserID could be anything, for example a UUID. We do not save any personal data, only the unique UserID. We use this UserID to measure usage, and to ensure that we can easily trace back any problems with the functionality. Furthermore, there is an option to add the browser’s User-Agent as metadata.