Using the /transcribe API with AWS S3
Overview
The Amazon Web Service (AWS) Simple Storage Service (S3) is a common location for archiving audio files and metadata together in zip files. If you already have such files stored in S3, you can use the /transcribe
API's support for S3 to process them from that location, which can save upload time because Conversations typically must upload your files to S3 for processing.
In order to use the /transcribe
API with S3 from the command line, you must pass your AWS_ACCESS_KEY
(referred to in the /transcribe
API as your aws_id
) and AWS_SECRET_KEY
(referred to as your aws_secret
) by using the curl command support for filling in forms.
The following is the general format of a cURL command that calls the /transcribe
API to transcribe a file or directory that is stored in S3:
Fields to Provide
The user-specific fields that you need to provide are the following:
The authorization token that you are using to retrieve information.
A company's authorisation token is found on the Conversations Accounts page in the Company section:
The Amazon key for the bucket in which the file that you want to transcribe is stored
The secret Amazon key for the bucket in which the file that you want to transcribe is stored
The Amazon S3 bucket in which the file that you want to transcribe is stored
path/to/file/or/directory
The path to the file that you want to process, a zip file that contains the audio file that you want to process (and an optional metadata file), or to a directory that contains a hierarchy of files that you want to process. If you specify a directory, all of the files that are located under that directory will be queued for transcription. Files that are submitted for processing but which are not in a format that is supported by Conversations will not be processed and will be listed in the Conversations folder's process log as being UNSUPPORTED.
You must specify the Amazon S3 region of the S3 bucket. The region option on the request specifies which regional endpoint to use for the request.
This option reduces request latency and is required.
The base URL that is correct according to the environment being used.
The short name of the organization that you are using.
An Organization's short name is found on the Conversations Accounts page in the Organizations section:
The Conversations folder in which you want the transcript and audio output that is produced by Conversations to be stored.
The following is a specific example of calling the /transcribe
API to transcribe a zip file that is stored in S3:
This example transcribes the audio in the zip file named documentation-TEST.zip
in the bucket example.company.com
and puts the results of that transcription in the Test01
folder of the organization Test-Testing
. As with other calls to the /transcribe
API, it returns the request ID for your transcription request, which you can subsequently use with the /request
API.
By default, any zip file in S3 that you have identified for transcription using the /transcribe
API remains stored on S3 after its contents have been transcribed. Keeping such files in S3 after their content has been transcribed may not be necessary, so the /transcribe
API includes a "delete=true"
option that you can pass to delete a file after its content has been transcribed. In an application, you would pass this as an additional parameter to the /transcribe
API call. In a curl command, you would add the -F delete=true
option to your command line.
Related Articles
Extract summary data from Conversations in csv format
Introduction The primary method used for processing data from Awaken Conversations into customer own data warehouses is to use a built-in capability in the platform called Callbacks. More information on this can be found in the Awaken public ...
Search for Credit Card numbers in transcript
There might be times when your PCI DSS process failed, and a credit card number ends up in a recording. In order to identify calls with potential credit card numbers in Awaken Conversations, you can bookmark regex expressions to search for credit ...
S3 Bucket Replication
The Awaken PS team will provide the following information S3 Bucket Name AWS Account ID Attached file includes the document to follow to complete the replciation setup.
Creating IAM Role
Create an IAM role with the following trust relationship: { "Version": "2012-10-17", "Statement": [ { "Sid": "", "Effect": "Allow", "Principal": { "Service": "s3.amazonaws.com" }, "Action": "sts:AssumeRole" } ] } Create a policy as follows, replacing ...