A text transcript is a document that records all of the audio content from an audio or video file or event. Essentially, it’s a log of what happened.
Text transcript should identify who is speaking, what they said, how they acted or reacted, as well as environmental sounds such as applause or laughter.
It should contain visual information as well — depictions of scenery, non-verbal communication, or anything else that will make it clear to someone reading the transcript what happened.
How to write a transcript
A robust text transcript will satisfy both 1.2.1 Audio-only and Video-only (Prerecorded) – Level A and the “text description of visual content” portion of 1.2.3 Audio Description or Media Alternative (Prerecorded) – Level A and 1.2.8 Media Alternative (Prerecorded) – Level AAA.
Picture a tv or movie script describing where all the actors are and how they’re reacting in addition to actors’ lines and the set, you’re on the right track.
In other words, if you’re writing the transcript yourself, do your best to cover all requirements in a single transcript.
Most companies outsource transcript writing to services. Note, however, that most transcript services don’t provide the text description of visual content unless you explicitly request it.
How to write a transcript on WikiHow provides an excellent guide, if you plan to do it yourself.
Audio Description script
If you are writing the script for an audio description track, you only need to describe the visual elements of the video, not the audio elements.
The Audio Description Standards section of 3Play’s The Ultimate Guide to Audio Description provides guidelines for writing audio descriptions.