Speech-to-text Has No Idea What You're Talking About
This pane clears float!
I’ve been a student employee of the Instructional Center for Educational Technologies (ICET) team at the University of Wisconsin-Platteville since late May.Over the last several months, I’ve been fortunate enough to get to try out all kinds of fun and new things. I’ve learned about cascading style sheets, practiced some HTML5 coding, helped create a style guide, prepared instructions, became incredibly familiar with Desire2Learn. . .the list goes on and on. I've also been fortunate enough to meet and work with numerous people who genuinely care about what they’re doing and the students and faculty they’re doing it for.
This year, ICET has made it a priority to ensure everyone can access ICET content in whatever way they need to. This process has included initiatives to convert PDFs, videos, and anything else we’ve got to formats that are screen-reader compatible, hearing impaired friendly, and ultimately ADA compliant. Each of us on the team has made contributions to these efforts. I was excited about my own contribution: adding-closed captioning to a Desire2Learn quiz tutorial video.
To tackle this project, I used the Speech-to-Text feature in Camtasia Studio. I should note that I had no previous experience with this kind of software and if anyone has suggestions I would definitely love to see them in the comments section below. Anyway, this particular screen recording and video editing software was easy to use and fun to play around with. With barely any learning curve at all, I was able to quickly load the audio and video files I needed and get the show on the road.
Just seconds after clicking the speech-to-text button I knew what I’d be doing at work for a while. A. Very. Long. While. The excitement I had felt just moments earlier basically disappeared. If I had to give speech-to-text an accuracy rating, I’d say it was correct about 15% of the time. Plus, there's no punctuation. According to the tip that came up when I hovered my cursor over the speech-to-text button, the instructor in the video I was working on could’ve completed some tutorials to train the computer to understand her voice. Obviously, it was too late for that.
I spent hours listening to the instructor at 50% of her normal speed, frantically typing each word while the audio file played. In case you're wondering, slow-motion really does take a person's voice down several octaves. After particularly long captioning sessions I would take out my ear buds and wonder why everyone around me was speaking so quickly (and why they sounded like chipmunks).
We share our world with 3-D printers, genetically modified tomatoes, and robotic vacuum cleaners. On a planet where a person expects a built-in camera on pretty much everything, I was shocked to find out that speech-to-text technology thinks my professors are saying "thugs" when they are actually saying "folks". As disappointed as I was with technology during this task, the frequent misunderstandings actually provided some serious entertainment. When you imagine instructors actually saying the things speech-to-text hears, it's hilarious. Here are my favorite PG-rated blunders:
Instructor said: You can put a password on a quiz.
Speech-to-text heard: You can put a bad word on a quiz
Instructor said: You have to provide the hints but the idea is you can give them a little bit of information.
Speech-to-text heard: You have to go ride the hints ideally shoes can heaving a little pits of it for Nathan.
Instructor said: I think that’s kind of a limit, personally.
Speech-to-text heard: I think cat skin is a gimmick personally
Thanks for checking out the ICET Blog! Don't forget we're just an email or phone call away for all your educational technology needs!
This pane clears float!