We’ve thoroughly examined how you can use WhisperAI, OpenAI’s artificial intelligence model that can convert your speech into text, within ChatGPT. Find all the detailed insights in this article.
What is Whisper AI?
Whisper.AI is an artificial intelligence model with Speech Recognition capability, announced by OpenAI on September 21, 2022 via a blog post. Trained on around 680,000 hours of audio, Whisper.AI can convert your speech into text while comprehending your conversations. Notably, it possesses a multi-language feature that works successfully not only in English but also in other languages excluding English.
Why should you use Whisper by OpenAI?
Using Whisper AI, you can transcribe what you say into text without needing to type or use a keyboard.
This enables you to accomplish tasks that would normally take minutes in a matter of seconds, simply by speaking. By integrating WhisperAI and utilizing other models and applications that can be combined with WhisperAI, you can take advantage of voice input functionality in a wide range of applications you can think of.
Also read: Why you should use ChatGPT when writing your cover letter?
By using WhisperAI within ChatGPT, you can communicate through speech instead of typing. This allows you to engage in conversations, express your questions, and receive better responses without the need to write. This integration empowers you to interact with ChatGPT more effectively.
Furthermore, with ChatGPT, which allows for interactive dialogues, you can elevate the mutual dialogue feature using WhisperAI. By incorporating WhisperAI into ChatGPT, you can engage in spoken conversations that take the interactive aspect to the next level, employing artificial intelligence on both ends.
Learn more: ChatGPT best practices, how to craft better prompts?
Moreover, it’s considered a best practice to formulate prompts accurately and express questions clearly when using GPT models. In ChatGPT powered by GPT models, you can also articulate your problems more swiftly and effectively through speech, thereby enhancing the quality of responses you receive.
How to use Whisper AI in ChatGPT?
Currently, OpenAI has integrated the WhisperAI speech recognition artificial intelligence model only into the mobile application of ChatGPT. Consequently, you can exclusively utilize WhisperAI within ChatGPT’s mobile app at this time. The following steps outline how this usage works, explained step by step.

- Open the ChatGPT mobile application and log in to your account.
- On the main screen of the app, click on the icon located on the right side of the textbox where messages are typed.
- If you’re clicking on this icon for the first time, the mobile app will request permission to use your microphone for recording. Grant the permission to proceed.
- WhisperAI will then become active automatically, and the recording process will start. During the recording process, the color of the bottom part of the screen will change, indicating that your voice is being recorded.
- Once you’ve finished speaking, simply tap the “Tap to Stop Recording” button on the screen. You can either tap this button again or click on the icon located on the right side of the Message Box to end the recording.
- When you end the recording, a message saying “Converting to Text” will appear, indicating that your voice is being transcribed to text. Thanks to WhisperAI, this process is completed within a very short time, so after waiting for 1-2 seconds, you will see the text version of your spoken words on the screen. The transcribed text will appear in the MessageBox.
- If you decide to cancel the transcription process, you can tap the “Tap to Stop Recording” button during the recording. Then, click on the “X” button located in the top-right corner of the screen that appears during the “Converting” phase.
Important Points to Note While Using Whisper AI
The key point to note here is that the longer your speech, the longer the conversion process to text will take.
Based on our usage experiences, the conversion process for approximately 1 minute of speech takes around 3 to 4 seconds. This process is quite fast and sufficiently adequate for our needs.
When your speech is too long, not only does the conversion process take longer, but there’s also a decreased likelihood of the conversion completing due to timeouts. In such cases, if an error occurs, you can click on the “Try Again” button that appears on the screen to attempt converting the same speech to text once more. Alternatively, you can start a new recording to initiate the conversion process again.
Does WhisperAI transcribe languages other than English?
Yes, WhisperAI can understand and transcribe your speech not only in English but also in many other languages. Moreover, even if you mix English with other languages in your speech, WhisperAI can comprehend this and accurately transcribe your multi-lingual conversation into text.
How to use voice transcription for ChatGPT in desktop / browser?
WhisperAI is indeed a useful and efficient feature. However, currently, it comes as a default feature only in the mobile version of ChatGPT. The WhisperAI feature is not available by default in desktop devices or through a browser. Nevertheless, there are a few ways to utilize voice typing feature of ChatGPT through a browser.
One of these methods involves leveraging various Chrome extensions. There are extensions available that can convert your voice into text and provide voice typing capabilities directly through your browser. By using these extensions, you can also receive a similar service through your browser.
Also read: ChatGPT Voice Typing Chrome Extension, how to use it?
Another method is to utilize the built-in voice-to-text features of operating systems. By enabling the default voice-to-text functionality on your Mac or Windows devices, you can also use this voice typing feature through your browser. For detailed information on activating voice typing through the browser and related details, you can refer to our relevant topic.