xAI has added new capabilities to its Grok chatbot, bringing Vision and Voice Mode to iPhone users. The update allows you to interact with the world through your phone’s camera and engage in spoken conversations in multiple languages. The features are live on the Grok iOS app, while Android users will need to wait.

Real-Time Visual and Voice Interactions

With Grok Vision, you can now point your iPhone camera at an object and ask, “What am I looking at?” Grok will identify it and respond out loud, providing real-time context. This brings Grok in line with similar capabilities offered by ChatGPT and Google Gemini, giving iPhone users a practical way to merge visual recognition with conversational AI.

Introducing Grok Vision, multilingual audio, and realtime search in Voice Mode. Available now.



Grok habla español

Grok parle français

Grok Türkçe konuşuyor

グロクは日本語を話す

ग्रोक हिंदी बोलता है pic.twitter.com/lcaSyty2n5 — Ebby Amir (@ebbyamir) April 22, 2025

The Voice Mode update also adds support for multilingual audio. You can speak to Grok in different languages, and it will respond accordingly. xAI has also added real-time web search to improve the chatbot’s ability to deliver up-to-date answers. These updates make Grok more dynamic, interactive, and responsive in daily use.

Expanded Features and Rollout

The release comes shortly after Grok introduced its memory feature, which helps the chatbot remember your past interactions and preferences. This makes replies more relevant and personalized. xAI also launched Studio, a workspace similar to ChatGPT’s Canvas, where you can generate documents or write code in a distraction-free interface.

According to xAI, these updates are part of a broader push to make Grok a more capable assistant across tasks that blend voice, vision, and context. Grok is now available for free on the App Store.