Hot on the heels of OpenAI's GPT-4o, Google has revealed its own AI prowess with Project Astra. Revealed at the Google I/O conference, Astra promises to bring a new level of intelligent assistance to everyday life. Google DeepMind's CEO, Demis Hassabis, introduced this prototype AI which showcased its multi-faceted capabilities. Not only can Astra identify objects based on sounds, but it can also explore creative communication, parse codes, and even help users locate items they've misplaced. Its potential extends to wearable tech, offering on-the-go analysis and witty banter based on visual cues.
What makes Astra stand out is its use of both camera and microphone inputs to process, understand, and remember events; effectively building a timeline that empowers it to interact intelligently with its environment. Even though these features are still in prototype, Google has signaled that Astra's technology could be integrated into their Gemini app as 'Gemini Live' later in the year. The goal is to create an 'agent with agency', echoing Sundar Pichai's vision of AI that can proactively think and plan on behalf of the user.
Apart from Astra, Google’s conference was abuzz with other AI developments. Google AI's head, Sundar Pichai, promised improvements to Gemini 1.5 Pro, with a mammoth 2 million-token context window for processing information – a huge leap from its earlier one million tokens and dwarfing OpenAI's GPT-4 Turbo's capacity. Simon Willison, an AI researcher, flagged the considerable cost this capability entails while acknowledging its potential. In addition, Google introduced Gemini 1.5 Flash, a budget-friendly AI model catered for rapid-fire tasks, and Gems, customizable AI roles for user interaction, alongside generative AI models for creative content like Imagen 3 and Google Veo.