Claude 3.5: AI's New Computer Skills

Source:

TheVerge
on
October 22, 2024
Curated on

October 24, 2024

Anthropic has unveiled a notable development for its Claude 3.5 Sonnet AI model: the ability to control a computer in a human-like way. This new 'computer use' feature is currently available in public beta, allowing developers to experiment and provide feedback. By mimicking human actions like moving a cursor, clicking, and typing, Claude pushes the boundaries of what AI can do. While similar technologies have been demonstrated by competitors like Microsoft, OpenAI, and Google, Anthropic's new capability sets it apart as it enables more dynamic interaction with onscreen elements than previously seen. Despite this impressive leap forward, Anthropic cautions that using AI for such tasks remains 'experimental'. It can be cumbersome and prone to errors, and Claude's interaction is restricted from engaging with certain activities like social media and election-related tasks. Claude's 'flipbook' approach—building a composite image from screenshots rather than using a continuous video stream—limits its ability to catch fleeting actions or notifications. As a result, the company is gathering developer feedback to enhance this functionality over time. Alongside the added computer control abilities, Claude 3.5 Sonnet also shows marked improvements in various performance benchmarks. For example, the model has improved its agentic coding capabilities and tool use tasks significantly. On the SWE-bench Verified for coding, it scored higher than several other models by boosting its performance from 33.4% to an impressive 49.0%. Likewise, it achieved higher scores on the TAU-bench in both retail and airline domains, demonstrating a strong step forward in AI-enabled productivity aids.

Ready to Transform Your Organization?

Take the first step toward harnessing the power of AI for your organization. Get in touch with our experts, and let's embark on a transformative journey together.

Contact Us today