Google Enhances Vertex AI with New Features and Models
On Wednesday, as part of its ongoing efforts to dominate the enterprise generative AI market, Google introduced a series of updates to several first-party media-generating AI models within its Vertex AI cloud platform. These enhancements were unveiled during the Cloud Next event, emphasizing the company’s strategic expansion in the realm of AI solutions.
New Additions to Google’s AI Portfolio
Lyria: Text-to-Music Model Preview
Google’s innovative text-to-music model, known as Lyria, is now available for preview to select customers. Lyria allows users to generate music across a diverse range of styles, from soothing lo-fi beats to intricate piano solos, providing an alternative to traditional royalty-free music libraries.
Enhancements to Veo 2 for Video Creation
The Veo 2 model, which specializes in video generation, has received significant upgrades that include:
- Background removal and manipulation features, allowing users to eliminate unwanted images, logos, and objects from videos.
- Enhanced capabilities to adjust camera angles and pacing, facilitating the creation of timelapse and drone-style videos.
- The ability to extend video frames, which can help convert landscape videos into portrait orientations.
Currently, these Veo 2 features are in preview phase.
Voice Cloning with Chirp 3
Another notable addition is the voice-cloning functionality powered by Chirp 3, Google’s advanced audio understanding model. Now generally available, this technology can synthesize speech in approximately 35 different languages. The feature, termed Instant Custom Voice, can replicate a voice using just ten seconds of audio input. Additionally, Google is launching a new preview tool called Transcription with Diarization, which effectively separates and identifies different speakers in recordings featuring multiple participants. To mitigate potential misuse, Google has implemented a diligence process to ensure that users have the appropriate permissions for voice usage.
Advancements in Imagen 3 Image Generation
The Imagen 3 image generator has also undergone significant enhancements, purportedly improving its ability to remove unwanted objects and to reconstruct missing or damaged sections of images.
Safety Measures and Content Safeguards
All media produced by Lyria, Veo, and Imagen is protected with Google’s SynthID watermarking technology. Moreover, Google assures that each of its generative AI models is equipped with built-in safeguards to prevent the creation of harmful content.
Addressing Copyright Concerns
While Google has not disclosed the specific data sources it employs to train its models—an area often fraught with controversy regarding intellectual property—the company has previously indicated that it offers opt-out mechanisms for training data, along with an indemnity policy aimed at protecting Google Cloud and Vertex AI users from potential copyright disputes arising from AI-related activities.
Conclusion
These updates signal Google’s robust approach to enhancing its generative AI capabilities, positioning itself competitively against other platforms like Amazon’s Bedrock. As advancements continue, the focus on safeguarding intellectual property and user permissions remains critical in the evolving landscape of AI technology.