Breakthroughs in Open AI: CLAE 2.1, Stable Video Diffusion, and More Democratizing Creativity
Anthropic One-Ups OpenAI, Stability AI Pioneers Video Generation, and More Experimental AI With Cross-Domain Potential
Welcome to the latest jam-packed edition of our AI advancements newsletter! We've got some incredibly exciting updates to share regarding major new AI models and capabilities that have launched over the past weeks. There's a ton happening in the AI world right now, so let's dive right in!
Anthropic Releases CLAE 2.1 to Compete with OpenAI
AI safety company Anthropic has released an upgraded version of its conversational AI assistant, CLAE 2.1 (https://www.anthropic.com/index/claud...). This new model aims to directly compete with and even surpass OpenAI's recently launched GPT-4 Turbo.
Some of the key updates in CLAE 2.1 include:
- 200,000 token context window (https://www.anthropic.com/index/claud...) - This allows CLAE to reference and "remember" far more information compared to other models. For reference, GPT-4 Turbo has a 128,000 token context window.
- Reduced hallucination rates - Anthropic claims a 2x decrease in the chances of CLAE generating false information or "spouting BS." This should improve accuracy for things like summarization and question answering.
- New system prompts and tool use features - This will allow users to better customize CLAE for their specific needs and integrate it with existing systems.
In their announcement, Anthropic emphasized CLAE's strengths in processing long, complex documents like legal contracts or financial reports. The high context window and accuracy improvements make it well suited for enterprise use cases that demand high precision.
Pricing also now scales more efficiently, although the "Pro" plan still runs $20/month for full access. For those wanting to try CLAE 2.1 on the cheap, third-party services like Nativ.dev will likely offer pay-as-you-go access soon.
With CLAE 2.1, Anthropic is essentially "one-upping" OpenAI in major ways barely a month after the launch of GPT-4 Turbo. This model was already extremely impressive, but the 200k context window is a true game-changer. Doubling the context could significantly improve CLAE's reasoning, accuracy, and reduction of repetition.
The hallucination and comprehension gains around summarization and document QA also can't be understated. Precision is crucial for real-world usage, and CLAE seems optimized specifically for complex business/research applications vs. conversational chatbots.
Platform customization features like tool use APIs also help CLAE stand out from the crowd. Integration opportunities will make it far more useful than a generic model reliant on prompts alone.
Stable Diffusion Team Launches AI Video Generation
In another huge open source AI advance, Stability AI has released Stable Video Diffusion (https://stability.ai/news/stable-vide...) - their first generative model capable of creating short videos based on text prompts.
While still an initial research release, the videos already show impressive coherence and fidelity across up to 25 frames. Backgrounds stay stable, motion looks natural, and faces/objects contain a high level of realistic detail.
The model code (https://github.com/Stability-AI/gener...), training data, and weights (https://huggingface.co/stabilityai/st...) are all now available on Stability AI's GitHub. This opens up endless possibilities for developers to build on top of this foundation and create better generative video AIs.
Stability AI themselves also released an interface demonstrating practical use cases like advertising or entertainment. You can provide an image or text prompt and get AI-generated 720p video clips in response.
They emphasize that this is not intended for real applications yet. But given Stable Diffusion's trajectory and community contributions across individuals and big tech, we can expect rapid iteration from researchers around the world.
Just as Stable Diffusion 2 showed the power of open source for images, Stable Video Diffusion can pioneer a similar renaissance for intelligent video generation. Having full access to data and code unlocks creativity that no single company can match.
Experimental AI Video Effects & Voice-to-Music
Also in experimental AI tech, Twitter user Nathan Shipley demonstrated using Stable Diffusion for "in-painting" video visual effects (https://twitter.com/CitizenPlain/stat...). By erasing part of a moving video, the AI can realistically generate missing elements like fire, water, ice, or fantastical creatures emerging from a person's body.
The seamless integration isn't perfect enough for professional use today. But as algorithms and data improve, anyone could leverage this tech for DIY video edits or effects that previously required green screens and complex editing. Democratizing these capabilities unlocks new creative potential.
Finally, a new web app called Musicy (
https://twitter.com/aribk24/status/17...)
allows you to sing or hum into your mic and convert your voice into realistic instrumental music. Feed it your off-key singing and out comes a competent piano riff - it almost sounds like magic!
This technology can empower creators without traditional musical training to make beautiful compositions. While the audio quality isn't as high fidelity as human performances yet, the musical ideas themselves show creativity starting to mimic the human creative process.
The Rapid Pace of Public AI Continues
As these examples show, publicly available AI capabilities are accelerating faster than ever thanks to open source development and healthy competition. The Stable Diffusion community proved that talented researchers around the world can collaborate to push algorithms beyond what even Big Tech's billions can do internally.
Anthropic's rapid progress in catching up to OpenAI likewise shows the power of transparency, strong ethics, and public conversation driving innovation for the common good. Startups with purpose can absolutely compete and often outpace legacy institutions.
There's still much progress needed on issues like accuracy, content filtering, and combating malicious use cases. But the overall pace of advancement shows no signs of slowing down. Each month brings new discoveries that push AIs closer to creative competency rivaling median human performance.
Cloud compute/data has also democratized capabilities that once cost billions in R&D, putting these tools in the hands of underserved groups. With care, equitability, and vision, this Cambrian explosion of AI creativity could bring wonder and possibility to millions more worldwide. We hope you share our optimism and excitement for the innovations still to come!
6 OF THE BEST AI TOOLS
HEADLIME IS THE GO-TO GPT-3 TOOL FOR MARKETERS.
WRITESONIC IS ONE OF THE BEST ARTIFICIAL INTELLIGENCE-POWERED COPYWRITING GPT-3 TOOLS.
Unleash Your Creativity with These 10+ Amazing Free AI Art, Music, and Video Tools
Transform Photos and Videos into 3D Scenes, Generate Original Music, and More with Cutting-Edge AIÂ