The Future of AI: Exciting Developments and Updates in Artificial Intelligence
Multimodal ChatGPT, Surprising Efficiency, US Regulations, Copyright Victory - Exciting AI Developments Reshape the Future
Artificial intelligence (AI) continues its rapid advancement, with new developments and updates that will shape the future emerging all the time. In this month's AI newsletter, we highlight some of the most exciting recent news in the world of AI.
ChatGPT Gets Smarter with Multimodal Capabilities
ChatGPT, the conversational AI chatbot from OpenAI that took the world by storm at the end of last year, has received an upgrade. The new version of ChatGPT now has multimodal capabilities, meaning it can both take in and generate visual information along with text. Users can upload images to ChatGPT and have it describe or modify them. This allows for more interactive and intelligent conversations with the bot.
Previously, different AI models within ChatGPT handled text and image processing separately. Bringing multimodal abilities together into one system is a huge step forward in developing more human-like communication abilities in AI. OpenAI says these new features are still rolling out slowly to users.
ChatGPT May Be More Efficient Than Expected
In a new paper, Microsoft researchers revealed that the backbone model behind ChatGPT, called GPT-3.5, has only 20 billion parameters. This is far fewer parameters than many experts estimated the model contained based on its capabilities. For reference, GPT-3, OpenAI's previous generation model, contained 175 billion parameters.
The smaller parameter count of GPT-3.5 indicates it is much more efficient than expected. With clever model architecture and training, OpenAI's researchers have created a highly performant conversational AI with fewer computational resources needed. This has positive implications for the future scalability and accessibility of powerful AI systems.
U.S. Government Lays Out AI Regulations
The White House recently released an executive order detailing the Biden administration's approach to regulating AI technology. The order establishes principles for developing AI safely and equitably in the US. There are still many details to be worked out, but it signals the government's intent to exert more control over powerful AI systems.
Some key elements in the executive order include requiring developers of the most high-risk AI systems to conduct safety testing and provide results to regulators. The order also directs federal agencies to prioritize AI safety, algorithms fairness, and data privacy when adopting AI tools. Additionally, it launched initiatives to fund AI research and hire AI talent into the government.
The order applies only to the most powerful AI systems, likely leaving smaller open source models unaffected for now. However, the administration says it will continuously reassess the levels at which AI regulation is needed. The tech industry will be closely monitoring how these policies take shape.
Major Win for AI Art in Copyright Lawsuit
In a closely watched case, DeviantArt, Stability AI, and Midjourney prevailed against a lawsuit alleging AI art infringes on copyrights. The judge dismissed the case, saying there was no concrete evidence specific copyrighted works were reproduced in the AI models involved.
This ruling sets an important precedent in applying long-standing copyright laws to emerging AI art generation tech. Unless an AI image is nearly identical or copied from a single copyrighted source, it likely falls under fair use protections. This allows AI artists and tool developers to keep creating without constant threat of legal action.
Of course, more lawsuits are likely as the technology continues advancing rapidly. But this early victory helps provide some reassurance that current copyright law stands on the side of AI innovation.
ASL Translation Makes Tech More Accessible
Nvidia researchers achieved a new milestone in AI accessibility tech. They created an AI system that recognizes and translates American Sign Language (ASL) finger spelling in real-time video into text. The technology could one day enable instant communication aides for the deaf community.
The model uses computer vision techniques to identify hand shapes corresponding to different letters, then combines letters sequentially to generate words and sentences. In tests, it achieved over 93% accuracy in detecting finger spelled words, translating sign language into text or speech.
Advancements like this showcase how AI can make information and communication more accessible to all. Tech that bridges barriers between people has immense value beyond its technical achievements. We look forward to seeing if and when this research makes its way into real-world products.
Midjourney Upgrades Website and User Experience
Midjourney, the AI art generator rival to DALL-E, redesigned their website to be faster, easier to navigate, and mobile friendly. Users can now view generation history organized by date, and search through past images more intuitively.
But the biggest improvement is yet to come - native image generation integrated into the site. Currently Midjourney relies on its Discord bot to produce images, but native generating from the website is on the way. This will provide a smoother, unified user experience within Midjourney's own platform.
The upgraded website lays the foundations for Midjourney to keep innovating independently, rather than being anchored to Discord. It's part of the continual evolution among AI art platforms competing for users. Which platform will unlock the next big feature?
New AI Music Generation Model Sounds Impressive
Gen-1, an AI system for generating original music, launched its own website and brand as GenMusicAI. Developed by Futurverse, Gen-1 showcases incredibly high-quality music and audio generation. Based on samples, it produces original songs, instrumentals, and other formats.
Early demos reveal music with clarity in individual instruments, ambiance, and stereo sound nearly on par with human-composed tracks. The AI model handles melody, rhythm, structure, and other musical elements seamlessly. Gen-1 is the most promising AI music generator we've heard yet.
Now operating as GenMusicAI, the team plans to offer paid generative music abilities to users soon. This could shake up industries like music production, licensing, advertising, and more. Keep an eye out for the official launch!
Advances in AI Video Generation
AI researchers are making big strides in text-to-video generation as well. Recently, models like VideoCrafter demonstrate rapidly improving quality for AI-generated video. Another new method, FreeNoise, allows increasing video length up to 20 seconds with better coherence.
There's still progress to be made, but innovations like these inch us closer to seamless, customizable AI video production. Possible applications span entertainment, advertising, art and beyond. While some worry about misuse, the possibilities to benefit creativity and human connection are immense.
Of course, there are still technical challenges around resolution, artifacts, and context consistency that today's models must overcome before reaching that goal. But with open source tools accelerating research, AI video generation is one space making swift headway.
The Future is Bright with AI
As AI research and applications gallop forward, it can be hard to keep up with everything happening in the field. But these recent developments give just a taste of the exciting innovations and opportunities on the horizon. From multimodal AI assistants to creative generative models and accessibility tech, artificial intelligence is reshaping society.
While regulators debate guidelines for governing AI safely and ethically, one thing is clear - there's no slowing the technology down. AI will drive progress in every industry and area of life. The road ahead will have obstacles, but the possibilities are too immense not to forge ahead. We can shape an inspiring future with AI, if we face the challenges with collective optimism.