AI research

Microsoft’s NaturalSpeech 3 clones voices and emotions

Summary NaturalSpeech 3 is Microsoft’s latest text-to-speech system that can clone voices and emotions. Microsoft Research Asia, Azure Speech, and partner universities have developed a new speech synthesis system called NaturalSpeech 3. The system uses a new approach that breaks down speech into different sub-units such as content, prosody, timbre, and acoustic details. The research …

Microsoft’s NaturalSpeech 3 clones voices and emotions Read More »

How exploration could help with reasoning in language models

Summary Meta researchers have investigated whether reinforcement learning can improve the reasoning ability of large language models. The researchers compared different algorithms, including Proximal Policy Optimization (PPO) and Expert Iteration (EI), to find out how well they can improve the reasoning ability of language models. The core idea is that models can generate their own …

How exploration could help with reasoning in language models Read More »

Researchers teach a robot to walk around San Francisco using AI’s word prediction techniques

Summary A study by the University of California, Berkeley, enables robots to navigate based on the principle of word prediction from language models. This approach could pave the way for a new generation of robots that can navigate complex environments with minimal training. In their paper, “Humanoid Locomotion as Next Token Prediction,” the researchers treat …

Researchers teach a robot to walk around San Francisco using AI’s word prediction techniques Read More »

Researchers develop generative AI worm that can steal data and send spam emails

Researchers have developed a generative AI worm called Morris II that can steal data and send spam. The Morris II worm uses two methods: a text-based, self-replicating prompt and a self-replicating prompt embedded in an image file. In the first method, the worm uses Retrieval Augmented Generation (RAG) to “poison” an email program’s database, allowing …

Researchers develop generative AI worm that can steal data and send spam emails Read More »

Google’s ScreenAI reliably navigates smartphone screens

Summary Google is taking another step on the long road to language and voice-controlled computer interfaces. Google Research has introduced a new AI model called ScreenAI that can understand user interfaces and infographics. It sets new benchmarks for various tasks, including answering content questions based on infographics, summarizing them, and navigating through user interfaces. At …

Google’s ScreenAI reliably navigates smartphone screens Read More »

The future of chatbots could be 1 bit

Summary Researchers at Microsoft Research and the University of the Chinese Academy of Sciences have unveiled BitNet b1.58, a 1-bit language model that promises high performance at significantly reduced cost and power consumption. The development of large-scale language models, such as GPT-4, has made significant progress in recent years, but the high energy and memory …

The future of chatbots could be 1 bit Read More »

For Microsoft’s bGPT, the world is just bytes

Summary Byte instead of token: A new paper from researchers at Microsoft Research Asia, the Central Conservatory of Music, China, and Tsinghua University introduces bGPT, a transformer model that relies on byte prediction instead of classical token prediction. Similar attempts have been made before, but unlike other models, which are usually limited to specific formats …

For Microsoft’s bGPT, the world is just bytes Read More »

New foundation model “Evo” unlocks sequence modeling and design at the genomic scale

Summary A team from TogtherAI and the Arc Institute presents Evo, an AI model for biological research that can interpret DNA, RNA, and proteins and enable generative design at the molecular and genomic level. Developed by a team of experts consisting of Eric Nguyen, Michael Poli, Matthew Durrant, Patrick Hsu and Brian Hie, the model …

New foundation model “Evo” unlocks sequence modeling and design at the genomic scale Read More »

How DeepMind’s Genie AI could reshape robotics by generating interactive worlds from images

Summary Researchers at DeepMind have developed Genie, a model that creates worlds from images and moves video game characters around in them on their own. It sounds like a gimmick, but it could be the basis for something much bigger. “What if, given a large corpus of videos from the Internet, we could not only …

How DeepMind’s Genie AI could reshape robotics by generating interactive worlds from images Read More »

Scroll to Top