Trends in AI: February

March 1, 2023

‍Artificial intelligence (AI) has made impressive strides in recent years. In February 2023, notable advancements were announced in generative models that can create music, text, and videos, language-to-code conversion, and image manipulation. This article highlights some of the most remarkable achievements in AI that have been announced lately.

Language Models

In the field of language models, Microsoft researchers have presented BioGPT, a generative pre-trained transformer model trained on biomedical literature that outperforms other scientific LLMs on par with human expert performance, and OpenAI has introduced a web-based tool to distinguish between AI-generated and human-written texts to fight plagiarism and harmful campaigns at scale. Salesforce researchers and researchers at UCSF and Berkeley have introduced ProGen, an LLM capable of generating functional protein sequences. Salesforce Research has also presented BLIP-2, a state-of-the-art vision-to-language model capable of zero-shot image-to-text generation and VQA.

Conversational AI

In the rapidly growing field of conversational AI, there are several exciting developments to keep an eye on. For example, Google has recently been testing a new competitor to ChatGPT called Apprentice Bard. Unlike ChatGPT, Apprentice Bard is powered by Google's LLM LaMDA and has knowledge of recent events, which could give it an advantage in certain contexts.

Meanwhile, Microsoft researchers have introduced FLAME, a Formula-Language Model for Excel. This powerful tool is capable of repairing, auto-completing, and reconstructing the syntax of formulas, which could save users a lot of time and effort.

Meta has released its new large language model called LLaMA, which is designed to help researchers advance in the field of AI. LLaMA is a state-of-the-art model available in different sizes (7B, 13B, 33B, and 65B parameters) and is versatile, making it applicable to many different use cases. By sharing the code for LLaMA, other researchers can easily test new approaches to eliminating problems such as bias, toxicity, and hallucinations in large language models. Meta is releasing the model under a noncommercial research-focused license and granting access on a case-by-case basis to academic researchers, government and civil society organizations, and industry research labs worldwide.

Microsoft has recently announced the release of a new ChatGPT-powered Bing search engine. You can read more about it in this blog post.

Finally, Google has introduced Bard, an experimental conversational AI that is part of Search. Powered by Google's LLM LaMDA, Bard has the potential to revolutionize the way we interact with search engines and other online tools.

Image and Video Generation

In the field of image and video generation, Meta researchers have proposed LEVER, a state-of-the-art approach that improves language-to-code generation by learning to verify the generated programs with their execution results.

Researchers from Meta AI and the University of British Columbia have presented MINOTAUR, a unified multi-task model for query-based video understanding.

Google and The Hebrew University of Jerusalem have released Dreamix, a new method for video generation and editing.

CarperAI has released a new diffusion model capable of generating code changes with commit messages.

Runway has presented a new model that uses language and images to generate new videos.

Researchers from CMU and Adobe have proposed pix2pix-zero, a new image-to-image translation method that outperforms existing models for real and synthetic image editing.

Alibaba researchers have presented Composer, a new generation paradigm that allows flexible control of the output image, such as spatial layout and palette.

Researchers from Oxford have demonstrated a state-of-the-art model called RealFusion, capable of reconstructing a full 360° photographic model of an object from a single image.

Music Generation

In the field of music generation, Google Research has introduced MusicLM, a transformer-based text-to-audio model that can produce tracks of varying genres, instruments, and concepts.

Google researchers have also introduced SingSong, a system that generates instrumental music to accompany vocal inputs, and Anonymous researchers have introduced Noise2Music, a diffusion model generating high-quality 30-second music clips from text prompts.

ByteDance AI Lab researchers have introduced Make-An-Audio, a text-to-audio diffusion model that can transform images and videos into audio, thanks to robust generalization.

The German Max Planck Institute has released Maûsai, a text-to-music generative model capable of long-context high-quality music.

Legal and Medical AI

Stability AI has introduced MedARC, an open-source research organization focusing on developing foundation models for medical AI research.

The world's 7th largest law firm has announced it will start using Harvey, a generative AI software that builds custom LLMs for law firms to draft contracts, client memos, and other legal documents.

Other Advancements

Researchers at Stanford University have introduced ControlNet, an open-source neural network structure with the goal of fine-tuning Stable Diffusion models.

Researchers from NYC and the University of Maryland have proposed a new approach to generate hard text prompts from images.

Berkeley researchers have proposed Hindsight Finetuning, a novel technique to significantly improve LLMs' performance with a limited amount of human feedback.

Microsoft has released BioGPT-Large, a 1.5B parameters model from medical text generation achieving a SOTA performance of 81% accuracy.

Nvidia researchers have introduced Re-ViLM, while Baidu researchers have introduced ERNIE-Music, the first text-to-music generation model in the waveform domain using diffusion models.

Google researchers have showcased ViT-22B, the largest vision transformer model at 22B parameters.

Stanford researchers have presented Hyena, a new method achieving state-of-the-art performance on language tasks while reducing training costs by 20% and improving inference time by up to 100x.

Researchers from CMU have introduced a model capable of generating high-resolution 3D photorealistic images from a 2D label. Researchers from UC Berkeley have proposed Hindsight Instruction Relabeling.

Takeaway

In conclusion, the field of AI has made significant strides in various areas, from music generation to legal and medical AI. The advancements showcased in February 2023 demonstrate the potential of AI to revolutionize industries and improve our daily lives. With continued research and development, we can expect to see even more exciting advancements in the future. As AI continues to evolve, it is important to remain aware of its potential impact on society and to ensure its development and implementation are responsible and ethical.

Upgrade your data

Enhance your data

For visibility and growth

Wow your customers

AI-driven guidance

For smarter shopping