Microsoft Has Created A Universal Neural Network Kosmos-1

Microsoft introduced neural network Kosmos-1, which combines text, images, audio and video content as input.

The researchers called the system a “multimodal grand language model”. In their opinion, such algorithms will become the basis of general AI (AGI), which will be able to perform tasks at the human level.

“As a basic part of intelligence, multimodal perception is essential to achieve AGI in terms of knowledge acquisition and real-world tying,” the researchers said.

According to examples from articlesKosmos-1 can:

analyze images and answer questions about them;
read text from pictures;
create image captions;
pass a visual IQ test with an accuracy of 22-26%.

Demonstration of the operation of the Kosmos-1 neural network. The blue boxes indicate the request, the red boxes indicate the response of the model. Data: Microsoft.

Microsoft trained Kosmos-1 on data from the Internet, including the 800 GB English text resource The Pile and the Common Crawl web archive. After training, the researchers evaluated the model’s abilities in several tests:

understanding and generation of language;
text classification without optical character recognition;
subtitles for images;
visual responses to questions;
answers to web page questions;
zero shot image classification.

Demonstration of communication with Kosmos-1 about images. Data: Microsoft.

According to Microsoft, the Kosmos-1 outperformed current models in many of these tests. In the near future, the researchers plan to publish the source code of the project on GitHub.

Recall that in January, Microsoft introduced a human voice simulator based on a short sample of VALL-E.

Subscribe to bitcoinlinux on social networks

Found a mistake in the text? Select it and press CTRL+ENTER

bitcoinlinux Newsletters: Keep your finger on the pulse of the bitcoin industry!

BitcoinLinux

Microsoft has created a universal neural network Kosmos-1

About The Author

NixCoin

Related Posts

About The Author

NixCoin