AI is a complex and constantly evolving field in which organizations and individuals are constantly focused on finding new solutions to pressing challenges. The year has been rich in revolutionary innovations that have pushed the boundaries and paved the way for better results. In this article, we list the top ten AI innovations of 2021 to date.
Register for our Workshop>>
Researchers from Facebook AI Research have presented a new transformer model, Unified Transformer (UniT). UniT has an encoder-decoder architecture that handles multiple tasks and domains in a single model with fewer parameters; according to the Facebook team, UniT is a step towards general intelligence.
OpenAI DALL.E & CLIP
DALL. E is the 12 billion parameter version of OpenAI. It is a transformer that can generate images from text prompts. The template can work with multiple objects in an image to render an image or modify it based on text prompts.
The OpenAI research team also demonstrated a neural network called Contrastive Language-Image Pre-training or CLIP. This neural network has been trained on 400 million pairs of images and text. CLIP is also similar to the GPT family and can learn to perform tasks like object character recognition (OCR), geolocation, action recognition, etc.
Mixer Bot 2
Facebook’s BlenderBot 2 is the first open-source chatbot of its kind with long-term memory. Facebook has worked to make AI more empathetic, knowledgeable, and capable. The BlenderBot 2.0 can create long term memory for continuous access. It does this while simultaneously searching the internet for information and having conversations on almost any topic.
Google Translatotron 2
In 2019, Google launched Translatotron, an end-to-end speech-to-speech translation model. It was then the first end-to-end framework capable of directly translating speech from one language to another.
The system was used to create synthesized translations of the voices to ensure that the sound of the original speaker is intact. But this feature could be misused to generate speech with a different voice and create deep fake voices.
This year, Google released Translatotron 2, an updated version where the trained model is limited to retain the source speaker’s voice. Unlike the previous version, it cannot generate voices in different voices, which mitigates potential abuse to create audio spoofing artifacts.
Google introduced Vertex AI, a managed machine learning platform for deploying and maintaining AI models, at this year’s Google I / O conference. The new platform unites AutoML and AI Platform into a unified API, client library, and user interface.
Previously, researchers had to run millions of test images for training algorithms, but now they can rely on Vertex’s technology stack to do the heavy lifting.
Microsoft’s FLAML is a python package that can tell us which ML model is best suited for low computation. It eliminates the manual process of choosing the best model and setting.
This AutoML system is primarily focused on model selection, hyperparameter tuning, feature engineering, neural architecture research, and model compression.
MusicBERT is Microsoft’s pre-trained large-scale model for the symbolic understanding of music. It covers applications such as classification of emotions, classification of genres and matching of pieces of music. Microsoft created this model using an OctupleMIDI method, a bar-level masking strategy, as well as a large-scale symbolic music corpus containing over a million pieces of music.
Microsoft’s Neural TTS
Microsoft’s Neural Text-to-Speech (TTS) software enables developers to create custom synthetic voices. AI is structured in three layers: text analyzer, neural acoustic model and neural vocoder.
The text analyzer converts the plain text to pronunciations, the acoustic model converts the pronunciations to acoustic characteristics, and finally, the vocoder generates waveforms.
Google’s TensorFlow 3D is a highly modular library for bringing 3D deep learning capabilities to TensorFlow. While the previous TensorFlow was not enough to understand the environment, the 3D update provides a set of operations, loss function, data processing tools, metrics and other models to develop, train and deploy advanced 3D scene understanding models.
Join our Telegram group. Be part of an engaging online community. Join here.
Subscribe to our newsletter
Receive the latest updates and relevant offers by sharing your email.