Now Facebook’s AI model can anticipate your future actions

Anticipating the next moves and predicting the same ones with precision is certainly exciting but difficult. For example, it can be easy to predict whether the next ball in a game of cricket will be hit for a six or a four; however, any bad prediction will not be a costly affair. Consider another situation where an autonomous vehicle is on the road at a stop sign. Now the situation requires the AV to predict whether the pedestrian will cross the road or not. Anticipating future activities is a difficult issue for AI because it requires both predicting the multimodal distribution of future activities and modeling the course of previous actions.

To meet this challenge, two researchers, namely Rohit Girdhar of Facebook AI Research and Kristen Grauman of the University of Texas, Austin, have come together to come up with Anticipative Video Transformer (AVT).

Register>>

The science behind AVT

The researchers took advantage of recent advances in transformer architectures, particularly for image modeling and natural language processing for AVT. This end-to-end attention-based video modeling architecture takes previously observed video into account in order to anticipate future actions.

The model is designed to produce predictions for future actions, given an input video clip. To accomplish the same thing, it relies on a two-stage architecture, consisting of:

Unifying network that works on individual images or short clips. This backbone, called AVT-b, adopts the recently proposed Vision Transformer (ViT) architecture, and it has previously shown impressive results for the classification of static images, followed by;
Head architecture which operates on the characteristics at the image / clip level to predict future characteristics and actions. It is called AVT-h and is used to predict the future characteristics of each input frame using a causal transformer decoder.

Additionally, AVT uses causal attention modeling – predicting future actions based solely on frames seen so far – and is trained using goals inspired by self-supervised learning. The architecture of the AVT model is shown below:

Image source: paper

In addition to this, the researchers train the model to predict future actions and features using three losses:

Recent advances in Facebook’s AI

In a recent claim, Facebook AI introduced a new audio-only language model – Generative Spoken Language Model (GSLM). This can now be considered the first high performance text independent NLP model. GSLM can operate directly from raw audio signals without labels or text, with optional voice input to voice output, expanding the boundaries of NLP without text in various spoken languages.
Last month, the Facebook team introduced the Instance-Conditioned GAN (IC-GAN), a new model of image generation. With or without input photographs of the drive assembly, this new model produces diverse, high-quality images. In addition, IC-GANs, unlike previous approaches, can produce combinations of realistic and unexpected images.
Opacus, a free open source library for training deep learning models with differential privacy, was recently released by Facebook. This new tool is intended to be simple, flexible and fast. It has a simple, user-friendly API that allows ML practitioners to privateize a training pipeline with just two lines of code.

Facebook’s recent AI advancements in AI and ML have come a long way. Every now and then, the organization’s researchers advance the field of artificial intelligence with good results-oriented work.

Subscribe to our newsletter

Receive the latest updates and relevant offers by sharing your email.

Join our Telegram Group. Be part of an engaging community

Kumar Gandharv

Kumar Gandharv, PGD in English Journalism (IIMC, Delhi), is embarking on a journey as a technical journalist at AIM. An attentive observer of national and IR news. He loves going to the gym. Contact: [email protected]

Xopus

Main Menu

Xopus

Now Facebook’s AI model can anticipate your future actions

The science behind AVT

Recent advances in Facebook’s AI

Subscribe to our newsletter

Xopus

Main Menu

Xopus

Now Facebook’s AI model can anticipate your future actions

The science behind AVT

Recent advances in Facebook’s AI

Subscribe to our newsletter

Related posts: