🎥 Today we’re excited to premiere Meta Movie Gen: the most advanced media foundation models to-date. Developed by AI research teams at Meta, Movie Gen delivers state-of-the-art results across a range of capabilities. We’re excited for the potential of this line of research to usher in entirely new possibilities for casual creators and creative professionals alike. More details and examples of what Movie Gen can do ➡️ https://go.fb.me/00mlgt Movie Gen Research Paper ➡️ https://go.fb.me/zfa8wf 🛠️ Movie Gen models and capabilities • Movie Gen Video: A 30B parameter transformer model that can generate high-quality and high-definition images and videos from a single text prompt. • Movie Gen Audio: A 13B parameter transformer model can take a video input along with optional text prompts for controllability to generate high-fidelity audio synced to the video. It can generate ambient sound, instrumental background music and foley sound — delivering state-of-the-art results in audio quality, video-to-audio alignment and text-to-audio alignment. • Precise video editing: Using a generated or existing video and accompanying text instructions as an input it can perform localized edits such as adding, removing or replacing elements — or global changes like background or style changes. • Personalized videos: Using an image of a person and a text prompt, the model can generate a video with state-of-the-art results on character preservation and natural movement in video. We’re continuing to work closely with creative professionals from across the field to integrate their feedback as we work towards a potential release. We look forward to sharing more on this work and the creative possibilities it will enable in the future.
AI at Meta
Research Services
Menlo Park, California 918,113 followers
Together with the AI community, we’re pushing boundaries through open science to create a more connected world.
About us
Through open science and collaboration with the AI community, we are pushing the boundaries of artificial intelligence to create a more connected world. We can’t advance the progress of AI alone, so we actively engage with the AI research and academic communities. Our goal is to advance AI in Infrastructure, Natural Language Processing, Generative AI, Vision, Human-Computer Interaction and many other areas of AI enable the community to build safe and responsible solutions to address some of the world’s greatest challenges.
- Website
-
https://ai.meta.com/
External link for AI at Meta
- Industry
- Research Services
- Company size
- 10,001+ employees
- Headquarters
- Menlo Park, California
- Specialties
- research, engineering, development, software development, artificial intelligence, machine learning, machine intelligence, deep learning, computer vision, engineering, computer vision, speech recognition, and natural language processing
Updates
-
Introducing Meta Motivo: a first-of-its-kind behavioral foundation model for controlling virtual physics-based humanoid agents for a wide range of complex whole-body tasks. Try the interactive demo ➡️ https://go.fb.me/308sfh Get the model and code ➡️ https://go.fb.me/ulrz1e Meta Motivo is capable of expressing human-like behaviors and achieves performance competitive with task-specific methods and outperforms state-of-the-art unsupervised RL and model-based baselines. We’re excited about how research like this could pave the way for fully embodied agents, leading to more lifelike NPCs, democratization of character animation and new types of immersive experiences.
-
Wrapping up the year and coinciding with #NeurIPS2024, today at Meta FAIR we’re releasing a collection of nine new open source AI research artifacts across our work in developing agents, robustness & safety and new architectures. More in the video from VP of AI Research, Joelle Pineau. Highlights from what we’re releasing today: • Meta Motivo: A first-of-its-kind behavioral foundation model that controls the movements of a virtual embodied humanoid agent to perform complex tasks. • Meta Video Seal: A state-of-the art comprehensive framework for neural video watermarking. • Meta Explore Theory-of-Mind: A program-guided adversarial data generation for theory of mind reasoning. • Meta Large Concept Models: A fundamentally different training paradigm for language modeling that decouples reasoning from language representation. Details and access to everything released by FAIR today ➡️ https://go.fb.me/251dxc We’re excited to share this work with the research community and look forward to seeing how it inspires new innovation across the field.
-
Following #NeurIPS2024 from your feed? Add these 10 papers from researchers at Meta to your reading list. 1. ReplaceAnything3D: Text-Guided object replacement in 3D scenes with compositional scene representations: https://go.fb.me/bfb0q0 1. emg2pose: A Large and Diverse Benchmark for Surface Electromyographic Hand Pose Estimation: https://go.fb.me/8id2bl 3. Unified Speech Recognition: A Single Model for Auditory, Visual, and Audiovisual Inputs: https://go.fb.me/0q6h9f 4. Déjà Vu Memorization in Vision–Language Models: https://go.fb.me/7z99lv 5. On improved Conditioning Mechanisms and Pre-training Strategies for Diffusion Models: https://go.fb.me/nxd0fj 6. Nearest Neighbor Speculative Decoding for LLM Generation and Attribution: https://go.fb.me/19k7ei 7. Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts: https://go.fb.me/fcbxgn 8. HOI-Swap: Swapping Objects in Videos with Hand-Object Interaction Awareness: https://go.fb.me/ptf5df 9. You Don’t Need Domain-Specific Data Augmentations When Scaling Self-Supervised Learning: https://go.fb.me/sgxvfr 10. Online Learning with Sublinear Best-Action Queries: https://go.fb.me/5ly349
-
+5
-
We're at NeurIPS in Vancouver this week showcasing some our latest research across GenAI, FAIR, Reality Labs and more. This year researchers from across Meta had 47+ publications accepted and are taking part in 7+ different talks/workshops/panels. We'll also be showcasing a number of demos on some of our newest work. 📍Attending NeurIPS? Visit us at Booth 433 near the center of the expo floor. 📱 Following from your feed? We'll be sharing more here. A few workshop highlights coming up this week: • Building Agentic Apps with Llama 3.2 and Llama Stack Workshops: • Streamlining Computer Vision Data Annotation with SAM 2.1 • AI-Driven Speech, Music and Sound Generation • Video-Language Models • Adaptive Foundation Models: Evolving AI for Personalized and Efficient Learning • Socially Responsible Language Modelling Research • Responsibly Building Next Generation of Multimodal Foundation Models • Safe Generative AI • Workshop on Touch Processing See you at #NeurIPS2024!
-
As we continue to explore new post-training techniques, today we're releasing Llama 3.3 — a new open source model that delivers leading performance and quality across text-based use cases such as synthetic data generation at a fraction of the inference cost. The models are available now on llama.com and on Hugging Face — and will be available for deployment soon through our broad ecosystem of partner platforms. Download from Meta ➡️ https://go.fb.me/vy2o5y Download on Hugging Face ➡️ https://go.fb.me/y5rcam Model card ➡️ https://go.fb.me/eezbpw The improvements in Llama 3.3 were driven by a new alignment process and progress in online RL techniques among other post-training improvements. We’re excited to release this model as part of our ongoing commitment to open source innovation, ensuring that the latest advancements in generative AI are accessible to everyone.
-
Today in partnership with Louisiana Economic Development, we announced our plans for our newest and largest datacenter in Richland Parish, Louisiana which will play a vital role in accelerating our AI progress and support training future open source LLMs. More details ➡️ https://go.fb.me/zpqs8t
-
Meta Sparsh is the first general-purpose encoder for vision-based tactile sensing that works across many tactile sensors and many tasks. The family of models was pre-trained on a large dataset of 460,000+ tactile images using self-supervised learning. To help foster new generations of robotics AI research in the academic community, we've released: • The PyTorch implementation • Pre-trained model weights on Hugging Face • Datasets • A new research paper You can find details on this work here ➡️ https://go.fb.me/hmlavg
-
As part of our continued work to help assure the future security of deployed cryptographic systems, we recently released new code that will enable researchers to benchmark AI-based attacks on lattice-based cryptography — and compare them to new and existing attacks going forward. We shared more on our work on Salsa — as well as seven other new releases for the open source community in this post ➡️ https://go.fb.me/h3f1fl
-
Following the release of our latest system level safeguards, today we're sharing new research papers outlining our work and findings on Llama Guard 3 1B & Llama Guard 3 Vision — models that support input/output safety in lightweight applications on the edge and in multimodal prompts. Llama Guard 3 1B research paper ➡️ https://go.fb.me/o8y8m1 Llama Guard 3 Vision research paper ➡️ https://go.fb.me/1cb0xh Our hope in releasing this research openly is that it helps practitioners build new customizable safeguard models — and that this work inspires further research and development in LLM safety.