Overview
SAM 3D is a foundational model developed by Meta Superintelligence Labs that reconstructs complete 3D shapes, textures, and layouts from a single image. Designed to handle real-world challenges like occlusions, clutter, and unusual poses, the model transforms masked objects in images into detailed 3D representations with accurate pose, geometry, and appearance.
The system combines progressive training methodologies with a human- and model-in-the-loop data engine, enabling it to excel in complex natural scenes that previous 3D generation models struggled with. Released publicly under the SAM License, SAM 3D provides both single-object and multi-object reconstruction capabilities, as well as specialized human body reconstruction through SAM 3D Body.
Researchers and developers can access the model through an interactive online playground or run it locally using the provided checkpoints and Python implementation. SAM 3D represents a significant advancement in mask-prompted 3D reconstruction, achieving ≥5:1 human-preference wins over prior models on real-world objects and scenes.
Key Features
Single-Image 3D Object Reconstruction — Converts any masked object from a 2D image into a complete 3D model with geometry, texture, pose, and layout, handling small objects, occlusions, and challenging real-world conditions.
Multi-Object Scene Reconstruction — Processes multiple objects simultaneously from a single image. Example notebooks demonstrate combining Objects and Body models in a shared reference frame.
SAM 3D Body for Human Reconstruction — Recovers detailed 3D human body meshes from single images through a specialized promptable foundation model optimized for human subjects.
Progressive Training with Human Feedback — Trained with a human- and model-in-the-loop data engine across progressive stages to improve robustness and quality in unfiltered natural image scenarios.
Multiple Output Formats — SAM 3D Objects exports 3D Gaussian splats in PLY format; SAM 3D Body outputs full-body meshes based on the Momentum Human Rig, both compatible with modern 3D rendering and visualization tools.
Interactive Playground — Offers browser-based access through the Segment Anything Playground, where you can type concepts (e.g., "sneaker", "chair") to select objects via SAM 3, then convert them to 3D without requiring local setup.
Pricing & Plans
SAM 3D is publicly released under the SAM License (a custom license). Code, weights, and the web demo are available at no charge, subject to license terms.
| Plan | Price | Access |
|---|---|---|
| Open Source | Free | Full access to model checkpoints, source code, demo scripts, and Jupyter notebooks via GitHub |
| Playground | Free | Browser-based interactive demo for testing single and multi-object reconstruction without installation |
| Local Deployment | Free | Run on your own hardware following the repository's installation requirements (PyTorch + PyTorch3D; GPU recommended) |
License Terms: The SAM License is a custom license that generally allows research and commercial use, subject to Acceptable Use Policy and Trade Control restrictions. Users must review the license file before deployment.
System Requirements: GPU acceleration is recommended for optimal performance. Follow the repository's INSTALL guide and requirements files for detailed setup instructions.
Pros & Cons
Pros:
- Publicly released under the SAM License with full access to code, model weights, and implementation details
- Handles complex real-world scenarios including occlusions, clutter, and unusual object poses
- Supports both single-object and multi-object reconstruction from one image
- Includes specialized human body reconstruction capabilities
- Free playground available for testing without technical setup
- Demonstrated strong performance with ≥5:1 human-preference wins in evaluations
Cons:
- Requires technical expertise and GPU hardware for local deployment
- No text-to-3D generation capability (text prompts only used for object selection via SAM 3)
- Designed as a research tool rather than production-ready application
- Documentation primarily targets researchers and developers
- Released under custom SAM License (not OSI-approved open source)
Best For
- Computer vision researchers developing or benchmarking 3D reconstruction methods
- AR/VR developers prototyping single-image 3D content generation pipelines
- Game developers experimenting with automatic 3D asset creation from photos
- Academic institutions teaching 3D computer vision and deep learning
- Technical users with GPU access who need publicly available 3D reconstruction capabilities
- Teams evaluating mask-prompted single-image 3D reconstruction for research applications
FAQ
Is SAM 3D free to use?
Yes, SAM 3D is publicly released under the SAM License (a custom license). All model checkpoints, code, and the online playground are available at no charge. Users should review the license terms, including Acceptable Use Policy and Trade Control restrictions, before commercial deployment.
What are the system requirements to run SAM 3D locally?
The model requires PyTorch and PyTorch3D according to the repository's installation guide. GPU acceleration is recommended for optimal performance. Follow the INSTALL documentation and requirements files in the GitHub repository for detailed setup instructions. Exact VRAM specifications are not centralized, but modern GPUs with sufficient memory are advised.
Can I use SAM 3D for commercial projects?
SAM 3D is released under the SAM License, which generally allows both research and commercial use. However, you must comply with the Acceptable Use Policy and Trade Control restrictions outlined in the license. Review the license file in the repository before deploying for commercial purposes.
Does SAM 3D support text prompts for 3D generation?
SAM 3D does not perform text-to-3D generation. However, in the integrated demo pipeline, text prompts can be used for object selection—you type a concept (e.g., "sneaker") and SAM 3 generates the mask, which SAM 3D then reconstructs into 3D. The core SAM 3D model itself works with images and masks.
How accurate are the 3D reconstructions?
According to the research paper, SAM 3D achieves ≥5:1 human-preference wins over prior models on real-world objects and scenes. The SAM 3D Body repository also includes per-dataset quantitative metrics such as MPJPE (Mean Per Joint Position Error) and PCK (Percentage of Correct Keypoints) for human reconstruction tasks.
Can SAM 3D reconstruct transparent or reflective objects?
The documentation does not specify performance on transparent or highly reflective materials. The model is trained on natural images with occlusions and clutter but detailed material-specific capabilities are not disclosed.
What output formats does SAM 3D support?
SAM 3D Objects exports 3D Gaussian splats in PLY format, while SAM 3D Body outputs full-body meshes based on the Momentum Human Rig. Both formats are compatible with modern 3D visualization and rendering tools. Converting to other formats like GLB or OBJ may require downstream processing.
How can I improve reconstruction quality?
Try experimenting with different random seeds in the inference parameters. The repository provides single-object and multi-object example notebooks that demonstrate various reconstruction options. Follow the repository's guidance for available inference settings to optimize results for your specific images.