Segment Anything
이미지에서 어떤 객체든 추출해주는 Meta의 AI모델
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Features
- "이미지 세그멘테이션"을 위한 첫번째 파운데이션 모델
- 픽셀이 어떤 객체에 속해있는지를 식별하는 것
- 추가적인 훈련 없이도 새로운 이미지 도메인(물속 사진이나, 세포 현미경 사진등)에도 적용 가능
Projects
- Magic Copy - Segment Anything Model을 이용한 크롬 확장
- Edit Anything - Meta의 Segment-Anything Model을 이용하여 이미지의 모든 부분을 인식하고, 각각을 수정 및 재생성
- Inpaint Anything - Segment Anything을 이용한 이미지 인페인팅
- FoodSAM - 음식 이미지용 SAM
- SAM.cpp - Meta의 Segment Anything Model을 순수 C/C++ 로 구현
- Language Segment-Anything - SAM with text prompt
- GroundingDINO
- GroundingSAM (GroundedSAM; Grounded SAM; Grounding-SAM; Grounding SAM; Grounded-Segment-Anything; Grounded Segment Anything)
- Semantic-SAM - Segment and Recognize Anything at Any Granularity
- FastSAM - Fast Segment Anything Model
- MobileSAM - This is the official code for MobileSAM project that makes SAM lightweight for mobile applications and beyond!
- Segment Anything Web - a Hugging Face Space by Xenova - 허깅 페이스에서 Web 에서 바로 데모 가능한 SAM
- SegmentAnything-TensorRT - SAM을 TensorRT에서 실행하기
- Track-Anything
- Caption-Anything
- SegmentAnythingin3D (Segment Anything in 3D; SA3D)
- AnyLabeling
Predict example
The code requires python>=3.8
, as well as pytorch>=1.7
and torchvision>=0.8
.
Install Segment Anything:
The following optional dependencies are necessary for mask post-processing, saving masks in COCO format, the example notebooks, and exporting the model in ONNX format. jupyter is also required to run the example notebooks.
Download Model Checkpoints:
Three model versions of the model are available with different backbone sizes. These models can be instantiated by running
from segment_anything import sam_model_registry
sam = sam_model_registry["<model_type>"](checkpoint="<path/to/checkpoint>")
Click the links below to download the checkpoint for the corresponding model type.
-
default
orvit_h
: ViT-H SAM model - 가장 파라미터 많음 -
vit_l
: ViT-L SAM model - 중간 -
vit_b
: ViT-B SAM model - 가장 경량화됨
from segment_anything import SamPredictor, sam_model_registry
sam = sam_model_registry["<model_type>"](checkpoint="<path/to/checkpoint>")
predictor = SamPredictor(sam)
predictor.set_image(<your_image>)
masks, _, _ = predictor.predict(<input_prompts>)
Example code
# -*- coding: utf-8 -*-
# curl -O https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth
from segment_anything import SamPredictor, sam_model_registry
import cv2
import os
import numpy as np
from datetime import datetime
use_gpu = False
sam = sam_model_registry["default"](checkpoint="./sam_vit_h_4b8939.pth")
if use_gpu:
sam.to(device="cuda")
predictor = SamPredictor(sam)
files = (
"mpv-shot0001.jpg",
"mpv-shot0002.jpg",
"mpv-shot0003.jpg",
"mpv-shot0004.jpg",
"mpv-shot0005.jpg",
"mpv-shot0006.jpg",
"mpv-shot0007.jpg",
"mpv-shot0008.jpg",
"mpv-shot0009.jpg",
"mpv-shot0010.jpg",
"mpv-shot0011.jpg",
"mpv-shot0012.jpg",
"mpv-shot0013.jpg",
"mpv-shot0014.jpg",
"mpv-shot0015.jpg",
"mpv-shot0016.jpg",
)
for filename in files:
file_prefix, ext = os.path.splitext(filename)
input_point = np.array([[1950, 378]])
input_label = np.array([1])
img = cv2.imread(filename)
begin = datetime.now()
predictor.set_image(img)
masks, scores, logits = predictor.predict(
point_coords=input_point,
point_labels=input_label,
multimask_output=True,
)
duration = (datetime.now() - begin).total_seconds()
print(f"Predict duration: {duration:.02f}s")
for i in range(masks.shape[0]):
mask_filename = file_prefix + f"-{i}.png"
cv2.imwrite(mask_filename, masks[0] * 255)
print(f"Save mask image: {mask_filename}")
NVIDIA GeForce RTX 3070 Ti 에서 Predict duration 이 약 0.72s 나온다.
내부 테스트 결과
- NVIDIA GeForce RTX 3070 Ti
- 3840x2160 (4K Original) 이미지 - 약 0.7s 소요