Papers

Paper 목록 뿐만 아닌, 그 밖의 흥미로운 주제도 모두 포함한다.

Computer Vision

FALdetector

Detecting_Photoshopped_Faces_by_Scripting_Photoshop.png
Photoshop 스크립팅을 통해 Photoshopped 얼굴 감지

Photo Wake-Up

Feature_picasso.gif
단일 이미지를 사용하여 3D 캐릭터 애니매에션을 만든다.

Image Deduplicator

Image_Deduplicator_-_mona_lisa.png
비슷한 이미지를 쉽게 찾을수 있다.

Image Inpainting for Irregular Holes Using Partial Convolutions

Image_Inpainting_for_Irregular_Holes_Using_Partial_Convolutions_-_Preview.jpg
부분 컨벌루션을 사용하여 불규칙한 구멍에 대한 이미지 인 페인팅

Noise2Noise

Noise2Noise_-_Preview.jpg
영상의 노이즈를 제거한다.

Super SloMo

Super_SloMo_-_Preview.jpg
일반영상을 슬로우 모션 영상으로 만든다.

Video-to-Video

Video-to-Video_Synthesis_-_Preview.jpg
입력 비디오에서 내용을 정확하게 묘사하는 사실적인 비디오로 출력한다.

Towards-Realtime-MOT

Towards-Realtime-MOT_-_MOT16-03.gif
딥러닝 기반, 빠른 다중 객체 추적기

Poly-MOT

Poly-MOT-Visualization.gif
3D 다중 객체 추적을 위한 다면체 프레임워크

Learning to Predict 3D Objects with an Interpolation-based Differentiable Renderer

191217_nvidia_902.png
2D 사진 1장으로 3D 모델 만드는 AI

PIFuHD: Multi-Level Pixel-Aligned Implicit Function for High-Resolution 3D Human Digitization

PIFuHD-sample.png
사람 사진 1장으로 3D 모델 만드는 AI

Image GPT

Openai-image-gpt-sample.jpg

SinGAN

Learning a Generative Model from a Single Natural Image

CRAFT

Craft_example.gif
OCR을 위한 텍스트 영역 분할.

MONAI

MONAI_-_COPLE-Net.png
AI Toolkit for Healthcare Imaging

C3DPO

C3DPO-demo.gif
모션에서 비 강성 구조를위한 정식 3D 포즈 네트워크

f-BRS

Fbrs_interactive_demo.gif
어노테이션 툴로 사용하기 좋은, 영역 선택 -> 객체 선택.

virtual-walk

Virtual-walk_-demo-_paris.gif
Virtual walks in Google Street View using PoseNet and applying Deep Learning models to recognize actions.
Github - virtual-walk

SwAV

Facebookresearch-swav.gif
클러스터 할당을 대조하여 시각적 기능에 대한 비지도 학습 (Unsupervised Learning)

DeepSORT


여러사람 트래킹 할 때 필요할듯

OCRNet

b5t에서 현대제강 진행할 때 사용한 Semantic Segmentation

Surface Crack Detection (Outlier Detection)

Surface_Crack_Outlier_Detection.png
alibi-detect의 VAE를 사용하여 콘크리트 표면의 금 간 부분을 알아내는 실용적인 인공지능 모델
Alibi Detect#Surface Crack Detection 항목 참조.

Deep Daze

`mist over green hills` -> DeepDaze_-_Mist_over_green_hills.jpg
텍스트에서 이미지 생성하기.
Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network)
Deep Daze 항목 참조.

MMSkeleton

Mmskeleton_demo_video.gif
스켈레톤으로 행위 인식
MMSkeleton 항목 참조.

Monodepth2

Depth_prediction-monodepth2.png
이미지에서 깊이 맵을 예측합니다.
Depth Prediction, Monodepth2 항목 참조.

Multi-person Real-time Action Recognition Based-on Human Skeleton

Realtime-Action-Recognition-demo-video.gif
OpenPose에서 골격에 ML을 적용합니다. 9 개의 행동; 여러 사람. (경고 : 이것은 실제 응용 프로그램이 아닌 코스 데모에만 유용하다는 점에 유감입니다
https://github.com/felixchenfy/Realtime-Action-Recognition
Skeleton Based Action Recognition 항목 참조.

SOLD2

SOLD2-demo_moving_camera.gif
특징 선 감지 및 설명을 위한 Joint 심층 네트워크.
https://github.com/cvg/SOLD2
Self-supervised Occlusion-aware Line Description and Detection (SOLD2) 항목 참조.

Self-Supervised Vision Transformers with DINO

Self-Supervised_Vision_Transformers_with_DINO_-_brief.gif
DINO를 위한 PyTorch 구현 및 사전 훈련 된 모델.
https://github.com/facebookresearch/dino
Self-Supervised Vision Transformers with DINO 항목 참조.

Handsfree.js

Handsfree.gif
웹에서 얼굴/손/포즈 인식을 쉽게
https://github.com/midiblocks/handsfree

What-If Tool

Wit-smile-intro.png
최소한의 코딩으로 훈련 된 기계 학습 모델의 동작을 시각적으로 조사합니다.
https://pair-code.github.io/what-if-tool/
https://github.com/PAIR-code/what-if-tool

Physics-based Human Motion Estimation and Synthesis from Videos

Physics-based_Human_Motion_Estimation_and_Synthesis_from_Videos.png
비디오에서 물리학 기반 인간 모션 추정 및 합성 - Physics-based Human Motion Estimation and Synthesis from Videos (ICCV 2021)
https://nv-tlabs.github.io/physics-pose-estimation-project-page/

LoFTR: Detector-Free Local Feature Matching with Transformers

Loftr-github-demo.gif
트랜스포머를 사용한 감지기 없는 로컬 피쳐 매칭
https://github.com/zju3dv/LoFTR

LaMa

LaMa_Demo.gif
큰해상도 이미지에서 사물 지우기
https://github.com/saic-mdal/lama

Self-supervised Geometric Correspondence for Category-level 6D Object Pose Estimation in the Wild

Self-supervised_Geometric_Correspondence_for_Category-level_6D_Object_Pose_Estimation_in_the_Wild.jpg
카테고리 수준의 6D 물체 포즈 추정을 위한 자체 지도형 기하학적 대응
https://kywind.github.io/self-pose

S-NeRF - Neural Radiance Fields for Street Views

S-NeRF_-Neural_Radiance_Fields_for_Street_Views-_sample.png
스트리트 뷰를 위해 영상으로 부터 깊이 정보를 추출
https://arxiv.org/abs/2303.00749 https://ziyang-xie.github.io/s-nerf/

Animated Drawings

Animated_Drawings_Sample.gif
아이들 그림을 애니메이트 시키기 오픈소스

Segment Anything in High Quality

SAM_vs_HQ-SAM_Overview.png
SAM_vs_HQ-SAM_Demo_4.gif
Alias: sam-hq, HQ-SAM

Segment Anything in 3D

SA3D-preview.gif
Alias: SegmentAnythingin3D, SA3D

LivePortrait

LivePortrait_-_showcase2.gif
스티칭 및 리타겟팅 제어를 통한 효율적인 인물 사진 애니메이션

SHARP

Sharp-teaser.jpg
Sharp Monocular View Synthesis in Less Than a Second
단일 이미지로부터 포토리얼리스틱 뷰를 합성하는 접근법

Point Cloud

PointNet

PointNet.jpg
PointNet
3D 분류 및 분할을 위한 포인트 세트 기반 딥러닝

Natural Language Processing

Google’s Zero-Label Language Learning Achieves Results Competitive With Supervised Learning

Towards Zero-Label Language Learning
Google AI 연구팀은 자연어 처리에서 제로 레이블 학습(합성 데이터만 사용한 학습)을 탐구하고 인간의 주석 없이 고품질 학습 데이터를 합성하도록 설계된 학습 데이터 생성 절차인 UDG(Unsupervised Data Generation)를 소개합니다.

Medical

Methodology

Miscellaneous

MMDetection3D

Mmdet3d_outdoor_demo.gif
3D 객체 탐지.

Extracting Training Data from Large Language Models

자연어 인공지능 모델 해킹하기

구글, 하버드, 스탠포드, OpenAI, 애플이 공동 발표한 논문에 따르면, 큰 언어 모델에 질문하는 것만으로 학습에 사용되었던 구체적인 데이터를 추출해 낼 수 있었다.
GPT-2를 대상으로 한 공격은 뉴스 헤드라인, 집주소와 같은 개인정보를 아주 높은 정확도로 추출해 냈다.
비단 GPT-2 뿐만 아니라 다른 언어 모델도 이런 공격에 취약할 수 있으니 학습 데이터를 전처리하는데 더욱 신경써야 한다

JoJoGan

JoJoGAN-samples.gif
조조 이미지 만들기 - Official PyTorch repo for JoJoGAN: One Shot Face Stylization
https://github.com/mchong6/JoJoGAN

Waifu Labs V2

Waifu_Labs_v2_sample.png
AI가 그리며 커스텀 가능한 일본 애니메이션 스타일 그림 생성기
https://waifulabs.com/blog/ai-creativity

StyleNeRF

StyleNeRF_web_demo.gif
3D 여러 방향 각도로 이미지 생성
https://github.com/facebookresearch/StyleNeRF

Anime BigGAN Toy

Anime_BigGAN_Toy_-_Sample.gif
Generate Amazing Anime Pictures With BigGAN. Just Have Fun
https://github.com/HighCWu/anime_biggan_toy

sahi

Sahi_-_sliced_inference.gif
A lightweight vision library for performing large scale object detection/ instance segmentation.
https://github.com/obss/sahi

Exploiting Diffusion Prior for Real-World Image Super-Resolution (StableSR)

StableSR_-_example.png
Exploiting Diffusion Prior for Real-World Image Super-Resolution
https://iceclear.github.io/projects/stablesr/

ReIdentificationNet

Multi-camera-tracking-ReIdentificationNet.gif
Enhance Multi-Camera Tracking Accuracy by Fine-Tuning AI Models with Synthetic Data
https://developer.nvidia.com/blog/enhance-multi-camera-tracking-accuracy-by-fine-tuning-ai-models-with-synthetic-data/

MegaSaM

MegaSaM-Sample.gif
MegaSaM - Accurate, Fast and Robust Structure and Motion from Casual Dynamic Videos
https://mega-sam.github.io/

CAT4D

Howitworks_cat4d.gif
CAT4D: Create Anything in 4D with Multi-View Video Diffusion Models
https://cat-4d.github.io/

Graphs

Speech

Playing Games

Generative Agents - Interactive Simulacra of Human Behavior

Generative_Agents_-_Interactive_Simulacra_of_Human_Behavior.png
Generative Agents - Interactive Simulacra of Human Behavior
https://arxiv.org/abs/2304.03442

Time Series

Computer Code

Audio

Robots

Game Engine

Diffusion Models Are Real-Time Game Engines

(대략 DOOM 스샷)
Real-time recordings of people playing the game DOOM simulated entirely by the GameNGen neural model.
https://gamengen.github.io/

3D Modeling

Hunyuan3D

Hunyuan3D_2.0.jpg
Tencent의 고해상도 3D Asset 생성 시스템
https://github.com/Tencent/Hunyuan3D-2

Bolt3D

Bolt3D_-_howitworks_website.jpg
초고속 3D Scene 생성 모델
https://szymanowiczs.github.io/bolt3d

Knowledge Base

Adversarial

Music

Reasoning

TODOs

Favorite site

[추천] OpenResearch.ai ¹ - 딥러닝 논문이 잘 정리되어 있다.
[추천] Papers With Code : the latest in machine learning - 페이퍼와 코드를 함께 정리해놓음.
- [강추] Browse the State-of-the-Art in Machine Learning | Papers With Code - 머신 러닝의 최신 기술 목록 및 카테고리별 정리.
Sci-Hub: making uncommon knowledge common - Sci-Hub 통계와 데이터베이스
- 8834만개의 연구 논문/자료 DB(약 100TB)
  - 의학/화학/생물학/인문학/물리학/공학/공학/경제학..
  - 출처는 저널 80%, 컨퍼런스 6%, 책 5%
  - 77%는 1980~2020년 사이, 36%는 2010~2020년 출판물
- dois 목록 및 메타데이터만 저장한 SQL 테이블도 다운로드 가능
- 전체 논문 데이터는 토렌트로 다운로드 가능 (Reddit의 Rescue Mission)
[추천] Deep Learning Bible - 4. Object Detection - 한글 - WikiDocs - 여러 논문을 한글로 분석/정리했다.

Article

The General Index - 백만개 저널의 n-gram 인덱스를 무료로 공개 | GeekNews
Unpaywall - 오픈 액세스 논문 검색엔진 | GeekNews
top-cited-2022-papers.tsv - 2022년에 가장 많이 인용된 AI 논문 100 (3년간 인용 횟수 및 논문 갯수별 순위 : 국가별, 회사별)
- 지난 3년간 탑 5 논문들
  - 2022: AlphaFold Protein Structure DB, ColabFold, DALL-E2, ConvNet, PaLM
  - 2021: AlphaFold Protein Strecture Prediction, Swin Transformer, CLIP,..
  - 2020: Transfomers for Image Recognition, GPT-3, YOLOv4,..

Tools

Writefull - 학술논문 작성을 도와주는 AI
- Title Generator: 초록(abstract)를 입력하면 제목을 추천
- Abstract Generator: 서론과 결론을 입력하면 초록을 생성
- Paraphraser: 표절 시비를 피하면서 인용할 수 있도록 문장 내용을 바꿔줌
- Academizer: 비 학술적인 표현들로 작성된 문장을 학술적인 표현으로 바꿔줌
- GPT Detector: 특정 문단이 GPT-3, ChatGPT 등의 AI로 생성된 것인지 확인해줌

References

Openresearch.ai-190921.zip ↩

Papers

virtual-walk

Surface Crack Detection (Outlier Detection)

Monodepth2

Multi-person Real-time Action Recognition Based-on Human Skeleton

Physics-based Human Motion Estimation and Synthesis from Videos

LoFTR: Detector-Free Local Feature Matching with Transformers

S-NeRF - Neural Radiance Fields for Street Views

Medical

Methodology

Miscellaneous

JoJoGan

Waifu Labs V2

Anime BigGAN Toy

Exploiting Diffusion Prior for Real-World Image Super-Resolution (StableSR)

Graphs

Speech

Playing Games

Time Series

Computer Code

3D Modeling

Knowledge Base

Adversarial

Reasoning

TODOs

See also

Favorite site

Article

Tools

References