Caffe

Caffe is a deep learning framework made with expression, speed, and modularity in mind. It is developed by the Berkeley Vision and Learning Center (BVLC) and by community contributors. Yangqing Jia created the project during his PhD at UC Berkeley. Caffe is released under the BSD 2-Clause license.

Models

CaffeNet vs AlexNet

CaffeNet
AlexNet
GoogleNet

Databases

Web demo

웹 데모를 위해 Python 패키지에서 Werkzeug, Flask, tornado, numpy, pandas, Pillow, itsdangerous 등이 필요하다. 설치는 아래와 같이 진행 할 수 있다.

$ pip install -r examples/web_demo/requirements.txt

또한 Python PATH를 아래와 같이 지정해야 한다.

PYTHONPATH=/path/to/caffe/python:$PYTHONPATH

이후 examples/web_demo/app.py를 사용하여 데모를 시작할 수 있다.

$ python examples/web_demo/app.py -h

Mean value

Stackoverflow - What is the order of mean values in Caffe's train.prototxt?

https://groups.google.com/forum/#!topic/caffe-users/9opH6AW3Irw (answer by Evan Shelhamer):
[Mean] values are BGR for historical reasons -- the original CaffeNet training lmdb was made with image processing by OpenCV which defaults to BGR order.

Troubleshooting

Multi-GPU Usage

caffe/multigpu.md at master · BVLC/caffe

Currently Multi-GPU is only supported via the C/C++ paths and only for training.

The GPUs to be used for training can be set with the "-gpu" flag on the command line to the 'caffe' tool. e.g. "build/tools/caffe train --solver=models/bvlc_alexnet/solver.prototxt --gpu=0,1" will train on GPUs 0 and 1.

NOTE: each GPU runs the batchsize specified in your train_val.prototxt. So if you go from 1 GPU to 2 GPU, your effective batchsize will double. e.g. if your train_val.prototxt specified a batchsize of 256, if you run 2 GPUs your effective batch size is now 512. So you need to adjust the batchsize when running multiple GPUs and/or adjust your solver params, specifically learning rate.

Could NOT find Atlas

cmake도중 아래와 같은 에러 메시지가 출력될 수 있다.

Could NOT find Atlas (missing: Atlas_LAPACK_LIBRARY)

Atlas관련 라이브러리를 찾을 수 없다는 에러이다. 이 경우 OpenBLAS를 사용하도록 우회한다. cmake/Dependencies.cmake파일을 참조하면 알 수 있지만, BLAD값을 Open으로 변경 해주면 된다.

$ cmake -DBLAS=Open -DUSE_CUDNN=1 -G "Unix Makefiles"

Python can't import _caffe module

Python can't import _caffe module #263

Problem with dynamic loading and caffe.proto

Problem with dynamic loading and caffe.proto #1917

EXE 실행시 아래와 같은 에러 메시지가 출력할 수 있다.

[libprotobuf ERROR google/protobuf/descriptor_database.cc:57] File already exists in database: caffe.proto

케이스가 다양하겠지만 필자의 경우 OpenCV의 Contrib확장 모듈을 추가 설치하여 발생된 문제이다. 확장모듈을 제거하고 OpenCV를 설치하면 된다.

ImportError: No module named gpu_nms

Faster-RCNN에서 발생되는 현상 중 하나. 해결 방법은 관련 항목 참조.

Segmentation Fault

세그먼테이션 결함은 발생되는 상황이 매우 많다. 따라서 아래의 내용 중 하나를 확인해야 한다.

Boost 버전. (참고로 필자의 경우 1.60.0에서 Boost.Python패키지를 사용했을 때 OpenCV.Python용 Mat에 접근하면 프로그램이 죽는 현상이 발생되었다. 1.59.0버전으로 다운그레이드 후 실행하니 정상적으로 작동하였다.)
- 위와 비슷한 현상이 발생되었을 경우, Python 관련 패키지를 다시 설치하였다. NumPy -> Boost.Python -> OpenCV -> Caffe 순서로 설치하였다.

R6034

Pycaffe사용시 이러한 메시지가 출력되었을 경우 아래와 같이 수정해 보면 정상작동할 가능성이 있다.

Protobuf가 설치된 Python Package의 google/__init__.py에서 __import__('pkg_resources').declare_namespace(__name__)를 제거한다.
Pycaffe의 caffe/io.py의 import skimage.io와 from skimage.transform import resize를 제거하면 된다. 이 부분은 Python의 __import__관련 항목을 사용할 경우 발생되는 문제로 보인다. (정확한 원인을 확인해야 한다)

registry.count(type) (0 vs. 1)

실행할 경우 아래와 같은 에러가 발생될 수 있다.

Check failed: registry.count(type) == 1 (0 vs. 1) Unknown layer type: Input (known types: MemoryData)

이 경우 만약 정적 컴파일하였다면, 동적 컴파일로 옵션을 바꾸면 된다. (LayerFactory에 등록하는 코드가 Global에 포함되어있다.)

No module named skimage.io

Python에서 skimage.io가 없을 경우 해당 패키지를 아래와 같이 설치하면 된다.

$ pip install scikit-image

cudaSuccess (2 vs. 0) out of memory

트레이닝 등의 상황에서 아래와 같은 에러메시지를 발견할 수 있다.

I0329 21:13:11.459560  5026 net.cpp:217] Network initialization done.
I0329 21:13:11.459564  5026 net.cpp:218] Memory required for data: 343607608
I0329 21:13:11.459643  5026 solver.cpp:42] Solver scaffolding done.
I0329 21:13:11.459671  5026 solver.cpp:222] Solving CaffeNet
I0329 21:13:11.459676  5026 solver.cpp:223] Learning Rate Policy: step
I0329 21:13:11.459689  5026 solver.cpp:266] Iteration 0, Testing net (#0)
I0329 21:15:33.072834  5026 solver.cpp:315]     Test net output #0: accuracy = 0
I0329 21:15:33.072907  5026 solver.cpp:315]     Test net output #1: loss = 7.48307 (* 1 = 7.48307 loss)
F0329 21:15:33.540781  5026 syncedmem.cpp:51] Check failed: error == cudaSuccess (2 vs. 0)  out of memory
*** Check failure stack trace: ***
    @     0x7f6a22caddaa  (unknown)
    @     0x7f6a22cadce4  (unknown)
    @     0x7f6a22cad6e6  (unknown)
    @     0x7f6a22cb0687  (unknown)
    @     0x7f6a230a0fda  caffe::SyncedMemory::mutable_gpu_data()
    @     0x7f6a23094352  caffe::Blob<>::mutable_gpu_data()
    @     0x7f6a230c4f05  caffe::ConvolutionLayer<>::Forward_gpu()
    @     0x7f6a2304c41d  caffe::Net<>::ForwardFromTo()
    @     0x7f6a2304c887  caffe::Net<>::ForwardPrefilled()
    @     0x7f6a2306b454  caffe::Solver<>::Step()
    @     0x7f6a2306be7f  caffe::Solver<>::Solve()
    @           0x4085d8  train()
    @           0x406b71  main
    @     0x7f6a221bfec5  (unknown)
    @           0x40711d  (unknown)
    @              (nil)  (unknown)
Aborted (core dumped)

nvidia-smi등의 유틸리티로 확인하면 GPU메모리가 부족하단걸 알 수 있다. 만약 트레이닝중이라면 solver.prototxt에서 batch size를 낮게 조정하면 된다. 이걸로도 부족하다면 레이어 구조를 바꾸는 수밖에...

Documentation

Welcome to the BrainCaffe wiki! Caffe Documentation - ver.Kor: https://github.com/ys7yoo/BrainCaffe/wiki; BrainCaffe.wiki_-master-_8616aa1.zip (2017-09-01)

Favorite site

Caffe web site
Caffe user groups
영상을 이용하기위한 Convolutional Neural Networks, CNN
Github: Study materials and example codes for caffe library (caffe 오픈소스 분석을 통해 deep learning 이해)
[추천] 딥러닝 공부 가이드 (HW / SW 준비편) ¹
- Which GPU(s) to Get for Deep Learning: My Experience and Advice for Using GPUs in Deep Learning
- A Full Hardware Guide to Deep Learning

Caffe

Category

Caffe Tour

Caffe source code

Caffe:Api

Layers

Programming

Libraries