Kubernetes
Container Orchestration의 De facto.
오픈소스 커뮤니티에 컨테이너화된 소프트웨어 개발을 도입하는 데 있어 도커는 강력하고 혁신적인 도구와 기술 모음의 중추적인 역할을 맡고 있다. 구글이 작년 6월에 선보인 쿠버네티스(Kubernetes)는 개발을 가속화하고 운영을 간소화하는 데 사용되는 오픈소스 컨테이너 관리 도구다.
구글은 오래전부터 내부 운영에 컨테이너를 사용해왔다. 구글은 2014년 여름 도커콘(DockerCon)에서 폭발적으로 성장하는 도커 생태계의 요구 사항에 대처하기 위해 개발된 쿠버네티스를 오픈소스화했다. 쿠버네티스 프로젝트 관리자들은 레드 햇, 코어OS(CoreOS)를 비롯한 다른 조직 및 프로젝트와의 협력을 통해 쿠버네티스를 도커 허브 다운로드 1위 프로젝트로 육성했다. 쿠버네티스 팀은 소프트웨어 개발자들이 인프라를 관리하는 시간을 절약하고, 앱을 구축하는 데 사용할 수 있도록 돕기 위해 프로젝트를 확장하고 커뮤니티를 키워나갈 방침이다.
Categories
- Container Orchestration
- Cloud Native Computing Foundation (CNCF; Linux Foundation)
- Kubernetes:Basic
- Kubernetes:Pod
- Kubernetes:ReplicaSet
- Kubernetes:Volume
- Kubernetes:Secret
- Kubernetes:Service
- Kubernetes:ServiceAccount
- Kubernetes:PersistentVolume (Kubernetes:SharedStorage)
- Kubernetes:HorizontalPodAutoscaler (Autoscaling)
- Kubernetes:CustomResourceDefinition (CRD; CRDs)
- Kubernetes:StorageClass
- Kubernetes:API (kube-apiserver)
- Kubernetes:Troubleshooting
- DevOps:Kubernetes - 쿠버네티스 기반 데브옵스 서버 구축.
- kubectl - 명령줄 유틸리티
- kubeadm - 클러스터를 부트스트랩하는 명령
- kubelet - 클러스터의 모든 머신에서 실행되는 파드와 컨테이너 시작과 같은 작업을 수행하는 컴포넌트이다
- Persistent Volume (PV)
- Persistent Volume Claim (PVC)
Lightweight
- Lightweight Kubernetes (k3s; Rancher)
- microk8s (Canonical)
- minikube
- kind
- k3d - docker 설치되는 k8s
Cloud Service
Container Network Interface (CNI)
- Ingress
- Flannel
- ACI
- Calico
- Canal
- Cilium
- CNI-Genie
- Contiv
- Multus
- NSX-T
- Nuage
- Romana
- Weave Net
Libraries
Block Storage
Object Storage
Configuration
certificate
Secret
Sidecar
ETC
- CDK for Kubernetes (cdk8s)
- Porter - 쿠버네티스 기반 PaaS 오픈소스.
- kubespray - 클러스터 구축 자동화 도구.
- Docker
- Docker Swarm
- k0s
- netshoot - 네트워크 문제를 추적할 때 필요한 여러 가지 도구를 포함한 컨테이너.
- KDash - 빠르고 심플한 쿠버네티스 대쉬보드
- Talos Linux - K8s를 위해 디자인된 리눅스
- StackRox - 쿠버네티스 보안 플랫폼
- Lens - 쿠버네티스 클러스터 관리를 위한 프로그램
- kubeflow - Kubeflow는 머신 러닝 모델의 개발, 학습, 배포 과정 전체 파이프라인의 반복적인 작업을 자동화하는 머신 러닝 플랫폼입니다.
- Istio
- Kubeshark - 쿠버네티스용 API 트래픽 뷰어
- Carreta - 쿠버네티스 디펜던시 맵 생성 도구
- KWOK - Kubernetes WithOut Kubelet
Patterns
- 기본 패턴
- 정상 상태 점검 패턴
- 예측 범위 내의 요구 사항 패턴
- 자동 배치 패턴
- 구조적 패턴
- 초기화 컨테이너 패턴
- 사이드카 패턴
- 행동 패턴
- 배치잡 패턴
- 스테이트풀 서비스 패턴
- 서비스 디스커버리 패턴
- 고급 패턴
- 컨트롤러 패턴
- 오퍼레이터 패턴
HTTP 연결
Kube-apiserver#HTTPS Endpoint 항목 참조.
Kubernetes를 Production에서 사용할 때 체크리스
Running applications in production can be tricky. This post proposes an opinionated checklist for going to production with a web service (i.e. application exposing HTTP API) on Kubernetes.
This checklist was shaped by my own limited experience in Zalando, so please see it only as template for your own checklist adopted to your environment. Some items can be considered optional or aspirational, depending on context.
General
- Application's name, description, purpose, and owning team is clearly documented (e.g. in a central application registry or wiki)
- Application's criticality level was defined (e.g. "tier 1" if the app is highly critical for the business)
- Development team has sufficient knowledge/experience with the technology stack
- Responsible 24/7 on-call team is identified and informed
- Go-Live plan exists incl. steps for potential rollback
Application
- Application's code repository (git) has clear instructions on how to develop, how to configure, and how to contribute changes (important for emergency fixes)
- Code dependencies are pinned (i.e. hotfix changes do not accidentally pull in new libraries)
- All relevant code is instrumented with OpenTracing or OpenTelemetry
- OpenTracing/OpenTelemetry semantic conventions are followed (incl. additional company conventions)
- All outgoing HTTP calls have a defined timeout
- HTTP connection pools are configured with sane values according to expected traffic
- Thread pools and/or non-blocking async code is correctly implemented/configured
- Database connection pools are sized correctly
- Retries and retry policies (e.g. backoff with jitter) are implemented for dependent services
- Circuit breakers are implemented
- Fallbacks for circuit breakers are defined according to business requirements
- Load shedding / rate limiting mechanisms are implemented (could be part of provided infrastructure)
- Application metrics are exposed for collection (e.g. to be scraped by Prometheus)
- Application logs go to stdout/stderr
- Application logs follow good practices (e.g. structured logging, meaningful messages), log levels are clearly defined, and debug logging is disabled for production by default (with option to turn on)
- Application container crashes on fatal errors (i.e. it does not enter some unrecoverable state or deadlock)
- Application design/code was reviewed by a senior/principal engineer
Security & Compliance
- Application can run as unprivileged user (non-root)
- Application does not require a writable container filesystem (i.e. can be mounted read-only)
- HTTP requests are authenticated and authorized (e.g. using OAuth)
- Mechanisms to mitigate Denial Of Service (DOS) attacks are in place (e.g. ingress rate limiting, WAF)
- A security audit was conducted
- Automated vulnerability checks for code / dependencies are in place
- Processed data is understood, classified (e.g. PII), and documented
- Threat model was created and risks are documented
- Other applicable organizational rules and compliance standards are followed
CI/CD
- Automated code linting is run on every change
- Automated tests are part of the delivery pipeline
- No manual operations are needed for production deployments
- All relevant team members can deploy and rollback
- Production deployments have smoke tests and optionally automatic rollbacks
- Lead time from code commit to production is fast (e.g. 15 minutes or less including test runs)
Kubernetes
- Development team is trained in Kubernetes topics and knows relevant concepts
- Kubernetes manifests use the latest API version (e.g. apps/v1 for Deployment)
- Container runs as non-root and uses a read-only filesystem
- A proper Readiness Probe was defined, see blog post about Readiness/Liveness Probes
- No Liveness Probe is used, or there is a clear rationale to use a Liveness Probe, see blog post about Readiness/Liveness Probes
- Kubernetes deployment has at least two replicas
- A Pod Disruption Budget was defined (or is automatically created, e.g. by pdb-controller)
- Horizontal autoscaling (HPA) is configured if adequate
- Memory and CPU requests are set according to performance/load tests
- Memory limit equals memory requests (to avoid memory overcommit)
- CPU limits are not set or impact of CPU throttling is well understood
- Application is correctly configured for the container environment (e.g. JVM heap, single-threaded runtimes, runtimes not container-aware)
- Single application process runs per container
- Application can handle graceful shutdown and rolling updates without disruptions, see this blog post
- Pod Lifecycle Hook (e.g. "sleep 20" in preStop) is used if the application does not handle graceful termination
- All required Pod labels are set (e.g. Zalando uses "application", "component", "environment")
- Application is set up for high availability: pods are spread across failure domains (AZs, default behavior for cross-AZ clusters) and/or application is deployed to multiple clusters
- Kubernetes Service uses the right label selector for pods (e.g. not only matches the "application" label, but also "component" and "environment" for future extensibility)
- There are no anti-affinity rules defined, unless really required (pods are spread across failure domains by default)
- Optional: Tolerations are used as needed (e.g. to bind pods to a specific node pool)
See also this curated checklist of Kubernetes production best practices.
Monitoring
- Metrics for The Four Golden Signals are collected
- Application metrics are collected (e.g. via Prometheus scraping)
- Backing data store (e.g. PostgreSQL database) is monitored
- SLOs are defined
- Monitoring dashboards (e.g. Grafana) exist (could be automatically set up)
- Alerting rules are defined based on impact, not potential causes
Testing
- Breaking points were tested (system/chaos test)
- Load test was performed which reflects the expected traffic pattern
- Backup and restore of the data store (e.g. PostgreSQL database) was tested
24/7 On-Call
- All relevant 24/7 personnel is informed about the go-live (e.g. other teams, SREs, or other roles like incident commanders)
- 24/7 on-call team has sufficient knowledge about the application and business context
- 24/7 on-call team has necessary production access (e.g. kubectl, kube-web-view, application logs)
- 24/7 on-call team has expertise to troubleshoot production issues with the tech stack (e.g. JVM)
- 24/7 on-call team is trained and confident to perform standard operations (scale up, rollback, ..)
- Runbooks are defined for application-specific incident handling
- Runbooks for overload scenarios have pre-approved business decisions (e.g. what customer feature to disable to reduce load)
- Monitoring alerts to page the 24/7 on-call team are set up
- Automatic escalation rules are in place (e.g. page next level after 10 minutes without acknowledgement)
- Process for conducting postmortems and disseminating incident learnings exists
- Regular application/operational reviews are conducted (e.g. looking at SLO breaches)
Documentation
- 미국 국가안보국(NSA)의 쿠버네티스 보안 강화 지침
- https://media.defense.gov/2021/Aug/03/2002820425/-1/-1/1/CTR_KUBERNETES%20HARDENING%20GUIDANCE.PDF
-
CTR_KUBERNETES_HARDENING_GUIDANCE.PDF - 주요 내용
- 취약점 또는 잘못된 구성을 찾기 위해 컨테이너와 포드 스캔하기
- 최소한의 권한으로 컨테이너와 포드 실행하기
- 네트워크를 분리하여 발생할 수 있는 데미지를 제어하기
- 방화벽을 사용해서 필요없는 네트웍 연결을 제한하고, 암호화하기
- 강력한 인증 및 권한관리를 통해서 사용자 및 관리자의 접근을 제한하고, 공격 표면(Attack Surface)을 한정시키기.
- 관리자가 활동을 모니터링하고, 잠재적인 위험 동작에 대해 경고할 수 있도록 Log Auditing 사용하기
- 정기적으로 모든 k8s 설정을 검토하고, Vulnerability 스캔을 사용해서 위험 방지및 보안 패치가 적용되었는지 확인하기
- [번역] 쿠버네티스에서 쉽게 저지르는 10가지 실수
- https://coffeewhale.com/kubernetes/mistake/2020/11/29/mistake-10/
k8s 이용해서 웹 서버 구축하기
- k8s 이용해서 웹 서버 구축하기 - 1부. install k3s with Calico - Oznote.io_-_k8s_web_server_1.pdf
- k8s 이용해서 웹 서버 구축하기 - 2부. setup nginx-ingress, ipvs, metalb - Oznote.io_-_k8s_web_server_2.pdf
- k8s 이용해서 웹 서버 구축하기 - 3부. deploy nextjs (with. bun.js) - Oznote.io_-_k8s_web_server_3.pdf
- k8s 이용해서 웹 서버 구축하기 - 4부. nginx ingress 에 https 적용 (cert-manager 이용) - Oznote.io_-_k8s_web_server_4.pdf
- k8s 이용해서 웹 서버 구축하기 - 5부. MariaDB 배포 및 연동 - Oznote.io_-_k8s_web_server_5.pdf
- k8s 이용하서 웹 서버 구축하기 - 5-1부. (번외편) PostgreSQL 배포 및 연동 - Oznote.io_-_k8s_web_server_5-1.pdf
- k8s 이용해서 웹 서버 구축하기 - 6부. MiniO 구축하기 - Oznote.io_-_k8s_web_server_6.pdf
List of Container Orchestration
- Container Orchestration
- Marathon (Apache)
- Rancher (SUSE)
- Shipyard
- Kubernetes (Google)
- OpenShift (Red Hat)
- Swarm (Docker)
- Mesos (Apache)
- DEIS
- Nomad (HashiCorp)
- Tanzu (VMware)
- Singularity (HPC)
See also
Favorite site
- kubernetes web site
- Kubernetes - ArchWiki
- Docker + Kubernetes를 이용한 빌드 서버 가상화 사례
- Google의 컨테이너 관리 논문을 읽고 (Borg Paper, Kubernetes)
- Kubernetes 설치하기 (Ubuntu 15.04)
- 컨테이너 오케스트레이션 시스템, "쿠버네티스(Kubernetes)"
- Slideshare: Introduce Google Kubernetes (ko)
- Slideshare: Docker + Kubernetes를 이용한 빌드 서버 가상화 사례
- [추천] CNCF Cloud Native Interactive Landscape - Kubernetes Ecosystem Landscape 이미지
Documentation
How to install
- [추천] kubernetes installation on ubuntu
- [추천] How to install Kubernetes Mesos Framework on Ubuntu (ko)
- [추천] Google Kubernetes – Container Cluster Manager (ko)
- Ubuntu에서 kubernetes 셋업하기
- Vagrant로 CoreOS + Kubernetes 설정하기
- ubuntu에 kubernetes 설치
- Kubernetes 간략한 소개 및 kuberenetes - docker 연동 설치 가이드
- [추천] Container Tutorials: get started kubernetes
- [추천] kubernetes installation on ubuntu
- Kubernetes 설치하기 (Ubuntu 14.04.3 LTS)
- Introducing Kubernetes version 1.0!
- Get that kubernetes cluster working !!!
- Youtube: Setting up and using a single node Kubernetes cluster
- Youtube: Kubernetes : Getting started with Kubernetes
Guide
- (번역) 쿠버네티스에서 쉽게 저지르는 10가지 실수
- [추천] A Kubernetes guide for Docker Swarm lovers - Docker Swarm을 아는 사람이 보는 Kubernetes
- [추천] Container부터 다시 살펴보는 Kubernetes Pod 동작 원리
- [추천] (번역) 쿠버네티스 2,500대 노드 운영하기
- [추천] (번역)쿠버네티스 패킷의 삶 - #1
- [추천] 쿠버네티스 시작하기 - Kubernetes란 무엇인가? 1
- [추천] 쿠버네티스 안내서
Tutorials
- 쿠버네티스 안내서 - 입문자를 위한, 설치부터 배포까지 <실습편>
- 도커 사용자를 위한 kubectl | Kubernetes
Article
Videos
- Kubernetes 공식 다큐멘터리 - 1, 2 편 총 55분
Online Tools
- kyaml2go - Kubernetes client-go code generator for resource YAMLs
- KUBERNETES INSTANCE CALCULATOR - 쿠버네티스 클러스터에 적합한 인스턴스를 찾기 쉽게 도와주는 계산기입니다.
- Kubernetes Spec Explorer - Reference Guide and Documentation
- 쿠버네티스의 모든 리소스/속성/타입/예제에 대한 문서를 쉽게 찾아 보기 가능
- 모든 리소스의 스키마, 유형 및 설명이 포함된 트리 보기 지원
- 버전 X 이후의 변경 내역(속성 추가/제거/수정) 표시
- 새로 출시된 1.32를 포함하여 1.25 이후의 모든 버전 지원
References
-
Getting_Started_with_Kubernetes_-_What_is_Kubernetes.pdf ↩