NumPy

NumPy("넘파이"라 읽는다)는 행렬이나 일반적으로 대규모 다차원 배열을 쉽게 처리 할 수 있도록 지원하는 파이썬의 라이브러리이다. NumPy는 데이터 구조 외에도 수치 계산을 위해 효율적으로 구현된 기능을 제공한다.

NumPy APIs

numpy
numpy.reshape - 차원(shape) 변경
numpy.expand_dims - 차원 추가 (e.g. np.expand_dims(a, axis=0))
numpy.newaxis - 새로운 axis 추가할 경우 사용. (e.g. a[:, np.newaxis])
numpy.arange - python의 range와 비슷한 효과 (e.g. np.arange(24))
numpy.astype - dtype을 변경할 때 사용.
numpy.argwhere - Non-zero 인 elements 의 인덱스 목록을 반환한다. (e.g. np.argwhere(arr >= 1))

Array

1차원 배열과 2차원 배열은 아래와 같이 생성/사용한다.

import numpy as np

a = np.array([1, 2, 3])  # Create a rank 1 array
print type(a)            # Prints "<type 'numpy.ndarray'>"
print a.shape            # Prints "(3,)"
print a[0], a[1], a[2]   # Prints "1 2 3"

b = np.array([[1,2,3],[4,5,6]])   # Create a rank 2 array
print b.shape                     # Prints "(2, 3)"
print b[0, 0], b[0, 1], b[1, 0]   # Prints "1 2 4"

참고로 아래와 같은 방법으로도 배열 생성이 가능하다.

import numpy as np
a = np.zeros((2,2))  # Create an array of all zeros
b = np.ones((1,2))   # Create an array of all ones
c = np.full((2,2), 7) # Create a constant array
d = np.eye(2)        # Create a 2x2 identity matrix
e = np.random.random((2,2)) # Create an array filled with random values

전체 엘리먼트 개수:

>>> x = np.zeros((3, 5, 2), dtype=np.complex128)
>>> x.size
30
>>> np.prod(x.shape)
30

Return the indices of the elements that are non-zero:

https://www.numpy.org/devdocs/reference/generated/numpy.nonzero.html

>>> a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
>>> a > 3
array([[False, False, False],
       [ True,  True,  True],
       [ True,  True,  True]])
>>> np.nonzero(a > 3)
(array([1, 1, 1, 2, 2, 2]), array([0, 1, 2, 0, 1, 2]))

Array slicing

아래와 같은 방법으로 슬라이싱이 가능하다.

import numpy as np
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
b = a[:2, 1:3]
# [[2 3]
#  [6 7]]

row_r1 = a[1, :] # Rank 1 view of the second row of a: # Prints "[5 6 7 8] (4,)"
row_r2 = a[1:2, :] # Rank 2 view of the second row of a: # Prints "[[5 6 7 8]] (1, 4)"

col_r1 = a[:, 1] # Rank 1: [ 2,  6, 10]
col_r2 = a[:, 1:2] # Rank 2: [[ 2], [ 6], [10]]

Array Indexing

import numpy as np
a = np.array([[1,2], [3, 4], [5, 6]])
a[[0, 1, 0]] # [[1, 2], [3, 4], [1, 2]]
a[[0, 1, 0], [0, 1, 0]] # [1, 4, 1]
# a[[0, 1, 0], [0, 1]] # ERROR!

Zero-copy Buffer

Copy Memory-mapped file into CuPy array · Issue #3431 · cupy/cupy

That's exactly right. Zero-copy is the most common reason to use mmap. Another way to wrap a mmap with a NumPy array is to do this:

mm = mmap.mmap(...)
arr = np.ndarray(..., buffer=mm, ...)

numpy.frombuffer 도 사용 가능한듯

Add axis

NumPy 배열에 축 추가하기 (adding axis to NumPy Array) : np.newaxis, np.tile

Indexing으로 길이가 1인 새로운 축을 추가하기:

import numpy as np
a = np.array([1., 2., 3., 4.])
a[:, np.newaxis]

배열을 반복하면서 새로운 축을 추가하기:

import numpy as np
a = np.arange(8).reshape(2, 4)
b = np.tile(a, 2)

최대/최소 구하기

(Numpy) 최소값(min) 최대값(max).. : 네이버블로그

>>> import numpy as np
>>> n1 = np.array([[0,0,1,2,3],
...                [5,6,7,0,0]])
>>> n1.max()
7
>>> n1.min()
0
>>> np.max(n1, axis=0)
array([5, 6, 7, 2, 3])
>>> np.max(n1, axis=1)
array([3, 7])
>>> np.min(n1, axis=0)
array([0, 0, 1, 0, 0])
>>> np.min(n1, axis=1)
array([0, 0])

np.apply_along_axis 함수를 사용한 min, max

지정한 축을 따라 연산합니다.

>>> import numpy as np
>>> n1 = np.array([[0,0,1,2,3],
...                [5,6,7,0,0]])
>>> np.apply_along_axis(lambda a: np.min(a), 1, n1)
array([0, 0])
>>> np.apply_along_axis(lambda a: np.min(a), 0, n1)
array([0, 0, 1, 0, 0])

마스크의 ROI Rect 구하기

import numpy as np

def find_roi(array):
    # True 값을 가지는 좌표들을 찾습니다.
    true_indices = np.argwhere(array)

    # True 값이 없다면 ROI를 정의할 수 없습니다.
    if true_indices.size == 0:
        return None

    # 좌표들의 최소값과 최대값을 찾습니다.
    y_min, x_min = true_indices.min(axis=0)
    y_max, x_max = true_indices.max(axis=0)

    # ROI 좌표를 반환합니다. (x1, y1, x2, y2)
    return x_min, y_min, x_max, y_max

# 예시 boolean 배열
arr = np.array([[0, 0, 0, 0, 0, 0],
                [0, 0, 1, 1, 0, 0],
                [0, 0, 1, 1, 0, 0],
                [0, 0, 1, 1, 0, 0],
                [0, 0, 1, 1, 0, 0],
                [0, 0, 0, 0, 0, 0]])

roi = find_roi(arr)
print(f"ROI: {roi}")

axis 인자에 대한 고찰

import numpy as np
arr = np.array([[0, 0, 0, 0, 0, 0],
                [0, 0, 1, 1, 0, 0],
                [0, 0, 1, 1, 0, 0],
                [0, 0, 1, 1, 0, 0],
                [0, 0, 1, 1, 0, 0],
                [0, 0, 0, 0, 0, 0]])
idx = np.argwhere(arr)

print(idx)
"""
array([[1, 2],
       [1, 3],
       [2, 2],
       [2, 3],
       [3, 2],
       [3, 3],
       [4, 2],
       [4, 3]])
"""

idx.min() 결과는 전체 elements 중 min 값을 조회한다.
idx.min(axis=0) - axis=0는 각 행의 min 값을 조회한다.
idx.min(axis=1) - axis=1는 각 열의 min 값을 조회한다.

다음과 같이 보면 된다:

      행1 행2
        |  |
array([[1, 2],   - 열1
       [1, 3],   - 열2
       [2, 2],   - 열3
       [2, 3],   - 열4
       [3, 2],   - 열5
       [3, 3],   - 열6
       [4, 2],   - 열7
       [4, 3]])  - 열8

idx.min() 결과는 11 이다.
idx.min(axis=0) 결과는 2개 행 이므로 array([1, 2]) 이다.
idx.min(axis=1) 결과는 8개 행 이므로 array([1, 1, 2, 2, 2, 3, 2, 3]) 이다.

공통적으로 1개 차원이 축소되는 것을 볼 수 있다.

3차원 에서 axis 인자에 대한 고찰

import numpy as np
arr = np.array([[[1,4,3], [2,1,4]],
                [[1,4,3], [3,1,4]],
                [[4,3,2], [2,4,1]],
                [[4,3,2], [3,4,1]]])

shape 은 (4, 2, 3) 이다. 즉, H=4,W=2,C=3 (RGB) 인 이미지가 된다.

공통적으로 1개 차원이 축소되는 것을 볼 수 있다.

idx.min(axis=0) - axis=0는 H 이다. H가 제거된 W=2,C=3가 되고, 가장 높은 axis index 는 C 이므로, C 단위의 min 값을 H Step 으로 찾는다. 각 행의 C 단위로 min 값을 찾는다.
idx.min(axis=1) - axis=1는 W 이다. W가 제거된 H=4,C=3가 되고, 가장 높은 axis index 는 C 이므로, C 단위의 min 값을 W Step 으로 찾는다. 각 열의 C 단위로 min 값을 찾는다.
idx.min(axis=2) - axis=2는 C 이다. C가 제거된 H=4,W=2가 되고, 가장 높은 axis index 는 W 이므로, W 단위의 min 값을 H Step 으로 찾는다. 결과, C 가 압축된다. GrayScale 이미지가 된다.

idx.min(axis=0) 결과는:

          C1 C2 C3   C4 C5 C6
          /  /  /    /  /  /
array([[[1, 4, 3], [2, 1, 4]],
          C1 C2 C3   C4 C5 C6
          /  /  /    /  /  /
       [[1, 4, 3], [3, 1, 4]],
          C1 C2 C3   C4 C5 C6
          /  /  /    /  /  /
       [[4, 3, 2], [2, 4, 1]],
          C1 C2 C3   C4 C5 C6
          /  /  /    /  /  /
       [[4, 3, 2], [3, 4, 1]]])

동일한 C{N}끼리 묶으면 array([[1, 3, 2], [2, 1, 1]]) 이다.

idx.min(axis=1) 결과는:

          C1 C2 C3
          /  /  /
array([[[1, 4, 3],
        [2, 1, 4]],  ~ W1 묶음
          \  \  \
          C1 C2 C3

          C1 C2 C3
          /  /  /
       [[1, 4, 3],
        [3, 1, 4]],  ~ W2 묶음
          \  \  \
          C1 C2 C3

          C1 C2 C3
          /  /  /
       [[4, 3, 2],
        [2, 4, 1]],  ~ W3 묶음
          \  \  \
          C1 C2 C3

          C1 C2 C3
          /  /  /
       [[4, 3, 2],
        [3, 4, 1]]]) ~ W4 묶음
          \  \  \
          C1 C2 C3

각 W{N} 마다 C{N}끼리 묶으면 array([[1, 1, 3], [1, 1, 3], [2, 3, 1], [3, 3, 1]]) 이다.

idx.min(axis=2) 결과는:

array([[[1, 4, 3],   ~ W1
        [2, 1, 4]],  ~ W2
       [[1, 4, 3],   ~ W3
        [3, 1, 4]],  ~ W4
       [[4, 3, 2],   ~ W5
        [2, 4, 1]],  ~ W6
       [[4, 3, 2],   ~ W7
        [3, 4, 1]]]) ~ W8

각 W{N} 끼리 묶으면 array([[1, 1], [1, 1], [2, 1], [2, 1]]) 이다.

Datatypes

import numpy as np
x = np.array([1, 2])
print x.dtype         # Prints "int64"

x = np.array([1, 2], dtype=np.int64)  # Force a particular datatype
print x.dtype # Prints "int64"

Random data

Semantic segmentation with OpenCV and deep learning

Random colors

# load the class label names
CLASSES = open(args["classes"]).read().strip().split("\n")

# if a colors file was supplied, load it from disk
if args["colors"]:
    COLORS = open(args["colors"]).read().strip().split("\n")
    COLORS = [np.array(c.split(",")).astype("int") for c in COLORS]
    COLORS = np.array(COLORS, dtype="uint8")

# otherwise, we need to randomly generate RGB colors for each class
# label
else:
    # initialize a list of colors to represent each class label in
    # the mask (starting with 'black' for the background/unlabeled
    # regions)
    np.random.seed(42)
    COLORS = np.random.randint(0, 255, size=(len(CLASSES) - 1, 3),
        dtype="uint8")
    COLORS = np.vstack([[0, 0, 0], COLORS]).astype("uint8")

특정 생상으로 채워 넣은 RGB888 이미지

import numpy as np

def make_rgb888(width: int, height: int, r=0, g=0, b=0) -> np.ndarray:
    return np.full(
        shape=(height, width, 3),
        fill_value=(r, g, b),
        dtype=np.int8,
    )

Valid RGB888 Image

from numpy import ndarray

DTYPE_KIND_BOOLEAN = "b"
DTYPE_KIND_SIGNED_INTEGER = "i"
DTYPE_KIND_UNSIGNED_INTEGER = "u"
DTYPE_KIND_FLOATING_POINT = "f"
DTYPE_KIND_COMPLEX_FLOATING_POINT = "c"
DTYPE_KIND_TIMEDELTA = "m"
DTYPE_KIND_DATETIME = "M"
DTYPE_KIND_OBJECT = "O"
DTYPE_KIND_BYTE_STRING = "S"
DTYPE_KIND_UNICODE = "U"
DTYPE_KIND_VOID = "V"


def valid_rgb888(image: ndarray) -> None:
    dimension = len(image.shape)
    if dimension != 3:
        raise ValueError(f"The `shape` of `image` must be 3. (vs {dimension})")

    h, w, c = image.shape
    assert h >= 1
    assert w >= 1
    assert c >= 1

    if c != 3:
        raise ValueError(f"The `channel` must be 3. (vs {c})")

    kind = image.dtype.kind
    if kind not in [DTYPE_KIND_SIGNED_INTEGER, DTYPE_KIND_UNSIGNED_INTEGER]:
        raise ValueError(
            f"The `kind` must be '{DTYPE_KIND_SIGNED_INTEGER}'"
            f" or '{DTYPE_KIND_UNSIGNED_INTEGER}'. (vs {kind})"
        )

    item_size = image.dtype.itemsize
    if item_size != 1:
        raise ValueError(f"The `item-size` must be 1. (vs {item_size})")

3채널 이미지에서 컬러키를 사용하여 1채널 마스크 만들기

from typing import Final, Optional, Tuple

from numpy import concatenate, ndarray, uint8, where
from numpy.typing import NDArray

BLACK_COLOR: Final[Tuple[int, int, int]] = (0, 0, 0)
DEFAULT_CHROMA_COLOR: Final[Tuple[int, int, int]] = BLACK_COLOR
CHANNEL_MIN: Final[int] = 0
CHANNEL_MAX: Final[int] = 255

def generate_mask(
    image: NDArray[uint8],
    chroma_color=DEFAULT_CHROMA_COLOR,
) -> NDArray[uint8]:
    assert image.dtype == uint8
    assert len(image.shape) == 3
    assert image.shape[-1] == 3

    channels_cmp: NDArray[bool] = image == chroma_color
    pixel_cmp: NDArray[bool] = channels_cmp.all(axis=-1, keepdims=True)
    return where(pixel_cmp, CHANNEL_MIN, CHANNEL_MAX)

3채널 이미지와 1채널 마스크를 합쳐, 4채널(BGRA)이미지 생성

from typing import Final, Optional, Tuple

from numpy import concatenate, ndarray, uint8, where
from numpy.typing import NDArray

def merge_to_bgra32(image: NDArray[uint8], mask: NDArray[uint8]) -> NDArray[uint8]:
    assert image.dtype == uint8
    assert len(image.shape) == 3
    assert image.shape[-1] == 3

    assert mask.dtype == uint8
    assert len(mask.shape) == 3
    assert mask.shape[-1] == 1

    return concatenate((image, mask), axis=-1)

Boolean array indexing

import numpy as np
a = np.array([[1,2], [3, 4], [5, 6]])
bool_idx = (a > 2) # [[False, False], [ True,  True], [ True,  True]]
bool_idx.dtype # bool

Select all non-black pixels in a NumPy array

Stackoverflow - How to select all non-black pixels in a NumPy array?

You should use np.any instead of np.all for the second case of selecting all but black pixels:

np.any(image != [0, 0, 0], axis=-1)

Or simply get a complement of black pixels by inverting a boolean array by ~:

black_pixels_mask = np.all(image == [0, 0, 0], axis=-1)
non_black_pixels_mask = ~black_pixels_mask

Working example:

import numpy as np
import matplotlib.pyplot as plt

image = plt.imread('example.png')
plt.imshow(image)
plt.show()

image_copy = image.copy()

black_pixels_mask = np.all(image == [0, 0, 0], axis=-1)

non_black_pixels_mask = np.any(image != [0, 0, 0], axis=-1)  
# or non_black_pixels_mask = ~black_pixels_mask

image_copy[black_pixels_mask] = [255, 255, 255]
image_copy[non_black_pixels_mask] = [0, 0, 0]

plt.imshow(image_copy)
plt.show()

이미지 RGB/BGR 순서 역전

Numpy / OpenCV image BGR to RGB | Scientific Computing | SciVision

Conversion between any/all of BGR, RGB, and GBR may be necessary when working with

Matplotlib pyplot.imshow(): M x N x 3 image, where last dimension is RGB.
OpenCV imshow(): M x N x 3 image, where last dimension is BGR
Scientific Cameras: some output M X N x 3 image, where last dimension is GBR

BGR to RGB - OpenCV image to Matplotlib: rgb = bgr[...,::-1].copy()

RGB to BGR - Matplotlib image to OpenCV: bgr = rgb[...,::-1].copy()

RGB to GBR: gbr = rgb[...,[2,0,1]].copy()

The axis order convention for Python images:

3-D: W x H x 3, where the last axis is color (e.g. RGB)
4-D: W x H x 3 x 1, where the last axis is typically an alpha channel

이미지 Grayscale 변환 방법

랜덤 2x3 크기 RGB 이미지 생성:

import numpy as np
width = 2
height = 3
bgr_image = np.random.randint(0, 256, (height, width, 3), dtype=np.uint8)
bgr_image

결과:

array([[[177, 140,  52],
        [ 35, 179, 111]],

       [[149, 196, 241],
        [  0,  64, 236]],

       [[188, 246,  74],
        [185, 129,  10]]], dtype=uint8)

mean을 써서 1채널 Grayscale 이미지 생성:

bgr_image.mean(axis=2)

결과:

array([[123.        , 108.33333333],
       [195.33333333, 100.        ],
       [169.33333333, 108.        ]])

np.stack을 써서 채널 위치 axis(axis=2; axis=-1도 같은 결과)에 (bgr_image.mean(axis=2),) * 3를 스택 쌓는다:

np.stack((bgr_image.mean(axis=2),) * 3, axis=2).astype(dtype=np.uint8)

결과:

array([[[123, 123, 123],
        [108, 108, 108]],

       [[195, 195, 195],
        [100, 100, 100]],

       [[169, 169, 169],
        [108, 108, 108]]], dtype=uint8)

Serialize and Deserialize

Python: Serialize and Deserialize Numpy 2D arrays

import numpy as np
as_array = np.array([ [1,2,3], [4,5,6], [7,8,9] ])
array_data_type = as_array.dtype.name
array_shape = as_array.shape
as_bytes = as_array.tobytes()
result = np.frombuffer(as_bytes, dtype = array_data_type).reshape(array_shape)

stack

numpy.stack — NumPy v1.19 Manual

새 축을 따라 일련의 배열을 결합합니다.

axis 매개 변수는 결과 차원에서 새 축의 인덱스를 지정합니다. 예를 들어, axis=0이면 첫 번째 차원이되고 axis=-1이면 마지막 차원이됩니다:

>>> import numpy as np
>>> a = np.array([[1, 2], [3, 4]])
>>> b = np.array([[5, 6], [7, 8]])
>>> np.stack([a, b])
array([[[1, 2],
        [3, 4]],

       [[5, 6],
        [7, 8]]])

Example

import numpy as np
## ..
print x + y
print np.add(x, y)

print x - y
print np.subtract(x, y)

print x * y
print np.multiply(x, y)

print x / y
print np.divide(x, y)

print np.sqrt(x)

## 내적은 dot을 사용한다.
print v.dot(w)
print np.dot(v, w)

## 총 합은 아래와 같이 구한다.
x = np.array([[1,2],[3,4]])
print np.sum(x)  # Compute sum of all elements; prints "10"
print np.sum(x, axis=0)  # Compute sum of each column; prints "[4 6]"
print np.sum(x, axis=1)  # Compute sum of each row; prints "[3 7]"

## 랭크(Rank)를 뒤집는 방법은 아래와 같다.
x = np.array([[1,2], [3,4]])
print x.T  # Prints "[[1 3], [2 4]]"

## 동일한 크기의 배열을 만들고자 할 때.
x = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
y = np.empty_like(x)

## 하나의 특정 값(또는 행렬)을 사용하여 반복적으로 적용하여, 또 다른 하나의 행렬을 만든다.
v = np.array([1, 0, 1])
vv = np.tile(v, (4, 1))

Numpy generate data from linear function

1차 함수 형태로 무작위 데이터를 추출할 수 있다.

Np.random.uniform_-_example.png

x = np.arange(100)
delta = np.random.uniform(-10,10, size=(100,))
y = .4 * x +3 + delta

평균, 분산, 표준편차

numpy.mean(arr) #평균
numpy.var(arr) #분산
numpy.std(arr) #표준편차

다차원 평균

Stackoverflow - How to find the average colour of an image in Python with OpenCV?

average = img.mean(axis=0).mean(axis=0)

차원 줄이기 (평탄화; flatten)

from numpy import array
a = array([[1], [2], [3]])
a.flatten()
print(a.flatten())
## Output: "[1 2 3]"

Readonly (Lock; Freeze)

import numpy as np

 a = np.zeros(11)
print("Before any change ")
print(a)

a[1] = 2
print("Before after first change ")
print(a)

a.setflags(write=False)
print("After making array immutable on attempting  second change ")
a[1] = 7

Image processing with numpy

Image processing with numpy | Python informer (PIL을 사용한다)

Type Hint

Stackoverflow - Type hinting / annotation (PEP 484) for numpy.ndarray
Type hinting / annotation (PEP 484) for ndarray, dtype, and ufunc #7370
라이브러리 사용: nptyping - Type hints for Numpy
Typing (numpy.typing) — NumPy v1.22.dev0 Manual

from typing import TypeVar, Generic, Tuple, Union, Optional
import numpy as np

Shape = TypeVar("Shape")
DType = TypeVar("DType")


class Array(np.ndarray, Generic[Shape, DType]):
    """
    Use this to type-annotate numpy arrays, e.g.

        def transform_image(image: Array['H,W,3', np.uint8], ...):
            ...

    """
    pass


def func(arr: Array['N,2', int]):
    return arr*2


print(func(arr = np.array([(1, 2), (3, 4)])))

C Extension

[추천] How to extend NumPy
Stackoverflow: Pointer-type mismatch with PyArray_SimpleNew
Numpy - C (Simple example)

Required

NumPy 관련 함수를 호출하기 전에 꼭 아래의 함수를 호출해야 한다.

#include <numpy/ndarrayobject.h>
// ...
import_array();

NumPy style arrays for C++

Stackoverflow - NumPy style arrays for C++?

Here are several free software that may suit your needs.

The GNU Scientific Library is a GPL software written in C. Thus, it has a C-like allocation and way of programming (pointers, etc.). With the GSLwrap, you can have a C++ way of programming, while still using the GSL. GSL has a BLAS implementation, but you can use ATLAS instead of the default CBLAS, if you want even more performances.
The boost/uBLAS library is a BSL library, written in C++ and distributed as a boost package. It is a C++-way of implementing the BLAS standard. uBLAS comes with a few linear algebra functions, and there is an experimental binding to ATLAS.
eigen is a linear algebra library written in C++, distributed under the LGPL3 (or GPL2). It's a C++ way of programming, but more integrated than the two others (more algorithms and data structures are available). Eigen claim to be faster than the BLAS implementations above, while not following the de-facto standard BLAS API. Eigen does not seem to put a lot of effort on parallel implementation.
Armadillo is LGPL3 library for C++. It has binding for LAPACK (the library used by numpy). It uses recursive templates and template meta-programming, which is a good point (I don't know if other libraries are doing it also?).
xtensor is a C++ library that is BSD licensed. It offers A C++ API very similar to that of NumPy. See https://xtensor.readthedocs.io/en/latest/numpy.html for a cheat sheet.

These alternatives are really good if you just want to get data structures and basic linear algebra. Depending on your taste about style, license or sysadmin challenges (installing big libraries like LAPACK may be difficult), you may choose the one that best suits your needs.

Troubleshooting

ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject

PandasAI같은걸 사용하다 다음과 같이 에러 발생됨:

Traceback (most recent call last):
  File "/home/yourid/Project/steel-reports/main.py", line 16, in <module>
    sys.exit(main())
  File "/home/yourid/Project/steel-reports/steelreports/entrypoint.py", line 60, in main
    return run_default_app(*files, debug=debug, verbose=verbose)
  File "/home/yourid/Project/steel-reports/steelreports/apps/__init__.py", line 7, in run_default_app
    from steelreports.apps.default import DefaultApp
  File "/home/yourid/Project/steel-reports/steelreports/apps/default.py", line 3, in <module>
    from pandasai import SmartDataframe
  File "/home/yourid/Project/steel-reports/.venv/lib/python3.9/site-packages/pandasai/__init__.py", line 6, in <module>
    from pandasai.smart_dataframe import SmartDataframe
  File "/home/yourid/Project/steel-reports/.venv/lib/python3.9/site-packages/pandasai/smart_dataframe/__init__.py", line 26, in <module>
    import pandasai.pandas as pd
  File "/home/yourid/Project/steel-reports/.venv/lib/python3.9/site-packages/pandasai/pandas/__init__.py", line 13, in <module>
    from pandas import *
  File "/home/yourid/Project/steel-reports/.venv/lib/python3.9/site-packages/pandas/__init__.py", line 22, in <module>
    from pandas.compat import is_numpy_dev as _is_numpy_dev  # pyright: ignore # noqa:F401
  File "/home/yourid/Project/steel-reports/.venv/lib/python3.9/site-packages/pandas/compat/__init__.py", line 18, in <module>
    from pandas.compat.numpy import (
  File "/home/yourid/Project/steel-reports/.venv/lib/python3.9/site-packages/pandas/compat/numpy/__init__.py", line 4, in <module>
    from pandas.util.version import Version
  File "/home/yourid/Project/steel-reports/.venv/lib/python3.9/site-packages/pandas/util/__init__.py", line 2, in <module>
    from pandas.util._decorators import (  # noqa:F401
  File "/home/yourid/Project/steel-reports/.venv/lib/python3.9/site-packages/pandas/util/_decorators.py", line 14, in <module>
    from pandas._libs.properties import cache_readonly
  File "/home/yourid/Project/steel-reports/.venv/lib/python3.9/site-packages/pandas/_libs/__init__.py", line 13, in <module>
    from pandas._libs.interval import Interval
  File "pandas/_libs/interval.pyx", line 1, in init pandas._libs.interval
ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject

numpy 버전 문제임. 2.xx를 사용한다면 1.xx로 다운시키자. (2024-12-03 현재, numpy=1.26.4 가 최신)

Favorite site

References

Taewan.kim_-_numpy_cheat_sheet.pdf ↩