回答问题

我正在尝试按列获取唯一计数,但我的数组具有分类变量(dtype 对象)

val, count = np.unique(x, axis=1, return_counts=True)

虽然我收到这样的错误:

TypeError: The axis argument to unique is not supported for dtype object

我该如何解决这个问题?

样品 x:

array([[' Private', ' HS-grad', ' Divorced'],
       [' Private', ' 11th', ' Married-civ-spouse'],
       [' Private', ' Bachelors', ' Married-civ-spouse'],
       [' Private', ' Masters', ' Married-civ-spouse'],
       [' Private', ' 9th', ' Married-spouse-absent'],
       [' Self-emp-not-inc', ' HS-grad', ' Married-civ-spouse'],
       [' Private', ' Masters', ' Never-married'],
       [' Private', ' Bachelors', ' Married-civ-spouse'],
       [' Private', ' Some-college', ' Married-civ-spouse']], dtype=object)

需要以下计数:

for x_T in x.T:
    val, count = np.unique(x_T, return_counts=True)
    print (val,count)


[' Private' ' Self-emp-not-inc'] [8 1]
[' 11th' ' 9th' ' Bachelors' ' HS-grad' ' Masters' ' Some-college'] [1 1 2 2 2 1]
[' Divorced' ' Married-civ-spouse' ' Married-spouse-absent'
 ' Never-married'] [1 6 1 1]

Answers

即使输出看起来不像您的输出,您也可以使用 Item Freq 它提供所需的计数:

import numpy as np
from scipy.stats import itemfreq

x = np. array([[' Private', ' HS-grad', ' Divorced'],
       [' Private', ' 11th', ' Married-civ-spouse'],
       [' Private', ' Bachelors', ' Married-civ-spouse'],
       [' Private', ' Masters', ' Married-civ-spouse'],
       [' Private', ' 9th', ' Married-spouse-absent'],
       [' Self-emp-not-inc', ' HS-grad', ' Married-civ-spouse'],
       [' Private', ' Masters', ' Never-married'],
       [' Private', ' Bachelors', ' Married-civ-spouse'],
       [' Private', ' Some-college', ' Married-civ-spouse']], dtype=object)

itemfreq(x)

输出:

array([[' 11th', 1],
       [' 9th', 1],
       [' Bachelors', 2],
       [' Divorced', 1],
       [' HS-grad', 2],
       [' Married-civ-spouse', 6],
       [' Married-spouse-absent', 1],
       [' Masters', 2],
       [' Never-married', 1],
       [' Private', 8],
       [' Self-emp-not-inc', 1],
       [' Some-college', 1]], dtype=object)

否则,您将尝试指定另一个 dtype,例如:

val, count = np.unique(x.astype("<U22"), axis=1, return_counts=True)

为此,但是您的阵列必须不同

Logo

学AI,认准AI Studio!GPU算力,限时免费领,邀请好友解锁更多惊喜福利 >>>

更多推荐