Comment mettre à l'échelle standard une matrice 3D?

Question

Je travaille sur un problème de classification du signal et je voudrais d'abord mettre à l'échelle la matrice du jeu de données, mais mes données sont au format 3D (lot, longueur, canaux).
J'ai essayé d'utiliser Scikit-learn Standard Scaler:

from sklearn.preprocessing import StandardScaler sc = StandardScaler() X_train = sc.fit_transform(X_train) X_test = sc.transform(X_test)

Mais j'ai ce message d'erreur:

Tableau trouvé avec dim 3. StandardScaler attendu <= 2

Je pense qu'une solution serait de diviser la matrice par chaque canal en plusieurs matrices 2D, de les mettre à l'échelle séparément puis de les remettre au format 3D, mais je me demande s'il y a une meilleure solution.
Merci beaucoup.

Bert Kellerman · Accepted Answer

Vous devrez adapter et stocker un scaler pour chaque canal

from sklearn.preprocessing import StandardScaler scalers = {} for i in range(X_train.shape[1]): scalers[i] = StandardScaler() X_train[:, i, :] = scalers[i].fit_transform(X_train[:, i, :]) for i in range(X_test.shape[1]): X_test[:, i, :] = scalers[i].transform(X_test[:, i, :])

Kilian Batzner · Answer

Si vous souhaitez mettre à l'échelle chaque fonctionnalité différemment, comme le fait StandardScaler , vous pouvez utiliser ceci:

import numpy as np from sklearn.base import TransformerMixin from sklearn.preprocessing import StandardScaler class NDStandardScaler(TransformerMixin): def __init__(self, **kwargs): self._scaler = StandardScaler(copy=True, **kwargs) self._orig_shape = None def fit(self, X, **kwargs): X = np.array(X) # Save the original shape to reshape the flattened X later # back to its original shape if len(X.shape) > 1: self._orig_shape = X.shape[1:] X = self._flatten(X) self._scaler.fit(X, **kwargs) return self def transform(self, X, **kwargs): X = np.array(X) X = self._flatten(X) X = self._scaler.transform(X, **kwargs) X = self._reshape(X) return X def _flatten(self, X): # Reshape X to <= 2 dimensions if len(X.shape) > 2: n_dims = np.prod(self._orig_shape) X = X.reshape(-1, n_dims) return X def _reshape(self, X): # Reshape X back to it's original shape if len(X.shape) >= 2: X = X.reshape(-1, *self._orig_shape) return X

Il aplatit simplement les fonctionnalités de l'entrée avant de la donner à StandardScaler de sklearn. Ensuite, il les remodèle. L'utilisation est la même que pour le StandardScaler:

data = [[[0, 1], [2, 3]], [[1, 5], [2, 9]]] scaler = NDStandardScaler() print(scaler.fit_transform(data))

impressions

[[[-1. -1.] [ 0. -1.]] [[ 1. 1.] [ 0. 1.]]]

Les arguments with_mean et with_std sont directement passés à StandardScaler et fonctionnent donc comme prévu. copy=False ne fonctionnera pas, car le remodelage n'a pas lieu sur place. Pour les entrées 2D, le NDStandardScaler fonctionne comme le StandardScaler:

data = [[0, 0], [0, 0], [1, 1], [1, 1]] scaler = NDStandardScaler() scaler.fit(data) print(scaler.transform(data)) print(scaler.transform([[2, 2]]))

impressions

[[-1. -1.] [-1. -1.] [ 1. 1.] [ 1. 1.]] [[3. 3.]]

comme dans l'exemple sklearn pour StandardScaler .

Marco Cerliani · Answer

Avec seulement 3 lignes de code ...

scaler = StandardScaler() X_train = scaler.fit_transform(X_train.reshape(-1, X_train.shape[-1])).reshape(X_train.shape) X_test = scaler.transform(X_test.reshape(-1, X_test.shape[-1])).reshape(X_test.shape)

PJRobot · Answer

s0, s1, s2 = y_train.shape[0], y_train.shape[1], y_train.shape[2] y_train = y_train.reshape(s0 * s1, s2) y_train = minMaxScaler.fit_transform(y_train) y_train = y_train.reshape(s0, s1, s2) s0, s1, s2 = y_test.shape[0], y_test.shape[1], y_test.shape[2] y_test = y_test.reshape(s0 * s1, s2) y_test = minMaxScaler.transform(y_test) y_test = y_test.reshape(s0, s1, s2)

Juste remodelé les données comme ça. Pour une utilisation sans rembourrage similaire:

s0, s1, s2 = x_train.shape[0], x_train.shape[1], x_train.shape[2] x_train = x_train.reshape(s0 * s1, s2) minMaxScaler.fit(x_train[0::s1]) x_train = minMaxScaler.transform(x_train) x_train = x_train.reshape(s0, s1, s2) s0, s1, s2 = x_test.shape[0], x_test.shape[1], x_test.shape[2] x_test = x_test.reshape(s0 * s1, s2) x_test = minMaxScaler.transform(x_test) x_test = x_test.reshape(s0, s1, s2)