Normaliser un pandas DataFrame par rangée

Question

Quel est le moyen le plus idiomatique de normaliser chaque rangée d’un DataFrame? La normalisation des colonnes est facile, donc une option (très moche!) Est:

(df.T / df.T.sum()).T

Les règles de diffusion des pandas empêchent df / df.sum(axis=1) de le faire

joris · Accepted Answer

Pour résoudre le problème de la diffusion, vous pouvez utiliser la méthode div:

df.div(df.sum(axis=1), axis=0)

Voir http://pandas.pydata.org/pandas-docs/stable/basics.html#matching-broadcasting-behavior

Rafa · Answer

Je suggérerais d’utiliser le prétraitement Scikit bibliothèques et de transposer votre dataframe selon les besoins:

''' Created on 05/11/2015 @author: rafaelcastillo ''' import matplotlib.pyplot as plt import pandas import random import numpy as np from sklearn import preprocessing def create_cos(number_graphs,length,amp): # This function is used to generate cos-kind graphs for testing # number_graphs: to plot # length: number of points included in the x axis # amp: Y domain modifications to draw different shapes x = np.arange(length) amp = np.pi*amp xx = np.linspace(np.pi*0.3*amp, -np.pi*0.3*amp, length) for i in range(number_graphs): iterable = (2*np.cos(x) + random.random()*0.1 for x in xx) y = np.fromiter(iterable, np.float) if i == 0: yfinal = y continue yfinal = np.vstack((yfinal,y)) return x,yfinal x,y = create_cos(70,24,3) data = pandas.DataFrame(y) x_values = data.columns.values num_rows = data.shape[0] fig, ax = plt.subplots() for i in range(num_rows): ax.plot(x_values, data.iloc[i]) ax.set_title('Raw data') plt.show() std_scale = preprocessing.MinMaxScaler().fit(data.transpose()) df_std = std_scale.transform(data.transpose()) data = pandas.DataFrame(np.transpose(df_std)) fig, ax = plt.subplots() for i in range(num_rows): ax.plot(x_values, data.iloc[i]) ax.set_title('Data Normalized') plt.show()