## Problem

The scikit-learn Python library has several classes for imputing (predicting missing values in arrays.)

I have a Python program written a little while ago. I made use of the *Imputer* class in the *sklearn.preprocessing* package. I set the `axis=1`

parameter to force a prediction of values **row-wise**, instead of the default column-wise prediction.

For example, I wanted an array like this (nan = missing value) …

```
[[ 10. nan 20. 15.]
[ 200. 200. 200. nan]
[ nan nan 5000. 6000.]]
```

… to have its missing values predicted with the **row-wise mean**. The expected outcome is:

```
[[ 10. 15. 20. 15.]
[ 200. 200. 200. 200.]
[5500. 5500. 5000. 6000.]]
```

Here’s the code, using `sklearn.preprocessing.Imputer`

:

```
import pandas as pd
import numpy as np
from sklearn.preprocessing import Imputer
# Create simple test array
X = np.asarray([[10, np.nan, 20, 15],\
[200, 200, 200, np.nan],\
[np.nan, np.nan, 5000, 6000]])
# Create imputer object, replacing 'nan' with feature means by row
mean_imputer = Imputer(missing_values=np.nan, strategy='mean', axis=1)
# Train and apply imputor
imputed_X = mean_imputer.fit_transform(X)
```

Unfortunately, Class *Imputer* is now deprecated. The code above throws a warning:

DeprecationWarning: Class Imputer is deprecated;

Imputer was deprecated in version 0.20 and will be

removed in 0.22. Import impute.SimpleImputer

from sklearn instead.

Double-unfortunately, `impute.SimpleImputer`

does not include an axis parameter **so I can no longer request a row-wise imputation**.

scikit-learn’s GitHub Issue “Remove SimpleImputer’s axis parameter” https://github.com/scikit-learn/scikit-learn/issues/10636 suggests:

Future (and default) behavior is equivalent to axis=0 (impute along columns). Row-wise imputation can be performed with FunctionTransformer (e.g., FunctionTransformer(lambda X: Imputer().fit_transform(X.T).T)).

Eh.

## Solution

Why not replace `preprocessing.Imputer`

with `impute.SimpleImputer`

as suggested, them directly transpose/untranspose the array while applying the imputer? Works for me.

```
import pandas as pd
import numpy as np
from sklearn.impute import SimpleImputer
# Create simple test array
X = np.asarray([[10, np.nan, 20, 15],
[200, 200, 200, np.nan],
[np.nan, np.nan, 5000, 6000]])
# Create imputer object, replacing 'Nan' with feature means
mean_imputer = SimpleImputer(missing_values=np.nan, strategy='mean')
# Train and apply imputor, transposing to affect imputation row-wise
imputed_X = mean_imputer.fit_transform(X.T).T
# Review original data
print(f'Original:\n{X}')
# View imputed data
print(f'\nImputed:\n{imputed_X}')
```

Output:

```
Original:
[[ 10. nan 20. 15.]
[ 200. 200. 200. nan]
[ nan nan 5000. 6000.]]
Imputed:
[[ 10. 15. 20. 15.]
[ 200. 200. 200. 200.]
[5500. 5500. 5000. 6000.]]
```

Done!

## Reference

scikit-learn’s Imputation of Missing Values with impute.SimpleImputer

scikit-learn’s Imputation of Missing Values using preprocessing.Imputer

Ashish says

Could you provide a detail explanation of how to apply FunctionTransformer() to impute row wise.

I have benn try to do it since 2 days and I am clueless

PS : I am a beginner in ML and any help will be grateful. Thanks in advance ๐

Leah says

Hi Ashish,

I do not know what issues you are facing specifically. Scikit-Learn.org is a good place to start, it includes several examples: https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.FunctionTransformer.html

I hope you’ve found your solution ๐

Trenton says

This works fine if fitting and transforming directly on the imputer, but not for imputing within a Pipeline