I am given a code chunk to run in Jupyter to learn about One Hot Encoding and when I run the code an error shows up.
from sklearn.preprocessing import OneHotEncoder as ohc
enc = ohc(drop='if_binary', sparse_output=False).set_output(transform='pandas')
df = enc.fit_transform(default[["student"]])
default_enc = default.assign(student = df['student_Yes'])
then I get the error code:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-8-f958840e2f7e> in <module>
1 from sklearn.preprocessing import OneHotEncoder as ohc
2 default = pd.read_csv("default.csv", index_col=[0])
----> 3 enc = ohc(drop = 'if_binary',sparse_output=False).set_output(transform='pandas')
4 df = enc.fit_transform(default[["student"]])
5 default_enc = default.assign(student = df['student_Yes'])
/usr/local/lib64/python3.6/site-packages/sklearn/utils/validation.py in inner_f(*args, **kwargs)
61 extra_args = len(args) - len(all_args)
62 if extra_args <= 0:
---> 63 return f(*args, **kwargs)
64
65 # extra_args > 0
TypeError: __init__() got an unexpected keyword argument 'sparse_output'
I have tried updating anaconda, and sklearn. The code is supposed to work the next few problems rely on editing it to see how different parts affect it. your text
The error you’re encountering suggests that the OneHotEncoder
class in scikit-learn does not have a sparse_output
parameter in its __init__
method. This could be due to version differences.
In scikit-learn version 0.22.0 and later, the OneHotEncoder
class does not have a sparse_output
parameter in its constructor. Instead, the sparse_output
parameter is part of the fit_transform
method.
Here’s how you can modify your code:
from sklearn.preprocessing import OneHotEncoder
import pandas as pd
enc = OneHotEncoder(drop='if_binary', sparse_output=False)
df = pd.DataFrame(enc.fit_transform(default[["student"]]).toarray(), columns=enc.get_feature_names_out(["student"]))
default_enc = default.join(df)
In this code:
fit_transform
is used directly on the OneHotEncoder
instance.toarray()
is called on the result to convert the sparse matrix to a dense array.get_feature_names_out
is used to get the column names for the one-hot encoded features.pd.DataFrame
is used to create a DataFrame from the one-hot encoded array.default.join(df)
is used to concatenate the original DataFrame (default
) with the one-hot encoded DataFrame (df
).Make sure to adjust the code according to your specific requirements and the structure of your dataset.