I am given a code chunk to run in Jupyter to learn about One Hot Encoding and when I run the code an error shows up.
from sklearn.preprocessing import OneHotEncoder as ohc enc = ohc(drop='if_binary', sparse_output=False).set_output(transform='pandas') df = enc.fit_transform(default[["student"]]) default_enc = default.assign(student = df['student_Yes'])
then I get the error code:
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-8-f958840e2f7e> in <module> 1 from sklearn.preprocessing import OneHotEncoder as ohc 2 default = pd.read_csv("default.csv", index_col=[0]) ----> 3 enc = ohc(drop = 'if_binary',sparse_output=False).set_output(transform='pandas') 4 df = enc.fit_transform(default[["student"]]) 5 default_enc = default.assign(student = df['student_Yes']) /usr/local/lib64/python3.6/site-packages/sklearn/utils/validation.py in inner_f(*args, **kwargs) 61 extra_args = len(args) - len(all_args) 62 if extra_args <= 0: ---> 63 return f(*args, **kwargs) 64 65 # extra_args > 0 TypeError: __init__() got an unexpected keyword argument 'sparse_output'
I have tried updating anaconda, and sklearn. The code is supposed to work the next few problems rely on editing it to see how different parts affect it. your text
your text
The error you’re encountering suggests that the OneHotEncoder class in scikit-learn does not have a sparse_output parameter in its __init__ method. This could be due to version differences.
OneHotEncoder
sparse_output
__init__
In scikit-learn version 0.22.0 and later, the OneHotEncoder class does not have a sparse_output parameter in its constructor. Instead, the sparse_output parameter is part of the fit_transform method.
fit_transform
Here’s how you can modify your code:
from sklearn.preprocessing import OneHotEncoder import pandas as pd enc = OneHotEncoder(drop='if_binary', sparse_output=False) df = pd.DataFrame(enc.fit_transform(default[["student"]]).toarray(), columns=enc.get_feature_names_out(["student"])) default_enc = default.join(df)
In this code:
toarray()
get_feature_names_out
pd.DataFrame
default.join(df)
default
df
Make sure to adjust the code according to your specific requirements and the structure of your dataset.