在keras中复制sklearn的MLPClassifier() [英] Replicate MLPClassifier() of sklearn in keras

查看:312
本文介绍了在keras中复制sklearn的MLPClassifier()的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是新来的喀拉拉邦人.我正在尝试ML问题. 关于数据:

I am new to keras. I was attempting an ML problem. About the data:

它具有5个输入功能,4个输出类和大约26000条记录.

It has 5 input features, 4 output classes and about 26000 records.

我首先使用MLPClassifier()尝试了它,如下所示:

I had first attempted it using MLPClassifier() as follows:

clf = MLPClassifier(verbose=True, tol=1e-6, batch_size=300, hidden_layer_sizes=(200,100,100,100), max_iter=500, learning_rate_init= 0.095, solver='sgd', learning_rate='adaptive', alpha = 0.002)
clf.fit(train, y_train)

经过测试,通常我的LB分数约为99.90.为了在模型上获得更大的灵活性,我决定在Keras中实施相同的模型,然后对其进行更改,以尝试提高LB分数.我提出了以下建议:

After testing, I usually got a LB score around 99.90. To gain more flexibility over the model, I decided to implement the same model in Keras to start with and then make changes in it in an attempt to increase the LB score. I came up with the following:

model = Sequential()
model.add(Dense(200, input_dim=5, init='uniform', activation = 'relu'))
model.add(Dense(100, init='uniform', activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(100, init='uniform', activation='relu'))
model.add(Dense(100, init='uniform', activation='relu'))
model.add(Dense(4, init='uniform', activation='softmax'))

lrate = 0.095
decay = lrate/125
sgd = SGD(lr=lrate, momentum=0.9, decay=decay, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])
hist = model.fit(train, categorical_labels, nb_epoch=125, batch_size=256, shuffle=True,  verbose=2)

该模型似乎与MLPClassifier()模型非常相似,但是LB分数非常令人失望,大约为97. 有人可以告诉我此模型究竟出了什么问题吗?或者我们如何在keras中复制MLPClassifier模型.我认为正规化可能是这里出错的因素之一.

The model seems pretty similar to the MLPClassifier() model but the LB scores were pretty disappointing at around 97. Can somebody please tell what exactly was wrong with this model? Or how can we replicate the MLPClassifier model in keras. I think regularisation might be one of the factors that went wrong here.

损耗曲线:

这是代码:

#import libraries
import pandas as pd
from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import log_loss
from sklearn.preprocessing import MinMaxScaler, scale, StandardScaler, Normalizer

from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras import regularizers
from keras.optimizers import SGD

#load data
train = pd.read_csv("train.csv")
test = pd.read_csv("test.csv")

#generic preprocessing 
#encode as integer
mapping = {'Front':0, 'Right':1, 'Left':2, 'Rear':3}
train = train.replace({'DetectedCamera':mapping})
test = test.replace({'DetectedCamera':mapping})
#renaming column
train.rename(columns = {'SignFacing (Target)': 'Target'}, inplace=True)
mapping = {'Front':0, 'Left':1, 'Rear':2, 'Right':3}
train = train.replace({'Target':mapping})


#split data
y_train = train['Target']
test_id = test['Id']
train.drop(['Target','Id'], inplace=True, axis=1)
test.drop('Id',inplace=True,axis=1)
train_train, train_test, y_train_train, y_train_test = train_test_split(train, y_train)

scaler = StandardScaler()
scaler.fit(train_train)
train_train = scaler.transform(train_train)
train_test = scaler.transform(train_test)
test = scaler.transform(test)

#training and modelling
model = Sequential()
model.add(Dense(200, input_dim=5, kernel_initializer='uniform', activation = 'relu'))
model.add(Dense(100, kernel_initializer='uniform', activation='relu'))
# model.add(Dropout(0.2))
# model.add(Dense(100, init='uniform', activation='relu'))
# model.add(Dense(100, init='uniform', activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(100, kernel_initializer='uniform', activation='relu'))
model.add(Dense(100, kernel_initializer='uniform', activation='relu'))
model.add(Dense(4, kernel_initializer='uniform', activation='softmax'))

lrate = 0.095
decay = lrate/250
sgd = SGD(lr=lrate, momentum=0.9, decay=decay, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])
hist = model.fit(train_train, categorical_labels, validation_data=(train_test, categorical_labels_test), nb_epoch=100, batch_size=256, shuffle=True,  verbose=2)

这些文件: train.csv test.csv

推荐答案

要获得真正的scikit,请KerasClassifier rel ="nofollow noreferrer"> tensorflow.keras.wrappers.scikit_learn .例如:

To get a bona fide scikit estimator you can use KerasClassifier from tensorflow.keras.wrappers.scikit_learn. For example:

from sklearn.datasets import make_classification
from tensorflow import keras
from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Sequential
from tensorflow.keras.wrappers.scikit_learn import KerasClassifier


X, y = make_classification(
    n_samples=26000, n_features=5, n_classes=4, n_informative=3, random_state=0
)


def build_fn(optimizer):
    model = Sequential()
    model.add(
        Dense(200, input_dim=5, kernel_initializer="he_normal", activation="relu")
    )
    model.add(Dense(100, kernel_initializer="he_normal", activation="relu"))
    model.add(Dense(100, kernel_initializer="he_normal", activation="relu"))
    model.add(Dense(100, kernel_initializer="he_normal", activation="relu"))
    model.add(Dense(4, kernel_initializer="he_normal", activation="softmax"))
    model.compile(
        loss="categorical_crossentropy",
        optimizer=optimizer,
        metrics=[
            keras.metrics.Precision(name="precision"),
            keras.metrics.Recall(name="recall"),
            keras.metrics.AUC(name="auc"),
        ],
    )
    return model


clf = KerasClassifier(build_fn, optimizer="rmsprop", epochs=500, batch_size=300)
clf.fit(X, y)
clf.predict(X)

这篇关于在keras中复制sklearn的MLPClassifier()的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆