Model validation in python datacamp github

DataCamp_Model_Validation_in_Python

This is a memo to share what I have learnt in Model Validation [using Python], capturing the learning objectives as well as my personal notes. The course is taught by Kasey Jones from DataCamp, and it includes 4 chapters:

Chapter 1. Basic Modeling in scikit-learn

Chapter 2. Validation Basics

Chapter 3. Cross Validation

Chapter 4. Selecting the best model with Hyperparameter tuning

Personal Notes:

//medium.com/ai-in-plain-english/model-validation-in-python-ad23c1d215b

Permalink

Cannot retrieve contributors at this time

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters

Permalink

Cannot retrieve contributors at this time

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters

'''
Evaluating model accuracy on validation dataset
Now it's your turn to monitor model accuracy with a validation data set. A model definition has been provided as model. Your job is to add the code to compile it and then fit it. You'll check the validation score in each epoch.
INSTRUCTIONS
100XP
Compile your model using 'adam' as the optimizer and 'categorical_crossentropy' for the loss. To see what fraction of predictions are correct [the accuracy] in each epoch, specify the additional keyword argument metrics=['accuracy'] in model.compile[].
Fit the model using the predictors and target. Create a validation split of 30% [or 0.3]. This will be reported in each epoch.
'''
# Save the number of columns in predictors: n_cols
n_cols = predictors.shape[1]
input_shape = [n_cols,]
# Specify the model
model = Sequential[]
model.add[Dense[100, activation='relu', input_shape = input_shape]]
model.add[Dense[100, activation='relu']]
model.add[Dense[2, activation='softmax']]
# Compile the model
model.compile[optimizer = 'adam', loss = 'categorical_crossentropy', metrics=['accuracy']]
# Fit the model
hist = model.fit[predictors, target, validation_split=0.3]

DataCamp_Model_Validation_in_Python

This is a memo to share what I have learnt in Model Validation [using Python], capturing the learning objectives as well as my personal notes. The course is taught by Kasey Jones from DataCamp, and it includes 4 chapters:

Chapter 1. Basic Modeling in scikit-learn

Chapter 2. Validation Basics

Chapter 3. Cross Validation

Chapter 4. Selecting the best model with Hyperparameter tuning

Personal Notes:

//medium.com/ai-in-plain-english/model-validation-in-python-ad23c1d215b

#==============================================================================================================================# #Chapter 2 - Regression #==============================================================================================================================# #Fit & predict for regression # Import LinearRegression from sklearn.linear_model import LinearRegression # Create the regressor: reg reg = LinearRegression[] # Create the prediction space prediction_space = np.linspace[min[X_fertility], max[X_fertility]].reshape[-1,1] # Fit the model to the data reg.fit[X_fertility, y] # Compute predictions over the prediction space: y_pred y_pred = reg.predict[prediction_space] # Print R^2 print[reg.score[X_fertility, y]] # Plot regression line plt.plot[prediction_space, y_pred, color='black', linewidth=3] plt.show[] #==============================================================================================================================# #Train/test split for regression # Import necessary modules from sklearn.linear_model import LinearRegression from sklearn.metrics import mean_squared_error from sklearn.model_selection import train_test_split # Create training and test sets X_train, X_test, y_train, y_test = train_test_split[X, y, test_size = 0.3, random_state=42] # Create the regressor: reg_all reg_all = LinearRegression[] # Fit the regressor to the training data reg_all.fit[X_train, y_train] # Predict on the test data: y_pred y_pred = reg_all.predict[X_test] # Compute and print R^2 and RMSE print["R^2: {}".format[reg_all.score[X_test, y_test]]] rmse = np.sqrt[mean_squared_error[y_test, y_pred]] print["Root Mean Squared Error: {}".format[rmse]] #==============================================================================================================================# #5-fold cross-validation # Import the necessary modules from sklearn.linear_model import LinearRegression from sklearn.model_selection import cross_val_score # Create a linear regression object: reg reg = LinearRegression[] # Compute 5-fold cross-validation scores: cv_scores cv_scores = cross_val_score[reg, X, y, cv=5] # Print the 5-fold cross-validation scores print[cv_scores] # Print the average 5-fold cross-validation score print["Average 5-Fold CV Score: {}".format[np.mean[cv_scores]]] #==============================================================================================================================# #K-Fold CV comparison # Import necessary modules from sklearn.linear_model import LinearRegression from sklearn.model_selection import cross_val_score # Create a linear regression object: reg reg = LinearRegression[] # Perform 3-fold CV cvscores_3 = cross_val_score[reg, X, y, cv = 3] print[np.mean[cvscores_3]] # Perform 10-fold CV cvscores_10 = cross_val_score[reg, X, y, cv = 10] print[np.mean[cvscores_10]] #==============================================================================================================================# #Regularization I: Lasso # Import Lasso from sklearn.linear_model import Lasso # Instantiate a lasso regressor: lasso lasso = Lasso[alpha=0.4, normalize=True] # Fit the regressor to the data lasso.fit[X, y] # Compute and print the coefficients lasso_coef = lasso.coef_ print[lasso_coef] # Plot the coefficients plt.plot[range[len[df_columns]], lasso_coef] plt.xticks[range[len[df_columns]], df_columns.values, rotation=60] plt.margins[0.02] plt.show[] #==============================================================================================================================# #Regularization II: Ridge # Import necessary modules from sklearn.linear_model import Ridge from sklearn.model_selection import cross_val_score # Setup the array of alphas and lists to store scores alpha_space = np.logspace[-4, 0, 50] ridge_scores = [] ridge_scores_std = [] # Create a ridge regressor: ridge ridge = Ridge[normalize=True] # Compute scores over range of alphas for alpha in alpha_space: # Specify the alpha value to use: ridge.alpha ridge.alpha = alpha # Perform 10-fold CV: ridge_cv_scores ridge_cv_scores = cross_val_score[ridge, X, y, cv=10] # Append the mean of ridge_cv_scores to ridge_scores ridge_scores.append[np.mean[ridge_cv_scores]] # Append the std of ridge_cv_scores to ridge_scores_std ridge_scores_std.append[np.std[ridge_cv_scores]] # Display the plot display_plot[ridge_scores, ridge_scores_std] #==============================================================================================================================# #==============================================================================================================================# #==============================================================================================================================# #==============================================================================================================================# #==============================================================================================================================# #==============================================================================================================================# #==============================================================================================================================#

Chủ Đề