Calculate RMSE on the validation set. What is it equal to? Provide the answer, rounded to the nearest FIVE decimal places (e.g. 12.3456789 -> 12.34568). Notice the speed of the algorithm. Ans:- # code here

Computer Networking: A Top-Down Approach (7th Edition)
7th Edition
ISBN:9780133594140
Author:James Kurose, Keith Ross
Publisher:James Kurose, Keith Ross
Chapter1: Computer Networks And The Internet
Section: Chapter Questions
Problem R1RQ: What is the difference between a host and an end system? List several different types of end...
icon
Related questions
Question
100%

import numpy as np

import pandas as pd

from catboost

import CatBoostRegressor

from lightgbm import LGBMRegressor

from sklearn.linear_model import Lasso

from sklearn.metrics import mean_squared_error

from sklearn.model_selection import train_test_split

from xgboost import XGBRegressor

df=pd.read_csv('data.csv')
X = df.drop('shares', axis=1)
y = df['shares']
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.40, random_state=13)
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.25, random_state=13)
Ans:- # code here

Q- Now let's train our first model - XGBoost. A link to the documentation: https://xgboost.readthedocs.io/en/latest/

We will use Scikit-Learn Wrapper interface for XGBoost (and the same logic applies to the following LightGBM and CatBoost models). Here, we work on the regression task - hence we will use XGBRegressor. Read about the parameters of XGBRegressor: https://xgboost.readthedocs.io/en/latest/python/python_api.html#xgboost.XGBRegressor

The main list of XGBoost parameters: https://xgboost.readthedocs.io/en/latest/parameter.html# Look through this list so that you understand which parameters are presented in the library.

Take XGBRegressor with MSE objective (objective='reg:squarederror'), 200 trees (n_estimators=200), learning_rate=0.01, max_depth=5, random_state=13 and all other default parameter values. Train it on the train set (fit function).

q5: Calculate Root Mean Squared Error (RMSE) on the validation set. What is it equal to? Provide the answer, rounded to the nearest FIVE decimal places (e.g. 12.3456789 -> 12.34568).

Ans:-

import xgboost as xgb

from xgboost.sklearn import XGBClassifier

xgb1 = XGBClassifier( learning_rate =0.01, n_estimators=200, max_depth=5, random_state=13, objective='reg:squarederror')

# Code here

Q- 

In the task 5, we have decided to build 200 trees in our model. However, it is hard to understand whether it is a good decision - maybe it is too much? Maybe 150 is a better number? Or 100? Or 50 is enough?

During the training process, it is possible to stop constructing the ensemble if we see that the validation error does not decrease anymore. Using the same XGBoost model, call fit function (to train it) with eval_set=[(X_val, y_val)] (to evaluate the boosting model after building a new tree) and early_stopping_rounds=50 (and other default parameter values). This early_stopping_rounds says that if the validation metric does not increase on 50 consequent iterations, the training stops.

q6: Calculate RMSE on the validation set. What is it equal to? Provide the answer, rounded to the nearest FIVE decimal places (e.g. 12.3456789 -> 12.34568).

Ans:- # code here

Q- Notes on parameter tuning: https://xgboost.readthedocs.io/en/latest/tutorials/param_tuning.html

Here, we tuned some parameters of the algorithm. Take XGBRegressor with the following parameters:

  • objective='reg:squarederror'
  • n_estimators=5000
  • learning_rate=0.001
  • max_depth=4
  • gamma=1
  • subsample=0.5
  • random_state=13
  • all other default parameter values

Train it in the same manner, as in the task 6, but with early_stopping_rounds=500.

q7: Calculate RMSE on the validation set. What is it equal to? Provide the answer, rounded to the nearest FIVE decimal places (e.g. 12.3456789 -> 12.34568).

Notice the speed of the algorithm.

Ans:- # code here

 

Expert Solution
trending now

Trending now

This is a popular solution!

steps

Step by step

Solved in 4 steps with 1 images

Blurred answer
Recommended textbooks for you
Computer Networking: A Top-Down Approach (7th Edi…
Computer Networking: A Top-Down Approach (7th Edi…
Computer Engineering
ISBN:
9780133594140
Author:
James Kurose, Keith Ross
Publisher:
PEARSON
Computer Organization and Design MIPS Edition, Fi…
Computer Organization and Design MIPS Edition, Fi…
Computer Engineering
ISBN:
9780124077263
Author:
David A. Patterson, John L. Hennessy
Publisher:
Elsevier Science
Network+ Guide to Networks (MindTap Course List)
Network+ Guide to Networks (MindTap Course List)
Computer Engineering
ISBN:
9781337569330
Author:
Jill West, Tamara Dean, Jean Andrews
Publisher:
Cengage Learning
Concepts of Database Management
Concepts of Database Management
Computer Engineering
ISBN:
9781337093422
Author:
Joy L. Starks, Philip J. Pratt, Mary Z. Last
Publisher:
Cengage Learning
Prelude to Programming
Prelude to Programming
Computer Engineering
ISBN:
9780133750423
Author:
VENIT, Stewart
Publisher:
Pearson Education
Sc Business Data Communications and Networking, T…
Sc Business Data Communications and Networking, T…
Computer Engineering
ISBN:
9781119368830
Author:
FITZGERALD
Publisher:
WILEY