河南移动官网网站建设高平企业网站
web/
2025/10/9 10:56:29/
文章来源:
河南移动官网网站建设,高平企业网站,广州科技网站建设,上海网站快速备案目录
模型初始化信息#xff1a;
模型实现#xff1a;
多变量损失函数#xff1a;
多变量梯度下降实现#xff1a;
多变量梯度实现#xff1a;
多变量梯度下降实现#xff1a; 之前部分实现的梯度下降线性预测模型中的training example只有一个特征属性#xff1a…
目录
模型初始化信息
模型实现
多变量损失函数
多变量梯度下降实现
多变量梯度实现
多变量梯度下降实现 之前部分实现的梯度下降线性预测模型中的training example只有一个特征属性房屋面积这显然是不符合实际情况的这里增加特征属性的数量再实现一次梯度下降线性预测模型。
这里回顾一下梯度下降线性模型的实现方法
实现线性模型f w*x b模型参数w,b待定寻找最优的w,b组合 1引入衡量模型优劣的cost functionJ(w,b) ——损失函数或者代价函数 2损失函数值最小的时候模型最接近实际情况通过梯度下降法来寻找最优w,b组合
模型初始化信息
新的房子的特征有房子面积、卧室数、楼层数、房龄共4个特征属性。
Size (sqft)Number of BedroomsNumber of floorsAge of HomePrice (1000s dollars)2104514546014163240232852213517 上面表中的训练样本有3个输入特征矩阵模型为 具体代码实现为X_train是输入矩阵y_train是输出矩阵
X_train np.array([[2104, 5, 1, 45], [1416, 3, 2, 40],[852, 2, 1, 35]])
y_train np.array([460, 232, 178])
模型参数w,b矩阵 代码实现w中的每一个元素对应房屋的一个特征属性
b_init 785.1811367994083
w_init np.array([ 0.39133535, 18.75376741, -53.36032453, -26.42131618])
模型实现
def predict(x, w, b): single predict using linear regressionArgs:x (ndarray): Shape (n,) example with multiple featuresw (ndarray): Shape (n,) model parameters b (scalar): model parameter Returns:p (scalar): predictionp np.dot(x, w) b return p
多变量损失函数
J(w,b)为 代码实现为:
def compute_cost(X, y, w, b): compute costArgs:X (ndarray (m,n)): Data, m examples with n featuresy (ndarray (m,)) : target valuesw (ndarray (n,)) : model parameters b (scalar) : model parameterReturns:cost (scalar): costm X.shape[0]cost 0.0for i in range(m): f_wb_i np.dot(X[i], w) b #(n,)(n,) scalar (see np.dot)cost cost (f_wb_i - y[i])**2 #scalarcost cost / (2 * m) #scalar return cost
多变量梯度下降实现
多变量梯度实现
def compute_gradient(X, y, w, b): Computes the gradient for linear regression Args:X (ndarray (m,n)): Data, m examples with n featuresy (ndarray (m,)) : target valuesw (ndarray (n,)) : model parameters b (scalar) : model parameterReturns:dj_dw (ndarray (n,)): The gradient of the cost w.r.t. the parameters w. dj_db (scalar): The gradient of the cost w.r.t. the parameter b. m,n X.shape #(number of examples, number of features)dj_dw np.zeros((n,))dj_db 0.for i in range(m): err (np.dot(X[i], w) b) - y[i] for j in range(n): dj_dw[j] dj_dw[j] err * X[i, j] dj_db dj_db err dj_dw dj_dw / m dj_db dj_db / m return dj_db, dj_dw
多变量梯度下降实现 def gradient_descent(X, y, w_in, b_in, cost_function, gradient_function, alpha, num_iters): Performs batch gradient descent to learn theta. Updates theta by taking num_iters gradient steps with learning rate alphaArgs:X (ndarray (m,n)) : Data, m examples with n featuresy (ndarray (m,)) : target valuesw_in (ndarray (n,)) : initial model parameters b_in (scalar) : initial model parametercost_function : function to compute costgradient_function : function to compute the gradientalpha (float) : Learning ratenum_iters (int) : number of iterations to run gradient descentReturns:w (ndarray (n,)) : Updated values of parameters b (scalar) : Updated value of parameter # An array to store cost J and ws at each iteration primarily for graphing laterJ_history []w copy.deepcopy(w_in) #avoid modifying global w within functionb b_infor i in range(num_iters):# Calculate the gradient and update the parametersdj_db,dj_dw gradient_function(X, y, w, b) ##None# Update Parameters using w, b, alpha and gradientw w - alpha * dj_dw ##Noneb b - alpha * dj_db ##None# Save cost J at each iterationif i100000: # prevent resource exhaustion J_history.append( cost_function(X, y, w, b))# Print cost every at intervals 10 times or as many iterations if 10if i% math.ceil(num_iters / 10) 0:print(fIteration {i:4d}: Cost {J_history[-1]:8.2f} )return w, b, J_history #return final w,b and J history for graphing
梯度下降算法测试
# initialize parameters
initial_w np.zeros_like(w_init)
initial_b 0.
# some gradient descent settings
iterations 1000
alpha 5.0e-7
# run gradient descent
w_final, b_final, J_hist gradient_descent(X_train, y_train, initial_w, initial_b,compute_cost, compute_gradient, alpha, iterations)
print(fb,w found by gradient descent: {b_final:0.2f},{w_final} )
m,_ X_train.shape
for i in range(m):print(fprediction: {np.dot(X_train[i], w_final) b_final:0.2f}, target value: {y_train[i]})# plot cost versus iteration
fig, (ax1, ax2) plt.subplots(1, 2, constrained_layoutTrue, figsize(12, 4))
ax1.plot(J_hist)
ax2.plot(100 np.arange(len(J_hist[100:])), J_hist[100:])
ax1.set_title(Cost vs. iteration); ax2.set_title(Cost vs. iteration (tail))
ax1.set_ylabel(Cost) ; ax2.set_ylabel(Cost)
ax1.set_xlabel(iteration step) ; ax2.set_xlabel(iteration step)
plt.show()
结果为 可以看到右图中损失函数在traning次数结束之后还一直在下降没有找到最佳的w,b组合。具体解决方法后面会有更新。
完整的代码为
import copy, math
import numpy as np
import matplotlib.pyplot as pltnp.set_printoptions(precision2) # reduced display precision on numpy arraysX_train np.array([[2104, 5, 1, 45], [1416, 3, 2, 40], [852, 2, 1, 35]])
y_train np.array([460, 232, 178])b_init 785.1811367994083
w_init np.array([ 0.39133535, 18.75376741, -53.36032453, -26.42131618])def predict(x, w, b):single predict using linear regressionArgs:x (ndarray): Shape (n,) example with multiple featuresw (ndarray): Shape (n,) model parametersb (scalar): model parameterReturns:p (scalar): predictionp np.dot(x, w) breturn pdef compute_cost(X, y, w, b):compute costArgs:X (ndarray (m,n)): Data, m examples with n featuresy (ndarray (m,)) : target valuesw (ndarray (n,)) : model parametersb (scalar) : model parameterReturns:cost (scalar): costm X.shape[0]cost 0.0for i in range(m):f_wb_i np.dot(X[i], w) b # (n,)(n,) scalar (see np.dot)cost cost (f_wb_i - y[i]) ** 2 # scalarcost cost / (2 * m) # scalarreturn costdef compute_gradient(X, y, w, b):Computes the gradient for linear regressionArgs:X (ndarray (m,n)): Data, m examples with n featuresy (ndarray (m,)) : target valuesw (ndarray (n,)) : model parametersb (scalar) : model parameterReturns:dj_dw (ndarray (n,)): The gradient of the cost w.r.t. the parameters w.dj_db (scalar): The gradient of the cost w.r.t. the parameter b.m, n X.shape # (number of examples, number of features)dj_dw np.zeros((n,))dj_db 0.for i in range(m):err (np.dot(X[i], w) b) - y[i]for j in range(n):dj_dw[j] dj_dw[j] err * X[i, j]dj_db dj_db errdj_dw dj_dw / mdj_db dj_db / mreturn dj_db, dj_dwdef gradient_descent(X, y, w_in, b_in, cost_function, gradient_function, alpha, num_iters):Performs batch gradient descent to learn theta. Updates theta by takingnum_iters gradient steps with learning rate alphaArgs:X (ndarray (m,n)) : Data, m examples with n featuresy (ndarray (m,)) : target valuesw_in (ndarray (n,)) : initial model parametersb_in (scalar) : initial model parametercost_function : function to compute costgradient_function : function to compute the gradientalpha (float) : Learning ratenum_iters (int) : number of iterations to run gradient descentReturns:w (ndarray (n,)) : Updated values of parametersb (scalar) : Updated value of parameter# An array to store cost J and ws at each iteration primarily for graphing laterJ_history []w copy.deepcopy(w_in) # avoid modifying global w within functionb b_infor i in range(num_iters):# Calculate the gradient and update the parametersdj_db, dj_dw gradient_function(X, y, w, b) ##None# Update Parameters using w, b, alpha and gradientw w - alpha * dj_dw ##Noneb b - alpha * dj_db ##None# Save cost J at each iterationif i 100000: # prevent resource exhaustionJ_history.append(cost_function(X, y, w, b))# Print cost every at intervals 10 times or as many iterations if 10if i % math.ceil(num_iters / 10) 0:print(fIteration {i:4d}: Cost {J_history[-1]:8.2f} )return w, b, J_history # return final w,b and J history for graphing# initialize parameters
initial_w np.zeros_like(w_init)
initial_b 0.
# some gradient descent settings
iterations 1000
alpha 5.0e-7
# run gradient descent
w_final, b_final, J_hist gradient_descent(X_train, y_train, initial_w, initial_b,compute_cost, compute_gradient,alpha, iterations)
print(fb,w found by gradient descent: {b_final:0.2f},{w_final} )
m,_ X_train.shape
for i in range(m):print(fprediction: {np.dot(X_train[i], w_final) b_final:0.2f}, target value: {y_train[i]})# plot cost versus iteration
fig, (ax1, ax2) plt.subplots(1, 2, constrained_layoutTrue, figsize(12, 4))
ax1.plot(J_hist)
ax2.plot(100 np.arange(len(J_hist[100:])), J_hist[100:])
ax1.set_title(Cost vs. iteration); ax2.set_title(Cost vs. iteration (tail))
ax1.set_ylabel(Cost) ; ax2.set_ylabel(Cost)
ax1.set_xlabel(iteration step) ; ax2.set_xlabel(iteration step)
plt.show()
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/web/89595.shtml
如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!