Go to be a mountain climber like gradient descent. How does Gradient Descent work (in PyTorch)?

The gradient descent is a method that is popular in many areas. It is used in various deep-learning models and optimization problems. The…

วิศวกรสอนเอไอ

~5 min read · March 18, 2023 (Updated: August 30, 2024) · Free: No

The gradient descent is a method that is popular in many areas. It is used in various deep-learning models and optimization problems. The technique is a fundamental approach to minimizing some functions that we don't know exactly what approach to minimize, such as deep learning layers.

Deep learning layers are linear map functions. They try to map the input data X to the target vector y. Layers are a powerful method in deep learning models. However, we need to find the weight of layers that minimizes the error between the prediction vector and the target vector. Finding the optimal weight is a very hard task.

This article is going to explain the basic concept of gradient descent. Also, we will dive deep into PyTorch code that implements gradient descent.

Basic Concept of Gradient Descent

A good way to understand this concept is to visualize, visualize, and visualize the problem that we need to minimize.

The example problem used in this article is:

Phiphat 2023

Although that is a very simple function, it can be a good example to visualize. We need to plot the function and its slopes:

Phiphat 2023

Each point on the function has a different slope. Also, the lowest value of the function has a slope of 0. We notice the directions of the slopes forward into the optimal solution, right?

What if we move the point by direction of the slope line?

Phiphat 2023

Let's do it.

Phiphat 2023

The above image shows our experiment. We think it will work well but the point is to swing between two points. Hmmm, how do we fix it?

Ahaaaa, why do we reduce the value of slopes? The way to reduce the slope is to multiply it by the learning rate.

Let's try it.

Phiphat 2023

Amazing, we can implement the gradient descent in a simple function. In the next step, we are going to the code section. Let's go!

Auto-Grad in PyTorch

How to find the gradient and the slope at point 𝑥 with PyTorch? In the first step, we need to define the function defined by the PyTorch code.

Phiphat 2023

Let's code the above function:

import torch

# Set the initial point of x
x = torch.tensor(2.0, requires_grad = True) # requires_grad = Do we need to get the grad from x

# Set our function
y = x**2

Ok, we have a function to experiment with the gradient descent method. We should explore each data first.

What about the x and y?

x
# tensor(2., requires_grad=True)
# grad's value before backward => None
x.grad

Hmm, the grad of x doesn't return some value. Let's try the backward method in y:

Phiphat 2023

# backward the function to find the grad.
y.backward()

And let's try to print out the grad of x again.

# grad's value after backward => 4
x.grad
# tensor(4.)

Alright, we get the tensor(4.) value from x.grad.

In the PyTorch, It accumulates the value of the backward method. So, we need to set the grad value to zero again.

# set the grad to zero again
x.grad.data.zero_()
# tensor(0.)

Experiment's gradient descent with PyTorch

We need to set the function optimized by our method. In this case, we need to find the lowest value of y.

# Set the number of steps
N = 9

# Set our learning rate
lr = 0.1

# set the initial point
x = torch.tensor(4., requires_grad = True) 

# set our function that is optimized
y_fn = lambda x: x**2
y = y_fn(x)

And, we find the optimal solution on the iteration that minimizes our function:

cost_value = []
for i in range(N):
    y.backward(retain_graph=True)
    grad = x.grad.data
    x.data = x.data - lr * grad
    x.grad.data.zero_()
    cost_value.append(y_fn(x.data))
    # print(f"iteration: {i}\ngrad: {grad}\nx: {x.data}\ny:{y_fn(x.data)}")
    # print('='*25)

Phiphat 2023

The values of y go down for each step. Wow, this is good work. We can implement the gradient descent with PyTorch.

If you enjoy reading code, you can find code from this link:

torch/Basice_concept.ipynb at main · OrnlyP63/torch

You can't perform that action at this time. You signed in with another tab or window. You signed out in another tab or…

github.com

torch/Linear Regression.ipynb at main · OrnlyP63/torch

You can't perform that action at this time. You signed in with another tab or window. You signed out in another tab or…

github.com

Enjoy it. If you like my content, you subscribe to my medium.

#optimization #deep-learning #gradient-descent #pytorch

< Go to the original

Go to be a mountain climber like gradient descent. How does Gradient Descent work (in PyTorch)?

The gradient descent is a method that is popular in many areas. It is used in various deep-learning models and optimization problems. The…

Basic Concept of Gradient Descent

Auto-Grad in PyTorch

Experiment's gradient descent with PyTorch

torch/Basice_concept.ipynb at main · OrnlyP63/torch

You can't perform that action at this time. You signed in with another tab or window. You signed out in another tab or…

torch/Linear Regression.ipynb at main · OrnlyP63/torch

You can't perform that action at this time. You signed in with another tab or window. You signed out in another tab or…

Reporting a Problem