What is a gradient descent?
A gradient descent is a function, which is designed to search for a local minimum in a function.
Local minimum
Let us start off by defining what a minimum is. A minimum is the lowest y-point on a function, meaning it's the "lowest" point. Now where does the local come from? For a better understanding you can graph out the following functions on desmos. If you pass desmos y=0.2x^4+2x^3+5x^2+3x
you will quickly see, that there are two separate low points, one of them being deeper than the other. Imagine you cant draw out the entire graph, because you want to save computational power. Now if you can only see the right part of the graph, you would point to (-0.4/-0.525)
and say: "Well this has to be the minimum. It goes up on both sides", not knowing that it's only a local minimum. This will make more sense when we look at the function
How does the function work?
To understand this function you need a bit of calculus knowledge. If you already have that, this will make a lot more sense.
Imagine yourself as a ball on top of the function curve, and for this example we will still be using y=0.2x^4+2x^3+5x^2+3x
, because it perfectly shows what we are looking for. Now this ball rolls down our function and always in the direction that goes down, because gravity pulls it down. We can calculate which way and how inclined the slope is with some entry level calculus. We just need to take the derivative of the function. For f(x) = x^2
our derivative would be f'(x) = 2x
. Now we always go left if the product of our derivative is positive and right if it is negative. Back to our ball example: If now this ball rolls down our function and reaches (-0.4/-0.525)
, he will, due to gravity, be trapped in that hole, even tho it isn't the deepest hole in our entire function.
所有评论(0)