Home > Software engineering >  R: Performing Gradient Descent Directly on Functions Instead of Data Points from the Function
R: Performing Gradient Descent Directly on Functions Instead of Data Points from the Function

Time:01-15

I am using the R programming language. Using this link over here enter image description here

x = seq(1,1000, by=1)
y = (x^3) - (2*x) -5

gdec.eta1 = gradientR(y = y, X = x, eta = 100, iters = 1000)

However, I got the following error:

Error in if (sqrt(sum(grad^2)) <= epsilon) { : 
  missing value where TRUE/FALSE needed

Can someone please show me what I am doing wrong? Why this is error being produced?

And does anyone know if some other gradient descent functions in R would allow you to "directly" optimize this function instead of generating points from this function?

Something like this:

func2 <- function(x) {
  x^3 - 2* x - 5
}

gradientR(func2, eta = 100, iters = 1000)

Does anyone know if this is possible?

Thanks!

Note: This works for the following example (from the website linked above):

> y = rnorm(n = 10000, mean = 0, sd = 1)
> x1 = rnorm(n = 10000, mean = 0, sd = 1)
> x2 = rnorm(n = 10000, mean = 0, sd = 1)
> x3 = rnorm(n = 10000, mean = 0, sd = 1)
> x4 = rnorm(n = 10000, mean = 0, sd = 1)
> x5 = rnorm(n = 10000, mean = 0, sd = 1)
> 
> ptm <- proc.time()
> gdec.eta1 = gradientR(y = y, X = data.frame(x1,x2,x3, x4,x5), eta = 100, iters = 1000)
[1] "Initialize parameters..."
[1] "Algorithm converged"
[1] "Final gradient norm is 9.80308529574335e-05"

I just don't know why it doesn't work for my example.

CodePudding user response:

We assume that gradientR solves the problem you have and the question is getting it to work with your input. There are several problems here:

  1. One cannot pass a function to the gradientR. y and X must be vectors or matrices.

  2. Many optimization problems have scaling issues if you give them numbers that are very different. This one is no different. Use x/1000 instead of x.

  3. Just to be clear on what the underlying problem being solved is, it is to find a coefficient vector b such that the sum of the squares of the residual vector y - cbind(1, x) %% b is minimized where y and x are known. Several commenters interpreted the problem differently but if their interpretation is what you want then gradientR is not applicable and , anyways, the problem of maximizing or minimizing x^3-2x-5 has no finite solutions.

  4. If you want to pass func instead of y then just write a simple wrapper, grad2, as shown below.

To fix the scaling issue use it with x/1000 and y as shown.

x = seq(1, 1000, by = 1) / 1000
y = (x^3) - (2*x) -5

gdec.eta1 = gradientR(y = y, X = x, eta = 100, iters = 1000)
str(gdec.eta1)
## List of 2
##  $ coef  : num [1:2, 1] -5.2 -1.1
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : chr [1:2] "rep.1..length.y.." "X"
##   .. ..$ : NULL
##  $ l2loss: num [1:260] 101.44 50.34 25.5 13.83 8.86 ...

# check that lm gives the same coefficients
coef(lm(y ~ x))
## (Intercept)           x 
##   -5.200701   -1.098500 

Now define a function which takes func rather than y. func must be such that func(x) is y. We test it out at the end and it gives the same result.

# func must be such that func(X) gives Y
grad2 <- function(func, X, ...) gradientR(func(X), X, ...)

# test
x = seq(1, 1000, by = 1) / 1000
func <- function(x) (x^3) - (2*x) -5

grad2 <- function(func, ...) gradientR(func(x), ...)
gdec.etal2 <- grad2(func, x, eta = 100, iters = 1000)
str(gdec.etal2)
## List of 2
##  $ coef  : num [1:2, 1] -5.2 -1.1
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : chr [1:2] "rep.1..length.y.." "X"
##   .. ..$ : NULL
##  $ l2loss: num [1:238] 141.66 69.93 34.71 17.59 9.57 ...
  •  Tags:  
  • Related