A brief description of my model:
- Consists of a single parameter
Xof dtypeComplexDoubleand shape(20, 20, 20, 3). For reference, this must be complex because I need to perform FFTs etc. on it Xis used to compute a real scalar value,Yas the output- The objective is to minimise the value of
Yusing autograd to optimize the value ofX.
Simple gradient descent-based optimizers like torch.optim.SGD and torch.optim.Adam seem to work fine for this process. I would like to extend this to L-BFGS.
The problem is upon using
optimizer = optim.LBFGS(solver.parameters())
def closure():
optimizer.zero_grad()
Y = model.forward()
Y.backward()
return Y
for i in range(steps):
optimizer.step(closure)
I get the error
File "xx\Python\Python38\lib\site-packages\torch\optim\lbfgs.py", line 410, in step
if gtd > -tolerance_change:
RuntimeError: "gt_cpu" not implemented for 'ComplexDouble'
According to the source file, it's computing the directional derivative to be complex which disrupts the algorithm.
Is there any way to get L-BFGS working for my complex parameter (e.g. using an alternative library) or is this fundamentally impossible? I had some ideas about replacing these "faulty" dot products with something like real(a.conj() * b)) but I wasn't sure whether that would work.
CodePudding user response:
My intuition was correct. I replaced every occurence of a.dot(b) in the file with torch.real(a.conj().dot(b)) and L-BFGS is working great!
