I haven't read this paper yet but I'm familiar with the general work. The aspect that everyone ignores is that, yes linear transformations like matrix operations or fourier transforms are incredibly fast in optics, however the nonlinearity is the sticker. While optical propagation is nonlinear, you need very high intensities. The elephant in the room is that the linear operations rely on parallelism, i.e. they split the optical power up into multiple paths so each path has very low intensity, thus exhibits low nonlinearity. The solution that has been that everyone simply used optical to electrical conversion and did the nonlinearity digitally (or sometimes in analog electronics). That sort of works for one layer, but completely falls apart for multiple layers, it is neither cost not energy efficient to have hundreds or possibly thousands of a/d converters.