Description
Hi Patrick! Thank you so much for your inspiring work.
I am currently trying to implement sequence-to-sequence forecasting with CDE but the output smoother without adding time as an additional dimension on my data. I am not sure whether I have implemented the code in the right way.
So my input data is with size batch size, input sequence length, feature dimension
and I mapped it to a latent space using GRU, this holds a dimension of batch size, input sequence length, latent dimension
, which I considered this as the observation of each time step.
I then concatenated the time points with this latent vector (add 1 dim to the latent dimension) prior to fitting the feature to the cubic spline.
x = torch.cat((t_x, x), dim=2)
After that I do:
z0 = self.mlp(X0)
t_forecast = torch.arange(start=0, end=t_y.shape[1], dtype=dtype, device=self.args.device) # create the forecasting time step
z_T = cdeint(X=X, z0=z0, func=self.func, t=t_forecast) # size: batch size, output sequence length, latent dimension
pred_y = self.decoder(z_T) # linear layer on the latent dimension, output size: batch size, output sequence length
The loss is then computed between pred_y
and the ground truth
My questions are:
- Is it correct to do time series forecasting with CDE in this way? If it is, why is the one without t works better?
- If I want to do extrapolation, is it correct to have a longer
t
fed into CDE withtorch.arange
?
Thank you in advance!
Activity