OpenAI's REINFORCE and actor-critic example for reinforcement learning has the following code:
policy_loss = torch.cat(policy_loss).sum()
loss = torch.stack(policy_losses).sum() + torch.stack(value_losses).sum()
One is using torch.cat, the other uses torch.stack.
As far as my understanding goes, the doc doesn't give any clear distinction between them.
I would be happy to know the differences between the functions.
#python #machine-learning