Reading paper list
Deep Reinforcement Learning for Optimal Portfolio Allocation https://icaps23.icaps-conference.org/papers/finplan/FinPlan23_paper_4.pdf 60 days look back in MVO MVO
Deep Reinforcement Learning for Optimal Portfolio Allocation https://icaps23.icaps-conference.org/papers/finplan/FinPlan23_paper_4.pdf 60 days look back in MVO MVO

policy gradient https://rail.eecs.berkeley.edu/deeprlcourse-fa18/static/slides/lec-5.pdf from https://rail.eecs.berkeley.edu/deeprlcourse-fa18/ lilian weng policy gradient https://lilianweng.github.io/posts/2018-04-08-policy-gradient/ The difference between policy based and value based https://www.reddit.com/r/reinforcementlearning/comments/mkz9gl/policybased_vs_valuebased_are_they_truly_different/ $$ J(\theta) = E_{\tau ∼p_\theta(\tau)} \left[ \sum_{t}r(s_t, a_t) \right]$$ $$ \theta^* = argm\underset{\theta}ax J(\theta) $$ Object function \( J(\theta)\) is the expected return of a policy parameterized by \( \theta \). \( \tau∼p_\theta(\tau) …...

Bellman equation plays a vital role in reinforcement learning. Iterative Policy Evaluation $$$$ $$ V_{k+1}(s) = \sum_{a, s'} \pi(a|s) p(s'|s,a) \{ r(s, a, s') + \gamma V_{k}(s') \}$$ $$$$ \begin{flalign} &V_{k}:\text{Value function after } k \text{ th iteration.}\ & \end{flalign} \begin{flalign} & \pi : \text{Policy. The probability of performing action } a \text{ under state } s. \ & \end{flalign} \begin{flalign} & p: \text{Probability of the next state } s' \text{ under state } s \text{ and action } a....