Generally the inverted pendulum system is modeled as a linear system, and hence the modeling is valid only for small oscillations of the pendulum. 通常倒立擺系統建模成一個線形系統,因此模型只對小幅度擺動的擺才有效。
The algorithm revises the reinforcement signal and improves the exploration policy to overcome the negative effect of limit cycles in the inverted pendulum system. 演算法採用修正強化信號和改進探索策略的方法克服極限環對倒立擺系統的影響。