Attitude control in quadrotor Unmanned Aerial Vehicle (UAV) systems is traditionally managed by optimal control loops tuned to minimize errors in performance. While robust, these loops perform sub-optimally in dynamic and unpredictable environments which inspire new interest in sophisticated solution and approaches such as reinforcement learning (RL) approaches which should...