In today’s highly industrialized world, with rapid advancements in artificial intelligence (AI), the ability to integrate intelligence into industrial robot control systems is increasingly desirable. Serial manipulators are extensively used in industries such as automotive, aerospace, and medical fields. Over time, actuator degradation can hinder performance. Reinforcement learning offers a solution by enabling adaptive control systems that compensate for this degradation, extending the robot's lifespan and minimizing downtime.
This project explores the application of reinforcement learning (RL) to improve control strategies for serial manipulators, such as ABB's IRB series in Figure 1 (a) and Franka's Emika Panda robot in Figure 1 (b), which are 6- and 7-axis articulated robots, respectively.

Figure 1 (a): ABB IRB 2400

Figure 1 (b): Franka Emika Panda Robot
Figure 1: Illustration of serial manipulators explored in this project: ABB IRB 2400 and Franka Emika Robot
This project introduces fundamental concepts in robotics, with a primary focus on serial manipulators equipped with six degrees of freedom (DOF). It begins by explaining the Denavit-Hartenberg (D-H) convention and presenting the corresponding parameters for the IRB 2400 and Franka Emika robots. These robots were chosen due to the availability of training datasets for machine learning applications and their comprehensive documentation, though the primary focus is on the IRB series from ABB. The study initially discusses the typical control setups for serial manipulators before delving into learning-based methods.
To establish a foundation, the derivations for forward and inverse kinematics, as well as the dynamic model of a 6-DOF serial manipulator, are included for clarity and completeness. Linear regression was first employed to model forward kinematics using the IRB 120 dataset from Kaggle. However, its inability to capture the nonlinearity inherent in forward kinematics highlighted the need for more sophisticated models. Subsequently, a neural network model with ReLU activation, three hidden layers, and 256 neurons per layer was explored. Building on this, the study investigated optimal neural network architectures, ultimately identifying that a configuration with LeakyReLU activation, three hidden layers, and 1000 neurons per layer achieved the best performance.
This study further explores reinforcement learning using the Actor-Critic method and neural network solutions for inverse kinematics. The findings suggest that learning-based methods can significantly reduce the computational burden of serial manipulator control, enabling real-time computation and improved responsiveness. Additionally, integrating sensor data into training datasets provides a means to compensate for actuator degradation over time.
In conclusion, the integration of artificial intelligence into serial manipulators offers substantial benefits, enhancing both computational efficiency and adaptability in control systems.
To manipulate an object in any position and orientation within the workspace, at least six Degrees of Freedom (DOF) are necessary. This paper investigates a serial manipulator equipped with six joints, which provides the robot with 6-DOF.
One of the most widely used conventions for attaching reference frames to the links of a spatial kinematic chain is the Denavit-Hartenberg (D-H) convention. It employs four parameters to define the relative position and orientation between consecutive links. These D-H parameters are:
$a_{i-1}$ : The link length, representing the perpendicular distance between the z-axes of two consecutive links.
$\alpha_{i-1}$ : The link twist, defined as the angle around the common normal (or x-axis), from the old z-axis to the new z-axis.
$d_i$ : The link offset, representing the displacement along the previous z-axis to the common normal. This parameter is typically associated with prismatic joints.
$\theta_i$ : The joint angle, measured around the previous z-axis from the old x-axis to the new x-axis. For a revolute joint, this angle represents the joint’s rotation.
![Figure 2: Description of link and joint parameters [6]](https://prod-files-secure.s3.us-west-2.amazonaws.com/484c79ae-8827-4abb-895c-4cd1fae48ce6/bc4e00a7-7adb-45df-b762-0bd115dfa6f3/Description_of_link_and_joint_parameters_(2).jpg)
Figure 2: Description of link and joint parameters [6]
<aside> 💡
Figure 2 is the illustration of four D-H parameters in Cartesian space.
</aside>