Introducing the perfect synergy

The AI agent that will help you operate your business even better

Roboterarm in Industrie
  • Article
  • 13 Minute Read
  • 03 Aug 2023

Businesses are always seeking for an elevated level of intelligence and control over their operations, leading to greater efficiency and productivity. To remain competitive in today's rapidly evolving business world, it is essential to keep up with the latest technological advancements. The emergence of AI technologies, including foundational models like ChatGPT, has demonstrated their potential to revolutionize the way we work. Digital Twins and Reinforcement Learning are two highly impactful technological advancements that have emerged in recent times with the same revolutionary potential. 

By leveraging technologies such as Digital Twins and Reinforcement Learning businesses are able to gain competitive advantage by lowering costs and improving quality and safety, e.g. of production processes. 

In contrast, by not embracing these technological changes and keeping up with the pace of innovation companies may risk falling behind competition. The possibilities for leveraging such technologies seem limitless, so it is time to elevate your business to the next level.

Your expert for questions

Hendrik Reese
Partner at PwC Germany
Tel: +49 89 5790-6093

The next level of automation and decision making

Digital Twins virtually replicate real-world objects and enable you to visualize your data nicely, e.g. in a metaverse. They can be set in reference to each other and interactions between these are framed as Simulation. Imagine, for example, the production of a screw. Sounds like an easy process but many parameters must be managed in order to get to the right production quality. The machine can be represented by a Digital Twin as can be the metal blank. When the machine transforms the blank into a screw the two components interact, as do their Digital Twins to control the process. In other words the real-world process is simulated virtually.

“AI has emerged from the scientific castles and is ready to conquer businesses and society all over the world. Now is the time to adapt and succeed in this exciting journey.”

Hendrik Reese,Partner at PwC Germany

A wide range of Simulation tools modeling these Simulations explicitly already exist. Despite this, building these Simulations can often be difficult, tedious and time-consuming due to the limited explicit knowledge about the possible interactions between the components and hence the Digital Twins. For example, not all processes and interactions that influence the transformation of the blank are known, let alone can be explicitly modeled, e.g. due to effects of environment pressure or blank quality. This shortcoming can be overcome using AI.

Building on the data foundation of the corresponding Digital Twins the complex influencing factors do not need to be explicitly modeled, but are learned by the AI. This creates AI based Simulations requiring only the acquisition of data and saves time, e.g., the time-consuming Simulation modeling. Thus, AI technologies based on Digital Twins represent a turning point for companies that want to increase their efficiency and productivity and fully exploit the potential of their digital transformation.

Also, an AI-based Simulation is a suitable foundation for Reinforcement Learning (RL). This type of AI learns from past experience by interacting with its environment, rather than relying solely on labeled sets of data. This makes it well-suited for tasks where the optimal actions, e.g. machine parameters, are not well-defined or difficult to determine a priori. Recalling the screw example, an AI trained in the simulated environment can then be deployed on the real-world machine autonomously producing optimal screws without expensive human intervention. The combination of Digital Twins and RL thus has the potential to revolutionize the way business decisions are done. To back this up, surveys show that 62% of companies using RL already benefit after a maximum of three months.

Do you have any questions?

Contact our experts

Discover the benefits

Before diving into the benefits of Reinforcement Learning let’s have a rough look into the engine room: Reinforcement Learning is a subfield of AI and (in its original form) separated from the well known machine learning paradigm (see below).

The RL set-up consists of an environment in which a so-called agent executes actions. Recalling the example from earlier, the screw production machine could be controlled not by a human being but an AI entity, the agent, that adapts the machine parameters (actions) such that always the desired quality is produced. In a static environment without variances, e.g. in the blank quality, machine or environment temperature, this replacement does not provide much business value. In general though these assumptions do not apply and the screw manufacturing machine is embedded in a real world environment with changing conditions. Like a human being, the agent has to learn how to adjust the machine's parameters to different environmental conditions for consistent quality. The agent learns choosing the optimal machine parameters through interaction with the environment during training and thus from experience.

The great advantage of an RL agent over human-based machine parameter adjustment for every single screw is a production without delays, e.g. for setting up the machine. Also, the agent works all day round at the same speed and quality, increasing productivity and leading to less human errors. For a technological deep-dive please be referred to our personal favorite below – the “AI-Enthusiast section”.

To further improve the advantages of RL it can be combined with the machine learning paradigm. One popular approach is Deep Reinforcement Learning (DRL), which integrates deep learning as part of the machine learning paradigm and Reinforcement Learning. In DRL, the agent is represented by a neural network that learns how to control the machine. The major advantage is the ability to detect hidden patterns for controlling the machine that are too complex for a human operator to comprehend.

The ability to autonomously find solutions and hidden strategies for complex tasks that would otherwise require intensive human intervention makes DRL a distinguished technology in the AI field. This type of automation leads to greater efficiency, higher and steady product quality and better robustness against process variations (like in the screw production example above).

These benefits can also be utilized, for example, to reduce the lack of well-trained staff. Besides finding well trained staff, e.g. for production assembly or welding, being an expensive and time-consuming task, RL can also reduce the dependency on these highly specifically skilled personnel. Another great advantage of RL is that significantly less data is necessary for creating an agent-controlled welding robot compared to classic machine learning approaches, e.g. computer vision for facial recognition. The reason is RL’s ability to develop its own solution strategy rather than being presented with the solution to be learned, e.g. what does a face look like. For sure the screw manufacturing machine represents only the tip of the iceberg, as there are infinitely more applications of DRL across all industries in development and already in use.

RL use cases

Smart Manufacturing

In a smart manufacturing facility, a Reinforcement Learning system can be employed to optimize the production process of electronic devices. The RL system interacts with the machines, adjusting parameters like machine speed, temperature, and material usage to maximize production output while minimizing defects and resource consumption. Through continuous trial and error, the RL system learns and refines its strategies, adapting to changing conditions such as raw material variations or market demand. As a result, the smart manufacturing system achieves higher production efficiency, reduced defects, and improved resource utilization, leading to enhanced profitability and competitive advantage.


Autonomous systems in logistics based on Reinforcement Learning enable warehouses to operate efficiently without human intervention, optimizing tasks such as inventory management and material handling. By utilizing RL-based robots, the risk of accidents and injuries is reduced as humans are removed from potentially hazardous environments. The continuous operation and scalability of autonomous systems enhance productivity and meet increasing demands, leading to a more streamlined and cost-effective logistics operation.

Real Estate

A real estate company can utilize a Digital Twin to optimize energy consumption, improve tenant experiences, and enhance space management. By monitoring and analyzing data from various sources (e.g. sensors) in real time, the company can identify inefficiencies, adjust systems, and reduce energy costs. The Digital Twin also empowers tenants to control their space, request maintenance services, and access building information, leading to increased satisfaction and engagement.


Reinforcement Learning can play a vital role in the field of medicine by enabling more robust and expedited diagnoses as well as facilitating the development of therapies. By leveraging disease outcomes and patient feedback, RL systems can continuously learn and adapt, discovering novel and improved treatment schemes that enhance patients' well-being. This iterative process allows for personalized and optimized healthcare approaches, ultimately leading to better patient outcomes and improved quality of care.


ChatGPT greatly demonstrates the use of Reinforcement Learning to mimic human articulation in chatbots. Through Reinforcement Learning, ChatGPT optimizes its behavior by incorporating user feedback and ratings on previous conversations. By iteratively adjusting its parameters based on this feedback, ChatGPT improves its ability to generate more appropriate and satisfactory responses, resulting in more natural and engaging conversations.


Reinforcement Learning systems in the financial sector have the potential to revolutionize investment strategies and risk management. By leveraging historical financial data, RL-based systems can simulate and evaluate the outcomes of various investment strategies, enabling the development of innovative and more robust approaches. These systems learn from past performance, adapt to changing market conditions, and optimize investment decisions, ultimately aiming to generate higher returns and minimize risks for investors.


In marketing, a Reinforcement Learning based system can leverage customer behavior and purchase data to predict and recommend products that are most likely to maximize sales. By analyzing individual customer preferences and interactions, the RL system can continually learn and adapt its sales strategy to optimize recommendations for each customer. This personalized approach enhances the customer experience, increases the chances of successful conversions, and ultimately drives higher sales and revenue for the business.

All these applications are proof of RL's potential, technological readiness and its ability to create real business value across a wide range of industries. In addition, and among the most important, a good understanding of RL from an academic and business perspective has made the risk in developing RL systems highly manageable.

Digital Twins facilitate trust in RL

The advantages of reduced costs and time by using RL can be increased even further by optimizing the training phase of RL models. Training RL models in a real-world environment is often challenging due to high costs, lots of time needed and risks involved. For example, using a real welding robot for training works, but causes high material costs and most likely damage to the robot itself and takes a lot of time. However, the RL concept can also be utilized in virtual environments that simulate real-world conditions well enough. This enables agent training without the risk of causing serious damage and again massively decreases costs. The virtualization of training further enables training of multiple agents in parallel decreasing training times and thus, also, costs even more. For example, training an agent for autonomous driving virtually avoids potentially costly and dangerous risks in the real-world. After training in the virtual environment the agent is deployed in real-world acting according to its learned policy. The cascading development steps embedded in agile project management shorten the time to ROI and minimize process risks, as adjustments can be made quickly and regularly in project implementation.

“AI has gotten to the point where it is ready to be applied in real-world products. Using it, things are possible that seemed impossible not long ago. It’s great times offering great opportunities.”

Dr. Janis Kesten-Kühne,Manager at PwC Germany

The agent's applicability in the real-world mainly depends on how well the virtual environment replicates the real-world, what is known as simulation-reality-gap. In general, it is not necessary and also very costly, to create a virtual environment that replicates reality in an exact manner. Instead, it is more important to develop an environment that creates a sufficiently complete picture of reality while omitting unnecessary information. Such a focus on the essentials requires a good understanding of the real-world. In the end, it leads to an appropriate balance between accuracy and cost-effectiveness in order to make the best use of the advantages of the virtual environment.

A great methodology to get to the sweet spot between cost and accuracy are Digital Twins. In the context of RL Digital Twins represent the real-world entities and serve as the virtual environment to train the agent. Additionally, the agent itself can be represented by a Digital Twin of a real-world entity, e.g. the screw manufacturing machine. This digital representation of the real-world through real-time updated Digital Twins enables well trained agents that are highly performant during their deployment in the real-world. The semantic data structuring and real-time update through Digital Twins allows DRL algorithms to be trained in a safe and trustworthy virtual environment. This foundation then enables the development of DRL models.

PwC, a partner you can trust

PwC, a trusted partner for trustworthy AI

We believe in driving business transformation through Digital Twins and Reinforcement Learning. With our strong technological background we understand the abilities of RL and strive to enable you to leverage the great technological AI advancements. PwC is the ideal partner for businesses looking to overcome traditional processes and outdated systems and streamline their operations through digital transformation based on Digital Twins and RL. As a leading trust company, we have a strong focus on compliance and regulation, ensuring our solutions meet the highest standards of safety, security, privacy, and scalability. We understand that implementing RL requires a deep understanding of the technology and potential risks, including data accuracy, reliability, and bias. Besides the huge obvious benefits of RL, it also brings the advantages of continuous learning, adapting to changing environments, finding new coping strategies and great scalability. Our team draws on a wealth of experience in this context, e.g. in conceptualizing, managing, programming applications and selecting appropriate data, ready to find the right solution for many more companies and applications enabling the industry to leverage the benefits of RL. To ensure trustworthiness in the data, algorithms, and RL agents used, we employ transparency, data governance, and validation processes and controls to ensure that the decisions made are accurate, safe, secure, and ethical.

Our team of technological, governance and business experts will work closely with you to understand your unique needs and develop a customized RL and Digital Twin based solution that meets your specific requirements on every step along the way: from planning and implementation to ongoing support and maintenance. We help you maintain compliance and stay up-to-date with the latest regulations, as PwC believes in building long-term relationships based on trust, reliability and a commitment to excellence.

Our AI-Enthusiast Section

The central objective of an RL agent is to earn as much total reward as possible by choosing the appropriate action in each state it is in during an episode. In this context the reward can explicitly or implicitly contain many objectives, e.g. reaching a destination along the shortest path or producing a product of always the same quality. The RL paradigm follows a clear learning procedure: During training, the agent observes a state of the environment (that is within the state space) and retrieves a reward from it. Based on its policy (which corresponds to the plan the agent aims to obtain the highest reward) the agent decides at each state it is in and under consideration of each state it has been in before for an action within the action space to manipulate the environment. The environment can be the real world, an appropriate simulation or some kind of well fitted sandbox providing the agent with the reward corresponding to the state-action pair. The action is chosen in a way to maximize the cumulative expected (and most often discounted) reward on the journey of reaching the final goal. Then the chosen action manipulates the environment and the new state as well as the resulting reward are returned to the agent. Based on the tuple <state, action, reward> the agent updates its policy using a specific learning algorithm to improve the expected reward. This procedure aims to find the policy that reaches the goal in the most efficient way. After the training procedure the inference phase follows. In this phase, the agent is deployed where it independently adapts its actions to achieve the desired environmental state based on the policy learned during the training phase.

Digital Twins and AI Revolutionize Business: Trust Is Key

Digital Twins and AI are significantly transforming industries, providing efficiency and insights. Use cases from healthcare, logistics and manufacturing with reduced downtimes and improved product quality show the incredible potential of the symbiosis of Digital Twins and Reinforcement Learning. These technologies will be the decisive factor for the transformation of companies in the future.

Find out how to embrace these technologies with a clear strategy for a competitive edge in our whitepaper.

To the Download

Follow us

Contact us

Hendrik Reese

Hendrik Reese

Partner, PwC Germany

Tel: +49 151 70423201

Dr. Janis Kesten-Kühne

Dr. Janis Kesten-Kühne

Manager, PwC Germany

Tel: +49 170 9831-117