人工智能ai 学习
As discussed earlier, in Reinforcement Learning, the agent takes decisions in order to attain maximum rewards. These rewards are the reinforcements through which the agent learns in this type of agent.
如前所述,在“ 强化学习”中 ,代理做出决策以获取最大的回报。 这些奖励是代理在此类代理中学习的增强。
The reinforcements are of two types:
钢筋有两种类型:
Positive Reinforcement:
积极加固:
When the agent completes any task, if the feedback or the points for the task are in a positive response, then it is termed as the positive reinforcement. This type of reinforcement increases the performance of the agent as the agent now gets a hint that it has to make decisions and perform tasks in this particular manner to earn more rewards in the future also.
当代理完成任何任务时,如果任务的反馈或要点处于积极响应中,则称为积极强化。 这种增强方式可以提高代理的性能,因为代理现在可以暗示它必须以这种特定方式做出决定并执行任务,以在将来也获得更多的回报。
Negative Reinforcement:
负加固:
Whenever the agent fails to perform any task as required, in that case, the agent is provided with negative reinforcement. This can be thought as of giving punishment to a child for doing mischiefs. The negative reinforcements tell the agent that such type of performance or such type of decisions must be avoided in the future while solving similar types of problems.
每当代理未能按要求执行任何任务时,在这种情况下,就会为代理提供负加固。 可以认为这是对孩子作恶的惩罚。 负面的补充告诉代理人,将来在解决类似类型的问题时,必须避免这种绩效或这种决策。
Factors on which the performance of the agent which learns through Reinforcements depend:
通过增援来学习的业务代表的绩效取决于以下因素:
Input:
输入:
The Agent seeks the initial stage as the input from which it has to start. This is an important phase because all the observations and inferences will be drawn starting from this state, and the past state of the agent will not be considered.
代理寻求初始阶段作为必须从其开始的输入。 这是重要的阶段,因为将从此状态开始绘制所有观察和推论,并且不会考虑代理的过去状态。
Output:
输出:
The output state that the system will reach after solving a certain problem is not fixed as there are multiple ways of solving a problem and the agent can choose different solution whenever it tries to solve the same type of problem.
系统解决某个问题后将达到的输出状态不是固定的,因为有多种解决方法,并且座席在尝试解决同一类型的问题时可以选择不同的解决方案。
Training/Learning:
培训/学习:
The training phase or the Learning Phase is when the agent builds its Knowledge Base from the reward or punishment that it gets based on the output it produces. It is a very important phase in Reinforcement Learning because it helps the agent to understand and learn in the same way as humans. This implements the human behavior in agents which is the main target in Artificial Intelligence.
培训阶段或学习阶段是指代理根据其产生的输出所获得的奖励或惩罚建立其知识库。 这是强化学习中非常重要的阶段,因为它可以帮助代理以与人类相同的方式来理解和学习。 这在代理中实现了人类行为,而代理是人工智能的主要目标。
翻译自: https://www.includehelp.com/ml-ai/main-points-of-reinforcement-learning-in-artificial-intelligence.aspx
人工智能ai 学习