i need help with 2 questions 3 and 4 NOT 5. Robotics subject

3. Reinforcement Learning

Reinforcements learning (RL) agents learn by taking state-dependent actions and experiencing reward arising from interaction with their environments. One method is to use a table-based Q-learning algorithm.

Figure 1: The inverted pendulum problem

Q-learning tables are discrete, but most real-world tasks involve systems that have continuous states and are controlled using continuous actions. With this in mind, consider how a table-based Q-learning algorithm could learn to balance an inverted pendulum (as shown in Fig. 1). To achieve this:

(a) Describe a suitable reward function.

[3 marks]

(b) Describe a suitable choice of states and explain why they are appropriate.

[3 marks]

(c) Describe a suitable choice of actions and explain why they are appropriate and how they relate to the states discussed in part (a).

[3 marks]

(d) Discuss how an inverted pendulum task could be either an MDP or a POMDP. [2 marks]

Question 3 continued …

Question 3 continued

(e) Discuss how simulated experience generated from a model within a RL agent can increase the speed with which the RL algorithm convergence. How can this assist finding a solution in the inverted pendulum task?

[4 marks]

(f) Dyna-Q algorithm is one such model-based approach to RL. Using high-level pseudo code in no more than 12 lines, describe the operation of the Dyna-Q algorithm and describe all its key terms.

[5 marks]

4. State estimation

(a) When building a full state feedback controller, why is if often necessary to use some form of state estimator?

[3 marks]

(b) The Luenberger observer is a deterministic state estimator. Draw its signal flow graph to illustrate its operation and explain the design and function of the Luenberger gain L.

[3 marks]

(c) The Kalman filter is a stochastic state estimator. Draw and compare a signal flow graph of the Kalman estimator with that of the Luenberger observer, illustrating all the Kalman estimator’s important components, including its noise sources.

[4 marks]

Question 4 continued …

Question 4 continued

(d) The Kalman filter iteratively computes 5 variables as illustrated below

Write a short paragraph on each of the terms 1 – 5 to explain their meaning and function.

[10 marks]

5. Gaussian processes

Describe the main difference between using Gaussian Processes and Support Vector Machines in approximating linear functions.

[20 marks]

3. Reinforcement Learning

Reinforcements learning (RL) agents learn by taking state-dependent actions and experiencing reward arising from interaction with their environments. One method is to use a table-based Q-learning algorithm.

Figure 1: The inverted pendulum problem

Q-learning tables are discrete, but most real-world tasks involve systems that have continuous states and are controlled using continuous actions. With this in mind, consider how a table-based Q-learning algorithm could learn to balance an inverted pendulum (as shown in Fig. 1). To achieve this:

(a) Describe a suitable reward function.

[3 marks]

(b) Describe a suitable choice of states and explain why they are appropriate.

[3 marks]

(c) Describe a suitable choice of actions and explain why they are appropriate and how they relate to the states discussed in part (a).

[3 marks]

(d) Discuss how an inverted pendulum task could be either an MDP or a POMDP. [2 marks]

Question 3 continued …

Question 3 continued

(e) Discuss how simulated experience generated from a model within a RL agent can increase the speed with which the RL algorithm convergence. How can this assist finding a solution in the inverted pendulum task?

[4 marks]

(f) Dyna-Q algorithm is one such model-based approach to RL. Using high-level pseudo code in no more than 12 lines, describe the operation of the Dyna-Q algorithm and describe all its key terms.

[5 marks]

4. State estimation

(a) When building a full state feedback controller, why is if often necessary to use some form of state estimator?

[3 marks]

(b) The Luenberger observer is a deterministic state estimator. Draw its signal flow graph to illustrate its operation and explain the design and function of the Luenberger gain L.

[3 marks]

(c) The Kalman filter is a stochastic state estimator. Draw and compare a signal flow graph of the Kalman estimator with that of the Luenberger observer, illustrating all the Kalman estimator’s important components, including its noise sources.

[4 marks]

Question 4 continued …

Question 4 continued

(d) The Kalman filter iteratively computes 5 variables as illustrated below

Write a short paragraph on each of the terms 1 – 5 to explain their meaning and function.

[10 marks]

5. Gaussian processes

Describe the main difference between using Gaussian Processes and Support Vector Machines in approximating linear functions.

[20 marks]

hello there i have an assignment that needs to be done but i see on the website someone has already posted it but not the answer im wondering would you do the assignment again or provide me with the answer55...It is a critical literature review should be of 9000 words limits.TopicPerceptions of stroke patients on their physiotherapy rehabilitation needs through Home based settings .1. Intro.2.RationaleBelow...I need for the writer to be more critically when writing the arguments in the essay please.Assignment 2500 wordsCritically essayCounterintelligence function carried out by the Intelligence AgenciesStructure:...AGILE LITERACY LLC CONSULTANCY PROJECT STRUCTUREBackgroundIn this section, the objective of the student team is to analyse the history of the client, as well as conduct an industry analysis. It is imperative...Component 2: Individual Report 2022SummaryIndividual report on AMAZON? Knowledge on organization and context (30%)? Ethical theories (30%)? Critical analysis and insight (30%)? Presentation (10%)Word count:...Hi how much would a 4000 word assignment be on retinopathy long term conditionsHi how much would a 4000 word assignment be on retinopathy long term conditions**Show All Questions**