2021-06-15 07:49:18 +00:00
|
|
|
---
|
|
|
|
id: 5e8f2f13c4cdbe86b5c72da5
|
2021-07-20 16:05:24 +00:00
|
|
|
title: 'Apprendimento per rinforzo con Q-Learning: Esempio'
|
2021-06-15 07:49:18 +00:00
|
|
|
challengeType: 11
|
|
|
|
videoId: RBBSNta234s
|
2021-10-03 19:24:27 +00:00
|
|
|
bilibiliIds:
|
|
|
|
aid: 848073871
|
|
|
|
bvid: BV1uL4y187Eq
|
|
|
|
cid: 409139471
|
2021-06-15 07:49:18 +00:00
|
|
|
dashedName: reinforcement-learning-with-q-learning-example
|
|
|
|
---
|
|
|
|
|
|
|
|
# --question--
|
|
|
|
|
|
|
|
## --text--
|
|
|
|
|
2021-07-20 16:05:24 +00:00
|
|
|
Compila gli spazi vuoti per completare la seguente equazione di Q-Learning:
|
2021-06-15 07:49:18 +00:00
|
|
|
|
|
|
|
```py
|
|
|
|
Q[__A__, __B__] = Q[__A__, __B__] + LEARNING_RATE * (reward + GAMMA * np.max(Q[__C__, :]) - Q[__A__, __B__])
|
|
|
|
```
|
|
|
|
|
|
|
|
## --answers--
|
|
|
|
|
|
|
|
A: `state`
|
|
|
|
|
|
|
|
B: `action`
|
|
|
|
|
|
|
|
C: `next_state`
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
A: `state`
|
|
|
|
|
|
|
|
B: `action`
|
|
|
|
|
|
|
|
C: `prev_state`
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
A: `state`
|
|
|
|
|
|
|
|
B: `reaction`
|
|
|
|
|
|
|
|
C: `next_state`
|
|
|
|
|
|
|
|
## --video-solution--
|
|
|
|
|
|
|
|
1
|
|
|
|
|