TranslateProject/sources/tech/20201120 Day 10- Training an RNN to count to three.md
DarkSun a6afd8247b 选题[tech]: 20201120 Day 10: Training an RNN to count to three
sources/tech/20201120 Day 10- Training an RNN to count to three.md
2020-11-22 05:02:49 +08:00

57 lines
2.3 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

[#]: collector: (lujun9972)
[#]: translator: ( )
[#]: reviewer: ( )
[#]: publisher: ( )
[#]: url: ( )
[#]: subject: (Day 10: Training an RNN to count to three)
[#]: via: (https://jvns.ca/blog/2020/11/20/day-10--training-an-rnn-to-count-to-three/)
[#]: author: (Julia Evans https://jvns.ca/)
Day 10: Training an RNN to count to three
======
Yesterday I was trying to train an RNN to generate English that sounds kind of like Shakespeare. That was not working, so today I instead tried to do something MUCH simpler: train an RNN to generate sequences like
```
0 1 2 0 1 2 0 1 2 0 1 2
```
and slightly more complicated sequences like
```
0 1 2 1 0 1 2 1 0 1 2 1 0 1 2 1 0
```
I used (I think) the exact same RNN that I couldnt get to work yesterday to generate English by training it on Shakespeare, so it was cool to see that I could at least use it for this much simpler task (memorize short sequences of numbers).
### the jupyter notebook
Its late so I wont explain all the code in this blog post, but heres the PyTorch code I wrote to train the RNN to count to three.
* Here it is as a [github gist][1]
* and [here it is on Colab][2] if you want to run it yourself
In the gist there are a few experiments with different sequence lengths, like (unsurprisingly) it takes longer to train it to memorize a sequence of length 20 than a sequence of length 5.
### simplifying is nice
Im super happy that I got an RNN to do something that I actually understand! I feel pretty hopeful that on Monday Ill be able to go back to the character RNN problem of trying to get the RNN to generate English words now that I have this simpler thing working.
--------------------------------------------------------------------------------
via: https://jvns.ca/blog/2020/11/20/day-10--training-an-rnn-to-count-to-three/
作者:[Julia Evans][a]
选题:[lujun9972][b]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://jvns.ca/
[b]: https://github.com/lujun9972
[1]: https://gist.github.com/jvns/b8804fb9d0672ce147a28d22648b4bd7
[2]: https://colab.research.google.com/gist/jvns/b8804fb9d0672ce147a28d22648b4bd7/rnn-123.ipynb