People's approach to chatbots with neural networks is all wrong. A seq2seq network only predicts the next most likely character. It's a prediction network, not a generator. A second network needs to interface with predictions, pay attention to meaningful input, reason with it, and be trained to generate desired responses. However, there are major issues with the way neural networks are trained that need to be addressed first.
First, a seq2seq network predicting 'footwear' instead of 'shoes' will cause more training error than predicting 'shot'. Such a loss function is nonsensical. Words, phrases, sentences, paragraphs, and topics need to be represented with meaningful latent vectors that are optimally compressed together.
Second, researchers have disregarded the neuroplasticity of the brain due to the quick results achieved with backpropagation. Without plasticity though it's impossible to do one-shot learning and generalize knowledge to new tasks. There will never be a proper dataset to train on either because there are multiple possibilities and solutions that are good. Creative answers better than the dataset will also be severely punished by supervised learning.
Solving these two issues is vital to creating intelligent chatbots. Consider if someone counted on their fingers to five and said, "ichi, ni, san, shi, go." If you copied counting on your fingers, it would be possible to remember these foreign numbers, right? Even if you only got one right. This is possible because neurons that fire together, wire together. The neurons that light up for these sounds would get subtly wired with neurons that light up for counting on fingers. They'd become vaguely activated when you go to count on your fingers. With intelligent attention, such as recalling how the person looked like or any other details that activate those neurons that got wired together by the experience, you can amplify that subtle activation and remember the numbers clearly. With repetition this connection becomes stronger and stronger to the point you don't even need to count on the fingers to remember.
I plan to publish my informal research here in a few weeks and its code to play around with but for now if you're interested in the subject here's some further reading:>MIT scientists discover fundamental rule of brain plasticity
news.mit.edu/2018/mit-scientists-discover-fundamental-rule-of-brain-plasticity-0622>tl;dr neurons can only strengthen their synapses by weakening the ones of neighbouring neurons>Learning to learn with backpropagation of Hebbian plasticity
arxiv.org/abs/1609.02228>making the plasticity of each connection a learnable parameter in addition to the baseline weights, so some parts of the network are hard-wired, some soft-wired>The Kanerva Machine: A Generative Distributed Memory
arxiv.org/abs/1804.01756>optimal online compression that greatly outperforms differentiable neural computers