The Relationship between Representational Conflicts, Generalization, and Encoding Requirements in an Instance Memory Network
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 33030
The Relationship between Representational Conflicts, Generalization, and Encoding Requirements in an Instance Memory Network

Authors: Mathew Wakefield, Matthew Mitchell, Lisa Wise, Christopher McCarthy

Abstract:

This paper aims to provide an interpretation of artificial neural networks (ANNs) and explore some of its implications. The interpretation views ANNs as a memory which encodes instances of experience. An experiment explores the behavior of encoding and retrieval of instances from memory. A localised representation ANN is created that allows control over encoding and retrieved memory sample size and is experimented with using the MNIST digits dataset. The relationship between input familiarity, conflict within retrieved samples, and error rates is described and demonstrated to be an effective driver for memory encoding. Results indicate that selective encoding and retrieval samples that allow detection of memory conflicts produce optimal performance, and that error rates are normally distributed with input familiarity and conflict. By using input familiarity and sample consistency to guide memory encoding, the number of encoding trials on the dataset were reduced to 18.33% of the training data while maintaining good recognition performance on the test data.

Keywords: Artificial Neural Networks, ANNs, representation, memory, conflict monitoring, confidence.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 479

References:


[1] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, pp. 436–444, 2015.
[2] G. E. Hinton, D. E. Rumelhart, and J. L. McClelland, Distributed Representations. MITP, 1986, pp. 77–109.
[3] I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” arXiv preprint arXiv:1412.6572, 2014.
[4] M. M. Botvinick, T. S. Braver, D. M. Barch, C. S. Carter, and J. D. Cohen, “Conflict monitoring and cognitive control,” Psychological Review, vol. 108, no. 3, pp. 624–652, 2001.
[5] D. Kumaran, D. Hassabis, and J. L. McClelland, “What learning systems do intelligent agents need? complementary learning systems theory updated,” Trends in Cognitive Sciences, vol. 20, no. 7, pp. 512–534, 2016.
[6] J. L. McClelland, B. L. McNaughton, and R. C. O’Reilly, “Why there are complementary learning systems in the hippocampus and neocortex: Insights from the successes and failures of connectionist models of learning and memory,” Psychological Review, vol. 102, no. 3, pp. 419–457, 1995.
[7] M. M. Botvinick, S. Ritter, J. X. Wang, Z. Kurth-Nelson, C. Blundell, and D. Hassabis, “Reinforcement learning, fast and slow,” Trends in Cognitive Sciences, vol. 23, no. 5, pp. 408–422, 2019.
[8] J. L. McClelland and D. E. Rumelhart, “Distributed memory and the representation of general and specific information,” Journal of Experimental Psychology: General, vol. 114, no. 2, pp. 159–188, 1985.
[9] C. Zhang, S. Bengio, M. Hardt, B. Recht, and O. Vinyals, “Understanding deep learning requires rethinking generalization,” arXiv preprint arXiv:1611.03530, 2016.
[10] ——, “Understanding deep learning (still) requires rethinking generalization,” Commun. ACM, vol. 64, no. 3, p. 107–115, 2021.
[11] D. Arpit, S. Jastrzebski, N. Ballas, D. Krueger, E. Bengio, M. S. Kanwal, T. Maharaj, A. Fischer, A. Courville, and Y. Bengio, “A closer look at memorization in deep networks,” in International Conference on Machine Learning. PMLR, Conference Proceedings, pp. 233–242.
[12] G. E. Hinton, “What kind of graphical model is the brain?” in Proc. 19th International Joint Conference on Artificial intelligence, vol. 5, 2005, Conference Proceedings, pp. 1765–1775.
[13] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning representations by back-propagating errors,” Nature, vol. 323, no. 6088, pp. 533–536, 1986.
[14] T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, “A simple framework for contrastive learning of visual representations,” arXiv preprint arXiv:2002.05709, 2020.
[15] N. Papernot and P. McDaniel, “Deep k-nearest neighbors: Towards confident, interpretable and robust deep learning,” arXiv preprint arXiv:1803.04765, 2018.
[16] S. Grossberg, “How does a brain build a cognitive code?” Psychological Review, vol. 87, no. 1, pp. 1–51, 1980.
[17] Z. Ghahramani, “Probabilistic machine learning and artificial intelligence,” Nature, vol. 521, no. 7553, pp. 452–459, 2015.
[18] J. Wang, P. Neskovic, and L. N. Cooper, “Neighborhood size selection in the k-nearest-neighbor rule using statistical confidence,” Pattern Recognition, vol. 39, no. 3, pp. 417–423, 2006.
[19] M. Page, “Connectionist modelling in psychology: A localist manifesto,” Behavioral and Brain Sciences, vol. 23, no. 4, pp. 443–467, 2000.
[20] G. Dong and H. Liu, Feature engineering for machine learning and data analytics. CRC Press, 2018.
[21] J. Yosinski, J. Clune, A. Nguyen, T. Fuchs, and H. Lipson, “Understanding neural networks through deep visualization,” arXiv preprint arXiv:1506.06579, 2015.
[22] J. S. Bowers, “Parallel distributed processing theory in the age of deep networks,” Trends in Cognitive Sciences, vol. 21, no. 12, pp. 950–961, 2017.
[23] J. Grainger and A. M. Jacobs, On localist connectionism and psychological science. Mahwah, New Jersey: Lawrence Erlbaum, 1998, pp. 1–38.
[24] J. L. McClelland and D. E. Rumelhart, “An interactive activation model of context effects in letter perception: I. An account of basic findings,” Psychological review, vol. 88, no. 5, pp. 375–407, 1981.
[25] D. A. Norman and T. Shallice, Attention to Action: Willed and Automatic Control of Behavior. Boston, MA: Springer US, 1986, pp. 1–18.
[26] J. Yosinski, “Understanding neural networks through deep visualization,” 2015, accessed: 27-01-2021.
[Online]. Available: http://yosinski.com/deepvis
[27] J. Yosinski, J. Clune, Y. Bengio, and H. Lipson, “How transferable are features in deep neural networks?” in Advances in neural information processing systems 27 (NIPS 2014), Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, Eds. Curran Associates, 2014, Conference Proceedings, pp. 3320–3328.
[28] Y. LeCun, C. Cortes, and C. J. C. Burges, “The MNIST database,” accessed: 03-06-2020.
[Online]. Available: http://yann.lecun.com/exdb/mnist/
[29] A. Pritzel, B. Uria, S. Srinivasan, A. P. Badia, O. Vinyals, D. Hassabis, D. Wierstra, and C. Blundell, “Neural episodic control,” in Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org, Conference Proceedings, pp. 2827–2836.
[30] S. Grossberg, “Adaptive resonance theory: How a brain learns to consciously attend, learn, and recognize a changing world,” Neural Networks, vol. 37, pp. 1–47, 2013.
[31] B. C. Love, D. L. Medin, and T. M. Gureckis, “Sustain: A network model of category learning,” Psychological Review, vol. 111, no. 2, pp. 309–332, 2004.
[32] G. Shafer and V. Vovk, “A tutorial on conformal prediction,” Journal of Machine Learning Research, vol. 9, no. 3, 2008.
[33] T. L. Griffiths, N. Chater, C. Kemp, A. Perfors, and J. B. Tenenbaum, “Probabilistic models of cognition: exploring representations and inductive biases,” Trends in Cognitive Sciences, vol. 14, no. 8, pp. 357–364, 2010.
[34] J. L. McClelland, M. M. Botvinick, D. C. Noelle, D. C. Plaut, T. T. Rogers, M. S. Seidenberg, and L. B. Smith, “Letting structure emerge: connectionist and dynamical systems approaches to cognition,” Trends in Cognitive Sciences, vol. 14, no. 8, pp. 348–356, 2010.
[35] V. Di Lollo, “The feature-binding problem is an ill-posed problem,” Trends in Cognitive Sciences, vol. 16, no. 6, pp. 317–321, 2012.
[36] Z. Tu, X. Chen, A. L. Yuille, and S.-C. Zhu, “Image parsing: Unifying segmentation, detection, and recognition,” International Journal of computer vision, vol. 63, no. 2, pp. 113–140, 2005.
[37] P. Khosla, P. Teterwak, C. Wang, A. Sarna, Y. Tian, P. Isola, A. Maschinot, C. Liu, and D. Krishnan, “Supervised contrastive learning,” arXiv preprint arXiv:2004.11362, 2020.