An Image-Based Facial Emotion Detection Chatbot

Authors

  • WGL Harshani Department of Computer Science, University of Sri Jayewardenepura, Sri Lanka
  • Ananda Dehigaspitiya Department of Computer Science, University of Sri Jayewardenepura, Sri Lanka

Keywords:

Facial Emotion Detection, NLP, Chatbot, FER-2013, Accuracy, LangChain

Abstract

In the evolving domain of conversational AI, integrating visual recognition capabilities into chatbots represents a pivotal step toward achieving empathetic and context-aware interactions. This study introduces an innovative emotion-aware chatbot system that utilizes facial emotion recognition (FER) to enhance emotional intelligence in human AI communication. The primary problem addressed is the lack of conversational systems capable of interpreting non-verbal cues, such as facial emotions, to create meaningful and personalized interactions. Our chatbot allows users to input facial images, enabling the system to recognize and classify emotions in real-time and dynamically generate emotion-based responses tailored to the user's state. The FER model was developed using the FER-2013 benchmark dataset, categorizing expressions into seven predefined emotions: Angry, Disgust, Fear, Happy, Sad, Surprise, and Neutral. To address achieved moderate results, data augmentation techniques and hyperparameter tuning were applied to improve robustness. Furthermore, LangChain, an open-source framework for building conversational agents, was integrated to manage dialogue flows. LangChain was utilized to orchestrate the chatbot’s conversational flow, leveraging its modular architecture for dynamic and adaptive dialogue management textually and visually. Recognized emotions from the FER model were processed by LangChain to generate contextually relevant responses tailored to the user's emotional state. The framework enabled seamless integration of visual input processing with language-based conversation, ensuring smooth transitions between emotion recognition and response generation. The integration methodology leverages LangChain’s toolkits for real-time processing of visual cues, enabling emotion-driven, contextually adaptive conversation generation. Unlike conventional chatbots, this system introduces a multimodal approach that bridges textual and visual emotional inputs with the integration of LangChain. This research contributes a detailed framework for integrating FER into conversational agents, emphasizing its potential in building rapport, improving engagement, and creating empathetic dialogue. Future work will focus on optimizing the FER model’s accuracy through advanced architectures and exploring real-world use cases, including healthcare and customer service, to demonstrate the transformative impact of emotion-aware AI on communication platforms. Future work will focus on improving FER model performance through advanced architectures like Vision Transformers and larger, more diverse datasets to boost accuracy and generalizability.

References

M. Y. Lee, “Building multimodal ai chatbots,” arXiv preprint arXiv:2305.03512, 2023.

J. Martınez-Miranda and A. Aldea, “Emotions in human and artificial intelligence,” Computers in Human Behavior, vol. 21, no. 2, pp. 323–341, 2005.

J. Li, Z. Zhang, H. Zhao, X. Zhou, and X. Zhou, “Task-specific objectives of pre-trained language models for dialogue adaptation,” arXiv preprint arXiv:2009.04984, 2020.

W. H. Dutton, G. Blank, and D. Groselj, Cultures of the internet: the internet in Britain: Oxford Internet Survey 2013 Report. Oxford Internet Institute, 2013.

N. Ma and R. Khynevych, “Modern trends in intelligent chatbot digital visual image,” Art and Design, no. 3, pp. 35–44, 2023.

K. Lobinger, “Photographs as things–photographs of things. a textomaterial perspective on photo-sharing practices,” Information, Communication & Society, vol. 19, no. 4, pp. 475–488, 2016.

E. M. Johnson and E. W. Healy, “The optimal speech-to-background ratio for balancing speech recognition with environmental sound recognition,” Ear and Hearing, pp. 10–1097, 2024.

G. A. Godghase, R. Agrawal, T. Obili, and M. Stamp, “Distinguishing chatbot from human,” arXiv preprint arXiv:2408.0464, 2024.

K. Chowdhary and K. Chowdhary, “Natural language processing,” Fundamentals of artificial intelligence, pp. 603–649, 2020.

M. K. Dobbala and M. S. S. Lingolu, “Conversational ai and chatbots: Enhancing user experience on websites,” American Journal of Computer Science and Technology, vol. 11, no. 1, pp. 62–70, 2024.

S. Pophale, H. Gandhi, and A. K. Gupta, “Emotion recognition using chatbot system,” in Proceedings of International Conference on Recent Trends in Machine Learning, IoT, Smart Cities and Applications: ICMISC 2020. Springer, 2021, pp. 579–587.

Q. Sun, Y. Wang, C. Xu, K. Zheng, Y. Yang, H. Hu, F. Xu, J. Zhang, X. Geng, and D. Jiang, “Multimodal dialogue response generation,” arXiv preprint arXiv:2110.08515, 2021.

J. D. M.-W. C. Kenton and L. K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” in Proceedings of naacL-HLT, vol. 1, no. 2. Minneapolis, Minnesota, 2019.

L. Dong, N. Yang, W. Wang, F. Wei, X. Liu, Y. Wang, J. Gao, M. Zhou, and H.-W. Hon, “Unified language model pre-training for natural language understanding and generation,” Advances in neural information processing systems, vol. 32, 2019.

T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell et al., “Language models are few-shot learners,” Advances in neural information processing systems, vol. 33, pp. 1877–1901, 2020.

T. Majumder, S. SR et al., “Machine learning based method to design a facial emotion detection and chatbot system.” Journal of Advanced Zoology, vol. 44, 2023.

S. Assayed, K. Shaalan, S. Al-Sayed, and M. Alkhatib, “Psychological emotion recognition of students using machine learning based chatbot,” International Journal of Artificial Intelligence and Applications (IJAIA), vol. 14, no. 2, 2023.

M. Karna, D. S. Juliet, and R. C. Joy, “Deep learning based text emotion recognition for chatbot applications,” in 2020 4th International Conference on Trends in Electronics and Informatics (ICOEI)(48184). IEEE, 2020, pp. 988–993.

L. Zahara, P. Musa, E. P. Wibowo, I. Karim, and S. B. Musa, “The facial emotion recognition (fer-2013) dataset for prediction system of micro-expressions face using the convolutional neural network (cnn) algorithm based raspberry pi,” in 2020 Fifth international conference on informatics and computing (ICIC). IEEE, 2020, pp. 1–9.

S. Eswar Sudhan, V. K. R. Garapati, M. R. G. Nadella, G. Jangala, and P. Aswathy, “Enhancing chatbot interaction with emotion detection for improved understanding: Emobot,” in Congress on Intelligent Systems. Springer, 2023, pp. 131–140

Published

02/06/2025

How to Cite

Welihena Gamage, L., & Ananda Dehigaspitiya. (2025). An Image-Based Facial Emotion Detection Chatbot. International Journal of Research in Computing, 4(i), 40–47. Retrieved from http://ijrcom.org/index.php/ijrc/article/view/143

Issue

Section

Articles