2 Ways to Build your FAQBot

6 min readFeb 14, 2019

A chatbot is an AI solution that simulates a conversation with an end-user in natural language over any messaging app, the phone or web.

Why are ChatBots important?

A chatbot is a most promising platform for the interaction between humans and machines. AI chatbot can handle obscure questions, this means you don’t have to be specific while asking the question to the bot for your answer. AI based chatbot leverage the Natural Language Processing (NLP) techniques to understand the question and formulate the answer of the question. While building AI based chatbot, it requires large amount of knowledge-base. The more knowledge-base/ QnA, it uses more complex NLP techniques and becomes more smarter.

In business, the most accepted area where AI based chatbot are being used frequently to answer the customer queries on web or to help customer service crew to fix the customer queries quickly. This types of AI based chatbot is termed as FAQ bot.

FAQ ChatBot

Customers usually have queries that they commonly ask and to respond the customers, you can provide the interactive platform which helps customer to find the answer of their question. In this scenario, you can deploy the FAQ bot, which automatically respond the customers with the solution of their queries. It is more useful to provide the chatbot with your application or website to solve the users’ questions.

Types of architecture models for FAQ ChatBot

Retrieval-Based
Generative

Retrieval-Based Models

Retrieval based architectural model is much easier to build and more reliable Though it doesn’t give guarantee 100% of accuracy while responding the customer query. It gives the highlight of the probable responses and you can control the bot for not responding any inappropriate response to the customer.

These days in business, retrieval-based models are more in use. This identifies context of the message and deliver the most relevant response from the predefined knowledge-base.

Generative Models

Generative models are used to develop the smart bots which are altogether cutting-edge solution in nature. This type of models for the chatbot are hardly in use for the business applications.

These models are comparatively difficult to build and develop. To train these type of models, it requires to invest a lot of time and effort by giving millions of examples of the historical chats/conversations. In this way, deep learning is being used in the implementation of generative models for the conversations. Still, the responses from the model may not be reliable.

Learning Approaches for Retrieval-Based Model

Here, we will discuss two different learning approaches that we have used to implement retrieval-based model for FAQ chatbot:

Supervised Learning for FAQ chatbot

Architecture

Features

Requires intent and context from each of the questions from FAQ set
Creates graph based relationship of each of the words within an intent

Training

Create different variations for each of the questions from FAQ set
Label intent and context for each of the question
Train classification model to predict intent of the input query
Design prediction logic for the given question
Build logic/ rules for the response generation

Response

Predicts the intent, identify the context
Based on the probability of the classes, respond the answer
Creates response based on the intent and context

Assets

Can handle contextual conversation
Responses are more reliable
On-premise solution
Creates domain-based vocabulary from the training dataset, this can handle words which are out of vocabulary of common English
Fall-back mechanism is easy to integrate as it gives direct probability of the intent

Limitations

Requires to spend lots of the time to generation training data
Becomes critical to prepare training data for large dataset of unique FAQs
Can not handle semantically similar questions from training dataset
A chatbot sounds intelligent when it has the ability to understand and maintain the context of a conversation and respond to the human accordingly
Scalability is a panic as it requires to generate variations of each of the new question with its intent and context for the training

Semi-supervised Learning for FAQ ChatBot

Architecture

Features

Pre-process the text data by removing stop words, punctuation, garbage text
Converts text into vector format using various NLP techniques such as TF-IDF and word2vec (using pre-trained word-vectors)

Training

This technique doesn’t actually requires training phase unless domain related word2vec model is available
Requires training when FAQ contains more domain related words which are not present in the common English vocabulary

Response

Compute the distance between feature-vectors of query question from the each of the question from FAQ dataset
Identify most similar question from the FAQ
Respond the answer of the most similar question from the pre-defined FAQ

Assets

Can handle semantic meaning of the text
Scalable approach as it just need to add new FAQs in the knowledge-base
Doesn’t require training phase once domain related vocabulary and word-vectors are prepare
Fall back mechanism can be integrated

Limitations

Can not handle contextual conversation
Responses are non-relevant some-times as it works on the pre-trained word-vectors and numbers
Requires testing on sample data while integrating fallback mechanism as it computes distance metrics

Summary

Chatbot has become exceptionally popular these days to improve customer experience in various business models. There are multiple learning methods are available to implement chatbot. In this blog, two of the learning methodologies 1. Supervised Learning and 2. Semi-supervised
learning to implement retrieval based FAQ bot are discussed.

Supervised-learning method can be used when small amount of data is available or if different variations of the same question is available for the
training. But when you have large amount of question-answer pairs available, semi-supervised learning of the model is more useful. Supervised-learning can handle contextual conversations, though training workflow and rules needs to be provided while building the implementation architecture. Where, semi-supervised learning is implemented using automate functions, so external rules is not integrated with the current system. Supervised-learning requires more efforts to prepare training data every time when any new samples are added for the training. Semi-supervised learning model is scalable approach so it will just need to add new QA pairs to the training file and let the system understand the semantic meanings and relation within the text.

In conclusion, we can say that when scope of the FAQBot is limited and contextual conversation is required, supervised learning approach fits well. Also, the responses of the system will be more reliable in that case. But when more scalable system is required and large amount QA pairs are accessible, semi-supervised learning model is more suitable to implement FAQBot. The system will understand semantic meaning of the text and generates the response, so the responses may not be reliable at the initial phase, but the system will learn and improve the responses over-time.

So If you like this blog, Don’t forget to give us 👏

2 Ways to Build your FAQBot

Why are ChatBots important?

FAQ ChatBot

Types of architecture models for FAQ ChatBot

Retrieval-Based Models

Generative Models

Learning Approaches for Retrieval-Based Model

Supervised Learning for FAQ chatbot

Architecture

Features

Training

Response

Assets

Limitations

Semi-supervised Learning for FAQ ChatBot

Architecture

Features

Training

Response

Assets

Limitations

Summary

Written by Intellica.AI

No responses yet