How to create your Chatbot on your website?

Do you have a small online store or a personal website and would like to improve customer service?

A chatbot can be the perfect solution to answer frequently asked questions (FAQ), inform about your shipping policies and share your latest news.

In this article, you will learn in detail how to develop and implement a chatbot for your goals.

In this specific case, I deploy a chatbot, using the Retrieval-Augmented Generation (RAG) technique to enable answers to questions related to the latest appointments published in State Agency for the Official State Gazette (BOE). For example, we will deal with the appointment of Mr. José Fernández Albertos as Director of the Public Policy Department of the Cabinet of the Presidency of the Government.

You will only need the following elements:

Free account in OpenAI (ChatGPT)
Free account in HuggingFace

Then, you can see the result after following the steps and you can ask him about the appointment:

What are we going to do?

Source

Retrieval Augmented Generation (RAG)

It is a process by which we will be able to use our own information and documents to increase the knowledge of language models like the one used by Chat-GPT.

In the first step we will transform our documents to vectors, because although it seems that Chat-GPT works with words, it really uses numbers. For this process we will use an embedding.

Finally, when making a new query, it first searches and retrieves relevant information from the previously added documentation. Then, it responds to our query by prioritizing that information.

OpenAI Account

1. Create an OpenAI Account

2. Access to Open AI API

Once we have an account, when logging in, we will be offered these two options, choosing API.

Check that you have been given $5 to use the Open-AI API. With that amount you will have enough to use multiple documents and be able to chat about them as many times as you want.

3. Create an API Key

To be able to use our account in different sites we will need a unique personal code or also called API-Key.

Once we click on "Create new secret key" we will not be able to see our code again, so it is advisable to copy it in a note.

HuggingFace Account

1. Create a HuggingFace Account

Hugging Face is a platform leader in the development and implementation of language models. It offers an open library with a wide variety of pre-trained models, tokenization tools and an active community. Its mission is to democratize and facilitate access to the latest innovations in PLN.

2. Token Generation

In the vectorization of texts, we will use embedding, as mentioned in previous steps. In this process, we will take advantage of the Hugging Face platform to obtain the free embedding, for which a token will be necessary, as with Open AI.

Deploying in HuggingFace Space

1. Access to HuggingFace Spaces

2. Create new Space

Spaces is a cloud-based platform for collaboration that provides an easy way to host and showcase Machine Learning (ML) applications.

It offers a great way to create a portfolio of your ML projects easily and with multiple free and paid options.

The most important thing for our chat is to select a name and the Space SDK Gradio.

3. Add Secrets

We will have to add our tokens as secrets so that nobody will be able to know what they are.

Settings > Scroll down to "New Secret" > Create two secrets

Token Hugging Face - Name HF
Open AI Token - Name OPENAI

4. Add code and requirements

Add the next two documents in Files > Upload Files

Document 1: app.py
Document 2: requirements.txt

5. Add folder files with PDFs

Create your folder with the pdFs files and name it : MyPDFs
Files > Upload Files
Drag and drop the folder with the PDF files
Make sure that the folder contains only PDF files.

Add to your website

Congratulations, we have finished the project!

Note that if you have a large number of documents in any cloud, such as S3 services, you can always use them, modifying the necessary code.

This method is more recommended for a test like the one performed, due to different factors such as data publishing. If you want more professional deployments or for productive environments, do not hesitate to contact me, I will be happy to help you.