See llms.txt for all machine-readable content.
AI chatbots are only as good as the data they learn from. Most large language models (LLM) rely only on their training datasets.
If you want the chatbots to know more about your business, the best is to implement a retrieval-augmented generation (RAG) pipeline to train Gemini with your website data. This is what this workflow will help you to do.
This workflow uses a scheduler to scrape a website on a regular basis using Apify; web pages are then indexed or updated in a Pinecone vector database. This allows the chatbot to provide accurate and up-to-date information. The workflow uses Google's Gemini AI for both embeddings and response generation.
This workflow is split into 2 sub-logics highlighted with green sticky notes:
All nodes with an orange sticky note require setup.
1 Google Cloud Project and Vertex AI API:
2 Get an Apify account
3 Pinecone Account:
RAG stands for retrieval-augmented generation.
It is a technique that provides an AI model (such as a large language model) with additional data. That allows the LLM to give more up-to-date and topic-specific information.
RAG is a way to complement an LLM by giving it more up-to-date information.
You can think of the LLM as the CPU processing your question, and RAG as the hard drive providing information.
No. Website Content Crawler can scrape any website. So you can, in theory, use this template to build a RAG for someone else. You can even combine data from multiple websites.
In theory, yes. You could replace the Gemini node with another LLM model. If you are looking for inspiration about RAG implementation with the Ollama model, check out this template.