APAI 2i05 - How to launch an AI App without breaking the bank
Use vectorstores for low, fixed cost, AI solutions
Good News Everybody: 2024 is here, and it’s time you, yes you, launched an AI application without breaking the bank.
Want the TLDR?
Use Vectorstores for production retrieval (no LLMs) and only use LLMs (e.g. GPT4) for finetuning your app (datasource, app context, etc).
Table of Contents:
Launch an AI application without breaking the bank
The Fine Print
The Big Print
Weekly Lesson: Use LLMs to finetune your data, use vectorstores in production
The Super Fine Print (yes, it comes after the lesson)
Launch an AI application without breaking the bank
First, the small print:
If you intend to generate realtime answers that you have no control over, no idea how they will come out, the format, the content, if you are ok if the AI will put your product/company in hot water, say racist things, mysoginist things, basically hallucinate with authority… do not read.
The big print:
This newsletter is for anybody who wants to sell a genuine product, idea, knowledge that they have control over the content and are never surprised by biased AI training.
For example, if you use Kajabi to sell your online courses, read this. Any Substack newsletter writers? Read this. Any Youtube/Vimeo/Loom/etc online content? Read this. and so on… any digital content creators… read this.
Imagine you are a substack writer. You are writing a free newsletter on writing resumes aimed at people for whom English is a second language and you are promoting your online video course, which is a paid product. You want to ensure your message is clear, consistent, unambiguous.
For every course cohort, you create a little database with the course materials, homework, and have a chatbot that answers questions. Problem is dual: despite the fact that your material barely changes, your chatbot is costing you money everytime somebody searches for information. Secondly, despote your best efforts, the chatbot does not get nuances of questions and sometimes it is plain wrong.
The solution:
There are many AI app/solutions that allow you to deploy a chatbot. Pick one that has the option for you to load the data in a vectorstore, then let’s you save finetuned answers into a new(finetuned) vectorstore.
The production solution should pull data only from the vectorstore. This is a fixed cost that is significanlty lower than using ChatGPT in realtime.
Do you still have to use the LLM?
Yes, if you want the answers given to the user to be conversational.
No if you are interested for the answers to be more “reference” type.
Let’s go over the Yes use case
If yes, there are 2 options to start:
Start using the AI app right away, using the LLM. Anytime you see a good answer, save it in the new (finetuned) database. When you decide you have enough good answers in the finetuned database, switch the AI application to the dataset only.
Go through your main/baseline set of questions - such as an FAQ. Once you go through all those questions and saved the best answers in the new finetuned dataset, launch your AI app using this new finetuned dataset.
Now, first of all, this is a proper AI solution. I write about this many times: a vectorstore query/retrieval app is a proper AI app. AI is NOT just ChatGPT. AI is complex.
Secondly, from the user experience point of view, the answers will look the same as when chatting with an LLM, but they will be wicked fast, consistent, high quality, non racist, non biased, to the point. Win-win-win.
How do you update/maintain database?
Any app pulling data from a vectorstore should give you a confidence score. Have the app act on the confidence score - ie, if the confidence score is lower than XX (where XX can be 50% for example), then respond with a canned answer such as “Please refine your query” AND save the question in a regular report which should collect all edge questions that could not be answered.
On a regular basis make a point of going through the new set of questions, save the best answers to the finetuned database. This will allow you to organically finetune your dataset of answers.
What is your funny/horror “AI hallucinated” on me story? Share below :
What did we learn?
Weekly lesson 4: use vectorstores for fixed, low cost, fast AI solutions.
Use Large Language Models (LLMs) such as GPT4 to finetune your data ans make your answers look more conversational, but save those answers in a vectorstore. Then use vectorstores finetuned data in production to answer queries from your user base.
Pros:
ZERO Hallucinations. - yes folks, no hallucinations.
Control over your data - No giving your data to OpenAI so they can sell it for $$
(or really, Google, Microsoft, Anthropic…)Consistent, deterministic results.
Energy conscious - yes, using ChatGPT is incredibly electricity intensive, so next time you see someone preaching at a guy driving his Diesel Truck, check their AI usage footprint. You will be shocked over how expensive their “cute AI poem” actually is.
Cons:
Now that you know about the power consumption, you must live with the guilt anytime you fire up ChatGPT and lazily say “I am too lazy to think, so you pretend you are an expert in UX and tell me how I can avoid hiring an actual professional”
Super Fine Print
Finetuning works well in combination with classifiers.
Run through your finetuning as described above, then you can apply UCIS classifier filters (User Context/Intent/Sentiment).
Classifiers are filters - when you search a database, you can apply filters to the query.
For the 'No' use case (reference type answers), would you still use a vectorstore or another kind of database?