Want to get better results out of your AI Application?
Yes, that includes regular chatbots, fancier chat-over-data bots as well (thanks to
), and fancy multi-prompt, multi-agent, multi-datasources, multi-AI applications.I mentioned in issue 2i09 that some of the fundementals AI Application stack layers that need finetuning to get better results are:
Better AI Prompts
Fine-tuned (vectorstore) data
AI model that is finetuned or domain aware
(the list was longer, but for our discussion, this is all that’s needed)
The list is in ascending order of cost, with Nr 1 being the least expensive finetuning method and Nr 3 being the most expensive.
There are many writeup on improving your prompts, such as the very actionable “Zero-Shot, One-Shot, Few-Shot” one from
. The prompts are inexpensive to tune and give you immediate results.This APAI issue is about Nr 2 - finetuning your datasource.
Table of Contents:
AI Solutions need regular datasource finetuning
Part 3 of 3 - Startup Feature Launch
Finetune your Datasource
What’s Next?
AI Solutions need regular finetuning
Let’s take the example chatbot that
explains in his “chat with your pdf” newsletter.In this issue, he uses his latest draft of his upcoming book “The Science of Computation”. Suppose he publishes the chatbot and people start using it. At some point, there will be an updated version of the book. Alejandro will need to update the datasource, in this case the PDF book.
This is a simplistic example but illustrates the fundemental point of data needing to be updated.
What if we wanted to build a chatbot for a large body of data? For example build a chatbot for a municipality, where we load all the bylaws of the city? This poses a couple of problems:
Data becomes stale fast
Any time there’s room for interpretation in how to apply regulatory and compliance data, contextual data for the particular issue or domain is needed.
This last point is where fine-tuning your datasource comes into play.
Finetuning datasources is needed in order to handle edge cases, overlapping context, ambiguous situations, etc.
Finetuning is best done by Subject Matter Experts (SMEs); finetuning covers ambiguity and edge cases and you need a high degree of confidence which is where SMEs come in.
How do you finetune your datasource? The 3rd product feature that is being launched at the Doha websummit addresses this problem.
Part 3 of 3 - Startup Feature Launch
As mentioned in a previous newsletter, I am heading to Doha (Qatar) at the end of February to showcase NeuralDreams at the WebSummit.
There are 3 product features we are launching as forming the core product offering.
First one we announced two days ago is a contextualized Ad Delivery Network.
Second one we announced yesterday is a Datasource Marketplace.
Third one we announce today is the ability to Finetune Datasources.
What is special:
Address ambiguity, edge cases and specialized domain queries.
Layered datasources where you set a primary datasource, a secondary datasource and tertiary reference library. (see example below to understand).
No-Code finetuning, deployment and admin.
Finetune your Datasource
(This paragraph covers a preview of an upcoming feature for NeuralDreams platform. The feature is being launched at the Doha (Qatar) Websummit in Feb 26/29 2024).
The ability to finetune your datasource works well in tune with the Datasource Marketplace announced yesterday, and the Ad Network feature announced two days ago.
You have any particular domain knowledge? Start with baseline datasources, start finetuning them and offer them on the market for others to use or finetune further.
Example for layered datasources:
Suppose you work for a municipality and your organization builds a chatbot for how to handle alcohol licences for restaurants & other esblabishments.
You start by loading reference information such as a municipality complete set of bylaws. That becomes your “reference” datasource.
Then you finetune that datasource to address primary mainstream licence requests such as restaurants, bars & other venues. That becomes your secondary datasource.
Then you finetune that datasource further to cover edge cases such as temporary venues, airplanes, ships, etc. This becomes your primary datasource.
The reason for a layered approach is to allow for data to be updated (otherwise it becomes stale).
When a user makes a query, your AI chatbot looks for data in the primary datasource; if no answer is found, the chatbot will look in the second datasource, if no answer is found it will go to retrieve it from the reference datasource.
Do you need to run all three datasources? No.
Can you use only one datasource? Yes.
Can you use two datasources? Yes.
The powerful thing about this is this - you can share datasources at any level
How much does it cost? It’s free.
What should you do?
Be Bold: be an early adopter or leave a comment below:
What’s Next?
Getting Started with NeuralDreams Apps:
Signup for the Datasource Finetuning Beta List - Indicate in your message “Datasource Finetuning beta”
Do not be shy, contact me for details, if you are reading this, either you could use a datasource, or you have data to offer. Either way, send me a message.
What is the best finetuning feature you use in your daily work with AI?
Appreciate the mention! Very good roundup article.
Interesting considerations in terms of keeping the data updated. Makes sense. And I appreciate the mention!