APAI 2i06 - How to build AI Products for the Military
Start a Generative AI product for the defence industry
OpenAI has quietly amended their terms of service to allow military applications using their GenAI models.
This means OpenAI will no longer frown upon you building an AI product for the military, but it does NOT mean the military is ready to buy your product.
Read on so you can find out how to navigate these news.
Table of Contents:
The AI Military complex is ready for startups
Weekly Lesson 5 - Data wants to be Free, but “clean-data” is for money.
Domain aware AI Solutions - what are they and how do you get started?
The AI Military Complex is ready for startups
OpenAI has announced they allow Generative AI products for the military.
First, let’s clarify a couple of things:
The military has been using AI for a long time already. The change in terms is really for the civilian population and the ecosystem feeding the industrial military complex.
Historical usage of AI for governments and the military has been as predictive AI and not generative AI. GenAI is slowly making its way to be used more and more in NLP/input processing, but for now it is unlikely to have any relevant use in the military in actual deployments.
Now, how do we use GenAI for the defence industry?
There is a massive industry that navigates the mountain of complexity and regulations that govern any contracts around the military.
GenAI is well suited for helping companies navigate such complexities, nuances and ever layered contracts.
Most US Military contracts go to a small number of apex contractors. The companies are top dogs and they deal with a small army (pun intended) of subcontractors. These subcontractors as well have their own network of vendors and subprocessors.
The International Traffic in Arms Regulations (ITAR) is the United States regulation that controls the manufacture, sale, and distribution of defense and space-related articles and services as defined in the United States Munitions List (USML).
Besides rocket launchers, torpedoes, and other military hardware, the list also restricts the plans, diagrams, photos, and other documentation used to build ITAR-controlled military gear. This is referred to by ITAR as “technical data”.
Handling the complexities of “being compliant” is done via a billion dollar industry whose sole purpose is to supply the US military with the products and services it awards, while following the regulations defined by the government.
How do you start building a Gen AI product for the defence industry?
Generative AI has massive problems (I wrote about this a few times) therefore you should not use it for anything where you need to meet any regulatory, legal, compliance requirements.
Mainstream usecase for a GenAI product is to help organizations stay compliant.
First, pick a vertical you want to service
Start with the ITAR and USML regulatory data set (ingest it into a vectorstore).
Add apex contractor data - or your immediate upstream contractor rules.
Add your own data on top of the baseline vectorstore data.
(Optional) If needed, add downstream data - vendor/subprocessor contractual obligations.
Get the list of contractual obligations your company/customers will be looking for answers. Finetune the application based on these contractual obligations (search in the system, save answers into the new finetuned vectorstore.)
Define classifier filters that are particular for your vertical
For example, if you are exporting anything, and you are using the application to go through a compliance checklist, you must check with the latest list of bad agents - this data must be fresh or you must inform the user of the last time the data was updated.
Important: Make sure to add a detailed audit trail feature to your application for all searches/responses.
Important: Ensure there’s a mechanism to update baseline vectorstore data and classifiers.
That’s it. Easy, eh?
The audit trail is very important.
The classifiers and finetuning the app is your differentiating factor.
A product like NeuralDreams’ ITAR dataset will start you ahead of the competition.
What is one compliance and regulatory dataset you would be interested to know if it is available to get you started?
Weekly Lesson 5: Data wants to be Free, but “clean-data” is for money
The internet is full of gurus telling you to give away your data/knowledge as much as possible to build a brand. There are many reasons to do so - one being to build credibility and to become a source of knowledge.
However, in such cases as the one listed above, such as the complicated compliance and regulatory ecosystem, the data alone is not quite useless, but almost.
This is because it requires context, interpretation, nuance and constant updating and checking if it has become stale.
This means that a system that “cleanses” the data so it becomes “information”, and actually **relevant** information, is a system worth money.
What is one data domain you wish you had a clean datasource so you can start your AI product idea?
Domain Aware AI Solutions - what are they and how do you get started?
Domain aware AI solutions are systems that have in depth knowledge on a specific domain/vertical.
This knowledge is a combination of all or some of: database, vectorstore, AI models, workflow and rule manager.
The least expensive method to start with a domain aware AI solution is to use a publicly available LLM (e.g. GPT4) and a vectorstore that contains specific domain data - for example the ITAR, USML & EAR regulations.
To some extent, any “chat over data” chatbot is a “domain aware” AI application, however, you do need to have higher quality data available in your dataset that what the body of knowledge of the LLM.
The ultimate domain aware AI solution will contain:
A specialized (finetuned) vectorstore
Classifier AI filters
Specialized domain trained NLP AI model
Domain aware rule and workflow manager
Domain aware AI agents
Get started as discussed above, that will give you the core offering of your product. If you are already in the more complex phase of your AI journey and find yourself needing domain aware models and/or agents, leave a comment or contact me directly.
What is one AI digital product you think would fit well in the ecosystem for the defence industry?


