Our Company Has Helped More than 400 Customers Explore GenAI Adoption

The generative AI waves come fast and they come frequently. The tech industry has endured what seems like countless hype cycles and contractions since ChatGPT debuted in November of 2022, some substantial and some subtle. It’s an exciting time—and a potentially confusing one.

I launched my software engineering consultancy, Loka, 20 years ago with an eye toward steering these sorts of Silicon Valley trends, and my teams have been knee-deep in generative AI (GenAI) since late ‘22. Our decade-plus of building practical machine learning (ML) and artificial intelligence applications for businesses of all kinds gave us a head start and a firm foundation versus similar firms. It also gives me a unique perspective for advising our clients as they consider how GenAI can amplify their own offerings. The current GenAI wave finds us at the forefront of establishing diverse, real-world use cases for the technology.

We’ve met with leaders from more than 400 companies in the last 12 months, exploring the prospects of GenAI adoption. From those conversations we’re now engaging in over 100 active projects, from barber shop scheduling software to revolutionary drug discovery that might cure cancer one day. As part of AWS’ first class of GenAI Competency Partners, we’re connecting a lot of those projects to powerful AWS tools. We’re also advising some of them away from GenAI if their data or their business model isn’t properly prepared to harness it. Our goal is sober assessment and mutually beneficial partnership. Loka is in this for the longterm benefit of our customers.

We’ve come away from these conversations and engagements with crucial insights on the latest best practices in GenAI. I have six in mind I want to share. For the sake of brevity, I’ll start with three aimed at anyone just getting started with GenAI.

GenAI Chatbots Rule

‍Retrieval-augmented generation, aka RAG, is the most common use case we’ve seen. This function usually shows up in the form of the ubiquitous chatbot offering to ease your engagement at almost every contemporary website. It’s a low-hanging fruit that easily leverages even unstructured written information, and it potentially provides value in a few different ways.

RAG systems improve the accuracy and depth of generative outputs by dynamically fetching relevant information from a vast collection of documents before generating responses. This process ensures that the generated content is not only relevant but also informed by the most up-to-date and comprehensive data available. It’s becoming more and more commoditized as almost every large-language model (LLM) provider or framework offers easily implementable RAG options. And now Amazon offers a savvy out-of-the-box solution called Amazon Q.

RAG has proven time and time again to save a wide variety of businesses time and money. In tandem with small LLMs, RAG systems can replicate or exceed performance of a much larger LLM. And they address one of the most serious limitations of LLMs in general: They’re great reasoning engines but poor knowledge bases. That is, they’re capable of writing plans to solve problems, thinking step by step, summarizing content, answering questions about provided information, extracting insights, using tools, etc, but they can’t be trusted to know all the facts, as if they were a search engine or encyclopedia. Engineering approaches like RAG provide useful information for the LLM to use as context.

Don't Forget the “AI” in GenAI

‍There’s no doubt that GenAI is sexy. It captures the imagination. That’s part of the reason we are where we are with it—everyone wants to tap this seemingly boundless source of novelty. But amid the enthusiasm, the truth is not all AI challenges are best met with the newest tools in the toolbox. Or as I like to say, don’t forget the AI in GenAI.

Traditional ML models, often overshadowed by the glitz of GenAI, still hold significant advantages in many scenarios. These models are not only the backbone of many AI systems, but in some cases they outperform their generative counterparts. Knowing when and why to choose traditional ML and deep learning over GenAI is the best way to deploy effective, efficient and sustainable AI solutions.

Non-generative ML models tend to be ideal for tasks that require rapid data processing and straightforward tasks like structured data classification (email spam filtering, customer segmentation and fraud detection), predictive analytics and trend forecasting and anomaly detection/outlier identification. Traditional ML models are also cheaper and easier to deploy and scale in production environments. They’re usually faster in inference. Computational requirements are also lower and offer a more cost-effective solution.

Traditional machine learning models excel in specific scenarios due to their efficiency and specialization, and GenAI can play a complementary role, especially in data preparation phases. For instance, in situations requiring large, labeled datasets that are time-consuming to compile, GenAI models may offer a swift and cost-effective solution for bulk labeling.

That said, a better example of complementarity might be GenAI that uses the outputs of an ML model (e.g. a forecasting model) to present results in an interesting, dynamic way (e.g. through NLP or graphics). Same for using LLMs to convert unstructured data into structured training data.

Our team recently conducted a GenAI Workshop—Loka’s flagship GenAI discovery program—with a client that connects contractors to government projects. Our initial scope was to apply GenAI to classify emails as genuine business opportunities or spam. After achieving some baseline results, we experimented with using traditional ML classification algorithms, such as K-Nearest Neighbors and Support Vector Classifier on top of embeddings (vector representation of emails) for the same task. The performance, inference time and cost demonstrated that these conventional ML algorithms were a more favorable approach than GenAI for this task.

Taking GenAI into Production Is a Big Step

Many of our clients want to get their POCs into customers’ hands quickly. But there’s a massive difference between a POC and productionized service.

POCs can be good for selecting a good model or provider and even for validating the feasibility of your use case. You'll need to consider the distinctions between models and model providers. Selecting which is right for you might be harder than you expect, as not all models are available for all providers and not all providers are right for your use case and budget. Realistic goals and expectations can be the difference between going to production and wasting money on a POC. Lessons from traditional ML and deep learning can be transposed into GenAI.

Here are some essential points to keep in mind when going into production:

You don’t always need to have the best model available. A good-enough result vs a perfect one could be the difference between a working solution or a prohibitively expensive one.
Evaluation. Understanding how well your system is performing is difficult to measure and it may even be hard to define. Before going to production we strongly recommend that you have at least a minimal automated evaluation framework set up.
Different areas and different products require different levels of monitoring and traceability. Either way, both are essential. Your system may not require extensive traceability but it’ll still require at least basic monitoring, which you’ll appreciate when you need to bugfix and continue developing.
Remember that anything you expose to the wider web requires guardrails—and some solutions require harsher guardrails than others. The level of verification that the user prompts and responses require vary with the use case, but in general you always need some level of guardrails around your solution to stop toxic content, avoid exploitation or just save money.
The last stage of going to production generally entails some level of optimization. Inference is expensive so a well-considered architecture, caching, properly sized instances and quantization can significantly lower solution cost.

These are the fundamentals to consider as you move forward with GenAI adoption or expansion. They are intended as an introduction to the world Loka has occupied for the last year, plucked from the countless lessons and solutions we’ve generated along the way. Stay tuned for my follow-up—another three insights to take us deeper into the technical questions around GenAI adoption, all based on the real-world use cases we’ve generated.

‍

Special thanks to Diogo Oliveira and André Ferreira for their technical insights.

Artificial Intelligence

April 4, 2024

Loka Staff

Bobby Mukherjee

CEO of Loka