VENTURE BEAT
In the last six months, AI, specifically generative AI, has been thrust into the mainstream by OpenAI’s launch of ChatGPT and DALL-E to the general public. For the first time, anyone with an internet connection can interact with an AI that feels smart and useful — not just a cool prototype that’s interesting.
With this elevation of AI from sci-fi toy to real-life tool has come a mixture of widely-publicized concerns (do we need to pause AI experiments?) and excitement (four-day work week!). Behind closed doors, software companies are scrambling to get AI into their products, and engineering leaders already feel the pressure of higher expectations from the boardroom and customers.
As an engineering leader, you’ll need to prepare for the increasing demands placed on your team and make the most of the new technological advancements to outrun your competition. Following the strategies outlined below will set you and your team up for success.
Channel ideas into realistic projects
Generative AI is nearing the Peak of Inflated Expectations in Gartner’s Hype Cycle. Ideas are starting to flow. Your peers and the board will come to you with new projects they see as opportunities to ride the AI wave.
Whenever people think big about what’s possible and how technology can enable them, it’s a great thing for engineering! But here comes the hard part. Many ideas coming across your desk will be accompanied by a how, which may not be anchored in reality.
There may be an assumption that you can just plug a model from OpenAI into your application and, presto, high-quality automation. However, if you peel back the how and extract the what of the idea, you might discover realistic projects with strong stakeholder support. Skeptics who previously doubted automation was attainable for some tasks may now be willing to consider new possibilities, regardless of the underlying tool you choose to use.
Opportunities and challenges of generative AI
The new-fangled AI capturing the headlines is really good at quickly generating text, code and images. For some applications, the potential time savings to humans is huge. Yet, it also has some serious weaknesses compared to existing technologies. Considering ChatGPT as an example:
- ChatGPT has no concept of “confidence level.” It doesn’t provide a way to differentiate between when there is a lot of evidence backing up its statements versus when it’s making a best guess from word associations. If that best guess is factually wrong, it still sounds surprisingly realistic, making ChatGPTs mistakes even more dangerous.
- ChatGPT doesn’t have access to “live” information. It can’t even tell you anything about the past several months.
- ChatGPT is ignorant of domain-specific terminology and concepts that aren’t publicly available for it to scrape from the web. It might associate your internal company project names and acronyms with unrelated concepts from obscure corners of the internet.
But technology has answers:
- Bayesian machine learning (ML) models (and plenty of classical statistics tools) include confidence bounds for reasoning about the likelihood of errors.
- Modern streaming architectures allow data to be processed with very low latency, whether for updating information retrieval systems or machine learning models.
- GPT models (and other pre-trained models from sources like HuggingFace) can be “fine-tuned” with domain-specific examples. This can dramatically improve results, but it also takes time and effort to curate a meaningful dataset for tuning.
As an engineering leader, you know your business and how to extract requirements from your stakeholders. What you need next, if you don’t already have it, is confidence in evaluating which tool is a good fit for those requirements. ML tools, which include a range of techniques from simple regression models to the large language models (LLMs) behind the latest “AI” buzz, now need to be options in that toolbox you feel confident evaluating.
Evaluating potential machine learning projects
Not every engineering organization needs a team dedicated to ML or data science. But before long, every engineering organization will need someone who can cut through the buzz and articulate what ML can and cannot do for their business. That judgment comes from experience working on successful and failed data projects. If you can’t name this person on your team, I suggest you find them!
In the interim, as you talk to stakeholders and set expectations for their dream projects, go through this checklist:
Has a simpler approach, like a rules-based algorithm, already been tried for this problem? What specifically did that simpler approach not achieve that ML might?
It’s tempting to think that a “smart” algorithm will solve a problem better and with less effort than a dozen “if” statements hand-crafted from interviewing a domain expert. That’s almost certainly not the case when considering the overhead of maintaining a learned model in production. When a rules-based approach is intractable or prohibitively expensive, it is time to seriously consider ML.
Can a human provide several specific examples of what a successful ML algorithm would output?
If a stakeholder hopes to find some nebulous “insights” or “anomalies” in a data set but can’t give specific examples, that’s a red flag. Any data scientist can discover statistical outliers but don’t expect them to be useful.
Is high-quality data readily available?
Garbage-in, garbage-out, as they say. Data hygiene and data architecture projects might be prerequisites to an ML project.
Is there an analogous problem with a documented ML solution?
If not, it doesn’t mean ML can’t help, but you should be prepared for a longer research cycle, needing deeper ML expertise on the team and the potential for ultimate failure.
Has ‘good enough’ been precisely defined?
For most use cases, an ML model can never be 100% accurate. Without clear guidance to the contrary, an engineering team can easily waste time inching closer to the elusive 100%, with each percentage point of improvement being more time-consuming than the last.
In conclusion
Start evaluating any proposal to introduce a new ML model into production with a healthy dose of skepticism, just like you would a proposal to add a new data store to your production stack. Effective gatekeeping will ensure ML becomes a useful tool in your team’s repertoire, not something stakeholders perceive as a boondoggle.
The Hype Cycle’s dreaded Trough of Disillusionment is inevitable. Its depth, though, is controlled by the expectations you set and the value you deliver. Channel new ideas from around your company into realistic projects — with or without AI — and upskill your team so you can quickly recognize and capitalize on the new opportunities advances in ML are creating.
Stephen Kappel is head of data at Code Climate.
Connect with us on our socials: