Why Code Testing Startup Nova AI Uses Open Source LLMs More Often Than OpenAI

It is a universal truth of human nature that the developers who create the code should not be the ones who test it. First, most of them despise this task. Second, as with any good audit trail, those doing the work should not be the ones reviewing it.

Not surprisingly, code testing in all its forms – usability, language or task-specific testing, end-to-end testing – is a focus of a growing number of generative AI startups. Every week TechCrunch reports on another like Antithesis ($47 million raised); CodiumAI (raised $11 million) QA Wolf ($20 million raised). And new ones are popping up all the time, like the new Y Combinator graduate Momentary.

Another is year-old startup Nova AI, an Unusual Academy accelerator graduate that raised a $1 million pre-seed round. It’s trying to outdo its competitors with its end-to-end testing tools by breaking many of Silicon Valley’s rules about how startups should operate, founder CEO Zach Smith tells TechCrunch.

While Y Combinator’s standard approach is to start small, Nova AI targets medium to large companies with complex codebases and an urgent need. Smith declined to name customers who used or tested his product, describing them only as mostly late-stage (Series C or higher) venture capitalist-backed startups in e-commerce, fintech or consumer products and “intensive User Experiences”. Downtime for these functions is costly.”

Nova AI’s technology scans its customers’ code to automatically create tests using GenAI. It is particularly aimed at continuous integration and continuous deployment/deployment (CI/CD) environments where engineers are constantly integrating bits and pieces into their production code.

The idea for Nova AI came from the experiences Smith and his co-founder Jeffrey Shih had while they were engineers for large technology companies. Smith is a former Google employee who worked on cloud-related teams that helped customers leverage many automation technologies. Shih previously worked at Meta (previously also at Unity and Microsoft) with a rare AI specialty involving synthetic data. It has since added a third co-founder, an AI data scientist Henry Li.

There’s another rule Nova AI doesn’t follow: While tons of AI startups are building on OpenAI’s industry-leading GPT, Nova AI uses OpenAI’s GPT-4 chat as little as possible, just to generate code and do some labeling tasks. No customer data is passed on to OpenAI.

While OpenAI promises that the data of those on a paid business plan will not be used to train its models, companies still don’t trust OpenAI, says Smith. “When we talk to large companies, they say, ‘We don’t want our data going into OpenAI,'” Smith said.

It’s not just the engineering teams of large companies who see it this way. OpenAI is fighting back fending off a number of lawsuits by those who do not want their work to be used to train models, or who believe that their work will end up in the results without authorization and unpaid.

Nova AI instead relies heavily on open source models like Llama, developed by Meta and StarCoder (from the BigCoder community developed by ServiceNow and Hugging Face) as well as creating your own models. They aren’t using Google’s Gemma with clients yet, but have tested it and “seen good results,” Smith says.

For example, he explains that a common use of OpenAI GPT4 is to create “vector embeddings” in data so that LLM models can use the vectors for semantic search. Vector embeddings translate blocks of text into numbers, allowing the LLM to perform various operations such as grouping them with other similar blocks of text. Nova AI uses OpenAI’s GPT4 for the customer’s source code, but is careful not to send any data to OpenAI.

“In this case, instead of using OpenAI’s embedding models, we leverage our own open source embedding models so that when we need to go through each file, we don’t just send it to OpenAi,” Smith explained.

Smith has found that not having to submit customer data to OpenAI appeases nervous companies, but open source AI models are also cheaper and more than adequate for targeted, specific tasks. In this case, they are good for writing tests.

“The open LLM industry is really proving that if you go very narrowly, it can beat GPT 4 and these big domain providers,” he said. “We don’t need to provide a huge model that can tell you what your grandma wants for her birthday. Right? We have to write a test. And that’s it. That’s why our models are specially tailored to this.”

Open source models are also advancing rapidly. For example, Meta recently unveiled a new version of Llama that is gaining widespread recognition in tech circles and could potentially convince more AI startups to look for OpenAI alternatives.

Leave a Comment Cancel reply