The question-and-answer web site Stack Overflow was based 17 years in the past to permit programmers — human programmers — to submit questions on programming issues and get solutions from a group of like-minded people.
Since its founding, the world has grow to be enamored with how ChatGPT and different generative AI packages cannot solely provide solutions however even carry out the work itself, handing you your personal customized code.
How ought to a group of individuals sharing information reply to the sudden attraction of AI automation?
Stack Overflow’s CEO, Prashanth Chandrasekar, has been operating the corporate for nearly six years — and he has a plan. I sat down with him this month to learn the way that plan is coming collectively.
“Issues have modified; we need to change with the instances,” Chandrasekar advised me. “We needed to go along with the path of the stream of the river.”
The rise of ChatGPT
He did not actually have a alternative. Beginning in 2022, the rise of ChatGPT had a right away influence on Stack Overflow’s public web site visitors, which delivers the promoting that accounts for a big portion of its income.
A main means wherein individuals got here to Stack Overflow was all the time a Google search. Paid search on Google was the “person interface to Stack Overflow,” as Chandrasekar places it.
With the ability to ask a chatbot as a substitute of looking out shortly reduce into these Google queries, and visitors started to say no.
Chandrasekar arrived at a broad philosophical conclusion about not simply AI and automation but in addition the web.
“Our view is that the character of the web has modified,” he stated. It is now not largely about paid search from human queries driving web site visitors. “The person interface has modified to be Gen AI instruments,” he noticed.
“And, so, we’re having to form of be as conscious of that trade change as attainable. We have to additionally diversify” as a property.
Chandrasekar and group realized there’s a number of worth in Stack Overflow’s 60 million solutions to deal with the shortcomings of generative AI. One possibility was to sue OpenAI and different makers of AI basis fashions. The “pre-trained” massive language fashions embrace Stack Overflow conversations, that are legally the property of Stack Overflow.
“We stated, OK, we may try this, or we may do one thing a bit bit extra, for instance, novel, relative to what everyone else was doing.”
As an alternative of submitting go well with, Stack Overflow put in code to dam scraping of the general public web site, “after which we stated, ‘for those who’re constructing a chat bot or no matter, it’s a must to work with us. Let’s have a good enterprise settlement.'”
Data as a service
On high of promoting and the paid enterprise model of Stack Overflow, referred to as Groups, Chandrasekar is constructing out a 3rd income referred to as information options — or, knowledge-as-a-service.
Chandrasekar has been signing offers to license Stack Overflow’s content material, first with Google and OpenAI.
Prior to now 18 months, he is additionally been busy signing “all of the cloud hyperscalers, I can not identify them, however you already know them,” he stated. “We’re within the means of doing many extra.”
These offers embrace integration with Microsoft’s Visible Studio Code software, GitHub CoPilot, and Google Gemini Code Help — to get solutions proper contained in the programming setting.
“All these firms are taking, are leveraging an official licensed model of our dataset to coach their fashions for pre-training to do issues like RAG and indexing in some circumstances.”
Solutions get surfaced on ChatGPT or different LLMs, with a credit score and a hyperlink again to Stack Overflow. “The purpose is to really recapture the visitors that folks gave to our web site instantly,” he stated. “We have gotten extra of a headless web site: if individuals are spending all their time in gen AI bots, that is additionally advantageous.”
After all, there’s the chance of not having direct relationships with the tip person if they aren’t really on the Stack Overflow web site. Chandrasekar stated the corporate has varied agreements to get pertinent info from OpenAI and others about issues such because the immediate that the chatbot person is utilizing.
“There are a number of subtleties within the engagement between the accomplice and us,” he advised me. “We’re working by way of every state of affairs,” he stated, resembling how a lot of an LLM’s “context window” (the latest reminiscence of chats) is shared with Stack Overflow.
AI’s shortcomings
Chandrasekar stated that Stack Overflow is fixing three main shortcomings of the know-how for OpenAI and the opposite giants.
One shortcoming is what he calls the belief level. “You do not belief what’s popping out of it,” alluding to the notorious LLM hallucinations and confabulations.
The second shortcoming is LLM mind drain. “In case you do not generate new info, these LLMs should not going to progress of their intelligence stage,” as is evident from the controversies over so-called artificial information that may pollute LLMs.
Final, and maybe most necessary, “the solutions popping out of the gen AI are literally not information,” stated Chandrasekar. “There could also be a solution, however they could be tapped out on complexity as a result of that is too difficult of a set of circumstances,” and subsequently, actually must have the wealthy context of Stack Overflow, he stated.
Groups integration
On the identical time that he has inked offers with the giants for the general public Stack Overflow, Chandrasekar has begun a second a part of the information options enterprise. Stack Overflow is integrating its public content material into the Groups product for companies that need to broaden their inner information sources for the aim of agentic AI.
The Groups product was launched as a result of firms stated they needed their very own model of Stack Overflow as a repository not for basic programming information however for his or her explicit company processes. Now, stated Chandrasekar, the identical firms need to broaden that info pool as a result of they need to develop AI brokers that do much more than programming.
“We now have firms like Uber who’ve really completed this with us already, the place they’ve constructed an AI chatbot that is referred to as an assistant, or, in a really beneficiant sense, agent as a result of it is all about performing the motion. However that agent is leveraging the Groups information for one thing like, ‘How can we really do that’ inside Uber? The bot serves up an computerized reply inside a company chat. “The bot has all of the information from the group, so it is surfacing the correct info on the proper time.”
In different phrases, agentic AI automates what workers historically do with Groups. “We observed this as a result of our APIs are red-hot; our APIs are being hit always by the bot.”
On condition that pattern, Chandrasekar is including a brand new ingredient to Groups, licensing your complete Stack Overflow public web site information to the identical Groups clients.
“We stated, ‘Why do not we take our information options product, our public platform information, and likewise current that to firms alongside Groups information?’ An agent then can have the information from 60 million questions and solutions, and likewise all of the information particular to the corporate, after which it is much more armed with the correct reply on the proper time.”
The Groups product, like the massive licensing offers, is built-in into varied merchandise, resembling Atlassian’s JIRA IT ticketing system.
Stack Overflow is negotiating licensing phrases with enterprise Groups customers. Chandrasekar declined to debate pricing particulars apart from to say, “It is a value-based pricing mannequin.”
“It is early phases,” he stated, in figuring out what the market will permit for such content material licensing.
Web site enhancements
Concurrently information options are being developed, the general public Stack Overflow web site is receiving some pretty important enhancements.
The corporate continues to be “working our means again” to the extent of visitors previous to ChatGPT, stated Chandrasekar, with out disclosing visitors numbers, including the positioning “has not but totally” made it again to the pre-ChatGPT stage.
Chandrasekar is constructing out the performance of the general public web site to make it extra real-time.
The normal mode of Stack Overflow is one particular person posting a query after which others posting their steered solutions.
Two different types of alternate are buried deep inside the web site — chats and discussions. Chandrasekar describes these as “swim lane” modes of interplay that do not present the perfection of the primary stack — overflow solutions — however can get a response to somebody far more shortly.
Chandrasekar’s philosophy is that “we need to present a number of kind components and challenge varieties for technologists of various varieties.”
The corporate can also be serious about including tutorial video content material from customers. “There’s a number of nice content material,” he noticed. “Think about if someone is testing DeepSeek and we’re capable of capable of, for instance, reside stream that and seize it and someone else can study from that have. We actually need to go from being a information base into far more of a group web site.”
If this sounds to you want Reddit, Chandrasekar stated he will get that loads. The distinction, he famous, is that “we’re clearly a really specialised viewers; we aren’t attempting to be all issues to all individuals.” He admires the bigger social web site. “I do know them very effectively, they usually’ve been nice. They’re very a lot an in depth cousin to us, or, perhaps, an even bigger brother.”
He noticed that Reddit’s licensing offers with OpenAI helped pave the way in which for the information options enterprise.
There may be all the time the hazard in increasing a profitable property that one can unfold one’s efforts too skinny. How does Stack Overflow place its bets?
“We need to choose those that resonate probably the most with our customers,” he stated. “I discuss to the group loads, principally have interaction with them to grasp which of them to double down on. We are actually going and operating checks on which of them really make a distinction.”
The connection internally with gen AI has additionally modified. Early on, when ChatGPT got here public, some customers of Stack Overflow had been grabbing ready-made solutions from the bot. The positioning responded by banning the cut-and-paste replies.
“However then, we stated, let’s discuss to the group and see how they needed to do issues,” he stated. “One factor that grew to become apparent is that folks nonetheless discovered it to be pretty tough to simply have interaction human to people on asking questions.”
Historically, human moderation on the positioning would possibly result in moderators scolding repetitive or beginner questions. “In case you’re asking a query a few technical topic, if somebody had answered earlier than, someone would let you know, that is a improper query. Go search earlier than you ask.”
There got here “an enormous alternative to make use of AI.”
The positioning not too long ago went reside with “Gemini-powered” solutions. Now, “Gemini is supplying you with a prompting it is all been requested and answered ” and taking you to the related itemizing, “all in a non-public window of simply you and the AI,” so there is not any disgrace in your beginner inquiry.
The modifications to Stack Overflow’s public web site are the newest initiative, however the licensing offers and the additions to Groups appear to be serving to the enterprise already.
Going ahead
“We’re rising as an organization,” stated Chandrasekar, whereas declining to reveal financials. Stack Overflow is owned by European funding big Prosus NV of The Netherlands, which acquired it 4 years in the past for $1.8 billion.
Prosus is publicly listed, so the corporate could disclose precise monetary details about Stack Overflow when it proclaims its full fiscal 12 months report, which it normally does towards the tip of June yearly.
The information options a part of the enterprise has grow to be Stack Overflow’s fastest-growing enterprise, adopted by the Groups gross sales, and the promoting enterprise, which is a “very regular” enterprise due to the fixed demand to promote to programmers the place they spend time. Every of the three is a 3rd of income, roughly, stated Chandrasekar.
Maybe the stickiest a part of that is how the group handles it. There was pushback when the corporate first approached Google and OpenAI about information licensing — not shocking, because the group considers the 60 million questions their group property in a way, even whether it is legally the property of the Stack Overflow company.
There was a lot pushback that some customers stated they had been banned from the positioning in the event that they triggered a stir over the licensing offers.
With out entering into the small print of previous conflicts, stated Chandrasekar, at this level, Stack Overflow customers have come to appreciate that the corporate will not be a not-for-profit and that it must make an revenue to serve its objective.
“I did an AMA two weeks in the past,” an “Ask Me Something” with customers on Stack Overflow, he recalled. “I stated, look, we’re not like every other web site on the market, however we do not take donations. You will have a enterprise that helps the positioning, and one method to drive a enterprise is to leverage what you will have that is helpful so as to add worth within the ecosystem.”
The consequence, he stated, is that “they realized it is a good factor for Stack, and for them, as a result of by not doing this, we are literally not capturing the income that we’d like to have the ability to make investments again into the group, to construct the moderator instruments that they want. So, the group understands it now, slowly.”
Need extra tales about AI? Join Innovation, our weekly e-newsletter.