On March 27, 2019, the founder of Ahrefs, Ukrainian Dmytro Gerasymenko, announced an ambitious new project — the YEP search engine, which he claimed would be able to compete with Google. How could a team of 100 people compete with the tech giant and search market monopolist? Dmytro stated that he wanted to fix some problems that would always remain problematic with Google — user privacy and revenue sharing with content creators. He previously spoke about his idea of the search engine of the future in an interview with AIN.Capital.
A year ago, YEP.com went online. The team hasn’t made any big announcements since the search engine has been in beta. But in just under a year of operation, it has already received more than 250 million search queries.
It also has an image search, a news section, and an AI summary — an LLM-based tool that allows users to get a summary of their search query generated by artificial intelligence.
Dmytro Gerasymenko and Ahrefs Head of PR Daria Samokish told AIN.Capital how YEP is working today including how a team of 16 people is creating an AI-based search engine that aims to one day topple Google from the top of the search market monopoly, and about the problems YEP is facing due to the refusal to track users.
At what stage is the product?
Dmytro: The Yep beta version has been running for a year, but we have not announced it. In June 2022, Techcrunch wrote about the launch of the beta version of Yep. There has not been an official full-fledged launch of the search engine yet. The search engine is already working and shows good results. They are imperfect, so it is too early to launch a full-fledged search engine. We are not yet ready to compete with Google in terms of search results. We are currently working on improving our algorithms.
Daria: Nevertheless, we have done a lot during this year. In June 2022, when we announced the beta, we only had a web search. But the YEP interface and capabilities have expanded now: current news, image search, etc. We are considering the possibility of acquiring some ready-made map solutions this year or in the near future.
DuckDuckGo, Neeve, Ekosia, or other search engines offer features that users are already used to. If we want to compete with them, we have to provide a search service that is already established as a benchmark in the market. We can’t compete with a search engine that is still worse than Google’s one. We have to take into account all the expectations that the market already has. And we want to finalize all the issues while we are in beta so we can meet the expectations of users during the official launch.
Competitive advantages
Dmytro: First, it is our unique business model — we will give 90% of our profits to content creators, while large search engines take all the money.
Secondly, privacy. From the very beginning, YEP did not plan to collect cookies and sell user data to third parties, as Google does. Even though this creates some problems for us.
We already have a couple of million search queries a day, and we know that some of them are people who make queries.
But there are also scrapers. These may already be our competitors — search engines trying to get our search results and then work with them. Since we have very high privacy guarantees, we can’t track which queries are from real people and which are from robots and filter out the latter.
Daria: Privacy issues have been discussed in Europe and the United States. Google and Apple are facing lawsuits due to this issue. We want to comply with all the governmental requirements of the European Union, the US, and other countries. The infrastructure of our search engine makes it possible to exchange extensive search data and not transfer them to any third parties.
Even if we use some data on clicks (which we are not going to do), this data will not leak anywhere, like it was with DuckDuckGo, which recently had a scandal: it turned out that they had shared some data on users in the United States with Microsoft. Such a situation is impossible with us, because the development was initially set up to be a private search engine.
Dmytro: Thirdly, we want to develop competition in the search market for the benefit of companies and users.
There are many search engines that do not have their own index or have a very small one. Therefore, they buy search results via API from Bing, but offer them in their interface with additional features. There are a lot of search engines, but only Google, Bing, and YEP have their own large indexes that cover most of the internet, different languages, and countries.
Google does not have an API. It does not allow third-party programs to use its search results. And Microsoft has recently announced a significant price increase. For some features, the price will increase by nine times! It is clear that Microsoft is simply getting rid of competitors this way. Now all these services have to pay 4-9 times more. For some of them, it will become unprofitable.
We want to offer an affordable API and allow people to use our search results for commercial purposes in their programs, AI applications, ChatGPT tools, etc. We have the third largest index after Google and Microsoft’s Bing — so we can do it.
Other search engines will either find a way to solve the problem of raising prices for themselves, or most competitors in the search market will simply disappear. Of course, some search engines have their index, but it is 100 times smaller than ours. We can also mention Yandex. Sometimes it ranks third place in terms of index size. But if we are talking about international search engines, it is unlikely that Yandex’s index will be bought for political reasons.
How the third most active web crawler was built
Daria: In general, how did Dmytro come to the idea of launching Ahrefs? One day, while he was running several projects simultaneously, Dmytro needed search data for one of them. Then he contacted the biggest search data seller at the time, Majestic. After reviewing an invoice from the company, Dmytro thought: “For this money, I could make a similar system and would collect the search data.” That is how Ahrefs, with its web crawler, began its way to currently being the third most active crawler in the world.
Dmytro: It took years. We had no money like Google or Microsoft, so we had to optimize everything so the program could function with minimal costs.
The Google Index is already 26 years old. Microsoft’s Bing Index has been for 14 years in development. And the Ahrefs Index YEP is now using has 13 years of development experience. Considering the volume, the available infrastructure base, and the developers’ expertise, we could technically and morally compete with those giant companies.
Investments and development
Daria: We have already invested $60+ million. Most of it was spent on data centers. Initially, the entire infrastructure was in Singapore, but now we are starting to index from the US, and we are in the process of opening an American data center. That will increase our crawling and data processing capabilities. Although our data center is currently only in Singapore, this does not limit the use of YEP.
Dmytro: The infrastructure we invest in is not only for YEP but also for Ahrefs. It serves tens of thousands of Ahrefs users worldwide. We use 80% of what we need for YEP in Ahrefs. We need a crawler to have data in Ahrefs. We need to index this data to have Ahrefs data. YEP is just another kind of index.
We expect the greatest number of early users of YEP in the US. It is essential to be close to the user for a search engine. That’s why we want to have a data center there. Also, if something happens in one data center, we can serve all our Ahrefs and YEP users from the US or Singapore. That will make the service in the US much faster for US users.
What is YEP TLDR (AI Summary in the search field)
Daria: In December, the first word about OpenAI was spread in the USA by geek journalists who used to write articles for developers. And in the late February-beginning of March, ChatGPT was booming in the USA so loud that it echoed in Ukraine within the next month.
Then Microsoft invested in OpenAI by declaring its great comeback to the search engine market: “Google, hold on!” So Google made a move — Bard. Both tech giants announced their AI Summaries to ease the users’ search experience.
While everybody was talking about whose AI was cooler, we developed our first AI Summary prototype and rolled it out to be used in Yep. Our AI Summary is called YEP TLDR (too long, didn’t read).
This technology creates AI-generated summaries of the top search results after a given request. For example, you enter “What is blockchain” in the search field and don’t need to read all 10 search results pages to get an answer. Instead, YEP TLDR will read them for you, keep only the core meaning, and form a concentrated text after reading which you will clearly understand what blockchain is.
YEP TLDR is available as a widget on the search page in YEP. A summary will be done in a few seconds the program needs to read the top results and write it shortly. I use YEP by default, and TLDR helps me a lot.
Why YEP TLDR is better than others
Dmytro: We didn’t make big announcements since it was a new tool requiring many refining steps. It is an advanced technology driven by an LLM (large language model). And here we have several problems.
Daria: American specialists said such tools could hallucinate by producing conspiracies instead of facts. For example, an AI made a story based on an article in The Wall Street Journal about a child kidnapping that never happened. Likewise, ChatGPT or any search engine with an AI summary can provide users with inaccurate or fake information.
Dmytro: We are hardly working on YEP TLDR to avoid such failures. We don’t try to force the AI system to generate a text. Instead of it, we offer: “Here are ten pages; read them and make a summary for us.” Now I believe we have achieved the breakpoint where no “hallucination” is possible in YEP anymore.
However, we can’t assure users of fixing the issue entirely. But here you get a bonus — the users can check the accuracy of our data. Our summary always has a source link below. So you can check where the AI has this information in one click. And any user can leave feedback under this source link. So, if you read information that looks inaccurate or a bit hallucinating, you may always inform us about an issue.
How 16 people could create an AI with extras
Dmytro: Our team is small. Ahrefs has 110 employees in total. Only 16 of them are working on YEP full-time. For example, our biggest competitor has 1500 specialists there. Our revenue is also significantly different: We planned to earn $100mln a year. But Google makes $160bn only on search.
Daria: Our smallness is our advantage. We work like a startup without bureaucracy. How I see the creation of YEP TLDR. OpenAI launched the sensational ChatGPT. Somebody shared the news in our developer chat, and then Dmytro said: “We will gather five to ten people and make the same for YEP. One will develop a prototype; another, the interface. The third one trains the model; the fourth will do tests, etc.”
Now you can imagine the process in a large corporation. The clue is lost on the team level, where managers supervise managers with all their approvals, calls, and negotiations with investors. Our team can create a product without unnecessary calls. One guy comes and says, “Hey guys, what can we do to make this?” The developers, designers, and I gather and think out a solution very fast. I think we have got a clear vision of the AI summary integration procedure for YEP just after a few days.
Ukraine-made search engine
Dmytro: Although Ahrefs and YEP are companies based in Singapore, we have very close ties to Ukraine. The founder & CEO is Ukrainian, and the leadership team is from Ukraine. In general, about 20-25 people in the company (one-third of the entire workforce) are from Ukraine. Some of them are geographically located in Ukraine. During the war, some left, some stayed on principle, and some returned.
Daria: The human intellectual capital of Ukrainians is extraordinary and noticeable in the world. When we announced the beta, we made one announcement, after which 200 media outlets wrote about us. And after just one announcement, we have two million daily queries on YEP. So, you can imagine that we did not involve advertising or marketing — we have not done anything in this direction yet.
When will the beta release be, and what else needs to be done
Dmytro: Currently, when you submit a search request that we cannot answer for some reason, we offer you to use Bing, Google, or DuckDuckGo results. All the competitors of YEP are achievable with one click of a button.
We made this because we first want our users to have the most effective and fastest access to the relevant information. We like them having a simple and superb search engine that pays content authors for their contributions. This is the goal of YEP.
The moment I wouldn’t have to click this button to look at Google or DuckDuckGo search results because YEP always provides me with a great outcome would be the time for the beta release. I hope to have it by the end of this year. But nobody can forecast the future.
Nowadays, language models have a drastic impact. The things that could take years now are doable within a month. E.g., you make a search request and get results. Some of them are good; others aren’t. Recently, we have had no fast tool to check them. For sure, we could ask a person to evaluate the results, but we have no time for this — just 200 milliseconds. Now, we can install a language model to check every result’s relevance. We have used such technology since the last year.
Today, an immense LLM automatically checks our search result relevance.
How can YEP break the monopoly in the search engine market
Daria: According to Statista, Google had a 92% share in the past year. However, its market share is shrinking. Statista reports 87% for Google in 2023. The competitors never sleep. To improve the competition, the biggest player must free some space. It is basics the of the market economy: The more competing and innovative players we have, the better experience users have. But if there is a monopolist, he sets the rules, and everyone else follows them.
Many American media reported Google monopolized the path to the users. Google is present by default on smartphones and tablets. You may buy a new phone and download a browser you like but with Google as the default search engine. To change it, users must edit some settings. We developed a button on the main page of YEP — “Set YEP as your default search engine.” You could do that quickly, without having to read half of the Internet as I did to set up an alternative search engine.
The European lawmakers are also very concerned about this issue. Laws are demanding that companies pull back to increase the competition. So we can expect a situation when, for example, Apple will provide iPhone users with default search engine options. That already happened to Google: Android smartphone users must be able to choose a search engine from the list — the same thing in India. So we expect some help from national governments to win some part of the market from the monopolist.
Could YEP win some part of the market from Google
Dmytro: We want YEP to become the most popular search engine. Therefore, in the first stage, we would like to help alternative search engines and allow them to reduce the monopoly of Google and Bing. Since search is costly to maintain, Google and Microsoft actually control everyone else with their prices. We can fix this and allow alternative search engines to develop as well.
And, of course, we will try to make YEP as good as the top search engines, maybe better. Then we will start a marketing campaign to get more and more users. And with the business model, we can convert many people who would be open to it.
Daria: We are often asked, “How will you compete with Google?”. It always makes me smile. It will take years to start competing with Google. The investments Google or Bing make in search are incomparable with what we can invest. Regarding budgets, it’s like a massive whale vs. a small rabbit. But we keep up with the market, trying to offer features that users want. You can look at the search, and although it’s still a beta, it is already a very awesome beta that works and follows the market trends. When the story of ChatGPT broke in the media, YEP reacted very quickly and created a feature creating an AI summary with a team of only 16 people! And if something else comes up, our team will gather