How to Build a Recommender System

The Excitement and Challenges of AI Projects

I often hear requests like, "Let's build a recommender system" or more commonly, "Let's implement AI." The appetite for AI solutions is growing rapidly.

For large corporations, this journey can feel like navigating a corporate-technological labyrinth. While startups might seem like an easier playing field, the reality is often more complex.

When starting such a project, the excitement and emotions are high. Creating a prototype within two weeks outside of a corporate environment is usually feasible, sparking great hope. However, after this initial phase, the project often faces the notorious "valley of death." If the team can navigate through this challenging phase, the result can be truly valuable.

Is it worth embarking on such a mission? How can we ensure we don't get stuck halfway?

Setting Up for Success: The Key Criteria

Before diving into action, it’s crucial to set the right conditions from the start. Success in analytical projects often depends on several key factors:

  1. A Clear Goal Instead of Vague Visions: The objective should be realistic and well-defined, avoiding overhyped, unrealistic expectations.
  2. Quality Input Data Instead of Garbage: High-quality, relevant data is essential for any AI project. Inadequate data can lead to poor outcomes.
  3. Solid Work Organization: It’s essential to have an effective process in place, rather than attempting to extract results without proper planning and resources.
  4. Technology and Architecture That Fit the Problem: Choose the right tools for the job rather than overcomplicating the solution with unnecessary technology.

Source: “Chief Data Officer” (https://books.chiefdataofficer.pl).

Flexibility in Startups: The Need for Quick Pivots

Once the groundwork is laid, it's time to move forward. In startups, flexibility is usually higher. Changing the purpose of a solution or pivoting the entire project is natural and often necessary. Without the ability to quickly adjust the tech stack and business approach, a project may fail, and its participants might shift their focus to other tasks.

Good preparation, especially learning from previous failures, is key. By analyzing past mistakes, we can avoid falling into the same traps in future projects.

The Data Trap: Expect the Unexpected

While it may seem obvious that we have data available, in practice, we often find ourselves returning to the data preparation stage multiple times. Each iteration costs time and resources, limiting the number of possible adjustments. In most projects, data preparation consumes about 80% of the resources, both in corporations and startups. It’s worth investing in methods to speed up this process drastically to enable faster testing of results and provide room for more iterations. The difference can be stark: I’ve seen projects where cycles took either four hours or four months — two completely different scenarios.

Additionally, managing the resulting technical debt is crucial. Like any debt, it can either act as a lever for growth or become a burdensome expense.

A common assumption is that "the data is already there." This can be a major pitfall. Yes, data exists, but it often requires significant cleaning and preparation before it becomes usable. It might take seven iterations of data transformations before reaching a truly effective dataset. Many projects don’t survive long enough to reach this point due to limited budgets or waning patience from stakeholders.

Leveraging Advanced Tools and AI Technologies

Today, we have access to powerful tools like Large Language Models (LLMs) operating in the cloud or on local machines. Although using generative AI is tempting, it’s not always the best solution. During training sessions, tasks are often completed swiftly and efficiently, but the real challenge lies in integrating and operationalizing these solutions.

Pilot results obtained within a few days might showcase the potential feasibility of the solution. However, this is far from full implementation. To create a fully functional system, many additional features need to be developed — features that may not be exciting but are essential for the final deployment. Relying solely on pilot results can lead to excessive optimism.

A Case Study: Building a Recommender System for Food-Tech

In a sample project for a food-tech recommender system, we managed to organize the data, integrate information about hundreds of thousands of products from various sources, and adapt it to the needs of users in the food industry. We made many mistakes along the way but learned valuable lessons. We repeatedly adjusted our approach to the data and rules, gradually increasing the system’s accuracy.

By utilizing classical algorithms (not just LLMs), we developed a unique architecture that continuously evolved at all levels. As a result, it now surpasses the capabilities of top market experts in the food industry and may soon help you during your shopping!

Overcoming the "Valley of Death" in AI Projects

The "valley of death" phase is a common occurrence in many AI projects. This is the period when the algorithm is still underperforming compared to human experts. However, with rapid iterations, patience, advanced technology, and a solid plan, the developed solution can eventually outperform humans and scale effectively.

This is where the real value lies — and it's worth fighting for.


Author: Mariusz Jażdżyk

The author is a lecturer at Kozminski University, specializing in building data-driven organizations in startups. He teaches courses based on his book Chief Data Officer, where he explores the practical aspects of implementing data strategies and AI solutions.

FirstScore