A seemingly simple technical gesture. Very restrained. Thousands of reactions through social media. La Liga Tech analyzed this type of incident when a player like Karim Benzema scored on Panenka in the 81st match.e Minutes of the Champions League semi-final between Real Madrid and Manchester City on Tuesday, April 26, 2022.
La Liga Tech is a spin-off from La Liga, the body responsible for the championship of the first division of Spanish football. For almost eight years, it has been developing a set of technological solutions, including an OTT (over-the-top) platform that allows content to be delivered from a mobile or web application.
“We made a big technological gamble in 2013 to modernize and professionalize the competition and expand La Liga’s growth opportunities,” said Tom Woods, Marketing and Communications Lead at LaLiga Tech. “This is something that has driven the growth of La Liga for many years around the world and in all aspects of its engagement: how it creates its content, how the teams prepare, passes through the management of the competition”, he added.
About 10 months ago, it launched LaLiga Tech as a technology enabler to help sports clubs engage fans, track player performances and secure broadcast rights for all kinds of events.
One of La Liga Tech’s specialties is data processing. This includes analyzing audience behavior, measuring consumption of content from broadcast platforms or social networks, detecting licensed streams and fraud in sports betting, selling tickets and derivative products, etc. ; “It ultimately comes down to the data,” says Tom Woods.
But there is no question of La Liga tech creating new silos or dispersing through different systems. Data scientists have therefore emphasized the adoption of data lakes.
“It’s easier to work when you have data in one place,” says Rafa Zambrano, head of data science at La Liga Tech. “When we started, deploying statistics and machine learning models was complicated because the data was scattered across different systems.”
Rafa ZambranoData Science Manager, La Liga Tech
LaLiga Tech chooses Apache Spark, followed by Databricks
About four years ago, La Liga opted to set up a data lake. According to Rafa Zambrano, data scientists wanted “low-cost administration and maintenance” and “accelerated development” of their projects. Their choice falls on Databricks on the Microsoft Azure cloud through the Azure Databricks offering.
In a blog post (in Spanish), Guillermo Roldán, head of architecture at LaLiga Tech, explained Databricks’ choice in detail. The subsidiary wanted a solution to process events in batch or real time that runs in the cloud and is able to rely on GPUs. It sought to avoid ownership lock-in, leaving the possibility to negotiate migration to another cloud if necessary.
Finally, the chief architect praised Azure Databricks integration with Azure AD, autoscaling, pay-as-you-go, and notebook-integrated data science environments. Note that Microsoft is one of the partners of La Liga.
Despite the availability of Databricks on various clouds, caution remains about reliance on selected providers and publishers. To limit the impact, LaLiga Tech chose to “bet on Apache Spark,” which gives it “peace of mind” since it doesn’t need to use Databricks or Azure to run jobs. Spark, according to Guillermo Roldan.
This does not prevent La Liga Tech from exploiting the vast majority of features offered by this “lakehouse”.
“We use Databrick for various Big Data projects, when we need the Apache Spark engine to process data quickly, we use the scheduler to automate and test our tasks, we have multiple metadata stores to manage data and security”, Rafa Zambrano list. “We are processing real-time data with Databricks and are starting to test MLOps and AI model governance features as part of our data science projects.”
Thus, Data Lake aggregates data from many DBMS to analyze the engagement and demand of more than 15 million fans across various websites, social networks, OTT platforms and several applications.
“Having this data in a central data lake allows us to understand who our fans are and identify profiles,” says Tom Woods. “Is it someone coming to the stadium? Does he want to buy derivatives? Who interacts with our gaming platform? etc”
Analyze more and more specific data
“It helps us reach our audience, making the competition more relevant to such an audience. And it helps our group grow,” he continued.
The data science environment also includes data and statistics produced during football matches. “We have a view of all the events of a match: score, fouls, number of passes, actions of each player, etc. as well as tracking the position of the players and the ball using specific cameras,” explains Rafa Zambrano. “We generate a data set of about 3 million lines per game”.
Each stadium of La Liga member clubs has 16 optical tracking cameras that locate athletes and referees with the ball 25 times per second. For this, LaLiga Tech uses computer vision technology from ChyronHego, a specialist in media event retransmission technology. This data is combined with events received through the Opta Sport sports statistics platform.
According to the head of data science, this huge amount of data, obtained in batch or in real time, makes it possible to “generate compelling statistics and predictions.” “For example, when a player shoots at the goal, we can calculate his probability of scoring,” he explains.
This is one of the features of MediaCoach, a solution offered to both broadcasters and football clubs. It extracts over 1,900 data points (and 300 real-time metrics) per player. Clubs use it to analyze and improve team tactics. The tool can also analyze the work of referees, with the aim of avoiding as many interruptions as possible to make matches as fluid as possible.
To detect and prevent match-fixing, La Liga Tech analyzes real-time data from forty sports betting platforms. A neural network is used to detect whether certain bets are out of the norm or to place bets close to teams or players. “Our neural network predicts predictions. A regression model compares these predictions with real-time data from the betting platform,” explains Rafa Zambano. “The regression model makes the comparison easier to interpret. We can point out which variables most affect output results before the team in charge of this analysis alerts the police”.
La Liga Tech is interested in other sports
So La Liga Tech has succeeded in developing a comprehensive solution to process data from a sports competition: football. It now has to adapt its algorithm to other games.
Rafa ZambranoData Science Manager, La Liga Tech
“My main challenge at the moment is to replicate what we have achieved in La Liga to meet the needs of other competitions,” said Rafa Zambano. “We need to adapt our solutions to other sports, work with other partners”.
For his part, Tom Woods believes that La Liga Tech has succeeded in its technological bet and that the entity, like its parent company, contributes to the appeal of the Spanish championship.
“We hold ourselves responsible for fan engagement, content protection, anti-fraud and many other aspects,” said Tom Woods.
“Other organizations in the sports industry are looking to do the same. We have technology packages that are already working, the right connectors to the IT ecosystem and some clubs are using them in production,” he continued.
And if La Liga Tech’s first customers are the Spanish and Belgian football leagues, the entity has every intention of offering its services to a set of sports leagues, including those dedicated to eSports, but also to diffusers. “We have an agreement with the World Paddle Tour, a sport that is rapidly gaining popularity”, explains Tom Woods. For those who discover it, paddle is a racket game that mixes the rules of tennis, squash, table tennis and basque pelota. “We built their OTT platform. Today, they have more than 400,000 registered users,” explains the communications manager.