Data Collection and Analysis in Equestrian

Sussi Zhu
CISS AL Big Data
Published in
6 min readOct 26, 2023

Introduction

We can always look back at the wins and losses to determine the biggest sports star. For example, Stephen Curry grew in popularity because of his exceptionally high 3-point rate in the past years. The tennis player John McEnroe is so well-known due to the fact that he won 77 singles titles and 78 doubles titles throughout his career, which still holds as a record. These instances demonstrate how direct results limit people’s fields of vision. But what if using historical data, we can reach into the future and congratulate athletes beforehand? This is where game prediction comes into play.

  1. Data Collection in Equestrian–What? How?

Show jumping is a thrilling branch of the equestrian sport that involves riders guiding their horses through a series of obstacles in a timed event. This industry has become strongly data-driven over the past decade, powered by the advancing technology and the mass of information made available to the world. Technology is said to play a more important role in optimization and prediction than the human mind. Analysts and researchers of equestrian engage in several sects of the sport in terms of data collection.

Fig. 1: Athlete registration page Fédération Equestre Internationale (International Equestrian Federation) FEI. “Fei & International Information: ESNZ.” ESNZ | The New Zealand National Federation for Equestrian Sports, 12 Oct. 2023, www.nzequestrian.org.nz/esnz/membershipregistration/fei-international-information/.

Personal Information

The British Equestrian organization collects personal data to share with reputable organizations for record-keeping. Such information often includes name, age, address, email, career history, sporting results, qualifications, etc. This type of data is not typically openly available on the web, therefore, the organization utilized various methods of collecting data. The British Equestrian generally obtains information from individuals in the following ways:

a). Directly from the individual.

This channel generally refers to surveys that collect voluntary information regarding the athletes’ personal information.

b). Web server.

This method uses your IP address for computers and devices used to identify and communicate with each other. This allows for the extraction of information.

c). Third parties

Utilizes references from assessment or comment from coaches; member bodies providing rider, owner, or horse information; competition results; testing results; etc.

d). Registration process

Using the process of registration as a means of data collection is a common method seen in not only the equestrian industry but across countless websites as well. When an individual registers as an athlete, they may be required to enter essential information such as name, age, email, and nationality, as shown in Figure 1. This gives the association access to data points that are stored in the database for future analysis.

The data that is collected from you may then be transferred to a destination external to British Equestrian’s secure network. However, all external sources that obtain this information are reliable and safe to ensure the most optimal and necessary use of the data.

Equine Welfare

The Equine Welfare Data Collective, in collaboration with the United Horse Coalition, has introduced an enhanced and more flexible monthly data collection system. This process provides participants with more freedom to update their data to their personal preferences. all data is aggregated to ensure the anonymity of individual organizations and users. However, the challenge with voluntary participation is that the organization may not be able to acquire enough data in the end to analyze. Thus, the Equine Welfare Data Collective imposed a reward/return system–by contributing data, participants receive exclusive reports, invitations to round table discussions, and an EWDC badge to publicly showcase their commitment to the mission.

2. Focus on specific key performance indicators (KPI)

To predict the outcomes of a game, the first and most crucial step is to grasp past data. According to Christina Rasnake, the director of Sports Science & Analytics at the University of Delaware, a mistake people tend to make is over-collect data, meaning many analysts gather excess data that can be irrelevant to the subject. However, “the key principles in collecting data for team sports are standardization, centralization, integration, and implementation,” says Rasnake. Therefore, when searching for data points, we need to acquire information from reliable and standardized sources and determine the key indicators of the athletes’ performance. The indicators need to stay constant through the examination process to guarantee accuracy and analytical equity.

Fig. 2: concept of web scraping “What Is Web Scraping?” WebHarvy Web Scraping Software — Easy to Use Web Scraper, WebHarvy, 25 Oct. 2023, www.webharvy.com/articles/what-is-web-scraping.html.

When sources have been identified, it is impractical to manually search for the needed data from the vast quantity of data points. Thus, we will involve a process called “web scraping” as shown in Figure 2. Web scraping refers to extracting precisely necessary information from a large site. This can be made possible by using Python coding

3. Analytical Models/Tools

Power BI

To analyze data, a tool used by professionals including Rasnake herself is Power BI, which is a powerful software for data deciphering and visualization. When the data points are entered into Excel, it may seem all over the place. Microsoft Power BI cleans up the completely raw data and allows for the creation of dashboards that provide concise and meaningful data analysis, enabling quick, efficient, and customized reporting and presentation.

IBM Cognos Analytics

Another useful tool is IBM Cognos Analytics. In this software, you are able to create active reports with all the data collected. You are also able to switch between different “regions” after sorting the data. For instance, by clicking the drop-down menu, you can easily switch between different athletes to visualize different trends in their data. Hence, instead of generating individual charts and graphs for each athlete, I am able to view every visualization in one report.

Fig. 3: Radar chart creation process (steps) “Radar Chart Steps Template.” Edraw, 2023, www.edrawsoft.com/template-radar-chart-steps.html.

Radar Charts

Furthermore, radar charts are also an accommodating tool for analyzing the results in the context of several factors–they offer a vivid and visual depiction of data, making it easier to understand and interpret. (To create radar charts, I will also be utilizing the IBM Cognos Analytics reporting tool.) Rather than trendlines or countless data points on a graph, radar charts will show the results in a timely and organized manner. Additionally, radar charts enable dynamic analysis, allowing users to track progress and identify areas of improvement. Rather than focusing on the correlation between 2 factors, this method of analysis enables us to see the relationship between multiple parameters. The simplicity and intuitiveness of radar charts encourage progress and excellence by providing concise and fast assessments. Furthermore, the process of creating this chart is relatively straightforward, as displayed in Figure 3. Radar charts offer clear illustrations of a player’s strengths and weaknesses, this propels the game bidding/result prediction process.

Conclusion

In short, investigating data in the sports industry is no simple task. It contains multiple layers that build up the prediction process. Starting with identifying key indicators or parameters, to visualization and interpretation, and to further analysis from mathematical models. By harnessing these procedures, we are able to gain valuable insights and make predictions.

Citations:

“Data Extraction in Python.” ScrapingBee, 2023, www.scrapingbee.com/tutorials/data-extraction-in-python/.

Harkins, Ashley. “Ashley Harkins.” American Horse Council, Ashley Harkins https://horsecouncil.org/wp-content/uploads/2020/04/logo.png, 4 Mar. 2021, horsecouncil.org/press-releases/equine-welfare-data-collective-updates-data-collection-procedure/.

Joaquín Vico PlazaDegree in Physical Activity and Sports Sciences.Master’s Degree in Personal Training.Certified Personal Trainer (NCSA-CPT).Official Master in Research of Physical Activity and Health.Continuous update on all of the above.CEO at Fitness c. “How to Analyze the Main KPI of Your Athletes — Vitruve: Velocity-Based Training.” Vitruve, 3 Mar. 2023, vitruve.fit/blog/how-to-analyze-the-main-kpi-of-your-athletes/.

“Privacy Policy.” British Equestrian, 2023, www.britishequestrian.org.uk/privacy-policy.

Rasnake, Christina. “Best Practices in Data Collection for Sport with Christina Rasnake.” SimpliFaster, 24 Sept. 2021, simplifaster.com/articles/data-collection-sport-christina-rasnake/.

--

--