Global AI and Data Science

 View Only

Predictions Of Qatar World Cup Results

By Moloy De posted Sat December 31, 2022 08:55 PM

Happy New Year 2023.

The 2022 FIFA World Cup was an international football tournament contested by the men's national teams of FIFA's member associations and 22nd edition of the FIFA World Cup that was won by the Argentina national football team. It took place in Qatar from 20 November to 18 December 2022, making it the first World Cup held in the Arab world and Muslim world, and the second held entirely in Asia after the 2002 tournament in South Korea and Japan. France were the defending champions, having defeated Croatia 4–2 in the 2018 final. However France stood second in 2022 FIFA World Cup losing to Argentina in final in tiebreaker.

This tournament was the last with 32 participating teams, with the number of teams being increased to 48 for the 2026 edition. To avoid the extremes of Qatar's hot climate, the event was held during November and December. It was held over a reduced time frame of 29 days with 64 matches played in eight venues across five cities. The Qatar national football team entered the event—their first World Cup—automatically as the host's national team, alongside 31 teams determined by the qualification process. The tournament started with a upset when Qatar lost to unknown Ecuador 0-2 in the opening match on 20th December.

I got a Kaggle Dataset International football results from 1872 to 2022 that list 43,740 international football matches between countries together with their results, venues and types of tournaments like world cup match, or pre-world cup match, or European League, or friendly match etc. So, given two countries I could extract all the international matches played between them so far together with their match results. I could figure out the numbers or the proportions of wins, losses and draws of match results. I considered two pie charts one having all the international matches and one having only the world cup matches and forecast the results of 2022 FIFA World Cup from them.

The final was played between Argentina and France on 18th December, 2022 at Lusail Stadium, Qatar. Since 1930 France and Argentina played 12 international matches that includes 3 world cup matches, 1 match in Brazil Independence Cup in 1972 and rest are friendly matches. Surely Argentina was predicted as the winner following the two pie charts below. But such a contested final with Messi and Mbappe proving their mettles and winner decided in tiebreaker was not foreseen.

Draws were allowed in Group Matches but not in Knockout Matches. The Group F match between Croatia and Belgium on 1st of December ended in a goalless draw and that was predicted.

Argentina won their Quarter Final Match against Netherlands on 10th December in tiebreaker. They played 9 international matches so far including 3 friendly matches and the last international match they played was in 2014 in Brazil. Surely the status so far doesn't match the intuition. Possibly a weightage towards recent matches would have improved the prediction.

Results in Group Stage has three possible outcome for a team - Win, Loss, Draw. So a random prediction has 33.33% probability to be correct. I have achieved 40.48% accuracy which is encouraging.
Results in Knockout Stage has two possible outcomes for a team - Win or Loss. So a random prediction is expected to have 50% accuracy. My prediction accuracy of 64.29% is quite acceptable.
Below is the overall accuracy that doesn't have a good theoretical standing considering two formats of matches in Group Phase and Knockout Phase.
Improvements of prediction processes have been suggested while considering the individual player data and more micro-level field data like data on number of passes, shots to goal and infringement data etc. On the contrary above has been a naïve prediction process based on data available in public.

QUESTION I : How to meaningfully construct an accuracy measure of prediction considering both Group Phase and Knockout Phase? 
QUESTION II :  Is it meaningful to come up with predictions based on simulation?

REFERENCE : 2022 FIFA World Cup Wikipedia