2
I. Introduction
Soccer, one of the most popular sports in the world, has its own fascination. Soccer
fans enchant in the tense and exciting moments of goals, especially those last-gasp goals
that determine the game result and then determine the final rank and points of teams.
For example, in the 2015-16 season of English Premier League, the dark horse team
Leicester, who just ascended to the Premier League in 2014-15 season, surprisingly beat
all the other teams and won the champion. This phenomenon reflects one of the most
charming part of soccer— complexity, which makes the game result hard to be
predicted.
Though soccer game results and team ranks are hard to predict, curious people
always want to figure out the keys to determine the game result for their own reasons.
Betting companies has to correctly, or as correctly as possible, predict the game results
and ranks since they have to design a series of odds that produce stable profit from
gamblers. Their methodology might be complex and various—from analyzing the
strength of two teams and the possible strategies of two coachs, to the choice of the
referee at the game day, the injury situation of two teams and both teams’ future
schedule, etc. Gamblers and sports fans want to predict the game results and ranks
correctly since gamblers want to predict correctly since gamblers want to earn money
from betting companies and gain pleasure from correctly predict their favorite club
winning the game and the seasonal championship. Restricted to the lack of information
and experience in the industry, they have to make prediction based on less parameters
such as general performance of two teams in this season(which can be easily obtained
from game table), historical game records and odds from betting companies.
As a statistician and a soccer fan, I mainly focus on predicting the game results
using statistical modeling techniques. I choose to predict the game results and the team
ranks in a very straight way—predicting the number of the goals for each team. The
reason I choose to predict the number of goals is that regardless what strategies that
coachs use or what types of the goals are, whenever a team achieve more goals than the
other, that team will win the game. Within each game result produced, I can easily
generate the results to make a final table contains team ranks and final points. The goal
of the project is to predict the final ranks and points of 2016-17 premier league season.
This goal will be achieved mainly in following steps:
(1) Data collection and re-organization in order to be used to construct prediction
models
(2) Several models will be evaluated and tested on 2015-16 season
(3) Select the one among those models
(4) Predictions will be made via the best model on selected in step(2)