Friday, September 26, 2014

A brief look at team totals

While I generally go to great lengths to emphasise the need to avoid placing too much weight on small samples and recent results, I also recognise that absolute certainty is not something we have time to wait for so with five weeks in the books it's time to start delving into the numbers, starting here with a first look at some of the team trends.

We're focusing on teams first as their data tends to stabilise much quicker than for an individual player1 and thus becomes more reliable at an early stage in the season. To underline this point, let's look at some like-for-like data from last season.

We've plotted each team's SiB +/-2 through the first five gameweeks of last season against their eventual season total to see how well these two metrics correlate. The stronger this correlation, the higher our confidence that early season results tell us a lot about the way the rest of the season might play out:

It isn't perfect, of course, but it's a stronger correlation than I expected and really suggests that team data through five weeks should hold some weight. Liverpool and City stand out as the biggest exceptions, turning unremarkable starts into memorable attacking seasons, and thus offer a word of caution about overreacting to what we are going to discuss below, but suffice to say, five games is more important than you might think.3 Here's how the teams lineup so far:
What's more surprising? That Man Utd are second to last on the attacking +/- rank or that you aren't particularly surprised to see them there? In one sense, averaging 6.8 SiB per game isn't the end of the world but when you consider the opponents they've faced - including all three promoted sides - it becomes clear just how badly this team have struggled to create chances and they've been somewhat fortunate to score the goals they have. They'll likely climb these rankings in the coming weeks given the amount of talent on offer but the numbers suggest that the "free scoring" narrative being suggested by some may not really materialise and we might see this team struggle even further to pickup results. Di Maria, Rooney, Mata and even Herrara have all proven to be useful fantasy assets so far, yet despite the notion that the team is 'underachieiving' this group are actually overachieving by a distance, with the midfield trio scoring six goals on just seven shots on target. Rooney's numbers support his success a little more, but questions have been raised about his overall game and with constant speculation of him playing a deeper role, or even dropping to the bench, he too looks like a risky investment. There's so much talent here that we'd still expect someone to enjoy a positive fantasy season but selecting anyone from this team is essentially a bet against the numbers and that's not something I'm here to encourage.4

Brendan Rodgers has suggested that Liverpool are "broken", and while it isn't entirely clear in what way he means, his concern seems premature if we purely look at their shot generation. Despite playing what looks on paper to be a tough schedule, they've outshot the league average in every game other than this past week at West Ham, with the only real drop off from last year being the drop in shot conversion. Though losing a player like Suarez, and for the past two weeks Sturridge, make a drop in conversion a likelihood, we might expect them to start hitting the target a bit more going forward as well as potentially benefiting from some improvement in the G/SoT department.5 The aforementioned BBC article helpfully points out that Balotelli hasn't scored Premier League goal since November 2012, since when he's made exactly four league starts. Fabregas hasn't scored since 2010! Panic! Sell! Overreact! No, Balotelli isn't Suarez but he's been fine so far in the shot department and you don't need to worry about off field issues as you're not paying his wages. With a very nice run of five fixtures to come, this is a team to target as other managers abandon ship.

Southampton made perhaps the most waves in the offseason; rarely in a good way. We all know the talent this team lost during the summer but while players were brought in to fill the gaps (particularly on the attacking side of the pitch), they weren't heralded stars and thus few gave the Saints much chance to succeed again this time around. Three wins and a draw from the first five is one thing but it's the fact that the underlying stats are equally impressive which really stands out. A 20% attacking SiB +/- and a -37% on the defensive side6 both put Southampton among the top three, right up with the elite teams we expect to see up there. Pelle has posted remarkable shot totals, leading the league in both total shots and SiB, and he's ably supported by fellow newbie Tadic and breakout candidate Ward Prowse7 who have each shown potential to contribute very healthy assist and goal totals. The defense hasn't missed a beat either and with very solid games on deck, this is another team to target as other managers are hesitant to believe in Koeman's side.

Tottenham have not made an impressive start to the season all around but it's their defense which is really interesting. I'm not sure many had extremely high hopes for this unit but they were a strong side last season and Eric Dier's pair of early goals caused thousands of managers to flock to him, pushing his ownership up to 27%. The goals are a great bonus but clean sheets should still be driving your targets for defenders in the 5.5m-6.0m range and Tottenham just don't look like delivering in that area in their current incarnation. They're giving up chances all over the field and conceding a tonne of chances inside the box, with worryingly bad performances against West Ham8 and Liverpool. Thing were better in the last two games but with trips to Arsenal and City in the next three gameweeks it's time to cash in on that Dier profit.

In the next couple of weeks I plan to do a quick take on all 20 teams to look at their lineup, ownership and potential differentiators, all while leaning on the new team dashboards. If there's any feedback on what other metrics you'd like to see on their please let me know @plfantasy. Thanks for sticking with the blog during this period of radio silence and I hope I can repay everyone's faith with some improved analysis and graphics in the coming weeks.

1. A two shot variance for a team that averages 10 per game will obviously cause less noise than that same two shot variance for a player who might only average 2.5 per appearance

2. +/- metrics show the amount of shots above/below the average number the league has managed against that team. So if Arsenal have conceded 6 SiB a game and Chelsea come into town and manage 8, they would register an aSiB +/- of 33% for that game

3. It's worth noting that SiB +/- is opponent adjusted so strength of schedule will be less of an issue here than if we simply looked at goals scored of even shots registered

4. That's not to say I'd suggest not owning Di Maria, Falcao or van Persie; just be mindful that you're buying into a team who simply cannot be assumed to reach their historic heights just because of the crest on the front of their shirts

5. How you feel about the likelihood of an improvement in G/SoT depends on your view on how much of that lies with the skill of the attacking player and how much is out of their hands, once they've hit the target

6. Remember that a minus number is a good thing for defenses, showing that they conceded 37% less than the league average

7. Arsenal "Ward Prowse" 2015/16 shirts will be on sale shortly

8. One of the more unlikely clean sheets in recent seasons with 18 shots (12 SiB) conceded without the Hammers breaching the Spurs goal

Wednesday, August 27, 2014

When to believe your eyes (or at least the data table)

When trying to obtain data for use in any kind of forecast you are faced with any number of questions with varying degrees of complication. Is the data from a reliable source? Do we have enough of it? How should it be interpreted? When is it stable enough to be relied upon?

That latter question is where this post will focus. Generally, more data is better than less data. Nate Silver fans’ ears will prick up at that simplistic statement as his excellent book The Signal and the Noise is full of instances where too much data can cloud our judgement, but for our purposes let’s say that when trying to judge the quality of a team you’d prefer to have data for 10 games rather than 5 (data from every game in Liverpool history would start to be too ‘noisy’ as Bill Shankly or Ian Rush have little bearing on the current crop of players).

After two games last season, Everton had amassed 42 shots (26 SiB) giving them a crazy 21(13) average. While the team played well the rest of the way, their totals of 14 shots and 8 SiB were considerably below that initial surge, which could have led to a couple of panic buys as managers sought to get 'coverage' of 'must own' teams. Looking at this season to date, what should we make of West Ham's 35 shots (22 SiB) or Swansea's 15(7) efforts?

More learned statisticians will likely be able to analyse this question with more certainty and skill, but for our purposes, we are just looking for a quick guideline as to when we can believe what the data is showing us.

For simplicity, I have simply plotted each teams’ average shot totals (both in total and those only in the box) for the season against the rolling average on a gameweek-by-gameweek basis. These lines will obviously converge as the season progresses but the speed at which this happens is less obvious. The data is plotted below with some quick analysis below the chart:

By GW6, of the forty team/location pairs (20 teams each at home/away), 34 see their rolling average within just two shots of their final season total. Thus if at that the point in the season a team had an average shot total of 10, we’d expect with some certainty that they would finish the season with between 8-12. The only notable departures were Sunderland at home, who fell from three strong performances (strangely including Arsenal and Liverpool) and a 21 shot average to just 14 on the season and then Liverpool at home, who improved throughout the year, taking their 15 shot average through GW6 to 21 by the time they fell just short of a title bid.

It's dangerous to draw too many conclusions through six weeks, especially when further splitting the data in home/away games but ultimately we can't wait until we're absolutely sure (if that day ever even arrives) as decisions on transfers need to be made sooner rather than later. Still, six weeks feels like a good benchmark to start taking things a bit more seriously and putting some weight behind any big revisions to impressions you had coming into the year. From memory of Silver's aforementioned book, I think this is something akin to Bayesian inference, where our initial hypothesis should be impacted by new data but to varying degrees based on how strong our initial opinion was. Thus, if you loved Alexis Sanchez coming into the season, his somewhat disappointing three shots in two games should move the needle less than David Nugent's zero SiB, as you were probably less sure on the Leicester man's prospects initially (though even there, two weeks is probably too early to panic unless you've seen any real issues with his or Leicester's gameplan).

After being away for a few months I'm sure everyone is thrilled to read a piece which basically tells you what you already knew, but hey, I had limited data to play with while on a recent flight and this is what I managed to cobble together. This also ties in well with my plan to launch the new graphics and forecast tables right around the GW6 mark. Next up is some actual analysis of the new season. 

Tuesday, August 19, 2014

The more things change . . .

Personally, I've just wrapped up one of the best and almost certainly most important years of my life, having got married, travelled to four continents, finally got a new job and bought a house. For this blog though, the results have been less promising. I considered charting the quality of content here with my life developments or perhaps dousing the fire of my own work but given that I'm still travelling I'll stick to simple words for now.

Long story short, I've had priorities which have trumped this blog which meant that (a) the weekly content has suffered (and stopped at the end of the last season) and (b) I totally ignored this year's pre season activities. My policy has always been to only post things which are worth reading so I didn't ever want to mail anything in with out of date data or banal narratives. I wasn't sure I could make a quality preseason guide so I didn't and I wasn't sure if I'd even be back for this year. Now it's started though, I got that familiar buzz on opening weekend - even if my team was assembled the night before - and so I've come to the conclusion I'm not yet ready to walk away.

There will be a couple of changes though. First, I'm hoping to move to a more 'graphic' based site which I'll hopefully host on a new site that allows for a bit more flexibility. Second, I won't aim to put out weekly lineup lessons or 'fanning the flames' pieces which take up masses of time and in all honestly become repetitive for you to read and me to write (no, you shouldn't buy the 19 year old right back who played once but scored with his only shot of his life). I will however continue to post written pieces where a particular player needs attention or where a new concept/trend arises.

There are a lot of good sites around which cover player fitness, team news and what I'll call 'standard' reporting and while I've never tried to offer great depth in those areas, I'm abandoning that area entirely now. I know less about the weekly ups and downs of football than most of you probably do as I'm simply not plugged into it 24/7 thanks to living in Canada. I no longer default to Sky Sports News as my background noise and I don't discuss Rooney's hamstring in the elevator at work anymore. The problem with this approach is that when data tables show Stevan Jovetic as the best forecasted player for a given week despite knowing that there's a 99% chance he won't play, many people get confused/annoyed and complain. You can't please everyone though and there likely won't be comments on the new site anyway (I'm always on Twitter though for any fairer comments or queries).

So the plan for the next couple of weeks is to get the new graphics completed and launch the new site. That should nicely coincide with the time when we have some somewhat useful data (~GW5). In the mean time, I'll start getting back into the swing of things by highlighting some promising new players and offering caution to those whose early success looks unsustainable (basically a prolonged fanning the flames piece).

I've just realised that given my absence there could be no one reading this but if you are, thanks for sticking with me and I hope and I can reward that loyalty with a couple of useful tips in the coming season.