Choosing Your Variables: A Home Video Example

When setting up an analysis model, it is necessary to consider all the factors that might produce potential effects.  The model building or theoretical foundation is necessary to understand statistical relationships.  I have often seen researchers quickly come to a conclusion based on a couple of data points and miss that there is something else going on.  This early step is too easy to skip yet too vital to the process.

I saw this example in one of my favorite newsletters from Statista ( The finding was that streaming dominates home entertainment spending and that spending is increasing dramatically.


Statista estimates the home entertainment market in 2018 at $23 billion including streaming, DVDs, VOD, and others.  Further, the article suggests an 11.5% growth in total home entertainment spending.  These are impressive numbers an interesting finding if you take it on face value.

Even if the research cannot access all variables is the model, it is worth acknowledging their existence in the model or modifying the questions to account for the change.

The problem with the Statista analysis is that home entertainment options independent are considered independent of a major home entertainment source – cable/satellite television (pay TV).  Industry figures show that US pay TV revenues reached a peak in 2015 at nearly $102 Billion. In 2018, that number dropped to $87 Billion.  Not only do pay television subscriber revenue (absent advertising and other revenue sources) dwarf revenue from the other video sources, the 14% drop in pay TV revenues balance much of the gains in the home entertainment market.

By taking pay TV out of the home entertainment market, the picture drawn is incomplete at best.  The market is not growing so much as it is shifting from one type of delivery to another.  As affordable internet speeds have increased, people have chosen to drop major pay TV outlets for streaming media.  Services like Hulu are transitioning from being ancillary to cable TV to actually replacing it.  The analysis completely misses the major tend of “cord-cutting.” (  At best the report misinterprets the intention of a majority of the streaming content users.

Missing a major variable is more than bad journalism.  It could amount to plan bad research resulting in misleading results.   Every researcher must draw the line between a fishing exhibition with unrelated variable tossed into a model and the incomplete picture caused by a missing variable.  I can not blame Statista too much.  Data visualization is much more difficult with categories that have vastly different levels.  A difficulty is not an excuse when it comes to completing a realistic data model.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s