# TSA Throughput numbers - Poly and R2

Updated: Jun 15

**The TSA charts are update **__here__**.**

Several people have asked how we come up with these charts and what the R2 number in the chart with the polynomial regression means.

A bit lengthy, but here we go - should have added this earlier, but I thought it was obvious, or you could easily google it:

As far as the charts, there's no special expertise or secret sauce.

I simply use available Microsoft Excel spreadsheet tools for forecasting, with mixed results.

I occasionally consult an expert, someone with a degree in aerospace engineering (my son), about the data and forecasts to make sure it is not all bogus. That's it. If it's all a joke, he is in on it.

**The poly is simply an equation that describes current data.**

It does so better than a linear regression. The chart below "__since traffic bottom__" has a linear regression line (silver dashed line) - most are familiar with that term and its meaning from financial analysis and charts in popular media.

With some data, a linear regression line is not the best description of the data and trendline.

So, there are polynomial regression equations available to describe data and trendline in a more meaningful way. You can even go beyond that with cubic equations, etc.

The R2 (R^2) describes how accurate that __polynomial regression__ equation is in describing said data - currently about 91.6%. When I started looking at the data, it was 88%.

How well the poly curve describes the body of data (how well it "fits") is also called "fitment" and it is expressed in the form of the R2 (R squared, or R^2).

The poly curve shooting into future days on the calendar is a trendline that simply means something like this:

** IF** the new data rolling in follows the same pattern as the previous data, then the new data should remain close to the poly curve. That's it - not more, not less.

* The polynomial equation is unaware of the million variables out there that can affect the actual traffic numbers*. The polynomial equation simply describes existing data, and the poly curve can be plotted into the future using the existing equation which describes the body of existing data. Obviously, the further into the future you plot, the more speculative the resulting projected traffic.

The fact that the poly R2 went from 88% to 91.6% tells me the following:

Since the poly R2 has increased, the new data is indeed following the same pattern as the previous data, and the (slightly) new equation is about 3% more accurate in describing the new body of existing data.

As new data arrives that largely matches previous data patterns, I would

*expect the poly to remain about the same or increase slightly*- more matching data is easier to describe in an equation.

Why?

If the new data was significantly off the previous trendline (or poly, short for polynomial curve) then any possible equation describing the new body of existing data would have a harder time properly describing the current data and the R2 would decrease - that's true for both higher and less traffic than normal.

Example: Let's say we suddenly have several days of zero traffic or several days of 10 million passengers - or both, one followed by the other. The new data would be so dramatically different from the previous pattern of existing data that any equation attempting to describe the newly emerged pattern (all the available traffic data: previous, plus the new zero or 10 million days - the new body of existing data) would only be a very, very rough approximation of the trendline for the data - worse than the previous equations. So, the R2 would decrease in both cases, whether the new data comes in high or low.

Each week that passes, you get another piece of the puzzle (new data) and the larger picture slowly emerges. Also, each week, with more data, one can evaluate the new data vs the old polynomial equations vs the new poly - you can tell where we are relative to the old curve or if we embarked on a new one.

Last week's poly flattened a bit, which means the outlook for the last quarter '20 is no longer looked as optimistic as it was the week before that. However, we were looking at data in the *middle of a cycle* - so that poly is not to be trusted as much as a fully periodical one.

**This week, looking at a full cycle, the poly shows that the new data arriving matches the previous weeks' prediction - we are still on track to reach traffic numbers 80% or higher of 2019, in the last quarter of 2020. Take a look at the 120 day poly **__here__**.**

What's a cycle (or periodical)?

The traffic bottom on my extinction island interwebs shows 4/14/2020, which was a Tuesday. As you can see, traffic follows a predictable pattern each week.

So, to really compare each new week to the previous one (apples to apples), ideally you should do your analysis with data in 7 day blocks, starting with 4/14/2020. Not really required for the expert rocket scientists, but for __Holiday Inn Express types__ like myself, it's a lot easier to understand and apply.

That's it in a nutshell.

Oh, if you're still reading: someone asked something like "what happens when the poly approaches previous (normal) traffic numbers? Should it not flatten?"

Of course it should. Just look at the initial __drop off the cliff in March__ - and mirror it.

Normalization should look about like a mirror of that drop off, but that's just another guess too.

How we get there is way more interesting.

Is it going to be a** V**,** U** or painful **L** shape? Any bets?