THOUGHTS ON PROFESSIONAL
WRESTLING ANALYTICS
By Chris Harrington
(indeedwrestling@gmail.com)
I didn’t spend my childhood tracking professional sports. In
fact, I can recall only two contests that ever really intrigued me. The first was the annual March Madness NCAA
Basketball Tournament. I was fascinated
with the odds-making around how to seed the teams, and how often did Number 14
beat Number 3 (answer: 17 times). The
second was the Olympics Medal Count standings with the international intrigue
and all of the unusual game specialization (Grenada, population 110,000, won a
gold medal in 2012 in the 400m dash!) However, these are both short-lived
events. So, instead of watching sports,
I was always more excited about setting a fictitious Tecmo Super Bowl Sack
Record with Howie Long on the NES.
With my love for mathematics, I did hear about
sabermetricians (people who engage in “specialized analysis of baseball through
objective evidence, especially baseball statistics that measure in-game
activity”-Wikipedia). While it sounded interesting but I couldn’t imagine
studying the works of baseball statistics when I detested actually watching the
baseball games. It was when I read Michael Lewis’ Moneyball that I was instantly drawn to the description of
historian/statistician/writer Bill James. In his works, James posed baseball
questions and answered them with statistics and insightful analysis. He
self-published books full of metrics, some as his own new inventions, and
proved his points & theses about baseball. Since he wasn’t a traditional
sports writer his approach baffled some and left them uninterested. A lot of people didn’t foresee any demand for
that form of exploration. Yet he
prospered among a core group of likeminded folk and over time his examinations
grew in popularity and influence. Bill James succeeded in inspiring others
people to look for relationships squeeze meaning from these imaginary numbers.
That idea, aggregating data points and using statistical tools to gleam
significant (and insignificant) conclusions fascinated me.
Luckily, instead of baseball, I did have a athletic endeavor
that intrigued me - Professional Wrestling. Whilst the WWE’s moniker of “Sports
Entertainment” is oft-derided, I loved the nexus of the colorful characters
from theater with the truly impressive displays of activity.
In High School I began tracking results from the syndicated
C-shows such as WCW Worldwide and WWF Jakked. I loved the randomness – seeing
the journeymen battling the jobbers-to-the-stars against the developmental
prospects. I began to scour for additional results so I could put together
brackets populated by the win-loss records for the likes of Super Calo, Fit
Finlay, Crash Holly and Mideon. I wanted
to know who was the king of the jobbers (the tallest midget)?
I was hardly the first person to do this. For instance, one
of the first pieces of pro wrestling analytics that I came across was from the original UseNet
group rec.sport.pro-wrestling (RSPW). Nicolas
Seafort had decided to tackle someone’s question about Alex Wright’s win-loss
record by compiling the data going back several years to the mid-1990s. ( You can
see still read an archive of his great work over at www.solie.org )
Most pro wrestling analytics projects have started in a
similar way – someone asks a somewhat open-ended question (“Is this the
youngest roster that WWF has ever had?”). To answer that, you need to cross a
large timespan and utilize a larger timeframe of data before a meaningful compare/contrast
can be provided.
A big turning point for me was an econometrics class that I
took at the University of Rochester. One of my final projects was creating a
predictive model to look for rhyme and reason in how the Pro Wrestling Illustrated
500 ranked various wrestlers. It required importing the historical PWI 500 lists, cross-referencing all of the names and
adding various accomplishments in that year (# of PPV appearances, belts won,
federations worked) and prior year rankings to look at what variables
represented a meaningful correlation. As with many quandaries, it’s a much
easier question to ask (“What are the PWI 500 rankings based on?”) than answer.
THE CHALLENGES
In my time reviewing professional wrestling records and
attempting to create meaningful (and random) analytics and analysis, here are a many of the
major challenges that I’ve encountered:
1.
Name & Identity Confusion
People have a lot of gimmicks
and the same gimmicks get re-used. Some territories will cycle through various
people under the same mask. Legends and rumors often replace discernible facts
(i.e. who played Doink the Clown at a given European House Show after Borne was
fired). In older reports, names are
reported phonetically how they sounded over the loudspeaker rather and
particularly pernicious misspellings abound!
2.
“Propriety”
resources
In any field of research some
materials are easier to access than others. Professional Wrestling History is no
exception. There are terrific Pro Wrestling Historians who have spent enormous
swatches of time carefully transcribing results and indexing information. Some
publish record books or write detailed biographies. Information may be posted
online in large, well-organized chunks (I am fond of www.thehistoryofwwe.com posting an entire
year on a single webpage). Other times, the information is available but broken
into dozens of posts on somewhat obscure webpages or tape-trading sites. Or,
the results could show up as chunks of text behind paywalls in Newsletter
Archives or only in non-electronic published forms. Television ratings may be
quoted, but the raw information is usually available from the measurement
services directly, for a hefty subscription fee. The work involved in
collecting and organizing the results can be incredible and consequently many
people don’t share their “raw” databases.
3.
Incomplete
and evolving Record-keeping
As people bounce from project
to project, it can be hard to keep track of what new information has been
revealed since the last time you may have built a database on a certain
subject. Not only are new matches taking place every day, but despite the adage
“once it’s out there, it’s out there forever”, old information can seem to slip away. Some of it is the shift from published newspapers to online news sources. Some of it is message boards disappearing years later, or websites no longer being maintained as federations go out of business and fans lose interest. Historians, often older fans, do a fantastic job resurrecting accounts and memories of people who were around in younger days, but if information isn’t recorded in a real-time fashion, there is a strong chance it will be lost eventually.
“once it’s out there, it’s out there forever”, old information can seem to slip away. Some of it is the shift from published newspapers to online news sources. Some of it is message boards disappearing years later, or websites no longer being maintained as federations go out of business and fans lose interest. Historians, often older fans, do a fantastic job resurrecting accounts and memories of people who were around in younger days, but if information isn’t recorded in a real-time fashion, there is a strong chance it will be lost eventually.
4. Private Information
Discrete data such as “number
of buys for a Pay-Per-view” or “downside guarantee for a WCW employee” aren’t
always available. Since WWE is a publicly traded company, many of their metrics
are available in some form since their IPO in late 1999. However, most pro wrestling
companies were not publicly traded and the quality of the information that the promoters
released publicly varies enormously. Companies that have “gone under” also face
the danger that many of the best sources of their records (records, gate
receipts, booking sheets) were not captured and carefully recorded. Still, to
pay the tax man, many state commissions did have a hand in auditing the
information (attendance, gate, licensing wrestlers) so information can vary
state to state.
5. Exaggerations and
assorted Tall Tales
No one’s memory is perfect.
Likewise, promoters have an incentive to be larger-than-life and appear infallible.
There’s a lot of reasons to lie ranging from outright jealous and pettiness to
benign indifference and honest confusion. Fans misremember events they
attended. Wrestlers misrecall where they
were and who they wrestled. The past is remembered
with rose-colored glasses. The seedy nature of pro-wrestling and the kayfabe-laden,
pseudo-sport coverage it’s given leave large holes in the record. Published articles
from respectable news source have been laced with mistakes and even outright carny
lies.
6. Over-reliance on same resources (or unsourced claims)
There are situations where a
single source of “truth” (Wikipedia, for instance) is cited to credit a fact (an
age, a claim about the success or attendance of an event, etc.) that cannot be
firmly established. True primary or
well-sourced secondary resources can be quite difficult to obtain. Events with
results emerge, and it can be difficult to ascertain the origin of the
information.
7. Inaccurate Assumptions
Good data collection practices can
still lead poorly reasoned conclusions. This especially happens when comparing
datapoints from different countries or from different time periods. I’ve especially noticed the confusion that
stems from Television (Nielsen) share ratings and Pay-per-view “buyrate” (percentage
of the users among the universe of PPV-capable television systems which has
bought a PPV) . Comparing arena attendances from companies across the United
States doesn’t always factor in the population differences and the venues
available to run.
8. Attempting to capture intangible variables
How “hot” was the crowd? How “good”
was the wrestler? Was the show “successful”? Tallying wins & losses is often an easier
exercise, but finding ways to quantify these sort of extremely atmospheric and experiential
information is hard. Often, only the basic statistics (Who wrestled? How long? Where
and When? Which spot on the card? What
was the attendance? How old or how tall or how much did they weigh?) are
available, but more complexity can be added with thoughtful measurement.
9. Lack of a formal community
Unlike more established analysis
in professional sports, pro wrestling analytics remains somewhat scattered. This
is partially because of the broadness of the term “pro wrestling” and the
number of different federations, wrestlers, styles, and locations that
covers. There are language barriers when
you’re comparing Lucha Libre and Puroresu. Fans often are attached to the
promotion they grew up with so older fans are more likely to cover WWWF or
Georgia Championship Wrestling while younger fans are more focused on ECW and
Ring of Honor. Pro Wrestling Historians and Pro Wrestling Analysts often cross
paths, but they don’t always do the same work – transcribing results versus
standardizing the information into a generalized and indexed format.
THE RESOURCES
Yet, we’ve come a long way in the past thirty years about
overcoming these challenges. In the past dozen years, the internet has grown
into an enormous resource with many well-organized databases.
For instance, there are incredible wrestling databases online now. Here is a list of websites that have search engines
that I will often consult for information – especially birth/death,
height/weight, debut or other relevant statistics:
When I started subscribing to Dave Meltzer’s Wrestling Observer Newsletter, I’ll
admit that I was disappointed that a portion of the publication was taken up
with RING RESULTS. These would be text recaps the house shows and international
tours of various pro-wrestling organizations. I would skip that part and
instead focus in on the insider news and rumors. It didn’t seem important to
know how many people attended the Butte, Montana event or who worked in the
third match from the top.
However, now I cherish that the information has been
recorded and saved. The raw and unbiased data provides an unbelievably
excellent pool of information to be mined. It’s exceptionally easy to miss the
seemingly unimportant points of history which are happening now while you’re
blinded by whatever the “big story” of
the week may be.
These are the websites that I normally visit besides the
ones listed above:
My favorite Pro Wrestling Discussion Boards:
Pro Wrestling Newsletters and Audio:
Pro Wrestling History Sites:
Pro Wrestling Statistics & Other Interesting Data:
WHERE ARE WE?
Right now, pro-wrestling analysis/analytics is usually distributed in
one of four ways:
a) Posted on a message board (some behind a pay wall, some
not). Sometimes this is reposted on other message boards.
b) Incorporated into newsletter analysis. Dave Meltzer has
done a lot of great original work. I’ve read
very good debates on “drawing power” from message boards reappropriated into
point systems and brought into newsletter articles.
c) Published stand-alone on a website for reference.
Sometimes author will post a link to the analysis on message boards or
link-sharing sites like reddit.com
d) Formally published as an article in a magazine (Fighting
Spirit Magazine, for instance), analysis in a newspaper article (for instance,
USA Today articles about Pro Wrestling deaths) or stand-alone in a book (The
Pro Wrestling Hall of Fame: The Heels).
What we don’t have now is a clearinghouse that provides a
place to have detailed pro wrestling analytics discussions outside of the general
pro wrestling discussion areas. I am not
aware of a purely pro wrestling analytics (statistics, studies, etc.) journal or
a message board purely devoted to this topic. It’s likely the universe is still
compromised of many isolated souls working on their own pet projects.
Hopefully, someday we can all come together and share what we’ve learned and
what we want to learn in a larger setting.
I'm just now digging into your blog after searching for a similar "one stop shop" type of SABR organization or RetroSheet project for wrestling and coming up dry. Now that it's three years since this post, have you seen or heard about any such efforts beyond the sites you linked? I've started helping out with historical results for Alabama at WrestlingData, but despite how large their database is, there is still a lot of detail they do not capture.
ReplyDelete