By the Numbers: Stats Analysts Learn Big Data Skills

Players, coaches, spectators and refs aren't the only parties involved in athletic contests these days. A growing number of data analysts are joining today's teams, compiling information on a wide variety of individual and team performance statistics to help maximize chances of winning.

»Ê¼Ò»ªÈË's department of mathematics and computer science has become a center for the study of sports analytics. The college is providing a growing number of students with opportunities not only for classroom instruction in the field, but also for internships to practice it in real games.

But there's more to it for students than quantifying the action of a physical endeavor they've always enjoyed. According to Associate Professor Tim Chartier, who initiated the development of sports analytics study at »Ê¼Ò»ªÈË, sports analytics teaches students valuable intellectual skills that have direct application in the lucrative world of commercial data analytics in general.

»Ê¼Ò»ªÈË alumni Brian Sachtjen '12 and Rob Lorenzen '13 visited campus recently to introduce students in Chartier's finite math class to the data analytics work they conduct at Red Ventures, a marketing and sales agency for many of the largest consumer products companies in the country. The company uses data tracking systems to help its clients identify and market to high value customers.

The two alumni presented class members with an actual Excel spreadsheet from their work at Red Ventures. It included hundreds of rows of client contacts, listing about 25 factors such as referrer domain, internet provider, operating system used, zip code, browser used and customer age.

After demonstrating several Excel analysis techniques, Sachtjen and Lorenzen left the class with a homework assignment not unlike the analysis they conduct in their jobs:

"Using the data set provided, please identify a customer attribute (ex: time of site, state, # page views, time to call) that may have an impact on Red Venture's marketing efforts."

The assignment continued with the challenge, "Create a brief executive summary describing your hypothesis of the customer attributes you chose to evaluate, the results of your regression analysis, and whether that attribute correlates with either revenue or call conversion. Next, explain which attribute you chose for your ranking, the results of the ranking, and how you would alter your marketing strategy to target the top ranked customer types."

Chartier pointed out, "That's a much more sophisticated problem than analyzing data to figure out how turnovers are affected by various player combinations on the basketball floor, but it's really only a matter of degree. Data analytics in general is the process of examining raw data to discover useful information, suggest conclusions, and support decision-making. It's a valuable skill to have at any level."

Bracketology

Chartier launched »Ê¼Ò»ªÈË's sports analytics program as a homework assignment for a class in 2010. He knew that millions of fans engage in an annual ritual of projecting the results of the NCAA's 64-team "March Madness" basketball tournament. While most people pick teams based on their records and intuition, Chartier asked his class to develop mathematical formulas that could generate more accurate results. The students' brackets proved to be so accurate that they generated media attention from top national outlets–New York Times, CBS, USA Today and ESPNU.

Chartier has assigned students to develop formulas every year since, with results from past years giving them new ideas for ways to tweak their calculations and hopefully be even more accurate this time around. They invite members of the general public to play along at to "Let the power of math help your bracketology!"

The March Madness basketball tournament is ideal for data collection and analysis, Chartier noted. It is structured in a bracket form where you win or lose, advance or go home. The season is about 30 games long, so there's ample time to gather data on team performance.

The high profile and success of »Ê¼Ò»ªÈË bracketology and Chartier's widespread ambassadorial efforts on behalf of math applied to everyday living has opened up myriad opportunities for student involvement with sports analytics over the past few years.

For the second year in a row, the Charlotte Hornets NBA basketball team has hired four students (Brandon Liang '17, Vincent Zhu '16, Phillip Bader '17 and Rich Korzelius '15) to operate its system of high-tech SportVU cameras during games. Mounted on the overhead catwalks, each of six cameras shoots 25 frames per second. Accompanying software provides X, Y and Z coordinates of the ball and X, Y coordinates of every person on the court for each frame, accumulating gigabytes of data that coaches and management can use to analyze the game and player performance. While the system is largely automated, situations of uncertainty generate queries to the »Ê¼Ò»ªÈË student system operators that require their human interpretation.

The range of statistics SportVU can generate is far broader than what's in a newspaper box score-aspects of the, such as touches in the paint, passes per possession, three-pointers off of kick-out passes, total distance run in a game, speed a player runs, leaping ability, shot trajectory and more. Brandon Liang said, "People tend to focus on the numbers of points, assists and rebounds. But there are a lot of things happening on the court that can only be shown in gathering and studying more data."

Angus Mitchell, Quantitative Analyst/Systems Developer for the Hornets, supervises the »Ê¼Ò»ªÈË students, and said he's been extremely pleased with how they have performed. If fact, he said, "This year we are trying to leverage the fact that we've tapped into a pipeline of really smart kids at »Ê¼Ò»ªÈË by expanding their roles. We've tried to get them to do some actual analysis as opposed to simply working the cameras."

But the Hornets aren't the only hometown team befitting from sports analytics at »Ê¼Ò»ªÈË. A group of students calling themselves the "CatStats" team has been regularly meeting with men's basketball coaches to gather and analyze data on the »Ê¼Ò»ªÈË Wildcats.

The CatStats roster is Jason Feldman '18, Abhi Jain '18, Ross Kruse '17, Grant McClure '17, and Richard Yan '15, as well as alumnus Sachtjen. The group uses commercially available databases to study tendencies of opposing teams, in essence offering an additional scouting report on the opposition for the »Ê¼Ò»ªÈË coaches. Each CatsStats member analyzes different data sets and brings that information to the group's meeting with coaches.

The work has been particularly helpful this year with »Ê¼Ò»ªÈË's transition to the A-10 conference, which involves playing many opponents the team has never faced. McClure, a math major and serious sports fan, said several game factors are particularly revealing when analyzing an opponent-effective field goal percentage (which gives extra weight to three point shots), turnover percentage, offensive rebounds, how often teams get to the line and strength of schedule. The group is also conducting a longer range study of lineups to discover which combination of players on the floor produces the best results for the team.

Sachtjen had not been involved with sports analytics at »Ê¼Ò»ªÈË, but joined the CatsStats team this school year. He quickly proved his worth by devising a method to use Excel software to speed the process of reviewing play-by-play game logs. A process of compiling data that previously took three hours was automated to require no more than 15 minutes.

Broad Applications

While basketball has been the focus of »Ê¼Ò»ªÈË's involvement in sports analytics, the skill can be applied to almost every sport. Students Tommy Rhodes '17 and Elena Vasiou Sivvopoulou '15 had internships last summer with Michael Waltrip Racing.

The opportunities for involvement in other sports has recently grown through a new program in the Charlotte Mecklenburg School (CMS) school system. Beginning next school year, 21 students from each of the 19 CMS high schools will be enrolled in a program administered by National Amateur Sports called StatSquad STEM. The program offers the high schoolers an internship that combines STEM instruction and sports data analysis.

Four »Ê¼Ò»ªÈË students helped beta test the program last semester-Ross Kruse '17 worked with football, Alex Feliciano '15 worked with men's soccer, Eric Hart '15 worked with women's soccer and Sarah Klett '15 worked with volleyball. More »Ê¼Ò»ªÈË students will be involved as mentors as the program gears up next fall.

»Ê¼Ò»ªÈË's activities in the field of sports analytics reflect the rapid growth of the field in the world outside of Chambers Building. There is now a Journal of Sports Analytics, and a large annual sports analytics conference at MIT. Chartier has helped organize a new annual conference in the Carolinas. The conferences cover a wide range of sports, from spelling bees to rugby, volleyball, baseball and soccer. The meetings feature research papers on such topics as "The Three Dimensions of Rebounding," "Predicting Points and Valuing Decisions in Real Time with NBA Optical Tracking Data" and "The Three Biases in Umpire Decision Making."

Chartier is under no illusion that analytics will transform the way a game is played, or bring a team instant success. He said, "In the heat of the moment a player is not thinking of what the data says when figuring out what to do next. But it can highly affect preparation for the game. It's sort of like helping the team from the bench, giving them a head start."

He concluded, "We're not training our students to be sports analysts-though for many of them that would be a dream job. But the training in data analytics helps them get jobs at firms like Red Ventures, and that could certainly end up being a tremendous career."