We live in a data-driven world where every organization needs data to make informed decisions. Whether it is to improve our quality of life or tackle mankind’s greatest challenges, data is at the forefront of our everyday lives and a crucial part of human advancement. This year, the department of Statistics at U of T hosted the annual DataFest competition for the seventh time. The event is an international data analysis and statistics competition, sponsored by the American Statistical Association (ASA). Each year, ASA partners organizers allow academic institutions access to data to run local analysis competitions based on a real-world problem.
At the event, 25 undergraduate student teams competed to analyze data on the play2PREVENT game. The video game focuses on gaining insight into adolescent behavior and identifying at-risk youth. The players were asked to play a game called "Elm City Stories" for six weeks. The game aims to help players make better life choices like resisting drugs or engaging in HIV prevention. DataFest student teams were given access to play the game before the official competition and they had three days to analyze, synthesize and present their findings.
There are many advantages for students who take part in the competition including winning up to $800 in cash prizes and meeting researchers and recruiters. The experience also allows participants to gain some valuable industry experience. This year’s winning group included students Jessica Wang, Meiyi Wu, Yuanjun Xia, and Nanyi Wang. The students agreed that the most intriguing element of the competition is having the opportunity to engage with real-life problems that attract companies because it can lead to solutions.
The journey to winning was not easy. Statistical science student Wu described the process during the competition as nerve-racking team’s process when she said, “my team began by playing the specified game and reading every instruction carefully. Then, we went through the entire data set and analyzed each variable to clean the data set and preserve the useful information. As a next step, we built linear mixed models and found a better model to get our conclusions.”
Through their analysis, the team gained valuable insights into players’ behaviour and performance, Jessica Wang explained: "Players who chose a female avatar performed better in terms of being able to resist drugs after playing the game for six weeks than players with a male avatar". She added, "older players performed worse in comparison to younger players."
This year’s event took place on Zoom, which meant students faced the additional challenge of having to present their work virtually. Xia and Nanyi Wang both agree that creativity during the presentation and teamwork played a critical role in winning the competition. Xia said, “Each of us offered our opinions on this project and we helped each other to complete the whole presentation. Visualization is also quite important to let the judges understand your work easily. We did great on our presentation, and this was one of the reasons why we won.” Other teammates explained that having specific detailed data sets from a variety of perspectives including age, gender, and ethnicity in their findings allowed the team to stand out during the competition.
The competition was well attended by many professionals in the industry as the team submissions were sent to a panel of industry judges, consisting of data scientists and statisticians from Amazon, BMO, TD, JLL Technologies, Roche Pharma, Rogers Communications, Sleepout, and Wealthsimple. Director of Credit at Wealthsimple and competition judge Paul Edwards, talked about being drawn to the project because many industries rely on data for making impactful decisions, especially in machine learning advancements, cloud computing, and computational power. He was particularly excited to see what new perspectives the students could bring to the table using data to solve real-life problems. The main skills the judges were looking for in the project included creativity, technical expertise, and communication. He said, “There’s a fundamental that goes into communicating technical information. They had to be clear on their writeups and visualization. I was also looking for less obvious problems they were trying to tackle. There were a few teams that did something quite different.”
DataFest not only allows students to apply their statistics and data analytical skills, but also highlights their work to a variety of potential employers. While the four-day competition was challenging, Jessica Wang explained that there are advantages to participating in the competition. She said, “This competition and data analysis project enriches your resume whether you win a prize or not. It demonstrates to the recruiters that you are proactive, and that you are so interested in data that you are willing to invest your personal time in a competition like this.