HistoryTracker: Minimizing Human Interactions in Baseball Game Annotation


– Good morning everyone, my name is Jorge, and I’m a PHD student at NYU, and today I’m going to present our paper, History Tracker: Minimizing
Human Interactions in Baseball Game Annotation. So bear with me, even if
you don’t care about sports. I hope you are going to get something out of this presentation. So, well, this paper is about capture and tracking data for sports. But, why do you need this? Tracking data is revolutionizing
sports and athletics, so, nowadays every team sport in Major League uses sports data, tracking data to capture
knowledge about the sport, and based on that create new strategies, develop skills, and prevent
injuries for players. Now, how do we compute tracking data? Traditionally, either people use ultimatum methods or manual methods. In the ultimatum methods,
companies put lots of cameras and radars
around the playing field, and those radars are going to capture the position of player
and ball through time at a very high frame rate. So on the positive side,
these ultimatative methods create a high volume of data. However, there is a catch. The data’s not perfect. If you look at the chart on the right, you’re going to see that half of the pop up balls, and
20% of the ground balls in baseball games are missed
by the MLB tracking system. So, there has to be
another way to do this, and there is. So, people have been using Manual Tracking as well, to capture this data. And basically Manual Tracking systems use specialized software, plus
trained human annotators, to create data that’s reliable
and available to everyone. So as an example, Opta
is one of the biggest companies that does manual
annotation for sports, and they have a huge coverage. So they cover 14 of the
major team sports nowadays. Just to see the size
of how much it’s used, they have annotated the
FIFA World Cup games. So the question is, Manual
Tracking is widely used, but can we make it better? And the answer is, yes. So we propose HistoryTracker, which is a system that takes advantage of historical tracking data to improve baseball manual annotation. And how much better are we talking about? Our system is 26% faster,
and 30% more accurate than the baseline manual annotation. What’s the idea? We warm-start the annotator. So, we give an initial approximation of the play to the worker, using historical tracking data. And traditionally this
is a very simple idea. Basically you have a a lot of plays that you have annotated before, or you have tracking data for. And then you can use these plays that you previously captured to warm-start the annotations so that the user doesn’t have to
annotate everything from scratch. This is how our system works. So this is a picture,
a view of the system, and it’s intentionally blurred out so that you don’t read anything. But these are the three
components of our system. There’s the fast play retrieval, the event based tuning, and the refinement on demand. And we are going to cover each one of them in more detail now. Basically we have a database of three years of automated tracking data that we can use to
warm-start the annotation. And this is roughly 80,000
plays in our database. This is how the plays look like. On the left we have the spatial temporal trajectory for ball and players. And on the right we have the associated events for this game. So you have, in gray, the events associated with player actions, such as balls pitched,
balls hit, and balls caught. And in green you have the events associated with player movement, such as player reached
a base, or player ran. We map from those events to an index that can be easily queried,
and retrieved in the database. So essentially each one of
those events is associated with a bit that can be either zero or one, whether this event happened or not. So in this play you have that
the ball type is a grounder, the player running is also happening, and the first baseman is also running. So this is how it works, instead of asking the users to annotate these bits one by one, we map these game events to questions that are more intuitive
for the user to answer. So questions such as which players ran, or who are stealing bases,
and what’s the batter on base? So basically the user has to answer those questions just by
clicking in those check boxes, and then we retrieve that trajectory that approximates the
play that he’s seeing. Like so. So let’s say that the user annotated this questionnaire on the left. We map this questionnaire to an index. And then we retrieve the trajectory that matches this index
as best as possible. You’re seeing Jaccard Similarity. Another thing that we allow is, we allow the user to
weight those questions, so that they can give more importance to certain events in the game. For example, here we’re weighting the who ran action with more weight. And in this way we give more importance to the players that ran in the field. Finally, now that we have a trajectory for our approximation, we have to align this trajectory with the video that the user is seeing. And this is how we do it. Basically the video and trajectory might not be aligned initially. So if you look at the video on the top, and the trajectory on the bottom, you’re going to see that the trajectory starts moving before the video. Like so. So the ball is moving, but
the play didn’t start yet. So we have to somehow align those two, in order to enable the game annotation, and this is what we do. First of all, we automatically align the events based on the hit sound. Then we let the users align the video, and the trajectory events by dragging and dropping those events on the field. And finally we will re-query the database, so that the events match
them in the temporal aspect. So this is how it looks like. The user just has to click, drag and drop the video frame across the screen. So, ball’s caught, now he’s annotating the
ball’s release event, and finally the ball is caught again. After each time that the user
drags and drops this game, it’s going to re-query the database, and retrieve a trajectory that approximates the game in a better way. So you are going to see that the player and ball now are aligned with the video. Like so. Finally, this is the part where the manual annotation really happens. So, some elements of
the retrieved trajectory might not represent the video correctly. Let’s say for example that in this case the first baseman does
not match the video. So we have to let the user
fix this trajectory somehow. And the way you do it is very simple. We just let the user select the element that he wants to annotate, and then we clear this trajectory, and let the user manually
annotate it from scratch. But notice that all the other players are already positioned in the field. So this is going to make
the process much faster, because the user doesn’t have to annotate every single player. He just has to annotate the players that he thinks needs to be fixed. So intuitively, this is
going to be much faster than the baseline manual annotation. Our system is then comprised
of just three steps, that are going to make the
annotation much faster, and more reliable. But how do we evaluate this? The evaluation is made by comparing the manual tracking from scratch with our system, HistoryTracker. And it was done using 10
plays and eight users. So each play is annotated eight times, Four with the baseline annotation, and four with HistoryTracker. And just to give you an idea, this is what the plays look like. You won’t have time to look at each play individually, but you can see that the
balls move in all directions, and there are multiple
players on the field. So we tried to convey as much as possible of the variability in the baseball games. So first we compare accuracy, and we can see that HistoryTracker
is 30% more accurate than the baseball annotation from scratch. This was really surprising for us, because originally we thought that our annotation and the baseball
annotation from scratch was going to be exactly the same. But then we started thinking about it, and this actually makes
a little bit of sense. So if you look at the
screens of the video footage, you are going to see
that the baseball camera points at specific portions of the video. So essentially the user is not going to see the entire field at all times. The user has to think about, and estimate the positions
of the other players. And therefor, this is going to introduce a lot of error in their annotations. If the user uses historical data to approximate these trajectories, then the results are going to be better. And this is what we see in our chart. Finally we also have a time comparison of HistoryTracker with the baseline. And HistoryTracker is 26% faster. We already expected this,
but this is great news, because 26% is not a lot,
if you think about it. But if you think it at scale, then it makes a lot of difference. So one baseball game has
approximately 67 batted balls. So if a person is to
annotate all those games, this person is going to
save more than one hour. And that’s a lot of effort
that this person’s saving. One major league season
has more than 2000 games. So if you think about it, this person is going to save
3000 hours in annotation. And that’s a huge effort
that they are saving. Finally here we have
some example annotations. So on the top we have the Ground Truth of the trajectories that were annotated. And on the bottom we have the trajectories annotated using HistoryTracker. And you can see that the top trajectories and the bottom trajectories
look very alike. So not only is the error
computed very small, but also the overall
trajectory look the same. So here’s some conclusions, HistoryTracker is a system
that warm-starts annotation using historical tracking data. And we noticed that it
produces high quality data, with lower user input. What’s the advantage of that? Well, manual tracking is much cheaper than automated tracking. So if you think about
it, Major League Baseball spent millions of dollars in their system that uses cameras and radars to track the player and ball positions. However with manual annotation, anyone can just drag and drop things on the screen, and create tracking data that can be used for analysis. This is very useful for smaller teams in minor leagues, because they don’t have as much money as those major companies. So we can democratize tracking data and analytics for smaller teams. As future work, we have
proposed HistoryTracker for baseball, but this can be applied to any other number of domains. If you think about sports,
then any other team sport, such as basketball or soccer, can take advantage of our method. And we just have to come up with questions that enable the
quick retrieval of plays, to enable users to quickly approximate what they want to annotate. But what we want to do is minimize the user effort by using previous data that we already acquired. Another thing that I want to point out is that even if you
don’t work with sports, you can think about, well, everybody needs to label data. And everybody needs to
make users label data. So can we think about ways to
use previously collected data to make this process
faster, and effortless? So I encourage you to think about this. Everybody has lots of data collected, everybody needs to collect more data, so let’s try to make this
work easier for our users. And with that, I thank you
all for your attention, and I open for questions now. (audience applause) – [Host] Thank you Jorge. Questions? – [Man 1] Thank you very
much for this presentation, it was very interesting. Question, you mentioned
in your last sentence exactly something that
I was wondering about. You mentioned that you
can use this kind of tool, HistoryTracker, for
annotating other fields. – Mm hm. – [Man 1] For instance game play, or human computer interaction in some way. But you also mentioned that you have 8000 hours of data, with
zillions of coded events. – Mm hm. – [Man 1] What would be the minimum time needed to feed your database? – I see. We actually thought about this. And we’re trying to transform
the system into a product. But this 80,000 hours of
plays that we have so far, we cannot use them
because those are private. So we have to somehow create
this data from scratch. And the idea is that,
using many annotations from scratch, we can crowd source it. In Mechanical Turk, or whichever system. And then, let’s say if you have 100 plays, this can be already used to
approximate the next ones. In essence when more users
start using the tool, we are going to gather more data, and this is going to make the
process faster and faster. So, I hope this answers your question. – [Man 1] Thank you. – [Host] More questions? – Hi, Cliff Lampe, University of Michigan. I appreciate the work, it’s really cool, combining, kind of, the spatial element, plus the time series data. Have you considered applying it to non entertainment activities? So the thing I was thinking of would be logistics, or the factory floor, or things like that. Is there anything different
about the contexts of work than you think about the context of play? – Mm hm, yes. So, one of the particularity
of the baseball games is that they are very structured. So this list of events
that I presented before, they clearly describe every single action, and every single movement of players. I’m thinking that in an industrial setting these movements are not
gonna be as structured as in a baseball game. So it’s going to be a
little bit more challenging to apply this system in this setting, but it’s possible. So if you think about the questions that you can ask, you can ask for example, where are the employees situated? Or, what’s the closest
employees to this person? And based on these questions
you can query the database, and approximate those trajectories. So it involves a little bit more work, but I hope that this is possible as well. – [Host] We have time
for one more question. We’re not gonna start the
next one until 20 to 12, so… (audience laughs) Better to have a question
than to stand here silently. Anything else you want to say, Jorge? (Jorge laughs) – Well, this is just my first CHI, so I’m really happy to be here. And yeah, thank you very much. (audience applause) – [Man 2] I have one simple question. – [Jorge] Right here. – [Man 2] Errors, is that something that the system is accounting for? And if it is, is that something
that’s typically annotated? I know some baseball, but I
know errors happen, right? And if a ball is not caught, or something happens, right? That has to be annotated, so, is your system flexible to include those? – [Host] Go up to the podium
and use the mic, please. – Yes, so, thee errors I
included in the questionnaire. So the user is going to, the user never introduces new events to the system, but he uses historical data to
give those events importance. So yes, it’s possible.

Leave a Reply

Your email address will not be published. Required fields are marked *