Google’s AI Plays Football…For Science! ⚽️


Dear Fellow Scholars, this is Two Minute Papers
with Károly Zsolnai-Fehér. Reinforcement learning is an important subfield
within machine learning research where we teach an agent to choose a set of actions
in an environment to maximize a score. This enables these AIs to play Atari games
at a superhuman level, control drones, robot arms, or even create self-driving cars. A few episodes ago, we talked about DeepMind’s
behavior suite that opened up the possibility of measuring how these AIs perform with respect
to the 7 core capabilities of reinforcement learning algorithms. Among them were how well such an AI performs
when being shown a new problem, how well or how much they memorize, how willing they are
to explore novel solutions, how well they scale to larger problems, and more. In the meantime, the Google Brain research
team has also been busy creating a physics-based 3D football, or for some of you, soccer simulation
where we can ask an AI to control one, or multiple players in this virtual environment. This is a particularly difficult task because
it requires finding a delicate balance between rudimentary short-term control tasks, like
passing, and long-term strategic planning. In this environment, we can also test our
reinforcement learning agents against handcrafted, rule-based teams. For instance, here you can see that DeepMind’s
Impala algorithm is the only one that can reliably beat the medium and hard handcrafted
teams, specifically, the one that was run for 500 million training steps. The easy case is tuned to be suitable for
single-machine research works, where the hard case is meant to challenge sophisticated AIs
that were trained on a massive array of machines. I like this idea a lot. Another design decision I particularly like
here is that these agents can be trained from pixels or internal game state. Okay, so what does that really mean? Training from pixels is easy to understand
but very hard to perform – this simply means that the agents see the same content as what
we see on the screen. DeepMind’s Deep Reinforcement Learning is
able to do this by training a neural network to understand what events take place on the
screen, and passes, no pun intended all this event information to a reinforcement learner
that is responsible for the strategic, gameplay-related decisions. Now, what about the other one? The internal game state learning means that
the algorithm sees a bunch of numbers which relate to quantities within the game, such
as the position of all the players and the ball, the current score and so on. This is typically easier to perform because
the AI is given high-quality and relevant information and is not burdened with the task
of visually parsing the entire scene. For instance, OpenAI’s amazing DOTA2 team
learned this way. Of course, to maximize impact, the source
code for this project is also available. This will not only help researchers to train
and test their own reinforcement learning algorithms on a challenging scenario, but
they can extend it and make up their own scenarios. Now note that so far, I tried my hardest not
to comment on the names of the players and the teams, but my will to resist just ran
out. Go real Bayesians! Thanks for watching and for your generous
support, and I’ll see you next time!

69 thoughts on “Google’s AI Plays Football…For Science! ⚽️

  1. This channel is soooo good. Because of videos like this we come across other people their solutions to challenges we don't think of haha

  2. Next step: do the same thing with two teams of physically simulated t-rex.
    https://www.youtube.com/watch?v=-ryF7237gNo

  3. The training on the pixels (screen) part is really doubtful. Basically there is a small map at the bottom which indicates every player's position, including the ball. How can they know that the model isn't just looking at that small map and ignore everything else? If the model is just looking at that, there isn't much difference between pixel and raw float representation as we know cnn can do it very well. In the paper they also say the entire screen is downsampled to 72×96 in greyscale in order to train (just a quick read so if it's wrong please tell me), and that makes the doubt more profound: how can the model even know where the ball is??? the ball would be downsampled to nothing!

    I have this opinion basically because if one says oh end-to-end reinforcement learning is so powerful, then why would we even need things like object detection in autonomous vehicles? Just let the model learn everything from what it sees from the camera!

    Well in the future maybe it would become reality, but for now I think I hold my reservations.

  4. What is the difference from learning starcraft? Looks like they just plugged their existing AI to another game. Is there any novelty here?

  5. Are there any research in "calming down" AI?

    I mean, does anyone train strong Ai to become beatable by weaker players to get "easy/medium difficulty" that is usually lacking?
    Because now we only see "strongest of the strongest" or "strong with exploitable mistakes" 🙁

  6. the spectator industry makes a lot of revenue. i imagine in future games designed specifically for viewer aesthetics to only be played(and playable) by optimized agents. twitch already proves elements of this right now.

  7. I’m still a bit confused about certain things of reinforcement learning. Does it just continue recording the actions and rewards forever, because that would take up a lot of space.

  8. <shameless_self_promotion> I played around with this environment, it's really nice. If anyone is interested, I made a step-by-step math and code video/text tutorial series to train a PPO algorithm to play this game: https://www.youtube.com/watch?v=SWllbdcrKLI Thanks!</shameless_self_promotion>

  9. I love how the ai is forced to "have an idea" of the whole hidden areas of the field. Similar to the alphastar approach. Would be interesting to see how it performs against human players.

  10. Learning from gamestates may be more efficient but it seems to me learning from pixels is more universal and natural.

  11. Atlas is watching this thinking….
    "I wanna go outside play football"
    Imagine a sturdy robot like A with real-time problem solving/solution drive… single minded determination and totally task driven. Yikes.

  12. How would AI handle a game like rocket league? I haven't seen an example accurately work with depth or camera angle changes yet

  13. Will be interesting to see whether AIs can come up with any strategic novelties in sports.

    Some football tactics in recent memory already seem like something an AI would invent – like Barca's 'all midfield, just keep passing forever' phase, and Stoke's tactic where they deliberately kicked the ball out every time they got possession because they conceded less goals when defending for most of the match.

  14. I want to see an AI for fighting games that replicates human behaviour. In fighting games, the AI can easily beat the player since they can react instantly.

    The AI doesn't play like a human does. It's just programmed to make mistakes sometimes. It's not actually thinking like a human and it doesn't have limited reaction times.

  15. It is probably too early to introduce the concept of "loot boxes", may blow an apu. And you do know I've been playing FIFA AI for years, so the significance of Football gaming AI may be lost to the causal viewer.

  16. wait a minute don't tell me they are dumping the entire screen shot/video of the play in to a deep neural network, that's a lot of GPU, power. the ai does not need to know about those large blank sections of the soccer field those could be zeroed out. they should first send the image threw a ai that's only job is to cut out circles around the "important" parts of the image as determined by back propagation. then the (x y) of the center location of that circle is put with the cut out. then all the cut outs and their coordinates of important information are pasted into a much smaller image. and that's what you should feed into the Ai s that tries to do the tasks. that way the resources that would have dealt with autistically watching the grass wave in the digital wind, at the input layer. can go to a larger hidden layer where the magic happens. it's still a general solution because the first ai is determining what is important, and that can change with each different game. I call this idea "give the ai a subconscious". the subconscious takes in all information but job is to filter out the boring stuff then gives only the meaty good information to the conscious that uses that cut down info to do the task, judging how much information to give should also be done with back propagation. it's basically having a neural network handle and oversee the drop out layer.

  17. This isn't an open source research foundation, but RLBot is really cool and based on a huge game which is also physics based and requires some strategic thinking, Rocket League: https://www.rlbot.org/

  18. I was just thinking this myself. The neural network should translate the pixels into information like positions and identifiers. Then the AI could reason with these positions to predict the best course of action to take like a chess computer. But I wonder, how would an AI come up with it's own strategies and rules of thumb like humans tend to do?

  19. There’s no way frequentists will be more popular than bayesians if the pool of fans consists entirely of machine learning researchers….

  20. I'm fully expecting an RL algorithm to discover how to break the simulation and teleport a player from one end of the field to the other or something.

  21. I add a Chinese translation for this video, but it seems that it need the author or netizens approve to see it on subtitle menu.
    If anybody familiar with Chinese, please do me a favor to approve it : https://www.youtube.com/timedtext_editor?action_mde_edit_form=1&v=Uk9p4Kk98_g&lang=zh-TW&bl=csps&ui=sub&ref=hub&ar=1570430089088&tab=captions

  22. Can't wait for games where the enemy AI actually uses deep learning. Also I kinda want to download this and set g=0.1 in the physics engine so I can see what moon football looks like.

  23. For pixel based training, when the a.i learns what information is irellevant to the games mechanics. (e.g fans in the stand.deciration) it can probably safe a lot of computation.

    So an a.i that doesn't just improve its quality, but its learning strategy as well. (Reflecting on its own learning progress and evaluating it compared over different simulations.)

  24. I optimized the Chinese translation for this video, hope the authors or netizens approve to see it on subtitle menu.

    If anybody familiar with Chinese, please do me a favor to approve it : https://www.youtube.com/timedtext_editor?action_mde_edit_form=1&v=Uk9p4Kk98_g&lang=zh-TW&bl=csps&ui=sub&ref=hub&ar=1570430089088&tab=captions

Leave a Reply

Your email address will not be published. Required fields are marked *