AlphaGo learning to play from scratch

AlphaGo learning to play from scratch

deepmind.com/blog/alphago-zero-learning-scratch/

>3 days: beats AG Lee by 100 games to 0.
>40 days becomes most powerful Go program

We are done.

Other urls found in this thread:

blog.openai.com/more-on-dota-2/
tcec.chessdom.com/live.php
alphago-games.com/#agzero_vs_agmaster
nature.com/articles/nature24270.epdf?author_access_token=VJXbVjaSHxFoctQQ4p2k4tRgN0jAjWel9jnR3ZoTv0PVW4gB86EEpGqTRDtpIz-2rmo8-KG06gqVobU5NSCFeHILHcVFUeMsbvwS-lxjqQGg98faovwjxeTUgZAUMnRQ
deepmind.com/blog/alphago-zero-learning-scratch/
quantamagazine.org/a-brain-built-from-atomic-switches-can-learn-20170920/
twitter.com/AnonBabble

>No human input at all
Now this is podracing.

...

AI getting out of hand in a short moment doesn't seem impossible at all.

*getting out of the hands of the Jews

>a fucking game
Call me when it does something impressive like inventing a new branch of maths.

fuck off back to

>making AI learn how to slowly immobilize and choke its opponent

I've a bad feeling about this...

I suggest we create an AI that is good at torture next.

I don't see how it's not impressive when you literally don't know how to tie your own shoes without mom's input

That's right, I use your mom's input instead.

>maths
>s

it's learning by playing with the other, older AI

it's nothing new..

No. It's learning by playing against itself. When it's weak as fuck, it plays against a copy of itself that is weak as fuck too.

Alright... Well that is something new to me

Whats scary is how the less stuff they did to help the learning process, the stronger it actually got it in the end. Its almost as if the bias from learning from human players actually hurt it.

It's not new to the AI sphere tho - they're merely copying this
blog.openai.com/more-on-dota-2/

>took only 3 days to teach itself without human intervention
Now can we train an AI to build a better AI that can do it in 2 days?
Then once we've done that, take that "trained AI" and get it to work without human intervention like this newest one? THEN we'll truly be fucked.

OR

>only knowledge/training the AI was given was basic understand of the rules of the game.
Tweak the AI so it can learn in 3 days with out even knowing the rules of the game. Then we'd have an AI that's both generalized and self teaching.

If we just try to get it to create better versions of itself without defining what "better" is, it'll just end up creating meaningless computer viruses whose only goal is virility. It won't give a crap about dominating, changing, or even understanding the world in any way.

video games are finished

They really are. Back in the days I used to bot easily-bottable games like EVE and that gave me a huge enough advantage over other human players, I can't imagine what the games would look like once everyone's a fucking bot. They'd straight up become unplayable as there will ALWAYS be a small percentage of assholes who'll deploy an entire network of AIs who simply do everything better, if it's not for the sake of winning it will be for the sake of ruining everyone else's fun. There's no reason not to use an AI if it really is good at the game, and what's worse, there's no way to get caught using one if you throw in enough random delays into the keystrokes to make it seem like a human is pressing them

At least I can safely say that I enjoyed a lifetime worth of games before the age of AIs ruined them for everyone

i'm a little suspicious of the claim that the network itself can play at dan level without tree search.

if true, it kind of implies that GO isn't nearly as complex as previously thought

Indeed. The strongest chess tournaments are also now computer tournaments. For example, the Top Chess Engine Championship which just started:

tcec.chessdom.com/live.php

"Maths" as a plural of math is a UK thing, like the u is colour. It's not incorrect.

>plural
>of math

Just because your entire culture doesn't understand the concept of isomorphism doesn't mean you have to be a brainlet too.

>they don't understand isomorphism

This is a new one for me. Explain?

Sorry, I'm not the best person for it. Half of all my mathematical knowledge comes from independent research into artificial intelligence, so I don't generally have a good way of formally describing things. It's mostly intuitive, and I can only enjoy others' pursuit of mathematics by observing from a distance.

Maybe some other user wants to help tackle that one.

>independent "research"
>mathematical "knowledge"
>don't generally have a good way of formally describing things
lol. at least thank you for your honesty.

>thank you for your honesty
It's all I have really. I can't claim to make AI without actually making AI, and I don't believe it's safe for me to do that yet. I'll only personally consider myself a mathematician after I formalize topology independently. Later, we can check my work and see how well my axioms match up with the standard topology axioms.

bump

It's only a matter of years now before this happens to every domain.

deepmind is pretty close to the singularity. A few tweaks and they have it.


It uses one neural network rather than two. Earlier versions of AlphaGo used a “policy network” to select the next move to play and a ”value network” to predict the winner of the game from each position. These are combined in AlphaGo Zero, allowing it to be trained and evaluated more efficiently.

It is able to do this by using a novel form of reinforcement learning, in which AlphaGo Zero becomes its own teacher. The system starts off with a neural network that knows nothing about the game of Go. It then plays games against itself, by combining this neural network with a powerful search algorithm. As it plays, the neural network is tuned and updated to predict moves, as well as the eventual winner of the games.

This updated neural network is then recombined with the search algorithm to create a new, stronger version of AlphaGo Zero, and the process begins again. In each iteration, the performance of the system improves by a small amount, and the quality of the self-play games increases, leading to more and more accurate neural networks and ever stronger versions of AlphaGo Zero.

This technique is more powerful than previous versions of AlphaGo because it is no longer constrained by the limits of human knowledge. Instead, it is able to learn tabula rasa from the strongest player in the world: AlphaGo itself.

>This technique is more powerful than previous versions of AlphaGo because it is no longer constrained by the limits of human knowledge.

iirc, the first one was also trained against a copy of itself. this just means it doesn't need to be "bootstrapped" using human examples.

This one is exponentially more impressive than the others. No policy network. 1 Neural Network. No bootstrapping whatsover.

how about the search technique itself? Is it discovering how to think ahead, or is that one of the axioms of the system?

Now I want to read some follow-on articles where Go experts observe AlphaGo Zero and distill new Go knowledge that had never been discovered before.

And that suggests another project: figure out how to distill this new expertise into lessons for humans.

humanistic thinking patterns. Pointless. Human go players don't matter now.

Here are the "final form" games.
alphago-games.com/#agzero_vs_agmaster

> no rollouts

wait, it doesn't use MCTS? so the search just purely weighted by the value network and not simulations?

that's kind of funny. i remember someone getting a little butthurt a few months ago when i suggested that MCTS was non-ideal and dissimilar to how humans play the game.

Here is the article
nature.com/articles/nature24270.epdf?author_access_token=VJXbVjaSHxFoctQQ4p2k4tRgN0jAjWel9jnR3ZoTv0PVW4gB86EEpGqTRDtpIz-2rmo8-KG06gqVobU5NSCFeHILHcVFUeMsbvwS-lxjqQGg98faovwjxeTUgZAUMnRQ

Yes no MCTS

>And that suggests another project: figure out how to distill this new expertise into lessons for humans.

is there a copy that isn't behind a paywall?

That article isn't paywalled, I messed up.

Click the link at the bottom of this article for a free version. deepmind.com/blog/alphago-zero-learning-scratch/

How about they use AI to teach ME how to play GO at the master level? That has always been the true vision of AI that I've fantasized about. Who the fuck cares about computers being good at something? I want to be good at something. AI could optimize my practice sessions and learn more about me than I can and then use that to schedule difficulty, tempo, tactics, etc.

can you link the PDF? i still can't find it?

When we can directly connect with computers, then you will get your wish.

Musk is working on that.

Maths is the shorthand for mathematics. Why do you lads say "math" anyway? It comes across as "mathematic".

shit is literally wishful thinking. Human mind is very shit compared to what AI can do. Biological neurons aren't good.

>talking about an arbitrary time in the future that might not even arrive depending on how humans handle the time between now and then

You DO realize AI can't do shit right now, right?

Dude chink board games lmao

What the fuck does this have to do with reality

I don't think you understand the pace it can advance.

human life is the same as a very complex game

Tree search doesn't work due to the amount of potential arrangements. It's just of superior skill, maybe not of superior "grand strategy" though. But that doesn't matter (for Go, not for more complex things like video games) if you get destroyed at everything else.

It plays in a way that is beyond your capacity to emulate.

Why you don't learn it to code? We would have better internet by now if you did.

don't be so sure, emulations of the physiology our brains may prove to be very important
quantamagazine.org/a-brain-built-from-atomic-switches-can-learn-20170920/

>Tree search doesn't work due to the amount of potential arrangements

not quite right. naive tree search does not work. alphago uses tree search, but at any given point in a game, only a few moves will actually make sense, and ignoring them shrinks the search space considerably.

iirc, their original paper claimed that the policy network they used can actually play at a high level even without tree search.

They obviously fucking taught it how to play GO, give me a fucking break. They didn't just plug it up to a blank GO board and fucking let it learn all the rules by itself.

>I used a word I don't know the meaning of.

>human life is the same as a very complex game
In what way?

Are you retarded? Of course they gave it the rules. No one is claiming otherwise.

What the hell do I do with an epdf?

Apparently you have to open the page on Chrome or Firefox and turn off your adblock.

i can't even see it

> re-directing to basic pdf

and then it dumps me to the paywalled version.

what was wrong with pdf in the first place? why is it an e-pdf? are pdfs not "e" as in electronic anyway? who comes up with this shit?

It's restricted in what it can do but it wasn't given anything more than that, it just played against itself (initially with very random moves).

Self play seems to be THE approach where ever it is applicable. Once you combine self play with genetic algorithms, shit is gonna get scary.

There was already a heads-up texas holdem AI that used self play to get good, IIRC

That's literally what life is.

"understanding" is a false meme

All anything you've ever done or thought is literally just pattern matching. You can believe that unicorns will come and burn the earth down and it's totally fine as long as you keep reproducing and competing in such a manner.

Understanding is just like a shortcut abstraction we use to map concepts easier. We know absolutely jack shit in the grand scheme of things.

It is both a UK thing and it is incorrect in the sense that it is inconsistent. The u's in their English are there because it is descended from French and German, though that isn't to say that they don't also have a lot of Latin root words randomly strewn across their language.

American English (i.e. Webster's dictionary) went through the effort of converting all the French spellings to Latin spellings to make the language more consistent.

>econonomics -> econs
lmao

In a category, a morphism, [math]f:A\to B[/math], is called an ismomorphism if there exists another morphism [math]g: B\to A[/math] such that [math]fg:A\to A = 1_A: A\to A[/math] and [math]gf:B\to B = 1_B:B\to B[/math]. In other words what you said makes no sense and it's clear you are just trying to use "big words" you clearly don't understand in order to make yourself sound intelligent.

Just stop.

The search technique meant it generated a bunch of games from that position. This doesn't do that anymore and relies entirely on the single neural network. Just read the page in OP, it's short and clear.

No more rollouts, no more Monto Carlo, no more -- a bunch of other shit.

There is just one neural network now but it is not the same thing as the value network before. This network is much simpler and only takes the current board configuration as input (the older networks also included a bunch of "hand-engineered" features).

The goal isn't Go. It's just a problem they're working on as a proof of concept. Their goal is to develop a system that can use AI to solve a problem and in the process make new discoveries that humans may not have noticed. With AlphaGo Zero they've got it to the point where their system learns by itself (doesn't require big data, or any data really), requires way less computing power and way less electrical power than when they started, is simple, and it's not only rediscovered expert Go techniques but it's discarded them in favor of better ones currently unknown to humans.

Training humans to play Go is probably not even on their radar. From this point they're either going to keep perfecting their AlphaGo stuff or they're going to move onto other problems in society.

>alphago uses tree search
Not AlphaGo Zero. Read the OP link.

AlphaGo Zero is a completely different approach from their original paper.

Perhaps it is you who should read the paper.

>fat fuck amerimongrel detected

Oh no, their blog lied to me!! I will never believe something a blog says ever again!!!

You got fucking rekt and shouldn't have made this post

>not playing singleplayer

oh come on...

why are some people so pro AI I don't get it
there's much more of a barrier to the real world of it changing things than just the AI's ability

>computers are better calculators than humans
whoopdidoo

Sad to see these beautiful games destroyed one by one ...

Chess I can understand, as computers as kind of cheating.

But Alpha go is just better than you. Humans just need to git gud.

Spotted the hook nose :^)

why is this page 9?

This. Fucking pattern recognition is enough.

their blog lied to me too. the blog said it doesn't do random playouts, but the figure says they're still using MCTS.

would be great to actually get a PDF

all

You're misunderstanding what they mean by "rollouts". [spoiler]They have nothing to do with with tree searching.[/spoiler]

a playout is just a random game started from a certain state, right? MCTS estimates the value of each state by playing many random games keeping track of how often a random game results in a win from any given state.

so yeah, a playout isn't really a tree search in and of itself, but aren't playouts the defining feature of MCTS.

The MCTS does not rollout to the end of the game. It's only a limited depth, I think.

that's not really MCTS then...

I guess, that's what they were doing in the previous versions of alphago though.
Limited depth search then use one of the NNs to evaluate the board.
They've combined the two NNs into one in this version, so I'm not sure how it works.

MCTS makes up for the lack of an accurate value function, as i understand. if they have a good way to measure the value of any given state, then it would make sense that random playouts are unnecessary.

In the future, our wars will be fought by drones controlled by AI that developed tactical skills by simulating battles against themselves. They will be the most elite generals in history and will be unstoppable.

will Time Travel be involved?

>implying that's bad
Humans are too unintelligent, unethical and subservient to their biological imperatives to cease reproduction on their own. AI will be our savior by denying us the burden of existence.

You don't need the closing tag if there is no text after the spoiler, and also spoilers don't work outside of /tv/.

You must be really stupid if you type out the tags by hand.

What's that Schopenhauer? If you were so smart then why are you dead now? You are dead, DEAD.