AlphaGo uses "cheap exploits" to beat humans

Saw this awesome video on Alphago strategy. It goes into detail eventually on the fact that Alphago does non-ideal moves in an effort to simplify the board state. The thing is by simplifying the board into easier to digest small sub-battles it can use it's strengths of montecarlo analysis to find ideal moves and eek out the victory.

This is very interesting because it means the AI is basically using exploits to beat humans, rather than making the best possible moves, it simply wins and has a strategy orientated around it's strengths.

Source
youtube.com/watch?v=YXKUuHnbyiE

But it still beat a human user. Don't belittle our soon to be robot overlords

This isn't belittling. It's actually better.

It makes substandard moves because it's specific hardware/architecture is so much better once the board is simplified. For instance the hard part about GO was the simple size of the board and how it made simple branching searches terrible. The Alphago natural strategy evolved into attempting to break up the board into simpler pieces that such searches work best in.

Cheap exploits? That's like when people complain that you used a certain weapon on a game because it's "overpowered". It didn't break any rules, it stills plays the game better than us. If anything, that makes it more impressive. I hope this 1 hour video I didn't bother to watch doesn't imply otherwise.

My favorite thing is when you look at a graph of it processing, it seemed to get stuck and then just made a shit random move. It reminded me of a player getting frustrated and then just trying something randomly.

OP wasn't saying it's bad.

From the AI's perspective reducing complexity is the best possible move

Ha. I had a very old chess program that did just this.

>cheap exploits
>always wins
so now that you know the cheap exploits, why can't people beat the computer?

It was mostly trained against itself though, not against humans.

It's very smart to orient strategy toward your strengths

>cheap exploits

You mean it plays correctly in order to win and doesn't need to computate everything. That isn't an exploit. An exploit is a bug/error in the game rules that not everyone knows about and shouldn't be there.

AlphaGo learned to use this strategy on its own, correct? More specifically, this wasn't programmed into it by the researchers who worked on it? Cool stuff.

No,the neural network that was trained (by watching humans and playing against itslef) takes for input a game state and return a weight for every possible moves depending on how good it feels for the AI.
Nothing more, the part that was trained is just capable of discerning a vague areas where the next move should probably be played.

The decision making is purely computational from here, it is roughly as follow:

1)get the game state, and the weights using the NN.

2)roll a random number, choose the move accordingly to this result and the weights

3)Repeat from 1 with this new game state untill the game is over. Record who win. Plays tens of thousands of games, and note the winrate of every move.

4)Chose the move with the highest winrate and play it for real.

what is sholite? can't find this term.

this is reinforcement learning and it's similar to how humans "know" good and bad moves. what's your point?

>Alphago does non-ideal moves in an effort to simplify the board state
humans do that as well when they feel ahead.
trading a few points for a more secure win is always correct strategy

>AI is basically using exploits to beat humans, rather than making the best possible moves
best move possible will always (ALWAYS) be out of reach of humans/computers. talking about perfect play is useless for a game as complex as go on a 19x19 board.
as for the strategy, see above.
alphago is simply maximizing its win percentage.
montecarlo bots already did this before neural nets came around

It's using a different strategy than a human would use. That's not a "cheap exploit", it's not even making "non-ideal" moves; its moves are good considering it's goal is to simplify the board down to a state where it can beat a human.

This is just some sperg crying I DON'T CARE IF YOU'RE FOLLOWING THE RULES YOU'RE PLAYING IT WRONG

Yeah, the OP is actually devil's advocate. I was just very interested when I heard the person say he noticed it trying to "simplify" the board and made mediocre moves to do so. It made me think about humans exploit shitty AI using tricks rather than "good" gameplay.

>when you look at a graph of it processing
where can I see such a thing?
I hope you're not referring to the video in the OP, because thats just an sgf editor (KGS) and has nothing to do with alphago.
also for anyone wondering: the commentator in the OP's video is afaik a failed youngseong or whateverthefuck theyre called. a former korean insei basically.
chess equivalent would be a young IM who then quit tournament chess before becoming GM

pic is black to play

This is how humans play as well. In chess for example, you try to bring the game deep into a board state you are familiar with, but your opponent is not. You can basically play from memory, but you opponent has to calculate moves.

I don't see it.
I'd prob do D-10
B-15 seems k too

>But it still beat a human user.
Here's the thing, though:

My experience playing against go computer players as a noob was that they could be tough to beat, until you learned the things they're stupid about. Then you could just smash them effortlessly.

That's why they were lousy practice. Instead of learning to play a good game of go, you were only learning about that one program's flaws. If you smashed a human player making the same mistakes, they'd learn from it, or if you played against another human, they probably wouldn't fall for the same tricks. Good skills are robust against many opponents.

However, pro players usually study their opponents' games before they play them, and AlphaGo was surely fed databases of the games of everyone important it has played against. Studying the opponent's games would be a significant advantage for one pro player to have against another.

They haven't made AlphaGo available for pros to play against repeatedly, knowing they're playing against AlphaGo each time. So they haven't been able to probe for its weaknesses, find where it plays weaker than a human.

When that happens, they may find that there are strong approaches to beating AlphaGo.

That's a cheating AI

Damn the butthurt is so strong.
I can't wait for truckers to go out of jobs because AI is cheating the roads.

Why do people care about alphago?

Because computers weren't 'supposed to' beat human players at go for another decade or two.

Not an exploit. Of course the optimal strategy for an intelligence is the one which can actually be deployed by that intelligence. Humans do not attempt to play optimally either, otherwise they wouldn't be able to make a single move.

It represents clear proof that AI is beating conservative expectations and more in line with Kurzweil optimism.

Or engineers and architects, when people figure out a neural network can churn out hyperoptimized designs through random processes far quicker than humans through planned design.

No it doesn't.

Go is just a boardgame, like chess.

People talk a lot about all the fancy AI concepts applied to AlphaGo, but the real innovation that dramatically improved gobot performance is a straightforward method for evaluating positions, like piece scoring in chess.

In 2006 "upper confidence bounds applied to trees" (UCT) was invented, and suddenly making gobots wasn't a fumbling black art anymore, rather it was like computer chess: a matter of optimizing and throwing computational power at a straightforward mathematical approach.

It took a few years for confidence in the approach to build so someone would throw money at it, as was necessary for chess to exceed the best human players.

Low IQ post

Are you expecting some magical algorithm that has to come to a special mountain of the Gods through an avatar? Your post reeks of idiocy and low IQ mistakes. These small innovations and slight improvements at harder tasks are what lead to the singularity.

UCT is a straightforward, go-specific algorithm. It produces reasonably good results using simple programs.

Back in the day, people used to use chess as an example of something that would prove computers were reaching a human level of intelligence. But then someone found a shortcut to turn it into a manageable tree-searching problem, and successful chessbots were much less interesting.

This is the same thing. People were saying you'd need human-like intelligence to play go well, because the tricks that worked for chess didn't work for go. But then someone found a trick that removed the apparent need for human-like intelligence.

Don't even need neural nets for that.

What is this image showing?

different guy here but it looks like maximising structural strength and minimising material, probably some evolutionary algo

The algorithm switches material from the least load bearing area to the highest. Out starts with random placement.

>sing exploits to beat humans, rather than making the best possible moves

If those "cheap exploits" win the game consistently then they're the best possible moves ya fuckin dingus

No they're not. They're the best moves so far. Against a better AI they might be losing.

Is it really using "less than ideal" moves if it's winning? Sure, you could say that as a program it would lose to a truly omniscient opponent but that's not really a huge insult when every other human would as well.

Besides, the measure of the "idealness" of a move is its propensity to win you the game. If a given strategy wins every time, it should be by definition an "ideal" strategy. Just because we can theoretically conceive of a strategy which beats it doesn't mean anything until someone can actually implement that strategy.

who cares? the goal with the project was to beat humans. the technology will get better. nobody claimed that the bot makes perfect moves every time.

>best move possible will always (ALWAYS) be out of reach of humans/computers

"no"

>Is it really using "less than ideal" moves if it's winning?
Yes, the idea should be to aim for the objectively strongest move, that way we can use its analysis to improve human play.
Chess programs don't play the trickiest moves or the moves most likely to trip up a human, they play the moves most likely to win against an optimal opponent.
There's nothing wrong with using techniques like this to win in the short term. However, the post I was replying to claimed they were the "best possible moves". This is clearly not true and in the long term it would be preferable to develop methods which allow us to play more optimally.

>However, the post I was replying to claimed they were the "best possible moves".
Ah, I see.

Op doesn't know what an exploit is. There are no exploits in go or chess.

To expand on this, what alpha go is doing in modern gaming lingo is finding a new meta game (in other words, rules above the rules). This is in no way an exploit.

>AI

All alphago is is a Go solver. It's no more of an AI than an A* search algorithm.

>Kurzweil

is that way

Alpha go does most of it's learning by playing itself. So this strategy also works against itself. Calling it an exploit is just weird

The basic idea behind alpha go is that it makes a guess of the probability of any move will lead to a win. The Monte Carlo tree search stuff is actually not necessary, it can play just with the neural network, although not as strong.

The algorithm will naturally trend towards board states which it knows more about (which are similar to board states it has seen before), because it will have higher certainty that these moves lead to a win.

The one game that Lee Sedol was able to beat alpha go was mainly because he was able to drive the board state to something very unique. Something which Alpha go has probably never seen before. People called it "God's Touch" because it was such an incredible move.

wouldn't it get locked into predictable patterns in this way, if that makes any sense at all? could a pro beat alphago with enough practice against it?

Theoretically you could. But this would require you to be able to calculate deeper than alpha go can, which is just not possible for a human.

I also imagine in the field deployed version of the algorithm, they add some minor randomness into the move decision process to prevent it from playing the same game over and over.

of course, but it might still be possible to get a sense of its overall strategy

I would say its strategy is to accumulate small advantages over the course of the game. Where as human players will generally go for bigger advantages, or gambits/bluffs. Alpha go is perfectly content with making a move that keeps the game even, because it knows it can find advantage later.

>it simply wins and has a strategy orientated around it's strengths.

Sounds like every contestant in every content ever. Including wars.

OP get over yourself.

and any MCTS algorithm will calculate "deeper" in terms of search depth/breadth than a human player.

>rather than making the best possible moves.
Ofcourse it doesn't make the best possible moves as it would be computationally impossible to calculate that.