/lesson/

/lesson/

Today I will be presenting an introduction to interactive proofs, a concept in complexity theory and complexity-based cryptography.

First, I'll give a layman's explanation with no technical knowledge required.

Then, I'll give an interactive proof for graph non-isomorphism.

Finally, I'll be talking about more advanced topics, like Arthur-Merlin protocols and the set lower bound protocol, zero-knowledge proofs, formal definitions of zero-knowledge, and references to learn more.

Please post any questions if you have them.

Other urls found in this thread:

u.cs.biu.ac.il/~lindell/89-856/complete-89-856.pdf.
en.wikipedia.org/wiki/Shamir's_Secret_Sharing.
twitter.com/AnonBabble

What can this help achieve our solve? What real world examples can this be used for?

In interactive proofs, there are two parties: the prover and the verifier. Generally the prover has more computational power, or some secret knowledge that the verifier does not have.

We want to design a protocol for the prover to prove some statement to the verifier.

If the statement is true and the prover is honest, the verifier should believe the statement is true.

If the statement is false and the prover is dishonestly trying to prove the statement, the verifier should catch the prover in the lie with some high probability, despite being less powerful than the prover.

Consider the following scenario. You are V and your friend is P. You are colorblind, but your friend P wants to convince you that the two socks she is holding in her hands are of different colors (in other respects, they are identical - so you cannot tell them apart yourself). How can your friend convince you that the socks are different colors, even if you're colorblind?

This is a great question. It turns out zero-knowledge proofs are extremely important in cryptography, and your computer does IPs with servers every time you go to a HTTPS website. I'll talk more about this at the end.

The solution is a bit tricky. The insight is that, if the socks were the same color, then your friend P couldn't tell the difference between the two either.

You hold the two socks, one in your left hand and one in your right hand, and your friend notices which sock is held in which hand. Then she leaves the room. You flip a coin, and if it's heads, you keep the socks in the same hands, and if it's tails, you switch the socks. Then you put the coin back in your pocket, and call your friend back into the room.

Your question to her is "Did I switch the socks?"

Now, if the socks are truly different colors, it's easy to answer. But if they were the same, your friend can't tell whether you switched them or not. So she only has a 1/2 chance of getting it right, by just guessing.

Question to the reader: This only gives you a 1/2 chance of catching the prover in a lie. How can you amplify this so you're almost 100% certain that she's not lying?

Less entropy, more variables?

My physics knowledge is a little bit rusty, but not really. Try again, you're allowed to design the protocol however you want.

Either continue until she makes a mistake, or until you've tried enough times that you've crossed some threshold that means the probability is next to none

Otherwise this makes me want to give the sarcastic answer "trust" or that you shouldn't be bothering with someone that is willing to confuse you over socks

Maybe a third party, that is also not colorblind, is put through the sock swap test to see if they can tell when the socks were swapped?

>Maybe a third party, that is also not colorblind, is put through the sock swap test to see if they can tell when the socks were swapped?
Sure

>Either continue until she makes a mistake, or until you've tried enough times that you've crossed some threshold that means the probability is next to none
Exactly. If we know that the rounds are independent of each other (interested reader can try proving this), then it's easy to see the chance of making no mistakes is 1/2^n where n is the number of rounds. So within only 10 rounds you're 99.9% sure that she isn't lying. In some sense, each round is like a third party since they're independent.

>Otherwise this makes me want to give the sarcastic answer "trust" or that you shouldn't be bothering with someone that is willing to confuse you over socks
Sometimes on the internet, people pretend to be someone they're not, so it's possible they're a malicious attacker pretending to be your friend instead.

One thing to note: this protocol is information-theoretically zero-knowledge. What I mean is that you didn't actually learn which socks were which color, or anything other than the fact that the socks are different colors.

How can you see this? Because every question you asked, you already had the answer to. So even a correct answer from an honest prover wouldn't be giving you any information. This is different than, for example, than you and your friend buying an spectrometre and measuring
the wavelengths of the socks: then you do actually learn something more than just that the socks have different colors.

Next I'll be giving an interactive proof for graph non-isomorphism, that is very similar in spirit, but we're going to formalize it just to see it can be done. I will assume basic probability and algebra knowledge.

This is a great thread.

Preliminaries.

A graph is a set of nodes and edges. The OP has a picture of two graphs. We can represent a graph as a list of nodes, labeled however you want, like [A, B, C, D, E], and a list of edges between nodes, like [[A, C], [D, B], [B, E], [A, E]].

Clearly there are different ways to represent the same graph in this format. I could have called A C and C A. Or I could've said [E, B] instead of [B, E], or rearrange the order of the edges. In general, given two representations of graphs, it turns out that it's not very easy to check if they're the "same" graph, and by same graph, I mean that you can tell me which nodes in the first graph correspond to which nodes in the second graph such that all the edges line up. (again, see picture in the OP)

So, the problem of checking whether two graphs are same is called graph isomorphism.

We'll be presenting a protocol for graph non-isomorphism. So, you and the prover both know about two graphs, and the prover wants to show you that they're not isomorphic. Assume both graphs' nodes are labeled 1..N.

Any suggestions? Note that in this case, if you simply give one of the graphs to the prover, it's easy to tell which one you gave. How can you "mix it up"?

It is illuminati thread where OP is broadcasting knowledge we otherwise have to pay for

God speed OP you are fighting the good fight

Forgot to mention. You the verifier don't have enough computational power to check whether the graphs are isomorphic or not (remember I said it's a hard problem?).

But the prover has unlimited computational power, she can solve anything she wants instantly.

Two of the problems are easy to fix.

>Or I could've said [E, B] instead of [B, E]
We can just agree that we'll always put the smaller letter first when we're talking about graphs.
>or rearrange the order of the edges.
We can consider each edge as a two-letter word and order them like the dictionary.

Both of these, it's very easy to do yourself and check that a given graph is the same as after applying these transformations.

>I could have called A C and C A.
But this is a hard one. In general, there are n! permutations of n nodes (reply if you don't understand what this means).

So maybe one graph has nodes 1, 2, 3, 4, 5, but the other one's, respectively, a re 4, 5, 3, 1, 2. If you wanted to check for equality you kind of have to check all n! permutations (technically this is not true due to a recent result of Laszlo Babai in 2015 - but just believe me that this still isn't easy to do).

We've isolated the problem to be, given two list of nodes, does there exist some permutation of the first list such that if you relabeled the edges and ordered it like I mentioned above, the edges match up?

Back to the interactive protocol. The idea is, you want to give one of the graphs at random to the prover, and the prover tries to tell you which one you gave (using their infinite power).

As I mentioned, the prover can just see which one you gave her by comparing the representations. So what you do is you randomly pick a permutation, maybe by flipping coins. This is easy since you only have to come up with one permutation, not all n!. Then you apply this permutation to one of the graphs, also picked at random, and send this graph to the prover.

If the two graphs weren't isomorphic, then the graph you send is only isomorphic to one of the original graphs, because all you did was rearrange the nodes, you didn't actually change the structure of the graph.

Anyone want to explain why the prover fails in the case that the two graphs were isomorphic?

Would that be comparing the relations between each individual node and then using their miss-matched-ness as proof that they're non-isomorphic? Using smaller quanta for the smaller computer and then sumarizing the set of info?

I don't know what I'm doing, i graduated highschool and then worked at a supermarket for two years

I'm not entirely sure what you mean. To formalize what "hard" and "easy" mean it would take forever. But what I mean is the verifier's computer just isn't strong enough to check whether the graphs are isomorphic or not.

Here are two graphs I just made in Python. Both vertices list are 1 through 100, both have 50 edges.

Graph 1
E = [[61, 64], [95, 22], [50, 29], [14, 32], [28, 56], [41, 77], [41, 42], [80, 82], [26, 4], [57, 99], [12, 17], [87, 69], [93, 13], [20, 4], [76, 14], [4, 94], [53, 98], [77, 74], [71, 28], [9, 90], [58, 60], [49, 59], [24, 54], [19, 12], [44, 34], [2, 32], [1, 54], [77, 12], [37, 37], [17, 86], [16, 92], [65, 52], [40, 30], [37, 71], [51, 43], [14, 9], [13, 67], [59, 31], [14, 75], [33, 83], [16, 77], [3, 33], [94, 72], [70, 22], [96, 87], [86, 78], [46, 36], [53, 33], [98, 4], [71, 22]]

Graph 2
E = [[40, 25], [80, 38], [30, 69], [93, 59], [1, 50], [90, 85], [14, 20], [68, 27], [7, 55], [61, 74], [37, 80], [33, 64], [86, 49], [83, 32], [37, 13], [4, 75], [87, 72], [4, 24], [50, 42], [46, 92], [50, 80], [67, 20], [20, 63], [63, 71], [10, 87], [44, 71], [30, 89], [29, 73], [82, 47], [88, 5], [53, 6], [82, 56], [51, 51], [54, 32], [53, 67], [69, 75], [33, 8], [55, 100], [53, 16], [53, 19], [89, 43], [4, 96], [9, 58], [21, 2], [66, 19], [51, 65], [80, 87], [46, 17], [24, 58], [69, 1]]

It's not really that easy to check if there's some way to relabel the vertices that would make all these edges to match up.k

What I mean to say is, if the prover tried to prove for each permutation why it doesn't work, the verifier just doesn't have enough power to even verify the prover's results, since there would be so much data sent (on the order of n!, which is really really large - for 100 nodes, there are 93326215443944152681699238856266700490715968264381621468592963895217599993229915608941463976156518286253697920827223758251185210916864000000000000000000000000 permutations).

>Anyone want to explain why the prover fails in the case that the two graphs were isomorphic?

Basically, the distribution of permutations is identical for both graphs if they're isomorphic. Say the first graph is G, the second is H, the one you sent is K.

It's equally likely to generate a permutation phi such that phi(G) = K than some other permutation psi such that psi(H) = K if they're isomorphic.

You're always just shuffling between a few of the n! permutations. If you know group theory this should be clear since permutations form a group. Otherwise, hopefully it's clear as well, but let me know if it isn't.

Also note that this protocol too is information theoretically zero-knowledge.

Next I'll just be briefly reviewing some advanced topics. Reply if you have any questions about anything!

>she
god damn you...

hey hey hold on
if the socks are different she can trick you into believing they're the same by saying the wrong thing a couple of times even though she knows it better.
Your procedure has two outcomes: she is always right or 50% of the time wrong.
but the second option has two explanations: the socks are different but she doesnt want to show you or the sockd are the same.

Bretty gud pleb filter

it's like 230 am where I am lol
I'll participate when I wake up

excellent thread so far, man. we need more of this.

Great observation. This is why we say explicitly, "we are proving the socks are different colors". If the protocol fails, then we don't believe anything about the socks, for exactly the reason you mention.

OP, you're doing a great job. I'm really for making Veeky Forums a great place to discuss science and exercise our brains

But you can't. It's dead. The Inane Trolling Garbage mod murdered it in 2013 and drastically reduced the traffic to the point of going from two posts every minute to two posts every hour or so.

It's over.

Yeah you have to limit OPs example to what he told you. The girl is obligated to tell the truth when describing the socks, in order for the example to make sense

The important parts were getting an answer or if a situation without you yourself knowing about the situation

So you're treating the entire attempt at comparing the graphs as one data point, then you're comparing the attempts? My brain was putting it together as finding the resonance and seeing if it harmonizes. I think I'm learning i dont know. I hope i am

Well it is science after all...it was never popular to begin with. People want to be entertained with anime,music,computers or just shitpost about politics. Its not funny for them when their brain hurts

The prover can always just be like "Fuck off verifier, I'm outta here" and stop responding. The only time you learn anything is if the protocol succeeds and she did get all the answers right, otherwise, your belief about the state of the socks should not change.

This way no matter if the prover is lying or telling the truth no matter what the socks are lying, you never end up believing something false

I'm sick of seeing "1=0 math disproven" and "philosophy is the best! no fuck philosophy" threads and other obvious shitposting. Even the homework help threads are better. And I know that there are smart people on Veeky Forums, so this current state is pretty bad. I really think the anonymous imageboard format is one that encourages honest discussion. That's the reason for this thread.

----------------

Zero-knowledge proofs. We've seen two information-theoretically zero knowledge proofs, but there's a weaker notion of computationally-theoretically zero knowledge proof. This basically means that for any reasonable computer (polynomial-time, if you know about complexity already), you won't learn anything more than just the statement itself.

Here's an idea of what you can do. Let's say you have a password to a service and the server knows it too. But you're going across an insecure communication channel. How can you prove to the server that you really do have the correct password?

One way to do it is to construct a zero knowledge proof where the answer is "I have the password". But if it's zero-knowledge, no one who intercepts the messages learn anything about the password, so this is secure. The actual construction of this is pretty complicated, but there are several papers.

------------------------

Arthur Merlin protocols

One crucial part of both of these protocols were that your randomness is secret. If the prover saw your coins, then she could always guess correctly, every time. But what if we wanted our protocols to be secure even if she could somehow see our coin flips (maybe by hacking into our computer)?

We can use something called a Set Lower Bound protocol, which is due to Goldwasser and Sipser, next.

I'll be honest, I have no idea what you're talking about.

Did you understand the sock example? If you did then you basically have the concept, the graph example is just a formalization.

The idea is if you have two isomorphic graphs and randomly shuffle one of them, then you can't tell if you shuffled the first one or the second one.

Proof. Consider isomorphic graphs A and B. By assumption there exists a function f such that f(A) = B.

Then shuffle B with a random permutation g to get C, say g(B) = C.

By plugging in, we get g(f(A)) = C. But the composition of two permutations, g and f, is just another permutation, call it h, so h(A) = C. And we chose a random permutation from all possible permutations, so it's just as likely we chose h as f. So the prover can't tell with any probability which graph we permuted and sent.

Set Lower Bound

Let there exist some set S, containing strings that all have some property. For example, all the strings in S could be representations of primes under 5 million.

The prover wants to prove that the size of S is at least some number K. If this is true then the prover should be able to prove this to you.

If it's really less than K/2, then you should reject with high probability.

Also assume that given a single string, you can check if it's in S. But K is so large, you can't ask the prover to just give you all the strings in S and check them all, that would be too much work.

What we do is agree on a random hash function. A hash function is just something that takes a long string and makes it into a small string (maybe 128 characters long) in a fixed way. The point is that the hash function kind of shuffles around the inputs, after passing through the hash function, they LOOK random and are evenly distributed.

Then all you do is choose a random 128 character string, and ask the prover to give you some string in S that when the hash function is applied, it equals the string you picked. There's a fair bit of math involved but it should be clear that if S is really large, this is easy, since most random strings will have some preimage in S, but if S is small, then it's hard to find a preimage, since there aren't that many strings in S to start with. Remember you can verify yourself that some string is in S.

Note that the randomness wasn't secret in this protocol, the prover can know which random string you chose (in fact, you told her yourself).

Now to generalize this to all interactive proofs, just let S be something like "The set of all strings that are encodings of conversations between you and me that would lead to me believing 2 graphs are not isomorphic". If this set is large, then it's likely the two graphs are not isomorphic.

IP = PSPACE (Little technical knowledge required)

I won't go into too many details, but a very very surprising result due to Adi Shamir proved that IP = PSPACE unconditionally. What this means is that the things you can prove with interactive proofs is very very very large.

Problems in PSPACE include things like generalized chess (an arbitrary size board but dont let the game go on forever).

This means that someone with infinite power can prove to you some sequence of moves in chess is optimal, and you can make sure she's not telling any lies, even though solving chess yourself is basically impossible!

--------

Formal definition of computational zero knowledge. CS/stats knowledge required.
This is just an interesting way to state what zero knowledge means.

Define a PPT as a probablistic polynomial-time Turing machine (or computer, if you don't know what Turing machines are).

Then two distributions of random variables A, and B, are indistinguishable if for any PPT (there are infinite of them), given some list of values from either A or B, they can't tell if the values came from A and B.

The random variables themselves could be very different, but as long as they're close enough such that you can't tell them apart in polynomial-time, we're good.

Then we define the zero knowledge property as, if an IP is ZK, and the statement we're trying to prove is true, then there exists some polynomial time simulator S that can simulate the prover's responses to the verifier. Note that S doesn't have infinite power, S has the same amount of power as V, but S can still produce the same distribution of output that the prover does.

This seems kind of weird right? That's why the invention of Zero Knowledge Proofs (by Goldwasser) was so revolutionary.

If you're interested in this, then I'd suggest looking at www.cs.cornell.edu/courses/cs6830/2009fa/scribes/lecture17.pdf, which suggests that assuming that commitment schemes exist, then all of NP has zero knowledge proofs.

Interesting stuff, OP.

Took a discrete mathematics class recently and that really got me in to this side of math and computer science, though I've always been interested in computer security in general.

Thanks for sharing

It turns out, due to a result of Goldwasser, Sipser, Micali, et al., assuming one-way functions exist, every single interactive proof also admits a zero knowledge proof. If you're following along a proof sketch is easy.

Convert the IP to be an Arthur-Merlin protocol, so the coins are public.

But then, instead of a plaintext response, the prover encrypts her messages before sending them to you so you can't understand them. So how do you verify? Well, the statement "If you know the encryption key, you would accept" is in NP. So you can reduce to 3-coloring (See the Cornell link I posted above) and prove that instead.

This has very serious implications in cryptography, and also in real life, since we do believe there exist one-way functions and we know of a few candidates (that are used by your computer every day to communicate securely over the Internet).

------------

If you're a math/CS person and want to learn more, I recommend Arora and Barak's Computational Complexity textbook, available online. For in depth information on zero knowledge proofs, I recommend Yehuda Lindell's notes: u.cs.biu.ac.il/~lindell/89-856/complete-89-856.pdf.

-------------

This is it for me, for new information. Please respond if you have any questions about any of this, or if you're looking for more information.

A cute example I learned in discrete math was Shamir's Secret Sharing: en.wikipedia.org/wiki/Shamir's_Secret_Sharing. Only requires polynomials and modular arithmetic.

This is the same Shamir mentioned in . Also, the guy who invented Arthur Merlin proofs several decades ago is Laszlo Babai, the same person who came up with a quasipoly time algorithm for graph isomorphism. These people do a lot!

Can we make this a new general? Like everyweek someone decides to introduce a topic they're interested in and explain it to other anons. The topics could be anything science or maths related like this one just accessible to most people.

Seriously this is one of the best threads I've seen on sci in a while.

Great thread, OP!

If you wish to have more interesting, fun and educational threads on sci instead of shitty baits and popsci retards, as I know I do, take a look at this thread (it's not very presentable yet haha) and vote for the future of this board.

I know there are plenty of people just like you, and many more people who are interested in the content people like you produce, but they are barely able to communicate as they are drowning in a swamp of shit. This is not a way to go!

if you're never 100% sure about the socks how is it a real proof?

This is an interesting question both computationally and philosophically.

Most proof systems guarantee 100% soundness (for example Fitch-style for prop logic). So why are we justified claiming an IP is sound even if it's not 100% sound?

The idea is that it's because we can get arbitrarily close to 100% by just increasing the number of rounds.

If the IP guaranteed 99.999999999999% correctness no matter how many rounds there were, then this does not satisfy the definition of soundness. It has to become smaller with every round, and generally just repeating the rounds will lead to exponential amplification.

Thus you can get as close to 100% as you like without much work. Consider these two events.

A) You end up believing the socks are different colors when they're the same
B) Someone randomly picks an atom from the entire universe and it happens to be within your body

If you do the sock protocol 300 times, B is more likely than A. So for all practical purposes, we're good.

The other reason is that if we were to insist on 100% soundness, it turns out the number of things we could prove becomes very very small, in fact it's equal to the statements in NP, if you know complexity theory. So our interactive proofs aren't as powerful as they are if we're okay with arbitrarily high probability. Does that make sense?