ITT: We share things we realized that made us feel clever (be them stupid/obvious or not)

Sorry if this is pretty obvious for those deeper into CS and perhaps AI shit.

I'm studying mathematical logic. In simple propositional logic, you have formulas, and you can deduce things from them using inference rules. So, say you have an argument with a bunch of premises, P1, P2, ... , Pn, and a conclusion supposedly drawn from them. To prove that the conclusion can be drawn from the premises, you have to play around with the premises, using inference rules to modify the premise formulas, until you reach the conclusion.

So, I realized, say you have a formula. When you apply an inference rule to it, you create a new formula, which is "connected" to the previous one by the rule you used. From every new formula you can "spawn" new ones using inference rules.

In this way, you have a directed, weighted graph where nodes are logical formulas, weights are inference rules, and the edge directions represent a relationship "original formula => formula generated by applying the weight/inference rule of this edge". Thus, and here's the thing, finding a proof is a graph search problem: you start from a bunch of formulas that are your premises, generate a graph with formulas, and then you search for your conclusion in those formulas. The proof *is* the path between your premises and your conclusion.
I'm still trying to tie the idea of having *multiple* premises to draw a conclusion from, perhaps you do a big AND with all of them and then use that as your single-source for the graph search algorithm.

Pic related, sorry for my poor Krita skills.

Other urls found in this thread:

en.wikipedia.org/wiki/Natural_deduction
en.wikipedia.org/wiki/Resolution_(logic)
en.wikipedia.org/wiki/DPLL_algorithm
twitter.com/NSFWRedditVideo

I just made a thread asking for good logic books. What are you using to learn logic?

That's actually really cool OP! The problem you run into is the computing power necessary to properly map any relationships that aren't trivial.

If you find a solution, email it to me and I'll claim the million dollar P=NP proof prize on your behalf.

I am teaching a graph theory class, and we were proving some theorem that states that several statements are equivalent. I made a point to explain that we could view each statement as a node, and each time we prove some statement implies another we would draw a directed edge, and we are done proving the theorem once the directed graph is strongly connected.

I think only one student had any idea what the fuck I was saying. Still worth it.

Yeah, I was thinking about how computationally expensive this shit can be, and how you can't just go around generating new formulas using inference rules at random, you have to do it with some heuristic.

For multiple premises, either model the proof as a rooted tree (with conclusion as root and premisrs as leaves) or combine all the premises into a single compound proposition using logical conjunction (if finitely many premises) or the universal quantifier (if infinitely many).

>simple thing you've realized
Category theory is basically set theory but with the axiomatization being carried out entirely on functions, so there are no axioms on sets (in particular, objects-as-sets need not be extensional or wellfounded).

This is the textbook of my logic class.

Noice! I still have to take graph theory, in my uni we take it along with combinatorics, not sure if this is a special case or the norm. Anyway it seems like pretty hard shit.

>weighted graph
labeled graph

What you are saying sounds a lot like Natural deduction to me: en.wikipedia.org/wiki/Natural_deduction

Yes, I was thinking about natural deduction, but I had never seen a proof phrased as a path in a graph of formulas. I guess AI uses similar thinking because there's a lot of talking about searches in state spaces and whatnot (where, I think, actions are edges and states are nodes).

Pretty much right OP.

Automated theorem proving is basically what you just described.

However this is not a universal approach. There are some theorems which you will never be able to prove by an automated method.

I assigned personality to numbers and it helps me to link them to other numbers.
5 is a dark, handsome businessman type. 7 is a hip number that is sociable but edgy. 3 is kind of retarded in a childish way.
I'm not a synesthete, but this helps me to link numbers for quick mental calculations.
At least it makes basic calculation more fun.

When I was in 9th grade I came up with a formula for the area of a triangle that only used side lengths

A = 1/2*a*b*sqrt(1-((c^2-a^2-b^2)/(-2ab)^2))

This will work no matter what sides you choose for a, b, and c

made me feel like less of a brainlet for a while

Because of NP completeness or something like that?

pretty cool.
what you can conclude from this is the ubiquity of graph theory in problem solving. it can literally be used anywhere. one thing that doesnt seem right however, a small nitpick, is that instead of 'weights' being inference rules, you should probably call them 'actions' or something of that sort. perhaps you can add an actual 'weight' to each (node,action) pair depending on some heuristic indicating how 'good' that action is

Thanks for weighing in! lol
You're right, weights kinda suggest quantity, and here we have what seem like functions (?) that map one formula into another.

My idea of considering them "weights" was that sometimes edges would take the same "values" (i.e. the same inference rule), and other times different ones. I guess I was thinking of an enumeration as in C-like programming languages.

This is actual autism

You can indeed view logic this way, OP; inference rules with multiple clauses can be represented either as multi-edges (like in Petri nets) or by having nodes represent, not propositions, but conjunctions of propositions.

I don't think this visualization actually buys you anything, though. It is not a practical way to view a particular proof (for you need to represent irrelevant nodes as well, to avoid the depiction from being trivial and content-free), and graph algorithms are not a practical way to *explore* this graph either.

>There are some theorems which you will never be able to prove by an automated method.
Can you elaborate on that?

No it's not. It's essentially a memory aid.

When I was bored in lecture today I realized that "r" is technically a vowel

I was trying to think about differential forms and how to understand them "concretely". I started from the fact that if M is an n-manifold, then an n-form associates to each point the "infinitesimal" tangent volume at x and then realized that it could always be seen as some sort of volume, but that we had to rethink the notion of volume.
A differential form is a map that associates to each point on a manifold a k-linear alternating form on the tangent space.
So first, we need to think about alternating multilinear forms. A great example of such a form on a vector space V of dimension n, with a basis [math]B[/math], is the determinant [math]\det_B[/math], which associates to n vectors the oriented volume (relative to B) of the parallelotope generated by those vectors.
Now, of course, it is not so easy to understand k-linear alternating forms on a vector space of dimension n for k < n and what they measure.
The point I want to make is that they still measure "the" k-volume of a k-parallelotope, only that because k < n, there are many (independent) ways to do it.
(cont.)

(cont'd)
What I mean is that the volume of a parallelotope generated by k vectors [math]u_1, \dots, u_k[/math] should be thought of as the wedge product [math]u_1 \wedge \dots \wedge u_k[/math] (which, as you notice, is intrinsic) and that each linear form [math]\phi \in \left(\Lambda^k V\right)^*[/math] is a way of measuring volumes (it's a way of interpreting the property that k-linear alternating maps factor through the k-th exterior power)
When n = k, there is only one such way to measuring, up to a constant scaling factor (corresponding to a choice of orientation and units, it's the example of the determinant with respect to a basis that we discussed above), but when k < n, there are several: For example, if you choose a basis [math]v_1, \dots v_n[/math], you can define such a measure as the volume of the projection onto [math]\langle v_{i_1}, \dots, v_{i_k} \rangle[/math] along [math]\langle v_j, j \not \in \{i_1,\dots i_k\}\rangle[/math] for each i1 < ... < ik.
Now finally, what is a differential k-form ? It's a way to coherently assign a weighted volume to each tangent k-parallelotope (that tangent parallelotope represents an "infinitesimal element of k-volume").

Show and tell thread? Cool.

Alright, here's all I got for independently discovered things that felt cool to discover independently. Now bear in mind I suck at math and am balls retarded so when I "discovered" these I had an almost perma-buzz the whole day.

[eqn]\sum_{k=0}^{\infty} \frac{1}{n^k}=\frac{n}{n-1}[/eqn]
[eqn]\sum_{n=1}^{\infty} \frac{1}{2^n}=1[/eqn]
[eqn]\left(\sum_{i=1}^ni\right)^2 = \sum_{i=1}^n i^3[/eqn]
[eqn](n-1)\sum_{n,k=1}^{\infty} \frac{1}{n^k}=1[/eqn]
[eqn]\left(\sum_{i=1}^{n}i^k = \frac{ab_1^{d_1} + ab_2^{d_2} + ... + ab_k^{d_n}}{c}\right)\rightarrow \left(\sum a = c\right) [/eqn]

Last one is a little tricky to state in a formula because once again I am retarded. Basically the sum of the coefficients of the terms of the dividend of the equation for the sum of the first n positive integers to any power k is always equal to the divisor of the equation. At first I thought it was just the number of terms in which I could just say the n of b_n is equal to c, but some of the equations have c + 1 terms but it's usually negative so if you just add the coefficients you get c.

desu, I never understood where those are formulas for simple geometric objects come from

like is basis*height/2 is this the area by definition ? how do we know this is an area ?

>I'm still trying to tie the idea of having *multiple* premises to draw a conclusion from

you could use a petri net

I guess for any k-form [math] \omega [/math] you can find a basis [math] v_1,\dots,v_n [/math] such that [math] \omega = v_1^* \wedge \dots \wedge v_k^* [/math]. then [math] \omega [/math] is some sort of volume in the vector space generated by [math] v_1,\dots,v_k [/math] (which is unique up to scaling) and the value of [math] \omega [/math] on any other vector is volume of the projection on this subspace. is this true ?

dude just integrate the characteristic function of the triangle

Provided you already know the area of a rectangle. That being said, there *is* something inherently complicated about the idea of area/volume.

>I'm still trying to tie the idea of having *multiple* premises to draw a conclusion from, perhaps you do a big AND with all of them and then use that as your single-source for the graph search algorithm.

currying, using AND

The hypergraph is the concept you're looking for.

As somebody else has mentioned, you're going to run into a brick wall. For example, if you claim that you can construct a complete algorithm (therefore producing either a proof or a contradiction on any input) indicating that the input is a contradiction in time polynomial in the length of the target proposition, you would solve the satisfiability problem fairly trivially as follows:

Take a formula [math]\Phi[\math] in 3-CNF. Obtain [math]\not\Phi[\math] in 3-DNF. Then return 0 as the satisfiability of the original formula phi if and only if [math]T \iff \not \Phi[\math].

If you can do what you hope to do, you will have shown that P=NP.

Also, on the topic of the hypergraph, you can use the concept to begin to intuitively see why the problem of what you're doing is hard in the computational sense: in a hypergraph, any node may have one or more sources or destinations.

In particular, we would expect any directed edge coming from a proposition to, in fact, have at least two destinations, and the reason is simple: it's trivially obvious that, given one proposition, you should be able to prove at least two propositions from it using the same inference rule. Let's now say that I grant you that the finite sub-hypergraph restricted to your propositions and proofs of interest is polynomial in width. Now, what does it mean to follow a hyper-edge in this graph? Moreover, what does it mean to follow a path in this hypergraph? Since we know that every edge has two destinations, following an edge inherently presents a choice of two nodes at which to arrive.

This concept of choice is where your proposed method will begin to unravel. You see, even if we have an optimal procedure in the form of an O(1) oracle which tells us which edge to follow in the hypergraph between our current node and our desired conclusion (or contradiction), we would still need to visit both nodes at each junction of choice in order to guarantee that we always encompass the shortest path to our conclusion; therefore, in order to make sure we never miss a junction, where the path length is p(n) for some polynomial wrt. n, the length of the target, we actually need to visit 2^p(n) nodes, requiring exponential time.

What you've described seems to be close to older methods in SAT solving.

See here: en.wikipedia.org/wiki/Resolution_(logic)
You could also improve it somewhat by "projecting" onto a standard digraph (with some loss, of course) and making use of backtracking: en.wikipedia.org/wiki/DPLL_algorithm

Hey OP, in case you're still here, I just wanted to say that I had came up with the *exact* same thing for the *exact* same reasons you described around 1-2 years ago. I know the way you're thinking since you're obviously thinking like me, and I'd like to give you a small boost since you are brave enough to come on an anonymous forum and write down your thoughts

>The functionality of AIs has a lot to do with entropy
>I believe that von Neumann came up with the Merge Sort not to sort data but in an attempt to create an entropy-reduction AI, because he reached the same conclusion

As for your last question, it's quite easy actually. You apply confidence to your results. Confidence (at least in our case) is used to reduce the data quantity required to describe something, therefore we trade accuracy so we can process a higher amount of it. You're not absolutely sure that the market will crash, but based on the combined data you've gathered so far you can conclude with a high confidence (without having to go through every detail) that there will eventually be a crash like there always is. This is a lot like reducing the quality of an image so it becomes lighter - the image still resembles what it is supposed to be, it's just the detail that's gone.

Therefor, the reverse process of that is going in depth and "sorting" your thoughts by proving every single one of them, memorizing the concrete results so you don't have to go through that again. Or in other words, zooming into the detail of the problem to reduce the entropy so you can zoom out again in a more-ordered state. Every following iteration of that increases confidence by reducing entropy of the problem - confidence is negative entropy.

Good luck

Mine is discrete mathematical structures

grothendieck?

yeah bu you claim that the integral gives the area precisely because you fall back on the formula that you had faith in beforehand


I speak of triangle bu the same thing holds for rectangle.

That's because the area of a shape is formally defined as the Lebesgue integral over the set determined by the locus of the shape.

Yeah this is a great realization

This is the basic concept behind automatic theorem proving

This reminds me of when I was in highschool, I programmed my ti calculator to automatically solve quadratics just with some valuable input. Brainlets thought I was a genius

>can you elaborate on that?

I think the quoted user is talking about computability, I don't understand much about it but Turing and Church state something about a limit of things which you can compute in a probably finite amount of time. Automated theorem proving would be in the class of problems that are believed (not proven though) to sit after that limit, we can only make it finitely computable by using heuristics.

As I said, I don't understand much theoretical CS, so it would be better if some other user could explain this in more precise and comprehensible terms.

Only in English I think, and that depends on your accent. In many languages it certainly is a consonant.

This guy gave the intuition as to why the problem is hard in the computational sense:

i wish we were friends irl

What makes you say that?

Really nice thread, OP. I'll mess with this for the next month, if I find anything interesting, I'll make a thread about it.