I'm working on a project that iterates through all threads on Veeky Forums and counts up the proper nouns (names...

I'm working on a project that iterates through all threads on Veeky Forums and counts up the proper nouns (names, book titles, etc.), then will graph them. Hopefully I will be able to track trends in Veeky Forums's zeitgeist over time.

Whadda think guys?

Do it

You should ask whoever runs warosu to give you a copy of the archive.

great idea, didn't think about applying it to archives.

Really cool user, hit me up with yous spelunquikal graphs whenever you want

Hope this is spelunkquikal enough for you. I think this is a representation of all "People" mentioned on the board.

I should say this is not optimized for imageboard speech, so it probably doesn't figure out lowercase names.

And one more for /pol/. This is literature, right?

always wanted to do something like this. good luck

This is fucking great

Can you do something like frequency of the Veeky Forums 100 or just top 10? Get an actual sense of what's discussed...

Nick Land mentioned more than Plato

never change, Veeky Forums

Please don't post anything from /pol/.
Thanks

>Sorry
>Steppenwolf

time to skew results

John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green

>10368817
John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John GreenJohn Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John GreenJohn Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green John Green

Please don't post porn or pictures like that on this thread.

And sure here's a pastebin of the top 100 most common Persons on Veeky Forums: The formal definition of "Persons" is going to be whenever the nltk pos_tagger tags it as a Singular Proper Noun and a PERSON.

Bonus: Graph of /g/. Pretty error-prone how it also gets stuff like "Fuck" and "D".
---
paste(.)ee/p/YYsXF

It's broken forever now.

Pure gold

...

...

This is pretty awesome, user. I am trying to learn this kind of thing, but I am a beginner and I am still making basic charts with matplotlib.

Could you share your code? I want to learn to do something like this. Thanks.

you're working on a shitpost meter?

why?

...

I was working on an attempt to create either topic models of what this board discusses or a word embedding to see the underlying associations held by the board as a whole but I got lazy when it came to formatting the raw data. if i ever get it done though im gonna post it