Hey guise, I wrote up a Python script to create a word cloud based on Veeky Forums threads

Hey guise, I wrote up a Python script to create a word cloud based on Veeky Forums threads

Does this look kinda accurate?

Other urls found in this thread:

pastebin.com/tu9Qv9s0
anaconda.org/)
anaconda.com/download/
github.com/amueller/word_cloud
reference.wolfram.com/language/ref/WordCloud.html
pastebin.com/zsW2JG8M
twitter.com/SFWRedditGifs

Was expecting "CS" to be bigger.

>faggot
Why the homophobia?

>ca

?

Why the homophobiaphobia?

I literally have no idea why that's there

>python
dropped

muh infinity doesn't ca

Nvm figured it out. My tokenizer is splitting "can't" into "ca" + "n't" and such

>fucking ca brainlet
leafs btfo

...

Fixed :-')

Oh my god. This is hilarious. I'm going to print this and put it on a wall.

source

literally gib

do it for every board and make a "guess the board" game out of it

Fucking ca ca shit calculus brainlet OP

Should I put up the code on Pastebin or smth then?

pastebin would be great

op don't fuck shit fucking brainlet
>muh anime derivative

this shit is gold. also do this and yes please

Uploaded to pastebin by popular request

pastebin.com/tu9Qv9s0

what is the requests module

Just pip install any packages you're missing on your machine

why tf can I never get pip to work goddammit

I recommend getting anaconda (anaconda.org/)

oops I meant anaconda.com/download/

I need(ed) python 2.7
It was just a mess man

goddamn python slow as fuck

>thread 5600/27998
foreeeeeeeeeeeeeeeever

but thanks a lot user very funny

I don't think it's python that's being slow, it's more likely the responses from the Veeky Forums API. Anything involving HTTP requests is slow.

mother fucker I am trolled fucking hard
>Thread 162160701 (27997 / 27998)
yasssssssssssss almost there!
>Traceback (most recent call last):
ffffffffffffffffffffffffffffffffffffff

please fucking tell me it saved the data it just spent ten million years downloading

alright I got it thank god for REPLs

lol nice pic

what error were you getting?

Apparently you can change the style/color scheme of the word cloud:
github.com/amueller/word_cloud

had to ntlk.download('brown')

Pic looks pretty cool. Someone should give this a shot.

Too easy to guess the board name since the board itself is usually a major discussion topic. How about omitting strings with 3 or less letters to make it harder?

So basically, there are so many "prove me wrong protip you can't" b8s here that can't is the single biggest word in your cloud. Pretty accurate tbqh.

>pol reddit retard

fag

You can probably just take a subset of all threads and still get a representative sample.

Do one for one of the slower boards like /asp/ or /po/

lol this is true. very sad

Do diy.
t. retard who can't install python modules to run your script because easy_install keeps using some directory that is neither Anaconda>Spyder nor my generic Idle

>fucking brainlet
LUL
yes please

Dr. Python, im spacetime

Pretty sure 4chinz thinks my home computer is a bot now, I couldn't get any board to load this morning.

Kek, I wonder how this would look on other boards

...

>alt gays
>shill boomers
checks out

Python rots your brain, use _anything_ else

Fuck, you're right, that's an important one.

t. can't program

Build it yourself then faggot

Bette dan u

ezzpezz

Can you post the script? Learning python and I want to break down a script piece by piece

...

Better know how all those fucking modules work m8

please someone make for more boards

why do brainlets love swearing so much

Hey OP does Veeky Forums now think you're a bot and refuse to serve pages to you? I haven't been able to access any pages since I ran the wordcloud on /pol/. The catalog is empty, no images display on the main page or any thread page, I can't post anywhere because captcha is fucked, etc. My advice to anyone running this is to modify the source to put some random delays in when fetching pages.

This

Why the cursophobia?

>fucking brainlet shit math calculus
yeah that pretty much sums it up
good work OP

Shit forced meme

>being memephobic
it's 2018

Still works for me, see pic

>every single board is "fucking"
kek

oh this is just humiliating

Guess the board, fags

...

/v/

/aco/....??

It's obviously /d/ you newfag.

Why the homophobiaphobiaphobia?

Oh come on, this /a/ is too easy
Where else would Darling in the Franxx get so much attention

ah it's fucking brave settings fucking up Veeky Forums images and everything

Thanks for the script, user. My systems were most obliging.

...

>496x248
whyy

fucking linux shit

...

With a botnet hidden in plain sight.

Brainlet's first time with a Python module. Luckily user saved the words in a file so I didn't have to sit around for 10 minutes after I fucked up.

...

Meanwhile...
reference.wolfram.com/language/ref/WordCloud.html

OP here, made a new version that automatically goes through every board and saves wordcloud pictures (good res) for each. Also added adjustable number of threads to read from each board + removal of keywords from the board's title to make guessing harder.

pastebin.com/zsW2JG8M

/g/ probably
Veeky Forums?

What plugins do you use and in what IDE?

I meant modules.

shy fucking has to be big in every board

No gentoo???

its here in the lift bottom corner

fuck i mean right

Oic now. Smaller meme than I thought I guess

even botnet isn't that big im disappointed

awesome

google software user

Don't say it in the filename brooooo

>diddly
They got some weird memes out there

Looking at all of these I have to say, jesus fuck, do we really say fucking that fucking much on Veeky Forums?

I mean, for fucks sake fucking is the biggest fucking word on every single one of these fucking things.

I use Anaconda which comes with many modules/packages, but you only need the ones that are imported at the top of the script.