Question about image processing in c++

Question about image processing in c++

I've been trying for a week or so and I can't get it through my head how image processing works in c++ (without using image libraries, only basic tools,on an 8-bit bmp).

I understand that I take the header, get the height/width, and then I thought I could just read the image data into a 2D array (type[height][width]) and each pixel (array[x][y]) would represent the color.

But this is giving me the same value across the entire image. I need to single out the background color and then perform operations based on that. I read about color palette indexes but I couldn't make heads or tails of it.

Can somebody explain the process to me in retard or pseudocode?

Other urls found in this thread:

pastebin.com/DvtBRXTf
en.wikipedia.org/wiki/BMP_file_format
pastebin.com/SRG52Bxy
en.wikipedia.org/wiki/Desirable_difficulty
twitter.com/AnonBabble

don't you think you should at least tell us what the image format is?

Don't you need 3 data points per pixel for the 3 component colors?

I don't know shit about this, but some input is better than nothing (also it bumps the thread).

post your code that is giving you "same value"

in simpler image formats, it'll be like 4 bytes per pixel: R,G,B,Alpha
but some image formats like GIF, PNG, or JPG have built-in compression

8 bit .bmp

I'm starting to figure it out a bit, looks like after the two headers in the file there are 4 bytes that give NULL,B,G,R indexes.

But how do I use these now? If I take the next value after them it should be the first pixel in the image; so then how do I get the color at imageArray[0][0]?

Here's current status of the code, a bit messy atm: pastebin.com/DvtBRXTf

The ultimate goal is to take an image with a solid background, find the "edges" of the inner image and do a straight edge crop.

For some reason I can't upload the example image, keep getting upload failed.

imageArray doesn't contain color data?

en.wikipedia.org/wiki/BMP_file_format

wat

>and then I thought I could just read the image data into a 2D array (type[height][width]) and each pixel (array[x][y]) would represent the color.

you don't need a 2D array because you know the width. you can use the mod operator to tell when the 1D array should start the next row. then using the height you can figure out how many rows there are total.

en.wikipedia.org/wiki/BMP_file_format

you shouldn't use fgets to read the file in.
use fread the whole way.

see check header offset 1E for compression method. if it's 0 it isn't compressed and you will have an easier job.
look at header offset 1C to determine the bit depth. this will determine how you interpret the pixel values. check the wikipedia link under "Pixel Formats" it teaches you everything.

Seriously, look at the link again: en.wikipedia.org/wiki/BMP_file_format
It even provides you small 2x2 and 4x2 example images with their hex representations, completely annotated with the meaning of each byte. Search for "Example 1" and "Example 2"

I'm actually blown away at how accessible this information is. 15 years ago I had to go to the store and hunt for books on this stuff. You can just google it and even get examples breaking the file down byte by byte.

on my first attempt I simply read each byte as an unsigned char and stored in imageArray, but this doesn't seem to work.

A 2D array just felt easier to visualize at first, but thanks, that might be easier to process

Yes, I've looked through that. The bit depth is 8, no compression;
"The 8-bit per pixel (8bpp) format supports 256 distinct colors and stores 1 pixel per 1 byte. Each byte is an index into a table of up to 256 colors."

What's getting me is "index into a table of up to 256 colors." This table supposedly comes between the headers (first 54 bytes) and the pixel information. But how long is it, and how do I use it?

Sorry if I'm being dense, this isn't really my field and I've never worked with reading image data before.

Ok it finally clicked that the color table is 256 bytes (checked the header and confirmed this) not sure why that took me so long

So my assumption is read these 256 into an array (what type? int?) and then colors[imageArray[x][y]] is my color?

Correct
Offset 0A (from the top of the file) has a 4 byte value that tells you where the pixel data actually begins.
Offset 0E gives a 4 byte value telling you the length of the DIB header.
The color table will be between the DIB header and the pixel array.
The color array will be of type unsigned char (most likely, an 8 bit value where it's 3 bits red 3 bits green 2 bits blue).

>doing any image processing without image libraries
why the fuck would you torture yourself with this?

It's a programming test for a job. The company might actually be shit so I might not accept but I figure this is good practice for me anyway.

Reading 0A gives me 54, which is immediately after the DIB header. If I simply read 256 bytes after the DIB and store them in an unsigned char array, they all come out null. I'm still missing something. Reading offset 22 ("his is the size of the raw bitmap data") gives me 64; does that mean anything important to me?

Try using opencv

>codemonkey shit

Cs50 has a section on BMP file format and it's pretty well done maybe check that out OP

>The color table (palette) occurs in the BMP image file directly after the BMP file header, the DIB header (and after optional three red, green and blue bitmasks if the BITMAPINFOHEADER header with BI_BITFIELDS option is used). Therefore, its offset is the size of the BITMAPFILEHEADER plus the size of the DIB header (plus optional 12 bytes for the three bit masks).
>The color table is a block of bytes (a table) listing the colors used by the image. Each pixel in an indexed color image is described by a number of bits (1, 4, or 8) which is an index of a single color described by this table. The purpose of the color palette in indexed color bitmaps is to inform the application about the actual color that each of these index values corresponds to.
>The colors in the color table are usually specified in the 4-byte per entry RGBA32 format.

>they all come out null
You may be reading the data incorrectly or mishandling it within your program.
In fact I'm sure you are just looking at your code.

It's a pretty good great shit test for programmers. He has to prove he has the practical skill to read a datasheet/spec/whitepaper and apply the knowledge within to create a working piece of software.

OP linked his code.
pastebin.com/DvtBRXTf

You can see exactly why they give these kinds of tests. It's a trainwreck.

Code bizarrely uses C-style streams for file IO, but c++ iostream for console output. Pick one or the other.

Line 31 (his constructor) creates a new local variable that masks the class variable of the same name, then "initializes" it. Doesn't really break anything, but reveals a conceptual misunderstanding.

Line 48 comment reveals he thinks the palette is only 4 bytes, a gross conceptual error. Likely, the palette is 256*4=1024 bytes long.

Lines 64-65 allocate memory that is never deallocated (he didn't delete[] it): i.e. the function they are going to look at most blithely leaks memory. Not a good look.

Lines 54, 66 use the

Thank you for pointing all of that out! Criticism is what I need. Granted a lot of it was that I was in the middle of just slapping shit together to experiment/not worrying yet about later parts of the issue. I really appreciate all of this help.

I've worked mostly in standard C Linux and I've forgotten a lot of the C++ stuff I learned. That and CS is my minor field, so I definitely haven't gotten as rigorous an education in it.

If the image is greyscale then each element on the image matrix can represen the whiteness on it using only one component.

I read the whole thread and I suggest you read bmp specification and also get a good book on C and start with simpler problems because clearly your knowledge/experience in c is lacking.

We could tell you the answer but and I am talking from experience if you dont figure it out yourself then you will not learn anything.

>start with simpler problems because clearly your knowledge/experience in c is lacking
I disagree with this, the emphasis on simple/toy programs is what hamstrings so many programmers.
It doesn't get any easier to read and apply a specification. He has to struggle with it in order to raise his level.

It is a color image

Thank you all so much for the input, I definitely have a long way to go and I appreciate it.

I've taken everything into account, studied up on C++ filestreams, and rewrote my code to be much cleaner. I wrote a struct to contain each 4-byte index and tried to read 1024 bytes, storing 4 at a time in each struct in a struct array. However after 172 bytes (after the 54 byte headers) I'm reading in null.

Here's my updated code. I'm not worrying about deleting or the most-variables just yet. I want to make sure I'm reading the image correctly first.

pastebin.com/SRG52Bxy

Have you tried getting gud?

that is idiotic
do you even know how your brain works?
brainlet

Here's: the input file, my current output file if I just write out what I read in and store, and my goal output file

Format changed to .jpg for posting because it wasn't letting me post 256-color .bmp

>do you even know how your brain works?
I do. Seems like you don't.
Those of us who have developed real skills in multiple areas know that you never learn anything if you don't ramp up the difficulty at some point.
There are millions of guys in gyms "perfecting their form" on baby weight and after 5 years their max squat is still 2plate. There are millions of people learning languages "mastering the basics" reading and rereading beginner textbooks, and after 5 years they still can't even read a children's book. And OP is one of the millions who have written toy programs and homework assignments but never actually tried to write something that interacts with a real world specification.

en.wikipedia.org/wiki/Desirable_difficulty

In my defense the first code piece I sent was mid-improv. I have three college semesters of work in standard C, but only about a month or so of working in C++. Most of our work is in data structures, things like AVL trees and various graphs for storing and interfacing with large databases.

So I went into this not having the slightest clue about how images store data and how to read binary data, and greatly underestimated the issue.

you aren't reading totally null, obviously your read has captured some aspect of the input file. looks like you might be off by a byte or two reading in the palette desu.
also, please post just the input file and please change it to a lossless (.png), i want to see if i can do it, sounds fun.

programming is not difficult by definition, so your whole case is bullshit. It's a fucking memorization + trial & error game.

...

>wonder if windows reads colors differently
>move it to school's linux server for commandline use
> check file->good()
> "File opened succesfully!"
> file->read(fhead, 14) //read file header
> check file->good()
> error

Guess I'll die

>programming is not difficult by definition
Are you saying that all programs are equally easy to write? Hello world = Kernel programming, in terms of difficulty, by definition?

>It's a fucking memorization + trial & error game.
>implying it is as easy to memorize 1 digit of pi as it is to memorize 10^27 digits of pi
Enjoying driving trucks through the holes in your idiotic arguments, please, go on.

You're fucking wrong. And just a person that has never programmed something relatively complex can say such ridiculous bullshit. "Memorization + trial&error" is what novices or shitty programmers does when developing simple programs.

>programming is not difficult by definition
What definition of programming are you using?

>Hello world = Kernel programming,
pretty much, yeah. different documentation, same shit.

you're saying it's a better idea to memorize 10^27 digits of pi instead of 1 at a time. How great.

>you're saying it's a better idea to memorize 10^27 digits of pi instead of 1 at a time.
you have reached that point where you're just babbling instead of closing thread
let me guess, you're only pretending to be retard

Decades of hard Research & Development done by fucking smart persons, and an internet twat comes and says that kernel programming and programming in general are easy bullshit. So, or you're a superhuman, or you're just an ignorant idiot. I'll pick the last one.

fun project, I finished it

the load format is exactly 14 bytes header, 40 bytes dibheader, 1024 bytes palette, and then 14400 (that's 120 x 120) pixels

after you read in the image and crop it (resizing it) you have to recalculate before you write it
filesize = 14 + 40 + 1024 + (width * height)
and arraystart = 1078

then you just write it the same way you read it. ezpz

dibheader+22 has to be changed as well (width*height) before writing.

So, in your code
>int hsize = fhead[10] - 14; //get image header size (imagestart - fileheadersize)
this line is wrong in a very sick way because it just so happens to produce the correct answer of 40.

because fhead is an array of char, fhead[10] gives you 1 byte of data which happens to be 54.

what you really need is the dword (4 bytes) beginning at fhead[10]. there are a few ways to do it, but using a cast to int is probably simplest to understand
>int pixels_offset = *(int*)&fhead[10];

second, it is wrong to think you can calculate the DIB header size this way. because you don't know what the palette size is, and you don't know that until you parse the header itself. that's why the very first dword of the DIB header (offset 14 from the start of the file) is the size of the DIB header itself.
>int DIBsize;
>myFile->read((char*)&DIBsize, 4);

again thanks for the fun op and gl