Questions about compilers and executable formats for a new OS 
Author Message
 Questions about compilers and executable formats for a new OS

First off, apologies for cross-posting; let me know if necessary to
which group(s) my questions are most relevant.

I am developping an OS and want it to be able to run programs
compiled in an existing environment; preferably Linux (cc) or DOS
(DJGPP). I realize I will have to write my own libraries, but after
looking at compilers, and the specs of a few executable formats
("old" a.out, ELF and COFF mainly), I am mostly confused. Here are
my concerns and questions:

*my OS uses segmentation (in protected mode, of course; it's written
for 386+). But I understand that pretty much all compilers out there
produce flat code. Is there one that doesn't?

*if the answer to the question above is 'no', I suppose there are
ways around that; I guess I can put the .text section in a code
segment, the .data in a data segment, and have DS-GS point to the
latter. Will that work?

*if that works, what about function calls? I suppose the compilers,
after putting the paraemeters on the stack, just produce an
intra-segment jump or call. (because flat mode = one big segment,
right?) Is there any way to 'patch' this during, say, the relocation
phase at load time, to produce far (inter-segment) jumps or calls? If
not, why not, if yes, how?

*I am not sure I understand how compilers work, so forgive me if the
following question sounds dumb: when you use, say, "printf" in your
program, what does the compiler do wih it? Is it an external function
part of a shared library, so the link is resolved at load time; or is
it statically included in the final executable (thus making the
executable platform specific)? The first solution, I guess, but then
how exactly do you resolve the link at load time?

*I would also appreciate some advice on what kind of executable
format would be best suited for my project, given that I want to keep
everything as simple as possible. I do not intend to take over the
world, this is a just a program started as an academic project that I
am continuing to work on for the fun of it. ELF was my first choice
seeing it is pretty much a standard in the Unix (Linux) world, but it
seems overly complex. A simple "Hello, World!" compiled with cc gave
me 29 (!!!) sections in the binary file. Plus, the documentation
itself is not that great (lack of examples...). Right now I hesitate
between COFF and a.out, but I'd like to know if they allow calls to
shared libraries.

Thanks a lot for your help!

David



Sun, 10 Nov 2002 03:00:00 GMT  
 Questions about compilers and executable formats for a new OS

Quote:

> First off, apologies for cross-posting; let me know if necessary to
> which group(s) my questions are most relevant.

The two OS groups are probably your best choices.

Quote:
> I am developping an OS and want it to be able to run programs
> compiled in an existing environment; preferably Linux (cc) or DOS
> (DJGPP).

[Big snip]

Quote:
> *I am not sure I understand how compilers work, so forgive me if the
> following question sounds dumb: when you use, say, "printf" in your
> program, what does the compiler do wih it?

If you're not sure about this, I'd learn much more about
computers before attempting an OS. Once you are more
knowledgeable, look at some OS' first. "Operating Systems:
Design and Implementation," by Tanenbaum for a tiny UNIX
system, and uC/OS by Labrosse for an embedded system OS are
two good choices.

Quote:
> Is it an external function
> part of a shared library, so the link is resolved at load time; or is
> it statically included in the final executable (thus making the
> executable platform specific)? The first solution, I guess, but then
> how exactly do you resolve the link at load time?

For a standard library function, it usually gets linked
when the executable is built, but you could implement
it as a DLL. A loader placing it into memory and having
physical addresses resolved as offsets to the starting
address is a basic method.

Those are good questions which show you have a basic grasp
of what goes on inside a computer. I'd see the FAQs of the
OS groups for more info.

--
Craig

Manchester, NH
On a long enough timeline, the survival rate for
everyone drops to zero.



Sun, 10 Nov 2002 03:00:00 GMT  
 Questions about compilers and executable formats for a new OS

Quote:
> *my OS uses segmentation (in protected mode, of course; it's written
> for 386+). But I understand that pretty much all compilers out there
> produce flat code. Is there one that doesn't?

Not any free compilers I know of.
All assemblers supports it though.

Quote:
> *if the answer to the question above is 'no', I suppose there are
> ways around that; I guess I can put the .text section in a code
> segment, the .data in a data segment, and have DS-GS point to the
> latter. Will that work?

No, I don't think so. Flat memory model depends on all segment registers
beeing loaded with a selector spanning the same linear addresses. DS
is used to reference both stack and code segment.

Quote:
> *if that works, what about function calls? I suppose the compilers,
> after putting the paraemeters on the stack, just produce an
> intra-segment jump or call. (because flat mode = one big segment,
> right?) Is there any way to 'patch' this during, say, the relocation
> phase at load time, to produce far (inter-segment) jumps or calls? If
> not, why not, if yes, how?

Inter-segment (far) jumps / calls are two bytes longer, so I guess the
answer is no.

Quote:
> *I am not sure I understand how compilers work, so forgive me if the
> following question sounds dumb: when you use, say, "printf" in your
> program, what does the compiler do wih it? Is it an external function
> part of a shared library, so the link is resolved at load time; or is
> it statically included in the final executable (thus making the
> executable platform specific)?

Both versions. It depends on how you link your application and whether
or not the linker supports shared libraries.

Quote:
> The first solution, I guess, but then
> how exactly do you resolve the link at load time?

Through the executable file and shared libraries. They contain this
information.

Quote:
> *I would also appreciate some advice on what kind of executable
> format would be best suited for my project, given that I want to keep
> everything as simple as possible. I do not intend to take over the
> world, this is a just a program started as an academic project that I
> am continuing to work on for the fun of it. ELF was my first choice
> seeing it is pretty much a standard in the Unix (Linux) world, but it
> seems overly complex. A simple "Hello, World!" compiled with cc gave
> me 29 (!!!) sections in the binary file.

Yes, it's very bloated! My shared library for RDOS with only stubs
where 1MB!

Quote:
> Plus, the documentation
> itself is not that great (lack of examples...). Right now I hesitate
> between COFF and a.out, but I'd like to know if they allow calls to
> shared libraries.

Win32 COFF and ELF does, I'm not sure about the others.

Sent via Deja.com http://www.deja.com/
Before you buy.



Sun, 10 Nov 2002 03:00:00 GMT  
 Questions about compilers and executable formats for a new OS

Quote:

> > *if the answer to the question above is 'no', I suppose there are
> > ways around that; I guess I can put the .text section in a code
> > segment, the .data in a data segment, and have DS-GS point to the
> > latter. Will that work?

> No, I don't think so. Flat memory model depends on all segment registers
> beeing loaded with a selector spanning the same linear addresses. DS
> is used to reference both stack and code segment.

I think you can get away with it if CS and DS are different if you write
the appropriate linker script (i.e. make sure data don't go in the .text
section).

Not having stack and data share a segment is a different issue.  Taking
the address of automatic variables becomes a real issue.  I am currently
thinking about modifying GCC to handle separate stack and data segments,
but I have a feeling it is a *big* job (where big might even involve
starting again!!).

Quote:

> > *if that works, what about function calls? I suppose the compilers,
> > after putting the paraemeters on the stack, just produce an
> > intra-segment jump or call. (because flat mode = one big segment,
> > right?) Is there any way to 'patch' this during, say, the relocation
> > phase at load time, to produce far (inter-segment) jumps or calls? If
> > not, why not, if yes, how?

> Inter-segment (far) jumps / calls are two bytes longer, so I guess the
> answer is no.

but why would you need to?  If each separately compiled and linked
module has its own code segment (which seems reasonable), then all
compiler generated branches are near.

[...]

Quote:
> > am continuing to work on for the fun of it. ELF was my first choice
> > seeing it is pretty much a standard in the Unix (Linux) world, but it
> > seems overly complex. A simple "Hello, World!" compiled with cc gave
> > me 29 (!!!) sections in the binary file.

> Yes, it's very bloated! My shared library for RDOS with only stubs
> where 1MB!

do you have to have all this information in ELF though?  Can't you just
have the sections you care about and discard extraneous stuff?

Greg

--

Dept. Of Computing,                     web:    http://www.soi.city.ac.uk/~gel/
City University,                        phone:  +44 20 7477 8341
London, UK.  EC1V 0HB.                  fax:    +44 20 7477 8587



Sun, 10 Nov 2002 03:00:00 GMT  
 Questions about compilers and executable formats for a new OS

Quote:

> > ways around that; I guess I can put the .text section in a code
> > segment, the .data in a data segment, and have DS-GS point to the
> > latter. Will that work?

> No, I don't think so. Flat memory model depends on all segment registers
> beeing loaded with a selector spanning the same linear addresses. DS
> is used to reference both stack and code segment.

Really? How is that possible? I thought push and pop, and other stack
related instructions  *always* implied the use of SS!!


Sun, 10 Nov 2002 03:00:00 GMT  
 Questions about compilers and executable formats for a new OS

Quote:

> First off, apologies for cross-posting; let me know if necessary to
> which group(s) my questions are most relevant.

  Alt.os.development is best.  Comp.lang.asm.x86 is moderated (I'm one
of the moderators).  This topic is close enough to on-topic there.  I
*think* it is off topic for comp.lang.c and you ought to drop that group
for further questions, but I can't speak with any authority about that
group.  Posting alt.os.development things to alt.os.assembly is common
and no one seems to mind.

Quote:
> *my OS uses segmentation (in protected mode, of course; it's written
> for 386+). But I understand that pretty much all compilers out there
> produce flat code. Is there one that doesn't?

  None that I know of.

Quote:
> *if the answer to the question above is 'no', I suppose there are
> ways around that; I guess I can put the .text section in a code
> segment, the .data in a data segment, and have DS-GS point to the
> latter. Will that work?

  I think there are some GCC compile options that are needed to make
that work.  I think it puts things like the tables for switch statements
into .text but accesses them with DS.  You need to make sure that
nothing in .text is accessed with DS.  I forget the details, but I know
that it can be done.

  It still uses DS for things in the stack segment.  I expect that would
be impossible to fix without major modifications to the compiler.  You
didn't mention it, but I would think a true segmented model would split
stack from data (I also think that is a good reason why a true segmented
model is a bad idea).

Quote:
> *if that works, what about function calls? I suppose the compilers,
> after putting the paraemeters on the stack, just produce an
> intra-segment jump or call. (because flat mode = one big segment,
> right?) Is there any way to 'patch' this during, say, the relocation
> phase at load time, to produce far (inter-segment) jumps or calls? If
> not, why not, if yes, how?

  Not directly, because the inter-segment call is a longer instruction.
I have done it (in mixing 16 bit small and medium memory models).  A
custom linker must generate an extra stub for every cross segment call.
The caller calls the stub (near) and the stub calls the other segment
far.  Storing the return linkage is very tricky because the called
routine knows the stack offsets of the parameters based on normal near
linkage.  In summary, it is not impossible, but for your purpose it is
not practical.

  What do you want to achieve with a segmented model that can't be
achieved with paging?  Don't treat segmented model as a goal in itself.
Look at what you are trying to do and at how much effort you can
expend.  Maybe it can be solved without segmenting;  Maybe it can be
kludged with segmenting despite the flat model compiler;  Maybe you can
modify gcc (source code is available).

--
http://www.erols.com/johnfine/
http://www.geocities.com/SiliconValley/Peaks/8600/



Sun, 10 Nov 2002 03:00:00 GMT  
 Questions about compilers and executable formats for a new OS

Visit http://www.mega-tokyo.com/osd and my homepage for latest version of
the OS Loader which currently boots up from MSDOS FAT12/16 partitions, sets
up PMode and runs a program made either with NASM or GCC/DJGPP (linked COFF
format).

All the info is available at the sites.

Good Luck
Alexei A. Frounze
-----------------------------------------
Homepage: http://alexfru.chat.ru
Mirror:   http://members.xoom.com/alexfru

Quote:

> First off, apologies for cross-posting; let me know if necessary to
> which group(s) my questions are most relevant.

> I am developping an OS and want it to be able to run programs
> compiled in an existing environment; preferably Linux (cc) or DOS
> (DJGPP). I realize I will have to write my own libraries, but after
> looking at compilers, and the specs of a few executable formats
> ("old" a.out, ELF and COFF mainly), I am mostly confused. Here are
> my concerns and questions:

> *my OS uses segmentation (in protected mode, of course; it's written
> for 386+). But I understand that pretty much all compilers out there
> produce flat code. Is there one that doesn't?

> *if the answer to the question above is 'no', I suppose there are
> ways around that; I guess I can put the .text section in a code
> segment, the .data in a data segment, and have DS-GS point to the
> latter. Will that work?

> *if that works, what about function calls? I suppose the compilers,
> after putting the paraemeters on the stack, just produce an
> intra-segment jump or call. (because flat mode = one big segment,
> right?) Is there any way to 'patch' this during, say, the relocation
> phase at load time, to produce far (inter-segment) jumps or calls? If
> not, why not, if yes, how?

> *I am not sure I understand how compilers work, so forgive me if the
> following question sounds dumb: when you use, say, "printf" in your
> program, what does the compiler do wih it? Is it an external function
> part of a shared library, so the link is resolved at load time; or is
> it statically included in the final executable (thus making the
> executable platform specific)? The first solution, I guess, but then
> how exactly do you resolve the link at load time?

> *I would also appreciate some advice on what kind of executable
> format would be best suited for my project, given that I want to keep
> everything as simple as possible. I do not intend to take over the
> world, this is a just a program started as an academic project that I
> am continuing to work on for the fun of it. ELF was my first choice
> seeing it is pretty much a standard in the Unix (Linux) world, but it
> seems overly complex. A simple "Hello, World!" compiled with cc gave
> me 29 (!!!) sections in the binary file. Plus, the documentation
> itself is not that great (lack of examples...). Right now I hesitate
> between COFF and a.out, but I'd like to know if they allow calls to
> shared libraries.

> Thanks a lot for your help!

> David



Sun, 10 Nov 2002 03:00:00 GMT  
 Questions about compilers and executable formats for a new OS

Quote:
> > Yes, it's very bloated! My shared library for RDOS with only stubs
> > where 1MB!

> do you have to have all this information in ELF though?  Can't you
just
> have the sections you care about and discard extraneous stuff?

How? I've already discarded 2MB of debug information!
The only thing I can think of is to omit certain parts of the library.
However, the GLIBC project is so large that I have a feeling it would
take a long time just to figure out how the build process works. Too be
usuable, a working LIBC should be smaller than 100k, and I have a
feeling this would be next to impossible to achieve. The Win32 DLLs are
only about 30k.

Maybe the easiest way is to start from scratch and incorporate the
modules as I need them. This also would mean I'd have to do a lot of
work when new LIBC versions come out.

BTW, it took several days just to figure out how to compile it on LINUX.
Cygwin didn't work!

Sent via Deja.com http://www.deja.com/
Before you buy.



Mon, 11 Nov 2002 03:00:00 GMT  
 Questions about compilers and executable formats for a new OS

Quote:

> > > Yes, it's very bloated! My shared library for RDOS with only stubs
> > > where 1MB!

> > do you have to have all this information in ELF though?  Can't you
> just
> > have the sections you care about and discard extraneous stuff?

> How? I've already discarded 2MB of debug information!

This is not a rehtorical question:

can you not just do  objcop ld_generated_elf.o slim_elf.o -R
all_useless_sections

so that you basically just copy .text, .data and .rodata?

Or have I got the wrong end of the stick? (I thought you were concerned
that ELF files incorporate all this extraneous crap).

Greg

--

Dept. Of Computing,                     web:    http://www.soi.city.ac.uk/~gel/
City University,                        phone:  +44 20 7477 8341
London, UK.  EC1V 0HB.                  fax:    +44 20 7477 8587



Mon, 11 Nov 2002 03:00:00 GMT  
 
 [ 9 post ] 

 Relevant Pages 

1. Porting C++ code to new compiler and OS

2. make executables for other OS?

3. Diff between PPC OS / HPC OS/ eXP OS

4. Compiler specific questions -- time for new groups?

5. Compiler specific questions -- time for new groups?

6. Executable File Format

7. about win32 executable file format

8. Portable Executables (PE File Format)

9. Wrong Executable format

10. Are object and executable file format specs available?

11. need sum help w/ my new OS

12. ****** New Os Development Idea ******

 

 
Powered by phpBB® Forum Software