2014/11/07

DISCLAIMER: These notes are from the defunct k8 project which precedes SquirrelJME. The notes for SquirrelJME start on 2016/02/26! The k8 project was effectively a Java SE 8 operating system and as such all of the notes are in the context of that scope. That project is no longer my goal as SquirrelJME is the spiritual successor to it.

03:27

Thinking about it, I will do this. First the classes will be translated to a nice form that is easier to translate (class metrics and such). After that they can either be dumped directly to KBF or other things for special kernel usage. For the special usage, magical stuff will be activated and made so that it works. I figure for my kernel design there will be a core hypervisor and the kernel will just be another hypervisor. The core hypervisor will do whatever it wants so when a sub-process such as the kernel requests some memory it will give it freely. However the core hypervisor will be quite limited in that it will lack support for filesystems and such, but it will permit access to whatever runs below it to system resources. So the core hypervisor will have less system calls in the IPC, very basic stuff. The kernel will handle threading and can control the CPU via the core hypervisor. The core hypervisor will need process and thread management in it however, so it knows where to direct IPC calls made between userspace processes and kernels. So when a kernel spawns a new subprocess it will own that. A subkernel running on the core will not have the capability to kill another kernel on the same level, but could kill any subprocess or thread that it owns (the kernel is a process). The core will have simple process management and can slice kernels time wise, where the kernel will have to schedule itself to run or other tasks. So multiple kernels can schedule their tasks at once. However a kernel running below a kernel (in a virtual machine) would be incapable of forcing scheduling of tasks so that it has less control (it could kill them and change their properties but it could not force a subprocess to be run ahead of a parent or sibling process). However, the kernel can just allocate slices based on what is runnable. So say if two threads need to be run and another is sleeping, the kernel can just requst that a slice be given to those two threads.

03:37

But back to the magic, the magic stuff allows stranger and more dangerous transformations of Java code such as enabling direct register access and such. Only the core hypervisor will be compiled with this in mind, the kernel will not be and will instead rely on the core hypervisor to manage maps and such. So when the kernel wants to map some memory for a process (perhaps for a memory mapped file) it can just request from the hypervisor to allocate (but not to fill) a virtual page for the specified process. Then when a page fault is performed on it, things happen. This means that creation of direct ByteBuffers will be handled by the core hypervisor (if it allows making one) and the objects will be gifted or made visible to a process that requested it. If the process dies or garbage collects the direct byte buffer then it is removed. Normal non-direct ByteBuffers are always backed by an array anyway so when those are created it is just a new byte array. Direct byte buffers are a window (managed by the core hypervisor, it creates this object) to real memory. Direct byte buffers are required to be initialized to zero however a kernel could request a direct buffer that does not clear its contents. If any clearing is specified it will not be done until the buffer is actually accessed (that is it page faults), and a direct byte buffer might be split on page boundaries of which they might be allocated and others might not be. The direct byte buffer would then need an owner which manages its contents and paging status and which thread/process triggered it. So if it is mapped to a file, then the process(es) accessing it will block until the kernel signals a ready. Data from a file will be read somewhere into that memory region then the page will be activated to that freshly allocated region for the subprocess and it will be resumed.

04:55

What I need to do is figure out the format of KBF files, I did make a giant map before but that would be insufficient to handle all the new stuff in the class file and everything in it (I did not know about a bunch of the stuff I know now then). I will need it to contain annotations, but the main thing would be methods and the ABI, both of those need to work. My main goal is so that a class only need to be loaded in memory once for every process, this means that if one process loads the class SocketChannel it gets loaded once in memory. All processes on the same kernel will share the same boot classpath and cannot override it. Then later on, another process decides to load SocketChannel, since it is already in memory the process just needs to initialize the class static stuff and linkage information. So this means that every class will need linkage zones that are used by the method code. Basically a per-class import and export table for that class (the table being static) that is basically a function pointer elsewhere in memory to that specific class code. However the one complex thing to handle would be interfaces. Interfaces could be in any order and have varying descriptors associated with them. It is not possible to statically bind where something points to in an interface. However, it could be done statically. When a class is initialized it may have implemented interfaces. In this case the class has an interface table with the expected exports for that type for each interface. So lets say that there are three classes: A, B, C, and an interface I. B and C both implement I, and A wants to call something in the interface I. Due to the varying order of methods A cannot know where the method lies in B and C. B and C will both have a table that is common for the interface which maps where the imports and exports go for the interface type. So A would obtain the interface table pointer for the specific B and C objects for the interface, then do a normal jump with the function pointer. Now the main issue is determining the actual interface specifier for a random class. I could do a linear search but that would be insanely slow as that would have to be done for all interfaces. On a sub note, subclasses that extend parent classes of which are not static will need to just stack on all of their fields, so that something similar to struct magic like in C will work where you can have fake object orientation with multiple structs that both have another but the same struct as the first member. You could cast both structs to that same member struct, type. Interfaces only have public static final fields so that is a non-issue. Anyway, I will need to think about this. The struct stuff mentioned before is called aliasing.

05:31

Thought about a least common denominator table for interfaces. Meaning that each class that is an interface gets an index which is static which points to an interface map for each class, where the interface table is located so it can execute methods and such. However, that would not work because if multiple interfaces get loaded they may get placed in the same slot. The only alternative that would be safe would be to have each interface have its own index, where then a class would have all of the interfaces mapped in. So say if 2000 interfaces are loaded on a 32-bit system, that would add an overhead of 8000 bytes to every loaded class so it can store the entire gigantic table for every interface that is possible. It would be fast as the classes could just index into it by the interface index offset. It would dereference the table then jump to the specified pointer in the table. It can work but that is not memory efficient. I do know one thing I can do regardless, is that I can always optimize the known stuff. The wrapped integer and float stuff and the Math and StrictMath classes. That is, instead of actually calling Math.pow() on a number, the compiler will just optimize the value at compile time. Since these are very well known classes that it would be horribly unoptimized to just call the method anyway. However, the method itself in the class will still be implemented (Math being as fast as possible while StrictMath being as accurate as possible). The same can go for the primitive wrapper types too, so stuff like Long.divideUnsigned() is never really called but optimized by the compiler to have the expected effect. Another thing that could be optimized is native type boxing. There would be a difference between stuff like Object o = 2 compared to foo("%d%n", 2). The compiler will have to detect if a boxed type is even wanted to be in an object. Although calling another method with a boxed type will require it to be created. So how would it be possible to make it so that it never does get created but is passed by value. Then if the method being invoked never makes the passed integer visible anywhere it does not have to create an object for it. I suppose for those objects, they can retain a very basic form. Still have a class identity but have a flat class type so that they are basically just the class identifier and the value they contain. Thus they would have no large overhead in memory but will still require that they are allocated. So there will need to be a way to determine stack locality. If an object is located on the stack in one method and it becomes exposed to outside objects, the sub method should be able to move the object from the stack and make it visible while filling the former stack area with a special inidicator that it got removed from the stack and placed at a specific region in memory. The stack would have to able to be grown similar to alloca in C, and new could either just add to the stack or create an object in the heap. So new would need some hints, if the object is just going to be set to a field then it can be allocated on the heap, if it is never exposed then it could be placed on the stack. So each allocation hint will have two states, ONSTACK and INHEAP. The calling class on new will not have any idea if the object violates something, so if it hints with ONSTACK and it turns out the object exposes itself everywhere then it will allocate on the heap instead as if INHEAP were called. So code will have to handle in the event that an object needs to be moved from the stack to the heap. Then if circumstances permit, it may be possible to move the object from the heap into the stack but that would mean that it has no references elsewhere so that would be limited. So object handling would have to handle cases where it is on the stack and in the heap.

05:59

Having interfaces hashing to a slot would be good, but the hash would need to be really good to not collide. However, the main thing that would be an issue is if a class decides to implement 65,535 interfaces, that would be very bad for hashing.

09:18

Another thing the compiler can optimize is the Atomic boxed classes which provide atomic interfaces. This would be so that two different CPUs which might see different memory can correctly read and write values in that they are truly atomic as per CPU.

10:39

Need to go into describing stack operations on the byte code ops.

13:01

After much typing, only 18 more opcodes to describe. Then once that is done I can start implementing a byte code reader so that methods are read and described. From there I can make super generic optimization and translation to reduce the number of rewrites for every architecture.