summaryrefslogtreecommitdiff
path: root/sys/src/9/pc64/main.c
AgeCommit message (Collapse)Author
2022-08-129: compute available kernel pages using sizeof(Proc*)Ori Bernstein
procs come from the dynamic pools, so we don't need to remove the memory used by possible procs from the total available.
2022-08-109/port: allow kiloprocs -- allocate procs lazilyOri Bernstein
Treallocate the small data structures around procs eagerly, but use malloc to allocate the large proc data structures when we need them, which allows us to scale to many more procs. There are still many scalability bottlenecks, so we only crank up the nproc limit by a little bit this time around, and crank it up more as we optimize more.
2021-05-12pc64: avoid getcr3() in mmuflushtlb()cinap_lenrek
it turns out that calculating physical address of pml4 is faster than reading the machine register, so pass it explicitely.
2021-01-17pc, pc64: add minimal HPET driver to measure LAPIC and TSC frequenciescinap_lenrek
This adds the new function pointer PCArch.clockinit(), which is a timer dependent initialization routine. It also takes over the job of guesscpuhz(). This way, the architecture ident code can switch between different timers (i8253, HPET and XEN timer).
2020-12-20kernel: handle tos and per process pcycle counters in port/cinap_lenrek
we might as well handle the per process cycle counter in the portable part instead of duplicating the code in every arch and have inconsistent implementations. we now have a portable kenter() and kexit() function, that is ment to be used in trap/syscall from user, which updates the counters. some kernels missed initializing Mach.cyclefreq.
2020-12-06pc, pc64: move all fpu specific code from main.c to fpu.ccinap_lenrek
2020-12-06amd64: FP: back to static size for allocation and copyingSigrid
2020-12-06amd64: FP: always use enough to fit AVX state and align to 64 bytesSigrid
2020-12-06amd64, vmx: support avx/avx2 for host/guest; use *noavx= in plan9.ini to disableSigrid
2020-12-05pc, pc64: allocate dma bounce buffer right after xinit()cinap_lenrek
2020-11-21pc, pc64: fix grub multibootcinap_lenrek
It appears that our IDT overlaps with the data structures passed from grub in multiboot load. So defer setup of the interrupt table after the multiboot parameters have been processed.
2020-11-21pc, pc64: disable all pci devices for /dev/rebootcinap_lenrek
Make sure all pci busmaster activity is disabled, including MSI/MSI-X interrupts, before switching control to the new kernel.
2020-11-17pc, pc64: load idt early in trapinit0()cinap_lenrek
loading the interrupt vector table early allows us to handle traps during bootup before mmuinit() which gives better diagnostics for debugging. we also can handle general protection fault on rdmsr() and wrmsr() which helps during cpuidentify() and archinit() when probing for cpu features.
2020-09-13kernel: massive pci code rewritecinap_lenrek
The new pci code is moved to port/pci.[hc] and shared by all ports. Each port has its own PCI controller implementation, providing the pcicfgrw*() functions for low level pci config space access. The locking for pcicfgrw*() is now done by the caller (only port/pci.c). Device drivers now need to include "../port/pci.h" in addition to "io.h". The new code now checks bridge windows and membars, while enumerating the bus, giving the pc driver a chance to re-assign them. This is needed because some UEFI implementations fail to assign the bars for some devices, so we need to do it outselfs. (See pcireservemem()). While working on this, it was discovered that the pci code assimed the smallest I/O bar size is 16 (pcibarsize()), which is wrong. I/O bars can be as small as 4 bytes. Bit 1 in an I/O bar is also reserved and should be masked off, making the port mask: port = bar & ~3;
2020-04-04pc, pc64: new memory map codecinap_lenrek
This replaces the memory map code for both pc and pc64 kernels with a unified implementation using the new portable memory map code. The main motivation is to be robust against broken e820 memory maps by the bios and delay the Conf.mem[] allocation after archinit(), so mp and acpi tables can be reserved and excluded from user memory. There are a few changes: new memreserve() function has been added for archinit() to reserve bios and acpi tables. upareserve() has been replaced by upaalloc(), which now has an address argument. umbrwmalloc() and umbmalloc() have been replaced by umballoc(). both upaalloc() and umballoc() return physical addresses or -1 on error. the physical address -1 is now used as a sentinel value instead of 0 when dealing with physical addresses. archmp and archacpi now always use vmap() to access the bios tables and reserve the ranges. more overflow checks have been added. ramscan() has been rewritten using vmap(). to handle the population of kernel memory, pc and pc64 now have pmap() and punmap() functions to do permanent mappings.
2020-01-26kernel: implement portable userinit() and simplify process creationcinap_lenrek
replace machine specific userinit() by a portable implemntation that uses kproc() to create the first process. the initcode text is mapped using kmap(), so there is no need for machine specific tmpmap() functions. initcode stack preparation should be done in init0() where the stack is mapped and can be accessed directly. replacing the machine specific userinit() allows some big simplifications as sysrfork() and kproc() are now the only callers of newproc() and we can avoid initializing fields that we know are being initialized by these callers. rename autogenerated init.h and reboot.h headers. the initcode[] and rebootcode[] blobs are now in *.i files and hex generation was moved to portmkfile. the machine specific mkfile only needs to specify how to build rebootcode.out and initcode.out.
2019-12-02pc, pc64: clear debug watchpoint registers on exec and exitcinap_lenrek
when a process does an exec syscall, procsetup() is called and we have to disable the debug watchpoint registers. just clearing p->dr is not enougth as we are not going thru a procsave() and procrestore() cycle which would disable and reload the saved debug registers. instead of clearing debug registers in procfork(), we should clear the saved debug registers before a process goes to die (pexit() calls sched() with up->state = Moribund) as the Proc structure can get reused for kernel processes (kproc) which never call procfork() and would therefore have debug registers loaded.
2019-08-29pc64: map kernel text readonly and everything else no-executecinap_lenrek
the idea is to catch bugs and make kernel exploitation harder by mapping the kernel text section readonly and everything else no-execute. l.s maps the KZERO address space using 2MB pages so to get the 4K granularity for the text section we use the new ptesplit() function to split that mapping up. we need to set EFER no-execute enable bit early in apbootstrap so secondary application processors will understand the NX bit in our shared kernel page tables. also APBOOTSTRAP needs to be kept executable. rebootjump() needs to mark REBOOTADDR page executable.
2019-06-28pc64: preallocate mmupool page tablescinap_lenrek
preallocate 2% of user pages for page tables and MMU structures and keep them mapped in the VMAP range. this leaves more space in the KZERO window and avoids running out of kernel memory on machines with large amounts of memory.
2019-06-20pc64: actually fix it, what was i THINKINGcinap_lenrek
2019-06-20pc64: fix compiler warning in rebootjump() entry calculationcinap_lenrek
2018-11-19pc, pc64: park application processors in rebootcode with mmu offcinap_lenrek
instead of having application processors spin in mpshutdown() with mmu on, and be subject to reboot() overriding kernel text and modifying page tables, park the application processors in rebootcode idle loop with the mmu off.
2018-07-11pc64: update headers to match pcaiju
2018-05-27sdram: experimental ramdisk drivercinap_lenrek
this driver makes regions of physical memory accessible as a disk. to use it, ramdiskinit() has to be called before confinit(), so that conf.mem[] banks can be reserved. currently, only pc and pc64 kernel use it, but otherwise the implementation is portable. ramdisks are not zeroed when allocated, so that the contents are preserved across warm reboots. to not waste memory, physical segments do not allocate Page structures or populate the segment pte's anymore. theres also a new SG_CHACHED attribute.
2018-05-21pc64: fix fpu bugcinap_lenrek
fpurestore() unconditionally changed fpstate to FPinactive when the kernel used the FPU. but in the FPinit case, the registers are not saved by mathemu(), resulting in all zero initialized registers being loaded once userspace uses the FPU so the process would have wrong MXCR value. the index overflow check was wrong with using shifted value.
2018-02-18devether: mux bridges, portable netconsolecinap_lenrek
2017-11-169pc64: handle special case in fpurestore() for procexec()/procsetup()cinap_lenrek
when a process does an exec, it calls procsetup() which unconditionally sets the sets the TS flag and fpstate=FPinit and fpurestore() should not revert the fpstate.
2017-11-14pc64: fix mistake fpurestore() mistakecinap_lenrek
cannot just reenable the fpu in FPactive case as we might have been procsaved() an rescheduled on another cpu. what was i thinking... thanks qu7uux for reproducing the problem.
2017-11-12pc64: allow using the FPU in syscall and pagefault handlerscinap_lenrek
The aim is to take advantage of SSE instructions such as AES-NI in the kernel by lazily saving and restoring FPU state across system calls and pagefaults. (everything can can do I/O) This is accomplished by the functions fpusave() and fpurestore(). fpusave() remembers the current state and disables the FPU if it was active by setting the TS flag. In case the FPU gets used, the current state gets saved and a new PFPU.fpslot is allocated by mathemu(). fpurestore() restores the previous FPU state, reenabling the FPU if fpusave() disabled it. In the most common case, when userspace is not using the FPU, then fpusave()/fpurestore() just toggle the FPpush bit in up->fpstate. When the FPU was active, but we do not use the FPU, then nothing needs to be saved or restored. We just switched the TS flag on and off agaian. Note, this is done for the amd64 kernel only.
2017-11-04kernel: introduce per process FPU struct (PFPU) for more flexible machine ↵cinap_lenrek
specific fpu handling introducing the PFPU structue which allows the machine specific code some flexibility on how to handle the FPU process state. for example, in the pc and pc64 kernel, the FPsave structure is arround 512 bytes. with avx512, it could grow up to 2K. instead of embedding that into the Proc strucutre, it is more effective to allocate it on first use of the fpu, as most processes do not use simd or floating point in the first place. also, the FPsave structure has special 16 byte alignment constraint, which further favours dynamic allocation. this gets rid of the memmoves in pc/pc64 kernels for the aligment. there is also devproc, which is now checking if the fpsave area is actually valid before reading it, avoiding debuggers to see garbage data. the Notsave structure is gone now, as it was not used on any machine.
2017-10-29kernel: introduce devswap #¶ to serve /dev/swap and handle swapfile encryptioncinap_lenrek
2017-09-02devvmx: call vmxshutdown from reboot() function manuallyaiju
2017-08-28devvmx, vmx: lilu dallas multivmaiju
2017-06-28kernel: pass bootargs also in multiboot command line, retire the bootline ↵cinap_lenrek
mechanism to pass arguments to /boot/boot
2017-06-25pc, pc64: support for multiboot framebuffer, common bootargs and multiboot codecinap_lenrek
2017-06-20pc, pc64: adapt devvmx to work on pc64aiju
2017-06-13pc/pc64: keep shadow copy of DR7 in Mach and use that to check whether we ↵aiju
need to reset DR7 in procsave(); remove superfluous reset of DR7 in mmurelease()
2017-06-12kernel: add support for hardware watchpointsaiju
2017-03-11kernel: get rid of active.Lock and active.thunderbirdsargocinap_lenrek
2017-03-11pc kernel: give cpu servers as many image cache strctures as processescinap_lenrek
2016-01-05kernel: change active.machs from bitmap to char array to support up to 64 ↵cinap_lenrek
cpus on pc64
2015-11-30kernel: cleanup exit()/shutdown()/reboot() codecinap_lenrek
introduce cpushutdown() function that does the common operation of initiating shutdown, returning once all cpu's got the message and are about to shutdown. this avoids duplicated code which isnt really machine specific. automatic reboot on panic only when *debug= is not set and the machine is a cpu server or has no display, otherwise just hang.
2015-08-05devkbd: disable mosue/keyboard on shutdown, disable ps2 mouse on init, ↵cinap_lenrek
remove kbdenable()/kbdinit() on vmware, loading a new kernel sometimes reboots when wiggling the mouse. disabling keyboard and mouse on shutdown fixes the issue. make sure ps2 mouse is disabled on init, will get re-enabled in i8042auxenable(). keyboard isnt special anymore, we can just use the devreset entry point in the device to do the keyboard initialization, so kbdinit()/kbdenable() are not needed anymore.
2015-02-07kernel: reduce Page structure size by changing Page.cachectl[]cinap_lenrek
there are no kernels currently that do page coloring, so the only use of cachectl[] is flushing the icache (on arm and ppc). on pc64, cachectl consumes 32 bytes in each page resulting in over 200 megabytes of overhead for 32gb of ram with 4K pages. this change removes cachectl[] and adds txtflush ulong that is set to ~0 by pio() to instruct putmmu() to flush the icache.
2014-12-19pc, pc64: adjust mpshutdown() comment to reflect the current statecinap_lenrek
2014-12-17pc, pc64: remove old B.COM command line parsing and just pass tokenized ↵cinap_lenrek
BOOTLINE to /boot/boot as argv[] this change allows command line passing to /boot/boot from qemu like: qemu -kernel 9pcf -append "-u glenda tcp"
2014-12-16kernel: remove obsolete comment regarding Mntcache size in */main.ccinap_lenrek
2014-10-18pc, pc64: allow passing RSDT pointer in *acpi= boot parameter, early ↵cinap_lenrek
bootscreeninit(), fix rampage() usage rampage() cannot be used after meminit(), so test for conf.mem[0].npage != 0 and use xalloc()/mallocalign() instead. this allows us to use vmap() early before mmuinit() which is needed for bootscreeninit() and acpi. to get memory for page tables, pc64 needs a lowraminit(). with EFI, the RSDT pointer is passed in *acpi= parameter from the efi loader. as the RSDT is ususally at the end of the physical address space (and not to be found in bios areas), we cannot KMAP() it so we need to vmap().
2014-09-21pc64: print "Plan 9" on boot, cleanup pccpu64 filescinap_lenrek
2014-06-22pc64: fix comment for preallocpages()cinap_lenrek