summaryrefslogtreecommitdiff
path: root/sys/src/9/port/portfns.h
AgeCommit message (Collapse)Author
2023-04-08kernel: Clear secrets on rebootcinap_lenrek
The idea is that when we reboot, we zero out memory written by processes that have the private flag set (such as factotum and keyfs), and also clear the secrmem pool, which contains TLS keys and the state of the random number generator. This is so the newly booted kernel or firmware will not find these secret keys in memory.
2022-09-259/port: revert timer wheel change, breaks pi4 boot, needs more time ↵cinap_lenrek
investigating
2022-09-249/port: reimplement timers to use timer wheelOri Bernstein
when many processes go to sleep, our old timer would slow to a crawl; this new implementation does not.
2022-09-03kernel: half NERR, refcount Note's to avoid excessive allocations for ↵cinap_lenrek
postnotepg() Half NERR stack to 32. When posing a note to a large group, avoid allocating Notes for each individual process, but post the reference instread. factor out process interruption into procinterrupt(). Avoid allocation of notes in alarmkproc, just posting the same note to everyone.
2022-08-17kernel: allocate notes in heapcinap_lenrek
de-bloat the proc structure by allocating notes with on the heap instead of embedding them in the proc structure. This saves around 640 bytes per process.
2022-08-17kernel: simplify notify() adding common popnote() functioncinap_lenrek
Handlin notes is common for all architectures except how the note has to be pushed on the user stack. This change adds a popnote() function that returns only the note string or nil if the process should not be notified (no notes or user notes hold off). Popnote() also handles common errors like notify during note handling or missing note handler and will suicide the process in that case.
2022-08-13dtracy: make timer probes run in interrupt contextOri Bernstein
When probing a timer, we were running in our own kproc, and not in an interrupt context, which meant that we didn't have any access to anything worth sampling, so we didn't give any data back. This moves the probe to the hzclock interrupt, and returns the pc in the probe.
2022-05-28kernel: add chdev command to devconsJacob Moody
2021-10-23kernel: fix stat bugscinap_lenrek
In a few places, we where using a fixed buffer of sizeof(Dir)+100 size for stat. This is not correct and fails if the name returned in stat is long. This results in being unable to seek to the end of file with a long filename. The kernel should do the same thing as dirfstat() from libc; handling the conversion and buffer allocation and returning a freeable Dir* pointer. For this, a new dirchanstat() function was added. The fstat syscall was not rewriting the name to the last path element; fix it. In addition, gracefully handle the mountfix case, reallocating the buffer to accomidate the required stat length plus size of the new name so dirsetname() does not fail.
2021-10-11kernel: move waserror() macro to port/portfns.hcinap_lenrek
2021-10-03kernel: ensure that all accesses to Mhead.mount is done with Mhead.lock acquiredcinap_lenrek
The Mhead structures have two sources of references to them: - from Pgrp.mnthash hash-table - from a channels Chan.umh pointer as returned by namec() for a union directory Unless one holds the Mhead.lock RWLock, the Mhead.mount chain can be mutated by eigther cmount(), cunmount() or closepgrp(). Readers, skipping acquiering the lock where: mountfix(): responsible for rewriting directory entries for union directory reads; was walking the Mhead.mount chain to detect if the passed channel itself appears in the mount list. cmount(): had a check and copy when "new" chan was a union itself and if the MCREATE flag is set and would copy the mount table. All this needs to be done with Mhead read-locked while copying the mount entries. devproc(): in the handler for reading /proc/n/ns file. namec(): while checking if the Chan->umh should be initialized. In addition to this, cmount() is changed to do the mountfree() of the original mount chain when MREPL is done after releasing the locks. Also, some cosmetic changes...
2021-05-29kernel: use 64-bit virtual entry point for expanded header, document ↵cinap_lenrek
behaviour in a.out(6) For 64-bit architectures, the a.out header has the HDR_MAGIC flag set in the magic and is expanded by 8 bytes containing the 64-bit virtual address of the programs entry point. While Exec.entry contains physical address for kernel images. Our sysexec() would always use Exec.entry, even for 64-bit a.out binaries, which worked because PADDR(entry) == entry for userspace pointers. This change fixes it, having the kernel use the 64-bit entry point and document the behaviour in the manpage.
2021-04-02kernel: get rid of physical page bank array and use conf.mem[] insteadcinap_lenrek
We can take advantage of the fact that xinit() allocates kernel memory from conf.mem[] banks always at the beginning of a bank, so the separate palloc.mem[] array can be eleminated as we can calculate the amount of non-kernel memory like: upages = cm->npage - (PGROUND(cm->klimit - cm->kbase)/BY2PG) for the number of reserved kernel pages, we provide the new function: ulong nkpages(Confmem*) This eleminates the error case of running out of slots in the array and avoids wasting memory in ports that have simple memory configurations (compared to pc/pc64).
2020-12-22kernel: avoid palloc lock during mmurelease()cinap_lenrek
Previously, mmurelease() was always called with palloc spinlock held. This is unneccesary for some mmurelease() implementations as they wont release pages to the palloc pool. This change removes pagechainhead() and pagechaindone() and replaces them with just freepages() call, which aquires the palloc lock internally as needed. freepages() avoids holding the palloc lock while walking the linked list of pages, avoding some lock contention.
2020-12-20kernel: handle tos and per process pcycle counters in port/cinap_lenrek
we might as well handle the per process cycle counter in the portable part instead of duplicating the code in every arch and have inconsistent implementations. we now have a portable kenter() and kexit() function, that is ment to be used in trap/syscall from user, which updates the counters. some kernels missed initializing Mach.cyclefreq.
2020-12-19kernel: remove Proc* argument from procsetuser() functioncinap_lenrek
2020-12-13kernel: implement per file descriptor OCEXEC flag, reject ORCLOSE when ↵cinap_lenrek
opening /fd, /srv and /shr The OCEXEC flag used to be maintained per channel, making it shared between all the file desciptors. This has a unexpected side effects with regard to channel passing drivers such as devdup (/fd), devsrv (/srv) and devshr (/shr). For example, opening a /srv file with OCEXEC makes it impossible to be remounted by exportfs as it internally does a exec() to mount and re-export it. There is no way to reset the flag. This change makes the OCEXEC flag per file descriptor, so a open with the OCEXEC flag only affects the fd group of the calling process, and not the channel itself. On rfork(RFFDG), the per file descriptor flags get copied. On dup(), the per file descriptor flags are reset. The second modification is that /fd, /srv and /shr should reject the ORCLOSE flag, as the files that are returned have already been opend.
2020-11-03pc, pc64: allocate i/o port space for unassigned pci bars, move ioalloc() to ↵cinap_lenrek
port/iomap.c With some newer UEFI firmware, not all pci bars get programmed and we have to assign them ourselfs. This was already done for memory bars. This change adds the same for i/o port space, by providing a ioreservewin() function which can be used to allocate port space within the parent pci-pci bridge window. Also, the pci code now allocates the pci config space i/o ports 0xCF8/0xCFC so userspace needs to use devpnp to access pci config space now. (see latest realemu change). Also, this moves the ioalloc()/iofree() code out of devarch into port/iomap.c as it can be shared with the ppc mtx kernel.
2020-04-26kernel: improve page reclaimation strategy and lockingcinap_lenrek
when reclaiming pages from an image, always reclaim all the hash chains equally. that way, we avoid being biased towards the chains at the start of the Image.pghash[] array. images can be in two states: active or inactive. inactive images are the ones which are not used by program while active ones aare. when reclaiming pages, we should try to reclaim pages from inactive images first and only if that set becomes exhausted attempt to release text pages and attempt to reclaim pages from active images. when we run out of Image structures, it makes only sense to reclaim pages from inactive images, as reclaiming pages from active ones will never free any Image structures. change putimage() to require a image already locked and make it unlock the image. this avoids many pointless unlock()/lock() sequences as all callers of putimage() already had the image locked.
2020-04-12kernel: remove unused mem2bl() prototypecinap_lenrek
2020-04-04kernel: add portable memory map code (port/memmap.c)cinap_lenrek
This is a generic memory map for physical addresses. Entries can be added with memmapadd() giving a range and a type. Ranges can be allocated and freed from the map. The code automatically resolves overlapping ranges by type priority.
2020-02-23kernel: fix multiple devproc bugs and pid reuse issuescinap_lenrek
devproc assumes that when we hold the Proc.debug qlock, the process will be prevented from exiting. but there is another race where the process has already exited and the Proc* slot gets reused. to solve this, on process creation we also have to acquire the debug qlock while initializing the fields of the process. this also means newproc() should only initialize fields *not* protected by the debug qlock. always acquire the Proc.debug qlock when changing strings in the proc structure to avoid doublefree on concurrent update. for changing the user string, we add a procsetuser() function that does this for auth.c and devcap. remove pgrpnote() from pgrp.c and replace by static postnotepg() in devproc. avoid the assumption that the Proc* entries returned by proctab() are continuous. fixed devproc permission issues: - make sure only eve can access /proc/trace - none should only be allowed to read its own /proc/n/text - move Proc.kp checks into procopen() pid reuse was not handled correctly, as we where only checking if a pid had a living process, but there still could be processes expecting a particular parentpid or noteid. this is now addressed with reference counted Pid structures which are organized in a hash table. read access to the hash table does not require locks which will be usefull for dtracy later.
2020-01-26kernel: implement portable userinit() and simplify process creationcinap_lenrek
replace machine specific userinit() by a portable implemntation that uses kproc() to create the first process. the initcode text is mapped using kmap(), so there is no need for machine specific tmpmap() functions. initcode stack preparation should be done in init0() where the stack is mapped and can be accessed directly. replacing the machine specific userinit() allows some big simplifications as sysrfork() and kproc() are now the only callers of newproc() and we can avoid initializing fields that we know are being initialized by these callers. rename autogenerated init.h and reboot.h headers. the initcode[] and rebootcode[] blobs are now in *.i files and hex generation was moved to portmkfile. the machine specific mkfile only needs to specify how to build rebootcode.out and initcode.out.
2019-12-07pc: replace duplicated and broken mmu flush code in vunmap()cinap_lenrek
comparing m with MACHP() is wrong as m is a constant on 386. add procflushothers(), which flushes all processes except up using common procflushmmu() routine.
2019-09-19kernel: simplify pgrpnote(); moving the note string copying to procwrite()cinap_lenrek
keeps handling of devproc's note and notepg files similar and in the same place and reduces stack usage.
2019-08-27kernel: catch execution read fault on SG_NOEXEC segmentcinap_lenrek
fault() now has an additional pc argument that is used to detect fault on a non-executable segment. that is, we check on read fault if the segment has the SG_NOEXEC attribute and the program counter is within faulting page.
2019-05-01kernel: get rid of checkpagerefs() debuggingcinap_lenrek
was only implemented by the pc kernel. does not account pages used by the mount cache.
2019-05-01kernel: export freepages() function so it can be used in mmurelease()cinap_lenrek
2019-01-22devswap: simplify, don't panic when writing swapfile failscinap_lenrek
always start the pager kproc in swapinit(), simplifying kickpager(). allow zero conf.nswap and conf.nswppo. avoid allocating the reference map and iolist arrays in that case. use ulong for ioptr and iolist indices. don't panic when writing pages out to the swapfile fails. just requeue the page in the io transaction list so we will try again next time executeio() is run or just free the page when the swap reference was dropped. remove unused pagersummary() function.
2018-05-27sdram: experimental ramdisk drivercinap_lenrek
this driver makes regions of physical memory accessible as a disk. to use it, ramdiskinit() has to be called before confinit(), so that conf.mem[] banks can be reserved. currently, only pc and pc64 kernel use it, but otherwise the implementation is portable. ramdisks are not zeroed when allocated, so that the contents are preserved across warm reboots. to not waste memory, physical segments do not allocate Page structures or populate the segment pte's anymore. theres also a new SG_CHACHED attribute.
2018-01-05stats: show amount of reclaimable pages (add -r flag)cinap_lenrek
reclaimable pages are user pages that are used for caches like the image cache, mount cache and swap cache.
2017-10-29kernel: introduce devswap #¶ to serve /dev/swap and handle swapfile encryptioncinap_lenrek
2017-06-12kernel: add support for hardware watchpointsaiju
2017-03-29kernel: fix twakeup()/timerdel() race conditioncinap_lenrek
timerdel() did not make sure that the timer function is not active (on another cpu). just acquiering the Timer lock in the timer function only blocks the caller of timerdel()/timeradd() but not the other way arround (on a multiprocessor). this changes the timer code to track activity of the timer function, having timerdel() wait until the timer has finished executing.
2017-01-12kernel: make the mntcache robust against fileserver like fossil that do not ↵cinap_lenrek
change the qid.vers on wstat introducing new ctrunc() function that invalidates any caches for the passed in chan, invoked when handling wstat with a specified file length or on file creation/truncation. test program to reproduce the problem: #include <u.h> #include <libc.h> #include <libsec.h> void main(int argc, char *argv[]) { int fd; Dir *d, nd; fd = create("xxx", ORDWR, 0666); write(fd, "1234", 4); d = dirstat("xxx"); assert(d->length == 4); nulldir(&nd); nd.length = 0; dirwstat("xxx", &nd); d = dirstat("xxx"); assert(d->length == 0); fd = open("xxx", OREAD); assert(read(fd, (void*)&d, 4) == 0); }
2016-11-12kernel/qio: make readblist() offset of type ulong as the restcinap_lenrek
2016-11-07kernel/qio: big cleanup of qio functionscinap_lenrek
remove bl2mem(), it is broken. a fault while copying to memory yields a partially freed block list. it can be simply replaced by readblist() and freeblist(), which we also use for qcopy() now. remove mem2bl(), and handle putting back remainer from a short read internally (splitblock()) avoiding the releasing and re- acquiering of the ilock. always attempt to free blocks outside of the ilock. have qaddlist() return the number of bytes enqueued, which avoids walking the block list twice.
2016-11-05kernel: avoid padblock copying for devtls/devssl/esp, cleanup debuggingcinap_lenrek
to avoid copying in padblock() when adding cryptographics macs to a block in devtls/devssl/esp we reserve 16 extra bytes to the allocation. remove qio ixsummary() function and add acid function qiostats() to /sys/lib/acid/kernel simplify iallocb(), remove iallocsummary() statitics.
2016-09-11kernel: xoroshiro128+ generator for rand()/nrand()cinap_lenrek
the kernels custom rand() and nrand() functions where not working as specified in rand(2). now we just use libc's rand() and nrand() functions but provide a custom lrand() impelmenting the xoroshiro128+ algorithm as proposed by aiju.
2016-08-27kernel: switch to fast portable chacha based seed-once random number generatorcinap_lenrek
2016-08-27kernel: add secalloc() and secfree() functions for secret memory allocationcinap_lenrek
The kernel needs to keep cryptographic keys and cipher states confidential. secalloc() allocates memory from the secret pool which is protected from debuggers reading the memory thru devproc. secfree() releases the memory, overriding the data with garbage.
2016-03-30devsegment: cleanupscinap_lenrek
- return distinct error message when attempting to create Globalseg with physseg name - copy directory name to up->genbuf so it stays valid after we unlock(&glogalseglock) - cleanup wstat() handling, allow changing uid - make sure global segment size is below SEGMAXSIZE - move isoverlap() check from globalsegattach() into segattach() - remove Proc* argument from globalsegattach(), segattach() and isoverlap() - make Physseg.attr and segattach attr parameter an int for consistency
2016-03-27zynq: introduce SG_FAULT to prevent access to AXI segment while PL is not readycinap_lenrek
access to the axi segment hangs the machine when the fpga is not programmed yet. to prevent access, we introduce a new SG_FAULT flag, that when set on the Segment.type or Physseg.attr, causes the fault handler to immidiately return with an error (as if the segment would not be mapped). during programming, we temporarily set the SG_FAULT flag on the axi physseg, flush all processes tlb's that have the segment mapped and when programming is done, we clear the flag again.
2016-03-10kernel: make fversion()/mntversion() types consistentcinap_lenrek
2015-12-21kernel: missing changes for ibrk() prototypecinap_lenrek
2015-11-30kernel: cleanup exit()/shutdown()/reboot() codecinap_lenrek
introduce cpushutdown() function that does the common operation of initiating shutdown, returning once all cpu's got the message and are about to shutdown. this avoids duplicated code which isnt really machine specific. automatic reboot on panic only when *debug= is not set and the machine is a cpu server or has no display, otherwise just hang.
2015-08-09kernel: pgrpcpy(), simplify Mount structurecinap_lenrek
instead of ordering the source mount list, order the new destination list which has the advantage that we do not need to wlock the source namespace, so copying can be done in parallel and we do not need the copy forward pointer in the Mount structure. the Mhead back pointer in the Mount strcture was unused, removed.
2015-08-06kernel: change vmemchr() length argument to ulong and simplifycinap_lenrek
2015-07-28kernel: export mntattach() from devmnt.c avoiding bogus struct passing and ↵cinap_lenrek
special case in namec() we already export mntauth() and mntversion(), so why not stop being sneaky and just export mntattach() so bindmount() and devshr can just call it directly with proper arguments being checked. we can also avoid handling #M attach specially in namec() by having the devmnt's attach function do error(Enoattach).
2015-07-26kernel: pipelined read ahead for the mount cachecinap_lenrek
this changes devmnt adding mntrahread() function and some helpers for it to do pipelined sequential read ahead for the mount cache. basically, cread() calls mntrahread() with Mntrah structure and it figures out if we where reading sequentially and if thats the case issues reads of c->iounit size in advance. the read ahead state (Mntrah) is kept in the mount cache so we can handle (read ahead) cache invalidation in the presence of writes.