summaryrefslogtreecommitdiff
path: root/sys/src/9/port/devproc.c
AgeCommit message (Collapse)Author
2022-09-03kernel: half NERR, refcount Note's to avoid excessive allocations for ↵cinap_lenrek
postnotepg() Half NERR stack to 32. When posing a note to a large group, avoid allocating Notes for each individual process, but post the reference instread. factor out process interruption into procinterrupt(). Avoid allocation of notes in alarmkproc, just posting the same note to everyone.
2022-08-17kernel: allocate notes in heapcinap_lenrek
de-bloat the proc structure by allocating notes with on the heap instead of embedding them in the proc structure. This saves around 640 bytes per process.
2022-08-109/port: allow kiloprocs -- allocate procs lazilyOri Bernstein
Treallocate the small data structures around procs eagerly, but use malloc to allocate the large proc data structures when we need them, which allows us to scale to many more procs. There are still many scalability bottlenecks, so we only crank up the nproc limit by a little bit this time around, and crank it up more as we optimize more.
2021-10-03kernel: ensure that all accesses to Mhead.mount is done with Mhead.lock acquiredcinap_lenrek
The Mhead structures have two sources of references to them: - from Pgrp.mnthash hash-table - from a channels Chan.umh pointer as returned by namec() for a union directory Unless one holds the Mhead.lock RWLock, the Mhead.mount chain can be mutated by eigther cmount(), cunmount() or closepgrp(). Readers, skipping acquiering the lock where: mountfix(): responsible for rewriting directory entries for union directory reads; was walking the Mhead.mount chain to detect if the passed channel itself appears in the mount list. cmount(): had a check and copy when "new" chan was a union itself and if the MCREATE flag is set and would copy the mount table. All this needs to be done with Mhead read-locked while copying the mount entries. devproc(): in the handler for reading /proc/n/ns file. namec(): while checking if the Chan->umh should be initialized. In addition to this, cmount() is changed to do the mountfree() of the original mount chain when MREPL is done after releasing the locks. Also, some cosmetic changes...
2020-12-23devproc: allow anyone to change user of its own processes to "none"cinap_lenrek
2020-03-08devproc: return process id when reading /proc/n/ctl filecinap_lenrek
allow reading the control file of a process and return its pid number. if the process has exited, return an error. this can be usefull as a way to test if a process is still alive. and also makes it behave similar to network protocol directories. another side effect is that processes who erroneously open the ctl file ORDWR would be allowed todo so as along as they have write permission and the process is not a kernel process.
2020-03-07devproc: don't allow /proc/$pid/ctl to be opens for readingcinap_lenrek
2020-03-05devproc: fix syscalltrace read for ratracecinap_lenrek
2020-02-28devproc: make sure writewatchpt() doesnt overflow the watchpoint arraycinap_lenrek
the user buffer could be changed while we parse it resulting in a different number of watchpoints than initially calculated. so add a check to the parse loop so we wont overflow the watchpoint array.
2020-02-28devproc: cleanup procwrite size checkscinap_lenrek
writes to /proc/n/notepg and /proc/n/note should be able to write at ERRMAX-1 bytes, not ERRMAX-2. simplify write to /proc/n/args by just copying to local buf first and then doing a kstrdup(). the value of Proc.nargs does not matter when Proc.setargs is 1.
2020-02-23kernel: fix multiple devproc bugs and pid reuse issuescinap_lenrek
devproc assumes that when we hold the Proc.debug qlock, the process will be prevented from exiting. but there is another race where the process has already exited and the Proc* slot gets reused. to solve this, on process creation we also have to acquire the debug qlock while initializing the fields of the process. this also means newproc() should only initialize fields *not* protected by the debug qlock. always acquire the Proc.debug qlock when changing strings in the proc structure to avoid doublefree on concurrent update. for changing the user string, we add a procsetuser() function that does this for auth.c and devcap. remove pgrpnote() from pgrp.c and replace by static postnotepg() in devproc. avoid the assumption that the Proc* entries returned by proctab() are continuous. fixed devproc permission issues: - make sure only eve can access /proc/trace - none should only be allowed to read its own /proc/n/text - move Proc.kp checks into procopen() pid reuse was not handled correctly, as we where only checking if a pid had a living process, but there still could be processes expecting a particular parentpid or noteid. this is now addressed with reference counted Pid structures which are organized in a hash table. read access to the hash table does not require locks which will be usefull for dtracy later.
2019-09-21devproc: fix fishy locking in proctext(), check proc validity, static functionscinap_lenrek
the locking in proctext() is wrong. we have to acquire Proc.seglock when reading segments from Proc.seg[] as segments do not have a private freelist and can therefore be reused for other data structures. once we have Proc.seglock acquired, check that the process pid is still valid so we wont accidentally read some other processes segments. (for both proctext() and procctlmemio()). this also should give better error message to distinguish the case when the process did segdetach() the segment in question before we could acquire Proc.seglock. declare private functions as static.
2019-09-19devproc: move proctab() call after Qnotepg special case in procwrite()cinap_lenrek
2019-09-19kernel: simplify pgrpnote(); moving the note string copying to procwrite()cinap_lenrek
keeps handling of devproc's note and notepg files similar and in the same place and reduces stack usage.
2019-09-08devproc: restore psstate info string in procstopwait()cinap_lenrek
2018-12-05kernel: fix tprof on multiprocessorcinap_lenrek
segclock() has to be called from hzclock(), otherwise only processes running on cpu0 would catche the interrupt and the time delta would be wrong. lock the segment when allocating Seg->profile as profile ctl might be issued from multiple processes. Proc->debug qlock is not sufficient. Seg->profile can never be freed or reallocated once set as the timer interrupt accesses it without any locking.
2018-06-03kernel: stop the practice of passing DMDIR to devir() perm argumentcinap_lenrek
devdir internally replicates the qid in ther perm stat field already and the practice of explicitely passing just causing confusion when done inconsistently.
2018-02-25ns, devproc: quote path and spec arguments for /proc/$pid/ns, namespace(6) ↵cinap_lenrek
does support quoting
2017-11-04kernel: introduce per process FPU struct (PFPU) for more flexible machine ↵cinap_lenrek
specific fpu handling introducing the PFPU structue which allows the machine specific code some flexibility on how to handle the FPU process state. for example, in the pc and pc64 kernel, the FPsave structure is arround 512 bytes. with avx512, it could grow up to 2K. instead of embedding that into the Proc strucutre, it is more effective to allocate it on first use of the fpu, as most processes do not use simd or floating point in the first place. also, the FPsave structure has special 16 byte alignment constraint, which further favours dynamic allocation. this gets rid of the memmoves in pc/pc64 kernels for the aligment. there is also devproc, which is now checking if the fpsave area is actually valid before reading it, avoiding debuggers to see garbage data. the Notsave structure is gone now, as it was not used on any machine.
2017-06-20kernel: add support for sticky segments (cached, preallocated, never paged)cinap_lenrek
2017-06-12pc/pc64: debugexc: ignore exception if in kernel mode and can't get hold of ↵aiju
up->debug
2017-06-12kernel: add support for hardware watchpointsaiju
2017-05-21kernel: avoid panic with segio and SG_FAULT segmentscinap_lenrek
the problem is that segio doesnt check segment attributes and it can't really in case of SG_FAULT which can be inherited from pseg and toggle at any time. so instead of returning -1 from fault into the fault$cputype handler which then panics when fault happend kernel mode, we jump into segio's waserror() block just like in the demand load i/o error case (faulterror()).
2017-05-06devproc: can't wait for ourselfs to stop (thanks Shamar)cinap_lenrek
2016-09-07kernel: use tk2ms() instead of TK2MS macro for process time conversioncinap_lenrek
this code isnt time critical and process TReal delta can become very long, so use tk2ms() which is less prone to overflow.
2016-09-06devproc: do unsigned subtraction to get MACHP(0)->ticks - up->times[TReal] deltacinap_lenrek
2016-08-27kernel: add secalloc() and secfree() functions for secret memory allocationcinap_lenrek
The kernel needs to keep cryptographic keys and cipher states confidential. secalloc() allocates memory from the secret pool which is protected from debuggers reading the memory thru devproc. secfree() releases the memory, overriding the data with garbage.
2015-12-16devprov: remove unused extern int unfaircinap_lenrek
2015-07-15devproc: make sure statbufread offset wont turn negativecinap_lenrek
2015-04-16kernel: add segio() function for reading/writing segmentscinap_lenrek
devproc's procctlmemio() did not handle physical segment types correctly, as it assumed it can just kmap() the page in question and write to it. physical segments however need to be mapped uncached but kmap() will always map cached as it assumes normal memory. on some machines with aliasing memory with different cache attributes leads to undefined behaviour! we borrow the code from devsegment and provide a generic segio() function to read and write user segments which handles all the cases without using kmap by just spawning a kproc that attaches the segment that needs to be read from or written to. fault() will setup the right mmu attributes for us. it will also properly flush pages for segments that maintain instruction cache when written. however, tlb's have to be flushed separately. segio() is used for devsegment and devproc now, which also allows for simplification of fixfault() as there is no special error handling case anymore as fixfault() is now called from faulting process *only*. reads from /proc/$pid/mem can now span multiple pages.
2015-04-12kernel: fixed segment support (for fpga experiments)cinap_lenrek
fixed segments are continuous in physical memory but allocated in user pages. unlike shared segments, they are not allocated on demand but the pages are allocated on creation time (devsegment). fixed segments are never swapped out, segfreed or resized and can only be destroyed as a whole. the physical base address can be discovered by userspace reading the ctl file in devsegment.
2015-02-07kernel: reduce Page structure size by changing Page.cachectl[]cinap_lenrek
there are no kernels currently that do page coloring, so the only use of cachectl[] is flushing the icache (on arm and ppc). on pc64, cachectl consumes 32 bytes in each page resulting in over 200 megabytes of overhead for 32gb of ram with 4K pages. this change removes cachectl[] and adds txtflush ulong that is set to ~0 by pio() to instruct putmmu() to flush the icache.
2014-12-21kernel: avoid inconsistent reads in /proc/#/fd and /proc/#/nscinap_lenrek
to allow bytewise access to /proc/#/fd, the contents of the file where recreated on each call. if fd's had been closed or reassigned between the reads, the offset would be inconsistent and a read could start off in the middle of a line. this happens when you cat /proc/#/fd file of a busy process that mutates its filedescriptor table. to fix this, we now return one line record at a time. if the line fits in the read size, then this means the next read will always start at the beginning of the next line record. we remember the consumed byte count in Chan.mrock and the current record in Chan.nrock. (these fields are free to usefor non-directory files) if a read comes in and the offset is the same as c->mrock, we do not need to regenerate the file and just render the next c->nrock's record. for reads smaller than the line count, we have to regenerate the content up to the offset and the race is still possible, but this should not be the common case. the same algorithm is now used for /proc/#/ns file, allowing a simpler reimplementation and getting rid of Mntwalk state strcture.
2014-11-07devproc: reset p->pdbg under p->debug qlock in procstopwait()cinap_lenrek
theres a race where procstopwait() is interrupted by a note, setting p->pdbg to nil *before* acquiering the lock and and pexit() and procctl() accessing it assuming it doesnt change under them while they are holding the lock.
2014-08-17kernel: make noswap flag exclude processes from killbig() if not eve, reset ↵cinap_lenrek
noswap flag on exec
2014-07-15devproc: nilcinap_lenrek
2014-07-15devproc: fix syscalltrace error handling, conistent use of nil for pointerscinap_lenrek
2014-07-14devproc: fix mistakecinap_lenrek
2014-07-14devproc: fix proccrlmemio bugscinap_lenrek
dont kill the calling process when demand load fails if fixfault() is called from devproc. this happens when you delete the binary of a running process and try to debug the process accessing uncached pages thru /proc/$pid/mem file. fixes to procctlmemio(): - fix missed unlock as txt2data() can error - make sure the segment isnt freed by taking a reference (under p->seglock) - access the page with segment locked (see comment) - get rid of the segment stealer lock other stuff: - move txt2data() and data2txt() to segment.c - add procpagecount() function - make return type mcounseg() to ulong
2014-06-22kernel: new pagecache, remove Lock from page, use cmpswap for Ref instead of ↵cinap_lenrek
Lock make the Page stucture less than half its original size by getting rid of the Lock and the lru. The Lock was required to coordinate the unchaining of pages that where both cached and on the lru freelist. now pages have a single next pointer that is used for palloc.head freelist xor for page cache hash chains in Image.pghash[]. cached pages are not on the freelist anymore, but will be reclaimed from images by the pager when the freelist runs out of pages. each Image has its own 512 hash chains for cached page lookup. That is 2MB worth of pages and there should be no collisions for most text images. page reclaiming can be done without holding palloc.lock as the Image is the owner of the page hash chains protected by the Image's lock. reclaiming Image structures can be done quickly by only reclaiming pages from inactive images, that is images which are not currently in use by segments. the Ref structure has no Lock anymore. Only a single long that is atomically incremented or decremnted using cmpswap(). there are various other changes as a consequence code. and lots of pikeshedding, sorry.
2014-05-26devproc: handle 64bit address writes to /proc/n/mem filescinap_lenrek
procwrite() did truncate the offset to 32bit ulong. introduce off2addr() function that does the sign extension hack and use it conststently for Qmem reads and writes.
2014-05-26devproc: fix close and closefiles procctlcinap_lenrek
for the CMclose procctl, the fd number was not bounds checked before indexing in the Fgrp.fd array. for the CMclosefiles, we looped fd from 0..maxfd-1, but need to loop from 0..maxfd as maxfd is inclusive.
2014-04-01devproc: change address format in segment file to %8p (thanks eekee)cinap_lenrek
the original format for addresses was %8lux which was changed to %p for amd64. this broke linuxemu which assumes fixed format in the segment file. as a compromize we change it to %8p and amd64 port of linuxemu will hopefully use a more robust parser :)
2014-02-08pc64: handle negative file offsets when accessing kernel memory with devproccinap_lenrek
file offset is 64 bit signed integer, negative offsets are invalid and rejected by the kernel. to still access kernel memory on amd64, we unconditionally clear the sign bit of the 64 bit offset in libmach and devproc sign extends the offset back to a 64 bit address.
2014-01-20kernel: apply uintptr for ulong when a pointer is storedcinap_lenrek
this change is in preparation for amd64. the systab calling convention was also changed to return uintptr (as segattach returns a pointer) and the arguments are now passed as va_list which handles amd64 arguments properly (all arguments are passed in 64bit quantities on the stack, tho the upper part will not be initialized when the element is smaller than 8 bytes). this is partial. xalloc needs to be converted in the future.
2013-12-31devproc: fix noteid permission checks for nonecinap_lenrek
make sure noteid is valid (>0). prohibit changing note group of kernel processes. this is also checked for in pgrpnote(). prevent "none" user from changing its note group to another "none" sessions. this would allow him to send notes other none processes other than its own.
2013-12-29kernel: dont call pprint() while holding up->debug qlockcinap_lenrek
pprint() might block or even (maliciously) call into devproc write which will corrupt the qlock chain on attempt to qlock up->debug again.
2013-12-07devproc: make sure /proc/n/wait waits for the right process childrencinap_lenrek
theres a race when we wait for a process children and that process exits before we sleep().
2013-09-22devproc: check for p->dot == nil, run closeproc with up->dot = up->slashcinap_lenrek
p->dot can be nil when process exits (see pexit()) set closeprocs dot to up->slash so it will show up right in devproc.
2013-08-27devproc: properly handle exclusive refcount for /proc/tracecinap_lenrek