Age | Commit message (Collapse) | Author |
|
this code isnt time critical and process TReal delta can become
very long, so use tk2ms() which is less prone to overflow.
|
|
|
|
The kernel needs to keep cryptographic keys and cipher states
confidential. secalloc() allocates memory from the secret pool
which is protected from debuggers reading the memory thru devproc.
secfree() releases the memory, overriding the data with garbage.
|
|
|
|
|
|
devproc's procctlmemio() did not handle physical segment
types correctly, as it assumed it can just kmap() the page
in question and write to it. physical segments however
need to be mapped uncached but kmap() will always map
cached as it assumes normal memory. on some machines with
aliasing memory with different cache attributes
leads to undefined behaviour!
we borrow the code from devsegment and provide a generic
segio() function to read and write user segments which
handles all the cases without using kmap by just spawning
a kproc that attaches the segment that needs to be read
from or written to. fault() will setup the right mmu
attributes for us. it will also properly flush pages for
segments that maintain instruction cache when written.
however, tlb's have to be flushed separately.
segio() is used for devsegment and devproc now, which
also allows for simplification of fixfault() as there is no
special error handling case anymore as fixfault() is now
called from faulting process *only*.
reads from /proc/$pid/mem can now span multiple pages.
|
|
fixed segments are continuous in physical memory but
allocated in user pages. unlike shared segments, they
are not allocated on demand but the pages are allocated
on creation time (devsegment). fixed segments are
never swapped out, segfreed or resized and can only be
destroyed as a whole.
the physical base address can be discovered by userspace
reading the ctl file in devsegment.
|
|
there are no kernels currently that do page coloring,
so the only use of cachectl[] is flushing the icache
(on arm and ppc).
on pc64, cachectl consumes 32 bytes in each page resulting
in over 200 megabytes of overhead for 32gb of ram with 4K
pages.
this change removes cachectl[] and adds txtflush ulong
that is set to ~0 by pio() to instruct putmmu() to flush
the icache.
|
|
to allow bytewise access to /proc/#/fd, the contents of the file where
recreated on each call. if fd's had been closed or reassigned between
the reads, the offset would be inconsistent and a read could start off
in the middle of a line. this happens when you cat /proc/#/fd file of
a busy process that mutates its filedescriptor table.
to fix this, we now return one line record at a time. if the line
fits in the read size, then this means the next read will always start
at the beginning of the next line record. we remember the consumed
byte count in Chan.mrock and the current record in Chan.nrock. (these
fields are free to usefor non-directory files)
if a read comes in and the offset is the same as c->mrock, we do not
need to regenerate the file and just render the next c->nrock's record.
for reads smaller than the line count, we have to regenerate the content
up to the offset and the race is still possible, but this should not
be the common case.
the same algorithm is now used for /proc/#/ns file, allowing a simpler
reimplementation and getting rid of Mntwalk state strcture.
|
|
theres a race where procstopwait() is interrupted by a note,
setting p->pdbg to nil *before* acquiering the lock and
and pexit() and procctl() accessing it assuming it doesnt
change under them while they are holding the lock.
|
|
noswap flag on exec
|
|
|
|
|
|
|
|
dont kill the calling process when demand load fails if fixfault()
is called from devproc. this happens when you delete the binary
of a running process and try to debug the process accessing uncached
pages thru /proc/$pid/mem file.
fixes to procctlmemio():
- fix missed unlock as txt2data() can error
- make sure the segment isnt freed by taking a reference (under p->seglock)
- access the page with segment locked (see comment)
- get rid of the segment stealer lock
other stuff:
- move txt2data() and data2txt() to segment.c
- add procpagecount() function
- make return type mcounseg() to ulong
|
|
Lock
make the Page stucture less than half its original size by getting rid of
the Lock and the lru.
The Lock was required to coordinate the unchaining of pages that where
both cached and on the lru freelist.
now pages have a single next pointer that is used for palloc.head
freelist xor for page cache hash chains in Image.pghash[].
cached pages are not on the freelist anymore, but will be reclaimed
from images by the pager when the freelist runs out of pages.
each Image has its own 512 hash chains for cached page lookup. That is
2MB worth of pages and there should be no collisions for most text images.
page reclaiming can be done without holding palloc.lock as the Image is
the owner of the page hash chains protected by the Image's lock.
reclaiming Image structures can be done quickly by only reclaiming pages from
inactive images, that is images which are not currently in use by segments.
the Ref structure has no Lock anymore. Only a single long that is atomically
incremented or decremnted using cmpswap().
there are various other changes as a consequence code. and lots of pikeshedding,
sorry.
|
|
procwrite() did truncate the offset to 32bit ulong.
introduce off2addr() function that does the sign
extension hack and use it conststently for Qmem
reads and writes.
|
|
for the CMclose procctl, the fd number was not
bounds checked before indexing in the Fgrp.fd
array.
for the CMclosefiles, we looped fd from 0..maxfd-1,
but need to loop from 0..maxfd as maxfd is inclusive.
|
|
the original format for addresses was %8lux which was changed
to %p for amd64. this broke linuxemu which assumes fixed format
in the segment file. as a compromize we change it to %8p and
amd64 port of linuxemu will hopefully use a more robust parser :)
|
|
file offset is 64 bit signed integer, negative offsets
are invalid and rejected by the kernel. to still access
kernel memory on amd64, we unconditionally clear the sign
bit of the 64 bit offset in libmach and devproc sign
extends the offset back to a 64 bit address.
|
|
this change is in preparation for amd64. the systab calling
convention was also changed to return uintptr (as segattach
returns a pointer) and the arguments are now passed as
va_list which handles amd64 arguments properly (all arguments
are passed in 64bit quantities on the stack, tho the upper
part will not be initialized when the element is smaller
than 8 bytes).
this is partial. xalloc needs to be converted in the future.
|
|
make sure noteid is valid (>0).
prohibit changing note group of kernel processes. this is also
checked for in pgrpnote().
prevent "none" user from changing its note group to another "none"
sessions. this would allow him to send notes other none processes
other than its own.
|
|
pprint() might block or even (maliciously) call into
devproc write which will corrupt the qlock chain on attempt
to qlock up->debug again.
|
|
theres a race when we wait for a process children and that
process exits before we sleep().
|
|
p->dot can be nil when process exits (see pexit())
set closeprocs dot to up->slash so it will show up
right in devproc.
|
|
|
|
the syscallno check in syscallfmt() was wrong. the unsigned
syscall number was cast to an signed integer. so negative
values would pass the check provoking bad memory access from
kernel. the check also has an off by one. one has to check
syscallno >= nsyscalls instead of syscallno > nsyscalls.
access to the p->syscalltrace string was not protected
from modification in devproc. you could awake the process
and cause it to free the string giving an opportunity for
the kernel to access bad memory. or someone could kill the
process (pexit would just free it).
now the string is protected by the usual p->debug qlock. we
also keep the string arround until it is overwritten again
or the process exists. this has the nice side effect that
one can inspect it after the process crashed.
another problem was that our validaddr() would error() instead
of pexiting the current process. the code was changed to only
access up->s.args after it was validated and copied instead of
accessing the user stack directly. this also prevents a sneaky
multithreaded process from chaning the arguments under us.
in case our validaddr() errors, we cannot assume valid user
stack after the waserror() if block. use up->s.arg[0] for the
noted() call to avoid bad access.
|
|
assuming that this check tried to prevent the hostowner
from killing init, it is silly because init would just
handle the note.
with kbdfs, we actually want to send interrupt note to
the initial process group so instead of working arround
this with rfork(RFNOTEG|RFNAMEG), we remove the check.
|
|
procopen.
|
|
|
|
in devproc status read handler the p->status, p->text and p->user
could overflow the local statbuf buffer as they where copied into
it with code like: memmove(statbuf+someoff, p->text, strlen(p->text)).
now using readstr() which will truncate if the string is too long.
make strncpy() usage consistent, make sure results are always null
terminated.
|
|
we have to acquire p->seglock before we lock the individual
segments of the process and lock them. if we dont then pexit()
might free the segments before we can lock them causing the
"qunlock called with qlock not held, from ..." prints.
|
|
always make sure that there are child processes we can wait for
before sleeping.
put pwait() sleep into a loop and recheck. this is not strictly
neccesary but prevents accidents if there are spurious wakeups
or a bug.
|
|
|
|
and proc
|
|
|
|
|
|
|
|
|
|
|