share/doc/papers/kernmalloc/kernmalloc.t

   1 .\" Copyright (c) 1988 The Regents of the University of California.
   2 .\" All rights reserved.
   3 .\"
   4 .\" Redistribution and use in source and binary forms, with or without
   5 .\" modification, are permitted provided that the following conditions
   6 .\" are met:
   7 .\" 1. Redistributions of source code must retain the above copyright
   8 .\"    notice, this list of conditions and the following disclaimer.
   9 .\" 2. Redistributions in binary form must reproduce the above copyright
  10 .\"    notice, this list of conditions and the following disclaimer in the
  11 .\"    documentation and/or other materials provided with the distribution.
  12 .\" 3. All advertising materials mentioning features or use of this software
  13 .\"    must display the following acknowledgement:
  14 .\"     This product includes software developed by the University of
  15 .\"     California, Berkeley and its contributors.
  16 .\" 4. Neither the name of the University nor the names of its contributors
  17 .\"    may be used to endorse or promote products derived from this software
  18 .\"    without specific prior written permission.
  19 .\"
  20 .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  21 .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  22 .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  23 .\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  24 .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  25 .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  26 .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  27 .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  28 .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  29 .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  30 .\" SUCH DAMAGE.
  31 .\"
  32 .\"     @(#)kernmalloc.t        5.1 (Berkeley) 4/16/91
  33 .\" $FreeBSD$
  34 .\"
  35 .\" reference a system routine name
  36 .de RN
  37 \fI\\$1\fP\^(\h'1m/24u')\\$2
  38 ..
  39 .\" reference a header name
  40 .de H
  41 .NH \\$1
  42 \\$2
  43 ..
  44 .\" begin figure
  45 .\" .FI "title"
  46 .nr Fn 0 1
  47 .de FI
  48 .ds Lb Figure \\n+(Fn
  49 .ds Lt \\$1
  50 .KF
  51 .DS B
  52 .nf
  53 ..
  54 .\"
  55 .\" end figure
  56 .de Fe
  57 .DE
  58 .ce
  59 \\*(Lb.  \\*(Lt
  60 .sp
  61 .KE
  62 ..
  63 .EQ
  64 delim $$
  65 .EN
  66 .ds CH "
  67 .pn 295
  68 .sp
  69 .rs
  70 .ps -1
  71 .sp -1
  72 .fi
  73 Reprinted from:
  74 \fIProceedings of the San Francisco USENIX Conference\fP,
  75 pp. 295-303, June 1988.
  76 .ps
  77 .\".sp |\n(HMu
  78 .rm CM
  79 .nr PO 1.25i
  80 .TL
  81 Design of a General Purpose Memory Allocator for the 4.3BSD UNIX\(dg Kernel
  82 .ds LF Summer USENIX '88
  83 .ds CF "%
  84 .ds RF San Francisco, June 20-24
  85 .EH 'Design of a General Purpose Memory ...''McKusick, Karels'
  86 .OH 'McKusick, Karels''Design of a General Purpose Memory ...'
  87 .FS
  88 \(dgUNIX is a registered trademark of AT&T in the US and other countries.
  89 .FE
  90 .AU
  91 Marshall Kirk McKusick
  92 .AU
  93 Michael J. Karels
  94 .AI
  95 Computer Systems Research Group
  96 Computer Science Division
  97 Department of Electrical Engineering and Computer Science
  98 University of California, Berkeley
  99 Berkeley, California  94720
 100 .AB
 101 The 4.3BSD UNIX kernel uses many memory allocation mechanisms,
 102 each designed for the particular needs of the utilizing subsystem.
 103 This paper describes a general purpose dynamic memory allocator
 104 that can be used by all of the kernel subsystems.
 105 The design of this allocator takes advantage of known memory usage
 106 patterns in the UNIX kernel and a hybrid strategy that is time-efficient
 107 for small allocations and space-efficient for large allocations.
 108 This allocator replaces the multiple memory allocation interfaces
 109 with a single easy-to-program interface,
 110 results in more efficient use of global memory by eliminating
 111 partitioned and specialized memory pools,
 112 and is quick enough that no performance loss is observed
 113 relative to the current implementations.
 114 The paper concludes with a discussion of our experience in using
 115 the new memory allocator,
 116 and directions for future work.
 117 .AE
 118 .LP
 119 .H 1 "Kernel Memory Allocation in 4.3BSD
 120 .PP
 121 The 4.3BSD kernel has at least ten different memory allocators.
 122 Some of them handle large blocks,
 123 some of them handle small chained data structures,
 124 and others include information to describe I/O operations.
 125 Often the allocations are for small pieces of memory that are only
 126 needed for the duration of a single system call.
 127 In a user process such short-term
 128 memory would be allocated on the run-time stack.
 129 Because the kernel has a limited run-time stack,
 130 it is not feasible to allocate even moderate blocks of memory on it.
 131 Consequently, such memory must be allocated through a more dynamic mechanism.
 132 For example,
 133 when the system must translate a pathname,
 134 it must allocate a one kilobye buffer to hold the name.
 135 Other blocks of memory must be more persistent than a single system call
 136 and really have to be allocated from dynamic memory.
 137 Examples include protocol control blocks that remain throughout
 138 the duration of the network connection.
 139 .PP
 140 Demands for dynamic memory allocation in the kernel have increased
 141 as more services have been added.
 142 Each time a new type of memory allocation has been required,
 143 a specialized memory allocation scheme has been written to handle it.
 144 Often the new memory allocation scheme has been built on top
 145 of an older allocator.
 146 For example, the block device subsystem provides a crude form of
 147 memory allocation through the allocation of empty buffers [Thompson78].
 148 The allocation is slow because of the implied semantics of
 149 finding the oldest buffer, pushing its contents to disk if they are dirty,
 150 and moving physical memory into or out of the buffer to create
 151 the requested size.
 152 To reduce the overhead, a ``new'' memory allocator was built in 4.3BSD
 153 for name translation that allocates a pool of empty buffers.
 154 It keeps them on a free list so they can
 155 be quickly allocated and freed [McKusick85].
 156 .PP
 157 This memory allocation method has several drawbacks.
 158 First, the new allocator can only handle a limited range of sizes.
 159 Second, it depletes the buffer pool, as it steals memory intended
 160 to buffer disk blocks to other purposes.
 161 Finally, it creates yet another interface of
 162 which the programmer must be aware.
 163 .PP
 164 A generalized memory allocator is needed to reduce the complexity
 165 of writing code inside the kernel.
 166 Rather than providing many semi-specialized ways of allocating memory,
 167 the kernel should provide a single general purpose allocator.
 168 With only a single interface,
 169 programmers do not need to figure
 170 out the most appropriate way to allocate memory.
 171 If a good general purpose allocator is available,
 172 it helps avoid the syndrome of creating yet another special
 173 purpose allocator.
 174 .PP
 175 To ease the task of understanding how to use it,
 176 the memory allocator should have an interface similar to the interface
 177 of the well-known memory allocator provided for
 178 applications programmers through the C library routines
 179 .RN malloc
 180 and
 181 .RN free .
 182 Like the C library interface,
 183 the allocation routine should take a parameter specifying the
 184 size of memory that is needed.
 185 The range of sizes for memory requests should not be constrained.
 186 The free routine should take a pointer to the storage being freed,
 187 and should not require additional information such as the size
 188 of the piece of memory being freed.
 189 .H 1 "Criteria for a Kernel Memory Allocator
 190 .PP
 191 The design specification for a kernel memory allocator is similar to,
 192 but not identical to,
 193 the design criteria for a user level memory allocator.
 194 The first criterion for a memory allocator is that it make good use
 195 of the physical memory.
 196 Good use of memory is measured by the amount of memory needed to hold
 197 a set of allocations at any point in time.
 198 Percentage utilization is expressed as:
 199 .ie t \{\
 200 .EQ
 201 utilization~=~requested over required
 202 .EN
 203 .\}
 204 .el \{\
 205 .sp
 206 .ce
 207 \fIutilization\fP=\fIrequested\fP/\fIrequired\fP
 208 .sp
 209 .\}
 210 Here, ``requested'' is the sum of the memory that has been requested
 211 and not yet freed.
 212 ``Required'' is the amount of memory that has been
 213 allocated for the pool from which the requests are filled.
 214 An allocator requires more memory than requested because of fragmentation
 215 and a need to have a ready supply of free memory for future requests.
 216 A perfect memory allocator would have a utilization of 100%.
 217 In practice,
 218 having a 50% utilization is considered good [Korn85].
 219 .PP
 220 Good memory utilization in the kernel is more important than
 221 in user processes.
 222 Because user processes run in virtual memory,
 223 unused parts of their address space can be paged out.
 224 Thus pages in the process address space
 225 that are part of the ``required'' pool that are not
 226 being ``requested'' need not tie up physical memory.
 227 Because the kernel is not paged,
 228 all pages in the ``required'' pool are held by the kernel and
 229 cannot be used for other purposes.
 230 To keep the kernel utilization percentage as high as possible,
 231 it is desirable to release unused memory in the ``required'' pool
 232 rather than to hold it as is typically done with user processes.
 233 Because the kernel can directly manipulate its own page maps,
 234 releasing unused memory is fast;
 235 a user process must do a system call to release memory.
 236 .PP
 237 The most important criterion for a memory allocator is that it be fast.
 238 Because memory allocation is done frequently,
 239 a slow memory allocator will degrade the system performance.
 240 Speed of allocation is more critical when executing in the
 241 kernel than in user code,
 242 because the kernel must allocate many data structure that user
 243 processes can allocate cheaply on their run-time stack.
 244 In addition, the kernel represents the platform on which all user
 245 processes run,
 246 and if it is slow, it will degrade the performance of every process
 247 that is running.
 248 .PP
 249 Another problem with a slow memory allocator is that programmers
 250 of frequently-used kernel interfaces will feel that they
 251 cannot afford to use it as their primary memory allocator.
 252 Instead they will build their own memory allocator on top of the
 253 original by maintaining their own pool of memory blocks.
 254 Multiple allocators reduce the efficiency with which memory is used.
 255 The kernel ends up with many different free lists of memory
 256 instead of a single free list from which all allocation can be drawn.
 257 For example,
 258 consider the case of two subsystems that need memory.
 259 If they have their own free lists,
 260 the amount of memory tied up in the two lists will be the
 261 sum of the greatest amount of memory that each of
 262 the two subsystems has ever used.
 263 If they share a free list,
 264 the amount of memory tied up in the free list may be as low as the
 265 greatest amount of memory that either subsystem used.
 266 As the number of subsystems grows,
 267 the savings from having a single free list grow.
 268 .H 1 "Existing User-level Implementations
 269 .PP
 270 There are many different algorithms and
 271 implementations of user-level memory allocators.
 272 A survey of those available on UNIX systems appeared in [Korn85].
 273 Nearly all of the memory allocators tested made good use of memory,
 274 though most of them were too slow for use in the kernel.
 275 The fastest memory allocator in the survey by nearly a factor of two
 276 was the memory allocator provided on 4.2BSD originally
 277 written by Chris Kingsley at California Institute of Technology.
 278 Unfortunately,
 279 the 4.2BSD memory allocator also wasted twice as much memory
 280 as its nearest competitor in the survey.
 281 .PP
 282 The 4.2BSD user-level memory allocator works by maintaining a set of lists
 283 that are ordered by increasing powers of two.
 284 Each list contains a set of memory blocks of its corresponding size.
 285 To fulfill a memory request,
 286 the size of the request is rounded up to the next power of two.
 287 A piece of memory is then removed from the list corresponding
 288 to the specified power of two and returned to the requester.
 289 Thus, a request for a block of memory of size 53 returns
 290 a block from the 64-sized list.
 291 A typical memory allocation requires a roundup calculation
 292 followed by a linked list removal.
 293 Only if the list is empty is a real memory allocation done.
 294 The free operation is also fast;
 295 the block of memory is put back onto the list from which it came.
 296 The correct list is identified by a size indicator stored
 297 immediately preceding the memory block.
 298 .H 1 "Considerations Unique to a Kernel Allocator
 299 .PP
 300 There are several special conditions that arise when writing a
 301 memory allocator for the kernel that do not apply to a user process
 302 memory allocator.
 303 First, the maximum memory allocation can be determined at
 304 the time that the machine is booted.
 305 This number is never more than the amount of physical memory on the machine,
 306 and is typically much less since a machine with all its
 307 memory dedicated to the operating system is uninteresting to use.
 308 Thus, the kernel can statically allocate a set of data structures
 309 to manage its dynamically allocated memory.
 310 These data structures never need to be
 311 expanded to accommodate memory requests;
 312 yet, if properly designed, they need not be large.
 313 For a user process, the maximum amount of memory that may be allocated
 314 is a function of the maximum size of its virtual memory.
 315 Although it could allocate static data structures to manage
 316 its entire virtual memory,
 317 even if they were efficiently encoded they would potentially be huge.
 318 The other alternative is to allocate data structures as they are needed.
 319 However, that adds extra complications such as new
 320 failure modes if it cannot allocate space for additional
 321 structures and additional mechanisms to link them all together.
 322 .PP
 323 Another special condition of the kernel memory allocator is that it
 324 can control its own address space.
 325 Unlike user processes that can only grow and shrink their heap at one end,
 326 the kernel can keep an arena of kernel addresses and allocate
 327 pieces from that arena which it then populates with physical memory.
 328 The effect is much the same as a user process that has parts of
 329 its address space paged out when they are not in use,
 330 except that the kernel can explicitly control the set of pages
 331 allocated to its address space.
 332 The result is that the ``working set'' of pages in use by the
 333 kernel exactly corresponds to the set of pages that it is really using.
 334 .FI "One day memory usage on a Berkeley time-sharing machine"
 335 .so usage.tbl
 336 .Fe
 337 .PP
 338 A final special condition that applies to the kernel is that
 339 all of the different uses of dynamic memory are known in advance.
 340 Each one of these uses of dynamic memory can be assigned a type.
 341 For each type of dynamic memory that is allocated,
 342 the kernel can provide allocation limits.
 343 One reason given for having separate allocators is that
 344 no single allocator could starve the rest of the kernel of all
 345 its available memory and thus a single runaway
 346 client could not paralyze the system.
 347 By putting limits on each type of memory,
 348 the single general purpose memory allocator can provide the same
 349 protection against memory starvation.\(dg
 350 .FS
 351 \(dgOne might seriously ask the question what good it is if ``only''
 352 one subsystem within the kernel hangs if it is something like the
 353 network on a diskless workstation.
 354 .FE
 355 .PP
 356 \*(Lb shows the memory usage of the kernel over a one day period
 357 on a general timesharing machine at Berkeley.
 358 The ``In Use'', ``Free'', and ``Mem Use'' fields are instantaneous values;
 359 the ``Requests'' field is the number of allocations since system startup;
 360 the ``High Use'' field is the maximum value of
 361 the ``Mem Use'' field since system startup.
 362 The figure demonstrates that most
 363 allocations are for small objects.
 364 Large allocations occur infrequently,
 365 and are typically for long-lived objects
 366 such as buffers to hold the superblock for
 367 a mounted file system.
 368 Thus, a memory allocator only needs to be
 369 fast for small pieces of memory.
 370 .H 1 "Implementation of the Kernel Memory Allocator
 371 .PP
 372 In reviewing the available memory allocators,
 373 none of their strategies could be used without some modification.
 374 The kernel memory allocator that we ended up with is a hybrid
 375 of the fast memory allocator found in the 4.2BSD C library
 376 and a slower but more-memory-efficient first-fit allocator.
 377 .PP
 378 Small allocations are done using the 4.2BSD power-of-two list strategy;
 379 the typical allocation requires only a computation of
 380 the list to use and the removal of an element if it is available,
 381 so it is quite fast.
 382 Macros are provided to avoid the cost of a subroutine call.
 383 Only if the request cannot be fulfilled from a list is a call
 384 made to the allocator itself.
 385 To ensure that the allocator is always called for large requests,
 386 the lists corresponding to large allocations are always empty.
 387 Appendix A shows the data structures and implementation of the macros.
 388 .PP
 389 Similarly, freeing a block of memory can be done with a macro.
 390 The macro computes the list on which to place the request
 391 and puts it there.
 392 The free routine is called only if the block of memory is
 393 considered to be a large allocation.
 394 Including the cost of blocking out interrupts,
 395 the allocation and freeing macros generate respectively
 396 only nine and sixteen (simple) VAX instructions.
 397 .PP
 398 Because of the inefficiency of power-of-two allocation strategies
 399 for large allocations,
 400 a different strategy is used for allocations larger than two kilobytes.
 401 The selection of two kilobytes is derived from our statistics on
 402 the utilization of memory within the kernel,
 403 that showed that 95 to 98% of allocations are of size one kilobyte or less.
 404 A frequent caller of the memory allocator
 405 (the name translation function)
 406 always requests a one kilobyte block.
 407 Additionally the allocation method for large blocks is based on allocating
 408 pieces of memory in multiples of pages.
 409 Consequently the actual allocation size for requests of size
 410 $2~times~pagesize$ or less are identical.\(dg
 411 .FS
 412 \(dgTo understand why this number is $size 8 {2~times~pagesize}$ one
 413 observes that the power-of-two algorithm yields sizes of 1, 2, 4, 8, \&...
 414 pages while the large block algorithm that allocates in multiples
 415 of pages yields sizes of 1, 2, 3, 4, \&... pages.
 416 Thus for allocations of sizes between one and two pages
 417 both algorithms use two pages;
 418 it is not until allocations of sizes between two and three pages
 419 that a difference emerges where the power-of-two algorithm will use
 420 four pages while the large block algorithm will use three pages.
 421 .FE
 422 In 4.3BSD on the VAX, the (software) page size is one kilobyte,
 423 so two kilobytes is the smallest logical cutoff.
 424 .PP
 425 Large allocations are first rounded up to be a multiple of the page size.
 426 The allocator then uses a first-fit algorithm to find space in the
 427 kernel address arena set aside for dynamic allocations.
 428 Thus a request for a five kilobyte piece of memory will use exactly
 429 five pages of memory rather than eight kilobytes as with
 430 the power-of-two allocation strategy.
 431 When a large piece of memory is freed,
 432 the memory pages are returned to the free memory pool,
 433 and the address space is returned to the kernel address arena
 434 where it is coalesced with adjacent free pieces.
 435 .PP
 436 Another technique to improve both the efficiency of memory utilization
 437 and the speed of allocation
 438 is to cluster same-sized small allocations on a page.
 439 When a list for a power-of-two allocation is empty,
 440 a new page is allocated and divided into pieces of the needed size.
 441 This strategy speeds future allocations as several pieces of memory
 442 become available as a result of the call into the allocator.
 443 .PP
 444 .FI "Calculation of allocation size"
 445 .so alloc.fig
 446 .Fe
 447 Because the size is not specified when a block of memory is freed,
 448 the allocator must keep track of the sizes of the pieces it has handed out.
 449 The 4.2BSD user-level allocator stores the size of each block
 450 in a header just before the allocation.
 451 However, this strategy doubles the memory requirement for allocations that
 452 require a power-of-two-sized block.
 453 Therefore,
 454 instead of storing the size of each piece of memory with the piece itself,
 455 the size information is associated with the memory page.
 456 \*(Lb shows how the kernel determines
 457 the size of a piece of memory that is being freed,
 458 by calculating the page in which it resides,
 459 and looking up the size associated with that page.
 460 Eliminating the cost of the overhead per piece improved utilization
 461 far more than expected.
 462 The reason is that many allocations in the kernel are for blocks of
 463 memory whose size is exactly a power of two.
 464 These requests would be nearly doubled if the user-level strategy were used.
 465 Now they can be accommodated with no wasted memory.
 466 .PP
 467 The allocator can be called both from the top half of the kernel,
 468 which is willing to wait for memory to become available,
 469 and from the interrupt routines in the bottom half of the kernel
 470 that cannot wait for memory to become available.
 471 Clients indicate their willingness (and ability) to wait with a flag
 472 to the allocation routine.
 473 For clients that are willing to wait,
 474 the allocator guarrentees that their request will succeed.
 475 Thus, these clients can need not check the return value from the allocator.
 476 If memory is unavailable and the client cannot wait,
 477 the allocator returns a null pointer.
 478 These clients must be prepared to cope with this
 479 (hopefully infrequent) condition
 480 (usually by giving up and hoping to do better later).
 481 .H 1 "Results of the Implementation
 482 .PP
 483 The new memory allocator was written about a year ago.
 484 Conversion from the old memory allocators to the new allocator
 485 has been going on ever since.
 486 Many of the special purpose allocators have been eliminated.
 487 This list includes
 488 .RN calloc ,
 489 .RN wmemall ,
 490 and
 491 .RN zmemall .
 492 Many of the special purpose memory allocators built on
 493 top of other allocators have also been eliminated.
 494 For example, the allocator that was built on top of the buffer pool allocator
 495 .RN geteblk
 496 to allocate pathname buffers in
 497 .RN namei
 498 has been eliminated.
 499 Because the typical allocation is so fast,
 500 we have found that none of the special purpose pools are needed.
 501 Indeed, the allocation is about the same as the previous cost of
 502 allocating buffers from the network pool (\fImbuf\fP\^s).
 503 Consequently applications that used to allocate network
 504 buffers for their own uses have been switched over to using
 505 the general purpose allocator without increasing their running time.
 506 .PP
 507 Quantifying the performance of the allocator is difficult because
 508 it is hard to measure the amount of time spent allocating
 509 and freeing memory in the kernel.
 510 The usual approach is to compile a kernel for profiling
 511 and then compare the running time of the routines that
 512 implemented the old abstraction versus those that implement the new one.
 513 The old routines are difficult to quantify because
 514 individual routines were used for more than one purpose.
 515 For example, the
 516 .RN geteblk
 517 routine was used both to allocate one kilobyte memory blocks
 518 and for its intended purpose of providing buffers to the filesystem.
 519 Differentiating these uses is often difficult.
 520 To get a measure of the cost of memory allocation before
 521 putting in our new allocator,
 522 we summed up the running time of all the routines whose
 523 exclusive task was memory allocation.
 524 To this total we added the fraction
 525 of the running time of the multi-purpose routines that could
 526 clearly be identified as memory allocation usage.
 527 This number showed that approximately three percent of
 528 the time spent in the kernel could be accounted to memory allocation.
 529 .PP
 530 The new allocator is difficult to measure
 531 because the usual case of the memory allocator is implemented as a macro.
 532 Thus, its running time is a small fraction of the running time of the
 533 numerous routines in the kernel that use it.
 534 To get a bound on the cost,
 535 we changed the macro always to call the memory allocation routine.
 536 Running in this mode, the memory allocator accounted for six percent
 537 of the time spent in the kernel.
 538 Factoring out the cost of the statistics collection and the
 539 subroutine call overhead for the cases that could
 540 normally be handled by the macro,
 541 we estimate that the allocator would account for
 542 at most four percent of time in the kernel.
 543 These measurements show that the new allocator does not introduce
 544 significant new run-time costs.
 545 .PP
 546 The other major success has been in keeping the size information
 547 on a per-page basis.
 548 This technique allows the most frequently requested sizes to be
 549 allocated without waste.
 550 It also reduces the amount of bookkeeping information associated
 551 with the allocator to four kilobytes of information
 552 per megabyte of memory under management (with a one kilobyte page size).
 553 .H 1 "Future Work
 554 .PP
 555 Our next project is to convert many of the static
 556 kernel tables to be dynamically allocated.
 557 Static tables include the process table, the file table,
 558 and the mount table.
 559 Making these tables dynamic will have two benefits.
 560 First, it will reduce the amount of memory
 561 that must be statically allocated at boot time.
 562 Second, it will eliminate the arbitrary upper limit imposed
 563 by the current static sizing
 564 (although a limit will be retained to constrain runaway clients).
 565 Other researchers have already shown the memory savings
 566 achieved by this conversion [Rodriguez88].
 567 .PP
 568 Under the current implementation,
 569 memory is never moved from one size list to another.
 570 With the 4.2BSD memory allocator this causes problems,
 571 particularly for large allocations where a process may use
 572 a quarter megabyte piece of memory once,
 573 which is then never available for any other size request.
 574 In our hybrid scheme,
 575 memory can be shuffled between large requests so that large blocks
 576 of memory are never stranded as they are with the 4.2BSD allocator.
 577 However, pages allocated to small requests are allocated once
 578 to a particular size and never changed thereafter.
 579 If a burst of requests came in for a particular size,
 580 that size would acquire a large amount of memory
 581 that would then not be available for other future requests.
 582 .PP
 583 In practice, we do not find that the free lists become too large.
 584 However, we have been investigating ways to handle such problems
 585 if they occur in the future.
 586 Our current investigations involve a routine
 587 that can run as part of the idle loop that would sort the elements
 588 on each of the free lists into order of increasing address.
 589 Since any given page has only one size of elements allocated from it,
 590 the effect of the sorting would be to sort the list into distinct pages.
 591 When all the pieces of a page became free,
 592 the page itself could be released back to the free pool so that
 593 it could be allocated to another purpose.
 594 Although there is no guarantee that all the pieces of a page would ever
 595 be freed,
 596 most allocations are short-lived, lasting only for the duration of
 597 an open file descriptor, an open network connection, or a system call.
 598 As new allocations would be made from the page sorted to
 599 the front of the list,
 600 return of elements from pages at the back would eventually
 601 allow pages later in the list to be freed.
 602 .PP
 603 Two of the traditional UNIX
 604 memory allocators remain in the current system.
 605 The terminal subsystem uses \fIclist\fP\^s (character lists).
 606 That part of the system is expected to undergo major revision within
 607 the next year or so, and it will probably be changed to use
 608 \fImbuf\fP\^s as it is merged into the network system.
 609 The other major allocator that remains is
 610 .RN getblk ,
 611 the routine that manages the filesystem buffer pool memory
 612 and associated control information.
 613 Only the filesystem uses
 614 .RN getblk
 615 in the current system;
 616 it manages the constant-sized buffer pool.
 617 We plan to merge the filesystem buffer cache into the virtual memory system's
 618 page cache in the future.
 619 This change will allow the size of the buffer pool to be changed
 620 according to memory load,
 621 but will require a policy for balancing memory needs
 622 with filesystem cache performance.
 623 .H 1 "Acknowledgments
 624 .PP
 625 In the spirit of community support,
 626 we have made various versions of our allocator available to our test sites.
 627 They have been busily burning it in and giving
 628 us feedback on their experiences.
 629 We acknowledge their invaluable input.
 630 The feedback from the Usenix program committee on the initial draft of
 631 our paper suggested numerous important improvements.
 632 .H 1 "References
 633 .LP
 634 .IP Korn85 \w'Rodriguez88\0\0'u
 635 David Korn, Kiem-Phong Vo,
 636 ``In Search of a Better Malloc''
 637 \fIProceedings of the Portland Usenix Conference\fP,
 638 pp 489-506, June 1985.
 639 .IP McKusick85
 640 M. McKusick, M. Karels, S. Leffler,
 641 ``Performance Improvements and Functional Enhancements in 4.3BSD''
 642 \fIProceedings of the Portland Usenix Conference\fP,
 643 pp 519-531, June 1985.
 644 .IP Rodriguez88
 645 Robert Rodriguez, Matt Koehler, Larry Palmer, Ricky Palmer,
 646 ``A Dynamic UNIX Operating System''
 647 \fIProceedings of the San Francisco Usenix Conference\fP,
 648 June 1988.
 649 .IP Thompson78
 650 Ken Thompson,
 651 ``UNIX Implementation''
 652 \fIBell System Technical Journal\fP, volume 57, number 6,
 653 pp 1931-1946, 1978.