share/doc/papers/kernmalloc/kernmalloc.t

   1 .\" Copyright (c) 1988 The Regents of the University of California.
   2 .\" All rights reserved.
   3 .\"
   4 .\" Redistribution and use in source and binary forms, with or without
   5 .\" modification, are permitted provided that the following conditions
   6 .\" are met:
   7 .\" 1. Redistributions of source code must retain the above copyright
   8 .\"    notice, this list of conditions and the following disclaimer.
   9 .\" 2. Redistributions in binary form must reproduce the above copyright
  10 .\"    notice, this list of conditions and the following disclaimer in the
  11 .\"    documentation and/or other materials provided with the distribution.
  12 .\" 3. Neither the name of the University nor the names of its contributors
  13 .\"    may be used to endorse or promote products derived from this software
  14 .\"    without specific prior written permission.
  15 .\"
  16 .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  17 .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  18 .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  19 .\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  20 .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  21 .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  22 .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  23 .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  24 .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  25 .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  26 .\" SUCH DAMAGE.
  27 .\"
  28 .\"     @(#)kernmalloc.t        5.1 (Berkeley) 4/16/91
  29 .\" $FreeBSD$
  30 .\"
  31 .\" reference a system routine name
  32 .de RN
  33 \fI\\$1\fP\^(\h'1m/24u')\\$2
  34 ..
  35 .\" reference a header name
  36 .de H
  37 .NH \\$1
  38 \\$2
  39 ..
  40 .\" begin figure
  41 .\" .FI "title"
  42 .nr Fn 0 1
  43 .de FI
  44 .ds Lb Figure \\n+(Fn
  45 .ds Lt \\$1
  46 .KF
  47 .DS B
  48 .nf
  49 ..
  50 .\"
  51 .\" end figure
  52 .de Fe
  53 .DE
  54 .ce
  55 \\*(Lb.  \\*(Lt
  56 .sp
  57 .KE
  58 ..
  59 .EQ
  60 delim $$
  61 .EN
  62 .ds CH "
  63 .pn 295
  64 .sp
  65 .rs
  66 .ps -1
  67 .sp -1
  68 .fi
  69 Reprinted from:
  70 \fIProceedings of the San Francisco USENIX Conference\fP,
  71 pp. 295-303, June 1988.
  72 .ps
  73 .\".sp |\n(HMu
  74 .rm CM
  75 .nr PO 1.25i
  76 .TL
  77 Design of a General Purpose Memory Allocator for the 4.3BSD UNIX\(dg Kernel
  78 .ds LF Summer USENIX '88
  79 .ds CF "%
  80 .ds RF San Francisco, June 20-24
  81 .EH 'Design of a General Purpose Memory ...''McKusick, Karels'
  82 .OH 'McKusick, Karels''Design of a General Purpose Memory ...'
  83 .FS
  84 \(dgUNIX is a registered trademark of AT&T in the US and other countries.
  85 .FE
  86 .AU
  87 Marshall Kirk McKusick
  88 .AU
  89 Michael J. Karels
  90 .AI
  91 Computer Systems Research Group
  92 Computer Science Division
  93 Department of Electrical Engineering and Computer Science
  94 University of California, Berkeley
  95 Berkeley, California  94720
  96 .AB
  97 The 4.3BSD UNIX kernel uses many memory allocation mechanisms,
  98 each designed for the particular needs of the utilizing subsystem.
  99 This paper describes a general purpose dynamic memory allocator
 100 that can be used by all of the kernel subsystems.
 101 The design of this allocator takes advantage of known memory usage
 102 patterns in the UNIX kernel and a hybrid strategy that is time-efficient
 103 for small allocations and space-efficient for large allocations.
 104 This allocator replaces the multiple memory allocation interfaces
 105 with a single easy-to-program interface,
 106 results in more efficient use of global memory by eliminating
 107 partitioned and specialized memory pools,
 108 and is quick enough that no performance loss is observed
 109 relative to the current implementations.
 110 The paper concludes with a discussion of our experience in using
 111 the new memory allocator,
 112 and directions for future work.
 113 .AE
 114 .LP
 115 .H 1 "Kernel Memory Allocation in 4.3BSD
 116 .PP
 117 The 4.3BSD kernel has at least ten different memory allocators.
 118 Some of them handle large blocks,
 119 some of them handle small chained data structures,
 120 and others include information to describe I/O operations.
 121 Often the allocations are for small pieces of memory that are only
 122 needed for the duration of a single system call.
 123 In a user process such short-term
 124 memory would be allocated on the run-time stack.
 125 Because the kernel has a limited run-time stack,
 126 it is not feasible to allocate even moderate blocks of memory on it.
 127 Consequently, such memory must be allocated through a more dynamic mechanism.
 128 For example,
 129 when the system must translate a pathname,
 130 it must allocate a one kilobye buffer to hold the name.
 131 Other blocks of memory must be more persistent than a single system call
 132 and really have to be allocated from dynamic memory.
 133 Examples include protocol control blocks that remain throughout
 134 the duration of the network connection.
 135 .PP
 136 Demands for dynamic memory allocation in the kernel have increased
 137 as more services have been added.
 138 Each time a new type of memory allocation has been required,
 139 a specialized memory allocation scheme has been written to handle it.
 140 Often the new memory allocation scheme has been built on top
 141 of an older allocator.
 142 For example, the block device subsystem provides a crude form of
 143 memory allocation through the allocation of empty buffers [Thompson78].
 144 The allocation is slow because of the implied semantics of
 145 finding the oldest buffer, pushing its contents to disk if they are dirty,
 146 and moving physical memory into or out of the buffer to create
 147 the requested size.
 148 To reduce the overhead, a ``new'' memory allocator was built in 4.3BSD
 149 for name translation that allocates a pool of empty buffers.
 150 It keeps them on a free list so they can
 151 be quickly allocated and freed [McKusick85].
 152 .PP
 153 This memory allocation method has several drawbacks.
 154 First, the new allocator can only handle a limited range of sizes.
 155 Second, it depletes the buffer pool, as it steals memory intended
 156 to buffer disk blocks to other purposes.
 157 Finally, it creates yet another interface of
 158 which the programmer must be aware.
 159 .PP
 160 A generalized memory allocator is needed to reduce the complexity
 161 of writing code inside the kernel.
 162 Rather than providing many semi-specialized ways of allocating memory,
 163 the kernel should provide a single general purpose allocator.
 164 With only a single interface,
 165 programmers do not need to figure
 166 out the most appropriate way to allocate memory.
 167 If a good general purpose allocator is available,
 168 it helps avoid the syndrome of creating yet another special
 169 purpose allocator.
 170 .PP
 171 To ease the task of understanding how to use it,
 172 the memory allocator should have an interface similar to the interface
 173 of the well-known memory allocator provided for
 174 applications programmers through the C library routines
 175 .RN malloc
 176 and
 177 .RN free .
 178 Like the C library interface,
 179 the allocation routine should take a parameter specifying the
 180 size of memory that is needed.
 181 The range of sizes for memory requests should not be constrained.
 182 The free routine should take a pointer to the storage being freed,
 183 and should not require additional information such as the size
 184 of the piece of memory being freed.
 185 .H 1 "Criteria for a Kernel Memory Allocator
 186 .PP
 187 The design specification for a kernel memory allocator is similar to,
 188 but not identical to,
 189 the design criteria for a user level memory allocator.
 190 The first criterion for a memory allocator is that it make good use
 191 of the physical memory.
 192 Good use of memory is measured by the amount of memory needed to hold
 193 a set of allocations at any point in time.
 194 Percentage utilization is expressed as:
 195 .ie t \{\
 196 .EQ
 197 utilization~=~requested over required
 198 .EN
 199 .\}
 200 .el \{\
 201 .sp
 202 .ce
 203 \fIutilization\fP=\fIrequested\fP/\fIrequired\fP
 204 .sp
 205 .\}
 206 Here, ``requested'' is the sum of the memory that has been requested
 207 and not yet freed.
 208 ``Required'' is the amount of memory that has been
 209 allocated for the pool from which the requests are filled.
 210 An allocator requires more memory than requested because of fragmentation
 211 and a need to have a ready supply of free memory for future requests.
 212 A perfect memory allocator would have a utilization of 100%.
 213 In practice,
 214 having a 50% utilization is considered good [Korn85].
 215 .PP
 216 Good memory utilization in the kernel is more important than
 217 in user processes.
 218 Because user processes run in virtual memory,
 219 unused parts of their address space can be paged out.
 220 Thus pages in the process address space
 221 that are part of the ``required'' pool that are not
 222 being ``requested'' need not tie up physical memory.
 223 Because the kernel is not paged,
 224 all pages in the ``required'' pool are held by the kernel and
 225 cannot be used for other purposes.
 226 To keep the kernel utilization percentage as high as possible,
 227 it is desirable to release unused memory in the ``required'' pool
 228 rather than to hold it as is typically done with user processes.
 229 Because the kernel can directly manipulate its own page maps,
 230 releasing unused memory is fast;
 231 a user process must do a system call to release memory.
 232 .PP
 233 The most important criterion for a memory allocator is that it be fast.
 234 Because memory allocation is done frequently,
 235 a slow memory allocator will degrade the system performance.
 236 Speed of allocation is more critical when executing in the
 237 kernel than in user code,
 238 because the kernel must allocate many data structure that user
 239 processes can allocate cheaply on their run-time stack.
 240 In addition, the kernel represents the platform on which all user
 241 processes run,
 242 and if it is slow, it will degrade the performance of every process
 243 that is running.
 244 .PP
 245 Another problem with a slow memory allocator is that programmers
 246 of frequently-used kernel interfaces will feel that they
 247 cannot afford to use it as their primary memory allocator.
 248 Instead they will build their own memory allocator on top of the
 249 original by maintaining their own pool of memory blocks.
 250 Multiple allocators reduce the efficiency with which memory is used.
 251 The kernel ends up with many different free lists of memory
 252 instead of a single free list from which all allocation can be drawn.
 253 For example,
 254 consider the case of two subsystems that need memory.
 255 If they have their own free lists,
 256 the amount of memory tied up in the two lists will be the
 257 sum of the greatest amount of memory that each of
 258 the two subsystems has ever used.
 259 If they share a free list,
 260 the amount of memory tied up in the free list may be as low as the
 261 greatest amount of memory that either subsystem used.
 262 As the number of subsystems grows,
 263 the savings from having a single free list grow.
 264 .H 1 "Existing User-level Implementations
 265 .PP
 266 There are many different algorithms and
 267 implementations of user-level memory allocators.
 268 A survey of those available on UNIX systems appeared in [Korn85].
 269 Nearly all of the memory allocators tested made good use of memory,
 270 though most of them were too slow for use in the kernel.
 271 The fastest memory allocator in the survey by nearly a factor of two
 272 was the memory allocator provided on 4.2BSD originally
 273 written by Chris Kingsley at California Institute of Technology.
 274 Unfortunately,
 275 the 4.2BSD memory allocator also wasted twice as much memory
 276 as its nearest competitor in the survey.
 277 .PP
 278 The 4.2BSD user-level memory allocator works by maintaining a set of lists
 279 that are ordered by increasing powers of two.
 280 Each list contains a set of memory blocks of its corresponding size.
 281 To fulfill a memory request,
 282 the size of the request is rounded up to the next power of two.
 283 A piece of memory is then removed from the list corresponding
 284 to the specified power of two and returned to the requester.
 285 Thus, a request for a block of memory of size 53 returns
 286 a block from the 64-sized list.
 287 A typical memory allocation requires a roundup calculation
 288 followed by a linked list removal.
 289 Only if the list is empty is a real memory allocation done.
 290 The free operation is also fast;
 291 the block of memory is put back onto the list from which it came.
 292 The correct list is identified by a size indicator stored
 293 immediately preceding the memory block.
 294 .H 1 "Considerations Unique to a Kernel Allocator
 295 .PP
 296 There are several special conditions that arise when writing a
 297 memory allocator for the kernel that do not apply to a user process
 298 memory allocator.
 299 First, the maximum memory allocation can be determined at
 300 the time that the machine is booted.
 301 This number is never more than the amount of physical memory on the machine,
 302 and is typically much less since a machine with all its
 303 memory dedicated to the operating system is uninteresting to use.
 304 Thus, the kernel can statically allocate a set of data structures
 305 to manage its dynamically allocated memory.
 306 These data structures never need to be
 307 expanded to accommodate memory requests;
 308 yet, if properly designed, they need not be large.
 309 For a user process, the maximum amount of memory that may be allocated
 310 is a function of the maximum size of its virtual memory.
 311 Although it could allocate static data structures to manage
 312 its entire virtual memory,
 313 even if they were efficiently encoded they would potentially be huge.
 314 The other alternative is to allocate data structures as they are needed.
 315 However, that adds extra complications such as new
 316 failure modes if it cannot allocate space for additional
 317 structures and additional mechanisms to link them all together.
 318 .PP
 319 Another special condition of the kernel memory allocator is that it
 320 can control its own address space.
 321 Unlike user processes that can only grow and shrink their heap at one end,
 322 the kernel can keep an arena of kernel addresses and allocate
 323 pieces from that arena which it then populates with physical memory.
 324 The effect is much the same as a user process that has parts of
 325 its address space paged out when they are not in use,
 326 except that the kernel can explicitly control the set of pages
 327 allocated to its address space.
 328 The result is that the ``working set'' of pages in use by the
 329 kernel exactly corresponds to the set of pages that it is really using.
 330 .FI "One day memory usage on a Berkeley time-sharing machine"
 331 .so usage.tbl
 332 .Fe
 333 .PP
 334 A final special condition that applies to the kernel is that
 335 all of the different uses of dynamic memory are known in advance.
 336 Each one of these uses of dynamic memory can be assigned a type.
 337 For each type of dynamic memory that is allocated,
 338 the kernel can provide allocation limits.
 339 One reason given for having separate allocators is that
 340 no single allocator could starve the rest of the kernel of all
 341 its available memory and thus a single runaway
 342 client could not paralyze the system.
 343 By putting limits on each type of memory,
 344 the single general purpose memory allocator can provide the same
 345 protection against memory starvation.\(dg
 346 .FS
 347 \(dgOne might seriously ask the question what good it is if ``only''
 348 one subsystem within the kernel hangs if it is something like the
 349 network on a diskless workstation.
 350 .FE
 351 .PP
 352 \*(Lb shows the memory usage of the kernel over a one day period
 353 on a general timesharing machine at Berkeley.
 354 The ``In Use'', ``Free'', and ``Mem Use'' fields are instantaneous values;
 355 the ``Requests'' field is the number of allocations since system startup;
 356 the ``High Use'' field is the maximum value of
 357 the ``Mem Use'' field since system startup.
 358 The figure demonstrates that most
 359 allocations are for small objects.
 360 Large allocations occur infrequently,
 361 and are typically for long-lived objects
 362 such as buffers to hold the superblock for
 363 a mounted file system.
 364 Thus, a memory allocator only needs to be
 365 fast for small pieces of memory.
 366 .H 1 "Implementation of the Kernel Memory Allocator
 367 .PP
 368 In reviewing the available memory allocators,
 369 none of their strategies could be used without some modification.
 370 The kernel memory allocator that we ended up with is a hybrid
 371 of the fast memory allocator found in the 4.2BSD C library
 372 and a slower but more-memory-efficient first-fit allocator.
 373 .PP
 374 Small allocations are done using the 4.2BSD power-of-two list strategy;
 375 the typical allocation requires only a computation of
 376 the list to use and the removal of an element if it is available,
 377 so it is quite fast.
 378 Macros are provided to avoid the cost of a subroutine call.
 379 Only if the request cannot be fulfilled from a list is a call
 380 made to the allocator itself.
 381 To ensure that the allocator is always called for large requests,
 382 the lists corresponding to large allocations are always empty.
 383 Appendix A shows the data structures and implementation of the macros.
 384 .PP
 385 Similarly, freeing a block of memory can be done with a macro.
 386 The macro computes the list on which to place the request
 387 and puts it there.
 388 The free routine is called only if the block of memory is
 389 considered to be a large allocation.
 390 Including the cost of blocking out interrupts,
 391 the allocation and freeing macros generate respectively
 392 only nine and sixteen (simple) VAX instructions.
 393 .PP
 394 Because of the inefficiency of power-of-two allocation strategies
 395 for large allocations,
 396 a different strategy is used for allocations larger than two kilobytes.
 397 The selection of two kilobytes is derived from our statistics on
 398 the utilization of memory within the kernel,
 399 that showed that 95 to 98% of allocations are of size one kilobyte or less.
 400 A frequent caller of the memory allocator
 401 (the name translation function)
 402 always requests a one kilobyte block.
 403 Additionally the allocation method for large blocks is based on allocating
 404 pieces of memory in multiples of pages.
 405 Consequently the actual allocation size for requests of size
 406 $2~times~pagesize$ or less are identical.\(dg
 407 .FS
 408 \(dgTo understand why this number is $size 8 {2~times~pagesize}$ one
 409 observes that the power-of-two algorithm yields sizes of 1, 2, 4, 8, \&...
 410 pages while the large block algorithm that allocates in multiples
 411 of pages yields sizes of 1, 2, 3, 4, \&... pages.
 412 Thus for allocations of sizes between one and two pages
 413 both algorithms use two pages;
 414 it is not until allocations of sizes between two and three pages
 415 that a difference emerges where the power-of-two algorithm will use
 416 four pages while the large block algorithm will use three pages.
 417 .FE
 418 In 4.3BSD on the VAX, the (software) page size is one kilobyte,
 419 so two kilobytes is the smallest logical cutoff.
 420 .PP
 421 Large allocations are first rounded up to be a multiple of the page size.
 422 The allocator then uses a first-fit algorithm to find space in the
 423 kernel address arena set aside for dynamic allocations.
 424 Thus a request for a five kilobyte piece of memory will use exactly
 425 five pages of memory rather than eight kilobytes as with
 426 the power-of-two allocation strategy.
 427 When a large piece of memory is freed,
 428 the memory pages are returned to the free memory pool,
 429 and the address space is returned to the kernel address arena
 430 where it is coalesced with adjacent free pieces.
 431 .PP
 432 Another technique to improve both the efficiency of memory utilization
 433 and the speed of allocation
 434 is to cluster same-sized small allocations on a page.
 435 When a list for a power-of-two allocation is empty,
 436 a new page is allocated and divided into pieces of the needed size.
 437 This strategy speeds future allocations as several pieces of memory
 438 become available as a result of the call into the allocator.
 439 .PP
 440 .FI "Calculation of allocation size"
 441 .so alloc.fig
 442 .Fe
 443 Because the size is not specified when a block of memory is freed,
 444 the allocator must keep track of the sizes of the pieces it has handed out.
 445 The 4.2BSD user-level allocator stores the size of each block
 446 in a header just before the allocation.
 447 However, this strategy doubles the memory requirement for allocations that
 448 require a power-of-two-sized block.
 449 Therefore,
 450 instead of storing the size of each piece of memory with the piece itself,
 451 the size information is associated with the memory page.
 452 \*(Lb shows how the kernel determines
 453 the size of a piece of memory that is being freed,
 454 by calculating the page in which it resides,
 455 and looking up the size associated with that page.
 456 Eliminating the cost of the overhead per piece improved utilization
 457 far more than expected.
 458 The reason is that many allocations in the kernel are for blocks of
 459 memory whose size is exactly a power of two.
 460 These requests would be nearly doubled if the user-level strategy were used.
 461 Now they can be accommodated with no wasted memory.
 462 .PP
 463 The allocator can be called both from the top half of the kernel,
 464 which is willing to wait for memory to become available,
 465 and from the interrupt routines in the bottom half of the kernel
 466 that cannot wait for memory to become available.
 467 Clients indicate their willingness (and ability) to wait with a flag
 468 to the allocation routine.
 469 For clients that are willing to wait,
 470 the allocator guarrentees that their request will succeed.
 471 Thus, these clients can need not check the return value from the allocator.
 472 If memory is unavailable and the client cannot wait,
 473 the allocator returns a null pointer.
 474 These clients must be prepared to cope with this
 475 (hopefully infrequent) condition
 476 (usually by giving up and hoping to do better later).
 477 .H 1 "Results of the Implementation
 478 .PP
 479 The new memory allocator was written about a year ago.
 480 Conversion from the old memory allocators to the new allocator
 481 has been going on ever since.
 482 Many of the special purpose allocators have been eliminated.
 483 This list includes
 484 .RN calloc ,
 485 .RN wmemall ,
 486 and
 487 .RN zmemall .
 488 Many of the special purpose memory allocators built on
 489 top of other allocators have also been eliminated.
 490 For example, the allocator that was built on top of the buffer pool allocator
 491 .RN geteblk
 492 to allocate pathname buffers in
 493 .RN namei
 494 has been eliminated.
 495 Because the typical allocation is so fast,
 496 we have found that none of the special purpose pools are needed.
 497 Indeed, the allocation is about the same as the previous cost of
 498 allocating buffers from the network pool (\fImbuf\fP\^s).
 499 Consequently applications that used to allocate network
 500 buffers for their own uses have been switched over to using
 501 the general purpose allocator without increasing their running time.
 502 .PP
 503 Quantifying the performance of the allocator is difficult because
 504 it is hard to measure the amount of time spent allocating
 505 and freeing memory in the kernel.
 506 The usual approach is to compile a kernel for profiling
 507 and then compare the running time of the routines that
 508 implemented the old abstraction versus those that implement the new one.
 509 The old routines are difficult to quantify because
 510 individual routines were used for more than one purpose.
 511 For example, the
 512 .RN geteblk
 513 routine was used both to allocate one kilobyte memory blocks
 514 and for its intended purpose of providing buffers to the filesystem.
 515 Differentiating these uses is often difficult.
 516 To get a measure of the cost of memory allocation before
 517 putting in our new allocator,
 518 we summed up the running time of all the routines whose
 519 exclusive task was memory allocation.
 520 To this total we added the fraction
 521 of the running time of the multi-purpose routines that could
 522 clearly be identified as memory allocation usage.
 523 This number showed that approximately three percent of
 524 the time spent in the kernel could be accounted to memory allocation.
 525 .PP
 526 The new allocator is difficult to measure
 527 because the usual case of the memory allocator is implemented as a macro.
 528 Thus, its running time is a small fraction of the running time of the
 529 numerous routines in the kernel that use it.
 530 To get a bound on the cost,
 531 we changed the macro always to call the memory allocation routine.
 532 Running in this mode, the memory allocator accounted for six percent
 533 of the time spent in the kernel.
 534 Factoring out the cost of the statistics collection and the
 535 subroutine call overhead for the cases that could
 536 normally be handled by the macro,
 537 we estimate that the allocator would account for
 538 at most four percent of time in the kernel.
 539 These measurements show that the new allocator does not introduce
 540 significant new run-time costs.
 541 .PP
 542 The other major success has been in keeping the size information
 543 on a per-page basis.
 544 This technique allows the most frequently requested sizes to be
 545 allocated without waste.
 546 It also reduces the amount of bookkeeping information associated
 547 with the allocator to four kilobytes of information
 548 per megabyte of memory under management (with a one kilobyte page size).
 549 .H 1 "Future Work
 550 .PP
 551 Our next project is to convert many of the static
 552 kernel tables to be dynamically allocated.
 553 Static tables include the process table, the file table,
 554 and the mount table.
 555 Making these tables dynamic will have two benefits.
 556 First, it will reduce the amount of memory
 557 that must be statically allocated at boot time.
 558 Second, it will eliminate the arbitrary upper limit imposed
 559 by the current static sizing
 560 (although a limit will be retained to constrain runaway clients).
 561 Other researchers have already shown the memory savings
 562 achieved by this conversion [Rodriguez88].
 563 .PP
 564 Under the current implementation,
 565 memory is never moved from one size list to another.
 566 With the 4.2BSD memory allocator this causes problems,
 567 particularly for large allocations where a process may use
 568 a quarter megabyte piece of memory once,
 569 which is then never available for any other size request.
 570 In our hybrid scheme,
 571 memory can be shuffled between large requests so that large blocks
 572 of memory are never stranded as they are with the 4.2BSD allocator.
 573 However, pages allocated to small requests are allocated once
 574 to a particular size and never changed thereafter.
 575 If a burst of requests came in for a particular size,
 576 that size would acquire a large amount of memory
 577 that would then not be available for other future requests.
 578 .PP
 579 In practice, we do not find that the free lists become too large.
 580 However, we have been investigating ways to handle such problems
 581 if they occur in the future.
 582 Our current investigations involve a routine
 583 that can run as part of the idle loop that would sort the elements
 584 on each of the free lists into order of increasing address.
 585 Since any given page has only one size of elements allocated from it,
 586 the effect of the sorting would be to sort the list into distinct pages.
 587 When all the pieces of a page became free,
 588 the page itself could be released back to the free pool so that
 589 it could be allocated to another purpose.
 590 Although there is no guarantee that all the pieces of a page would ever
 591 be freed,
 592 most allocations are short-lived, lasting only for the duration of
 593 an open file descriptor, an open network connection, or a system call.
 594 As new allocations would be made from the page sorted to
 595 the front of the list,
 596 return of elements from pages at the back would eventually
 597 allow pages later in the list to be freed.
 598 .PP
 599 Two of the traditional UNIX
 600 memory allocators remain in the current system.
 601 The terminal subsystem uses \fIclist\fP\^s (character lists).
 602 That part of the system is expected to undergo major revision within
 603 the next year or so, and it will probably be changed to use
 604 \fImbuf\fP\^s as it is merged into the network system.
 605 The other major allocator that remains is
 606 .RN getblk ,
 607 the routine that manages the filesystem buffer pool memory
 608 and associated control information.
 609 Only the filesystem uses
 610 .RN getblk
 611 in the current system;
 612 it manages the constant-sized buffer pool.
 613 We plan to merge the filesystem buffer cache into the virtual memory system's
 614 page cache in the future.
 615 This change will allow the size of the buffer pool to be changed
 616 according to memory load,
 617 but will require a policy for balancing memory needs
 618 with filesystem cache performance.
 619 .H 1 "Acknowledgments
 620 .PP
 621 In the spirit of community support,
 622 we have made various versions of our allocator available to our test sites.
 623 They have been busily burning it in and giving
 624 us feedback on their experiences.
 625 We acknowledge their invaluable input.
 626 The feedback from the Usenix program committee on the initial draft of
 627 our paper suggested numerous important improvements.
 628 .H 1 "References
 629 .LP
 630 .IP Korn85 \w'Rodriguez88\0\0'u
 631 David Korn, Kiem-Phong Vo,
 632 ``In Search of a Better Malloc''
 633 \fIProceedings of the Portland Usenix Conference\fP,
 634 pp 489-506, June 1985.
 635 .IP McKusick85
 636 M. McKusick, M. Karels, S. Leffler,
 637 ``Performance Improvements and Functional Enhancements in 4.3BSD''
 638 \fIProceedings of the Portland Usenix Conference\fP,
 639 pp 519-531, June 1985.
 640 .IP Rodriguez88
 641 Robert Rodriguez, Matt Koehler, Larry Palmer, Ricky Palmer,
 642 ``A Dynamic UNIX Operating System''
 643 \fIProceedings of the San Francisco Usenix Conference\fP,
 644 June 1988.
 645 .IP Thompson78
 646 Ken Thompson,
 647 ``UNIX Implementation''
 648 \fIBell System Technical Journal\fP, volume 57, number 6,
 649 pp 1931-1946, 1978.