| Jeremy Hylton : weblog : 2003-12-04 |
Thursday, December 04, 2003
I've spent a lot of time this week working out bugs in the cache simulations. Many of the results reported in earlier blog entries -- both on cache access performance and on 2Q simulation results were wrong. That seems to be par for the course: The cache simulations are sometimes complicated. Understanding simulation results means checking the results for consistency ten different ways. Eventually you understand what the numbers are telling you and how they relate to each other.
The new 2Q implementation is probably correct, but doesn't produce the dramatically good results reported earlier. It's still better than LRU by a non-trivial margin (5% difference in hit rate) for large caches.
The ARC algorithm has been very difficult to implementation for a cache with variable-sized objects. Much of the specific logic from their pseudocode is different when you are calculating with object sizes instead of pages. For example, if you add a new object, the object you choose to evict may be smaller, so you have to evict several objects.
The other interesting problem with the ARC cache is that it seems to adjust its p factor too slowly. In the paper, they say that 1 results in changes that are fast enough, but they only have tiny caches in the simulation. For a very large cache (like a 500MB ZEO cache), it seems to adapt far too slowly. The paper boasts that there are no tuning parameters in ARC, but I needed to tune the adjustment rate to get decent results.
We're still stuck trying to find a decent implementation of an LRU list when the objects in the list are stored on disk and not in memory. The Iyengar paper doesn't cover replacements at all, even though it seems to say that a cache is typically in an equilibrium where it must always do replacement.
For the record, here is apparently correct simulation results for LRU and 2Q on the zope.org cache trace.
LRUCacheSimulation, cache size 50,000,000 bytes START TIME DURATION LOADS HITS INVALS WRITES HITRATE EVICTS Nov 25 00:36 2:19:55 51340 15399 47 116 30.0% 28391 Nov 25 02:56 1:55:01 54952 14318 26 442 26.1% 31652 Nov 25 04:51 1:30:01 72530 23387 45 97 32.2% 41593 Nov 25 06:21 1:00:13 46689 15161 16 111 32.5% 25083 Nov 25 07:21 1:45:09 53628 17812 46 143 33.2% 28617 Nov 25 09:06 1:15:02 58238 19730 23 163 33.9% 30879 Nov 25 10:21 2:05:01 126466 36876 527 4751 29.2% 84047 Nov 25 12:26 1:15:03 51919 15501 22 68 29.9% 25989 Nov 25 13:41 2:45:03 69965 17362 107 296 24.8% 45034 Nov 25 16:26 4:19:34 37131 11960 3 38 32.2% 21452 Nov 25 20:46 6:05:06 110716 26365 140 2605 23.8% 78027 Nov 26 02:51 2:04:57 79765 24191 438 1157 30.3% 48735 Nov 26 04:56 40:01 53910 16997 63 187 31.5% 28491 Nov 26 05:36 1:15:11 51677 14870 8 456 28.8% 30886 Nov 26 06:51 1:00:08 51355 13124 16 92 25.6% 30407 Nov 26 07:51 1:04:56 51923 16151 5 68 31.1% 28086 Nov 26 08:56 1:05:13 66551 21848 12 90 32.8% 37771 Nov 26 10:01 50:04 48137 17088 6 80 35.5% 22723 Nov 26 10:51 28:00 40506 15784 3 70 39.0% 16731 Nov 25 00:36 34:43:38 1177398 353924 1553 11030 30.1% 684594 OVERALL TwoQSimluation, cache size 50,000,000 bytes START TIME DURATION LOADS HITS INVALS WRITES HITRATE EVICTS HOTHIT Nov 25 00:36 2:19:55 51340 14223 71 116 27.7% 29929 90 Nov 25 02:56 1:55:01 54952 17129 28 442 31.2% 36735 5932 Nov 25 04:51 1:30:01 72530 29424 52 97 40.6% 43540 13274 Nov 25 06:21 1:00:13 46689 19995 17 111 42.8% 27212 7724 Nov 25 07:21 1:45:09 53628 19438 46 143 36.2% 34643 7832 Nov 25 09:06 1:15:02 58238 22877 15 163 39.3% 34734 9308 Nov 25 10:21 2:05:01 126466 45287 432 4751 35.8% 81699 17469 Nov 25 12:26 1:15:03 51919 18809 19 68 36.2% 28539 7551 Nov 25 13:41 2:45:03 69965 14727 107 296 21.0% 59106 11076 Nov 25 16:26 4:19:34 37131 15132 2 38 40.8% 23250 4213 Nov 25 20:46 6:05:06 110716 27964 121 2605 25.3% 82201 16686 Nov 26 02:51 2:04:57 79765 28798 292 1157 36.1% 50332 12103 Nov 26 04:56 40:01 53910 20701 60 187 38.4% 32999 7766 Nov 26 05:36 1:15:11 51677 19818 9 456 38.3% 31505 6998 Nov 26 06:51 1:00:08 51355 18963 11 92 36.9% 30628 8204 Nov 26 07:51 1:04:56 51923 21585 6 68 41.6% 32241 7465 Nov 26 08:56 1:05:13 66551 27667 13 90 41.6% 38881 9298 Nov 26 10:01 50:04 48137 20918 7 80 43.5% 26887 6758 Nov 26 10:51 28:00 40506 19589 4 70 48.4% 20435 5447 Nov 25 00:36 34:43:38 1177398 423044 1312 11030 35.9% 745496 165194 OVERALL
The ARC simulator is barely functional. There are a few corner cases where I don't understand what it's doing, so I doubt it's right. I'm also ignoring invalidations and writes, but it's producing some basic results that make some sense. It looks like it's favoring the LRU (L2) component over the FIFO component (L1) at the expense of performance. When I compare to 2Q, which is using more space for the FIFO component, I see that ARC has more LRU hits but many fewer FIFO hits.
It's difficult to say if these results are representative of ARC or are caused by bugs in my implementation. (Likely the latter given my recent track record and the hour.)
The first chart is with a "walk factor" of 500. If it needs to adjust p, it does so using the basic algorithm multiplied by the walk factor.
ARCCacheSimulation, cache size 50,000,000 bytes START TIME DURATION LOADS HITS INVALS WRITES HITRATE LRUTHITS EVICTS P Nov 25 00:36 2:19:55 51340 13001 0 116 25.3% 170 32170 2332000 Nov 25 02:56 1:55:01 54952 17790 0 442 32.4% 8280 33913 1145500 Nov 25 04:51 1:30:01 72530 18813 0 97 25.9% 8983 55903 5223500 Nov 25 06:21 1:00:13 46689 15301 0 111 32.8% 3566 31520 5353500 Nov 25 07:21 1:45:09 53628 15866 0 143 29.6% 4885 35772 6715500 Nov 25 09:06 1:15:02 58238 16603 0 163 28.5% 4693 42441 6187500 Nov 25 10:21 2:05:01 126466 38843 0 4751 30.7% 18441 90298 508500 Nov 25 12:26 1:15:03 51919 11498 0 68 22.1% 2424 38796 1910000 Nov 25 13:41 2:45:03 69965 10883 0 296 15.6% 8175 57130 7010000 Nov 25 16:26 4:19:34 37131 9392 0 38 25.3% 2146 23714 8427500 Nov 25 20:46 6:05:06 110716 32635 0 2605 29.5% 31363 83417 4433000 Nov 26 02:51 2:04:57 79765 22837 0 1157 28.6% 8656 58531 5609000 Nov 26 04:56 40:01 53910 14455 0 187 26.8% 3025 38794 6279000 Nov 26 05:36 1:15:11 51677 14862 0 456 28.8% 4006 36383 7866000 Nov 26 06:51 1:00:08 51355 13421 0 92 26.1% 2902 37051 10099500 Nov 26 07:51 1:04:56 51923 15462 0 68 29.8% 2922 38083 9251000 Nov 26 08:56 1:05:13 66551 21951 0 90 33.0% 4321 44048 9926000 Nov 26 10:01 50:04 48137 16766 0 80 34.8% 3382 31469 8320000 Nov 26 10:51 28:00 40506 15277 0 70 37.7% 3165 23202 7099500 Nov 25 00:36 34:43:38 1177398 335656 0 11030 28.5% 125505 832635 7099500 OVERALL
The couple of places where ARC was competitive with 2Q, it was were the LRU hit rate was very high. Everywhere else 2Q appeared to win by devoting more space to the LRU component. (This over simplifies, because 2Q is also more hesitant to promote to the LRU component.) So I tried changing the walk factor to 2000, since ARC is never using much of the cache for FIFO. It showed a small improvement, but not much.
ARCCacheSimulation, cache size 50,000,000 bytes START TIME DURATION LOADS HITS INVALS WRITES HITRATE LRUTHITS EVICTS P Nov 25 00:36 2:19:55 51340 13160 0 116 25.6% 163 32042 9064000 Nov 25 02:56 1:55:01 54952 16400 0 442 29.8% 5725 35860 10772000 Nov 25 04:51 1:30:01 72530 23346 0 97 32.2% 5752 50476 12750000 Nov 25 06:21 1:00:13 46689 15478 0 111 33.2% 2868 31651 10224000 Nov 25 07:21 1:45:09 53628 16411 0 143 30.6% 4215 35884 16850000 Nov 25 09:06 1:15:02 58238 19977 0 163 34.3% 4066 38113 16094000 Nov 25 10:21 2:05:01 126466 38994 0 4751 30.8% 16923 92417 14000 Nov 25 12:26 1:15:03 51919 12217 0 68 23.5% 3050 37720 5578000 Nov 25 13:41 2:45:03 69965 11217 0 296 16.0% 5612 58317 20374000 Nov 25 16:26 4:19:34 37131 13122 0 38 35.3% 1238 26363 20258000 Nov 25 20:46 6:05:06 110716 24968 0 2605 22.6% 22161 84526 11990000 Nov 26 02:51 2:04:57 79765 22640 0 1157 28.4% 7114 57781 9536000 Nov 26 04:56 40:01 53910 14579 0 187 27.0% 2378 38769 13806000 Nov 26 05:36 1:15:11 51677 16014 0 456 31.0% 3039 35352 18124000 Nov 26 06:51 1:00:08 51355 14413 0 92 28.1% 1852 35752 28536000 Nov 26 07:51 1:04:56 51923 16757 0 68 32.3% 1785 35976 26386000 Nov 26 08:56 1:05:13 66551 23303 0 90 35.0% 2963 40329 25508000 Nov 26 10:01 50:04 48137 17391 0 80 36.1% 1817 32920 20414000 Nov 26 10:51 28:00 40506 16211 0 70 40.0% 2987 24224 17756000 Nov 25 00:36 34:43:38 1177398 346598 0 11030 29.4% 95708 824472 17756000 OVERALL
Boosting the walk factor to 10,000 gets the overall hit rate up to 31.4%, a little better than LRU and still worse than 2Q.