The Grey Labyrinth is a collection of puzzles, riddles, mind games, paradoxes and other intellectually challenging diversions. Related topics: puzzle games, logic puzzles, lateral thinking puzzles, philosophy, mind benders, brain teasers, word problems, conundrums, 3d puzzles, spatial reasoning, intelligence tests, mathematical diversions, paradoxes, physics problems, reasoning, math, science.

   
The Grey Labyrinth Forum Index
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups    RegisterRegister  
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

phpBB question ....
Goto page Previous  1, 2
 
Reply to topic    The Grey Labyrinth Forum Index -> Off-Topic
View previous topic :: View next topic  
Author Message
Zag
Unintentionally offensive old coot



PostPosted: Thu Nov 10, 2011 3:59 pm    Post subject: 41 Reply with quote

extro, true LRU (for those following along without a scorecard, that's Least Recently Used) algorithms (without hardware support) are typically inefficient unless the objects you are storing are relatively big compare to pointers and hash tables. The problem is that either you keep objects quickly findable by reference, in which case hunting down the least recently used one is expensive, or you keep objects sorted by last use, in which case it is expensive to find an object when it is requested.

The solution to that is to keep the object references in a hash table so they can be found quickly AND in a linked list in order of last reference. When you are asked for an object, you can quickly find it (or not) from the hash table, AND you can move its linked list entry quickly to the front. When asked to store a new object, you can quickly find the oldest one (assuming you are keeping head and tail references for you list) to throw it away.

If the objects being cached are only the size of, say, six or eight pointers, then you are spending more than half your memory allocated on the hash table and the links.

I had such a problem, once (gawd, 25 years ago, now). I kept a hash table of the pointers to the objects, where I ignored the last two bits of the pointer when hashing (and zeroed those bits when making an actual pointer from it). The last two bits were used to keep track of the age of the objects in this way:

Filling up the cache initially, I tagged new pointers with 00 until I had hit 1/2 of the size of the cache (in object count -- these objects were all the same size, which was the same size as 8 pointers. Then I tagged with 01 (including rereferenced 00 ones) until 2/3 of the cache was full. (By experimentation, I learned that this was pretty close to making 1/3 of the cache with 00 and 1/3 of the cache with 01, thanks to some 00's being converted to 01). I then tagged new and rereferenced objects with 10 until the cache was full. Throughout this process I kept an accurate count of how many objects were tagged with each number.

Now that the cache was full, when a new object came in, I found the first object tagged with 00 and threw it out, then I tagged the new object with 11. So I was throwing away not the exact least-recently-used, but it was in the bottom third. Once I had gone through the list and thrown out all the objects tagged with 00, I started throwing out 01's and tagging new objects with 00. etc. I called it NVRU -- Not Very Recently Used -- cache.
Back to top
View user's profile Send private message Send e-mail Visit poster's website Yahoo Messenger
extro...*
Guest



PostPosted: Thu Nov 10, 2011 4:42 pm    Post subject: 42 Reply with quote

Zag wrote:
The solution to that is to keep the object references in a hash table so they can be found quickly AND in a linked list in order of last reference. When you are asked for an object, you can quickly find it (or not) from the hash table, AND you can move its linked list entry quickly to the front. When asked to store a new object, you can quickly find the oldest one (assuming you are keeping head and tail references for you list) to throw it away.


This is exactly what I did.
Back to top
Antrax
ESL Student



PostPosted: Thu Nov 10, 2011 4:48 pm    Post subject: 43 Reply with quote

I wonder if the next generation of programmers will consider what I do for a living to be common knowledge (Psuedo-LRU I learned during my degree; extro's question I got as a homework assignment in another course. BTW, you don't need both data structures, you can just combine them)
_________________
After years of disappointment with get rich quick schemes, I know I'm gonna get rich with this scheme. And quick!
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Zag
Unintentionally offensive old coot



PostPosted: Thu Nov 10, 2011 5:15 pm    Post subject: 44 Reply with quote

Sorted hash tables are really just both. The fact that its a single entity to you doesn't mean its significantly smaller.

BTW, that whole description of the NVRU cache was not what I was talking about as the clever caching I did for the fat-client application. That's another topic for when I have some more time.
Back to top
View user's profile Send private message Send e-mail Visit poster's website Yahoo Messenger
extro...*
Guest



PostPosted: Thu Nov 10, 2011 5:49 pm    Post subject: 45 Reply with quote

Antrax wrote:
BTW, you don't need both data structures, you can just combine them


You need to combine them ... The nodes of the linked list need to be in the hash table, so that when you find an element, you can move it to the front of the list in constant time (and also why the list needs to be doubly linked).
Back to top
Zag
Unintentionally offensive old coot



PostPosted: Thu Nov 10, 2011 7:33 pm    Post subject: 46 Reply with quote

Ok. The much requested (well, twice) Interchange caching system.


The smallest addressable element was a zObject. Every user definition, every post in a discussion, every email, every article, every image, etc. was a zObject (basically, everything that represented something that might be displayed on its own). Every zObject was addressable with a unique zOID (Ziff Object IDentifier), which was, I think, 8 bytes long. (To ensure uniqueness, all zOIDs that were created by either clients or publishing tools started with the first byte zero, which meant a temporary zOID and only had meaning on that machine. As objects were uploaded to the server, they were given permanent zOIDs and all internal references were updated.)

Every zObject was split into two parts, the header and the content. In the header were the title, the dates created and last updated locally, the zOIDs for the creator and owner, and two lists of zOIDs for objects that it owns. (That is, objects that are needed in order to display it.) The first list was the zObjects for which it only needs the headers, and the second list was the zObjects for which it needs the entire body. On the server, these lists included the dates of every change to the list. (Basically, every entry also had the date that it was put on the list. Also, there were breadcrumbs left for elements removed from the list.) The header might also have some other things, dependent on the type of the element.

In the content portion of the zObject was the large lump of content, defined by what sort of thing it was, as well as a list of other zOIDs that the zObject references (though doesn't own -- i.e. links). These are kept in a separate list for ease of updating from temporary to permanent zOIDs, and for ease of putting a "subscription" on the object.

The UI was completely Event driven. When a new zObjectViewer was created for a particular zObject, it would register events for when those objects arrived in the local store. It would immediately display any objects that are already in the local store, but leave the event handlers there in case the object was updated. Whatever was available would display immediately, and new stuff would pop in as it arrived.

The system ran very well unconnected from the server, because the user could place "Subscriptions" on objects so that they would always be updated whenever the client hit the server. Subscriptions could be shallow or deep, and were somewhat defined by the type of thing that the subscription was on. For a Discussion board, for instance, a shallow subscription meant just the headers of all the threads, whereas a deep subscription meant all the contents of all the threads. Note that a subscription only limited what content was available to you when you are not connected. Anything (that you had paid for) was available to you while you were connected. It would update subscriptions in the background while you were connected, so you could connect and browse around for a while as your subscriptions are updated.

When not using subscriptions, zObjects were requested in groups. For instance, say you are click on "Grey Labyrinth - Visitor Submitted Puzzles" something you've never opened before. You probably already have had the header of that object to be able to see it and click on it (exception: if it was a link in some rich text, like in an email) so right away it can display the top stuff. Meanwhile, a request went to the server to get full contents of the zOID for that zObject. The server would queue up several messages to you right away: That zObject's contents, a single combined message with the headers of all the zObjects that it owns, then an individual message for the contents of each zObject that it owns and needs the contents of. So you would open VSP and the summary list would arrive using only 2 messages, and it took, typically, less than two seconds. (Remember, 9600 baud modem.) Also, we now already have the headers of all the individual threads.

Now the trickier (and far more common case): You have visited VSP before and are now doing so again. This time, the request for the zObject that represents the VSP page includes the date that it was last updated. The server has a choice to send a new header for the VSP zObject, or only a delta to its lists, and it chooses to send the contents or not depending on whether or not they have changed since the date you said you were up to date. Similarly, in sending the headers of all that zObjects owned objects, it know whether or not they changed or were created since that date and only sends you the ones you need. With a discussion board, typically it can display much of the page right away, and then it displays the rest of the page, typically before you have even scrolled down to see it.

When it goes to display an article with pictures, it knows that the pictures are in the list of owned objects for which it needs contents. So the "Update Me" message for the article includes the date, and it knows that you have the picture up to date as of then. If the picture hasn't changed, then it doesn't need to download it again. Consider on web sites that you visit frequently, and all the pictures that you get once per day (with default browser settings) but which actually never change. Now imagine doing it on a 9600 baud modem. This is why the Internet, in 1993, mostly did not include a lot of pictures. For something like a news article with several large pictures, it takes a long time to download the first time, and if you go back to it tomorrow it takes just as long. It has to do this even though the picture is still in your browser cache, because maybe that picture is something that changes frequently, like, for instance, an election map. In Interchange, unless you explicitly deleted that article from your cache, it was already there and it knew that it was up to date (or not) and only updated it if it wasn't.
Back to top
View user's profile Send private message Send e-mail Visit poster's website Yahoo Messenger
Display posts from previous: by   
Reply to topic    The Grey Labyrinth Forum Index -> Off-Topic All times are GMT
Goto page Previous  1, 2
Page 2 of 2

 
Jump to:  
You cannot post new topics in this forum
You can reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group
Site Design by Wx3