187. Hashing around ~~~~~~~~~~~~~~ There are occasions, in programming, when you need to need to scan down a list and find if there is a match with a particular string. The string might be a name, a radio callsign in my case, or a number of some sort, or a mixed string of numbers letters and punctuation. If your program has to scan down a long list, then this can be very slow indeed. This is because the string might not have any obvious location in the list, so the lot has to be searched. There is a way round this, called 'hashing', whereby the program calculates a definite location in the list for a particular string, which is ALMOST unique. If the location in question is found, as happens occasionally, to already contain a different string, then the next free location is used for the new string, which hardly slows things down at all. The theory behind the technique is too involved to be covered in a short article like this, but here is a ready-to-use Function to try it out. The variable as% at the beginning of line 10 should be set to a size larger than the maximum expected length of the list. 50% larger is fine, but you can get away with as low as 20% with a slight reduction is speed, but smaller margins will cause a drastic reduction in speed as the list fills up. The string array list$ contains the list of strings, and the integer array list% is used to keep track of how many times each string has been searched for. The latter can be useful for statistical purposes, but if you aren't interested, then simply omit all terms involving list%. Lines 20 to 30 form a simple demonstration program, using a list of length 100. (Well you shouldn't fill it up completely, as I implied above, so the real maximum is in the region of 65-80.) The Function FNlist() does two things. First, it checks the list to see if the string passed in brackets is already on the list, and adds it if necessary. It then increments the count by one for that string, and returns this number. So, if a string was not already on the list, then "1" is returned. The second time you search for that string, "2" would be returned, as so on. Thus "1" implies "not already on list", "2" and above implies it was found, the number being the total number of times it was searched for, including the very first time. Obviously, a list as short as 100 doesn't take long to scan from top to bottom anyway, but with much larger lists it's a different story. This Function takes no more time to search in a list 1000 long as it does one 100 long, and a 10-character string takes only about 6/100 of a second to check for on a Master. 10as%=100:DIMlist$(as%),list%(as%) 20REPEAT:INPUT"Type in a string "s$ 30PRINT"Searched ";FNlist(s$);" times" 40UNTILFALSE 50: 100DEFFNlist(n$):LOCALh%,j% 110FORj%=1TOLEN(n$):h%=h%+j%*ASC(MID$(n$,j%,1)):NEXT 120h%=RND(-h%):h%=RND(2):h%=RND(as%)-1 130REPEAT:h%=(h%MODas%)+1:UNTILlist$(h%)=""ORlist$(h%)=n$ 140list$(h%)=n$:list%(h%)=list%(h%)+1 150=list%(h%)