Saturday, February 1, 2014

LOCATE vs SEEK, which is faster?

From time to time someone would inquire which is faster?  LOCATE or SEEK? And the answer mostly is it depends, if the records are small, then the speed difference is negligible enough.  True.  But that does not still answer which really is faster.  When we say faster, it has to be at least a nanosecond quicker than the other one.  We are not arguing "how" fast but "which" is faster.



To understand better which is really faster, we have to first understand how the two differs.  How does LOCATE performs searching?  How does SEEK performs searching?  Let us start with SEEK() first.

HOW DOES SEEK() PERFORM SEARCHING?

In order for SEEK() to properly kick in, you have to have an index on field(s) where you wanted to search for a match.  An index can be structural (part of the table in the form of a cdx extension, auto-opens with the table) or non-structural (stand-alone idx file, does not auto-open with table, can be re-used or discarded later).

Once a field is indexed, when you perform a SEEK(), what it does is since the field subject for searching is now ordered properly (check SET INDEX and SET ORDER commands in VFP help), it moves the record pointer outright to the first record matching the search criteria.  After that though is a little bit hazy (for me) because it may do any of these (I am not absolutely sure which exactly it does):

  1. From that jump-start point, it SKIPs records one by one until it find the first near match (affected by SET EXACT and SET NEAR commands setting or the Exactly Equal (==) operator).  Or;
  2. Maps the possible sets where the search should be performed (starting from the jump-start point) and breaks it into three section, then:
    1. Tries matching on the first part (jump-start point), if match is not found then
    2. Jumps to the 2nd part, if it is beyond the match then jumps back to part 1 and does sequential search
    3. If it is not beyond the search criteria, jumps to 3rd and final part 
    4. If 3rd part is beyond, jumps back to 2nd part and performs sequential search from there
    5. If 3rd is not beyond does a sequential search from that point
Number 2 above is what I can recall from reading a Clipper book ages ago from a respected author (I still read books a lot during those days).

So SEEK() will still perform a sequential search but it jumps straight to the first near match.

Sequential search (also known as linear search), according to Wiki, is a method for finding a particular value in a list, that consists of checking every one of its elements, one at a time and in sequence, until the desired one is foundhttp://en.wikipedia.org/wiki/Linear_search


HOW DOES LOCATE PERFORM SEARCHING?

Because LOCATE is designed to work even on non-indexed  (not sorted) field(s), then in some cases it is really useful especially if you wanted two or more fields for comparisons.  But how does LOCATE do it actually on a non-sorted field(s)?

When you perform any LOCATE command, what VFP (and some old xBase apps) do is to immediately position the record at the top.   This can be easily tested like this:

LOCAL lnloop
CREATE CURSOR junk (whatever I)
FOR lnloop = 1 TO 100
  INSERT INTO junk VALUES (m.lnloop)
NEXT
MESSAGEBOX(RECNO())

* Issue locate
LOCATE
MESSAGEBOX(RECNO())

* jump to another record
goto(30)
MESSAGEBOX(RECNO())

* Issue locate
LOCATE
MESSAGEBOX(RECNO())

* jump to another record
goto(85)
MESSAGEBOX(RECNO())

* Issue locate
LOCATE
MESSAGEBOX(RECNO())


So you see, whenever you issue a LOCATE, it immediately positions itself on top of the record.  Why?  What is the reason?  Because LOCATE is designed to allow user to search for a match by always starting from top then performing a sequential search or skipping records one by one until it hits the match you are looking for or it hits EOF().  So the speed with which LOCATE finding a match depends on the actual record position of the match.  If it is near the top, it will be quick.  If the record subject to match is near EOF(), then it will take longer.  Imagine a table with million of records where the match is positioned at 999,999?


Summary:

Looking on the way these two search engines work, it is very clear that SEEK() is really faster than LOCATE.  As the records grow, LOCATE will comparatively take longer than SEEK() in searching for a match.

Hope I was able to give a clear explanation on these two.  Cheers!





4 comments:

  1. why are you still teaching about a dead language?

    ReplyDelete
  2. Which? SEEK() and LOCATE command or VFP? If it is VFP, then I am sorry, there are still tons of developers using this one. ;)

    ReplyDelete
  3. Absolutely agree with Mr. Jun. VFP is not a dead language Mr. Anonymous ;) BTW Congratulations Mr. Jun as Ceil Silver Ambassador 2012 at Southwestfox Arizona USA, i hope and wish i could attend the Southwestfox soon.

    ReplyDelete
  4. Absolutely agree with Mr. Jun. VFP is not a dead language. BTW, Congratulations Mr. Jun as 2012 Ceil Silver Ambassador in Southwestfox Arizona. I wish i could attend soon the conference.

    ReplyDelete