Zibb

Ron WilsonEDN Executive Editor Ron Wilson explores how IC design teams really work: the struggle for power efficiency and performance, wrestling with semiconductor processes and design methodologies, the challenges of global design teams. How do we somehow herd architecture, IP, design and verification into a successful tape-out?



   Advertisement

Profile

RSS Feed

  • Add this blog to your RSS newsreader!

Recent Posts

Recent Comments

Most Commented On

Archives

By Category

Blog

Wednesday, June 13, 2007

Search engines, software developers and systems engineers

Jun 13 2007 3:46PM | Permalink |Comments (2) |


I had the privilege yesterday evening, courtesy of angel-investor group Silicom Ventures, of listening to a panel of very eminent professionals from the search engine world talking about the future of search technology. They had many cogent things to say about where search was going, including these important points. First, search results are enormously improved by reducing the range of the search to a specific topic area—say, software development or invertebrate biology or Indian restaurants in San Francisco. Second, search results are improved enormously by recording the post-search behavior of prior users who have made similar searches. This of course is the basis of the famous (or infamous) Amazon “customers who bought this book also bought …” feature. In fact Steve Larsen, CEO of specialized search start-up Krugle, said that Amazon sometimes has to dial back their so-called collaborative filtering algorithm because it becomes so good at predicting that customers find it uncomfortable.

Third, the user interface on search engines is hopelessly archaic. Text entry of keywords, for heaven’s sake. A single text list of results, with only priority ranking and no other form of indexing. Only very modest text previews of the result pages. All of that is 1960s stuff, of the same generation as the text-based adventure games we used to play on teletype machines.

A more controversial point was that the whole concept of text-matching, even with page ranking thrown in, is inadequate. Search needs to extract meaning, not keywords, from target pages and from queries, and match the meanings, at least in the view of Powerset, Inc. CEO Barney Pell, who has bet a good deal of money on this assertion.

All of this was quite interesting. But the thing that interested me most was the remarkable bias of all the speakers. Two were venture capitalists, and hence can be excused for a rather myopic view of systems design. But two were computer science PhDs, and this concerns me.

The myopia is this. All the panelists clearly think of search as a software problem, not a systems problem. When they discuss the future of the technology, they talk in terms of algorithms, always implicitly assuming that the algorithms are coded in something C-ish and run on increasingly preposterous giant server farms.

My conclusion, and I hope I’m being vastly unjust here, is that even in very fine institutions, computer science still means programming technology, neither really about computers nor really science. There doesn’t seem to be the mindset, even among fine CS graduates, to address complex problems as systems issues—to assume that the system hardware, topology and software will have to be developed concurrently. The hardware is a given—multicore CPUs from Intel or AMD. The topology is a given—whatever the server vendors make available to lash blades and racks together. The guiding lights of search technology address the problem as if the only independent variable were algorithms. At that rate, they will be searching for a long time.


Related entries in: Design and Technology | Venture Capital & IPOs | 


Reader Comments



at 6/14/2007 2:52:41 PM, Ken Krugler said:
You''re right that if you view search as only having an algorithm dial to turn, then you''re going to be in for some serious pain & suffering.

In fact, one key thing we''ve learned about commercial-grade search is that it''s more about operations and less about algorithms. Having a reliable crawling system, for example, is part algorithm, part hardware, part system architecture, and a whole lot of ops-related tasks.

I''m talking about stuff like monitoring, twiddling, re-running, updating, pushing, prodding, and the 100 other things you need to do well to handle extracting lots of usable data coming from a bunch of not-very-well behaved web servers.



at 6/15/2007 4:16:45 PM, Ron Wilson said:
Ken:
Very interesting, as they say. Is there anywhere I can go to learn more about the subject?
ron

Post a comment



Display Name

Change Image
Before submitting this form, please type the characters displayed above.
Note the letters are NOT case sensitive.


ADVERTISEMENT

©1997-2009 Reed Business Information, a division of Reed Elsevier Inc. All rights reserved.
Use of this Web site is subject to its Terms of Use | Privacy Policy

Please visit these other Reed Business sites