how to search
Lesson 9

Fravia's Nofrill
Web design

24 January 1998

Lesson 9

(Based on some original
private emailings from +ORC)
Preceding lessons:
lesson_5 about general agora
http:// retrieving ~ July 1996
lesson_6 about ftping files agora queries
and emailing altavista ~ December 1996
lesson_7 about the W3gate, search spiders, error messages and evaluation of results ~ March 1997
lesson_8 about advanced searching techniques
(combing and klebing) ~ November 1997

Study! If you do not know what an agora server, a FTPsearch, an archie search and a FTPremailer are, you are paying your provider much too much... and remember that in order to get useful results you may also use klebing and combing techniques and (if really necessary) even lure or trap targets!
Go to
Different techniques walking on the quicksand
Go to When you need to find something fast
Go to Site changes monitoring
Go to Tierra highlights: win a prize! (A small nice exercise)


'to know an answer is easy, the difficult 
part is to know how to find any answer' (+ORC)

'don't waste your time doing it yourself, 
others have already done it for you' (Surreal)
(Part of the following is based on some private emailings from Surreal: Thanx a lot Surreal :-)

Different techniques walking on the quicksand

You want to set up a site, prepare a paper, find information on a hurry about a target you want to crack, the butterflies of the Orinoco-river, german poets of the XVI century, old Märklin steam locs, or silent actresses of the 1920's cinema?

In 90% of the problem-solution situations you are actually desperately trying to do something that somebody else has ALREADY DONE, in fact you are seeking solutions for your problem that have ALREADY BEEN FOUND. Yes, but WHERE on the ever-expanding and quicksand moving deep deep web? For each of your projects there are some different constraints and caveats, as you'll see.

A crack for a new version of a program could have been made a few days before you downloaded the program. In this case, good updated warez sites are a lot faster (and produce far far better results) than ftp search engines.

The butterflies throw you hundred sites that are non relevant until you learn how to use advanced query features to EXCLUDE all non-university sites.

Most sites about german poets are irrelevant or dilettantesque until you learn to use strings in GERMAN to find relevant sites.

For Märklin steam locs you of course need more PICTURES than text, and you'll have to learn how to use correctly the PICTURES tags in the advanced search queries

silent actress sites (as well as the preceding Märklin locs sites) are better found and accessed using usenet queries than search engines.

The main problem, searching in this way, is the "Quicksand" aspect of the web: if you are searching through hotbot how much snow you have NOW on the alpine resorts you'll visit tomorrow, you won't be alused to find ton of links to the snow situation 8 months ago, when hotbot visited those sites, and then, trying to access them, to constate that more than 50% have disappeared, another good 30% was never updated and the remaining 20% have changed sector of interest :-(

When you need to find something fast

When you need to find something fast, use infoseek and "search only in these results" a couple of times. Infoseek is the only search engine that keeps getting better and better. Altavista and Hotboat, on the contrary, are either stable or slightly worsening. This has a lot to do with the awful (and totally useless) 'commercialisation' of the web. Infoseek is 'coming back' from too much commercialisation (started as a 'payment' search engines), Altavista and Hotbot are 'going towards' commercialisation. And commercialisation, as you know is VERY BAD, because it has absolutely nothing to do with CONTENT, that is what you are supposed to be seeking in primis.
As usual, history can help to find solutions. Web search engines used (many many years ago) to be no-frill EFFECTIVE search engines, dunnow if you know what a gopher Veronica search is... it's obsolete searching now, yet try it out, and compare that with hotbot -say, and you'll notice that not all differences are for the better. Another aggravating commercial infection is that many search engines are PUSHING their results, de facto faking them.
They filter results and push commercial (paying) sites and/or commercial "friends" in the pole positions... as if all people using them were really so idiot to believe that their solution should dwell among the first ten results, and would never care examining the remaining pages... well, may be most people using those search engines ARE so stupid, I don't know.

Anyway, when you need to find something fast, use infoseek and "search only in these results" a couple of times. When you don't find anything this way, or when you need "time sensitive" information, use altavista's "advanced query", or hotbot date filter, and specify a date.

If you still don't find what you need, make a filter in dejanews, which is always VERY good for clues on where to look instead of searching it.

Combining these three approaches will get you to a point where you hardly have to drop or "forget" anything you need.

You can solve 95% (or more) of the problem situations (looking for info) within max 2 hrs, with these 3 search engines/filters. If you can't find it, you can be 90% sure it can't be found on the (html based) web. However, the time sensitive info will still be a huge problem (people don't update keywords to the search engines, so 90% of the time sensitive info you're looking for is noise).

There are various tools that notify you of uploads on the web. These tools may also be useful for stalking purposes (unless the webmaster has really a LOT of free time, you will soon recognize his time patterns and be able to reconstruct his REAL timezone).

Site changes monitoring

I recently came across a tool called Tierra highlights, which monitors pages and notifies you of changes, then displays the page with the changes made in another colour.

I don't have to tell you that some of the page mainteiners I targeted have been the ftp uploaders for *ahem* time sensitive programs, so you can see what i'm getting at, I believe.

I ran this proggie for the duration of the evaluation period with the maximum nr of 15 sites and set the best pages to update every 15 minutes. I got into the ftp sites nearly 9 out of 10 of the times, (as i got the ftp info within 15 mins after the uploader put it on his page). Beats the hell out of loading a page and not knowing how many hours have gone by since it's last update (timezone delays etc).

Other possible uses: satelite channel codes pages, warez/passwords pages etc, but also realaudio pages, sensitive information pages and general stalking purposes.

Tierra highlights: win a prize! (A small nice exercise)

You may find this program at http://www.tierra.com/. It's protection scheme is relatively easy to reverse, I'll leave to my readers the pleasure of working on it right now. Consider it a small +HCU exercise for February: best and most elegant short solution will gain once lmore a "strategical" nice prize (previous prize has been won by +Frog's Print as you may recall)

I would like to hear your opinion about this and any other "page updating" util you know of: if you know a better tool, please let me know (and write maybe a small essay on it).
Go ahead, enjoy!

(c) fravia+ 1997, work in progress, all rights reserved nevertheless

how to search 5 how to search 6 how to search 7 how to search 8
Entrance links ~~ tools ~~ antismut anonymity
~~ academy database ~~ ~~ search_forms mail_fravia

red(c) Fravia 1995,1996, 1997, 1998. All rights reserved