init: deps-jar: compile-single: run-single: addLink: before if addLink: after if 0 is NOT empty WHILE INIT getNextURL: crawl_level=0 NEXT URL GIVEN www.mpi-sb.mpg.de ROBOT - SAVE CON OPENED: http://www.mpi-sb.mpg.de/units/ag5/teaching/ss05/is05/links.htm CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http://www2.links2go.com/topic/Information_Retrieval NEW LINK FOUND: http://www.scils.rutgers.edu/~rieh/cir/ NEW LINK FOUND: http://www.isi.edu/~muslea/RISE/index.html NEW LINK FOUND: http://www.mri.mq.edu.au/~einat/web_ir/ NEW LINK FOUND: http://ai.iit.nrc.ca/bibliographies/ml-applied-to-ir.html NEW LINK FOUND: http://web.soi.city.ac.uk/~andym/ir_groups.html NEW LINK FOUND: http://www.alakhawayn.ma/~Y.Houmame/irlinks.html NEW LINK FOUND: http://www.alakhawayn.ma/~Y.Houmame/ir_groups.html NEW LINK FOUND: http://www-ir.inf.ethz.ch/ir_groups.html NEW LINK FOUND: http://www.cs.jhu.edu/~weiss/ NEW LINK FOUND: http://www.cs.jhu.edu/~weiss/ir.html NEW LINK FOUND: http://www.cs.jhu.edu/~weiss/papers.html NEW LINK FOUND: http://mansci1.uwaterloo.ca/~jjiang/biblio.html NEW LINK FOUND: http://mansci1.uwaterloo.ca/~jjiang/ NEW LINK FOUND: http://www.bright.org.br./pfg/Review/ir/welcome.htm NEW LINK FOUND: http://web.syr.edu/~diekemar/ir.html NEW LINK FOUND: http://www.dcs.gla.ac.uk/idom/ir_resources/ NEW LINK FOUND: http://www.dcs.gla.ac.uk/idom/ir_resources/test_collections/ NEW LINK FOUND: http://www.iam.unibe.ch/~scg/Archive/Software/FreeDB/FreeDB.home.html NEW LINK FOUND: http://www.cs.utk.edu/~berry/sc95/sc95.html NEW LINK FOUND: http://www.ee.umd.edu/medlab/mlir/ NEW LINK FOUND: http://www.kdnuggets.com/index.html NEW LINK FOUND: http://www.kdnuggets.com/tools.html NEW LINK FOUND: http://www.cs.bham.ac.uk/~anp/TheDataMine.html NEW LINK FOUND: http://www.cs.umbc.edu/~ian/research-bib.html NEW LINK FOUND: http://www.asis.org/Publications/JASIS/ NEW LINK FOUND: http://www.wkap.nl/journals/ir NEW LINK FOUND: http://searchenginewatch.internet.com/resources/software.html NEW LINK FOUND: http://info.webcrawler.com/mak/projects/robots/robots.html NEW LINK FOUND: http://www-db.stanford.edu/~cho/crawler-paper/ NEW LINK FOUND: http://www.cs.jcu.edu.au/~alison/TONY/lexparse.html NEW LINK FOUND: http://developer.java.sun.com/developer/technicalArticles/ThirdParty/WebCrawler/ NEW LINK FOUND: http://open.muscat.com/stemming/ NEW LINK FOUND: http://www.ils.unc.edu/keyes/java/porter/index.html NEW LINK FOUND: ftp://ftp.cs.cornell.edu/pub/smart/ NEW LINK FOUND: ftp://ftp.cs.cornell.edu/pub/smart/cran/ NEW LINK FOUND: ftp://ftp.cs.cornell.edu/pub/smart/med/ NEW LINK FOUND: ftp://ftp.cs.cornell.edu/pub/smart/time/ NEW LINK FOUND: http://pi0959.kub.nl:2080/Paai/Onderw/Smart/ NEW LINK FOUND: http://www.ee.umd.edu/~oard/software.html NEW LINK FOUND: http://www.dcs.gla.ac.uk/ftp/pub/Bruin/HTML/SMART.HTM NEW LINK FOUND: http://www.cs.wpi.edu/~ifc/courses/CS525I/ NEW LINK FOUND: http://tux.cs.brown.edu/courses/cs295-3/ NEW LINK FOUND: http://www.sims.berkeley.edu/courses/is202/f99/ NEW LINK FOUND: http://www.sims.berkeley.edu/courses/is296a-4/f99/ NEW LINK FOUND: http://maya.cs.depaul.edu/~mobasher/classes/ds599/ NEW LINK FOUND: http://www.gslis.utexas.edu/~palmquis/inforets99/ NEW LINK FOUND: http://www.cs.cmu.edu/afs/cs.cmu.edu/academic/class/15850-s99/www/home.html NEW LINK FOUND: http://www.cs.umbc.edu/~nicholas/courses/691d/index.html NEW LINK FOUND: http://www.cs.utk.edu/~cs494/ NEW LINK FOUND: http://www.cs.princeton.edu/courses/archive/spring98/cs598b/ NEW LINK FOUND: http://ciir.cs.umass.edu/cmpsci646/ NEW LINK FOUND: http://hobart.cs.umass.edu/~allan/irclass.html NEW LINK FOUND: http://www.cs.wisc.edu/~shavlik/cs838/cs838.html NEW LINK FOUND: http://ei.cs.vt.edu/~cs5604/ NEW LINK FOUND: http://www.ags.uci.edu/~jblevins/reviews/DISE/DISE_syllabus.html NEW LINK FOUND: http://pi0959.kub.nl:2080/Paai/Onderw/Ir/ir.html NEW LINK FOUND: http://www.cs.duke.edu/education/courses/cps370/fall98/ NEW LINK FOUND: http://www.acm.org/sigir/ NEW LINK FOUND: http://sdmc.iss.nus.sg/sigir-ap/ NEW LINK FOUND: http://www-nlpir.nist.gov/ NEW LINK FOUND: http://trec.nist.gov/ NEW LINK FOUND: http://irsg.eu.org/ NEW LINK FOUND: http://www.almaden.ibm.com/cs/k53/clever.html NEW LINK FOUND: http://www.canis.uiuc.edu/projects/interspace/ NEW LINK FOUND: http://ils.unc.edu/iris/ NEW LINK FOUND: http://www-ai.ijs.si/DunjaMladenic/yplanet.html NEW LINK FOUND: http://google.stanford.edu/ NEW LINK FOUND: http://cap.anu.edu.au/cap/projects/text_retrieval/ NEW LINK FOUND: http://pastime.anu.edu.au/TAR/ NEW LINK FOUND: http://www.research.digital.com/SRC/personal/Krishna_Bharat/WebArcheology/ NEW LINK FOUND: http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/wwkb/ NEW LINK FOUND: http://www-uilots.let.uu.nl/~uplift/ NEW LINK FOUND: http://www.public.iastate.edu/~CYBERSTACKS/Aristotle.htm NEW LINK FOUND: http://robotics.Stanford.EDU/users/sahami/SONIA/SONIAproject.html NEW LINK FOUND: http://www.cs.bu.edu/groups/ivc/ImageRover/Home.html NEW LINK FOUND: http://www.ee.umd.edu/medlab/filter/filter_project.html NEW LINK FOUND: http://www.ee.umd.edu/medlab/filter/filter.html NEW LINK FOUND: http://www.ee.umd.edu/medlab/filter/software.html NEW LINK FOUND: http://www.cogsci.kun.nl/~profile/others.html NEW LINK FOUND: http://www.dsv.su.se/~fk/if_Doc/IntFilter.html NEW LINK FOUND: http://ils.unc.edu/gants/filterbib.html NEW LINK FOUND: http://www.cs.kun.nl/is/research/filter/references.html NEW LINK FOUND: ftp://ftp.cs.umd.edu/pub/hcil/Reports-Abstracts-Bibliography/3643html/filter.html NEW LINK FOUND: http://lsi.argreenhouse.com/~remde/lsi/ NEW LINK FOUND: http://lsi.argreenhouse.com/~remde/lsi/LSIpapers.html NEW LINK FOUND: http://www.cs.utk.edu/~lsi/ NEW LINK FOUND: http://lsa.colorado.edu/ NEW LINK FOUND: http://psych.colorado.edu/~rehder/lsa.html NEW LINK FOUND: http://maude.nmsu.edu/essay/index.html NEW LINK FOUND: http://kf.oise.utoronto.ca/lsa/ NEW LINK FOUND: http://www.cse.ogi.edu/~mhereim/ NEW LINK FOUND: http://www.site.uottawa.ca/tanka/ts.html NEW LINK FOUND: http://www.cs.columbia.edu/~hjing/summarization.html NEW LINK FOUND: http://www.dcs.shef.ac.uk/~gael/summarization.html NEW LINK FOUND: http://www.dcs.shef.ac.uk/~gael/alphalist.html NEW LINK FOUND: http://www.cs.columbia.edu/~radev/summarization/ NEW LINK FOUND: http://www.isi.edu/natural-language/projects/SUMMARIST.html NEW LINK FOUND: http://www.isi.edu/natural-language/summ-indic-phrases.html NEW LINK FOUND: http://www.ik.fh-hannover.de/ik/projekte/Dagstuhl/Abstract/ NEW LINK FOUND: http://extractor.iit.nrc.ca/ NEW LINK FOUND: http://dblp.uni-trier.de/ NEW LINK FOUND: http://lorca.compapp.dcu.ie/SIGIR98-wshop/program.html NEW LINK FOUND: http://www.webnetjrl.com/ NEW LINK FOUND: http://trec.nist.gov/pubs.html NEW LINK FOUND: http://www3.cm.deakin.edu.au/apweb98/program.htm NEW LINK FOUND: http://www.acm.org/pubs/contents/proceedings/ir/ NEW LINK FOUND: http://www.iit.nrcps.ariadne-t.gr/~costass/mulsaic97.html NEW LINK FOUND: http://www.ee.umd.edu/medlab/filter/sss/papers/ NEW LINK FOUND: http://ciir.cs.umass.edu/nir97/ NEW LINK FOUND: http://journals.ecs.soton.ac.uk/~lac/ht97/summary.html NEW LINK FOUND: http://proceedings.www6conf.org/ NEW LINK FOUND: http://atlanta.cs.nchu.edu.tw/www/ToC.html NEW LINK FOUND: http://www5conf.inria.fr/ NEW LINK FOUND: http://www.supercomp.org/sc95/proceedings/ NEW LINK FOUND: http://ai.iit.nrc.ca/DEIL/ NEW LINK FOUND: http://www-cse.ucsd.edu/users/rik/MLIA.html NEW LINK FOUND: http://www.enst.fr/~rungsawa/irrs.html/ NEW LINK FOUND: http://ciir.cs.umass.edu/publications/index.shtml NEW LINK FOUND: http://www.cse.ucsd.edu/groups/guru/publications.html NEW LINK FOUND: http://www.ub2.lu.se/desire/radar/lit-about-search-services.html NEW LINK FOUND: http://web.syr.edu/~mdtaffet/nlp_sites.html NEW LINK FOUND: http://etupc19.wiwi.uni-karlsruhe.de/webmining/bib/ NEW LINK FOUND: http://www.cs.ualberta.ca/~tszhu/webmining.htm NEW LINK FOUND: http://xtasy.lib.indiana.edu/jmdocs/irsys.html NEW LINK FOUND: http://www.muc.saic.com/ NEW LINK FOUND: http://www.cs.mu.oz.au/~alistair/ NEW LINK FOUND: http://www.cs.mu.oz.au/~alistair/exploring/ NEW LINK FOUND: http://www-users.cs.umn.edu/~mobasher/webminer/survey/survey.html NEW LINK FOUND: http://www.cs.umbc.edu/~mayfield/ngrams.html NEW LINK FOUND: http://websom.hut.fi/ NEW LINK FOUND: http://www.cs.umbc.edu/abir/ NEW LINK FOUND: http://www.asis.org/midyear-96/girillpaper.html NEW LINK FOUND: http://www-ir.inf.ethz.ch/Public-Web-Pages/mittendorf/mittendorf.html NEW LINK FOUND: http://lcavwww.epfl.ch/LSI/index.html nach ParserDelegator() PARSER RUN 135 crawl_level: 0 MAX_CRAWL_DEPTH: 2 http://www2.links2go.com/topic/Information_Retrieval 0 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.scils.rutgers.edu/~rieh/cir/ 1 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.isi.edu/~muslea/RISE/index.html 2 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.mri.mq.edu.au/~einat/web_ir/ 3 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://ai.iit.nrc.ca/bibliographies/ml-applied-to-ir.html 4 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://web.soi.city.ac.uk/~andym/ir_groups.html 5 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.alakhawayn.ma/~Y.Houmame/irlinks.html 6 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.alakhawayn.ma/~Y.Houmame/ir_groups.html 7 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www-ir.inf.ethz.ch/ir_groups.html 8 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.cs.jhu.edu/~weiss/ 9 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.cs.jhu.edu/~weiss/ir.html 10 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.cs.jhu.edu/~weiss/papers.html 11 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://mansci1.uwaterloo.ca/~jjiang/biblio.html 12 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://mansci1.uwaterloo.ca/~jjiang/ 13 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.bright.org.br./pfg/Review/ir/welcome.htm 14 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://web.syr.edu/~diekemar/ir.html 15 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.dcs.gla.ac.uk/idom/ir_resources/ 16 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.dcs.gla.ac.uk/idom/ir_resources/test_collections/ 17 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.iam.unibe.ch/~scg/Archive/Software/FreeDB/FreeDB.home.html 18 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.cs.utk.edu/~berry/sc95/sc95.html 19 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.ee.umd.edu/medlab/mlir/ 20 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.kdnuggets.com/index.html 21 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.kdnuggets.com/tools.html 22 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.cs.bham.ac.uk/~anp/TheDataMine.html 23 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.cs.umbc.edu/~ian/research-bib.html 24 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.asis.org/Publications/JASIS/ 25 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.wkap.nl/journals/ir 26 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://searchenginewatch.internet.com/resources/software.html 27 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://info.webcrawler.com/mak/projects/robots/robots.html 28 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www-db.stanford.edu/~cho/crawler-paper/ 29 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.cs.jcu.edu.au/~alison/TONY/lexparse.html 30 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://developer.java.sun.com/developer/technicalArticles/ThirdParty/WebCrawler/ 31 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://open.muscat.com/stemming/ 32 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.ils.unc.edu/keyes/java/porter/index.html 33 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 ftp://ftp.cs.cornell.edu/pub/smart/ 34 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 ftp://ftp.cs.cornell.edu/pub/smart/cran/ 35 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 ftp://ftp.cs.cornell.edu/pub/smart/med/ 36 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 ftp://ftp.cs.cornell.edu/pub/smart/time/ 37 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://pi0959.kub.nl:2080/Paai/Onderw/Smart/ 38 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.ee.umd.edu/~oard/software.html 39 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.dcs.gla.ac.uk/ftp/pub/Bruin/HTML/SMART.HTM 40 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.cs.wpi.edu/~ifc/courses/CS525I/ 41 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://tux.cs.brown.edu/courses/cs295-3/ 42 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.sims.berkeley.edu/courses/is202/f99/ 43 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.sims.berkeley.edu/courses/is296a-4/f99/ 44 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://maya.cs.depaul.edu/~mobasher/classes/ds599/ 45 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.gslis.utexas.edu/~palmquis/inforets99/ 46 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.cs.cmu.edu/afs/cs.cmu.edu/academic/class/15850-s99/www/home.html 47 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.cs.umbc.edu/~nicholas/courses/691d/index.html 48 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.cs.utk.edu/~cs494/ 49 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.cs.princeton.edu/courses/archive/spring98/cs598b/ 50 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://ciir.cs.umass.edu/cmpsci646/ 51 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://hobart.cs.umass.edu/~allan/irclass.html 52 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.cs.wisc.edu/~shavlik/cs838/cs838.html 53 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://ei.cs.vt.edu/~cs5604/ 54 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.ags.uci.edu/~jblevins/reviews/DISE/DISE_syllabus.html 55 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://pi0959.kub.nl:2080/Paai/Onderw/Ir/ir.html 56 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.cs.duke.edu/education/courses/cps370/fall98/ 57 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.acm.org/sigir/ 58 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://sdmc.iss.nus.sg/sigir-ap/ 59 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www-nlpir.nist.gov/ 60 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://trec.nist.gov/ 61 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://irsg.eu.org/ 62 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.almaden.ibm.com/cs/k53/clever.html 63 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.canis.uiuc.edu/projects/interspace/ 64 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://ils.unc.edu/iris/ 65 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www-ai.ijs.si/DunjaMladenic/yplanet.html 66 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://google.stanford.edu/ 67 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://cap.anu.edu.au/cap/projects/text_retrieval/ 68 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://pastime.anu.edu.au/TAR/ 69 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.research.digital.com/SRC/personal/Krishna_Bharat/WebArcheology/ 70 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/wwkb/ 71 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www-uilots.let.uu.nl/~uplift/ 72 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.public.iastate.edu/~CYBERSTACKS/Aristotle.htm 73 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://robotics.Stanford.EDU/users/sahami/SONIA/SONIAproject.html 74 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.cs.bu.edu/groups/ivc/ImageRover/Home.html 75 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.ee.umd.edu/medlab/filter/filter_project.html 76 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.ee.umd.edu/medlab/filter/filter.html 77 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.ee.umd.edu/medlab/filter/software.html 78 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.cogsci.kun.nl/~profile/others.html 79 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.dsv.su.se/~fk/if_Doc/IntFilter.html 80 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://ils.unc.edu/gants/filterbib.html 81 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.cs.kun.nl/is/research/filter/references.html 82 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 ftp://ftp.cs.umd.edu/pub/hcil/Reports-Abstracts-Bibliography/3643html/filter.html 83 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://lsi.argreenhouse.com/~remde/lsi/ 84 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://lsi.argreenhouse.com/~remde/lsi/LSIpapers.html 85 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.cs.utk.edu/~lsi/ 86 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://lsa.colorado.edu/ 87 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://psych.colorado.edu/~rehder/lsa.html 88 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://maude.nmsu.edu/essay/index.html 89 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://kf.oise.utoronto.ca/lsa/ 90 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.cse.ogi.edu/~mhereim/ 91 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.site.uottawa.ca/tanka/ts.html 92 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.cs.columbia.edu/~hjing/summarization.html 93 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.dcs.shef.ac.uk/~gael/summarization.html 94 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.dcs.shef.ac.uk/~gael/alphalist.html 95 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.cs.columbia.edu/~radev/summarization/ 96 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.isi.edu/natural-language/projects/SUMMARIST.html 97 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.isi.edu/natural-language/summ-indic-phrases.html 98 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.ik.fh-hannover.de/ik/projekte/Dagstuhl/Abstract/ 99 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://extractor.iit.nrc.ca/ 100 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://dblp.uni-trier.de/ 101 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://lorca.compapp.dcu.ie/SIGIR98-wshop/program.html 102 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.webnetjrl.com/ 103 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://trec.nist.gov/pubs.html 104 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www3.cm.deakin.edu.au/apweb98/program.htm 105 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.acm.org/pubs/contents/proceedings/ir/ 106 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.iit.nrcps.ariadne-t.gr/~costass/mulsaic97.html 107 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.ee.umd.edu/medlab/filter/sss/papers/ 108 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://ciir.cs.umass.edu/nir97/ 109 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://journals.ecs.soton.ac.uk/~lac/ht97/summary.html 110 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://proceedings.www6conf.org/ 111 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://atlanta.cs.nchu.edu.tw/www/ToC.html 112 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www5conf.inria.fr/ 113 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.supercomp.org/sc95/proceedings/ 114 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://ai.iit.nrc.ca/DEIL/ 115 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www-cse.ucsd.edu/users/rik/MLIA.html 116 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.enst.fr/~rungsawa/irrs.html/ 117 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://ciir.cs.umass.edu/publications/index.shtml 118 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.cse.ucsd.edu/groups/guru/publications.html 119 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.ub2.lu.se/desire/radar/lit-about-search-services.html 120 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://web.syr.edu/~mdtaffet/nlp_sites.html 121 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://etupc19.wiwi.uni-karlsruhe.de/webmining/bib/ 122 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.cs.ualberta.ca/~tszhu/webmining.htm 123 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://xtasy.lib.indiana.edu/jmdocs/irsys.html 124 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.muc.saic.com/ 125 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.cs.mu.oz.au/~alistair/ 126 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.cs.mu.oz.au/~alistair/exploring/ 127 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www-users.cs.umn.edu/~mobasher/webminer/survey/survey.html 128 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.cs.umbc.edu/~mayfield/ngrams.html 129 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://websom.hut.fi/ 130 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.cs.umbc.edu/abir/ 131 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www.asis.org/midyear-96/girillpaper.html 132 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://www-ir.inf.ethz.ch/Public-Web-Pages/mittendorf/mittendorf.html 133 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 http://lcavwww.epfl.ch/LSI/index.html 134 135 before addLink(isd.getUrls()[i]) addLink: before if addLink: after if after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 1 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=0 crawl_level++ getNextURL: crawl_level=1 NEXT URL GIVEN www2.links2go.com ROBOT - SAVE CON OPENED: http://www2.links2go.com/topic/Information_Retrieval CONTENT GOT: text/html javax.swing.text.ChangedCharSetException java.lang.NullPointerException - http://www2.links2go.com/topic/Information_Retrieval vor ParserDelegator() PARSER RUN 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.scils.rutgers.edu ROBOT - SAVE CON OPENED: http://www.scils.rutgers.edu/~rieh/cir/ CONTENT GOT: text/html 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.isi.edu FC1: java.io.FileNotFoundException: http://www.scils.rutgers.edu/~rieh/cir/ ROBOT - SAVE CON OPENED: http://www.isi.edu/~muslea/RISE/index.html CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http://www.isi.edu/info-agents/RISE nach ParserDelegator() PARSER RUN 1 crawl_level: 1 MAX_CRAWL_DEPTH: 2 http://www.isi.edu/info-agents/RISE 0 1 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.mri.mq.edu.au ROBOT - SAVE CON OPENED: http://www.mri.mq.edu.au/~einat/web_ir/ 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN ai.iit.nrc.ca java.lang.NullPointerException - http://www.mri.mq.edu.au/~einat/web_ir/ ROBOT - SAVE CON OPENED: http://ai.iit.nrc.ca/bibliographies/ml-applied-to-ir.html CONTENT GOT: text/html 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN web.soi.city.ac.uk FC1: java.io.FileNotFoundException: http://ai.iit.nrc.ca/bibliographies/ml-applied-to-ir.html ROBOT - SAVE CON OPENED: http://web.soi.city.ac.uk/~andym/ir_groups.html CONTENT GOT: text/html 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.alakhawayn.ma FC1: java.io.FileNotFoundException: http://web.soi.city.ac.uk/~andym/ir_groups.html ROBOT - SAVE CON OPENED: http://www.alakhawayn.ma/~Y.Houmame/irlinks.html CONTENT GOT: text/html 0 is NOT empty FC1: java.io.FileNotFoundException: http://www.aui.ma/personal/~Y.Houmame/irlinks.html WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.alakhawayn.ma ROBOT - SAVE CON OPENED: http://www.alakhawayn.ma/~Y.Houmame/ir_groups.html CONTENT GOT: text/html 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www-ir.inf.ethz.ch FC1: java.io.FileNotFoundException: http://www.aui.ma/personal/~Y.Houmame/ir_groups.html ROBOT - SAVE CON OPENED: http://www-ir.inf.ethz.ch/ir_groups.html 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.cs.jhu.edu java.lang.NullPointerException - http://www-ir.inf.ethz.ch/ir_groups.html unsupported Site 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.cs.jhu.edu unsupported Site 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.cs.jhu.edu unsupported Site 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN mansci1.uwaterloo.ca ROBOT - SAVE CON OPENED: http://mansci1.uwaterloo.ca/~jjiang/biblio.html 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN mansci1.uwaterloo.ca java.lang.NullPointerException - http://mansci1.uwaterloo.ca/~jjiang/biblio.html ROBOT - SAVE CON OPENED: http://mansci1.uwaterloo.ca/~jjiang/ java.lang.NullPointerException - http://mansci1.uwaterloo.ca/~jjiang/ 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.bright.org.br. java.lang.NullPointerException - http://www.bright.org.br./pfg/Review/ir/welcome.htm ROBOT - SAVE CON OPENED: http://www.bright.org.br./pfg/Review/ir/welcome.htm 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN web.syr.edu ROBOT - SAVE CON OPENED: http://web.syr.edu/~diekemar/ir.html CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http://mansci1.uwaterloo.ca/~jjiang/biblio.html NEW LINK FOUND: http://mansci1.uwaterloo.ca/~jjiang/ NEW LINK FOUND: http://www-inf.enst.fr/~rungsawa/irrs.html NEW LINK FOUND: http://www-inf.enst.fr/~rungsawa/ NEW LINK FOUND: http://www.seas.gwu.edu/student/chulee/bib.html NEW LINK FOUND: http://www.seas.gwu.edu/student/chulee/ NEW LINK FOUND: http://www.sils.umich.edu/~mjpinto/ILS609Page/Bibliography/ IRBibliography.html NEW LINK FOUND: http://www.si.umich.edu/~mjpinto/ NEW LINK FOUND: http://joinus.comeng.chungnam.ac.kr/~dolphin/db/indices/a-tree/s/ Salton:Gerard.html NEW LINK FOUND: http://superbook.bellcore.com/~std/LSI.html NEW LINK FOUND: http://www.cs.cmu.edu/afs/cs/project/ai-repository/ai/areas/nlp/0.html NEW LINK FOUND: http://www.hcibib.org/ NEW LINK FOUND: http://www.ubilab.ubs.ch/sigir96/welcome.html NEW LINK FOUND: http://trec.nist.gov/ NEW LINK FOUND: http://cs.nyu.edu/cs/faculty/grishman/muc6.html NEW LINK FOUND: http://www.dcs.gla.ac.uk/logic95/ NEW LINK FOUND: http://www.rxrc.xerox.com/research/mltt/DMHead/CLIR/ NEW LINK FOUND: http://pi0959.kub.nl:2080/Paai/Onderw/Ir/ir.html NEW LINK FOUND: http://pi0959.kub.nl:2080/Paai/Onderw/Barcpaai/barcpaai.html NEW LINK FOUND: http://pi0959.kub.nl:2080/Paai/engels.html NEW LINK FOUND: http://www.cs.bilkent.edu.tr/~david/cs533/cs533.html NEW LINK FOUND: http://www.cs.bilkent.edu.tr/~david/david.html NEW LINK FOUND: http://ocelot.cat.syr.edu/~farhad/dissertation.html NEW LINK FOUND: http://ocelot.cat.syr.edu/~farhad/ NEW LINK FOUND: http://www.fask.uni-mainz.de/user/krueger/dissweb/Diss-00.html NEW LINK FOUND: http://www.cs.columbia.edu/~acl/nlpfaq.txt NEW LINK FOUND: http://www.dcs.gla.ac.uk/idom/irlist/ NEW LINK FOUND: http://xxx.lanl.gov/cmp-lg/ NEW LINK FOUND: http://www.dlib.org/ NEW LINK FOUND: gopher://ukoln.bath.ac.uk:7070/11/BUBL_Main_Menu/E/E2/E2EI06 NEW LINK FOUND: gopher://ukoln.bath.ac.uk:7070/11/BUBL_Main_Menu/E/E2/E2EJ02 NEW LINK FOUND: gopher://ukoln.bath.ac.uk:7070/11/BUBL_Main_Menu/E/E2/E2EJ05 NEW LINK FOUND: http://www.hotwired.com/frontdoor/ NEW LINK FOUND: http://www.asis.org/ NEW LINK FOUND: http://www.acm.org/sigir/ NEW LINK FOUND: http://www.tipster.org NEW LINK FOUND: http://ciir.cs.umass.edu/info/ciirinfo.html NEW LINK FOUND: http://www.dcs.gla.ac.uk/ir/ NEW LINK FOUND: http://www-nlpir.nist.gov/ NEW LINK FOUND: http://www-ir.inf.ethz.ch/ NEW LINK FOUND: http://www.dcs.glasgow.ac.uk/Keith/Preface.html NEW LINK FOUND: http://ciir.cs.umass.edu:80/info/psfiles/irpubs/sigir93-4/sigir93-4.html NEW LINK FOUND: http://pi0959.kub.nl:2080/Paai/Onderw/Paai/Ai_ir/ai_ir.html NEW LINK FOUND: http://sunsite.anu.edu.au/mirrors/dlib/dlib/november95/11croft.html NEW LINK FOUND: http://www.cs.kun.nl/~theoh/ NEW LINK FOUND: http://bliss.berkeley.edu/papers/ NEW LINK FOUND: http://bliss.berkeley.edu/ NEW LINK FOUND: http://www.cse.ogi.edu/CSLU/HLTsurvey/HLTsurvey.html NEW LINK FOUND: http://www.enee.umd.edu//medlab/filter/filter_project.html NEW LINK FOUND: http://cs.nyu.edu/cs/faculty/grishman/proteus.html NEW LINK FOUND: http://www-nlp.cs.umass.edu/ NEW LINK FOUND: http://www.compapp.dcu.ie/~asmeaton/group.html NEW LINK FOUND: http://hwr.nici.kun.nl/~profile/index.html NEW LINK FOUND: http://www.hardlink.com/~chambers/HLP/ NEW LINK FOUND: http://www.dcs.gla.ac.uk/idom/ir_resources/ NEW LINK FOUND: http://www.ee.umd.edu/medlab/mlir/mlir.html NEW LINK FOUND: http://www.glue.umd.edu/~oard/Welcome.html NEW LINK FOUND: http://www.digits.com/ NEW LINK FOUND: mailto: diekemar@mailbox.syr.edu NEW LINK FOUND: http://web.syr.edu/~diekemar/ NEW LINK FOUND: mailto:diekemar@mailbox.syr.edu nach ParserDelegator() PARSER RUN 61 crawl_level: 1 MAX_CRAWL_DEPTH: 2 http://mansci1.uwaterloo.ca/~jjiang/biblio.html 0 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://mansci1.uwaterloo.ca/~jjiang/ 1 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-inf.enst.fr/~rungsawa/irrs.html 2 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-inf.enst.fr/~rungsawa/ 3 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.seas.gwu.edu/student/chulee/bib.html 4 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.seas.gwu.edu/student/chulee/ 5 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.sils.umich.edu/~mjpinto/ILS609Page/Bibliography/ IRBibliography.html 6 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.si.umich.edu/~mjpinto/ 7 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://joinus.comeng.chungnam.ac.kr/~dolphin/db/indices/a-tree/s/ Salton:Gerard.html 8 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://superbook.bellcore.com/~std/LSI.html 9 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/afs/cs/project/ai-repository/ai/areas/nlp/0.html 10 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.hcibib.org/ 11 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ubilab.ubs.ch/sigir96/welcome.html 12 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://trec.nist.gov/ 13 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://cs.nyu.edu/cs/faculty/grishman/muc6.html 14 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.dcs.gla.ac.uk/logic95/ 15 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.rxrc.xerox.com/research/mltt/DMHead/CLIR/ 16 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://pi0959.kub.nl:2080/Paai/Onderw/Ir/ir.html 17 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://pi0959.kub.nl:2080/Paai/Onderw/Barcpaai/barcpaai.html 18 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://pi0959.kub.nl:2080/Paai/engels.html 19 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.bilkent.edu.tr/~david/cs533/cs533.html 20 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.bilkent.edu.tr/~david/david.html 21 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ocelot.cat.syr.edu/~farhad/dissertation.html 22 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ocelot.cat.syr.edu/~farhad/ 23 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.fask.uni-mainz.de/user/krueger/dissweb/Diss-00.html 24 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.columbia.edu/~acl/nlpfaq.txt 25 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.dcs.gla.ac.uk/idom/irlist/ 26 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://xxx.lanl.gov/cmp-lg/ 27 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.dlib.org/ 28 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 gopher://ukoln.bath.ac.uk:7070/11/BUBL_Main_Menu/E/E2/E2EI06 29 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 gopher://ukoln.bath.ac.uk:7070/11/BUBL_Main_Menu/E/E2/E2EJ02 30 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 gopher://ukoln.bath.ac.uk:7070/11/BUBL_Main_Menu/E/E2/E2EJ05 31 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.hotwired.com/frontdoor/ 32 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.asis.org/ 33 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.acm.org/sigir/ 34 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.tipster.org 35 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ciir.cs.umass.edu/info/ciirinfo.html 36 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.dcs.gla.ac.uk/ir/ 37 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-nlpir.nist.gov/ 38 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-ir.inf.ethz.ch/ 39 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.dcs.glasgow.ac.uk/Keith/Preface.html 40 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ciir.cs.umass.edu:80/info/psfiles/irpubs/sigir93-4/sigir93-4.html 41 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://pi0959.kub.nl:2080/Paai/Onderw/Paai/Ai_ir/ai_ir.html 42 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://sunsite.anu.edu.au/mirrors/dlib/dlib/november95/11croft.html 43 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.kun.nl/~theoh/ 44 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://bliss.berkeley.edu/papers/ 45 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://bliss.berkeley.edu/ 46 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cse.ogi.edu/CSLU/HLTsurvey/HLTsurvey.html 47 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.enee.umd.edu//medlab/filter/filter_project.html 48 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://cs.nyu.edu/cs/faculty/grishman/proteus.html 49 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-nlp.cs.umass.edu/ 50 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.compapp.dcu.ie/~asmeaton/group.html 51 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://hwr.nici.kun.nl/~profile/index.html 52 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.hardlink.com/~chambers/HLP/ 53 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.dcs.gla.ac.uk/idom/ir_resources/ 54 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ee.umd.edu/medlab/mlir/mlir.html 55 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.glue.umd.edu/~oard/Welcome.html 56 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.digits.com/ 57 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto: diekemar@mailbox.syr.edu 58 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://web.syr.edu/~diekemar/ 59 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:diekemar@mailbox.syr.edu 60 61 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.dcs.gla.ac.uk ROBOT - SAVE CON OPENED: http://www.dcs.gla.ac.uk/idom/ir_resources/ CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http://www.dcs.gla.ac.uk/ir/ir_resources.html nach ParserDelegator() PARSER RUN 1 crawl_level: 1 MAX_CRAWL_DEPTH: 2 http://www.dcs.gla.ac.uk/ir/ir_resources.html 0 1 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.dcs.gla.ac.uk ROBOT - SAVE CON OPENED: http://www.dcs.gla.ac.uk/idom/ir_resources/test_collections/ CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: ftp://medir.ohsu.edu/pub/ohsumed/ NEW LINK FOUND: http://www.daviddlewis.com/resources/testcollections/reuters21578/ NEW LINK FOUND: http://trec.nist.gov/ nach ParserDelegator() PARSER RUN 3 crawl_level: 1 MAX_CRAWL_DEPTH: 2 ftp://medir.ohsu.edu/pub/ohsumed/ 0 3 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.daviddlewis.com/resources/testcollections/reuters21578/ 1 3 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://trec.nist.gov/ 2 3 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.iam.unibe.ch ROBOT - SAVE CON OPENED: http://www.iam.unibe.ch/~scg/Archive/Software/FreeDB/FreeDB.home.html CONTENT GOT: text/html FC1: java.io.FileNotFoundException: http://www.iam.unibe.ch/~scg/Archive/Software/FreeDB/FreeDB.home.html 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.cs.utk.edu ROBOT - SAVE CON OPENED: http://www.cs.utk.edu/~berry/sc95/sc95.html CONTENT GOT: text/html vor ParserDelegator() nach ParserDelegator() PARSER RUN 0 crawl_level: 1 MAX_CRAWL_DEPTH: 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.ee.umd.edu ROBOT - SAVE CON OPENED: http://www.ee.umd.edu/medlab/mlir/ CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http://www.clis.umd.edu/dlrg/ NEW LINK FOUND: http://www.clis.umd.edu/ NEW LINK FOUND: mailto:oard@glue.umd.edu NEW LINK FOUND: http://www.dlib.org/dlib/december97/oard/12oard.html NEW LINK FOUND: http://www.glue.umd.edu/~oard/Welcome.html nach ParserDelegator() PARSER RUN 5 crawl_level: 1 MAX_CRAWL_DEPTH: 2 http://www.clis.umd.edu/dlrg/ 0 5 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.clis.umd.edu/ 1 5 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:oard@glue.umd.edu 2 5 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.dlib.org/dlib/december97/oard/12oard.html 3 5 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.glue.umd.edu/~oard/Welcome.html 4 5 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.kdnuggets.com ROBOT - SAVE CON OPENED: http://www.kdnuggets.com/index.html CONTENT GOT: text/html javax.swing.text.ChangedCharSetException java.lang.NullPointerException - http://www.kdnuggets.com/index.html vor ParserDelegator() PARSER RUN 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.kdnuggets.com ROBOT - SAVE CON OPENED: http://www.kdnuggets.com/tools.html CONTENT GOT: text/html vor ParserDelegator() nach ParserDelegator() PARSER RUN 0 crawl_level: 1 MAX_CRAWL_DEPTH: 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.cs.bham.ac.uk ROBOT - SAVE CON OPENED: http://www.cs.bham.ac.uk/~anp/TheDataMine.html CONTENT GOT: text/html javax.swing.text.ChangedCharSetException vor ParserDelegator() PARSER RUN 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.cs.umbc.edu java.lang.NullPointerException - http://www.cs.bham.ac.uk/~anp/TheDataMine.html ROBOT - SAVE CON OPENED: http://www.cs.umbc.edu/~ian/research-bib.html CONTENT GOT: text/html FC1: java.io.FileNotFoundException: http://www.cs.umbc.edu/~ian/research-bib.html 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.asis.org ROBOT - SAVE CON OPENED: http://www.asis.org/Publications/JASIS/ CONTENT GOT: text/html javax.swing.text.ChangedCharSetException java.lang.NullPointerException - http://www.asis.org/Publications/JASIS/ vor ParserDelegator() PARSER RUN 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.wkap.nl unsupported Site 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN searchenginewatch.internet.com unsupported Site 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN info.webcrawler.com ROBOT - SAVE CON OPENED: http://info.webcrawler.com/mak/projects/robots/robots.html CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http://www.webcrawler.com/info.wbcrwl/yellow-pages NEW LINK FOUND: http://www.webcrawler.com/info.wbcrwl/white-pages NEW LINK FOUND: http://msxml.webcrawler.com/info.wbcrwl/search/advanced/web.htm? NEW LINK FOUND: http://msxml.webcrawler.com/info.wbcrwl/search/preferences.htm?ver=193 NEW LINK FOUND: http://msxml.webcrawler.com/info.wbcrwl/search/help/index.htm? NEW LINK FOUND: http://msxml.webcrawler.com/info.wbcrwl/search/advanced/' + searchType.toLowerCase() + '.htm? NEW LINK FOUND: http://msxml.webcrawler.com/info.wbcrwl/search/advanced/'+searchType+'.htm? NEW LINK FOUND: http://clickit.go2net.com/adclick?cid=372985&area=click.tracking&site=search&cp=info.wbcrwl&src=0&shape=link&rawto=http://msxml.webcrawler.com/info.wbcrwl/searchspy/ NEW LINK FOUND: http://msxml.webcrawler.com/info.wbcrwl/search/web/Hurricane%2BSeason NEW LINK FOUND: http://msxml.webcrawler.com/info.wbcrwl/search/web/Raymond%2BFinale NEW LINK FOUND: http://msxml.webcrawler.com/info.wbcrwl/search/web/Ringtones NEW LINK FOUND: http://msxml.webcrawler.com/info.wbcrwl/search/web/Solar%2BSail NEW LINK FOUND: http://msxml.webcrawler.com/info.wbcrwl/search/web/Jerry%2BRice NEW LINK FOUND: http://msxml.webcrawler.com/info.wbcrwl/search/web/Sith%2BPosters NEW LINK FOUND: http://www.webcrawler.com/info.wbcrwl/search/help/bookmark.htm NEW LINK FOUND: https://secure.ah-ha.com/guaranteed_inclusion/teaser.aspx?network=WebCrawler NEW LINK FOUND: http://www.webcrawler.com/info.wbcrwl/search/help/terms.htm NEW LINK FOUND: http://www.webcrawler.com/info.wbcrwl/search/help/privacy.htm NEW LINK FOUND: http://www.webcrawler.com/info.wbcrwl/search/help/contact.htm NEW LINK FOUND: http://www.webcrawler.com/info.wbcrwl/search/help/popular.htm NEW LINK FOUND: http://www.dogpile.com NEW LINK FOUND: http://www.infospaceinc.com nach ParserDelegator() PARSER RUN 22 crawl_level: 1 MAX_CRAWL_DEPTH: 2 http://www.webcrawler.com/info.wbcrwl/yellow-pages 0 22 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.webcrawler.com/info.wbcrwl/white-pages 1 22 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://msxml.webcrawler.com/info.wbcrwl/search/advanced/web.htm? 2 22 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://msxml.webcrawler.com/info.wbcrwl/search/preferences.htm?ver=193 3 22 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://msxml.webcrawler.com/info.wbcrwl/search/help/index.htm? 4 22 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://msxml.webcrawler.com/info.wbcrwl/search/advanced/' + searchType.toLowerCase() + '.htm? 5 22 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://msxml.webcrawler.com/info.wbcrwl/search/advanced/'+searchType+'.htm? 6 22 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://clickit.go2net.com/adclick?cid=372985&area=click.tracking&site=search&cp=info.wbcrwl&src=0&shape=link&rawto=http://msxml.webcrawler.com/info.wbcrwl/searchspy/ 7 22 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://msxml.webcrawler.com/info.wbcrwl/search/web/Hurricane%2BSeason 8 22 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://msxml.webcrawler.com/info.wbcrwl/search/web/Raymond%2BFinale 9 22 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://msxml.webcrawler.com/info.wbcrwl/search/web/Ringtones 10 22 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://msxml.webcrawler.com/info.wbcrwl/search/web/Solar%2BSail 11 22 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://msxml.webcrawler.com/info.wbcrwl/search/web/Jerry%2BRice 12 22 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://msxml.webcrawler.com/info.wbcrwl/search/web/Sith%2BPosters 13 22 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.webcrawler.com/info.wbcrwl/search/help/bookmark.htm 14 22 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 https://secure.ah-ha.com/guaranteed_inclusion/teaser.aspx?network=WebCrawler 15 22 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.webcrawler.com/info.wbcrwl/search/help/terms.htm 16 22 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.webcrawler.com/info.wbcrwl/search/help/privacy.htm 17 22 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.webcrawler.com/info.wbcrwl/search/help/contact.htm 18 22 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.webcrawler.com/info.wbcrwl/search/help/popular.htm 19 22 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.dogpile.com 20 22 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.infospaceinc.com 21 22 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www-db.stanford.edu ROBOT - SAVE CON OPENED: http://www-db.stanford.edu/~cho/crawler-paper/ CONTENT GOT: text/html FC1: java.io.FileNotFoundException: http://www-db.stanford.edu/~cho/crawler-paper/ 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.cs.jcu.edu.au ROBOT - SAVE CON OPENED: http://www.cs.jcu.edu.au/~alison/TONY/lexparse.html CONTENT GOT: text/html FC1: java.io.FileNotFoundException: http://www.cs.jcu.edu.au/~alison/TONY/lexparse.html 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN developer.java.sun.com ROBOT - SAVE CON OPENED: http://developer.java.sun.com/developer/technicalArticles/ThirdParty/WebCrawler/ CONTENT GOT: text/html javax.swing.text.ChangedCharSetException java.lang.NullPointerException - http://developer.java.sun.com/developer/technicalArticles/ThirdParty/WebCrawler/ vor ParserDelegator() PARSER RUN 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN open.muscat.com ROBOT - SAVE CON OPENED: http://open.muscat.com/stemming/ 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.ils.unc.edu java.lang.NullPointerException - http://open.muscat.com/stemming/ ROBOT - SAVE CON OPENED: http://www.ils.unc.edu/keyes/java/porter/index.html CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http://www.tartarus.org/~martin/PorterStemmer/ nach ParserDelegator() PARSER RUN 1 crawl_level: 1 MAX_CRAWL_DEPTH: 2 http://www.tartarus.org/~martin/PorterStemmer/ 0 1 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN ftp.cs.cornell.edu unsupported Site 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN ftp.cs.cornell.edu unsupported Site 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN ftp.cs.cornell.edu unsupported Site 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN ftp.cs.cornell.edu unsupported Site 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN pi0959.kub.nl ROBOT - SAVE CON OPENED: http://pi0959.kub.nl:2080/Paai/Onderw/Smart/ CONTENT GOT: text/html vor ParserDelegator() nach ParserDelegator() PARSER RUN 0 crawl_level: 1 MAX_CRAWL_DEPTH: 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.ee.umd.edu ROBOT - SAVE CON OPENED: http://www.ee.umd.edu/~oard/software.html CONTENT GOT: text/html FC1: java.io.IOException: Server returned HTTP response code: 403 for URL: http://www.ee.umd.edu/~oard/software.html 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.dcs.gla.ac.uk ROBOT - SAVE CON OPENED: http://www.dcs.gla.ac.uk/ftp/pub/Bruin/HTML/SMART.HTM CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http://cs-tr.cs.cornell.edu:80/TR/CORNELLCS:TR68-11?abstract=SMART NEW LINK FOUND: http://cs-tr.cs.cornell.edu:80/TR/CORNELLCS:TR68-5?abstract=SMART NEW LINK FOUND: http://cs-tr.cs.cornell.edu:80/TR/CORNELLCS:TR69-39?abstract=SMART NEW LINK FOUND: http://cs-tr.cs.cornell.edu:80/TR/CORNELLCS:TR69-40?abstract=SMART NEW LINK FOUND: http://cs-tr.cs.cornell.edu:80/TR/CORNELLCS:TR71-115?abstract=SMART NEW LINK FOUND: http://cs-tr.cs.cornell.edu:80/TR/CORNELLCS:TR72-139?abstract=SMART NEW LINK FOUND: http://cs-tr.cs.cornell.edu:80/TR/CORNELLCS:TR80-447?abstract=SMART NEW LINK FOUND: http://cs-tr.cs.cornell.edu:80/TR/CORNELLCS:TR83-560?abstract=SMART NEW LINK FOUND: http://cs-tr.cs.cornell.edu:80/TR/CORNELLCS:TR83-561?abstract=SMART NEW LINK FOUND: http://cs-tr.cs.cornell.edu:80/TR/CORNELLCS:TR87-868?abstract=SMART NEW LINK FOUND: http://cs-tr.cs.cornell.edu:80/TR/CORNELLCS:TR90-1115?abstract=SMART NEW LINK FOUND: mailto: tech-reports@cs.cornell.edu nach ParserDelegator() PARSER RUN 12 crawl_level: 1 MAX_CRAWL_DEPTH: 2 http://cs-tr.cs.cornell.edu:80/TR/CORNELLCS:TR68-11?abstract=SMART 0 12 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://cs-tr.cs.cornell.edu:80/TR/CORNELLCS:TR68-5?abstract=SMART 1 12 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://cs-tr.cs.cornell.edu:80/TR/CORNELLCS:TR69-39?abstract=SMART 2 12 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://cs-tr.cs.cornell.edu:80/TR/CORNELLCS:TR69-40?abstract=SMART 3 12 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://cs-tr.cs.cornell.edu:80/TR/CORNELLCS:TR71-115?abstract=SMART 4 12 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://cs-tr.cs.cornell.edu:80/TR/CORNELLCS:TR72-139?abstract=SMART 5 12 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://cs-tr.cs.cornell.edu:80/TR/CORNELLCS:TR80-447?abstract=SMART 6 12 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://cs-tr.cs.cornell.edu:80/TR/CORNELLCS:TR83-560?abstract=SMART 7 12 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://cs-tr.cs.cornell.edu:80/TR/CORNELLCS:TR83-561?abstract=SMART 8 12 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://cs-tr.cs.cornell.edu:80/TR/CORNELLCS:TR87-868?abstract=SMART 9 12 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://cs-tr.cs.cornell.edu:80/TR/CORNELLCS:TR90-1115?abstract=SMART 10 12 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto: tech-reports@cs.cornell.edu 11 12 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.cs.wpi.edu ROBOT - SAVE CON OPENED: http://www.cs.wpi.edu/~ifc/courses/CS525I/ java.lang.NullPointerException - http://www.cs.wpi.edu/~ifc/courses/CS525I/ 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN tux.cs.brown.edu ROBOT - SAVE CON OPENED: http://tux.cs.brown.edu/courses/cs295-3/ CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http://www.cs.brown.edu/people/th NEW LINK FOUND: http://www.amazon.com/exec/obidos/ASIN/0471056693/ NEW LINK FOUND: http://www.amazon.com/exec/obidos/ASIN/0521780195/ NEW LINK FOUND: http://www.amazon.com/exec/obidos/ASIN/0387987800/ NEW LINK FOUND: http://www.amazon.com/exec/obidos/ASIN/0387946187/ NEW LINK FOUND: http://www.amazon.com/exec/obidos/ASIN/0198538642/ NEW LINK FOUND: http://www.amazon.com/exec/obidos/ASIN/0070428077// NEW LINK FOUND: http://www.amazon.com/exec/obidos/ASIN/0387952845/ NEW LINK FOUND: http://www.informatik.uni-freiburg.de/~ml/ecmlpkdd/ NEW LINK FOUND: http://www.cs.brown.edu/people/th/Course/Lecture1.pdf NEW LINK FOUND: http://www.cs.brown.edu/people/th/Course/Lecture2.pdf NEW LINK FOUND: http://www.cs.brown.edu/people/th/Course/Homework1.pdf NEW LINK FOUND: http://www.cs.brown.edu/people/th/Course/Lecture3.pdf NEW LINK FOUND: http://www.cs.brown.edu/people/th/Course/Lecture4.pdf NEW LINK FOUND: http://www.cs.brown.edu/people/th/Course/Homework2.pdf NEW LINK FOUND: http://www.cs.brown.edu/people/th/Course/Programming1.pdf NEW LINK FOUND: http://www.cs.brown.edu/people/th/Course/Lecture5.pdf NEW LINK FOUND: http://www.cs.brown.edu/people/th/Course/Lecture6.pdf NEW LINK FOUND: http://www.cs.brown.edu/people/th/Course/Lecture7.pdf NEW LINK FOUND: http://www.cs.brown.edu/people/th/Course/Lecture8.pdf NEW LINK FOUND: http://www.cs.brown.edu/people/th/Course/Lecture9.pdf NEW LINK FOUND: http://www.cs.brown.edu/people/th/Course/Homework3.pdf NEW LINK FOUND: http://www.cs.brown.edu/people/th/Course/Lecture10.pdf NEW LINK FOUND: http://www.cs.brown.edu/people/th/Course/Lecture11.pdf NEW LINK FOUND: http://www.cs.brown.edu/people/th/Course/Lecture12.pdf NEW LINK FOUND: http://www.cs.brown.edu/people/th/Course/Lecture13.pdf NEW LINK FOUND: http://www.cs.brown.edu/people/th/Course/Programming2.pdf NEW LINK FOUND: http://www.cs.brown.edu/people/th/Course/Lecture14.pdf NEW LINK FOUND: http://www.cs.brown.edu/people/th/Course/Lecture15.pdf NEW LINK FOUND: http://www.cs.brown.edu/people/th/Course/Lecture16.pdf NEW LINK FOUND: http://www.cs.brown.edu/people/th/Course/Lecture17.pdf NEW LINK FOUND: file:/course/cs295-3/JordanBishop-Chapter2.pdf NEW LINK FOUND: file:/course/cs295-3/JordanBishop-Chapter14.pdf NEW LINK FOUND: http://www.cs.brown.edu/people/th/Course/Homework4.pdf NEW LINK FOUND: file:/course/cs295-3/JordanBishop-Chapter15.pdf NEW LINK FOUND: http://citeseer.nj.nec.com/135897.html NEW LINK FOUND: http://www.cs.cmu.edu/Groups/NIPS/ nach ParserDelegator() PARSER RUN 37 crawl_level: 1 MAX_CRAWL_DEPTH: 2 http://www.cs.brown.edu/people/th 0 37 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.amazon.com/exec/obidos/ASIN/0471056693/ 1 37 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.amazon.com/exec/obidos/ASIN/0521780195/ 2 37 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.amazon.com/exec/obidos/ASIN/0387987800/ 3 37 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.amazon.com/exec/obidos/ASIN/0387946187/ 4 37 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.amazon.com/exec/obidos/ASIN/0198538642/ 5 37 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.amazon.com/exec/obidos/ASIN/0070428077// 6 37 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.amazon.com/exec/obidos/ASIN/0387952845/ 7 37 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.informatik.uni-freiburg.de/~ml/ecmlpkdd/ 8 37 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.brown.edu/people/th/Course/Lecture1.pdf 9 37 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.brown.edu/people/th/Course/Lecture2.pdf 10 37 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.brown.edu/people/th/Course/Homework1.pdf 11 37 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.brown.edu/people/th/Course/Lecture3.pdf 12 37 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.brown.edu/people/th/Course/Lecture4.pdf 13 37 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.brown.edu/people/th/Course/Homework2.pdf 14 37 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.brown.edu/people/th/Course/Programming1.pdf 15 37 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.brown.edu/people/th/Course/Lecture5.pdf 16 37 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.brown.edu/people/th/Course/Lecture6.pdf 17 37 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.brown.edu/people/th/Course/Lecture7.pdf 18 37 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.brown.edu/people/th/Course/Lecture8.pdf 19 37 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.brown.edu/people/th/Course/Lecture9.pdf 20 37 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.brown.edu/people/th/Course/Homework3.pdf 21 37 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.brown.edu/people/th/Course/Lecture10.pdf 22 37 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.brown.edu/people/th/Course/Lecture11.pdf 23 37 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.brown.edu/people/th/Course/Lecture12.pdf 24 37 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.brown.edu/people/th/Course/Lecture13.pdf 25 37 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.brown.edu/people/th/Course/Programming2.pdf 26 37 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.brown.edu/people/th/Course/Lecture14.pdf 27 37 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.brown.edu/people/th/Course/Lecture15.pdf 28 37 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.brown.edu/people/th/Course/Lecture16.pdf 29 37 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.brown.edu/people/th/Course/Lecture17.pdf 30 37 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 file:/course/cs295-3/JordanBishop-Chapter2.pdf 31 37 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 file:/course/cs295-3/JordanBishop-Chapter14.pdf 32 37 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.brown.edu/people/th/Course/Homework4.pdf 33 37 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 file:/course/cs295-3/JordanBishop-Chapter15.pdf 34 37 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://citeseer.nj.nec.com/135897.html 35 37 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/Groups/NIPS/ 36 37 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.sims.berkeley.edu ROBOT - SAVE CON OPENED: http://www.sims.berkeley.edu/courses/is202/f99/ CONTENT GOT: text/html javax.swing.text.ChangedCharSetException java.lang.NullPointerException - http://www.sims.berkeley.edu/courses/is202/f99/ vor ParserDelegator() PARSER RUN 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.sims.berkeley.edu ROBOT - SAVE CON OPENED: http://www.sims.berkeley.edu/courses/is296a-4/f99/ CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http://www.sims.berkeley.edu NEW LINK FOUND: http://www.sims.berkeley.edu/~hearst nach ParserDelegator() PARSER RUN 2 crawl_level: 1 MAX_CRAWL_DEPTH: 2 http://www.sims.berkeley.edu 0 2 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.sims.berkeley.edu/~hearst 1 2 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN maya.cs.depaul.edu ROBOT - SAVE CON OPENED: http://maya.cs.depaul.edu/~mobasher/classes/ds599/ CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: mailto:mobasher@cs.depaul.edu NEW LINK FOUND: http://maya.cs.depaul.edu/~mobasher NEW LINK FOUND: mailto:mobasher@cs.depaul.edu NEW LINK FOUND: mailto:mobasher@cs.depaul.edu nach ParserDelegator() PARSER RUN 4 crawl_level: 1 MAX_CRAWL_DEPTH: 2 mailto:mobasher@cs.depaul.edu 0 4 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://maya.cs.depaul.edu/~mobasher 1 4 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:mobasher@cs.depaul.edu 2 4 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:mobasher@cs.depaul.edu 3 4 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.gslis.utexas.edu ROBOT - SAVE CON OPENED: http://www.gslis.utexas.edu/~palmquis/inforets99/ CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http://www.gslis.utexas.edu NEW LINK FOUND: http://www.utexas.edu NEW LINK FOUND: http://www.gslis.utexas.edu/~palmquis/ NEW LINK FOUND: mailto:palmquis@uts.cc.utexas.edu NEW LINK FOUND: mailto:pasqua@mail.utexas.edu NEW LINK FOUND: mailto:aferrell@ccwf.cc.utexas.edu,ackelly@gslis.utexas.edu,mcooper@gslis.utexas.edu,johnz@utxvms.cc.utexas.edu,Pasqua@mail.utexas.edu,rdoyle@gslis.utexas.edu,aristm@gslis.utexas.edu,leith@mail.utexas.edu,amir@cs.utexas.edu,olane@gslis.utexas.edu,mandy@gslis.utexas.edu,rwp@mail.utexas.edu,sandenaw@gslis.utexas.edu,casburn@gslis.utexas.edu,ehahn@mail.utexas.edu,sjohnson@gslis.utexas.edu,amy@gslis.utexas.edu,puppydv@gslis.utexas.edu,smegula@utxvms.cc.utexas.edu,jarizola@tenet.edu,jforbes@mail.utexas.edu,palmquis@uts.cc.utexas.edu NEW LINK FOUND: mailto:Pasqua@mail.utexas.edu nach ParserDelegator() PARSER RUN 7 crawl_level: 1 MAX_CRAWL_DEPTH: 2 http://www.gslis.utexas.edu 0 7 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.utexas.edu 1 7 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.gslis.utexas.edu/~palmquis/ 2 7 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:palmquis@uts.cc.utexas.edu 3 7 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:pasqua@mail.utexas.edu 4 7 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:aferrell@ccwf.cc.utexas.edu,ackelly@gslis.utexas.edu,mcooper@gslis.utexas.edu,johnz@utxvms.cc.utexas.edu,Pasqua@mail.utexas.edu,rdoyle@gslis.utexas.edu,aristm@gslis.utexas.edu,leith@mail.utexas.edu,amir@cs.utexas.edu,olane@gslis.utexas.edu,mandy@gslis.utexas.edu,rwp@mail.utexas.edu,sandenaw@gslis.utexas.edu,casburn@gslis.utexas.edu,ehahn@mail.utexas.edu,sjohnson@gslis.utexas.edu,amy@gslis.utexas.edu,puppydv@gslis.utexas.edu,smegula@utxvms.cc.utexas.edu,jarizola@tenet.edu,jforbes@mail.utexas.edu,palmquis@uts.cc.utexas.edu 5 7 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:Pasqua@mail.utexas.edu 6 7 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.cs.cmu.edu ROBOT - SAVE CON OPENED: http://www.cs.cmu.edu/afs/cs.cmu.edu/academic/class/15850-s99/www/home.html CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http://www.cs.cmu.edu/~guyb NEW LINK FOUND: mailto:guyb@cs.cmu.edu NEW LINK FOUND: http://www.cs.cmu.edu/~lafferty NEW LINK FOUND: mailto:lafferty@cs.cmu.edu NEW LINK FOUND: http://www.cs.cmu.edu/afs/glmiller/public/www/home.html NEW LINK FOUND: mailto:glmiller@cs.cmu.edu NEW LINK FOUND: http://www.cs.cmu.edu/~cleah/ NEW LINK FOUND: mailto:cleah@cs.cmu.edu nach ParserDelegator() PARSER RUN 8 crawl_level: 1 MAX_CRAWL_DEPTH: 2 http://www.cs.cmu.edu/~guyb 0 8 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:guyb@cs.cmu.edu 1 8 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/~lafferty 2 8 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:lafferty@cs.cmu.edu 3 8 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/afs/glmiller/public/www/home.html 4 8 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:glmiller@cs.cmu.edu 5 8 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/~cleah/ 6 8 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:cleah@cs.cmu.edu 7 8 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.cs.umbc.edu ROBOT - SAVE CON OPENED: http://www.cs.umbc.edu/~nicholas/courses/691d/index.html CONTENT GOT: text/html FC1: java.io.FileNotFoundException: http://www.cs.umbc.edu/~nicholas/courses/691d/index.html 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.cs.utk.edu ROBOT - SAVE CON OPENED: http://www.cs.utk.edu/~cs494/ CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http://www.cs.utk.edu/~berry NEW LINK FOUND: http://www.cs.utk.edu/~browne NEW LINK FOUND: mailto:berry@cs.utk.edu NEW LINK FOUND: mailto:browne@cs.utk.edu NEW LINK FOUND: http://www.cs.utk.edu/~letsche NEW LINK FOUND: http://www.cs.utk.edu/~hudgens NEW LINK FOUND: ftp://ftp.vt.edu/pub/reuse/IR.code NEW LINK FOUND: http://www.cs.utk.edu/~cs494/labs/hall_of_fame/ NEW LINK FOUND: http://www.cs.utk.edu/~cs494/labs/lab1.html NEW LINK FOUND: http://www.cs.utk.edu/~cs494/labs/lab2.html NEW LINK FOUND: http://www.cs.utk.edu/~cs494/labs/lab3.html NEW LINK FOUND: http://www.cs.utk.edu/~cs494/labs/lab4.html NEW LINK FOUND: http://www.cs.utk.edu/~cs494/labs/lab5.html NEW LINK FOUND: http://www.cs.utk.edu/~cs494/labs/lab6.1.ps NEW LINK FOUND: http://www.cs.utk.edu/~berry NEW LINK FOUND: http://www.cs.utk.edu/ NEW LINK FOUND: http://www.utk.edu/ NEW LINK FOUND: http://gopher.lib.utk.edu:70/0/Other-Internet-Resources/pictures/html-docs/home.html NEW LINK FOUND: http://ciir.cs.umass.edu/index.html NEW LINK FOUND: http://cnidr.org/ NEW LINK FOUND: http://glimpse.cs.arizona.edu:1994/ NEW LINK FOUND: http://www.cs.utk.edu/~lsi NEW LINK FOUND: http://www.netlib.org/nse/ NEW LINK FOUND: http://potomac.ncsl.nist.gov/prise/prise.html NEW LINK FOUND: http://www-ir.inf.ethz.ch/ir_groups.html NEW LINK FOUND: mailto:cs494@cs.utk.edu nach ParserDelegator() PARSER RUN 26 crawl_level: 1 MAX_CRAWL_DEPTH: 2 http://www.cs.utk.edu/~berry 0 26 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.utk.edu/~browne 1 26 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:berry@cs.utk.edu 2 26 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:browne@cs.utk.edu 3 26 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.utk.edu/~letsche 4 26 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.utk.edu/~hudgens 5 26 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 ftp://ftp.vt.edu/pub/reuse/IR.code 6 26 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.utk.edu/~cs494/labs/hall_of_fame/ 7 26 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.utk.edu/~cs494/labs/lab1.html 8 26 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.utk.edu/~cs494/labs/lab2.html 9 26 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.utk.edu/~cs494/labs/lab3.html 10 26 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.utk.edu/~cs494/labs/lab4.html 11 26 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.utk.edu/~cs494/labs/lab5.html 12 26 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.utk.edu/~cs494/labs/lab6.1.ps 13 26 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.utk.edu/~berry 14 26 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.utk.edu/ 15 26 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.utk.edu/ 16 26 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://gopher.lib.utk.edu:70/0/Other-Internet-Resources/pictures/html-docs/home.html 17 26 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ciir.cs.umass.edu/index.html 18 26 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://cnidr.org/ 19 26 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://glimpse.cs.arizona.edu:1994/ 20 26 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.utk.edu/~lsi 21 26 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.netlib.org/nse/ 22 26 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://potomac.ncsl.nist.gov/prise/prise.html 23 26 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-ir.inf.ethz.ch/ir_groups.html 24 26 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:cs494@cs.utk.edu 25 26 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.cs.princeton.edu ROBOT - SAVE CON OPENED: http://www.cs.princeton.edu/courses/archive/spring98/cs598b/ CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http://www.princeton.edu NEW LINK FOUND: http://www.cs.princeton.edu NEW LINK FOUND: http://www.cs.princeton.edu/~aslp/ NEW LINK FOUND: http://www.cs.princeton.edu/~aslp/ NEW LINK FOUND: mailto:aslp@cs.princeton.edu NEW LINK FOUND: mailto:barbu@cs.princeton.edu NEW LINK FOUND: http://computer.org/computer/dli/ NEW LINK FOUND: http://www.cise.nsf.gov/iis/dli_home.html NEW LINK FOUND: http://computer.org/computer/ NEW LINK FOUND: http://computer.org/computer/dli/r50028/r50028.htm NEW LINK FOUND: http://elib.cs.berkeley.edu/papers.html NEW LINK FOUND: http://www-diglib.stanford.edu/cgi-bin/WP/get/SIDL-WP-1995-0013 NEW LINK FOUND: http://www.cise.nsf.gov/iis/dli_home.html NEW LINK FOUND: http://cimic3.rutgers.edu/ieee_dltf.html NEW LINK FOUND: http://neumann.computer.org/cspress/CATALOG/pr07402.htm NEW LINK FOUND: http://cesdis1.gsfc.nasa.gov/admin/adl97/adlcall.html NEW LINK FOUND: http://neumann.computer.org/cspress/CATALOG/pr08010.htm NEW LINK FOUND: http://www.alexandria.ucsb.edu/conferences/ADL98/ NEW LINK FOUND: http://www5conf.inria.fr/ NEW LINK FOUND: http://proceedings.www6conf.org/ NEW LINK FOUND: http://www7.conf.au/ NEW LINK FOUND: http://www.acm.org/pubs/contents/proceedings/dl/226931/index.html NEW LINK FOUND: http://www.acm.org/pubs/contents/proceedings/dl/263690/index.html NEW LINK FOUND: http://www.ks.com/dl98/1.html NEW LINK FOUND: http://www.acm.org/pubs/contents/proceedings/ir/ NEW LINK FOUND: http://www.asis.org/annual-96/ElectronicProceedings/ NEW LINK FOUND: http://www.asis.org/annual-97/ASIS97.htm NEW LINK FOUND: http://www.asis.org/annual-97/accepted.htm NEW LINK FOUND: http://www.infonortics.com/bath97.html NEW LINK FOUND: http://www.infonortics.com/infodesc.html NEW LINK FOUND: http://www.asis.org/Publications/JASIS/jasis.html NEW LINK FOUND: http://www.jstor.org/ NEW LINK FOUND: http://www.videolib.princeton.edu/ NEW LINK FOUND: http://www.precise.cs.princeton.edu/ NEW LINK FOUND: http://alvarado97.princeton.edu/~cddc/ NEW LINK FOUND: http://www.kcpl.lib.mo.us/search/srchengines.htm NEW LINK FOUND: http://www.searchenginewatch.com/ NEW LINK FOUND: http://www.searchenginewatch.com/reports/reviewchart.html NEW LINK FOUND: http://www.infonortics.com/test/bathmeeting/bath1997/debwiley/bathmeet.htm NEW LINK FOUND: http://lcweb.loc.gov/z3950/agency/ NEW LINK FOUND: http://www.cs.berkeley.edu/~phelps/Multivalent/index.html nach ParserDelegator() PARSER RUN 41 crawl_level: 1 MAX_CRAWL_DEPTH: 2 http://www.princeton.edu 0 41 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.princeton.edu 1 41 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.princeton.edu/~aslp/ 2 41 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.princeton.edu/~aslp/ 3 41 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:aslp@cs.princeton.edu 4 41 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:barbu@cs.princeton.edu 5 41 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://computer.org/computer/dli/ 6 41 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cise.nsf.gov/iis/dli_home.html 7 41 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://computer.org/computer/ 8 41 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://computer.org/computer/dli/r50028/r50028.htm 9 41 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://elib.cs.berkeley.edu/papers.html 10 41 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-diglib.stanford.edu/cgi-bin/WP/get/SIDL-WP-1995-0013 11 41 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cise.nsf.gov/iis/dli_home.html 12 41 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://cimic3.rutgers.edu/ieee_dltf.html 13 41 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://neumann.computer.org/cspress/CATALOG/pr07402.htm 14 41 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://cesdis1.gsfc.nasa.gov/admin/adl97/adlcall.html 15 41 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://neumann.computer.org/cspress/CATALOG/pr08010.htm 16 41 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.alexandria.ucsb.edu/conferences/ADL98/ 17 41 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www5conf.inria.fr/ 18 41 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://proceedings.www6conf.org/ 19 41 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www7.conf.au/ 20 41 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.acm.org/pubs/contents/proceedings/dl/226931/index.html 21 41 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.acm.org/pubs/contents/proceedings/dl/263690/index.html 22 41 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ks.com/dl98/1.html 23 41 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.acm.org/pubs/contents/proceedings/ir/ 24 41 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.asis.org/annual-96/ElectronicProceedings/ 25 41 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.asis.org/annual-97/ASIS97.htm 26 41 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.asis.org/annual-97/accepted.htm 27 41 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.infonortics.com/bath97.html 28 41 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.infonortics.com/infodesc.html 29 41 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.asis.org/Publications/JASIS/jasis.html 30 41 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.jstor.org/ 31 41 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.videolib.princeton.edu/ 32 41 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.precise.cs.princeton.edu/ 33 41 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://alvarado97.princeton.edu/~cddc/ 34 41 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.kcpl.lib.mo.us/search/srchengines.htm 35 41 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.searchenginewatch.com/ 36 41 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.searchenginewatch.com/reports/reviewchart.html 37 41 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.infonortics.com/test/bathmeeting/bath1997/debwiley/bathmeet.htm 38 41 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://lcweb.loc.gov/z3950/agency/ 39 41 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.berkeley.edu/~phelps/Multivalent/index.html 40 41 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN ciir.cs.umass.edu ROBOT - SAVE CON OPENED: http://ciir.cs.umass.edu/cmpsci646/ CONTENT GOT: text/html vor ParserDelegator() javax.swing.text.ChangedCharSetException java.lang.NullPointerException - http://ciir.cs.umass.edu/cmpsci646/ PARSER RUN 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN hobart.cs.umass.edu ROBOT - SAVE CON OPENED: http://hobart.cs.umass.edu/~allan/irclass.html 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.cs.wisc.edu java.lang.NullPointerException - http://hobart.cs.umass.edu/~allan/irclass.html ROBOT - SAVE CON OPENED: http://www.cs.wisc.edu/~shavlik/cs838/cs838.html CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http:/~belew/ NEW LINK FOUND: http:/~shavlik/ NEW LINK FOUND: http:/areas/ai/cs540-all.html NEW LINK FOUND: http:/~shavlik/cs760.html NEW LINK FOUND: mailto:mlir@cs.wisc.edu NEW LINK FOUND: http:/~shavlik/cs838/mail/ NEW LINK FOUND: http:/~shavlik/cs838/cs838-sched-tbl.html NEW LINK FOUND: http:/~shavlik/cs838/cs838-sched-txt.html NEW LINK FOUND: mailto:belew@cs.wisc.edu NEW LINK FOUND: mailto:shavlik@cs.wisc.edu NEW LINK FOUND: http:/~belew/ NEW LINK FOUND: http://www-cse.ucsd.edu:80/users/rik/papers/sigir92/lsi2.ps NEW LINK FOUND: http://www-cse.ucsd.edu:80/users/rik/papers/sigir92/abstract.html NEW LINK FOUND: http://www-cse.ucsd.edu:80/users/rik/papers/sigir94/sig.ps NEW LINK FOUND: http://www-cse.ucsd.edu:80/users/rik/papers/sigir94/abstract.html NEW LINK FOUND: http://www-cse.ucsd.edu:80/users/rik/papers/dair94/inc.ps NEW LINK FOUND: http://www-cse.ucsd.edu:80/users/rik/papers/dair94/abstract.html NEW LINK FOUND: http://www-cse.ucsd.edu:80/users/rik/papers/dpe-ml/dair93.ps NEW LINK FOUND: http://www-cse.ucsd.edu:80/users/rik/papers/dair93/abstract.html NEW LINK FOUND: http://www-cse.ucsd.edu:80/users/rik/papers/aaai94-steier/aaai94_3.ps NEW LINK FOUND: http://www.aaai.org/Press/press.html NEW LINK FOUND: http://www-cse.ucsd.edu:80/users/rik/papers/aaai94-steier/abstract.html NEW LINK FOUND: http://www-cse.ucsd.edu:80/users/rik/papers/aaai-sss95/info-spiders.ps NEW LINK FOUND: http://www.isi.edu/sims/knoblock/sss95/proceedings.html NEW LINK FOUND: http://www.aaai.org/Press/press.html NEW LINK FOUND: http://www-cse.ucsd.edu:80/users/rik/papers/aaai-sss95/abstract.html NEW LINK FOUND: http://www.cs.washington.edu/research/projects/softbots/www/oren.html NEW LINK FOUND: ftp://cs.washington.edu/pub/map/papers/Category-Translation.ps NEW LINK FOUND: http://www.cs.washington.edu/homes/map/ NEW LINK FOUND: http://cs-www.uchicago.edu/~kris/index.html NEW LINK FOUND: http://www.isi.edu/sims/knoblock/sss95/hammond.ps NEW LINK FOUND: http://www.isi.edu/sims/knoblock/sss95/proceedings.html NEW LINK FOUND: http://www.aaai.org/Press/press.html NEW LINK FOUND: http://www.csi.uottawa.ca/~holte/Learning/index.html NEW LINK FOUND: http://www.csi.uottawa.ca/~holte/Learning/TR-95-12.ps NEW LINK FOUND: http://www.csi.uottawa.ca/~holte/Learning/aaai94ss.ps NEW LINK FOUND: http://www.csi.uottawa.ca/~holte/Learning/kbse93.ps NEW LINK FOUND: http://webhound.www.media.mit.edu/projects/webhound/doc/Webhound.html NEW LINK FOUND: http://agents.www.media.mit.edu/groups/agents/papers/aaai-ymp/aaai.html NEW LINK FOUND: ftp://media.mit.edu/pub/agents/interface-agents/coll-agents.ps NEW LINK FOUND: http://yezdi.www.media.mit.edu/people/yezdi NEW LINK FOUND: http://memetral.www.media.mit.edu/people/memetral/ NEW LINK FOUND: http://pattie.www.media.mit.edu/people/pattie/ NEW LINK FOUND: http://www.aaai.org/Publications/Press/press.html NEW LINK FOUND: http://anther.learning.cs.cmu.edu/ifhome.html NEW LINK FOUND: http://anther.learning.cs.cmu.edu/ml95.ps NEW LINK FOUND: http://www.cs.cmu.edu/afs/cs.cmu.edu/user/mitchell/ftp/tomhome.html NEW LINK FOUND: http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-6/web-agent/www/project-home.html NEW LINK FOUND: http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-6/web-agent/www/webagent-plus.ps.Z NEW LINK FOUND: http://www.isi.edu/sims/knoblock/sss95/proceedings.html NEW LINK FOUND: http://www.aaai.org/Press/press.html NEW LINK FOUND: http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-6/web-agent/www/mltagung-e.ps.Z NEW LINK FOUND: http://www.ics.uci.edu/dir/faculty/AI/pazzani/ NEW LINK FOUND: http://www.ics.uci.edu/~pazzani/I3.html NEW LINK FOUND: http://www.ics.uci.edu/~pazzani/Coldlist.html NEW LINK FOUND: http://robotics.stanford.edu/groups/nobotics/home.html NEW LINK FOUND: http://robotics.stanford.edu/people/marko/papers/jvcir.ps NEW LINK FOUND: http://flamingo.stanford.edu/users/marko/bio.html NEW LINK FOUND: http://robotics.stanford.edu/people/marko/papers/vcir.abs NEW LINK FOUND: http://robotics.stanford.edu/people/marko/papers/lira.ps NEW LINK FOUND: http://www.isi.edu/sims/knoblock/sss95/proceedings.html NEW LINK FOUND: http://www.aaai.org/Press/press.html NEW LINK FOUND: http://robotics.stanford.edu/people/marko/papers/lira.abs NEW LINK FOUND: ftp://scr.siemens.com:/pub/learning/Papers/voorhees/README NEW LINK FOUND: ftp://scr.siemens.com:/pub/learning/Papers/towell/ml95.ps.gz NEW LINK FOUND: http://www.cs.colorado.edu/~andreas/Time-Series/MyPapers/topic-spotting.ps.Z NEW LINK FOUND: http://www.cs.colorado.edu/~andreas/ NEW LINK FOUND: ftp://parcftp.xerox.com/pub/qca/SIGIR95.ps NEW LINK FOUND: http:/~shavlik/cs838/dlewis-sigir94.ps NEW LINK FOUND: http://ai.iit.nrc.ca/DEIL/krulwich.ps.Z NEW LINK FOUND: http://www.isi.edu/sims/knoblock/sss95/krulwich.ps NEW LINK FOUND: http://www.isi.edu/sims/knoblock/sss95/proceedings.html NEW LINK FOUND: http://www.aaai.org/Press/press.html NEW LINK FOUND: http://ai.iit.nrc.ca/DEIL/martin.ps.Z NEW LINK FOUND: http://www.research.att.com/orgs/ssr/people/wcohen/postscript/ml-95-ir.ps NEW LINK FOUND: http://www.research.att.com/orgs/ssr/people/wcohen/ NEW LINK FOUND: ftp://odyssey.ucc.ie/pub/filtering/INNC94.ps NEW LINK FOUND: http://ciir.cs.umass.edu/info/psfiles/irpubs/ml95_ToC.html NEW LINK FOUND: ftp://ciir-ftp.cs.umass.edu/pub/papers/lewis/nlirbib93.ps.Z NEW LINK FOUND: http://atg1.wustl.edu/DL94/paper/futrelle.html NEW LINK FOUND: http://ai.iit.nrc.ca/II_public/ NEW LINK FOUND: http://www.cs.columbia.edu/~acl/home.html NEW LINK FOUND: http://xxx.lanl.gov/cmp-lg/ NEW LINK FOUND: http://www.cogsci.princeton.edu/~wn/index.html NEW LINK FOUND: http://www.glue.umd.edu/enee/medlab/filter/filter.html NEW LINK FOUND: http://www.uspto.gov/web/ipnii/ NEW LINK FOUND: http://www.itd.nrl.navy.mil/ONR/aci/ NEW LINK FOUND: http://www.aic.nrl.navy.mil/~aha/people.html NEW LINK FOUND: http:/~shavlik/interesting-links.html NEW LINK FOUND: http:/~shavlik/journal-tocs/tocs.html NEW LINK FOUND: mailto:shavlik@cs.wisc.edu NEW LINK FOUND: mailto:belew@cs.wisc.edu nach ParserDelegator() PARSER RUN 92 crawl_level: 1 MAX_CRAWL_DEPTH: 2 http:/~belew/ 0 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http:/~shavlik/ 1 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http:/areas/ai/cs540-all.html 2 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http:/~shavlik/cs760.html 3 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:mlir@cs.wisc.edu 4 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http:/~shavlik/cs838/mail/ 5 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http:/~shavlik/cs838/cs838-sched-tbl.html 6 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http:/~shavlik/cs838/cs838-sched-txt.html 7 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:belew@cs.wisc.edu 8 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:shavlik@cs.wisc.edu 9 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http:/~belew/ 10 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-cse.ucsd.edu:80/users/rik/papers/sigir92/lsi2.ps 11 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-cse.ucsd.edu:80/users/rik/papers/sigir92/abstract.html 12 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-cse.ucsd.edu:80/users/rik/papers/sigir94/sig.ps 13 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-cse.ucsd.edu:80/users/rik/papers/sigir94/abstract.html 14 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-cse.ucsd.edu:80/users/rik/papers/dair94/inc.ps 15 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-cse.ucsd.edu:80/users/rik/papers/dair94/abstract.html 16 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-cse.ucsd.edu:80/users/rik/papers/dpe-ml/dair93.ps 17 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-cse.ucsd.edu:80/users/rik/papers/dair93/abstract.html 18 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-cse.ucsd.edu:80/users/rik/papers/aaai94-steier/aaai94_3.ps 19 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.aaai.org/Press/press.html 20 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-cse.ucsd.edu:80/users/rik/papers/aaai94-steier/abstract.html 21 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-cse.ucsd.edu:80/users/rik/papers/aaai-sss95/info-spiders.ps 22 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.isi.edu/sims/knoblock/sss95/proceedings.html 23 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.aaai.org/Press/press.html 24 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-cse.ucsd.edu:80/users/rik/papers/aaai-sss95/abstract.html 25 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.washington.edu/research/projects/softbots/www/oren.html 26 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 ftp://cs.washington.edu/pub/map/papers/Category-Translation.ps 27 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.washington.edu/homes/map/ 28 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://cs-www.uchicago.edu/~kris/index.html 29 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.isi.edu/sims/knoblock/sss95/hammond.ps 30 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.isi.edu/sims/knoblock/sss95/proceedings.html 31 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.aaai.org/Press/press.html 32 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.csi.uottawa.ca/~holte/Learning/index.html 33 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.csi.uottawa.ca/~holte/Learning/TR-95-12.ps 34 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.csi.uottawa.ca/~holte/Learning/aaai94ss.ps 35 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.csi.uottawa.ca/~holte/Learning/kbse93.ps 36 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://webhound.www.media.mit.edu/projects/webhound/doc/Webhound.html 37 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://agents.www.media.mit.edu/groups/agents/papers/aaai-ymp/aaai.html 38 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 ftp://media.mit.edu/pub/agents/interface-agents/coll-agents.ps 39 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://yezdi.www.media.mit.edu/people/yezdi 40 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://memetral.www.media.mit.edu/people/memetral/ 41 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://pattie.www.media.mit.edu/people/pattie/ 42 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.aaai.org/Publications/Press/press.html 43 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://anther.learning.cs.cmu.edu/ifhome.html 44 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://anther.learning.cs.cmu.edu/ml95.ps 45 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/afs/cs.cmu.edu/user/mitchell/ftp/tomhome.html 46 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-6/web-agent/www/project-home.html 47 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-6/web-agent/www/webagent-plus.ps.Z 48 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.isi.edu/sims/knoblock/sss95/proceedings.html 49 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.aaai.org/Press/press.html 50 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-6/web-agent/www/mltagung-e.ps.Z 51 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ics.uci.edu/dir/faculty/AI/pazzani/ 52 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ics.uci.edu/~pazzani/I3.html 53 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ics.uci.edu/~pazzani/Coldlist.html 54 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://robotics.stanford.edu/groups/nobotics/home.html 55 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://robotics.stanford.edu/people/marko/papers/jvcir.ps 56 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://flamingo.stanford.edu/users/marko/bio.html 57 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://robotics.stanford.edu/people/marko/papers/vcir.abs 58 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://robotics.stanford.edu/people/marko/papers/lira.ps 59 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.isi.edu/sims/knoblock/sss95/proceedings.html 60 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.aaai.org/Press/press.html 61 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://robotics.stanford.edu/people/marko/papers/lira.abs 62 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 ftp://scr.siemens.com:/pub/learning/Papers/voorhees/README 63 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 ftp://scr.siemens.com:/pub/learning/Papers/towell/ml95.ps.gz 64 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.colorado.edu/~andreas/Time-Series/MyPapers/topic-spotting.ps.Z 65 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.colorado.edu/~andreas/ 66 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 ftp://parcftp.xerox.com/pub/qca/SIGIR95.ps 67 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http:/~shavlik/cs838/dlewis-sigir94.ps 68 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ai.iit.nrc.ca/DEIL/krulwich.ps.Z 69 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.isi.edu/sims/knoblock/sss95/krulwich.ps 70 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.isi.edu/sims/knoblock/sss95/proceedings.html 71 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.aaai.org/Press/press.html 72 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ai.iit.nrc.ca/DEIL/martin.ps.Z 73 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.research.att.com/orgs/ssr/people/wcohen/postscript/ml-95-ir.ps 74 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.research.att.com/orgs/ssr/people/wcohen/ 75 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 ftp://odyssey.ucc.ie/pub/filtering/INNC94.ps 76 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ciir.cs.umass.edu/info/psfiles/irpubs/ml95_ToC.html 77 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 ftp://ciir-ftp.cs.umass.edu/pub/papers/lewis/nlirbib93.ps.Z 78 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://atg1.wustl.edu/DL94/paper/futrelle.html 79 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ai.iit.nrc.ca/II_public/ 80 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.columbia.edu/~acl/home.html 81 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://xxx.lanl.gov/cmp-lg/ 82 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cogsci.princeton.edu/~wn/index.html 83 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.glue.umd.edu/enee/medlab/filter/filter.html 84 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.uspto.gov/web/ipnii/ 85 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.itd.nrl.navy.mil/ONR/aci/ 86 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.aic.nrl.navy.mil/~aha/people.html 87 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http:/~shavlik/interesting-links.html 88 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http:/~shavlik/journal-tocs/tocs.html 89 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:shavlik@cs.wisc.edu 90 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:belew@cs.wisc.edu 91 92 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN ei.cs.vt.edu ROBOT - SAVE CON OPENED: http://ei.cs.vt.edu/~cs5604/ CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http://www.cs.vt.edu/~ramakris NEW LINK FOUND: mailto:naren@cs.vt.edu NEW LINK FOUND: mailto:fmin@cvt.edu nach ParserDelegator() PARSER RUN 3 crawl_level: 1 MAX_CRAWL_DEPTH: 2 http://www.cs.vt.edu/~ramakris 0 3 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:naren@cs.vt.edu 1 3 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:fmin@cvt.edu 2 3 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.ags.uci.edu ROBOT - SAVE CON OPENED: http://www.ags.uci.edu/~jblevins/reviews/DISE/DISE_syllabus.html CONTENT GOT: text/html FC1: java.io.FileNotFoundException: http://www.ags.uci.edu/~jblevins/reviews/DISE/DISE_syllabus.html 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN pi0959.kub.nl ROBOT - SAVE CON OPENED: http://pi0959.kub.nl:2080/Paai/Onderw/Ir/ir.html CONTENT GOT: text/html vor ParserDelegator() nach ParserDelegator() PARSER RUN 0 crawl_level: 1 MAX_CRAWL_DEPTH: 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.cs.duke.edu unsupported Site 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.acm.org unsupported Site 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN sdmc.iss.nus.sg java.lang.NullPointerException - http://sdmc.iss.nus.sg/sigir-ap/ ROBOT - SAVE CON OPENED: http://sdmc.iss.nus.sg/sigir-ap/ 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www-nlpir.nist.gov ROBOT - SAVE CON OPENED: http://www-nlpir.nist.gov/ CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http://www.itl.nist.gov/iaui NEW LINK FOUND: http://www.nist.gov NEW LINK FOUND: http://www.commerce.gov NEW LINK FOUND: http://www.ta.doc.gov NEW LINK FOUND: mailto: ellen.voorhees@nist.gov NEW LINK FOUND: http://www.nist.gov/public_affairs/privacy.htm nach ParserDelegator() PARSER RUN 6 crawl_level: 1 MAX_CRAWL_DEPTH: 2 http://www.itl.nist.gov/iaui 0 6 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov 1 6 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.commerce.gov 2 6 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ta.doc.gov 3 6 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto: ellen.voorhees@nist.gov 4 6 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/public_affairs/privacy.htm 5 6 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN trec.nist.gov ROBOT - SAVE CON OPENED: http://trec.nist.gov/ CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http://www.itl.nist.gov NEW LINK FOUND: http://www.itl.nist.gov/div894/894.02 NEW LINK FOUND: http://www.itl.nist.gov/div894 NEW LINK FOUND: http://www.nist.gov/cgi-bin/exit_nist.cgi?url=http://www.ic-arda.org NEW LINK FOUND: http://www.nist.gov NEW LINK FOUND: http://www.doc.gov NEW LINK FOUND: http://www.ta.doc.gov NEW LINK FOUND: http://www.nist.gov/public_affairs/privacy.htm NEW LINK FOUND: http://www.nist.gov/public_affairs/disclaim.htm NEW LINK FOUND: http://www.nist.gov/admin/foia/foia.htm NEW LINK FOUND: mailto:trec@nist.gov nach ParserDelegator() PARSER RUN 11 crawl_level: 1 MAX_CRAWL_DEPTH: 2 http://www.itl.nist.gov 0 11 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.itl.nist.gov/div894/894.02 1 11 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.itl.nist.gov/div894 2 11 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/cgi-bin/exit_nist.cgi?url=http://www.ic-arda.org 3 11 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov 4 11 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.doc.gov 5 11 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ta.doc.gov 6 11 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/public_affairs/privacy.htm 7 11 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/public_affairs/disclaim.htm 8 11 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/admin/foia/foia.htm 9 11 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:trec@nist.gov 10 11 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN irsg.eu.org ROBOT - SAVE CON OPENED: http://irsg.eu.org/ java.lang.NullPointerException - http://irsg.eu.org/ 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.almaden.ibm.com ROBOT - SAVE CON OPENED: http://www.almaden.ibm.com/cs/k53/clever.html CONTENT GOT: text/html javax.swing.text.ChangedCharSetException java.lang.NullPointerException - http://www.almaden.ibm.com/cs/k53/clever.html vor ParserDelegator() PARSER RUN 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.canis.uiuc.edu ROBOT - SAVE CON OPENED: http://www.canis.uiuc.edu/projects/interspace/ CONTENT GOT: text/html javax.swing.text.ChangedCharSetException java.lang.NullPointerException - http://www.canis.uiuc.edu/projects/interspace/ vor ParserDelegator() PARSER RUN 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN ils.unc.edu ROBOT - SAVE CON OPENED: http://ils.unc.edu/iris/ CONTENT GOT: text/html vor ParserDelegator() nach ParserDelegator() PARSER RUN 0 crawl_level: 1 MAX_CRAWL_DEPTH: 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www-ai.ijs.si ROBOT - SAVE CON OPENED: http://www-ai.ijs.si/DunjaMladenic/yplanet.html CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http://www.yahoo.com NEW LINK FOUND: http://alchemist.ijs.si/yquint/yquint.exe NEW LINK FOUND: http://www-ai.ijs.si/MarkoGrobelnik/MarkoGrobelnik.html NEW LINK FOUND: http://www-ai.ijs.si/DunjaMladenic/home.html NEW LINK FOUND: mailto:Dunja.Mladenic@ijs.si nach ParserDelegator() PARSER RUN 5 crawl_level: 1 MAX_CRAWL_DEPTH: 2 http://www.yahoo.com 0 5 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://alchemist.ijs.si/yquint/yquint.exe 1 5 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-ai.ijs.si/MarkoGrobelnik/MarkoGrobelnik.html 2 5 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-ai.ijs.si/DunjaMladenic/home.html 3 5 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:Dunja.Mladenic@ijs.si 4 5 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN google.stanford.edu ROBOT - SAVE CON OPENED: http://google.stanford.edu/ CONTENT GOT: text/html vor ParserDelegator() PARSER RUN 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN cap.anu.edu.au javax.swing.text.ChangedCharSetException java.lang.NullPointerException - http://google.stanford.edu/ ROBOT - SAVE CON OPENED: http://cap.anu.edu.au/cap/projects/text_retrieval/ CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http://web.soi.city.ac.uk/~andym/PADRE/anu.html NEW LINK FOUND: http://acsys.anu.edu.au/acsys/ NEW LINK FOUND: http://pastime.anu.edu.au/TAR/ NEW LINK FOUND: http://www-nlpir.nist.gov/TREC NEW LINK FOUND: http://cs.anu.edu.au/people/David.Hawking NEW LINK FOUND: http://pastime.anu.edu.au/pbt NEW LINK FOUND: http://cap.anu.edu.au/~rbs NEW LINK FOUND: http://cap.anu.edu.au/~peterb NEW LINK FOUND: http://pastime.anu.edu.au/TAR/publications.html NEW LINK FOUND: http://cs.anu.edu.au/techreports/1996/TR-CS-96-04.ps.gz NEW LINK FOUND: mailto:dave@cs.anu.edu.au nach ParserDelegator() PARSER RUN 11 crawl_level: 1 MAX_CRAWL_DEPTH: 2 http://web.soi.city.ac.uk/~andym/PADRE/anu.html 0 11 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://acsys.anu.edu.au/acsys/ 1 11 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://pastime.anu.edu.au/TAR/ 2 11 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-nlpir.nist.gov/TREC 3 11 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://cs.anu.edu.au/people/David.Hawking 4 11 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://pastime.anu.edu.au/pbt 5 11 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://cap.anu.edu.au/~rbs 6 11 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://cap.anu.edu.au/~peterb 7 11 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://pastime.anu.edu.au/TAR/publications.html 8 11 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://cs.anu.edu.au/techreports/1996/TR-CS-96-04.ps.gz 9 11 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:dave@cs.anu.edu.au 10 11 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN pastime.anu.edu.au ROBOT - SAVE CON OPENED: http://pastime.anu.edu.au/TAR/ java.lang.NullPointerException - http://pastime.anu.edu.au/TAR/ 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.research.digital.com ROBOT - SAVE CON OPENED: http://www.research.digital.com/SRC/personal/Krishna_Bharat/WebArcheology/ CONTENT GOT: text/html FC1: java.io.FileNotFoundException: http://www.research.compaq.com/SRC/personal/Krishna_Bharat/WebArcheology/ 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.cs.cmu.edu ROBOT - SAVE CON OPENED: http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/wwkb/ CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-20/www/data/ NEW LINK FOUND: http://www.cs.cmu.edu/~WebKB/ILP-data.html NEW LINK FOUND: http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-51/www/co-training/data/ NEW LINK FOUND: http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-20/www/data/news20.html NEW LINK FOUND: http://www.cs.cmu.edu/afs/cs/project/theo-20/www/data/bootstrappingIE/7sectors.tar.gz NEW LINK FOUND: http://www.cs.cmu.edu/~TextLearning NEW LINK FOUND: http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/wwkb/overview-aaai98.ps.gz NEW LINK FOUND: http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/wwkb/overview-aij99.ps.gz NEW LINK FOUND: http://www.cs.cmu.edu/~rayid/mypapers/kdd-shield.ps NEW LINK FOUND: http://www.cs.cmu.edu/~dunja/WshKDD2000.html NEW LINK FOUND: http://www.cora.justresearch.com NEW LINK FOUND: http://www.cs.cmu.edu/~mccallum/papers/cora-aaaiss99.ps.gz NEW LINK FOUND: http://www.cs.cmu.edu/~biokb/ NEW LINK FOUND: http://www.cs.cmu.edu/~craven/craven.ismb99.ps NEW LINK FOUND: http://www.cs.cmu.edu/~rayid/mypapers/ecoc-icml.ps NEW LINK FOUND: http://www.cs.cmu.edu/~knigam/papers/cotrain-CIKM00.ps NEW LINK FOUND: http://www.csee.umbc.edu/cikm/2000/ NEW LINK FOUND: http://www.cs.cmu.edu/~knigam/papers/emcat-mlj99.ps NEW LINK FOUND: http://www.cs.cmu.edu/~knigam/papers/maxent-ijcaiws99.ps NEW LINK FOUND: http://www.cs.cmu.edu/~knigam/papers/keywordcat-aclws99.ps NEW LINK FOUND: http://www.cs.cmu.edu/~mccallum/papers/hier-icml98.ps.gz NEW LINK FOUND: http://www.cs.cmu.edu/~mccallum/papers/emcat-aaai98.ps.gz NEW LINK FOUND: http://www.cs.cmu.edu/~mccallum/papers/emactive-icml98.ps.gz NEW LINK FOUND: http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/wwkb/colt98_final.ps NEW LINK FOUND: http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-20/www/mitchell-pubs/iccs99.ps NEW LINK FOUND: http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/wwkb/aaai-ws-aslog.ps.gz NEW LINK FOUND: http://www.cs.cmu.edu/~knigam/papers/multinomial-aaaiws98.ps NEW LINK FOUND: http://www.cs.cmu.edu/~sean/papers/icml2000.ps NEW LINK FOUND: http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/wwkb/ilp98.ps.gz NEW LINK FOUND: http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/wwkb/ecml98.ps.gz NEW LINK FOUND: http://www.cs.cmu.edu/~rayid/mypapers/CIKM-model.ps NEW LINK FOUND: http://www.cs.cmu.edu/~rayid/mypapers/ACL.ps NEW LINK FOUND: http://www.ai.mit.edu/~jrennie/papers/icml99.ps.gz NEW LINK FOUND: http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/wwkb/aaai99iedict-rj.ps.gz NEW LINK FOUND: http://www.cs.cmu.edu/~knigam/papers/bootstrap-ijcaiws99.ps NEW LINK FOUND: http://www.cs.cmu.edu/~mccallum/papers/ieshrink-aaaiws99s.ps NEW LINK FOUND: http://www.cs.cmu.edu/~mccallum/papers/iestruct-aaaiws99s.ps NEW LINK FOUND: http://www.cs.cmu.edu/afs/cs/project/theo-11/www/wwkb/ms-ie.ps.gz NEW LINK FOUND: http://www.cs.cmu.edu/afs/cs/project/theo-11/www/wwkb/ling-ie.ps.gz NEW LINK FOUND: http://www.cs.cmu.edu/afs/cs/project/theo-11/www/wwkb/webie.ps.gz NEW LINK FOUND: http://www.cs.cmu.edu/afs/cs/project/theo-11/www/wwkb/gi-ie.ps.gz NEW LINK FOUND: http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/wwkb/danthesis.ps.gz NEW LINK FOUND: http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/wwkb/choon-thesis.html NEW LINK FOUND: http://www.cs.cmu.edu/~rayid/ NEW LINK FOUND: http://www.cs.cmu.edu/~rosie/ NEW LINK FOUND: http://www.cs.cmu.edu/~mccallum NEW LINK FOUND: http://www.cs.cmu.edu/~tom NEW LINK FOUND: http://www.cs.cmu.edu/afs/cs.cmu.edu/user/dunja/www/home.html NEW LINK FOUND: http://www.cs.cmu.edu/~knigam NEW LINK FOUND: http://www.cs.cmu.edu/afs/cs.cmu.edu/user/jslttery/www/home_page.html NEW LINK FOUND: http://www.alakhawayn.ma/~A.Bensaid/ NEW LINK FOUND: http://www.cs.wisc.edu/~craven/ NEW LINK FOUND: http://www.isi.edu/~dipasquo/ NEW LINK FOUND: http://www.cs.cmu.edu/afs/cs/user/dayne/www/dayne-home.html NEW LINK FOUND: http://www.ai.univie.ac.at/~juffi/ NEW LINK FOUND: http://www.eng.uwaterloo.ca/Student/hkatirai NEW LINK FOUND: http://www.cs.cmu.edu/afs/cs/user/choon/www/ NEW LINK FOUND: http://www.ai.mit.edu/~jrennie/ NEW LINK FOUND: file:/afs/cs.cmu.edu/project/theo-9/webkb/documents/www/index.html nach ParserDelegator() PARSER RUN 59 crawl_level: 1 MAX_CRAWL_DEPTH: 2 http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-20/www/data/ 0 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/~WebKB/ILP-data.html 1 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-51/www/co-training/data/ 2 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-20/www/data/news20.html 3 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/afs/cs/project/theo-20/www/data/bootstrappingIE/7sectors.tar.gz 4 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/~TextLearning 5 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/wwkb/overview-aaai98.ps.gz 6 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/wwkb/overview-aij99.ps.gz 7 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/~rayid/mypapers/kdd-shield.ps 8 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/~dunja/WshKDD2000.html 9 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cora.justresearch.com 10 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/~mccallum/papers/cora-aaaiss99.ps.gz 11 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/~biokb/ 12 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/~craven/craven.ismb99.ps 13 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/~rayid/mypapers/ecoc-icml.ps 14 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/~knigam/papers/cotrain-CIKM00.ps 15 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.csee.umbc.edu/cikm/2000/ 16 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/~knigam/papers/emcat-mlj99.ps 17 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/~knigam/papers/maxent-ijcaiws99.ps 18 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/~knigam/papers/keywordcat-aclws99.ps 19 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/~mccallum/papers/hier-icml98.ps.gz 20 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/~mccallum/papers/emcat-aaai98.ps.gz 21 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/~mccallum/papers/emactive-icml98.ps.gz 22 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/wwkb/colt98_final.ps 23 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-20/www/mitchell-pubs/iccs99.ps 24 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/wwkb/aaai-ws-aslog.ps.gz 25 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/~knigam/papers/multinomial-aaaiws98.ps 26 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/~sean/papers/icml2000.ps 27 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/wwkb/ilp98.ps.gz 28 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/wwkb/ecml98.ps.gz 29 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/~rayid/mypapers/CIKM-model.ps 30 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/~rayid/mypapers/ACL.ps 31 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ai.mit.edu/~jrennie/papers/icml99.ps.gz 32 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/wwkb/aaai99iedict-rj.ps.gz 33 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/~knigam/papers/bootstrap-ijcaiws99.ps 34 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/~mccallum/papers/ieshrink-aaaiws99s.ps 35 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/~mccallum/papers/iestruct-aaaiws99s.ps 36 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/afs/cs/project/theo-11/www/wwkb/ms-ie.ps.gz 37 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/afs/cs/project/theo-11/www/wwkb/ling-ie.ps.gz 38 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/afs/cs/project/theo-11/www/wwkb/webie.ps.gz 39 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/afs/cs/project/theo-11/www/wwkb/gi-ie.ps.gz 40 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/wwkb/danthesis.ps.gz 41 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/wwkb/choon-thesis.html 42 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/~rayid/ 43 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/~rosie/ 44 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/~mccallum 45 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/~tom 46 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/afs/cs.cmu.edu/user/dunja/www/home.html 47 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/~knigam 48 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/afs/cs.cmu.edu/user/jslttery/www/home_page.html 49 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.alakhawayn.ma/~A.Bensaid/ 50 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.wisc.edu/~craven/ 51 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.isi.edu/~dipasquo/ 52 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/afs/cs/user/dayne/www/dayne-home.html 53 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ai.univie.ac.at/~juffi/ 54 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.eng.uwaterloo.ca/Student/hkatirai 55 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/afs/cs/user/choon/www/ 56 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ai.mit.edu/~jrennie/ 57 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 file:/afs/cs.cmu.edu/project/theo-9/webkb/documents/www/index.html 58 59 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www-uilots.let.uu.nl ROBOT - SAVE CON OPENED: http://www-uilots.let.uu.nl/~uplift/ CONTENT GOT: text/html FC1: java.io.FileNotFoundException: http://www-uilots.let.uu.nl/uplift/ 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.public.iastate.edu ROBOT - SAVE CON OPENED: http://www.public.iastate.edu/~CYBERSTACKS/Aristotle.htm CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http://www.utm.edu/research/iep/a/aristotl.htm NEW LINK FOUND: http://www.ncsa.uiuc.edu/SDG/IT94/Proceedings/Agents/eichmann.ethical/eichmann.html NEW LINK FOUND: mailto:willhill@research.att.com NEW LINK FOUND: mailto:terveen@research.att.com NEW LINK FOUND: http://www.phoaks.com//index.html NEW LINK FOUND: mailto:sfchang@ctr.columbia.edu NEW LINK FOUND: mailto:jrsmith@ctr.columbia.edu NEW LINK FOUND: http://www.ctr.columbia.edu/webseek NEW LINK FOUND: http://www.ctr.columbia.edu/VisualSEEk NEW LINK FOUND: http://www.ctr.columbia.edu/~sfchang/dpc96-2/title.html NEW LINK FOUND: mailto:info@documagix.com NEW LINK FOUND: http://webreview.com/96/03/08/addict/index.html NEW LINK FOUND: http://lislin.gws.uky.edu/Sitemap/Sitemap.html NEW LINK FOUND: mailto:yoelle@haifa.vnet.ibm.com NEW LINK FOUND: http://www5conf.inria.fr/fich_html/papers/P37/Overview.html NEW LINK FOUND: mailto:rlborgen@devvax.jpl.nasa.gov NEW LINK FOUND: http://www.ub2.lu.se/desire/radar/archive/indexing/schwartz.96/Papers/Wagner@JPL.html NEW LINK FOUND: mailto:doemel@informatik.uni-frankfurt.de NEW LINK FOUND: http://www.tm.informatik.uni-frankfurt.de/~doemel/Papers/WWWFall94/www-fall94.html NEW LINK FOUND: mailto:weiss@cs.jhu.edu NEW LINK FOUND: http://www.parc.xerox.com/istl/projects/mlia/papers/weiss.ps NEW LINK FOUND: mailto:rma@ks.com NEW LINK FOUND: http://www.ub2.lu.se/desire/radar/archive/indexing/schwartz.96/Papers/Akscyn@KS.html NEW LINK FOUND: mailto:taeha@cosmos.kaist.ac.kr NEW LINK FOUND: http://www.ub2.lu.se/auto_new/UDC.html NEW LINK FOUND: http://www.ub2.lu.se/autoclass.html NEW LINK FOUND: http://www.ub2.lu.se/W4/summary.html NEW LINK FOUND: http://www.ub2.lu.se/nordic_w4.html NEW LINK FOUND: http://www.ub2.lu.se/W4/plan.html NEW LINK FOUND: http://www.ub2.lu.se/W4/phase1.html NEW LINK FOUND: mailto:rweiss@lcs.mit.edu NEW LINK FOUND: http://www.psrg.lcs.mit.edu:80/publications/Papers/hypert.pdf NEW LINK FOUND: http://www.psrg.lcs.mit.edu:80/~sheldon/dist-indexing-workshop-position.html NEW LINK FOUND: mailto:lieber@media.mit.edu NEW LINK FOUND: http://www.info.unicaen.fr/~serge/3wia/workshop/papers/paper29.html NEW LINK FOUND: http://lieber.www.media.mit.edu/people/lieber/Lieberary/Letizia/Letizia.html NEW LINK FOUND: mailto:pattie@media.mit.edu NEW LINK FOUND: http://www.agents-inc.com/ NEW LINK FOUND: mailto:punch@cps.msu.edu NEW LINK FOUND: mailto:mitiak-i@is.aist-nara.ac.jp NEW LINK FOUND: http://ai-www.aist-nara.ac.jp/doc/people/mitiak-i/aaai96/aaai96_5_12.html NEW LINK FOUND: mailto:moreinfo@netscape.com NEW LINK FOUND: http://www.the-coast.com/frames/cata-datasheet.html NEW LINK FOUND: mailto:francis@slab.ntt.jp NEW LINK FOUND: http://www.ingrid.org/ NEW LINK FOUND: http://www.ingrid.org/francis/www4/Overview.html NEW LINK FOUND: http://www.w3.org/pub/Conferences/WWW4/Papers/300/ NEW LINK FOUND: mailto:shafer@oclc.org NEW LINK FOUND: http://purl.oclc.org/scorpion/ NEW LINK FOUND: http://www.oracle.com/support/ NEW LINK FOUND: http://www.ub2.lu.se/desire/radar/archive/indexing/schwartz.96/Papers/Roberts@Oracle.html NEW LINK FOUND: mailto:kirsten@csd.sgi.com NEW LINK FOUND: http://www5conf.inria.fr/fich_html/papers/P39/Overview.html NEW LINK FOUND: mailto:gravano@cs.stanford.edu NEW LINK FOUND: http://gloss.stanford.edu/ NEW LINK FOUND: http://www-db.stanford.edu/pub/gravano/1993/stan.cs.tn.93.002.ps NEW LINK FOUND: http://www-db.stanford.edu/pub/gravano/1996/tois96.ps NEW LINK FOUND: mailto:genesereth@cs.stanford.edu NEW LINK FOUND: http://infomaster.stanford.edu/tutorial/ NEW LINK FOUND: http://infomaster.stanford.edu:4000/ NEW LINK FOUND: http://logic.stanford.edu/people/duschka/papers/complete-query-plans.ps NEW LINK FOUND: http://www-db.stanford.edu/pub/keller/1995/iiaw95-infomaster.ps NEW LINK FOUND: mailto:marko@rsv.ricoh.com NEW LINK FOUND: http://www-diglib.stanford.edu/diglib/WP/PUBLIC/DOC93.ps NEW LINK FOUND: http://elib.stanford.edu/Dienst/UI/2.0/Describe/stanford.cs%2fCS-TR-98-1605 NEW LINK FOUND: http://elib.stanford.edu/Dienst/UI/2.0/Describe/stanford.cs%2fCS-TN-97-52 NEW LINK FOUND: mailto:kamiya@cs.stanford.edu NEW LINK FOUND: mailto:fln@di.uminho.pt NEW LINK FOUND: http://www.w3.org/pub/Conferences/WWW4/Papers/portugal/ NEW LINK FOUND: mailto:pedwards@csd.abdn.ac.uk NEW LINK FOUND: http://www.parc.xerox.com/istl/projects/mlia/papers/edwards.ps NEW LINK FOUND: mailto:udi@cs.arizona.edu NEW LINK FOUND: http://www.cs.arizona.edu/webglimpse/ NEW LINK FOUND: http://www.w3.org/pub/WWW/Search/9605-Indexing-Workshop/Papers/Manber@Arizona.html NEW LINK FOUND: mailto:hchen@bpa.arizona.edu NEW LINK FOUND: http://ai.bpa.arizona.edu/Lists/list_demos.html NEW LINK FOUND: mailto:gaines@cpsc.ucalgary.ca NEW LINK FOUND: http://ksi.cpsc.ucalgary.ca/articles/ConceptMaps/ NEW LINK FOUND: http://www.w3j.com/1/gaines.134/paper/134.html NEW LINK FOUND: mailto:pazzani@ics.uci.edu NEW LINK FOUND: http://www.ics.uci.edu/~pazzani/Syskill.html NEW LINK FOUND: http://www.ics.uci.edu/~pazzani/RTF/AAAI.html NEW LINK FOUND: mailto:ray@sherlock.sims.berkeley.edu NEW LINK FOUND: http://sherlock.berkeley.edu/asis96/asis96.html NEW LINK FOUND: mailto:rad@cs.ucsb.edu NEW LINK FOUND: http://www.cs.ucsb.edu/TRs/TRCS96-05.html NEW LINK FOUND: mailto:hammond@cs.uchicago.edu NEW LINK FOUND: http://infolab.cs.uchicago.edu/echo/echo.html NEW LINK FOUND: http://mingo.info-science.uiowa.edu/eichmann/www-s96/interaction_protocols.html NEW LINK FOUND: http://mingo.info-science.uiowa.edu/eichmann/disw.html NEW LINK FOUND: http://mingo.info-science.uiowa.edu/eichmann/www-s96/Overview.html NEW LINK FOUND: mailto:sgauch@ittc.ukans.edu NEW LINK FOUND: http://www.ittc.ukans.edu/~xzhu/wwwAgents/agents/lba/eecs/lba.html NEW LINK FOUND: http://www.ittc.ukans.edu/~xzhu/wwwAgents/agents/rba/rba.html NEW LINK FOUND: http://www.ittc.ukans.edu/~xzhu/wwwAgents/project.html NEW LINK FOUND: http://www.ittc.ukans.edu/~sgauch/papers/ITTC-FY97-TR-11100-1.html NEW LINK FOUND: http://www.ittc.ukans.edu/~sgauch/papers/JASIS97.html NEW LINK FOUND: mailto:seanl@cs.umd.edu NEW LINK FOUND: http://www.cs.umd.edu/projects/plus/SHOE/spec.html NEW LINK FOUND: http://www.cs.umd.edu/projects/plus/SHOE/ontologies.html NEW LINK FOUND: http://www.cs.umd.edu/projects/plus/SHOE/spec.html NEW LINK FOUND: http://www.cs.umd.edu/projects/plus/SHOE/aaai-paper.html NEW LINK FOUND: mailto:moises@accugraph.com NEW LINK FOUND: mailto:rekent@eecs.wsu.edu NEW LINK FOUND: http://wave.eecs.wsu.edu NEW LINK FOUND: http://www.ncsa.uiuc.edu/SDG/IT94/Proceedings/Autools/kent/kent.html NEW LINK FOUND: http://www.igd.fhg.de/www/www95/papers/94/www3.html NEW LINK FOUND: http://www.hermans.org/agents/ NEW LINK FOUND: http://info.webcrawler.com/mak/projects/robots/threat-or-treat.html NEW LINK FOUND: http://pattie.www.media.mit.edu/people/pattie/CACM-94/CACM-94.p1.html NEW LINK FOUND: http://www.dlib.org/dlib/february97/02mckiernan.html NEW LINK FOUND: mailto:gerrymck@iastate.edu NEW LINK FOUND: http://wwwscout.cs.wisc.edu/scout/report/archive/scout-960823.html#16 nach ParserDelegator() PARSER RUN 113 crawl_level: 1 MAX_CRAWL_DEPTH: 2 http://www.utm.edu/research/iep/a/aristotl.htm 0 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ncsa.uiuc.edu/SDG/IT94/Proceedings/Agents/eichmann.ethical/eichmann.html 1 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:willhill@research.att.com 2 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:terveen@research.att.com 3 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.phoaks.com//index.html 4 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:sfchang@ctr.columbia.edu 5 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:jrsmith@ctr.columbia.edu 6 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ctr.columbia.edu/webseek 7 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ctr.columbia.edu/VisualSEEk 8 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ctr.columbia.edu/~sfchang/dpc96-2/title.html 9 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:info@documagix.com 10 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://webreview.com/96/03/08/addict/index.html 11 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://lislin.gws.uky.edu/Sitemap/Sitemap.html 12 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:yoelle@haifa.vnet.ibm.com 13 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www5conf.inria.fr/fich_html/papers/P37/Overview.html 14 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:rlborgen@devvax.jpl.nasa.gov 15 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ub2.lu.se/desire/radar/archive/indexing/schwartz.96/Papers/Wagner@JPL.html 16 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:doemel@informatik.uni-frankfurt.de 17 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.tm.informatik.uni-frankfurt.de/~doemel/Papers/WWWFall94/www-fall94.html 18 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:weiss@cs.jhu.edu 19 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.parc.xerox.com/istl/projects/mlia/papers/weiss.ps 20 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:rma@ks.com 21 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ub2.lu.se/desire/radar/archive/indexing/schwartz.96/Papers/Akscyn@KS.html 22 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:taeha@cosmos.kaist.ac.kr 23 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ub2.lu.se/auto_new/UDC.html 24 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ub2.lu.se/autoclass.html 25 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ub2.lu.se/W4/summary.html 26 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ub2.lu.se/nordic_w4.html 27 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ub2.lu.se/W4/plan.html 28 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ub2.lu.se/W4/phase1.html 29 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:rweiss@lcs.mit.edu 30 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.psrg.lcs.mit.edu:80/publications/Papers/hypert.pdf 31 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.psrg.lcs.mit.edu:80/~sheldon/dist-indexing-workshop-position.html 32 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:lieber@media.mit.edu 33 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.info.unicaen.fr/~serge/3wia/workshop/papers/paper29.html 34 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://lieber.www.media.mit.edu/people/lieber/Lieberary/Letizia/Letizia.html 35 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:pattie@media.mit.edu 36 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.agents-inc.com/ 37 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:punch@cps.msu.edu 38 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:mitiak-i@is.aist-nara.ac.jp 39 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ai-www.aist-nara.ac.jp/doc/people/mitiak-i/aaai96/aaai96_5_12.html 40 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:moreinfo@netscape.com 41 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.the-coast.com/frames/cata-datasheet.html 42 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:francis@slab.ntt.jp 43 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ingrid.org/ 44 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ingrid.org/francis/www4/Overview.html 45 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.w3.org/pub/Conferences/WWW4/Papers/300/ 46 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:shafer@oclc.org 47 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://purl.oclc.org/scorpion/ 48 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.oracle.com/support/ 49 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ub2.lu.se/desire/radar/archive/indexing/schwartz.96/Papers/Roberts@Oracle.html 50 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:kirsten@csd.sgi.com 51 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www5conf.inria.fr/fich_html/papers/P39/Overview.html 52 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:gravano@cs.stanford.edu 53 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://gloss.stanford.edu/ 54 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-db.stanford.edu/pub/gravano/1993/stan.cs.tn.93.002.ps 55 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-db.stanford.edu/pub/gravano/1996/tois96.ps 56 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:genesereth@cs.stanford.edu 57 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://infomaster.stanford.edu/tutorial/ 58 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://infomaster.stanford.edu:4000/ 59 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://logic.stanford.edu/people/duschka/papers/complete-query-plans.ps 60 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-db.stanford.edu/pub/keller/1995/iiaw95-infomaster.ps 61 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:marko@rsv.ricoh.com 62 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-diglib.stanford.edu/diglib/WP/PUBLIC/DOC93.ps 63 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://elib.stanford.edu/Dienst/UI/2.0/Describe/stanford.cs%2fCS-TR-98-1605 64 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://elib.stanford.edu/Dienst/UI/2.0/Describe/stanford.cs%2fCS-TN-97-52 65 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:kamiya@cs.stanford.edu 66 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:fln@di.uminho.pt 67 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.w3.org/pub/Conferences/WWW4/Papers/portugal/ 68 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:pedwards@csd.abdn.ac.uk 69 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.parc.xerox.com/istl/projects/mlia/papers/edwards.ps 70 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:udi@cs.arizona.edu 71 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.arizona.edu/webglimpse/ 72 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.w3.org/pub/WWW/Search/9605-Indexing-Workshop/Papers/Manber@Arizona.html 73 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:hchen@bpa.arizona.edu 74 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ai.bpa.arizona.edu/Lists/list_demos.html 75 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:gaines@cpsc.ucalgary.ca 76 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ksi.cpsc.ucalgary.ca/articles/ConceptMaps/ 77 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.w3j.com/1/gaines.134/paper/134.html 78 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:pazzani@ics.uci.edu 79 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ics.uci.edu/~pazzani/Syskill.html 80 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ics.uci.edu/~pazzani/RTF/AAAI.html 81 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:ray@sherlock.sims.berkeley.edu 82 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://sherlock.berkeley.edu/asis96/asis96.html 83 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:rad@cs.ucsb.edu 84 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.ucsb.edu/TRs/TRCS96-05.html 85 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:hammond@cs.uchicago.edu 86 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://infolab.cs.uchicago.edu/echo/echo.html 87 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://mingo.info-science.uiowa.edu/eichmann/www-s96/interaction_protocols.html 88 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://mingo.info-science.uiowa.edu/eichmann/disw.html 89 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://mingo.info-science.uiowa.edu/eichmann/www-s96/Overview.html 90 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:sgauch@ittc.ukans.edu 91 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ittc.ukans.edu/~xzhu/wwwAgents/agents/lba/eecs/lba.html 92 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ittc.ukans.edu/~xzhu/wwwAgents/agents/rba/rba.html 93 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ittc.ukans.edu/~xzhu/wwwAgents/project.html 94 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ittc.ukans.edu/~sgauch/papers/ITTC-FY97-TR-11100-1.html 95 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ittc.ukans.edu/~sgauch/papers/JASIS97.html 96 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:seanl@cs.umd.edu 97 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.umd.edu/projects/plus/SHOE/spec.html 98 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.umd.edu/projects/plus/SHOE/ontologies.html 99 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.umd.edu/projects/plus/SHOE/spec.html 100 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.umd.edu/projects/plus/SHOE/aaai-paper.html 101 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:moises@accugraph.com 102 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:rekent@eecs.wsu.edu 103 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://wave.eecs.wsu.edu 104 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ncsa.uiuc.edu/SDG/IT94/Proceedings/Autools/kent/kent.html 105 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.igd.fhg.de/www/www95/papers/94/www3.html 106 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.hermans.org/agents/ 107 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://info.webcrawler.com/mak/projects/robots/threat-or-treat.html 108 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://pattie.www.media.mit.edu/people/pattie/CACM-94/CACM-94.p1.html 109 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.dlib.org/dlib/february97/02mckiernan.html 110 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:gerrymck@iastate.edu 111 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://wwwscout.cs.wisc.edu/scout/report/archive/scout-960823.html#16 112 113 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN robotics.Stanford.EDU ROBOT - SAVE CON OPENED: http://robotics.Stanford.EDU/users/sahami/SONIA/SONIAproject.html CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http://www-diglib.stanford.edu/diglib/ NEW LINK FOUND: http://oi.stanford.edu/ NEW LINK FOUND: http://flamingo.stanford.edu/users/sahami/bio.html NEW LINK FOUND: http://www-leland.stanford.edu/~yusufali/ NEW LINK FOUND: http://www.parc.xerox.com/istl/members/baldonad/ NEW LINK FOUND: http://robotics.stanford.edu/~koller NEW LINK FOUND: http://www.sims.berkeley.edu/~hearst NEW LINK FOUND: http://robotics.stanford.edu/users/sahami/papers-dir/dl98-sonia.ps NEW LINK FOUND: http://robotics.stanford.edu/users/sahami/papers-dir/ml97-hier.ps NEW LINK FOUND: http://robotics.stanford.edu/users/sahami/papers-dir/kdd96-learn-bn.ps NEW LINK FOUND: http://robotics.stanford.edu/users/sahami/papers-dir/gm-clustering.ps NEW LINK FOUND: http://robotics.stanford.edu/users/sahami/papers-dir/sonia-abst.ps NEW LINK FOUND: http://robotics.stanford.edu/users/sahami/papers-dir/ml96-mcmm.ps NEW LINK FOUND: http://robotics.stanford.edu/users/sahami/papers-dir/ml96-featsel.ps NEW LINK FOUND: mailto:sahami@cs.Stanford.EDU NEW LINK FOUND: http://www.stanford.edu/stanford.html NEW LINK FOUND: http://www-cs.stanford.edu/csd/csd.html nach ParserDelegator() PARSER RUN 17 crawl_level: 1 MAX_CRAWL_DEPTH: 2 http://www-diglib.stanford.edu/diglib/ 0 17 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://oi.stanford.edu/ 1 17 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://flamingo.stanford.edu/users/sahami/bio.html 2 17 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-leland.stanford.edu/~yusufali/ 3 17 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.parc.xerox.com/istl/members/baldonad/ 4 17 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://robotics.stanford.edu/~koller 5 17 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.sims.berkeley.edu/~hearst 6 17 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://robotics.stanford.edu/users/sahami/papers-dir/dl98-sonia.ps 7 17 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://robotics.stanford.edu/users/sahami/papers-dir/ml97-hier.ps 8 17 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://robotics.stanford.edu/users/sahami/papers-dir/kdd96-learn-bn.ps 9 17 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://robotics.stanford.edu/users/sahami/papers-dir/gm-clustering.ps 10 17 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://robotics.stanford.edu/users/sahami/papers-dir/sonia-abst.ps 11 17 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://robotics.stanford.edu/users/sahami/papers-dir/ml96-mcmm.ps 12 17 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://robotics.stanford.edu/users/sahami/papers-dir/ml96-featsel.ps 13 17 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:sahami@cs.Stanford.EDU 14 17 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.stanford.edu/stanford.html 15 17 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-cs.stanford.edu/csd/csd.html 16 17 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.cs.bu.edu ROBOT - SAVE CON OPENED: http://www.cs.bu.edu/groups/ivc/ImageRover/Home.html CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http://www.cs.bu.edu/groups/ivc/ImageRover/approach.html NEW LINK FOUND: http://www.cs.bu.edu/groups/ivc/ImageRover/demo.html NEW LINK FOUND: http://www.cs.bu.edu/groups/ivc/ImageRover/publications.html NEW LINK FOUND: http://www.cs.bu.edu/groups/ivc/contact.html NEW LINK FOUND: http://www.cs.bu.edu/groups/ivc/ nach ParserDelegator() PARSER RUN 5 crawl_level: 1 MAX_CRAWL_DEPTH: 2 http://www.cs.bu.edu/groups/ivc/ImageRover/approach.html 0 5 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.bu.edu/groups/ivc/ImageRover/demo.html 1 5 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.bu.edu/groups/ivc/ImageRover/publications.html 2 5 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.bu.edu/groups/ivc/contact.html 3 5 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.bu.edu/groups/ivc/ 4 5 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.ee.umd.edu ROBOT - SAVE CON OPENED: http://www.ee.umd.edu/medlab/filter/filter_project.html CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http://www.ee.umd.edu/medlab/Welcome.html NEW LINK FOUND: http://www.umiacs.umd.edu/labs/CLIP/index.html NEW LINK FOUND: http://www.glue.umd.edu/~dlrg NEW LINK FOUND: http://www.glue.umd.edu/~oard/research.html NEW LINK FOUND: http://www.clis.umd.edu/dlrg/filter/ NEW LINK FOUND: http://www.clis.umd.edu/dlrg/clir/ NEW LINK FOUND: http://www.ee.umd.edu/medlab/filter/papers/trec5.ps NEW LINK FOUND: http://www.ee.umd.edu/medlab/filter/papers/sigir96.ps NEW LINK FOUND: http://www.ee.umd.edu/medlab/filter/papers/thesis.ps.gz NEW LINK FOUND: http://www.ee.umd.edu/medlab/filter/papers/filter/filter.html NEW LINK FOUND: http://www.ee.umd.edu/medlab/filter/papers/mlir.ps NEW LINK FOUND: http://www.glue.umd.edu/~oard/research.html NEW LINK FOUND: http://www.ee.umd.edu/medlab/filter/papers/forum.ps NEW LINK FOUND: http://www.ee.umd.edu/medlab/filter/papers/aumdr.ps NEW LINK FOUND: http://www.ee.umd.edu/medlab/filter/papers/smc95.ps NEW LINK FOUND: http://www.ee.umd.edu/medlab/filter/papers/smc.ps NEW LINK FOUND: http://www.cs.umd.edu/Server/TR/UMCP-CSD:CS-TR-3514/Body?format=postscript NEW LINK FOUND: http://www.ee.umd.edu/medlab/filter/papers/signidr.ps NEW LINK FOUND: http://www.ee.umd.edu/medlab/filter/papers/balt.ps NEW LINK FOUND: http://www.ee.umd.edu/medlab/filter/papers/umir.html NEW LINK FOUND: http://www.ee.umd.edu/medlab/filter/papers/neural.ps NEW LINK FOUND: http://www.glue.umd.edu/~oard/Welcome.html NEW LINK FOUND: http://www.ee.umd.edu/medlab/declaris.html NEW LINK FOUND: http://www.cs.umd.edu/users/bonnie NEW LINK FOUND: http://www.cs.umd.edu/users/christos NEW LINK FOUND: http://www.glue.umd.edu/~march NEW LINK FOUND: http://www.glue.umd.edu/~dlrg/ NEW LINK FOUND: http://documents.cfar.umd.edu/DocumentsGroup/ NEW LINK FOUND: http://www.glue.umd.edu/~oard/ nach ParserDelegator() PARSER RUN 29 crawl_level: 1 MAX_CRAWL_DEPTH: 2 http://www.ee.umd.edu/medlab/Welcome.html 0 29 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.umiacs.umd.edu/labs/CLIP/index.html 1 29 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.glue.umd.edu/~dlrg 2 29 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.glue.umd.edu/~oard/research.html 3 29 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.clis.umd.edu/dlrg/filter/ 4 29 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.clis.umd.edu/dlrg/clir/ 5 29 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ee.umd.edu/medlab/filter/papers/trec5.ps 6 29 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ee.umd.edu/medlab/filter/papers/sigir96.ps 7 29 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ee.umd.edu/medlab/filter/papers/thesis.ps.gz 8 29 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ee.umd.edu/medlab/filter/papers/filter/filter.html 9 29 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ee.umd.edu/medlab/filter/papers/mlir.ps 10 29 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.glue.umd.edu/~oard/research.html 11 29 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ee.umd.edu/medlab/filter/papers/forum.ps 12 29 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ee.umd.edu/medlab/filter/papers/aumdr.ps 13 29 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ee.umd.edu/medlab/filter/papers/smc95.ps 14 29 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ee.umd.edu/medlab/filter/papers/smc.ps 15 29 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.umd.edu/Server/TR/UMCP-CSD:CS-TR-3514/Body?format=postscript 16 29 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ee.umd.edu/medlab/filter/papers/signidr.ps 17 29 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ee.umd.edu/medlab/filter/papers/balt.ps 18 29 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ee.umd.edu/medlab/filter/papers/umir.html 19 29 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ee.umd.edu/medlab/filter/papers/neural.ps 20 29 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.glue.umd.edu/~oard/Welcome.html 21 29 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ee.umd.edu/medlab/declaris.html 22 29 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.umd.edu/users/bonnie 23 29 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.umd.edu/users/christos 24 29 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.glue.umd.edu/~march 25 29 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.glue.umd.edu/~dlrg/ 26 29 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://documents.cfar.umd.edu/DocumentsGroup/ 27 29 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.glue.umd.edu/~oard/ 28 29 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.ee.umd.edu ROBOT - SAVE CON OPENED: http://www.ee.umd.edu/medlab/filter/filter.html CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http://www.glue.umd.edu/~oard/research.html NEW LINK FOUND: mailto:jinmook@glue.umd.edu NEW LINK FOUND: http://www.glue.umd.edu/~oard NEW LINK FOUND: mailto:oard@glue.umd.edu NEW LINK FOUND: http://www.wam.umd.edu/~jinmook NEW LINK FOUND: mailto:jinmook@glue.umd.edu NEW LINK FOUND: http://usa.nedstatbasic.net/cgi-bin/viewstat?name=jinmook3 nach ParserDelegator() PARSER RUN 7 crawl_level: 1 MAX_CRAWL_DEPTH: 2 http://www.glue.umd.edu/~oard/research.html 0 7 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:jinmook@glue.umd.edu 1 7 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.glue.umd.edu/~oard 2 7 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:oard@glue.umd.edu 3 7 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.wam.umd.edu/~jinmook 4 7 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:jinmook@glue.umd.edu 5 7 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://usa.nedstatbasic.net/cgi-bin/viewstat?name=jinmook3 6 7 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.ee.umd.edu ROBOT - SAVE CON OPENED: http://www.ee.umd.edu/medlab/filter/software.html CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: ftp://db.stanford.edu/pub/sift/sift-1.1-netnews.tar.Z NEW LINK FOUND: http://www-db.stanford.edu/people/tyan.html NEW LINK FOUND: ftp://db.stanford.edu/pub/yan/ NEW LINK FOUND: http://193.62.196.6/sift/index.html NEW LINK FOUND: http://www.cs.umn.edu/Research/GroupLens/ NEW LINK FOUND: http://www-sloan.mit.edu/ccs/paul.html NEW LINK FOUND: http://ccs.mit.edu/ NEW LINK FOUND: http://www.cs.umn.edu/~bmiller/ NEW LINK FOUND: http://www.cs.umn.edu/~herlocke/ NEW LINK FOUND: http://www.cs.umn.edu/~herlocke/nr/nr.html NEW LINK FOUND: http://www.firefly.com NEW LINK FOUND: http://www.lucifer.com/~sasha/articles/ACF.html NEW LINK FOUND: http://www.lucifer.com/~sasha/home.html NEW LINK FOUND: http://www.MachinaSapiens.qc.ca/english/products/infoscan/infoscanang.html NEW LINK FOUND: http://www.MachinaSapiens.qc.ca/index.html NEW LINK FOUND: http://www.panix.com/~erik/InfoTicker.cgi NEW LINK FOUND: http://www.panix.com/~erik NEW LINK FOUND: http://www.newssieve.com/ NEW LINK FOUND: http://www.haneke.de NEW LINK FOUND: http://iagent.iti.gov.sg/ NEW LINK FOUND: http://www.iti.gov.sg NEW LINK FOUND: http://www.iti.gov.sg/iti_people/iti_staff/kflai/ NEW LINK FOUND: http://www.iti.gov.sg/iti_people/iti_staff/kflai/trec95.html NEW LINK FOUND: http://home.wisewire.com/ NEW LINK FOUND: http://www2.echo.lu/libraries/en/projects/borges.html NEW LINK FOUND: http://www.compapp.dcu.ie/~asmeaton/asmeaton.html NEW LINK FOUND: ftp://ftp.cs.pdx.edu/pub/faculty/jrb/rama/ NEW LINK FOUND: http://www.cse.ogi.edu/~jrb/ NEW LINK FOUND: http://www.cs.pdx.edu/ NEW LINK FOUND: http://www.ee.umd.edu/medlab/filter/browse.tar.Z NEW LINK FOUND: http://www.cse.rmit.edu.au/~rdsajj NEW LINK FOUND: http://www.cse.rmit.edu.au/~rdsajj/brow.ps NEW LINK FOUND: http://www.clarinet.com/newsclip.html NEW LINK FOUND: ftp://ftp.digex.net/pub/networking/news/trn/strn/README.strn NEW LINK FOUND: http://www.island-resort.com/sm.htm NEW LINK FOUND: http://www.island-resort.com/lgl.htm NEW LINK FOUND: ftp://ftp.cs.cornell.edu/pub/smart NEW LINK FOUND: http://www.cs.cornell.edu/Info/Department/Annual95/Faculty/Salton.html NEW LINK FOUND: http://www.cs.cornell.edu/ NEW LINK FOUND: http://pi0959.kub.nl:2080/Paai/Onderw/Smart/hands.html NEW LINK FOUND: http://pi0959.kub.nl:2080/Paai/engels.html NEW LINK FOUND: http://cs-tr.cs.cornell.edu:80/TR/CORNELLCS:TR85-686?abstract= NEW LINK FOUND: ftp://ftp.media.mit.edu/pub/agents/interface-agents/MAXIMS NEW LINK FOUND: http://memetral.www.media.mit.edu/people/memetral/ NEW LINK FOUND: http://agents.www.media.mit.edu/groups/agents/ NEW LINK FOUND: ftp://parcftp.xerox.com/pub/collab-filter/ NEW LINK FOUND: ftp://uniwa.uwa.edu.au/pub/nn NEW LINK FOUND: http://www.cs.cmu.edu/afs/cs.cmu.edu/user/dmaltz/www/home.html NEW LINK FOUND: http://www.cs.cmu.edu/afs/cs.cmu.edu/user/dmaltz/Doc/mit-thesis.ps.gz NEW LINK FOUND: ftp://ftp.informatik.rwth-aachen.de/pub/packages/procmail NEW LINK FOUND: http://www.ii.com/internet/faqs/launchers/mail/filtering-faq/ NEW LINK FOUND: http://www.myxa.com/elm.html NEW LINK FOUND: http://www.iki.fi/~era NEW LINK FOUND: http://www.iki.fi/~era/procmail/ NEW LINK FOUND: http://www.iki.fi/~era/procmail/mini-faq.html NEW LINK FOUND: http://www.flounder.net/~mrsam/maildrop/ NEW LINK FOUND: http://www.geocities.com/SiliconValley/Peaks/5799/maildrop.README.html NEW LINK FOUND: http://www.nmt.edu/~mfisk/style.cgi?unixtools/mailfilt.html NEW LINK FOUND: http://www.nmt.edu/~mfisk/ NEW LINK FOUND: ftp://ftp.foretune.co.jp/pub/network/mail/mailagent/ NEW LINK FOUND: http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-6/web-agent/www/project-home.html NEW LINK FOUND: http://www.cs.cmu.edu/afs/cs.cmu.edu/user/stork/mosaic/storkhome.html NEW LINK FOUND: http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-5/www/pleiades.html NEW LINK FOUND: http://math-www.uni-paderborn.de/~axel/NoShit/ NEW LINK FOUND: http://math-www.uni-paderborn.de/~axel/ NEW LINK FOUND: http://ils.unc.edu/gants/report.html NEW LINK FOUND: http://ruby.ils.unc.edu/gants/ NEW LINK FOUND: http://ils.unc.edu/ NEW LINK FOUND: http://www.fortunecity.com/skyscraper/telnet/639/gusdocs.html NEW LINK FOUND: http://www.fortunecity.com/skyscraper/telnet/639/mauro/mauro.htm NEW LINK FOUND: http://www.siemens.de/servers/wwash NEW LINK FOUND: http://www.scienceindex.com/ NEW LINK FOUND: http://www.neci.nj.nec.com/homepages/lawrence/ NEW LINK FOUND: http://www.neci.nj.nec.com/ NEW LINK FOUND: http://www.neci.nec.com/~lawrence/papers.html NEW LINK FOUND: http://cmc.dsv.su.se/select/ NEW LINK FOUND: http://www.comp.lancs.ac.uk/computing/users/dmn NEW LINK FOUND: http://www.comp.lancs.ac.uk/computing/users/dmn/select NEW LINK FOUND: http://chaffaway.inkey.com/ NEW LINK FOUND: mailto: ian@inkey.com NEW LINK FOUND: http://www.inkey.com/ NEW LINK FOUND: http://www.cm.org NEW LINK FOUND: http://sourceforge.net/projects/scoop/ NEW LINK FOUND: http://www.glue.umd.edu/~oard NEW LINK FOUND: mailto:oard@glue.umd.edu NEW LINK FOUND: http://www.wam.umd.edu/~jinmook NEW LINK FOUND: mailto:jinmook@glue.umd.edu nach ParserDelegator() PARSER RUN 87 crawl_level: 1 MAX_CRAWL_DEPTH: 2 ftp://db.stanford.edu/pub/sift/sift-1.1-netnews.tar.Z 0 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-db.stanford.edu/people/tyan.html 1 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 ftp://db.stanford.edu/pub/yan/ 2 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://193.62.196.6/sift/index.html 3 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.umn.edu/Research/GroupLens/ 4 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-sloan.mit.edu/ccs/paul.html 5 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ccs.mit.edu/ 6 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.umn.edu/~bmiller/ 7 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.umn.edu/~herlocke/ 8 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.umn.edu/~herlocke/nr/nr.html 9 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.firefly.com 10 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.lucifer.com/~sasha/articles/ACF.html 11 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.lucifer.com/~sasha/home.html 12 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.MachinaSapiens.qc.ca/english/products/infoscan/infoscanang.html 13 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.MachinaSapiens.qc.ca/index.html 14 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.panix.com/~erik/InfoTicker.cgi 15 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.panix.com/~erik 16 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.newssieve.com/ 17 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.haneke.de 18 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://iagent.iti.gov.sg/ 19 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.iti.gov.sg 20 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.iti.gov.sg/iti_people/iti_staff/kflai/ 21 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.iti.gov.sg/iti_people/iti_staff/kflai/trec95.html 22 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://home.wisewire.com/ 23 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www2.echo.lu/libraries/en/projects/borges.html 24 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.compapp.dcu.ie/~asmeaton/asmeaton.html 25 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 ftp://ftp.cs.pdx.edu/pub/faculty/jrb/rama/ 26 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cse.ogi.edu/~jrb/ 27 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.pdx.edu/ 28 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ee.umd.edu/medlab/filter/browse.tar.Z 29 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cse.rmit.edu.au/~rdsajj 30 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cse.rmit.edu.au/~rdsajj/brow.ps 31 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.clarinet.com/newsclip.html 32 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 ftp://ftp.digex.net/pub/networking/news/trn/strn/README.strn 33 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.island-resort.com/sm.htm 34 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.island-resort.com/lgl.htm 35 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 ftp://ftp.cs.cornell.edu/pub/smart 36 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cornell.edu/Info/Department/Annual95/Faculty/Salton.html 37 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cornell.edu/ 38 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://pi0959.kub.nl:2080/Paai/Onderw/Smart/hands.html 39 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://pi0959.kub.nl:2080/Paai/engels.html 40 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://cs-tr.cs.cornell.edu:80/TR/CORNELLCS:TR85-686?abstract= 41 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 ftp://ftp.media.mit.edu/pub/agents/interface-agents/MAXIMS 42 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://memetral.www.media.mit.edu/people/memetral/ 43 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://agents.www.media.mit.edu/groups/agents/ 44 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 ftp://parcftp.xerox.com/pub/collab-filter/ 45 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 ftp://uniwa.uwa.edu.au/pub/nn 46 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/afs/cs.cmu.edu/user/dmaltz/www/home.html 47 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/afs/cs.cmu.edu/user/dmaltz/Doc/mit-thesis.ps.gz 48 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 ftp://ftp.informatik.rwth-aachen.de/pub/packages/procmail 49 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ii.com/internet/faqs/launchers/mail/filtering-faq/ 50 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.myxa.com/elm.html 51 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.iki.fi/~era 52 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.iki.fi/~era/procmail/ 53 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.iki.fi/~era/procmail/mini-faq.html 54 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.flounder.net/~mrsam/maildrop/ 55 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.geocities.com/SiliconValley/Peaks/5799/maildrop.README.html 56 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nmt.edu/~mfisk/style.cgi?unixtools/mailfilt.html 57 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nmt.edu/~mfisk/ 58 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 ftp://ftp.foretune.co.jp/pub/network/mail/mailagent/ 59 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-6/web-agent/www/project-home.html 60 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/afs/cs.cmu.edu/user/stork/mosaic/storkhome.html 61 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-5/www/pleiades.html 62 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://math-www.uni-paderborn.de/~axel/NoShit/ 63 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://math-www.uni-paderborn.de/~axel/ 64 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ils.unc.edu/gants/report.html 65 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ruby.ils.unc.edu/gants/ 66 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ils.unc.edu/ 67 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.fortunecity.com/skyscraper/telnet/639/gusdocs.html 68 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.fortunecity.com/skyscraper/telnet/639/mauro/mauro.htm 69 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.siemens.de/servers/wwash 70 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.scienceindex.com/ 71 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.neci.nj.nec.com/homepages/lawrence/ 72 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.neci.nj.nec.com/ 73 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.neci.nec.com/~lawrence/papers.html 74 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://cmc.dsv.su.se/select/ 75 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.comp.lancs.ac.uk/computing/users/dmn 76 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.comp.lancs.ac.uk/computing/users/dmn/select 77 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://chaffaway.inkey.com/ 78 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto: ian@inkey.com 79 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.inkey.com/ 80 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cm.org 81 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://sourceforge.net/projects/scoop/ 82 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.glue.umd.edu/~oard 83 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:oard@glue.umd.edu 84 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.wam.umd.edu/~jinmook 85 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:jinmook@glue.umd.edu 86 87 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.cogsci.kun.nl ROBOT - SAVE CON OPENED: http://www.cogsci.kun.nl/~profile/others.html 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.dsv.su.se java.lang.NullPointerException - http://www.cogsci.kun.nl/~profile/others.html ROBOT - SAVE CON OPENED: http://www.dsv.su.se/~fk/if_Doc/IntFilter.html CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: mailto:fk@dsv.su.se NEW LINK FOUND: http://www.dsv.su.se/~evafaahr/Filter.html NEW LINK FOUND: http://www.dsv.su.se/~jpalme/w4g/rating-choices.html NEW LINK FOUND: http://www.dsv.su.se/~fk NEW LINK FOUND: http://www.dsv.su.se/~jpalme NEW LINK FOUND: http://www.glue.umd.edu/enee/medlab/filter/filter.html nach ParserDelegator() PARSER RUN 6 crawl_level: 1 MAX_CRAWL_DEPTH: 2 mailto:fk@dsv.su.se 0 6 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.dsv.su.se/~evafaahr/Filter.html 1 6 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.dsv.su.se/~jpalme/w4g/rating-choices.html 2 6 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.dsv.su.se/~fk 3 6 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.dsv.su.se/~jpalme 4 6 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.glue.umd.edu/enee/medlab/filter/filter.html 5 6 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN ils.unc.edu ROBOT - SAVE CON OPENED: http://ils.unc.edu/gants/filterbib.html CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http://ils.unc.edu/gants NEW LINK FOUND: mailto:gants@ils.unc.edu NEW LINK FOUND: http://ils.unc.edu/gants NEW LINK FOUND: mailto:gants@ils.unc.edu nach ParserDelegator() PARSER RUN 4 crawl_level: 1 MAX_CRAWL_DEPTH: 2 http://ils.unc.edu/gants 0 4 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:gants@ils.unc.edu 1 4 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ils.unc.edu/gants 2 4 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:gants@ils.unc.edu 3 4 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.cs.kun.nl ROBOT - SAVE CON OPENED: http://www.cs.kun.nl/is/research/filter/references.html CONTENT GOT: text/html FC1: java.io.FileNotFoundException: http://www.cs.ru.nl/is/research/filter/references.html 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN ftp.cs.umd.edu unsupported Site 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN lsi.argreenhouse.com ROBOT - SAVE CON OPENED: http://lsi.argreenhouse.com/~remde/lsi/ CONTENT GOT: text/html 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN lsi.argreenhouse.com FC1: java.io.FileNotFoundException: http://lsi.argreenhouse.com/~remde/lsi/ ROBOT - SAVE CON OPENED: http://lsi.argreenhouse.com/~remde/lsi/LSIpapers.html CONTENT GOT: text/html FC1: java.io.FileNotFoundException: http://lsi.argreenhouse.com/~remde/lsi/LSIpapers.html 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.cs.utk.edu ROBOT - SAVE CON OPENED: http://www.cs.utk.edu/~lsi/ CONTENT GOT: text/html vor ParserDelegator() nach ParserDelegator() PARSER RUN 0 crawl_level: 1 MAX_CRAWL_DEPTH: 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN lsa.colorado.edu ROBOT - SAVE CON OPENED: http://lsa.colorado.edu/ CONTENT GOT: text/html vor ParserDelegator() nach ParserDelegator() PARSER RUN 0 crawl_level: 1 MAX_CRAWL_DEPTH: 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN psych.colorado.edu ROBOT - SAVE CON OPENED: http://psych.colorado.edu/~rehder/lsa.html CONTENT GOT: text/html FC1: java.io.FileNotFoundException: http://psych.colorado.edu/~rehder/lsa.html 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN maude.nmsu.edu ROBOT - SAVE CON OPENED: http://maude.nmsu.edu/essay/index.html java.lang.NullPointerException - http://maude.nmsu.edu/essay/index.html 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN kf.oise.utoronto.ca ROBOT - SAVE CON OPENED: http://kf.oise.utoronto.ca/lsa/ CONTENT GOT: text/html vor ParserDelegator() PARSER RUN 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.cse.ogi.edu javax.swing.text.ChangedCharSetException java.lang.NullPointerException - http://kf.oise.utoronto.ca/lsa/ ROBOT - SAVE CON OPENED: http://www.cse.ogi.edu/~mhereim/ CONTENT GOT: text/html FC1: java.io.FileNotFoundException: http://www.cse.ogi.edu/~mhereim/ 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.site.uottawa.ca ROBOT - SAVE CON OPENED: http://www.site.uottawa.ca/tanka/ts.html CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http://www.csi.uottawa.ca/~clank/IIA.html NEW LINK FOUND: http://www.site.uottawa.ca/~kbarker NEW LINK FOUND: http://www.site.uottawa.ca/tanka/kaml.html NEW LINK FOUND: http://by.genie.uottawa.ca/profs/barriere/barriere.html NEW LINK FOUND: http://www.site.uottawa.ca/tanka/kaml.html NEW LINK FOUND: http://ai.iit.nrc.ca/~martin NEW LINK FOUND: http://ai.iit.nrc.ca/ NEW LINK FOUND: http://www.csi.uottawa.ca/~terry NEW LINK FOUND: http://www.site.uottawa.ca/tanka/kaml.html NEW LINK FOUND: http://www.csi.uottawa.ca/~debruijn/berry_home.html NEW LINK FOUND: http://www.site.uottawa.ca/tanka/kaml.html NEW LINK FOUND: http://www.csi.uottawa.ca/~delannoy NEW LINK FOUND: http://www.csi.uottawa.ca/~holte NEW LINK FOUND: http://www.site.uottawa.ca/tanka/kaml.html NEW LINK FOUND: http://www.csi.uottawa.ca:80/~u668426/ NEW LINK FOUND: http://www.site.uottawa.ca/tanka/kaml.html NEW LINK FOUND: http://www.csi.uottawa.ca:80/~clank/ NEW LINK FOUND: http://www.site.uottawa.ca/tanka/kaml.html NEW LINK FOUND: http://ai.iit.nrc.ca/staff/joel.html NEW LINK FOUND: http://ai.iit.nrc.ca/ NEW LINK FOUND: http://www.csi.uottawa.ca/~stan NEW LINK FOUND: http://www.site.uottawa.ca/tanka/kaml.html NEW LINK FOUND: http://www.csi.uottawa.ca/~szpak NEW LINK FOUND: http://www.site.uottawa.ca/tanka/kaml.html NEW LINK FOUND: http://ai.iit.nrc.ca/staff/peter.html NEW LINK FOUND: http://ai.iit.nrc.ca/ NEW LINK FOUND: http://www.csi.uottawa.ca/tanka/uploadable/tr98-04.ps NEW LINK FOUND: http://www.csi.uottawa.ca/~delannoy/relevant_cfps.htm NEW LINK FOUND: http://www.csi.uottawa.ca/tanka/uploadable/TS_exper.ps NEW LINK FOUND: http://www.csi.uottawa.ca/~szpak/proposals/text-summ-1996_ToC.html NEW LINK FOUND: http://www.csi.uottawa.ca/~debruijn/irbib.html NEW LINK FOUND: http://www.dcs.shef.ac.uk/~gael/alphalist.html NEW LINK FOUND: http://bib.cs.uni-dortmund.de/journals/journals-index.html NEW LINK FOUND: http://www.csi.uottawa.ca/~szpak/proposals/text-summ-1996.html#RTFToC21 NEW LINK FOUND: http://crrm.univ-mrs.fr/gateway/info-sci/Infosci1.html NEW LINK FOUND: http://researchsmp2.cc.vt.edu/DB/db/indices/a-tree/index.html NEW LINK FOUND: http://concept.cs.uah.edu/CG/ NEW LINK FOUND: http://www-ssdi.di.fct.unl.pt/~jea/ NEW LINK FOUND: http://www.cs.columbia.edu/~radev/summarization/ NEW LINK FOUND: http://www.contrib.andrew.cmu.edu/~zechner/klaus.html NEW LINK FOUND: http://ultratext.hil.unb.ca/Texts/ NEW LINK FOUND: http://library.usask.ca/90th NEW LINK FOUND: http://www.tipster.org/index.htm NEW LINK FOUND: http://www.tipster.org/summcall.htm NEW LINK FOUND: http://www.ik.fh-hannover.de/ik/projekte/Dagstuhl/Abstract/ NEW LINK FOUND: http://www.phil.uni-sb.de/FR/Infowiss/Abstract/program.html NEW LINK FOUND: http://www.mcs.surrey.ac.uk/SystemQ/ NEW LINK FOUND: http://www.machinasapiens.com/english/service/demo/form_demo_infos_ang.html NEW LINK FOUND: http://www.machinasapiens.com/english/company/machinaang.html NEW LINK FOUND: http://www.isoquest.com/DownLoad/NetOwl.exe NEW LINK FOUND: http://www.glu.com/ResearchMTT.html NEW LINK FOUND: http://ai.iit.nrc.ca/II_public/visualize NEW LINK FOUND: http://www.dur.ac.uk/~dcs3py/pages/work/Documents/lit-survey/IV-Survey/index.html#sec3.3.5 NEW LINK FOUND: http://ai.iit.nrc.ca/II_public/extractor.html NEW LINK FOUND: http://ai.iit.nrc.ca/jair/keyphrases/ NEW LINK FOUND: http://www.lingsoft.fi/demos.html NEW LINK FOUND: http://transend.labs.bt.com/prosum/word/index.html NEW LINK FOUND: http://transend.labs.bt.com/cgi-bin/prosum/prosum NEW LINK FOUND: http://www.verity.com/support/s97is/sis301u/10_is4.htm NEW LINK FOUND: ftp://ftp.cs.cornell.edu/pub/smart/ NEW LINK FOUND: http://www.qdeck.com/cgi-bin/download.pl?WebCompass NEW LINK FOUND: http://www.inxight.com/inprodl.htm NEW LINK FOUND: http://www.ik.fh-hannover.de/ik/personen/ben/Buchdeckel.htm NEW LINK FOUND: http://www.ik.fh-hannover.de/ik/personen/ben/SimSumDemoHeader.htm NEW LINK FOUND: http://www.ik.fh-hannover.de/ik/personen/ben/SimSumlastHeader.htm NEW LINK FOUND: http://aztec.lib.utk.edu/libres/libre7n2/khoo.html NEW LINK FOUND: http://www.mitre.org/resources/centers/advanced_info/g04f/bnn/mmiis97.html NEW LINK FOUND: http://ai.iit.nrc.ca/II_public/extractor/applications.html NEW LINK FOUND: http://www.dcs.glasgow.ac.uk/Keith/Preface.html NEW LINK FOUND: http://www.dcs.glasgow.ac.uk/Keith/Chapter.2/Ch.2.html NEW LINK FOUND: http://www.kgw.tu-berlin.de/~mengel/SpeechTech/HLTsurvey.html NEW LINK FOUND: http://www.kgw.tu-berlin.de/~mengel/SpeechTech/ch7node6.html#SECTION74 NEW LINK FOUND: http://www.kgw.tu-berlin.de/~mengel/SpeechTech/ch7node9.html#CH7REF NEW LINK FOUND: http://www.g7.fed.us/enrm/pattern.html NEW LINK FOUND: http://www.cs.mu.oz.au/~sbaillie/RP.html NEW LINK FOUND: http://www.dcs.shef.ac.uk/research/ilash/Seminars/paice.html NEW LINK FOUND: http://www.csi.uottawa.ca/dept/kaml/KAML.html NEW LINK FOUND: http://www.site.uottawa.ca/ NEW LINK FOUND: http://www.eng.uottawa.ca/ nach ParserDelegator() PARSER RUN 79 crawl_level: 1 MAX_CRAWL_DEPTH: 2 http://www.csi.uottawa.ca/~clank/IIA.html 0 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.site.uottawa.ca/~kbarker 1 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.site.uottawa.ca/tanka/kaml.html 2 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://by.genie.uottawa.ca/profs/barriere/barriere.html 3 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.site.uottawa.ca/tanka/kaml.html 4 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ai.iit.nrc.ca/~martin 5 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ai.iit.nrc.ca/ 6 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.csi.uottawa.ca/~terry 7 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.site.uottawa.ca/tanka/kaml.html 8 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.csi.uottawa.ca/~debruijn/berry_home.html 9 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.site.uottawa.ca/tanka/kaml.html 10 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.csi.uottawa.ca/~delannoy 11 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.csi.uottawa.ca/~holte 12 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.site.uottawa.ca/tanka/kaml.html 13 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.csi.uottawa.ca:80/~u668426/ 14 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.site.uottawa.ca/tanka/kaml.html 15 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.csi.uottawa.ca:80/~clank/ 16 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.site.uottawa.ca/tanka/kaml.html 17 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ai.iit.nrc.ca/staff/joel.html 18 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ai.iit.nrc.ca/ 19 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.csi.uottawa.ca/~stan 20 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.site.uottawa.ca/tanka/kaml.html 21 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.csi.uottawa.ca/~szpak 22 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.site.uottawa.ca/tanka/kaml.html 23 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ai.iit.nrc.ca/staff/peter.html 24 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ai.iit.nrc.ca/ 25 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.csi.uottawa.ca/tanka/uploadable/tr98-04.ps 26 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.csi.uottawa.ca/~delannoy/relevant_cfps.htm 27 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.csi.uottawa.ca/tanka/uploadable/TS_exper.ps 28 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.csi.uottawa.ca/~szpak/proposals/text-summ-1996_ToC.html 29 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.csi.uottawa.ca/~debruijn/irbib.html 30 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.dcs.shef.ac.uk/~gael/alphalist.html 31 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://bib.cs.uni-dortmund.de/journals/journals-index.html 32 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.csi.uottawa.ca/~szpak/proposals/text-summ-1996.html#RTFToC21 33 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://crrm.univ-mrs.fr/gateway/info-sci/Infosci1.html 34 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://researchsmp2.cc.vt.edu/DB/db/indices/a-tree/index.html 35 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://concept.cs.uah.edu/CG/ 36 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-ssdi.di.fct.unl.pt/~jea/ 37 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.columbia.edu/~radev/summarization/ 38 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.contrib.andrew.cmu.edu/~zechner/klaus.html 39 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ultratext.hil.unb.ca/Texts/ 40 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://library.usask.ca/90th 41 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.tipster.org/index.htm 42 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.tipster.org/summcall.htm 43 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ik.fh-hannover.de/ik/projekte/Dagstuhl/Abstract/ 44 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.phil.uni-sb.de/FR/Infowiss/Abstract/program.html 45 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.mcs.surrey.ac.uk/SystemQ/ 46 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.machinasapiens.com/english/service/demo/form_demo_infos_ang.html 47 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.machinasapiens.com/english/company/machinaang.html 48 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.isoquest.com/DownLoad/NetOwl.exe 49 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.glu.com/ResearchMTT.html 50 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ai.iit.nrc.ca/II_public/visualize 51 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.dur.ac.uk/~dcs3py/pages/work/Documents/lit-survey/IV-Survey/index.html#sec3.3.5 52 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ai.iit.nrc.ca/II_public/extractor.html 53 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ai.iit.nrc.ca/jair/keyphrases/ 54 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.lingsoft.fi/demos.html 55 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://transend.labs.bt.com/prosum/word/index.html 56 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://transend.labs.bt.com/cgi-bin/prosum/prosum 57 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.verity.com/support/s97is/sis301u/10_is4.htm 58 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 ftp://ftp.cs.cornell.edu/pub/smart/ 59 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.qdeck.com/cgi-bin/download.pl?WebCompass 60 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.inxight.com/inprodl.htm 61 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ik.fh-hannover.de/ik/personen/ben/Buchdeckel.htm 62 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ik.fh-hannover.de/ik/personen/ben/SimSumDemoHeader.htm 63 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ik.fh-hannover.de/ik/personen/ben/SimSumlastHeader.htm 64 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://aztec.lib.utk.edu/libres/libre7n2/khoo.html 65 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.mitre.org/resources/centers/advanced_info/g04f/bnn/mmiis97.html 66 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ai.iit.nrc.ca/II_public/extractor/applications.html 67 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.dcs.glasgow.ac.uk/Keith/Preface.html 68 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.dcs.glasgow.ac.uk/Keith/Chapter.2/Ch.2.html 69 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.kgw.tu-berlin.de/~mengel/SpeechTech/HLTsurvey.html 70 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.kgw.tu-berlin.de/~mengel/SpeechTech/ch7node6.html#SECTION74 71 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.kgw.tu-berlin.de/~mengel/SpeechTech/ch7node9.html#CH7REF 72 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.g7.fed.us/enrm/pattern.html 73 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.mu.oz.au/~sbaillie/RP.html 74 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.dcs.shef.ac.uk/research/ilash/Seminars/paice.html 75 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.csi.uottawa.ca/dept/kaml/KAML.html 76 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.site.uottawa.ca/ 77 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.eng.uottawa.ca/ 78 79 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.cs.columbia.edu ROBOT - SAVE CON OPENED: http://www.cs.columbia.edu/~hjing/summarization.html CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http://www.ik.fh-hannover.de/ik/projekte/Dagstuhl/Abstract/ NEW LINK FOUND: http://www.cs.columbia.edu/~radev/ists97 NEW LINK FOUND: http://www.cs.columbia.edu/~radev/aaai-sss98-its NEW LINK FOUND: http://www-nlpir.nist.gov/related_projects/tipster_summac/results_eval.html NEW LINK FOUND: http://www.dcs.shef.ac.uk/~gael/summarization.html NEW LINK FOUND: http://www.site.uottawa.ca/tanka/ts.html NEW LINK FOUND: http://transend.labs.bt.com/ NEW LINK FOUND: http://www.inxight.com/ NEW LINK FOUND: http://extractor.iit.nrc.ca/ NEW LINK FOUND: http://www.trl.ibm.co.jp/projects/langtran/abst_e.htm NEW LINK FOUND: http://www.dcs.shef.ac.uk/~gael/alphalist.html NEW LINK FOUND: http://www.mews.org/jto/ab45001.html NEW LINK FOUND: http://www.jprc.com/users/mkant/summarize.html NEW LINK FOUND: http://www.kgw.tu-berlin.de/~mengel/SpeechTech/ch7node6.html NEW LINK FOUND: http://www.lehmam.freesurf.fr/automatic_summarization.htm NEW LINK FOUND: mailto:hjing@cs.columbia.edu nach ParserDelegator() PARSER RUN 16 crawl_level: 1 MAX_CRAWL_DEPTH: 2 http://www.ik.fh-hannover.de/ik/projekte/Dagstuhl/Abstract/ 0 16 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.columbia.edu/~radev/ists97 1 16 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.columbia.edu/~radev/aaai-sss98-its 2 16 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-nlpir.nist.gov/related_projects/tipster_summac/results_eval.html 3 16 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.dcs.shef.ac.uk/~gael/summarization.html 4 16 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.site.uottawa.ca/tanka/ts.html 5 16 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://transend.labs.bt.com/ 6 16 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.inxight.com/ 7 16 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://extractor.iit.nrc.ca/ 8 16 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.trl.ibm.co.jp/projects/langtran/abst_e.htm 9 16 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.dcs.shef.ac.uk/~gael/alphalist.html 10 16 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.mews.org/jto/ab45001.html 11 16 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.jprc.com/users/mkant/summarize.html 12 16 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.kgw.tu-berlin.de/~mengel/SpeechTech/ch7node6.html 13 16 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.lehmam.freesurf.fr/automatic_summarization.htm 14 16 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:hjing@cs.columbia.edu 15 16 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.dcs.shef.ac.uk ROBOT - SAVE CON OPENED: http://www.dcs.shef.ac.uk/~gael/summarization.html CONTENT GOT: text/html 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.dcs.shef.ac.uk FC1: java.io.FileNotFoundException: http://www.dcs.shef.ac.uk/~gael/summarization.html ROBOT - SAVE CON OPENED: http://www.dcs.shef.ac.uk/~gael/alphalist.html CONTENT GOT: text/html FC1: java.io.FileNotFoundException: http://www.dcs.shef.ac.uk/~gael/alphalist.html 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.cs.columbia.edu ROBOT - SAVE CON OPENED: http://www.cs.columbia.edu/~radev/summarization/ CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http://www.summarization.com NEW LINK FOUND: http://www.geocities.com/Athens/Forum/1373/acl-2000-summarization-theme.html NEW LINK FOUND: http://www.isi.edu/~cyl/was-anlp2000.html NEW LINK FOUND: http://www.cs.columbia.edu/~radev/aaai-sss98-its NEW LINK FOUND: http://www.aaai.org/Press/Reports/Symposia/Spring/SS-98-06/SS-98-06.html NEW LINK FOUND: http://www.cs.columbia.edu/~radev/ists97 NEW LINK FOUND: http://www.ik.fh-hannover.de/ik/projekte/Dagstuhl/Abstract/ NEW LINK FOUND: http://mitpress.mit.edu/book-home.tcl?isbn=0262133598 NEW LINK FOUND: http://www.dcs.shef.ac.uk/~gael/alphalist.html NEW LINK FOUND: http://www.cs.columbia.edu/~hjing/summarization.html NEW LINK FOUND: http://www.jprc.com/users/mkant/summarize.html NEW LINK FOUND: http://www.mitre.org/pubs/edge/first.htm NEW LINK FOUND: http://www.site.uottawa.ca/tanka/ts.html NEW LINK FOUND: http://www.aclweb.org/u/bin/search-index.cgi?database_name=acl&keywords=summary+summarization&max_output=1000 NEW LINK FOUND: http://www.infoseek.com/Titles?qt=%22text+summarization%22&col=WW&sv=I2&svx=NSQuickseek NEW LINK FOUND: http://www.altavista.digital.com/cgi-bin/query?pg=q&what=web&kl=XX&q=%22text+summarization%22 NEW LINK FOUND: http://transend.labs.bt.com/ NEW LINK FOUND: http://www.inxight.com NEW LINK FOUND: http://www.si.umich.edu/~radev/summarization/large-bib.doc nach ParserDelegator() PARSER RUN 19 crawl_level: 1 MAX_CRAWL_DEPTH: 2 http://www.summarization.com 0 19 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.geocities.com/Athens/Forum/1373/acl-2000-summarization-theme.html 1 19 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.isi.edu/~cyl/was-anlp2000.html 2 19 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.columbia.edu/~radev/aaai-sss98-its 3 19 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.aaai.org/Press/Reports/Symposia/Spring/SS-98-06/SS-98-06.html 4 19 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.columbia.edu/~radev/ists97 5 19 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ik.fh-hannover.de/ik/projekte/Dagstuhl/Abstract/ 6 19 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://mitpress.mit.edu/book-home.tcl?isbn=0262133598 7 19 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.dcs.shef.ac.uk/~gael/alphalist.html 8 19 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.columbia.edu/~hjing/summarization.html 9 19 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.jprc.com/users/mkant/summarize.html 10 19 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.mitre.org/pubs/edge/first.htm 11 19 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.site.uottawa.ca/tanka/ts.html 12 19 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.aclweb.org/u/bin/search-index.cgi?database_name=acl&keywords=summary+summarization&max_output=1000 13 19 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.infoseek.com/Titles?qt=%22text+summarization%22&col=WW&sv=I2&svx=NSQuickseek 14 19 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.altavista.digital.com/cgi-bin/query?pg=q&what=web&kl=XX&q=%22text+summarization%22 15 19 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://transend.labs.bt.com/ 16 19 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.inxight.com 17 19 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.si.umich.edu/~radev/summarization/large-bib.doc 18 19 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.isi.edu ROBOT - SAVE CON OPENED: http://www.isi.edu/natural-language/projects/SUMMARIST.html CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http://www.isi.edu/~cyl/must/must_beta.htm NEW LINK FOUND: http://www.isi.edu/natural-language/ONTOLOGIES.html NEW LINK FOUND: http://www.isi.edu/natural-language/ONTOLOGIES.html NEW LINK FOUND: http://www.isi.edu/natural-language/mt/nitrogen/ NEW LINK FOUND: http://www.isi.edu/~cyl/must/must_beta.htm NEW LINK FOUND: http://www.isi.edu/natural-language/summ-indic-phrases.html NEW LINK FOUND: http://www.isi.edu/~cyl/summarist/full-01.jpg NEW LINK FOUND: http://www.isi.edu/natural-language/people/hovy.html NEW LINK FOUND: http://www.isi.edu/~cyl NEW LINK FOUND: http://www.isi.edu/~marcu/ nach ParserDelegator() PARSER RUN 10 crawl_level: 1 MAX_CRAWL_DEPTH: 2 http://www.isi.edu/~cyl/must/must_beta.htm 0 10 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.isi.edu/natural-language/ONTOLOGIES.html 1 10 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.isi.edu/natural-language/ONTOLOGIES.html 2 10 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.isi.edu/natural-language/mt/nitrogen/ 3 10 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.isi.edu/~cyl/must/must_beta.htm 4 10 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.isi.edu/natural-language/summ-indic-phrases.html 5 10 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.isi.edu/~cyl/summarist/full-01.jpg 6 10 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.isi.edu/natural-language/people/hovy.html 7 10 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.isi.edu/~cyl 8 10 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.isi.edu/~marcu/ 9 10 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.isi.edu ROBOT - SAVE CON OPENED: http://www.isi.edu/natural-language/summ-indic-phrases.html CONTENT GOT: text/html vor ParserDelegator() PARSER RUN 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.ik.fh-hannover.de javax.swing.text.ChangedCharSetException java.lang.NullPointerException - http://www.isi.edu/natural-language/summ-indic-phrases.html ROBOT - SAVE CON OPENED: http://www.ik.fh-hannover.de/ik/projekte/Dagstuhl/Abstract/ CONTENT GOT: text/html FC1: java.io.FileNotFoundException: http://www.ik.fh-hannover.de/ik/projekte/Dagstuhl/Abstract/ 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN java.lang.NullPointerException - http://extractor.iit.nrc.ca/ extractor.iit.nrc.ca ROBOT - SAVE CON OPENED: http://extractor.iit.nrc.ca/ 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN dblp.uni-trier.de ROBOT - SAVE CON OPENED: http://dblp.uni-trier.de/ CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http://dblp.uni-trier.de/ NEW LINK FOUND: http://www.uni-trier.de/ NEW LINK FOUND: http://www.informatik.uni-trier.de/~ley/ NEW LINK FOUND: http://www.informatik.uni-trier.de/~ley/db/ NEW LINK FOUND: http://www.acm.org/sigmod/dblp/db/index.html NEW LINK FOUND: http://www.vldb.org/dblp/db/index.html NEW LINK FOUND: http://www.cobase.cs.ucla.edu/pub/dblp/html/db/ NEW LINK FOUND: http://sunsite.informatik.rwth-aachen.de/dblp/db/ NEW LINK FOUND: http://www.informatik.uni-trier.de/~ley/db/indices/a-tree/index.html NEW LINK FOUND: http://www.acm.org/sigmod/dblp/db/indices/a-tree/index.html NEW LINK FOUND: http://www.vldb.org/dblp/db/indices/a-tree/index.html NEW LINK FOUND: http://www.cobase.cs.ucla.edu/pub/dblp/html/db/indices/a-tree/index.html NEW LINK FOUND: http://sunsite.informatik.rwth-aachen.de/dblp/db/indices/a-tree/index.html NEW LINK FOUND: http://www.informatik.uni-trier.de/~ley/db/indices/t-form.html NEW LINK FOUND: http://www.acm.org/sigmod/dblp/db/indices/t-form.html NEW LINK FOUND: http://www.vldb.org/dblp/db/indices/t-form.html NEW LINK FOUND: http://www.cobase.cs.ucla.edu/pub/dblp/html/db/indices/t-form.html NEW LINK FOUND: http://sunsite.informatik.rwth-aachen.de/dblp/db/indices/t-form.html NEW LINK FOUND: http://www.informatik.uni-trier.de/~ley/db/conf/index.a.html NEW LINK FOUND: http://www.informatik.uni-trier.de/~ley/db/journals/index.html NEW LINK FOUND: http://www.informatik.uni-trier.de/~ley/db/subjects.html NEW LINK FOUND: http://www.informatik.uni-trier.de/~ley/db/conf/vldb/index.html NEW LINK FOUND: http://www.informatik.uni-trier.de/~ley/db/journals/tods/index.html NEW LINK FOUND: http://www.informatik.uni-trier.de/~ley/db/conf/sigmod/HullZ96.html NEW LINK FOUND: http://www.acm.org/ NEW LINK FOUND: http://link.springer.de/ NEW LINK FOUND: mailto:ley@uni-trier.de NEW LINK FOUND: http://www.informatik.uni-trier.de/~ley/db/about/faq.html NEW LINK FOUND: http://www.informatik.uni-trier.de/~ley/db/about/awards.html NEW LINK FOUND: http://www.informatik.uni-trier.de/~ley/papers/gi97.ps NEW LINK FOUND: http://www.informatik.uni-trier.de/~ley/db/copyright.html NEW LINK FOUND: http://www.informatik.uni-trier.de/~ley/addr.html NEW LINK FOUND: mailto:ley@uni-trier.de nach ParserDelegator() PARSER RUN 33 crawl_level: 1 MAX_CRAWL_DEPTH: 2 http://dblp.uni-trier.de/ 0 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.uni-trier.de/ 1 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.informatik.uni-trier.de/~ley/ 2 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.informatik.uni-trier.de/~ley/db/ 3 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.acm.org/sigmod/dblp/db/index.html 4 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.vldb.org/dblp/db/index.html 5 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cobase.cs.ucla.edu/pub/dblp/html/db/ 6 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://sunsite.informatik.rwth-aachen.de/dblp/db/ 7 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.informatik.uni-trier.de/~ley/db/indices/a-tree/index.html 8 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.acm.org/sigmod/dblp/db/indices/a-tree/index.html 9 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.vldb.org/dblp/db/indices/a-tree/index.html 10 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cobase.cs.ucla.edu/pub/dblp/html/db/indices/a-tree/index.html 11 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://sunsite.informatik.rwth-aachen.de/dblp/db/indices/a-tree/index.html 12 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.informatik.uni-trier.de/~ley/db/indices/t-form.html 13 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.acm.org/sigmod/dblp/db/indices/t-form.html 14 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.vldb.org/dblp/db/indices/t-form.html 15 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cobase.cs.ucla.edu/pub/dblp/html/db/indices/t-form.html 16 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://sunsite.informatik.rwth-aachen.de/dblp/db/indices/t-form.html 17 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.informatik.uni-trier.de/~ley/db/conf/index.a.html 18 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.informatik.uni-trier.de/~ley/db/journals/index.html 19 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.informatik.uni-trier.de/~ley/db/subjects.html 20 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.informatik.uni-trier.de/~ley/db/conf/vldb/index.html 21 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.informatik.uni-trier.de/~ley/db/journals/tods/index.html 22 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.informatik.uni-trier.de/~ley/db/conf/sigmod/HullZ96.html 23 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.acm.org/ 24 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://link.springer.de/ 25 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:ley@uni-trier.de 26 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.informatik.uni-trier.de/~ley/db/about/faq.html 27 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.informatik.uni-trier.de/~ley/db/about/awards.html 28 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.informatik.uni-trier.de/~ley/papers/gi97.ps 29 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.informatik.uni-trier.de/~ley/db/copyright.html 30 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.informatik.uni-trier.de/~ley/addr.html 31 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:ley@uni-trier.de 32 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN lorca.compapp.dcu.ie ROBOT - SAVE CON OPENED: http://lorca.compapp.dcu.ie/SIGIR98-wshop/program.html CONTENT GOT: text/html 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.webnetjrl.com FC1: java.io.FileNotFoundException: http://lorca.compapp.dcu.ie/SIGIR98-wshop/program.html ROBOT - SAVE CON OPENED: http://www.webnetjrl.com/ CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http://www.megacockcravers.com/2/main.htm?id=marcoola NEW LINK FOUND: http://www.megacockcravers.com/2/main.htm?id=marcoola NEW LINK FOUND: http://www.springbreakspycam.com/t1/pps=marcoola/ NEW LINK FOUND: http://www.springbreakspycam.com/t1/pps=marcoola/ NEW LINK FOUND: http://www.cumfiesta.com/gallys/?id=marcoola NEW LINK FOUND: http://www.cumfiesta.com/gallys/?id=marcoola NEW LINK FOUND: http://www.bigtitsroundasses.com/t1/pps=marcoola/ NEW LINK FOUND: http://www.bigtitsroundasses.com/t1/pps=marcoola/ NEW LINK FOUND: http://www.oxpass.com/t1/pps=marcoola/ NEW LINK FOUND: http://www.streetblowjobs.com/2/main.htm?id=marcoola NEW LINK FOUND: http://www.wild-girl.biz NEW LINK FOUND: http://www.asianglamors.com NEW LINK FOUND: http://www.cumfiesta.com/5/main.htm?id=marcoola NEW LINK FOUND: http://www.wild-girl.biz/handjobs.htm NEW LINK FOUND: http://www.webnetjrl.com NEW LINK FOUND: http://www.webnetjrl.com/Hentai.htm nach ParserDelegator() PARSER RUN 16 crawl_level: 1 MAX_CRAWL_DEPTH: 2 http://www.megacockcravers.com/2/main.htm?id=marcoola 0 16 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.megacockcravers.com/2/main.htm?id=marcoola 1 16 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.springbreakspycam.com/t1/pps=marcoola/ 2 16 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.springbreakspycam.com/t1/pps=marcoola/ 3 16 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cumfiesta.com/gallys/?id=marcoola 4 16 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cumfiesta.com/gallys/?id=marcoola 5 16 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.bigtitsroundasses.com/t1/pps=marcoola/ 6 16 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.bigtitsroundasses.com/t1/pps=marcoola/ 7 16 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.oxpass.com/t1/pps=marcoola/ 8 16 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.streetblowjobs.com/2/main.htm?id=marcoola 9 16 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.wild-girl.biz 10 16 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.asianglamors.com 11 16 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cumfiesta.com/5/main.htm?id=marcoola 12 16 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.wild-girl.biz/handjobs.htm 13 16 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.webnetjrl.com 14 16 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.webnetjrl.com/Hentai.htm 15 16 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN trec.nist.gov ROBOT - SAVE CON OPENED: http://trec.nist.gov/pubs.html CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http://trec.nist.gov NEW LINK FOUND: http://www.nist.gov NEW LINK FOUND: http://www.nist.gov/cgi-bin/wwwph?Ellen+Voorhees NEW LINK FOUND: http://www.nist.gov/cgi-bin/wwwph?Lori+Buckland NEW LINK FOUND: http://www.nist.gov/cgi-bin/wwwph?Ellen+Voorhees NEW LINK FOUND: http://www.nist.gov/cgi-bin/wwwph?Lori+Buckland NEW LINK FOUND: http://www.nist.gov/cgi-bin/exit_nist.cgi?url=www.doc.gov NEW LINK FOUND: http://www.nist.gov NEW LINK FOUND: http://www.nist.gov/cgi-bin/wwwph?Ellen+Voorhees NEW LINK FOUND: http://www.nist.gov/cgi-bin/wwwph?Lori+Buckland NEW LINK FOUND: http://www.nist.gov/cgi-bin/exit_nist.cgi?url=http://www.gpo.gov/ NEW LINK FOUND: http://www.nist.gov/cgi-bin/exit_nist.cgi?url=www.doc.gov NEW LINK FOUND: http://www.nist.gov NEW LINK FOUND: http://www.nist.gov/cgi-bin/wwwph?Ellen+Voorhees NEW LINK FOUND: http://www.nist.gov/cgi-bin/wwwph?Donna+Harman NEW LINK FOUND: http://www.nist.gov/cgi-bin/exit_nist.cgi?url=www.doc.gov NEW LINK FOUND: http://www.nist.gov NEW LINK FOUND: http://www.nist.gov/cgi-bin/wwwph?Ellen+Voorhees NEW LINK FOUND: http://www.nist.gov/cgi-bin/wwwph?Donna+Harman NEW LINK FOUND: http://www.nist.gov/cgi-bin/exit_nist.cgi?url=http://www.gpo.gov/ NEW LINK FOUND: http://www.nist.gov/cgi-bin/exit_nist.cgi?url=www.doc.gov NEW LINK FOUND: http://www.nist.gov NEW LINK FOUND: http://www.nist.gov/cgi-bin/wwwph?Ellen+Voorhees NEW LINK FOUND: http://www.nist.gov/cgi-bin/wwwph?Donna+Harman NEW LINK FOUND: http://www.nist.gov/cgi-bin/exit_nist.cgi?url=http://www.gpo.gov/ NEW LINK FOUND: http://www.nist.gov/cgi-bin/exit_nist.cgi?url=www.doc.gov NEW LINK FOUND: http://www.nist.gov NEW LINK FOUND: http://www.nist.gov/cgi-bin/wwwph?Ellen+Voorhees NEW LINK FOUND: http://www.nist.gov/cgi-bin/wwwph?Donna+Harman NEW LINK FOUND: http://www.nist.gov/cgi-bin/exit_nist.cgi?url=www.ntis.gov NEW LINK FOUND: http://www.nist.gov/cgi-bin/exit_nist.cgi?url=www.doc.gov NEW LINK FOUND: http://www.nist.gov NEW LINK FOUND: http://www.nist.gov/cgi-bin/wwwph?Ellen+Voorhees NEW LINK FOUND: http://www.nist.gov/cgi-bin/wwwph?Donna+Harman NEW LINK FOUND: http://www.nist.gov/cgi-bin/exit_nist.cgi?url=www.ntis.gov NEW LINK FOUND: http://www.nist.gov/cgi-bin/exit_nist.cgi?url=www.doc.gov NEW LINK FOUND: http://www.nist.gov NEW LINK FOUND: http://www.nist.gov/cgi-bin/wwwph?Ellen+Voorhees NEW LINK FOUND: http://www.nist.gov/cgi-bin/wwwph?Donna+Harman NEW LINK FOUND: http://www.nist.gov/cgi-bin/exit_nist.cgi?url=www.ntis.gov NEW LINK FOUND: http://www.nist.gov/cgi-bin/exit_nist.cgi?url=www.doc.gov NEW LINK FOUND: http://www.nist.gov NEW LINK FOUND: http://www.nist.gov/cgi-bin/wwwph?Donna+Harman NEW LINK FOUND: http://www.nist.gov/cgi-bin/exit_nist.cgi?url=www.ntis.gov NEW LINK FOUND: http://www.nist.gov/cgi-bin/exit_nist.cgi?url=www.doc.gov NEW LINK FOUND: http://www.nist.gov NEW LINK FOUND: http://www.nist.gov/cgi-bin/wwwph?Donna+Harman NEW LINK FOUND: http://www.nist.gov/cgi-bin/exit_nist.cgi?url=www.ntis.gov NEW LINK FOUND: http://www.nist.gov/cgi-bin/exit_nist.cgi?url=www.doc.gov NEW LINK FOUND: http://www.nist.gov NEW LINK FOUND: http://www.nist.gov/cgi-bin/wwwph?Donna+Harman NEW LINK FOUND: http://www.nist.gov/cgi-bin/exit_nist.cgi?url=http://www.ntis.gov NEW LINK FOUND: http://www.nist.gov/cgi-bin/exit_nist.cgi?url=www.doc.gov NEW LINK FOUND: http://www.nist.gov NEW LINK FOUND: http://www.nist.gov/cgi-bin/wwwph?Donna+Harman NEW LINK FOUND: mailto:trec@nist.gov nach ParserDelegator() PARSER RUN 56 crawl_level: 1 MAX_CRAWL_DEPTH: 2 http://trec.nist.gov 0 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov 1 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/cgi-bin/wwwph?Ellen+Voorhees 2 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/cgi-bin/wwwph?Lori+Buckland 3 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/cgi-bin/wwwph?Ellen+Voorhees 4 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/cgi-bin/wwwph?Lori+Buckland 5 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/cgi-bin/exit_nist.cgi?url=www.doc.gov 6 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov 7 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/cgi-bin/wwwph?Ellen+Voorhees 8 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/cgi-bin/wwwph?Lori+Buckland 9 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/cgi-bin/exit_nist.cgi?url=http://www.gpo.gov/ 10 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/cgi-bin/exit_nist.cgi?url=www.doc.gov 11 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov 12 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/cgi-bin/wwwph?Ellen+Voorhees 13 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/cgi-bin/wwwph?Donna+Harman 14 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/cgi-bin/exit_nist.cgi?url=www.doc.gov 15 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov 16 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/cgi-bin/wwwph?Ellen+Voorhees 17 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/cgi-bin/wwwph?Donna+Harman 18 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/cgi-bin/exit_nist.cgi?url=http://www.gpo.gov/ 19 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/cgi-bin/exit_nist.cgi?url=www.doc.gov 20 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov 21 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/cgi-bin/wwwph?Ellen+Voorhees 22 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/cgi-bin/wwwph?Donna+Harman 23 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/cgi-bin/exit_nist.cgi?url=http://www.gpo.gov/ 24 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/cgi-bin/exit_nist.cgi?url=www.doc.gov 25 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov 26 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/cgi-bin/wwwph?Ellen+Voorhees 27 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/cgi-bin/wwwph?Donna+Harman 28 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/cgi-bin/exit_nist.cgi?url=www.ntis.gov 29 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/cgi-bin/exit_nist.cgi?url=www.doc.gov 30 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov 31 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/cgi-bin/wwwph?Ellen+Voorhees 32 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/cgi-bin/wwwph?Donna+Harman 33 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/cgi-bin/exit_nist.cgi?url=www.ntis.gov 34 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/cgi-bin/exit_nist.cgi?url=www.doc.gov 35 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov 36 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/cgi-bin/wwwph?Ellen+Voorhees 37 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/cgi-bin/wwwph?Donna+Harman 38 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/cgi-bin/exit_nist.cgi?url=www.ntis.gov 39 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/cgi-bin/exit_nist.cgi?url=www.doc.gov 40 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov 41 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/cgi-bin/wwwph?Donna+Harman 42 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/cgi-bin/exit_nist.cgi?url=www.ntis.gov 43 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/cgi-bin/exit_nist.cgi?url=www.doc.gov 44 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov 45 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/cgi-bin/wwwph?Donna+Harman 46 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/cgi-bin/exit_nist.cgi?url=www.ntis.gov 47 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/cgi-bin/exit_nist.cgi?url=www.doc.gov 48 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov 49 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/cgi-bin/wwwph?Donna+Harman 50 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/cgi-bin/exit_nist.cgi?url=http://www.ntis.gov 51 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/cgi-bin/exit_nist.cgi?url=www.doc.gov 52 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov 53 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nist.gov/cgi-bin/wwwph?Donna+Harman 54 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:trec@nist.gov 55 56 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www3.cm.deakin.edu.au ROBOT - SAVE CON OPENED: http://www3.cm.deakin.edu.au/apweb98/program.htm CONTENT GOT: text/html vor ParserDelegator() PARSER RUN 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.acm.org javax.swing.text.ChangedCharSetException java.lang.NullPointerException - http://www3.cm.deakin.edu.au/apweb98/program.htm unsupported Site 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.iit.nrcps.ariadne-t.gr java.lang.NullPointerException - http://www.iit.nrcps.ariadne-t.gr/~costass/mulsaic97.html ROBOT - SAVE CON OPENED: http://www.iit.nrcps.ariadne-t.gr/~costass/mulsaic97.html 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.ee.umd.edu ROBOT - SAVE CON OPENED: http://www.ee.umd.edu/medlab/filter/sss/papers/ CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: mailto:oard@glue.umd.edu NEW LINK FOUND: http://www.ee.umd.edu/medlab/filter/sss/ NEW LINK FOUND: http://www.aaai.org/Symposia/Spring/1997/sssparticipation-97.html NEW LINK FOUND: http://www.cs.cmu.edu/~ref/aaai-ref.html NEW LINK FOUND: http://isserv.tas.ntt.co.jp/~hayashi/papers/AAAI-CLTR/hayashi.html NEW LINK FOUND: http://www.ee.umd.edu/medlab/filter/sss/ NEW LINK FOUND: http://www.aaai.org/Symposia/Spring/1997/sssparticipation-97.html NEW LINK FOUND: mailto:oard@glue.umd.edu nach ParserDelegator() PARSER RUN 8 crawl_level: 1 MAX_CRAWL_DEPTH: 2 mailto:oard@glue.umd.edu 0 8 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ee.umd.edu/medlab/filter/sss/ 1 8 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.aaai.org/Symposia/Spring/1997/sssparticipation-97.html 2 8 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/~ref/aaai-ref.html 3 8 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://isserv.tas.ntt.co.jp/~hayashi/papers/AAAI-CLTR/hayashi.html 4 8 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ee.umd.edu/medlab/filter/sss/ 5 8 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.aaai.org/Symposia/Spring/1997/sssparticipation-97.html 6 8 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:oard@glue.umd.edu 7 8 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN ciir.cs.umass.edu ROBOT - SAVE CON OPENED: http://ciir.cs.umass.edu/nir97/ CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http://www.acm.org/sigir/conferences/sigir97/ NEW LINK FOUND: mailto:callan@cs.umass.edu NEW LINK FOUND: http://www.acm.org/sigir/conferences/sigir97/wshops.html nach ParserDelegator() PARSER RUN 3 crawl_level: 1 MAX_CRAWL_DEPTH: 2 http://www.acm.org/sigir/conferences/sigir97/ 0 3 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:callan@cs.umass.edu 1 3 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.acm.org/sigir/conferences/sigir97/wshops.html 2 3 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN journals.ecs.soton.ac.uk ROBOT - SAVE CON OPENED: http://journals.ecs.soton.ac.uk/~lac/ht97/summary.html CONTENT GOT: text/html FC1: java.io.FileNotFoundException: http://journals.ecs.soton.ac.uk/~lac/ht97/summary.html 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN proceedings.www6conf.org ROBOT - SAVE CON OPENED: http://proceedings.www6conf.org/ CONTENT GOT: text/html vor ParserDelegator() PARSER RUN 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN atlanta.cs.nchu.edu.tw javax.swing.text.ChangedCharSetException java.lang.NullPointerException - http://proceedings.www6conf.org/ ROBOT - SAVE CON OPENED: http://atlanta.cs.nchu.edu.tw/www/ToC.html 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www5conf.inria.fr java.lang.NullPointerException - http://atlanta.cs.nchu.edu.tw/www/ToC.html ROBOT - SAVE CON OPENED: http://www5conf.inria.fr/ CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http://www.inria.fr/welcome-eng.html NEW LINK FOUND: http://www.cec.lu/ NEW LINK FOUND: http://www-ercim.inria.fr/ NEW LINK FOUND: http://www.w3.org/ NEW LINK FOUND: http://jeeves.ncsa.uiuc.edu/Public/IW3C2/Members.html NEW LINK FOUND: http://www7.conf.au NEW LINK FOUND: http://www6conf.slac.stanford.edu NEW LINK FOUND: http://www.elsevier.nl/homepage/sac/paris96/ NEW LINK FOUND: http://www5conf.inria.fr/fich_html/webcast/Welcome.html NEW LINK FOUND: http://www.atmedia.net/WWW5Volunteers/ NEW LINK FOUND: mailto:www5-info@inria.fr NEW LINK FOUND: http://www.inria.fr/welcome-eng.html NEW LINK FOUND: http://www.cec.lu NEW LINK FOUND: http://www-ercim.inria.fr NEW LINK FOUND: http://www.w3.org/ nach ParserDelegator() PARSER RUN 15 crawl_level: 1 MAX_CRAWL_DEPTH: 2 http://www.inria.fr/welcome-eng.html 0 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cec.lu/ 1 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-ercim.inria.fr/ 2 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.w3.org/ 3 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://jeeves.ncsa.uiuc.edu/Public/IW3C2/Members.html 4 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www7.conf.au 5 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www6conf.slac.stanford.edu 6 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.elsevier.nl/homepage/sac/paris96/ 7 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www5conf.inria.fr/fich_html/webcast/Welcome.html 8 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.atmedia.net/WWW5Volunteers/ 9 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:www5-info@inria.fr 10 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.inria.fr/welcome-eng.html 11 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cec.lu 12 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-ercim.inria.fr 13 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.w3.org/ 14 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.supercomp.org ROBOT - SAVE CON OPENED: http://www.supercomp.org/sc95/proceedings/ CONTENT GOT: text/html FC1: java.io.FileNotFoundException: http://www.supercomp.org/sc95/proceedings/ 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN ai.iit.nrc.ca ROBOT - SAVE CON OPENED: http://ai.iit.nrc.ca/DEIL/ CONTENT GOT: text/html FC1: java.io.FileNotFoundException: http://ai.iit.nrc.ca/DEIL/ 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www-cse.ucsd.edu ROBOT - SAVE CON OPENED: http://www-cse.ucsd.edu/users/rik/MLIA.html CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http://www-cse.ucsd.edu/users/rik/ NEW LINK FOUND: http://www.cs.wisc.edu/~shavlik/ NEW LINK FOUND: http://www.parc.xerox.com/istl/projects/mlia/mlia-papers.html NEW LINK FOUND: mailto:rik@cs.ucsd.edu NEW LINK FOUND: http://www.dcs.gla.ac.uk/Keith/Preface.html NEW LINK FOUND: ftp://cs.washington.edu/pub/map/papers/Category-Translation.ps NEW LINK FOUND: http://www.cs.washington.edu/homes/map/ NEW LINK FOUND: http://www.isi.edu/sims/knoblock/sss95/hammond.ps NEW LINK FOUND: http://www.isi.edu/sims/knoblock/sss95/proceedings.html NEW LINK FOUND: http://www.aaai.org/Press/press.html NEW LINK FOUND: http://www.csi.uottawa.ca/~holte/Learning/TR-95-12.ps NEW LINK FOUND: http://www.csi.uottawa.ca/~holte/Learning/aaai94ss.ps NEW LINK FOUND: http://www.csi.uottawa.ca/~holte/Learning/kbse93.ps NEW LINK FOUND: http://agents.www.media.mit.edu/groups/agents/papers/aaai-ymp/aaai.html NEW LINK FOUND: ftp://media.mit.edu/pub/agents/interface-agents/coll-agents.ps NEW LINK FOUND: http://yezdi.www.media.mit.edu/people/yezdi NEW LINK FOUND: http://memetral.www.media.mit.edu/people/memetral/ NEW LINK FOUND: http://pattie.www.media.mit.edu/people/pattie/ NEW LINK FOUND: http://www.aaai.org/Publications/Press/press.html NEW LINK FOUND: http://anther.learning.cs.cmu.edu/ml95.ps NEW LINK FOUND: http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-6/web-agent/www/webagent-plus.ps.Z NEW LINK FOUND: http://www.isi.edu/sims/knoblock/sss95/proceedings.html NEW LINK FOUND: http://www.aaai.org/Press/press.html NEW LINK FOUND: http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-6/web-agent/www/mltagung-e.ps.Z NEW LINK FOUND: http://www.ics.uci.edu/~pazzani/Coldlist.html NEW LINK FOUND: http://robotics.stanford.edu/people/marko/papers/jvcir.ps NEW LINK FOUND: http://flamingo.stanford.edu/users/marko/bio.html NEW LINK FOUND: http://robotics.stanford.edu/people/marko/papers/vcir.abs NEW LINK FOUND: http://robotics.stanford.edu/people/marko/papers/lira.ps NEW LINK FOUND: http://www.isi.edu/sims/knoblock/sss95/proceedings.html NEW LINK FOUND: http://www.aaai.org/Press/press.html NEW LINK FOUND: http://robotics.stanford.edu/people/marko/papers/lira.abs NEW LINK FOUND: ftp://scr.siemens.com:/pub/learning/Papers/towell/ml95.ps.gz NEW LINK FOUND: http://www.cs.colorado.edu/~andreas/Time-Series/MyPapers/topic-spotting.ps.Z NEW LINK FOUND: http://www.cs.colorado.edu/~andreas/ NEW LINK FOUND: ftp://parcftp.xerox.com/pub/qca/SIGIR95.ps NEW LINK FOUND: http:/~shavlik/cs838/dlewis-sigir94.ps NEW LINK FOUND: http://ai.iit.nrc.ca/DEIL/krulwich.ps.Z NEW LINK FOUND: http://www.isi.edu/sims/knoblock/sss95/krulwich.ps NEW LINK FOUND: http://www.isi.edu/sims/knoblock/sss95/proceedings.html NEW LINK FOUND: http://www.aaai.org/Press/press.html NEW LINK FOUND: http://ai.iit.nrc.ca/DEIL/martin.ps.Z NEW LINK FOUND: http://www.research.att.com/orgs/ssr/people/wcohen/postscript/ml-95-ir.ps NEW LINK FOUND: http://www.research.att.com/orgs/ssr/people/wcohen/ NEW LINK FOUND: ftp://odyssey.ucc.ie/pub/filtering/INNC94.ps NEW LINK FOUND: http://ai.bpa.arizona.edu/papers/mlir93/mlir93.html NEW LINK FOUND: http://ai.bpa.arizona.edu/papers/hicss27g/hicss27g.html NEW LINK FOUND: http://www-cse.ucsd.edu:80/users/rik/papers/sigir92/lsi2.ps NEW LINK FOUND: http://www-cse.ucsd.edu:80/users/rik/papers/sigir92/abstract.html NEW LINK FOUND: http://www-cse.ucsd.edu:80/users/rik/papers/sigir94/sig.ps NEW LINK FOUND: http://www-cse.ucsd.edu:80/users/rik/papers/sigir94/abstract.html NEW LINK FOUND: http://www-cse.ucsd.edu:80/users/rik/papers/dair94/inc.ps NEW LINK FOUND: http://www-cse.ucsd.edu:80/users/rik/papers/dair94/abstract.html NEW LINK FOUND: http://www-cse.ucsd.edu:80/users/rik/papers/dpe-ml/dair93.ps NEW LINK FOUND: http://www-cse.ucsd.edu:80/users/rik/papers/dair93/abstract.html NEW LINK FOUND: http://www-cse.ucsd.edu:80/users/rik/papers/aaai94-steier/aaai94_3.ps NEW LINK FOUND: http://www.aaai.org/Press/press.html NEW LINK FOUND: http://www-cse.ucsd.edu:80/users/rik/papers/aaai94-steier/abstract.html NEW LINK FOUND: http://www-cse.ucsd.edu:80/users/rik/papers/aaai-sss95/info-spiders.ps NEW LINK FOUND: http://www.isi.edu/sims/knoblock/sss95/proceedings.html NEW LINK FOUND: http://www.aaai.org/Press/press.html NEW LINK FOUND: http://www-cse.ucsd.edu:80/users/rik/papers/aaai-sss95/abstract.html NEW LINK FOUND: http://argo.gslis.ucla.edu/pbiron/pubs/cir/cir.html NEW LINK FOUND: http://ciir.cs.umass.edu/info/psfiles/irpubs/ml95_ToC.html NEW LINK FOUND: ftp://ciir-ftp.cs.umass.edu/pub/papers/lewis/nlirbib93.ps.Z NEW LINK FOUND: http://atg1.wustl.edu/DL94/paper/futrelle.html NEW LINK FOUND: http://ai.iit.nrc.ca/II_public/ NEW LINK FOUND: http://www.cs.columbia.edu/~acl/home.html NEW LINK FOUND: http://xxx.lanl.gov/cmp-lg/ NEW LINK FOUND: http://www.cogsci.princeton.edu/~wn/index.html NEW LINK FOUND: http://www.glue.umd.edu/enee/medlab/filter/filter.html NEW LINK FOUND: http://www.uspto.gov/web/ipnii/ NEW LINK FOUND: http://www.itd.nrl.navy.mil/ONR/aci/ NEW LINK FOUND: http://ciir.cs.umass.edu/ NEW LINK FOUND: http://fox.cs.vt.edu/foxlinks.html NEW LINK FOUND: mailto:rik@cs.ucsd.edu NEW LINK FOUND: mailto:shavlik@cs.wisc.edu nach ParserDelegator() PARSER RUN 77 crawl_level: 1 MAX_CRAWL_DEPTH: 2 http://www-cse.ucsd.edu/users/rik/ 0 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.wisc.edu/~shavlik/ 1 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.parc.xerox.com/istl/projects/mlia/mlia-papers.html 2 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:rik@cs.ucsd.edu 3 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.dcs.gla.ac.uk/Keith/Preface.html 4 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 ftp://cs.washington.edu/pub/map/papers/Category-Translation.ps 5 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.washington.edu/homes/map/ 6 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.isi.edu/sims/knoblock/sss95/hammond.ps 7 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.isi.edu/sims/knoblock/sss95/proceedings.html 8 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.aaai.org/Press/press.html 9 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.csi.uottawa.ca/~holte/Learning/TR-95-12.ps 10 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.csi.uottawa.ca/~holte/Learning/aaai94ss.ps 11 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.csi.uottawa.ca/~holte/Learning/kbse93.ps 12 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://agents.www.media.mit.edu/groups/agents/papers/aaai-ymp/aaai.html 13 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 ftp://media.mit.edu/pub/agents/interface-agents/coll-agents.ps 14 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://yezdi.www.media.mit.edu/people/yezdi 15 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://memetral.www.media.mit.edu/people/memetral/ 16 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://pattie.www.media.mit.edu/people/pattie/ 17 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.aaai.org/Publications/Press/press.html 18 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://anther.learning.cs.cmu.edu/ml95.ps 19 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-6/web-agent/www/webagent-plus.ps.Z 20 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.isi.edu/sims/knoblock/sss95/proceedings.html 21 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.aaai.org/Press/press.html 22 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-6/web-agent/www/mltagung-e.ps.Z 23 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ics.uci.edu/~pazzani/Coldlist.html 24 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://robotics.stanford.edu/people/marko/papers/jvcir.ps 25 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://flamingo.stanford.edu/users/marko/bio.html 26 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://robotics.stanford.edu/people/marko/papers/vcir.abs 27 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://robotics.stanford.edu/people/marko/papers/lira.ps 28 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.isi.edu/sims/knoblock/sss95/proceedings.html 29 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.aaai.org/Press/press.html 30 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://robotics.stanford.edu/people/marko/papers/lira.abs 31 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 ftp://scr.siemens.com:/pub/learning/Papers/towell/ml95.ps.gz 32 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.colorado.edu/~andreas/Time-Series/MyPapers/topic-spotting.ps.Z 33 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.colorado.edu/~andreas/ 34 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 ftp://parcftp.xerox.com/pub/qca/SIGIR95.ps 35 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http:/~shavlik/cs838/dlewis-sigir94.ps 36 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ai.iit.nrc.ca/DEIL/krulwich.ps.Z 37 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.isi.edu/sims/knoblock/sss95/krulwich.ps 38 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.isi.edu/sims/knoblock/sss95/proceedings.html 39 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.aaai.org/Press/press.html 40 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ai.iit.nrc.ca/DEIL/martin.ps.Z 41 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.research.att.com/orgs/ssr/people/wcohen/postscript/ml-95-ir.ps 42 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.research.att.com/orgs/ssr/people/wcohen/ 43 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 ftp://odyssey.ucc.ie/pub/filtering/INNC94.ps 44 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ai.bpa.arizona.edu/papers/mlir93/mlir93.html 45 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ai.bpa.arizona.edu/papers/hicss27g/hicss27g.html 46 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-cse.ucsd.edu:80/users/rik/papers/sigir92/lsi2.ps 47 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-cse.ucsd.edu:80/users/rik/papers/sigir92/abstract.html 48 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-cse.ucsd.edu:80/users/rik/papers/sigir94/sig.ps 49 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-cse.ucsd.edu:80/users/rik/papers/sigir94/abstract.html 50 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-cse.ucsd.edu:80/users/rik/papers/dair94/inc.ps 51 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-cse.ucsd.edu:80/users/rik/papers/dair94/abstract.html 52 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-cse.ucsd.edu:80/users/rik/papers/dpe-ml/dair93.ps 53 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-cse.ucsd.edu:80/users/rik/papers/dair93/abstract.html 54 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-cse.ucsd.edu:80/users/rik/papers/aaai94-steier/aaai94_3.ps 55 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.aaai.org/Press/press.html 56 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-cse.ucsd.edu:80/users/rik/papers/aaai94-steier/abstract.html 57 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-cse.ucsd.edu:80/users/rik/papers/aaai-sss95/info-spiders.ps 58 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.isi.edu/sims/knoblock/sss95/proceedings.html 59 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.aaai.org/Press/press.html 60 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-cse.ucsd.edu:80/users/rik/papers/aaai-sss95/abstract.html 61 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://argo.gslis.ucla.edu/pbiron/pubs/cir/cir.html 62 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ciir.cs.umass.edu/info/psfiles/irpubs/ml95_ToC.html 63 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 ftp://ciir-ftp.cs.umass.edu/pub/papers/lewis/nlirbib93.ps.Z 64 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://atg1.wustl.edu/DL94/paper/futrelle.html 65 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ai.iit.nrc.ca/II_public/ 66 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.columbia.edu/~acl/home.html 67 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://xxx.lanl.gov/cmp-lg/ 68 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cogsci.princeton.edu/~wn/index.html 69 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.glue.umd.edu/enee/medlab/filter/filter.html 70 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.uspto.gov/web/ipnii/ 71 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.itd.nrl.navy.mil/ONR/aci/ 72 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ciir.cs.umass.edu/ 73 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://fox.cs.vt.edu/foxlinks.html 74 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:rik@cs.ucsd.edu 75 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:shavlik@cs.wisc.edu 76 77 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.enst.fr ROBOT - SAVE CON OPENED: http://www.enst.fr/~rungsawa/irrs.html/ CONTENT GOT: text/html FC1: java.io.FileNotFoundException: http://perso.enst.fr/~rungsawa/irrs.html/ 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN ciir.cs.umass.edu ROBOT - SAVE CON OPENED: http://ciir.cs.umass.edu/publications/index.shtml CONTENT GOT: text/html FC1: java.io.FileNotFoundException: http://ciir.cs.umass.edu/publications/index.shtml 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.cse.ucsd.edu ROBOT - SAVE CON OPENED: http://www.cse.ucsd.edu/groups/guru/publications.html CONTENT GOT: text/html vor ParserDelegator() PARSER RUN 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.ub2.lu.se javax.swing.text.ChangedCharSetException java.lang.NullPointerException - http://www.cse.ucsd.edu/groups/guru/publications.html ROBOT - SAVE CON OPENED: http://www.ub2.lu.se/desire/radar/lit-about-search-services.html CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http://www.ambrosiasw.com/~fprefect/matrix/matrix.shtml NEW LINK FOUND: http://www.lib.rug.ac.be/internet/search.html NEW LINK FOUND: http://www.searchenginewatch.com/ NEW LINK FOUND: http://www.searchenginewatch.com/features.htm NEW LINK FOUND: http://www.melee.com/mica/index.html NEW LINK FOUND: http://www.zdnet.com/pcmag/features/websearch/_open.htm NEW LINK FOUND: http://www.internetworld.com/1997/12/iwlabs.html NEW LINK FOUND: http://www.internetworld.com/cur/1997/09/report.html NEW LINK FOUND: http://www4.zdnet.com/pccomp/features/excl0997/sear/sear.html NEW LINK FOUND: http://www.winona.msus.edu/is-f/library-f/webind2/webind2.htm NEW LINK FOUND: http://www.cs.uga.edu/~seong/search.html NEW LINK FOUND: http://www.onlineinc.com/database/FebDB97/nets2.html NEW LINK FOUND: http://www.firstmonday.dk NEW LINK FOUND: http://www.stark.k12.oh.us/Docs/search/ NEW LINK FOUND: http://www.library.nwu.edu/resources/internet/search/evaluate.html NEW LINK FOUND: http://www.internetworld.com/1997/01/iwlabs.html NEW LINK FOUND: http://www.pcmag.com/iu/srchsite/_open.htm NEW LINK FOUND: http://www.asis.org/annual-96/ElectronicProceedings/chu.html NEW LINK FOUND: http://www.zdnet.com/pccomp/features/fea1096/sub2.html NEW LINK FOUND: http://www.curtin.edu.au/curtin/library/staffpages/gwpersonal/senginestudy/ NEW LINK FOUND: http://www.pcworld.com/engines.html NEW LINK FOUND: http://www.philb.com/msengine.htm NEW LINK FOUND: http://www.focus.de/DD/DDN/ddn.htm NEW LINK FOUND: http://www.imaginet.fr/~gmaire/search.htm NEW LINK FOUND: http://www.onlineinc.com/database/JuneDB/nets6.html NEW LINK FOUND: http://neal.ctstateu.edu:2001/htdocs/websearch.html NEW LINK FOUND: http://www.onlineinc.com/onlinemag/MayOL/zorn5.html NEW LINK FOUND: http://www.hotwired.com/wired/4.05/indexing/index.html NEW LINK FOUND: http://www.internetworld.com/1996/05/showdown.html NEW LINK FOUND: http://www.internetworld.com/1996/05/guiding.html NEW LINK FOUND: http://www.library.ucsb.edu/untangle/eagan.html NEW LINK FOUND: http://www.chem.msu.su/eng/comparison.html NEW LINK FOUND: http://cord.iupui.edu/~arzurmue/ NEW LINK FOUND: http://lib-www.ucr.edu/pubs/navigato.html NEW LINK FOUND: http://ukoln.bath.ac.uk/ariadne/issue2/engines/ NEW LINK FOUND: http://www.indiana.edu/~librcsd/search/ NEW LINK FOUND: http://www.wsulibs.wsu.edu/general/robots.htm NEW LINK FOUND: http://www.pcworld.com/reprints/lycos.htm NEW LINK FOUND: http://www.cnet.com/Content/Reviews/Compare/Search/ NEW LINK FOUND: http://www.microsoft.com/usability/webconf/schlichting/schlichting.htm NEW LINK FOUND: http://www.zdnet.com/~pccomp/features/internet/search/sub1.html NEW LINK FOUND: http://www.imt.net/~notess/compeng.html NEW LINK FOUND: http://sunsite.berkeley.edu/Help/searchdetails.html NEW LINK FOUND: http://www.ub2.lu.se/NNC/libconf-archive/0039.html NEW LINK FOUND: http://www.bubl.bath.ac.uk/BUBL/IWinship.html NEW LINK FOUND: http://www.zdnet.com/~pccomp/features/internet/search/index.html NEW LINK FOUND: http://www.leeds.ac.uk/ucs/docs/fur14/fur14.html NEW LINK FOUND: http://issfw.palomar.edu/Library/TGSEARCH.HTM NEW LINK FOUND: gopher://online.lib.uic.edu:70/0F-1%3A1803%3A06.txt NEW LINK FOUND: http://burns.library.uvic.ca/CLA/CLA_overhead1.html NEW LINK FOUND: http://hackberry.chem.niu.edu:80/Infobahn/Paper23/ NEW LINK FOUND: http://www.internetworld.com/1995/04/feat41.htm NEW LINK FOUND: http://www.ix.de/ix/raven/Web/9412/WebPoints.html NEW LINK FOUND: http://sunsite.unc.edu/cmc/mag/1994/sep/spiders.html NEW LINK FOUND: http://www.ub2.lu.se/nav_menu.html NEW LINK FOUND: http://www.ub2.lu.se/tk/websearch_systemat.html NEW LINK FOUND: http://www.ub2.lu.se/tk/demos/BFD9602-en.html NEW LINK FOUND: http://www.ub2.lu.se/tk/demos/DO9603-manus.html NEW LINK FOUND: http://www.ub2.lu.se/tk/demos/DO9603-meng.html NEW LINK FOUND: http://www.ub2.lu.se/desire/radar/reports/D3.11/ NEW LINK FOUND: http://www.wilpaterson.edu/home/staff/kwagner/eval.htm NEW LINK FOUND: http://www.tiac.net/users/hope/findqual.html NEW LINK FOUND: http://www.nlc-bnc.ca/pubs/netnotes/notes15.htm NEW LINK FOUND: http://magi.com/~mmelick/it96jan.htm NEW LINK FOUND: http://www.w3.org/pub/Conferences/WWW4/Papers/169/ NEW LINK FOUND: http://fuzine.mt.cs.cmu.edu/mlm/signidr94.html NEW LINK FOUND: http://www.wais.com/SIGNIDR NEW LINK FOUND: http://fuzine.mt.cs.cmu.edu/mlm/signidr94brief.html NEW LINK FOUND: http://info.webcrawler.com/bp/WWW94.html NEW LINK FOUND: http://info.webcrawler.com/signidr/WCOverview.html NEW LINK FOUND: http://www.cs.colorado.edu/home/mcbryan/mypapers/www94.ps NEW LINK FOUND: http://arlo.wilsonhs.pps.k12.or.us/search.html NEW LINK FOUND: http://www.oise.on.ca/~mberns/RoboSearch.html NEW LINK FOUND: http://www.mispress.com/websearch/webstoc.html NEW LINK FOUND: http://www.mispress.com/websearch/websch4.html NEW LINK FOUND: http://argus-inc.com/searcher/index.html NEW LINK FOUND: http://www.cs.concordia.ca/w3-paper.html NEW LINK FOUND: http://info.webcrawler.com/mak/projects/aliweb/paper-www94/paper.html NEW LINK FOUND: http://www1.cern.ch/PapersWWW94/spider.ps NEW LINK FOUND: http://www.ub2.lu.se/UB2proj/LIS_collection/lorcan.html NEW LINK FOUND: http://www.mecklerweb.com/mags/ww/news/oct-95/news/1-6halves.html NEW LINK FOUND: http://www.ncsa.uiuc.edu/SDG/IT94/Proceedings/Searching/debra/article.html NEW LINK FOUND: http://www.win.tue.nl/help/doc/demo.ps NEW LINK FOUND: http://www.library.ucsb.edu/untangle/lager.html NEW LINK FOUND: http://www.virage.com/wpaper/ NEW LINK FOUND: http://wwwqbic.almaden.ibm.com/ NEW LINK FOUND: http://www.scu.edu.au/ausweb96/educn/barry1/paper.html NEW LINK FOUND: http://www.mccmedia.com/search-list/ NEW LINK FOUND: http://www.tiac.net/users/hope/tips/ NEW LINK FOUND: http://www.tiac.net/users/hope/iw96/tips96.html NEW LINK FOUND: http://www.sci.ouc.bc.ca/libr/connect96/search.htm NEW LINK FOUND: http://www.monash.com/spidap.html NEW LINK FOUND: http://www.macworld.com/pages/december.96/Column.2893.html NEW LINK FOUND: http://www.tisl.ukans.edu/~sgauch/papers/JUCS96.html NEW LINK FOUND: http://userpage.fu-berlin.de/~angela/bond/browsing.htm NEW LINK FOUND: http://www.dom.de/FreiRaum/uli/buch.htm NEW LINK FOUND: http://www.ora.com/catalog/netresearch/noframes.html NEW LINK FOUND: http://www.winmag.com/library/1996/1196/11fc1.htm NEW LINK FOUND: http://www.winmag.com/library/1996/1196/11fc2.htm NEW LINK FOUND: http://www.netskills.ac.uk/resources/searching/index.shtml NEW LINK FOUND: http://ukoln.bath.ac.uk/dlib/dlib/march97/bt/03pollock.html NEW LINK FOUND: http://www.sciam.com/0397issue/0397intro.html NEW LINK FOUND: http://www.macworld.com/pages/december.96/Column.2893.html NEW LINK FOUND: http://www.cs.uchicago.edu/~swain/pubs/TR-96-14.ps NEW LINK FOUND: http://www.cs.uchicago.edu/~swain/pubs/TR-96-14.ps.Z NEW LINK FOUND: http://www.cs.uchicago.edu/~swain/pubs/TR-96-14.pdf NEW LINK FOUND: http://www.cs.uchicago.edu/~swain/pubs/CVPR97sub.ps NEW LINK FOUND: http://osiris.sunderland.ac.uk/sst/se/ NEW LINK FOUND: http://www.ukoln.ac.uk/ariadne/issue6/survey/ NEW LINK FOUND: http://www.searchinsider.com NEW LINK FOUND: http://www.accesscom.com/~ziegler/search.html NEW LINK FOUND: http://www.ub.uni-bielefeld.de/biblio/search/smkurs.htm NEW LINK FOUND: http://www.lib.berkeley.edu/TeachingLib/Guides/Internet/FindInfo.html NEW LINK FOUND: http://www.newscientist.com/keysites/networld/lost.html NEW LINK FOUND: http://www.library.ucsb.edu/untangle/lager.html NEW LINK FOUND: http://www.ariadne.ac.uk/issue9/search-engines/ NEW LINK FOUND: http://www.curtin.edu.au/curtin/library/staffpages/gwpersonal/senginestudy/zindex.htm NEW LINK FOUND: http://www.state.wi.us/agencies/dpi/www/search.html NEW LINK FOUND: http://login.eunet.no/~presno/bok/10.html NEW LINK FOUND: http://www.ualberta.ca/~nfriesen/536/ NEW LINK FOUND: http://www.sciam.com/0397issue/0397lynch.html NEW LINK FOUND: http://www.cnet.com/Content/Features/Dlife/Search/ NEW LINK FOUND: http://www.pcworld.com/software/internet_www/articles/dec96/1412p182a.html NEW LINK FOUND: http://www.pscw.uva.nl/sociosite/SEARCH/about_searching.html NEW LINK FOUND: http://osiris.sunderland.ac.uk/sst/se/ NEW LINK FOUND: http://rs.internic.net/nic-support/nicnews/archive/september96/enduser.html NEW LINK FOUND: http://rs.internic.net/nic-support/nicnews/oct96/enduser.html NEW LINK FOUND: http://www1.zdnet.com/complife/fea/9708/findny10.html NEW LINK FOUND: http://www4.zdnet.com/pccomp/besttips/search.html NEW LINK FOUND: http://debussy.cs.arizona.edu/sb/paper.html NEW LINK FOUND: http://www.winona.msus.edu/is-f/library-f/webind2/webind2.htm NEW LINK FOUND: http://mirrored.ukoln.ac.uk/lis-journals/dlib/dlib/dlib/january98/01kirriemuir.html NEW LINK FOUND: http://www.w3.org/pub/WWW/Search/9605-Indexing-Workshop/ NEW LINK FOUND: http://ukoln.bath.ac.uk/ariadne/issue3/acdc/ NEW LINK FOUND: http://www.ub2.lu.se/W4/summary.html NEW LINK FOUND: http://www.ub2.lu.se/desire/radar/reports/D3.12/ NEW LINK FOUND: http://www.ncsa.uiuc.edu/SDG/IT94/Proceedings/Searching/schwartz.harvest/schwartz.harvest.html NEW LINK FOUND: http://www.ncsa.uiuc.edu/SDG/IT94/Proceedings/Agents/eichmann.ethical/eichmann.html NEW LINK FOUND: http://www.ncsa.uiuc.edu/SDG/IT94/Proceedings/Searching/lim/coollist.html NEW LINK FOUND: http://stork.ukc.ac.uk/computer_science/Html/Pubs/IHC10-94/ NEW LINK FOUND: http://www.igd.fhg.de/www/www95/papers/82/html-files/discover.html NEW LINK FOUND: http://www.igd.fhg.de/www/www95/papers/47/fwsf/fwsf.html NEW LINK FOUND: http://www.w3.org/pub/Conferences/WWW4/Papers/134/ NEW LINK FOUND: http://union.ncsa.uiuc.edu/HyperNews/get/www/indexing.html NEW LINK FOUND: http://www.informatik.th-darmstadt.de/~neuss/www1/w4-main.html NEW LINK FOUND: http://www.ncsa.uiuc.edu/SDG/IT94/Proceedings/Autools/kent/kent.html NEW LINK FOUND: http://www.w3.org/pub/Conferences/WWW4/Papers/66/ NEW LINK FOUND: http://dbcl13.cs.ust.hk:8001/IndexServer/doc/paper66.html NEW LINK FOUND: http://dbcl13.cs.ust.hk:8001/IndexServer/doc/wwwindex.ps NEW LINK FOUND: http://www.w3.org/pub/Conferences/WWW4/Papers/300/ NEW LINK FOUND: http://info.webcrawler.com/mak/projects/robots/robots.html NEW LINK FOUND: http://info.webcrawler.com/mak/projects/robots/faq.html NEW LINK FOUND: http://info.webcrawler.com/mak/projects/robots/guidelines.html NEW LINK FOUND: http://info.webcrawler.com/mak/projects/robots/threat-or-treat.html NEW LINK FOUND: http://pubweb.nexor.co.uk/public/cusi/doc/simultaneous.html NEW LINK FOUND: http://lorne.stir.ac.uk:80/~jf1/papers/signidr.html NEW LINK FOUND: http://rbse.jsc.nasa.gov/agents/ NEW LINK FOUND: http://www.ncsa.uiuc.edu/SDG/IT94/Proceedings/HCI/glazer/glazer.html NEW LINK FOUND: http://www.cern.ch/PapersWWW94/reinpost.ps NEW LINK FOUND: http://www.win.tue.nl/win/cs/is/debra/tut/ NEW LINK FOUND: http://www5conf.inria.fr/fich_html/slides/panels/Panel10/overview.htm NEW LINK FOUND: http://www5conf.inria.fr/fich_html/slides/dday/agents/overview.htm NEW LINK FOUND: http://www.mccmedia.com/robots/ NEW LINK FOUND: http://www.dstc.edu.au/RDU/reports/webir.html NEW LINK FOUND: http://ciir.cs.umass.edu/nir96/ NEW LINK FOUND: http://www.obs-us.com/obs/english/papers/mont1.htm NEW LINK FOUND: http://www6.nttlab.com/HyperNews/get/PAPER267.html NEW LINK FOUND: http://www6.nttlab.com/HyperNews/get/PAPER205.html NEW LINK FOUND: http://www6.nttlab.com/HyperNews/get/PAPER96.html NEW LINK FOUND: http://www6.nttlabs.com/HyperNews/get/PAPER53.html NEW LINK FOUND: ftp://ftp.loc.gov/pub/z3950/articles/italy.ps NEW LINK FOUND: ftp://ftp.loc.gov/pub/z3950/articles/italy.pdf NEW LINK FOUND: http://www.dstc.edu.au/RDU/reports/QuestNet95.html NEW LINK FOUND: http://www.firstmonday.dk/issues/indexes/index2_3.html NEW LINK FOUND: http://www6.nttlabs.com/HyperNews/get/PAPER40.html NEW LINK FOUND: http://www6.nttlabs.com/HyperNews/get/PAPER222.html NEW LINK FOUND: http://www6.nttlabs.com/HyperNews/get/PAPER3.html NEW LINK FOUND: http://www6.nttlabs.com/HyperNews/get/PAPER68.html NEW LINK FOUND: http://www6.nttlabs.com/HyperNews/get/PAPER39.html NEW LINK FOUND: http://www6.nttlabs.com/HyperNews/get/PAPER206.html NEW LINK FOUND: http://www.objs.com/survey/crawl.htm NEW LINK FOUND: http://www6.nttlabs.com/HyperNews/get/PAPER118.html NEW LINK FOUND: http://www.macworld.com/features/pov.4.4.html NEW LINK FOUND: http://www.library.ucsb.edu/untangle/callery.html NEW LINK FOUND: http://www.ariadne.ac.uk/issue10/search-engines/ NEW LINK FOUND: http://debussy.cs.arizona.edu/sb/paper.html NEW LINK FOUND: http://www.tamu.edu//global_info/index-articles.html NEW LINK FOUND: http://lib-www.ucr.edu/pubs/navigato.html NEW LINK FOUND: http://www.gwdg.de/~hkuhn1/pagesuch.html NEW LINK FOUND: http://www.zdnet.com/products/searchuser.html NEW LINK FOUND: http://www.ub2.lu.se/desire/ NEW LINK FOUND: http://linnea.helsinki.fi/meta/ NEW LINK FOUND: http://www.lub.lu.se/dc/nmd_viewer.pl NEW LINK FOUND: http://www.ub2.lu.se/person_tk.html NEW LINK FOUND: mailto:Traugott.Koch@ub2.lu.se nach ParserDelegator() PARSER RUN 195 crawl_level: 1 MAX_CRAWL_DEPTH: 2 http://www.ambrosiasw.com/~fprefect/matrix/matrix.shtml 0 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.lib.rug.ac.be/internet/search.html 1 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.searchenginewatch.com/ 2 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.searchenginewatch.com/features.htm 3 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.melee.com/mica/index.html 4 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.zdnet.com/pcmag/features/websearch/_open.htm 5 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.internetworld.com/1997/12/iwlabs.html 6 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.internetworld.com/cur/1997/09/report.html 7 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www4.zdnet.com/pccomp/features/excl0997/sear/sear.html 8 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.winona.msus.edu/is-f/library-f/webind2/webind2.htm 9 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.uga.edu/~seong/search.html 10 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.onlineinc.com/database/FebDB97/nets2.html 11 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.firstmonday.dk 12 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.stark.k12.oh.us/Docs/search/ 13 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.library.nwu.edu/resources/internet/search/evaluate.html 14 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.internetworld.com/1997/01/iwlabs.html 15 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.pcmag.com/iu/srchsite/_open.htm 16 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.asis.org/annual-96/ElectronicProceedings/chu.html 17 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.zdnet.com/pccomp/features/fea1096/sub2.html 18 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.curtin.edu.au/curtin/library/staffpages/gwpersonal/senginestudy/ 19 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.pcworld.com/engines.html 20 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.philb.com/msengine.htm 21 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.focus.de/DD/DDN/ddn.htm 22 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.imaginet.fr/~gmaire/search.htm 23 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.onlineinc.com/database/JuneDB/nets6.html 24 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://neal.ctstateu.edu:2001/htdocs/websearch.html 25 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.onlineinc.com/onlinemag/MayOL/zorn5.html 26 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.hotwired.com/wired/4.05/indexing/index.html 27 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.internetworld.com/1996/05/showdown.html 28 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.internetworld.com/1996/05/guiding.html 29 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.library.ucsb.edu/untangle/eagan.html 30 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.chem.msu.su/eng/comparison.html 31 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://cord.iupui.edu/~arzurmue/ 32 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://lib-www.ucr.edu/pubs/navigato.html 33 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ukoln.bath.ac.uk/ariadne/issue2/engines/ 34 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.indiana.edu/~librcsd/search/ 35 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.wsulibs.wsu.edu/general/robots.htm 36 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.pcworld.com/reprints/lycos.htm 37 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cnet.com/Content/Reviews/Compare/Search/ 38 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.microsoft.com/usability/webconf/schlichting/schlichting.htm 39 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.zdnet.com/~pccomp/features/internet/search/sub1.html 40 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.imt.net/~notess/compeng.html 41 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://sunsite.berkeley.edu/Help/searchdetails.html 42 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ub2.lu.se/NNC/libconf-archive/0039.html 43 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.bubl.bath.ac.uk/BUBL/IWinship.html 44 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.zdnet.com/~pccomp/features/internet/search/index.html 45 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.leeds.ac.uk/ucs/docs/fur14/fur14.html 46 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://issfw.palomar.edu/Library/TGSEARCH.HTM 47 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 gopher://online.lib.uic.edu:70/0F-1%3A1803%3A06.txt 48 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://burns.library.uvic.ca/CLA/CLA_overhead1.html 49 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://hackberry.chem.niu.edu:80/Infobahn/Paper23/ 50 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.internetworld.com/1995/04/feat41.htm 51 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ix.de/ix/raven/Web/9412/WebPoints.html 52 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://sunsite.unc.edu/cmc/mag/1994/sep/spiders.html 53 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ub2.lu.se/nav_menu.html 54 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ub2.lu.se/tk/websearch_systemat.html 55 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ub2.lu.se/tk/demos/BFD9602-en.html 56 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ub2.lu.se/tk/demos/DO9603-manus.html 57 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ub2.lu.se/tk/demos/DO9603-meng.html 58 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ub2.lu.se/desire/radar/reports/D3.11/ 59 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.wilpaterson.edu/home/staff/kwagner/eval.htm 60 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.tiac.net/users/hope/findqual.html 61 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nlc-bnc.ca/pubs/netnotes/notes15.htm 62 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://magi.com/~mmelick/it96jan.htm 63 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.w3.org/pub/Conferences/WWW4/Papers/169/ 64 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://fuzine.mt.cs.cmu.edu/mlm/signidr94.html 65 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.wais.com/SIGNIDR 66 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://fuzine.mt.cs.cmu.edu/mlm/signidr94brief.html 67 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://info.webcrawler.com/bp/WWW94.html 68 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://info.webcrawler.com/signidr/WCOverview.html 69 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.colorado.edu/home/mcbryan/mypapers/www94.ps 70 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://arlo.wilsonhs.pps.k12.or.us/search.html 71 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.oise.on.ca/~mberns/RoboSearch.html 72 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.mispress.com/websearch/webstoc.html 73 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.mispress.com/websearch/websch4.html 74 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://argus-inc.com/searcher/index.html 75 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.concordia.ca/w3-paper.html 76 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://info.webcrawler.com/mak/projects/aliweb/paper-www94/paper.html 77 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www1.cern.ch/PapersWWW94/spider.ps 78 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ub2.lu.se/UB2proj/LIS_collection/lorcan.html 79 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.mecklerweb.com/mags/ww/news/oct-95/news/1-6halves.html 80 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ncsa.uiuc.edu/SDG/IT94/Proceedings/Searching/debra/article.html 81 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.win.tue.nl/help/doc/demo.ps 82 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.library.ucsb.edu/untangle/lager.html 83 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.virage.com/wpaper/ 84 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://wwwqbic.almaden.ibm.com/ 85 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.scu.edu.au/ausweb96/educn/barry1/paper.html 86 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.mccmedia.com/search-list/ 87 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.tiac.net/users/hope/tips/ 88 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.tiac.net/users/hope/iw96/tips96.html 89 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.sci.ouc.bc.ca/libr/connect96/search.htm 90 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.monash.com/spidap.html 91 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.macworld.com/pages/december.96/Column.2893.html 92 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.tisl.ukans.edu/~sgauch/papers/JUCS96.html 93 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://userpage.fu-berlin.de/~angela/bond/browsing.htm 94 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.dom.de/FreiRaum/uli/buch.htm 95 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ora.com/catalog/netresearch/noframes.html 96 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.winmag.com/library/1996/1196/11fc1.htm 97 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.winmag.com/library/1996/1196/11fc2.htm 98 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.netskills.ac.uk/resources/searching/index.shtml 99 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ukoln.bath.ac.uk/dlib/dlib/march97/bt/03pollock.html 100 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.sciam.com/0397issue/0397intro.html 101 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.macworld.com/pages/december.96/Column.2893.html 102 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.uchicago.edu/~swain/pubs/TR-96-14.ps 103 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.uchicago.edu/~swain/pubs/TR-96-14.ps.Z 104 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.uchicago.edu/~swain/pubs/TR-96-14.pdf 105 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.uchicago.edu/~swain/pubs/CVPR97sub.ps 106 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://osiris.sunderland.ac.uk/sst/se/ 107 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ukoln.ac.uk/ariadne/issue6/survey/ 108 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.searchinsider.com 109 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.accesscom.com/~ziegler/search.html 110 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ub.uni-bielefeld.de/biblio/search/smkurs.htm 111 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.lib.berkeley.edu/TeachingLib/Guides/Internet/FindInfo.html 112 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.newscientist.com/keysites/networld/lost.html 113 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.library.ucsb.edu/untangle/lager.html 114 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ariadne.ac.uk/issue9/search-engines/ 115 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.curtin.edu.au/curtin/library/staffpages/gwpersonal/senginestudy/zindex.htm 116 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.state.wi.us/agencies/dpi/www/search.html 117 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://login.eunet.no/~presno/bok/10.html 118 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ualberta.ca/~nfriesen/536/ 119 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.sciam.com/0397issue/0397lynch.html 120 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cnet.com/Content/Features/Dlife/Search/ 121 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.pcworld.com/software/internet_www/articles/dec96/1412p182a.html 122 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.pscw.uva.nl/sociosite/SEARCH/about_searching.html 123 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://osiris.sunderland.ac.uk/sst/se/ 124 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://rs.internic.net/nic-support/nicnews/archive/september96/enduser.html 125 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://rs.internic.net/nic-support/nicnews/oct96/enduser.html 126 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www1.zdnet.com/complife/fea/9708/findny10.html 127 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www4.zdnet.com/pccomp/besttips/search.html 128 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://debussy.cs.arizona.edu/sb/paper.html 129 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.winona.msus.edu/is-f/library-f/webind2/webind2.htm 130 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://mirrored.ukoln.ac.uk/lis-journals/dlib/dlib/dlib/january98/01kirriemuir.html 131 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.w3.org/pub/WWW/Search/9605-Indexing-Workshop/ 132 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ukoln.bath.ac.uk/ariadne/issue3/acdc/ 133 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ub2.lu.se/W4/summary.html 134 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ub2.lu.se/desire/radar/reports/D3.12/ 135 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ncsa.uiuc.edu/SDG/IT94/Proceedings/Searching/schwartz.harvest/schwartz.harvest.html 136 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ncsa.uiuc.edu/SDG/IT94/Proceedings/Agents/eichmann.ethical/eichmann.html 137 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ncsa.uiuc.edu/SDG/IT94/Proceedings/Searching/lim/coollist.html 138 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://stork.ukc.ac.uk/computer_science/Html/Pubs/IHC10-94/ 139 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.igd.fhg.de/www/www95/papers/82/html-files/discover.html 140 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.igd.fhg.de/www/www95/papers/47/fwsf/fwsf.html 141 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.w3.org/pub/Conferences/WWW4/Papers/134/ 142 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://union.ncsa.uiuc.edu/HyperNews/get/www/indexing.html 143 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.informatik.th-darmstadt.de/~neuss/www1/w4-main.html 144 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ncsa.uiuc.edu/SDG/IT94/Proceedings/Autools/kent/kent.html 145 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.w3.org/pub/Conferences/WWW4/Papers/66/ 146 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://dbcl13.cs.ust.hk:8001/IndexServer/doc/paper66.html 147 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://dbcl13.cs.ust.hk:8001/IndexServer/doc/wwwindex.ps 148 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.w3.org/pub/Conferences/WWW4/Papers/300/ 149 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://info.webcrawler.com/mak/projects/robots/robots.html 150 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://info.webcrawler.com/mak/projects/robots/faq.html 151 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://info.webcrawler.com/mak/projects/robots/guidelines.html 152 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://info.webcrawler.com/mak/projects/robots/threat-or-treat.html 153 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://pubweb.nexor.co.uk/public/cusi/doc/simultaneous.html 154 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://lorne.stir.ac.uk:80/~jf1/papers/signidr.html 155 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://rbse.jsc.nasa.gov/agents/ 156 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ncsa.uiuc.edu/SDG/IT94/Proceedings/HCI/glazer/glazer.html 157 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cern.ch/PapersWWW94/reinpost.ps 158 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.win.tue.nl/win/cs/is/debra/tut/ 159 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www5conf.inria.fr/fich_html/slides/panels/Panel10/overview.htm 160 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www5conf.inria.fr/fich_html/slides/dday/agents/overview.htm 161 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.mccmedia.com/robots/ 162 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.dstc.edu.au/RDU/reports/webir.html 163 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ciir.cs.umass.edu/nir96/ 164 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.obs-us.com/obs/english/papers/mont1.htm 165 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www6.nttlab.com/HyperNews/get/PAPER267.html 166 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www6.nttlab.com/HyperNews/get/PAPER205.html 167 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www6.nttlab.com/HyperNews/get/PAPER96.html 168 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www6.nttlabs.com/HyperNews/get/PAPER53.html 169 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 ftp://ftp.loc.gov/pub/z3950/articles/italy.ps 170 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 ftp://ftp.loc.gov/pub/z3950/articles/italy.pdf 171 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.dstc.edu.au/RDU/reports/QuestNet95.html 172 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.firstmonday.dk/issues/indexes/index2_3.html 173 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www6.nttlabs.com/HyperNews/get/PAPER40.html 174 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www6.nttlabs.com/HyperNews/get/PAPER222.html 175 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www6.nttlabs.com/HyperNews/get/PAPER3.html 176 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www6.nttlabs.com/HyperNews/get/PAPER68.html 177 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www6.nttlabs.com/HyperNews/get/PAPER39.html 178 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www6.nttlabs.com/HyperNews/get/PAPER206.html 179 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.objs.com/survey/crawl.htm 180 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www6.nttlabs.com/HyperNews/get/PAPER118.html 181 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.macworld.com/features/pov.4.4.html 182 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.library.ucsb.edu/untangle/callery.html 183 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ariadne.ac.uk/issue10/search-engines/ 184 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://debussy.cs.arizona.edu/sb/paper.html 185 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.tamu.edu//global_info/index-articles.html 186 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://lib-www.ucr.edu/pubs/navigato.html 187 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.gwdg.de/~hkuhn1/pagesuch.html 188 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.zdnet.com/products/searchuser.html 189 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ub2.lu.se/desire/ 190 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://linnea.helsinki.fi/meta/ 191 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.lub.lu.se/dc/nmd_viewer.pl 192 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ub2.lu.se/person_tk.html 193 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:Traugott.Koch@ub2.lu.se 194 195 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN web.syr.edu ROBOT - SAVE CON OPENED: http://web.syr.edu/~mdtaffet/nlp_sites.html CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http://istweb.syr.edu/facstaff/facultymember.asp?id=37 NEW LINK FOUND: http://www.syr.edu/ NEW LINK FOUND: http://istweb.syr.edu/ NEW LINK FOUND: http://www.cs.bham.ac.uk/~pxc/nlpa/nlpgloss.html NEW LINK FOUND: http://www.cse.unsw.edu.au/~billw/nlpdict.html NEW LINK FOUND: http://donelaitis.vdu.lt/publikacijos/SDoCL.htm NEW LINK FOUND: http://www.itl.nist.gov/iaui/894.02/related_projects/tipster/gloss.htm NEW LINK FOUND: http://www.scs.leeds.ac.uk/amalgam/amalgam/amalgsoft.html NEW LINK FOUND: http://www.scs.leeds.ac.uk/amalgam/tagsets/brown.html NEW LINK FOUND: http://www.scs.leeds.ac.uk/amalgam/tagsets/ice.html NEW LINK FOUND: http://www.scs.leeds.ac.uk/amalgam/tagsets/llc.html NEW LINK FOUND: http://www.scs.leeds.ac.uk/amalgam/tagsets/lob.html NEW LINK FOUND: http://www.scs.leeds.ac.uk/amalgam/tagsets/parts.html NEW LINK FOUND: http://www.scs.leeds.ac.uk/amalgam/tagsets/pow.html NEW LINK FOUND: http://www.scs.leeds.ac.uk/amalgam/tagsets/sec.html NEW LINK FOUND: http://www.rdues.liv.ac.uk/ NEW LINK FOUND: http://www.rdues.liv.ac.uk/april.shtml NEW LINK FOUND: http://www.rdues.liv.ac.uk/aprdemo/ NEW LINK FOUND: http://www.rdues.liv.ac.uk/aprdemo/dbintro.html NEW LINK FOUND: http://www.rdues.liv.ac.uk/aprdemo/dbadvanced.html NEW LINK FOUND: http://www.rdues.liv.ac.uk/aprdemo/dbstats.html NEW LINK FOUND: http://www.ling.gu.se/~lager/index.html NEW LINK FOUND: http://www.ling.gu.se/~lager/mutbl.html NEW LINK FOUND: http://www.ling.gu.se/~lager/Mutbl/demo.html NEW LINK FOUND: http://www.ling.gu.se/~lager/Home/brilltagger_ui.html NEW LINK FOUND: http://www.comp.lancs.ac.uk/computing/research/ucrel/ NEW LINK FOUND: http://www.lancs.ac.uk/ NEW LINK FOUND: http://www.comp.lancs.ac.uk/computing/research/ucrel/claws/ NEW LINK FOUND: http://www.comp.lancs.ac.uk/computing/research/ucrel/claws/trial.html NEW LINK FOUND: http://www.comp.lancs.ac.uk/computing/research/ucrel/claws/ NEW LINK FOUND: http://www.comp.lancs.ac.uk/ucrel/claws5tags.html NEW LINK FOUND: http://www.comp.lancs.ac.uk/ucrel/claws7tags.html NEW LINK FOUND: http://titania.cobuild.collins.co.uk/form.html NEW LINK FOUND: http://www.ling.gu.se/~lager/index.html NEW LINK FOUND: http://www.ling.gu.se/~lager/mutbl.html NEW LINK FOUND: http://www.ling.gu.se/~lager/Mutbl/demo.html NEW LINK FOUND: http://www.ling.gu.se/~lager/Home/cgtagger_ui.html NEW LINK FOUND: http://www.cs.brandeis.edu/~paulb/CoreLex/overview.html NEW LINK FOUND: http://www.dfki.de/~paulb/corelex.html NEW LINK FOUND: http://www.lingsoft.fi/cgi-pub/engcg NEW LINK FOUND: http://www.lingsoft.fi/doc/engcg/intro/mtags.html NEW LINK FOUND: http://www.lingsoft.fi/doc/engcg/intro/stags.html NEW LINK FOUND: http://www.lingsoft.fi/doc/engcg/intro/ NEW LINK FOUND: http://www.conexor.fi/anlp97/anlp97.html NEW LINK FOUND: http://www.conexor.fi/testing.html NEW LINK FOUND: http://www.conexor.fi/engcg2.html NEW LINK FOUND: http://www.conexor.fi/testing.html NEW LINK FOUND: http://www.conexor.fi/fdg.html NEW LINK FOUND: http://www.conexor.fi/anlp97/anlp97.html NEW LINK FOUND: http://www.conexor.fi/testing.html NEW LINK FOUND: http://www.conexor.fi/np-parser.html NEW LINK FOUND: http://www.lingsoft.fi/cgi-pub/engtwol NEW LINK FOUND: http://www.lingsoft.fi/doc/engcg/intro/mtags.html NEW LINK FOUND: http://lcweb2.loc.gov/law/GLINv1/GLIN.html NEW LINK FOUND: http://cimic.rutgers.edu/~holowcza/glin/ling/ NEW LINK FOUND: http://cimic.rutgers.edu/~holowcza/glin/ling/query.html NEW LINK FOUND: http://cimic.rutgers.edu/cgi-bin/cgiwrap/holowcza/ling/db/index.pl NEW LINK FOUND: http://www.ltg.ed.ac.uk/~jo/interarbora/ NEW LINK FOUND: http://www.ltg.ed.ac.uk/~jo/interarbora/general.html NEW LINK FOUND: http://www.comp.nus.edu.sg/~qiul/NLPTools/JavaRAP.html NEW LINK FOUND: http://www.comp.nus.edu.sg/%7Eqiul/Publications/lrec04_JavaRAP.pdf NEW LINK FOUND: http://www-appn.comp.nus.edu.sg/%7Erpnlpir/cgi-bin/JavaRAP/JavaRAPdemo.html NEW LINK FOUND: http://www.link.cs.cmu.edu/link/index.html NEW LINK FOUND: http://www.link.cs.cmu.edu/link/submit-sentence-4.html NEW LINK FOUND: http://www.link.cs.cmu.edu/link/explain-output.html NEW LINK FOUND: http://www.ltg.ed.ac.uk/software/posdemo.html NEW LINK FOUND: http://www.ltg.ed.ac.uk/software/posdemo.html NEW LINK FOUND: http://www.ltg.ed.ac.uk/software/thistle/demos/index.html NEW LINK FOUND: http://www-rali.iro.umontreal.ca/Accueil.en.html NEW LINK FOUND: http://www-rali.iro.umontreal.ca/Accueil.fr.html NEW LINK FOUND: http://www-rali.iro.umontreal.ca/Morpho/Morpho.en.cgi NEW LINK FOUND: http://www-rali.iro.umontreal.ca/Morpho/Morpho.fr.cgi NEW LINK FOUND: http://www.teemapoint.com/ NEW LINK FOUND: http://www.teemapoint.com/nlpdemo/servlet/ParserServlet NEW LINK FOUND: http://www.teemapoint.com/nlpdemo/docs/about.html NEW LINK FOUND: http://cs.nyu.edu/cs/projects/lsp/ NEW LINK FOUND: http://www.ling.gu.se/~lager/index.html NEW LINK FOUND: http://www.ling.gu.se/~lager/mutbl.html NEW LINK FOUND: http://www.ling.gu.se/~lager/Mutbl/demo.html NEW LINK FOUND: http://www.ling.gu.se/~lager/Home/chunker_ui.html NEW LINK FOUND: http://perun.si.umich.edu/clair/ NEW LINK FOUND: http://perun.si.umich.edu/~radev/ NEW LINK FOUND: http://perun.si.umich.edu/~radev/nsir/nsir_test.cgi NEW LINK FOUND: http://www.comp.lancs.ac.uk/computing/research/stemming/paice/cdp.htm NEW LINK FOUND: http://www.comp.lancs.ac.uk/computing/research/stemming/ NEW LINK FOUND: http://www.comp.lancs.ac.uk/computing/research/stemming/paice/stemdemo.htm NEW LINK FOUND: http://www.clres.com/ NEW LINK FOUND: http://www.clres.com/cgi-bin/parse.cgi NEW LINK FOUND: http://www.clres.com/parse_interpret.html NEW LINK FOUND: http://clg1.bham.ac.uk/tagger.html NEW LINK FOUND: http://clg1.bham.ac.uk/tagger/tagset.html NEW LINK FOUND: http://www.cogilex.com/ NEW LINK FOUND: http://www.cogilex.com/online.asp NEW LINK FOUND: http://www.cogilex.com/frames.htm NEW LINK FOUND: http://www-rali.iro.umontreal.ca/SILC/SILC.en.cgi/2 NEW LINK FOUND: http://www-rali.iro.umontreal.ca/Accueil.en.html NEW LINK FOUND: http://www-rali.iro.umontreal.ca/Accueil.fr.html NEW LINK FOUND: http://www-rali.iro.umontreal.ca/SILC/SILC.en.cgi NEW LINK FOUND: http://www-rali.iro.umontreal.ca/SILC/SILC.fr.cgi NEW LINK FOUND: http://quebec.alis.com/castil/essai_silc.cgi?lang_form=en NEW LINK FOUND: http://www.rxrc.xerox.com/research/mltt/tools/audiosummarizer NEW LINK FOUND: http://grid.let.rug.nl/~vannoord/TextCat/Demo/textcat.html NEW LINK FOUND: http://www.coli.uni-sb.de/~thorsten/tnt/ NEW LINK FOUND: http://www-rali.iro.umontreal.ca/Accueil.en.html NEW LINK FOUND: http://www-rali.iro.umontreal.ca/Accueil.fr.html NEW LINK FOUND: http://www-rali.iro.umontreal.ca/TrialDir/demo/index.cgi/lang=en NEW LINK FOUND: http://www-rali.iro.umontreal.ca/TrialDir/demo/index.cgi/lang=fr NEW LINK FOUND: http://www-rali.iro.umontreal.ca/TrialDir/demo/QetR.en.html NEW LINK FOUND: http://www-rali.iro.umontreal.ca/TrialDir/demo/QetR.fr.html NEW LINK FOUND: http://www-rali.iro.umontreal.ca/Trial/ NEW LINK FOUND: http://www.rdues.liv.ac.uk/ NEW LINK FOUND: http://www.webcorp.org.uk/ NEW LINK FOUND: http://www.ling.gu.se/~lager/index.html NEW LINK FOUND: http://www.ling.gu.se/~lager/mutbl.html NEW LINK FOUND: http://www.ling.gu.se/~lager/Mutbl/demo.html NEW LINK FOUND: http://www.ling.gu.se/~lager/Home/pwe_ui.html NEW LINK FOUND: http://www.rxrc.xerox.com/research/mltt/fst/fsplay.html NEW LINK FOUND: http://www.rxrc.xerox.com/research/mltt/fst/fsinput.html NEW LINK FOUND: http://www.rxrc.xerox.com/research/mltt/tools/guesser NEW LINK FOUND: http://www.rxrc.xerox.com/research/mltt/demos/english NEW LINK FOUND: http://www.rxrc.xerox.com/research/mltt/demos/english NEW LINK FOUND: http://www.rxrc.xerox.com/research/mltt/demos/english NEW LINK FOUND: http://www.cis.upenn.edu/~treebank/tokenization.html NEW LINK FOUND: http://www.xrce.xerox.com/publis/mltt/mltt-004.ps NEW LINK FOUND: http://www.ltg.ed.ac.uk/projects/ledtools/ale-ra/node6.html NEW LINK FOUND: http://www.phon.ucl.ac.uk/home/dick/mwg.htm NEW LINK FOUND: http://www.phil-fak.uni-duesseldorf.de/sfb282/C8/ NEW LINK FOUND: http://www.rxrc.xerox.com/research/mltt/fsnlp/morph.html NEW LINK FOUND: http://www.comp.lancs.ac.uk/computing/research/stemming/general/index.htm NEW LINK FOUND: http://www.cs.jmu.edu/common/projects/Stemming/ NEW LINK FOUND: http://ciir.cs.umass.edu/whatsnew/stemming.html NEW LINK FOUND: http://www.comp.lancs.ac.uk/computing/research/stemming/general/dawson.htm NEW LINK FOUND: http://www.comp.lancs.ac.uk/computing/research/stemming/general/krovetz.htm NEW LINK FOUND: http://www.comp.lancs.ac.uk/computing/research/stemming/general/lovins.htm NEW LINK FOUND: http://www.cs.waikato.ac.nz/~eibe/stemmers/ NEW LINK FOUND: http://www.comp.lancs.ac.uk/computing/research/stemming/ NEW LINK FOUND: http://www.comp.lancs.ac.uk/computing/research/stemming/general/paice.htm NEW LINK FOUND: http://www.tartarus.org/~martin/PorterStemmer/index.html NEW LINK FOUND: http://www.comp.lancs.ac.uk/computing/research/stemming/general/porter.htm NEW LINK FOUND: http://www.ling.lancs.ac.uk/monkey/ihe/linguistics/corpus2/2lemma.htm NEW LINK FOUND: http://www.kcl.ac.uk/humanities/cch/chwp/siemens2/siemens3.html NEW LINK FOUND: http://www.ling.lancs.ac.uk/monkey/ihe/linguistics/corpus2/2pos.htm NEW LINK FOUND: ftp://ftp.cis.upenn.edu/pub/treebank/doc/tagguide.ps.gz NEW LINK FOUND: http://www.comp.lancs.ac.uk/ucrel/annotation.html#POS NEW LINK FOUND: http://www.ltg.ed.ac.uk/software/pos/#general NEW LINK FOUND: http://www.lynellen.com/pos.html NEW LINK FOUND: http://www.georgetown.edu/cball/ling361/tagging_overview.html NEW LINK FOUND: http://www.rxrc.xerox.com/research/mltt/fsnlp/tagger.html NEW LINK FOUND: http://w3.arizona.edu/~ling/hh/522/LexSemBiblio.html NEW LINK FOUND: http://budling.nytud.hu/~kalman/reading/semantics.html NEW LINK FOUND: http://www.ling.lancs.ac.uk/monkey/ihe/linguistics/corpus2/2parse.htm NEW LINK FOUND: http://www.ling.lancs.ac.uk/monkey/ihe/linguistics/corpus2/2full.htm NEW LINK FOUND: ftp://ftp.cis.upenn.edu/pub/treebank/doc/arpa94.ps.gz NEW LINK FOUND: http://www.rxrc.xerox.com/research/mltt/fsnlp/fsparsing.html NEW LINK FOUND: http://www.georgetown.edu/luperfoy/Discourse-Treebank/dri-home.html NEW LINK FOUND: http://www.sil.org/linguistics/RST/ NEW LINK FOUND: http://www.sigdial.org/ NEW LINK FOUND: http://ipra-www.uia.ac.be/ipra/ NEW LINK FOUND: http://www.dcs.shef.ac.uk/research/groups/nlp/pragmatics.html NEW LINK FOUND: http://www.summarization.com/~radev/u/db/acl/ NEW LINK FOUND: http://www.aclweb.org/ NEW LINK FOUND: http://cora.whizbang.com/Artificial_Intelligence/NLP/index.html NEW LINK FOUND: http://www.acm.org/pubs/corr/ NEW LINK FOUND: http://www.cs.columbia.edu/~radev/newacl/conferences.html NEW LINK FOUND: http://www.cicling.org/ NEW LINK FOUND: http://www.dcs.shef.ac.uk/research/ilash/iccl/ NEW LINK FOUND: http://www.dcs.shef.ac.uk/research/ilash/iccl/past_meetings.html NEW LINK FOUND: http://www.nodali.sics.se/bibliotek/kval/coling.html NEW LINK FOUND: http://www.eacl.org/ NEW LINK FOUND: http://issco-www.unige.ch/eacl/conf/ NEW LINK FOUND: http://www.cs.cornell.edu/home/llee/emnlp.html NEW LINK FOUND: http://www.cs.jhu.edu/~yarowsky/sigdat.html NEW LINK FOUND: http://parlevink.cs.utwente.nl/sigparse/ NEW LINK FOUND: http://www.aclweb.org/naacl NEW LINK FOUND: http://www.gte.com/anlp-naacl2000 NEW LINK FOUND: http://www.cs.cmu.edu/~ref/naacl2001.html NEW LINK FOUND: http://www.link.cs.cmu.edu/lexfn/ NEW LINK FOUND: http://morph.ldc.upenn.edu/annotation/ NEW LINK FOUND: http://www.hit.uib.no/corpora/welcome.txt NEW LINK FOUND: http://www.hit.uib.no/corpora/ NEW LINK FOUND: http://web.syr.edu/~mdtaffet/student_sites.html NEW LINK FOUND: http://www.aaai.org/AITopics/html/natlang.html NEW LINK FOUND: http://www.cs.columbia.edu/~radev/nlpfaq.txt NEW LINK FOUND: http://roadkill.scms.rgu.ac.uk/staff/asga/nat.html NEW LINK FOUND: http://www.cis.upenn.edu/~treebank/home.html NEW LINK FOUND: http://www.coli.uni-sb.de/~hansu/what_is_cl.html NEW LINK FOUND: http://www.hum.uva.nl/~ewn/ NEW LINK FOUND: ftp://ftp.cogsci.princeton.edu/pub/wordnet/5papers.pdf NEW LINK FOUND: http://www.cogsci.princeton.edu/~wn/ NEW LINK FOUND: http://www.conexor.fi/testing.html NEW LINK FOUND: http://crl.nmsu.edu/cgi-bin/Tools/CLR/clrcat NEW LINK FOUND: http://www.hltcentral.org/htmlengine.shtml?id=239 NEW LINK FOUND: http://www.ifi.unizh.ch/CL/InteractiveTools.html NEW LINK FOUND: http://www.ltg.ed.ac.uk/software/index.html NEW LINK FOUND: http://www.lingsoft.fi/demos.html NEW LINK FOUND: http://www.rxrc.xerox.com/research/mltt/toolhome NEW LINK FOUND: http://www.ling.gu.se/~lager/Mutbl/demo.html NEW LINK FOUND: http://www.aaai.org/Repository-Mirror/nlp-tools.html NEW LINK FOUND: http://registry.dfki.de/ NEW LINK FOUND: http://cslp.comp.nus.edu.sg/CS6207/course/nlpres.html NEW LINK FOUND: http://www.cam.sri.com/html/demos.html NEW LINK FOUND: http://www-nlp.stanford.edu/links/statnlp.html NEW LINK FOUND: http://titania.cobuild.collins.co.uk/ NEW LINK FOUND: http://www.hd.uib.no/icame.html NEW LINK FOUND: http://www.ldc.upenn.edu/ NEW LINK FOUND: http://www.ling.lancs.ac.uk/monkey/ihe/linguistics/contents.htm NEW LINK FOUND: http://www.site.uottawa.ca/~mjarmasz/corpus/index.html NEW LINK FOUND: http://donelaitis.vdu.lt/publikacijos/SDoCL3.htm NEW LINK FOUND: http://www.ife.dk/url-cl.htm NEW LINK FOUND: http://www.ccl.umist.ac.uk/teaching/material/3009/intro/ NEW LINK FOUND: http://www.ruf.rice.edu/~barlow/corpus.html NEW LINK FOUND: http://solaris3.ids-mannheim.de/ijcl/ NEW LINK FOUND: http://tractor.bham.ac.uk/ijcl/teubert_cl.html NEW LINK FOUND: http://www.cst.ku.dk/dan/corpus.html NEW LINK FOUND: http://www.clres.com/ NEW LINK FOUND: http://www-personal.umich.edu/~jlawler/levin.html NEW LINK FOUND: http://www-personal.umich.edu/~jlawler/levin.verbs NEW LINK FOUND: http://www.isi.edu/natural-language/nlp-at-isi.html NEW LINK FOUND: http://www.itl.nist.gov/iaui/894.02/ NEW LINK FOUND: http://www.rxrc.xerox.com/publis/mltt/mltttech.html NEW LINK FOUND: http://www.rxrc.xerox.com/publis/mltt/mlttart.html NEW LINK FOUND: http://www.clres.com/siglex.html NEW LINK FOUND: http://stp.ling.uu.se/~torbjorn/Mutbl/bibliography.html NEW LINK FOUND: http://trec.nist.gov/ NEW LINK FOUND: http://www3.pitt.edu/~jehst49/smallback.html NEW LINK FOUND: http://member.bcentral.com/cgi-bin/fc/fastcounter-login?1759178 NEW LINK FOUND: http://fastcounter.bcentral.com/fc-join NEW LINK FOUND: http://fastcounter.bcentral.com NEW LINK FOUND: mailto:mdtaffet@syr.edu nach ParserDelegator() PARSER RUN 229 crawl_level: 1 MAX_CRAWL_DEPTH: 2 http://istweb.syr.edu/facstaff/facultymember.asp?id=37 0 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.syr.edu/ 1 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://istweb.syr.edu/ 2 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.bham.ac.uk/~pxc/nlpa/nlpgloss.html 3 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cse.unsw.edu.au/~billw/nlpdict.html 4 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://donelaitis.vdu.lt/publikacijos/SDoCL.htm 5 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.itl.nist.gov/iaui/894.02/related_projects/tipster/gloss.htm 6 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.scs.leeds.ac.uk/amalgam/amalgam/amalgsoft.html 7 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.scs.leeds.ac.uk/amalgam/tagsets/brown.html 8 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.scs.leeds.ac.uk/amalgam/tagsets/ice.html 9 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.scs.leeds.ac.uk/amalgam/tagsets/llc.html 10 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.scs.leeds.ac.uk/amalgam/tagsets/lob.html 11 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.scs.leeds.ac.uk/amalgam/tagsets/parts.html 12 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.scs.leeds.ac.uk/amalgam/tagsets/pow.html 13 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.scs.leeds.ac.uk/amalgam/tagsets/sec.html 14 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.rdues.liv.ac.uk/ 15 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.rdues.liv.ac.uk/april.shtml 16 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.rdues.liv.ac.uk/aprdemo/ 17 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.rdues.liv.ac.uk/aprdemo/dbintro.html 18 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.rdues.liv.ac.uk/aprdemo/dbadvanced.html 19 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.rdues.liv.ac.uk/aprdemo/dbstats.html 20 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ling.gu.se/~lager/index.html 21 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ling.gu.se/~lager/mutbl.html 22 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ling.gu.se/~lager/Mutbl/demo.html 23 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ling.gu.se/~lager/Home/brilltagger_ui.html 24 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.comp.lancs.ac.uk/computing/research/ucrel/ 25 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.lancs.ac.uk/ 26 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.comp.lancs.ac.uk/computing/research/ucrel/claws/ 27 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.comp.lancs.ac.uk/computing/research/ucrel/claws/trial.html 28 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.comp.lancs.ac.uk/computing/research/ucrel/claws/ 29 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.comp.lancs.ac.uk/ucrel/claws5tags.html 30 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.comp.lancs.ac.uk/ucrel/claws7tags.html 31 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://titania.cobuild.collins.co.uk/form.html 32 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ling.gu.se/~lager/index.html 33 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ling.gu.se/~lager/mutbl.html 34 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ling.gu.se/~lager/Mutbl/demo.html 35 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ling.gu.se/~lager/Home/cgtagger_ui.html 36 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.brandeis.edu/~paulb/CoreLex/overview.html 37 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.dfki.de/~paulb/corelex.html 38 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.lingsoft.fi/cgi-pub/engcg 39 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.lingsoft.fi/doc/engcg/intro/mtags.html 40 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.lingsoft.fi/doc/engcg/intro/stags.html 41 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.lingsoft.fi/doc/engcg/intro/ 42 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.conexor.fi/anlp97/anlp97.html 43 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.conexor.fi/testing.html 44 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.conexor.fi/engcg2.html 45 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.conexor.fi/testing.html 46 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.conexor.fi/fdg.html 47 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.conexor.fi/anlp97/anlp97.html 48 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.conexor.fi/testing.html 49 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.conexor.fi/np-parser.html 50 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.lingsoft.fi/cgi-pub/engtwol 51 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.lingsoft.fi/doc/engcg/intro/mtags.html 52 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://lcweb2.loc.gov/law/GLINv1/GLIN.html 53 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://cimic.rutgers.edu/~holowcza/glin/ling/ 54 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://cimic.rutgers.edu/~holowcza/glin/ling/query.html 55 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://cimic.rutgers.edu/cgi-bin/cgiwrap/holowcza/ling/db/index.pl 56 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ltg.ed.ac.uk/~jo/interarbora/ 57 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ltg.ed.ac.uk/~jo/interarbora/general.html 58 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.comp.nus.edu.sg/~qiul/NLPTools/JavaRAP.html 59 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.comp.nus.edu.sg/%7Eqiul/Publications/lrec04_JavaRAP.pdf 60 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-appn.comp.nus.edu.sg/%7Erpnlpir/cgi-bin/JavaRAP/JavaRAPdemo.html 61 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.link.cs.cmu.edu/link/index.html 62 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.link.cs.cmu.edu/link/submit-sentence-4.html 63 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.link.cs.cmu.edu/link/explain-output.html 64 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ltg.ed.ac.uk/software/posdemo.html 65 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ltg.ed.ac.uk/software/posdemo.html 66 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ltg.ed.ac.uk/software/thistle/demos/index.html 67 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-rali.iro.umontreal.ca/Accueil.en.html 68 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-rali.iro.umontreal.ca/Accueil.fr.html 69 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-rali.iro.umontreal.ca/Morpho/Morpho.en.cgi 70 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-rali.iro.umontreal.ca/Morpho/Morpho.fr.cgi 71 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.teemapoint.com/ 72 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.teemapoint.com/nlpdemo/servlet/ParserServlet 73 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.teemapoint.com/nlpdemo/docs/about.html 74 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://cs.nyu.edu/cs/projects/lsp/ 75 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ling.gu.se/~lager/index.html 76 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ling.gu.se/~lager/mutbl.html 77 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ling.gu.se/~lager/Mutbl/demo.html 78 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ling.gu.se/~lager/Home/chunker_ui.html 79 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://perun.si.umich.edu/clair/ 80 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://perun.si.umich.edu/~radev/ 81 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://perun.si.umich.edu/~radev/nsir/nsir_test.cgi 82 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.comp.lancs.ac.uk/computing/research/stemming/paice/cdp.htm 83 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.comp.lancs.ac.uk/computing/research/stemming/ 84 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.comp.lancs.ac.uk/computing/research/stemming/paice/stemdemo.htm 85 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.clres.com/ 86 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.clres.com/cgi-bin/parse.cgi 87 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.clres.com/parse_interpret.html 88 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://clg1.bham.ac.uk/tagger.html 89 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://clg1.bham.ac.uk/tagger/tagset.html 90 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cogilex.com/ 91 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cogilex.com/online.asp 92 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cogilex.com/frames.htm 93 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-rali.iro.umontreal.ca/SILC/SILC.en.cgi/2 94 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-rali.iro.umontreal.ca/Accueil.en.html 95 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-rali.iro.umontreal.ca/Accueil.fr.html 96 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-rali.iro.umontreal.ca/SILC/SILC.en.cgi 97 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-rali.iro.umontreal.ca/SILC/SILC.fr.cgi 98 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://quebec.alis.com/castil/essai_silc.cgi?lang_form=en 99 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.rxrc.xerox.com/research/mltt/tools/audiosummarizer 100 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://grid.let.rug.nl/~vannoord/TextCat/Demo/textcat.html 101 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.coli.uni-sb.de/~thorsten/tnt/ 102 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-rali.iro.umontreal.ca/Accueil.en.html 103 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-rali.iro.umontreal.ca/Accueil.fr.html 104 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-rali.iro.umontreal.ca/TrialDir/demo/index.cgi/lang=en 105 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-rali.iro.umontreal.ca/TrialDir/demo/index.cgi/lang=fr 106 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-rali.iro.umontreal.ca/TrialDir/demo/QetR.en.html 107 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-rali.iro.umontreal.ca/TrialDir/demo/QetR.fr.html 108 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-rali.iro.umontreal.ca/Trial/ 109 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.rdues.liv.ac.uk/ 110 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.webcorp.org.uk/ 111 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ling.gu.se/~lager/index.html 112 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ling.gu.se/~lager/mutbl.html 113 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ling.gu.se/~lager/Mutbl/demo.html 114 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ling.gu.se/~lager/Home/pwe_ui.html 115 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.rxrc.xerox.com/research/mltt/fst/fsplay.html 116 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.rxrc.xerox.com/research/mltt/fst/fsinput.html 117 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.rxrc.xerox.com/research/mltt/tools/guesser 118 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.rxrc.xerox.com/research/mltt/demos/english 119 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.rxrc.xerox.com/research/mltt/demos/english 120 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.rxrc.xerox.com/research/mltt/demos/english 121 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cis.upenn.edu/~treebank/tokenization.html 122 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.xrce.xerox.com/publis/mltt/mltt-004.ps 123 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ltg.ed.ac.uk/projects/ledtools/ale-ra/node6.html 124 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.phon.ucl.ac.uk/home/dick/mwg.htm 125 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.phil-fak.uni-duesseldorf.de/sfb282/C8/ 126 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.rxrc.xerox.com/research/mltt/fsnlp/morph.html 127 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.comp.lancs.ac.uk/computing/research/stemming/general/index.htm 128 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.jmu.edu/common/projects/Stemming/ 129 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ciir.cs.umass.edu/whatsnew/stemming.html 130 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.comp.lancs.ac.uk/computing/research/stemming/general/dawson.htm 131 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.comp.lancs.ac.uk/computing/research/stemming/general/krovetz.htm 132 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.comp.lancs.ac.uk/computing/research/stemming/general/lovins.htm 133 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.waikato.ac.nz/~eibe/stemmers/ 134 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.comp.lancs.ac.uk/computing/research/stemming/ 135 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.comp.lancs.ac.uk/computing/research/stemming/general/paice.htm 136 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.tartarus.org/~martin/PorterStemmer/index.html 137 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.comp.lancs.ac.uk/computing/research/stemming/general/porter.htm 138 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ling.lancs.ac.uk/monkey/ihe/linguistics/corpus2/2lemma.htm 139 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.kcl.ac.uk/humanities/cch/chwp/siemens2/siemens3.html 140 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ling.lancs.ac.uk/monkey/ihe/linguistics/corpus2/2pos.htm 141 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 ftp://ftp.cis.upenn.edu/pub/treebank/doc/tagguide.ps.gz 142 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.comp.lancs.ac.uk/ucrel/annotation.html#POS 143 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ltg.ed.ac.uk/software/pos/#general 144 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.lynellen.com/pos.html 145 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.georgetown.edu/cball/ling361/tagging_overview.html 146 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.rxrc.xerox.com/research/mltt/fsnlp/tagger.html 147 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://w3.arizona.edu/~ling/hh/522/LexSemBiblio.html 148 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://budling.nytud.hu/~kalman/reading/semantics.html 149 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ling.lancs.ac.uk/monkey/ihe/linguistics/corpus2/2parse.htm 150 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ling.lancs.ac.uk/monkey/ihe/linguistics/corpus2/2full.htm 151 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 ftp://ftp.cis.upenn.edu/pub/treebank/doc/arpa94.ps.gz 152 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.rxrc.xerox.com/research/mltt/fsnlp/fsparsing.html 153 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.georgetown.edu/luperfoy/Discourse-Treebank/dri-home.html 154 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.sil.org/linguistics/RST/ 155 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.sigdial.org/ 156 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ipra-www.uia.ac.be/ipra/ 157 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.dcs.shef.ac.uk/research/groups/nlp/pragmatics.html 158 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.summarization.com/~radev/u/db/acl/ 159 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.aclweb.org/ 160 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://cora.whizbang.com/Artificial_Intelligence/NLP/index.html 161 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.acm.org/pubs/corr/ 162 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.columbia.edu/~radev/newacl/conferences.html 163 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cicling.org/ 164 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.dcs.shef.ac.uk/research/ilash/iccl/ 165 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.dcs.shef.ac.uk/research/ilash/iccl/past_meetings.html 166 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.nodali.sics.se/bibliotek/kval/coling.html 167 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.eacl.org/ 168 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://issco-www.unige.ch/eacl/conf/ 169 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cornell.edu/home/llee/emnlp.html 170 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.jhu.edu/~yarowsky/sigdat.html 171 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://parlevink.cs.utwente.nl/sigparse/ 172 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.aclweb.org/naacl 173 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.gte.com/anlp-naacl2000 174 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/~ref/naacl2001.html 175 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.link.cs.cmu.edu/lexfn/ 176 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://morph.ldc.upenn.edu/annotation/ 177 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.hit.uib.no/corpora/welcome.txt 178 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.hit.uib.no/corpora/ 179 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://web.syr.edu/~mdtaffet/student_sites.html 180 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.aaai.org/AITopics/html/natlang.html 181 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.columbia.edu/~radev/nlpfaq.txt 182 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://roadkill.scms.rgu.ac.uk/staff/asga/nat.html 183 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cis.upenn.edu/~treebank/home.html 184 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.coli.uni-sb.de/~hansu/what_is_cl.html 185 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.hum.uva.nl/~ewn/ 186 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 ftp://ftp.cogsci.princeton.edu/pub/wordnet/5papers.pdf 187 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cogsci.princeton.edu/~wn/ 188 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.conexor.fi/testing.html 189 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://crl.nmsu.edu/cgi-bin/Tools/CLR/clrcat 190 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.hltcentral.org/htmlengine.shtml?id=239 191 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ifi.unizh.ch/CL/InteractiveTools.html 192 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ltg.ed.ac.uk/software/index.html 193 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.lingsoft.fi/demos.html 194 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.rxrc.xerox.com/research/mltt/toolhome 195 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ling.gu.se/~lager/Mutbl/demo.html 196 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.aaai.org/Repository-Mirror/nlp-tools.html 197 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://registry.dfki.de/ 198 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://cslp.comp.nus.edu.sg/CS6207/course/nlpres.html 199 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cam.sri.com/html/demos.html 200 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-nlp.stanford.edu/links/statnlp.html 201 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://titania.cobuild.collins.co.uk/ 202 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.hd.uib.no/icame.html 203 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ldc.upenn.edu/ 204 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ling.lancs.ac.uk/monkey/ihe/linguistics/contents.htm 205 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.site.uottawa.ca/~mjarmasz/corpus/index.html 206 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://donelaitis.vdu.lt/publikacijos/SDoCL3.htm 207 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ife.dk/url-cl.htm 208 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ccl.umist.ac.uk/teaching/material/3009/intro/ 209 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ruf.rice.edu/~barlow/corpus.html 210 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://solaris3.ids-mannheim.de/ijcl/ 211 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://tractor.bham.ac.uk/ijcl/teubert_cl.html 212 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cst.ku.dk/dan/corpus.html 213 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.clres.com/ 214 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-personal.umich.edu/~jlawler/levin.html 215 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-personal.umich.edu/~jlawler/levin.verbs 216 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.isi.edu/natural-language/nlp-at-isi.html 217 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.itl.nist.gov/iaui/894.02/ 218 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.rxrc.xerox.com/publis/mltt/mltttech.html 219 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.rxrc.xerox.com/publis/mltt/mlttart.html 220 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.clres.com/siglex.html 221 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://stp.ling.uu.se/~torbjorn/Mutbl/bibliography.html 222 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://trec.nist.gov/ 223 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www3.pitt.edu/~jehst49/smallback.html 224 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://member.bcentral.com/cgi-bin/fc/fastcounter-login?1759178 225 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://fastcounter.bcentral.com/fc-join 226 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://fastcounter.bcentral.com 227 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:mdtaffet@syr.edu 228 229 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN etupc19.wiwi.uni-karlsruhe.de ROBOT - SAVE CON OPENED: http://etupc19.wiwi.uni-karlsruhe.de/webmining/bib/ 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.cs.ualberta.ca java.lang.NullPointerException - http://etupc19.wiwi.uni-karlsruhe.de/webmining/bib/ ROBOT - SAVE CON OPENED: http://www.cs.ualberta.ca/~tszhu/webmining.htm CONTENT GOT: text/html vor ParserDelegator() PARSER RUN 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN xtasy.lib.indiana.edu javax.swing.text.ChangedCharSetException java.lang.NullPointerException - http://www.cs.ualberta.ca/~tszhu/webmining.htm ROBOT - SAVE CON OPENED: http://xtasy.lib.indiana.edu/jmdocs/irsys.html CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http://xcite.lib.indiana.edu/jmdocs/papersi.html NEW LINK FOUND: http://ezinfo.ucs.indiana.edu/~harter/colinked.html NEW LINK FOUND: http://www.asis.org NEW LINK FOUND: http://www.asis.org/Bulletin/Jun-95/cdavis.html NEW LINK FOUND: http://www.ics.hawaii.edu/um-96/ NEW LINK FOUND: http://www.ubilab.ubs.ch/sigir96/welcome.html NEW LINK FOUND: http://ezinfo.ucs.indiana.edu/~harter/arist.html NEW LINK FOUND: http://memex.lib.indiana.edu/Research/shaw-mla.html NEW LINK FOUND: http://www-slis.lib.indiana.edu/HomePages/davisc.html NEW LINK FOUND: http://ezinfo.ucs.indiana.edu/~harter/home.html NEW LINK FOUND: http://xcite.lib.indiana.edu/ NEW LINK FOUND: http://www-slis.lib.indiana.edu/HomePages/upriss.html NEW LINK FOUND: http://www-slis.lib.indiana.edu/HomePages/shawd.html NEW LINK FOUND: http://xcite.lib.indiana.edu/jmdocs/restext.html NEW LINK FOUND: http://ezinfo.ucs.indiana.edu/~harter/info_retrieval.htm nach ParserDelegator() PARSER RUN 15 crawl_level: 1 MAX_CRAWL_DEPTH: 2 http://xcite.lib.indiana.edu/jmdocs/papersi.html 0 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ezinfo.ucs.indiana.edu/~harter/colinked.html 1 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.asis.org 2 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.asis.org/Bulletin/Jun-95/cdavis.html 3 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ics.hawaii.edu/um-96/ 4 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.ubilab.ubs.ch/sigir96/welcome.html 5 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ezinfo.ucs.indiana.edu/~harter/arist.html 6 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://memex.lib.indiana.edu/Research/shaw-mla.html 7 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-slis.lib.indiana.edu/HomePages/davisc.html 8 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ezinfo.ucs.indiana.edu/~harter/home.html 9 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://xcite.lib.indiana.edu/ 10 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-slis.lib.indiana.edu/HomePages/upriss.html 11 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-slis.lib.indiana.edu/HomePages/shawd.html 12 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://xcite.lib.indiana.edu/jmdocs/restext.html 13 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ezinfo.ucs.indiana.edu/~harter/info_retrieval.htm 14 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.muc.saic.com ROBOT - SAVE CON OPENED: http://www.muc.saic.com/ 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.cs.mu.oz.au java.lang.NullPointerException - http://www.muc.saic.com/ ROBOT - SAVE CON OPENED: http://www.cs.mu.oz.au/~alistair/ CONTENT GOT: text/html vor ParserDelegator() PARSER RUN 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.cs.mu.oz.au javax.swing.text.ChangedCharSetException java.lang.NullPointerException - http://www.cs.mu.oz.au/~alistair/ ROBOT - SAVE CON OPENED: http://www.cs.mu.oz.au/~alistair/exploring/ CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http://www.cs.rmit.edu.au/~jz/ NEW LINK FOUND: http://www.cs.mu.oz.au/~alistair NEW LINK FOUND: http://www.cs.mu.oz.au/~alistair/abstracts/zm98:forum.html NEW LINK FOUND: http://www.cs.mu.oz.au/~alistair/exploring/ nach ParserDelegator() PARSER RUN 4 crawl_level: 1 MAX_CRAWL_DEPTH: 2 http://www.cs.rmit.edu.au/~jz/ 0 4 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.mu.oz.au/~alistair 1 4 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.mu.oz.au/~alistair/abstracts/zm98:forum.html 2 4 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.mu.oz.au/~alistair/exploring/ 3 4 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www-users.cs.umn.edu ROBOT - SAVE CON OPENED: http://www-users.cs.umn.edu/~mobasher/webminer/survey/survey.html CONTENT GOT: text/html vor ParserDelegator() nach ParserDelegator() PARSER RUN 0 crawl_level: 1 MAX_CRAWL_DEPTH: 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.cs.umbc.edu ROBOT - SAVE CON OPENED: http://www.cs.umbc.edu/~mayfield/ngrams.html CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http://www.c3.lanl.gov:8075/cgi/casey/revdict/ NEW LINK FOUND: http://www.cs.umbc.edu/~elm/Projects/telltale/ NEW LINK FOUND: http://potomac.ncsl.nist.gov/TREC/trec3.papers/cavnar_ngram_94.ps NEW LINK FOUND: http://potomac.ncsl.nist.gov/TREC/t3_proceedings.html NEW LINK FOUND: http://xxx.lanl.gov/cmp-lg/ACL-95-proceedings.html#tut-ngrams NEW LINK FOUND: http://xxx.lanl.gov/cmp-lg/ACL-95-proceedings.html NEW LINK FOUND: http://www.cs.umbc.edu/cikm/iia/submitted/viewing/crowder.ps NEW LINK FOUND: http://www.cs.umbc.edu/cikm/iia/proc.html NEW LINK FOUND: http://potomac.ncsl.nist.gov/TREC/t3_proceedings.html NEW LINK FOUND: http://www.cs.umbc.edu/cikm/1993/program NEW LINK FOUND: http://patent.womplex.ibm.com/ NEW LINK FOUND: http://www.seas.gwu.edu/student/chulee/ NEW LINK FOUND: http://www.musicbase.co.uk/music//orbit/ngram/ngram_main.html NEW LINK FOUND: mailto:james.mayfield@jhuapl.edu NEW LINK FOUND: http://www.cs.umbc.edu/~mayfield/personal-info.html nach ParserDelegator() PARSER RUN 15 crawl_level: 1 MAX_CRAWL_DEPTH: 2 http://www.c3.lanl.gov:8075/cgi/casey/revdict/ 0 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.umbc.edu/~elm/Projects/telltale/ 1 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://potomac.ncsl.nist.gov/TREC/trec3.papers/cavnar_ngram_94.ps 2 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://potomac.ncsl.nist.gov/TREC/t3_proceedings.html 3 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://xxx.lanl.gov/cmp-lg/ACL-95-proceedings.html#tut-ngrams 4 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://xxx.lanl.gov/cmp-lg/ACL-95-proceedings.html 5 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.umbc.edu/cikm/iia/submitted/viewing/crowder.ps 6 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.umbc.edu/cikm/iia/proc.html 7 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://potomac.ncsl.nist.gov/TREC/t3_proceedings.html 8 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.umbc.edu/cikm/1993/program 9 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://patent.womplex.ibm.com/ 10 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.seas.gwu.edu/student/chulee/ 11 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.musicbase.co.uk/music//orbit/ngram/ngram_main.html 12 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:james.mayfield@jhuapl.edu 13 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.umbc.edu/~mayfield/personal-info.html 14 15 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN websom.hut.fi ROBOT - SAVE CON OPENED: http://websom.hut.fi/ CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: mailto:websom@websom.hut.fi nach ParserDelegator() PARSER RUN 1 crawl_level: 1 MAX_CRAWL_DEPTH: 2 mailto:websom@websom.hut.fi 0 1 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.cs.umbc.edu ROBOT - SAVE CON OPENED: http://www.cs.umbc.edu/abir/ CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http://www.cs.umbc.edu/~ian/ NEW LINK FOUND: mailto:ian@cs.umbc.edu NEW LINK FOUND: http://www.cs.umbc.edu/~ian/agent-ir.html NEW LINK FOUND: http://www.cs.washington.edu/homes/etzioni/ NEW LINK FOUND: http://snoopy.cs.umbc.edu:8080/web-agent/ NEW LINK FOUND: http://www.cs.umd.edu/projects/plus/SHOE NEW LINK FOUND: http://www.cs.dartmouth.edu/~rus/ NEW LINK FOUND: http://www.cs.dartmouth.edu/~rus/papers/info/info.html NEW LINK FOUND: http://www-nlp.cs.umass.edu/ NEW LINK FOUND: http://ciir.cs.umass.edu/ NEW LINK FOUND: http://www.public.iastate.edu/~CYBERSTACKS/Aristotle.htm NEW LINK FOUND: http://www.cs.cmu.edu/~softagents/index.html NEW LINK FOUND: http://www.cs.cmu.edu/~softagents/dvina/ NEW LINK FOUND: http://www.cs.cmu.edu/~softagents/webmate/ NEW LINK FOUND: http://www.cs.washington.edu/research/projects/softbots/www/softbots.html NEW LINK FOUND: http://robotics.stanford.edu/groups/nobots/ NEW LINK FOUND: http://robotics.stanford.edu/groups/nobotics/home.html NEW LINK FOUND: http://www.cs.umbc.edu/kqml/ NEW LINK FOUND: http://www.cs.umbc.edu/ngram/ NEW LINK FOUND: http://superbook.bellcore.com/~std/LSI.html NEW LINK FOUND: http://www.cs.umd.edu/projects/plus/SHOE/ NEW LINK FOUND: http://www.agentsoft.com/xml/ NEW LINK FOUND: http://www.cs.ruu.nl/people/theo/abstract_agents2.html NEW LINK FOUND: http://lcs.www.media.mit.edu/groups/agents/research.html NEW LINK FOUND: http://www.media.mit.edu/ NEW LINK FOUND: http://robotics.Stanford.EDU/~marko/ NEW LINK FOUND: http://fab.stanford.edu/ NEW LINK FOUND: http://www.alexa.com/ NEW LINK FOUND: http://www-diglib.stanford.edu/ NEW LINK FOUND: http://www.si.umich.edu/UMDL/ NEW LINK FOUND: http://www.si.umich.edu/UMDL/nsfdlsites.html NEW LINK FOUND: http://www.si.umich.edu/UMDL/otherdlsites.html NEW LINK FOUND: http://www.si.umich.edu/UMDL/otherdlpubs.html nach ParserDelegator() PARSER RUN 33 crawl_level: 1 MAX_CRAWL_DEPTH: 2 http://www.cs.umbc.edu/~ian/ 0 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:ian@cs.umbc.edu 1 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.umbc.edu/~ian/agent-ir.html 2 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.washington.edu/homes/etzioni/ 3 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://snoopy.cs.umbc.edu:8080/web-agent/ 4 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.umd.edu/projects/plus/SHOE 5 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.dartmouth.edu/~rus/ 6 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.dartmouth.edu/~rus/papers/info/info.html 7 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-nlp.cs.umass.edu/ 8 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://ciir.cs.umass.edu/ 9 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.public.iastate.edu/~CYBERSTACKS/Aristotle.htm 10 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/~softagents/index.html 11 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/~softagents/dvina/ 12 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.cmu.edu/~softagents/webmate/ 13 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.washington.edu/research/projects/softbots/www/softbots.html 14 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://robotics.stanford.edu/groups/nobots/ 15 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://robotics.stanford.edu/groups/nobotics/home.html 16 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.umbc.edu/kqml/ 17 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.umbc.edu/ngram/ 18 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://superbook.bellcore.com/~std/LSI.html 19 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.umd.edu/projects/plus/SHOE/ 20 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.agentsoft.com/xml/ 21 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.cs.ruu.nl/people/theo/abstract_agents2.html 22 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://lcs.www.media.mit.edu/groups/agents/research.html 23 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.media.mit.edu/ 24 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://robotics.Stanford.EDU/~marko/ 25 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://fab.stanford.edu/ 26 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.alexa.com/ 27 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www-diglib.stanford.edu/ 28 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.si.umich.edu/UMDL/ 29 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.si.umich.edu/UMDL/nsfdlsites.html 30 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.si.umich.edu/UMDL/otherdlsites.html 31 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 http://www.si.umich.edu/UMDL/otherdlpubs.html 32 33 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www.asis.org ROBOT - SAVE CON OPENED: http://www.asis.org/midyear-96/girillpaper.html CONTENT GOT: text/html vor ParserDelegator() NEW LINK FOUND: http://www.asis.org NEW LINK FOUND: mailto:trg@llnl.gov NEW LINK FOUND: mailto:luk@ecst.csuchico.edu nach ParserDelegator() PARSER RUN 3 crawl_level: 1 MAX_CRAWL_DEPTH: 2 http://www.asis.org 0 3 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:trg@llnl.gov 1 3 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 mailto:luk@ecst.csuchico.edu 2 3 before addLink(isd.getUrls()[i]) after addLink(isd.getUrls()[i]) URL added via addLink to crawl_level 2 (finished) 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN www-ir.inf.ethz.ch ROBOT - SAVE CON OPENED: http://www-ir.inf.ethz.ch/Public-Web-Pages/mittendorf/mittendorf.html 0 is NOT empty WHILE INIT getNextURL: crawl_level=1 NEXT URL GIVEN lcavwww.epfl.ch java.lang.NullPointerException - http://www-ir.inf.ethz.ch/Public-Web-Pages/mittendorf/mittendorf.html ROBOT - SAVE CON OPENED: http://lcavwww.epfl.ch/LSI/index.html CONTENT GOT: text/html FC1: java.io.FileNotFoundException: http://lcavwww.epfl.ch/LSI/index.html VECTOR IS EMPTY BUILD SUCCESSFUL (total time: 6 minutes 51 seconds)