講義情報/ウェブ工学
ウェブ工学
豊田正史(生産研)
電子情報学専攻, 2007 冬, 月 14:45-16:15
講義内容
- 10/01 イントロダクション
- 講義資料
Anna Patterson. Why Writing Your Own Search Engine is Hard. ACM Queue vol.2 no.2, 2004
- 10/15 大規模検索エンジンの仕組み
- Sergey Brin and Lawrence Page. The Anatomy of a Large-Scale Hypertextual Web Search Engine. WWW7, 1998
- 11/05 ページランク
-
Page, Lawrence; Brin, Sergey; Motwani, Rajeev; Winograd, Terry. The PageRank Citation Ranking: Bringing Order to the Web. 1999
- 11/12 ハブ・オーソリティ解析
-
J. Kleinberg. Authoritative Sources in a Hyperlinked Environment, Proc. 9th ACM-SIAM Symposium on Discrete Algorithms, 1998. Extended version in Journal of the ACM 46(1999). Also appears as IBM Research Report RJ 10076, May 1997.
Krishna Bharat and Monika R. Henzinger. Improved algorithms for topic distillation in a hyperlinked environment, SIGIR '98, 1998.
Soumen Chakrabarti, Byron Dom, Prabhakar Raghavan, Sridhar Rajagopalan David Gibson, Jon Kleinberg. Automatic resource list compilation by analyzing hyperlink structure and associated text, WWW7, 1998.
Jeffrey Dean, Monika R. Henzinger. Finding Related Pages in the World Wide Web, WWW8, 1999.
- 11/19 リンクデータベース, ミラー検出
-
Keith H. Randall, Raymie Stata, Rajiv Wickremesinghe, Janet L. Wiener. The Link Database: Fast Access to Graphs of the Web, Compaq Systems Research Center, Tech Report: SRC-RR-175, 2001
Andrei Z. Broder, Steven C. Glassman, Mark S. Manasse, Geoffrey Zweig. Syntactic Clustering of the Web, WWW6, 1997
- 11/26 ウェブグラフ全体の構造
-
Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan, Andrew Tomkins. Trawling the Web for Emerging Cyber-Communities. WWW8, 1999
Andrei Broder, Ravi Kumar, Farzin Maghoul, Prabhakar Raghavan, Sridhar Rajagopalan, Raymie Stata, Andrew Tomkins, Janet Wiener. Graph structure in the web. WWW9, 2000
- 12/10 並列クローラー
- Junghoo Cho, Hector Garcia-Molina. Parallel crawlers. WWW2002, 2002
- 12/17 ウェブページの進化
-
Dennis Fetterly, Mark Manasse, Marc Najork, and Janet Wiener. A Large-Scale Study of the Evolution of Web Pages. WWW2003
Alexandros Ntoulas, Junghoo Cho, Christopher Olston. What's new on the web?: the evolution of the web from a search engine perspective. WWW2004
- 01/21
- 02/04
評価
2回のレポート提出による。
11/19 第1回レポート課題 (12/17 〆切)
WWW, SIGIR, SIGKDD等の著名な国際会議において発表された、Webに関係した
full paperの中から、興味深いものを1本選び、その内容を6ページ以内でまと
めよ。ただし、講義で扱った上記の論文は選ばないこと。以下の項目を
必ず含めること。
- 論文タイトル、著者、会議名、出版年
- なぜその論文を興味深いと思ったか
- 論文の内容のまとめ(4ページ程度)
- 論文の良い点を3つ
- 論文の悪い点を3つ
レポートはPDFフォーマットで、以下のメールアドレスへ送付すること。
Subject: の先頭に [Web Engineering Report] と記入すること。
toyoda [@] tkl.iis.u-tokyo.ac.jp
12/17 第2回レポート課題発表 (2/4 〆切)
以下の課題から一つを選んでレポートとして提出せよ。
- 第1回レポートと異なる論文を選び、同じ様式で提出せよ。
- Web情報を扱う計算機実験を行い、目的、方法、結果をレポートせよ。既存の技術の追試や改良でも良いし、Yahoo!やGoogleのAPIを使ったオリジナルウェブサービスのデモでも良い。
レポートはPDFフォーマットで、以下のメールアドレスへ送付すること。
Subject: の先頭に [Web Engineering Report 2] と記入すること。
toyoda [@] tkl.iis.u-tokyo.ac.jp
Web Engineering
Masashi Toyoda (Institute of Industrial Science)
Information and Communication Engineering, 2007 Winter, Monday 14:45-16:15
Topics
- 10/01 Introduction
- Resume
Anna Patterson. Why Writing Your Own Search Engine is Hard. ACM Queue vol.2 no.2, 2004
- 10/15 The Anatomy of a Large-Scale Search Engine
- Sergey Brin and Lawrence Page. The Anatomy of a Large-Scale Hypertextual Web Search Engine. WWW7, 1998
- 11/05 PageRank
-
Page, Lawrence; Brin, Sergey; Motwani, Rajeev; Winograd, Terry. The PageRank Citation Ranking: Bringing Order to the Web. 1999
- 11/12 Hub and Authority Analysis
-
J. Kleinberg. Authoritative Sources in a Hyperlinked Environment, Proc. 9th ACM-SIAM Symposium on Discrete Algorithms, 1998. Extended version in Journal of the ACM 46(1999). Also appears as IBM Research Report RJ 10076, May 1997.
Krishna Bharat and Monika R. Henzinger. Improved algorithms for topic distillation in a hyperlinked environment, SIGIR '98, 1998.
Soumen Chakrabarti, Byron Dom, Prabhakar Raghavan, Sridhar Rajagopalan David Gibson, Jon Kleinberg. Automatic resource list compilation by analyzing hyperlink structure and associated text, WWW7, 1998.
Jeffrey Dean, Monika R. Henzinger. Finding Related Pages in the World Wide Web, WWW8, 1999.
- 11/19 Link Database, and Near Mirror Detection
-
Keith H. Randall, Raymie Stata, Rajiv Wickremesinghe, Janet L. Wiener. The Link Database: Fast Access to Graphs of the Web, Compaq Systems Research Center, Tech Report: SRC-RR-175, 2001
Andrei Z. Broder, Steven C. Glassman, Mark S. Manasse, Geoffrey Zweig. Syntactic Clustering of the Web, WWW6, 1997
- 11/26 Graph Structure of the Web
-
Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan, Andrew Tomkins. Trawling the Web for Emerging Cyber-Communities. WWW8, 1999
Andrei Broder, Ravi Kumar, Farzin Maghoul, Prabhakar Raghavan, Sridhar Rajagopalan, Raymie Stata, Andrew Tomkins, Janet Wiener. Graph structure in the web. WWW9, 2000
- 12/10 Parallel Crawlers
- Junghoo Cho, Hector Garcia-Molina. Parallel crawlers. WWW2002, 2002
- 12/17
-
Dennis Fetterly, Mark Manasse, Marc Najork, and Janet Wiener. A Large-Scale Study of the Evolution of Web Pages. WWW2003
Alexandros Ntoulas, Junghoo Cho, Christopher Olston. What's new on the web?: the evolution of the web from a search engine perspective. WWW2004
- 01/21
- 02/04
Evaluation
Two reports:
11/19 Announcement (Deadline 12/17)
Choose an interesting full paper relating to the Web from measure
international conferences, such as WWW, SIGIR, SIGKDD, and report a
summary of that paper within 6 pages in total. Do not select papers
explained in the lecture (listed above). You must include the
following.
- The informaion of the paper (title, authors, conference, year)
- The reason why you are interested in the paper
- A summary of the paper (about 4 pages)
- Three strong points about the paper
- Three weak points about the paper
Send the report to the following e-mail address in the PDF format.
The "Subject:" should begin with "[Web Engineering Report 2]".
toyoda [@] tkl.iis.u-tokyo.ac.jp
12/17 Announcement (Deadline 2/4)
Make one of the following reports.
- Choose another paper and report a summary in the same way as the first report.
- Perform some computer experiment on the Web data, and report the purpose, method, and results. It may be a check experiment or improvement of existing techniques, or a demonstration of your original web service created with API of Yahoo!, Google, Amazon, etc.
Send the report to the following e-mail address in the PDF format.
The "Subject:" should begin with "[Web Engineering Report 2]".
toyoda [@] tkl.iis.u-tokyo.ac.jp