Sub-document Timestamping:

A Study on the Content Creation Dynamics of Web Documents

Conference Paper (2016)
Author(s)

Y. Zhao (TU Delft - Web Information Systems)

C. Hauff (TU Delft - Web Information Systems)

Research Group
Web Information Systems
More Info
expand_more
Publication Year
2016
Language
English
Research Group
Web Information Systems
Pages (from-to)
203-214
ISBN (print)
978-3-319-43996-9
ISBN (electronic)
978-3-319-43997-6

Abstract

The creation time of documents is an important kind of information in temporal information retrieval, especially for document clustering, timeline construction and search engine improvements. Considering the manner in which content on the Web is created, updated & deleted, the common assumption that each document has only one creation time is not suitable for Web documents. In this paper, we investigate to what extent this assumption is wrong. We introduce two methods to timestamp individual parts (sub-documents) of Web documents and analyze in detail the creation & update dynamics of three classes of Web documents.

No files available

Metadata only record. There are no files for this record.