Towards Distribution of Web Sites in a Crawler Used for Large Scale Web Accessibility Assessment

General Information

Download Towards Distribution of Web Sites in a Crawler Used for Large Scale Web Accessibility Assessment as PDF (326 KB) .

Title: Towards Distribution of Web Sites in a Crawler Used for Large Scale Web Accessibility Assessment.
Author(s): Leiming Chen, Dongmei Wang, Morten Goodwin.
Published date: June 2006.
Published at: Second Workshop on Web Accessibility and Metamodelling 2006

Abstract


A mechanisms used for large scale accessibility measuring may involve a distributed web crawler. Furthermore, it makes sense to spread the web sites involved to dirent access points (crawler lo- cations / crawler nodes) of the distributed crawler. We will in this publication present an algorithm utilising the available resources to a much greater extent than the traditional uniform distribution of web sites.
Our novel algorithm, namely the Time Weighted Object Migration Automaton (TWOMA), is an extension on the Object Migration Automaton (OMA). The heart of our scheme involves continuously accessing web sites while measuring the duration of each access. Note that accessing a site involves downloading and measuring the accessibility. When a web site is accessed the following happens;


The above scheme is repeated as long as the crawling / measurement is ongoing. This ensures that the scheme works in a dynamic environment (as the real web). Furthermore, we will in this publication show that the algorithm is working towards an optimal distribtion of web sites in available access points using experimental data.

BIBTEX


@article{chenwangolsentwoma2006,
author{Leiming Chen, Dongmei Wang and Morten Goodwin Olsen},
title{Towards Distribution of Web Sites in a Crawler Used for Large Scale Web Accessibility Assessment.},
booktitle{WWAM},
year{2006},
month{June}
}

The author of this document is:
Morten Goodwin
E-mail address is:
morten.goodwin [at] uia.no
Phone is:
+47 95 24 86 79