Because nowadays network environment is based on HTML which only describes forms rather than discovers content, so the problems of half or non-structure datum and isomerous database source bring difficulties for web data mining.
英
美
- 由于现行的网络环境以HTML语言为基础构建,它是一种只能描述形式而不能揭示内容的语言,因此,Web上的半结构化数据和异构数据源问题给Web数据挖掘带来了困难。