A Study of Information Extraction Tools for
Online English Newspapers (PDF):
Comparative Analysis

M. Hanumanthappa; Deepa T. Nagalavi; Manish Kumar

Абстрактный

A Study of Information Extraction Tools for Online English Newspapers (PDF): Comparative Analysis

M. Hanumanthappa, Deepa T. Nagalavi, Manish Kumar

Information retrieval is the task of retrieving relevant and useful information from e-newspapers. Electronic newspapers are electronic replicas of traditional newspapers. E-newspapers are becoming increasingly popular because of the ease and convenience in accessing them. Newspapers are the source of timely information. These are the documents comprising news items and several independent informative articles. It is also interesting to note that many newspapers present news on the same subject with different perspectives. In this fast moving era, it is impossible to read multiple newspapers. Thus, it is an essential to quickly summarize an article collected from different newspapers and present it to the reader in a compact and concise manner without compromising the structure and format of the news. A system that achieves this task should parse the e-newspapers available in PDF format and convert to text format. Secondly, data mining techniques are applied to identify and summarize the articles from various newspapers. This survey, focuses on article identification methods and popular extraction tools used for extracting the contents of e-newspapers for conversion from PDF to text format. A comparative study on extraction tools based on the source type, programming language and working characteristics is also presented.

Отказ от ответственности: Этот реферат был переведен с помощью инструментов искусственного интеллекта и еще не прошел проверку или верификацию

Основные моменты журнала

Adaptive Advanced Numerical Algorithms Автономные и контекстно-зависимые вычисления Агентное промежуточное ПО Безопасность базы данных Беспроводные датчики Биоинформатика и вычислительная биология Грид-вычисления Охранные системы Передовые вычислительные архитектуры Программное обеспечение с открытым исходным кодом Протокол связи CDMA/GSM Радарная технология Распознавание образов/изображений искусственного интеллекта Робототехника Специальная сеть Спокойная технология Структура данных Хранилище данных Широкополосная связь и интеллектуальные сети

Индексировано в

Индекс Коперника

Академические ключи

CiteFactor

Космос ЕСЛИ

РефСик

Университет Хамдарда

Всемирный каталог научных журналов

Импакт-фактор Международного инновационного журнала (IIJIF)

Международный институт организованных исследований (I2OR)

Cosmos

Посмотреть больше

Международные журналы

Инженерное дело медицинские науки Общие науки Фармацевтические науки

Международный журнал исследований в области компьютерной и телекоммуникационной техники

Абстрактный

A Study of Information Extraction Tools for Online English Newspapers (PDF): Comparative Analysis

Основные моменты журнала

Индексировано в

Международные журналы

Адрес