The Center for Education and Research in Information Assurance and Security (CERIAS)

The Center for Education and Research in
Information Assurance and Security (CERIAS)

Real Time Text Analysis on Internet Relay Chat Conversations

Download

Download PDF Document
PDF

Author

Marvin O. Michels

Tech report number

CERIAS TR 2012-03

Entry type

mastersthesis

Abstract

Internet Relay Chat (IRC) has been and is still being used for a number of legal and illegal activities. Investigations dealing with IRC tend to be arduous and require a vast amount of man hours for the constant monitoring needed, whether it is from law enforcement or just a normal user surfing through the channels. This research looked at developing the IRC Data Gathering Tool (IRCDGT), which facilitated real-time analysis of IRC chat messages as well as real-time updates to the investigator. This is intended to help reduce the number of man-house needed in front of a computer for an investigation. A crawler was developed for IRC that goes through a list of channels and reports on what is being discussed in those channels. Normal keyword analysis statistically outperforms keyword & POST analysis in terms of recall while there is no significant difference between basic keyword analysis and keyword & POST analysis in terms of precision. Topic analysis was performed in near-real time to enhance the keyword analysis. Lastly, natural language processing seems to have issues with dealing with the language of the Internet subculture.

Download

PDF

Date

2012 – 5 – 13

Key alpha

MIchels

School

Purdue University

Publication Date

2012-05-13

Contents

Internet Relay Chat Keyword Analysis Topic Analysis

Subject

Analyzing Internet Relay Chat Conversations in real time and near real time.

BibTex-formatted data

To refer to this entry, you may select and copy the text below and paste it into your BibTex document. Note that the text may not contain all macros that BibTex supports.