Abstract
Internet Relay Chat (IRC) has been and is still being used for a number of legal and illegal activities. Investigations dealing with IRC tend to be arduous and require a vast amount of man hours for the constant monitoring needed, whether it is from law enforcement or just a normal user surfing through the channels. This research looked at developing the IRC Data Gathering Tool (IRCDGT), which facilitated real-time analysis of IRC chat messages as well as real-time updates to the investigator. This is intended to help reduce the number of man-house needed in front of a computer for an investigation. A crawler was developed for IRC that goes through a list of channels and reports on what is being discussed in those channels. Normal keyword analysis statistically outperforms keyword & POST analysis in terms of recall while there is no significant difference between basic keyword analysis and keyword & POST analysis in terms of precision. Topic analysis was performed in near-real time to enhance the keyword analysis. Lastly, natural language processing seems to have issues with dealing with the language of the Internet subculture.