Author
Wei Jiang, Mummoorthy Murugesan, Chris Clifton, Luo Si
Abstract
Similar document detection plays important roles in many applications, such as file management, copyright protection, and plagiarism prevention. Existing protocols assume that the contents of files stored on a server (or multiple servers) are directly accessible. This assumption limits more practical applications, e.g., detecting plagiarized documents between two conferences, where submissions are confidential. We propose novel protocols to detect similar documents between two entities where documents cannot be openly shared with each other. We also conduct experiments to show the practical value of the proposed protocols.