Lately I have been doing some interesting work on forensic linguistics. We are working on an interesting project called SMIRK that we shall reveal after the publication, or while its in pub. One thing I have learned is that the best one can do with authorship - (being able to detect the author of the system) is to get an accuracy rate of about 70%. We have been able to get that accuracy rate with a small training set of text also, which I believe is great. The concept of forensic authorship can be very useful when an investigator does not know the author of a message. This can be very useful. I also learned about n-grams which I feel could  really aid in forensics, especially because the concept of n-grams can help in achieving a language independent authorship system.




Leave a Reply.