Work on the design of AI tools to limit cases of online harassment or trafficking is likely to increase in the future. A concrete example is the STOP project , which aims to analyze Twitter posts to limit the risk of suicide. The Institut québécois d’intelligence artificielle (Mila) has designed an algorithm capable of detecting activities or facts that could highlight cases of online sexual exploitation and intervene accordingly. Carnegie Mellon University’s School of Computer Science and McGill University participated in the development of the algorithm.
Trying to find a smart solution to the scourge of sexual exploitation
Infoshield: that’s the name of the algorithm that could be used in the future by Canadian police in their fight against online sexual exploitation. According to the International Labour Organization, approximately 4.8 million people are trafficked annually for sexual exploitation. This global industry, controlled by criminal organisations, is estimated to generate almost 70 billion euros.
To fight against this scourge, a research team has designed this algorithm capable of identifying human trafficking activities in online escort ads. Internet advertisements are regularly used in this kind of trafficking: the online advertising market is constantly growing and offers criminals an anonymous and low-risk platform to carry out their misdeeds in “total peace of mind”.
That’s according to Mila’s senior academic member Reihaneh Rabbany, an assistant professor at McGill’s School of Computer Science and CIFAR Chair in AI-Canada.
“The majority of victims are advertised online and have no influence on the wording of the ads posted by their exploiter, who typically controls four to six victims at a time.”
However, this practice has a flaw: organized online activity can be detected due to similar wording and duplicates among the ads…
algorithm analyzes information and targets ads
The InfoShield algorithm has been designed in such a way that it is “able to spot millions of ads and highlight commonalities between them” according to Christos Faloutos, a professor at CMU’s School of Computer Science. The algorithm would be able to scan the information circulating in real time on the web and social networks, 24 hours a day. Then it would be able to analyze it by linking data together.
Catalina Vajiac and Meng-Chieh Lee, two Mila researchers, explain how the tool works:
“Human trafficking is a serious societal problem and difficult to overcome. By searching for small clusters of ads that contain similar text rather than analyzing standalone ads, we are able to locate clusters of ads that are most likely to correspond to organized activity, namely a strong signal of human trafficking.”
Christos Faloutsos, Catalina Vajiac, and Namyong Park of Carnegie Mellon University, Reihaneh Rabbany, Aayushi Kulshrestha, and Sacha Levy of McGill University and Mila, Meng-Chieh Lee of National Chiao Tung University, and Cara Jones of Marinus Analytics authored the publication outlining the body of work that led to the development of the tool.
How was the algorithm trained and tested?
In order to test Infoshield, the researchers applied the algorithm to a set of escort ads already identified by experts who knew how to recognize this type of advertising. The experiments allowed the reporting of ads on the net with an accuracy of 85%. These results would be, according to the research team, better than all other AI algorithms performing the same tasks.
The training data for the model contained real ads placed by human traffickers: an additional difficulty for the researchers who could not share examples of the identified similarities or the data itself for data and victim protection reasons. Nonetheless, the researchers were able to mine publicly available datasets which they used to train Infoshield.
The researchers hope that the development of tools like this can benefit society and that their algorithm can be leveraged by law enforcement to combat sexual exploitation.