August 25, 2010
Spam is commonly known as unsolicited or unwanted email messages in the Internet causing potential threat to Internet Security. Users spend a valuable amount of time deleting spam emails. More importantly, ever increasing spam emails occupy server storage space and consume network bandwidth. Keyword-based spam email filtering strategies will eventually be less successful to model spammer behavior as the spammer constantly changes their tricks to circumvent these filters. The ...
January 14, 2020
Social networking websites face a constant barrage of spam, unwanted messages that distract, annoy, and even defraud honest users. These messages tend to be very short, making them difficult to identify in isolation. Furthermore, spammers disguise their messages to look legitimate, tricking users into clicking on links and tricking spam filters into tolerating their malicious behavior. Thus, some spam filters examine relational structure in the domain, such as connections amo...
May 7, 2012
The battle between email service providers and senders of mass unsolicited emails (Spam) continues to gain traction. Vast numbers of Spam emails are sent mainly from automatic botnets distributed over the world. One method for mitigating Spam in a computationally efficient manner is fast and accurate blacklisting of the senders. In this work we propose a new sender reputation mechanism that is based on an aggregated historical data-set which encodes the behavior of mail trans...
October 10, 2014
We use the Enron email corpus to study relationships in a network by applying six different measures of centrality. Our results came out of an in-semester undergraduate research seminar. The Enron corpus is well suited to statistical analyses at all levels of undergraduate education. Through this note's focus on centrality, students can explore the dependence of statistical models on initial assumptions and the interplay between centrality measures and hierarchical ranking, a...
November 4, 2010
There is a tremendous increase in spam traffic these days. Spam messages muddle up users inbox, consume network resources, and build up DDoS attacks, spread worms and viruses. Our goal is to present a definite figure about the characteristics of spam and spammers. Since spammers change their mode of operation to counter anti spam technology,continues evaluation of the characteristics of spam and spammers technology has become mandatory. These evaluations help us to enhance th...
February 19, 2004
Unsolicited bulk email (aka. spam) is a major problem on the Internet. To counter spam, several techniques, ranging from spam filters to mail protocol extensions like hashcash, have been proposed. In this paper we investigate the effectiveness of several spam filtering techniques and technologies. Our analysis was performed by simulating email traffic under different conditions. We show that genetic algorithm based spam filters perform best at server level and naive Bayesian ...
January 25, 2002
We study the topology of e-mail networks with e-mail addresses as nodes and e-mails as links using data from server log files. The resulting network exhibits a scale-free link distribution and pronounced small-world behavior, as observed in other social networks. These observations imply that the spreading of e-mail viruses is greatly facilitated in real e-mail networks compared to random architectures.
March 17, 2015
Tracing criminal ties and mining evidence from a large network to begin a crime case analysis has been difficult for criminal investigators due to large numbers of nodes and their complex relationships. In this paper, trust networks using blind carbon copy (BCC) emails were formed. We show that our new shortest paths network search algorithm combining shortest paths and network centrality measures can isolate and identify criminals' connections within a trust network. A group...
October 14, 2009
In this paper we discuss the techniques involved in the design of the famous statistical spam filters that include Naive Bayes, Term Frequency-Inverse Document Frequency, K-Nearest Neighbor, Support Vector Machine, and Bayes Additive Regression Tree. We compare these techniques with each other in terms of accuracy, recall, precision, etc. Further, we discuss the effectiveness and limitations of statistical filters in filtering out various types of spam from legitimate e-mails...
September 12, 2012
A large part of modern day communications are carried out through the medium of E-mails, especially corporate communications. More and more people are using E-mail for personal uses too. Companies also send notifications to their customers in E-mail. In fact, in the Multinational business scenario E-mail is the most convenient and sought-after method of communication. Important features of E-mail such as its speed, reliability, efficient storage options and a large number of ...