Friday, June 6, 2008

Malware Trend in 2007

I read the report IBM Internet Security System X-Force 2007 Trend
Statistics. This is a report describing trends for various threats in 2007.
This team has been tracking trends since 2000. I found the report
to be quite interesting. In the rest of this post, I highlight some
of the interesting points from the report and what they mean in the
context of malware detection.

(I) The X-Force team reports continued growth in Web browser exploitation. This
clearly shows that the infection vector is changing to the Web. Earlier
the primary infection vectors were email and the network. Therefore,
for detecting malware, drive-by-downloads (DBD) and other threats targeted at hacking through the Web browser need a lot of attention.

(II) X-Force also reports a marked increase in obfuscated exploits, i.e.,
exploits that use various code obfuscation techiques (such as encryption).
Here is a quote, "X-Force estimated that nearly 80 percent of Web exploits
used obfuscation and/or self decryption ... By the end of 2007, X-Force
believed this rate had reached 100 percent, ...". This means that going
forward, Web exploits will increasingly harbor indiscernible code rending signature-based techniques less effective. Advanced
techniques (such as behavior-based detection) are clearly needed to detect
such malware. To exacerbate the situation the X-Force report stated that
there was a 30% increase in new malware samples in 2007 over 2006. This
further drives home the point that signature-based detectors will have trouble
in keeping up with the number of malware as they cannot detect new threats.

(III) There was another very interesting point made by the report. Modern
malware use features from various types of classic malware (such as viruses, worms,
and spyware) by pulling the successful features of each into new strains. To quote the report, "Modern malware is now the digital equivalent
of the Swiss Army knife, and 2007 data continues to support this." This trend
also indicates that the behavior of malware is becoming more sophisticated, which
again supports my claim that detection techniques based on analyzing behavior are
better suited to handle malware of the future. Another interesting tidbit from the
report: "Trojans make up the largest class of malware in 2007 as opposed to downloaders,
which were the largest category in 2006." Recall that a Trojan appears to be a
legitimate file with some hidden functionality (for example, that of a rootkit).
Trojans are historically a problematic class of malware for signature-based

Overall, I found the report to be very interesting. Read it for yourself.
You can find the report here.

Wednesday, April 23, 2008

Zero Day Threat by Acohido and Swartz

I read the book Zero Day Threat (ZDT) by Byron Acohido and Jon Swartz. I really liked the book! Zero Day Threat is about the underground cyber-economy. It makes some surprising points grounded in real truths. I liked that the book paints a complete picture, i.e., how malware,
identity theft, and "drop off" gangs collaborate to facilitate
a well oiled cyber-economy. Since my research area is security,
I was very familiar with the different types of malware brought up in Zero Day Threat. However, this book gave me a complete picture of the problem.

I particularly appreciated two features of the book:

Structure: Each chapter is broken into three sections: exploiters,
enablers, and expeditors. Exploiter sections focus on crooks (such
as scam artists and drug addicts) and how they benefit from the
underground economy. The Enablers sections focus on credit card
companies, banks, and credit bureaus, and how their current practices
enable the underground cyber-economy. Expediters
are guys (good and bad) that allow the cybercrooks to exploit
vulnerabilities in an expeditious manner. I thought this structure
was just brilliant! It really brings out the correlation between
various factors and actors that enable the underground cyber-economy.

Narrative Style: I really enjoyed various anecdotes in the book.
There are several stories about people being scammed or getting
lured into the profitable cyber-underground. For example, there is a story of
a "drop off" gang in Edmonton which is narrated throughout the
book. These anecdotes make the book very interesting and provide
a "human side" to the cyber-underground.

I highly recommend this book.

Wednesday, March 19, 2008

Botnets in USA Today

I got a call from Byron Acohido over at the USA Today last weekend,
and we had an interesting talk about botnets. Byron and Jon Swartz ended
up writing an article about botnets which appeared as the cover story
in the Money section of the USA Today on March 17, 2008. Here's a link to the full
story (link). I found the entire article to be a fascinating read
on the nature of botnets. Here are some of the highlights, but
definitely go and read the entire article.
  • On a typical day, 40% of the 800 million computers connected to the Internet are bots engaged in various nefarious activities, such as spamming, stealing sensitive data, and engaging in denial-of-service attacks. Think about it. Approximately 320 million computers are engaged these illicit actiivities!
  • Later on in the article they describe various features of Storm, the state-of-the-art for botnets. Storm introduced various innovations into the bot landscape, such as using P2P style communication to converse with the bots and encrypting the command-and-control (C&C) traffic. Command-and-control is the traffic from the bot-herder to the bots instructing them to perform various nefarious activities. Note that this means that various network-based botnet solutions that simply look for centralized C&C communication will not work. Moreover, encrypted traffic is a major problem for the network-based solutions. See my earlier blog where I argue that we should move to a cooperative solution. This is looking like a very good idea. Storm also has a self-defense mechanism, i.e., anyone trying to probe the botnet is punished with a denial-of-service attack. I found this self-defense mechanism of Storm to be very interesting.
Overall a fascinating article!
I plan to drop by Byron's book signing at the RSA Conference in San
Francisco on April 7th. Byron also has an interesting blog which is related to the
material in the book.

Wednesday, March 5, 2008

Model Checking and Security

Model checking is a technique of verifying temporal properties of finite-state systems. One of
the attractive features of model checking over other techniques (such as theorem proving)
is that if a property is not true, a model checker provides a counter-example which
explains why the property is not true. Inventors of model checking, Edmund Clarke,
Allen Emerson, and Joseph Sifakis, won the 2008 ACM Turing award (see the announcement here). I have a personal connection to two of the recipients. Edmund Clarke was my adviser
at Carnegie Mellon, and Allen Emerson and I have collaborated on few projects and he
has supported me through out my career.

In this note I try to summarize various applications of model checking to security.

Protocol verification: Protocols in the realm of security (henceforth referred to
as security protocols) are very tricky to get correct. For
example, flaws in authentication protocols have been discovered several years after they have been published. Techniques based on model checking have been extensively used to verify these protocols. The tricky part in applying these techniques for verifying security protocols is
modeling the capabilities of the attacker. Gavin Lowe used the FDR model checker to find
a subtle attack on the Needham-Schroeder authentication protocol (this
publication can found here). Following Lowe's work there
was a flurry of activity on this topic. Interested readers can look at the proceedings of
the Computer Security Foundations Symposium (CSF).

Vulnerability assessment: Imagine you are given an enterprise network with various components (firewalls, routers, and Intrusion Prevention Systems (IPSs)).
Vulnerability assessment tries to ascertain how an attacker can penetrate the specified network. Vulnerability assesment is crucial in updating policies of various security appliances (such as firewalls and IPSs) and ascertaining the risk of various decisions. Traditionally, vulnerability assessment has been performed by red teams. Red teaming is a very valuable activity but can provide no guarantees that the entire state space of vulnerabilities has been explored. I along with (Oleg Sheyner and Jeannette Wing) explored techniques based on model checking for vulnerability assessment. We formally specify the network and express the negation of the attackers goal (e.g., attacker gets root access on a critical server) as a property to be verified. If
the specified network is vulnerable, then the model checker will output a counter-example (which is an attack on the network). The innovation we devised was to output the set of all
counter-examples or attacks as an attack graph, which is succinct representation of all attacks on the network. Analysis of the attack graph can provide a basis for vulnerability assessment. This paper can be downloaded here.

Other applications: There are several problems in security that can be addressed using model checking. For example, I and Tom Reps have used model checking properties to analyze properties of security policies in trust-management systems. Ninghui Li and his collaborators have used techniques based on model checking to analyze several classes of security properties.
In the context of security, the advantage model checking has over other techniques (such as
testing) is that it exhaustively covers the state-space. After all, if you just have one
vulnerability, an attacker will exploit that vulnerability, i.e., an attacker just needs one
door to get through your system. Thus the completeness guarantee that a model checker provides is very valuable in the context of security.

Wednesday, February 13, 2008

Cooperating Detectors

A malware detector tries to determine whether a program is malicious (examples
of malicious programs are drive-by-downloads, botnets, and keyloggers).
Malware detection is primarily performed at two vantage points: host and
network. This post explains why cooperation between host-based and network-
based detectors is a good thing.

Traditionally, detection has been performed either at the network or host level, but
not both. First, let me examine both approaches separately.

A network-based detector monitors events by examining a session or
network flow and tries to determine whether it is malicious. The
advantage of a network-based detector is ease of deployment -- there
are not that many points of deployment for a network-based detector
(typically they are deployed behind border routers).

Unfortunately, network-based detectors have a limited view of each
network session. In fact, if a session happens to be
encrypted such as is common with VPNs, Skype, and some bots, a
network-based detector is essentially blind. For example, a botmaster
can hide its communication with the bots by simply encrypting the session.

By contrast, host-based detectors have a more comprehensive view of system activities, i.e.,
they have the potential to observe every event at the host, including malicious ones. However, the major drawback of a host-based detector is that it has to be widely deployed. Typically in a managed
network (such as in an enterprise), a host-based detector has to be deployed at
every host.

Cooperation between host-based and network-based detectors can potentially
address the shortcomings of each detector. I've come up with three possible scenarios.

1) Host-based detector helping the network-based detector.
A network-based detector can pull alerts from a host-based
detector and a host-based detector can push alerts to a network-based
detector. This is a simple solution and I suspect the easiest
scenario for cooperation.

2) Queue up suspicious activity on a virtual machine.
If a network-based detector determines that a session is
"suspicious," it can divert the suspicious traffic to a virtual machine
with a host-based detector for more in-depth analysis. The trick here
is figuring out what events are indeed "suspicious" (you do not want
too much traffic to go through the "slow path" corresponding to a
host-based detector). There is already a
startup called Fireeye adapting this solution. I find this line of work quite intriguing.

3) Pushing signatures.
This third scenario has been explored quite thoroughly in academic
literature. It involves the cooperation of host-based and network-based detectors to
push signatures for malware in real-time. For example, if a host-
based detector recognizes an attack, it pushes out a signature to a
network-based detector. The advantage of this
approach is that, by updating a network-based detector, an entire
enterprise can be protected against that particular threat. However, in my
view this is not a good approach in the long run. Hackers are creating malware variants at
an alarming rate and signatures won't be able to keep up.

Wednesday, January 30, 2008

Case for kernel-level detection

Why kernel-level detection?
These are my thoughts on why malware detection should performed at the
kernel level. In general, the lower in the system hierarchy your
detector resides, the harder it is for an attacker to evade your detector.
For example, if a detector uses system-call interposition, an attacker can
evade this system by directly using kernel calls. For example,
system-call interposition can be done on Windows using the following
package. In my conversations with
a guy from NSA (name withheld for obvious reasons:-)) he confirmed that
new malware they are observing in their lab are using kernel calls directly.
Also, look at the following article

The semantic-gap problem:
A natural question that comes to mind is: why not perform detection at even a lower layer
in the heirarchy? Say the VM layer or even better at hardware. As you move
down in the system hierarchy, you lose some high-level semantics. Let me explain.
Lets say you are doing detection at the VM layer. A high-level event (such as
opening a file) manifests itself as a sequence of events (such as writing to
memory page or an interrupt). In other words, there is a gap between
the events you observe at the VM level and the corresponding high-level event. To my knowledge
the "semantic gap" issue was first articulated in the following paper:

Peter M. Chen, Brian D. Noble, "When virtual is better than real",
Proceedings of the 2001 Workshop on Hot Topics in Operating Systems (HotOS),
May 2001.
The paper can be downloaded at the following site.

As you move down in the hierarchy, the semantic gap problem becomes harder. The
semantic gap problem still exists at the kernel level, but it is more tractable
than at the other layers. Therefore, I think kernel-level detection hits
the "sweet spot". Implementing detectors at kernel level is harder than other
approaches (such as system-call interposition), but then everything good in life takes
effort:-) I strongly believe that detectors that use system-call interposition are very
easy to evade, and so what is the point in having them. The next generation of malware
will definitely use kernel calls directly.