Chrome extensions and security

Adrienne wrote a blog post about some of her recent work analyzing Google Chrome extensions for security related bugs. It’s a nice read and illuminates mistakes made by a surprisingly large number of extension developers (27 / 100 extensions leak private information!).

Although I don’t use Chrome on a regular basis, I had believed that  simple APIs and (presumably) more thought that went into security related design would have made it more difficult for developers to write vulnerable extensions.

It’s not just extensions that are problematic either, in a recent screenshot of a Blackhole Exploit Pack’s control panel, the exploits it served were far more successful against Google Chrome (in % of visitors) than all versions of FF and IE combined.

Posted in Uncategorized | Comments Off

paper at IMC 2011

This year we have a paper studying the activity of suspended users on Twitter, which will appear at IMC in November. The title is “Suspended Accounts In Retrospect: An Analysis of Twitter Spam“, and the paper presents a unique perspective on spam as compared to our previous papers (in CCS and S&P). We look back in time and collect the spam tweets sent by users who were eventually suspended, and then try and tie as much together as we can.

For example, one spam campaign, advertising a single landing page, can use well over 100,000 twitter accounts, and send millions of tweets. Each of the accounts involved was created for the purpose of sending spam, and generally has never sent a non-spam tweet (there are some exceptions to this of course!). The resources of the Twitter spammers is quite impressive — being able to throw away 100k accounts (they all get suspended eventually) after sending a few tweets demonstrates the account resources they have at hand.

Anyway, read more in a couple weeks when we post the PDF!

Posted in Uncategorized | Comments Off

Anti-virus labels are not suitable for system evaluation

I won’t name names, but there’s plenty of researchers out there that rely on anti-virus labeling in their research. While this could work, without manual validation there’s very little chance the AV labels can be used as any sort of ground truth.

Here’s 5 reports:
1. fc39ce1593cfb6ca1eb0c289a2ca561c

2. c4d93b536f35b350a992a402dfd72e12

3. c77ba55255c1db38568ca3a73d4b8a72

4. e57d938e0754e4fbb3b87cf818a0fc69

5. e397696b7835ccdcfad9d768cf1a091c

Quick highlights in classification from each report:
1. Bredolab, Krap, Ursnif, Downloader, Generic, etc…
2. Krap, Kryptic, Generic packed, etc…
3. Bredolab, Oficla, Krap, Zbot, Ldpinch, etc…
4. Bredolab, Harnig, Krap, Ursnif, etc…
5. FakeAV, Bubnix, etc…

Based on those 5 reports, it’s certainly not obvious that these samples are all the exact same family of malware. In fact, if you run each one, they issue nearly identical HTTP requests. Report #3 seems to have the most diverse set of well-known names, almost a grab bag of popular malware.

There’s a few things I can say for certain: It’s definitely malware. It’s not Bredolab. It’s also not Harnig, Zbot, Ldpinch, Oficla, or any sort of FakeAV. I’m not sure what a few of the names, like Krap and Ursnif, refer to, so I can’t definitively say it’s not those.

Based on these reports, if someone were to go and develop a malware classification technique and validate it against a set of malware (see lots of papers from IEE S&P, Usenix, ACM CCS, and everywhere else!), using ground truth obtained from Virus Total labels: Which AV should be trusted? Will that same AV perform well on another family of malware? Do any of the labels have more or less meaning than others?

If an AV says a binary is Bredolab (Report #1), what does that mean? Did engineers determine that a particular binary, with a specific MD5, is Bredolab? Did they find a few bytes in the binary that typically indicates Bredolab? Did the network traffic match Bredolab?

In summary, the labels that AV programs produce for malware are too noisy to be used with any confidence to evaluate a system unless each sample is manually validated.

Posted in research | Tagged , | Comments Off

Click Trajectories press!

The paper, “Click Trajectories: End-to-End Analysis of the Spam Value Chain”, got quite a bit of pres recently so there’s a number of great articles that summarize the paper content and have gone out to get quotes from banks and other security researchers and experts.

Newspaper, online news, and blogs:

You can even watch a video!

Posted in research | Comments Off

Papers at 2011 IEEE Symp. on S&P

We had two papers at Oakland this year, and I’ve put the PDFs up online. Kirill and Kurt  presented on Tuesday afternoon (schedule)

NYT Article on the “Click Trajectories” work: http://nyti.ms/j6sf5c

“Click Trajectories: End-to-End Analysis of the Spam Value Chain”, Kirill Levchenko, Andreas Pitsillidis, Neha Chachra, Brandon Enright, Mark Felegyhazi, Chris Grier, Tristan Halvorson, Chris Kanich, Christian Kreibich, He Liu, Damon McCoy, Nicholas Weaver, Vern Paxson, Geoffrey M. Voelker, and Stefan Savage. Proceedings of the IEEE Symposium on Security and Privacy, Oakland, CA, May 2011. PDF

and

“Design and Evaluation of a Real-Time URL Spam Filtering Service”, Kurt Thomas, Chris Grier, Justin Ma, Vern Paxson, and Dawn Song. Proceedings of the IEEE Symposium on Security and Privacy, Oakland, CA, May 2011. PDF

Posted in Uncategorized | Comments Off

Naming some popular spambots

Part of what I’ve been doing lately is finding, running, and maintaining bots in a controlled environment. The first part, finding, which includes identifying the binaries I’m running, turns out to be difficult to do.

Through a few “special” techniques, I come up some new binaries that produce spam. For example, the binary with MD5 f03077adfdedc55b9ae906be897f2cc0. It runs, connects a C&C, has a obfuscated C&C protocol, and ends up sending spam. So what is it? Virus Total says: Screenshot 1

What does that mean? Well, in my  opinion, it means that none of the AV signatures have a clue, they just say it’s probably bad stuff. This binary happens to be a installer for a newer version of Rustock, which I can verify by watching it run. I have several thousand binaries that I’ve acquired using the same technique as this one, most of which also have useless AV labels.

Why is that? Malware distribution is complicated. There’s a lot of steps, intermediate binaries, packers, crypto, etc… What happens is that somewhere along the line of installation, the AV signature matched and then labeled other things according to the same signature. I see this a lot with generic droppers, the bot binaries that are run become labeled with Virut (a old school generic dropper), or Harnig (another generic dropper), both of which can drop any number and type of malware binaries. In some experiments, I’ve seen over 15 different binaries be downloaded and executed by a single dropper, and this behavior changes on subsequent executions.

Posted in research | Tagged , , | Comments Off

presenting at CCS Tuesday

I’m going to be at CCS 2010 in Chicago this week presenting @spam: The Underground on 140 Characters or Less. My presentation is the 3rd talk of the conference in the security session (on the first day).

Posted in Uncategorized | Tagged | Comments Off

Illinois email going away!

grier@uiuc.edu and grier@illinois.edu are going to stop working this Friday! CITES is officially done forwarding my email. Use my new ones @berkeley.edu, or better yet: grier@imchris.org!

Posted in Uncategorized | Comments Off

Running research on AWS

At the beginning of the year, in the middle of the project that led to the CCS paper on Twitter spam, I decided to try out Amazon Web Services. As I’ve slowly become familiar with the process, I’ve found that more and more I can make use of EC2, S3, Elastic Map Reduce, or even Virtual Private Cloud for what I’m working on.

Sometimes this is a mixed blessing – I spend a bunch of time working out how to use a certain AWS feature, when I could have just ran with something I baked myself. Other times, it’s awesome.

One of the current projects I’m working on is building a service that can quickly identify URLs as being spam, the target being a service that Facebook, Twitter, or anyone else could add into the line of fire and remove bad messages before they are seen a few thousand times (and reported as spam or phishing).

For the most part, our stuff is up and working –and even works at quite a large volume on relatively small amount of computers. What’s even better is that it’s currently operating on AWS, running a reduced volume (1 million URLs per day) on medium sized instances, for around $.00086/URL (works out to $860/month for 1 million URL capacity). We’ve also been testing to see if this scales, and it looks like it does! a little less than linearly with the number of URLs we need to handle per day. As we evaluate more of the system we’ve built and write the paper, I’ll post a few updates on how it’s built and what we are using to make it all work.

Posted in research | Comments Off

a journal paper

In the summer of 2007 I wrote a paper on the OP web browser that was published at Oakland in 2008. A few months afterward I was invited to submit it as a “fast tracked” paper in a journal. I thought it would be a easy way to add in some of the work we had done while working on and using OP since summer 2007.

If, or when, the journal paper actually gets published, security and systems researchers will have been using OP since November 2007 (3 years ago!), Chrome since Sept 2008 (2 years ago!), had the opportunity to read the Gazelle paper (summer 2009), use Firefox with out-of-process plugins (spring 2010), and possibly even try out a full multi-process Firefox (upcoming release?), not to mention LCIE in IE8 (spring 2009). And this list doesn’t even include the many other security improvements that have been made in these browsers by both researchers and industry.

Posted in research | Comments Off