Taking a look at some logs from a RSS-collector two things raised my eyebrows. The first is how many feeds are being served by FeedBurner instead of directly being served by the website it self. The part that worries me is that a lot of those feeds also are about security, privacy and compliance. I think a lot of those people have something to think about in 2012.
The other thing that worries me even more is something I discussed with WordPress developers a couple of years ago and I know others who have done the same with other projects. A lot of projects learned to do input validation, but most of them still need to learn to do output validation. The parser I currently use appears to be very strict luckily and drops a feed when it doesn’t parse correctly. Here comes the funny part, other parses like from Google Reader seems to be more forgiving.
When I search for “libxml exploit” on Google Search I get 1.220.000 results back. I didn’t start searching for parsers currently in use, but this doesn’t look very promising. With current hash-issues in mind, how could this be used to be an attack vector? Keep in mind that a lot of sites use FeedBurner to take the load of there site. And yes, FeedBurner doesn’t really clean things up if I may believe my current logs. So the recipe looks like a good exploit to misuse, a high profile WordPress based website with FeedBurner enabled and watch the fireworks.