Facebook Open Sources Presto for Munching Petabytes of Data
By on November 7th, 2013

Facebook unveiled Presto, a SQL-on-Hadoop engine that it developed in-house, back in June this year. The SQL engine is capable of doing fast interactive analysis on the social networking site’s enormous 250-petabyte-and-growing data warehouse, with processing speed 10 times faster than Hive.

Today the company has open sourced Presto and the code was made available today under the Apache v2 license. According to Facebook, Presto is “ten times better” than alternatives such as Hive when it comes to CPU efficiency and latency for a large number of queries.

“It currently supports a large subset of ANSI SQL, including joins, left/right outer joins, sub-queries, and most of the common aggregate and scalar functions, including approximate distinct counts (using HyperLogLog) and approximate percentiles (based on quantile digest),” Martin Traverso, a software engineer at Facebook said.

Presto

Facebook initially relied on Hadoop MapReduce along with Hive, however, as users increased and its data kept multiplying, the approached seemed very slow. To overcome this issue, Facebook started the development of Presto in the fall of 2012 and was released to Facebook employees last spring. Facebook says that the engine is used by over 1000 employees, running over 30,000 queries on a daily basis.

So, who can use Presto? Well, if you’re a business with 750GB or more data, Presto could be the right choice for you, and Facebook estimates that the system could be relevant for such businesses.

Presto, unlike Hive, does not depend on MapReduce computing framework, which in fact has led to improved scheduling, says Facebook. The software is already being tested by a number of other large Internet services, namely AirBnB and Dropbox.

You can get the source code here.

 

Tags: ,
Author: Joel Fernandes Google Profile for Joel Fernandes
Joel Fernandes (G+) is a tech enthusiast and a social media blogger. During his leisure time, he enjoys taking photographs, and photography is one of his most loved hobbies. You can find some of his photos on Flickr. He does a little of web coding, and maintains a tech blog of his own - Techo Latte. Joel is currently pursuing his Masters in Computer Application from Bangalore, India. You can get in touch with him on Twitter - @joelfernandes, or visit his Facebook Profile for more information.

Joel Fernandes has written and can be contacted at joel@techie-buzz.com.

Leave a Reply

Name (required)

Website (optional)

 
    Warning: call_user_func() expects parameter 1 to be a valid callback, function 'advanced_comment' not found or invalid function name in /home/keith/techie-buzz.com/htdocs/wp-includes/comment-template.php on line 1694
 
Copyright 2006-2012 Techie Buzz. All Rights Reserved. Our content may not be reproduced on other websites. Content Delivery by MaxCDN