Facebook Open Sources Presto for Munching Petabytes of Data

Facebook unveiled Presto, a SQL-on-Hadoop engine that it developed in-house, back in June this year. The SQL engine is capable of doing fast interactive analysis on the social networking site’s enormous 250-petabyte-and-growing data warehouse, with processing speed 10 times faster than Hive.

Today the company has open sourced Presto and the code was made available today under the Apache v2 license. According to Facebook, Presto is “ten times better” than alternatives such as Hive when it comes to CPU efficiency and latency for a large number of queries.

“It currently supports a large subset of ANSI SQL, including joins, left/right outer joins, sub-queries, and most of the common aggregate and scalar functions, including approximate distinct counts (using HyperLogLog) and approximate percentiles (based on quantile digest),” Martin Traverso, a software engineer at Facebook said.


Facebook initially relied on Hadoop MapReduce along with Hive, however, as users increased and its data kept multiplying, the approached seemed very slow. To overcome this issue, Facebook started the development of Presto in the fall of 2012 and was released to Facebook employees last spring. Facebook says that the engine is used by over 1000 employees, running over 30,000 queries on a daily basis.

So, who can use Presto? Well, if you’re a business with 750GB or more data, Presto could be the right choice for you, and Facebook estimates that the system could be relevant for such businesses.

Presto, unlike Hive, does not depend on MapReduce computing framework, which in fact has led to improved scheduling, says Facebook. The software is already being tested by a number of other large Internet services, namely AirBnB and Dropbox.

You can get the source code here.


Published by

Joel Fernandes

Joel Fernandes (G+) is a tech enthusiast and a social media blogger. During his leisure time, he enjoys taking photographs, and photography is one of his most loved hobbies. You can find some of his photos on Flickr.He does a little of web coding, and maintains a tech blog of his own - Techo Latte.Joel is currently pursuing his Masters in Computer Application from Bangalore, India. You can get in touch with him on Twitter - @joelfernandes, or visit his Facebook Profile for more information.

  • http://www.jugarjugar.net/ Jugar Jugar

    These new gadgets are being gradually integrated in the social networks, especially Facebook. We’re still waiting for more on FB and hope that they will not make us disappointed

  • http://www.parafriv.net/ Para Friv

    It seems that in the current period is Faceboook connectivity tools, a social networking site hooijkhoong who do not use, especially young people. Lots of interesting things we can find here, but besides that also hides a lot of problems to think about.

  • http://www.gahe.co/ Gahe

    I definitely bookmark this page and share it with your friends, hopefully will be useful to them.

  • omar

    add me im accept http://bit.ly/1gtGKlP

  • Harini Ethimex

    Good post. Thanks

  • http://www.myappsblog.com/ MyAppsBlog
  • jhony
  • 0funny
  • Mostafa
  • Zhang Xuan
  • http://searchbuzz.co/ SearchBuzz

    Is this the end of Techie Buzz? It seems the articles have stopped coming.

  • http://www.fastfacelikes.com FastFaceLikes

    nice article. facebook remains the biggest internet site.

  • http://descargarares14.com/ OGZxAnKa

    Twitter is the best and more used.

  • http://descargarares14.com/ OGZxAnKa

    Very interesting, the author really express the feelings through the words.

  • roja

    I have read this.nice and useful too

  • http://www.androprogrammer.com/ Wasim Memon

    they have started this project with other big companies like google and twitter etc.

  • http://mymobotips.com/ Ashvin Patil

    Awesome article. facebook remains the biggest social media network ever.