Thursday, May 22, 2008

The biggest Data Warehouse in the world

Computerworld published today a news about the Yahoo's database. Yahoo has a database with 2 petabytes, and claims "that it is not only the world's single-largest database, but also the busiest". A petabyte equals one thousand terabytes, one million gigabytes, or 1 trillion megabytes.

The database, specially built data warehouse, it uses to analyze the behavior of its half-billion Web visitors per month. Yahoo bought a database start-up that changed the open-source PostgreSQL to run as column-based database instead of row-based one, and continued to enhance it, including tighter data compression, more parallel data processing and more optimized queries. Yahoo has others large databases to store unstructured data such as video and sound files.

