`

Hadoop Architecturegasm 10

Posted by andry
on Friday, August 08

sketch

The word “cloud computing” has been buzzing in my head ever since I played with EC2 and S3 in early 2007.

When industrial revolution began nearly a decade ago, factories and farmers are build and maintain their power generators themselves. Later, when electricity companies arrive, factories are begun to dump their generators and “subscribe” to these electricity companies.

Computing need will evolve in that way also. IT companies will dump their server farms, keep small numbers of servers with a lot of network peripheral, and “subscribe” to computing companies. No additional headache to maintain racks, powers, and air coolers.

I’m not dreaming. These are happening. Right here, right now.

When you reach frightening numbers, say 5GB/day, you’ll realize that you need massive computing resource. When you reach 5TB/day, you know that everything they taught us in Computer Science school is obsolete.

RDBMS is irrelevant.
OOP is dead.
(What? Trying to load/stream 5TB with precious java.io.* is a way to brutal suicide. SAX-like event-driven parser is yet another way, although you’ll get far more elegant death).

Just after I looked that sketch we’ve made several days ago, I know I won’t be writing code, if I have too, same way that I wrote code years ago.

Comments

Leave a response

  1. silentAugust 08, 2008 @ 12:43 PM
    hmm... that would be another detik project, I guess...
  2. pebbieAugust 08, 2008 @ 02:50 PM
    OOP is dead. (What? Trying to load/stream 5TB with precious java.io.* is a way to brutal suicide. SAX-like event-driven parser is yet another way, although you’ll get far more elegant death).
    not necessarily kills the overweighted giant.. we might just need to put ourselves higher and floating (no stepping on a shoulder of giant anymore) within the cloud.. which means, the rise of higher & more versatile abstraction.. back to prolog (business rules/workflows)? extend to distributed-collaborative logic? thx
  3. EftuAugust 08, 2008 @ 05:03 PM
    Ah tapi Detik gitu-gitu aja :P
  4. andikaAugust 08, 2008 @ 05:44 PM
    are you sure 5GB/day is large? :p http://fotografer.net/ can easily serve 20GB/day using only one box and I've served ~3GB/day since 2001 http://web.archive.org/web/20020528110624/http://mirrors.piksi.itb.ac.id/ (static content, though)
  5. AbeAugust 08, 2008 @ 09:22 PM
    Look, mbah andika telah bersabda. *hormat* Eniwey mbah, ini cuma faktor usia aja (kalau gak mau dibilang faktor rambut). Kita lihat saja, nanti si andry seusianya mbah andika mungkin nangani lebih lagi. Percayalah.. *eiya, harus kabooorrr...*
  6. treespotterAugust 10, 2008 @ 07:05 PM
    now, that's an interesting revelation. I'm not so certain of the numbers, but well, yeah, the theory is right there.
  7. sufehmiAugust 11, 2008 @ 07:46 AM
    Salah satu masalah dengan cloud computing adalah persepsi bahwa cloud computing = scalable + resilient. Padahal ternyata beberapa implementasinya hanya scalable, namun tidak resilient -- plus berbagai masalah lainnya. Contoh: Amazon S3 downtime, Amazon EC2 down, EC2 : no persistent storage, EC2 : several problems. Mungkin ekspektasi tersebut karena kita sudah terekspos dengan arsitektur Google, yang tidak hanya scalable namun juga sangat resilient. Sehingga ketika mendengar term "cloud computing" maka yang terbayang adalah service level yang seperti demikian juga. Saat ini sudah ada beberapa solusi cloud computing yang bisa dipakai untuk berbagai keperluan (lumayan generik), dan berbasis open source; dan menawarkan scalability + resiliency. Mudah-mudahan dalam waktu tidak terlalu lama lagi solusi-solusi tersebut juga menjadi lebih mudah untuk di implementasikan.
  8. sufehmiAugust 11, 2008 @ 07:51 AM
    @andika - http://fotografer.net/ can easily serve 20GB/day using only one box -- .
    Indeed, a client of mine is serving 60 GB/day, and increasing everyday. Using a server with single-core processor. Loads of RAM though - so we are able to use cache in all parts of the system; from the database up to the reverse proxy.
  9. AndryAugust 12, 2008 @ 07:47 PM
    @andika: 5GB dari satu apache di satu mirror saja pak :d

    @sufehmi: PLN kan juga sering byar pet, tapi orang-orang masih langganan saja ke dia.
  10. FerisSeptember 14, 2008 @ 11:38 PM
    Dear Pak Andry, Yes... I think our conversation in JUG several months make sense to both of us. Very keen to see Detik implement Hadoop in its way. Regards, Feris
Comment