-
Building Scalable Websites with Perl, Perrin Harkins
2004-07-29 17:59 in /tech/oscon
Big sites run on Perl: Yahoo / Overture, Amazon / IMDB, TicketMaster / CitySearch. Billions of requests a day served with Perl. How do we do it?
Scaling at this level requires more than just hardware. If your application is slow, you can’t just throw hardware at it. This is about high-level design patterns for applications.
Caching
It’s nearly always worthwhile to cache some data. Caching full pages is best if you can do it. Pregenerate, or generate-on-the-fly and use mod_proxy cache. Usually can’t do this because you have at least some personalized data on every page. Partial page caching caches different page sections with different expirations. Mason provides this. Other modules: Cache::FastMmap (key/value storage in local shared memory), Cache::Memcached (distributed key/value storage)
(Aside: tied BerkeleyDB bad — object interface about 3x faster)
Job Queuing
Bursty traffic is a real pain. If traffic load exceeds steady state capacity, you need to queue requests to be handled in turn. Decouples users from backend load. Typical “search in progress” approach is okay, but doesn’t scale unless you limit the number of worker threads / processes. Don’t want to fork 1000 processes if you have 1000 user waiting. Spread::Queue is one possibility.
Questions...
How do you deal with session data? General rule: keep as little session data as you possibly can.
Memcached vs. MySQL performance? About the same for small numbers of clients. memcached ought to be better for large numbers (at least on BSD or Linux 2.6)
Is Spread ready for production? Seems to be.
-
IO::All
2004-07-29 14:59 in /tech/oscon
Damn... been coding for 5 minutes and I’m wishing I could use IO::All.
-
IO::All and Other Spiffy Modules, Brian Ingerson
2004-07-29 13:40 in /tech/oscon
Well, I’m figuring I picked the right talk for this time period, since Damian Conway is in the back row as well as a couple other luminaries. If it’s worth his time, it’s probably going to be interesting. (Correction: everyone is here)
Goal: make all i/o as easy as writting to STDOUT. E.g.
"Hello World\n" > io("greetings.txt");Don’t worry, there are method versions of everything, but overloaded operators seem to make it clearer.
IO::All exports the io() method which is just a factory for IO::All objects. IO::All objects can be files, directories, sockets, pipes, dbms, web pages, and more.
files can slurp(),
>>, or scalar ref to get contents.next() iterates through directories, returning more IO::All objects.
Stuff that just works:
io(‘—’) > io(‘-’); $tmpio = io(‘?’) < $str; #tmp file $tmpio->seek(0,0); $tmpio > io(‘=’); #STDERR
<,>,<< on http IO objects does GET, PUT, POST.
2 line forking webserver with CGI support
“IO::All is not your mom”
Continuing on with Spiffy.pm. Sort of a super-exporter that works with inheritance. Also some nice OO conveniences. Implicit
$self, can callsuperto call method of same name with same arguments in superclass.fieldandconstdefine mutable and immutable object attributes. Provides mixins. -
How to Manage over 1000 Servers in Your Spare Time, Sean Lynch
2004-07-29 11:18 in /tech/oscon
Another standing room only talk, but at least I got a seat. I’m not sure who is doing room assignments, but this seems like it should have been in a larger one. I think there may be a problem of there only being small rooms and really big rooms, and no medium rooms
This talk will cover a subset of best practices used by Ticketmaster to manage a large set of servers with a modest staff. Approx. 800 production linux servers, 4 production clusters, 6000 tickets/minute (max or avg?), very bursty around onsale times. Application is divided into 24 system running on 24 distinct classes of nodes.
Operational culture of TM: automation, open source, reliability, modularity
Best Practices: everything should be active, utilize a flexible system organization, automate everything
Best Practice 1: everything should be active. Use all your hardware resources. Better to automatically decommission known systems, than to automatically commission unknown systems. Idle standby systems often decay leading to surprises when you have to failover. TM uses: multi-master Oracle replication, Active/active VRRP, clustered NAS,
Best Practice 2: utilize a flexible system organization. Operations is often the last to hear about changes. Naming of servers: <class><node#>.<product>.<cluster>.<sa>. All nodes of same class are configured identically; all automation software keys off class. All nodes have two NFS shares: /<class>/local/, /<class>/shared/. Shared typically has code, configuration, templates. Local used for logging.
Best Practice 3: Automate everything.
Configuration management: Kickstart to do OS install. per-class symlink to shared RPM repository. Overlay system copies m4 templates, by class, to root filesystem. All config files are templated. Can deploy a new server in 7 minutes.
Next gen CM (in development): always running to detect configuration drift. TT replaces m4, APT replaces symlinks. Configuration in a giant XML file with all information needed for all classes (is this a good idea?). Configuration authors must also provide auditing functionality. Define severity levels for changes which limit what time windows they can execute during.
Systems Engineering staff is 1 lead, 2 network admins, 3 sysadmins, and 1 developer.
-
Infinite In Three Dimensions
2004-07-29 09:36 in /tech/oscon
Esther Dyson is stuck in Dallas, so it’s just Freeman and George this morning.
Freeman is mostly interested in biotech currently. He has hopes that this technology with become as common as computers, giving home gardeners, pet breeders, etc. the ability to use it in their work and hobbies.
Tim asks about possible risks
Freeman wonders can it be stopped, should it be stopped, and if so, how do you set up the rules so the it can be used for good purposes and not dangerous ones.
Tim asks George about his experiences leaving the traditional education system.
George comments that he thinks there is a false dichotomy between doing work with you hands and with your mind. On the contrary, good mind work often happens while your hands are occupied with other things. He misses the days when people learned about technology by taking apart carburators and the like. Now, shop classes use CAD instead of actually taking apart and building stuff.
Back to Freeman, Tim references the debates about nuclear technology and asks about how he views society and risk.
Freeman says people are really bad at assessing risk quantitatively. Our society is both extremely risk-adverse and at the same time highly risk-taking, because people don’t think about things correctly.
Tim quoting Mitch Kapor: “Don’t need a Department of Homeland Security, need a Department of Homeland Arithmetic”.
-
Weds. Review
2004-07-29 09:33 in /tech/oscon
Yesterday was fairly good overall. I enjoyed most of the sessions I went to, and I didn’t overload myself. After the LJ talk, I ran into Geoff Young and chatted with him for a bit about an issue I’m trying to deal with. Unfortunately, I ended up rather late for the Parallel Sessions in Perl talk as a result. He was on slide 11 of 20 when I got there.
Aside: I almost ignored that talk entirely when I was looking over the schedule. That’s because at every other conference I’ve been to “Parallel Sessions in Foo” means something completely different.
After the sessions, I went to off to dinner with a bunch of veggies. We attempted to go to the Calendula Cafe, but they were closed unexpectedly. Instead we went across the street to Vege Thai. I was a bit worried for a bit when the vegans at the table started worrying that the soy “meats” might not be vegan, and I’m just thinking, my god, we went all the way across town to a completely vegetarian restaurant and you people still can’t eat anything on the menu. (N.B., I am not a vegetarian, I just live with one.) But, it turns out their fake meats don’t have egg in them, so it was all good.
Post dinner, all the Yahoo folks headed to a party hosted by O’Reilly, where we all talked to each other. Kinda lame. I wandered a little bit and chatted with a few people, but it was loud and I was having trouble focusing. There were a few people I would have liked to talk with, but I wasn’t feeling up to starting conversations.
Leave a comment
Please use plain text only. No HTML tags are allowed.