About the Author

Chris Shiflett

Hi, I’m Chris: entrepreneur, community leader, husband, and father. I live and work in Boulder, CO.


All posts for Oct 2003

Virtual Machine War

Dave has a very interesting discussion on the "Virtual Machine War". Personally, I'm placing my bets and hopes on Parrot.

What Is Scalability?

There is an interesting article on ONJava.com entitled The PHP Scalability Myth. The author describes scalability as follows:

There are a number of different aspects of scalability. It always starts with performance, which is what we will cover in this article. But it also covers issues such as code maintainability, fault tolerance, and the availability of programming staff.

Is this what scalability means? It's certainly not my definition. Do code maintainability, fault tolerance, and the availability of programming staff have something to do with scalability? They can if your definition of scalability takes human resources into account, which seems reasonable.

The definition I find on Dictionary.com describes scalability as:

How well a solution to some problem will work when the size of the problem increases.

This seems like a better definition. A textbook definition would be something to the effect of, "the ability to scale." This is probably a starting point that everyone can agree to. So why do some people argue that certain technologies (PHP, mod_perl, Java, etc.) don't scale? I have always assumed that these people define scalability as the ability for something to scale well and that they're using their own subjective opinions to define what scales well and what doesn't. This is where things go wrong. It also seems that more and more people use scalability as a measure of performance, when this is not the case either. Something that performs very poorly can still potentially scale very well. Scalability is a relative measurement.

Before I say more, I should describe what I think it means for something to scale well. Consider the following three figures:

Figure 1 represents a case where the amount of required resources grows exponentially compared to the number of users. This is bad. In Figure 2, the amount of required resources grows linearly. This is typical (the rate of growth can vary; smaller is better). In Figure 3, the amount of required resources grows logarithmically. This is very nice. My opinion is that both Figure 2 and Figure 3 represent something that scales well. Because I am a Web developer, a growing number of users is typically when the "size of the problem increases" for me. The term "resources" refers to many things, but most people are concerned with cost. Things that cost money include hardware, software, human resources, and time.

Lastly, let's look at an example. Consider two hypothetical technologies, Technology A and Technology B:

Resources required to build an application that supports 100,000 users a day:
Technology A: 10 servers, 5 developers, and 6 months of development time
Technology B: 40 servers, 10 developers, and 3 months of development time

Resources required to build an application that supports 250,000 users a day:
Technology A: 25 servers, 5 developers, and 9 months of development time
Technology B: 50 servers, 10 developers, and 6 months of development

Which technology do you think scales better? Which appears to be the better choice when no more than 250,000 users a day need to be supported? Should things like maintainability and robustness be taken into consideration? How do you measure these things? If you are making decisions based on your assumptions about the scalability of certain technologies without asking these types of questions, you need to stop making such decisions.

Article Errata

My article about XSS and CSRF was published today (technically yesterday, since it is after midnight) in php|a. When reading through it, I couldn't help my perfectionist tendencies, and I found myself noticing a few minor errors. None of these exist in the original manuscript, but the complexities of the editorial process can sometimes introduce a few problems. I have found this to be true with both book publishers and magazine publishers. Just as with writing code, any step that involves a change can introduce bugs.

The reason I decided to write about this is that php|a offers some nice forums, and each article they publish is given its own forum. This provides a convenient place for follow-up questions and discussions about the article. It also provides a home for article errata.

I have found many articles on the Web with serious errors, and given the likelihood for misinformation to mislead people, it would be nice if there was an easy way for people to find article errata (in cases where the article itself cannot be corrected). I have tried to contact the original author in a few cases, but it seems that most every email address I use for this purpose is outdated.

Would a single source for such article errata be the best solution, or should each publisher/Web site provide its own? I'm not sure, but I may give it some more thought.

RAMP Training

RAMP

NYPHP has announced RAMP training courses. Lasting only three hours each, these courses are intended for people who are already experienced but are looking for advanced instruction on very specific topics. The hope is that you can take a class in the morning and be applying what you have learned the same afternoon.

I will be teaching HTTP and State Management, which I hope will help people use PHP sessions more effectively. Once I cover some fundamental topics, I plan to focus on debugging techniques and methods of improving the security of your sessions.

These courses are scheduled for November 10 and 11 and are being taught in some nice training facilities in New York City.