Search This Blog

Momentum

"An object at rest tends to stay at rest, and an object in motion tends to stay in motion." Isaac Newton's First Law of Motion seems to apply as well to projects as it does to physical objects. In my experience it is much easier to continue working on a project and keep it moving if it already has decent momentum. It is much harder to make any progress on a project that is languishing from lack of attention, or has hit a barrier that has stopped it completely. To keep a project that you care about moving, you need to at least maintain momentum, and preferably build it up.

In classical mechanics, the measurement of momentum of an object is a simple calculation consisting of the mass of said object multiplied by its velocity:

p=mv

The equation couldn't be simpler, but it has powerful implications in physics. Those implications can also apply to projects.

Specifically, a project has properties that could be considered its mass and velocity. The mass of a software project could be the amount of code, design assets, and documentation that have been created for it. If the project has been released and is out in front of real users, the mass could also be the number of people using it. If the users are actually customers paying for the software, the mass could be the amount of revenue the software is bringing in. In any case, the mass of a project is most likely growing over time. Since the mass is increasing, according to the equation, momentum should also be increasing, but there is that other term to consider.

The velocity term is a vector that has a direction and a magnitude. The magnitude of the velocity (a.k.a. the speed) of a project is simply how fast it is moving over time. How quickly are features being added, bugs getting fixed, or interfaces getting polished? How fast are infrastructure and various support mechanisms being added to the project so that it can survive in the wild? Those are some of the things that determine the speed of a project.

As for the direction, that is determined by how the project is progressing towards the goals set out for it. Two things need to be considered here. The project could either be headed in the direction of the project goals or not, and the project goals could either be in the right place or not. Ideally we want the project goals to be in the right place and the project heading towards them, but that's not always reality.

We've now come to the first insight we can gain from this mental model. (It's obvious, but that's the way insights are sometimes.) If you realize a project is heading in the wrong direction or the goals are way off-base, then how easily the project can be course-corrected will depend on its mass and its velocity. The larger a project is or the faster it's moving, the more energy it will take to change its course because it has more momentum. Likewise, the larger the course correction needs to be, the more energy it will take to get it back on track. Small adjustments can be made even to large or fast-moving projects relatively easily, but if you have to turn a large project completely around, it is going to take a lot of effort.

Something else that happens to projects over time, other than needing course corrections, is that they grow. Much of the energy that a team puts into a project gets converted to mass in the form of the aforementioned materials, customers, and revenue. All of this accumulated mass makes projects more sluggish as they grow. It takes more and more energy to keep a growing project moving fast, which is why it seems physically impossible to make progress on huge, crusty projects with any kind of speed. In contrast, a greenfield project has almost zero mass, so it feels like making lightning fast progress takes no effort at all.

Those greenfield projects don't stay small for long, though. A new project that's making good progress will quickly increase in mass. As it grows, another force will come into play that is alluded to in Newton's First Law of Motion—friction. Since a project is never built in a vacuum, it will have to overcome all kinds of friction to stay in motion. As a project grows, its surface area will also grow, increasing the frictional forces on it.

Even projects that you wouldn't think would encounter more friction as they grow will, in fact, do so. For example, this blog seems to be adding friction as I write more and more posts. I didn't anticipate that. As I've learned things about writing and added new features to my later posts, I am acutely aware of the benefits of revisiting some of my older posts and updating them. I should also refresh my memory of things that I've already written. Both of these frictions continue to grow as I write more, and I'm sure other frictions will arise as I keep going. I also continue to deal with the multitude of distractions from everyday life that are pulling me in directions other than writing.

One way that I've found to keep the momentum up with writing—something I committed to from the start—is to write every single week, no excuses. I normally do even better than that by doing something blog related every single day. It's not always writing; I do that two to four nights a week. The rest of the days I'll make notes in Trello when an idea strikes me or I'm reading something and come across a good example for a future post. I also take time to think about my writing. It may not all seem like momentum, but it all adds up to a blog that keeps moving. It has worked pretty well for me.

I've found that that mentality works well for any kind of project, personal or work-related. Personal projects have all kinds of frictions, not the least of which is probably work and the energy drain that comes with it. It's so much easier to come home and veg out in front of the TV or drift on the internet than sit down and work on a project, but none of that will keep the momentum going, even if you care deeply about the project. The same is true for work projects. Frictions abound with meetings, email, and other (sometimes necessary) distractions.

The best way to overcome these frictions, whether for work or personal projects, is to do something for the project every day. It doesn't have to be a big thing every day. Some days are so packed with meetings or you're so exhausted when you get home that the best you can do is fix one simple little bug or think about the next feature for ten minutes, fifteen minutes tops. Do that. Even a small bit of progress will keep the momentum up on a project so that it's still moving when you have the time and energy to do something big for it.

The last thing you want is for the project to stop completely. Then you have to overcome the most devastating of all forces—static friction. Keeping momentum going is so much easier than starting up a project from a dead stop, especially as the project gets bigger (remember the project's mass). If you want a project to keep going, do at least one thing for it every day because the momentum will help carry your project along. Momentum is a powerful thing.

Leveling Up

As a programmer, I'm always trying to improve my abilities, to level up my programming skills. Reading a lot helps, but that only provides part of what's needed to become a better programmer. Reading about programming gives you the knowledge of what is out there and available to use, but it doesn't give you the ability to actually use any of that knowledge when you need it. To have the knowledge ready at your fingertips requires cold, hard practice.

I've written recently about the importance of knowing the basics, and the main reason for it was that if you know the basics, you can derive the higher level abstractions that are so useful in programming without having to memorize them all, giving you access to substantially more tools than you otherwise would have. But there is another reason that knowing the basics is so important. If you know the basics of programming so well that you don't so much recall them when needed as they are emitted from your fingertips into your text editor before you even thought to use them, then you will be able to think at a higher level while programming than would otherwise be possible.

Programming, Fast and Slow


Lately, I've been listening to the StackOverflow podcasts in the car. I started from the beginning and have been going through them back-to-back. It's a Joel and Jeff overload, but they have some fascinating discussions pretty regularly. One discussion in Stack Overflow podcast #57 was on this very topic of good programmers being able to belt out basic programs quickly. It starts off with a question from a StackOverflow user who is lamenting the fact that StackOverflow seems to reward programmers that can answer questions really fast, basically promoting the fastest gun in the West without much regard for quality.

Regardless of whether or not that's true about incentives and quality, Joel took the opportunity to start analyzing what makes a good programmer and how to figure out if any given programmer is potentially good. He describes how he used to use an interview question with a very basic problem that could be coded up in a few if statements. He needed a way to filter out people from the dot-com crash that were looking for jobs but didn't actually know how to program, and he was hoping this simple problem would be a good first-stage filter. Here's what he did and what he found, edited a little for clarity:
It's sort of something like ... is John older than Mary?  <Laughter>  It's a little harder than that!  …[W]hat happens when I ask this question is that everybody gets it, but some people take a long time to think it out, and they're carefully thinking it through, and they really are struggling, and some people just write really fast and they're done.  And then we can move on to more advanced things.  And, it suddenly occurred to me, that there was a very high correlation between the people that I hired, and the people that could solve this problem as fast as they could write.
I don't think this means that you can study and practice the fundamentals exclusively, and magically you'll be able to solve the hardest programming problems. It doesn't work like that. I can certainly believe that great programmers are more likely to have a solid grasp of the fundamentals, so practicing the basics is a prerequisite that opens the door to higher levels of proficiency.

Joel continued with a fascinating story about a math professor teaching calculus:
Serge Lang is a math professor at Yale who is very interested in math education. Serge Lang, at the beginning of freshman calculus at Yale, gave people, for no credit, … told all the students to take out a piece of paper, and he put a fairly simple equation up on the board and said, "reduce this to its simplest terms." So this is basically 9th grade algebra. Then after like 30 seconds he said, "stop!" and took all their pieces of paper.

Some students were able to reduce this algebra equation to its simplest form as fast as they could write, and some of them really had to think about it and really had to work on it. … And he said that the people that could do it quickly was an almost perfect predictor of who would get an A in the course. In other words, that was as good a predictor of math ability as an entire semester in calculus with problem sets every week and probably two mid-terms and a final. 
This kind of predictor is fascinating, but after thinking about it for a while, it makes total sense. The students who did well in the course were nearly always the same students that did well in the simple algebra problem. That correlation doesn't necessarily mean that some students performed better because of some inherent proficiency in math. It could also mean that they did a much better job of internalizing the fundamentals of algebra so that by the time they needed to use algebra for calculus, they knew it backwards and forwards and could spend their mental energy on the new ideas of calculus.

Meanwhile, those students that struggled with the simple algebra problem probably continued to struggle with the algebra that they needed to use throughout the course. Because they never made the basic operations instinctive, they couldn't grasp the higher levels of thought necessary to learn calculus. Derivatives and integrals were beyond their reach because they still needed to adequately learn how to simplify and balance equations.

Pretty much all of mathematics depends on knowing lower level operations and theory well to reach higher level theory. This process is obvious at the lowest levels of counting, arithmetic, and number systems, but it is also true at the highest levels of mathematical theory. Mathematicians often have a sense that a theorem is true long before they can prove it, and much of the work of proving it is building a foundation of simpler proofs and machinery to use to support the final theorem and discover a valid proof for it. Without the hard work of learning the basics, the higher-level proof would be unattainable.

From Dancing to Chess


This same concept of mastering the basics applies in any learning scenario that I can think of. Take dancing as an example. Most styles of dance have a basic step or basic set of moves that define that style. Swing has a basic step. The Tango has a basic step. Ballet has a basic set of movements. All of the other moves in a dance style branch off from that basic step.

I took some ballroom dance classes in college with my then-girlfriend, now-wife, and for each dance style we learned, we were constantly encouraged to practice the basic step until it was second nature. We should practice it until we could carry on a deep conversation without ever missing a step. It had to be automatic. If we had to think about what our feet were doing for the basic step, we had no chance of learning the fun dips and spins of the more advanced steps. Once we knew the basic step, it acted like a safety net that we could always fall back on if we got lost in a complicated move or made a misstep, and it gave us a lot of confidence to learn more challenging stuff.

Okay, enough about dancing; how about playing an instrument? The same idea applies. If you learn the basic chords on the guitar or the bow strokes and fingerings on the violin so well that you no longer have to think, "Okay now this finger goes here and this one goes here and I have to move my arm like so…," then you can focus on the music you're playing and actually think ahead to the measures you'll be playing next. When you've built up the muscle memory and your fingers are doing all of the thinking for you, that's when you can level up and really play.

Playing chess is another example where reaching higher levels requires knowing the fundamentals by heart. The best chess players in the world don't look at a position and see a bunch of pieces that need to be analyzed individually. They see patterns of moves, pawn structure, and collections of interacting pieces that allows them to see many moves ahead. All of that knowledge and ability to see patterns comes from thousands of hours of practicing the fundamentals of chess. They know the tactics of pins, forks, and skewers for all of the pieces. They thoroughly understand the implications of doubled, isolated, and passed pawns. They immediately see strategic advantages of space, outposts, and initiative.

Chess masters can clobber amateurs so quickly and decisively because they are so well-versed in the fundamentals. They don't have to think about their immediate next move because they can see the combinations in a position and know the consequences of different moves by instinct. All of their hard work has paid off in an ability to operate at a much higher level of chess thought.

Take Time Out to Level Up


All of these examples have at least one thing in common—lots of practice. It's like level grinding in a role-playing game. If you reach a point in the game where you can't seem to make any forward progress, you probably need to go off and fight a bunch of enemies, gain some experience, and grind out a few levels. When you come back to the point where you were stuck before, it will be much easier after you've leveled up.

Practicing the basics until they are second nature, until you no longer have to think about them at all to use them, is key to leveling up as a programmer. This is not as easy as it sounds. It can be incredibly boring if you don't use a wide variety of practice tools and incredibly frustrating when it feels like you're not making progress or are even regressing. That's why plenty of programmers plateau a few years out of school. Practicing is hard, but necessary, and the programmers that practice regularly and effectively will reach a much higher level of programming competence. They will be the ones that can solve the hard problems in programming now and in the future. Are you leveling up?

Test in Moderation

Software testing is a huge topic, and intense debates are going on all the time about what is the best way to do software testing. Test Driven-Development (TDD) has built up a strong following over the past 15 years, and with good reason. TDD can have a lot of benefits if done well. It also has its downsides, and if taken to an extreme, TDD can cause more harm than good. Programmers are justified in pushing back on the more extreme forms of TDD, but we don't want to go to the other extreme, either. Testing is a necessary part of a high-quality development process that increases the odds of releasing highly-functional software. We need to do testing in some form. The real questions are, how should we test and how much?

The intention here is not to focus solely on TDD, although that will be difficult because it's something most programmers are familiar with. What I really want to talk about is testing in general, and TDD is but one piece of the complete testing picture. The rest of the picture includes things like usability testing, code reviews, and quality assurance (QA) testing. Proponents of TDD, when at their most extreme, claim that TDD subsumes and encompasses all other types of testing. They take every possible type of testing, define and categorize them, and put them all under the banner of TDD. It may be a convenient title, but in practice most programmers think of TDD as focusing on unit and integration testing because those are the tests that are most easily automated.

When referring to TDD, the concept I'm thinking of is essentially the same as the one used by Steve Freeman and Nat Pryce in Growing Object-Oriented Software, Guided by Tests:
When we’re implementing a feature, we start by writing an acceptance test, which exercises the functionality we want to build. While it’s failing, an acceptance test demonstrates that the system does not yet implement that feature; when it passes, we’re done. When working on a feature, we use its acceptance test to guide us as to whether we actually need the code we’re about to write—we only write code that’s directly relevant. Underneath the acceptance test, we follow the unit level test/implement/refactor cycle to develop the feature[.]
I'm using 'integration' whereas they use 'acceptance,' but they're pretty interchangeable. It's clear from this explanation of the TDD process that acceptance tests do not include getting the user interface in front of actual users, getting the code in front of coworkers, and getting the product in front of physical, human testers. That's okay. The definition of TDD should be focused, but we shouldn't forget about the rest of the testing picture.

In the Grand Scheme of Things


One great way to step back and see the bigger picture is to do a postmortem on a project (referred to as a retrospective in agile development). Despite the name, don't wait until the project is finished (or dead) to do a postmortem. You can learn a lot from looking back on a project after certain milestones are reached, like major releases, and many software projects are never truly finished anyway.

When I was working for an ASIC design company, we sometimes did a postmortem right before we would release the second revision of silicon for prototyping. This was a perfect time to take a look back at what happened during the previous cycle of silicon evaluation because we were about to send off the design for fabrication again. It would be 6-8 weeks before we'd see the results, and it would be very unlikely that we could fix any mistakes we found after pulling the trigger.

I remember one postmortem in particular that was for a project with a high profile customer. We were aiming to complete the design in two and a half spins—that means the third revision would only change the metal interconnect layers of the design—and as always, we were schedule constrained to get it right. Before releasing the design for the second spin we took a look back at all of the bugs that we discovered during the first silicon evaluation. We had found 35 bugs (pretty respectable for first silicon), and they had varying degrees of severity and scope.

While we didn't discover anything that caused us to delay the next release, we did learn some interesting things about the source of the bugs. Part of the postmortem exercise involved figuring out how we could prevent those bugs that we found in the future. I can't remember the exact numbers, but we found that a significant number of those bugs, like more than half, could not have been prevented with better automated testing and simulations.

Some bugs we definitely should have caught. They were obvious after we had gotten silicon back, the fixes were obvious, and the simulations required to catch the bugs were obviously missing. It's your standard face-palm kind of stuff.

Other bugs had a root cause that didn't originate in the implementation, but in the specification. These were the inevitable misunderstandings of how the product should work, either on our part, the customer's part, or the communication between us. We continually strove to reduce these kinds of mistakes, but there's always room for improvement when it comes to product specifications and customer communication.

Then there were bugs that we could not have predicted would be an issue until we had physical silicon in our hands to test. Similar to most software development, we were doing something that had never been done before, at least by the collection of engineers participating in this project. We knew going in that there would be issues with the design, but we couldn't know what all of them would be. We did try to mitigate the risk by making the design as flexible as possible in the areas where we thought there would be trouble, and we planned for the inevitable bugs with the two and a half spin schedule. It's actually a tribute to the design team that there were so few bugs in first silicon and most of them were of this unpredictable variety.

We ended up shipping revision B2 as planned, and it was one of the most well-executed projects I've had the pleasure of working on. However, the take-away for this discussion is that in spite of our quite rigorous automated testing, we didn't catch every bug that we needed to, nor did we have any hope of doing so even if we devoted ten times the effort to testing simulations.

Imperfections in Testing


The equivalent automated testing in the software world is integration and unit testing, whether it's TDD or not, and Steve McConnell has some interesting statistics on the effectiveness of different types of testing in Code Complete 2:
.. software testing has limited effectiveness when used alone—the average defect detection rate is only about 30 percent for unit testing, 35 percent for integration testing, and 35 percent for low-volume beta testing. In contrast, the average effectiveness of design and code inspections are 55 and 60 percent.
The defect detection rates for unit and integration testing are amazingly low. Similar rates are probably applicable for ASIC simulation testing, and that is why we would have had to put in so much more effort to catch the obvious bugs that got through. During development those bugs are not obvious, and you have to write a ton of extra tests and cover cases that aren't broken to find the few remaining cases that are hiding the bugs. Other types of testing can make those bugs more obvious.

One argument against TDD is based on this issue of missing obvious bugs. The argument goes something like this. If you write the minimum amount of test code to produce a failing test and then write the minimum amount of production code to make the failing test pass, what you’re actually doing is minimizing the amount of production code you write. In the end, you’ll probably write much less error checking code, and you’ll forget to write error checking code for certain conditions or in certain circumstances, letting bugs get through the development process.

I don't think this argument only applies to TDD. You have to think of the error condition where a bug might exist whether or not you're practising TDD. The TDD approach just changes the way you handle error checking during development. If you think of a possible error condition in your code, you write a test for it first to make the failure a reality, and then write code to fix it. Humans are notoriously bad at predicting failure so it may be better to make sure the bug is real before trying to protect against it. TDD doesn't alleviate developers from needing to be diligent about error checking. Although it may help eliminate unnecessary error checking code.

Another argument against TDD came up in a discussion on the Stackoverflow podcast #38. (I know it's old, but history can still be relevant.) Joel and Jeff talk about unit testing, TDD, and the SOLID principles. Joel questions whether TDD and SOLID are agile or not. It's an interesting thought, and I can see how he would come to that conclusion if these practices were taken to their extreme: 100% unit testing, TDD for every single line of production code, and all the SOLID principles applied to every class in the system. Such a code base would be an incredible, unmanageable mess. Especially the interaction of unit testing every single little thing about every class while separating classes into thousands of tiny single-function elements causes the entire design to be very rigid and fragile to change. I shudder at the thought of it.

While it is useful to have an automated test to check code that you've written so that you don't have to keep checking it manually all the time, that does not always trump other factors. Sometimes it is a piece of code that will never break again, and writing the test is way more complicated than the actual code. In some cases it makes sense to not write the test and instead write the code, check it manually, and be done with it. As a general rule, if a test is hard to write—much harder than writing the actual code—think about whether or not to actually write the test. It would have to provide some benefit, like if the code is especially risky, before it's valuable to write and verify a complicated test.

I like the balance that Michael Hartl strikes in his Ruby on Rails Tutorial. Sometimes he uses TDD, sometimes he writes integration tests or unit tests after coding the functionality, and his assertions have a light touch that leaves enough room for the code to be flexible

Expanding the Testing Repertoire


When writing tests, it's best to KISS. If you get too wrapped up in testing it's easy to go overboard, and that can waste a lot of time. There is some intersection of tests that catch bugs and code that is the source of bugs (or code changes that create bugs). That sounds like a venn diagram to me.

 Venn diagram of bug catching tests vs buggy code

If you write tests for 100% code coverage, you're creating a huge number of tests and spending the corresponding huge amount of time writing those tests when some percentage of tests will never ever fail, and are thus useless.

Another way to think about this problem is to consider how many bugs are caught as you ramp up the amount of time you spend using a given testing style. For unit testing, as you write more tests it gets harder and harder to find more bugs, and tests start to overlap in the bugs that they catch. This is why 100% code coverage doesn't mean 100% of bugs caught; it only means about 30% of bugs caught. Over time, testing efficiency tends to look like this:

Estimated plot of bugs caught vs. testing effort

At some point, putting in more effort doesn't give you any more progress. This trend is true of all types of testing. Whether it's unit testing, usability testing, code reviews, or QA testing, testing efficiency is going to saturate at some level below 100% bug-free because different testing styles will catch different classes of bugs. Instead of taking one testing style to the extreme, it's better to mix different testing styles together so that you're hitting your software from multiple angles. If you have a fixed amount of effort that you can dedicate to testing, different mixes of testing styles will result in varying degrees of quality in the final product.

Chart of testing efficiency for fixed effort

This chart represents what the space of testing efficiency might look like if you could spread out all variations in the mix of testing styles on one axis. Somewhere in that curve would be a point where you did 30% automated testing, 20% code reviews, 10% usability testing, 20% QA testing, and 20% other miscellaneous testing. Maybe 100% automated testing is near that valley on the left. It's really representing a multi-dimensional space on a single axis, and it's purposefully vague because the maximum testing efficiency is going to heavily depend on everything—the product, the problem domain, the experience and skill of the team, the schedule, the customer, and the list goes on. Experience is probably the best remedy for this dilemma because with experience you'll learn what a good mix of testing is for the contexts that you generally develop in.

One thing's for sure. It's not worth hammering on one testing style to the detriment of all others. That's a terrible waste of time. It's better to back off the 100% code coverage goal—trust me, it's a mirage because you're still far from 100% input coverage—and look for other ways to test more effectively. Automated tests are great, but they should be used judiciously where they give you the most benefit by making sure your stuff isn't fundamentally broken. Then, spending some time getting your product in front of real users or having another developer look over your code and talking them through it will provide much more value for your testing time than writing another test that verifies that, yes, your getters and setters do indeed work. Like all things, testing is best done in moderation.

A Microsecond in the Life of a Line of Code

We sit atop a large technology stack. Every layer in that stack is an abstraction of the layer below it that hides details and makes it easier to do more work with less effort. We have built up a great edifice of abstractions, and most people understand and work within a couple levels of this technology stack at most. No matter where you work in the stack, it's worthwhile to know something about what's going on below you.

Last week I talked about the importance of knowing the basics. Part of knowing the basics can be thought of as learning more about some of the layers of abstraction below your normal layer of expertise. I certainly cannot do justice to every layer of the computing tech stack, nor can I cover any layer in much detail, but it should still be fun to take a journey through the stack, stopping at each layer to see what goes on there.

We'll follow one line of code from a Ruby on Rails application and see how incredibly tall the tech stack has become. I only picked Rails as a starting point because it's one of the higher-level frameworks in a high-level language, and I happen to know it. Other choices exist, but it gives us something to focus on instead of talking in generalities all the way down. I'm also focusing on one line of code instead of following an operation like an HTTP request because it's fascinating that all of the actions on each of these layers are happening simultaneously, whereas an HTTP request has more sequential components to it. Maybe I can cover an HTTP request another time.

Let's get started on this incredible microsecond in the life of a line of code. With every level that we descend, things are getting more physical, and keep in mind all of these actions are happening at the same time on the same piece of silicon. Let's start peeling back the layers.

Ruby on Rails Application

Here we are on the top of the tech stack. Things are pretty cushy up here, and we can express an awful lot of work with only a few lines of code. For our example, we'll use this line of code:
respond_with @user
This piece of code sits in a controller that handles an HTTP POST request, takes the appropriate actions for the request, and renders a response to send back to the web browser. Because we're at such a high level of abstraction, this one line of code does a tremendous amount of work. To see what it does, we'll have to peel away a layer and look in the Rails framework.

Ruby on Rails Framework

The respond_with method call is a call into the Ruby on Rails framework, and here's what the source code looks like:
# File actionpack/lib/action_controller/metal/
# mime_responds.rb, line 390
    def respond_with(*resources, &block)
      if self.class.mimes_for_respond_to.empty?
        raise "In order to use respond_with, first you " + 
              "need to declare the formats your controller " + 
              "responds to in the class level."
      end

      if collector = retrieve_collector_from_mimes(&block)
        options = resources.size == 1 ? {} : 
                    resources.extract_options!
        options = options.clone
        options[:default_response] = collector.response
        (options.delete(:responder) || self.class.responder).
          call(self, resources, options)
      end
    end
First, the method checks that it knows what mime-type to generate the response for, and if there is none, it throws an error. Next, it calls another method deeper inside the Rails framework that executes the block of code provided with the call to respond_with, if there is one, and returns an object that knows the response type. Finally, a call is made at the end to render the response from a template in the Rails application.

I'm glossing over a lot of stuff here, so if this explanation doesn't make sense, don't worry about it. I'm not trying to analyze Rails in depth, so we can just enjoy the view as we pass by. Many more method calls are happening inside the methods called from this method, and it's all thin layers of abstraction contained within Rails and the other gems that it depends on. Gems are collections of Ruby code that are packaged up into convenient chunks of functionality that can be included in other Ruby programs, and they could be considered a layer in and of themselves. We'll group all of those thin layers into one for the purposes of this discussion, otherwise this post will go on forever. The important thing to keep in mind is that a ton of stuff is going on at each level, and the shear number of moving parts that are flying around as we continue down is astonishing.

Let's focus in on one line of code and move down to the next layer.

Ruby

This line looks fairly simple:
options = resources.size == 1 ? {} : resources.extract_options!
Ruby on Rails is written in a programming language called Ruby. (Bet you didn't already know that from the name, amiright?) The line of code we're looking at is a line of Ruby code. What does it do? It assigns a local variable called options one of two values depending on the size of resources. If the size is one, then options is an empty hash table, otherwise it gets assigned to the options that can be found in resources.

Ruby is made up of a set of thin layers of abstraction much like Rails, but with Ruby the layers are made up of the standard library and the core language features. The call to resources.size is actually a call to a standard library method for the Array class that is referred to as Array#size. Normally a language and its standard library go together, so we'll consider them one layer in the stack.

Ruby is an interpreted language, which means there is another program running—called the interpreter—that reads every line of code in a Ruby program file and figures out what it means and what it needs to do during run time. An interpreter can be implemented in a few different ways. In particular, Ruby has interpreters written in C (the MRI interpreter), Java (JRuby for the JVM), and Ruby (Rubinius—how's that for recursion). To make thing a bit more interesting, lets look at the JRuby implementation.

JRuby Interpreter

The JRuby interpreter reads files of Ruby code, parses it, and executes instructions to do whatever the Ruby code says should be done. Since JRuby is written mostly in Java, it's compiled Java code that's doing the work of figuring out array sizes, deciding which values to use, and assigning hash tables to local variables in our line of code above. An interpreter has a ton of decisions to make, and a lot of code is being executed to make those higher-level Ruby statements happen.

What is essentially happening in the interpreter is that code in one language is getting translated into code in another language using a third language. In this case, Ruby code is translated into Java bytecode using Java. Java bytecode is compiled Java code that is similar to an assembly language. If we assume with our example line of Ruby code that the conditional assignment operator is implemented in the JRuby interpreter as a Java method with references for the empty hash and the options hash, then the method might look like this:
public Object conditionalAssign(int lhsCompare,
                                int rhsCompare,
                                Object trueObj,
                                Object falseObj) {
  if (lhsCompare == rhsCompare) {
    return trueObj;
  } else {
    return falseObj;
  }
}
This Java method is returning a reference to an object dependant on the outcome of the equality comparison. That's what we want for the Ruby conditional assignment operator, but the interpreter wouldn't be outputting Java code, it would be outputting Java bytecode. The equivalent Java bytecode for the conditionalAssign method is this:
0: iload_1
1: iload_2
2: if_icmpne     7
5: aload_3
6: areturn
7: aload_4
8: areturn
This Java bytecode runs on a virtual machine, which brings us to the next layer in our tech stack.

JVM

The JVM is a virtual machine that models a physical processor in software so it can take in instructions represented as Java bytecode and output the right assembly code for the processor it was running on. The original idea with the JVM was that a software program could be compiled once for the JVM, and then it could run on any hardware that had the JVM implemented on it. This idea of write once, run anywhere didn't quite turn out as the JVM proponents hoped, but the JVM still has a lot of value because many languages can run on it and it runs on many hardware platforms.

Much like the JRuby interpreter, the JVM is doing a translation, this time from bytecode to assembly code, and the JVM is most likely written in yet another language—C. If the JVM happened to be running on an x64 processor, it might emit assembly code that looks like this:
 .globl conditionalAssign
 .type conditionalAssign, @function
conditionalAssign:
.LFB0:
 .cfi_startproc
 pushq %rbp
 .cfi_def_cfa_offset 16
 .cfi_offset 6, -16
 movq %rsp, %rbp
 .cfi_def_cfa_register 6
 movl %edi, -4(%rbp)
 movl %esi, -8(%rbp)
 movq %rdx, -16(%rbp)
 movq %rcx, -24(%rbp)
 movl -4(%rbp), %eax
 cmpl -8(%rbp), %eax
 jne .L2
 movq -16(%rbp), %rax
 jmp .L3
.L2:
 movq -24(%rbp), %rax
.L3:
 popq %rbp
 .cfi_def_cfa 7, 8
 ret
 .cfi_endproc
I know, things are starting to get ugly, but we're definitely making our way deep into the stack now. This is the kind of code that the actual physical microprocessor understands, but before we get to the processor, the assembly code has to get from a hard disk onto that processor, and that requires memory.

The Memory Hierarchy

The memory hierarchy of a modern processor is deep and complicated. We'll assume there's enough memory so that the entire state of the program—all of its program code and data—can be held in it without paging anything to disk. That simplifies things a bit since we can ignore things like disk I/O, virtual memory, and the TLB (translation look-aside buffer). Those things are very important to modern processors, so remember that they exist, but we'll focus on the rest of the memory hierarchy.

Computer motherboard diagram

The main goal of memory is to get instructions and data to the processor as fast as possible to keep it well fed. If the processor doesn't have the next instruction or piece of data that it needs, it will stall, and that's wasted cycles. Let's focus on a single assembly instruction at this point, the jne .L2 instruction. This instruction is short-hand for jump-not-equal and the target is the .L2 label. We'll get into what it does in the next layer. Right now we only need to know that the instruction isn't actually represented as text in memory. It's represented as a string of 1s and 0s called bits, and depending on the processor, instructions could be 16, 32, or 64 bits in length. Some processors even have variable length instructions.

So the processor needs this jne instruction and it's loaded into the memory. Normally it starts out in main program memory (DDR in the diagram above), which is very large but much slower than the processor. It then makes its way down the hierarchy to the L3, L2, and L1 caches on the processor. Each cache level is faster and smaller than the previous one, and depending on the processor, it may have less levels of caching. Each cache has a policy to decide whether to keep or remove instructions and data, and it has to keep track of whether cache lines have been written to or are stale. It all gets extremely complicated, but the basic idea is that the lowest level of cache should hold the instructions and data that are most often or most recently used. Ideally the L1 cache will always have the next instruction that the processor needs because it normally runs at or very near the processor's clock speed.

The Microprocessor

Once this jne instruction reaches the processor, it needs to be executed. For that to happen, the processor needs to know what the instruction is and where its inputs and outputs are. The instruction is decoded to figure this out. The processor looks at the series of 1s and 0s of the jne instruction and decides that it needs to look at some flags that were set by the previous instruction, and if the equal flag is set to 0, it will jump to the target address specified by the label .L2.

Not all instructions are decoded into a single operation. Some instructions do more work than is reasonable to do all at once. These instructions are broken up into more bite-sized chunks of work, called microcode, before they are executed. Then all of these instructions are fed to the next layer of abstraction.

The Processor Pipeline

Instructions are no longer executed on a processor in a single clock cycle unless we're talking about a very simple processor, like a 16-bit microcontroller. Instead, some small amount of work is done for each instruction on each clock cycle. Fetching operands, executing operations, writing results, and even the initial decode steps are part of the pipeline. Breaking the work up into these pipeline stages allows the clock speed to be faster because it's no longer limited by the longest instruction, but by the longest pipeline stage. Naturally, hardware architects try to keep pipeline stages well balanced.

Intel Nehalem Microarchitecture
"Intel Nehalem arch" by Appaloosa - Own work. Licensed under CC BY-SA 3.0 via Wikimedia Commons - http://commons.wikimedia.org/wiki/File:Intel_Nehalem_arch.svg#mediaviewer/File:Intel_Nehalem_arch.svg

Modern processors also have stages that figure out if multiple instructions can be executed at once and feed them to multiple execution units. These dispatch stages can even decide to execute later instructions ahead of earlier ones if their dependencies are satisfied. In today's processors literally hundreds of instructions could be in flight at once and starting and finishing out of order. If you thought you knew what your code micro-optimizations were doing for performance, you might want to reconsider that position. The processor is way ahead of you.

Our original Rails code is completely unrecognizable at this point, but we're not done, yet.

The Pipeline Stage

A pipeline stage consists of some combinational logic that runs between clock cycles, and a set of memory elements, called flip-flops, that store the results of that logic for the next pipeline stage. The results are latched into the flip-flops on every clock edge.

Combinational logic can do all kinds of operations, including logical operations, arithmetic, and shifting. This logic is normally described using an HDL, (Hardware Description Language) like Verilog, that resembles other programming languages, but with a concept of a clock and statements executing in parallel between clock edges. A ton of stuff is happening at once in Verilog simulations because all combinational logic in the processor evaluates its inputs as soon as they change.

MIPS pipeline stages

But the processor isn't executing Verilog. The Verilog is synthesized into digital logic gates, and those are what make up combinational logic and our next layer of abstraction.

Digital Logic Gates

The basic digital logic gates are NAND, NOR, and NOT. NAND stands for NOT-AND and NOR stands for NOT-OR. Why aren't AND and OR gates fundamental? It turns out that transistor circuits that implement logic gates inherently invert their outputs from 0 to 1 and 1 to 0, so AND and OR gates require an extra NOT gate on the output to invert the result again.

Digital logic gates and truth tables

Many other logic gates can be built up from these three basic gates, and a digital synthesis library—used by the synthesizer to select gates to build up the pipeline stages—can consist of hundreds or even thousands of variations of logic gates. All of these gates need to be connected together to make a processor. Sometimes it's done by hand for regular structures like memory or ALUs (Arithmetic Logic Units), and for other blocks it's done with an automated tool called a Place-and-Router. Suffice it to say, this stuff gets complicated in a hurry.

We're now ready to leave the land of ones and zeros because we've finally made it to the transistor.

The Transistor

Digital logic gates are normally made up of CMOS (Complimentary Metal-Oxide Semiconductor) transistors. These transistors come in two varieties—NMOS and PMOS. They both have three terminals: the source, drain, and gate. A fourth connection always exists below transistors, and that is referred to as the bulk connection. If the gate voltage is greater than or equal to the source voltage for a PMOS or less than or equal to the source voltage for an NMOS, the transistor turns off. If the gate voltage goes in the opposite direction, the transistor turns on.

CMOS NAND gate

This schematic shows a NAND gate that is made up of four transistors. Two PMOS have their sources connected to Vdd (the digital supply voltage), and two NMOS are stacked together with the bottom one's source connected to ground. When an input voltage is at Vdd, that corresponds to a 1, and an input voltage at ground corresponds to a 0. In the case of the NAND gate, if either input is at ground, one of the NMOS will turn off and one of the PMOS will turn on, pulling the output up to Vdd. If both inputs are at ground, both NMOS turn on and both PMOS turn off, pulling the output to ground. This behavior exactly matches what a NAND gate should do.

To understand how a transistor works, we need to descend another level.

Semiconductor Physics

A transistor is made up of a silicon substrate, a polysilicon gate sitting on top of a silicon oxide insulating layer, and two regions on either side of the gate that are doped (enriched) with ions that make a P-N junction. The doped regions are the source and drain of the transistor.

A P-N junction makes a diode where electrons will flow from the N-type side to the P-type side, but not the other way around. A PMOS needs to sit inside an extra well that is N-type so that the source and drain aren't shorted to the substrate and each other.

CMOS transistor cross sections

Considering just the NMOS, when the gate voltage is less than or equal to the source voltage (normally at ground), the source and drain are not connected and electrons can't flow from the source to the drain through the substrate. When the gate voltage rises enough above the source voltage, a channel forms between the source and drain, allowing electrons to flow and pulling down the drain voltage. The behavior of the PMOS is similar with the voltages reversed.

The Tower of Knowledge

There you have it. We've reached the bottom of the tech stack. This huge pile of technology and abstractions is sitting atop some basic materials and physics. When building an application in a high-level framework it is truly astounding how much other technology we depend on. The original respond_with method call in a Rails application is the tip of a very large iceberg.

Every layer in this stack is much more complex than I've described. You can read books, take courses, and spend years learning about the details of each layer and how to design within it. If you spend all of your time in one layer, it would probably be worthwhile to take a look at what's happening at least one or two layers below you to get a better understanding of how the things work that you depend on. It's one way of learning the basics.

It may also be worth learning about the layer above where you work so that you can better appreciate the issues that designers in that layer have to deal with all the time. The better these layers of technology can interact and the better the abstractions become, the more progress we'll make designing new technologies. Knowledge is power, and there's an awful lot of knowledge to be had.