Saturday, January 28, 2017

Can you pair program at a company where pair programming isn't done?

I've been hooked on pair programming from the time I first tried it. I love pairing as a way to transfer knowledge (either about technology or about our product), build motivation, and build teams. Software development is a long series of decisions both large and small, many of which could plausibly go another way. When I'm soloing it is so much easier for me to get stuck on any one of them.

If you are in a company where pairing is the norm, you'll do it, but what if people are just curious about pairing? Or willing to try but who don't know much about it? Here's what had worked for me. First of all, I invite people to pair for 1.5 hour blocks, usually scheduled on our calendars (shorter can work but to go longer (a) requires a break in the middle, and (b) requires more buy-in from my pair than I sometimes have). Secondly, when I'm asking a co-worker to pair I ask them to pair on a specific task which I am up to speed on (for example, which has been assigned to me). Ideally, the task also requires knowledge they have that I don't (familiarity with a particular part of the codebase for example). During the pairing I apply pairing skills I've learned over the years (for example, handing the keyboard to a bored pair or saying "let's give it a try and see what happens" rather than "that won't work"). I wrap up by the end of the scheduled time (continuing after a break if both people are psyched is an option but usually 1.5 hours is quite enough for people who aren't in the pairing habit). As we wrap up, I make sure to thank them and tell them how helpful it was (this is usually quite sincere - I did mention that I go faster when pairing than soloing, didn't I?). If the task isn't done, I usually finish it up soloing (especially if the remaining items are fairly straightforward once pairing makes some of the bigger decisions).

Afterwards, I tell others, for example in a retrospective or a 1:1 with my manager, how much I enjoyed pairing and/or concrete benefits like "we were able to work out the interface between these two components much more easily than if I had been soloing on one side and you had been soloing on the other". The goal here is not to tell people they have to pair, the goal is to make it feel like they are missing out on something great if they don't.

Pairing got one of my teams out of a sticky trap. There was a section of the code which only one person understood. We saw this was a problem and the person who knew the code wanted to share his knowledge. For our first attempt, he explained it in a conference room with a whiteboard and a projector. Perhaps that helped a bit, but the explanation didn't made as much sense to the audience as to the presenter and we adjourned with confusion and frustration, or at least with limited comprehension. Later I had reason to do something to that code, and so I asked the expert whether he would pair on it. Mechanically, it was miserable. We didn't share a fluent spoken language and he used a customized setup (using vi and virtual machines) which meant that I mostly watched him type or told him what to type. A far cry from the easy flow between two people which sold me on pairing in the first place! But guess what? I learned a whole lot more about that code than I did from sitting in a conference room. Other people started working on that code and the person who had been the expert could get help and feel less alone. Here I started to formulate my belief that even a very small amount of pairing was better than none at all.

In my other example the surprise was even more pleasant and delayed. We were in a company where pairing was often mentioned, sometimes practiced (at least in some teams or situations), but was certainly optional and not part of most people's habits on a regular basis. One of the people on my team was nice but also seemed like a loner: often wearing headphones, not speaking up much in meetings, and getting a lot done but in a heads-down kind of way. Not my first choice for someone to ask to pair. But in a few cases, I carefully came up with a focused, suitable task and asked him to pair. We paired maybe half a dozen times (if that) over a one year period. It energized me and I appreciated his willingness to put up with my eccentric desire to pair. Fast forward a year or so, we now work for different companies, and he tells me that pairing with me was one of the highlights of his entire two year time at the company! I was floored. I knew I enjoyed working with him in general but I was completely unaware of what he was getting out of pairing.

Do I recommend being a pairing pioneer? Well, it isn't always easy and to be perfectly honest, my current job search is for a situation where pairing is already more established and common. But if you like pairing and find yourself in a non-pairing or low-pairing situation? Sure, give it a shot. As long as people approach it with an open mind (on both sides), the only thing you are risking is 1.5 hours of your time.

Saturday, November 23, 2013

Securing package distribution with TUF

Suppose you are downloading a new fun game for your computer and you want to know whether it is going to do what it claims (clicking on cows, let's say) or whether it is going to send all your data (credit card numbers you type, let's say) to a shadowy cabal in Martha's Vineyard or Napa Valley or whereever shadowy cabals are found these days. For the sake of argument, let's say that you have heard good things about the a (hypothetical) open source project called FreeCowClicker2 written by Ilia Bogomips and you want to try it out.

Well, in some cases the authors of FreeCowClicker2 might run a download site and you might get it there, but most of the time you'll probably be getting it from a package repository, such as a linux distribution, a programming-language-specific repository such as CPAN (perl), PyPI (python), rubygems (ruby), or something like How do I know I'm getting the package I want, if (a) I am connecting to a potentially dodgy WiFi access point or there is some other way in which the shadowy cabal has gotten into my network, or (b) one of the servers involved in serving up the files, or mirroring them, is under the control of our shadowy cabal?

If you are a little bit familiar with this stuff, you are probably saying "signed packages", as found in for example Fedora or Debian. And that indeed is what I'm getting at, specifically TUF (The Update Framework). TUF aims to be usable by any package repository, but the most effort to date has been to using it for PyPI.

As part of Square's hack week which just concluded, a number of us looked into using TUF with rubygems, and wrote some code to that end. Hopefully that code helps clarify what this is all about, and there is a fair bit of documentation on the TUF site, so I'll just mention a few of the high points and interesting details:

  • TUF can upgrade its keys. Your package installer might find there are new keys (signed by the old keys) and switch to them.
  • There are multiple keys for different things. Anyone who contributes a package can have a key which is just good for signing that one package, there are separate keys which are used to say what packages were released as of a given time, and there are keys which are just used to sign the frequently used keys. Some of the keys can be kept offline, and only used maybe a few times a year.
  • TUF is fairly easy to work with. The public keys and signatures and such are kept in JSON, which means you can parse them with, say, ruby or jq.
  • One little example of something they thought of: when you get a reference to another TUF file you also get a signed length. That way an attacker can't substitute a multi-gigabyte file and cause a denial of service by tying up your network or computer. You just need to download as many bytes as the signature told you to, and can quit after that.

There's plenty left to do to finish the job of getting rubygems to use TUF, so go look at the pull request and start pitching in if you feel so inclined. Based on a week of looking at it, TUF does look like a solid basis for a more secure rubygems which preserves all the cool things about rubygems like letting people release gems often, letting anyone author a gem, etc. Likewise, it also seems promising for other package repositories.

Friday, July 20, 2012

Parsing large XML files

Every once in a while, we need to parse large XML files. Here "large" means that the file won't fit in memory, so we can't just suck it in using nokogiri (or our favorite in-memory XML library). SAX is fine as a low level parser to hand tell you where the tags start and end, but trying to do any significant processing will turn into spaghetti unless you have a bit of a framework. The last time I visited this topic, I ended up writing a library, saxophone, which invoked callbacks when it encountered certain named tags. Saxophone is sitting in an obscure git repository; I could put it up as a gem if someone wants it; the big question is whether there is something better out there. The wasabi WSDL parser has been trying their own mini-framework (partially special purpose) described at this issue. But probably the best I've seen so far is sax-machine (specifically, the lazy option thereto). I haven't spent much time playing with it (at least not yet), but it seems like a better starting point than starting from scratch with a new gem. If you do end up writing code directly on top of SAX, just remember this: keep a stack of start tags and end tags. Following this idiom might cut down on the buggy spaghetti that I've seen when I've tried to do without something like saxophone or sax-machine. Update: I fixed the above link to the wasabi issue, which had changed. Not sure how long-lived any of these links are going to be, but here's another one: lib/wasabi/sax_parser.rb from the sax-parser branch. The key is the stack (pushed on start tag, popped on end tag) and the matchers.