Nerdy Adventures of Wes: code

OK, this one's gonna be pretty nerdy, even for me. Yeah.

In the world of object-oriented software design, we talk about things we're modeling as falling into 1 of 2 categories: Entities and Value Objects. Entities have identity separate from their constituent attributes, while Value Objects do not. So I, for example, am an Entity, but the color magenta is a Value Object. Even if you list my attributes (first name: Wes, last name: Morgan, hair color: blond, eye color: hazel, height: 5'11"), that doesn't fully encompass who I am because there is probably at least one other guy out there matching that same description. Freaky Friday. But if you were trying to find me, that other guy just won't do. Magenta, however, is a color that can be represented by 3 red, green, and blue values, for example. Magenta's magenta. This is nice because you can create an immutable, throw-away magenta instance of a Color class to represent it in your code. Decide you like chartreuse better? Throw away your magenta instance and instantiate a new Color object for chartreuse. This has several handy side-effects for your code. Entities require much more care and feeding. I should know, I am one. And so is that jerk imposter whom I will crush.

Anyway! REST works very well on Entities. You have a URL like http://server.org/people/34 and you call various HTTP methods on person #34 (GET, PUT, POST, DELETE). How rude of you. But it all works. Wam, bam, thank ya ma'am.

But what about Value Objects? If you wanted to write an RGB to HTML hex value web service that converted colors from red, green, and blue values to their equivalent HTML hex value, how would make that RESTful? What's the resource? What operations do you map the HTTP verbs to? Here's a real-world example I ran into this week that highlights the conundrum and how I solved it:

I have a piece of software that converts ZIP codes into City, State pairs. This is how we in the U.S. of A. do postal codes. It can also take full street addresses and sanitize them and find out their 9-digit ZIP code. For those outside the U.S., the 9-digit ZIP code is extra special because it can tell you exactly where that address is geographically, but almost no one knows the final four digits of their code. So having software that can derive that code from their street address (which they do know) is super handy. My mission was to wrap this software in a Ruby on Rails layer to create RESTful web services out of these functions.

Step 1. ZIP -> City, State : ZIP codes are a great example of a Value Object. The ZIP code of my office in downtown Denver, CO is 80202. 80202 is the same as 80202, even though they're at different places on this page. You don't care which one I typed first, they're interchangeable. So it would be silly to create a RESTful web service that did something like this: GET /zips/89 which would return the ZIP code w/ id 89 and the associated city and state. Let's say it's 73120 in Oklahoma City, OK. Why assign ZIP codes an id number? Why not just do this: GET /zips/73120 and have that return Oklahoma City, OK? So that's what I did. The value *is* the id. That works great, moving on.

Step 2. Address -> Better Address (incl. the magical 9-digit ZIP code) : Now things get trickier. An address is still a value object. 123 Main St., Missoula, MT is the same as 123 Main St., Missoula, MT. We don't need to assign id's to these objects, they're just the sum of their parts. But unlike ZIP codes, more than one attribute is required before you can say you have an Address object. You need 1-2 lines of house number, street, apartment number, etc. information, then a city, state, and 5-9 digit ZIP code. That's a bit harder to put into a URL. Ideally we want to be able to label the different attributes so we can easily keep track of what's what. Something like this:
GET /addresses/123%20Main%20St.%20Missoula,%20MT%2059801 ...could work, but it's kind of a hassle because you then have to write a parsing routine on the server side. It would be nicer if you could just label the components of the address when you pass it in. Query parameters to the rescue! Here's what I did:
GET /addresses/1?addr1=123%20Main%20St.&addr2=&city=Missoula&state=MT&zip=59801
Now I can very easily pass this address into my sanitizer and get back the corrected version w/ the 9-digit ZIP. The "1" in the URL is a pseudo-id placeholder. If it bothers you, you could just as easily put "address" or "thisone" or "foo" there. But don't put "get" or "lookup" or any other verb there! That's not RESTful. Only the HTTP verb should define the action being taken on the resource.

So yeah, that's how I did it. I think it's still pretty RESTful. It only uses GET, but that's all you really need for read-only Value Objects. It would probably be better to do something like GET /zips/90210/city or GET /zips/80218/state, or maybe even GET /zips/73120/city_state to avoid the double call in the common case of wanting the city and state. I still don't really know of a cleaner way to do the address sanitizer piece. It works, but it feels a bit hackish.

I would like it if smarter types chimed in and critiqued my approach. Because this is all just, like, my opinion, man.

In my day job, we use an open-source content management system called WebGUI. They recently switched to git to manage their source code. "Great!" I thought, "Git's decentralized nature will make it super easy for me to maintain the local additions and modifications we've made to WebGUI over the years while still being able to update to new upstream releases as they come out."

Yeah, right.

I still haven't figured out how to do this correctly or easily (see update below). My first attempt was to branch at the v7.7.20 upstream release tag on the webgui-7.7 branch. My branch is called webgui-7.7-pin and I committed all my local additions and modifications to that branch. I then pushed it out to an EC2 instance so my WebGUI servers running in EC2 can pull their code from it when they launch. I committed a few more changes over time, and I even committed a couple of bug fixes upstream and to my local branch (because I want the bug fixes now, not in the next release). So far so good. But then 7.7.21 came out.

I tried doing (on the webgui-7.7-pin branch):

git rebase v7.7.21

That v7.7.21 tag name gets found by git on the webgui-7.7 branch and it rebases, or forward-ports, my local commits to the 7.7.21 tag. Git does this by rewinding all my commits, fast-forwarding from the v7.7.20 tag (the original branch point) and then re-applying my commits in order to that new tip. So I end up with exactly what I want, v7.7.21 but with my local changes applied on top of that. Then I tried to push it to my "cloud" remote (the EC2 instance serving my production servers). The push is rejected because it's not a fast-forward. As far as my cloud remote is concerned, my local copy of the webgui-7.7-pin branch is in chaos. It can't make heads or tails of my rebased code.

I haven't found much useful advice on the web. I'm really surprised this isn't a more common workflow pattern for folks using git. But maybe it is and I'm just going about it all wrong. Hopefully someone will smack me with a clue stick soon!

UPDATE: Many thanks to Haarg in #webgui for the clue stick beating! Turns out I was getting too big for my britches and trying to use rebase where a good ol' fashioned merge did the trick nicely. So here's the new workflow:

Upstream releases a new version, tags it v7.7.22 (for example)
I pull the latest changes into my local copy of the upstream branch:
git checkout webgui-7.7 git pull origin
I now checkout my local modifications branch:
git checkout webgui-7.7-pin
I then merge the new version tag into my local branch (git is smart enough to go find the "v7.7.22" tag on the webgui-7.7 branch):
git merge v7.7.22
Resolve any conflicts that created (it happens), commit them (not necessary if the merge created no conflicts), and then push to my cloud remote:
git push cloud
This time the push was a fast-forward merge and the cloud remote happily accepts it. Hooray!

Nerdy Adventures of Wes

Thursday, January 14, 2010

RESTful web services for value objects

Wednesday, September 30, 2009

(How not to) Use git to maintain local changes to an upstream codebase

Followers

Blog Archive

About Me