Ruby Sorting [2] – Common Mistakes When Sorting With Blocks

This sorting technique is one I’ve had a chance to use at work more lately. But what keeps tripping me up is when you use the block to sorting primarily by one field, with a secondary sort on another field. Let’s say Fish has species and type, and we have these fish in our database:

species type
Platy Sunset
Platy Calico
Molly Dalmation
Platy Rainbow
Guppy Fancy Tail
Platy Mickey Mouse

When I sort first by species, then by type, I keep accidentally doing the following, which gives the wrong sorted results:

>> fishes = Fish.find(:all) 

>> fishes.sort do |a,b|
?>   a.species <=> b.species
>>   a.type <=> b.type
>> end

Doing that, I end up with a list like:

species type
Platy Calico
Molly Dalmation
Guppy Fancy Tail
Platy Mickey Mouse
Platy Rainbow
Platy Sunset

It ignores my first sort on species, and ends up sorting only by type!

Why? What’s wrong with that?  Well, the <=> comparator function (also known informally as the “spaceship operator”) returns either -1, 0 or 1, depending on whether the first value is less than, equal to, or greater than the other.  The block will return the last statement evaluated.  So what happens is we compare species, then we then compare type, and it is always the result of the type comparison, -1, 0 or 1, is returned from the block.  The problem is, if species is not equal, then we want to stop there and return -1 or 1 accordingly and not evaluate type at all.

A simple way to do this is to add “if result==0″ to the end of the type comparison, and only evaluate type if species was equal.

>> fishes = Fish.find(:all) 

>> fishes.sort do |a,b|
?>   result = a.species <=> b.species
>>   result = a.type <=> b.type if result == 0 
>>   result
>> end

This way, it will perform the first search by species, then only continue to perform the secondary search if the result of the first search was zero, that is they were equal values. And so I end up with a list like:

species type
Guppy Fancy Tail
Molly Dalmation
Platy Calico
Platy Mickey Mouse
Platy Rainbow
Platy Sunset
Posted in code. Tags: , . Leave a Comment »

Ruby Sorting [1] – When and Why to use sort_by()

When I read the rdoc on sort_by, I understood the general idea that sort_by is more efficient in some situations. The specifics on why were still over my head, so I wasn’t planning to get into specifics during my recent talk on sorting. Yet just a few hours before my talk Jim Weirich was still trying to cajole me into using big words like “Schwartzian Transformation” in my talk because, he teased, “using big words makes you sound important :)

The good thing is that this gave me a chance to talk it out with him, and actually understand it for real. It was too late for me to add that into my talk a few hours before I was to give it, but I do want to talk about it here now that I understand.

sort_by() is good if the values you’re sorting on require some kind of complex calculation or operation to get their value.

Let’s say you have an aquarium, and you save the dates of when each fish is born in a database. Later, you want to sort the list of fish by age. But you must calculate the age based on the birth date. So the Fish class has an age method:

class Fish
...
  def age
    (Date.today - birthday).to_i
  end
end

So when you sort like this:

>> fishes = Fish.find(:all)

>> fishes.sort do |a, b|
>>   a.age <=> b.age
>> end

It will calculate age over and over as it sorts. And if you’ve studied sorting algorithms, you know that the items in the list are compared with other list items repeatedly until it can be determined where the items go in the ordered list. So using this way of sorting, the age will be calculated a lot!

When you use sort_by() instead:

>> fishes.sort_by do |a|
>>   a.age
>> end

It does 3 things:

1. It will first go through each item in fishes, calculate age, and put those values into a temporary array keyed by the value. Let’s say we have 3 fish, one 300 days old, one 365 days old, and one 225 days old. The temporary array looks like this

[[300, #<Fish:A>][365, #<Fish:B>][225, #<Fish:C>]

2. The complex calculation is now done, once for each fish. It sorts this temporary array by the first item in each sub array. Meaning, it sorts by the numbers 300, 365 and 225, without recalculating them.

[[225, #<Fish:C>],[300, #<Fish:A>][365, #<Fish:B>]]

3. Lastly, it goes back through the array, grabbing the 2nd array elements (the actual Fish objects) and putting them in order into a flattened 1-dimensional array

[#<Fish:C>, #<Fish:A>, #<Fish:B>]

So, that is how you end up with a sorted array without recalculating values more than you need to. And that is why sort_by() can be more efficient.

Posted in code. Tags: , . 1 Comment »

Ruby Sorting [0] – Sorting a Hash

It figures that one of the questions someone asked, following my talk on sorting last month, was something I specifically chose not to cover for the sake of time, and to be able to cover more valuable topics. So, when someone asked whether you can sort a Hash and how, I knew A) that you can, and B) I knew I had barely glanced at it in the rdoc on hash sorting, but didn’t remember how it worked. So I had to simply suggest they go read about it themselves. But I was also curious to go read about it more myself.  So I took my own advice, and here’s what I came up with:

I define a hash:

irb(main):001:0> typical_fish_colors =
irb(main):002:0*    {"clownfish" => ["orange","white","black"],
irb(main):003:1*     "goldfish" => ["orange"],
irb(main):004:1*     "angelfish" => ["black","white"]
irb(main):005:1>    }

Default sorting on a hash – sorts the keys.

Pretty simple I guess. What you may not expect, depending on what you know of hashes, is that you don’t get a Hash object returned from calling .sort, you get an Array.  A nested, three-dimensional Array. This is because arrays are ordered, hashes are not.

irb(main):006:0> typical_fish_colors.sort
=> [["angelfish", ["black", "white"]], ["clownfish", ["orange", "white", "black"]],
   ["goldfish", ["orange"]]]

What you can’t do when sorting a hash is use sort! (sort-bang):

irb(main):007:0> typical_fish_colors.sort!
NoMethodError: undefined method `sort!' for #<Hash:0x2d17ea8>
 from (irb):7

What else you can’t do when sorting a hash by default is use symbols for keys:

irb(main):008:0> typical_fish_colors =
irb(main):009:0*     {:clownfish => ["orange","white","black"],
irb(main):010:1*      :goldfish => ["orange"],
irb(main):011:1*      :angelfish => ["black","white"]
irb(main):012:1>     }

irb(main):013:0> typical_fish_colors.sort
NoMethodError: undefined method `<=>' for :clownfish:Symbol
 from (irb):13:in `<=>'
 from (irb):13:in `sort'
 from (irb):13
Posted in code. Tags: , . Leave a Comment »

On Sorting in Ruby

After speaking at the Columbus Ruby Brigade last month on the topic of Ruby sorting, I’ve gotten a chance to use some of the sorting techniques I spoke about more over the last month at work. There were a few things I didn’t get a chance to cover since I was only giving a 5-10 minute lightning talk. There’s a few things I’ve come to understand a little better (benefits and gotcha’s) since then.

Things I talked about (feel free to check out the slides from my talk for examples of some of these things)

  • Basic sorting with .sort and .sort!
  • Sorting your own complex types by defining a <=> method on your class
  • Sorting on the fly with blocks
  • Sorting nested objects

Things I did not cover for various reasons. And blog posts covering them in more detail after the fact:

Posted in code. Tags: . Leave a Comment »

Speaking at the Columbus Ruby Brigade

Tonight I will be speaking at the Columbus Ruby Brigade (CRB) meeting.  I will be taking the lead from another CRB member, Kevin Munc, who has started a series of introductory talks focusing on taking one method in Ruby and doing a 5-10 minute lightning talking about that method.  “Ruby Method of the Month.”  Tonight I will be speaking about sorting in Ruby.

I will post a copy my slides up here soon.  Not until after I’m done so nobody cheats :)

Update: August 18, 2009
Here’s the slides from my talk, enjoy!

Helping my mother start a blog!

I’ve spent the last several weeks helping my mom set up a blog. She’s doing really well with all this stuff that’s new to her. I know she was apprehensive at first – but I also remember a time when she wasn’t sure she’d be able to do email and now she uses email all the time. :) She’ll be an expert blogger in no time I’m sure!

I wrote a little more about the personal story behind her blog on my other blog. But in summary: she wrote a book back in 2006 about planning church family camp and that’s what her blog covers.

But here, I thought I’d cover a few of the technical details and decisions made in helping set this up for her.

Using things she’s already familiar with

First, I wanted to make this process easy for her to do, by making it as similar to things she’s already familiar with using as possible.  Based on that, I decided to have her try out Windows Live Writer for her writing.  She’s very familiar with using MS Word, and Live Writer isn’t too unlike Word.  Not exactly, but more so than going to the web page and doing her writing inside a text box.

Easy to use over dial-up

In addition, I wanted to make it something easy for her to use with a dial-up internet connection.  Yes, my parents live so far out in the country that their internet options are basically dial-up or satellite.  They’re too far out for DSL, and their cell phone signal out there is unreliable.  Given that they mostly only use the internet for email, it’s hard to justify the cost of internet through the satellite.  If she were writing from her blog’s website, I’m not sure it would be able to handle the automatic periodic saves that it does very well. With Live Writer, she can do all of her writing offline, and only connect right before she wants to publish her content.

Easy for me to support her

And I wanted to make this process easy for me.  Since I have a blog on wordpress and blogger, that narrowed down my choice of one of those two places to have her blog.  I wanted something I was already familiar with to make it that much easier for me to be her “system administrator.”  :)   I went with WordPress because I’ve found it a little easier for me to configure, and b/c of a few extras like blog stats that are already there automatically.  I know it’s possible to get blog stats using Google tools and such, but then I have to go set that up separately and I can be lazy. :)   Bonus is I’ve learned a lot more about what WordPress can do through the process of setting things up for her.  Particularly in relation to Widgets.  So much can be done with Widgets, and anything that can’t already be done I can pretty much take care of myself with the catchall Text Widget coupled with my HTML skills.

Explaining the tools

At first she wasn’t sure how Live Writer and WordPress relate to each other, why she needed both, and what each one was really for.  I used an analogy to try to explain it to her.  I said to think of Live Writer as her spiral bound notebook.  (She wrote her book on the computer, I just went old-school with the analogy :) )  And think of  WordPress as her book publisher.  She can write whatever she wants to in her notebook.  Scratch things out.  Throw pages away.  Start new pages.  But only when she’s satisfied with a draft, she sends it to the publisher and they publish it for the world.

Ada Lovelace Day

While I didn’t officially “pledge” for this (bah), I thought I’d throw something out there for Ada Lovelace Day 2009 – March 24, and write about a woman in technology whom I admire.

When I first thought about participating in this, I had some severe writers block.  I just couldn’t think what or who I wanted to write about.  And as I reflected on why this was so tough, it occurred to me that most of that came from a thought that goes something like:

“dangit, people, why aren’t we writing about about inspiring people in technology.  What about people who inspired my career?  Why does it matter if it’s a woman?  I guess I just don’t get it.

I, for one, want to be defined by – I want to be remembered as – many things, but not necessary because I am a woman.  Rather because I am a good developer.  Because I am intelligent.  Because I am a nice person.  Because I am a person who can figure things out.  Because I am a person who helps others figure things out.  Because I have good ideas.  But not because I am woman.

Then, a person came to mind: the first woman programmer I ever worked alongside: Clar-René Sliper.  I was a little, junior co-op, still in college.  She was one of the senior members of our group.  Aside from me, she was the only female programmer there.  (There were other women around, just not programmers.)

I remember wondering how she got into IT.  After all, there were so few women in my college classes.  I can imagine that finding women in the field was even more rare when she got into IT.  Yet whenever she was around, it just never seemed like – nobody acted like – she was any different than anyone else on the team.  She was smart, I never saw her ideas ignored, she never got walked all over or had to put up with crap from anyone.  And she never had to put forth any attitude to get that respect.  The only difference was her name was Clar instead of Jim or Liem or André or Dean or Matt or Dave.

It’s not unusual for people’s first experiences with things to shape them going forward.  Maybe those early experiences paved the whole way for me thinking: “People – this is not a big deal. Quit making it one.  Singling people out only makes it worse.”  You should admire anyone around you who deserves it.  Man, woman, white, black, Asian, disabled, etc.  Appreciate anyone around you who deserves it.  And treat everyone around you with at least a basic level of respect, no matter what.

I appreciate Clar for unwittingly showing me that it is not that hard, or weird, to be as competent and capable as everyone else around me.

CodeMash 2009

CodeMash – I’ll be there!

Are you?

Using Selenium to check for the presence of an image

I spent a little bit of time yesterday trying to figure out how to use Selenium to check for the presence or absense of a specific image used as the background of an HTML table.  It wasn’t that hard to do in the end, but I got a little tripped up trying to check for the wrong thing.  Pretty cool once I got it done, so I figured I’d share here.

What I tried to do:

At first I tried to grab the HTML text inside the TD and verify the presence of the filename of the image as text.

What I should have done all along:

What ended up working was that I knew the table row and cell in which I was expecting the image, and I had to do a “verifyElementPresent” instead. (Thanks to Dean for helping me figure that out)

fishsel

 

The only thing I don’t like about that approach (vs. checking for text) is that I can’t do a wildcard search for the text.  I have to exactly check for the full path to the file.  So, if we decide to re-organize the directory structure and move some images to a “saltwater” and some to a “freshwater” subdirectory, the selenium test is then fragile and breaks.  But chances are the images are going to remain pretty static, so I am happy enough with this approach.

Posted in code. 4 Comments »

Great Lakes Ruby Bash

As I’ve mentioned before, I’ve enjoyed going beyond my local network of technical folks and attending regional events in Ohio and Michigan such as CodeMash, Cleveland Day of .Net (even though I’ve never been a .Net developer) and Agile Summer Camp

It’s high time I expand my Ruby-specific regional network, especially now that I’m getting paid to write Ruby code! :)

This weekend, I’m looking forward to attending the
Great Lakes Ruby Bash

Posted in community. Tags: . Leave a Comment »