Janis Miezitis blog

Ruby's Functional Sweetness and Modeling Clay

I enjoy data manipulations. You have data coming in and data coming out, it is like kneading modeling clay to create something meaningful from a formless shape.

Let’s say our task at hand is to summarize data on daily basis from API endpoint that is providing list of activities and it may or may not have several activities during the day:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
{"activities": [
  {
      "captured_at": 1392366463,
      "distance": "1.12",
      "duration": 12,
      "calories": 100
  },
  {
      "captured_at": 1392355727,
      "distance": "4.20",
      "duration": 80,
      "calories": 400
  },
  {
      "captured_at": 1392280163,
      "distance": "2.32",
      "duration": 20,
      "calories": 2200
  }
]}

Let’s parse our JSON data and get array of activities

1
activity_data = JSON.parse(json_data).fetch('activities')

First manipulation of the data would be to transform it to format that we want to work with. By reading API docs we know that distance from given endpoint is provided in miles, calories is self explanatory, duration is provided in minutes and captured_at is a UNIX timestamp. As a European and a sane person, I want to convert miles to metric system and our internal app uses meters to represent distance. We care about time only in context of date so let’s convert it to Date object.

1
2
3
4
5
6
7
8
activity_data.map! do |a|
  {
    distance: miles_to_meters(a.fetch("distance")),
    calories: a.fetch("calories").to_i,
    duration: a.fetch("duration").to_i,
    date: Time.at(a.fetch("captured_at")).to_date
  }
end

I use fetch here to get values from a hash, because that is sort of a test for making sure, that our assumption about the structure of the data is correct. If any of given keys are not present or 3rd party changes their API, we can find it out early.

Ok, so far, so good. Let’s get to the tricky part of summarizing the data. As a first step we need to group activities by date. That is pretty easy to do with Rubys group_by function. It will create a hash containg array of elements that match the condition provided in block as a value and and key as a condition itself.

1
groups_of_activities = activity_data.group_by {|a| a[:date]}

Each element in groups_of_activities now represents all the data needed for single summary for day. We will iterate over groups of activities and we want to transform each of these groups to single summary.

We can take each activity group and map it. Inside the block we will calculate each value by by mapping the same group of primitives (Integers) of that value and reducing it by passing a function reference as a symbol, &:+ in our case.

1
2
3
4
5
6
7
8
summaries = groups_of_activities.map do |date, activity_group|
  {
    distance: activity_group.map{|a| a[:distance]}.reduce(&:+),
    duration: activity_group.map{|a| a[:duration]}.reduce(&:+),
    calories: activity_group.map{|a| a[:calories]}.reduce(&:+),
    date: date
  }
end

Ad viola! We have a collection of activity summaries. Ill print summaries converted to JSON. Did that behind the scenes:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
[
   {
      "captured_at":"2014-02-14",
      "distance":8561,
      "duration":92,
      "calories":500,
      "date":"2014-02-14"
   },
   {
      "captured_at":"2014-02-13",
      "distance":3733,
      "duration":20,
      "calories":2200
   }
]

Here is a full Gist of the code with simple spec

UPD: Thanks for all repliess on Reddit

Comments