coding, javascript

Map Reduce is fun and practical in JavaScript

I’ll be honest, i’ve never used map..reduce in javascript. I wrote it in java and ruby (so far). So i had to try and i had an challenge in front of me that i needed to complete.

I turned to Mozilla for their wonderful JavaScript documentation.

This is an implementation of <array>.map

if (!Array.prototype.map)
{
  Array.prototype.map = function(fun /*, thisArg */)
  {
    "use strict";

    if (this === void 0 || this === null)
      throw new TypeError();

    var t = Object(this);
    var len = t.length >>> 0;
    if (typeof fun !== "function")
      throw new TypeError();

    var res = new Array(len);
    var thisArg = arguments.length >= 2 ? arguments[1] : void 0;
    for (var i = 0; i < len; i++)
    {
      // NOTE: Absolute correctness would demand Object.defineProperty
      //       be used.  But this method is fairly new, and failure is
      //       possible only if Object.prototype or Array.prototype
      //       has a property |i| (very unlikely), so use a less-correct
      //       but more portable alternative.
      if (i in t)
        res[i] = fun.call(thisArg, t[i], i, t);
    }

    return res;
  };
}

Now the fun part, how do i FILTER data, then map, then reduce and get the result back.

 

Challenge:

1. A bunch of data with object such as this:  ({name: “it”, salary: 100} )

2. Filter data by name “it”

3. Provide a total sum of all salaries for that name

 

Clearly this can be achieved in an simple data.forEach(function(item….)  but with map reduce + filter its a lot more elegant , though probably not as fast.

Here is my solution (after i sat down and refactored what i wrote during the challenge earlier )

 

var data = []

while( data.length < 100) {
   data.push({name: "it", salary: 33*data.length});
}
data.push({name: "accounting", salary: 100});

data.push({name: "acc", salary: 100});
var sum = data.filter(function(val){
	return val.name == "it"
})
.map(function(curr){
	return curr.salary;
})
.reduce(function(prev, curr){
	return prev +curr;
})

console.log(sum);

I generated a bunch of data and it prints the sum of all salaries for a name “it”.

For some reason, i thought that map and reduce would executed in parallel and would have a callback but that just means how heavily i am into NodeJS . On the next post, ill share how i truly write async code an how sorting, filtering, map/reduce can be achieved with callbacks.

 

Thanks and happy coding.

 

 

Share This:

coding, ruby

Ruby interview challenge

I had a pleasure of getting an interview with an upcoming startup (i won’t disclose which one). Besides implementing fizz buzz in ruby, i was asked to write a method that would check for input to be a palindrome.

Palindrome is a word, phrase, number, or other sequence of symbols or elements, whose meaning may be interpreted the same way in either forward or backward.

Keep in mind: using reverse is not allowed 🙂

I wrote two versions since i wasn’t pleased with my first one.

 

Rspec – testing driven development

require 'spec_helper'
require 'blah'

describe "Blah" do 

	it "should match reversed order of the word " do
		palindrome("abba").should == true
		palindrome("abcba").should == true
	end
	it "should reject if reversed order doesnt match" do 
		palindrome("abbac").should_not == true
	end

	it "should handle empty string with passing" do 
		palindrome("").should == true
	end

	it "should handle various cases " do
		palindrome("AbbA").should == true
	end

	it "should handle empty spaces " do
		palindrome("   Ab  bA").should == true
	end
end

 

Version 1

def palindrome2(word)

i = 0
last = -1
word.each_char do |c|
	if word[i] != word[last]
		return false
	end
	i+=1
	last -=1
end

return true

end

 

Version 2

def palindrome(word)
	word = word.downcase.gsub(" ","").chars 
	word.each{|c| return false if  word.shift != word.pop  }
	true
end

I have a feeling there is a better way of writing this.

Thanks

Share This:

bigdata, coding, javascript, nodejs

Using sumo logic to query bigdata

Main selling point of Sumologic is: real-time (near) big data forensic capability.

[pullquote]Log data is the fastest-growing and most under-utilized component of Big Data. And no one puts your machine-generated Big Data to work like Sumo Logic[/pullquote]

 

At Inpowered, we used Sumologic extensively, our brave and knowledgeable DevOps folks managed chef scripts that contained installation of Sumologic’s agents on most instances. What’s great about this:

  • Any application that writes any sort of log, be it a tomcat log (catalina.out)
    or custom log file (i wrote tons of json) , basically any data that’s structured or otherwise is welcome
  • Sumologic behind the scene processes your data seamlessly (with help of hadoop
    and other tools in the background) and you deal with your data using SQL-like language
  • Sumologic can retain gigabytes of data , although there are limits as to what is kept monthly
  • Sumologic has a robust set of functions , from basic avg, sum, count,
    it has PCT (percentile ) – pct(ntime, 90) gives you 90th percentile of some column
  • Sumo has a search API, allowing you to run your search query ,
    suspend process in the background and return
  • Sumo’s agent can be installed on hundreds of your ec2 machines (or whatever)
    and each machine can have multiple collectors (think of collector as a source of logs)
  • Besides an easy access to your data (through collectors on hundreds of machines) ,
    very useful dashboard with autocomplete field for your query is easy to use
  • Another cool feature is “Summarizing” within your search query,
    allowing you to group data via some sort of pattern into clusters
  • Oh! And you get to use timeslicing when dealing with your data

 

Getting started guide can be found here 

High level overview how Sumologic processes data behind the scene (img from sumologic)

valprop_anomaly

 How could we live without an API?!

 

Sumologic wouldn’t great if it hadn’t offered us to run queries ourselves using whatever tools we want.

This can be achieved fairly easily using their Search job API , here is an example that parses log files that contain 10.343 sec —-< action name> . Somewhat a common usecase where an app logs these things and i want to know which are the slowest, whats the 90th percentile and what were the actions within certain time range and sliced by hour so that i don’t get too much data. Just an example written in nodeJS.

query_messages – query that will return you all the messages with actions that were slow

query – query that will provide you statistics and 90th percentile, sorted result

var request = require('request'),
    username = "[email protected]",
    password = "somepass",
    url = "https://api.sumologic.com/api/v1/logs/search",
    query_messages = '_collector=somesystem-prd* _source=httpd "INFO"| parse "[* (" as ntype  | parse "--> *sec" as time | num(time) as ntime | timeslice by 1h |  where ntime > 7 | where !(ntype matches "Dont count me title")   | sort by ntime',
    query = '_collector=somesystem-prd* _source=httpd "INFO"| parse "[* (" as ntype  | parse "--> *sec" as time | num(time) as ntime | timeslice by 1h |  where ntime > 7 | where !(ntype matches "dont count me title")  | max(ntime), min(ntime), pct(ntime, 90)  by _timeslice | sort by _ntime_pct_90 desc'

var qs = require('querystring');
var util = require('util'); 
from = "2014-01-21T10:00:00";
to = "2013-01-21T17:00:00"

	var params = {
    		q: query_messages,
    		from: from,
    		to: to
    	};

    	params = qs.stringify(params);    
    url = url + "?"+ params;
    request.get(url,
    {
    	'auth': {
    		'user': username,
    		'pass': password,
    		'sendImmediately': false
    	},
    },
    function (error, response, body) {
    	if (!error && response.statusCode == 200) {
    		var json = JSON.parse(body);
    		insp(json);
    	}else{
    		console.log(">>> ERrror "+error + " code: "+response.statusCode);
    	}
    }
);

    function insp(obj){
	console.log(util.inspect(obj, false, null));
}

Now you have an example and you can work with your data , transform it, send a cool notification to a team, etc etc.

Thanks and enjoy Sumologic (free with 500Megs daily )

 

Share This: