Sunday, December 18, 2011

Achievement Unlocked

Sunday, December 11, 2011




the act or habit of procrasthesizing, or putting off or delaying the writing of a graduate thesis, academic paper, or other large document, in particular one having an imminent and impending deadline. In most cases procrasthesation involves the use of a web browser to read social networking update feeds, design and technology blogs, xkcdslashdot.orgrandom youtube videosoverheardin[your hometown], writing a blog post such as this one, or all of the above.

"His constant procrasthesation made him miss his thesis deadlines, drop out of school, and end up as a travelling-used-tire-salesman, but boy did he send out the coolest links!"

Sunday, November 13, 2011

#SciFund Challenge: Crowdsourcing Funding for Science

Just found out about the #SciFund Challenge, in collaboration with Rockethub, to fund scientific research via crowdsourcing. I think it's currently only set up for the month of November.

It would be interesting to see how initiatives like this develop, whether an approach like this could be sustainable, and what would be the mechanisms to ensure the money is actually used for the right purpose...


Marketing lesson: If you want to get your stuff funded fast then its good to have words like "zombie" in your title, like this one: "Support Zombie Research!"

Image source: 

Further reading:
Scientific American Article
SciFund blog

/via slashdot

Sunday, October 9, 2011

New Chromebook Remote Desktop Extension

This new Google Chrome Remote Desktop extension just came out on Friday, and it could really come in handy.

A large part of my work these days is done on a remote server (mainly due to sensitive data analysis and computation power needs). Sometimes an SSH terminal is sufficient but many other times I find the graphical interface more productive, at least for me.

There are always issues with setting up remote desktop connections, especially across different operating systems - To connect to a Windows machine I use MS Remote Desktop Connection, for linux I use NX and sometimes VNC (less secure, more bandwidth cost, but simpler to set up), etc. One of the most recent issues is OS-X Lion breaking compatibility for the existing Free NX / Nomachine NX client setup, and so on.

So you can understand my excitement about this new extension. Finally, Chromoting, for real! I can't wait to try it out.

And who knows, maybe my Chromebook will be useful for doing something other than browsing after all!

To download from the Chrome webstore:


Wednesday, September 7, 2011

Spotify + Echo Nest = Echofi yourself reports on an independent developer's use of the Spotify and Echo Nest APIs to create "Echofi" Pandora-like auto-generated playlist for Spotify. 
Some quick and raw thoughts that come to mind:

  1. Cudos to developer Andy Smith for stitching these two APIs together!
  2. Back in 2009 Spotify and Echo Nest announced collaboration on Spotify's playlist and music discovery functions, and even showed a prototype of this ( This was way before Spotify hit the US. I'm not sure if they already launched anything public yet, but I would not be surprised if they are cooking something very similar to Echofi. If so, it will probably be baked into Spotify in a much more integrated manner. 
  3. Yet more respect to Andy Smith for actually doing something public and working before an official thing is launched. And respect to the two companies for exposing the API that enables apps like this. Lets just hope the app isn't found to be violating some "terms of service" and will be shut down. Open API = Innovation. 
  4. Pandora - Be afraid, be very afraid. This type of service was one of the key things I was missing in Spotify. Combining auto playlists with the ability to also play any specific song that comes to mind makes for a really powerful service, and Pandora's radio-station-like model of operation just can't compete with that.
  5. I really like Echo Nest, and not just because it was founded by two Media Lab alums. I really like them because they released the public "Million Song Dataset", and which even made it to as a public dataset on Amazon web services. I played with it a bit during the Boston Hack/Reduce Event in June and its really neat.
  6. This app could be a great opportunity to compare two competing approaches: Pandora's manually generated "music genome" vs. Echo Nest's algorithmic machine approach to understanding music and generating its feature space. 
Which leads me to my homework for you the reader: 
Pick a song, any song, and use it to seed a Pandora station and an Echofi playlist, see how they compare, and tell us what you think!

Saturday, August 20, 2011

From Agile Development to "Alive Development"

Om Malik and others have recently blogged about the emergence of the "Alive Web" - which is all about the live interaction with others, the real-time, the here-and-now. has been one of the representative example of it, and also Chatroulette.

People talk about how Google Plus also supports this trend, giving the Google+ Hangouts as example. I agree the Google+ launch has taken the Alive Web one giant leap forward - but for me the major "WOW" moment is not about any of its end-user features like Hangouts, Huddles, or the new Games. What is the most amazing leap for me is Google's "Alive" development and iteration of the Google+ product.

I have NEVER seen any company be as open and as interactive about feature and UX design of a product in the scale of Google+. This is not about how high-ranking Googlers are publically active on the product while some exects of other companies that don't use their own product (+Thomas Hawk mentioned Carol Bartz doesn't have a Flikr account). I am talking about how, from day one, the Google+ team is in the frontlines, using the product itself to generate a live conversation with the users, eliciting feedback through the feedback button, posts that ask about general or specific feedback, responding to user's posts about suggestions and/or complaints, or holding public hangouts for Q&A. Every few days we see another post and video with new features, directly responding to user comments and requests. Some recent examples by product manager +Shimrit Ben-Yair, designer +Jonathan Terlesky, and software wizard +Andy Hertzfeld, herehere, and here
This is way beyond the traditional "agile development" methodologies. This is a near-realtime development cycle, with continuous user engagement, feedback, and response.
What we're seeing here is the Alive Development of the Alive Web.

Thursday, August 18, 2011

SXSW Talk Proposal

My SXSW plug:
I proposed a talk for the upcoming SXSW based on my PhD research and the work we are doing at the Human Dynamics group at the MIT Media Lab. I think its going to be awesome, and I'm not biased at all.
Details below, Please Vote!

Investigating Social Mechanisms with Mobile Phones
Imagine an imaging chamber placed around an entire community. What if we could, with permission, record and display nearly every facet of behavior, communication, and social interaction among its members as they live their everyday life? This potential would afford rich insights into humanity - how societies operate, how real world relationships form and change over time, and how behavior and choices spread from one person to another. We could diagnose the health of a community, and of its individuals. We could even measure the effects of feeding this information back to them.

At the MIT Media Lab, we have built the beginnings of what we call “The Social MRI.” You don’t need a huge chamber – just a bunch of modern smartphones. Using our mobile sensing software, we transformed a residential community into a living laboratory for over 15 months. Many signals were collected from each participant, altogether comprising what is, to date, the richest real-world dataset of its kind. As part of our continuing research, we are developing new tools to realize "the quantified self", and architectures to do all of this from a user centric perspective – where individuals own their data, and privacy is embedded into the framework.

This talk will highlight surprising results from the study, introduce our open source tools developed for data collection, and discuss how the lessons learned could extend to improve the consumer and business worlds.

Questions The Talk Will Answer:
  • How can we design new mechanisms of social support (e.g. for increasing physical activity), and measure their performance with a real community?
  • How can mobile phones be used to infer real-world social signals, relationships, and other personal and group characteristics?
  • How is it possible to preserve user privacy while still enabling today’s data collection and advertising-driven business models?
  • Who has more influence over the mobile apps that you install on your phone – your friends on Facebook, your “real” friends, or the people you just hang out with?
  • What tools can we provide to developers and researchers to build apps for smart mobile sensing that are both secure and battery efficient?

Vote for My SXSW Idea!

Sunday, August 14, 2011

Step aside "Wearable Compiting", "Epidermal Computing" is the new buzz!

"Epidermal Electronics" - Remember this term, because I am sure we'll be hearing more of it in the future. Ars Technica writes about this amazing new technology of  "Epidermal Electronic System" (EES). Basically, its a "technology that allows electrical measurements (and other measurements, such as temperature and strain) using ultra-thin polymers with embedded circuit elements. These devices connect to skin without adhesives, are practically unnoticeable, and can even be attached via temporary tattoo."

Check out the cool video by Northwestern:

Tattoo electronics could have medical applications from Northwestern News on Vimeo.

For an in depth read and pictures that "show a lot of skin", I recommend diving into the full Science paper, and this Science perspective article that talks about the potential of the technology for medical applications.

What I thought was way cool is how they show proof of concept for solar power or inductive power sources (read: wireless power up, like RFIDs). I wonder if piezoelectrics could be used to power such epidermal devices from the motion of the wearer, or maybe harvest the person's body heat... (did anyone say Matrix?)

The authors discuss and show feasibility of RF based wireless communication, but I wonder if you can also do something like Body Area Networks where multiple epidermal devices could communicate with one another using the human body as its medium - so that you could have one device responsible for aggregating the sensor data and transmitting it out. And if we go this far, why not person-to-person communication, ala Jay Silver's ok2touch, but with all epidermic computing:

How about epidermal peer-to-peer music sharing?

I can already picture Hallmark making a kiss-activated-epidermal-electronics-musical-greeting-card-tattoo for Valentines day (Hallmark, lets talk royalties. Call me).

And can you imagine how the TSA would react to this tech?

Do you have other ideas for EES applications?

Wednesday, May 11, 2011

funf: Open Sensing Framework site live, sneak-peek launched!

In the Human Dynamics group at the MIT Media Lab, under the direction of Prof. Alex (Sandy) Pentland, we work on sensing, learning, and gaining a better understanding of people and social systems. 

For the past year and a half I've been point-man on a very large scale and longitudinal experiment, a "living laboratory", or a "social observatory" if you may. The study team includes Wei Pan, Cory Ip, Cody Sumter, Maya Orbach, Yves-Alexandre de Montjoye, Inas Khayal, Sai Moturu, and many many others who have helped out in various capacities through this very challenging year and a half.

We equipped members of a residential community with mobile phones that are used as social and behavioral sensors (with participants' informed consent and operating under strict privacy and experimental protocols under supervision of MIT's Internal Review Board, or IRB). We are developing mathematical models to better understand how people and communities organize and operate, and how things like news, ideas, habits, or diseases propagate through a real-world social network. We are also investigating ways to help people make use of the knowledge collected by their mobile phones and aggregate collective data to improve their lives in constructive ways.

You can read a bit more about our project and other projects at the Human Dynamics group at this recent Wall Street Journal article ("The Really Smart Phone"), and I'll probably write more about the study and the results coming out of it in future posts. 

This post however, is dedicated to the underlying Android-based mobile phone software that we developed for data collection and management. We have long wanted to release this codebase as an open framework that would help researchers, developers, and individual self-trackers collect data in an easy and safe way, and leverage the experience we have gained through our field deployments. We recently received a grant through Google Health dedicated for doing just that - opening-up and extending the software platform. 

There is still a lot of work ahead of us this upcoming summer. We want to implement the lessons we've learned, update architecture, add a whole lot of documentation and tutorials, and many other things needed in the transition from an internal experimental platform to an a configurable, reusable, and extensible sensing framework. 

For those of you who don't have the patience to wait, and following many requests to try out the platform, we decided to do an early release of a "sneak peek" version so that you could get a feel of funf while we work to refactor the full source code for public consumption. We will gradually be adding functionality and tutorials, and of course the source code...

We thought that this years Google I/O, Google's developer conference, would be the ideal opportunity for launching our framework and the sneak peek.

So here we go: World, meet "funf: Open Sensing Framework", currently tailored for Android devices. 
Check out our freshly minted website at

We hope you find it useful!

Wednesday, April 27, 2011

... And the Pursuit of Ha-PII-ness

Criminal law does not define in print every type of gun or object that might kill a person. It talks about the act itself, and evaluates intent and the surrounding circumstances to decide if something is a murder, homicide, manslaughter, self-defense, etc., whether there was so called "malice aforethought" involved and so on. Can you imagine someone arguing that running over another person with a "Killdozer" is not a crime since its not on the predefined weapons list?
Privacy law should be treated in the same manner. Currently, there are lists of explicitly defined Personally Identifiable Information, or PII. PII refers to information that can be used to uniquely identify, contact, or locate a single person or can be used with other sources to uniquely identify a single individual. 
One of my criticisms with regards to existing privacy legislature is related to cases where information types were explicitly defined (e.g. a person’s name, telephone number, etc.). Since some of these types of information are very dynamic and fleeting as technology advances, it becomes easy to bypass explicitly stated data types by using ones that are not explicitly stated.

Here are four categories of workarounds, or types of information not covered by explicit rules:
  • Temporary Identifier Workarounds: Like the saying "The 'temporary' becomes the 'permanent'," many times it is possible to use "temporary" identifiers to uncover a permanent one, or link them together to create a persistent chain of temporary identifiers. There is the famous case of Internet IP addresses not being on the PII list. Many times, there is some second temporary identifier that remains constant between IP address changes that make it possible to link the old and new IP addresses together - like a website cookie or a site't internal username identifier.
  • New and changing identifier types: Is your Facebook ID PII? More and more so. And you should be very wary of apps and referrers who leak it. By the time legislators add it to the PII list,  Facebook might not even be that relevant (depending who you ask...). And what about your Android ID (unique identifier based on the specific handset and your Google account) or the iPhone's unique device ID (UDID)? Its hard to keep up with new services and their proprietary IDs. 
  • "Joined Identifiers" (for lack of a better name): This is when pieces of information that seem innocuous when considered independently, generate a not-so-innocuous identifier when combined together. For example, a set of properties that your browser advertises, and could be used to "fingerprint" your computer, like this, or this (seriously, try it). There's also work like Sweeney's on how simple demographics like gender, date of birth, and zip code, could be used to uniquely identify a large percentage of the US population. This data-combo essentially becomes PII, and should be treated with the same respect.
  • "Inferred Identifiers": Big Data also means big data-mining and inference. Some of it can be very revealing about a person's identity, and can even be used for de-anonymizing users. Golle and Partridge show how people could be uniquely identified if the approximate locations of an individual's home and workplace can both be deduced from a location trace, combined with public US Census data. Narayanan and Shmatikov show how the supposedly anonimyzed Netflix Prize dataset could be deanonimyzed. These are just a few examples.
We need laws and policies that are able to advance as fast as new technology, or even faster - we need them flexible enough to deal with developments before they happen.  I believe we need to take a step back from specific and explicit identifiers, and define our policies in more robust terms, and increase focus on use and intent rather than the identifier itself. 
That might be where updated definitions for Fair Information Practices Principles (FIPPs) come into play. US Department of Commerce's "Green Paper" on privacy from Dec 2010 talks about a dynamic privacy framework, but not exactly in the same sense that I am trying to get at. 
I wish we had a way to define our laws like a modular computer program, or particularly the test of adherence to laws. There are inputs, and an  evaluation "function" that takes the inputs and spits out a some result if something is PII or not under a predefined context. A parameter that is considered PII under one context, might not be considered PII in a different context. We (or "THEY") should be able to swap out the inputs and evaluation functions as new ones come into play, to evaluate PII.
Who knows, perhaps PII could/should be defined not in a discrete way, but in a statistical manner. For example, given relatively accurate GPS coordinates of a person’s home in Manhattan would still make it very hard to extract the identity of the person due to the population density in the area. However, GPS coordinates of a person living in a rural agricultural area might indicate very accurately who the person is, as they might point to the only house in the area. In that case, the accurate GPS coordinates should not be used, but they could be reduced in resolution to include a much larger radius of confidence, that would include a predefined minimal population density so that the information is not considered PII anymore. We would of course want to generalize this to more than just GPS input.
Anyways, before I wade even deeper into meta-legislature and meta-nerdiness territory, here's the bottom line: The concept of Personally Identifiable Information is still very important, but should be radically redefined in order to remain relevant, together with an update of the principles of Fair Information Practices. Its good for the ecosystem going forward, its good for end-users, and its going to be good for businesses who play fair. 

Further reading: 
- Bill proposal: Personal Data Privacy and Security Act of 2009 (I especially like the list of those who oppose)

Wednesday, April 20, 2011

The World is Flast

Yesterday was the day of the Boston Marathon, the world's oldest annual marathon.
Last Friday, as I was returning home from Logan airport, the shuttle busses and subway were already filled with marathon visitors, comparing stats, qualification times, and some other strange terms that I wasn't really familiar with, like "running", whatever that means...

Apparently, the marathon itself is not the only race in this story. What I wanted to focus on is not the marathon day, but the events surrounding this year's marathon registration back in the fall.  The 2011 marathon was sold out in a record 8 hours and 3 minutes, compared to the 2010 marathon which sold out in 65 days, also a record at the time. Many long-time Boston Marathon runners, and many others who prepared for this marathon for months, even years, and made the required qualifications preliminary competitions, were not able to register. Some complained about the fact that thousands of non-qualifying runners who signed up to promote various charities and fundraising efforts were taking spots that should have been reserved for the competitive runners. The organizers of the marathon even made raised the bar for the qualifying scores of next year's marathon. However, I am not sure that the issue is with the Boston Marathon's organization. This is part of a much bigger trend.

Google's developer conference, Google I/O, sold out in 59 minutes. In 2009 it took 90 days to sell out, and in 2010 it took 50 days. I am happy to be one of the lucky ones going this year, but I also feel bad for so many developers who wanted to go. Here too, some of the disappointed ones pointed out that many registrants are doing it just because of the cool gifts and gadgets Google has been known to give out, while pushing serious developers aside. But again. I think its more than that. 

Apple's WWDC 2011 sold out in 10 hours, nearly 20 times faster than in 2010 (which took 8 days to sell-out), nearly 70 times faster than 2009 (one month), and nearly 150 times the 2008 conference (two months). Burning Man Festival's first three tiers of tickets were also sold out in record time, and there's even talk on reaching the cap on the total number of available tickets. Conan O'Brien's 42 show tour sold out in several hours. Charlie Sheen's tour sold out in 18 minutes

The world is not just getting flat. Its getting fast, super fast. Like a good restaurant in New York City, once the word is out on a lucrative event - everybody gets in line. Information that used to spread in closed circles of the hard core fans and advocates, is now instantly trending on twitter and the other social sites, and the race is on. Scalpers and second market sellers are part of the game too. A google I/O Academia ticket, originally prices at $450, sold for over $2000 on ebay. and many others over $1000. 

This whole thing is starting to remind me of high-speed trading on Wall Street. Firms placing their computers closer to the stock exchange to gain a few extra microseconds on a transaction. People modifying the TCP/IP stack to make their algorithm act fractions of a second faster. Already the tech savvy (and persistent) have an advantage - when Google I/O's servers crashed mid-transaction, some of them were able to rescue their unique registration key from the browser session that crashed, and re-enter it in the URL field to continue their registration even after the official site already announced it sold out. Are we going to start seeing software like e-bay's automatic bidding apps (e.g. EZsniper) pop up for concerts and conferences? Will people start setting up their servers close to Ticketmaster headquarters?

The world is not just flat. The world is flast. And getting flaster.

From the desk of the Procrastinating Perfectionist

For years I have been accumulating heaps of notes and thoughts about everything ranging from my work and research topics, opinions on various issues, short prose, and even some ad campaigns and slogans for businesses that should never see the light of day (on that note, if anyone ever wants to open a strip club close to MIT campus, I got a full branding packet for "M.I.Tits". Call me.) These notes are in various forms of existence - from one-line idea sketches to fully baked essays, and for a long time I've been thinking of letting them loose somewhere. 

However, there had always been a long chain of prerequisites that I tied myself up with. I can't open a blog before I have the perfect theme for it. And I need to research what's the best blogging platform, and I probably want to set it up myself on my own server so I have the most flexibility with it, and I need to update my homepage first (its currently about 3 years out of date, and the new homepage I've been working on is far from complete and already more than one year out of date...), and I need to buy a domain name, and so on. Setting up my online presence has always been something I do in my spare time. If you actually know me in person you would probably ask - "what spare time?", to which I answer - Exactly.

I've decided enough is enough. Time to focus on what actually matters, and the rest will follow. 
So here goes. In this blog I will try to put up some new reflections as well as some older content that hasn't yet seen the light of day, and I'll generally try to use this blog as a place for arranging my different thoughts and ideas.

This is me.
This is my blog.

Now lets see how long I can keep this up.