Archive

Author Archive
Note: While I am keeping this blog for the occasional longer piece, I have moved to Google+ (link to my Google+ profile) for the more casual use.

PyContracts

July 10th, 2011 andrea Comments off

In the past two years I have fallen in love with Python. In the course of my work I developed a few libraries that I use consistently across projects and save me quite a bit of development time. Some of them are worth sharing, and I hope to have the time to do it (ah!).

PyContracts is one such library, stable enough to merit a version number larger than 1.0. I uploaded it on the PyPI software repository, so that anybody can install it using:

easy_install PyContracts

(or: pip install PyContracts)

PyContracts can be used to declare constraints on the arguments and return value of the functions. While it can be used for type-checking, I found it much more valuable for expressing complicated constraints on the values rather than on the types. For example:

@contract(color='seq[3](>=0,<=1)')
def paint(color):
    ...

This checks that color is a sequence of 3 elements, and each element is a number between 0 and 1. Any sequence is allowed: lists, tuples, Numpy arrays, etc.

Note that it would be very tedious to do by hand:

def paint(color):
    if ( not isinstance(color, Sequence)
         or len(color) != 3):
        raise ValueError('I need a sequence of length 3.')
    for c in color:
        if (not isinstance(c, (int, float))
           or c<0 or c>1):
            raise ValueError('Colors must be in [0,1]')
    ...

All these tests are done automatically by PyContracts, and similar error messages are given.

Here’s a more complicated example. Suppose you want to write a function blend() that takes a list of RGB or RGBA images, blends them together, and return the blended images.

There are several checks you need to do: the input is a list, there are at least two images, each image is RGB/RGBA, each image is the same size, and so on.

With PyContracts you can express all of that very compactly:

@contract(images="list[>=2]( array[HxWx(3|4)](uint8) )",
         returns="array[HxWx3](uint8)")
def blend(images):
    ''' Blends a series of images together. '''
    ...

See the PyContracts home page for more information.

Categories: research Tags:

Estimating stress and procrastination levels from inbox data

February 7th, 2011 andrea 4 comments

I’ve been stuck home with a bad flu for the last few days. The headache and paracetamol intoxication impacted my abilities and willpower, to the point that the only productive thing I did was polishing my backup strategies.

I added duplicity to my Time Machine setup for day-to-day backups. I also found a way to backup my GMail account using getmail. Getmail is written in Python, so I looked a little bit into the libraries that can be used to interact with IMAP. It’s actually very easy to retrieve messages or other statistics; so I had to do something with it.

You’re asking why? Why? Because I love data! I love data like Tarantino likes making movies.

I wrote a script that periodically logs the headers of flagged messages in my inbox, and another that reads the logged data, and plots a couple of statistics: stress, defined as the number of flagged messages; and procrastination, defined as the median age of those messages.

Here’s a snapshot of the result:

For now, I only have three days of data, but you can see it live on my webpage and track my stats in the future.

This is especially useful if you’re waiting for an answer from me!

Categories: life, research Tags:

Tools of the trade

December 16th, 2010 andrea 5 comments

My old Macbook was in terrible conditions after a long and productive life: it’s been powered on for an average of 16 hours per day for 3 years, and it has traveled to several continents. It will be fondly remembered.

I got a new Macbook pro and spent a considerable amount of time reinstalling my development and research environment. In the hope of saving some time during the next reinstall, I documented everything I installed. I thought I’d write a blog post about it, thinking that this list might be useful to someone else.

Developing environment

  • First thing first: the developer tools from Apple; they come in the install CD.
  • macports is my choice to complete the Unix environment. Packages installed: wget, ffmpeg, mplayer, imagemagick, autossh, aspell.
  • python is my current language of choice. The indispensable packages to learn and use are numpy, scipy, matplotlib, pytables, opencv. The Enthought Python Distribution is the most convenient package, although I have been having a few small problems with the 64 bit version.
  • Eclipse is my favorite multilingual IDE, of course with pydev, the Python plugin.
  • textmate is the best casual editor you can find for OS X. The essential plugins are the matlab bundle, and the remate plugin, to work over sftp links.
  • The inconsolata fonts are my favorite programming fonts. On Mac there is already Consolas, but Inconsolata is a free equivalent that you can install on Linux as well.
  • cmake is the best solution for building and packaging C/C++ projects.
  • After years with subversion, I am now a git convert. I found the best solution for Mac is to just install the binaries from the official website.
  • hdfview is a nice HDF viewer. See a few reasons why you should use HDF.
  • I dumped Parallels for VirtualBox: it’s free, and better.
  • You should drop bash for zsh. The best theme/plugin manager for zsh is oh-my-sh.

Networking

  • Chicken of the VNC is still the best VNC client around. In theory all VNC clients should be compatible with all servers; in practice, there are all sorts of incompatibilities, and Chicken of the VNC seems to play along well, especially with older Linux servers.
  • With synergy and a couple of computers, you can create a fancy Matrix-style C&C environment.
  • JungleDisk is my current favorite solution for automatic offsite backup (coupled with Time Machine for onsite backup).
  • expandrive is a must if you are working with multiple machines.
  • I switched from Firefox to Chrome; it’s faster, and has better memory handling. The indispensable plugin is adblockplus.
  • autossh might be useful if you need to establish SSH tunnels towards your machines (example: to secure VNC).

Doing research

  • Evernote saved my sanity! It is my solution for all the random bits of information (everything that I care about, but not so “stable” as to require a repository.)
  • Papers is the other piece of software that saved my sanity. Highly recommended.
  • Jabref is better than Papers at handling .bib references; useful at writing time when you want to have more control on formatting.
  • Skim is better than Preview for reading long PDF files.
  • Keynote has been a welcome liberation from Powerpoint. There was a period in which I used to do presentations with LaTeX using PowerDot, but nowadays I do everything with Keynote.
  • The best LaTeX distribution for Mac is MacTex.
  • If you are writing raw LaTeX, you are probably wasting your time; LyX gives you a nice GUI environment, and you still can have the flexibility of LaTeX when you need it.
  • Staying up late in front of your screen screws up your melatonin production. Use flux to minimize this problem.
  • Use Freedom to cut your internet connection when you really have to work. (Unfortunately it is not free anymore.)
  • Use RescueTime to assess your productivity (warning: it can be very demoralizing!)
  • SimplyNoise is useful when your work environment gets too noisy.
  • Piping to mplayer/mencoder, with perhaps a second pass with ffmpeg, is the best solution to create high-quality videos for your research.
  • VLC plays everything.
  • Sometimes I forget huge log files in some remote directory. I solved this problem using GrandPerspective.

Software that I have to install but I get no joy in using

  • Adobe Acrobat is the bloatest piece of software I have ever seen. I hate its intrusive “update manager”. But it is sometimes useful for checking which fonts a PDF is using.
  • Matlab is something I sweared to use again only when forced.
  • Mathematica is still enigmatic and profoundly non intuitive for me.
  • There’s no alternative to Skype.

Miscellaneous

Categories: research Tags:

Lessons learned

October 24th, 2010 andrea Comments off

Spot the bug:

def compute_derivative(x, dt):
  deriv_filter = numpy.array([-0.5, 0, 0.5]) / dt
  d = scipy.signal.convolve(x, deriv_filter, mode=1)
  d[0] = d[1]
  d[-1] = d[-2]
  return d

Hint: this snipped is correct:

def second_derivative(x, dt):
  return compute_derivative(compute_derivative(x, dt), dt)

Moral of the story: always, always, always write unit tests, otherwise you run the risk of spending nights trying to understand why certain flies go left when they are supposed to go right, and in the end, who knows, they might prefer to go left, this is not an exact science, maybe this is an important discovery, perhaps I need more coffee.

Categories: research Tags:

Graduation day

May 27th, 2010 andrea Comments off

It is graduation day at Willow Garage: the first batch of PR-2 robots leaves home for their final destinations in research and industry. The money shot is at 4:23.

Categories: research Tags:

Midnight with Gray’s anatomy

May 9th, 2010 andrea Comments off

It’s midnight. I find myself in the library with “Gray’s anatomy”.

For my homework I have to draw the brain and label several areas. The result looks more like a pizza with too many toppings.

I think that everybody will turn in something like that. By chance, the only other person in the library is an undergrad doing the same homework. When I walk by, I glance at her work, and, well, it looks better than the tables in the book. I go home, I’m too old for this.

Categories: caltech Tags:

David MacKay: Sustainable energy without the hot air

April 7th, 2010 andrea Comments off

The other day I’ve been to David MacKay’s talk “Sustainable energy — without the hot air”. MacKay got a PhD in Computational and Neural Systems at Caltech and went on to became a leading researcher in estimation and machine learning. Later he became interested in sustainable energy; now he is Chief Scientific Advisor of
Her Majesty’s Department of Energy and Climate Change.

His approach is to explain the problem of sustainability in terms of quantities to which every person can relate. This makes the discussion more meaningful and pragmatic. It works! Even if I’m not that good at remembering numbers, today I still remember that the average European consumes 125 KWh/day, the average American about twice than that. If I fly once internationally every year, that averages to about 25 KWh/day. The same can be done for energy sources: the output of each technology is measured in energy per square meters. For example, we learned that to supply the UK’s energy consumption with solar energy one would have to cover a huge part of the territory with solar panels.

The bottom line is that the only technologies that are sustainable, that is, they will produce energy for the next 1000 years with limited pollution, are nuclear energy, clean coal, and solar power “in other people’s deserts”. All other renewable sources just do not produce enough energy. His views are collected in a book that is available online.

It’s refreshing to see that one can start in a narrow field such as machine learning and go on to make some policy contribution for one of the main problems of mankind.

Categories: caltech Tags:

Robotic cloth folding

April 5th, 2010 andrea Comments off

Watch this video from Pieter Abbeel’s group in Berkeley.

Given Abbeel’s research curriculum, I jumped to reading the preprint to see if there was some learning involved. Unfortunately, it seems that the techniques used are
fairly standard (precise state estimation + giant state graph + motion planning). Still, it is a very impressive application.

The contribution highlighted by the paper is the detection of appropriate folding points. While the robot is holding and slowly rotating the cloth, several cameras observe the scene and can reconstruct a 3D model. The analysis of the 3D model allows to detect the towel’s borders, which are the appropriate folding points.

The paper certainly shows that Willow Garage’s robots, and accompanying software, are a robust platform for developing complex applications. For example, this application involves interplay between robot locomotion, stereo vision, and motion planning. The robot must move its arms to grab the towel at the desired folding point, but, at the same time, it must not impact it in other parts. This implies that the (a priori unknown) shape of the towel must be included in the motion planner’s world representation.

All in all, it’s one step towards the robotic maid!

Categories: research Tags:

Judging at a middle school science fair

March 28th, 2010 andrea Comments off

Last week I’ve been a judge for the science fair at Marhsall fundamental High School. The kids were 6-8th grade (11-14 year old). No, there were no volcanoes.

There was much variety in the complexity of the projects. We went from the classics (“how does light influence plants growth”, “what is my dog’s favorite toy”, “which kitchen paper is more absorbent”) to the more innovative (“which home surface has more bacteria”, “which bridge design bears more load”). This girl checked what was the best method to treat the discoloration of her blond air after swimming.

The most courageous effort was by a group that found a way to justify six hours of video-game playing as a science project. I tried to insist to give them a special mention, but the teacher didn’t want to. Probably she didn’t want to encourage insubordinate behavior.

Categories: caltech, life Tags:

Censi’s Census

March 19th, 2010 andrea Comments off

I felt this was the only correct answer.

Categories: caltech Tags: