Here are some of my current and recent projects and activities:

  • Preservation of knowledge & data – I'm part of the Caltech Digital Library Development team, have been active in community efforts such as Data Together, and researched decentralized content-addressed networks.
  • CASICSComprehensive and Automated Software Inventory Creation System, a way to catalog software by leveraging ontologies and machine learning. This has produced numerous reusable software modules:
    • Nostril, Python module that infers whether a given short string of characters is likely to be random gibberish or something meaningful.
    • Dassie, a database of the subject term hierarchies found in the Library of Congress Subject Headings (LCSH).
    • Spiral, a library of functions for splitting class names, function names and other identifiers found in source code files.
  • SBML – A community standard format for exchanging computational models in biology. In addition to SBML specifications, some of the notable open-source software I co-developed include the following:
    • MOCCASIN, a program to convert certain classes of MATLAB ODE-based models into SBML. It does not require MATLAB. Some of its innovations include a Python-based parser for MATLAB.
    • SBML Test Suite. This includes a Test Runner written in Java with the SWT GUI widgets and bundled as a self-contained desktop app.
    • libSBML, an API library for reading, writing, manipulating, and validating SBML files and data streams in many languages including C++, Java, MATLAB, Python, R and others.
  • COMBINE – The COmputational Modeling in BIology NEtwork, a community group to help coordinate the development of standards and resources for computational modeling in biology.