Here are some of my current and recent projects and activities:
- Preservation of knowledge & data – I'm part of the Caltech Digital Library Development team, have been active in community efforts such as Data Together, and researched decentralized content-addressed networks.
CASICS – Comprehensive and Automated
Software Inventory Creation System, a way to catalog software by leveraging ontologies and machine learning. This has produced numerous reusable software modules:
- Nostril, Python module that infers whether a given short string of characters is likely to be random gibberish or something meaningful.
- Dassie, a database of the subject term hierarchies found in the Library of Congress Subject Headings (LCSH).
- Spiral, a library of functions for splitting class names, function names and other identifiers found in source code files.
SBML – A community standard format for exchanging computational models in biology. In addition to SBML specifications, some of the notable open-source software I co-developed include the following:
- MOCCASIN, a program to convert certain classes of MATLAB ODE-based models into SBML. It does not require MATLAB. Some of its innovations include a Python-based parser for MATLAB.
- SBML Test Suite. This includes a Test Runner written in Java with the SWT GUI widgets and bundled as a self-contained desktop app.
- libSBML, an API library for reading, writing, manipulating, and validating SBML files and data streams in many languages including C++, Java, MATLAB, Python, R and others.
- COMBINE – The COmputational Modeling in BIology NEtwork, a community group to help coordinate the development of standards and resources for computational modeling in biology.