Portfolio — iOS & macOS apps

Subscript – A Freeform Note-Taking App

Screenshot of the Subscript note-taking app

Freeform, Human-Centric, and Built from the Ground Up

Subscript was born out of frustration with the rigidity of existing note-taking apps. I wanted a tool that respected the structure of serious, long-term note creation—especially for STEM content—and preserved the human presence behind each note. What emerged was a freeform handwriting-based note-taking system that emphasized authenticity, long-term readability, and device-independent rendering.

All input is stored in a scalable vector-based model space, making it resolution-independent and suitable for export to PDF or web. Each note is composed of entries—either structured headings or freeform strokes—recorded as a collection of stylus sample points. These strokes are rendered using a customized Swift port of the perfect-freehand algorithm, with user-defined pen tips, stroke widths, and color selections.

The system supports a semantic heading hierarchy (H1–H6), which auto-generates a navigable table of contents. This design aligns with the philosophy of LaTeX—separating content from presentation—while prioritizing a beautiful reading experience over time.

Subscript also captures something rare in today’s AI-saturated landscape: undeniable human authorship. No part of the note-taking flow can be faked by automation. It’s handwriting, structure, and thought—preserved faithfully and rendered precisely.

For a deeper look into the philosophy and design, watch the video demo.

Used in the Real World — Chemistry 1010 (Fall 2021)

Subscript wasn’t just a prototype it was my primary note-taking tool during an actual college-level chemistry course. I used it to produce a complete, structured set of lecture notes for Chemistry 1010 (Fall 2021), combining handwritten formulas, semantic headings, and responsive layouts. The result is a document that remains readable, navigable, and presentation-ready years after it was written.

The chemistry notes showcase Subscript’s ability to manage structured academic content over time, preserving both spatial arrangement and semantic organization. This example is publicly available as a web export:

→ View the Chemistry 1010 Notes (Web Version)

My Bilingual Dictionary & Phrasebook App

App Overview

The Dictionary

The Phrasebook

Inside the 3in1Spanish Dataset Compiler

See my YouTube Video for details with commentary.

What if generating a bilingual dictionary and phrasebook wasn't about writing static content by hand, but compiling it — like code? That’s the idea behind my 3in1Spanish app, a fully offline-native iOS/macOS app that bundles a dictionary, phrasebook, and flashcards into a single lightweight tool.

At its core is a dataset pipeline inspired by compiler design, not scraping or prompt spam. This post dives into how the system works, what problems it solves, and why it’s built to scale.

Motivation

There are tons of language apps — but they usually either:

Rely on huge editorial teams for content
Depend on fragile scraping pipelines
Require online access to fetch data

I wanted something different:

Fully offline
Fully scalable (10 → 10M entries)
Generated programmatically
Modular, inspectable, and cache-friendly

Thus: a compiler-style LLM pipeline.

Input: Raw Source Text (EPUBs)

The source input is clean digital books — primarily EPUBs. These are ZIP-archives of structured HTML chapters, usually with cleaner markup than scraped websites.

I parse the HTML, normalize it, and feed it through deterministic preprocessors that extract usable text for downstream phrase mining.

Architecture: A Compiler for Language Data

The pipeline has multiple stages, loosely inspired by compilers:

Preprocessing
- Extracts and cleans source sentences
- Normalizes punctuation, line breaks, headings
Phrase Mining
- LLM extracts semantically useful phrases (e.g. greetings, common patterns)
- Outputs structured JSON arrays
Metadata Generation
- For each phrase or word:
  - English translation
  - IPA + simplified pronunciation
  - Grammar category
  - Notes on silent letters or irregularities
  - Regional variation info
  - Syllabification (for TTS/fallback)
Categorization & Tagging
- LLM suggests category cluster (e.g. food, travel)
- Tag-based sorting for UI
Dictionary Compilation
- Lemmas + inflected forms
- Disambiguates homographs
- Long/short definition pairs
- Example usage with glosses
Flashcard Export
- JSON entries exported into spaced-repetition-ready format

Retry, Repair, and Idempotency

This isn’t just prompt → response → save.

The system validates LLM output against schemas. If JSON is malformed, it attempts:

Auto-repair (via Mistral or system prompt reset)
Conversational correction
Retry with fallback prompt variants

Each content unit (phrase, word, definition) is identity-tracked, cached, and compiled independently. If you delete or edit an entry, only its downstream dependencies are invalidated.

Output: SQLite Database

Final output is stored as a SQLite DB and bundled into the iOS/macOS app. This allows:

Instant local access
No network dependencies
App-side fallbacks (e.g. show alt forms if IPA missing)

Stack

Rust — pipeline logic, prompt orchestration, compilation
Swift/SwiftUI — app frontend
SQLite — offline dataset storage
LLMs — OpenAI GPT-3.5, Mistral, Claude (experimental)
Prompt DSL — an XML-inspired macro system for structured LLM interactions

Generalization Potential

Although I built this for Spanish-English, the pipeline is language-agnostic. With minor changes, it could generate:

Medical glossaries
Legal term dictionaries
Niche phrasebooks (e.g. aviation, military, religious)
Low-resource language datasets

It’s not just a phrasebook — it’s a general-purpose bilingual content compiler.

Status & Future

The current app works. The dataset pipeline is fully operational and idempotent. The UI is lightweight but serviceable.

Next steps:

Polish flashcard interaction
Add fuzzy search + transliteration
Expand prompt set for rarer parts of speech
Expose the compiler as a CLI tool

Interested?

If you work in:

EdTech
Localization
NLP toolchains
AI-powered language learning

…then I’d love to connect.

I'm especially interested in contract or freelance opportunities where this infrastructure — or its generalizations — can save teams time and money on multilingual content.

Bilingual Scripture App

TODO: contact me for details — will be publishing the initial version to the app store sometime in the near future — subsequent work will include more scripture content from across cultures