Skip to the content.

Development notes for my iOS and macOS apps

Hi! I make iPhone and macOS apps at www.appblit.com and am writing here about the technologies used to make them happen. Some are for work, some for education (kids, K12 like PopMath and PopSpell).

Readable Transcripts - LLM are great to clean up transcripts

Turns out LLMs are great at fixing ASR text transcripts, if prompted the right way šŸ˜€

I open sourced Readable Transcripts. It uses the very fast and cheap Gemini 1.5 Flash-8B model to clean up transcripts from YouTube.

I included 4 examples of pre-processed transcripts for you to compare the LLM-corrected with the original transcripts.

But you can get your own free Gemini API KEY if you want to transcribe other videos.

Screegle with CoreMediaIO Camera Extensions and ScreenCaptureKit

CoreMediaIO Camera Extensions are a great addition to macOS. In addition, ScreenCaptureKit allows capturing several windows at 60 FPS without overloading the CPU. These 2 technologies are made for Screegle! https://www.appblit.com/screegle

Here Iā€™ll post macOS and Swift tips Iā€™ve learned while developing my side-project Screegle.

I published a free sample app showing how to use a sink and source streams cameraextension The sink is then fed CMSampleBuffers from the app, allowing you to show arbitrary content from your app to your camera.

Screegle, published on the App Store, uses this technique now and the CPU usage is ~10% while sharing several windows simultaneously.

PDFReflow: Reading PDFs on a small screen

This is perhaps the most advanced tech I developed for my apps (along with Screegle). PDFReflow images PDF images and cuts them out into pieces to visually reflow them to fit your iPhoneā€™s screen. It lets you read any PDF (well, when it works!) without having to zoom or pan around. Of course, because itā€™s using good-old heuristics, it doesnā€™t always work. Page segmentation is still an active research area: not surprisingly, modern approaches use deep learning to identify areas on each page (text, figures, captions, etc.). For now, PDFReflow uses heuristics: it looks at text bounding boxes (if found in the PDF), binarizes the images to compute the connected components, applies a famous page layout algorithm called XYCut. It then reflows the page by cutting each identified word into a tiny image, or leaves images as is (as one block).

ReaderView: a reader mode for Safari that works on many web sites

I really like the ā€œReader Modeā€ that comes with Mobile Safari, but it doesnā€™t work on all web sites, and you canā€™t personalize the articles with highlights, or save them for later. So ReaderView was born! I uses heuristics to identify articles and lets you swipe or select paragraphs to highlight them. It also saves images locally, so when you open the main app, you have access to your articles along with images.

PopSpell and SpriteKit

SpriteKit is an awesome framework to build 2D games, so I had to try it. My previous kids apps (PopMath, PopGeo, etc.) used UIKit to move sprites (UIButtons, really!) on the screen. But PopSpell uses SpriteKit, along with special effect like particles when you swipe through the letters. Check it out PopSpell

Contact

Find me on Twitter and Iā€™ll help you.