The Problem With Data | Christopher Symonds

Recently, OpenAI has given ChatGPT Plus users access to the code interpreter plugin. As a subscriber to that service, I've had the chance to experiment with the plugin, and it has given me the same sense of wonder that GPT-4 itself gave me. It's that feeling that I have been given a tool that is incredibly powerful, and the possibilities only seem to be limited by my capacity to think of its uses. In one experiment, I asked ChatGPT to tell me something that most people are not aware of, but knowing it could have a significant impact on their lives. I told it to include any relevant data, and show me any analysis that it thought would be helpful.

What it came back with was, unfortunately, a list of things that I've been hearing about for years. The kinds of things you tend to read in news articles or opinions pieces. "Compounding interest is a powerful force, particularly for the young saver." or "A good night's sleep is underrated for its impact on your health." I was underwhelmed with the advice. What it was able to do was show me graphs and analysis that nicely displayed the data that supported the advice. It was then I started to wonder what kind of data I could give to it in order to tailer the advice to me. The code interpreter allows you to upload files so that ChatGPT can make use of it. It turns out... I don't have very much data on myself.

Oh I can think of all kinds of data that I produce that gets shipped off to this or that service. I carry around a swarm of sensors with me throughout the day, all happily buzzing along with activity collecting every type of data one can, but all of it shipped tidily off to some service that I have no access to. It seems crazy to me that I do not at least have a copy of my own data! So that is what I am currently thinking about - how do I get my own data so that I can do analysis on it and maybe do some good for myself. The first thing I thought about was the Sharecast I do. I use the Marco Polo app, which has a really cool feature that lets you create a Sharecast just for the people you invite. Like a private YouTube channel. So of course, I talk about AI, and anything related to it. So I wrote a little Python tool that will transcribe the audio for me. I recently finished running the hours of talking I've done through it and now I have a text file with everything I've said, ready to do... some kind of analysis. I'm not sure what yet. I'm still thinking about it.

I've heard of people giving data to ChatGPT and asking "What trends do you see in this data? Show me an analysis and and explanation of what you find." Who knows? Maybe ChatGPT will surprise me with what it finds? In the mean time, I've made my tool available to anyone who wants to try it. It uses OpenAI's Whisper API to do the transcribing. Right now, it only converts .mp4s because that's what I work with, but it can be easily modified to work with other formats. I've given it the most adorabugly little UI that I'm rather fond of. We do love our own children, don't we? I hope someone, maybe you, finds it useful. For now, I'm off to think about data. I'll let you know what I come up with.

Scribe: .mp4 transcription using OpenAI's Whisper API