- We don’t really understand how GPT does what it does. By “we”, he means humanity. And that’s because we don’t fully understand how neural nets work. They seem to work like human brains, but as everyone knows, science knows little about how brains work.
- Training neural nets is more art (or craft) than science. Lot of educated guesses and empirical observation. Wolfram uses the phrase “neural net lore” many times.
- After a certain point, training with more examples degrades performance.
- Training on text does not require tagging like it does with many image AI projects.
- Bigger problems can be ironically easier for them to solve than smaller ones, like generating a paragraph vs. a sentence.
- We’re blown away by ChatGPT’s (and Midjourney’s) performance, so we conclude computers are vastly more powerful than we thought. But the reality is that these are easier problems than we thought.
- ChatGPT is forward feed, so it’s only determining the next tokens (word fragments) to produce. The only thing resembling a loop is that when you respond to what it says, it rereads the entire conversation, including what it wrote, to determine what tokens come next (its response).
- ChatGPT 3 has 175B connections/weights in its neural net, so it has to do 175B calculations per token produced, and a token may not even be a whole word. This is why longer responses have such lag.
Like many, I was impressed with ChatGPT and Copilot, but not blown away. However, my brilliant friend Jim White was, and he’s far more qualified to judge having studied this for decades. Some things he shared:
- Neural nets can be trained to use tools. Researchers have got it to use a calculator, Wikipedia, Q&A site, search engines, a translator, and a calendar. And ChatGPT has added plugins for this, including Wolfram Alpha. This should cut down on “hallucinations”.
- Prompt writing is a skill that must be learned. If you’re not impressed, you probably have spent very little time learning it. And ironically, you can ask GPT for help with this.
- ChatGPT already beats humans in several things, like annotating text. This means for more specialized AI that does require tagging, you can get the AIs to collaborate and train each other. We’ve already seen this (or accusations of AIs “stealing” knowledge from other models).
This is important because OpenAI CEO Sam Altman has already said GPT-5 is a long way off. He doesn’t see bigger models bringing better performance, and I suppose GPT-4 costing more than $100M to train doesn’t put them in a mood to spend more. The advances will have to come from other ideas.
Regarding coding tools like Copilot and Amazon CodeWhisperer, I generally hear good things, however they don’t take any feedback from the IDE so they generate broken code and I’m not sure this can be fixed. For one, given the forward feed process, this could be computationally expensive with many iterations (do it again, but don’t use class DoesNotExist, these methods…). GPT has been poor at this so far; it kind of loses track of the conversation. The IDE could provide more context to start with, as it has the full dependencies. However, when you’re adding new code, you’re going to need to add dependencies to your build file, more imports, etc. Same as when you cut and paste from Stack Overflow.
Better (or more) training/weights could help. If you look at what the community has accomplished with Stable Diffusion by trading models, it’s quite impressive. I had thought Midjourney was far ahead, but a recent discussion on HackerNews suggested that was an antiquated view – from two months ago. The community’s new tools, models, etc. allow much better control.
My hope is we’ll get similar out of the open source LLaMA. Or maybe CodeWhisperer, which already has extra training on AWS APIs, will allow you to create a model weighted on your exact stack. It’s a hard problem, and maybe there will be better ways.
In the meantime, some comments from some legendary programmers like Antirez suggest it’s already quite good for whipping out quick and dirty code in languages you aren’t familiar with. Think build files, bash scripts, Jenkins files, etc. This could up the game of full stack programmers, as just about every one I talk to has a strong preference (and experience) with one side of the stack.
What I am most curious about is if we can create new languages and frameworks that are designed to be correctly predicted. We’ve already heard talk that AI is going to kill low code platforms. But maybe low code platforms evolve to be even more productive with AI. Surely they are scrambling to figure that out.
I believe two companies are so well vertically integrated that they have the most potential: Microsoft and Amazon. Microsoft is the clear frontrunner. They are already investors in OpenAI and have GitHub Copilot. But they also own the entire .NET ecosystem and Azure. I can imagine an AI-powered, cloud-native platform based on TypeScript everywhere.
Amazon will want to compete and have already started with CodeWhisperer and Cloud9. However, the overall AWS ecosystem has a steep learning curve. Their attempts to do low code with Amplify have not been well received by the larger community. But I think they started with the goal of making software development easier for AWS admins and devs building highly scalable systems. Google had similar struggles with AppEngine. The majority needs a low learning curve and apps for maybe hundreds of thousands, not billions.
It’s early enough to be anyone’s game. Maybe Meta, shifting from the metaverse to AI, will win it all with a PHP stack?