Top-tier LLMs, Rust and Erlang NIFs; nifty, and night and day vs. C, but let me tell you about vibe coding…
After I submitted my blog post to Hacker News on using Grok 3 to generate C for an Erlang NIF with the help of code reviews by GPT-5 and Gemini 2.5 Flash, I received some interesting comments. Let’s review them before we dive deep into why vibe coding leaves me with a very sour taste, and why I will never engage in vibe coding again, if I can avoid it.
It was all unnecessary
“So all this arose because you didn’t read the docs and note that get_disk_info/1 immediately fetches the data when called? The every-30-minutes-by-default checks are for generating “disk usage is high” event conditions.” – juped
As it turns out, that’s correct. In my sleep-deprived state of mind, I didn’t actually read the documentation of :disksup.get_disk_info/1
, which clearly states that the data the function returns are immediate, i.e. it returns fresh data on the state of the disk space there and then. It has nothing to do with the periodic interval configuration of disksup
.
Oh well.
I should have used Claude Code (?)
“This was built copy pasting results from chats? Not using an ide or cli like Claude Code or Amp? Why such a manual process. This isn’t 2023…” – wordofx
I hear you. All the cool kids are using Claude Code or IDE extensions to have LLMs interact directly with the code, and/or to let LLMs run amok on a codebase, and apparently also their PC.
First of all, would Claude Code be better than Grok 4 at writing C code, and/or better than Gemini 1.5 Flash and GPT-5 at reviewing C code and Makefiles?
Maybe; I don’t know, because I’ve never used Claude Code. As far as I know, it’s a paid-only service and even though there are ways to get started with some credit, I shudder at the idea of handling the reins to an LLM for coding. Even if it’s “the best”, how much better can it be, given that (so far) none of the top-tier LLMs stand out against each other? What difference would it make, given that the C code that was generated was pretty terrible? More on that later.
Secondly, the later iterations of “The Process” described in yesterday’s post included feeding Grok, Gemini and GPT the errors of the automated build-and-test attempts on GitHub Actions, so that I could get the C source to compile on macOS and Windows. How would Claude Code have helped with that? In that case a human (me) in the loop was necessary to shovel errors from the build logs to the LLMs and have them generate recommendations that I then shoveled back to Grok 3 for code fixes. Most those fixes actually made things worse, until they didn’t.
Lastly, the manual process is tiresome and annoying after the first 5-6 iterations (more on that below). It does have one upside though: you get to see how the sausage is (badly) made. Otherwise, just YOLO it and put your trust in GenAI completely–if, of course, you find a way (that may exist, I don’t care enough to look into it) to also automate the “shoveling” of build logs from GitHub Actions to the LLMs. By the moment you get such a complex setup working, it might have been a better investment of time to learn Rust.
I could/should have used better LLMs
“It’s interesting why the author used weaker models (like Grok 3 when 4 is available, and Gemini 2.5 Flash when Pro is), since the difference in coding quality between these models is significant, and results could be much better.” – SweetSoftPillow
Yeah, it’s true. I pay for SuperGrok, but given how slow Grok 4 used to be right after it launched, I had set Grok 3 as the default model. I also didn’t want to exhaust the free (daily?) allotment of Gemini 2.5 Pro tokens for that exercise. And I ran out of free GPT-5 tokens shortly after I reached the state of v0.4.0 that compiled cleanly across the target OSs and Elixir/Erlang version combinations.
This comment nagged me though; could I have really gotten much better code from Grok 4, and better code reviews from Gemini 2.5 Pro?
In the end, I found out that the answer is yes. More on that below.
C and LLMs are a bad match
“I would never ever let an LLM anywhere near C code. If you need help from LLM to write a NIF that performs basic C calls to the OS, you probably can’t check if it’s safe. I mean, it needs at least to pass valgrind.” – drumnerd
Here I also have to concur. That was one of the earliest comments on my HackerNews submission and made me try out some tools to analyze the disk_space.c
file of version 0.4.0
. I used splint
, and the results were not pretty. Examples:
disk_space.c:111:14: Operands of == have incompatible types (unsigned char,
int): (data[i] & 0xE0) == 0xC0
disk_space.c:144:16: Only storage bin.ref_bin (type void *) derived from
variable declared in this scope is not released (memory leak)
disk_space.c:148:16: Only storage bin.data (type unsigned char *) derived from
variable declared in this scope is not released (memory leak)
disk_space.c:148:16: Only storage bin.ref_bin (type void *) derived from
variable declared in this scope is not released (memory leak)
disk_space.c:152:11: Null storage returned as non-null: NULL
disk_space.c:152:16: Only storage bin.data (type unsigned char *) derived from
variable declared in this scope is not released (memory leak)
disk_space.c:152:16: Only storage bin.ref_bin (type void *) derived from
variable declared in this scope is not released (memory leak)
disk_space.c:156:15: Only storage bin.data (type unsigned char *) derived from
variable declared in this scope is not released (memory leak)
disk_space.c:156:15: Only storage bin.ref_bin (type void *) derived from
variable declared in this scope is not released (memory leak)
disk_space.c:160:7: Operand of ! is non-boolean (int): !enif_is_list(env, term)
disk_space.c:161:10: Null storage returned as non-null: NULL
disk_space.c:161:15: Only storage bin.data (type unsigned char *) derived from
variable declared in this scope is not released (memory leak)
disk_space.c:161:15: Only storage bin.ref_bin (type void *) derived from
variable declared in this scope is not released (memory leak)
disk_space.c:166:10: Null storage returned as non-null: NULL
disk_space.c:166:15: Only storage bin.data (type unsigned char *) derived from
variable declared in this scope is not released (memory leak)
disk_space.c:166:15: Only storage bin.ref_bin (type void *) derived from
variable declared in this scope is not released (memory leak)
Ouch. That made the next comment have more “bite” to it. I started thinking that releasing this had been a bad idea; positively negligent on my part.
It didn’t actually work
"‘it mostly worked’ is just a more nuanced way of saying ‘it didn’t work’. Apparently the author did eventually get something working, but it is false to say that the LLMs produced a working project." – flax
What is the definition of “a working project”? I don’t know what is meant by that. It’s obviously working, i.e. from within Elixir I get the map I want, with the data I intended to get.
When I added “it mostly worked” to the title, my intended meaning was “yeah it worked, but I had to do a lot of hand-holding” (in terms of using LLMs to generate C code) and “it passes the tests but I cannot tell if the code has memory leaks” (in terms of the generated C code quality).
But clearly, it didn’t actually work, when you consider that the generated C code does more than I wanted it to: memory leaks.
That made me pretty sure that releasing vibe-coded C code to the wild had been a terrible idea–especially given that the whole exercise had been unnecessary to begin with.
I should have used Rust or Zig
“Why C instead of Rust or Zig? Rustler and Zigler exist. I feel like a Vibecoded NIF in C is the absolute last thing I would want to expose the BEAM to.” – weatherlight
Well, I had to start somewhere, and that was the familiar Exqlite and its use of elixir_make
and a Makefile to compile SQLite’s amalgamated C source. I also never programmed in Rust or Zig, though I had been reading the Tigerbeetle
articles as part of my (abandoned) tbapi
REST API for it in Go
many months ago, and I was seriously impressed by Zig.
As for Rust, I only know it as a (in-)famously difficult programming language with a very opinionated fanbase. However, since Rust is now part of the Linux kernel , it must have some merit.
Yeah yeah, OK “rustaceans”, I have heard all about its memory safety, if you use it correctly, so don’t come at me with your rusty pinchers…
Grok 4 code, aided by GPT-5 and Gemini 2.5 Pro code and error reviews
Given all of the above, I thought I’d dedicate at most half a day on the next bout of this experiment: I’d have Grok 4 convert the C source into a lib.rs
, then use Gemini 2.5 Pro and today’s GPT-5 free credits to review Grok 4’s code, supplemented by any (many) errors of the build-and-test workflow from GitHub Actions.
And that’s what I did. Grok 4 was capable of converting the C code into Rust code. As for whether it would compile and work for the disk_space
Elixir project: that was the next piece of the puzzle.
Figuring out Rustler
I looked online to see how to use Rustler
to compile and load the NIF into the DiskSpace
Elixir module. There is a lot of outdated information on various articles and blogs about using Rustler, in terms of what you need to add to your Elixir project’s mix.exs
. In particular, the line indicated below appears in many articles:
def project do
[app: :my_app,
version: "0.1.0",
compilers: [:rustler] ++ Mix.compilers, # this one
rustler_crates: rustler_crates(),
deps: deps()]
end
defp rustler_crates do
[io: [
path: "native/io",
mode: (if Mix.env == :prod, do: :release, else: :debug),
]]
end
This is how it used to work in older releases of Rustler. In the latest one, I was getting errors about a missing compile.rustler
Mix task.
I looked into the documentation of the installed dependency’s version and saw that it’s much simpler nowadays. The standard example of NifIo
also helped. I invoked mix rustler.new
and then had to adapt the native/Cargo.toml
file to also make lib.rs
compile on Windows. Grok 4 did that in a single shot, but whether it actually worked (both the Rust code and the Cargo.toml
) was still to be determined.
Right first time–but not on Windows
mix compile
worked the first time around, with no compilation warnings even! mix test
proved that the NIF also worked from within Elixir. Impressive!
The situation with macOS and Windows was still unclear. I had Grok 4 rewrite the build.yml
file and I saw how much shorter and simpler it became, since it did not include installing choco
and then MinGW
/gcc
anymore. One preparation step for compile-time dependencies for all three target OSs. Neat!
Pushing to the git
repo triggered the Build and Test workflow and it compiled on macOS/arm64 and Ubuntu/amd64 for all combinations of Elixir and Erlang that the old C source supported. I was impressed again.
I was not as impressed with the Windows build attempts.
Vibe coding once again
I started “The Process” again, this time only for the Windows-specific parts of lib.rs
. The initial list of errors and warnings of the Windows builds was truly a sheet. Tons of them, for both Elixir 1.17 and 1.18 (and the corresponding Erlang/OTP 26 and 27).
Grok 4 was capable of fixing the mistakes it had made in its first version, but after a while got stuck. It would fix issues with GetDiskFreeSpaceExW
in the code, but forget to set the correct imports. It would introduce something called “HLOCAL
”, then attempt to get it to work within the make_winapi_error_tuple
function again and again. It would fix one thing and break another.
Again, I had Gemini (this time, 2.5 Pro) and GPT-5 (as yesterday) review Grok 4’s Rust code and identify errors, without giving me code. I would shovel this feedback to Grok 4, ask it to implement these recommendations, push to GitHub, open all workflow tasks in separate tags (otherwise GitHub could not show me the logs after the fact, for unknown reasons), then copy/paste the errors back to Gemini and GPT or Grok (depending on how serious the errors seemed), ad nauseam.
And I mean “ad nauseam” quite literally. It was nauseatingly tiresome, and disillusionment almost made me give up a few times. You only need to look at the GitHub Actions log to see the many failed attempts. All that took a few hours until finally Grok 4 reached a point at which all builds completed successfully.
Vibe coding for the last mile(s)
On that first “basecamp”, the Rust code would build and the Elixir package would pass the tests successfully across all Elixir versions from 1.14 to 1.18 (and with the corresponding latest-supported Erlang/OTP version), which was a major boon to the whole Rust port performed by the three LLMs in collaboration.
That first basecamp still had issues that Gemini and GPT identified, so I continued the code reviews by the two LLMs, this time asking Gemini and GPT “could this be considered production-grade?” and prompting Grok to “ensure that any changes you make DO NOT IN ANY CASE break the macOS and Linux builds”. I might have called Grok an idiot in there a few times.
On the way to what’s now version 1.0.0 of the DiskSpace Elixir package, I reached 4 more such “basecamps”, each of which was followed by a series of abject failures by Grok at fixing things without breaking other things, thankfully leaving the macOS and Linux builds unscathed, until the next basecamp.
I had kinda-timeboxed this second experiment to a few hours so that I would return to writing the book
(which will now use disksup
anyway), so once GPT and Gemini reached a point of recommending improvements that they considered minor, I considered their job done.
By that time, I had realized something. Vibe coding increasingly felt like something I had experienced (and escaped from, and avoided for evermore) before.
I really dislike vibe coding
After I started working as a development engineer for turbocharger components in 2008, I very quickly grew disenchanted with the role. Here’s what the typical trial-end-error design process involved:
- Taking an existing design of a shaft, a turbine blade, a turbocharger casing, etc.
- Simulating it with CFD and/or FEA to understand how it performs.
- Spending time coming up with changes to design parameters.
- Adapting the 3D CAD drawing.
- Converting the design to geometry input files.
- Preparing and meshing the simulation model (load cases, boundary conditions, etc.)
- Sometimes, adapting the simulation setup (simulation parameters, etc.)
- Running the simulation locally on the workstation (if expected to be short) or submitting the simulation job to an HPC cluster.
- Waiting for anything from 15 minutes (for short FEA), to 2-4 hours (for more complex FEA) to 18 (!) hours (for high-fidelity CFD).
- Reviewing the simulation logs for any simulation-run issues.
- If issues prevented the calculation from completing successfully, go back to step 7 (ideally) or as far back as step 3 (worst-case).
- If no issues with the simulation, investigating the resulting multi-gigabyte files in the FEA/CFD suite’s visualization software.
- Post-processing the files to generate various images and graphs.
- Evaluating performance against the best design variant so far.
- Spending time coming up with changes to design parameters (go back to step 3), or go to step 16.
- Reviewing the entire process and the design and documenting it.
That was slow process, with the human entirely in the loop and completely at the mercy of the key bottlenecks, which were:
- The manual preparation of input files with a lot of copy-pasting of text between Notepad++ (what a great piece of software!) and pre-processing software.
- The manual preparation or tweaking of the simulation setup.
- The manual review of simulation logs.
- The manual post-processing and evaluation of the simulation results.
- Most importantly, the ungodly amount of waiting time to get simulation results.
In my first role as a development engineer I was not designing turbine blades, and so I wasn’t subject to the torturous waiting time of CFD simulations. One week after moving to the turbine design team, I remember being so frustrated with the slow pace of the process that I told my supervisor that this cannot be how work gets done. He told me that this is that the work is, and that I will get faster as I learn more about turbine design.
He wasn’t wrong, but I was still pissed off, to say the least. Surely, this could not have been something that I had studied so hard for!
Not quite vibe coding
That short chat with my supervisor happened on a Friday. As soon as I boarded the tram from Baden to Zurich, I took out a notebook and wrote down the requirements on a semi-automated process, and what kind of Python (Python 2 at the time) code I’d need, in order to automate some parts of the process.
The idea was to automate some of the most boring and error-prone parts of the process, and get faster overall by vetting some designs using insanely-faster low-fidelity evaluations of the designs, before committing to waiting a double-digit amount of hours to get the “proper” results. I would evaluate hundreds or thousands more designs well-enough to understand better what’s going on and avoid spending expensive simulation time on variants that even with simplified simulation physics had no chance of approaching the targeted spec improvements, and thus weed out useless variant. What would remain would then be worthy of higher-fidelity simulations for better insights.
I entered the office on Monday at 7:00 and by 12:30 I had a working prototype of a Particle Swarm Optimization loop in Python. It was crappy, hastily and enthusiastically written Python code that orchestrated a few other tools, such as:
- writing input files to a quasi-3D radial-turbine flow calculation for mass flow, efficiency and Mach number along the blade’s profile at different meridional heights,
- triggering multiple calculations with in parallel with Python’s
multiprocessing
across the (IIRC) 4 or 8 cores of the Xeon workstation, - gathering results from each calculation and aggregating them, determining the best one from that generation,
- feeding the last generation’s data into the next iteration of the PSO loop.
My gets-the-job-done visualization was an Excel spreadsheet that automatically refreshed the geometry data and the results of the “global best” simulation thus far from a text file, and plotted them.
It worked so well that I spent the 3.5 years after that automating the heck out of everything I touched. I had found a happy place; programming, engineering, problem-solving, understanding, documenting, improving–day in and day out. I lived and breathed engineering optimization; so much so that a colleague started calling me “Mister Optimizer” with the Swiss-German English accent.
Whatever I was tasked with designing, I would first set up lower-fidelity simulations, run parametric studies either locally or on the cluster, harvest the results and evaluated them in a semi-automated manner, select the best variants, etc.
Eventually, I reached the point where I had too much data to evaluate with heuristics or by eye/Excel, so I got into Machine Learning
, building ANNs with libFANN
, training them with early-stopping code I implemented in Python following a paper
, building and querying regression models with Weka, clustering with k-means to identify “families” of designs, performing PCA to find out the most impactful design parameters, executing Monte Carlo simulations with millions of data points for probabilistic design, and much more.
This kind of work delivered great designs, and even made it possible to identify hitherto-unseen design regions of multi-physics turbine-blade design. It brings me great joy to know that some of “my” turbines are still spinning in engines’ turbochargers worldwide.
Had I not engaged in this semi-automation of my work while always being in the loop for the most knowledge-rich parts, I would have continued to be subject to the insanely slow speed of each iteration, severely limiting the rate at which I would be learning from the things I was developing. To be frank, I would have resigned and gone to do something more creative and intellectually demanding than shoveling data back and forth between pieces of software.
The equivalent of vibe coding in the world of mechanical engineering
At the time, a colleague in a different team of the R&D department had been the flagbearer of pushing for the “fully-automated” optimization suite of one of the largest simulation software vendors.
It all looked so enticing! You set up the optimization case, select the optimization algorithm, then it was supposedly “pushbutton optimization” from there on, while the optimization suite orchestrated the simulation software to perform automatic meshing and simulation preparation, and then ran numerous simulations consuming per-core licenses for hours or days (coincidentally very beneficially for the simulation software vendor). You just had to lean back and enjoy the pretty graphs. Why spend precious budget on engineers engineering things, when you can let the computer do everything automatically?
Does this remind you of a current situation by any chance?
Well, I can tell you that this did not go very far, because removing the human from the loop entirely in that manner turns a trained engineer into a machine operator; without agency, without understanding what is really going on, without even the possibility of understanding why some optimization algorithms are a bad fit for the task at hand, or why a multi-fidelity optimization strategy (at the time, not supported by the optimization suite) would make things better across the board. And forget about using Machine Learning being part of any such automated software; no, this was an optimization suite by a simulation software vendor. Machine Learning? LOL.
Calculate more, you peasants!
Same old promises and worries, in a different cloth
Vibe coding gives me a mix of the same vibes as both the old process I had been subjected to, and of the fully-automated approach.
At least when running through the old process you will gradually upskill; you would have to be either stupid (unlikely) or wilfully uninterested in learning, to not gradually start seeing patterns between design parameters and decisions and the impact they have on the outcome of the design/development process.
“True” vibe coding, like the one espoused by those letting a Claude Code or some other “agentic” system run amok on a software-development problem gives me the exact same vibes as the fully-automated engineering optimization approach. Only now it’s not simulation hours or simulation licenses, but tokens.
Spend more tokens more, you peasants!
So I’ve seen this play out before; in the same way that you become a mere machine operator of a “pushbutton optimization suite” and a useful idiot for the software vendor (who sells the shovels in the gold rush) to utilize more and more of the licenses the company has paid for, when you hand the reins entirely to an agentic system and lean back and look at the beautiful results, you become a mere machine operator for a “pushbutton software development suite” and a useful idiot for the supply chain that spans from the agentic system’s vendor to the LLM vendor and all the way to Nvidia supplying the growing global thirst for compute capacity.
In case you are one of those people who are worried about the role of the software engineer becoming obsolete, you might have noticed that mechanical engineers still haven’t, and probably won’t; and not due a lack of affinity of software automation for engineering work. Engineering work has changed to an insane degree between the 1990s and the 2000s. It has kept changing all throughout the last 20 years.
Engineers (at least, the better ones) kept evolving with it. Some branched out towards systems engineering, looking at the whole picture while having a deep understanding of the details. Others transitioned towards managing engineers, a job that includes bringing juniors up to speed. Yet others moved on to Machine Learning, which is seriously one of the best things you can do as a mechanical engineer; chemical engineers already had a better grasp of EDA and Data Mining, for example, and many found ML a natural expansion of their skills. Some others (like me) branched out to business and to software, and kept their engineering ethos alive.
My money is not on software engineers becoming obsolete. Software engineers, like every other discipline of engineer, will have to adapt to new technologies and new conditions; new expectations too! Much like you cannot be expected to not know how to use a 3D CAD piece of software competently, effectively and efficiently for the past 25 years (as one trivial example), so will be at some point unheard of to not know how to use LLMs or whatever technology will arrive over the next years competently, effectively, and efficiently.
Why should the discipline and the job of a software engineer be any different? So don’t get all riled up by the current rhetoric espoused by those with the self-interest of convincing you that your only professional outlook is to become obsolete.
It’s not the only outlook; it’s one possible outcome though, which becomes more likely, the more you buy into the hype, and divert from attention to evergreen values.
Is vibe coding engineering?
You may call vibe coding whatever you like, so that it gives less Gen-Z. Call it “automated code-generation”. Call it “LLM-assisted coding”. Call it “George”.
As long as the activity is dominated by pursuing full automation and “pushbutton solutions”, engineering it isn’t. And I by no means proclaim that engineering is the end-all be-all. Engineering is an expensive process that not all problems are worthy of, and the iron triangle of something that must/could/should be developed must include the cost of engineering, which depends on how you go about tackling the problem to be solved and accomplishing the task at hand.
There is also a flip-side to the cost; a side-effect if you will, if you like functional programming: the benefit of the engineering process is that you increase your skill in understanding anything from the physics and constraints that dominate turbine performance (e.g., mechanics and fluid mechanics, manufacturing and metal-casting tolerances) to the “physics” and constraints that dominate the performance and quality of the software you develop, including architecture, architectural fit, maintainability, business risks, and most importantly: whether it fulfills the requirements posed on it by the system the component (mechanical or electrical or electronic or software, etc.) is a part of.
And, regardless of whether we are talking about turbines, radars, an API or a SaaS or anything else, part of the cost of engineering includes understanding better and better what it is to develop, and this always requires talking to others; hashing out requirements, specifications and designs; iterating on concepts; trying out new things; seeing if they work, understanding if they “stick” and why, and especially why not.
The Incredible Story of Deft… for AI
If you think that an LLM will be able to give you a reliable verdict on a requirement, a spec, a user story or a system architecture just because you managed to find “the perfect prompt”, go out and touch grass.
Talk to people, be they colleagues, customers or other stakeholders, and see how ambiguous and contradictory the things you’ll hear are. As an engineer (or a product manager thinking as an engineer) it’s your job to figure things out, address uncertainty and ambiguity and make decisions regardless of the noise inherent in anything involving human beings. Even though LLMs can you give a quick “lay of the land” to understand the various dimensions of a problem or a domain, especially if you know little about it, as long as they are not tapped into human brains to get feedback, this is a pure pipe-dream.
The world of software seems eternally a victim of its own fast-moving, explosive nature in the past decades and its own lack of awareness of industrial history (or of how things take place in industries other than software). Add to that the frothy valuations and insane hype of Venture Capital firms betting on the casino of startups to justify said valuations in the hope of an “exit” or passing off a hot potato to some other sucker before it explodes, and you have a potent mix of hyperbole, fad-chasing, misplaced expectations, and ultimately disillusionment.
I have a hunch that the software industry is repeating not only the same mistakes that other industries have learned painful and valuable lessons from, but is even repeating cyclically the same patterns that it itself went through earlier. Big fat promises of increased productivity, faster, cheaper, better product with less costs. Less experienced developers pushing buttons without understanding or even having the chance to gradually understand more of what they are actually doing, because it’s all being taken care of by some “pushbutton” system that does the heavy lifting for them.
And don’t get this confused with a false analogy of “but you are using 3D CAD without ever having programmed a computational geometry kernel”. The vast majority of mechanical engineers, as one example, may not have written such a complex piece of software (a task better left to mathematicians or mathematically-inclined engineers, by the way), but I can assure you that most have taken lectures on numerical computation, linear algebra, materials science, thermodynamics, fluid dynamics, and perhaps even wrote rudimentary FEA code to be able to understand what the software vendors’ big impressive piece of simulation software is doing behind the scenes with high computational efficiency.
But how many of the pushbutton vibe-coders nowadays understand what’s happening, what’s really going on behind the scenes?
It’s a meme by now: the clueless enthusiastic beginner who posts “I can code anything with Claude Code” on X, Bluesky or Linkedin–and then gets wrecked because of not understanding the fundamentals, such as authentication, securing a server, API keys and keeping them secret, etc., and instead relies on a myriad of subscriptions to SaaS of startups that deliver “pushbutton experiences” and the illusion of competence–and a hefty monthly bill, be it in $20 per month per developer at Vercel for the priviledge of deploying a NextJS app (trivial with npm
and pm2
on a VPS), or some managed database vendor to deploy PostgreSQL (trivial, except if you want to do clustering, which as an enthusiastic beginner you probably don’t even know what it is), or any of the agentic code-generation tools by vendors who will happily sell you that experience and illusion of “care-free deployments” and other marketing bromides that get your brain to switch off.
The two poles
It all sounds like a curmudgeonly rant to some, I know. But look at the polarized state of opinions about what is called “AI”. On the one side, we have:
- Enthusiastic beginners who think they found The Philosopher’s Stone. They don’t even know what they need to know. Dangerous, but primarily only to their own potential of developing skills and competencies to make a living–and to their own wallet.
- Bombastic executives who want to please shareholders with their visionary “AI strategy” (that a consulting firm will happily bill them for, to deliver boilerplate slides and executive absolution) and proclaim that this will revolutionize their company or industry; exactly like what they said about the last fad-du-jour, like Agile, Lean, IoT, Industry 4.0, Cloud, no-code, low-code, blockchain, etc.
- Middle managers who want to display that they are something more than an agency-less corporate drone and do exactly that: they drone on and on on Linkedin about things they most likely don’t understand; will never understand, except if they have spent or decide to invest years “in the trenches” getting acquainted with the “ground truth” of what any new “miracle cure” can actually do, and under which conditions.
- Individual contributors who jump on the latest shiny new object to add something new to their CV, and go negotiate a new position elsewhere, loaded with certificates, GitHub repos, etc. Fine–don’t hate the player, hate the game; but if you are stuck in tutorial hell and can never get to stick to something (a language, a framework, a domain) for longer than a few years, understand that you are jeopardizing your own overall competency, as well as the resilience outside of an organization that might happen to want X years in this or Y years of that. At least play the game in the long run.
- Consultants and coaches who somehow manage to change their expertise (and LinkedIn headline) insanely fast with every new wave of hyperbole; wow, these people are incredible–for example, ChatGPT launched end of 2022, and by the time they were back in their office after New Year’s, they managed to build enough understanding, expertise even, to rebrand themselves as “AI expert”. Amazing how inflation (of anything) literally reduces it value. Groundbreaking insight.
- Those software vendors (and their suppliers) profiting from all of the above.
Been there, seen that; see also: The Incredible Story of Deft . On the other side we have everyone else. Who is that?
- Anyone who has seen waves of hype and big promises come and go every 3 to 7 years.
- Anyone who understands the hype cycle and its links to human nature and to agency issues within and across organizations.
- Anyone who is enthusiastic about new technologies, but takes vendors’ and consultants’ promises with more than a few grains of salt.
- Anyone who understands that LLMs are tools, and tools are not universally applicable.
- Anyone who doesn’t want to become a mindless button-pushing drone, i.e. the exact kind of “worker” that historically again and again commoditizes themselves out of a living by chasing fads, neglecting to build competence, settling in the comfortably familiar world of patting themselves on the back as “visionary”, “expert” etc., and gradually de-skilling themselves while keeping up appearances of always being up to date on things they have no chance even understanding, because this takes time and “friction with the material”, as we say in Greece.
- Other.
Don’t become or remain a fool with a tool
All this to say: I really dislike vibe coding. Actually, during these past two days I came to hate it; more than once I got this disgusted knot in my throat. But “hate” is an expensive feeling, so let’s not inflate it. “Dislike” will do just fine.
I dislike that it mirrors the stupid and slow trial-and-error development process I strived so hard (and succeeded) to escape from almost 2 decades ago, in an industry that is not about software.
I dislike that the promised fully-automated “YOLO it and let the agent code” approach gets so many ooohs and aaahs despite the fact that my experiments from the two past days clearly demonstrate that the technology is amazing in its own right and also not what the “one side” above portrays it as, or wishes it would be.
I dislike more than anything else that (some) people once again seem to prefer taking the lazy path towards outcomes without spending time to understand the trade-offs, compromises and sacrifices their actions have, not only on the outcome / the product, but on themselves too. And I would not care about that; you do you, if you are one of those people. But spare us the incessant vomit of regurgitated “thought leadership”.
Because it is one stance to knowingly admit “hey, I don’t have the time or interest to learn Rust, C, or OS internals to write a NIF, so I’ll let LLMs code it for me and see what I’ll get” and an entirely different and long-term detrimental stance to say, proudly even, “lol bro, why do I need to even understand what I’m doing? Claude Code will write it for me!”
A fool with a tool is still a fool, even if the tool is cool and makes the vendor drool. The techology of LLMs is amazing, especially if you (as I did years ago) played with NLP (NLTK and spaCy) and the early GPTs. At the same time, LLMs are an incredibly dangerous technology for those who will use it in the wrong way, and it is primarily dangerous for themselves, but secondarily also for everyone else who will be impacted by knee-jerk reactions to hyperbole.
So, again: LLMs are an amazing technology. My first computer having been an 8088, and having implemented a version of ELIZA in C 22 years ago, I feel as if I’m living in a science-fiction novel since ChatGPT launched. It’s almost certainly a good idea to learn how to use the technology competently–and this includes when and for what not to use it, and when to use it with caveats , which means understanding if you are in the “Danger Zone”, where I have clearly (and most importanly, knowingly) put myself into during the experiments of the past 2 days.
But, again, this “AGI” thing… sorry, this isn’t it. I remain unconvinced. Remain skeptical.
Rant over. Enjoy DiskSpace v1.0.0
–even if it does something already done with :disksup.get_disk_info/1
!
Will I vibe-code ever again?
Not if I can avoid it.