Following the recent release of v1.0, the 100%-completed version of Northwind Elixir Traders book on modeling databases with Elixir, Ecto and SQLite, I was a guest on Jacob Luetzow’s “Elixir Mentor” podcast. Jacob is not only a great host, but he has also been pivotal in my path to becoming competent in Elixir. I have been following his YouTube channel from the days when it was called “Backend Stuff”. His early videos on developing a REST API with Phoenix were very useful in getting the lay of the land.
The following is a loose transcription of the video’s content, without all the uhmmms and aaaahs that Toastmasters had beaten out of me years ago. It has also been shortened to contain only the parts pertinent to the discussion about the book and development work.
“Can you give a quick introduction to yourself, who you are, and what your background is?”
I’m a mechanical engineer by training. I worked for many years abroad , and have been living in Greece for the past 5 years. I lived in Central Europe (Switzerland), traveled the world in my corporate career, and did various things, like R&D of turbochargers , product development, Machine Learning in R&D–that’s how I programmed a lot in Python. I wrote a book that has been a surprising success, and that came after 2 years of getting immersed in the Elixir ecosystem .
“I started reading your book, haven’t gotten as far through it as I was hoping, but I think it’s a great resource for getting started, understanding how databases work, how Ecto works, and how to efficiently build your schemas and work with data in Elixir apps. I’m curious what made you want to go with the Northwind application.”
I know that at some point I remembered that there is a Northwind Traders database that I had in an old Microsoft Access 97 CD, from way back when. What happened is that I got into Elixir, Phoenix and Phoenix LiveView in 2022, and as I was following tutorials and documentation, I realized that I’m having trouble building things in the way that I imagined they should work. I would set out to create the ERD (Entity-Relationship Diagram) and then when I would transform that into the schemas and the migrations, it worked–but I always felt that I’m following something “by the book”, and what if I change something that the tutorial doesn’t cover? When I changed something, things sometimes broke. It’s not that difficult in the end, especially after you follow how the book explains things, which hopefully makes it clear. But it always nagged me that I’m following the “happy path”, and that I do not know what will happen if things break. How can this thing break? How far can it break, and how do I recover? What could I do differently? So I thought, let’s seriously explore how all this works, and its failure modes.
After trying things out in many projects, I thought OK, enough with all that copy-paste work–let’s sit down and take a database that is simple enough and laid out properly, and try to model it in Ecto. So I started doing this, and at some point I realized: well, I’m doing this anyway for myself, and it seems to help me understand how this works, so why don’t I document my learning process in a couple of chapters and put them up as an early book release on Leanpub? I published two chapters in April 2024, and the pick-up was massive, so it instantly became my “weekend project”. I kept updating the book, exploring and building it out over almost an entire year, with the exception of some months in the summer, when I was too busy with other Elixir work.
At some point, the aspiration went beyond documenting how Ecto works, but also to use my Elixir skills, which had been growing over the months as I had been programming a lot in Elixir over the summer, to teach things that I had learned. For example, there’s a chapter in the book where we import the Northwind Traders database. You can either do this manually for every table, or you can implement an import_all()
function that will automatically determine the order in which the tables’ data need to be imported. So, I found myself writing an algorithm for determining the right order, and it turned out a few chapters later that it had a bug. This made me implement Depth-First Search in Elixir, in order to reliably determine an order in which you should automatically import the tables’ data.
So, while the book’s scope started off as “Elixir, Ecto, and SQLite”, it later became something like “OK, you can practice Elixir, but also learn those things”. It’s a kind of recap of basic and a bit more advanced Elixir skills, and you also learn database modeling in the process. Plus, you have to fight SQLite in a couple of places–only twice–because it’s not as feature-rich as PostgreSQL. It has been a fun weekend project and I learned tons from it. I was telling my friends that after spending so many hours reworking the chapters and making sure that it all flows more sensibly, it’s as if Elixir is flowing out of my fingers on the keyboard. For example, I had to rewrite two chapters almost from scratch, because they way they were originally written caused a lot of confusing back-and-forth for the reader.
That’s how the book came to be, and it has so far received excellent feedback, and so I’m happy to take this approach of writing technical educational books further, because it seems to be a missing ingredient in the market for technical education and self-study. I mean, there are many great books about Elixir–actually, there are few books overall about Elixir, compared to what you can do with Elixir–and there are tons of tutorials and short blog posts explaining how to do single things. But I always approach things from an exploratory instead of a prescriptive angle, which seems that it’s something that people appreciate.
“I love that–and yeah, what’s great is that there are few books on Elixir, and we’re really lucky that the ones that exist are amazing! I really haven’t run across a bad Elixir resource. And that says a lot about the Elixir community and the knowledge base that everyone is sharing.”
I think it’s also important to admit that different people have different preferred ways of learning. For example, I went through “Programming Ecto”, which is an excellent book, but doesn’t match the way I learn. It’s very completionist, very regimented, and it delivers what it promises: you go through it and you basically get to know all of Ecto–but that’s a lot. And if I don’t apply it on something that I need to achieve or solve, I cannot retain all that knowledge. It will all fade away too quickly compared to the effort I had to put into studying from this kind of book.
In contrast, the way that Northwind Elixir Traders is written is that it starts with a mission: to model the Northwind Traders database, and run difficult queries on it. If we need to use something specific, we’ll use it. We are not covering all of Ecto–but if you have gone through those basics, the documentation is one click away, so you can build out anything else you need.
“That’s great–I feel that our approach to teaching is very similar, because when I started my YouTube channel, I started it because I realized there aren’t many resources outside of books. And I know that when I started programming, I enjoyed video tutorials, and the hand-walking, being shown how to do things, but also you don’t really have to deep-dive into things you are not using. So I thought hey, we’re going to build something useful and we’re only going to touch on the things that we touch on, and there’s no need to dive deep into things that we don’t need yet, because I feel that this can overwhelm your “knowledge pipeline”, however you’re learning. What else I think is funny too is that Isaak and I have very similar background. He’s a mechanical engineer, I’m a structural engineer, and we’ve both worked for family businesses that have formed us in our software path from various experiences. I started doing mechanical and electrical engineering work, because I would automate manufacturing lines, like rip out old hydraulic pumps and replace them with vector motors and PLCs and all these things, and I started implementing ladder logic, latches, timers, switches… and I realized that I want to do more. So I started teaching myself how to program and I started with Java–unfortunately–and slowly moved to JavaScript, Swift, and eventually ended up here, using Elixir. And now I don’t want to use anything else.”
Fun fact: I started with BASIC, exactly where I’m sitting right now. This apartment used to be the first office of the family business, and here where I’m sitting used to be a PC with a 286 CPU back in the late 80s. I used to program FOR
loops to raise the pitch of beeps to simulate the sound of a revving engine. Then I didn’t do much programming for years while in school, and then I did a lot of C and C++ in my studies–but super superficially, not really programming something complex, just implementing algorithms.
Then, I got into Python in my R&D work. Back in 2009, that was seen as something “weird” in the world of mechanical engineering, because it was otherwise all FORTRAN and C++ for simulation code. I automated the heck out of everything with Python, because I found R&D work to be drudgery. For example, you sit there and input new design parameters for the shape of a turbine. Then you run it through a program that will generate a geometry and give you a simple estimate within one minute for how the design will perform–and then you take the geometry, you manually create a 3D model and mesh it, you send it off to a High-Performance Computing cluster, and have to wait for 18 hours to see how well your design performs. Rinse and repeat, tens of times. I thought that this is just stupid–at the very least I should let an optimization algorithm run the fast simulation and give me an allegedly optimal design, and only then wait for 18 hours on that allegedly optimal design.
So, for 4 years I avoided manual labor like the plague in everything I touched in R&D. That’s how I got into Python. In retrospect, the Python code I was writing was very “functionally written”–no classes, no object-oriented programming in that sense. Also, as engineers we don’t put much emphasis into becoming great software developers, we just want the code to work. If it works, it’s good code–that’s the sole quality criterion.
Then for many years I didn’t do anything with programming, and then again I touched Python in my startup on Machine Learning in R&D. After that, my programming skills became dormant, until in the family business in 2022 I got fed up with manual labor for offers, inventory management, product data management, and so on. Having understood the business domain for 1.5 years by that time, I eventually sat down for one week and automated the core processes of the business. That came after failing with Django, which I found super weird to work with, because OOP in my mind has way too much abstraction. For example, when you pick up Django, which is great, thanks to the admin UI it provides you out of the box: you start off quickly, but when you want to implement something complex, you suddenly need to first understand how the maintainers of Django have abstracted everything. So you don’t actually learn how to program in Python, but how to get into the brain of the maintainers and follow the abstractions they have created.
Whereas, with Phoenix it’s pretty straightforward, even though it has its own DSL and peculiarities with the Plug module and so on. You don’t have to think through a hundred abstractions to get something done. That’s also why I like Ecto: it’s very straightforward, once you get the hang of it.
In the summer of 2022, a friend of mine who comes from academia and programs in Scala professionally was telling me about functional programming. Soon thereafter I installed Elixir on my Debian Linux workstation, ran through some exercises on Exercism and through some online tutorials, and from the moment I saw the |>
pipe operator, I got hooked. This was the moment! In my mind it worked like my old HP 48G calculator works with RPN. Since then I’ve been trying to program everything I can in Elixir.
I nowadays try to program only in Elixir, though it’s not possible, as I’m also getting paid to program an app in PHP and NextJS for the past many months now. However, coming back to Elixir always feels like a breath of fresh air, because it’s so… so much cleaner. In Phoenix LiveView, you work on the backend, you don’t have to deal with pinging back and forth between different backend and frontend codebases, with JSON wrangling, and all that. It’s just a great programming language, and the community is amazing.
“Yeah, it’s great! The thing that got me hooked before I really truly understood all the power of OTP was the pipe operator, and when I wrapped my head around pattern-matching, that was a game-changer as well.”
There are some things that I used only later, for example the with
construct. I have an app that I haven’t released that polls the Norwegian Meteorological Service’s REST API. You want to get the weather of a “data location” that might be shared among different actual real-estate property locations. For example, if you want to get the weather in a geohashed grid, you can have multiple real-estate locations within that grid. You don’t need a resolution of 100 meters–the weather is not going to differ between that and the other apartment across the road. Then the app follows what you could call a monad pattern. There’s a function pipeline that says “maybe there is a location”, “maybe there’s a data location”, “maybe fetch the weather for this data location”, “maybe update prior forecasts”, “maybe persist the new data in the database”.
So you start with a tuple of {latitude, longitued}
that passes through this series of maybe
functions in a pipeline, and the end result is that you get updated data in the database.
The more I played with Elixir and learned to use its constructs, the more sometimes I maybe over-used them. For example, when I discovered Enum.reduce/3
, everything was done with Enum.reduce/3
for a while. Or Enum.reduce_while/3
. Of course I will use a reduction everywhere now!
“That’s why I tell everyone coming in from object-oriented programming languages not to rush into GenServers, because you’re gonna use them wrong.”
Yeah, and I did that too for a while. Last summer I was busy with building a sister app for a startup I run on property management in Greece, Breek.gr . We had an idea with the co-founder to do something for the top-line growth of our users’ property-management businesses. So yes, with Breek.gr you have something that you can use to increase the operational efficiency of your office, but so far we are not giving you something to grow your business. The idea was–still is–to also give users something that can increase their visibility in the market of offering property-management services: a directory of property managers.
Before I decided to jump into professionally developing software, I had a decision to make: build this sister app on the existing PHP and NextJS stack, or use Elixir and Phoenix LiveView. I decided for the latter, and went through six weeks of frantic work, where I pulled out all the stops. GenServers, LiveComponents, caching with Cachex, multi-step forms–it was an incredible learning and developer experience!
As you mentioned, the GenServer is a tricky one! Once you learn it, you think “wow, everything can be a GenServer”, and you start to think of the GenServer like an object that does things for you. But if you do this, it becomes a bottleneck, as everything has to go synchronously through it.
But seriously–the Elixir programming language gives you so many amazing features. I cannot find something right now that I could not do with Elixir.
“I agree–there’s nothing you can’t use Elixir for. It’s pretty much a good solution for everything, at least with backend API work. Now, I’m curious: I think that data architecture and figuring out schemas and relationships, and everything you do between database tables… it’s a hard skill to learn, and it’s one of those things you can’t really learn without creating a bunch of projects and slowly learning how to lay things out. I’m curious, how do you approach that piece of it in your book?”
Given that the database schema is already fixed in the Northwind Traders database, there are some thoughts of how you could build it out, but we don’t do anything that goes far beyond the schema. OK, we do add one more table for countries, and import data into it from a online CSV file on GitHub. But we don’t do any re-architecting in terms of changing columns from being string values for the country name to a foreign-key field pointing to the countries table. We don’t normalize further. So the book is not so much about starting from scratch, it’s about modeling an existing database.
In other cases… when I was trying to learn Ecto and Phoenix, I don’t know how many trial projects and different ERDs I built, and stumbled again and again on misunderstandings I had about Ecto, because it had also been a long time since I learned about databases in my studies. However, now that I’m building more and more complex things, I find it’s really useful to start small.
Think big, but start small. What I do is I lay it out on a diagramming software like dbdiagram.io and I think about what else might come, and try to not make it into a tangled mess, because that’s a huge risk–you can have so many different tables and normalize the database to an extent where you’d need to join so many tables to get the results you want… And you don’t need to make it that complex.
Then I try to zoom into what I need for the first version. My biggest fear has honestly been migrations with production data. This worried me, and that’s why there’s one chapter in the book where we change the :price
field of a product from a :float
to an :integer
. That’s super simple–but it’s also a litmus test of dealing with data already present in the database. You have to write migrations to create a new field, then convert the price to cents by executing SQL, then rename the field, then drop the old column…
I would say, aim to avoid useless migrations. If you can run ahead a bit and predict what you might need in the future that might require a migration, try and build the core so that you won’t have to change things as much–because it’s pretty much guaranteed that things will change at some point, for example for performance reasons.
One principle I follow is that I try to avoid deciding too early, like in the Toyota Second Paradox: delay decision-making until the moment you have to make the decision, but forecast first, what kind of decisions will I need to make? And then defer decision-making until the absolute latest moment it becomes necessary, either by itself or because other decisions depend on it.
“I think people underestimate the power in planning, as well. When you’re creating a product you want your MVP to have as few features as possible, and that’s what you want your database to be as well–only data that you need. But you really want to think through, should this be in its own table? Can this be a column? Do query speeds matter? How often am I grabbing this data? There’s a lot that goes into the layout of your database architecture.”
The whole Agile, MVP, Lean Startup mindset–there is a downside to it, where people start to say “well, I don’t need to plan, I just do”. But if you “just do”, you aren’t being agile; you’re merely performing rework again and again, wasting valuable time and energy on things that you could have just as well avoided with some foresight.
“Is that waterfall?”
“Waterfall” is a strawman–people put it up and criticize it. But I don’t think that anyone does waterfall, not even in hardware anymore. People do a lot of rework in “Agile”, though!
I’ve done a lot of product development and some product management and the whole point is to go fast. I mean, we developed a hand-held radar to see in walls within 12 months! 12 months for software and hardware. Of course, we based it on something that already existed, so we didn’t do fundamentals research on everything from scratch. But we had a clear product vision, and the idea was to always work with constraints and trade-offs.
For example, how big should the antenna be? Well, I don’t need to fix its footprint, i.e. so many centimeters by so many centimeters. I give a range to the RF engineer, and he needs to give me trade-off curves. If the antenna is larger, it has a different center frequency, and its depth penetration and resolution differs, and it hampers the ergonomics of the device. If it’s smaller than it needs to be, it differs in some other way, like limiting the size of the batteries you can use to power it. So, understand the trade-offs, and then gradually converge towards something that works and fulfills the requirements.
In software engineering, the way I think of it is: run ahead a bit, put down your product vision beyond the MVP, and know where you might end up. Already bake in some things in the concept that will allow you to the next versions of the MVP or to the full product, without having to re-do everything. Re-doing is super expensive and always comes with bugs–and with migrations. I’m in the middle of something like that right now, actually.
“Migrations can be very difficult, especially if you’re changing types and altering data with them. There’s a big difference with just adding a column vs. migrating data.”
Yes, because you need to make sure that everything you migrated got converted correctly. And it doesn’t even need to be a migration in the sense of an Ecto migration. It can even be a different way of organizing data.
For example, I’m going to soon release a new version of the Breek.gr web app, where we have the ability to upload files as attachments to different entities: real-estate properties, tasks, financial transaction, occupancies… The way that the attachments of each type of entity were implemented for the MVP launch was OK for the MVP, it was done pragmatically–every entity has its “embedded schema”, if you were to put it in Ecto lingo, for the attachments. It’s a Repeater field in Wordpress ACF Pro.
Now the idea is to have centralized document management across all types of entities, with the same attributes and features, such as permissions and optional expiration dates, and other metadata, like a summary, or tags. So now comes the difficult task of re-architecting everything into a new table that represents documents in an S3 bucket, and their metadata: a file’s S3 path/key, its size and type, the permissions… And that’s not technically difficult–but making sure that everything will get migrated successfully and that the frontend will work correctly with the new concept is not trivial.
It’s not technically difficult–but you need to make sure that the app keeps working after the data have been migrated. Since we’ll have to migrate production data, I get this gut feeling that this needs to be done “first time right”. Of course, we first try things out in a staging environment–but the gut feeling of having to be extra careful and diligent is still there.
“With changing data… if you’re not just limited to web applications, you need to think about backward compatibility, because people don’t update their app… it can get tricky!”
I find that this is the most difficult problem with the split between frontend and backend that you can avoid when you implement your app in Elixir.
“Yeah, if everything’s a web app, everything on the frontend updates when you tell it to.”
That was another “aha moment” I had. One of the “trial balloon” Elixir apps that I wrote is Changelogrex . It downloads the Changelog files of the Linux kernel, then parses them into separate commits and shoves them in an SQLite database. You have the standard boilerplate Phoenix LiveView list of records, and you can add a new Changelog to download by entering the Linux kernel version. Then there’s a button you can press that will asynchronously and in parallel process all commits and persist them.
It was an “aha moment” when I was able to trigger an update of a LiveComponent that shows you how many commits are in the database from within IEx, and see the number go up. I mean, I know it can be done, but to see the number go up without writing a single line of JavaScript–it was fascinating!
“I haven’t played much with SQLite. What made you go that route? Is it just a database you prefer using or is it the easiest to get up and running without having to run a separate instance of PostgreSQL?”
Spinning up a Docker container with PostgreSQL is trivial. It’s trivial even if you host it on bare metal, which I do. What is not so trivial is to make it work in a clustered setup–this is something that I’m still working on, to learn how to do it properly.
For the things I’ve been developing that have a very small scale, I find that it’s overkill to go to PostgreSQL if you can avoid it. And there’s a certain feeling of simplicity when you have everything in three files: the database, the .db-shm
file, and the Write-Ahead Log. There are 3 files to backup, whereas with PostgreSQL you have files in some opaque directory structure. That’s one reason.
The other reason is that I used SQLite extensively when I was working in R&D, though not in complex setups. The way it worked is that you would send calculations to the HPC cluster, and after minutes or hours you’d get the results in a multi-gigabyte file. Then you’d open this file in a purpose-specific GUI of the software suite, and post-process the results. For example, you could run simple analyses on the Finite Element nodes, to see where the stresses and strains where high or low.
I had to make some complex calculations that needed stress and strain data from every node, to identify interesting regions of the 3D model, to calculate fatigue and so on. I couldn’t figure out how to apply complex criteria through the GUI, so I dumped the results into SQLite instead. Then I’d run SQL queries across the nodes–like “find me all the nodes where the stress in this direction is higher than a certain value, and the stress in the other direction is lower than some other value”.
And that’s how I learned SQLite for in 2010. I mean, it’s not something difficult to learn… I still remembered it as a very reliable workhorse for some cases.
“Well, and it’s great for small projects, especially if you’re not planning on hosting thousands and thousands of users and having a billion records, it’s a really good option.”
In fact, there’s a guy who posted on HackerNews, who has 6.4 TB in a single SQLite database–but the thing is that his application is read-heavy. I think that this is the actual criterion. If you want to have multiple concurrent writes, then going with SQLite is not the right choice…
“Yeah, that makes sense.”
…but if you have a read-heavy application, or if your data doesn’t change much… For me, SQLite is like a spreadsheet in a way. What I also like to do is take CSV files or data that don’t change that much, import them into SQLite, and then I have a spreadsheet in a file that I can access programmatically and run queries on, without bothering about deploying PostgreSQL, managing it, backing it up… It’s quite handy.
The other reason why I used it for the book is that I wanted to explore something that has constraints. I think you learn the most when you have to do things under constraints, because you need to be creative.
For example, there’s no support for ALTER COLUMN
in SQLite. So when you want to change the type of a column, you need to create a new column, convert the data, then create another migration to drop the old column, then another one to rename the new column to the old column’s name.
There’s also no support for named constraints. So, when you use Ecto.Changeset
’s validation functions, you cannot validate foreign-key constraints the way you do it with PostgreSQL, because Exqlite–actually, SQLite–will not report the name of the constraint being violated.
That’s great, because in the book it gives us a reason to roll up our sleeves and create a new custom validation function for that purpose that reports the proper error and the field name, so that you can have such a validation, i.e. that the foreign-key value exists in the target table, when you’re importing data from Northwind Traders.
This was done as an excuse, in a way, to see how far you can push it, deal with the consequences, and see what works and what doesn’t. And there are some things that work really badly compared to your expectations. For example, SQLite doesn’t support INTERVAL
–if you want to apply a window function across a date interval to calculate a moving average of orders’ revenues in the last N
dates, you cannot do it as easily as with PostgreSQL.
In turn this means that you need to implement workarounds, and SQLite is not at all happy with those workarounds. The final queries get the job done, but are also super slow. But then again: this gives you an excuse in the book to profile those queries and their sub-queries, understand where the bottlenecks are coming from, learn how to profile functions with Erlang’s timer
module and queries with SQLite’s EXPLAIN QUERY PLAN
…
It’s all done for the purpose of learning and exploration–that has been the ethos of the book from the start.
“That’s actually really awesome, that you run into profiling queries and performance, because I feel that this is something I’ve never had to think about or experience, especially in a small-scale project–and for you to replicate that with this project, that’s awesome. People that are reading your book are going to be getting knowledge so much earlier in their career. Otherwise you don’t deal with that stuff until you have to, cause you have scaling issues, or you start dropping connections to your database, cause queries are taking too long…”
Yeah, and the idea is not to become an expert in query planning. I am not an expert. What I do is, I say “well, this is running super slow, let’s make some changes”. Oh, now it’s running even slower. OK, let’s drop the call to distinct/3
and see what happens. Wow, now it’s super fast–yeah, but we get wrong results…
There’s a whole lot of playing around with the way you formulate the query, because it contains 3 sub-queries, one of them a recursive CTE, and so on. So you then say “what can we use to understand why this thing is slow?”
The book is not going to teach you how to profile any query, but it plants the flag to say “well, if you ever stumble across something like this, there is a tool that you can use, you can also profile your subqueries with benchmarking in Elixir, there are some trade-offs that happen when you pick SQLite”. And so in the future you know you can refer to this awareness. We plant the flag for you to know that this can be done, and you need to be somewhat of a detective to figure out why it’s slow, or why it’s not as fast as you expected.
The finding is that doing that kind of a query with SQLite, at least the way I had the idea to implement it, is not something that SQLite is happy about–but that’s OK! I mean, we had to go through almost 500 pages and stumbled across just two situations where SQLite is not as easy as PostgreSQL. And that’s actually fantastic!
Imagine: many people consider SQLite a “toy database” because it’s what everything uses to store data in your phone, but you can take it so far, and when you start to do complicated queries with recursive Common Table Expressions and so on–yes, you have an issue. Well, I consider this a win, that’s not a problem.
“I think it’s very important to know… a lot of people pick the popular choices for tech stack when they’re building a product, and it’s good to get your hands dirty and to know limitations of certain aspects of your stack before you are really tied into a decision down the road. For instance, MySQL versus PostgreSQL… do you know what limitations you’ll run into, choosing between these two databases? It’s important homework to do. It all depends on use cases of your app, and how many read queries vs. write queries… how you want to query data, how you’re filtering things… it’s good to start thinking about those things early on.”
Yeah, it’s about understanding requirements! And you’re probably not going to need to make that distinction between, for example, MySQL and PostgreSQL, until your app is so successful that’s it’s a nice problem to have.
“Then, it’s also a really hard problem…”
It is a hard problem, but then by that time you should have the money to invest in it. But to delay a product launch because you are deliberating over this database or the other one, you can consider this premature optimization, but there’s a whole tendency in the software development world to become fascinated with something or entrenched in a choice…
"…and get stuck with the planning instead of the doing, right?"
Yes… or to make the choice of something that which supposedly will determine the success of the project. And I’ve done it–I’ve done it myself.
“Yeah… me too!”
It’s very difficult to not do it!
“When you’re an engineer, you want everything to be perfect from day one, right?”
Yes, but it never is, and there’s unfortunately a lot of what I call “chicken-clucking” online. You go on HackerNews and find indignant responses to someone mentioning MySQL… I mean, MySQL runs on how many millions or billions of installations. You’re probably not going to succeed or fail because you chose MySQL instead of PostgreSQL…
“And no matter what you choose, you’re always going to run into some nuance with either one… you’re not going to avoid it.”
…and that’s part of the mindset of the book: the engineering ethos. Yes, we want to do this perfectly from the start. That’s the aspiration. The reality is that you knock your head against the wall until you break through, and there are a couple of points in the book where we have to “advance backwards” and go and fix things, for example because the prioritization of the tables earlier was not robust.
In that specific case, I noted down in Chapter 10 that “it seems to work, but ideally we would have used an algorithm instead of a heuristic to determine the order in which the tables should get imported. It doesn’t seem to be needed, though”. Well, what do you know… A few chapters later–I didn’t know this at the time–this problem would bubble up, and that’s why later we implement Depth-First Search to determine a proper order of the table importing process.
But that’s OK! The book tries to also convey a mindset, not just Elixir skills: “you’ll figure it out”. It’s also like talking to myself, because I often want to do things “right the first time”…
“Right, but you can’t see everything, you can’t see into the future. You don’t know… what I think today could be completely wrong tomorrow, depending on shifting of the use case, or just querying performance, or maybe you’re using your data differently, and just have to shift.”
Or maybe a new feature comes in, like this document-management feature: it was never planned. It was never foreseen that we’d need to implement something like this. But, as it turns out, it’s an important feature that makes our users’ lives easier. Well, OK, this new feature’s requirements violate a lot of assumptions that we made when initially modeling the database. Let’s figure it out.
The beauty of programming is–barring physical constraints, like physics–if you can apply your mind to something, you will figure it out, somehow. Maybe it’s not performant as you wanted; maybe it’s not as pretty as you wanted. Maybe it doesn’t conform to the “best practices” of the sexiest way of doing things, but if the customer or the user is happy, then you’ve achieved the goal. Then you can start looking into making it pretty, fast–if you can make it prettier and faster. Prettier usually goes well. Faster… not always.
“Yeah… I remember when I was building hardware and machinery, being the architect and the builder, you build it in a certain way. And then the second you let someone else test it or taking it for a test drive, they’re using it in a completely different way than my original intentions and they find all these edge cases and instantly break things. And that’s what’s fun about being an engineer. You don’t see different paths a lot of times when you’re building something, and then you learn to shift or avoid the wrong path.”
I think it has to do with how much you let form over function rule the game. I was watching a video two days ago about the iPhone 4–remember, when holding it in a certain way would attenuate the antenna signal?
“Oh yeah, and the famous Steve Jobs line, that you’re holding it wrong.”
Exactly! I mean, form can take different forms. It can be the design, it can be the architectural choices, as you said. In the end, developing products, developing software… developing anything that has uncertainty is about managing that uncertainty. If you don’t manage it correctly, you will overcomplicate things or lock yourself into something that backs you into a corner. And then, to come out of those corners can sometimes be super expensive. In hardware, we both know how expensive it gets… In software, you change the code and it’s fine. But in hardware, if you have plastic injection molding, or cast parts, or machined parts, or supply-chain lead times, or things that cost a lot of time–not only money–to process, there you are not anymore so “fluid” or lax with “yeah, it’s OK, let’s be agile, we’ll figure it out later”…
“You can’t really be agile with parts, can you? With physical parts, not so easily. A little bit, with 3D printing, but that’s about it.”
What 3D printing does is allow you to speed up the feedback loop between the design and figuring out whether it works. But in the end it’s all about adapting to new knowledge. People use and abuse the term “Agile” a lot; it’s something I had been posting about for years. I even wrote books on this . But after I saw that the Agile community doesn’t change its mind because it’s caught in a certification game, I stopped posting about this. It’s just nonsense, because if you want to engineer something or develop anything, you need to manage your knowledge gaps.
I don’t want to quote Donald Rumsfeld, who probably wasn’t the best person ever, but he had this fantastic epistemological quote about the known knowns, the known unknowns, the unknown knowns, and the unknown unknowns.
There are some things you know you know. OK, nail them down, you don’t need to bother with them. Of course, qualify whether what you think you know is actually true.
Then there are some things you know you don’t know. OK, now you know what to research.
There are some things you don’t even know you know. That’s what happens in bigger companies, and I’ve seen this first-hand. There are companies where they don’t even know what they know! They don’t know that this guy has knowledge on this topic, or framework, or industry, or process. So it’s good for a company to map what it knows every now and then.
What kills you is usually what you don’t even know you don’t know…
“Yeah… that’s true!”
…which is Chapter 16 of the book, where I had no idea… I mean, in the beginning, I didn’t know that SQLite doesn’t support INTERVAL
. Then I knew that SQLite doesn’t support INTERVAL
, but I had no idea that SQLite with this complicated query setup to replicate the INTERVAL
function is so abysmally underperforming.
That’s OK, you will always find things you don’t know, but that’s why you need to pull things forward and map all those four quadrants, so at least you don’t facepalm and say “oh my God, I should have read that book, or listened to that podcast, or talked to that guy, or asked for a second opinion…” And that’s the whole problem with development: mapping knowledge gaps.
(ed.: omitted content not relevant to the book or to development)
“We have a viewer question. He is curious about your startup philosophy and some of your previous books. I feel that’s a pretty open-ended question–but it sounds like you’ve been part of a lot of startups…”
Not really. Let me count… one, formally. But I’ve been in a lot of technology and product development and launch projects and in quite a few turnaround projects for process development, product development, hardware and software. I’ve also been developing the family business, of which I took over the management when I returned to Greece 5 years ago. That was not really a turnaround, but it was a lot of change that needed to happen after almost 30 years of stability and legacy work processes.
As for “startup philosophy”… I wouldn’t say I have one. I’ve seen some things that resemble “waterfall”–though it never is waterfall. I’ve lived through what is called “stage-gate” or “phase-gate”, where you have batched decision-making in meetings where you decide “should we continue the project?” And that’s one issue with this approach : due to the sunken-cost fallacy, the longer you’ve been at it, the more likely it is that you will continue the project, regardless of the realistic outlook on its outcome–this is a failure mode of that mental model of gradual de-risking of projects through big-batch decision-making. Most projects famously never get killed, so you often end up with crappier products than you expected, later than you expected, and with cost overruns.
The most important thing for me is to not get caught up in fads. That’s why I’ve also been so critical of Agile and even of Lean. Those are good ideas but they got perverted as soon as they became products being peddled by their respective cottage industry.
“I was gonna say, the problem with a lot of them is they get used improperly, right?”
Yeah… but then you get into the “no true Scotsman” fallacy. What is “true Agile”? What is “true Lean”? It doesn’t matter. I’d rather focus on principles. For example, is it a good principle get competent people, and let them work together, but also expect them and coach them and support them to work together in a way that makes great things happen? Yes.
It’s also great to have people on the team who are less experienced, but they can skill up through coaching, exposure to more senior members… This is all part of the story. Forget about the stupidity of modern HR practices of expecting X years in this framework or Y years in something else, regardless of whether X and Y are realistic… Or 10 years in LLMs… What? Who are you in that case, as a candidate?
I find it’s more useful to have principles, because it’s up to you to adapt how they are to be applied on any given situation and context. If a principle is sound, it will surely apply differently to different situations, but it will generally be a good rule to follow. For example, “decide as late as possible”. People in management hear this and they grow grey hair, because they want to know everything now, now, now, in order to decide and nail down how much the project will cost, how much budget to reserve, how much money the product will bring in (to set sales targets), and so on. But even though it might sound very controversial or counter-intuitive as a principle, once you put it into practice, you tackle the question "how would we go about deciding as late as possible?" Then you start to find the solution applicable for your business, for your product, for your situation, for your people, in your culture… and then you can turn it into a process, a guidebook, a hiring practice, into whatever.
I would say, the one principle is: instead of having philosophies, rather look at principles, don’t follow process just because somebody else did it. Always have a product vision before you start, know that the product will probably not end up being exactly what you envisioned, and manage uncertainty. And: not everything has to succeed, and most things fail–that’s part of starting up anything.
“I think it’s also important to be able to see your idea be a failure and move on from it, instead of staying married to it and try to make it work. I’ve been part of startups where they’ll try to reinvent their leverage into a market and almost invent a problem that doesn’t exist. You’re never going to have a successful project if you’re inventing problems. You actually have to be solving problems.”
Yeah, although the way to work for many years with the Zero-Interest Rate Policy created what I call “siliconvalley-nato” in Greek (σιλικονβαλλεϊνάτο)–it means Silicon-Valley-inspired, like raise millions, throw money at the problem, implement lots of bells and whistles, give in to Google envy… remember the time when all the startups wanted to be like Google internally, with foosball tables? That kind of frenzy of “if we behave like Google, then we are basically Google and we’ll raise tons of money”… During those years, a lot of cargo-culting and monkey-see-monkey-do happened across the whole stack, from the way you hire to the way you deploy. This created the perception of “copy-paste-ability” of approaches between companies, and an explosion of headcount that included roles that are in recent years being shed in droves–like Scrum Masters, Agile Coaches…
And you can see in modern software development that people copy-paste “because we might eventually be the next Facebook”, or “the Uber of X”… You don’t need a lot of things, you’ll probably never need the complexity.
“I think that’s great advice to wrap this up. I really appreciate your time Isaak, it’s been awesome chatting.”
Thank you for having me on the podcast!
- The book is available as PDF and ePub on Leanpub
- Join the discussion on elixirforum.com
- Leanpub invited me to a short interview about the book