A reference build we commissioned ourselves, because NDAs block most of the before-and-afters we’d otherwise publish.
We built the same booking app twice. First in base44, a vibe-coding platform that promises a working product in a single prompt. Then again, from scratch, in Next.js and Supabase, using the same product spec and the same screens.
The base44 version demoed fine. It was also unshippable: no way to export the code, no auth we could trust, validation that accepted anything, and the kind of soft failures you only find out about when a real customer hits them.
The rebuild is the version in the screenshots on this page. This is the gap we fix for a living.
Most of our rescue engagements start with a founder sending us a Lovable, Bolt, Cursor, or base44 repo and asking some version of, "can this be saved?" The answer is almost always yes. But the why is hard to explain without showing the work, and client NDAs mean we can’t always publish the before-and-after.
So we built one ourselves, on purpose, from a tool we hadn’t yet put through its paces. BookBase is a booking platform in the Calendly / Acuity / Square Appointments lane. We picked it because the category is common, the happy path looks easy, and the edge cases (double-booking, timezone drift, cancellation logic, payment validation, multi-tenant data isolation) are the exact places vibe-coded software falls apart.
The point isn’t to dunk on base44. Base44 is good at what it’s good at: going from zero to a working demo in about an hour. The point is what happens on day 90, when a real user submits a booking with a $-50 price in the notes field and the whole thing silently accepts it.
We gave base44 the same product brief we would give any engineer starting from scratch. It produced a working app in roughly an hour. Here is what we found when we stopped watching and started looking.
Base44 does not expose the generated source. You cannot clone the repo, audit the schema, inspect the auth flow, read the RLS policies, set up staging, or run it locally. You are renting an app you do not own. This is the single issue that makes every other issue unfixable.
Empty names were accepted through the API. Malformed emails never surfaced an error. The phone field accepted arbitrary text. Notes had no length cap. Past-date bookings via crafted requests went through. Double-booking was a race condition.
No query-level scoping we could verify. The isolation guarantee was "the UI does not show you the other data." That is not a guarantee. Anyone with the API shape and a login token could have queried across tenants.
When a booking failed, the user got either a spinner that never resolved or a generic "something went wrong." The server-side error was swallowed. No monitoring, no alerts. If a user hit an error, we would find out when they emailed us.
Not a unit test. Not an integration test. Not an end-to-end test. Every deploy is a leap of faith.
With a dozen bookings, everything was fast. We loaded it with 5,000. Dashboard queries took seconds. The appointments list had no pagination. The calendar view did the same.
No logs, no metrics, no uptime monitoring. The only way to diagnose a production issue was to refresh the app and hope you could reproduce it.
None of this is visible in a demo. All of it matters the moment you have a real customer.
Because base44 would not export the code, a refactor was not on the table. We rebuilt from scratch, using the base44 product as a working spec. Every choice below is defensible. None of it is exotic. This is the stack you would pick if you were planning to run the app for five years instead of five days.
A selection of the differences, in concrete numbers. The rebuild took two weeks. The base44 version took an hour.
The rebuild delivered every feature the base44 version had, plus ownership, auditability, tests, and observability. Full table in the write-up above.
A customer-facing booking form has exactly one job: collect a valid booking and put it in the database. Everything it gets wrong is visible to a paying customer.
In the base44 version, the name field accepted any string including empty whitespace, the email field had no format validation, the phone field accepted text, notes had no length limit, and the time slot did not re-check availability on submit. If two people picked the same slot within a few seconds, one of them got a confirmation page for a booking that did not actually exist. The confirm button had no disabled state, so rapid clicks created duplicates.
In the rebuild, name is validated on both sides. Email format is checked client and server. Phone is optional but format-checked when present. Notes are capped at 1,000 characters with a visible counter. Slot availability is re-checked inside a database transaction at submission time: the booking commits atomically or fails with a clear error asking the user to pick a new slot. The submit button enters a loading state and is disabled until the request resolves. All of it is covered by end-to-end tests that simulate real-world race conditions.
This is the last 30% of the work. It does not look like anything in a demo. It is the entire difference between "works" and "ships."
Book a production audit. A senior engineer reads your code, runs the diagnostics, and tells you honestly whether a rescue, a rebuild, or staying on the platform for another month is the right call for your situation.
Book a production audit