From proof-of-concept to production
Shyam exclaimed, “We have completed developing an entire application in three weeks meant for the proof-of-concept. This would have taken around six months without the help of vibe coding.” That was two months ago. The solution is still not live. Was there a lack of motivation with the team? Did they lose steam after the first demo version was built? Probably, it’s neither. In the industry, we see this as a trend, especially in this era of vibe coding. The first workable application in the development environment takes only 20% of the time, while making it production-ready consumes 80% of the time.
The challenges of the journey from proof of concept (POC) to production are often underestimated. This week, I list the features of the production-grade solution that developers often overlook when building a POC.
User management
An application intended for an initial demonstration of vendor capability or technical feasibility typically uses a dummy user. And mostly only one of them, to show the application usage. However, the production-grade solution must support multiple users with varying access levels.
Security features
Developers often ignore the application’s security features when developing a POC. Not permissible for an application that is about to be rolled out to production. Email address verification would require the application to send a user an email containing an activation link. Phone number verification would require an OTP to be sent to the user’s mobile number. Developers build in two-factor authentication (2FA), which requires two proofs of identity before the user can log in. One can also use authenticator apps or fingerprints.
Password and key hashing must be implemented. With hashed passwords, attackers face the much harder problem of cracking each hash. Access to tables, columns, and database vectors must be controlled. Input validation, logging, and auditing cannot be skipped. Handling errors securely and managing exceptions is mandatory.
Rate limiting, abuse detection, and account lockout are used to reduce brute-force attacks and API abuse. Rate limiting restricts the number of requests one service can accept in a specified window. This helps in reducing overload on precious hardware resources.
Guardrails, especially with LLMs
Input to LLMs and their responses must be guarded against profanity. Rate limiting and allowed token usage per hour or day must apply with LLMs as well.
The architecture and hosting services
A POC is usually housed entirely inside a single laptop: the application, the database, and the flat-file storage run on the laptop itself. Often, the LLM also runs on the same laptop if the developer chooses an open-source LLM for independence, experimentation, and heavy usage. It is not the same when the solution goes live. Usually, all components (application, database, flat-file storage, the LLM, and, optionally, an LLM gateway) are housed on separate servers. The entire ecosystem must function seamlessly in cohesion.
Number of concurrent users
Often, during a POC, developers don’t think much about the load of concurrent user requests on the services mentioned earlier (application, database, flat-file storage, LLM, and LLM gateway). However, before rolling out to production, one must determine the solution’s hardware requirements. The hardware requirements are derived from the number of users expected to use the solution concurrently. Only then can the solution’s infrastructure be provisioned.
Solution monitoring features
One must monitor the solution for ML model degradation. The metrics to be monitored must cover both technical and business performances of the solution. Often, in thoughtfully developed solutions, the feedback loop is automated, and the decision to retrain the model is built in.
Continuous integration and continuous deployment
The discussion will not be complete without mentioning CI/CD, an automated workflow that builds, tests, and deploys code changes frequently and reliably, enabling faster, safer releases to production.
What is missing from the above list? I guess they are the following: 1) Data discipline and governance, 2) Reliability guarantees and response latencies, 3) Cost control, 4) Observability depth, 5) Human + process integration. Remember HITL?
No wonder Shyam is still working to make his solution ready for production.
Disclaimer
Views expressed above are the author’s own.
END OF ARTICLE